Advanced Usage

Adding Custom Simulation

A simulation consists of a pipeline taking in pool configurations and market data sampled as a time-series. For each pool configuration, a run consists of applying the strategy to the given configuration and stream of market data. A report of various metrics can then be created from the results of all runs.

To flexibly handle future-use-cases, the pipeline concept has not been formalized into a configurable object, but the basic template can be understood in the implementation of the helper function run_pipeline(). It takes in a param sampler, price sampler, and strategy. The pipeline iterates over the pool with parameters set from the param sampler; for each set of parameters, the strategy is applied on each time series sample produced by the price sampler.

Typically you would use run_pipeline() by creating a function that:

  1. instantiates PoolMetaDataInterface from a pool address and chain label

  2. creates a SimPool using the pool data.

  3. instantiates a param_sampler, price_sampler, and strategy

  4. invokes run_pipeline(), returning result metrics

Other auxiliary args may need to be passed-in to instantiate all necessary objects.

The main pipeline, which was developed for the specific use-case of optimizing Curve pools for best reward-risk tradeoff, is the volume limited arbitrage pipeline.

The simple pipeline provides an easier starting point for creating a custom pipeline.

The SimPool interface

To setup arbitrage strategies, the SimPool interface exposes:

  1. price: (method)

  2. trade: (method)

  3. assets: (property)

Given market price(s), any strategy that checks the “price” and then exchanges one asset for another can be implemented. While the name SimPool suggests a pool, this object can be any type of market or venue where assets are exchanged.

For example, one could implement:

class CollateralizedDebtPosition(SimPool):
    """
    A simple Aave-style collateralized debt position.
    """

    def price(self, debt_token, collateral_token, use_fee=True):
        """
        Returns the effective price for collateral from liquidating
        the position.
        """

    def trade(self, debt_token, collateral_token, size):
        """
        Liquidate the position by paying `size` amount of the debt.
        """

    @property
    def assets(self):
        """
        Return a :class:`SimAssets` instance with information on
        the tradable assets (debt and collateral in this example).
        """

The available implementations wrap a Curve pool into an appropriate SimPool, letting strategies more flexibly define tradable assets. Expected use-cases taking advantage of these abstractions include trading LP tokens or even baskets of tokens, routing through multiple pools, and trading between two competing pools of different types.

The Strategy and Trader interfaces

The Strategy callable is what coordinates the different moving parts of the system:

def __call__(self, sim_pool, parameters, price_sampler):
    """
    Computes and executes trades at each timestep.
    """

The parameters configure the pool and the price_sampler provides market tick data that pushes the pool through a simulation run.

The Strategy base class houses an implementation to do this based on customizing an injected Trader. The Trader class assumes typical logic has a compute step and then a trade execution step, but since only the process_time_sample method is invoked in a strategy, this isn’t mandatory in your custom implementation.

Adding Custom Metrics

Custom simulation metrics can be created by adding new metric classes to curvesim.metrics.metrics. This involves three main requirements:

  1. Subclassing one of the Generic Metric Classes found in curvesim.metrics.base.

  2. Adding Methods to compute the metric(s).

  3. Specifying the Metric Configuration in the config property.

Each requirement is described in detail below. Once completed, the new metric class can be passed to a simulation pipeline, where it will be automatically initialized, calculated, and included in the results output.

Basic Example

The Timestamp metric provides a simple example that incorporates each of the required elements. It subclasses the generic Metric class, adds the _get_timestamp() method, and defines a minimal config property that specifies _get_timestamp() as the function used to generate the metric:

class Timestamp(Metric):
    """Simple pass-through metric to record timestamps."""

    @property
    def config(self):
        return {"functions": {"metrics": self._get_timestamp}}

    def _get_timestamp(self, price_sample, **kwargs):
        return DataFrame(price_sample.timestamp)

While this metric simply “passes through” the timestamps recorded throughout each simulation run, any of the major components could be modified to produce something more interesting. The following sections explain how each component can be expanded to generate more informative metrics.

Generic Metric Classes

Each metric is a subclass of one of four generic classes found in curvesim.metrics.base. When creating a new metric, you should subclass whichever best suits your needs:

  1. Metric

    The most basic of the generic metric classes. Used for metrics that are computed with a single function, regardless of the type of pool used in a simulation.

  2. PoolMetric

    Generic metric class with distinct configurations for different pool-types. Used for metrics that require unique functions or configurations depending on the type of pool used in a simulation. See Pool Config Specification

  3. PricingMetric

    Basic Metric class with added functionality for calculations involving market prices:

    • get_market_price()

      Returns exchange rate for two coins identified by their pool indices.

    • numeraire and numeraire_idx attributes:

      The numeraire to be used in pricing calculations and its numeric coin index. The numeraire is automatically selected from a list of preferred numeraires, or defaults to the first of the coin_names passed to the metric at instantiation. See curvesim.metrics.base.get_numeraire().

  4. PoolPricingMetric

    PoolMetric class with added functionality for calculations involving market prices (same as above).

Adding Methods

Methods for computing (and, optionally, summarizing) your metric(s) should be added to your new subclass and referenced in the config property.

Metric Function (required)

The function for computing your metric(s) is executed at the end of each simulation run (i.e., after each timepoint is simulated with a given set of pool parameters). The function should take the data provided by the StateLog and return a DataFrame with named columns for each computed metric (“sub-metric”) and rows for each timestamp.

If summary functions or plotting specs are included in the config property, they must reference each sub-metric using the column names used in the DataFrame.

Data Inputs

At the end of each simulation run, the StateLog passes the following data to each metric as keyword arguments. Your function signature should include any of the keywords you need for your computation and **kwargs to “soak up” any unused keywords.

  • pool_parameters (DataFrame)

    The parameters of the pool used in a simulation run. These vary depending on pool type (e.g., for a stableswap pool, these are A, initial D, and fee), and are returned in a DataFrame with columns for each parameter. For example:

       A             D     fee
    0  100  3.882173e+08  0.0004
    

    See metrics.StateLog.pool_parameters for the parameters recorded for each pool type.

  • pool_state (DataFrame)

    A time series of the pool state recorded at each timepoint in the simulation run. For example:

          balances                                           tokens
    0     [130845201307275888876149751, 1305944797254687...  378440487077049660301217105
    1     [132282500493342273867963383, 1317798299188966...  378440487077049660301217105
    2     [133706765982576658123938807, 1329505526946925...  378440487077049660301217105
    3     [135129521787669164155296759, 1341178732597889...  378440487077049660301217105
    4     [136553908964358298693622775, 1352866792170694...  378440487077049660301217105
    ...                                                 ...                          ...
    1460  [130546859394920751460984594, 1294035642077598...  378440487077049660301217105
    1461  [130546859394920751460984594, 1294035642077598...  378440487077049660301217105
    1462  [129676139388539620009120586, 1303866269827065...  378440487077049660301217105
    1463  [130515100587653360449688394, 1302453178645099...  378440487077049660301217105
    1464  [129771580655313918569313433, 1302453178645094...  378440487077049660301217105
    

    The recorded variables vary with pool type. See metrics.StateLog.pool_state for the parameters recorded for each pool type.

    Note

    If your calculations depend on pool state, you must call self.set_pool_state(pool_state_row) before performing a calculation for each timestamp. set_pool_state() is a built-in method in the PoolMetric class, and takes one row of the pool_state DataFrame as input for each timestamp.

  • price_sample (DataFrame)

    The information provided by the price_sampler at each timepoint. Currently, this includes the timestamp, market prices, and market volumes. Prices and volumes are given for each pairwise combination of coins, ordered as in itertools.combinations(range(n_coins), 2):

         timestamp                  prices                                        volumes
    0    2023-03-23 23:30:00+00:00  [0.9972223936856817, 0.9934336361010216, ...  [6372460371.611408, 32388718876.53451,  ...
    1    2023-03-24 00:30:00+00:00  [0.9974647037626924, 0.9953008467903304, ...  [6405220209.779885, 32298840369.832382, ...
    2    2023-03-24 01:30:00+00:00  [0.9983873712830038, 0.9968781445095656, ...  [6428761178.953415, 31924323767.57396,  ...
    3    2023-03-24 02:30:00+00:00  [0.998974908950286, 0.9971146840056136,  ...  [6478213966.455348, 31834217713.8281,   ...
    4    2023-03-24 03:30:00+00:00  [0.9954604997820208, 0.993597773487017,  ...  [6476018037.815129, 31880343748.124725, ...
    ...                        ...                                           ...                                          ...
    1460 2023-05-23 19:30:00+00:00  [0.9995590221398217, 0.9996802980794983, ...  [2450447658.4796195, 19720280583.1984,  ...
    1461 2023-05-23 20:30:00+00:00  [0.999792588099074, 0.9998231064202561,  ...  [3767115607.6887126, 9745029505.401602, ...
    1462 2023-05-23 21:30:00+00:00  [1.002580556630733, 1.001640822363833,   ...  [3238172226.196708, 20213110441.90307,  ...
    1463 2023-05-23 22:30:00+00:00  [0.9992115557645646, 0.9991726268082701, ...  [3806396776.6569495, 18785423624.570637,...
    1464 2023-05-23 23:30:00+00:00  [1.0000347245259464, 1.0001933464807435, ...  [1618332387.3201604, 20024972704.395084,...
    
  • trade_data (DataFrame)

    The information provided by the pipeline strategy at each timepoint. Currently, this includes the executed trades (format: coin_in index, coin_out index, coin_in amount, coin_out amount, fee), total volume, and post-trade price error between pool price and market price:

          trades                                             volume                     price_errors
    0     [(1, 2, 1425272746997353459744768, 14246897852...  2864693070085621213560832  [0.003204099273566685, 0.005952357013628284, 0...
    1     [(1, 2, 1423136006037754555138048, 14221320488...  2860435192104139546951680  [0.001272223599089517, 0.0037650550769624536, ...
    2     [(1, 2, 1409378240580197696929792, 14079575628...  2833643729814581952905216  [0.00030753025199858897, 0.001863116663974429,...
    3     [(1, 2, 1407807687734395949547520, 14059536943...  2830563492826901980905472  [0.00034208257763679306, 0.0012934752817830297...
    4     [(1, 2, 1409207476296572163063808, 14069028812...  2833594652985706701389824  [2.394374749004058e-05, 0.004466502992504617, ...
    ...                                                 ...                        ...                                                ...
    1460  [(0, 1, 90998886739193884573696, 9096431673643...    90998886739193884573696  [5.417643337612965e-05, -0.0001157222448445738...
    1461                                                 []                          0  [-0.0001793895258761502, -0.000258530585602323...
    1462  [(1, 2, 862811250032832837320704, 862453153772...  1733689111071210703159296  [0.0005053090828606166, 0.001416255647489928, ...
    1463  [(0, 2, 697608304688585215311872, 697307118016...   838961199113740440567808  [0.000331361959104326, 0.0004680864443290522, ...
    1464  [(2, 0, 743639722611787945738240, 743519932339...   743639722612382454775808  [-2.4884112744816278e-05, -2.3648798199493726e...
    

Summary Functions (optional)

Summary functions take the per-timestamp metrics computed by your metric function and compute a single value for each run. As outlined below, summary functions may be specified by a string referring to a pandas.DataFrame method, or a dict mapping a summary statistic’s name to a custom function.

Summary functions are specified individually for each sub-metric computed by your metric function (i.e., for each column in the returned DataFrame).

If you specify a custom summary function, it should take the column of per-timestamp values for your sub-metric as an argument and return a single value. For example, the PoolValue metric takes a pandas.DataFrame as input, and returns a single value which summarizes each run:

def compute_annualized_returns(self, data):
    """Computes annualized returns from a series of pool values."""
    year_multipliers = timedelta64(1, "Y") / data.index.to_series().diff()
    log_returns = log(data).diff()

    return exp((log_returns * year_multipliers).mean()) - 1

Metric Configuration

Each metric’s config property specifies how to compute, summarize, and/or plot recorded data. The formatting specifications are outlined below.

Config Specification

The general config specification is:

{
    "functions": {
        "metrics": function returning all sub_metrics,
        "summary": {
            "sub_metric1": str, list of str, or dict,
            "sub_metric2": str, list of str, or dict,
        },
    },
    "plot": {
        "metrics": {
            "sub_metric1": {
                "title": str,
                "style": str,
                "resample": str,
            },

            "sub_metric2": {
                "title": str,
                "style": str,
                "resample": str,
            },

        "summary":
            "sub_metric1": {
                "title": str,
                "style": str,
            },

            "sub_metric2": {
                "title": str,
                "style": str,
            },
}

Pool Config Specification

For PoolMetric subclasses, a pool_config property must be specified to map pool-types to individual configs in the above format:

{
    PoolClass1: {
        "functions": ...
        "plot": ...
    },

    PoolClass2: {
        "functions": ...
        "plot": ...
    },
}

Functions

Functions used to compute metrics and/or summary statistics. Includes two sub-keys:

  • config["functions"]["metrics"] (required):

    A single function that computes all sub-metrics and returns them in a single DataFrame

  • config["functions"]["summary"] (optional):

    A dict mapping sub-metric names to functions for computing summary statistics. Functions can be specified using either:

    • a string referring to a pandas.DataFrame method (e.g., “sum”, “mean”, “median”)

    • a sub-dict mapping a summary statistic’s name to a function

For example, the ArbMetrics config specifies functions as follows:

"functions": {
            "metrics": self.compute_metrics,
            "summary": {
                "arb_profit": "sum",
                "pool_fees": "sum",
                "pool_volume": "sum",
                "price_error": "median",
            }
}

When summary functions are specified as strings, the string is used to specify both the function and the summary statistic’s name in the results DataFrame. If a summary function is specified with a dict, the key specifies the summary statistic’s name, and the value is the function to compute the statistic:

"pool_value": {"annualized_returns": self._compute_annualized_returns}

Finally, multiple summary statistics can be specified for each sub-metric by using either a list of strings or a dict with multiple items. For example:

"pool_balance": ["median", "min"]

Or, if we sought to rename the summary statistics:

"pool_balance": {"Median": "median", "Minimum": "min"}

Plot (optional)

Plotting specifications for metrics and/or summary statistics.

At minimum, the plot key specifies a title, style, and (for sub-metrics, but not summary statistics) a resampling function. Take for example this sub-section of the ArbMetrics config:

"plot": {
        "metrics": {
            "arb_profit": {
                "title": f"Daily Arbitrageur Profit (in {self.numeraire})",
                "style": "time_series",
                "resample": "sum",
            },
            "pool_fees": {
                "title": f"Daily Pool Fees (in {self.numeraire})",
                "style": "time_series",
                "resample": "sum",
            },

        "summary": {
            "arb_profit": {
                "title": f"Total Arbitrageur Profit (in {self.numeraire})",
                "style": "point_line",
            },
            "pool_fees": {
                "title": f"Total Pool Fees (in {self.numeraire})",
                "style": "point_line",
            },

Plot: Title

The title key specifies the title that will be shown above each plot. Because config is a property, we can use f-strings or other executable code to define this or any other entry.

Plot: Style

The style key indicates the plot style, as defined in plot.styles.

Currently, the following styles are supported:

  • line - a line plot

  • point_line - a line plot with each individual point also marked

  • time_series - a line plot with the x-axis set to the “timestamp” metric

  • histogram - a normalized histogram with “Frequency” as the y-axis

Note that any of the style properties can be overriden by specifying additional properties in the plot config (see Plot: Additional Properties below). For histograms, the metric must be specified as the x-axis variable.

Plot: Resample

The resample key defines what function to apply when the metric time-series are downsampled before plotting. Because the full metric dataset can be very large, we resample each metric to a sampling frequency of 1 day.

Any pandas function that returns a single value per time-bin is supported: sum, mean, std, sem, max, min, median, first, or last.

See pandas resampling docs for more details.

Downsampling can be overriden by specifying "resample": False.

Plot: Additional Properties

Each sub-metric or summary statistic’s plot can be further customized by providing additional keys, which are passed as keyword arguments to altair.Chart.

For example, in the ArbMetrics config["plot"]["metrics"] entry, the encoding for the price_error sub-metric is altered to specify the metric as the x-axis and truncate the x-axis scale:

"price_error": {
    "title": "Price Error",
    "style": "histogram",
    "encoding": {
        "x": {
            "title": "Price Error (binned)",
            "shorthand": "price_error",
            "scale": Scale(domain=[0, 0.05], clamp=True),
        },
    },
},

In the above example, the "encoding" key would be passed to altair.Chart as a keyword argument after the sub-dict "x" was passed to altair.X (i.e., the relevant Altair class constructor).