Advanced Usage¶
Adding Custom Simulation¶
A simulation consists of a pipeline taking in pool configurations and market data sampled as a time-series. For each pool configuration, a run consists of applying the strategy to the given configuration and stream of market data. A report of various metrics can then be created from the results of all runs.
To flexibly handle future-use-cases, the pipeline concept has not been formalized into
a configurable object, but the basic template can be understood in the implementation
of the helper function run_pipeline(). It takes in a
param sampler,
price sampler,
and strategy.
The pipeline iterates over the pool with parameters set from the param sampler; for each
set of parameters, the strategy is applied on each time series sample produced by the
price sampler.
Typically you would use run_pipeline() by creating a function that:
instantiates
PoolMetaDataInterfacefrom a pool address and chain labelcreates a
SimPoolusing the pool data.instantiates a param_sampler, price_sampler, and strategy
invokes
run_pipeline(), returning result metrics
Other auxiliary args may need to be passed-in to instantiate all necessary objects.
The main pipeline, which was developed for the specific use-case of optimizing Curve pools
for best reward-risk tradeoff, is the
volume limited arbitrage pipeline.
The simple pipeline provides an easier starting
point for creating a custom pipeline.
The SimPool interface¶
To setup arbitrage strategies, the SimPool interface exposes:
price: (method)trade: (method)assets: (property)
Given market price(s), any strategy that checks the “price” and then exchanges one
asset for another can be implemented. While the name SimPool suggests a pool, this object
can be any type of market or venue where assets are exchanged.
For example, one could implement:
class CollateralizedDebtPosition(SimPool):
"""
A simple Aave-style collateralized debt position.
"""
def price(self, debt_token, collateral_token, use_fee=True):
"""
Returns the effective price for collateral from liquidating
the position.
"""
def trade(self, debt_token, collateral_token, size):
"""
Liquidate the position by paying `size` amount of the debt.
"""
@property
def assets(self):
"""
Return a :class:`SimAssets` instance with information on
the tradable assets (debt and collateral in this example).
"""
The available implementations wrap a Curve pool into an appropriate SimPool, letting
strategies more flexibly define tradable assets. Expected use-cases taking advantage
of these abstractions include trading LP tokens or even baskets of tokens, routing through
multiple pools, and trading between two competing pools of different types.
The Strategy and Trader interfaces¶
The Strategy callable is what coordinates the different moving parts of the system:
def __call__(self, sim_pool, parameters, price_sampler):
"""
Computes and executes trades at each timestep.
"""
The parameters configure the pool and the price_sampler provides market tick data that pushes the pool through a simulation run.
The Strategy base class houses an implementation to do this based on customizing an injected Trader. The Trader class assumes typical logic has a compute step and then a trade execution step, but since only the process_time_sample method is invoked in a strategy, this isn’t mandatory in your custom implementation.
Adding Custom Metrics¶
Custom simulation metrics can be created by adding new metric classes to
curvesim.metrics.metrics. This involves three main requirements:
Subclassing one of the Generic Metric Classes found in
curvesim.metrics.base.Adding Methods to compute the metric(s).
Specifying the Metric Configuration in the
configproperty.
Each requirement is described in detail below. Once completed, the new
metric class can be passed to a simulation pipeline, where it
will be automatically initialized, calculated, and included in the
results output.
Basic Example¶
The Timestamp metric provides a simple example that incorporates each
of the required elements. It subclasses the generic
Metric class, adds the
_get_timestamp() method, and defines a minimal config property
that specifies _get_timestamp() as the function used to generate the
metric:
class Timestamp(Metric):
"""Simple pass-through metric to record timestamps."""
@property
def config(self):
return {"functions": {"metrics": self._get_timestamp}}
def _get_timestamp(self, price_sample, **kwargs):
return DataFrame(price_sample.timestamp)
While this metric simply “passes through” the timestamps recorded throughout each simulation run, any of the major components could be modified to produce something more interesting. The following sections explain how each component can be expanded to generate more informative metrics.
Generic Metric Classes¶
Each metric is a subclass of one of four generic classes found in
curvesim.metrics.base. When creating a new metric, you should subclass
whichever best suits your needs:
MetricThe most basic of the generic metric classes. Used for metrics that are computed with a single function, regardless of the type of pool used in a simulation.
PoolMetricGeneric metric class with distinct configurations for different pool-types. Used for metrics that require unique functions or configurations depending on the type of pool used in a simulation. See Pool Config Specification
PricingMetricBasic
Metricclass with added functionality for calculations involving market prices:get_market_price()Returns exchange rate for two coins identified by their pool indices.
numeraireandnumeraire_idxattributes:The numeraire to be used in pricing calculations and its numeric coin index. The numeraire is automatically selected from a list of preferred numeraires, or defaults to the first of the
coin_namespassed to the metric at instantiation. Seecurvesim.metrics.base.get_numeraire().
PoolPricingMetricPoolMetricclass with added functionality for calculations involving market prices (same as above).
Adding Methods¶
Methods for computing (and, optionally, summarizing) your metric(s) should be added to your new subclass and referenced in the config property.
Metric Function (required)¶
The function for computing your metric(s) is executed at the end of each simulation run (i.e., after each timepoint is simulated with a given set of pool parameters). The function should take the data provided by the StateLog and return a DataFrame with named columns for each computed metric (“sub-metric”) and rows for each timestamp.
If summary functions or plotting specs are included in the config property, they must reference each sub-metric using the column names used in the DataFrame.
Data Inputs¶
At the end of each simulation run, the StateLog passes the following
data to each metric as keyword arguments. Your function signature should include
any of the keywords you need for your computation and **kwargs to “soak up”
any unused keywords.
pool_parameters(DataFrame)The parameters of the pool used in a simulation run. These vary depending on pool type (e.g., for a stableswap pool, these are A, initial D, and fee), and are returned in a DataFrame with columns for each parameter. For example:
A D fee 0 100 3.882173e+08 0.0004
See
metrics.StateLog.pool_parametersfor the parameters recorded for each pool type.
pool_state(DataFrame)A time series of the pool state recorded at each timepoint in the simulation run. For example:
balances tokens 0 [130845201307275888876149751, 1305944797254687... 378440487077049660301217105 1 [132282500493342273867963383, 1317798299188966... 378440487077049660301217105 2 [133706765982576658123938807, 1329505526946925... 378440487077049660301217105 3 [135129521787669164155296759, 1341178732597889... 378440487077049660301217105 4 [136553908964358298693622775, 1352866792170694... 378440487077049660301217105 ... ... ... 1460 [130546859394920751460984594, 1294035642077598... 378440487077049660301217105 1461 [130546859394920751460984594, 1294035642077598... 378440487077049660301217105 1462 [129676139388539620009120586, 1303866269827065... 378440487077049660301217105 1463 [130515100587653360449688394, 1302453178645099... 378440487077049660301217105 1464 [129771580655313918569313433, 1302453178645094... 378440487077049660301217105
The recorded variables vary with pool type. See
metrics.StateLog.pool_statefor the parameters recorded for each pool type.Note
If your calculations depend on pool state, you must call
self.set_pool_state(pool_state_row)before performing a calculation for each timestamp.set_pool_state()is a built-in method in thePoolMetricclass, and takes one row of thepool_stateDataFrame as input for each timestamp.
price_sample(DataFrame)The information provided by the
price_samplerat each timepoint. Currently, this includes the timestamp, market prices, and market volumes. Prices and volumes are given for each pairwise combination of coins, ordered as initertools.combinations(range(n_coins), 2):timestamp prices volumes 0 2023-03-23 23:30:00+00:00 [0.9972223936856817, 0.9934336361010216, ... [6372460371.611408, 32388718876.53451, ... 1 2023-03-24 00:30:00+00:00 [0.9974647037626924, 0.9953008467903304, ... [6405220209.779885, 32298840369.832382, ... 2 2023-03-24 01:30:00+00:00 [0.9983873712830038, 0.9968781445095656, ... [6428761178.953415, 31924323767.57396, ... 3 2023-03-24 02:30:00+00:00 [0.998974908950286, 0.9971146840056136, ... [6478213966.455348, 31834217713.8281, ... 4 2023-03-24 03:30:00+00:00 [0.9954604997820208, 0.993597773487017, ... [6476018037.815129, 31880343748.124725, ... ... ... ... ... 1460 2023-05-23 19:30:00+00:00 [0.9995590221398217, 0.9996802980794983, ... [2450447658.4796195, 19720280583.1984, ... 1461 2023-05-23 20:30:00+00:00 [0.999792588099074, 0.9998231064202561, ... [3767115607.6887126, 9745029505.401602, ... 1462 2023-05-23 21:30:00+00:00 [1.002580556630733, 1.001640822363833, ... [3238172226.196708, 20213110441.90307, ... 1463 2023-05-23 22:30:00+00:00 [0.9992115557645646, 0.9991726268082701, ... [3806396776.6569495, 18785423624.570637,... 1464 2023-05-23 23:30:00+00:00 [1.0000347245259464, 1.0001933464807435, ... [1618332387.3201604, 20024972704.395084,...
trade_data(DataFrame)The information provided by the pipeline
strategyat each timepoint. Currently, this includes the executed trades (format: coin_in index, coin_out index, coin_in amount, coin_out amount, fee), total volume, and post-trade price error between pool price and market price:trades volume price_errors 0 [(1, 2, 1425272746997353459744768, 14246897852... 2864693070085621213560832 [0.003204099273566685, 0.005952357013628284, 0... 1 [(1, 2, 1423136006037754555138048, 14221320488... 2860435192104139546951680 [0.001272223599089517, 0.0037650550769624536, ... 2 [(1, 2, 1409378240580197696929792, 14079575628... 2833643729814581952905216 [0.00030753025199858897, 0.001863116663974429,... 3 [(1, 2, 1407807687734395949547520, 14059536943... 2830563492826901980905472 [0.00034208257763679306, 0.0012934752817830297... 4 [(1, 2, 1409207476296572163063808, 14069028812... 2833594652985706701389824 [2.394374749004058e-05, 0.004466502992504617, ... ... ... ... ... 1460 [(0, 1, 90998886739193884573696, 9096431673643... 90998886739193884573696 [5.417643337612965e-05, -0.0001157222448445738... 1461 [] 0 [-0.0001793895258761502, -0.000258530585602323... 1462 [(1, 2, 862811250032832837320704, 862453153772... 1733689111071210703159296 [0.0005053090828606166, 0.001416255647489928, ... 1463 [(0, 2, 697608304688585215311872, 697307118016... 838961199113740440567808 [0.000331361959104326, 0.0004680864443290522, ... 1464 [(2, 0, 743639722611787945738240, 743519932339... 743639722612382454775808 [-2.4884112744816278e-05, -2.3648798199493726e...
Summary Functions (optional)¶
Summary functions take the per-timestamp metrics computed by your metric function and compute a single value for each run. As outlined below, summary functions may be specified by a string referring to a pandas.DataFrame method, or a dict mapping a summary statistic’s name to a custom function.
Summary functions are specified individually for each sub-metric computed by your metric function (i.e., for each column in the returned DataFrame).
If you specify a custom summary function, it should take the column of
per-timestamp values for your sub-metric as an argument and return a single
value. For example, the PoolValue metric takes a pandas.DataFrame as input,
and returns a single value which summarizes each run:
def compute_annualized_returns(self, data):
"""Computes annualized returns from a series of pool values."""
year_multipliers = timedelta64(1, "Y") / data.index.to_series().diff()
log_returns = log(data).diff()
return exp((log_returns * year_multipliers).mean()) - 1
Metric Configuration¶
Each metric’s config property specifies how to compute, summarize,
and/or plot recorded data. The formatting specifications are outlined below.
Config Specification¶
The general config specification is:
{
"functions": {
"metrics": function returning all sub_metrics,
"summary": {
"sub_metric1": str, list of str, or dict,
"sub_metric2": str, list of str, or dict,
},
},
"plot": {
"metrics": {
"sub_metric1": {
"title": str,
"style": str,
"resample": str,
},
"sub_metric2": {
"title": str,
"style": str,
"resample": str,
},
"summary":
"sub_metric1": {
"title": str,
"style": str,
},
"sub_metric2": {
"title": str,
"style": str,
},
}
Pool Config Specification¶
For PoolMetric subclasses, a
pool_config property must be specified to map pool-types to individual configs
in the above format:
{
PoolClass1: {
"functions": ...
"plot": ...
},
PoolClass2: {
"functions": ...
"plot": ...
},
}
Functions¶
Functions used to compute metrics and/or summary statistics. Includes two sub-keys:
config["functions"]["metrics"](required):A single function that computes all sub-metrics and returns them in a single DataFrame
config["functions"]["summary"](optional):A dict mapping sub-metric names to functions for computing summary statistics. Functions can be specified using either:
a string referring to a pandas.DataFrame method (e.g., “sum”, “mean”, “median”)
a sub-dict mapping a summary statistic’s name to a function
For example, the ArbMetrics config
specifies functions as follows:
"functions": {
"metrics": self.compute_metrics,
"summary": {
"arb_profit": "sum",
"pool_fees": "sum",
"pool_volume": "sum",
"price_error": "median",
}
}
When summary functions are specified as strings, the string is used to specify both the function and the summary statistic’s name in the results DataFrame. If a summary function is specified with a dict, the key specifies the summary statistic’s name, and the value is the function to compute the statistic:
"pool_value": {"annualized_returns": self._compute_annualized_returns}
Finally, multiple summary statistics can be specified for each sub-metric by using either a list of strings or a dict with multiple items. For example:
"pool_balance": ["median", "min"]
Or, if we sought to rename the summary statistics:
"pool_balance": {"Median": "median", "Minimum": "min"}
Plot (optional)¶
Plotting specifications for metrics and/or summary statistics.
At minimum, the plot key specifies a title, style, and (for sub-metrics, but
not summary statistics) a resampling function. Take for example this sub-section of
the ArbMetrics config:
"plot": {
"metrics": {
"arb_profit": {
"title": f"Daily Arbitrageur Profit (in {self.numeraire})",
"style": "time_series",
"resample": "sum",
},
"pool_fees": {
"title": f"Daily Pool Fees (in {self.numeraire})",
"style": "time_series",
"resample": "sum",
},
"summary": {
"arb_profit": {
"title": f"Total Arbitrageur Profit (in {self.numeraire})",
"style": "point_line",
},
"pool_fees": {
"title": f"Total Pool Fees (in {self.numeraire})",
"style": "point_line",
},
Plot: Title
The title key specifies the title that will be shown above each plot. Because
config is a property, we can use f-strings or other executable code to define
this or any other entry.
Plot: Style
The style key indicates the plot style, as defined in
plot.styles.
Currently, the following styles are supported:
line - a line plot
point_line - a line plot with each individual point also marked
time_series - a line plot with the x-axis set to the “timestamp” metric
histogram - a normalized histogram with “Frequency” as the y-axis
Note that any of the style properties can be overriden by specifying additional properties in the plot config (see Plot: Additional Properties below). For histograms, the metric must be specified as the x-axis variable.
Plot: Resample
The resample key defines what function to apply when the metric time-series are
downsampled before plotting. Because the full metric dataset can be very large,
we resample each metric to a sampling frequency of 1 day.
Any pandas function that returns a single value per time-bin is supported: sum, mean, std, sem, max, min, median, first, or last.
See pandas resampling docs for more details.
Downsampling can be overriden by specifying "resample": False.
Plot: Additional Properties
Each sub-metric or summary statistic’s plot can be further customized by providing additional keys, which are passed as keyword arguments to altair.Chart.
For example, in the ArbMetrics
config["plot"]["metrics"] entry, the encoding for the price_error sub-metric is altered to specify the metric as the x-axis and truncate the x-axis scale:
"price_error": {
"title": "Price Error",
"style": "histogram",
"encoding": {
"x": {
"title": "Price Error (binned)",
"shorthand": "price_error",
"scale": Scale(domain=[0, 0.05], clamp=True),
},
},
},
In the above example, the "encoding" key would be passed to altair.Chart as a keyword argument after the sub-dict "x" was passed to altair.X (i.e., the relevant Altair class constructor).