market maker

In an influential paper , Avellaneda and Stoikov expounded a strategy addressing market maker inventory risk. The optimal bid and ask quotes are obtained from a set of formulas built around these parameters. The rationale behind the strategy is, in Avellaneda and Stoikov’s words, to perform a ‘balancing act between the dealer’s personal risk considerations and the market environment’ [ibid.]. A wide variety of RL techniques have been developed to allow the agent to learn from the rewards it receives as a result of its successive interactions with the environment. A notable example is Google’s AlphaGo project , in which a deep reinforcement learning algorithm was given the rules of the game of Go, and it then taught itself to play so well that it defeated the human world champion.

The procedure, therefore, has two steps, which are applied at each time increment as follows. The first chart shows price, indiference price and bid, ask quotes evolution. For asymptotic expansions when T is large you should read the paper by Guéant, Lehalle, and Fernandez-Tapia here or the book of Guéant The financial mathematics of market-liquidity. There are many exciting models out there with different approaches, and with HFTs dominating the market-making scene in the last years, there is a lot for our team to explore. Note that this is how much % of the total inventory value you want to have allocated on the base asset. For example, if you are trading BTC-USD but want to focus on keeping your inventory 100% on BTC, you set this value to 100.


To put it simply, as the trading session is nearing the end, the reservation price will approach the market mid-price, reducing the risk of holding the inventory too far from the desired target. But this kind of approach, depending on the market situation, might lead to market maker inventory skewing in one direction, putting the trader in a wrong position as the asset value moves against him. This parameter denoted in the letter eta is related to the aggressiveness when setting the order amount to achieve the inventory target. It is inversely proportional to the asymmetry between the bid and ask order amount. The Avellaneda Market Making Strategy is designed to scale inventory and keep it at a specific target that a user defines it with. To achieve this, the strategy will optimize both bid and ask spreads and their order amount to maximize profitability.

On Hummingbot, the value of q is calculated on the target inventory percentage you are aiming for. Adjust the settings by opening the strategy config ETC file with a text editor. Directly override orders placed by order_amount and order_level_parameter. When placing orders, if the order’s size determined by the order price and quantity is below the exchange’s minimum order size, then the orders will not be created.

After that, use config order_book_depth_factor and config risk_factor to set your custom values. On hummingbot, you choose what the asset inventory target is, and the bot calculates the value of q. This parameter is used to calculate what is the difference between the current inventory position and the desired one. But for now, it is essential to know that using a significant κ value, you are assuming that the order book is denser, and your optimal spread will have to be smaller since there is more competition on the market. There is a lot of mathematical detail on the paper explaining how they arrive at this factor by assuming exponential arrival rates.

Market Making With Signals Through Deep Reinforcement Learning

The Volatility Sensibility will recalculate gamma, kappa, and eta after the value of volatility sensibility threshold in percentage is achieved. For example, when the parameter is set to 0, it will recalculate gamma, kappa, and eta each time an order is created. In expert mode, the user will need to directly define the algorithm’s basic parameters described in the foundation paper, and no recalculation of parameters will happen.


However, because of the characteristics of imbalanced classification, we replace the categorical cross-entropy loss with the focal loss function. It is necessary to pay more attention on the minority cases and capture the patterns of these valuable long and short signals. Then, the model trained daily or weekly can predict trading actions and the probability of each choice at every tick. The next step is to trade the securities based on the information yielded by the predictions. Instead of investing the same proportion consistently, we devise an optimization scheme using the fractional Kelly growth criterion under risk control, which is further achieved by the risk measure, value at risk . Based on the estimates of historical VaR and returns for successful/failed actions, we provide a theoretical closed-form solution for the optimal investment proportion.

Optimal high-frequency trading with limit and market orders

However, adding secure points to a WANET can be costly in terms of price and time, so minimizing the number of secure points is of utmost importance. Graph theory provides a great foundation to tackle the emerging problems in WANETs. A vertex cover is a set of vertices where every edge is incident to at least one vertex. The minimum weighted connected VC problem can be defined as finding the VC of connected nodes having the minimum total weight. MWCVC is a very suitable infrastructure for energy-efficient link monitoring and virtual backbone formation.

These are additional parameters that you can reconfigure and use to customize the behavior of your strategy further. To change its settings, run the command config followed by the parameter name, e.g. config max_order_age. «Forecasting prices form level 1 quotes in the presence of hidden liquidity.» This paper surveys recent developments in the literature regarding deep RL methods for building human-level agents, and provides an overview of constructing a framework for prospective autonomous systems. 5) Why do you LINK opt for discretized large action space instead of simply using a continuous action space and an appropriate RL algorithm, especially given there is a great selection of RL algorithms capable of tackling continuous action spaces? We also plan to compare the performance of the Alpha-AS models with that of leading RL models in the literature that do not work with the Avellaneda-Stoikov procedure.

Extensions to the AS model have been proposed, most notably the Guéant-Lehalle-Fernandez-Tapia approximation , and in a recent variation of it by Bergault et al. , which are currently used by major market making agents. Nevertheless, in practice, deviations from the model scenarios are to be expected. Under real trading conditions, therefore, there is room for improvement upon the orders generated by the closed-form AS model and its variants. Where tj is the current time upon arrival of the jth market tick, pm is the current market mid-price, I is the current size of the inventory held, γ is a constant that models the agent’s risk aversion, and σ2 is the variance of the market midprice, a measure of volatility. Low-rank approximation algorithms aim to utilize convex nuclear norm constraint of linear matrices to recover ill-conditioned entries caused by multi-sampling rates, sensor drop-out. However, these existing algorithms are often limited in solving high-dimensionality and rank minimization relaxation.

Two Models on Limit Order Trading

The data for the first use of the genetic algorithm was the full day of trading on 8th December 2020. Our algorithm works through 10 generations of instances of the AS model, which we will refer to as individuals, each with a different chromosomal makeup . In the first generation, 45 individuals were created by assigning to each of the four genes random values within the defined ranges. These individuals run through the orderbook data, and are then ranked according to the Sharpe ratio they have attained. For each subsequent generation 45 new individuals run through the data and then added to the cumulative population, retaining all the individuals from previous generations. The 10 generations thus yield a total of 450 individuals, ranked by their Sharpe ratio.

The mean Max DD for the AS-Gen model over the entire test period was visibly the lowest , and its standard deviation was also the lowest by far from among all models. In comparison, both the mean and the standard deviation of the Max DD for the Alpha-AS models were very high. Indeed, the differences in Max DD performance between Gen-AS and either of the Alpha-AS models, over all test days, are not statistically significant, despite the large differences in means. The latter are a result of extreme outliers for the Alpha-AS models from days in which these obtained a very poor (i.e., high) value for Max DD. The medians, however, are very similar to the median for the Gen-AS model.

One of the most active areas of research in algorithmic trading is, broadly, the application of machine learning algorithms to derive trading decisions based on underlying trends in the volatile and hard to predict activity of securities markets. Machine learning is being applied to time series prediction (for instance, of next-day prices ); risk management (e.g., in a ML model is substituted for the commonly used Principal Components Analysis approach), and the improvement or discovery of factors in factor investing [10–13]. Machine learning approaches have been explored to obtain dynamic limit order placement strategies that attempt to adapt in real time to changing market conditions. As regards market making, the AS algorithm, or versions of it , have been used as benchmarks against which to measure the improved performance of the machine learning algorithms proposed, either working with simulated data or in backtests with real data.

High-frequency trading and market performance

The Asymmetric dampened P&L penalizes speculative positions, as speculative profits are not added while losses are discounted. Single feature importance , an out-of-sample estimator of the individual importance of each feature, that avoids the substitution effect found with MDI and MDA . Is the sum of the corresponding quantity over all of the orderbook levels . S′ is the state the MDP has transitioned to when taking action a from state s, to which it arrived at the previous iteration. PLOS ONE promises fair, rigorous peer review, broad scope, and wide readership – a perfect fit for your research every time. To maximize trade profitability, spreads should be enlarged such that the expected future value of the account is maximized.

  • In the literature, reinforcement learning approaches to market making typically employ models that act directly on the agent’s order prices, without taking advantage of knowledge we may have of market behaviour or indeed findings in market-making theory.
  • Α is the learning rate (α∈), which reduces to a fraction the amount of change that is applied to Qi from the observation of the latest reward and the expectation of optimal future rewards.
  • Then, the model trained daily or weekly can predict trading actions and the probability of each choice at every tick.
  • Following the approach in López de Prado , where random forests are applied to an automatic classification task, we performed a selection from among our market features , based on a random forest classifier.
  • To start filling Alpha-AS memory replay buffer and training the model (Section 5.2).

Maximum drawdown registers the largest loss of portfolio value registered between any two points of a full day of trading. Similarly, on the Sortino ratio, one or the other of the two Alpha-AS models performed better, that is, obtained better negative risk-adjusted returns, than all the baseline models on 25 (12+13) of the 30 days. Again, on 9 of the 12 days for which Alpha-AS-1 had the best Sharpe ratio, Alpha-AS-2 had the second best; and for 10 of the 13 test days for which after Alpha-AS-2 obtained the best Sortino ratio, Alpha-AS-1 performed second best.

Top 10 Quant Professors 2022 — Rebellion Research

Top 10 Quant Professors 2022.

Posted: Thu, 13 Oct 2022 07:00:00 GMT [source]

Double DQN is a deep RL approach, more specifically deep Q-learning, that relies on two neural networks, as we shall see shortly (in Section 4.1.7). In this paper we present a double DQN applied to the market-making decision process. The cumulative profit resulting from a market maker’s operations comes from the successive execution of trades on both sides of the spread. This profit from the spread is endangered when the market maker’s buy and sell operations are not balanced overall in volume, since this will increase the dealer’s asset inventory.

The RL agents (Alpha-AS) developed to use the Avellaneda-Stoikov equations to determine their actions are described in Section 4.1. An agent that simply applies the Avellaneda-Stoikov procedure with fixed parameters (Gen-AS), and the genetic algorithm to obtain said parameters, are presented in Section 4.2. With the risk aversion parameter, you tell the bot how much inventory risk you want to take. A value close to 1 will indicate that you don’t want to take too much inventory risk, and hummingbot will “push” the reservation price more to reach the inventory target. In its beginner mode, the user will be asked to enter min and max spread limits, and it’s aversion to inventory risk scaled from 0 to 1 .

Finally, the best-performing model overall, with its corresponding parameter values contained in its chromosome, is retained for subsequent application to the problem at hand. In our case, it will be the AS model used as a baseline against which to compare the performance of our Alpha-AS model. Data normalization for features and labeling for signals are required for classification. Instead of simply labeling the mid-price movement as in Kercheval and Zhang and Tsantekidis et al. , we consider the direct trading actions, including long, short, and none. This approach is inspired by the previous application of deep learning to trade signals in the context of VIX futures (Avellaneda et al., 2021).

Overall performance is more meaningfully obtained from the other indicators (Sharpe, Sortino and P&L-to-MAP), which show that, at the end of the day, the Alpha-AS models’ strategy pays off. The usual approach in algorithmic trading research is to use machine learning algorithms to determine the buy and sell orders directly. In contrast, we propose maintaining the Avellaneda-Stoikov procedure as the basis upon which to determine the orders to be placed. We use a reinforcement learning algorithm, a double DQN, to adjust, at each trading step, the values of the parameters that are modelled as constants in the AS procedure.

You will need to hold a sufficient avellaneda-stoikov of quote and or base currencies on the exchange to place orders of the exchange’s minimum order size. So far, the model has only been applied to a single currency pair, but he has already worked with Guéant and Bergault on extending it to multi-currency portfolios. Pulling all of that together was mathematically complicated due to the fact that client flows are discrete while trading on liquidity pools is continuous. For instance, even after comments about reference formatting, some references have missing publications, years, issues, or even author names . Also, there seems to be a large number of arxiv or SSRN preprints listed for references which are actually published, either as working papers by some institutions or even in peer reviewed journals .