├── .gitignore ├── README.md ├── TP 1 - Courbes Intraday_final.ipynb ├── TP 2 - Volatility and Correlation_final.ipynb ├── TP 3 - Stochastic Approximation Algorithms Applied to Optimal Execution.ipynb ├── TP 4 - Market Making.ipynb └── imgs ├── DynamicalMarketMaking.png ├── EppsEffect.png ├── MarketMaking.png ├── MarketMaking2.png ├── MultipleDFillrate.png ├── MultipleDFillrate2.png ├── MultipleDtFillrate.png ├── SignaturePlot.png └── SingleFillrate.png /.gitignore: -------------------------------------------------------------------------------- 1 | Data/ 2 | .DS_Store 3 | *.pdf 4 | .ipynb_checkpoints/ 5 | .vscode/ -------------------------------------------------------------------------------- /README.md: -------------------------------------------------------------------------------- 1 | # ML_Optimal_Trading 2 | 3 | This is a repository for TPs at SU for the module https://finance.math.upmc.fr/enseignements/2_stat_2_trading_quantitatif/ 4 | -------------------------------------------------------------------------------- /TP 3 - Stochastic Approximation Algorithms Applied to Optimal Execution.ipynb: -------------------------------------------------------------------------------- 1 | { 2 | "cells": [ 3 | { 4 | "cell_type": "markdown", 5 | "metadata": { 6 | "id": "hN3rRJOm9oTv" 7 | }, 8 | "source": [ 9 | "# TP3: Stochastic Approximation Algorithms Applied to Optimal Execution" 10 | ] 11 | }, 12 | { 13 | "cell_type": "markdown", 14 | "metadata": { 15 | "id": "SJLIcC4y9oT2" 16 | }, 17 | "source": [ 18 | "### By Zhiyuan XU and Wenjun LIU" 19 | ] 20 | }, 21 | { 22 | "cell_type": "code", 23 | "execution_count": 3, 24 | "metadata": { 25 | "id": "7AjbwzNa9oT3" 26 | }, 27 | "outputs": [], 28 | "source": [ 29 | "import numpy as np\n", 30 | "import pandas as pd \n", 31 | "import matplotlib.pylab as plt\n", 32 | "from matplotlib.dates import DateFormatter\n", 33 | "from matplotlib.ticker import Formatter\n", 34 | "from matplotlib import cbook, dates\n", 35 | "\n", 36 | "from datetime import datetime\n", 37 | "import statsmodels.api as sm\n", 38 | "from tqdm import tqdm\n", 39 | "import os" 40 | ] 41 | }, 42 | { 43 | "cell_type": "code", 44 | "execution_count": 5, 45 | "metadata": { 46 | "id": "E3UZx6o69oT3" 47 | }, 48 | "outputs": [], 49 | "source": [ 50 | "%matplotlib inline" 51 | ] 52 | }, 53 | { 54 | "cell_type": "markdown", 55 | "metadata": { 56 | "id": "5MGb822M9oT4" 57 | }, 58 | "source": [ 59 | "## 1. Optimal split of orders across liquidity pools" 60 | ] 61 | }, 62 | { 63 | "cell_type": "markdown", 64 | "metadata": { 65 | "id": "svVBL83X9oT4" 66 | }, 67 | "source": [ 68 | "Given the objective function $\n", 69 | "\\Phi(r) = \\Phi\\left(r_{1}, \\ldots, r_{N}\\right):=\\sum_{i=1}^{N} \\varphi_{i}\\left(r_{i}\\right) := \\sum_{i=1}^{N} \\rho_{i} \\mathbb{E}\\left[ \\min \\left(r_{i} V, D_{i}\\right)\\right]\n", 70 | "$, we aim at solving \n", 71 | "$\n", 72 | "\\max _{r \\in \\mathcal{P}_{N}} \\Phi(r)\n", 73 | "$ \n", 74 | "\n", 75 | "where\n", 76 | "$$\n", 77 | "\\mathcal{P}_{N}:=\\left\\{r=\\left(r_{i}\\right)_{1 \\leq i \\leq N} \\in \\mathbb{R}_{+}^{N} \\mid \\sum_{i=1}^{N} r_{i}=1\\right\\}\n", 78 | "$$\n", 79 | "\n", 80 | "The Lagrangian approach gives \n", 81 | "\n", 82 | "$$\n", 83 | "\\begin{aligned}\n", 84 | "r^{*} \\in \\arg \\max _{\\mathcal{P}_{N}} \\Phi &\\iff \\forall i \\in I_{N}, \\quad \\varphi_{i}^{\\prime}\\left(r_{i}^{*}\\right)=\\frac{1}{N} \\sum_{j=1}^{N} \\varphi_{j}^{\\prime}\\left(r_{j}^{*}\\right) \\\\\n", 85 | "&\\iff \\forall i \\in I_{N}, \\quad \\mathbb{E}\\left[V\\left(\\rho_{i} \\mathbb{1}_{\\left\\{r_{i}^{*} V= 0 and tot > 0:\n", 162 | " cr_oracle += rho_rebate[j] * (min(tot, dark_pools_arg[j][i]))\n", 163 | " tot -= min(tot, dark_pools[j][i])\n", 164 | " j -= 1\n", 165 | " return cr_oracle" 166 | ] 167 | }, 168 | { 169 | "cell_type": "code", 170 | "execution_count": 5, 171 | "metadata": { 172 | "id": "qR8GtiCg9oT5" 173 | }, 174 | "outputs": [], 175 | "source": [ 176 | "# choose a horizon T (seconds)\n", 177 | "T = \"5S\"\n", 178 | "eps = 1e-8\n", 179 | "\n", 180 | "# read all data\n", 181 | "df_trades_V = get_asset_V(stock_data[0])\n", 182 | "trades = get_pools(stock_data[1:])" 183 | ] 184 | }, 185 | { 186 | "cell_type": "code", 187 | "execution_count": 6, 188 | "metadata": { 189 | "id": "eezyJtvY9oT6", 190 | "scrolled": true 191 | }, 192 | "outputs": [], 193 | "source": [ 194 | "window = 50\n", 195 | "log_cr = pd.DataFrame(columns=[\"Oracle\", \"Optimization\"])\n", 196 | "x_vlines = [0]\n", 197 | " \n", 198 | "for date, trades_date in df_trades_V.groupby(level=0):\n", 199 | " V = trades_date.droplevel(0).resample(T).sum().Volume\n", 200 | " t_index = V.index\n", 201 | " t_index_real = []\n", 202 | " V = V.to_numpy(copy=True)\n", 203 | " dark_pools = []\n", 204 | " \n", 205 | " # reset r for each day\n", 206 | " r = np.array([1/N]*N)\n", 207 | " \n", 208 | " # compute Di of each dark pool (pseudo-data generation)\n", 209 | " for i in range(N):\n", 210 | " Si = trades[i].loc[date].resample(T).sum().Volume.to_numpy(copy=True)\n", 211 | " Si.resize(V.shape)\n", 212 | " dark_pools.append(Si)\n", 213 | " dark_pools = np.array(dark_pools)\n", 214 | " \n", 215 | " ESi = np.mean(dark_pools, axis=0, keepdims=True) + eps\n", 216 | " dark_pools = beta_scale.reshape(-1, 1)*((1-alpha_comb.reshape(-1, 1))*V + alpha_comb.reshape(-1, 1)*dark_pools*V.mean()/ESi)\n", 217 | "\n", 218 | " cr_opti_day = []\n", 219 | " cr_oracle_day = []\n", 220 | " \n", 221 | " for i in range(V.shape[0]): \n", 222 | " if V[i] == 0:\n", 223 | " continue\n", 224 | " cr_opti, mean_rho = 0, 0\n", 225 | "\n", 226 | " for j in range(N):\n", 227 | " cr_opti += rho_rebate[j]*min(r[j]*V[i], dark_pools[j, i])\n", 228 | " if r[j]*V[i] < dark_pools[j][i]:\n", 229 | " mean_rho += rho_rebate[j] / N\n", 230 | " \n", 231 | " dark_pool = dark_pools[:, i]\n", 232 | " r += ((r*V[i] < dark_pool)*rho_rebate - mean_rho)*0.5\n", 233 | " r[r > 1] = 1\n", 234 | " r[r < 0] = 0\n", 235 | " r /= r.sum()\n", 236 | " \n", 237 | " cr_opti_day.append(cr_opti)\n", 238 | " cr_oracle_day.append(compute_oracle_cost(V[i], dark_pools, rho_rebate, N))\n", 239 | " t_index_real.append(t_index[i])\n", 240 | "\n", 241 | " x_vlines.append(x_vlines[-1] + len(cr_opti_day))\n", 242 | " log_cr_day = pd.DataFrame(\n", 243 | " list(zip(cr_oracle_day, cr_opti_day)), columns=[\"Oracle\", \"Optimization\"], index=t_index_real)\n", 244 | " log_cr_day.loc[:, \"CR_ratio\"] = log_cr_day.loc[:, \"Optimization\"] / (log_cr_day.loc[:, \"Oracle\"] + eps)\n", 245 | " log_cr = log_cr.append(log_cr_day.rolling(window).mean())" 246 | ] 247 | }, 248 | { 249 | "cell_type": "code", 250 | "execution_count": 15, 251 | "metadata": { 252 | "id": "lgNVVumT9oT6", 253 | "outputId": "9b5a7967-515d-44f6-8cae-76cb2965ad18" 254 | }, 255 | "outputs": [ 256 | { 257 | "data": { 258 | "image/png": "\n", 259 | "text/plain": [ 260 | "
" 261 | ] 262 | }, 263 | "metadata": { 264 | "needs_background": "light" 265 | }, 266 | "output_type": "display_data" 267 | } 268 | ], 269 | "source": [ 270 | "log_cr_roll = log_cr\n", 271 | "\n", 272 | "class MyFormatter(Formatter):\n", 273 | " def __init__(self, dates, fmt='%m-%d'):\n", 274 | " self.dates = dates\n", 275 | " self.fmt = fmt\n", 276 | "\n", 277 | " def __call__(self, x, pos=0):\n", 278 | " \"\"\"Return the label for time x at position pos.\"\"\"\n", 279 | " ind = int(round(x))\n", 280 | " if ind >= len(self.dates) or ind < 0:\n", 281 | " return ''\n", 282 | " return self.dates[ind].strftime(self.fmt)\n", 283 | " \n", 284 | "fig, ax = plt.subplots(1, 1, figsize=(9, 5))\n", 285 | "ax.xaxis.set_major_formatter(MyFormatter(log_cr_roll.index))\n", 286 | "ax.plot(range(log_cr_roll.shape[0]), log_cr_roll.loc[:, \"CR_ratio\"], '--', color='r')\n", 287 | "ax.axhline(1, 0, log_cr_roll.shape[0], color='k')\n", 288 | "for x in x_vlines:\n", 289 | " ax.axvline(x, color='b', linestyle='--')\n", 290 | "ax.set_ylim(0, 1.2)\n", 291 | "fig.autofmt_xdate()\n", 292 | "plt.show()" 293 | ] 294 | }, 295 | { 296 | "cell_type": "markdown", 297 | "metadata": { 298 | "id": "g6cgmlnX9oT7" 299 | }, 300 | "source": [ 301 | "The performance is shown in the figure above with the red curve. The shape is different from that of the reference, after a careful examination, we state that it's due to the fact that an uniforme initialization could eventually consumes all liquidity at market openning in the case $\\sum \\beta < 1$, the traded volume are similar in all the three assets, thus no update of weight is done at the begining.\n", 302 | "\n", 303 | "Then, no trades happen at some time slots during the day (cf tp1 intraday volume curve), all cost reductions are 0 if there is no trades happenned, thus lowers the performance. In this way, we have a daily \"V\" shape. Thus, we avoided recording such results and obtained a daily increasing performance. \n", 304 | "\n", 305 | "Overall, the weight update scheme helps to improve the performance (performance curve moves upward) but the shape of the curve remains the same for different settings of $\\gamma_n$" 306 | ] 307 | }, 308 | { 309 | "cell_type": "markdown", 310 | "metadata": { 311 | "id": "y1cljsqJ9oT8" 312 | }, 313 | "source": [ 314 | "## 2. Optimal posting price of limit orders: learning by trading" 315 | ] 316 | }, 317 | { 318 | "cell_type": "code", 319 | "execution_count": 8, 320 | "metadata": { 321 | "id": "06WWf9FD9oT8" 322 | }, 323 | "outputs": [], 324 | "source": [ 325 | "import statsmodels.api as sm\n", 326 | "\n", 327 | "\n", 328 | "def Find_tau_bid(BestBid,Traded,tick,delta):\n", 329 | " tau = []\n", 330 | " p = 0\n", 331 | " while p+1 < len(Traded):\n", 332 | " tra = Traded[p+1:]\n", 333 | " condi = (tra <= BestBid[p] - tick*delta)\n", 334 | " if condi.any():\n", 335 | " new_tau = condi.argmax()\n", 336 | " tau.append(new_tau+1)\n", 337 | " p += new_tau+1\n", 338 | " else:\n", 339 | " return tau\n", 340 | " return tau\n", 341 | "\n", 342 | "\n", 343 | "def Find_tau_ask(BestAsk,Traded,tick,delta):\n", 344 | " tau = []\n", 345 | " p = 0\n", 346 | " while p+1 < len(Traded):\n", 347 | " tra = Traded[p+1:]\n", 348 | " condi = (tra >= BestAsk[p] + tick*delta)\n", 349 | " if condi.any():\n", 350 | " new_tau = condi.argmax()\n", 351 | " tau.append(new_tau+1)\n", 352 | " p += new_tau+1\n", 353 | " else:\n", 354 | " return tau\n", 355 | " return tau" 356 | ] 357 | }, 358 | { 359 | "cell_type": "code", 360 | "execution_count": 9, 361 | "metadata": { 362 | "id": "JWeedcXi9oT9" 363 | }, 364 | "outputs": [], 365 | "source": [ 366 | "def estimate_lambda(tau,T):\n", 367 | " new_tau = ((pd.Series(tau)).apply(lambda x:x if x < T else T))\n", 368 | " return (new_tau < T).sum()/new_tau.sum()" 369 | ] 370 | }, 371 | { 372 | "cell_type": "code", 373 | "execution_count": 10, 374 | "metadata": { 375 | "id": "AAT9m1PQ9oT9" 376 | }, 377 | "outputs": [], 378 | "source": [ 379 | "def find_list(Best,Traded,tick,T,mode = 'bid'):\n", 380 | " lamb_list = []\n", 381 | " for delta in range(5):\n", 382 | " if mode == 'bid':\n", 383 | " tau = Find_tau_bid(Best,Traded,tick,delta)\n", 384 | " if mode == 'ask':\n", 385 | " tau = Find_tau_ask(Best,Traded,tick,delta)\n", 386 | " lamb = estimate_lambda(tau,T)\n", 387 | " lamb_list.append(lamb)\n", 388 | " return lamb_list\n", 389 | "\n", 390 | "def reg(lamb_list):\n", 391 | " y = np.log(lamb_list)\n", 392 | " x = list(range(5))\n", 393 | " x = sm.add_constant(x)\n", 394 | " rlm_model = sm.RLM(y, x, M=sm.robust.norms.HuberT())\n", 395 | " model = rlm_model.fit()\n", 396 | " A = np.exp(model.params[0])\n", 397 | " a = -model.params[1]\n", 398 | " return A,a" 399 | ] 400 | }, 401 | { 402 | "cell_type": "code", 403 | "execution_count": 11, 404 | "metadata": { 405 | "id": "CZzpcAvJ9oT9" 406 | }, 407 | "outputs": [], 408 | "source": [ 409 | "# \"LVMH\" \"SANOFI\" \"TOTAL\" \"BOUYGUES\"\n", 410 | "Chemin = \"Data\"\n", 411 | "StockName = \"LVMH\"\n", 412 | "stock = pd.read_hdf(Chemin+'/'+StockName+'.h5')\n", 413 | "stock = stock[stock.index.month == 2]\n", 414 | "delta_max = 4\n", 415 | "T = 15\n", 416 | "tick = 0.05\n", 417 | "BestBid = stock['BidPrice'].to_numpy()\n", 418 | "BestAsk = stock['AskPrice'].to_numpy()\n", 419 | "Traded = stock['TradedPrice'].to_numpy()" 420 | ] 421 | }, 422 | { 423 | "cell_type": "code", 424 | "execution_count": 12, 425 | "metadata": { 426 | "id": "45Q5wPyR9oT-", 427 | "outputId": "f8a51699-d448-4ffa-d48b-20fcef2f879a" 428 | }, 429 | "outputs": [ 430 | { 431 | "data": { 432 | "text/plain": [ 433 | "" 434 | ] 435 | }, 436 | "execution_count": 12, 437 | "metadata": {}, 438 | "output_type": "execute_result" 439 | }, 440 | { 441 | "data": { 442 | "image/png": "\n", 443 | "text/plain": [ 444 | "
" 445 | ] 446 | }, 447 | "metadata": { 448 | "needs_background": "light" 449 | }, 450 | "output_type": "display_data" 451 | } 452 | ], 453 | "source": [ 454 | "lamb_bid = find_list(stock['BidPrice'].to_numpy(),Traded,tick,T,mode = 'bid') \n", 455 | "lamb_ask = find_list(stock['AskPrice'].to_numpy(),Traded,tick,T,mode = 'ask')\n", 456 | "A_bid,a_bid = reg(lamb_bid)\n", 457 | "A_ask,a_ask = reg(lamb_ask)\n", 458 | "\n", 459 | "xx = np.linspace(0,4,100)\n", 460 | "\n", 461 | "f, axs = plt.subplots(1, 2, figsize=(20,6))\n", 462 | "axs[0].plot(lamb_bid,'--o',label = 'lambda')\n", 463 | "axs[0].plot(xx,A_bid * np.exp(-a_bid*xx),'r',label = 'regression')\n", 464 | "axs[0].set_title(\"Estimateur de lambda et sa parametrisation a l'achat\")\n", 465 | "axs[0].set_xlabel('\\delta')\n", 466 | "axs[0].set_ylabel('Estimated lambda')\n", 467 | "axs[0].legend(shadow=True, fancybox=True)\n", 468 | "\n", 469 | "axs[1].plot(lamb_ask,'--o',label = 'lambda')\n", 470 | "axs[1].plot(xx,A_ask * np.exp(-a_ask*xx),'r',label = 'regression')\n", 471 | "axs[1].set_title('Estimateur de lambda et sa parametrisation a la vente')\n", 472 | "axs[1].set_xlabel('\\delta')\n", 473 | "axs[1].set_ylabel('Estimated lambda')\n", 474 | "axs[1].legend(shadow=True, fancybox=True)" 475 | ] 476 | }, 477 | { 478 | "cell_type": "code", 479 | "execution_count": null, 480 | "metadata": { 481 | "id": "kmXZaeTI9oT-" 482 | }, 483 | "outputs": [], 484 | "source": [] 485 | }, 486 | { 487 | "cell_type": "markdown", 488 | "metadata": { 489 | "id": "QvqWwJw_9xAZ" 490 | }, 491 | "source": [ 492 | "Nous voyons ici que $\\lambda$ décroit bien avec $\\delta$ et suit bien une forme exponentielle\r\n", 493 | "$$\\lambda (x) = Ae^{(-ax)}$$\r\n", 494 | "les côtés vente et achat ont tous les deux cette propriétés. Ceci confirme l'hypothèse du modèle." 495 | ] 496 | }, 497 | { 498 | "cell_type": "code", 499 | "execution_count": null, 500 | "metadata": { 501 | "id": "0xWBiKx0_Cx8" 502 | }, 503 | "outputs": [], 504 | "source": [] 505 | } 506 | ], 507 | "metadata": { 508 | "colab": { 509 | "collapsed_sections": [], 510 | "name": "TP 3 - Stochastic Approximation Algorithms Applied to Optimal Execution_Wenjun LIU&&Zhiyuan XU.ipynb", 511 | "provenance": [] 512 | }, 513 | "kernelspec": { 514 | "display_name": "Python 3", 515 | "language": "python", 516 | "name": "python3" 517 | }, 518 | "language_info": { 519 | "codemirror_mode": { 520 | "name": "ipython", 521 | "version": 3 522 | }, 523 | "file_extension": ".py", 524 | "mimetype": "text/x-python", 525 | "name": "python", 526 | "nbconvert_exporter": "python", 527 | "pygments_lexer": "ipython3", 528 | "version": "3.8.5-final" 529 | } 530 | }, 531 | "nbformat": 4, 532 | "nbformat_minor": 4 533 | } -------------------------------------------------------------------------------- /imgs/DynamicalMarketMaking.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/gaelwjl/ML_Optimal_Trading/d834f97a60eb6e15eed86696c32709f37d43c64b/imgs/DynamicalMarketMaking.png -------------------------------------------------------------------------------- /imgs/EppsEffect.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/gaelwjl/ML_Optimal_Trading/d834f97a60eb6e15eed86696c32709f37d43c64b/imgs/EppsEffect.png -------------------------------------------------------------------------------- /imgs/MarketMaking.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/gaelwjl/ML_Optimal_Trading/d834f97a60eb6e15eed86696c32709f37d43c64b/imgs/MarketMaking.png -------------------------------------------------------------------------------- /imgs/MarketMaking2.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/gaelwjl/ML_Optimal_Trading/d834f97a60eb6e15eed86696c32709f37d43c64b/imgs/MarketMaking2.png -------------------------------------------------------------------------------- /imgs/MultipleDFillrate.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/gaelwjl/ML_Optimal_Trading/d834f97a60eb6e15eed86696c32709f37d43c64b/imgs/MultipleDFillrate.png -------------------------------------------------------------------------------- /imgs/MultipleDFillrate2.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/gaelwjl/ML_Optimal_Trading/d834f97a60eb6e15eed86696c32709f37d43c64b/imgs/MultipleDFillrate2.png -------------------------------------------------------------------------------- /imgs/MultipleDtFillrate.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/gaelwjl/ML_Optimal_Trading/d834f97a60eb6e15eed86696c32709f37d43c64b/imgs/MultipleDtFillrate.png -------------------------------------------------------------------------------- /imgs/SignaturePlot.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/gaelwjl/ML_Optimal_Trading/d834f97a60eb6e15eed86696c32709f37d43c64b/imgs/SignaturePlot.png -------------------------------------------------------------------------------- /imgs/SingleFillrate.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/gaelwjl/ML_Optimal_Trading/d834f97a60eb6e15eed86696c32709f37d43c64b/imgs/SingleFillrate.png --------------------------------------------------------------------------------