├── .gitignore ├── LICENSE.md ├── Procfile_dash ├── Procfile_restapi ├── README.md ├── art └── homepage.png ├── assets ├── favicon.ico ├── s1.css └── style.css ├── dash_app.py ├── dash_app_functions.py ├── docker └── Dockerfile ├── requirements.txt ├── rest_api.py ├── rest_api_models.py ├── runtime.txt ├── start.sh ├── stock_pattern_analyzer ├── __init__.py ├── data.py ├── search_index.py ├── search_model.py └── visualization.py ├── symbols.txt.example └── tests ├── measurements.py └── rest_api_stress_test.py /.gitignore: -------------------------------------------------------------------------------- 1 | .idea/ 2 | __pycache__/ 3 | *.pk 4 | symbols.txt 5 | 6 | -------------------------------------------------------------------------------- /LICENSE.md: -------------------------------------------------------------------------------- 1 | Stock Pattern Analyzer Tool License - Version 1.0 2 | 3 | Copyright 2023 Gabor Vecsei 4 | 5 | The following terms and conditions govern the use, modification, and distribution of the trading tool (the "Tool") 6 | developed by Gabor Vecsei (the "Author"). 7 | 8 | 1. Grant of License 9 | 10 | Subject to the terms and conditions of this license, the Author hereby grants you a worldwide, non-exclusive, 11 | royalty-free, revocable license to use, modify, and distribute the Tool for non-commercial purposes. 12 | 13 | 2. Attribution 14 | 15 | When using, modifying, or distributing the Tool, you must give appropriate credit to the Author by clearly mentioning the following: 16 | 17 | - The name of the Author 18 | - The title of the Tool 19 | - The URL or link to the original repository 20 | 21 | 3. Non-Commercial Use 22 | 23 | You are not permitted to use the Tool, in whole or in part, for commercial purposes without obtaining a separate commercial license from the Author. 24 | Commercial purposes include, but are not limited to, selling, licensing, or distributing the Tool for financial gain. 25 | 26 | 4. No Derivative Works for Commercial Purposes 27 | 28 | You may not distribute modified versions of the Tool for commercial purposes without obtaining a separate commercial license from the Author. 29 | However, you are allowed to make modifications to the Tool for personal use or non-commercial research purposes. 30 | 31 | 5. No Warranty 32 | 33 | The Tool is provided "as is" without any warranties or guarantees of any kind, whether expressed or implied. 34 | The Author shall not be held liable for any damages or liabilities arising from the use, modification, or distribution of the Tool. 35 | 36 | 6. Entire Agreement 37 | 38 | This license constitutes the entire agreement between the parties regarding the use, modification, 39 | and distribution of the Tool and supersedes any prior agreements or understandings, whether written or oral. 40 | 41 | For any inquiries or requests regarding commercial use or obtaining a commercial license, please contact 42 | the Author at vecseigabor.x@gmail.com. 43 | -------------------------------------------------------------------------------- /Procfile_dash: -------------------------------------------------------------------------------- 1 | web: gunicorn dash_app:server -------------------------------------------------------------------------------- /Procfile_restapi: -------------------------------------------------------------------------------- 1 | web: uvicorn rest_api:app --host=0.0.0.0 --port=${PORT:-5000} -------------------------------------------------------------------------------- /README.md: -------------------------------------------------------------------------------- 1 | # Stocks Pattern Analyzer 2 | 3 |

4 | 5 | > *As I am not a a frontend guy, the client does not look good at all on mobile devices. 6 | This is the best I could do. Help is greatly appreciated.* 7 | 8 | ## Run it locally 9 | 10 | Include ticker symbols with the `symbols.txt` file - put each symbol here in a new line. Check `symbols.txt.example` for 11 | an example use case. 12 | 13 | There are 2 special symbols which you can use as a shortcut 14 | - `$SP500` to include all S&P500 symbols 15 | - `$CURRENCY_PAIRS` to include currency pairs where the base currency is EUR 16 | 17 | ### Build & Run with Docker 18 | 19 | (Execute these in the root folder of the project) 20 | 21 | ```shell script 22 | # Build the image 23 | $ docker build -t stock -f docker/Dockerfile . 24 | # Run it 25 | $ docker run --rm --name stock -v $(pwd):/code -p 8050:8050 stock start.sh 26 | ``` 27 | 28 | After this you can access it at `localhost:8050` 29 | 30 | > *Disclaimer*: in a proper setup you would create 2 different images, on for the RestAPI and one for the Client App. 31 | Then with a `docker-compoase.yml` you could create the services. But just like with Heroku, this is a toy and local 32 | deployment, so I won't do fancy stuff here. 33 | 34 | ### Run directly 35 | 36 | - `python rest_api.py` 37 | - Wait until the data creation and search model creation is done (1-2 mins) 38 | - `python dash_app.py` 39 | - The environment variable `$REST_API_URL` controls the connection with the RestAPI. It should be the base URL 40 | - Enjoy :sunglasses: 41 | 42 | ## Deployment to Heroku (toy deployment) 43 | 44 | First of all, this is a mono-repo which is not ideal, but the deployment is just an example. 45 | This is why a multi-buildpack solution is used with `heroku-community/multi-procfile`. 46 | 47 | ```shell script 48 | $ heroku create stock-restapi --remote restapi 49 | $ heroku buildpacks:add -a stock-restapi heroku/python 50 | $ heroku buildpacks:add -a stock-restapi -i 1 heroku-community/multi-procfile 51 | $ heroku config:set -a stock-restapi PROCFILE=Procfile_restapi 52 | $ git push restapi master 53 | $ 54 | $ heroku create stock-dash-client --remote dash 55 | $ heroku buildpacks:add -a stock-dash-client heroku/python 56 | $ heroku buildpacks:add -a stock-dash-client -i 1 heroku-community/multi-procfile 57 | $ heroku config:set -a stock-dash-client PROCFILE=Procfile_dash 58 | $ heroku config:set -a stock-dash-client REST_API_URL=https://stock-restapi.herokuapp.com --> this is the URL where we can reach the RestAPI 59 | $ git push dash master 60 | ``` 61 | 62 | Heroku Files: 63 | - `runtime.txt` describes the Python version 64 | - `Procfile_restapi` Heroku Procfile for the RestAPI app 65 | - `Procfile_dash` Heroku Procfile for the Dash Client app 66 | 67 | ## TODOs 68 | 69 | - Backend 70 | - Proper logging and getting rid of `print`s 71 | - RAM and Speed measurements for the different Search Models 72 | - Frontend 73 | - React frontend instead of the dash app 74 | -------------------------------------------------------------------------------- /art/homepage.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/gaborvecsei/Stocks-Pattern-Analyzer/72ebded1862419c8c65a95f4d6c8277b9b693143/art/homepage.png -------------------------------------------------------------------------------- /assets/favicon.ico: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/gaborvecsei/Stocks-Pattern-Analyzer/72ebded1862419c8c65a95f4d6c8277b9b693143/assets/favicon.ico -------------------------------------------------------------------------------- /assets/s1.css: -------------------------------------------------------------------------------- 1 | /* Table of contents 2 | –––––––––––––––––––––––––––––––––––––––––––––––– 3 | - Plotly.js 4 | - Grid 5 | - Base Styles 6 | - Typography 7 | - Links 8 | - Buttons 9 | - Forms 10 | - Lists 11 | - Code 12 | - Tables 13 | - Spacing 14 | - Utilities 15 | - Clearing 16 | - Media Queries 17 | 18 | */ 19 | 20 | /* PLotly.js 21 | –––––––––––––––––––––––––––––––––––––––––––––––– */ 22 | /* plotly.js's modebar's z-index is 1001 by default 23 | * https://github.com/plotly/plotly.js/blob/7e4d8ab164258f6bd48be56589dacd9bdd7fded2/src/css/_modebar.scss#L5 24 | * In case a dropdown is above the graph, the dropdown's options 25 | * will be rendered below the modebar 26 | * Increase the select option's z-index 27 | 28 | */ 29 | 30 | /* This was actually not quite right - 31 | dropdowns were overlapping each other (edited October 26) 32 | 33 | .Select { 34 | z-index: 1002; 35 | }*/ 36 | 37 | /* Grid 38 | –––––––––––––––––––––––––––––––––––––––––––––––––– */ 39 | .container { 40 | position: relative; 41 | width: 100%; 42 | max-width: 960px; 43 | margin: 0 auto; 44 | padding: 0 20px; 45 | box-sizing: border-box; 46 | } 47 | .column, 48 | .columns { 49 | width: 100%; 50 | float: left; 51 | box-sizing: border-box; 52 | } 53 | 54 | /* For devices larger than 400px */ 55 | @media (min-width: 400px) { 56 | .container { 57 | width: 85%; 58 | padding: 0; 59 | } 60 | } 61 | 62 | /* For devices larger than 550px */ 63 | @media (min-width: 550px) { 64 | .container { 65 | width: 80%; 66 | } 67 | .column, 68 | .columns { 69 | margin-left: 4%; 70 | } 71 | 72 | .one.column, 73 | .one.columns { 74 | width: 4.66666666667%; 75 | } 76 | .two.columns { 77 | width: 13.3333333333%; 78 | } 79 | .three.columns { 80 | width: 22%; 81 | } 82 | .four.columns { 83 | width: 30.6666666667%; 84 | } 85 | .five.columns { 86 | width: 39.3333333333%; 87 | } 88 | .six.columns { 89 | width: 48%; 90 | } 91 | .seven.columns { 92 | width: 56.6666666667%; 93 | } 94 | .eight.columns { 95 | width: 65.3333333333%; 96 | } 97 | .nine.columns { 98 | width: 74%; 99 | } 100 | .ten.columns { 101 | width: 82.6666666667%; 102 | } 103 | .eleven.columns { 104 | width: 91.3333333333%; 105 | } 106 | .twelve.columns { 107 | width: 100%; 108 | margin-left: 0; 109 | } 110 | 111 | .one-third.column { 112 | width: 30.6666666667%; 113 | } 114 | .two-thirds.column { 115 | width: 65.3333333333%; 116 | } 117 | 118 | .one-half.column { 119 | width: 48%; 120 | } 121 | 122 | /* Offsets */ 123 | .offset-by-one.column, 124 | .offset-by-one.columns { 125 | margin-left: 8.66666666667%; 126 | } 127 | .offset-by-two.column, 128 | .offset-by-two.columns { 129 | margin-left: 17.3333333333%; 130 | } 131 | .offset-by-three.column, 132 | .offset-by-three.columns { 133 | margin-left: 26%; 134 | } 135 | .offset-by-four.column, 136 | .offset-by-four.columns { 137 | margin-left: 34.6666666667%; 138 | } 139 | .offset-by-five.column, 140 | .offset-by-five.columns { 141 | margin-left: 43.3333333333%; 142 | } 143 | .offset-by-six.column, 144 | .offset-by-six.columns { 145 | margin-left: 52%; 146 | } 147 | .offset-by-seven.column, 148 | .offset-by-seven.columns { 149 | margin-left: 60.6666666667%; 150 | } 151 | .offset-by-eight.column, 152 | .offset-by-eight.columns { 153 | margin-left: 69.3333333333%; 154 | } 155 | .offset-by-nine.column, 156 | .offset-by-nine.columns { 157 | margin-left: 78%; 158 | } 159 | .offset-by-ten.column, 160 | .offset-by-ten.columns { 161 | margin-left: 86.6666666667%; 162 | } 163 | .offset-by-eleven.column, 164 | .offset-by-eleven.columns { 165 | margin-left: 95.3333333333%; 166 | } 167 | 168 | .offset-by-one-third.column, 169 | .offset-by-one-third.columns { 170 | margin-left: 34.6666666667%; 171 | } 172 | .offset-by-two-thirds.column, 173 | .offset-by-two-thirds.columns { 174 | margin-left: 69.3333333333%; 175 | } 176 | 177 | .offset-by-one-half.column, 178 | .offset-by-one-half.columns { 179 | margin-left: 52%; 180 | } 181 | } 182 | 183 | /* Base Styles 184 | –––––––––––––––––––––––––––––––––––––––––––––––––– */ 185 | /* NOTE 186 | html is set to 62.5% so that all the REM measurements throughout Skeleton 187 | are based on 10px sizing. So basically 1.5rem = 15px :) */ 188 | html { 189 | font-size: 62.5%; 190 | } 191 | body { 192 | font-size: 1.5em; /* currently ems cause chrome bug misinterpreting rems on body element */ 193 | line-height: 1.6; 194 | font-weight: 400; 195 | font-family: "Open Sans", "HelveticaNeue", "Helvetica Neue", Helvetica, Arial, 196 | sans-serif; 197 | color: rgb(50, 50, 50); 198 | } 199 | 200 | /* Typography 201 | –––––––––––––––––––––––––––––––––––––––––––––––––– */ 202 | h1, 203 | h2, 204 | h3, 205 | h4, 206 | h5, 207 | h6 { 208 | margin-top: 0; 209 | margin-bottom: 0; 210 | font-weight: 300; 211 | } 212 | h1 { 213 | font-size: 4.5rem; 214 | line-height: 1.2; 215 | letter-spacing: -0.1rem; 216 | margin-bottom: 2rem; 217 | } 218 | h2 { 219 | font-size: 3.6rem; 220 | line-height: 1.25; 221 | letter-spacing: -0.1rem; 222 | margin-bottom: 1.8rem; 223 | margin-top: 1.8rem; 224 | } 225 | h3 { 226 | font-size: 3rem; 227 | line-height: 1.3; 228 | letter-spacing: -0.1rem; 229 | margin-bottom: 1.5rem; 230 | margin-top: 1.5rem; 231 | } 232 | h4 { 233 | font-size: 2.6rem; 234 | line-height: 1.35; 235 | letter-spacing: -0.08rem; 236 | margin-bottom: 1.2rem; 237 | margin-top: 1.2rem; 238 | } 239 | h5 { 240 | font-size: 2.2rem; 241 | line-height: 1.5; 242 | letter-spacing: -0.05rem; 243 | margin-bottom: 0.6rem; 244 | margin-top: 0.6rem; 245 | } 246 | h6 { 247 | font-size: 2rem; 248 | line-height: 1.6; 249 | letter-spacing: 0; 250 | margin-bottom: 0.75rem; 251 | margin-top: 0.75rem; 252 | } 253 | 254 | p { 255 | margin-top: 0; 256 | } 257 | 258 | /* Blockquotes 259 | –––––––––––––––––––––––––––––––––––––––––––––––––– */ 260 | blockquote { 261 | border-left: 4px lightgrey solid; 262 | padding-left: 1rem; 263 | margin-top: 2rem; 264 | margin-bottom: 2rem; 265 | margin-left: 0rem; 266 | } 267 | 268 | /* Links 269 | –––––––––––––––––––––––––––––––––––––––––––––––––– */ 270 | a { 271 | color: #1eaedb; 272 | text-decoration: underline; 273 | cursor: pointer; 274 | } 275 | a:hover { 276 | color: #0fa0ce; 277 | } 278 | 279 | /* Buttons 280 | –––––––––––––––––––––––––––––––––––––––––––––––––– */ 281 | .button, 282 | button, 283 | input[type="submit"], 284 | input[type="reset"], 285 | input[type="button"] { 286 | display: inline-block; 287 | height: 38px; 288 | padding: 0 30px; 289 | color: #555; 290 | text-align: center; 291 | font-size: 11px; 292 | font-weight: 600; 293 | line-height: 38px; 294 | letter-spacing: 0.1rem; 295 | text-transform: uppercase; 296 | text-decoration: none; 297 | white-space: nowrap; 298 | background-color: transparent; 299 | border-radius: 4px; 300 | border: 1px solid #bbb; 301 | cursor: pointer; 302 | box-sizing: border-box; 303 | } 304 | .button:hover, 305 | button:hover, 306 | input[type="submit"]:hover, 307 | input[type="reset"]:hover, 308 | input[type="button"]:hover, 309 | .button:focus, 310 | button:focus, 311 | input[type="submit"]:focus, 312 | input[type="reset"]:focus, 313 | input[type="button"]:focus { 314 | color: #333; 315 | border-color: #888; 316 | outline: 0; 317 | } 318 | .button.button-primary, 319 | button.button-primary, 320 | input[type="submit"].button-primary, 321 | input[type="reset"].button-primary, 322 | input[type="button"].button-primary { 323 | color: #fff; 324 | background-color: #33c3f0; 325 | border-color: #33c3f0; 326 | } 327 | .button.button-primary:hover, 328 | button.button-primary:hover, 329 | input[type="submit"].button-primary:hover, 330 | input[type="reset"].button-primary:hover, 331 | input[type="button"].button-primary:hover, 332 | .button.button-primary:focus, 333 | button.button-primary:focus, 334 | input[type="submit"].button-primary:focus, 335 | input[type="reset"].button-primary:focus, 336 | input[type="button"].button-primary:focus { 337 | color: #fff; 338 | background-color: #1eaedb; 339 | border-color: #1eaedb; 340 | } 341 | 342 | /* Forms 343 | –––––––––––––––––––––––––––––––––––––––––––––––––– */ 344 | input[type="email"], 345 | input[type="number"], 346 | input[type="search"], 347 | input[type="text"], 348 | input[type="tel"], 349 | input[type="url"], 350 | input[type="password"], 351 | textarea, 352 | select { 353 | height: 38px; 354 | padding: 6px 10px; /* The 6px vertically centers text on FF, ignored by Webkit */ 355 | background-color: #fff; 356 | border: 1px solid #d1d1d1; 357 | border-radius: 4px; 358 | box-shadow: none; 359 | box-sizing: border-box; 360 | font-family: inherit; 361 | font-size: inherit; /*https://stackoverflow.com/questions/6080413/why-doesnt-input-inherit-the-font-from-body*/ 362 | } 363 | /* Removes awkward default styles on some inputs for iOS */ 364 | input[type="email"], 365 | input[type="number"], 366 | input[type="search"], 367 | input[type="text"], 368 | input[type="tel"], 369 | input[type="url"], 370 | input[type="password"], 371 | textarea { 372 | -webkit-appearance: none; 373 | -moz-appearance: none; 374 | appearance: none; 375 | } 376 | textarea { 377 | min-height: 65px; 378 | padding-top: 6px; 379 | padding-bottom: 6px; 380 | } 381 | input[type="email"]:focus, 382 | input[type="number"]:focus, 383 | input[type="search"]:focus, 384 | input[type="text"]:focus, 385 | input[type="tel"]:focus, 386 | input[type="url"]:focus, 387 | input[type="password"]:focus, 388 | textarea:focus, 389 | select:focus { 390 | border: 1px solid #33c3f0; 391 | outline: 0; 392 | } 393 | label, 394 | legend { 395 | display: block; 396 | margin-bottom: 0px; 397 | } 398 | fieldset { 399 | padding: 0; 400 | border-width: 0; 401 | } 402 | input[type="checkbox"], 403 | input[type="radio"] { 404 | display: inline; 405 | } 406 | label > .label-body { 407 | display: inline-block; 408 | margin-left: 0.5rem; 409 | font-weight: normal; 410 | } 411 | 412 | /* Lists 413 | –––––––––––––––––––––––––––––––––––––––––––––––––– */ 414 | ul { 415 | list-style: circle inside; 416 | } 417 | ol { 418 | list-style: decimal inside; 419 | } 420 | ol, 421 | ul { 422 | padding-left: 0; 423 | margin-top: 0; 424 | } 425 | ul ul, 426 | ul ol, 427 | ol ol, 428 | ol ul { 429 | margin: 1.5rem 0 1.5rem 3rem; 430 | font-size: 90%; 431 | } 432 | li { 433 | margin-bottom: 1rem; 434 | } 435 | 436 | /* Tables 437 | –––––––––––––––––––––––––––––––––––––––––––––––––– */ 438 | table { 439 | } 440 | th, 441 | td { 442 | padding: 12px 15px; 443 | text-align: left; 444 | border-bottom: 1px solid #e1e1e1; 445 | } 446 | th:first-child, 447 | td:first-child { 448 | padding-left: 0; 449 | } 450 | th:last-child, 451 | td:last-child { 452 | padding-right: 0; 453 | } 454 | 455 | /* Spacing 456 | –––––––––––––––––––––––––––––––––––––––––––––––––– */ 457 | button, 458 | .button { 459 | margin-bottom: 0rem; 460 | } 461 | input, 462 | textarea, 463 | select, 464 | fieldset { 465 | margin-bottom: 0rem; 466 | } 467 | pre, 468 | dl, 469 | figure, 470 | table, 471 | form { 472 | margin-bottom: 0rem; 473 | } 474 | p, 475 | ul, 476 | ol { 477 | margin-bottom: 0.75rem; 478 | } 479 | 480 | /* Utilities 481 | –––––––––––––––––––––––––––––––––––––––––––––––––– */ 482 | .u-full-width { 483 | width: 100%; 484 | box-sizing: border-box; 485 | } 486 | .u-max-full-width { 487 | max-width: 100%; 488 | box-sizing: border-box; 489 | } 490 | .u-pull-right { 491 | float: right; 492 | } 493 | .u-pull-left { 494 | float: left; 495 | } 496 | 497 | /* Misc 498 | –––––––––––––––––––––––––––––––––––––––––––––––––– */ 499 | hr { 500 | margin-top: 3rem; 501 | margin-bottom: 3.5rem; 502 | border-width: 0; 503 | border-top: 1px solid #e1e1e1; 504 | } 505 | 506 | /* Clearing 507 | –––––––––––––––––––––––––––––––––––––––––––––––––– */ 508 | 509 | /* Self Clearing Goodness */ 510 | .container:after, 511 | .row:after, 512 | .u-cf { 513 | content: ""; 514 | display: table; 515 | clear: both; 516 | } 517 | 518 | /* Media Queries 519 | –––––––––––––––––––––––––––––––––––––––––––––––––– */ 520 | /* 521 | Note: The best way to structure the use of media queries is to create the queries 522 | near the relevant code. For example, if you wanted to change the styles for buttons 523 | on small devices, paste the mobile query code up in the buttons section and style it 524 | there. 525 | */ 526 | 527 | /* Larger than mobile, screen sizes larger than 400px */ 528 | @media (min-width: 400px) { 529 | } 530 | 531 | /* Larger than phablet (also point when grid becomes active), screen larger than 550px */ 532 | @media (min-width: 550px) { 533 | .one.column, 534 | .one.columns { 535 | width: 8%; 536 | } 537 | .two.columns { 538 | width: 16.25%; 539 | } 540 | .three.columns { 541 | width: 22%; 542 | } 543 | .four.columns { 544 | width: calc(100% / 3); 545 | } 546 | .five.columns { 547 | width: calc(100% * 5 / 12); 548 | } 549 | .six.columns { 550 | width: 49.75%; 551 | } 552 | .seven.columns { 553 | width: calc(100% * 7 / 12); 554 | } 555 | } 556 | 557 | /* Larger than tablet, for screens smaller than 768px */ 558 | @media (max-width: 550px) { 559 | .flex-display { 560 | display: block !important; 561 | } 562 | .pretty_container { 563 | margin: 0 !important; 564 | margin-bottom: 25px !important; 565 | } 566 | #individual_graph, 567 | #count_graph, 568 | #aggregate_graph { 569 | position: static !important; 570 | } 571 | .container-display { 572 | display: flex; 573 | } 574 | 575 | .mini_container { 576 | margin-bottom: 25px !important; 577 | border-radius: 5px; 578 | background-color: #f9f9f9; 579 | padding: 15px; 580 | position: relative; 581 | box-shadow: 2px 2px 2px lightgrey; 582 | } 583 | 584 | h3 { 585 | font-size: 2.5rem; 586 | } 587 | h5 { 588 | font-size: 2rem; 589 | } 590 | h6 { 591 | font-size: 1.25rem; 592 | } 593 | p { 594 | font-size: 11px; 595 | } 596 | } 597 | 598 | /* Larger than desktop */ 599 | @media (min-width: 1000px) { 600 | } 601 | 602 | /* Larger than Desktop HD */ 603 | @media (min-width: 1200px) { 604 | } -------------------------------------------------------------------------------- /assets/style.css: -------------------------------------------------------------------------------- 1 | .js-plotly-plot .plotly .modebar { 2 | padding-top: 5%; 3 | margin-right: 3.5%; 4 | } 5 | 6 | body { 7 | background-color: #f2f2f2; 8 | margin-top: 2%; 9 | margin-bottom: 2%; 10 | margin-left: 5%; 11 | margin-right: 5%; 12 | } 13 | 14 | .two.columns { 15 | width: 16.25%; 16 | } 17 | 18 | .column, 19 | .columns { 20 | margin-left: 0.5%; 21 | } 22 | 23 | .pretty_container { 24 | border-radius: 5px; 25 | background-color: #f9f9f9; 26 | margin: 10px; 27 | padding: 15px; 28 | position: relative; 29 | box-shadow: 2px 2px 2px lightgrey; 30 | } 31 | 32 | .bare_container { 33 | margin: 0 0 0 0; 34 | padding: 0 0 0 0; 35 | } 36 | 37 | .dcc_control { 38 | margin: 0; 39 | padding: 5px; 40 | width: calc(80%); 41 | } 42 | 43 | .control_label { 44 | margin: 0; 45 | padding: 10px; 46 | padding-bottom: 0px; 47 | margin-bottom: 0px; 48 | width: calc(100%-40px); 49 | } 50 | 51 | .rc-slider { 52 | margin-left: 0px; 53 | padding-left: 0px; 54 | } 55 | 56 | .flex-display { 57 | display: flex; 58 | } 59 | 60 | .container-display { 61 | display: flex; 62 | } 63 | 64 | #individual_graph, 65 | #aggregate_graph { 66 | width: calc(100% - 30px); 67 | position: absolute; 68 | } 69 | 70 | #count_graph { 71 | position: absolute; 72 | height: calc(100% - 30px); 73 | width: calc(100% - 30px); 74 | } 75 | 76 | #countGraphContainer { 77 | flex: 5; 78 | position: relative; 79 | } 80 | 81 | #header { 82 | align-items: center; 83 | } 84 | 85 | #learn-more-button { 86 | text-align: center; 87 | height: 100%; 88 | padding: 0 20px; 89 | text-transform: none; 90 | font-size: 15px; 91 | float: right; 92 | margin-right: 10px; 93 | margin-top: 30px; 94 | } 95 | #title { 96 | text-align: center; 97 | } 98 | 99 | .mini_container { 100 | border-radius: 5px; 101 | background-color: #f9f9f9; 102 | margin: 10px; 103 | padding: 15px; 104 | position: relative; 105 | box-shadow: 2px 2px 2px lightgrey; 106 | } 107 | 108 | #right-column { 109 | display: flex; 110 | flex-direction: column; 111 | } 112 | 113 | #wells { 114 | flex: 1; 115 | } 116 | 117 | #gas { 118 | flex: 1; 119 | } 120 | 121 | #aggregate_data { 122 | align-items: center; 123 | } 124 | 125 | #oil { 126 | flex: 1; 127 | } 128 | 129 | #water { 130 | flex: 1; 131 | } 132 | 133 | #tripleContainer { 134 | display: flex; 135 | flex: 3; 136 | } 137 | 138 | #mainContainer { 139 | display: flex; 140 | flex-direction: column; 141 | } 142 | 143 | #pie_graph > div > div > svg:nth-child(3) > g.infolayer > g.legend { 144 | pointer-events: all; 145 | transform: translate(30px, 349px); 146 | } -------------------------------------------------------------------------------- /dash_app.py: -------------------------------------------------------------------------------- 1 | import dash_core_components as dcc 2 | import dash_html_components as html 3 | import dash_table 4 | from dash import Dash 5 | from dash.dependencies import Input, Output 6 | 7 | import stock_pattern_analyzer as spa 8 | from dash_app_functions import get_search_window_sizes, get_symbols, search_most_recent 9 | 10 | app = Dash(__name__, meta_tags=[{"name": "viewport", "content": "width=device-width"}]) 11 | app.title = "Stock Patterns" 12 | server = app.server 13 | 14 | ##### Header ##### 15 | 16 | header_div = html.Div([html.Div([html.H3("📈")], className="one-third column"), 17 | html.Div([html.Div([html.H3("Stock Patterns", style={"margin-bottom": "0px"}), 18 | html.H5("Find historical patterns and use for forecasting", 19 | style={"margin-top": "0px"})])], 20 | className="one-half column", id="title"), 21 | html.Div([html.A(html.Button("Gabor Vecsei"), href="https://www.gaborvecsei.com/")], 22 | className="one-third column", 23 | id="learn-more-button")], 24 | id="header", className="row flex-display", style={"margin-bottom": "25px"}) 25 | 26 | ##### Explanation ##### 27 | 28 | explanation_div = html.Div([dcc.Markdown("""Select a stock symbol and a time-frame. This tools finds similar patterns in 29 | historical data. 30 | 31 | The most similar patters are visualized with an extended *time-frame/'future data'*, which can be an 32 | indication of future price movement for the selected (anchor) stock.""")]) 33 | 34 | ##### Settings container ##### 35 | 36 | symbol_dropdown_id = "id-symbol-dropdown" 37 | available_symbols = get_symbols() 38 | default_symbol = "AAPL" if "AAPL" in available_symbols else available_symbols[0] 39 | symbol_dropdown = dcc.Dropdown(id=symbol_dropdown_id, 40 | options=[{"label": x, "value": x} for x in available_symbols], 41 | multi=False, 42 | value=default_symbol, 43 | className="dcc_control") 44 | 45 | window_size_dropdown_id = "id-window-size-dropdown" 46 | window_sizes = get_search_window_sizes() 47 | window_size_dropdown = dcc.Dropdown(id=window_size_dropdown_id, 48 | options=[{"label": f"{x} days", "value": x} for x in window_sizes], 49 | multi=False, 50 | value=window_sizes[2], 51 | className="dcc_control") 52 | 53 | future_size_input_id = "id-future-size-input" 54 | MAX_FUTURE_WINDOW_SIZE = 10 55 | future_size_input = dcc.Input(id=future_size_input_id, type="number", min=0, max=MAX_FUTURE_WINDOW_SIZE, value=5, 56 | className="dcc_control") 57 | 58 | top_k_input_id = "id-top-k-input" 59 | MAX_TOP_K_VALUE = 10 60 | top_k_input = dcc.Input(id=top_k_input_id, type="number", min=0, max=MAX_TOP_K_VALUE, value=5, className="dcc_control") 61 | 62 | offset_checkbox_id = "id-offset-checkbox" 63 | offset_checkbox = dcc.Checklist(id=offset_checkbox_id, options=[{"label": "Use Offset", "value": "offset"}], 64 | value=["offset"], className="dcc_control") 65 | 66 | settings_div = html.Div([html.P("Symbol (anchor)", className="control_label"), 67 | symbol_dropdown, 68 | html.P("Search window size", className="control_label"), 69 | window_size_dropdown, 70 | html.P(f"Future window size (max. {MAX_FUTURE_WINDOW_SIZE})", className="control_label"), 71 | future_size_input, 72 | html.P(f"Patterns to match (max. {MAX_TOP_K_VALUE})", className="control_label"), 73 | top_k_input, 74 | html.P("Offset the matched patterns for easy comparison (to the anchors last market close)", 75 | className="control_label"), 76 | offset_checkbox], 77 | className="pretty_container three columns", 78 | id="id-settings-div") 79 | 80 | ##### Stats & Graph ##### 81 | 82 | graph_id = "id-graph" 83 | stats_and_graph_div = html.Div([html.Div(id="id-stats-container", className="row container-display"), 84 | html.Div([dcc.Graph(id=graph_id)], id="id-graph-div", className="pretty_container")], 85 | id="id-graph-container", className="nine columns") 86 | 87 | ##### Matched Stocks List ##### 88 | 89 | matched_table_id = "id-matched-list" 90 | table_columns = ["Index", 91 | "Match distance", 92 | "Symbol", 93 | "Pattern Start Date", 94 | "Pattern End Date", 95 | "Pattern Start Close Value ($)", 96 | "Pattern End Close Value ($)", 97 | "Pattern Future Close Value ($)"] 98 | table = dash_table.DataTable(id=matched_table_id, columns=[{"id": c, "name": c} for c in table_columns], page_size=5) 99 | matched_div = html.Div([html.Div([html.H6("Matched (most similar) patterns"), table], 100 | className="pretty_container")], 101 | id="id-matched-list-container", 102 | className="eleven columns") 103 | 104 | ##### Reference Links ##### 105 | 106 | css_link = html.A("[1] Style of the page (css)", 107 | href="https://github.com/plotly/dash-sample-apps/tree/master/apps/dash-oil-and-gas") 108 | yahoo_data_link = html.A("[2] Yahoo data", href="https://finance.yahoo.com") 109 | gabor_github_link = html.A("[3] Gabor Vecsei GitHub", href="https://github.com/gaborvecsei") 110 | reference_links_div = html.Div([html.Div([html.H6("References"), 111 | css_link, 112 | html.Br(), 113 | yahoo_data_link, 114 | html.Br(), 115 | gabor_github_link], 116 | className="pretty_container")], 117 | className="four columns") 118 | 119 | ##### Layout ##### 120 | 121 | app.layout = html.Div([header_div, 122 | explanation_div, 123 | html.Div([settings_div, 124 | stats_and_graph_div], 125 | className="row flex-display"), 126 | html.Div([matched_div], className="row flex-display"), 127 | reference_links_div], 128 | id="mainContainer", 129 | style={"display": "flex", "flex-direction": "column"}) 130 | 131 | 132 | ##### Callbacks ##### 133 | 134 | @app.callback([Output(graph_id, "figure"), 135 | Output(matched_table_id, "data")], 136 | [Input(symbol_dropdown_id, "value"), 137 | Input(window_size_dropdown_id, "value"), 138 | Input(future_size_input_id, "value"), 139 | Input(top_k_input_id, "value"), 140 | Input(offset_checkbox_id, "value")]) 141 | def update_plot_and_table(symbol_value, window_size_value, future_size_value, top_k_value, checkbox_value): 142 | # RetAPI search 143 | ret = search_most_recent(symbol=symbol_value, 144 | window_size=window_size_value, 145 | top_k=top_k_value, 146 | future_size=future_size_value) 147 | 148 | # Parse response and build the HTML table rows 149 | table_rows = [] 150 | values = [] 151 | symbols = [] 152 | start_end_dates = [] 153 | for i, match in enumerate(ret.matches): 154 | values.append(match.values) 155 | symbols.append(match.symbol) 156 | start_end_dates.append((match.start_date, match.end_date)) 157 | row_values = [i + 1, 158 | match.distance, 159 | match.symbol, 160 | match.end_date, 161 | match.start_date, 162 | match.values[-1], 163 | match.values[window_size_value - 1], 164 | match.values[0]] 165 | row_dict = {c: v for c, v in zip(table_columns, row_values)} 166 | table_rows.append(row_dict) 167 | 168 | offset_traces = False if len(checkbox_value) == 0 else True 169 | 170 | # Visualize the data on a graph 171 | fig = spa.visualize_graph(match_values_list=values, 172 | match_symbols=symbols, 173 | match_str_dates=start_end_dates, 174 | window_size=window_size_value, 175 | future_size=future_size_value, 176 | anchor_symbol=ret.anchor_symbol, 177 | anchor_values=ret.anchor_values, 178 | show_legend=False, 179 | offset_traces=offset_traces) 180 | 181 | return fig, table_rows 182 | 183 | 184 | if __name__ == "__main__": 185 | app.run_server(debug=False, host="0.0.0.0") 186 | -------------------------------------------------------------------------------- /dash_app_functions.py: -------------------------------------------------------------------------------- 1 | import os 2 | 3 | import requests 4 | 5 | from rest_api_models import (TopKSearchResponse, SearchWindowSizeResponse, DataRefreshResponse, 6 | AvailableSymbolsResponse) 7 | 8 | BASE_URL = os.environ.get("REST_API_URL", default="http://localhost:8001") 9 | 10 | 11 | def get_search_window_sizes() -> list: 12 | res = requests.get(f"{BASE_URL}/search/sizes") 13 | res = SearchWindowSizeResponse.parse_obj(res.json()) 14 | return res.sizes 15 | 16 | 17 | def get_symbols() -> list: 18 | res = requests.get(f"{BASE_URL}/data/symbols") 19 | res = AvailableSymbolsResponse.parse_obj(res.json()) 20 | return res.symbols 21 | 22 | 23 | def search_most_recent(symbol: str, window_size: int, top_k: int, future_size: int) -> TopKSearchResponse: 24 | url = f"{BASE_URL}/search/recent/?symbol={symbol.upper()}&window_size={window_size}&top_k={top_k}&future_size={future_size}" 25 | res = requests.get(url) 26 | res = TopKSearchResponse.parse_obj(res.json()) 27 | return res 28 | 29 | 30 | def get_last_refresh_date() -> str: 31 | url = f"{BASE_URL}/refresh/when" 32 | res = requests.get(url) 33 | res = DataRefreshResponse.parse_obj(res) 34 | date_str = res.date.strftime("%Y/%m/%d, %H:%M:%S") 35 | return date_str 36 | -------------------------------------------------------------------------------- /docker/Dockerfile: -------------------------------------------------------------------------------- 1 | FROM python:3.9.2-slim 2 | 3 | RUN apt-get update 4 | 5 | # Requirements copied first, not the whole project, so code change won't trigger a pip install always 6 | # It is only triggered when the requirements.txt changes 7 | COPY ./requirements.txt /requirements.txt 8 | RUN pip install -r /requirements.txt 9 | 10 | WORKDIR /code 11 | 12 | ENTRYPOINT ["/bin/bash"] 13 | -------------------------------------------------------------------------------- /requirements.txt: -------------------------------------------------------------------------------- 1 | numpy 2 | yfinance 3 | pandas 4 | tqdm 5 | fastapi 6 | dash 7 | plotly 8 | matplotlib 9 | fastapi 10 | pydantic 11 | apscheduler 12 | uvicorn[standard] 13 | scikit-learn 14 | gunicorn 15 | scipy 16 | faiss-cpu 17 | psutil 18 | -------------------------------------------------------------------------------- /rest_api.py: -------------------------------------------------------------------------------- 1 | from datetime import datetime 2 | import itertools 3 | from pathlib import Path 4 | import threading 5 | from typing import Optional, Set, Tuple 6 | 7 | from apscheduler.schedulers.asyncio import AsyncIOScheduler 8 | from fastapi import FastAPI, HTTPException, Response 9 | import numpy as np 10 | import pandas as pd 11 | 12 | from rest_api_models import ( 13 | AvailableSymbolsResponse, 14 | DataRefreshResponse, 15 | IsReadyResponse, 16 | MatchResponse, 17 | SearchWindowSizeResponse, 18 | SuccessResponse, 19 | TopKSearchResponse, 20 | ) 21 | import stock_pattern_analyzer as spa 22 | 23 | app = FastAPI() 24 | 25 | 26 | def _get_sp500_ticker_list() -> set: 27 | table = pd.read_html('https://en.wikipedia.org/wiki/List_of_S%26P_500_companies') 28 | df = table[0] 29 | symbols = set(df["Symbol"].values) 30 | return symbols 31 | 32 | 33 | def _get_currency_pairs_symbol_list(base_currency: str) -> set: 34 | table = pd.read_html("https://en.wikipedia.org/wiki/Currency_pair") 35 | df = table[2] 36 | currency_symbols: Set[str] = set(df["ISO 4217 code"].values) 37 | currency_pairs: Set[Tuple[str, str]] = set(itertools.product([base_currency.upper()], currency_symbols)) 38 | # This is a format what yahoo finance uses 39 | currency_pair_str_list: Set[str] = {f"{x}{y}=X" for x, y in currency_pairs} 40 | return currency_pair_str_list 41 | 42 | 43 | AVAILABLE_SEARCH_WINDOW_SIZES = list(range(6, 17, 2)) + [5, 20, 25, 30, 45] 44 | AVAILABLE_SEARCH_WINDOW_SIZES = sorted(AVAILABLE_SEARCH_WINDOW_SIZES) 45 | 46 | user_defined_tickers_file_path = Path("symbols.txt") 47 | user_defined_tickers: Set[str] = set() 48 | if user_defined_tickers_file_path.exists(): 49 | user_defined_tickers = set(user_defined_tickers_file_path.read_text().split("\n")) 50 | else: 51 | raise FileNotFoundError("We need a symbols.txt - check readme") 52 | 53 | if "$SP500" in user_defined_tickers: 54 | sp500_symbols = _get_sp500_ticker_list() 55 | user_defined_tickers.remove("$SP500") 56 | user_defined_tickers = user_defined_tickers.union(sp500_symbols) 57 | 58 | if "$CURRENCY_PAIRS" in user_defined_tickers: 59 | currency_pair_symbols = _get_currency_pairs_symbol_list("EUR") 60 | user_defined_tickers.remove("$CURRENCY_PAIRS") 61 | user_defined_tickers = user_defined_tickers.union(currency_pair_symbols) 62 | 63 | SYMBOL_LIST = sorted(user_defined_tickers) 64 | 65 | PERIOD_YEARS = 20 66 | 67 | 68 | def _prepare_data(force_update: bool = False) -> spa.RawStockDataHolder: 69 | return spa.initialize_data_holder(tickers=SYMBOL_LIST, period_years=PERIOD_YEARS, force_update=force_update) 70 | 71 | 72 | data_holder: spa.RawStockDataHolder = _prepare_data() 73 | search_tree_dict: dict = {} 74 | refresh_scheduler: AsyncIOScheduler = AsyncIOScheduler() 75 | last_refreshed: Optional[datetime] = None 76 | 77 | 78 | def _date_to_str(date): 79 | return pd.to_datetime(date).strftime("%Y-%m-%d") 80 | 81 | 82 | def _find_and_remove_files(folder_path: str, file_pattern: str) -> list: 83 | paths = Path(folder_path).glob(file_pattern) 84 | for p in paths: 85 | p.unlink() 86 | return list(paths) 87 | 88 | 89 | @app.get("/") 90 | def root(): 91 | return Response(content="Welcome to the stock pattern matcher RestAPI") 92 | 93 | 94 | @app.get("/is_ready", response_model=IsReadyResponse) 95 | def is_read(): 96 | if (data_holder is None) or not data_holder.is_filled: 97 | return IsReadyResponse(is_ready=False) 98 | 99 | if len(search_tree_dict) == 0: 100 | return IsReadyResponse(is_ready=False) 101 | 102 | return IsReadyResponse(is_ready=True) 103 | 104 | 105 | @app.get("/data/symbols", response_model=AvailableSymbolsResponse, tags=["data"]) 106 | def get_available_symbols(): 107 | return AvailableSymbolsResponse(symbols=SYMBOL_LIST) 108 | 109 | 110 | @app.get("/data/refresh", response_model=SuccessResponse, include_in_schema=False) 111 | def refresh_data(): 112 | # TODO: hardcoded file prefix and folder 113 | _find_and_remove_files(".", "data_holder_*.pk") 114 | global data_holder 115 | data_holder = _prepare_data() 116 | print("Data refreshed") 117 | return SuccessResponse(message="Existing data holder files removed, and a new one is created") 118 | 119 | 120 | @app.get("/refresh", response_model=SuccessResponse, include_in_schema=False) 121 | def refresh_everything(): 122 | refresh_data() 123 | refresh_search() 124 | global last_refreshed 125 | last_refreshed = datetime.now() 126 | return SuccessResponse() 127 | 128 | 129 | @app.get("/refresh/when", response_model=DataRefreshResponse, tags=["refresh"]) 130 | def when_was_data_refreshed(): 131 | return DataRefreshResponse(date=last_refreshed) 132 | 133 | 134 | @app.get("/search/prepare/{window_size}", response_model=SuccessResponse, include_in_schema=False) 135 | def prepare_search_tree(window_size: int, force_update: bool = False): 136 | global search_tree_dict 137 | search_tree_dict[window_size] = spa.initialize_search_tree(data_holder=data_holder, 138 | window_size=window_size, 139 | force_update=force_update) 140 | return SuccessResponse() 141 | 142 | 143 | @app.get("/search/prepare", response_model=SuccessResponse, include_in_schema=False) 144 | def prepare_all_search_trees(force_update: bool = False): 145 | # TODO: The parallel creation of the search windows gives Memory error on Heroku free dynos 146 | # with concurrent.futures.ThreadPoolExecutor() as pool: 147 | # futures = {} 148 | # for w in AVAILABLE_SEARCH_WINDOW_SIZES: 149 | # f = pool.submit(prepare_search_tree, window_size=w, force_update=force_update) 150 | # futures[f] = w 151 | # 152 | # for f in concurrent.futures.as_completed(futures): 153 | # w = futures[f] 154 | # try: 155 | # f.result() 156 | # print(f"Search tree with size {w} prepared") 157 | # except Exception as e: 158 | # print(f"There was a problem with size {w}, could not create it") 159 | 160 | # TODO: Sequential creation is used because this way Heroku won't crash (because of RAM limit) 161 | for w in AVAILABLE_SEARCH_WINDOW_SIZES: 162 | prepare_search_tree(window_size=w, force_update=force_update) 163 | print(f"Search tree with size {w} prepared") 164 | return SuccessResponse() 165 | 166 | 167 | @app.get("/search/refresh", response_model=SuccessResponse, include_in_schema=False) 168 | def refresh_search(): 169 | # TODO: hardcoded file prefix and folder 170 | _find_and_remove_files(".", "search_tree_*.pk") 171 | prepare_all_search_trees() 172 | print("Search trees are refreshed") 173 | return SuccessResponse() 174 | 175 | 176 | @app.get("/search/sizes", response_model=SearchWindowSizeResponse, tags=["search"]) 177 | def get_available_search_window_sizes(): 178 | return SearchWindowSizeResponse(sizes=AVAILABLE_SEARCH_WINDOW_SIZES) 179 | 180 | 181 | @app.get("/search/recent/", response_model=TopKSearchResponse, tags=["search"]) 182 | async def search_most_recent(symbol: str, window_size: int = 5, top_k: int = 5, future_size: int = 5): 183 | symbol = symbol.upper() 184 | try: 185 | label = data_holder.symbol_to_label[symbol] 186 | except KeyError: 187 | raise HTTPException(status_code=400, detail=f"Ticker symbol {symbol} is not supported") 188 | most_recent_values = data_holder.values[label][:window_size] 189 | 190 | try: 191 | search_tree = search_tree_dict[window_size] 192 | except KeyError: 193 | raise HTTPException(status_code=400, detail=f"No prepared {window_size} day search window") 194 | 195 | top_k_indices, top_k_distances = search_tree.search(values=most_recent_values, k=top_k + 1) 196 | # We need to discard the first item, as that is our search sequence 197 | top_k_indices = top_k_indices[1:] 198 | top_k_distances = top_k_distances[1:] 199 | 200 | forecast_values = [] 201 | matches = [] 202 | 203 | for index, distance in zip(top_k_indices, top_k_distances): 204 | ticker = search_tree.get_window_symbol(index) 205 | start_date, end_date = search_tree.get_start_end_date(index) 206 | 207 | start_date_str = _date_to_str(start_date) 208 | end_date_str = _date_to_str(end_date) 209 | 210 | window_with_future_values = search_tree.get_window_values(index=index, future_length=future_size) 211 | todays_value = window_with_future_values[-window_size] 212 | future_value = window_with_future_values[0] 213 | diff_from_today = todays_value - future_value 214 | 215 | match = MatchResponse(symbol=ticker, 216 | distance=distance, 217 | start_date=start_date_str, 218 | end_date=end_date_str, 219 | todays_value=todays_value, 220 | future_value=future_value, 221 | change=diff_from_today, 222 | values=window_with_future_values.tolist()) 223 | 224 | matches.append(match) 225 | 226 | forecast_values.append(diff_from_today) 227 | 228 | tmp = np.where(np.array(forecast_values) < 0, 0, 1) 229 | forecast_confidence = np.sum(tmp) / len(tmp) 230 | forecast_type = "gain" 231 | if forecast_confidence <= 0.5: 232 | forecast_type = "loss" 233 | forecast_confidence = 1 - forecast_confidence 234 | 235 | top_k_match = TopKSearchResponse(matches=matches, 236 | forecast_type=forecast_type, 237 | forecast_confidence=forecast_confidence, 238 | anchor_symbol=symbol, 239 | window_size=window_size, 240 | top_k=top_k, 241 | future_size=future_size, 242 | anchor_values=most_recent_values.tolist()) 243 | 244 | return top_k_match 245 | 246 | 247 | @app.on_event("startup") 248 | def startup_event(): 249 | # Download and prepare new data when app starts 250 | # This is started in the bg as app needs to start-up in less than 60secs (for Heroku) 251 | threading.Thread(target=refresh_everything).start() 252 | 253 | # Refresh data after every market close 254 | # TODO: set the timezones and add multiple refresh jobs for the multiple market closes 255 | refresh_scheduler.add_job(func=refresh_everything, trigger="cron", day="*", hour=8, minute=35) 256 | refresh_scheduler.add_job(func=refresh_everything, trigger="cron", day="*", hour=15, minute=35) 257 | refresh_scheduler.start() 258 | -------------------------------------------------------------------------------- /rest_api_models.py: -------------------------------------------------------------------------------- 1 | from datetime import datetime 2 | from typing import List, Optional 3 | 4 | from pydantic import BaseModel 5 | 6 | 7 | class SuccessResponse(BaseModel): 8 | message: str = "Successful" 9 | 10 | 11 | class SearchWindowSizeResponse(BaseModel): 12 | sizes: List[int] 13 | 14 | 15 | class AvailableSymbolsResponse(BaseModel): 16 | symbols: List[str] 17 | 18 | 19 | class MatchResponse(BaseModel): 20 | symbol: str 21 | distance: float 22 | start_date: str 23 | end_date: str 24 | todays_value: Optional[float] 25 | future_value: Optional[float] 26 | change: Optional[float] 27 | values: Optional[List[float]] 28 | 29 | 30 | class TopKSearchResponse(BaseModel): 31 | matches: List[MatchResponse] = [] 32 | forecast_type: str 33 | forecast_confidence: float 34 | anchor_symbol: str 35 | anchor_values: Optional[List[float]] 36 | window_size: int 37 | top_k: int 38 | future_size: int 39 | 40 | 41 | class DataRefreshResponse(BaseModel): 42 | message: str = "Last (most recent) refresh" 43 | date: datetime 44 | 45 | 46 | class IsReadyResponse(BaseModel): 47 | is_ready: bool 48 | -------------------------------------------------------------------------------- /runtime.txt: -------------------------------------------------------------------------------- 1 | python-3.8.7 -------------------------------------------------------------------------------- /start.sh: -------------------------------------------------------------------------------- 1 | #!/bin/bash 2 | 3 | set -m 4 | 5 | # Run the RestAPI ("debug" deployment) 6 | uvicorn rest_api:app --host 0.0.0.0 --port 8001 & 7 | 8 | # Wait a bit for the restapi to start-up 9 | sleep 3 10 | 11 | # Run the Dash client app ("debug" deployment) 12 | python /code/dash_app.py 13 | 14 | fg %1 15 | -------------------------------------------------------------------------------- /stock_pattern_analyzer/__init__.py: -------------------------------------------------------------------------------- 1 | from .data import RawStockDataHolder, initialize_data_holder 2 | from .search_index import MemoryEfficientIndex, cKDTreeIndex, FastIndex 3 | from .search_model import SearchModel, initialize_search_tree 4 | from .visualization import visualize_graph 5 | -------------------------------------------------------------------------------- /stock_pattern_analyzer/data.py: -------------------------------------------------------------------------------- 1 | import concurrent.futures 2 | import pickle 3 | from datetime import datetime 4 | from pathlib import Path 5 | from typing import Tuple 6 | 7 | import numpy as np 8 | import pandas as pd 9 | import yfinance 10 | from tqdm import tqdm 11 | 12 | 13 | class RawStockDataHolder: 14 | def __init__(self, ticker_symbols: list, period_years: int = 5, interval: int = 1): 15 | self.ticker_symbols = ticker_symbols 16 | self.period_years = period_years 17 | self.interval = interval 18 | 19 | max_values_per_stock = self.period_years * self.interval * 365 20 | nb_ticker_symbols = len(self.ticker_symbols) 21 | 22 | self.dates = np.zeros((nb_ticker_symbols, max_values_per_stock)) 23 | self.values = np.zeros((nb_ticker_symbols, max_values_per_stock), dtype=np.float32) 24 | self.nb_of_valid_values = np.zeros(nb_ticker_symbols, dtype=np.int32) 25 | 26 | self.symbol_to_label = {symbol: label for label, symbol in enumerate(ticker_symbols)} 27 | self.label_to_symbol = {label: symbol for symbol, label in self.symbol_to_label.items()} 28 | 29 | self.is_filled = False 30 | 31 | def _download_stock_data(self, symbol: str) -> pd.DataFrame: 32 | ticker = yfinance.Ticker(symbol) 33 | period_str = f"{self.period_years}y" 34 | interval_str = f"{self.interval}d" 35 | ticker_df = ticker.history(period=period_str, interval=interval_str, rounding=True)[::-1] 36 | if ticker_df.empty or len(ticker_df) == 0: 37 | raise ValueError(f"{symbol} does not have enough data") 38 | return ticker_df 39 | 40 | def _get_stock_data_for_symbol(self, symbol: str) -> Tuple[np.ndarray, np.ndarray, int]: 41 | ticker_df = self._download_stock_data(symbol=symbol) 42 | close_values = ticker_df["Close"].values 43 | dates = ticker_df.index.values 44 | label = self.symbol_to_label[symbol] 45 | return close_values, dates, label 46 | 47 | def fill(self): 48 | """ 49 | Fills the data holder with the defined stock data 50 | Returns: 51 | None 52 | """ 53 | 54 | pbar = tqdm(desc="Symbol data download", total=len(self.ticker_symbols)) 55 | 56 | with concurrent.futures.ThreadPoolExecutor() as pool: 57 | future_to_symbol = {} 58 | for symbol in self.ticker_symbols: 59 | future = pool.submit(self._get_stock_data_for_symbol, symbol=symbol) 60 | future_to_symbol[future] = symbol 61 | 62 | for future in concurrent.futures.as_completed(future_to_symbol): 63 | completed_symbol = future_to_symbol[future] 64 | try: 65 | close_values, dates, label = future.result() 66 | self.values[label, :len(close_values)] = close_values 67 | self.dates[label, :len(dates)] = dates 68 | self.nb_of_valid_values[label] = len(dates) 69 | except ValueError as e: 70 | print(f"ERROR with {completed_symbol}: {e}") 71 | continue 72 | 73 | pbar.update(1) 74 | self.is_filled = True 75 | pbar.close() 76 | 77 | def create_filename_for_today(self) -> str: 78 | current_date = datetime.now().strftime("%Y_%m_%d") 79 | file_name = f"data_holder_{self.period_years}y_{self.interval}d_{current_date}.pk" 80 | return file_name 81 | 82 | def serialize(self) -> str: 83 | if not self.is_filled: 84 | raise ValueError("You need to fill the class with data first") 85 | 86 | file_name = self.create_filename_for_today() 87 | with open(file_name, "wb") as f: 88 | pickle.dump(self, f) 89 | 90 | return file_name 91 | 92 | @staticmethod 93 | def load(file_name: str) -> "RawStockDataHolder": 94 | with open(file_name, "rb") as f: 95 | obj = pickle.load(f) 96 | return obj 97 | 98 | 99 | def initialize_data_holder(tickers: list, period_years: int, force_update: bool = False): 100 | data_holder = RawStockDataHolder(ticker_symbols=tickers, 101 | period_years=period_years, 102 | interval=1) 103 | 104 | file_path = Path(data_holder.create_filename_for_today()) 105 | 106 | if (not file_path.exists()) or force_update: 107 | data_holder.fill() 108 | data_holder.serialize() 109 | else: 110 | data_holder = RawStockDataHolder.load(str(file_path)) 111 | return data_holder 112 | -------------------------------------------------------------------------------- /stock_pattern_analyzer/search_index.py: -------------------------------------------------------------------------------- 1 | import abc 2 | import pickle 3 | from typing import Tuple 4 | 5 | import faiss 6 | import numpy as np 7 | from scipy.spatial.ckdtree import cKDTree 8 | 9 | 10 | class _BaseIndex: 11 | 12 | def __init__(self): 13 | self.index = None 14 | 15 | @abc.abstractmethod 16 | def create(self, X: np.ndarray) -> None: 17 | """ 18 | This method creates the self.index object (index/search-tree) 19 | Args: 20 | X: Data [n_rows, n_features] 21 | 22 | Returns: 23 | None 24 | """ 25 | raise NotImplementedError() 26 | 27 | @abc.abstractmethod 28 | def query(self, q: np.ndarray, k: int) -> Tuple[np.ndarray, np.ndarray]: 29 | """ 30 | This method allows us to query from the index 31 | Args: 32 | q: query vector 33 | k: number of matches to return 34 | 35 | Returns: 36 | Results as a tuple: distances, indices (from X) 37 | """ 38 | raise NotImplementedError() 39 | 40 | @classmethod 41 | @abc.abstractmethod 42 | def load(cls, file_path: str) -> "_BaseIndex": 43 | raise NotImplementedError() 44 | 45 | @abc.abstractmethod 46 | def serialize(self, file_path: str) -> None: 47 | raise NotImplementedError() 48 | 49 | 50 | class FastIndex(_BaseIndex): 51 | 52 | def __init__(self): 53 | super().__init__() 54 | 55 | def create(self, X: np.ndarray): 56 | self.index = faiss.IndexFlatL2(X.shape[-1]) 57 | self.index.add(X) 58 | 59 | def query(self, q: np.ndarray, k: int): 60 | distances, indices = self.index.search(q, k) 61 | return distances[0], indices[0] 62 | 63 | @classmethod 64 | def load(cls, file_path: str): 65 | faiss.read_index(str(file_path)) 66 | 67 | def serialize(self, file_path: str): 68 | faiss.write_index(self.index, str(file_path)) 69 | 70 | 71 | class MemoryEfficientIndex(_BaseIndex): 72 | 73 | def __init__(self): 74 | super().__init__() 75 | 76 | def create(self, X: np.ndarray): 77 | d = X.shape[-1] 78 | # TODO: refine this as this is just a dummy selection for "m" 79 | if d % 4 == 0: 80 | m = 4 81 | elif d % 5 == 0: 82 | m = 5 83 | elif d % 2 == 0: 84 | m = 2 85 | else: 86 | raise ValueError("This is not handled, can not find a good value for m") 87 | quantizer = faiss.IndexFlatL2(d) 88 | self.index = faiss.IndexIVFPQ(quantizer, d, 100, m, 8) 89 | self.index.train(X) 90 | self.index.add(X) 91 | 92 | def query(self, q: np.ndarray, k: int): 93 | distances, indices = self.index.search(q, k) 94 | return distances[0], indices[0] 95 | 96 | @classmethod 97 | def load(cls, file_path: str): 98 | obj = cls() 99 | obj.index = faiss.read_index(str(file_path)) 100 | return obj 101 | 102 | def serialize(self, file_path: str): 103 | faiss.write_index(self.index, str(file_path)) 104 | 105 | 106 | class cKDTreeIndex(_BaseIndex): 107 | 108 | def __init__(self): 109 | super().__init__() 110 | 111 | def create(self, X: np.ndarray): 112 | self.index = cKDTree(data=X) 113 | 114 | def query(self, q: np.ndarray, k: int): 115 | top_k_distances, top_k_indices = self.index.query(x=q, k=k) 116 | return top_k_distances, top_k_indices 117 | 118 | @classmethod 119 | def load(cls, file_path: str): 120 | obj = cls() 121 | with open(file_path, "rb") as f: 122 | obj.index = pickle.load(f) 123 | return obj 124 | 125 | def serialize(self, file_path: str): 126 | with open(file_path, "wb") as f: 127 | pickle.dump(self.index, f) 128 | -------------------------------------------------------------------------------- /stock_pattern_analyzer/search_model.py: -------------------------------------------------------------------------------- 1 | import pickle 2 | from datetime import datetime 3 | from pathlib import Path 4 | 5 | import numpy as np 6 | from sklearn.preprocessing import minmax_scale 7 | 8 | from .data import RawStockDataHolder 9 | from .search_index import MemoryEfficientIndex 10 | 11 | MINIMUM_WINDOW_SIZE = 5 12 | 13 | 14 | class SearchModel: 15 | def __init__(self, data_holder: RawStockDataHolder, window_size: int): 16 | if window_size < MINIMUM_WINDOW_SIZE: 17 | raise ValueError(f"Window size is too small. Minimum is {MINIMUM_WINDOW_SIZE}") 18 | 19 | self.window_size = window_size 20 | self._data_holder = data_holder 21 | 22 | # This is the object we can use for querying 23 | self.index = None 24 | # TODO: solve this more efficiently without wasting memory 25 | # This stores the start and end indices in the original array of the windows 26 | self.start_end_indices_in_original_array = None 27 | # This stores the ticket symbol label associated with every window 28 | self.labels = None 29 | # This shows if the index is created or not 30 | self.is_built = False 31 | 32 | def _create_windows(self): 33 | """ 34 | Create the sliding windows from the stock dara 35 | Returns: 36 | windows as a numpy array [n_samples, window_size] 37 | """ 38 | 39 | if not self._data_holder.is_filled: 40 | raise ValueError("Data holder needs to be filled first") 41 | 42 | windows = [] 43 | self.start_end_indices_in_original_array = [] 44 | self.labels = [] 45 | 46 | # TODO: this for-loop should be vectorized 47 | for symbol in self._data_holder.ticker_symbols: 48 | label = self._data_holder.symbol_to_label[symbol] 49 | nb_valid_values = self._data_holder.nb_of_valid_values[label] 50 | 51 | symbol_values = self._data_holder.values[label][:nb_valid_values] 52 | # Vectorized sliding window creation 53 | window_indices = np.arange(symbol_values.shape[0] - self.window_size + 1)[:, None] + np.arange( 54 | self.window_size) 55 | windows.extend(symbol_values[window_indices]) 56 | self.start_end_indices_in_original_array.extend(window_indices[:, (0, -1)]) 57 | self.labels.extend([label] * len(window_indices)) 58 | 59 | self.start_end_indices_in_original_array = np.array(self.start_end_indices_in_original_array) 60 | self.labels = np.array(self.labels) 61 | 62 | windows = np.array(windows) 63 | # Separate windows should be normalized, so it is comparable within a given window size (time-frame) 64 | windows = minmax_scale(windows, feature_range=(0, 1), axis=1) 65 | windows = np.nan_to_num(windows) 66 | 67 | return windows 68 | 69 | def build_index(self): 70 | """ 71 | Build the search index 72 | 73 | Returns: 74 | None 75 | """ 76 | 77 | X = self._create_windows() 78 | self.index = MemoryEfficientIndex() 79 | self.index.create(X.astype(np.float32)) 80 | self.is_built = True 81 | 82 | def search(self, values: np.ndarray, k: int = 5) -> tuple: 83 | """ 84 | Search in the data 85 | Args: 86 | values: "query" data - not (min-max) scaled 87 | k: This is how many matches will be returned 88 | 89 | Returns: 90 | tuple: indices, distances 91 | """ 92 | 93 | if not self.is_built: 94 | raise ValueError("You need to build thh search tree first") 95 | 96 | values = minmax_scale(values, feature_range=(0, 1)) 97 | if len(values.shape) == 1: 98 | values = values.reshape(1, -1) 99 | values = values.astype(np.float32) 100 | 101 | top_k_distances, top_k_indices = self.index.query(q=values, k=k) 102 | top_k_distances = top_k_distances.ravel() 103 | top_k_indices = top_k_indices.ravel() 104 | 105 | return top_k_indices, top_k_distances 106 | 107 | def get_window_symbol_label(self, index: int): 108 | return self.labels[index] 109 | 110 | def get_window_symbol(self, index: int) -> str: 111 | label = self.get_window_symbol_label(index) 112 | return self._data_holder.label_to_symbol[label] 113 | 114 | def _get_label_and_start_end_indices(self, index: int, future_length: int): 115 | start_index, end_index = self.start_end_indices_in_original_array[index] 116 | label = self.get_window_symbol_label(index) 117 | 118 | if future_length > 0: 119 | start_index -= future_length 120 | if start_index < 0: 121 | start_index = 0 122 | return label, start_index, end_index 123 | 124 | def get_window_dates(self, index: int, future_length: int = 0) -> np.ndarray: 125 | label, start_index, end_index = self._get_label_and_start_end_indices(index, future_length) 126 | dates = self._data_holder.dates[label][start_index:end_index + 1] 127 | return dates 128 | 129 | def get_window_values(self, index: int, future_length: int = 0): 130 | label, start_index, end_index = self._get_label_and_start_end_indices(index, future_length) 131 | values = self._data_holder.values[label][start_index:end_index + 1] 132 | return values 133 | 134 | def get_start_end_date(self, index: int, future_length: int = 0) -> tuple: 135 | dates = self.get_window_dates(index, future_length) 136 | return dates[0], dates[-1] 137 | 138 | def create_filename_for_today(self) -> str: 139 | current_date = datetime.now().strftime("%Y_%m_%d") 140 | file_name = f"search_tree_{self.window_size}win_{current_date}.pk" 141 | return file_name 142 | 143 | def serialize(self) -> str: 144 | if not self.is_built: 145 | raise ValueError("You need to build the tree first") 146 | 147 | file_name = self.create_filename_for_today() 148 | with open(file_name, "wb") as f: 149 | pickle.dump(self, f) 150 | 151 | return file_name 152 | 153 | @staticmethod 154 | def load(file_name: str) -> "SearchModel": 155 | with open(file_name, "rb") as f: 156 | obj = pickle.load(f) 157 | return obj 158 | 159 | 160 | def initialize_search_tree(data_holder: RawStockDataHolder, window_size: int, force_update: bool = False): 161 | search_tree = SearchModel(data_holder=data_holder, window_size=window_size) 162 | 163 | file_path = Path(search_tree.create_filename_for_today()) 164 | 165 | if (not file_path.exists()) or force_update: 166 | search_tree.build_index() 167 | # TODO: implement serialization 168 | # search_tree.serialize() 169 | else: 170 | search_tree = SearchModel.load(str(file_path)) 171 | return search_tree 172 | -------------------------------------------------------------------------------- /stock_pattern_analyzer/visualization.py: -------------------------------------------------------------------------------- 1 | from typing import List, Tuple 2 | 3 | import numpy as np 4 | from plotly import graph_objs 5 | from sklearn.preprocessing import minmax_scale 6 | 7 | FIG_BG_COLOR = "#F9F9F9" 8 | ANCHOR_COLOR = "#FF372D" 9 | VALUES_COLOR = "#89D4F5" 10 | 11 | 12 | def visualize_graph(match_values_list: List[np.ndarray], 13 | match_symbols: List[str], 14 | match_str_dates: List[Tuple[str, str]], 15 | window_size: int, 16 | future_size: int, 17 | anchor_symbol: str, 18 | anchor_values: np.ndarray, 19 | show_legend: bool = True, 20 | offset_traces: bool = False) -> graph_objs.Figure: 21 | nb_matches = len(match_symbols) 22 | opacity_values = np.linspace(0.2, 1.0, nb_matches)[::-1] 23 | 24 | anchor_original_values = anchor_values[::-1] 25 | minmax_anchor_values = minmax_scale(anchor_original_values) 26 | 27 | fig = graph_objs.Figure() 28 | 29 | assert len(match_values_list) == len(match_symbols), "Something is fishy" 30 | 31 | # Draw all matches 32 | for i in range(nb_matches): 33 | match_values = match_values_list[i] 34 | match_symbol = match_symbols[i] 35 | match_start_date, match_end_date = match_str_dates[i] 36 | 37 | x = list(range(1, len(match_values) + 1)) 38 | original_values = match_values[::-1] 39 | minmax_matched_values = minmax_scale(original_values) 40 | if offset_traces: 41 | diff = minmax_anchor_values[window_size - 1] - minmax_matched_values[window_size - 1] 42 | minmax_matched_values = minmax_matched_values + diff 43 | trace_name = f"{i}) {match_symbol} ({match_start_date} - {match_end_date})" 44 | trace = graph_objs.Scatter(x=x, 45 | y=minmax_matched_values, 46 | name=trace_name, 47 | meta=trace_name, 48 | mode="lines", 49 | line=dict(color=VALUES_COLOR), 50 | opacity=opacity_values[i], 51 | customdata=original_values, 52 | hovertemplate="%{meta}
Norm. val.: %{y:.2f}
Value: %{customdata:.2f}$") 53 | fig.add_trace(trace) 54 | 55 | # Draw the anchor series 56 | x = list(range(1, len(anchor_values) + 1)) 57 | trace_name = f"Anchor ({anchor_symbol})" 58 | trace = graph_objs.Scatter(x=x, 59 | y=minmax_anchor_values, 60 | name=trace_name, 61 | meta=trace_name, 62 | mode="lines+markers", 63 | line=dict(color=ANCHOR_COLOR), 64 | customdata=anchor_original_values, 65 | hovertemplate="%{meta}
Norm. val.: %{y:.2f}
Value: %{customdata:.2f}$") 66 | fig.add_trace(trace) 67 | 68 | # Add "last market close" line 69 | fig.add_vline(x=window_size, line_dash="dash", line_color="black", 70 | annotation_text="Last market close (for selected symbol)") 71 | 72 | # Style the figure 73 | fig.update_xaxes(showspikes=True, spikecolor="black", spikesnap="cursor", spikemode="across") 74 | # fig.update_yaxes(showspikes=True, spikecolor="black", spikethickness=1) 75 | 76 | x_axis_ticker_labels = list(range(-window_size, future_size + 1)) 77 | fig.update_layout(title=f"Similar patters for {anchor_symbol} based on historical market close data", 78 | yaxis=dict(title="Normalized Value"), 79 | xaxis=dict(title="Days", 80 | tickmode="array", 81 | tickvals=list(range(len(x_axis_ticker_labels))), 82 | ticktext=x_axis_ticker_labels), 83 | autosize=True, 84 | plot_bgcolor=FIG_BG_COLOR, 85 | paper_bgcolor=FIG_BG_COLOR, 86 | legend=dict(font=dict(size=9), orientation="h", yanchor="bottom", y=-0.5), 87 | showlegend=show_legend, 88 | spikedistance=1000, 89 | hoverdistance=100) 90 | 91 | return fig 92 | -------------------------------------------------------------------------------- /symbols.txt.example: -------------------------------------------------------------------------------- 1 | $SP500 2 | GME 3 | TSLA 4 | 5 | -------------------------------------------------------------------------------- /tests/measurements.py: -------------------------------------------------------------------------------- 1 | import os 2 | import time 3 | import json 4 | 5 | import numpy as np 6 | from tqdm import tqdm 7 | 8 | import stock_pattern_analyzer as spa 9 | 10 | NB_STOCKS = 100 11 | NB_DAYS_PER_STOCK = 5 * 365 12 | WINDOW_SIZES = [5, 10, 20, 50, 100] 13 | 14 | 15 | def create_windows(X, window_size: int): 16 | window_indices = np.arange(X.shape[0] - window_size + 1)[:, None] + np.arange(window_size) 17 | return X[window_indices] 18 | 19 | 20 | def over_estimate_memory_footprint(model): 21 | """ 22 | A good over-approximation for the memory footprint is the file size of the pickled object 23 | Based on https://stackoverflow.com/a/565382/5108062 24 | """ 25 | 26 | tmp_filename = "test.pk" 27 | model.serialize(tmp_filename) 28 | size_of_the_file = os.path.getsize(tmp_filename) 29 | os.remove(tmp_filename) 30 | return size_of_the_file 31 | 32 | 33 | def measure_build_time(model_class, windowed_data): 34 | start_time = time.time() 35 | model = model_class() 36 | model.create(windowed_data) 37 | end_time = time.time() 38 | build_time = end_time - start_time 39 | return model, build_time 40 | 41 | 42 | def estimate_query_speed(model, windowed_data, N: int): 43 | query = windowed_data[0] 44 | start_time = time.time() 45 | for i in range(N): 46 | _ = model.query(query, k=10) 47 | end_time = time.time() 48 | query_time = (end_time - start_time) / N 49 | return query_time 50 | 51 | 52 | def perform_measurements(): 53 | max_values = NB_STOCKS * NB_DAYS_PER_STOCK 54 | X = np.random.random(max_values) 55 | 56 | res_dict = {} 57 | 58 | for model_class in tqdm([spa.cKDTreeIndex, spa.FastIndex, spa.MemoryEfficientIndex]): 59 | res_dict[model_class.__name__] = {} 60 | for window_size in WINDOW_SIZES: 61 | res_dict[model_class.__name__][window_size] = {} 62 | data = create_windows(X, window_size) 63 | 64 | model, build_time = measure_build_time(model_class, data) 65 | res_dict[model_class.__name__][window_size]["build_time"] = build_time 66 | 67 | memory_footprint = over_estimate_memory_footprint(model) 68 | res_dict[model_class.__name__][window_size]["memory_footprint"] = memory_footprint 69 | 70 | query_speed = estimate_query_speed(model, data, 10) 71 | res_dict[model_class.__name__][window_size]["query_speed"] = query_speed 72 | 73 | with open("measurement_results.json", "w") as f: 74 | json.dump(res_dict, f) 75 | 76 | 77 | if __name__ == "__main__": 78 | perform_measurements() 79 | -------------------------------------------------------------------------------- /tests/rest_api_stress_test.py: -------------------------------------------------------------------------------- 1 | import concurrent.futures 2 | import time 3 | 4 | import numpy as np 5 | import requests 6 | 7 | BASE_URL = "http://localhost:8001" 8 | 9 | 10 | def search_recent(symbol: str, window_size: int, future_size: int, top_k: int): 11 | s = time.time() 12 | url = f"{BASE_URL}/search/recent/?symbol={symbol.upper()}&window_size={window_size}&top_k={top_k}&future_size={future_size}" 13 | _ = requests.get(url) 14 | e = time.time() 15 | return e - s 16 | 17 | 18 | def print_stats(request_execution_times: list, start_time, end_time, N: int): 19 | request_execution_times = np.array(request_execution_times) 20 | print("Statistics:") 21 | 22 | execution_time = end_time - start_time 23 | print(f"Execution time: {execution_time:.4f}") 24 | print(f"FPS: {N / execution_time:.4f}") 25 | 26 | print(f"(single) Request execution time: {request_execution_times.mean()}+/-{request_execution_times.std()} ") 27 | 28 | 29 | def test_most_recent_search(N: int): 30 | print(f"Recent search test with {N} requests") 31 | 32 | # # Sequential requests 33 | # print("Sequential requests") 34 | # 35 | # start_time = time.time() 36 | # request_execution_times = [] 37 | # for i in range(N): 38 | # exec_time = search_recent("AAPL", window_size=5, future_size=5, top_k=5) 39 | # request_execution_times.append(exec_time) 40 | # end_time = time.time() 41 | # 42 | # print_stats(request_execution_times, start_time, end_time, N) 43 | # print("-" * 30) 44 | 45 | # Concurrent requests 46 | print("Concurrent requests") 47 | 48 | start_time = time.time() 49 | request_execution_times = [] 50 | with concurrent.futures.ThreadPoolExecutor(max_workers=None) as pool: 51 | futures = {} 52 | for i in range(N): 53 | # Previous manual tests showed there is no latency when using bigger sizes 54 | # (tested to the maximum allowed window length) 55 | f = pool.submit(search_recent, symbol="AAPL", window_size=5, future_size=5, top_k=5) 56 | futures[f] = i 57 | 58 | for future in concurrent.futures.as_completed(futures): 59 | try: 60 | exec_time = future.result() 61 | request_execution_times.append(exec_time) 62 | except Exception as e: 63 | print(e) 64 | 65 | end_time = time.time() 66 | 67 | print_stats(request_execution_times, start_time, end_time, N) 68 | print("-" * 30) 69 | 70 | 71 | if __name__ == "__main__": 72 | test_most_recent_search(100) 73 | --------------------------------------------------------------------------------