├── JP Morgan Python Training ├── requirements.txt ├── notebooks │ ├── 6_financial_data.ipynb │ ├── mandlebrot.py │ ├── 9_3d_plotting.ipynb │ ├── 1_basic.ipynb │ ├── 4_webapi.ipynb │ ├── 8_altman_z_double_prime.ipynb │ ├── 3_flights.ipynb │ └── 7_advanced_plotting.ipynb ├── README.md └── data │ ├── data_norm.csv │ └── data_raw.csv ├── Quantum Use Cases └── readme.md ├── Quantum Machine Learning in Finance └── Readme.md ├── imgs └── Intro to Finance with Python.png ├── Classical Use Cases └── Multimodal Transportation Optimization │ ├── model data.xlsx │ ├── Solution.txt │ └── README.md ├── Machine Learning For Finance ├── Regression Based Machine Learning for Algorithmic Trading │ ├── RN_11_01.pdf │ ├── empirical properties of asset returns.pdf │ ├── Trend Following - Penalized Regression Techniques.py │ ├── Pairs Trading - Lasso Regression.py │ ├── Pairs Trading - Ridge Regression.py │ ├── Pairs Trading - Elastic Net.py │ ├── Pairs Trading - Bayesian Ridge Regression.py │ ├── Pairs Trading scikit-learn Linear.py │ ├── Pairs Trading statsmodels Linear Post 2008.py │ ├── Pairs Trading statsmodels Linear Pre 2008.py │ └── Pairs Trading with Kalman Filters.py └── Classification Based Machine Learning for Algorithmic Trading │ ├── img │ ├── Bias.PNG │ ├── Good.PNG │ ├── SVM 1.png │ ├── SVM 2.png │ ├── SVM 3.png │ ├── SVM 4.png │ ├── irises.png │ ├── recall.jpg │ ├── Polynomial.png │ ├── Variance.PNG │ ├── pr_or_roc.png │ ├── precision.jpg │ ├── tnr_and_fpr.png │ ├── holdout_method.PNG │ ├── confusion matrix.jpg │ ├── confusion_matrix.jpg │ ├── cross_validation.PNG │ ├── precision-recall.png │ ├── holdout_method_w_validation.PNG │ └── iris_petal_length_and_width.png │ ├── Predict Next Day Return │ ├── data │ │ ├── YMA.csv │ │ └── XMA.csv │ ├── spyder_LDA.py │ ├── spyder_LogisticRegression.py │ └── spyder_QDA.py │ └── default_predictions │ ├── Kernel SVM.py │ ├── Linear SVM.py │ ├── Naive Bayes.py │ ├── SGDClassifier.py │ ├── PAC.py │ ├── GPC.py │ ├── QDA.py │ ├── Decision Tree.py │ ├── Logistic regression.py │ ├── Template.py │ ├── LDA.py │ ├── KNN.py │ ├── Random forest.py │ └── dataset_2.csv ├── .github └── workflows │ └── jekyll-gh-pages.yml └── README.md /JP Morgan Python Training/requirements.txt: -------------------------------------------------------------------------------- 1 | -------------------------------------------------------------------------------- /Quantum Use Cases/readme.md: -------------------------------------------------------------------------------- 1 | # Quantum Use Cases -------------------------------------------------------------------------------- /Quantum Machine Learning in Finance/Readme.md: -------------------------------------------------------------------------------- 1 | # QML in Finance 2 | -------------------------------------------------------------------------------- /imgs/Intro to Finance with Python.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/MonitSharma/Quantum-Finance-and-Numerical-Methods/HEAD/imgs/Intro to Finance with Python.png -------------------------------------------------------------------------------- /Classical Use Cases/Multimodal Transportation Optimization/model data.xlsx: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/MonitSharma/Quantum-Finance-and-Numerical-Methods/HEAD/Classical Use Cases/Multimodal Transportation Optimization/model data.xlsx -------------------------------------------------------------------------------- /Machine Learning For Finance/Regression Based Machine Learning for Algorithmic Trading/RN_11_01.pdf: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/MonitSharma/Quantum-Finance-and-Numerical-Methods/HEAD/Machine Learning For Finance/Regression Based Machine Learning for Algorithmic Trading/RN_11_01.pdf -------------------------------------------------------------------------------- /Machine Learning For Finance/Classification Based Machine Learning for Algorithmic Trading/img/Bias.PNG: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/MonitSharma/Quantum-Finance-and-Numerical-Methods/HEAD/Machine Learning For Finance/Classification Based Machine Learning for Algorithmic Trading/img/Bias.PNG -------------------------------------------------------------------------------- /Machine Learning For Finance/Classification Based Machine Learning for Algorithmic Trading/img/Good.PNG: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/MonitSharma/Quantum-Finance-and-Numerical-Methods/HEAD/Machine Learning For Finance/Classification Based Machine Learning for Algorithmic Trading/img/Good.PNG -------------------------------------------------------------------------------- /Machine Learning For Finance/Classification Based Machine Learning for Algorithmic Trading/img/SVM 1.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/MonitSharma/Quantum-Finance-and-Numerical-Methods/HEAD/Machine Learning For Finance/Classification Based Machine Learning for Algorithmic Trading/img/SVM 1.png -------------------------------------------------------------------------------- /Machine Learning For Finance/Classification Based Machine Learning for Algorithmic Trading/img/SVM 2.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/MonitSharma/Quantum-Finance-and-Numerical-Methods/HEAD/Machine Learning For Finance/Classification Based Machine Learning for Algorithmic Trading/img/SVM 2.png -------------------------------------------------------------------------------- /Machine Learning For Finance/Classification Based Machine Learning for Algorithmic Trading/img/SVM 3.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/MonitSharma/Quantum-Finance-and-Numerical-Methods/HEAD/Machine Learning For Finance/Classification Based Machine Learning for Algorithmic Trading/img/SVM 3.png -------------------------------------------------------------------------------- /Machine Learning For Finance/Classification Based Machine Learning for Algorithmic Trading/img/SVM 4.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/MonitSharma/Quantum-Finance-and-Numerical-Methods/HEAD/Machine Learning For Finance/Classification Based Machine Learning for Algorithmic Trading/img/SVM 4.png -------------------------------------------------------------------------------- /Machine Learning For Finance/Classification Based Machine Learning for Algorithmic Trading/img/irises.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/MonitSharma/Quantum-Finance-and-Numerical-Methods/HEAD/Machine Learning For Finance/Classification Based Machine Learning for Algorithmic Trading/img/irises.png -------------------------------------------------------------------------------- /Machine Learning For Finance/Classification Based Machine Learning for Algorithmic Trading/img/recall.jpg: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/MonitSharma/Quantum-Finance-and-Numerical-Methods/HEAD/Machine Learning For Finance/Classification Based Machine Learning for Algorithmic Trading/img/recall.jpg -------------------------------------------------------------------------------- /Machine Learning For Finance/Classification Based Machine Learning for Algorithmic Trading/img/Polynomial.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/MonitSharma/Quantum-Finance-and-Numerical-Methods/HEAD/Machine Learning For Finance/Classification Based Machine Learning for Algorithmic Trading/img/Polynomial.png -------------------------------------------------------------------------------- /Machine Learning For Finance/Classification Based Machine Learning for Algorithmic Trading/img/Variance.PNG: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/MonitSharma/Quantum-Finance-and-Numerical-Methods/HEAD/Machine Learning For Finance/Classification Based Machine Learning for Algorithmic Trading/img/Variance.PNG -------------------------------------------------------------------------------- /Machine Learning For Finance/Classification Based Machine Learning for Algorithmic Trading/img/pr_or_roc.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/MonitSharma/Quantum-Finance-and-Numerical-Methods/HEAD/Machine Learning For Finance/Classification Based Machine Learning for Algorithmic Trading/img/pr_or_roc.png -------------------------------------------------------------------------------- /Machine Learning For Finance/Classification Based Machine Learning for Algorithmic Trading/img/precision.jpg: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/MonitSharma/Quantum-Finance-and-Numerical-Methods/HEAD/Machine Learning For Finance/Classification Based Machine Learning for Algorithmic Trading/img/precision.jpg -------------------------------------------------------------------------------- /Machine Learning For Finance/Classification Based Machine Learning for Algorithmic Trading/img/tnr_and_fpr.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/MonitSharma/Quantum-Finance-and-Numerical-Methods/HEAD/Machine Learning For Finance/Classification Based Machine Learning for Algorithmic Trading/img/tnr_and_fpr.png -------------------------------------------------------------------------------- /Machine Learning For Finance/Classification Based Machine Learning for Algorithmic Trading/img/holdout_method.PNG: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/MonitSharma/Quantum-Finance-and-Numerical-Methods/HEAD/Machine Learning For Finance/Classification Based Machine Learning for Algorithmic Trading/img/holdout_method.PNG -------------------------------------------------------------------------------- /Machine Learning For Finance/Classification Based Machine Learning for Algorithmic Trading/img/confusion matrix.jpg: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/MonitSharma/Quantum-Finance-and-Numerical-Methods/HEAD/Machine Learning For Finance/Classification Based Machine Learning for Algorithmic Trading/img/confusion matrix.jpg -------------------------------------------------------------------------------- /Machine Learning For Finance/Classification Based Machine Learning for Algorithmic Trading/img/confusion_matrix.jpg: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/MonitSharma/Quantum-Finance-and-Numerical-Methods/HEAD/Machine Learning For Finance/Classification Based Machine Learning for Algorithmic Trading/img/confusion_matrix.jpg -------------------------------------------------------------------------------- /Machine Learning For Finance/Classification Based Machine Learning for Algorithmic Trading/img/cross_validation.PNG: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/MonitSharma/Quantum-Finance-and-Numerical-Methods/HEAD/Machine Learning For Finance/Classification Based Machine Learning for Algorithmic Trading/img/cross_validation.PNG -------------------------------------------------------------------------------- /Machine Learning For Finance/Classification Based Machine Learning for Algorithmic Trading/img/precision-recall.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/MonitSharma/Quantum-Finance-and-Numerical-Methods/HEAD/Machine Learning For Finance/Classification Based Machine Learning for Algorithmic Trading/img/precision-recall.png -------------------------------------------------------------------------------- /Machine Learning For Finance/Classification Based Machine Learning for Algorithmic Trading/img/holdout_method_w_validation.PNG: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/MonitSharma/Quantum-Finance-and-Numerical-Methods/HEAD/Machine Learning For Finance/Classification Based Machine Learning for Algorithmic Trading/img/holdout_method_w_validation.PNG -------------------------------------------------------------------------------- /Machine Learning For Finance/Classification Based Machine Learning for Algorithmic Trading/img/iris_petal_length_and_width.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/MonitSharma/Quantum-Finance-and-Numerical-Methods/HEAD/Machine Learning For Finance/Classification Based Machine Learning for Algorithmic Trading/img/iris_petal_length_and_width.png -------------------------------------------------------------------------------- /Machine Learning For Finance/Regression Based Machine Learning for Algorithmic Trading/empirical properties of asset returns.pdf: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/MonitSharma/Quantum-Finance-and-Numerical-Methods/HEAD/Machine Learning For Finance/Regression Based Machine Learning for Algorithmic Trading/empirical properties of asset returns.pdf -------------------------------------------------------------------------------- /JP Morgan Python Training/notebooks/6_financial_data.ipynb: -------------------------------------------------------------------------------- 1 | { 2 | "cells": [ 3 | { 4 | "cell_type": "code", 5 | "execution_count": null, 6 | "metadata": {}, 7 | "outputs": [], 8 | "source": [] 9 | } 10 | ], 11 | "metadata": { 12 | "kernelspec": { 13 | "display_name": "Python 3", 14 | "language": "python", 15 | "name": "python3" 16 | }, 17 | "language_info": { 18 | "codemirror_mode": { 19 | "name": "ipython", 20 | "version": 3 21 | }, 22 | "file_extension": ".py", 23 | "mimetype": "text/x-python", 24 | "name": "python", 25 | "nbconvert_exporter": "python", 26 | "pygments_lexer": "ipython3", 27 | "version": "3.7.3" 28 | } 29 | }, 30 | "nbformat": 4, 31 | "nbformat_minor": 4 32 | } 33 | -------------------------------------------------------------------------------- /JP Morgan Python Training/README.md: -------------------------------------------------------------------------------- 1 | 2 | 3 | # Python Training 4 | **This Python training is for JPMorgan business analysts and traders, as well as select clients.** 5 | 6 | 7 | 8 | 9 | ## Overview 10 | This course is designed to be an introduction to numerical computing and data visualization in Python. It is not designed to be a complete course in Computer Science or programming, but rather a motivational demonstration of how relatively complex topics can be accessible even to those without formal progamming backgrounds. 11 | 12 | 13 | 14 | ## Platform Attribution 15 | This repository relies on the [Binder](https://mybinder.readthedocs.io/en/latest/about.html) project, which is generously supported by [Google Cloud Platform](https://cloud.google.com/), [OVH](https://www.ovh.com/world/), [GESIS Notebooks](https://notebooks.gesis.org) and the [Turing Institute](https://www.turing.ac.uk). 16 | 17 | 18 | ## Data Attribution 19 | 20 | - Financial data provided by [IEX Cloud](https://iexcloud.io) 21 | - Airport and route data provided by [OpenFlights.org](https://openflights.org/data.html#license) 22 | 23 | 24 | 25 | -------------------------------------------------------------------------------- /JP Morgan Python Training/notebooks/mandlebrot.py: -------------------------------------------------------------------------------- 1 | # coding: utf-8 2 | 3 | import numpy as np 4 | 5 | 6 | def mandelbrot(X, Y, max_iterations=1000, verbose=True): 7 | """Computes the Mandelbrot set. 8 | 9 | Returns a matrix with the escape iteration number of the mandelbrot 10 | sequence. The matrix contains a cell for every (x, y) couple of the 11 | X and Y vectors elements given in input. Maximum max_iterations are 12 | performed for each point 13 | :param X: set of x coordinates 14 | :param Y: set of y coordinates 15 | :param max_iterations: maximum number of iterations to perform before 16 | forcing to stop the sequence 17 | :param show_out: flag indicating whether to print on console which line 18 | number is being computed 19 | :return: Matrix containing the escape iteration number for every point 20 | specified in input 21 | """ 22 | 23 | # init the output array 24 | out_arr = np.zeros((len(Y), len(X))) 25 | 26 | # Iterate of the y coordinates 27 | for i, y in enumerate(Y): 28 | for j, x in enumerate(X): 29 | n = 0 30 | c = x + 1j*y 31 | z = c 32 | while (n < max_iterations) and (abs(z) <= 2): 33 | z = z*z + c 34 | n += 1 35 | out_arr[i, j] = n 36 | 37 | return out_arr 38 | -------------------------------------------------------------------------------- /.github/workflows/jekyll-gh-pages.yml: -------------------------------------------------------------------------------- 1 | # Sample workflow for building and deploying a Jekyll site to GitHub Pages 2 | name: Deploy Jekyll with GitHub Pages dependencies preinstalled 3 | 4 | on: 5 | # Runs on pushes targeting the default branch 6 | push: 7 | branches: ["main"] 8 | 9 | # Allows you to run this workflow manually from the Actions tab 10 | workflow_dispatch: 11 | 12 | # Sets permissions of the GITHUB_TOKEN to allow deployment to GitHub Pages 13 | permissions: 14 | contents: read 15 | pages: write 16 | id-token: write 17 | 18 | # Allow one concurrent deployment 19 | concurrency: 20 | group: "pages" 21 | cancel-in-progress: true 22 | 23 | jobs: 24 | # Build job 25 | build: 26 | runs-on: ubuntu-latest 27 | steps: 28 | - name: Checkout 29 | uses: actions/checkout@v3 30 | - name: Setup Pages 31 | uses: actions/configure-pages@v2 32 | - name: Build with Jekyll 33 | uses: actions/jekyll-build-pages@v1 34 | with: 35 | source: ./ 36 | destination: ./_site 37 | - name: Upload artifact 38 | uses: actions/upload-pages-artifact@v1 39 | 40 | # Deployment job 41 | deploy: 42 | environment: 43 | name: github-pages 44 | url: ${{ steps.deployment.outputs.page_url }} 45 | runs-on: ubuntu-latest 46 | needs: build 47 | steps: 48 | - name: Deploy to GitHub Pages 49 | id: deployment 50 | uses: actions/deploy-pages@v1 51 | -------------------------------------------------------------------------------- /Machine Learning For Finance/Classification Based Machine Learning for Algorithmic Trading/Predict Next Day Return/data/YMA.csv: -------------------------------------------------------------------------------- 1 | Date,YMA 2 | 16/3/2017,97.94 3 | 17/3/2017,97.92 4 | 20/3/2017,97.95 5 | 21/3/2017,97.96 6 | 22/3/2017,97.98 7 | 23/3/2017,98 8 | 24/3/2017,98.01 9 | 27/3/2017,98.04 10 | 28/3/2017,98.06 11 | 29/3/2017,98.04 12 | 30/3/2017,98.06 13 | 31/3/2017,98.05 14 | 3/4/2017,98.07 15 | 4/4/2017,98.11 16 | 5/4/2017,98.12 17 | 6/4/2017,98.15 18 | 7/4/2017,98.19 19 | 10/4/2017,98.18 20 | 11/4/2017,98.19 21 | 12/4/2017,98.21 22 | 13/4/2017,98.22 23 | 18/4/2017,98.21 24 | 19/4/2017,98.23 25 | 20/4/2017,98.2 26 | 21/4/2017,98.18 27 | 24/4/2017,98.16 28 | 25/4/2017,98.15 29 | 26/4/2017,98.12 30 | 27/4/2017,98.14 31 | 28/4/2017,98.17 32 | 1/5/2017,98.16 33 | 2/5/2017,98.14 34 | 3/5/2017,98.15 35 | 4/5/2017,98.1 36 | 5/5/2017,98.1 37 | 8/5/2017,98.08 38 | 9/5/2017,98.08 39 | 10/5/2017,98.11 40 | 11/5/2017,98.13 41 | 12/5/2017,98.15 42 | 15/5/2017,98.18 43 | 16/5/2017,98.18 44 | 17/5/2017,98.22 45 | 18/5/2017,98.21 46 | 19/5/2017,98.22 47 | 22/5/2017,98.21 48 | 23/5/2017,98.24 49 | 24/5/2017,98.23 50 | 25/5/2017,98.26 51 | 26/5/2017,98.28 52 | 29/5/2017,98.27 53 | 30/5/2017,98.3 54 | 31/5/2017,98.32 55 | 1/6/2017,98.31 56 | 2/6/2017,98.29 57 | 5/6/2017,98.3 58 | 6/6/2017,98.31 59 | 7/6/2017,98.26 60 | 8/6/2017,98.25 61 | 9/6/2017,98.24 62 | 12/6/2017,98.25 63 | 13/6/2017,98.245 64 | 14/6/2017,98.235 65 | 15/6/2017,98.23 66 | 16/6/2017,98.19 67 | 19/6/2017,98.18 68 | 20/6/2017,98.16 69 | 21/6/2017,98.18 70 | 22/6/2017,98.19 71 | 23/6/2017,98.19 72 | 26/6/2017,98.18 73 | 27/6/2017,98.19 74 | 28/6/2017,98.12 75 | 29/6/2017,98.08 76 | 30/6/2017,98.02 77 | 3/7/2017,98.01 78 | 4/7/2017,98.07 79 | 5/7/2017,98.05 80 | -------------------------------------------------------------------------------- /Machine Learning For Finance/Classification Based Machine Learning for Algorithmic Trading/Predict Next Day Return/data/XMA.csv: -------------------------------------------------------------------------------- 1 | Date,XMA 2 | 16/3/2017,97.14 3 | 17/3/2017,97.095 4 | 20/3/2017,97.135 5 | 21/3/2017,97.15 6 | 22/3/2017,97.205 7 | 23/3/2017,97.22 8 | 24/3/2017,97.21 9 | 27/3/2017,97.255 10 | 28/3/2017,97.265 11 | 29/3/2017,97.24 12 | 30/3/2017,97.275 13 | 31/3/2017,97.27 14 | 3/4/2017,97.29 15 | 4/4/2017,97.36 16 | 5/4/2017,97.36 17 | 6/4/2017,97.39 18 | 7/4/2017,97.42 19 | 10/4/2017,97.395 20 | 11/4/2017,97.44 21 | 12/4/2017,97.47 22 | 13/4/2017,97.5 23 | 18/4/2017,97.485 24 | 19/4/2017,97.52 25 | 20/4/2017,97.47 26 | 21/4/2017,97.435 27 | 24/4/2017,97.375 28 | 25/4/2017,97.38 29 | 26/4/2017,97.34 30 | 27/4/2017,97.36 31 | 28/4/2017,97.395 32 | 1/5/2017,97.395 33 | 2/5/2017,97.365 34 | 3/5/2017,97.38 35 | 4/5/2017,97.325 36 | 5/5/2017,97.32 37 | 8/5/2017,97.295 38 | 9/5/2017,97.285 39 | 10/5/2017,97.31 40 | 11/5/2017,97.315 41 | 12/5/2017,97.335 42 | 15/5/2017,97.38 43 | 16/5/2017,97.385 44 | 17/5/2017,97.44 45 | 18/5/2017,97.47 46 | 19/5/2017,97.475 47 | 22/5/2017,97.455 48 | 23/5/2017,97.495 49 | 24/5/2017,97.46 50 | 25/5/2017,97.505 51 | 26/5/2017,97.535 52 | 29/5/2017,97.53 53 | 30/5/2017,97.55 54 | 31/5/2017,97.56 55 | 1/6/2017,97.55 56 | 2/6/2017,97.535 57 | 5/6/2017,97.555 58 | 6/6/2017,97.575 59 | 7/6/2017,97.565 60 | 8/6/2017,97.545 61 | 9/6/2017,97.55 62 | 12/6/2017,97.5425 63 | 13/6/2017,97.555 64 | 14/6/2017,97.5525 65 | 15/6/2017,97.5975 66 | 16/6/2017,97.55 67 | 19/6/2017,97.55 68 | 20/6/2017,97.54 69 | 21/6/2017,97.565 70 | 22/6/2017,97.58 71 | 23/6/2017,97.585 72 | 26/6/2017,97.58 73 | 27/6/2017,97.6 74 | 28/6/2017,97.495 75 | 29/6/2017,97.445 76 | 30/6/2017,97.35 77 | 3/7/2017,97.335 78 | 4/7/2017,97.38 79 | 5/7/2017,97.37 80 | -------------------------------------------------------------------------------- /Machine Learning For Finance/Classification Based Machine Learning for Algorithmic Trading/Predict Next Day Return/spyder_LDA.py: -------------------------------------------------------------------------------- 1 | # -*- coding: utf-8 -*- 2 | """ 3 | Created on Wed Jul 5 16:51:39 2017 4 | 5 | @author: AnthonyN 6 | 7 | https://www.quantstart.com/articles/Forecasting-Financial-Time-Series-Part-1 8 | 9 | Predicting Price Returns 10 | """ 11 | 12 | import numpy as np 13 | import pandas as pd 14 | lags = 5 15 | start_test = pd.to_datetime('2017-06-18') 16 | 17 | from sklearn.discriminant_analysis import LinearDiscriminantAnalysis as LDA 18 | 19 | ts = pd.read_csv('data\XMA.csv', index_col='Date') 20 | ts.index = pd.to_datetime(ts.index) 21 | tslag = ts[['XMA']].copy() 22 | 23 | for i in range(0,lags): 24 | tslag["Lag_" + str(i+1)] = tslag["XMA"].shift(i+1) 25 | tslag["returns"] = tslag["XMA"].pct_change() 26 | 27 | # Create the lagged percentage returns columns 28 | for i in range(0,lags): 29 | tslag["Lag_" + str(i+1)] = tslag["Lag_" + str(i+1)].pct_change() 30 | tslag.fillna(0, inplace=True) 31 | 32 | tslag["Direction"] = np.sign(tslag["returns"]) 33 | # Use the prior two days of returns as predictor values, with direction as the response 34 | X = tslag[["Lag_1", "Lag_2"]] 35 | y = tslag["Direction"] 36 | 37 | # Create training and test sets 38 | X_train = X[X.index < start_test] 39 | X_test = X[X.index >= start_test] 40 | y_train = y[y.index < start_test] 41 | y_test = y[y.index >= start_test] 42 | 43 | # Create prediction DataFrame 44 | pred = pd.DataFrame(index=y_test.index) 45 | 46 | lda = LDA() 47 | lda.fit(X_train, y_train) 48 | y_pred = lda.predict(X_test) 49 | 50 | # pred = (1.0 + y_pred * y_test)/2.0 51 | pred = (1.0 + (y_pred == y_test))/2.0 52 | hit_rate = np.mean(pred) 53 | print('LDA {:.4f}'.format(hit_rate)) 54 | 55 | 56 | 57 | -------------------------------------------------------------------------------- /Machine Learning For Finance/Classification Based Machine Learning for Algorithmic Trading/Predict Next Day Return/spyder_LogisticRegression.py: -------------------------------------------------------------------------------- 1 | # -*- coding: utf-8 -*- 2 | """ 3 | Created on Wed Jul 5 16:51:39 2017 4 | 5 | @author: AnthonyN 6 | 7 | https://www.quantstart.com/articles/Forecasting-Financial-Time-Series-Part-1 8 | 9 | Predicting Price Returns 10 | """ 11 | 12 | import numpy as np 13 | import pandas as pd 14 | lags = 5 15 | start_test = pd.to_datetime('2017-06-18') 16 | 17 | from sklearn.linear_model import LogisticRegression 18 | 19 | ts = pd.read_csv('data\XMA.csv', index_col='Date') 20 | ts.index = pd.to_datetime(ts.index) 21 | tslag = ts[['XMA']].copy() 22 | 23 | for i in range(0,lags): 24 | tslag["Lag_" + str(i+1)] = tslag["XMA"].shift(i+1) 25 | tslag["returns"] = tslag["XMA"].pct_change() 26 | 27 | # Create the lagged percentage returns columns 28 | for i in range(0,lags): 29 | tslag["Lag_" + str(i+1)] = tslag["Lag_" + str(i+1)].pct_change() 30 | tslag.fillna(0, inplace=True) 31 | 32 | 33 | tslag["Direction"] = np.sign(tslag["returns"]) 34 | # Use the prior two days of returns as predictor values, with direction as the response 35 | X = tslag[["Lag_1", "Lag_2"]] 36 | y = tslag["Direction"] 37 | 38 | # Create training and test sets 39 | X_train = X[X.index < start_test] 40 | X_test = X[X.index >= start_test] 41 | y_train = y[y.index < start_test] 42 | y_test = y[y.index >= start_test] 43 | 44 | # Create prediction DataFrame 45 | pred = pd.DataFrame(index=y_test.index) 46 | 47 | lr = LogisticRegression() 48 | lr.fit(X_train, y_train) 49 | y_pred = lr.predict(X_test) 50 | 51 | # pred = (1.0 + y_pred * y_test)/2.0 52 | pred = (1.0 + (y_pred == y_test))/2.0 53 | hit_rate = np.mean(pred) 54 | print('Logistic Regresstion {:.4f}'.format(hit_rate)) -------------------------------------------------------------------------------- /Machine Learning For Finance/Classification Based Machine Learning for Algorithmic Trading/default_predictions/Kernel SVM.py: -------------------------------------------------------------------------------- 1 | # -*- coding: utf-8 -*- 2 | """ 3 | Created on Sun Jun 25 22:02:07 2017 4 | 5 | @author: Anthony 6 | """ 7 | import numpy as np 8 | import pandas as pd 9 | df = pd.read_csv("dataset_2.csv") 10 | 11 | df['default'].describe() 12 | sum(df['default'] == 0) 13 | sum(df['default'] == 1) 14 | 15 | X = df.iloc[:, 1:6].values 16 | y = df['default'].values 17 | 18 | 19 | # Splitting the dataset into the Training set and Test set 20 | from sklearn.model_selection import train_test_split 21 | X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.25, 22 | random_state=0) 23 | 24 | shuffle_index = np.random.permutation(len(X_train)) 25 | X_train, y_train = X_train[shuffle_index], y_train[shuffle_index] 26 | 27 | 28 | # Feature Scaling 29 | from sklearn.preprocessing import StandardScaler 30 | sc_X = StandardScaler() 31 | X_train = sc_X.fit_transform(X_train) 32 | X_test = sc_X.transform(X_test) 33 | 34 | from sklearn.svm import SVC 35 | clf = SVC(kernel = 'rbf', random_state=0) 36 | clf.fit(X_train, y_train) 37 | 38 | # Cross Validation 39 | from sklearn.model_selection import cross_val_score 40 | from sklearn.metrics import confusion_matrix 41 | from sklearn.model_selection import cross_val_predict 42 | cross_val_score(clf, X_train, y_train, cv=3, scoring='accuracy') 43 | y_train_pred = cross_val_predict(clf, X_train, y_train, cv=3) 44 | cm = confusion_matrix(y_train, y_train_pred) 45 | print(cm) 46 | 47 | from sklearn.metrics import precision_score, recall_score 48 | print("precision score = {0:.4f}".format(precision_score(y_train, y_train_pred))) 49 | print("recall score = {0:.4f}".format(recall_score(y_train, y_train_pred))) 50 | -------------------------------------------------------------------------------- /Machine Learning For Finance/Classification Based Machine Learning for Algorithmic Trading/default_predictions/Linear SVM.py: -------------------------------------------------------------------------------- 1 | # -*- coding: utf-8 -*- 2 | """ 3 | Created on Sun Jun 25 22:02:07 2017 4 | 5 | @author: Anthony 6 | """ 7 | import numpy as np 8 | import pandas as pd 9 | df = pd.read_csv("dataset_2.csv") 10 | 11 | df['default'].describe() 12 | sum(df['default'] == 0) 13 | sum(df['default'] == 1) 14 | 15 | X = df.iloc[:, 1:6].values 16 | y = df['default'].values 17 | 18 | 19 | # Splitting the dataset into the Training set and Test set 20 | from sklearn.model_selection import train_test_split 21 | X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.25, 22 | random_state=0) 23 | 24 | shuffle_index = np.random.permutation(len(X_train)) 25 | X_train, y_train = X_train[shuffle_index], y_train[shuffle_index] 26 | 27 | 28 | # Feature Scaling 29 | from sklearn.preprocessing import StandardScaler 30 | sc_X = StandardScaler() 31 | X_train = sc_X.fit_transform(X_train) 32 | X_test = sc_X.transform(X_test) 33 | 34 | from sklearn.svm import SVC 35 | clf = SVC(kernel = 'linear', random_state=0) 36 | clf.fit(X_train, y_train) 37 | 38 | # Cross Validation 39 | from sklearn.model_selection import cross_val_score 40 | from sklearn.metrics import confusion_matrix 41 | from sklearn.model_selection import cross_val_predict 42 | cross_val_score(clf, X_train, y_train, cv=3, scoring='accuracy') 43 | y_train_pred = cross_val_predict(clf, X_train, y_train, cv=3) 44 | cm = confusion_matrix(y_train, y_train_pred) 45 | print(cm) 46 | 47 | from sklearn.metrics import precision_score, recall_score 48 | print("precision score = {0:.4f}".format(precision_score(y_train, y_train_pred))) 49 | print("recall score = {0:.4f}".format(recall_score(y_train, y_train_pred))) 50 | -------------------------------------------------------------------------------- /Machine Learning For Finance/Classification Based Machine Learning for Algorithmic Trading/default_predictions/Naive Bayes.py: -------------------------------------------------------------------------------- 1 | # -*- coding: utf-8 -*- 2 | """ 3 | Created on Sun Jun 25 22:02:07 2017 4 | 5 | @author: Anthony 6 | """ 7 | import numpy as np 8 | import pandas as pd 9 | df = pd.read_csv("dataset_2.csv") 10 | 11 | df['default'].describe() 12 | print(sum(df['default'] == 0)) 13 | print(sum(df['default'] == 1)) 14 | 15 | X = df.iloc[:, 1:6].values 16 | y = df['default'].values 17 | 18 | 19 | # Splitting the dataset into the Training set and Test set 20 | from sklearn.model_selection import train_test_split 21 | X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.25, 22 | random_state=0) 23 | 24 | shuffle_index = np.random.permutation(len(X_train)) 25 | X_train, y_train = X_train[shuffle_index], y_train[shuffle_index] 26 | 27 | 28 | # Feature Scaling 29 | from sklearn.preprocessing import StandardScaler 30 | sc_X = StandardScaler() 31 | X_train = sc_X.fit_transform(X_train) 32 | X_test = sc_X.transform(X_test) 33 | 34 | from sklearn.naive_bayes import GaussianNB 35 | classifier = GaussianNB() 36 | clf.fit(X_train, y_train) 37 | 38 | # Cross Validation 39 | from sklearn.model_selection import cross_val_score 40 | from sklearn.metrics import confusion_matrix 41 | from sklearn.model_selection import cross_val_predict 42 | cross_val_score(clf, X_train, y_train, cv=3, scoring='accuracy') 43 | y_train_pred = cross_val_predict(clf, X_train, y_train, cv=3) 44 | cm = confusion_matrix(y_train, y_train_pred) 45 | print(cm) 46 | 47 | from sklearn.metrics import precision_score, recall_score 48 | print("precision score = {0:.4f}".format(precision_score(y_train, y_train_pred))) 49 | print("recall score = {0:.4f}".format(recall_score(y_train, y_train_pred))) 50 | -------------------------------------------------------------------------------- /Machine Learning For Finance/Classification Based Machine Learning for Algorithmic Trading/default_predictions/SGDClassifier.py: -------------------------------------------------------------------------------- 1 | # -*- coding: utf-8 -*- 2 | """ 3 | Created on Sun Jun 25 22:02:07 2017 4 | 5 | @author: Anthony 6 | """ 7 | import numpy as np 8 | import pandas as pd 9 | df = pd.read_csv("dataset_2.csv") 10 | 11 | df['default'].describe() 12 | sum(df['default'] == 0) 13 | sum(df['default'] == 1) 14 | 15 | X = df.iloc[:, 1:6].values 16 | y = df['default'].values 17 | 18 | 19 | # Splitting the dataset into the Training set and Test set 20 | from sklearn.model_selection import train_test_split 21 | X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.25, 22 | random_state=0) 23 | 24 | shuffle_index = np.random.permutation(len(X_train)) 25 | X_train, y_train = X_train[shuffle_index], y_train[shuffle_index] 26 | 27 | 28 | # Feature Scaling 29 | from sklearn.preprocessing import StandardScaler 30 | sc_X = StandardScaler() 31 | X_train = sc_X.fit_transform(X_train) 32 | X_test = sc_X.transform(X_test) 33 | 34 | from sklearn import linear_model 35 | clf = linear_model.SGDClassifier(random_state=0) 36 | clf.fit(X_train, y_train) 37 | 38 | # Cross Validation 39 | from sklearn.model_selection import cross_val_score 40 | from sklearn.metrics import confusion_matrix 41 | from sklearn.model_selection import cross_val_predict 42 | cross_val_score(clf, X_train, y_train, cv=3, scoring='accuracy') 43 | y_train_pred = cross_val_predict(clf, X_train, y_train, cv=3) 44 | cm = confusion_matrix(y_train, y_train_pred) 45 | print(cm) 46 | 47 | from sklearn.metrics import precision_score, recall_score 48 | print("precision score = {0:.4f}".format(precision_score(y_train, y_train_pred))) 49 | print("recall score = {0:.4f}".format(recall_score(y_train, y_train_pred))) 50 | -------------------------------------------------------------------------------- /Machine Learning For Finance/Classification Based Machine Learning for Algorithmic Trading/default_predictions/PAC.py: -------------------------------------------------------------------------------- 1 | # -*- coding: utf-8 -*- 2 | """ 3 | Created on Sun Jun 25 22:02:07 2017 4 | 5 | @author: Anthony 6 | """ 7 | import numpy as np 8 | import pandas as pd 9 | df = pd.read_csv("dataset_2.csv") 10 | 11 | df['default'].describe() 12 | sum(df['default'] == 0) 13 | sum(df['default'] == 1) 14 | 15 | X = df.iloc[:, 1:6].values 16 | y = df['default'].values 17 | 18 | 19 | # Splitting the dataset into the Training set and Test set 20 | from sklearn.model_selection import train_test_split 21 | X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.25, 22 | random_state=0) 23 | 24 | shuffle_index = np.random.permutation(len(X_train)) 25 | X_train, y_train = X_train[shuffle_index], y_train[shuffle_index] 26 | 27 | 28 | # Feature Scaling 29 | from sklearn.preprocessing import StandardScaler 30 | sc_X = StandardScaler() 31 | X_train = sc_X.fit_transform(X_train) 32 | X_test = sc_X.transform(X_test) 33 | 34 | from sklearn import linear_model 35 | clf = linear_model.PassiveAggressiveClassifier(random_state=0) 36 | clf.fit(X_train, y_train) 37 | 38 | # Cross Validation 39 | from sklearn.model_selection import cross_val_score 40 | from sklearn.metrics import confusion_matrix 41 | from sklearn.model_selection import cross_val_predict 42 | cross_val_score(clf, X_train, y_train, cv=3, scoring='accuracy') 43 | y_train_pred = cross_val_predict(clf, X_train, y_train, cv=3) 44 | cm = confusion_matrix(y_train, y_train_pred) 45 | print(cm) 46 | 47 | from sklearn.metrics import precision_score, recall_score 48 | print("precision score = {0:.4f}".format(precision_score(y_train, y_train_pred))) 49 | print("recall score = {0:.4f}".format(recall_score(y_train, y_train_pred))) 50 | -------------------------------------------------------------------------------- /Machine Learning For Finance/Classification Based Machine Learning for Algorithmic Trading/default_predictions/GPC.py: -------------------------------------------------------------------------------- 1 | # -*- coding: utf-8 -*- 2 | """ 3 | Created on Sun Jun 25 22:02:07 2017 4 | 5 | @author: Anthony 6 | """ 7 | import numpy as np 8 | import pandas as pd 9 | df = pd.read_csv("dataset_2.csv") 10 | 11 | df['default'].describe() 12 | sum(df['default'] == 0) 13 | sum(df['default'] == 1) 14 | 15 | X = df.iloc[:, 1:6].values 16 | y = df['default'].values 17 | 18 | 19 | # Splitting the dataset into the Training set and Test set 20 | from sklearn.model_selection import train_test_split 21 | X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.25, 22 | random_state=0) 23 | 24 | shuffle_index = np.random.permutation(len(X_train)) 25 | X_train, y_train = X_train[shuffle_index], y_train[shuffle_index] 26 | 27 | 28 | # Feature Scaling 29 | from sklearn.preprocessing import StandardScaler 30 | sc_X = StandardScaler() 31 | X_train = sc_X.fit_transform(X_train) 32 | X_test = sc_X.transform(X_test) 33 | 34 | from sklearn import gaussian_process 35 | clf = gaussian_process.GaussianProcessClassifier(random_state=0) 36 | clf.fit(X_train, y_train) 37 | 38 | # Cross Validation 39 | from sklearn.model_selection import cross_val_score 40 | from sklearn.metrics import confusion_matrix 41 | from sklearn.model_selection import cross_val_predict 42 | cross_val_score(clf, X_train, y_train, cv=3, scoring='accuracy') 43 | y_train_pred = cross_val_predict(clf, X_train, y_train, cv=3) 44 | cm = confusion_matrix(y_train, y_train_pred) 45 | print(cm) 46 | 47 | from sklearn.metrics import precision_score, recall_score 48 | print("precision score = {0:.4f}".format(precision_score(y_train, y_train_pred))) 49 | print("recall score = {0:.4f}".format(recall_score(y_train, y_train_pred))) 50 | -------------------------------------------------------------------------------- /Machine Learning For Finance/Classification Based Machine Learning for Algorithmic Trading/default_predictions/QDA.py: -------------------------------------------------------------------------------- 1 | # -*- coding: utf-8 -*- 2 | """ 3 | Created on Sun Jun 25 22:02:07 2017 4 | 5 | @author: Anthony 6 | """ 7 | import numpy as np 8 | import pandas as pd 9 | df = pd.read_csv("dataset_2.csv") 10 | 11 | df['default'].describe() 12 | sum(df['default'] == 0) 13 | sum(df['default'] == 1) 14 | 15 | X = df.iloc[:, 1:6].values 16 | y = df['default'].values 17 | 18 | 19 | # Splitting the dataset into the Training set and Test set 20 | from sklearn.model_selection import train_test_split 21 | X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.25, 22 | random_state=0) 23 | 24 | shuffle_index = np.random.permutation(len(X_train)) 25 | X_train, y_train = X_train[shuffle_index], y_train[shuffle_index] 26 | 27 | 28 | # Feature Scaling 29 | from sklearn.preprocessing import StandardScaler 30 | sc_X = StandardScaler() 31 | X_train = sc_X.fit_transform(X_train) 32 | X_test = sc_X.transform(X_test) 33 | 34 | from sklearn import discriminant_analysis 35 | clf = discriminant_analysis.QuadraticDiscriminantAnalysis() 36 | clf.fit(X_train, y_train) 37 | 38 | # Cross Validation 39 | from sklearn.model_selection import cross_val_score 40 | from sklearn.metrics import confusion_matrix 41 | from sklearn.model_selection import cross_val_predict 42 | cross_val_score(clf, X_train, y_train, cv=3, scoring='accuracy') 43 | y_train_pred = cross_val_predict(clf, X_train, y_train, cv=3) 44 | cm = confusion_matrix(y_train, y_train_pred) 45 | print(cm) 46 | 47 | from sklearn.metrics import precision_score, recall_score 48 | print("precision score = {0:.4f}".format(precision_score(y_train, y_train_pred))) 49 | print("recall score = {0:.4f}".format(recall_score(y_train, y_train_pred))) 50 | -------------------------------------------------------------------------------- /Machine Learning For Finance/Classification Based Machine Learning for Algorithmic Trading/Predict Next Day Return/spyder_QDA.py: -------------------------------------------------------------------------------- 1 | # -*- coding: utf-8 -*- 2 | """ 3 | Created on Wed Jul 5 16:51:39 2017 4 | 5 | @author: AnthonyN 6 | 7 | https://www.quantstart.com/articles/Forecasting-Financial-Time-Series-Part-1 8 | 9 | Predicting Price Returns 10 | """ 11 | 12 | import numpy as np 13 | import pandas as pd 14 | lags = 5 15 | start_test = pd.to_datetime('2017-06-18') 16 | 17 | from sklearn.discriminant_analysis import QuadraticDiscriminantAnalysis as QDA 18 | 19 | ts = pd.read_csv('data\XMA.csv', index_col='Date') 20 | ts.index = pd.to_datetime(ts.index) 21 | tslag = ts[['XMA']].copy() 22 | 23 | for i in range(0,lags): 24 | tslag["Lag_" + str(i+1)] = tslag["XMA"].shift(i+1) 25 | tslag["returns"] = tslag["XMA"].pct_change() 26 | 27 | # Create the lagged percentage returns columns 28 | for i in range(0,lags): 29 | tslag["Lag_" + str(i+1)] = tslag["Lag_" + str(i+1)].pct_change() 30 | tslag.fillna(0, inplace=True) 31 | 32 | 33 | tslag["Direction"] = np.sign(tslag["returns"]) 34 | # Use the prior two days of returns as predictor values, with direction as the response 35 | X = tslag[["Lag_1", "Lag_2"]] 36 | y = tslag["Direction"] 37 | 38 | # Create training and test sets 39 | X_train = X[X.index < start_test] 40 | X_test = X[X.index >= start_test] 41 | y_train = y[y.index < start_test] 42 | y_test = y[y.index >= start_test] 43 | 44 | # Create prediction DataFrame 45 | pred = pd.DataFrame(index=y_test.index) 46 | 47 | qda = QDA() 48 | qda.fit(X_train, y_train) 49 | y_pred = qda.predict(X_test) 50 | 51 | # pred = (1.0 + y_pred * y_test)/2.0 52 | pred = (1.0 + (y_pred == y_test))/2.0 53 | hit_rate = np.mean(pred) 54 | print('QDA {:.4f}'.format(hit_rate)) 55 | 56 | 57 | 58 | 59 | 60 | 61 | 62 | 63 | 64 | 65 | 66 | 67 | 68 | -------------------------------------------------------------------------------- /Machine Learning For Finance/Classification Based Machine Learning for Algorithmic Trading/default_predictions/Decision Tree.py: -------------------------------------------------------------------------------- 1 | # -*- coding: utf-8 -*- 2 | """ 3 | Created on Sun Jun 25 22:02:07 2017 4 | 5 | @author: Anthony 6 | """ 7 | import numpy as np 8 | import pandas as pd 9 | df = pd.read_csv("dataset_2.csv") 10 | 11 | df['default'].describe() 12 | sum(df['default'] == 0) 13 | sum(df['default'] == 1) 14 | 15 | X = df.iloc[:, 1:6].values 16 | y = df['default'].values 17 | 18 | 19 | # Splitting the dataset into the Training set and Test set 20 | from sklearn.model_selection import train_test_split 21 | X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.25, 22 | random_state=0) 23 | 24 | shuffle_index = np.random.permutation(len(X_train)) 25 | X_train, y_train = X_train[shuffle_index], y_train[shuffle_index] 26 | 27 | 28 | # Feature Scaling 29 | from sklearn.preprocessing import StandardScaler 30 | sc_X = StandardScaler() 31 | X_train = sc_X.fit_transform(X_train) 32 | X_test = sc_X.transform(X_test) 33 | 34 | from sklearn.tree import DecisionTreeClassifier 35 | clf = DecisionTreeClassifier(criterion='entropy',random_state=0) 36 | clf.fit(X_train, y_train) 37 | 38 | # Cross Validation 39 | from sklearn.model_selection import cross_val_score 40 | from sklearn.metrics import confusion_matrix 41 | from sklearn.model_selection import cross_val_predict 42 | cross_val_score(clf, X_train, y_train, cv=3, scoring='accuracy') 43 | y_train_pred = cross_val_predict(clf, X_train, y_train, cv=3) 44 | cm = confusion_matrix(y_train, y_train_pred) 45 | print(cm) 46 | 47 | from sklearn.metrics import precision_score, recall_score 48 | print("precision score = {0:.4f}".format(precision_score(y_train, y_train_pred))) 49 | print("recall score = {0:.4f}".format(recall_score(y_train, y_train_pred))) 50 | -------------------------------------------------------------------------------- /Machine Learning For Finance/Classification Based Machine Learning for Algorithmic Trading/default_predictions/Logistic regression.py: -------------------------------------------------------------------------------- 1 | # -*- coding: utf-8 -*- 2 | """ 3 | Created on Thu Aug 17 19:12:52 2017 4 | 5 | @author: Anthony 6 | 7 | Default Predictions 8 | """ 9 | 10 | import numpy as np 11 | import pandas as pd 12 | df = pd.read_csv("dataset_2.csv") 13 | 14 | df['default'].describe() 15 | sum(df['default'] == 0) 16 | sum(df['default'] == 1) 17 | 18 | X = df.iloc[:, 1:6].values 19 | y = df['default'].values 20 | 21 | # Splitting the dataset into the Training set and Test set 22 | from sklearn.model_selection import train_test_split 23 | X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.25, 24 | random_state=0) 25 | 26 | shuffle_index = np.random.permutation(len(X_train)) 27 | X_train, y_train = X_train[shuffle_index], y_train[shuffle_index] 28 | 29 | # Feature Scaling 30 | from sklearn.preprocessing import StandardScaler 31 | sc_X = StandardScaler() 32 | X_train = sc_X.fit_transform(X_train) 33 | X_test = sc_X.transform(X_test) 34 | 35 | # Machine Learning Algorithm 36 | from sklearn import linear_model 37 | clf = linear_model.LogisticRegression(random_state=0) 38 | clf.fit(X_train, y_train) 39 | 40 | # Cross Validation 41 | from sklearn.model_selection import cross_val_score 42 | from sklearn.metrics import confusion_matrix 43 | from sklearn.model_selection import cross_val_predict 44 | cross_val_score(clf, X_train, y_train, cv=3, scoring='accuracy') 45 | y_train_pred = cross_val_predict(clf, X_train, y_train, cv=3) 46 | cm = confusion_matrix(y_train, y_train_pred) 47 | print(cm) 48 | 49 | from sklearn.metrics import precision_score, recall_score 50 | print("precision score = {0:.4f}".format(precision_score(y_train, y_train_pred))) 51 | print("recall score = {0:.4f}".format(recall_score(y_train, y_train_pred))) 52 | 53 | # Predicting the Test set results 54 | y_pred = clf.predict(X_test) 55 | 56 | # Making the Confusion Matrix 57 | from sklearn.metrics import confusion_matrix 58 | cm = confusion_matrix(y_test, y_pred) 59 | print(cm) 60 | print("precision score = {0:.4f}".format(precision_score(y_test, y_pred))) 61 | print("recall score = {0:.4f}".format(recall_score(y_test, y_pred))) -------------------------------------------------------------------------------- /Machine Learning For Finance/Classification Based Machine Learning for Algorithmic Trading/default_predictions/Template.py: -------------------------------------------------------------------------------- 1 | # -*- coding: utf-8 -*- 2 | """ 3 | Created on Thu Aug 17 19:12:52 2017 4 | 5 | @author: Anthony 6 | 7 | Default Predictions 8 | """ 9 | 10 | import numpy as np 11 | import pandas as pd 12 | df = pd.read_csv("dataset_2.csv") 13 | 14 | df['default'].describe() 15 | sum(df['default'] == 0) 16 | sum(df['default'] == 1) 17 | 18 | X = df.iloc[:, 1:6].values 19 | y = df['default'].values 20 | 21 | # Splitting the dataset into the Training set and Test set 22 | from sklearn.model_selection import train_test_split 23 | X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.25, 24 | random_state=0) 25 | 26 | shuffle_index = np.random.permutation(len(X_train)) 27 | X_train, y_train = X_train[shuffle_index], y_train[shuffle_index] 28 | 29 | # Feature Scaling 30 | from sklearn.preprocessing import StandardScaler 31 | sc_X = StandardScaler() 32 | X_train = sc_X.fit_transform(X_train) 33 | X_test = sc_X.transform(X_test) 34 | 35 | # Machine Learning Algorithm 36 | from sklearn import linear_model 37 | clf = linear_model.LogisticRegression(random_state=0) 38 | clf.fit(X_train, y_train) 39 | 40 | # Cross Validation 41 | from sklearn.model_selection import cross_val_score 42 | from sklearn.metrics import confusion_matrix 43 | from sklearn.model_selection import cross_val_predict 44 | cross_val_score(clf, X_train, y_train, cv=3, scoring='accuracy') 45 | y_train_pred = cross_val_predict(clf, X_train, y_train, cv=3) 46 | cm = confusion_matrix(y_train, y_train_pred) 47 | print(cm) 48 | 49 | from sklearn.metrics import precision_score, recall_score 50 | print("precision score = {0:.4f}".format(precision_score(y_train, y_train_pred))) 51 | print("recall score = {0:.4f}".format(recall_score(y_train, y_train_pred))) 52 | 53 | # Predicting the Test set results 54 | y_pred = clf.predict(X_test) 55 | 56 | # Making the Confusion Matrix 57 | from sklearn.metrics import confusion_matrix 58 | cm = confusion_matrix(y_test, y_pred) 59 | print(cm) 60 | print("precision score = {0:.4f}".format(precision_score(y_test, y_pred))) 61 | print("recall score = {0:.4f}".format(recall_score(y_test, y_pred))) 62 | 63 | ''' 64 | Logistic Regression Training set: 65 | precision score = 0.7788 66 | recall score = 0.8351 67 | 68 | Logistic Regression Test set: 69 | precision score = 0.8158 70 | recall score = 0.8378 71 | ''' 72 | 73 | 74 | 75 | -------------------------------------------------------------------------------- /Machine Learning For Finance/Classification Based Machine Learning for Algorithmic Trading/default_predictions/LDA.py: -------------------------------------------------------------------------------- 1 | # -*- coding: utf-8 -*- 2 | """ 3 | Created on Thu Aug 17 19:12:52 2017 4 | 5 | @author: Anthony 6 | 7 | Default Predictions 8 | """ 9 | 10 | import numpy as np 11 | import pandas as pd 12 | df = pd.read_csv("dataset_2.csv") 13 | 14 | df['default'].describe() 15 | sum(df['default'] == 0) 16 | sum(df['default'] == 1) 17 | 18 | X = df.iloc[:, 1:6].values 19 | y = df['default'].values 20 | 21 | # Splitting the dataset into the Training set and Test set 22 | from sklearn.model_selection import train_test_split 23 | X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.25, 24 | random_state=0) 25 | 26 | shuffle_index = np.random.permutation(len(X_train)) 27 | X_train, y_train = X_train[shuffle_index], y_train[shuffle_index] 28 | 29 | # Feature Scaling 30 | from sklearn.preprocessing import StandardScaler 31 | sc_X = StandardScaler() 32 | X_train = sc_X.fit_transform(X_train) 33 | X_test = sc_X.transform(X_test) 34 | 35 | # Machine Learning Algorithm 36 | from sklearn import discriminant_analysis 37 | clf = discriminant_analysis.LinearDiscriminantAnalysis() 38 | clf.fit(X_train, y_train) 39 | 40 | # Cross Validation 41 | from sklearn.model_selection import cross_val_score 42 | from sklearn.metrics import confusion_matrix 43 | from sklearn.model_selection import cross_val_predict 44 | cross_val_score(clf, X_train, y_train, cv=3, scoring='accuracy') 45 | y_train_pred = cross_val_predict(clf, X_train, y_train, cv=3) 46 | cm = confusion_matrix(y_train, y_train_pred) 47 | print(cm) 48 | 49 | from sklearn.metrics import precision_score, recall_score 50 | print("precision score = {0:.4f}".format(precision_score(y_train, y_train_pred))) 51 | print("recall score = {0:.4f}".format(recall_score(y_train, y_train_pred))) 52 | 53 | # Predicting the Test set results 54 | y_pred = clf.predict(X_test) 55 | 56 | # Making the Confusion Matrix 57 | from sklearn.metrics import confusion_matrix 58 | cm = confusion_matrix(y_test, y_pred) 59 | print(cm) 60 | print("precision score = {0:.4f}".format(precision_score(y_test, y_pred))) 61 | print("recall score = {0:.4f}".format(recall_score(y_test, y_pred))) 62 | 63 | ''' 64 | Logistic Regression Training set: 65 | precision score = 0.7788 66 | recall score = 0.8351 67 | 68 | Logistic Regression Test set: 69 | precision score = 0.8158 70 | recall score = 0.8378 71 | ''' 72 | 73 | 74 | 75 | -------------------------------------------------------------------------------- /Machine Learning For Finance/Classification Based Machine Learning for Algorithmic Trading/default_predictions/KNN.py: -------------------------------------------------------------------------------- 1 | # -*- coding: utf-8 -*- 2 | """ 3 | Created on Thu Aug 17 19:12:52 2017 4 | 5 | @author: Anthony 6 | 7 | Default Predictions 8 | """ 9 | 10 | import numpy as np 11 | import pandas as pd 12 | df = pd.read_csv("dataset_2.csv") 13 | 14 | df['default'].describe() 15 | sum(df['default'] == 0) 16 | sum(df['default'] == 1) 17 | 18 | X = df.iloc[:, 1:6].values 19 | y = df['default'].values 20 | 21 | # Splitting the dataset into the Training set and Test set 22 | from sklearn.model_selection import train_test_split 23 | X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.25, 24 | random_state=0) 25 | 26 | shuffle_index = np.random.permutation(len(X_train)) 27 | X_train, y_train = X_train[shuffle_index], y_train[shuffle_index] 28 | 29 | # Feature Scaling 30 | from sklearn.preprocessing import StandardScaler 31 | sc_X = StandardScaler() 32 | X_train = sc_X.fit_transform(X_train) 33 | X_test = sc_X.transform(X_test) 34 | 35 | # Machine Learning Algorithm 36 | from sklearn import neighbors 37 | clf = neighbors.KNeighborsClassifier(n_neighbors=2, algorithm='ball_tree') 38 | clf.fit(X_train, y_train) 39 | 40 | # Cross Validation 41 | from sklearn.model_selection import cross_val_score 42 | from sklearn.metrics import confusion_matrix 43 | from sklearn.model_selection import cross_val_predict 44 | cross_val_score(clf, X_train, y_train, cv=3, scoring='accuracy') 45 | y_train_pred = cross_val_predict(clf, X_train, y_train, cv=3) 46 | cm = confusion_matrix(y_train, y_train_pred) 47 | print(cm) 48 | 49 | from sklearn.metrics import precision_score, recall_score 50 | print("precision score = {0:.4f}".format(precision_score(y_train, y_train_pred))) 51 | print("recall score = {0:.4f}".format(recall_score(y_train, y_train_pred))) 52 | 53 | # Predicting the Test set results 54 | y_pred = clf.predict(X_test) 55 | 56 | # Making the Confusion Matrix 57 | from sklearn.metrics import confusion_matrix 58 | cm = confusion_matrix(y_test, y_pred) 59 | print(cm) 60 | print("precision score = {0:.4f}".format(precision_score(y_test, y_pred))) 61 | print("recall score = {0:.4f}".format(recall_score(y_test, y_pred))) 62 | 63 | ''' 64 | Logistic Regression Training set: 65 | precision score = 0.7788 66 | recall score = 0.8351 67 | 68 | Logistic Regression Test set: 69 | precision score = 0.8158 70 | recall score = 0.8378 71 | ''' 72 | 73 | 74 | 75 | -------------------------------------------------------------------------------- /Machine Learning For Finance/Classification Based Machine Learning for Algorithmic Trading/default_predictions/Random forest.py: -------------------------------------------------------------------------------- 1 | # -*- coding: utf-8 -*- 2 | """ 3 | Created on Thu Aug 17 19:12:52 2017 4 | 5 | @author: Anthony 6 | 7 | Default Predictions 8 | """ 9 | 10 | import numpy as np 11 | import pandas as pd 12 | df = pd.read_csv("dataset_2.csv") 13 | 14 | df['default'].describe() 15 | sum(df['default'] == 0) 16 | sum(df['default'] == 1) 17 | 18 | X = df.iloc[:, 1:6].values 19 | y = df['default'].values 20 | 21 | # Splitting the dataset into the Training set and Test set 22 | from sklearn.model_selection import train_test_split 23 | X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.25, 24 | random_state=0) 25 | 26 | shuffle_index = np.random.permutation(len(X_train)) 27 | X_train, y_train = X_train[shuffle_index], y_train[shuffle_index] 28 | 29 | # Feature Scaling 30 | from sklearn.preprocessing import StandardScaler 31 | sc_X = StandardScaler() 32 | X_train = sc_X.fit_transform(X_train) 33 | X_test = sc_X.transform(X_test) 34 | 35 | # Machine Learning Algorithm 36 | from sklearn.ensemble import RandomForestClassifier 37 | clf = RandomForestClassifier(random_state=0) 38 | clf.fit(X_train, y_train) 39 | 40 | # Cross Validation 41 | from sklearn.model_selection import cross_val_score 42 | from sklearn.metrics import confusion_matrix 43 | from sklearn.model_selection import cross_val_predict 44 | cross_val_score(clf, X_train, y_train, cv=3, scoring='accuracy') 45 | y_train_pred = cross_val_predict(clf, X_train, y_train, cv=3) 46 | cm = confusion_matrix(y_train, y_train_pred) 47 | print(cm) 48 | 49 | from sklearn.metrics import precision_score, recall_score 50 | print("precision score = {0:.4f}".format(precision_score(y_train, y_train_pred))) 51 | print("recall score = {0:.4f}".format(recall_score(y_train, y_train_pred))) 52 | 53 | # Predicting the Test set results 54 | y_pred = clf.predict(X_test) 55 | 56 | # Making the Confusion Matrix 57 | from sklearn.metrics import confusion_matrix 58 | cm = confusion_matrix(y_test, y_pred) 59 | print(cm) 60 | print("precision score = {0:.4f}".format(precision_score(y_test, y_pred))) 61 | print("recall score = {0:.4f}".format(recall_score(y_test, y_pred))) 62 | 63 | ''' 64 | Logistic Regression Training set: 65 | precision score = 0.7788 66 | recall score = 0.8351 67 | 68 | Logistic Regression Test set: 69 | precision score = 0.8158 70 | recall score = 0.8378 71 | ''' 72 | 73 | 74 | 75 | -------------------------------------------------------------------------------- /Classical Use Cases/Multimodal Transportation Optimization/Solution.txt: -------------------------------------------------------------------------------- 1 | Solution 2 | Number of goods: 8 3 | Total cost: 196959.0 4 | Transportation cost: 6645.0 5 | Warehouse cost: 1410.0 6 | Tax cost: 188904.0 7 | ------------------------------------ 8 | Goods-1 Category: Honey 9 | Start date: 2018-02-01 10 | Arrival date: 2018-02-12 11 | Route: 12 | (1)Date: 2018-02-01 From: Singapore Warehouse To: Malaysia Warehouse By: Truck 13 | (2)Date: 2018-02-02 From: Malaysia Warehouse To: Malaysia Port By: Truck 14 | (3)Date: 2018-02-03 From: Malaysia Port To: Shanghai Port By: Sea 15 | (4)Date: 2018-02-10 From: Shanghai Port To: Shanghai Warehouse By: Truck 16 | (5)Date: 2018-02-11 From: Shanghai Warehouse To: Wuxi Warehouse By: Truck 17 | ------------------------------------ 18 | Goods-2 Category: Furniture 19 | Start date: 2018-02-02 20 | Arrival date: 2018-02-11 21 | Route: 22 | (1)Date: 2018-02-02 From: Malaysia Warehouse To: Malaysia Port By: Truck 23 | (2)Date: 2018-02-03 From: Malaysia Port To: Shanghai Port By: Sea 24 | (3)Date: 2018-02-10 From: Shanghai Port To: Shanghai Warehouse By: Truck 25 | ------------------------------------ 26 | Goods-3 Category: Paper plates 27 | Start date: 2018-02-03 28 | Arrival date: 2018-02-15 29 | Route: 30 | (1)Date: 2018-02-03 From: Singapore Warehouse To: Malaysia Warehouse By: Truck 31 | (2)Date: 2018-02-06 From: Malaysia Warehouse To: Malaysia Port By: Truck 32 | (3)Date: 2018-02-07 From: Malaysia Port To: Shanghai Port By: Sea 33 | (4)Date: 2018-02-14 From: Shanghai Port To: Shanghai Warehouse By: Truck 34 | ------------------------------------ 35 | Goods-4 Category: Pharmaceutical drugs 36 | Start date: 2018-02-04 37 | Arrival date: 2018-02-15 38 | Route: 39 | (1)Date: 2018-02-04 From: Singapore Warehouse To: Malaysia Warehouse By: Truck 40 | (2)Date: 2018-02-06 From: Malaysia Warehouse To: Malaysia Port By: Truck 41 | (3)Date: 2018-02-07 From: Malaysia Port To: Shanghai Port By: Sea 42 | (4)Date: 2018-02-14 From: Shanghai Port To: Shanghai Warehouse By: Truck 43 | ------------------------------------ 44 | Goods-5 Category: Cigarette 45 | Start date: 2018-02-05 46 | Arrival date: 2018-02-15 47 | Route: 48 | (1)Date: 2018-02-05 From: Wuxi Warehouse To: Shanghai Warehouse By: Truck 49 | (2)Date: 2018-02-06 From: Shanghai Warehouse To: Shanghai Port By: Truck 50 | (3)Date: 2018-02-07 From: Shanghai Port To: Malaysia Port By: Sea 51 | (4)Date: 2018-02-14 From: Malaysia Port To: Malaysia Warehouse By: Truck 52 | ------------------------------------ 53 | Goods-6 Category: Apple 54 | Start date: 2018-02-06 55 | Arrival date: 2018-02-16 56 | Route: 57 | (1)Date: 2018-02-06 From: Shanghai Warehouse To: Shanghai Port By: Truck 58 | (2)Date: 2018-02-07 From: Shanghai Port To: Malaysia Port By: Sea 59 | (3)Date: 2018-02-14 From: Malaysia Port To: Malaysia Warehouse By: Truck 60 | (4)Date: 2018-02-15 From: Malaysia Warehouse To: Singapore Warehouse By: Truck 61 | ------------------------------------ 62 | Goods-7 Category: Durian 63 | Start date: 2018-02-07 64 | Arrival date: 2018-02-08 65 | Route: 66 | (1)Date: 2018-02-07 From: Malaysia Warehouse To: Singapore Warehouse By: Truck 67 | ------------------------------------ 68 | Goods-8 Category: Furniture 69 | Start date: 2018-02-08 70 | Arrival date: 2018-02-09 71 | Route: 72 | (1)Date: 2018-02-08 From: Wuxi Warehouse To: Shanghai Warehouse By: Truck -------------------------------------------------------------------------------- /JP Morgan Python Training/notebooks/9_3d_plotting.ipynb: -------------------------------------------------------------------------------- 1 | { 2 | "cells": [ 3 | { 4 | "cell_type": "code", 5 | "execution_count": null, 6 | "metadata": {}, 7 | "outputs": [], 8 | "source": [ 9 | "import pandas as pd\n", 10 | "import numpy as np\n", 11 | "\n", 12 | "import plotly.graph_objs as go\n", 13 | "\n", 14 | "df = pd.read_csv('../data/yield_curve.csv')\n", 15 | "xlist = list(df[\"x\"].dropna())\n", 16 | "ylist = list(df[\"y\"].dropna())\n", 17 | "\n", 18 | "del df[\"x\"]\n", 19 | "del df[\"y\"]\n", 20 | "\n", 21 | "zlist = []\n", 22 | "for row in df.iterrows():\n", 23 | " index, data = row\n", 24 | " zlist.append(data.tolist())\n" 25 | ] 26 | }, 27 | { 28 | "cell_type": "code", 29 | "execution_count": null, 30 | "metadata": {}, 31 | "outputs": [], 32 | "source": [ 33 | "z_secondary_beginning = [z[1] for z in zlist if z[0] == \"None\"]\n", 34 | "z_secondary_end = [z[0] for z in zlist if z[0] != \"None\"]\n", 35 | "z_secondary = z_secondary_beginning + z_secondary_end\n", 36 | "x_secondary = [\"3-month\"] * len(z_secondary_beginning) + [\"1-month\"] * len(\n", 37 | " z_secondary_end\n", 38 | ")\n", 39 | "y_secondary = ylist\n", 40 | "\n", 41 | "trace1 = dict(\n", 42 | " type=\"surface\",\n", 43 | " x=xlist,\n", 44 | " y=ylist,\n", 45 | " z=zlist,\n", 46 | " hoverinfo=\"x+y+z\",\n", 47 | " lighting={\n", 48 | " \"ambient\": 0.95,\n", 49 | " \"diffuse\": 0.99,\n", 50 | " \"fresnel\": 0.01,\n", 51 | " \"roughness\": 0.01,\n", 52 | " \"specular\": 0.01,\n", 53 | " },\n", 54 | " colorscale=[\n", 55 | " [0, \"rgb(230,245,254)\"],\n", 56 | " [0.4, \"rgb(123,171,203)\"],\n", 57 | " [0.8, \"rgb(40,119,174)\"],\n", 58 | " [1, \"rgb(37,61,81)\"],\n", 59 | " ],\n", 60 | " opacity=0.7,\n", 61 | " showscale=False,\n", 62 | " scene=\"scene\",\n", 63 | ")\n", 64 | "\n", 65 | "trace2 = dict(\n", 66 | " type=\"scatter3d\",\n", 67 | " mode=\"lines\",\n", 68 | " x=x_secondary,\n", 69 | " y=y_secondary,\n", 70 | " z=z_secondary,\n", 71 | " hoverinfo=\"x+y+z\",\n", 72 | " line=dict(color=\"#444444\"),\n", 73 | ")\n", 74 | "\n", 75 | "data = [trace1, trace2]\n", 76 | "layout = dict(\n", 77 | " autosize=True,\n", 78 | " font=dict(size=12, color=\"#CCCCCC\"),\n", 79 | " margin=dict(t=5, l=5, b=5, r=5),\n", 80 | " showlegend=False,\n", 81 | " hovermode=\"closest\",\n", 82 | " scene=dict(\n", 83 | " aspectmode=\"manual\",\n", 84 | " aspectratio=dict(x=2, y=5, z=1.5),\n", 85 | " xaxis={\n", 86 | " \"showgrid\": True,\n", 87 | " \"title\": \"\",\n", 88 | " \"type\": \"category\",\n", 89 | " \"zeroline\": False,\n", 90 | " \"categoryorder\": \"array\",\n", 91 | " \"categoryarray\": list(reversed(xlist)),\n", 92 | " },\n", 93 | " yaxis={\"showgrid\": True, \"title\": \"\", \"type\": \"date\", \"zeroline\": False},\n", 94 | " ),\n", 95 | ")\n", 96 | "figure = dict(data=data, layout=layout)\n", 97 | "\n", 98 | "go.FigureWidget(figure)" 99 | ] 100 | }, 101 | { 102 | "cell_type": "code", 103 | "execution_count": null, 104 | "metadata": {}, 105 | "outputs": [], 106 | "source": [] 107 | } 108 | ], 109 | "metadata": { 110 | "kernelspec": { 111 | "display_name": "Python 3", 112 | "language": "python", 113 | "name": "python3" 114 | }, 115 | "language_info": { 116 | "codemirror_mode": { 117 | "name": "ipython", 118 | "version": 3 119 | }, 120 | "file_extension": ".py", 121 | "mimetype": "text/x-python", 122 | "name": "python", 123 | "nbconvert_exporter": "python", 124 | "pygments_lexer": "ipython3", 125 | "version": "3.7.3" 126 | } 127 | }, 128 | "nbformat": 4, 129 | "nbformat_minor": 2 130 | } 131 | -------------------------------------------------------------------------------- /JP Morgan Python Training/notebooks/1_basic.ipynb: -------------------------------------------------------------------------------- 1 | { 2 | "cells": [ 3 | { 4 | "cell_type": "markdown", 5 | "metadata": {}, 6 | "source": [ 7 | "# Getting started with Jupyter" 8 | ] 9 | }, 10 | { 11 | "cell_type": "markdown", 12 | "metadata": {}, 13 | "source": [ 14 | "Before getting into the code, what is Jupyter, why are we using it, and why python?\n", 15 | "\n", 16 | "Jupyter comes in three parts: a kernel, a webserver, and the notebook app. All code run will be sent to and run on the kernel. The web server will handle sending code & results between the kernel and the notebook. Last, but most importantly for us, the notebook app delivers the results of code executed on the server, allows for easy sharing of code, and a unified environment among **serveral** other things. For those of you who are interested, you can read more about what Jupyter is and the impact it's making [here from Project Jupyter](https://jupyter.org), [here from nature](https://www.nature.com/news/interactive-notebooks-sharing-the-code-1.16261), [here from O'Reilly](https://www.oreilly.com/ideas/what-is-jupyter), and [here from the Jupyter documentation](https://jupyter-notebook-beginner-guide.readthedocs.io/en/latest/what_is_jupyter.html). Note, Jupyter Notebooks **do not exclusively** run Python code.\n", 17 | "\n", 18 | "Unlike Jupyter, Python has been around for decades, but has grown in popularity [recently](https://bit.ly/2u1z6gh). As a result, several projects are being developed in python first and the python has a strong open source culture. \n", 19 | "\n", 20 | "Let's get into writing code. Most importantly, how do I execute code in a Jupyter notebook? (Ctrl + Enter)\n", 21 | "\n", 22 | "Let's try some simple operations. Each code input section below, e.g. [1], is referred to as a \"cell\"." 23 | ] 24 | }, 25 | { 26 | "cell_type": "code", 27 | "execution_count": null, 28 | "metadata": {}, 29 | "outputs": [], 30 | "source": [ 31 | "a = 3\n", 32 | "b = 4\n", 33 | "a + b" 34 | ] 35 | }, 36 | { 37 | "cell_type": "code", 38 | "execution_count": null, 39 | "metadata": {}, 40 | "outputs": [], 41 | "source": [ 42 | "a / b" 43 | ] 44 | }, 45 | { 46 | "cell_type": "code", 47 | "execution_count": null, 48 | "metadata": {}, 49 | "outputs": [], 50 | "source": [ 51 | "a // b" 52 | ] 53 | }, 54 | { 55 | "cell_type": "markdown", 56 | "metadata": {}, 57 | "source": [ 58 | "Notice there aren't any types that are defined like some other languages. We can see what type each object is by simply calling `type` on it." 59 | ] 60 | }, 61 | { 62 | "cell_type": "code", 63 | "execution_count": null, 64 | "metadata": {}, 65 | "outputs": [], 66 | "source": [ 67 | "print(type(a))\n", 68 | "print(type(b))" 69 | ] 70 | }, 71 | { 72 | "cell_type": "code", 73 | "execution_count": null, 74 | "metadata": {}, 75 | "outputs": [], 76 | "source": [ 77 | "c = a / b\n", 78 | "type(c)" 79 | ] 80 | }, 81 | { 82 | "cell_type": "code", 83 | "execution_count": null, 84 | "metadata": {}, 85 | "outputs": [], 86 | "source": [ 87 | "a = \"some string\"\n", 88 | "type(a)" 89 | ] 90 | }, 91 | { 92 | "cell_type": "markdown", 93 | "metadata": {}, 94 | "source": [ 95 | "We can easily redefine `a` and run this cell again." 96 | ] 97 | }, 98 | { 99 | "cell_type": "code", 100 | "execution_count": null, 101 | "metadata": {}, 102 | "outputs": [], 103 | "source": [ 104 | "a = 15\n", 105 | "# a has been reset to 15\n", 106 | "a + b" 107 | ] 108 | } 109 | ], 110 | "metadata": { 111 | "kernelspec": { 112 | "display_name": "Python 3", 113 | "language": "python", 114 | "name": "python3" 115 | }, 116 | "language_info": { 117 | "codemirror_mode": { 118 | "name": "ipython", 119 | "version": 3 120 | }, 121 | "file_extension": ".py", 122 | "mimetype": "text/x-python", 123 | "name": "python", 124 | "nbconvert_exporter": "python", 125 | "pygments_lexer": "ipython3", 126 | "version": "3.7.3" 127 | } 128 | }, 129 | "nbformat": 4, 130 | "nbformat_minor": 4 131 | } 132 | -------------------------------------------------------------------------------- /JP Morgan Python Training/data/data_norm.csv: -------------------------------------------------------------------------------- 1 | date,close 2 | 2019-01-25,1.0 3 | 2019-01-28,1.0069908814589668 4 | 2019-01-29,0.9617021276595745 5 | 2019-01-30,0.9805471124620061 6 | 2019-01-31,1.0200607902735563 7 | 2019-02-01,1.0088145896656535 8 | 2019-02-04,1.03161094224924 9 | 2019-02-05,1.0446808510638297 10 | 2019-02-06,1.0382978723404255 11 | 2019-02-07,0.9361702127659575 12 | 2019-02-08,0.9121580547112463 13 | 2019-02-11,0.9188449848024317 14 | 2019-02-12,0.9237082066869301 15 | 2019-02-13,0.9458966565349545 16 | 2019-02-14,0.941033434650456 17 | 2019-02-15,0.9492401215805472 18 | 2019-02-19,0.9620060790273556 19 | 2019-02-20,0.9534954407294833 20 | 2019-02-21,0.9349544072948329 21 | 2019-02-22,0.9638297872340427 22 | 2019-02-25,0.9723404255319149 23 | 2019-02-26,0.9425531914893618 24 | 2019-02-27,0.9243161094224924 25 | 2019-02-28,0.9355623100303952 26 | 2019-03-01,0.9306990881458967 27 | 2019-03-04,0.9270516717325228 28 | 2019-03-05,0.9431610942249241 29 | 2019-03-06,0.9361702127659575 30 | 2019-03-07,0.915501519756839 31 | 2019-03-08,0.9130699088145897 32 | 2019-03-11,0.9382978723404256 33 | 2019-03-12,0.9471124620060791 34 | 2019-03-13,0.9513677811550153 35 | 2019-03-14,0.9431610942249241 36 | 2019-03-15,0.948936170212766 37 | 2019-03-18,0.9446808510638298 38 | 2019-03-19,0.9504559270516718 39 | 2019-03-20,0.9899696048632219 40 | 2019-03-21,0.9911854103343465 41 | 2019-03-22,1.003647416413374 42 | 2019-03-25,0.9905775075987844 43 | 2019-03-26,1.0048632218844986 44 | 2019-03-27,0.9811550151975684 45 | 2019-03-28,0.9990881458966565 46 | 2019-03-29,0.9993920972644378 47 | 2019-04-01,1.0164133738601824 48 | 2019-04-02,1.0258358662613982 49 | 2019-04-03,1.044984802431611 50 | 2019-04-04,1.0462006079027357 51 | 2019-04-05,1.0553191489361702 52 | 2019-04-08,1.0595744680851065 53 | 2019-04-09,1.0680851063829788 54 | 2019-04-10,1.0562310030395137 55 | 2019-04-11,1.0510638297872341 56 | 2019-04-12,1.0446808510638297 57 | 2019-04-15,1.0550151975683892 58 | 2019-04-16,1.0474164133738602 59 | 2019-04-17,1.0480243161094225 60 | 2019-04-18,1.0455927051671732 61 | 2019-04-22,1.0452887537993922 62 | 2019-04-23,1.2088145896656537 63 | 2019-04-24,1.194224924012158 64 | 2019-04-25,1.1696048632218845 65 | 2019-04-26,1.1753799392097266 66 | 2019-04-29,1.2091185410334346 67 | 2019-04-30,1.2130699088145895 68 | 2019-05-01,1.194224924012158 69 | 2019-05-02,1.2142857142857144 70 | 2019-05-03,1.2401215805471124 71 | 2019-05-06,1.2227963525835865 72 | 2019-05-07,1.1738601823708206 73 | 2019-05-08,1.1726443768996961 74 | 2019-05-09,1.1790273556231003 75 | 2019-05-10,1.1686930091185412 76 | 2019-05-13,1.1121580547112464 77 | 2019-05-14,1.1224924012158055 78 | 2019-05-15,1.1519756838905775 79 | 2019-05-16,1.1641337386018236 80 | 2019-05-17,1.1398176291793314 81 | 2019-05-20,1.1291793313069909 82 | 2019-05-21,1.138905775075988 83 | 2019-05-22,1.1726443768996961 84 | 2019-05-23,1.1303951367781155 85 | 2019-05-24,1.137082066869301 86 | 2019-05-28,1.1334346504559272 87 | 2019-05-29,1.1200607902735564 88 | 2019-05-30,1.1288753799392097 89 | 2019-05-31,1.107598784194529 90 | 2019-06-03,1.0465045592705167 91 | 2019-06-04,1.0972644376899696 92 | 2019-06-05,1.1042553191489362 93 | 2019-06-06,1.1121580547112464 94 | 2019-06-07,1.152887537993921 95 | 2019-06-10,1.1440729483282674 96 | 2019-06-11,1.1310030395136779 97 | 2019-06-12,1.1395136778115502 98 | 2019-06-13,1.1045592705167175 99 | 2019-06-14,1.0987841945288754 100 | 2019-06-17,1.107598784194529 101 | 2019-06-18,1.1139817629179332 102 | 2019-06-19,1.1030395136778115 103 | 2019-06-20,1.0772036474164133 104 | 2019-06-21,1.064437689969605 105 | 2019-06-24,1.0814589665653496 106 | 2019-06-25,1.0553191489361702 107 | 2019-06-26,1.070516717325228 108 | 2019-06-27,1.0562310030395137 109 | 2019-06-28,1.060790273556231 110 | 2019-07-01,1.0966565349544073 111 | 2019-07-02,1.1009118541033436 112 | 2019-07-03,1.0948328267477205 113 | 2019-07-05,1.101823708206687 114 | 2019-07-08,1.1079027355623101 115 | 2019-07-09,1.1443768996960486 116 | 2019-07-10,1.138905775075988 117 | 2019-07-11,1.1310030395136779 118 | 2019-07-12,1.1501519756838907 119 | 2019-07-15,1.1756838905775076 120 | 2019-07-16,1.154711246200608 121 | 2019-07-17,1.1458966565349546 122 | 2019-07-18,1.1446808510638298 123 | 2019-07-19,1.1176291793313071 124 | 2019-07-22,1.1422492401215805 125 | 2019-07-23,1.1519756838905775 126 | 2019-07-24,1.1772036474164134 127 | -------------------------------------------------------------------------------- /Machine Learning For Finance/Regression Based Machine Learning for Algorithmic Trading/Trend Following - Penalized Regression Techniques.py: -------------------------------------------------------------------------------- 1 | ''' 2 | Penalized Regression Techniques: Lasso, Ridge, ElasticNet 3 | 4 | Penalized Regression Approach in Multi Asset Trend Following Strategy 5 | 6 | We illustrate an application of Lasso by estimating the 1-day returns of assets in a cross-asset momentum model. We attempt to predict the returns of 4 assets: S&P 500, 7-10Y Treasury Bond Index, US dollar (DXY) and Gold. For predictor variables, we choose lagged 1M, 3M, 6M and 12M returns of these same 4 assets, yielding a total of 16 variables. 7 | 8 | Reference: JP Morgan Big Data and AI Strategies 9 | ''' 10 | 11 | import pandas as pd 12 | import numpy as np 13 | from sklearn import linear_model 14 | 15 | def initialize(context): 16 | 17 | #set_slippage(slippage.FixedSlippage(spread=0)) 18 | #set_commission(commission.PerTrade(cost=1)) 19 | set_symbol_lookup_date('2014-01-01') 20 | set_benchmark(symbol('IEF')) 21 | 22 | context.securities = symbol('IEF') 23 | context.reg_params = [symbol('SPY'), symbol('IEF'), symbol('UUP'), symbol('GLD')] 24 | 25 | context.inLong = False 26 | context.inShort = False 27 | 28 | schedule_function(my_rebalance, date_rules.month_start(days_offset=0), time_rules.market_open(hours=1)) 29 | 30 | # Record tracking variables at the end of each day. 31 | schedule_function(my_record_vars, date_rules.every_day(), time_rules.market_close()) 32 | 33 | 34 | def before_trading_start(context, data): 35 | pass 36 | 37 | def my_record_vars(context,data): 38 | record(leverage=context.account.leverage) 39 | 40 | def my_rebalance(context,data): 41 | """ 42 | Execute orders according to our schedule_function() timing. 43 | """ 44 | # dropping the last day to prevent look forward 45 | price_history = data.history(context.reg_params, fields='price', 46 | bar_count=750, frequency='1d') 47 | 48 | # Use returns history of the 4 tickers with different look back period as factor 49 | returns_mat_1m = price_history[:-1].pct_change(20).dropna() 50 | returns_mat_1m.columns = ['SPY_1m', 'IEF_1m', 'UUP_1m', 'GLD_1m'] 51 | returns_mat_3m = price_history[:-1].pct_change(60).dropna() 52 | returns_mat_3m.columns = ['SPY_3m', 'IEF_3m', 'UUP_3m', 'GLD_3m'] 53 | returns_mat_6m = price_history[:-1].pct_change(120).dropna() 54 | returns_mat_6m.columns = ['SPY_6m', 'IEF_6m', 'UUP_6m', 'GLD_6m'] 55 | returns_mat_12m = price_history[:-1].pct_change(240).dropna() 56 | returns_mat_12m.columns = ['SPY_12m', 'IEF_12m', 'UUP_12m', 'GLD_12m'] 57 | X = returns_mat_1m.join(returns_mat_3m).join(returns_mat_6m).join(returns_mat_12m).dropna() 58 | # X = X.dropna(axis=1) 59 | 60 | y = price_history[context.securities].pct_change().dropna()[-501:-1] 61 | 62 | # Transform by standardising the factors 63 | from sklearn.preprocessing import StandardScaler 64 | sc_X = StandardScaler() 65 | X_train = sc_X.fit_transform(X[-501:-1]) 66 | sc_y = StandardScaler() 67 | y_train = sc_y.fit_transform(y) 68 | 69 | #reg = linear_model.LinearRegression(normalize=True) 70 | reg = linear_model.Lasso(alpha = 0.001, normalize=True) 71 | #reg = linear_model.Ridge (alpha = .5) 72 | reg.fit(X_train, y_train) 73 | 74 | res = reg.predict(sc_X.fit_transform(X[-501:-1])[-1]) 75 | #record('predict', res, leverage=context.account.leverage) 76 | 77 | if get_open_orders(): 78 | return 79 | 80 | # currently no position and predicted returns is positive, then open long 81 | if (res > 0) and (not context.inLong): 82 | # if (res > 0): 83 | context.inLong = True 84 | context.inShort = False 85 | order_target_percent(context.securities, 1.0) 86 | # print('scenario 3', context.inShort, context.inLong) 87 | return 88 | 89 | # currently no position and predicted returns is negative, then open short 90 | if (res < 0) and (not context.inShort): 91 | # if (res < 0): 92 | context.inLong = False 93 | context.inShort = True 94 | order_target_percent(context.securities, -1.0) 95 | # print('scenario 4', context.inShort, context.inLong) 96 | return 97 | 98 | 99 | 100 | 101 | -------------------------------------------------------------------------------- /Machine Learning For Finance/Regression Based Machine Learning for Algorithmic Trading/Pairs Trading - Lasso Regression.py: -------------------------------------------------------------------------------- 1 | ''' 2 | Lasso Regression 3 | 4 | ''' 5 | 6 | import numpy as np 7 | import pandas as pd 8 | from zipline.utils import tradingcalendar 9 | import pytz 10 | from sklearn import linear_model 11 | reg = linear_model.Lasso(alpha = .1) 12 | 13 | def initialize(context): 14 | # Quantopian backtester specific variables 15 | set_slippage(slippage.FixedSlippage(spread=0)) 16 | set_commission(commission.PerTrade(cost=1)) 17 | set_symbol_lookup_date('2014-01-01') 18 | 19 | context.stock_pairs = [(sid(5885), sid(4283))] 20 | # set_benchmark(context.y) 21 | 22 | context.num_pairs = len(context.stock_pairs) 23 | # strategy specific variables 24 | context.lookback = 20 # used for regression 25 | context.z_window = 20 # used for zscore calculation, must be <= lookback 26 | 27 | context.spread = np.ndarray((context.num_pairs, 0)) 28 | # context.hedgeRatioTS = np.ndarray((context.num_pairs, 0)) 29 | context.inLong = [False] * context.num_pairs 30 | context.inShort = [False] * context.num_pairs 31 | 32 | # Only do work 30 minutes before close 33 | schedule_function(func=check_pair_status, date_rule=date_rules.every_day(), time_rule=time_rules.market_close(minutes=30)) 34 | 35 | # Will be called on every trade event for the securities you specify. 36 | def handle_data(context, data): 37 | # Our work is now scheduled in check_pair_status 38 | pass 39 | 40 | def check_pair_status(context, data): 41 | if get_open_orders(): 42 | return 43 | 44 | prices = history(35, '1d', 'price').iloc[-context.lookback::] 45 | 46 | new_spreads = np.ndarray((context.num_pairs, 1)) 47 | 48 | for i in range(context.num_pairs): 49 | 50 | (stock_y, stock_x) = context.stock_pairs[i] 51 | 52 | Y = prices[stock_y] 53 | X = prices[stock_x] 54 | 55 | try: 56 | hedge = hedge_ratio(Y, X, add_const=True) 57 | except ValueError as e: 58 | log.debug(e) 59 | return 60 | 61 | # context.hedgeRatioTS = np.append(context.hedgeRatioTS, hedge) 62 | 63 | new_spreads[i, :] = Y[-1] - hedge * X[-1] 64 | 65 | if context.spread.shape[1] > context.z_window: 66 | # Keep only the z-score lookback period 67 | spreads = context.spread[i, -context.z_window:] 68 | 69 | zscore = (spreads[-1] - spreads.mean()) / spreads.std() 70 | 71 | if context.inShort[i] and zscore < 0.0: 72 | order_target(stock_y, 0) 73 | order_target(stock_x, 0) 74 | context.inShort[i] = False 75 | context.inLong[i] = False 76 | record(X_pct=0, Y_pct=0) 77 | return 78 | 79 | if context.inLong[i] and zscore > 0.0: 80 | order_target(stock_y, 0) 81 | order_target(stock_x, 0) 82 | context.inShort[i] = False 83 | context.inLong[i] = False 84 | record(X_pct=0, Y_pct=0) 85 | return 86 | 87 | if zscore < -1.0 and (not context.inLong[i]): 88 | # Only trade if NOT already in a trade 89 | y_target_shares = 1 90 | X_target_shares = -hedge 91 | context.inLong[i] = True 92 | context.inShort[i] = False 93 | 94 | (y_target_pct, x_target_pct) = computeHoldingsPct( y_target_shares,X_target_shares, Y[-1], X[-1] ) 95 | order_target_percent( stock_y, y_target_pct * (1.0/context.num_pairs) ) 96 | order_target_percent( stock_x, x_target_pct * (1.0/context.num_pairs) ) 97 | record(Y_pct=y_target_pct, X_pct=x_target_pct) 98 | return 99 | 100 | if zscore > 1.0 and (not context.inShort[i]): 101 | # Only trade if NOT already in a trade 102 | y_target_shares = -1 103 | X_target_shares = hedge 104 | context.inShort[i] = True 105 | context.inLong[i] = False 106 | 107 | (y_target_pct, x_target_pct) = computeHoldingsPct( y_target_shares, X_target_shares, Y[-1], X[-1] ) 108 | order_target_percent( stock_y, y_target_pct * (1.0/context.num_pairs)) 109 | order_target_percent( stock_x, x_target_pct * (1.0/context.num_pairs)) 110 | record(Y_pct=y_target_pct, X_pct=x_target_pct) 111 | 112 | context.spread = np.hstack([context.spread, new_spreads]) 113 | 114 | def hedge_ratio(Y, X, add_const=True): 115 | reg.fit(X.reshape(-1,1), Y) 116 | return reg.coef_ 117 | 118 | def computeHoldingsPct(yShares, xShares, yPrice, xPrice): 119 | yDol = yShares * yPrice 120 | xDol = xShares * xPrice 121 | notionalDol = abs(yDol) + abs(xDol) 122 | y_target_pct = yDol / notionalDol 123 | x_target_pct = xDol / notionalDol 124 | return (y_target_pct, x_target_pct) 125 | -------------------------------------------------------------------------------- /Machine Learning For Finance/Regression Based Machine Learning for Algorithmic Trading/Pairs Trading - Ridge Regression.py: -------------------------------------------------------------------------------- 1 | ''' 2 | Ridge Regression 3 | 4 | ''' 5 | 6 | 7 | import numpy as np 8 | import pandas as pd 9 | from zipline.utils import tradingcalendar 10 | import pytz 11 | from sklearn import linear_model 12 | reg = linear_model.Ridge (alpha = .5) 13 | 14 | def initialize(context): 15 | # Quantopian backtester specific variables 16 | set_slippage(slippage.FixedSlippage(spread=0)) 17 | set_commission(commission.PerTrade(cost=1)) 18 | set_symbol_lookup_date('2014-01-01') 19 | 20 | context.stock_pairs = [(sid(5885), sid(4283))] 21 | # set_benchmark(context.y) 22 | 23 | context.num_pairs = len(context.stock_pairs) 24 | # strategy specific variables 25 | context.lookback = 20 # used for regression 26 | context.z_window = 20 # used for zscore calculation, must be <= lookback 27 | 28 | context.spread = np.ndarray((context.num_pairs, 0)) 29 | # context.hedgeRatioTS = np.ndarray((context.num_pairs, 0)) 30 | context.inLong = [False] * context.num_pairs 31 | context.inShort = [False] * context.num_pairs 32 | 33 | # Only do work 30 minutes before close 34 | schedule_function(func=check_pair_status, date_rule=date_rules.every_day(), time_rule=time_rules.market_close(minutes=30)) 35 | 36 | # Will be called on every trade event for the securities you specify. 37 | def handle_data(context, data): 38 | # Our work is now scheduled in check_pair_status 39 | pass 40 | 41 | def check_pair_status(context, data): 42 | if get_open_orders(): 43 | return 44 | 45 | prices = history(35, '1d', 'price').iloc[-context.lookback::] 46 | 47 | new_spreads = np.ndarray((context.num_pairs, 1)) 48 | 49 | for i in range(context.num_pairs): 50 | 51 | (stock_y, stock_x) = context.stock_pairs[i] 52 | 53 | Y = prices[stock_y] 54 | X = prices[stock_x] 55 | 56 | try: 57 | hedge = hedge_ratio(Y, X, add_const=True) 58 | except ValueError as e: 59 | log.debug(e) 60 | return 61 | 62 | # context.hedgeRatioTS = np.append(context.hedgeRatioTS, hedge) 63 | 64 | new_spreads[i, :] = Y[-1] - hedge * X[-1] 65 | 66 | if context.spread.shape[1] > context.z_window: 67 | # Keep only the z-score lookback period 68 | spreads = context.spread[i, -context.z_window:] 69 | 70 | zscore = (spreads[-1] - spreads.mean()) / spreads.std() 71 | 72 | if context.inShort[i] and zscore < 0.0: 73 | order_target(stock_y, 0) 74 | order_target(stock_x, 0) 75 | context.inShort[i] = False 76 | context.inLong[i] = False 77 | record(X_pct=0, Y_pct=0) 78 | return 79 | 80 | if context.inLong[i] and zscore > 0.0: 81 | order_target(stock_y, 0) 82 | order_target(stock_x, 0) 83 | context.inShort[i] = False 84 | context.inLong[i] = False 85 | record(X_pct=0, Y_pct=0) 86 | return 87 | 88 | if zscore < -1.0 and (not context.inLong[i]): 89 | # Only trade if NOT already in a trade 90 | y_target_shares = 1 91 | X_target_shares = -hedge 92 | context.inLong[i] = True 93 | context.inShort[i] = False 94 | 95 | (y_target_pct, x_target_pct) = computeHoldingsPct( y_target_shares,X_target_shares, Y[-1], X[-1] ) 96 | order_target_percent( stock_y, y_target_pct * (1.0/context.num_pairs) ) 97 | order_target_percent( stock_x, x_target_pct * (1.0/context.num_pairs) ) 98 | record(Y_pct=y_target_pct, X_pct=x_target_pct) 99 | return 100 | 101 | if zscore > 1.0 and (not context.inShort[i]): 102 | # Only trade if NOT already in a trade 103 | y_target_shares = -1 104 | X_target_shares = hedge 105 | context.inShort[i] = True 106 | context.inLong[i] = False 107 | 108 | (y_target_pct, x_target_pct) = computeHoldingsPct( y_target_shares, X_target_shares, Y[-1], X[-1] ) 109 | order_target_percent( stock_y, y_target_pct * (1.0/context.num_pairs)) 110 | order_target_percent( stock_x, x_target_pct * (1.0/context.num_pairs)) 111 | record(Y_pct=y_target_pct, X_pct=x_target_pct) 112 | 113 | context.spread = np.hstack([context.spread, new_spreads]) 114 | 115 | def hedge_ratio(Y, X, add_const=True): 116 | reg.fit(X.reshape(-1,1), Y) 117 | return reg.coef_ 118 | 119 | def computeHoldingsPct(yShares, xShares, yPrice, xPrice): 120 | yDol = yShares * yPrice 121 | xDol = xShares * xPrice 122 | notionalDol = abs(yDol) + abs(xDol) 123 | y_target_pct = yDol / notionalDol 124 | x_target_pct = xDol / notionalDol 125 | return (y_target_pct, x_target_pct) 126 | -------------------------------------------------------------------------------- /Machine Learning For Finance/Regression Based Machine Learning for Algorithmic Trading/Pairs Trading - Elastic Net.py: -------------------------------------------------------------------------------- 1 | ''' 2 | Elastic Net 3 | http://scikit-learn.org/stable/modules/linear_model.html#elastic-net 4 | ''' 5 | 6 | 7 | import numpy as np 8 | import pandas as pd 9 | from zipline.utils import tradingcalendar 10 | import pytz 11 | from sklearn import linear_model 12 | # reg = linear_model.Ridge (alpha = .5) 13 | # reg = linear_model.BayesianRidge() 14 | reg = linear_model.ElasticNet() 15 | 16 | def initialize(context): 17 | # Quantopian backtester specific variables 18 | set_slippage(slippage.FixedSlippage(spread=0)) 19 | set_commission(commission.PerTrade(cost=1)) 20 | set_symbol_lookup_date('2014-01-01') 21 | 22 | context.stock_pairs = [(sid(5885), sid(4283))] 23 | # set_benchmark(context.y) 24 | 25 | context.num_pairs = len(context.stock_pairs) 26 | # strategy specific variables 27 | context.lookback = 20 # used for regression 28 | context.z_window = 20 # used for zscore calculation, must be <= lookback 29 | 30 | context.spread = np.ndarray((context.num_pairs, 0)) 31 | # context.hedgeRatioTS = np.ndarray((context.num_pairs, 0)) 32 | context.inLong = [False] * context.num_pairs 33 | context.inShort = [False] * context.num_pairs 34 | 35 | # Only do work 30 minutes before close 36 | schedule_function(func=check_pair_status, date_rule=date_rules.every_day(), time_rule=time_rules.market_close(minutes=30)) 37 | 38 | # Will be called on every trade event for the securities you specify. 39 | def handle_data(context, data): 40 | # Our work is now scheduled in check_pair_status 41 | pass 42 | 43 | def check_pair_status(context, data): 44 | if get_open_orders(): 45 | return 46 | 47 | prices = history(35, '1d', 'price').iloc[-context.lookback::] 48 | 49 | new_spreads = np.ndarray((context.num_pairs, 1)) 50 | 51 | for i in range(context.num_pairs): 52 | 53 | (stock_y, stock_x) = context.stock_pairs[i] 54 | 55 | Y = prices[stock_y] 56 | X = prices[stock_x] 57 | 58 | try: 59 | hedge = hedge_ratio(Y, X, add_const=True) 60 | except ValueError as e: 61 | log.debug(e) 62 | return 63 | 64 | # context.hedgeRatioTS = np.append(context.hedgeRatioTS, hedge) 65 | 66 | new_spreads[i, :] = Y[-1] - hedge * X[-1] 67 | 68 | if context.spread.shape[1] > context.z_window: 69 | # Keep only the z-score lookback period 70 | spreads = context.spread[i, -context.z_window:] 71 | 72 | zscore = (spreads[-1] - spreads.mean()) / spreads.std() 73 | 74 | if context.inShort[i] and zscore < 0.0: 75 | order_target(stock_y, 0) 76 | order_target(stock_x, 0) 77 | context.inShort[i] = False 78 | context.inLong[i] = False 79 | record(X_pct=0, Y_pct=0) 80 | return 81 | 82 | if context.inLong[i] and zscore > 0.0: 83 | order_target(stock_y, 0) 84 | order_target(stock_x, 0) 85 | context.inShort[i] = False 86 | context.inLong[i] = False 87 | record(X_pct=0, Y_pct=0) 88 | return 89 | 90 | if zscore < -1.0 and (not context.inLong[i]): 91 | # Only trade if NOT already in a trade 92 | y_target_shares = 1 93 | X_target_shares = -hedge 94 | context.inLong[i] = True 95 | context.inShort[i] = False 96 | 97 | (y_target_pct, x_target_pct) = computeHoldingsPct( y_target_shares,X_target_shares, Y[-1], X[-1] ) 98 | order_target_percent( stock_y, y_target_pct * (1.0/context.num_pairs) ) 99 | order_target_percent( stock_x, x_target_pct * (1.0/context.num_pairs) ) 100 | record(Y_pct=y_target_pct, X_pct=x_target_pct) 101 | return 102 | 103 | if zscore > 1.0 and (not context.inShort[i]): 104 | # Only trade if NOT already in a trade 105 | y_target_shares = -1 106 | X_target_shares = hedge 107 | context.inShort[i] = True 108 | context.inLong[i] = False 109 | 110 | (y_target_pct, x_target_pct) = computeHoldingsPct( y_target_shares, X_target_shares, Y[-1], X[-1] ) 111 | order_target_percent( stock_y, y_target_pct * (1.0/context.num_pairs)) 112 | order_target_percent( stock_x, x_target_pct * (1.0/context.num_pairs)) 113 | record(Y_pct=y_target_pct, X_pct=x_target_pct) 114 | 115 | context.spread = np.hstack([context.spread, new_spreads]) 116 | 117 | def hedge_ratio(Y, X, add_const=True): 118 | reg.fit(X.reshape(-1,1), Y) 119 | return reg.coef_ 120 | 121 | def computeHoldingsPct(yShares, xShares, yPrice, xPrice): 122 | yDol = yShares * yPrice 123 | xDol = xShares * xPrice 124 | notionalDol = abs(yDol) + abs(xDol) 125 | y_target_pct = yDol / notionalDol 126 | x_target_pct = xDol / notionalDol 127 | return (y_target_pct, x_target_pct) 128 | -------------------------------------------------------------------------------- /Machine Learning For Finance/Regression Based Machine Learning for Algorithmic Trading/Pairs Trading - Bayesian Ridge Regression.py: -------------------------------------------------------------------------------- 1 | ''' 2 | Bayesian Ridge Regression 3 | http://scikit-learn.org/stable/modules/linear_model.html#bayesian-regression 4 | ''' 5 | 6 | 7 | import numpy as np 8 | import pandas as pd 9 | from zipline.utils import tradingcalendar 10 | import pytz 11 | from sklearn import linear_model 12 | # reg = linear_model.Ridge (alpha = .5) 13 | reg = linear_model.BayesianRidge() 14 | 15 | 16 | def initialize(context): 17 | # Quantopian backtester specific variables 18 | set_slippage(slippage.FixedSlippage(spread=0)) 19 | set_commission(commission.PerTrade(cost=1)) 20 | set_symbol_lookup_date('2014-01-01') 21 | 22 | context.stock_pairs = [(sid(5885), sid(4283))] 23 | # set_benchmark(context.y) 24 | 25 | context.num_pairs = len(context.stock_pairs) 26 | # strategy specific variables 27 | context.lookback = 20 # used for regression 28 | context.z_window = 20 # used for zscore calculation, must be <= lookback 29 | 30 | context.spread = np.ndarray((context.num_pairs, 0)) 31 | # context.hedgeRatioTS = np.ndarray((context.num_pairs, 0)) 32 | context.inLong = [False] * context.num_pairs 33 | context.inShort = [False] * context.num_pairs 34 | 35 | # Only do work 30 minutes before close 36 | schedule_function(func=check_pair_status, date_rule=date_rules.every_day(), time_rule=time_rules.market_close(minutes=30)) 37 | 38 | # Will be called on every trade event for the securities you specify. 39 | def handle_data(context, data): 40 | # Our work is now scheduled in check_pair_status 41 | pass 42 | 43 | def check_pair_status(context, data): 44 | if get_open_orders(): 45 | return 46 | 47 | prices = history(35, '1d', 'price').iloc[-context.lookback::] 48 | 49 | new_spreads = np.ndarray((context.num_pairs, 1)) 50 | 51 | for i in range(context.num_pairs): 52 | 53 | (stock_y, stock_x) = context.stock_pairs[i] 54 | 55 | Y = prices[stock_y] 56 | X = prices[stock_x] 57 | 58 | try: 59 | hedge = hedge_ratio(Y, X, add_const=True) 60 | except ValueError as e: 61 | log.debug(e) 62 | return 63 | 64 | # context.hedgeRatioTS = np.append(context.hedgeRatioTS, hedge) 65 | 66 | new_spreads[i, :] = Y[-1] - hedge * X[-1] 67 | 68 | if context.spread.shape[1] > context.z_window: 69 | # Keep only the z-score lookback period 70 | spreads = context.spread[i, -context.z_window:] 71 | 72 | zscore = (spreads[-1] - spreads.mean()) / spreads.std() 73 | 74 | if context.inShort[i] and zscore < 0.0: 75 | order_target(stock_y, 0) 76 | order_target(stock_x, 0) 77 | context.inShort[i] = False 78 | context.inLong[i] = False 79 | record(X_pct=0, Y_pct=0) 80 | return 81 | 82 | if context.inLong[i] and zscore > 0.0: 83 | order_target(stock_y, 0) 84 | order_target(stock_x, 0) 85 | context.inShort[i] = False 86 | context.inLong[i] = False 87 | record(X_pct=0, Y_pct=0) 88 | return 89 | 90 | if zscore < -1.0 and (not context.inLong[i]): 91 | # Only trade if NOT already in a trade 92 | y_target_shares = 1 93 | X_target_shares = -hedge 94 | context.inLong[i] = True 95 | context.inShort[i] = False 96 | 97 | (y_target_pct, x_target_pct) = computeHoldingsPct( y_target_shares,X_target_shares, Y[-1], X[-1] ) 98 | order_target_percent( stock_y, y_target_pct * (1.0/context.num_pairs) ) 99 | order_target_percent( stock_x, x_target_pct * (1.0/context.num_pairs) ) 100 | record(Y_pct=y_target_pct, X_pct=x_target_pct) 101 | return 102 | 103 | if zscore > 1.0 and (not context.inShort[i]): 104 | # Only trade if NOT already in a trade 105 | y_target_shares = -1 106 | X_target_shares = hedge 107 | context.inShort[i] = True 108 | context.inLong[i] = False 109 | 110 | (y_target_pct, x_target_pct) = computeHoldingsPct( y_target_shares, X_target_shares, Y[-1], X[-1] ) 111 | order_target_percent( stock_y, y_target_pct * (1.0/context.num_pairs)) 112 | order_target_percent( stock_x, x_target_pct * (1.0/context.num_pairs)) 113 | record(Y_pct=y_target_pct, X_pct=x_target_pct) 114 | 115 | context.spread = np.hstack([context.spread, new_spreads]) 116 | 117 | def hedge_ratio(Y, X, add_const=True): 118 | reg.fit(X.reshape(-1,1), Y) 119 | return reg.coef_ 120 | 121 | def computeHoldingsPct(yShares, xShares, yPrice, xPrice): 122 | yDol = yShares * yPrice 123 | xDol = xShares * xPrice 124 | notionalDol = abs(yDol) + abs(xDol) 125 | y_target_pct = yDol / notionalDol 126 | x_target_pct = xDol / notionalDol 127 | return (y_target_pct, x_target_pct) 128 | -------------------------------------------------------------------------------- /JP Morgan Python Training/notebooks/4_webapi.ipynb: -------------------------------------------------------------------------------- 1 | { 2 | "cells": [ 3 | { 4 | "cell_type": "markdown", 5 | "metadata": {}, 6 | "source": [ 7 | "# Accessing data from a web API" 8 | ] 9 | }, 10 | { 11 | "cell_type": "markdown", 12 | "metadata": {}, 13 | "source": [ 14 | "One of the best things about \"web 2.0\" is more formal, external facing API's to allow developers to build tools on top of data. These APIs provide a reliable connection to data over a reliable protocol returning a reliable response in an easy, machine-readable format. There are tons of APIs out there. Sometimes you will need to register for a developer key (see the St. Louis Fed's API [here](https://research.stlouisfed.org/docs/api/)) and other times the APIs are free (see Coindesk's API [here](https://www.coindesk.com/api)). ProgrammableWeb has a great [directory](https://www.programmableweb.com/apis/directory).\n", 15 | "\n", 16 | "## With great power comes great responsibility. We should be responsible and polite internet citizens when hitting websites programmatically\n", 17 | "\n", 18 | "Below we will use Coindesk's API to get daily prices for bitcoin. Coindesk's API, like many others, returns a JSON (javascript object notation) object, which we can easily turn into a dataframe. Here we will need the `json` and `requests` libraries." 19 | ] 20 | }, 21 | { 22 | "cell_type": "code", 23 | "execution_count": null, 24 | "metadata": {}, 25 | "outputs": [], 26 | "source": [ 27 | "import datetime\n", 28 | "import pandas as pd\n", 29 | "import json\n", 30 | "import requests\n", 31 | "\n", 32 | "coindeskURL = 'https://api.coindesk.com/v1/bpi/historical/close.json?'\n", 33 | "\n", 34 | "# from API\n", 35 | "start = datetime.date(2019, 1 ,1)\n", 36 | "end = datetime.date(2019, 7, 1)\n", 37 | "\n", 38 | "url = f'{coindeskURL}start={start:%Y-%m-%d}&end={end:%Y-%m-%d}'\n", 39 | "\n", 40 | "print(f'Hitting this url:{url}')\n", 41 | "\n", 42 | "result = requests.get(url)\n", 43 | "result.content" 44 | ] 45 | }, 46 | { 47 | "cell_type": "markdown", 48 | "metadata": {}, 49 | "source": [ 50 | "Digging into this a little bit, there are a few critical components to accessing data. Not all APIs will return the same structure of data. Many times you can look at what is returned by just hitting the URL with your browser. Let's try that\n", 51 | "\n", 52 | "[https://api.coindesk.com/v1/bpi/historical/close.json?start=2019-01-01&end=2019-07-01](https://api.coindesk.com/v1/bpi/historical/close.json?start=2019-01-01&end=2019-07-01)\n" 53 | ] 54 | }, 55 | { 56 | "cell_type": "markdown", 57 | "metadata": {}, 58 | "source": [ 59 | "In this case we get:\n", 60 | "```\n", 61 | "{\n", 62 | " \"bpi\":\n", 63 | " {\n", 64 | " \"2019-01-01\":3869.47,\n", 65 | " \"2019-01-02\":3941.2167,\n", 66 | " ...\n", 67 | " },\n", 68 | " \"disclaimer\": \"This data was produced from the CoinDesk Bitcoin Price Index.\",\n", 69 | " \"time\":\n", 70 | " {\n", 71 | " \"updated\":\"Jul 2, 2019 00:03:00 UTC\",\n", 72 | " \"updatedISO\":\"2019-07-02T00:03:00+00:00\"\n", 73 | " }\n", 74 | "}\n", 75 | "```" 76 | ] 77 | }, 78 | { 79 | "cell_type": "markdown", 80 | "metadata": {}, 81 | "source": [ 82 | "`bpi`, `disclaimer`, and `time` are all data that we can reference. The data we're primarily interested in is `bpi`. Different APIs can break up data differently, we will see examples in other APIs later." 83 | ] 84 | }, 85 | { 86 | "cell_type": "markdown", 87 | "metadata": {}, 88 | "source": [ 89 | "Let's take a look at the json result in python." 90 | ] 91 | }, 92 | { 93 | "cell_type": "code", 94 | "execution_count": null, 95 | "metadata": {}, 96 | "outputs": [], 97 | "source": [ 98 | "jsondata = json.loads(result.content)\n", 99 | "jsondata" 100 | ] 101 | }, 102 | { 103 | "cell_type": "markdown", 104 | "metadata": {}, 105 | "source": [ 106 | "We can then wrap that up in to a pandas dataframe! Note: pandas does have a `read_json` helper as well, but sometimes it is not well suited given the structure of the json output." 107 | ] 108 | }, 109 | { 110 | "cell_type": "code", 111 | "execution_count": null, 112 | "metadata": {}, 113 | "outputs": [], 114 | "source": [ 115 | "# note the json gets read with the disclaimer and time outputs as well\n", 116 | "data = pd.read_json(result.content)\n", 117 | "data" 118 | ] 119 | }, 120 | { 121 | "cell_type": "code", 122 | "execution_count": null, 123 | "metadata": {}, 124 | "outputs": [], 125 | "source": [ 126 | "data = pd.DataFrame({'Bitcoin Price Index': jsondata['bpi']})\n", 127 | "data" 128 | ] 129 | }, 130 | { 131 | "cell_type": "code", 132 | "execution_count": null, 133 | "metadata": {}, 134 | "outputs": [], 135 | "source": [ 136 | "%matplotlib inline\n", 137 | "data.plot()" 138 | ] 139 | } 140 | ], 141 | "metadata": { 142 | "kernelspec": { 143 | "display_name": "Python 3", 144 | "language": "python", 145 | "name": "python3" 146 | }, 147 | "language_info": { 148 | "codemirror_mode": { 149 | "name": "ipython", 150 | "version": 3 151 | }, 152 | "file_extension": ".py", 153 | "mimetype": "text/x-python", 154 | "name": "python", 155 | "nbconvert_exporter": "python", 156 | "pygments_lexer": "ipython3", 157 | "version": "3.7.3" 158 | } 159 | }, 160 | "nbformat": 4, 161 | "nbformat_minor": 4 162 | } 163 | -------------------------------------------------------------------------------- /Machine Learning For Finance/Regression Based Machine Learning for Algorithmic Trading/Pairs Trading scikit-learn Linear.py: -------------------------------------------------------------------------------- 1 | ''' 2 | Anthony NG 3 | @ 2017 4 | 5 | ## MIT License 6 | 7 | THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE. 8 | 9 | ''' 10 | 11 | import numpy as np 12 | import pandas as pd 13 | from zipline.utils import tradingcalendar 14 | import pytz 15 | from sklearn.linear_model import LinearRegression 16 | reg = LinearRegression(fit_intercept=True) 17 | 18 | def initialize(context): 19 | # Quantopian backtester specific variables 20 | set_slippage(slippage.FixedSlippage(spread=0)) 21 | set_commission(commission.PerTrade(cost=1)) 22 | set_symbol_lookup_date('2014-01-01') 23 | 24 | context.stock_pairs = [(sid(5885), sid(4283))] 25 | # set_benchmark(context.y) 26 | 27 | context.num_pairs = len(context.stock_pairs) 28 | # strategy specific variables 29 | context.lookback = 20 # used for regression 30 | context.z_window = 20 # used for zscore calculation, must be <= lookback 31 | 32 | context.spread = np.ndarray((context.num_pairs, 0)) 33 | # context.hedgeRatioTS = np.ndarray((context.num_pairs, 0)) 34 | context.inLong = [False] * context.num_pairs 35 | context.inShort = [False] * context.num_pairs 36 | 37 | # Only do work 30 minutes before close 38 | schedule_function(func=check_pair_status, date_rule=date_rules.every_day(), time_rule=time_rules.market_close(minutes=30)) 39 | 40 | # Will be called on every trade event for the securities you specify. 41 | def handle_data(context, data): 42 | # Our work is now scheduled in check_pair_status 43 | pass 44 | 45 | def check_pair_status(context, data): 46 | if get_open_orders(): 47 | return 48 | 49 | prices = history(35, '1d', 'price').iloc[-context.lookback::] 50 | 51 | new_spreads = np.ndarray((context.num_pairs, 1)) 52 | 53 | for i in range(context.num_pairs): 54 | 55 | (stock_y, stock_x) = context.stock_pairs[i] 56 | 57 | Y = prices[stock_y] 58 | X = prices[stock_x] 59 | 60 | try: 61 | hedge = hedge_ratio(Y, X, add_const=True) 62 | except ValueError as e: 63 | log.debug(e) 64 | return 65 | 66 | # context.hedgeRatioTS = np.append(context.hedgeRatioTS, hedge) 67 | 68 | new_spreads[i, :] = Y[-1] - hedge * X[-1] 69 | 70 | if context.spread.shape[1] > context.z_window: 71 | # Keep only the z-score lookback period 72 | spreads = context.spread[i, -context.z_window:] 73 | 74 | zscore = (spreads[-1] - spreads.mean()) / spreads.std() 75 | 76 | if context.inShort[i] and zscore < 0.0: 77 | order_target(stock_y, 0) 78 | order_target(stock_x, 0) 79 | context.inShort[i] = False 80 | context.inLong[i] = False 81 | record(X_pct=0, Y_pct=0) 82 | return 83 | 84 | if context.inLong[i] and zscore > 0.0: 85 | order_target(stock_y, 0) 86 | order_target(stock_x, 0) 87 | context.inShort[i] = False 88 | context.inLong[i] = False 89 | record(X_pct=0, Y_pct=0) 90 | return 91 | 92 | if zscore < -1.0 and (not context.inLong[i]): 93 | # Only trade if NOT already in a trade 94 | y_target_shares = 1 95 | X_target_shares = -hedge 96 | context.inLong[i] = True 97 | context.inShort[i] = False 98 | 99 | (y_target_pct, x_target_pct) = computeHoldingsPct( y_target_shares,X_target_shares, Y[-1], X[-1] ) 100 | order_target_percent( stock_y, y_target_pct * (1.0/context.num_pairs) ) 101 | order_target_percent( stock_x, x_target_pct * (1.0/context.num_pairs) ) 102 | record(Y_pct=y_target_pct, X_pct=x_target_pct) 103 | return 104 | 105 | if zscore > 1.0 and (not context.inShort[i]): 106 | # Only trade if NOT already in a trade 107 | y_target_shares = -1 108 | X_target_shares = hedge 109 | context.inShort[i] = True 110 | context.inLong[i] = False 111 | 112 | (y_target_pct, x_target_pct) = computeHoldingsPct( y_target_shares, X_target_shares, Y[-1], X[-1] ) 113 | order_target_percent( stock_y, y_target_pct * (1.0/context.num_pairs)) 114 | order_target_percent( stock_x, x_target_pct * (1.0/context.num_pairs)) 115 | record(Y_pct=y_target_pct, X_pct=x_target_pct) 116 | 117 | context.spread = np.hstack([context.spread, new_spreads]) 118 | 119 | def hedge_ratio(Y, X, add_const=True): 120 | reg.fit(X.reshape(-1,1), Y) 121 | return reg.coef_ 122 | 123 | def computeHoldingsPct(yShares, xShares, yPrice, xPrice): 124 | yDol = yShares * yPrice 125 | xDol = xShares * xPrice 126 | notionalDol = abs(yDol) + abs(xDol) 127 | y_target_pct = yDol / notionalDol 128 | x_target_pct = xDol / notionalDol 129 | return (y_target_pct, x_target_pct) 130 | -------------------------------------------------------------------------------- /Machine Learning For Finance/Regression Based Machine Learning for Algorithmic Trading/Pairs Trading statsmodels Linear Post 2008.py: -------------------------------------------------------------------------------- 1 | ''' 2 | Anthony NG 3 | @ 2017 4 | 5 | ## MIT License 6 | 7 | THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE. 8 | 9 | ''' 10 | 11 | import numpy as np 12 | import statsmodels.api as sm 13 | import pandas as pd 14 | from zipline.utils import tradingcalendar 15 | import pytz 16 | 17 | 18 | def initialize(context): 19 | # Quantopian backtester specific variables 20 | set_slippage(slippage.FixedSlippage(spread=0)) 21 | set_commission(commission.PerTrade(cost=1)) 22 | set_symbol_lookup_date('2014-01-01') 23 | 24 | context.stock_pairs = [(sid(5885), sid(4283))] 25 | # set_benchmark(context.y) 26 | 27 | context.num_pairs = len(context.stock_pairs) 28 | # strategy specific variables 29 | context.lookback = 20 # used for regression 30 | context.z_window = 20 # used for zscore calculation, must be <= lookback 31 | 32 | context.spread = np.ndarray((context.num_pairs, 0)) 33 | # context.hedgeRatioTS = np.ndarray((context.num_pairs, 0)) 34 | context.inLong = [False] * context.num_pairs 35 | context.inShort = [False] * context.num_pairs 36 | 37 | # Only do work 30 minutes before close 38 | schedule_function(func=check_pair_status, date_rule=date_rules.every_day(), time_rule=time_rules.market_close(minutes=30)) 39 | 40 | # Will be called on every trade event for the securities you specify. 41 | def handle_data(context, data): 42 | # Our work is now scheduled in check_pair_status 43 | pass 44 | 45 | def check_pair_status(context, data): 46 | if get_open_orders(): 47 | return 48 | 49 | prices = history(35, '1d', 'price').iloc[-context.lookback::] 50 | 51 | new_spreads = np.ndarray((context.num_pairs, 1)) 52 | 53 | for i in range(context.num_pairs): 54 | 55 | (stock_y, stock_x) = context.stock_pairs[i] 56 | 57 | Y = prices[stock_y] 58 | X = prices[stock_x] 59 | 60 | try: 61 | hedge = hedge_ratio(Y, X, add_const=True) 62 | except ValueError as e: 63 | log.debug(e) 64 | return 65 | 66 | # context.hedgeRatioTS = np.append(context.hedgeRatioTS, hedge) 67 | 68 | new_spreads[i, :] = Y[-1] - hedge * X[-1] 69 | 70 | if context.spread.shape[1] > context.z_window: 71 | # Keep only the z-score lookback period 72 | spreads = context.spread[i, -context.z_window:] 73 | 74 | zscore = (spreads[-1] - spreads.mean()) / spreads.std() 75 | 76 | if context.inShort[i] and zscore < 0.0: 77 | order_target(stock_y, 0) 78 | order_target(stock_x, 0) 79 | context.inShort[i] = False 80 | context.inLong[i] = False 81 | record(X_pct=0, Y_pct=0) 82 | return 83 | 84 | if context.inLong[i] and zscore > 0.0: 85 | order_target(stock_y, 0) 86 | order_target(stock_x, 0) 87 | context.inShort[i] = False 88 | context.inLong[i] = False 89 | record(X_pct=0, Y_pct=0) 90 | return 91 | 92 | if zscore < -1.0 and (not context.inLong[i]): 93 | # Only trade if NOT already in a trade 94 | y_target_shares = 1 95 | X_target_shares = -hedge 96 | context.inLong[i] = True 97 | context.inShort[i] = False 98 | 99 | (y_target_pct, x_target_pct) = computeHoldingsPct( y_target_shares,X_target_shares, Y[-1], X[-1] ) 100 | order_target_percent( stock_y, y_target_pct * (1.0/context.num_pairs) ) 101 | order_target_percent( stock_x, x_target_pct * (1.0/context.num_pairs) ) 102 | record(Y_pct=y_target_pct, X_pct=x_target_pct) 103 | return 104 | 105 | if zscore > 1.0 and (not context.inShort[i]): 106 | # Only trade if NOT already in a trade 107 | y_target_shares = -1 108 | X_target_shares = hedge 109 | context.inShort[i] = True 110 | context.inLong[i] = False 111 | 112 | (y_target_pct, x_target_pct) = computeHoldingsPct( y_target_shares, X_target_shares, Y[-1], X[-1] ) 113 | order_target_percent( stock_y, y_target_pct * (1.0/context.num_pairs)) 114 | order_target_percent( stock_x, x_target_pct * (1.0/context.num_pairs)) 115 | record(Y_pct=y_target_pct, X_pct=x_target_pct) 116 | 117 | context.spread = np.hstack([context.spread, new_spreads]) 118 | 119 | def hedge_ratio(Y, X, add_const=True): 120 | if add_const: 121 | X = sm.add_constant(X) 122 | model = sm.OLS(Y, X).fit() 123 | return model.params[1] 124 | model = sm.OLS(Y, X).fit() 125 | return model.params.values 126 | 127 | def computeHoldingsPct(yShares, xShares, yPrice, xPrice): 128 | yDol = yShares * yPrice 129 | xDol = xShares * xPrice 130 | notionalDol = abs(yDol) + abs(xDol) 131 | y_target_pct = yDol / notionalDol 132 | x_target_pct = xDol / notionalDol 133 | return (y_target_pct, x_target_pct) 134 | -------------------------------------------------------------------------------- /Machine Learning For Finance/Regression Based Machine Learning for Algorithmic Trading/Pairs Trading statsmodels Linear Pre 2008.py: -------------------------------------------------------------------------------- 1 | ''' 2 | Anthony NG 3 | @ 2017 4 | 5 | ## MIT License 6 | 7 | THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE. 8 | 9 | ''' 10 | 11 | import numpy as np 12 | import statsmodels.api as sm 13 | import pandas as pd 14 | from zipline.utils import tradingcalendar 15 | import pytz 16 | 17 | 18 | def initialize(context): 19 | # Quantopian backtester specific variables 20 | set_slippage(slippage.FixedSlippage(spread=0)) 21 | set_commission(commission.PerTrade(cost=1)) 22 | set_symbol_lookup_date('2014-01-01') 23 | 24 | context.stock_pairs = [(sid(5885), sid(4283))] 25 | # set_benchmark(context.y) 26 | 27 | context.num_pairs = len(context.stock_pairs) 28 | # strategy specific variables 29 | context.lookback = 20 # used for regression 30 | context.z_window = 20 # used for zscore calculation, must be <= lookback 31 | 32 | context.spread = np.ndarray((context.num_pairs, 0)) 33 | # context.hedgeRatioTS = np.ndarray((context.num_pairs, 0)) 34 | context.inLong = [False] * context.num_pairs 35 | context.inShort = [False] * context.num_pairs 36 | 37 | # Only do work 30 minutes before close 38 | schedule_function(func=check_pair_status, date_rule=date_rules.every_day(), time_rule=time_rules.market_close(minutes=30)) 39 | 40 | # Will be called on every trade event for the securities you specify. 41 | def handle_data(context, data): 42 | # Our work is now scheduled in check_pair_status 43 | pass 44 | 45 | def check_pair_status(context, data): 46 | if get_open_orders(): 47 | return 48 | 49 | prices = history(35, '1d', 'price').iloc[-context.lookback::] 50 | 51 | new_spreads = np.ndarray((context.num_pairs, 1)) 52 | 53 | for i in range(context.num_pairs): 54 | 55 | (stock_y, stock_x) = context.stock_pairs[i] 56 | 57 | Y = prices[stock_y] 58 | X = prices[stock_x] 59 | 60 | try: 61 | hedge = hedge_ratio(Y, X, add_const=True) 62 | except ValueError as e: 63 | log.debug(e) 64 | return 65 | 66 | # context.hedgeRatioTS = np.append(context.hedgeRatioTS, hedge) 67 | 68 | new_spreads[i, :] = Y[-1] - hedge * X[-1] 69 | 70 | if context.spread.shape[1] > context.z_window: 71 | # Keep only the z-score lookback period 72 | spreads = context.spread[i, -context.z_window:] 73 | 74 | zscore = (spreads[-1] - spreads.mean()) / spreads.std() 75 | 76 | if context.inShort[i] and zscore < 0.0: 77 | order_target(stock_y, 0) 78 | order_target(stock_x, 0) 79 | context.inShort[i] = False 80 | context.inLong[i] = False 81 | record(X_pct=0, Y_pct=0) 82 | return 83 | 84 | if context.inLong[i] and zscore > 0.0: 85 | order_target(stock_y, 0) 86 | order_target(stock_x, 0) 87 | context.inShort[i] = False 88 | context.inLong[i] = False 89 | record(X_pct=0, Y_pct=0) 90 | return 91 | 92 | if zscore < -1.0 and (not context.inLong[i]): 93 | # Only trade if NOT already in a trade 94 | y_target_shares = 1 95 | X_target_shares = -hedge 96 | context.inLong[i] = True 97 | context.inShort[i] = False 98 | 99 | (y_target_pct, x_target_pct) = computeHoldingsPct( y_target_shares,X_target_shares, Y[-1], X[-1] ) 100 | order_target_percent( stock_y, y_target_pct * (1.0/context.num_pairs) ) 101 | order_target_percent( stock_x, x_target_pct * (1.0/context.num_pairs) ) 102 | record(Y_pct=y_target_pct, X_pct=x_target_pct) 103 | return 104 | 105 | if zscore > 1.0 and (not context.inShort[i]): 106 | # Only trade if NOT already in a trade 107 | y_target_shares = -1 108 | X_target_shares = hedge 109 | context.inShort[i] = True 110 | context.inLong[i] = False 111 | 112 | (y_target_pct, x_target_pct) = computeHoldingsPct( y_target_shares, X_target_shares, Y[-1], X[-1] ) 113 | order_target_percent( stock_y, y_target_pct * (1.0/context.num_pairs)) 114 | order_target_percent( stock_x, x_target_pct * (1.0/context.num_pairs)) 115 | record(Y_pct=y_target_pct, X_pct=x_target_pct) 116 | 117 | context.spread = np.hstack([context.spread, new_spreads]) 118 | 119 | def hedge_ratio(Y, X, add_const=True): 120 | if add_const: 121 | X = sm.add_constant(X) 122 | model = sm.OLS(Y, X).fit() 123 | return model.params[1] 124 | model = sm.OLS(Y, X).fit() 125 | return model.params.values 126 | 127 | def computeHoldingsPct(yShares, xShares, yPrice, xPrice): 128 | yDol = yShares * yPrice 129 | xDol = xShares * xPrice 130 | notionalDol = abs(yDol) + abs(xDol) 131 | y_target_pct = yDol / notionalDol 132 | x_target_pct = xDol / notionalDol 133 | return (y_target_pct, x_target_pct) 134 | -------------------------------------------------------------------------------- /JP Morgan Python Training/notebooks/8_altman_z_double_prime.ipynb: -------------------------------------------------------------------------------- 1 | { 2 | "cells": [ 3 | { 4 | "cell_type": "markdown", 5 | "metadata": {}, 6 | "source": [ 7 | "# Using Financial Data Example #1: Calculating Altman Z\" Score" 8 | ] 9 | }, 10 | { 11 | "cell_type": "markdown", 12 | "metadata": {}, 13 | "source": [ 14 | "Professor Altman first formulated his infamous \"Z Score\" in [1968](http://www.defaultrisk.com/_pdf6j4/Financial_Ratios_Discriminant_Anlss_n_Prdctn_o_Crprt_Bnkrptc.pdf) while at NYU. The \"Z Score\" attempts to quantify the likelihood that a company defaults. After several iterations, the Altman Z\" (Double Prime) Score was developed to better quantify a company's credit risk. Professor Altman's presentation [here](http://pages.stern.nyu.edu/~ealtman/3-%20CopCrScoringModels.pdf) walks through several models including this one. When it was initially developed there were some crude cutoffs for the scores - above 2.6 and the firm was \"healthy\", between 1.1-2.6 was the \"grey area\", and below 1.1 and the firm as at risk of bankruptcy. However, over time that crude scale was refined. One of the unique things about the Altman Z\" Score today is that we have a mapping to conventional credit ratings. Below we walk through the example of calculating the score from scratch using [IEX data](https://iextrading.com/developer/docs/)." 15 | ] 16 | }, 17 | { 18 | "cell_type": "markdown", 19 | "metadata": {}, 20 | "source": [ 21 | "$$ Z” = 6.56x_1 +3.26x_2 + 6.72x_3 + 1.05x_4 $$\n", 22 | "$$ \\textrm{Where:} $$\n", 23 | "$$ x_1 = \\textrm{Working Capital / Total Assets} $$\n", 24 | "$$ x_2 = \\textrm{Retained Earnings / Total Assets} $$\n", 25 | "$$ x_3 = \\textrm{EBIT / Total Assets} $$\n", 26 | "$$ x_4 = \\textrm{Market Value of Equity / Total Liabilities} $$" 27 | ] 28 | }, 29 | { 30 | "cell_type": "code", 31 | "execution_count": null, 32 | "metadata": {}, 33 | "outputs": [], 34 | "source": [ 35 | "import pyEX as p\n", 36 | "c = p.Client(api_token=\"pk_353fe2ce67cd4c16b30a748ff783c865\", version=\"v1\")" 37 | ] 38 | }, 39 | { 40 | "cell_type": "code", 41 | "execution_count": null, 42 | "metadata": {}, 43 | "outputs": [], 44 | "source": [ 45 | "ticker = \"aapl\"\n", 46 | "incomeStatement = c.incomeStatementDF(ticker)\n", 47 | "balanceSheet = c.balanceSheetDF(ticker)\n", 48 | "cfStatement = c.cashFlowDF(ticker)\n", 49 | "stats = c.keyStats(ticker)" 50 | ] 51 | }, 52 | { 53 | "cell_type": "code", 54 | "execution_count": null, 55 | "metadata": {}, 56 | "outputs": [], 57 | "source": [ 58 | "x1 = ( balanceSheet[\"currentAssets\"][0] - balanceSheet[\"totalCurrentLiabilities\"][0] ) / balanceSheet[\"totalAssets\"][0]" 59 | ] 60 | }, 61 | { 62 | "cell_type": "code", 63 | "execution_count": null, 64 | "metadata": {}, 65 | "outputs": [], 66 | "source": [ 67 | "x2 = balanceSheet[\"retainedEarnings\"][0] / balanceSheet[\"totalAssets\"][0]" 68 | ] 69 | }, 70 | { 71 | "cell_type": "code", 72 | "execution_count": null, 73 | "metadata": {}, 74 | "outputs": [], 75 | "source": [ 76 | "x3 = incomeStatement[\"ebit\"][0] / balanceSheet[\"totalAssets\"][0]" 77 | ] 78 | }, 79 | { 80 | "cell_type": "code", 81 | "execution_count": null, 82 | "metadata": {}, 83 | "outputs": [], 84 | "source": [ 85 | "x4 = stats[\"marketcap\"] / balanceSheet[\"totalLiabilities\"][0]" 86 | ] 87 | }, 88 | { 89 | "cell_type": "code", 90 | "execution_count": null, 91 | "metadata": {}, 92 | "outputs": [], 93 | "source": [ 94 | "6.56 * x1 + 3.26 * x2 + 6.72 * x3 + 1.05 * x4" 95 | ] 96 | }, 97 | { 98 | "cell_type": "code", 99 | "execution_count": null, 100 | "metadata": {}, 101 | "outputs": [], 102 | "source": [ 103 | "def altmanZDoublePrime( ticker ):\n", 104 | " '''\n", 105 | " Calculate the Altman Z\" Score for a given ticker\n", 106 | " \n", 107 | " ticker = string, user input for which to calculate the Z-score. Not case sensitive.\n", 108 | " '''\n", 109 | " incomeStatement = c.incomeStatementDF(ticker)\n", 110 | " balanceSheet = c.balanceSheetDF(ticker)\n", 111 | " cfStatement = c.cashFlowDF(ticker)\n", 112 | " stats = c.keyStats(ticker)\n", 113 | " x1 = ( balanceSheet[\"currentAssets\"][0] - balanceSheet[\"totalCurrentLiabilities\"][0] ) / balanceSheet[\"totalAssets\"][0]\n", 114 | " x2 = balanceSheet[\"retainedEarnings\"][0] / balanceSheet[\"totalAssets\"][0]\n", 115 | " x3 = incomeStatement[\"ebit\"][0] / balanceSheet[\"totalAssets\"][0]\n", 116 | " x4 = stats[\"marketcap\"] / balanceSheet[\"totalLiabilities\"][0]\n", 117 | " return 6.56 * x1 + 3.26 * x2 + 6.72 * x3 + 1.05 * x4" 118 | ] 119 | }, 120 | { 121 | "cell_type": "code", 122 | "execution_count": null, 123 | "metadata": {}, 124 | "outputs": [], 125 | "source": [ 126 | "altmanZDoublePrime(\"aapl\")" 127 | ] 128 | }, 129 | { 130 | "cell_type": "code", 131 | "execution_count": null, 132 | "metadata": {}, 133 | "outputs": [], 134 | "source": [ 135 | "import numpy as np\n", 136 | "\n", 137 | "def altmanZDPImpliedRating( ticker ):\n", 138 | " '''\n", 139 | " Calculate the implied credit rating from a company's Altman Z\" Score \n", 140 | " \n", 141 | " ticker = string, user input for which to calculate the Z-score. Not case sensitive.\n", 142 | " '''\n", 143 | " adjZScore = 3.25 + altmanZDoublePrime( ticker )\n", 144 | " zMap = [ 8.15, 7.6, 7.3, 7., 6.85, 6.65, 6.4, 6.25, 5.85, 5.65, 5.25, 4.95, 4.75, 4.5, 4.15, 3.75, 3.2, 2.5, 1.75 ]\n", 145 | " scores = [ \"AAA\", \"AA+\", \"AA\", \"AA-\", \"A+\", \"A\", \"A-\", \"BBB+\", \"BBB\", \"BBB-\", \"BB+\", \"BB\", \"BB-\", \"B+\", \"B\", \"B-\", \"CCC+\", \"CCC\", \"CCC-\", \"D\" ] \n", 146 | " return scores[ zMap.index( np.array( zMap )[ np.array( zMap ) < adjZScore ].max() ) ]" 147 | ] 148 | }, 149 | { 150 | "cell_type": "code", 151 | "execution_count": null, 152 | "metadata": {}, 153 | "outputs": [], 154 | "source": [ 155 | "altmanZDPImpliedRating(\"aapl\")" 156 | ] 157 | }, 158 | { 159 | "cell_type": "code", 160 | "execution_count": null, 161 | "metadata": {}, 162 | "outputs": [], 163 | "source": [] 164 | } 165 | ], 166 | "metadata": { 167 | "kernelspec": { 168 | "display_name": "Python 3", 169 | "language": "python", 170 | "name": "python3" 171 | }, 172 | "language_info": { 173 | "codemirror_mode": { 174 | "name": "ipython", 175 | "version": 3 176 | }, 177 | "file_extension": ".py", 178 | "mimetype": "text/x-python", 179 | "name": "python", 180 | "nbconvert_exporter": "python", 181 | "pygments_lexer": "ipython3", 182 | "version": "3.7.3" 183 | } 184 | }, 185 | "nbformat": 4, 186 | "nbformat_minor": 2 187 | } 188 | -------------------------------------------------------------------------------- /JP Morgan Python Training/notebooks/3_flights.ipynb: -------------------------------------------------------------------------------- 1 | { 2 | "cells": [ 3 | { 4 | "cell_type": "markdown", 5 | "metadata": {}, 6 | "source": [ 7 | "# Example #2 - Working with files" 8 | ] 9 | }, 10 | { 11 | "cell_type": "markdown", 12 | "metadata": {}, 13 | "source": [ 14 | "Let's make sure our data files are there. The `os` library will be very helpful." 15 | ] 16 | }, 17 | { 18 | "cell_type": "code", 19 | "execution_count": null, 20 | "metadata": {}, 21 | "outputs": [], 22 | "source": [ 23 | "from os import listdir\n", 24 | "from os.path import join\n", 25 | "\n", 26 | "listdir('.')" 27 | ] 28 | }, 29 | { 30 | "cell_type": "code", 31 | "execution_count": null, 32 | "metadata": {}, 33 | "outputs": [], 34 | "source": [ 35 | "listdir('../data')" 36 | ] 37 | }, 38 | { 39 | "cell_type": "markdown", 40 | "metadata": {}, 41 | "source": [ 42 | "We will rely on `pandas`'s built in `read_csv` function. " 43 | ] 44 | }, 45 | { 46 | "cell_type": "code", 47 | "execution_count": null, 48 | "metadata": {}, 49 | "outputs": [], 50 | "source": [ 51 | "import pandas as pd\n", 52 | "pd.read_csv?" 53 | ] 54 | }, 55 | { 56 | "cell_type": "markdown", 57 | "metadata": {}, 58 | "source": [ 59 | "What other \"read\" functions does `pandas` have?" 60 | ] 61 | }, 62 | { 63 | "cell_type": "code", 64 | "execution_count": null, 65 | "metadata": {}, 66 | "outputs": [], 67 | "source": [ 68 | "import re\n", 69 | "regex = re.compile(r'read')\n", 70 | "list(filter(regex.match, dir(pd)))" 71 | ] 72 | }, 73 | { 74 | "cell_type": "markdown", 75 | "metadata": {}, 76 | "source": [ 77 | "Let's focus on `read_csv` for now, and investigate some data from [openflights.org](openflights.org)" 78 | ] 79 | }, 80 | { 81 | "cell_type": "code", 82 | "execution_count": null, 83 | "metadata": {}, 84 | "outputs": [], 85 | "source": [ 86 | "routes = pd.read_csv(join('..', 'data', 'routes.csv'))\n", 87 | "airports = pd.read_csv(join('..', 'data', 'airports.csv'))" 88 | ] 89 | }, 90 | { 91 | "cell_type": "code", 92 | "execution_count": null, 93 | "metadata": {}, 94 | "outputs": [], 95 | "source": [ 96 | "routes.head()" 97 | ] 98 | }, 99 | { 100 | "cell_type": "code", 101 | "execution_count": null, 102 | "metadata": {}, 103 | "outputs": [], 104 | "source": [ 105 | "routes.info()" 106 | ] 107 | }, 108 | { 109 | "cell_type": "markdown", 110 | "metadata": {}, 111 | "source": [ 112 | "Looks like there's some missing data points...can we visualize what that looks like?" 113 | ] 114 | }, 115 | { 116 | "cell_type": "code", 117 | "execution_count": null, 118 | "metadata": {}, 119 | "outputs": [], 120 | "source": [ 121 | "import missingno as msno" 122 | ] 123 | }, 124 | { 125 | "cell_type": "markdown", 126 | "metadata": {}, 127 | "source": [ 128 | "`missingno` is a library that wraps together handy data utilities to get a visual summary of data completeness. We will only scratch the surface of this library." 129 | ] 130 | }, 131 | { 132 | "cell_type": "code", 133 | "execution_count": null, 134 | "metadata": {}, 135 | "outputs": [], 136 | "source": [ 137 | "%matplotlib inline\n", 138 | "msno.matrix(routes)" 139 | ] 140 | }, 141 | { 142 | "cell_type": "markdown", 143 | "metadata": {}, 144 | "source": [ 145 | "This gives a quick display to visually check out your data. The sparkline on the right shows the completeness of the data. Let's check out airports." 146 | ] 147 | }, 148 | { 149 | "cell_type": "code", 150 | "execution_count": null, 151 | "metadata": {}, 152 | "outputs": [], 153 | "source": [ 154 | "airports.info()" 155 | ] 156 | }, 157 | { 158 | "cell_type": "code", 159 | "execution_count": null, 160 | "metadata": {}, 161 | "outputs": [], 162 | "source": [ 163 | "msno.matrix(airports)" 164 | ] 165 | }, 166 | { 167 | "cell_type": "markdown", 168 | "metadata": {}, 169 | "source": [ 170 | "Cool! Overall the csvs look pretty clean, with relatively complete data. We've got ~68000 airline routes and ~7000 airports around the world. Let's see if this list is real by trying to find everyone's favorite airport - LaGuardia (LGA)." 171 | ] 172 | }, 173 | { 174 | "cell_type": "code", 175 | "execution_count": null, 176 | "metadata": {}, 177 | "outputs": [], 178 | "source": [ 179 | "airports[airports['IATA'] == 'LGA']" 180 | ] 181 | }, 182 | { 183 | "cell_type": "markdown", 184 | "metadata": {}, 185 | "source": [ 186 | "Awesome. How about finding some other things like the highest airport in the world, highest airport in the country that comes last alphabetically, and the highest airport in Australia." 187 | ] 188 | }, 189 | { 190 | "cell_type": "code", 191 | "execution_count": null, 192 | "metadata": {}, 193 | "outputs": [], 194 | "source": [ 195 | "highest = airports.sort_values('altitude', ascending=False).iloc[0]\n", 196 | "highestInLastAlpha = airports.sort_values(['country', 'altitude'], ascending=[False, False]).iloc[0]\n", 197 | "highestInAustralia = airports[airports['country'] == 'Australia'].sort_values('altitude', ascending=False).iloc[0]" 198 | ] 199 | }, 200 | { 201 | "cell_type": "code", 202 | "execution_count": null, 203 | "metadata": {}, 204 | "outputs": [], 205 | "source": [ 206 | "print(f'Highest airport: \\n{highest}\\n\\n')\n", 207 | "print(f'Highest airport in the country that comes last alphabetically: \\n{highestInLastAlpha}\\n\\n')\n", 208 | "print(f'Highest airport in Australia: \\n{highestInAustralia}\\n\\n')" 209 | ] 210 | }, 211 | { 212 | "cell_type": "markdown", 213 | "metadata": {}, 214 | "source": [ 215 | "Let's do some more practical things with this data like finding the 10 busiest routes in the world." 216 | ] 217 | }, 218 | { 219 | "cell_type": "code", 220 | "execution_count": null, 221 | "metadata": {}, 222 | "outputs": [], 223 | "source": [ 224 | "busiest = routes.groupby(['source', 'dest']).count()['airline_id'].nlargest(10)\n", 225 | "busiest" 226 | ] 227 | }, 228 | { 229 | "cell_type": "markdown", 230 | "metadata": {}, 231 | "source": [ 232 | "We can merge in the names of the airports." 233 | ] 234 | }, 235 | { 236 | "cell_type": "code", 237 | "execution_count": null, 238 | "metadata": {}, 239 | "outputs": [], 240 | "source": [ 241 | "busiest = pd.DataFrame(busiest).reset_index()\n", 242 | "busiest.head()" 243 | ] 244 | }, 245 | { 246 | "cell_type": "code", 247 | "execution_count": null, 248 | "metadata": {}, 249 | "outputs": [], 250 | "source": [ 251 | "for sd in ['source', 'dest']:\n", 252 | " busiest = busiest.merge(airports[['IATA', 'name']].set_index('IATA'), how='left', left_on=sd, right_on='IATA')\n", 253 | "\n", 254 | "busiest" 255 | ] 256 | }, 257 | { 258 | "cell_type": "code", 259 | "execution_count": null, 260 | "metadata": {}, 261 | "outputs": [], 262 | "source": [] 263 | } 264 | ], 265 | "metadata": { 266 | "kernelspec": { 267 | "display_name": "Python 3", 268 | "language": "python", 269 | "name": "python3" 270 | }, 271 | "language_info": { 272 | "codemirror_mode": { 273 | "name": "ipython", 274 | "version": 3 275 | }, 276 | "file_extension": ".py", 277 | "mimetype": "text/x-python", 278 | "name": "python", 279 | "nbconvert_exporter": "python", 280 | "pygments_lexer": "ipython3", 281 | "version": "3.7.3" 282 | } 283 | }, 284 | "nbformat": 4, 285 | "nbformat_minor": 4 286 | } 287 | -------------------------------------------------------------------------------- /Machine Learning For Finance/Regression Based Machine Learning for Algorithmic Trading/Pairs Trading with Kalman Filters.py: -------------------------------------------------------------------------------- 1 | """ 2 | Pairs Trading with Kalman Filters 3 | 4 | Credit: Quantopian 5 | """ 6 | 7 | 8 | import numpy as np 9 | import pandas as pd 10 | from pykalman import KalmanFilter 11 | import statsmodels.api as sm 12 | 13 | 14 | def initialize(context): 15 | # Quantopian backtester specific variables 16 | set_slippage(slippage.FixedSlippage(spread=0)) 17 | set_commission(commission.PerShare(cost=0.01, min_trade_cost=1.0)) 18 | 19 | context.pairs = [ 20 | KalmanPairTrade(sid(5885), sid(4283), 21 | initial_bars=300, freq='1m', delta=1e-3, maxlen=300), 22 | ] 23 | 24 | context.security_list = [sid(5885), sid(4283)] 25 | 26 | weight = 1.8 / len(context.pairs) 27 | for pair in context.pairs: 28 | pair.leverage = weight 29 | 30 | for minute in range(10, 390, 90): 31 | for pair in context.pairs: 32 | schedule_function(pair.trading_logic, 33 | time_rule=time_rules.market_open(minutes=minute)) 34 | 35 | 36 | class KalmanPairTrade(object): 37 | 38 | def __init__(self, y, x, leverage=1.0, initial_bars=10, 39 | freq='1d', delta=1e-3, maxlen=3000): 40 | self._y = y 41 | self._x = x 42 | self.maxlen = maxlen 43 | self.initial_bars = initial_bars 44 | self.freq = freq 45 | self.delta = delta 46 | self.leverage = leverage 47 | self.Y = KalmanMovingAverage(self._y, maxlen=self.maxlen) 48 | self.X = KalmanMovingAverage(self._x, maxlen=self.maxlen) 49 | self.kf = None 50 | self.entry_dt = pd.Timestamp('1900-01-01', tz='utc') 51 | 52 | @property 53 | def name(self): 54 | return "{}~{}".format(self._y.symbol, self._x.symbol) 55 | 56 | def trading_logic(self, context, data): 57 | try: 58 | if self.kf is None: 59 | self.initialize_filters(context, data) 60 | return 61 | self.update(context, data) 62 | if get_open_orders(self._x) or get_open_orders(self._y): 63 | return 64 | spreads = self.mean_spread() 65 | 66 | zscore = (spreads[-1] - spreads.mean()) / spreads.std() 67 | 68 | reference_pos = context.portfolio.positions[self._y].amount 69 | 70 | now = get_datetime() 71 | if reference_pos: 72 | if (now - self.entry_dt).days > 20: 73 | order_target(self._y, 0.0) 74 | order_target(self._x, 0.0) 75 | return 76 | # Do a PNL check to make sure a reversion at least covered trading costs 77 | # I do this because parameter drift often causes trades to be exited 78 | # before the original spread has become profitable. 79 | pnl = self.get_pnl(context, data) 80 | if zscore > -0.0 and reference_pos > 0 and pnl > 0: 81 | order_target(self._y, 0.0) 82 | order_target(self._x, 0.0) 83 | 84 | elif zscore < 0.0 and reference_pos < 0 and pnl > 0: 85 | order_target(self._y, 0.0) 86 | order_target(self._x, 0.0) 87 | 88 | else: 89 | if zscore > 1.5: 90 | order_target_percent(self._y, -self.leverage / 2.) 91 | order_target_percent(self._x, self.leverage / 2.) 92 | self.entry_dt = now 93 | if zscore < -1.5: 94 | order_target_percent(self._y, self.leverage / 2.) 95 | order_target_percent(self._x, -self.leverage / 2.) 96 | self.entry_dt = now 97 | except Exception as e: 98 | log.debug("[{}] {}".format(self.name, str(e))) 99 | 100 | def update(self, context, data): 101 | prices = np.log(data.history(context.security_list, 'price', 1, '1m')) 102 | self.X.update(prices) 103 | self.Y.update(prices) 104 | self.kf.update(self.means_frame().iloc[-1]) 105 | 106 | 107 | def mean_spread(self): 108 | means = self.means_frame() 109 | beta, alpha = self.kf.state_mean 110 | return means[self._y] - (beta * means[self._x] + alpha) 111 | 112 | 113 | def means_frame(self): 114 | mu_Y = self.Y.state_means 115 | mu_X = self.X.state_means 116 | return pd.DataFrame([mu_Y, mu_X]).T 117 | 118 | 119 | def initialize_filters(self, context, data): 120 | prices = np.log(data.history(context.security_list, 'price', self.initial_bars, self.freq)) 121 | self.X.update(prices) 122 | self.Y.update(prices) 123 | 124 | # Drops the initial 0 mean value from the kalman filter 125 | self.X.state_means = self.X.state_means.iloc[-self.initial_bars:] 126 | self.Y.state_means = self.Y.state_means.iloc[-self.initial_bars:] 127 | self.kf = KalmanRegression(self.Y.state_means, self.X.state_means, 128 | delta=self.delta, maxlen=self.maxlen) 129 | 130 | def get_pnl(self, context, data): 131 | x = self._x 132 | y = self._y 133 | prices = data.history(context.security_list, 'price', 1, '1d').iloc[-1] 134 | positions = context.portfolio.positions 135 | dx = prices[x] - positions[x].cost_basis 136 | dy = prices[y] - positions[y].cost_basis 137 | return (positions[x].amount * dx + 138 | positions[y].amount * dy) 139 | 140 | 141 | 142 | def handle_data(context, data): 143 | record(market_exposure=context.account.net_leverage, 144 | leverage=context.account.leverage) 145 | 146 | 147 | class KalmanMovingAverage(object): 148 | """ 149 | Estimates the moving average of a price process 150 | via Kalman Filtering. 151 | 152 | See http://pykalman.github.io/ for docs on the 153 | filtering process. 154 | """ 155 | 156 | def __init__(self, asset, observation_covariance=1.0, initial_value=0, 157 | initial_state_covariance=1.0, transition_covariance=0.05, 158 | initial_window=20, maxlen=3000, freq='1d'): 159 | 160 | self.asset = asset 161 | self.freq = freq 162 | self.initial_window = initial_window 163 | self.maxlen = maxlen 164 | self.kf = KalmanFilter(transition_matrices=[1], 165 | observation_matrices=[1], 166 | initial_state_mean=initial_value, 167 | initial_state_covariance=initial_state_covariance, 168 | observation_covariance=observation_covariance, 169 | transition_covariance=transition_covariance) 170 | self.state_means = pd.Series([self.kf.initial_state_mean], name=self.asset) 171 | self.state_vars = pd.Series([self.kf.initial_state_covariance], name=self.asset) 172 | 173 | 174 | def update(self, observations): 175 | for dt, observation in observations[self.asset].iteritems(): 176 | self._update(dt, observation) 177 | 178 | def _update(self, dt, observation): 179 | mu, cov = self.kf.filter_update(self.state_means.iloc[-1], 180 | self.state_vars.iloc[-1], 181 | observation) 182 | self.state_means[dt] = mu.flatten()[0] 183 | self.state_vars[dt] = cov.flatten()[0] 184 | if self.state_means.shape[0] > self.maxlen: 185 | self.state_means = self.state_means.iloc[-self.maxlen:] 186 | if self.state_vars.shape[0] > self.maxlen: 187 | self.state_vars = self.state_vars.iloc[-self.maxlen:] 188 | 189 | 190 | class KalmanRegression(object): 191 | """ 192 | Uses a Kalman Filter to estimate regression parameters 193 | in an online fashion. 194 | 195 | Estimated model: y ~ beta * x + alpha 196 | """ 197 | 198 | def __init__(self, initial_y, initial_x, delta=1e-5, maxlen=3000): 199 | self._x = initial_x.name 200 | self._y = initial_y.name 201 | self.maxlen = maxlen 202 | trans_cov = delta / (1 - delta) * np.eye(2) 203 | obs_mat = np.expand_dims( 204 | np.vstack([[initial_x], [np.ones(initial_x.shape[0])]]).T, axis=1) 205 | 206 | self.kf = KalmanFilter(n_dim_obs=1, n_dim_state=2, 207 | initial_state_mean=np.zeros(2), 208 | initial_state_covariance=np.ones((2, 2)), 209 | transition_matrices=np.eye(2), 210 | observation_matrices=obs_mat, 211 | observation_covariance=1.0, 212 | transition_covariance=trans_cov) 213 | state_means, state_covs = self.kf.filter(initial_y.values) 214 | self.means = pd.DataFrame(state_means, 215 | index=initial_y.index, 216 | columns=['beta', 'alpha']) 217 | self.state_cov = state_covs[-1] 218 | 219 | def update(self, observations): 220 | x = observations[self._x] 221 | y = observations[self._y] 222 | mu, self.state_cov = self.kf.filter_update( 223 | self.state_mean, self.state_cov, y, 224 | observation_matrix=np.array([[x, 1.0]])) 225 | mu = pd.Series(mu, index=['beta', 'alpha'], 226 | name=observations.name) 227 | self.means = self.means.append(mu) 228 | if self.means.shape[0] > self.maxlen: 229 | self.means = self.means.iloc[-self.maxlen:] 230 | 231 | def get_spread(self, observations): 232 | x = observations[self._x] 233 | y = observations[self._y] 234 | return y - (self.means.beta[-1] * x + self.means.alpha[-1]) 235 | 236 | @property 237 | def state_mean(self): 238 | return self.means.iloc[-1] 239 | 240 | -------------------------------------------------------------------------------- /README.md: -------------------------------------------------------------------------------- 1 | [←←Back to Homepage](https://monitsharma.github.io/) 2 | 3 | 4 | # Quantum Finance and Numerical Methods 5 | 6 | 7 | Welcome to the **[Quantum Finance and Numerical Methods](https://github.com/MonitSharma/Quantum-Finance-and-Numerical-Methods)** repository! This repository contains a variety of materials related to the intersection of **finance**, **economics**, and **quantum computing**. The focus of these materials is on the application of quantum techniques to financial and economic analysis, with an emphasis on topics such as **financial modeling, data visualization**, and **economic forecasting**. 8 | 9 | ![](https://previews.123rf.com/images/radiantskies/radiantskies1212/radiantskies121203577/17020484-abstract-word-cloud-for-quantum-finance-with-related-tags-and-terms.jpg) 10 | ---- 11 | 12 | These materials are intended for readers with a background in finance and economics, as well as some familiarity with programming and data analysis. They are primarily designed for educational and research purposes, but may also be useful for practitioners looking to expand their skills in this emerging field. 13 | 14 | ![](https://previews.123rf.com/images/radiantskies/radiantskies1211/radiantskies121100688/16446015-abstract-word-cloud-for-derivative-with-related-tags-and-terms.jpg) 15 | 16 | The materials in this repository are well-documented and tested, and should be relatively easy to use. However, some familiarity with programming languages such as Python is assumed. A list of dependencies and required software packages is included in the README file. 17 | 18 | If you have any questions or encounter any issues while using these materials, please don't hesitate to open an issue on the repository or contact the maintainers directly. We are always happy to help! 19 | 20 | For more information, check out the Wiki pages or browse through the individual files and directories in the repository. 21 | 22 | 23 | ## Contents 24 | 25 | To get started with **quantum computing in finance**, it is important to first have a strong foundation in **classical computational finance**. This includes concepts such as *financial modeling, data analysis*, and *programming*, as well as a deep understanding of *financial and economic principles* . By starting with classical computational finance, you will be better equipped to understand the unique challenges and opportunities that quantum computing presents in the finance industry, and you will have the skills and knowledge needed to effectively apply quantum techniques to financial problems. 26 | 27 | There are many resources available for learning classical computational finance, including online courses, textbooks, and open source repositories. Some specific topics you may want to focus on include financial modeling with tools such as Python and R, data visualization and analysis, and economic forecasting. By building a strong foundation in these areas, you will be well-prepared to explore the exciting possibilities of quantum computing in finance. 28 | 29 | ![](https://previews.123rf.com/images/radiantskies/radiantskies1211/radiantskies121101250/16498372-abstract-word-cloud-for-financial-instrument-with-related-tags-and-terms.jpg) 30 | 31 | ------- 32 | 33 | 34 | ### [Introduction to Finance with Python](https://github.com/MonitSharma/Quantum-Finance-and-Numerical-Methods/tree/main/Introduction%20to%20Finance%20with%20Python) : 35 | 36 | 37 | 38 | 39 | Welcome to the Introduction to Finance with Python course! This course is designed for students and professionals with little or no background in finance, but who have a strong foundation in programming and data analysis. Through a combination of lectures, exercises, and projects, this course will introduce you to the basics of financial concepts and provide you with the skills needed to apply these concepts using Python. 40 | 41 | ![](https://yodalearning.com/wp-content/uploads/Python-for-Finance-Course.png) 42 | 43 | Throughout the course, you will learn how to use Python to solve financial problems such as calculating return on investment, analyzing stock prices, and building financial models. You will also have the opportunity to work with real-world financial data and gain hands-on experience with tools such as Pandas, NumPy, and Matplotlib. 44 | 45 | By the end of the course, you will have a solid understanding of financial concepts and the ability to use Python to apply these concepts in practice. Whether you are looking to start a career in finance or simply want to learn more about financial analysis, this course is the perfect place to start. 46 | 47 | 48 | ------- 49 | 50 | 51 | ### [Machine Learning for Finance](https://github.com/MonitSharma/Quantum-Finance-and-Numerical-Methods/tree/main/Machine%20Learning%20For%20Finance) 52 | 53 | 54 | Are you interested in using machine learning to solve complex financial problems? Look no further than the Machine Learning for Finance course! This course is designed for students and professionals with a background in finance and an interest in using cutting-edge machine learning techniques to analyze and make predictions about financial data. 55 | 56 | ![](https://monkeylearn.com/static/f3e1fac5db48d225d0be591917f13e93/Machine-Learning-in-Finance-Social@2x.png) 57 | 58 | This course will teach you how to use machine learning algorithms for tasks such as classification, regression, and clustering to build predictive models for financial applications. You will also learn how to apply these techniques to real-world financial data using tools such as Python and TensorFlow. 59 | 60 | In addition to covering the basics of machine learning, this course will delve into more advanced topics such as natural language processing and deep learning, and how they can be applied to financial data. By the end of the course, you will have a deep understanding of the potential of machine learning in finance and the skills to apply these techniques to your own projects. 61 | 62 | Whether you are a financial analyst, data scientist, or simply curious about the intersection of machine learning and finance, this course has something for you. 63 | 64 | 65 | --- 66 | 67 | 68 | ### [Foundations of Computational Economics](https://github.com/MonitSharma/Quantum-Finance-and-Numerical-Methods/blob/main/Computational%20Economics/README.md) 69 | 70 | 71 | The **Foundation of Computational Economics** course is the perfect starting point for students and professionals who want to learn how to use computational techniques to analyze and solve economic problems. With a focus on the basics, this course will introduce you to the fundamental concepts and tools used in computational economics, including programming languages such as Python and R, and techniques such as simulation and optimization. 72 | 73 | 74 | 75 | Whether you are new to economics or simply want to learn more about the intersection of computation and economics, this course is for you. With a combination of lectures, exercises, and projects, you will gain a strong foundation in computational economics and be well-prepared to move on to more advanced topics in the field. 76 | 77 | 78 | 79 | [![](https://fedor.iskh.me/assets/img/github_logo.png)](https://github.com/fediskhakov/CompEcon) 80 | 81 | ### [Open Course Repository](https://github.com/fediskhakov/CompEcon) 82 | 83 | 84 | [![](https://fedor.iskh.me/assets/img/binder_logo.png)](https://mybinder.org/v2/gh/fediskhakov/CompEcon/HEAD) 85 | 86 | ### [Launch Course notebooks on Binder](https://mybinder.org/v2/gh/fediskhakov/CompEcon/HEAD) 87 | 88 | [![](https://fedor.iskh.me/assets/img/youtube_logo.png)](https://www.youtube.com/playlist?list=PLWs2dSJolpWKBCBPVymYBm0BM8VPksmLv) 89 | 90 | ### [All course videos on one page](https://www.youtube.com/playlist?list=PLWs2dSJolpWKBCBPVymYBm0BM8VPksmLv) 91 | 92 | 93 | --------- 94 | 95 | ## [JP Morgan Python training](https://github.com/MonitSharma/Quantum-Finance-and-Numerical-Methods/tree/main/JP%20Morgan%20Python%20Training) 96 | 97 | ![](https://camo.githubusercontent.com/23d2429ae3232063b0d95bf9f06b13ad77656cd0b7ce3e0370c6ba234e64caf5/68747470733a2f2f75706c6f61642e77696b696d656469612e6f72672f77696b6970656469612f636f6d6d6f6e732f7468756d622f612f61662f4a5f505f4d6f7267616e5f4c6f676f5f323030385f312e7376672f3132383070782d4a5f505f4d6f7267616e5f4c6f676f5f323030385f312e7376672e706e67) 98 | 99 | 100 | This Python training is for JPMorgan business analysts and traders, as well as select clients. This course is designed to be an introduction to numerical computing and data visualization in Python. It is not designed to be a complete course in Computer Science or programming, but rather a motivational demonstration of how relatively complex topics can be accessible even to those without formal progamming backgrounds. 101 | 102 | 103 | --------- 104 | 105 | ## [Classical Use Cases](https://github.com/MonitSharma/Quantum-Finance-and-Numerical-Methods/tree/main/Classical%20Use%20Cases) 106 | 107 | This folder contains a collection of case studies demonstrating the application of Python programming and numerical methods to solve real-world problems of finance and supply chain. 108 | 109 | 110 | To make the most of these case studies, you should have a strong foundation in Python programming and numerical methods. Familiarity with concepts such as linear algebra, optimization, and differential equations will be helpful. 111 | 112 | 113 | --------- 114 | 115 | ## [Quantum Finance](https://github.com/MonitSharma/Quantum-Finance-and-Numerical-Methods/tree/main/Quantum%20Finance) 116 | 117 | Welcome to the Quantum Finance repository! This repository contains a collection of materials related to the intersection of quantum computing and finance. 118 | 119 | ![](https://platform.qureca.com/wp-content/uploads/2021/05/Quantum-Computing-For-Finance-1.png) 120 | 121 | Inside, you will find resources such as lecture notes, exercises, and projects that cover topics such as quantum algorithms for portfolio optimization, supply chain and option call price prediction, and the applications of quantum computing to financial risk analysis. 122 | 123 | Whether you are a finance professional, quantum computing enthusiast, or simply interested in the potential of this emerging field, this repository has something for you. With a focus on hands-on learning, you will have the opportunity to gain practical experience with tools such as Qiskit and TensorFlow Quantum, and apply these techniques to real-world financial data. 124 | 125 | I hope that these materials will provide you with a strong foundation in quantum finance and inspire you to continue learning and exploring this exciting field. Thank you for visiting, and happy learning!" 126 | 127 | --------- 128 | 129 | ## [Quantum Machine Learning in Finance](https://github.com/MonitSharma/Quantum-Finance-and-Numerical-Methods/tree/main/Quantum%20Machine%20Learning%20in%20Finance) 130 | 131 | The intersection of quantum computing, machine learning, and finance is a rapidly growing field with exciting potential for innovation and discovery. This repository is dedicated to providing materials and resources for learning about quantum machine learning in finance, including lecture notes, exercises, and projects. 132 | 133 | Inside, you will find a wide range of topics, from the basics of quantum computing and machine learning, to more advanced concepts such as quantum algorithms for financial prediction and quantum-enhanced feature extraction for trading strategies. With a focus on hands-on learning, you will have the opportunity to gain practical experience with tools such as Qiskit and TensorFlow Quantum, and apply these techniques to real-world financial data. 134 | 135 | Whether you are a finance professional looking to learn about the potential of quantum machine learning, or a machine learning practitioner interested in applying your skills to the finance industry, this repository has something for you. I hope that these materials will provide you with a strong foundation in quantum machine learning in finance and inspire you to continue learning and exploring this exciting field. Thank you for visiting, and happy learning!" 136 | 137 | 138 | 139 | ------ 140 | 141 | ## [Quantum Use Cases](https://github.com/MonitSharma/Quantum-Finance-and-Numerical-Methods/tree/main/Quantum%20Use%20Cases) 142 | 143 | Welcome to the Real-World Use Cases in Finance Using Quantum Computing folder! This folder contains a collection of case studies demonstrating the application of quantum computing to solve real-world problems in finance. 144 | 145 | Inside, you will find a variety of case studies covering topics such as quantum algorithms for portfolio optimization, quantum machine learning for financial prediction, and the applications of quantum computing to financial risk analysis. Each case study includes detailed explanations and code examples to help you understand and replicate the results. 146 | 147 | 148 | 149 | 150 | 151 | 152 | 153 | -------------------------------------------------------------------------------- /JP Morgan Python Training/data/data_raw.csv: -------------------------------------------------------------------------------- 1 | date,change,changeOverTime,changePercent,close,high,label,low,open,uClose,uHigh,uLow,uOpen,uVolume,volume 2 | 2019-01-25,0.0,0.0,0.0,32.9,33.62,"Jan 25, 19",31.98,31.99,32.9,33.62,31.98,31.99,22522440,22522440 3 | 2019-01-28,0.23,0.006991,0.6991,33.13,33.2,"Jan 28, 19",32.12,32.65,33.13,33.2,32.12,32.65,21750772,21750772 4 | 2019-01-29,-1.49,-0.038298,-4.4974,31.64,33.55,"Jan 29, 19",31.46,33.33,31.64,33.55,31.46,33.33,18849804,18849804 5 | 2019-01-30,0.62,-0.019453,1.9595,32.26,32.38,"Jan 30, 19",31.42,32.04,32.26,32.38,31.42,32.04,17142508,17142508 6 | 2019-01-31,1.3,0.020061,4.0298,33.56,33.69,"Jan 31, 19",32.79,33.07,33.56,33.69,32.79,33.07,21211309,21211309 7 | 2019-02-01,-0.37,0.008815,-1.1025,33.19,34.09,"Feb 1, 19",32.96,33.56,33.19,34.09,32.96,33.56,18816582,18816582 8 | 2019-02-04,0.75,0.031611,2.2597,33.94,34.18,"Feb 4, 19",33.24,33.34,33.94,34.18,33.24,33.34,14244111,14244111 9 | 2019-02-05,0.43,0.044681,1.2669,34.37,34.57,"Feb 5, 19",33.92,34.29,34.37,34.57,33.92,34.29,17610232,17610232 10 | 2019-02-06,-0.21,0.038298,-0.611,34.16,35.25,"Feb 6, 19",33.75,35.05,34.16,35.25,33.75,35.05,34058034,34058034 11 | 2019-02-07,-3.36,-0.06383,-9.8361,30.8,31.73,"Feb 7, 19",30.31,31.17,30.8,31.73,30.31,31.17,69764059,69764059 12 | 2019-02-08,-0.79,-0.087842,-2.5649,30.01,30.74,"Feb 8, 19",29.42,30.47,30.01,30.74,29.42,30.47,40669782,40669782 13 | 2019-02-11,0.22,-0.081155,0.7331,30.23,30.44,"Feb 11, 19",29.65,30.17,30.23,30.44,29.65,30.17,28838151,28838151 14 | 2019-02-12,0.16,-0.076292,0.5293,30.39,30.8,"Feb 12, 19",30.23,30.44,30.39,30.8,30.23,30.44,20315266,20315266 15 | 2019-02-13,0.73,-0.054103,2.4021,31.12,31.84,"Feb 13, 19",30.55,30.57,31.12,31.84,30.55,30.57,29683308,29683308 16 | 2019-02-14,-0.16,-0.058967,-0.5141,30.96,31.28,"Feb 14, 19",30.6,30.86,30.96,31.28,30.6,30.86,15321099,15321099 17 | 2019-02-15,0.27,-0.05076,0.8721,31.23,31.8,"Feb 15, 19",30.97,31.2,31.23,31.8,30.97,31.2,17591516,17591516 18 | 2019-02-19,0.42,-0.037994,1.3449,31.65,32.11,"Feb 19, 19",31.15,31.23,31.65,32.11,31.15,31.23,14391680,14391680 19 | 2019-02-20,-0.28,-0.046505,-0.8847,31.37,31.93,"Feb 20, 19",31.21,31.71,31.37,31.93,31.21,31.71,16871139,16871139 20 | 2019-02-21,-0.61,-0.065046,-1.9445,30.76,31.48,"Feb 21, 19",30.6,31.36,30.76,31.48,30.6,31.36,13944892,13944892 21 | 2019-02-22,0.95,-0.03617,3.0884,31.71,31.73,"Feb 22, 19",30.81,30.81,31.71,31.73,30.81,30.81,15413439,15413439 22 | 2019-02-25,0.28,-0.02766,0.883,31.99,32.71,"Feb 25, 19",31.88,31.99,31.99,32.71,31.88,31.99,15061331,15061331 23 | 2019-02-26,-0.98,-0.057447,-3.0635,31.01,31.96,"Feb 26, 19",30.99,31.89,31.01,31.96,30.99,31.89,17519063,17519063 24 | 2019-02-27,-0.6,-0.075684,-1.9349,30.41,31.0,"Feb 27, 19",29.9,30.95,30.41,31.0,29.9,30.95,24639117,24639117 25 | 2019-02-28,0.37,-0.064438,1.2167,30.78,30.79,"Feb 28, 19",30.01,30.25,30.78,30.79,30.01,30.25,15242855,15242855 26 | 2019-03-01,-0.16,-0.069301,-0.5198,30.62,31.19,"Mar 1, 19",30.28,31.17,30.62,31.19,30.28,31.17,12360706,12360706 27 | 2019-03-04,-0.12,-0.072948,-0.3919,30.5,31.26,"Mar 4, 19",30.06,30.78,30.5,31.26,30.06,30.78,15920363,15920363 28 | 2019-03-05,0.53,-0.056839,1.7377,31.03,31.23,"Mar 5, 19",30.39,30.5,31.03,31.23,30.39,30.5,13073531,13073531 29 | 2019-03-06,-0.23,-0.06383,-0.7412,30.8,31.33,"Mar 6, 19",30.58,30.94,30.8,31.33,30.58,30.94,10938592,10938592 30 | 2019-03-07,-0.68,-0.084498,-2.2078,30.12,30.84,"Mar 7, 19",30.01,30.76,30.12,30.84,30.01,30.76,15779860,15779860 31 | 2019-03-08,-0.08,-0.08693,-0.2656,30.04,30.21,"Mar 8, 19",29.41,29.64,30.04,30.21,29.41,29.64,11964309,11964309 32 | 2019-03-11,0.83,-0.061702,2.763,30.87,30.91,"Mar 11, 19",30.24,30.24,30.87,30.91,30.24,30.24,16013214,16013214 33 | 2019-03-12,0.29,-0.052888,0.9394,31.16,31.41,"Mar 12, 19",30.88,31.15,31.16,31.41,30.88,31.15,12324334,12324334 34 | 2019-03-13,0.14,-0.048632,0.4493,31.3,31.48,"Mar 13, 19",31.04,31.31,31.3,31.48,31.04,31.31,10201337,10201337 35 | 2019-03-14,-0.27,-0.056839,-0.8626,31.03,31.55,"Mar 14, 19",30.94,31.28,31.03,31.55,30.94,31.28,12090588,12090588 36 | 2019-03-15,0.19,-0.051064,0.6123,31.22,31.41,"Mar 15, 19",30.71,31.04,31.22,31.41,30.71,31.04,17522706,17522706 37 | 2019-03-18,-0.14,-0.055319,-0.4484,31.08,31.58,"Mar 18, 19",30.84,31.25,31.08,31.58,30.84,31.25,13172614,13172614 38 | 2019-03-19,0.19,-0.049544,0.6113,31.27,31.5,"Mar 19, 19",30.88,31.15,31.27,31.5,30.88,31.15,15557439,15557439 39 | 2019-03-20,1.3,-0.01003,4.1573,32.57,32.65,"Mar 20, 19",31.16,31.24,32.57,32.65,31.16,31.24,22373782,22373782 40 | 2019-03-21,0.04,-0.008815,0.1228,32.61,32.69,"Mar 21, 19",32.03,32.31,32.61,32.69,32.03,32.31,13346906,13346906 41 | 2019-03-22,0.41,0.003647,1.2573,33.02,34.21,"Mar 22, 19",32.34,32.5,33.02,34.21,32.34,32.5,28034683,28034683 42 | 2019-03-25,-0.43,-0.009422,-1.3022,32.59,33.3,"Mar 25, 19",32.28,32.83,32.59,33.3,32.28,32.83,15272347,15272347 43 | 2019-03-26,0.47,0.004863,1.4422,33.06,33.86,"Mar 26, 19",32.92,32.98,33.06,33.86,32.92,32.98,17252299,17252299 44 | 2019-03-27,-0.78,-0.018845,-2.3593,32.28,33.45,"Mar 27, 19",31.95,32.93,32.28,33.45,31.95,32.93,13669409,13669409 45 | 2019-03-28,0.59,-0.000912,1.8278,32.87,32.93,"Mar 28, 19",31.73,32.29,32.87,32.93,31.73,32.29,17750596,17750596 46 | 2019-03-29,0.01,-0.000608,0.0304,32.88,33.24,"Mar 29, 19",32.47,33.1,32.88,33.24,32.47,33.1,13529256,13529256 47 | 2019-04-01,0.56,0.016413,1.7032,33.44,33.68,"Apr 1, 19",32.7,33.16,33.44,33.68,32.7,33.16,12499656,12499656 48 | 2019-04-02,0.31,0.025836,0.927,33.75,33.89,"Apr 2, 19",33.23,33.44,33.75,33.89,33.23,33.44,11638013,11638013 49 | 2019-04-03,0.63,0.044985,1.8667,34.38,34.76,"Apr 3, 19",33.81,34.0,34.38,34.76,33.81,34.0,18040972,18040972 50 | 2019-04-04,0.04,0.046201,0.1163,34.42,35.13,"Apr 4, 19",33.9,34.7,34.42,35.13,33.9,34.7,14604127,14604127 51 | 2019-04-05,0.3,0.055319,0.8716,34.72,34.79,"Apr 5, 19",34.37,34.55,34.72,34.79,34.37,34.55,9571670,9571670 52 | 2019-04-08,0.14,0.059574,0.4032,34.86,35.06,"Apr 8, 19",34.51,34.79,34.86,35.06,34.51,34.79,10655009,10655009 53 | 2019-04-09,0.28,0.068085,0.8032,35.14,35.39,"Apr 9, 19",34.8,34.84,35.14,35.39,34.8,34.84,13889711,13889711 54 | 2019-04-10,-0.39,0.056231,-1.1098,34.75,35.27,"Apr 10, 19",34.51,35.26,34.75,35.27,34.51,35.26,11648786,11648786 55 | 2019-04-11,-0.17,0.051064,-0.4892,34.58,34.87,"Apr 11, 19",34.41,34.75,34.58,34.87,34.41,34.75,10982651,10982651 56 | 2019-04-12,-0.21,0.044681,-0.6073,34.37,34.83,"Apr 12, 19",34.11,34.67,34.37,34.83,34.11,34.67,12713801,12713801 57 | 2019-04-15,0.34,0.055015,0.9892,34.71,35.03,"Apr 15, 19",34.34,34.38,34.71,35.03,34.34,34.38,10248382,10248382 58 | 2019-04-16,-0.25,0.047416,-0.7203,34.46,34.99,"Apr 16, 19",34.23,34.84,34.46,34.99,34.23,34.84,9396320,9396320 59 | 2019-04-17,0.02,0.048024,0.058,34.48,34.9,"Apr 17, 19",34.2,34.73,34.48,34.9,34.2,34.73,9022995,9022995 60 | 2019-04-18,-0.08,0.045593,-0.232,34.4,34.86,"Apr 18, 19",34.32,34.67,34.4,34.86,34.32,34.67,9806113,9806113 61 | 2019-04-22,-0.01,0.045289,-0.0291,34.39,34.61,"Apr 22, 19",33.82,34.4,34.39,34.61,33.82,34.4,19704288,19704288 62 | 2019-04-23,5.38,0.208815,15.6441,39.77,40.53,"Apr 23, 19",36.91,36.93,39.77,40.53,36.91,36.93,104262474,104262474 63 | 2019-04-24,-0.48,0.194225,-1.2069,39.29,39.95,"Apr 24, 19",38.8,39.86,39.29,39.95,38.8,39.86,30266862,30266862 64 | 2019-04-25,-0.81,0.169605,-2.0616,38.48,40.13,"Apr 25, 19",38.19,39.26,38.48,40.13,38.19,39.26,26044758,26044758 65 | 2019-04-26,0.19,0.17538,0.4938,38.67,39.34,"Apr 26, 19",38.18,38.59,38.67,39.34,38.18,38.59,15270496,15270496 66 | 2019-04-29,1.11,0.209119,2.8704,39.78,39.97,"Apr 29, 19",38.63,38.63,39.78,39.97,38.63,38.63,19679975,19679975 67 | 2019-04-30,0.13,0.21307,0.3268,39.91,40.92,"Apr 30, 19",39.65,39.79,39.91,40.92,39.65,39.79,22912022,22912022 68 | 2019-05-01,-0.62,0.194225,-1.5535,39.29,40.07,"May 1, 19",39.26,40.0,39.29,40.07,39.26,40.0,14962555,14962555 69 | 2019-05-02,0.66,0.214286,1.6798,39.95,39.99,"May 2, 19",38.84,39.24,39.95,39.99,38.84,39.24,13419117,13419117 70 | 2019-05-03,0.85,0.240122,2.1277,40.8,40.82,"May 3, 19",39.96,40.48,40.8,40.82,39.96,40.48,15577111,15577111 71 | 2019-05-06,-0.57,0.222796,-1.3971,40.23,40.44,"May 6, 19",39.45,39.69,40.23,40.44,39.45,39.69,14517359,14517359 72 | 2019-05-07,-1.61,0.17386,-4.002,38.62,40.15,"May 7, 19",38.12,39.9,38.62,40.15,38.12,39.9,19283131,19283131 73 | 2019-05-08,-0.04,0.172644,-0.1036,38.58,39.15,"May 8, 19",38.33,38.45,38.58,39.15,38.33,38.45,9168440,9168440 74 | 2019-05-09,0.21,0.179027,0.5443,38.79,39.02,"May 9, 19",37.82,38.11,38.79,39.02,37.82,38.11,10010669,10010669 75 | 2019-05-10,-0.34,0.168693,-0.8765,38.45,39.16,"May 10, 19",37.86,38.68,38.45,39.16,37.86,38.68,12258962,12258962 76 | 2019-05-13,-1.86,0.112158,-4.8375,36.59,37.64,"May 13, 19",36.37,37.5,36.59,37.64,36.37,37.5,16829698,16829698 77 | 2019-05-14,0.34,0.122492,0.9292,36.93,37.52,"May 14, 19",36.6,37.04,36.93,37.52,36.6,37.04,11125110,11125110 78 | 2019-05-15,0.97,0.151976,2.6266,37.9,38.14,"May 15, 19",36.64,36.67,37.9,38.14,36.64,36.67,11523055,11523055 79 | 2019-05-16,0.4,0.164134,1.0554,38.3,38.72,"May 16, 19",38.05,38.11,38.3,38.72,38.05,38.11,10104398,10104398 80 | 2019-05-17,-0.8,0.139818,-2.0888,37.5,38.12,"May 17, 19",37.47,37.83,37.5,38.12,37.47,37.83,9090265,9090265 81 | 2019-05-20,-0.35,0.129179,-0.9333,37.15,37.72,"May 20, 19",36.92,37.12,37.15,37.72,36.92,37.12,9411902,9411902 82 | 2019-05-21,0.32,0.138906,0.8614,37.47,37.86,"May 21, 19",37.33,37.47,37.47,37.86,37.33,37.47,8861388,8861388 83 | 2019-05-22,1.11,0.172644,2.9624,38.58,39.32,"May 22, 19",37.24,37.41,38.58,39.32,37.24,37.41,21105429,21105429 84 | 2019-05-23,-1.39,0.130395,-3.6029,37.19,38.29,"May 23, 19",36.8,38.15,37.19,38.29,36.8,38.15,18398792,18398792 85 | 2019-05-24,0.22,0.137082,0.5916,37.41,37.85,"May 24, 19",37.27,37.47,37.41,37.85,37.27,37.47,9303870,9303870 86 | 2019-05-28,-0.12,0.133435,-0.3208,37.29,38.04,"May 28, 19",37.23,37.41,37.29,38.04,37.23,37.41,11770272,11770272 87 | 2019-05-29,-0.44,0.120061,-1.1799,36.85,37.4,"May 29, 19",36.52,37.02,36.85,37.4,36.52,37.02,12270334,12270334 88 | 2019-05-30,0.29,0.128875,0.787,37.14,37.29,"May 30, 19",36.6,37.0,37.14,37.29,36.6,37.0,7413128,7413128 89 | 2019-05-31,-0.7,0.107599,-1.8848,36.44,36.96,"May 31, 19",36.3,36.62,36.44,36.96,36.3,36.62,8556067,8556067 90 | 2019-06-03,-2.01,0.046505,-5.5159,34.43,36.9,"Jun 3, 19",34.04,36.45,34.43,36.9,34.04,36.45,22217080,22217080 91 | 2019-06-04,1.67,0.097264,4.8504,36.1,36.15,"Jun 4, 19",34.97,35.18,36.1,36.15,34.97,35.18,15357128,15357128 92 | 2019-06-05,0.23,0.104255,0.6371,36.33,36.71,"Jun 5, 19",35.95,36.5,36.33,36.71,35.95,36.5,12798679,12798679 93 | 2019-06-06,0.26,0.112158,0.7157,36.59,36.75,"Jun 6, 19",35.96,36.26,36.59,36.75,35.96,36.26,8862950,8862950 94 | 2019-06-07,1.34,0.152888,3.6622,37.93,38.31,"Jun 7, 19",36.8,36.87,37.93,38.31,36.8,36.87,15399731,15399731 95 | 2019-06-10,-0.29,0.144073,-0.7646,37.64,38.64,"Jun 10, 19",37.62,38.35,37.64,38.64,37.62,38.35,10036385,10036385 96 | 2019-06-11,-0.43,0.131003,-1.1424,37.21,38.26,"Jun 11, 19",36.85,38.09,37.21,38.26,36.85,38.09,9528602,9528602 97 | 2019-06-12,0.28,0.139514,0.7525,37.49,37.61,"Jun 12, 19",37.05,37.13,37.49,37.61,37.05,37.13,6795451,6795451 98 | 2019-06-13,-1.15,0.104559,-3.0675,36.34,37.04,"Jun 13, 19",35.76,37.04,36.34,37.04,35.76,37.04,21679417,21679417 99 | 2019-06-14,-0.19,0.098784,-0.5228,36.15,36.49,"Jun 14, 19",36.04,36.36,36.15,36.49,36.04,36.36,7791057,7791057 100 | 2019-06-17,0.29,0.107599,0.8022,36.44,36.68,"Jun 17, 19",36.11,36.26,36.44,36.68,36.11,36.26,7895968,7895968 101 | 2019-06-18,0.21,0.113982,0.5763,36.65,37.51,"Jun 18, 19",36.55,36.71,36.65,37.51,36.55,36.71,12705487,12705487 102 | 2019-06-19,-0.36,0.10304,-0.9823,36.29,36.68,"Jun 19, 19",35.81,36.63,36.29,36.68,35.81,36.63,12059707,12059707 103 | 2019-06-20,-0.85,0.077204,-2.3422,35.44,36.65,"Jun 20, 19",35.33,36.35,35.44,36.65,35.33,36.35,19676060,19676060 104 | 2019-06-21,-0.42,0.064438,-1.1851,35.02,35.75,"Jun 21, 19",34.99,35.5,35.02,35.75,34.99,35.5,12853620,12853620 105 | 2019-06-24,0.56,0.081459,1.5991,35.58,35.67,"Jun 24, 19",34.82,35.19,35.58,35.67,34.82,35.19,9216830,9216830 106 | 2019-06-25,-0.86,0.055319,-2.4171,34.72,35.73,"Jun 25, 19",34.57,35.52,34.72,35.73,34.57,35.52,11196392,11196392 107 | 2019-06-26,0.5,0.070517,1.4401,35.22,35.38,"Jun 26, 19",34.92,34.98,35.22,35.38,34.92,34.98,8744768,8744768 108 | 2019-06-27,-0.47,0.056231,-1.3345,34.75,35.64,"Jun 27, 19",34.61,35.44,34.75,35.64,34.61,35.44,12655890,12655890 109 | 2019-06-28,0.15,0.06079,0.4317,34.9,34.99,"Jun 28, 19",34.28,34.75,34.9,34.99,34.28,34.75,15505289,15505289 110 | 2019-07-01,1.18,0.096657,3.3811,36.08,36.25,"Jul 1, 19",35.22,35.5,36.08,36.25,35.22,35.5,14416765,14416765 111 | 2019-07-02,0.14,0.100912,0.388,36.22,36.6,"Jul 2, 19",35.88,36.17,36.22,36.6,35.88,36.17,8398931,8398931 112 | 2019-07-03,-0.2,0.094833,-0.5522,36.02,36.39,"Jul 3, 19",35.94,36.29,36.02,36.39,35.94,36.29,5372634,5372634 113 | 2019-07-05,0.23,0.101824,0.6385,36.25,36.35,"Jul 5, 19",35.6,36.0,36.25,36.35,35.6,36.0,6084815,6084815 114 | 2019-07-08,0.2,0.107903,0.5517,36.45,36.64,"Jul 8, 19",35.87,36.04,36.45,36.64,35.87,36.04,8058474,8058474 115 | 2019-07-09,1.2,0.144377,3.2922,37.65,37.67,"Jul 9, 19",36.29,36.35,37.65,37.67,36.29,36.35,14217970,14217970 116 | 2019-07-10,-0.18,0.138906,-0.4781,37.47,38.03,"Jul 10, 19",36.88,37.9,37.47,38.03,36.88,37.9,11831813,11831813 117 | 2019-07-11,-0.26,0.131003,-0.6939,37.21,37.87,"Jul 11, 19",36.97,37.66,37.21,37.87,36.97,37.66,8549479,8549479 118 | 2019-07-12,0.63,0.150152,1.6931,37.84,37.9,"Jul 12, 19",37.17,37.37,37.84,37.9,37.17,37.37,8048674,8048674 119 | 2019-07-15,0.84,0.175684,2.2199,38.68,38.97,"Jul 15, 19",37.94,38.0,38.68,38.97,37.94,38.0,12738673,12738673 120 | 2019-07-16,-0.69,0.154711,-1.7839,37.99,38.79,"Jul 16, 19",37.82,38.78,37.99,38.79,37.82,38.78,10979576,10979576 121 | 2019-07-17,-0.29,0.145897,-0.7634,37.7,38.23,"Jul 17, 19",37.56,37.86,37.7,38.23,37.56,37.86,7926602,7926602 122 | 2019-07-18,-0.04,0.144681,-0.1061,37.66,37.79,"Jul 18, 19",37.0,37.39,37.66,37.79,37.0,37.39,11125887,11125887 123 | 2019-07-19,-0.89,0.117629,-2.3633,36.77,38.09,"Jul 19, 19",36.73,37.96,36.77,38.09,36.73,37.96,10842400,10842400 124 | 2019-07-22,0.81,0.142249,2.2029,37.58,37.69,"Jul 22, 19",36.83,36.92,37.58,37.69,36.83,36.92,8415150,8415150 125 | 2019-07-23,0.32,0.151976,0.8515,37.9,38.02,"Jul 23, 19",36.82,37.87,37.9,38.02,36.82,37.87,10741460,10741460 126 | 2019-07-24,0.83,0.177204,2.19,38.73,38.8,"Jul 24, 19",37.76,38.0,38.73,38.8,37.76,38.0,12488470,12488470 127 | -------------------------------------------------------------------------------- /JP Morgan Python Training/notebooks/7_advanced_plotting.ipynb: -------------------------------------------------------------------------------- 1 | { 2 | "cells": [ 3 | { 4 | "cell_type": "markdown", 5 | "metadata": {}, 6 | "source": [ 7 | "# Advanced Plotting: Beyond matplotlib" 8 | ] 9 | }, 10 | { 11 | "cell_type": "markdown", 12 | "metadata": {}, 13 | "source": [ 14 | "Alright, so our prior charts were plotted using `matplotlib` which helps us see the data, but the charts don't look amazing and aren't interactive at all. What other options do we have? [seaborn](https://seaborn.pydata.org/index.html) is a wrapper the top of `matplotlib` that makes much prettier plots (with a ton of statistical plotting capabilities) and [plotly](https://plot.ly/) is a great cross-platform plotting liberary that we can easil set up. Below we'll use `seaborn` and `plotly` to make some pretty pictures. " 15 | ] 16 | }, 17 | { 18 | "cell_type": "markdown", 19 | "metadata": {}, 20 | "source": [ 21 | "First, let's import some libraries." 22 | ] 23 | }, 24 | { 25 | "cell_type": "code", 26 | "execution_count": null, 27 | "metadata": {}, 28 | "outputs": [], 29 | "source": [ 30 | "import pandas as pd\n", 31 | "import datetime as dt\n", 32 | "import seaborn as sns\n", 33 | "import numpy as np\n", 34 | "import matplotlib.pyplot as plt\n", 35 | "import datetime as dt\n", 36 | "\n", 37 | "%matplotlib inline" 38 | ] 39 | }, 40 | { 41 | "cell_type": "markdown", 42 | "metadata": {}, 43 | "source": [ 44 | "Since we don't have access to some of the data that we do within JPM, we'll have to generate some data to plot. In this case, we're going to generate an approximation of Russell2k implied volatility and HYG (iTraxx High Yield Bond ETF) prices." 45 | ] 46 | }, 47 | { 48 | "cell_type": "code", 49 | "execution_count": null, 50 | "metadata": {}, 51 | "outputs": [], 52 | "source": [ 53 | "vols = [ 0.3, 0.06 ]\n", 54 | "dailyVols = vols / np.sqrt( 252 )\n", 55 | "corr = -0.4\n", 56 | "covars = [ \n", 57 | " [ dailyVols[ 0 ] ** 2, dailyVols[ 0 ] * dailyVols[ 1 ] * corr ],\n", 58 | " [ dailyVols[ 0 ] * dailyVols[ 1 ] * corr, dailyVols[ 1 ] ** 2 ]\n", 59 | "]\n", 60 | "randomSeries = np.random.multivariate_normal( ( 0.001, 0 ), covars, 500 ).T" 61 | ] 62 | }, 63 | { 64 | "cell_type": "code", 65 | "execution_count": null, 66 | "metadata": {}, 67 | "outputs": [], 68 | "source": [ 69 | "randomSeries" 70 | ] 71 | }, 72 | { 73 | "cell_type": "markdown", 74 | "metadata": {}, 75 | "source": [ 76 | "We've got two return series, but we need to convert them to a time series for what they're meant to represent. " 77 | ] 78 | }, 79 | { 80 | "cell_type": "code", 81 | "execution_count": null, 82 | "metadata": {}, 83 | "outputs": [], 84 | "source": [ 85 | "rtyVol = 0.2 * ( 1 + randomSeries[ 0 ] ).cumprod()\n", 86 | "hygPrice = 80 * ( 1 + randomSeries[ 1 ] ).cumprod()" 87 | ] 88 | }, 89 | { 90 | "cell_type": "markdown", 91 | "metadata": {}, 92 | "source": [ 93 | "Let's see if they make sense... Often the easiest way to do that is to plot them. Many of the plotting libraries set up to operate / plot a `DataFrame` natively." 94 | ] 95 | }, 96 | { 97 | "cell_type": "code", 98 | "execution_count": null, 99 | "metadata": {}, 100 | "outputs": [], 101 | "source": [ 102 | "df = pd.DataFrame(np.array([rtyVol, hygPrice]).T, columns=[\"RTY.3m.Proxy.Implied.Vol\", \"HYG.spot\"])" 103 | ] 104 | }, 105 | { 106 | "cell_type": "code", 107 | "execution_count": null, 108 | "metadata": {}, 109 | "outputs": [], 110 | "source": [ 111 | "df.head()" 112 | ] 113 | }, 114 | { 115 | "cell_type": "code", 116 | "execution_count": null, 117 | "metadata": {}, 118 | "outputs": [], 119 | "source": [ 120 | "df.plot()" 121 | ] 122 | }, 123 | { 124 | "cell_type": "markdown", 125 | "metadata": {}, 126 | "source": [ 127 | "So... problem #1, these two series not similar magnitudes. We need to plot these on difference axes and while we're at it let's make it look a little better. " 128 | ] 129 | }, 130 | { 131 | "cell_type": "code", 132 | "execution_count": null, 133 | "metadata": {}, 134 | "outputs": [], 135 | "source": [ 136 | "plt.style.use('seaborn')\n", 137 | "df.plot(secondary_y=[\"HYG.spot\"], legend=True)" 138 | ] 139 | }, 140 | { 141 | "cell_type": "markdown", 142 | "metadata": {}, 143 | "source": [ 144 | "If we flip around RTY implied vol, we can see this inverse relationship a bit better. " 145 | ] 146 | }, 147 | { 148 | "cell_type": "code", 149 | "execution_count": null, 150 | "metadata": {}, 151 | "outputs": [], 152 | "source": [ 153 | "df[\"RTY.3m.Proxy.Implied.Vol\"] = df[\"RTY.3m.Proxy.Implied.Vol\"] * -1\n", 154 | "df.plot(secondary_y=[\"HYG.spot\"], legend=True)" 155 | ] 156 | }, 157 | { 158 | "cell_type": "markdown", 159 | "metadata": {}, 160 | "source": [ 161 | "We can look at the scatter on levels pretty easily with `matplotlib` using `matplotlib.pyplot.scatter`." 162 | ] 163 | }, 164 | { 165 | "cell_type": "code", 166 | "execution_count": null, 167 | "metadata": {}, 168 | "outputs": [], 169 | "source": [ 170 | "plt.scatter( df[ df.columns[ 0 ] ], df[ df.columns[ 1 ] ] )" 171 | ] 172 | }, 173 | { 174 | "cell_type": "markdown", 175 | "metadata": {}, 176 | "source": [ 177 | "It looks like there is a relationship there (there should be, we generated the series with a negative correlation). Let's explore that a bit. As we said before, `seaborn` comes with a bunch of good statistical tools. In fact, it has has quick and easy way to generate a regression plot with `sns.regplot`." 178 | ] 179 | }, 180 | { 181 | "cell_type": "code", 182 | "execution_count": null, 183 | "metadata": {}, 184 | "outputs": [], 185 | "source": [ 186 | "fig, ax = plt.subplots( sharex=True )\n", 187 | "sns.regplot( x=\"HYG.spot\", y=\"RTY.3m.Proxy.Implied.Vol\", data=df.diff(), ax=ax )" 188 | ] 189 | }, 190 | { 191 | "cell_type": "markdown", 192 | "metadata": {}, 193 | "source": [ 194 | "Unfortunately, the underlying regression data is not exposed in `seaborn`, so we will need to generate it ourselves using `scipy`." 195 | ] 196 | }, 197 | { 198 | "cell_type": "code", 199 | "execution_count": null, 200 | "metadata": {}, 201 | "outputs": [], 202 | "source": [ 203 | "from scipy import stats\n", 204 | "\n", 205 | "diff = df.diff().dropna()\n", 206 | "slope, intercept, rvalue, pvalue, stderr = stats.linregress(diff[\"HYG.spot\"], diff[\"RTY.3m.Proxy.Implied.Vol\"])\n", 207 | "print( \"R^2 = {r:.3f}\".format( r=rvalue ) )\n", 208 | "print( 'y = {m:.3f}x {sign} {b:.3f}'.format( m=slope, sign=\"+\" if intercept >= 0 else \"-\", b=abs(intercept) ) )" 209 | ] 210 | }, 211 | { 212 | "cell_type": "markdown", 213 | "metadata": {}, 214 | "source": [ 215 | "What if I want to interact with the plot.... zoom in, inspect the values, etc. This is where `plotly` shines." 216 | ] 217 | }, 218 | { 219 | "cell_type": "code", 220 | "execution_count": null, 221 | "metadata": {}, 222 | "outputs": [], 223 | "source": [ 224 | "from plotly.offline import download_plotlyjs, init_notebook_mode, iplot\n", 225 | "import plotly.tools as tls\n", 226 | "init_notebook_mode(connected=True)\n", 227 | "\n", 228 | "# Here we can convert our matplotlib object to a plotly object\n", 229 | "plotlyFig = tls.mpl_to_plotly(fig)\n", 230 | "\n", 231 | "# Add annotation so you have the regression stats\n", 232 | "plotlyFig['layout']['annotations'] = [\n", 233 | " dict(\n", 234 | " x=18,\n", 235 | " y=-2,\n", 236 | " showarrow=False,\n", 237 | " text='R^2 = {:.3f}'.format( rvalue )\n", 238 | " ),\n", 239 | " dict(\n", 240 | " x=18,\n", 241 | " y=-2.6,\n", 242 | " showarrow=False,\n", 243 | " text='y = {m:.3f}x {sign} {b:.3f}'.format( m=slope, sign=\"+\" if intercept >= 0 else \"-\", b=abs(intercept) )\n", 244 | " )\n", 245 | "]\n", 246 | "iplot(plotlyFig)" 247 | ] 248 | }, 249 | { 250 | "cell_type": "markdown", 251 | "metadata": {}, 252 | "source": [ 253 | "2d plots are cool but 3d plots.... Below is an example of plotting a vol surface from start to finish." 254 | ] 255 | }, 256 | { 257 | "cell_type": "code", 258 | "execution_count": null, 259 | "metadata": {}, 260 | "outputs": [], 261 | "source": [ 262 | "c = [0.8023,0.814,0.8256,0.8372,0.8488,0.8605,0.8721,0.8837,0.8953,0.907,0.9186,0.9302,0.9419,0.9535,0.9651,0.9767,0.9884,1,1.0116,1.0233,1.0349,1.0465,1.0581,1.0698,1.0814,1.093,1.1047,1.1163,1.1279,1.1395,1.1512,1.1628,1.1744,1.186,1.1977,1.2093]\n", 263 | "i = [ dt.datetime(2019,8,2), dt.datetime(2019,8,9), dt.datetime(2019,8,16), dt.datetime(2019,8,23), dt.datetime(2019,8,30), dt.datetime(2019,9,6), dt.datetime(2019,9,20), dt.datetime(2019,10,18), dt.datetime(2019,11,15), dt.datetime(2019,12,20), dt.datetime(2019,12,31), dt.datetime(2020,1,17), dt.datetime(2020,3,20), dt.datetime(2020,3,31), dt.datetime(2020,6,19) ]\n", 264 | "d = [ [0.4244,0.4016,0.3796,0.3584,0.3381,0.3187,0.3002,0.2827,0.2662,0.2508,0.2363,0.2229,0.2105,0.1991,0.1888,0.1794,0.171,0.1636,0.157,0.1513,0.1465,0.1425,0.1393,0.1368,0.135,0.1338,0.1331,0.1329,0.133,0.1334,0.1341,0.135,0.136,0.1372,0.1384,0.1396],\n", 265 | " [0.4006,0.3777,0.3556,0.3343,0.3139,0.2944,0.2759,0.2583,0.2418,0.2263,0.2118,0.1983,0.1859,0.1745,0.1641,0.1547,0.1463,0.1388,0.1322,0.1265,0.1216,0.1176,0.1143,0.1118,0.11,0.1088,0.1081,0.1078,0.108,0.1084,0.1092,0.1101,0.1112,0.1123,0.1136,0.1149],\n", 266 | " [0.3431,0.3257,0.3089,0.2927,0.277,0.2621,0.2477,0.2341,0.2212,0.2089,0.1974,0.1866,0.1765,0.1671,0.1584,0.1504,0.1431,0.1364,0.1303,0.1248,0.1198,0.1154,0.1116,0.1083,0.1055,0.1032,0.1013,0.0998,0.0986,0.0977,0.0971,0.0966,0.0964,0.0963,0.0963,0.0965],\n", 267 | " [0.3124,0.298,0.284,0.2705,0.2574,0.2449,0.2328,0.2213,0.2104,0.1999,0.1901,0.1807,0.1719,0.1637,0.156,0.1488,0.1421,0.136,0.1304,0.1253,0.1206,0.1165,0.1129,0.1098,0.1072,0.1049,0.1031,0.1015,0.1003,0.0994,0.0988,0.0983,0.098,0.0979,0.0978,0.0979],\n", 268 | " [0.2955,0.283,0.2708,0.259,0.2475,0.2365,0.2259,0.2158,0.206,0.1968,0.1879,0.1795,0.1716,0.1641,0.1571,0.1505,0.1443,0.1385,0.1332,0.1284,0.124,0.1201,0.1167,0.1137,0.1111,0.1089,0.1071,0.1056,0.1044,0.1034,0.1027,0.1022,0.1019,0.1018,0.1017,0.1018],\n", 269 | " [0.2866,0.2752,0.264,0.2532,0.2427,0.2326,0.2228,0.2134,0.2044,0.1958,0.1876,0.1798,0.1724,0.1653,0.1587,0.1524,0.1465,0.141,0.1359,0.1313,0.1271,0.1233,0.12,0.1171,0.1145,0.1124,0.1106,0.1091,0.1079,0.1069,0.1062,0.1057,0.1054,0.1052,0.1051,0.1051],\n", 270 | " [0.2729,0.2632,0.2538,0.2446,0.2357,0.227,0.2187,0.2106,0.2028,0.1954,0.1882,0.1813,0.1748,0.1685,0.1626,0.1569,0.1516,0.1465,0.1418,0.1376,0.1336,0.1301,0.127,0.1242,0.1218,0.1197,0.1179,0.1164,0.1152,0.1143,0.1135,0.113,0.1126,0.1124,0.1122,0.1122],\n", 271 | " [0.2548,0.2473,0.24,0.2329,0.226,0.2192,0.2126,0.2063,0.2001,0.1941,0.1883,0.1827,0.1773,0.1721,0.167,0.1622,0.1576,0.1532,0.149,0.1452,0.1416,0.1383,0.1354,0.1327,0.1303,0.1281,0.1262,0.1246,0.1232,0.122,0.121,0.1202,0.1195,0.119,0.1186,0.1183],\n", 272 | " [0.2482,0.242,0.2359,0.2299,0.2241,0.2184,0.2128,0.2074,0.2021,0.1969,0.1919,0.187,0.1823,0.1777,0.1733,0.169,0.1649,0.1608,0.157,0.1535,0.1501,0.1471,0.1442,0.1415,0.1391,0.1369,0.135,0.1332,0.1316,0.1303,0.1291,0.128,0.1271,0.1263,0.1257,0.1252],\n", 273 | " [0.2331,0.2278,0.2227,0.2176,0.2127,0.2078,0.2031,0.1984,0.1939,0.1895,0.1851,0.1809,0.1768,0.1728,0.1689,0.1652,0.1615,0.158,0.1547,0.1516,0.1486,0.1459,0.1433,0.1409,0.1387,0.1367,0.1349,0.1332,0.1317,0.1303,0.1291,0.128,0.127,0.1261,0.1254,0.1247],\n", 274 | " [0.2313,0.2262,0.2212,0.2163,0.2115,0.2068,0.2022,0.1977,0.1932,0.1889,0.1847,0.1806,0.1766,0.1727,0.1689,0.1653,0.1617,0.1582,0.155,0.152,0.1491,0.1464,0.1438,0.1415,0.1393,0.1373,0.1354,0.1337,0.1322,0.1308,0.1296,0.1284,0.1274,0.1266,0.1258,0.1251],\n", 275 | " [0.2302,0.2253,0.2206,0.216,0.2114,0.2069,0.2026,0.1983,0.1941,0.19,0.186,0.1821,0.1783,0.1745,0.1709,0.1674,0.164,0.1607,0.1576,0.1546,0.1518,0.1492,0.1467,0.1444,0.1423,0.1403,0.1384,0.1367,0.1351,0.1337,0.1324,0.1313,0.1302,0.1293,0.1285,0.1277],\n", 276 | " [0.229,0.2249,0.2209,0.2169,0.213,0.2091,0.2054,0.2017,0.1981,0.1946,0.1911,0.1877,0.1845,0.1812,0.1781,0.175,0.172,0.1691,0.1664,0.1637,0.1611,0.1587,0.1563,0.1541,0.152,0.15,0.148,0.1462,0.1445,0.143,0.1415,0.1401,0.1388,0.1376,0.1365,0.1355],\n", 277 | " [0.2289,0.2249,0.2209,0.2171,0.2133,0.2096,0.2059,0.2023,0.1988,0.1953,0.192,0.1887,0.1855,0.1823,0.1793,0.1763,0.1733,0.1705,0.1678,0.1651,0.1626,0.1602,0.1579,0.1557,0.1535,0.1515,0.1496,0.1478,0.1461,0.1445,0.143,0.1415,0.1402,0.139,0.1379,0.1368],\n", 278 | " [0.2239,0.2207,0.2175,0.2143,0.2112,0.2082,0.2051,0.2022,0.1993,0.1964,0.1936,0.1909,0.1882,0.1855,0.1829,0.1804,0.1779,0.1755,0.1731,0.1708,0.1685,0.1663,0.1642,0.1622,0.1601,0.1582,0.1563,0.1545,0.1527,0.151,0.1494,0.1478,0.1464,0.1449,0.1436,0.1423]\n", 279 | "]\n", 280 | "df2 = pd.DataFrame(d, columns=c, index=i)" 281 | ] 282 | }, 283 | { 284 | "cell_type": "code", 285 | "execution_count": null, 286 | "metadata": {}, 287 | "outputs": [], 288 | "source": [ 289 | "df2.head()" 290 | ] 291 | }, 292 | { 293 | "cell_type": "code", 294 | "execution_count": null, 295 | "metadata": {}, 296 | "outputs": [], 297 | "source": [ 298 | "import plotly.graph_objs as go\n", 299 | "\n", 300 | "fig = go.Figure(\n", 301 | " data=[ go.Surface(\n", 302 | " z=df2.values.tolist(),\n", 303 | " y=df2.columns.values,\n", 304 | " x=df2.index.astype(str).values.tolist()\n", 305 | " )],\n", 306 | " layout=dict(\n", 307 | " title = 'Vol Surface', \n", 308 | " autosize = True,\n", 309 | " width = 900,\n", 310 | " height = 700,\n", 311 | " margin = dict(\n", 312 | " l = 65,\n", 313 | " r = 50,\n", 314 | " b = 65,\n", 315 | " t = 90\n", 316 | " ),\n", 317 | " scene = dict(\n", 318 | " aspectratio = dict(\n", 319 | " x = 1,\n", 320 | " y = 1,\n", 321 | " z = 0.667\n", 322 | " )\n", 323 | " )\n", 324 | " ))\n", 325 | "\n", 326 | "go.FigureWidget(fig)" 327 | ] 328 | }, 329 | { 330 | "cell_type": "code", 331 | "execution_count": null, 332 | "metadata": {}, 333 | "outputs": [], 334 | "source": [] 335 | } 336 | ], 337 | "metadata": { 338 | "kernelspec": { 339 | "display_name": "Python 3", 340 | "language": "python", 341 | "name": "python3" 342 | }, 343 | "language_info": { 344 | "codemirror_mode": { 345 | "name": "ipython", 346 | "version": 3 347 | }, 348 | "file_extension": ".py", 349 | "mimetype": "text/x-python", 350 | "name": "python", 351 | "nbconvert_exporter": "python", 352 | "pygments_lexer": "ipython3", 353 | "version": "3.7.3" 354 | } 355 | }, 356 | "nbformat": 4, 357 | "nbformat_minor": 4 358 | } 359 | -------------------------------------------------------------------------------- /Machine Learning For Finance/Classification Based Machine Learning for Algorithmic Trading/default_predictions/dataset_2.csv: -------------------------------------------------------------------------------- 1 | default,WC/TA,Ret Earnings/TA,EBIT/TA,Mk val Equity/BV Debt,Sales/TA 2 | 0,0.014136297,0.141362968,0.026897097,0.300231865,0.931279621 3 | 0,-0.002690364,0.165253546,0.009049405,0.194708277,1.133621393 4 | 0,0.279427408,0.300884418,0.068960277,2.064435974,1.064067106 5 | 0,0.077886424,0.281782346,0.068927342,0.543863435,0.884408637 6 | 0,-0.280907494,-0.437343823,0.041193498,0.217049033,1.55862342 7 | 0,0.008500262,0.339084013,0.118597413,2.313229626,1.223902619 8 | 0,-0.134008354,0.175647878,0.013380172,0.435381088,1.429392481 9 | 0,0.064015839,0.16185448,0.045482044,0.422352513,1.673431227 10 | 0,0.150773528,0.128367027,0.005109412,0.184104589,0.667716225 11 | 0,0.27925075,0.129400667,0.039989219,1.37780425,0.963918741 12 | 0,0.311554025,0.16671001,0.021876398,0.294468891,1.881370214 13 | 0,0.33680102,-0.36636738,-0.061704637,3.362862626,1.384926956 14 | 0,0.006470113,0.227580015,0.10119225,2.481619152,1.34539256 15 | 0,0.313933982,0.312049965,0.046403593,0.791311216,1.869202777 16 | 0,-0.088881427,0.362384881,0.119987529,0.603369092,2.146249041 17 | 0,-0.166099513,0.029246758,0.029108802,0.528125197,0.56180447 18 | 0,0.271805192,0.511390622,-0.009237271,0.535137064,0.2755484 19 | 0,-0.594816027,-0.876758204,0.234113153,0.117531685,1.028376652 20 | 0,-0.166594615,-0.016659461,0.045161191,0.18226472,0.976168787 21 | 0,0.130235797,0.359175895,0.082272717,1.131706276,1.470439222 22 | 0,0.071501855,0.325151922,0.091626549,1.687393731,3.83963381 23 | 0,-0.01450878,-0.00449167,0.03996398,0.183389313,0.939061684 24 | 0,-0.15209002,0.109717909,-0.00462572,0.157342665,0.932303374 25 | 0,-0.068014223,0.216050714,0.051544816,0.115770846,1.37721647 26 | 0,-0.096611753,0.011680235,0.013186708,1.835163558,0.716105448 27 | 0,-0.124730906,0.259033999,-0.173525803,1.531807476,0.781402218 28 | 0,-0.15067833,0.002594707,-0.020090444,0.793170442,0.725517088 29 | 0,0.19276966,0.249427512,0.059986901,2.197728767,0.982069427 30 | 0,-0.193954499,0.031579916,0.039361604,0.143629646,0.896805897 31 | 0,0.179406475,0.154676259,0.050584532,0.374928558,3.362972122 32 | 0,0.240271836,0.458327219,0.055586286,1.398143203,1.311222244 33 | 0,-0.014953912,-0.011858521,-0.037633441,0.086629696,2.040282958 34 | 0,-0.159102493,0.12938269,0.021157853,0.295407694,1.306498811 35 | 0,0.278732341,0.294514446,0.134911544,1.343612228,1.190148912 36 | 0,-0.058856334,0.161618978,0.042555017,0.27180739,2.393955643 37 | 0,-0.138784461,0.29006109,0.059288847,0.227696599,0.941729323 38 | 0,0.33557047,0.450335571,0.066946309,1.04302226,4.470637584 39 | 0,0.008999497,0.073152338,0.047435897,0.319120166,1.334690799 40 | 0,0.08173058,0.215929204,0.03193707,0.329559599,0.584936087 41 | 0,-0.0062533,0.260559662,0.050488384,0.502226426,0.921743004 42 | 0,0.645762605,0.441621154,0.036691057,1.069263255,1.381449929 43 | 0,0.32072795,0.654697446,0.117960241,5.536153062,1.423384813 44 | 0,0.133937272,0.354355902,0.073129081,1.272427288,1.175054983 45 | 0,0.045954752,0.292087216,0.031487093,0.422955492,2.666537559 46 | 0,0.037554796,0.315052438,0.034104808,1.274088505,0.595822262 47 | 0,0.240026197,0.061850167,0.044013743,1.691614783,1.07924349 48 | 0,0.182224057,0.074279466,0.046937522,0.261972378,0.737292927 49 | 0,0.412815983,0.380925236,0.135362669,2.836324746,1.291149194 50 | 0,0.051843118,0.017859001,0.033117176,0.862845913,4.316988591 51 | 0,0.123268522,-0.112746219,0.013368916,0.546543655,1.692514932 52 | 0,0.182824465,0.134568414,0.014938857,0.113965424,3.058364495 53 | 0,0.309113454,0.324472552,-0.000208745,0.313031793,1.955027656 54 | 0,0.180566,0.062418566,-0.038335853,1.657304923,1.652560595 55 | 0,0.427122427,0.093428555,0.047694393,0.556913498,1.20429744 56 | 0,0.108783443,-0.14470806,-0.062005721,1.507711201,1.616523641 57 | 0,0.307188361,0.202165764,0.051239263,0.520158308,3.22154514 58 | 0,0.215979411,0.290584697,0.05688894,0.77782214,3.550826056 59 | 0,-0.327696948,-0.00570588,-0.076311965,2.786034929,0.669665616 60 | 0,0.179854821,0.272178541,0.063612973,0.8598448,2.404437314 61 | 0,0.165686394,0.411546338,0.034990603,0.538434523,1.104485567 62 | 0,-0.048189073,0.345917741,0.056067117,0.648272472,1.692449355 63 | 0,0.196318229,0.315782698,0.180033471,1.905424453,2.161560247 64 | 0,0.08598076,0.257555138,0.061989676,3.364782057,2.364969498 65 | 0,0.043165961,0.303440504,-0.021636712,1.185740328,1.367125058 66 | 0,-0.004236136,0.039854073,0.083808908,0.525093897,1.012469794 67 | 0,-0.138408368,0.121310756,0.033463775,0.202570364,0.18792744 68 | 0,0.615184758,0.208718245,0.082419169,0.54981689,1.583573903 69 | 0,0.248976554,-0.018235951,-0.114625977,10.52777161,0.28619278 70 | 0,0.255033573,0.032828554,0.104604433,2.185722112,1.037185965 71 | 0,0.080483582,0.216248796,0.130669199,2.939500685,2.732236765 72 | 0,0.340305393,0.216561741,0.082336909,0.739465775,1.077256607 73 | 0,0.685749121,-0.079113511,-0.029147978,0.974580567,0.531024565 74 | 0,0.188465433,0.208898141,0.103903519,0.305810398,1.250551547 75 | 0,0.436169767,0.173085775,0.032662082,0.302067511,1.198010509 76 | 0,0.22685096,0.089140384,0.125122526,0.226358462,1.180409336 77 | 0,0.092032374,-0.136073461,0.017067092,2.234315993,0.929502106 78 | 0,-0.119403623,0.263761697,0.058676992,0.430931963,0.152456612 79 | 0,0.121496083,0.005641479,-0.011782067,0.040804678,0.272510679 80 | 0,0.170070076,0.083539565,0.065582885,0.444196002,1.22203118 81 | 0,0.490564733,0.394145828,0.115616719,0.547978096,0.818969346 82 | 0,0.093935813,0.367508863,0.180893966,1.291906786,1.277080637 83 | 0,0.259890627,0.143266361,0.072835079,0.11144843,0.648805112 84 | 0,0.349286854,-0.063945238,-0.082480062,0.0778398,0.601145399 85 | 0,0.515676361,0.147223645,0.084827212,0.199704762,0.705079781 86 | 0,0.057272895,-0.568835337,-0.119886349,0.080803507,0.622056773 87 | 0,0.381360133,0.155340844,0.02115734,0.120000746,1.903759491 88 | 0,0.120315131,0.329682369,0.077230167,2.767265803,0.386925174 89 | 0,0.607130778,-0.061321546,0.05644794,1.776818113,2.310873467 90 | 0,0.060658148,0.046220286,0.01672634,0.077673285,0.559438084 91 | 0,-0.017680937,0.217649621,0.002248622,0.247638833,1.119102833 92 | 0,0.28698375,0.188252012,0.06077949,2.719409699,0.64598667 93 | 0,0.533405562,0.508125677,0.108342362,3.575675676,1.012639942 94 | 0,0.112725016,0.147237741,0.013035382,0.271146599,0.907386716 95 | 0,0.171013536,0.21319552,0.057190746,0.633106013,1.106127943 96 | 0,0.353083945,0.37060096,0.084524877,1.256511499,0.736347808 97 | 0,0.491978198,0.223570424,0.03892452,5.310880995,0.958764277 98 | 0,0.029296675,0.071233943,0.014218107,0.112024694,1.156808916 99 | 0,-0.000310538,-0.037410413,0.033123467,0.027988027,0.866159833 100 | 0,0.051057622,0.165638883,0.020555666,0.187412416,1.290564286 101 | 0,0.124533229,0.061809636,-0.020472647,0.23079053,0.837132785 102 | 0,0.028291038,-0.179966279,0.095044169,1.444178329,0.827302466 103 | 0,0.305541842,0.299740805,0.012589484,0.659202465,0.970069119 104 | 0,0.239943692,0.094818614,0.027603753,0.711635103,1.344723903 105 | 0,0.257410049,0.22808686,0.020763988,0.493913824,0.645585671 106 | 0,-0.05219159,0.020029048,0.00850477,0.20506818,1.11611207 107 | 0,0.241638737,0.190233762,0.043895748,0.347607185,1.984147187 108 | 0,-0.079207413,0.023785996,-0.035984871,0.24030945,1.123417547 109 | 0,0.285139968,0.170972598,0.051131667,2.41001303,0.527815244 110 | 0,0.127000162,0.123565541,0.013886644,0.124872739,0.818099779 111 | 0,0.239457618,0.215462161,0.02236263,1.039361903,0.421269345 112 | 0,0.257984854,0.22604544,0.00642081,0.466340214,0.596476786 113 | 0,-0.043988351,0.234858418,0.028409891,0.388282256,0.813470192 114 | 0,0.098367747,-0.019320549,-0.016193265,0.242299307,0.847841975 115 | 0,0.02476794,-0.003320891,0.048372226,1.568017189,0.740406563 116 | 0,-0.013524175,0.107055333,0.030043638,0.643581746,1.295604818 117 | 0,-0.001419509,0.263452069,0.047642284,0.808167198,1.931331234 118 | 0,0.441176471,-0.127638615,0.017309316,7.676895216,0.448212778 119 | 0,0.316633152,0.314211007,0.067436704,0.612440147,1.620190529 120 | 0,0.024151734,0.073716271,0.02274992,0.327057628,1.177326645 121 | 0,0.379624558,0.263606035,0.108524623,6.314862499,0.83613006 122 | 0,0.298733256,0.407464603,0.087756488,1.119160438,1.172854471 123 | 0,0.141929872,0.145877944,0.044499465,0.527524462,1.364070307 124 | 0,0.445885957,0.237520191,0.21573251,2.443773262,1.761892277 125 | 0,0.141816411,0.280562743,0.123533985,0.465121155,0.675359971 126 | 0,-0.001215308,0.154558469,0.034637739,0.374623532,0.890169516 127 | 0,-0.127935032,-0.247173055,-0.07409991,0.428153777,0.722036717 128 | 0,0.279805109,0.278080764,0.0151983,0.617001874,1.40227373 129 | 0,0.109867697,0.051005558,0.084526312,1.14816002,0.793624921 130 | 0,0.401733957,0.433395665,0.089672758,1.747429982,1.458368041 131 | 0,0.307316568,0.268659915,0.124328957,0.866740038,2.433515259 132 | 0,0.203222796,0.201251315,0.243746288,1.308342097,0.941213136 133 | 0,0.562479973,0.309498095,0.0566349,2.279280159,1.325520123 134 | 1,-0.326106568,-0.084625906,0.026860286,0.132065505,1.804673632 135 | 1,0.046977163,0.018068612,0.019500283,0.070163708,1.698861645 136 | 1,0.005568431,0.066649597,0.019953749,0.061008513,1.341219213 137 | 1,0.151718733,0.117344082,0.069309148,0.464713475,1.097413504 138 | 1,-0.068802112,0.033483883,-0.016234551,0.137298998,0.740089513 139 | 1,0.233751481,-0.189267156,-0.199311267,1.042773504,1.128229117 140 | 1,0.331867664,0.225832937,0.041315545,0.518969789,0.838880334 141 | 1,-0.020509576,0.0860343,-0.061910444,0.091128873,0.923725287 142 | 1,-0.002683749,-0.058483362,-0.02762025,0.083808188,1.142478314 143 | 1,-0.141264675,0.006580579,0.007662527,0.087078534,0.884753027 144 | 1,-0.086376157,-0.007486404,0.017727241,0.08255134,0.874073028 145 | 1,0.184071813,-0.011142769,0.034723402,0.022629188,0.778758127 146 | 1,-0.598667563,-0.231462797,0.004852503,0.050697917,0.838858461 147 | 1,-0.097262014,-0.046489648,-0.017367162,0.173288191,1.212836341 148 | 1,-0.042167845,-0.00692349,0.022961187,0.052215023,1.038185977 149 | 1,-0.028655068,-0.086206636,0.017820766,0.062830396,0.478278583 150 | 1,-0.086192266,0.022426961,-0.0601285,0.082201369,0.92423324 151 | 1,0.010823898,-0.009353736,0.017975763,0.053482664,0.654970563 152 | 1,-0.170427148,-0.142645339,-0.029101956,0.058631763,1.680887429 153 | 1,-0.207191533,-0.025619445,0.043337625,0.012988213,0.765034702 154 | 1,0.00721982,0.021916001,0.016290405,0.058488646,1.293777029 155 | 1,-0.207418346,-0.151233495,0.017103892,0.061426484,0.870022585 156 | 1,0.356129566,0.074597553,0.044569836,0.452639959,1.500273658 157 | 1,-0.144390192,-0.008625315,0.02906852,0.084635337,0.733218866 158 | 1,0.175578056,-0.027952653,0.013330819,0.19752819,0.864588471 159 | 1,-0.153622222,0.0088,0.017888889,0.011563039,0.6286 160 | 1,0.463453242,0.11514142,0.077954095,0.787127718,0.417749625 161 | 1,0.159727959,-0.183059342,0.019454439,0.032367357,0.49010904 162 | 1,0.108694152,0.042974762,0.013076793,0.048307283,0.751488594 163 | 1,-0.040905462,-0.167296705,-0.007687237,0.214611504,1.434880067 164 | 1,-0.070787606,-0.058126944,0.001662246,0.05995462,1.48897819 165 | 1,0.189489813,0.006232076,0.029608579,0.07911067,1.885794382 166 | 1,-0.464579612,-1.125223299,-0.088265441,0.151422125,0.673734149 167 | 1,0.223234149,-0.05346218,-0.040913515,0.7627172,1.471565629 168 | 1,-0.339130435,-0.100673161,0.011166548,0.026303278,1.201520551 169 | 1,-0.440571155,-0.154116146,0.020301532,0.155241918,1.417477664 170 | 1,-0.149781451,0.044867364,0.041739792,0.255596285,0.596199192 171 | 1,-0.055076487,-0.196691747,-0.177860019,0.669686059,0.334285067 172 | 1,-0.580071017,-1.166972687,-0.231250471,0.403282299,1.774649746 173 | 1,0.067522834,-0.384098259,-0.309156943,0.238533674,1.080579091 174 | 1,-0.210911361,-0.328598288,-0.006576467,0.271751739,0.905534312 175 | 1,-0.683093528,-0.98225851,-0.270441471,0.104710543,3.334601487 176 | 1,0.100843955,0.023940751,0.04323114,0.130992196,1.452204616 177 | 1,0.118016988,0.154524665,0.002695198,0.144059631,0.364505064 178 | 1,-0.572798721,-0.450646064,-0.014253364,0.139989189,0.527640869 179 | 1,-0.52939472,-0.097351748,0.003870134,0.054648252,0.324464168 180 | 1,-0.356081553,-0.005980326,-0.03932523,0.246379842,0.551951308 181 | 1,-0.221860228,-0.296777818,-0.32298126,0.060940548,0.568417492 182 | 1,-0.079494415,-0.148944655,-0.044509446,1.180403943,0.691490518 183 | 1,0.017172609,-0.052049095,0.060801092,0.616125192,0.804233945 184 | 1,-0.344285865,-0.06755471,0.018331748,0.107212353,1.115868485 185 | 1,-0.030204693,0.045681478,-0.090114828,0.151875275,0.631469463 186 | 1,-0.995195563,-2.402722357,-0.609262753,0.528158262,3.859165908 187 | 1,-0.294852236,-0.209601614,0.032554935,0.085815842,0.727552351 188 | 1,0.402398613,0.114816921,-0.081276767,9.282478168,0.070194515 189 | 1,0.558440534,0.008531788,-0.010585442,0.009747131,0.046751187 190 | 1,0.03891824,-0.11005911,-0.061332845,0.102223163,0.096929435 191 | 1,-0.293569003,-0.076257668,-0.046850505,0.147321185,1.032473692 192 | 1,0.092650903,0.05828571,0.042147809,0.224888247,1.430553426 193 | 1,0.033793784,-0.088307844,0.032889327,0.218563354,1.412678836 194 | 1,-0.210605193,-0.019304271,-0.059396099,0.081216764,0.428454489 195 | 1,-0.323415634,-0.236109338,0.034365165,0.091901851,0.591058297 196 | 1,-0.695935345,-0.072249665,0.006522832,0.055009645,0.46930043 197 | 1,-0.315537941,0.056649959,-0.032346427,0.046930355,0.749562886 198 | 1,-0.303186287,0.028414063,0.040922428,0.123230617,0.68713646 199 | 1,-0.313841504,-0.165717603,0.034639788,0.103060157,0.675967901 200 | 1,-0.190649536,-0.247573162,-0.00856531,0.51484542,0.522947894 201 | 1,0.003055929,-0.046378988,-0.003121789,0.111779196,0.453910799 202 | 1,-0.37845092,-0.655578988,0.00872316,0.109114926,0.593270706 203 | 1,0.226815072,0.050395657,0.018332276,0.416086233,0.689502263 204 | 1,0.086520328,0.131795429,0.018864469,0.20792291,0.62313284 205 | 1,-0.017665764,0.095866211,-0.035331527,0.188478865,0.789424096 206 | 1,-0.102159031,0.148367562,-0.154423381,0.107789554,0.295155345 207 | 1,-0.294231092,-0.808653362,-0.001970485,0.155349225,0.650176086 208 | 1,0.169516729,-0.348921933,-0.121412639,0.130534007,0.431747212 209 | 1,-1.799940003,-2.495545223,-0.203269837,0.27587456,1.048897555 210 | 1,0.290694816,-0.011109062,0.022863465,0.265836141,0.391306352 211 | 1,0.211653387,-0.271912351,-0.052913347,0.274289482,0.698207171 212 | 1,0.061663004,0.038729404,-0.005817134,0.089552917,0.76401857 213 | 1,0.124833997,-0.171609857,0.150066401,0.207049087,2.777039988 214 | 1,-0.050938733,0.006270518,-0.012024639,0.093571797,0.690937258 215 | 1,-0.303662692,-0.175772694,-0.014449988,0.073655334,0.62025432 216 | 1,0.096478248,-2.161882214,-0.154779521,1.901439942,1.370227878 217 | 1,-0.714361904,-0.119834668,-0.012563974,0.015671849,0.737110389 218 | 1,0.102284758,0.060305665,0.015368955,0.061442155,1.378036542 219 | 1,-0.163449035,0.17140142,-0.00550165,0.062617885,0.60683205 220 | 1,-0.223063126,0.019117197,-0.012380244,0.273247566,1.243142745 221 | 1,-0.347972973,0.034409409,0.026026026,0.827566161,1.471471472 222 | 1,0.182130851,-0.111454022,0.061919651,0.154770465,1.242484724 223 | 1,-0.763929928,-0.277543267,-0.00453778,0.020961221,0.338222879 224 | 1,-0.269862099,0.068998425,0.026053347,0.128857391,0.415278904 225 | 1,-0.632038804,-0.402653304,-0.06986605,0.073557306,1.155215059 226 | 1,-0.176377953,-0.151088467,-0.018712367,0.117557052,1.820842983 227 | 1,0.130714702,0.024599275,0.007721994,0.329454236,0.838395955 228 | 1,-0.287917935,-0.097386563,-0.043318691,0.1723752,1.019124545 229 | 1,-0.02580933,-0.503145204,-0.289371321,0.396830077,2.03449934 230 | 1,-0.343667395,-0.042696242,-0.012642225,0.048785239,1.342144581 231 | 1,-0.489274008,-0.028667873,0.024240497,0.042781361,1.309145834 232 | 1,0.444169834,-0.170043601,0.044813355,0.195668415,0.544731749 233 | 1,-1.244822557,-0.953107672,0.002712281,0.046795761,0.947333526 234 | 1,-0.380563482,-0.066430385,0.008523557,0.067859001,0.994986969 235 | 1,-0.445737695,-0.123454346,-0.021699265,0.042284804,1.48857524 236 | 1,-0.178884483,-0.05218193,0.010343265,0.038262963,0.935913373 237 | 1,-0.141240669,-0.029507084,0.002527596,0.038443219,0.206147944 238 | 1,-0.421855277,0.040624437,0.014374404,0.024743814,0.176769108 239 | 1,0.078963353,-0.041987244,-0.101753898,0.037754869,0.447800668 240 | 1,-0.020601233,-0.092956517,-0.006350592,0.102670864,0.382170331 241 | 1,0.28043338,0.045108467,-0.029040915,0.124503571,0.97385421 242 | 1,0.510861881,0.1161964,0.115565764,0.202710235,0.404412131 243 | 1,0.549348464,0.13395808,0.080094061,0.104025248,0.545834266 244 | 1,0.33910893,0.044863695,0.054276033,0.119634666,0.46779091 245 | 1,0.363453637,0.030094191,0.056545662,0.117730989,0.675183528 246 | 1,0.301112755,0.102460939,0.070239024,0.134074405,0.959385139 247 | 1,0.311725966,0.118566962,0.07412727,0.214427823,0.353622722 248 | 1,0.488148796,0.144027949,0.080405012,0.151212586,0.5145179 249 | 1,0.424965583,-0.096662118,-0.012125788,0.099608126,0.501681122 250 | 1,0.270756033,0.060202162,0.002339864,0.159736488,0.969792663 251 | 1,0.331126864,0.081958965,-0.018774932,0.190940171,0.89339026 252 | 1,0.084277012,0.039155657,0.02888422,0.087359124,0.51686962 253 | 1,0.295234063,-0.123725187,0.058428663,0.110043324,1.439241896 254 | 1,0.297603599,0.054387419,0.067335839,0.149459658,1.202837827 255 | 1,0.35463052,0.062715779,0.08178702,0.216583017,0.59931058 256 | 1,0.079070283,-0.022623646,-0.002598165,0.053217598,0.544501734 257 | 1,-0.093031332,-0.132605692,-0.029065294,0.35125297,1.114514997 258 | 1,-0.474959058,-0.071890661,0.030516117,0.165951754,0.546315031 259 | 1,-0.141341853,0.235527157,-0.028498403,0.300744701,0.174440895 260 | 1,-0.007322142,-0.272230216,-0.053429257,0.073328966,0.435779377 261 | 1,-0.470761947,-0.033058335,-0.025195116,0.067955138,0.188025875 262 | 1,0.243927367,-2.710239078,-0.23793044,1.300541772,1.237420473 263 | 1,-0.162320511,0.069058253,0.093886568,0.904284867,1.103731451 264 | 1,-0.084527815,-0.037923381,-0.076423902,0.061640462,1.511672038 265 | 1,-0.612760417,-0.248177083,-0.057552083,0.021287441,2.252994792 266 | 1,-0.329573531,-0.078472446,0.000429692,0.309340356,0.792297776 267 | 1,0.072774233,-0.016914818,0.203382964,0.31174631,1.06760863 268 | -------------------------------------------------------------------------------- /Classical Use Cases/Multimodal Transportation Optimization/README.md: -------------------------------------------------------------------------------- 1 | # Multi-modal Transportation Optimization 2 | A project on using mathematical programming to solve multi-modal transportation cost minimization in goods delivery and supply chain management. 3 | ## Catalogue 4 | [**Project Overview**]
5 | [**Problem Statement**]
6 | [**Assumptions**])
7 | [**Dimension & Matrixing**]
8 | [**Decision Variables**]
9 | [**Parameters**]
10 | [**Mathematical Modelling**]
11 | [**Optimization Result & Solution**]
12 | [**Model Use & Extension Guide**] 13 | 14 | ## Project Overview 15 | In delivery services, many different transportation tools such as trucks, airplanes and ships are available. Different choices of routes and transporation tools will lead to different costs. To minimize cost, we should consider goods consolidation (Occassions when different goods share a journey together.), different transportation costs and delivery time constraints etc. This project uses mathematical programming to model such situation and solves for overall cost minimization solution. The model construction offers options of two mathematical programming frameworks, **DOcplex** and **CVXPY**. 16 | 17 | **DOcplex** is a commercial programming framework developed by IBM, which directly calls the **CPLEX** solver. If you want to use this framework option, installation of **CPLEX Studio** needs to be done beforehand. **CVXPY** is an open-source programming framework originally developed by a Stanford research team, detailed info can be found [**here**](https://github.com/cvxgrp/cvxpy). With this option, you can choose from a list of solvers integrated by **CVXPY** framework, **CBC** solver (open-source, needs to be installed first) is recommended for this model, details can be seen [**here**](https://www.cvxpy.org/tutorial/advanced/index.html#choosing-a-solver). 18 | 19 |

20 | 21 | ## Problem Statement 22 | In our simulated case, there are 8 goods, 4 cities/countries (Shanghai, Wuxi, Singapore, Malaysia), 16 ports and 4 transportation tools. The 8 goods originate from different cities and have different destinations. Each city/country has 4 ports, the airport, railway station, seaport and warehouse. There are in total 50 direct routes connecting different ports. Each route has a specific transportation tool, transportation cost, transit time and weekly schedule. Warehouse in each city allows goods to be deposited for a period of time so as to fit certain transportation schedules or wait for other goods to be transported together. All goods might have different order dates and different delivery deadlines. With all these criteria, how can we find out solution routes for all goods that minimize the overall cost? 23 | 24 |

25 |

(The above diagram is only used to show the basic idea of the problem, the routes are not complete.)

26 | 27 | ## Assumptions 28 | Before model building, some assumptions should be made to simplify the case because real-world delivery problems consist of too many unmeasurable factors that can affect the delivery process and final outcomes. Here are the main assumptions:
29 | 1. The delivery process is **deterministic**, no random effect will appear on delivery time and cost etc. 30 | 2. Goods can be transported in **normal container**, no special containers (refrigerated, thermostatic etc.) will be needed. 31 | 3. Container only constraints on the good's **volume**, and all goods are **divisible in terms of volume**. (No bin packing problem needed to be considered.) 32 | 4. The model only evaluates the **major carriage routes**. The first and last mile between end user and origin/destination shipping point are not considered. (**From warehouse to warehouse**.) 33 | 5. There is **only one transportation tool available between each two ports**. For instance, we can only directly go from one airport to the other airport in different cities by flight, while direct journey by ship or railway or truck is infeasible. 34 | 6. Overall cost is restricted to the most important 3 parts, **transportation cost**, **warehouse cost** and **goods tariff**. 35 | 7. The minimum unit for time is **day** in the model, and there is **at most one transit in a route in one day**. 36 | 37 | ## Dimension & Matrixing 38 | In order to make the criteria logic clearer and the calculation more efficient, we use the concept of matrixing to build the necessary components in the model. In our case, there are totally 4 dimensions:
39 | 1. **Start Port:**    ***i***
40 | Indicating the start port of a direct transport route. The dimension length equals the total number of ports in the data. 41 | 2. **End Port:**    ***j***
42 | Indicating the end port of a direct transport route. The dimension length equals the total number of ports in the data. 43 | 1. **Time:**    ***t***
44 | Indicating the departure time of a direct transport. The dimension length equals the total number of days between the earliest order date and the latest delivery deadline date of all goods in the data. 45 | 1. **Goods:**    ***k***
46 | Indicating the goods to be transported. The dimension length equals the total number of goods in the data. 47 | 48 | All the variable or parameter matrices to be introduced in the later parts will have one or more of these 4 dimensions. 49 | ## Decision Variables 50 | As mentioned above, we will use the concept of **variable matrix**, a list of variables deployed in the form of a matrix or multi-dimensional array. In our model, 3 variable matrices will be introduced:
51 | 1. **Decision Variable Matrix:**    ***X***
52 | The most important variable matrix in the model. It's a 4 dimensional matrix, each dimension representing start port, end port, time and goods respectively. Each element in the matrix is a binary variable, representing whether a route is taken by a specific goods. For example, element ***Xi,j,t,k*** represents whether **goods k** travels from **port i** to **port j** at **time t**. 53 | ```python 54 | varList1 = model.binary_var_list(portDim * portDim * timeDim * goodsDim,name = 'x') 55 | x = np.array(varList1).reshape(portDim, portDim, timeDim, goodsDim) 56 | ``` 57 | 58 | 2. **Container Number Matrix:**    ***Y***
59 | A variable matrix used to support the decision variable matrix. It's a 3 dimensional matrix, with each dimension representing start port, end port and time respectively. Each element in the matrix is an integer variable, representing the number of containers needed in a specific route. For example, ***Yi,j,t*** represents the number of containers needed to load all the goods travelling simultaneously from **port i** to **port j** at **time t**. Such matrix is introduced to make up for the limitation of "linear operator only" in mathematical programming, when we need a **roundup()** method in direct calculation of the container number. 60 | ```python 61 | varList2 = model.integer_var_list(portDim * portDim * timeDim,name = 'y') 62 | y = np.array(varList2).reshape(portDim, portDim, timeDim) 63 | ``` 64 | 65 | 3. **Route Usage Matrix:**    ***Z***
66 | A variable matrix used to support the decision variable matrix. It's a 3 dimensional matrix, with each dimension representing start port, end port and time respectively. Each element in the matrix is a binary variable, representing whether a route is used or not. For instance, ***Zi,j,t*** represents whether the route from **port i** to **port j** at **time t** is used or not (no matter which goods). It's introduced with similar purpose to ***Yi,j,t*** . 67 | ```python 68 | varList3 = model.binary_var_list(portDim * portDim * timeDim,name = 'z') 69 | z = np.array(varList3).reshape(portDim, portDim, timeDim) 70 | ``` 71 | 72 | ## Parameters 73 | Similar to the decision variables, the following parameter arrays or matrices are introduced for the sake of later model building:
74 | 75 | 1. **Per Container Cost:**    ***C***
76 | A 3 dimensional parameter matrix, each dimension representing start port, end port and time. ***Ci,j,t*** in the matrix represents the overall transportation cost per container from **port i** to **port j** at **time t**. This overall cost includes handling cost, bunker/fuel cost, documentation cost, equipment cost and extra cost from [**model data.xlsx**]model%20data.xlsx). For infeasible route, the cost element will be set to be big M (an extremely large number), making the choice infeasible. 77 | 78 | 2. **Route Fixed Cost:**    ***FC***
79 | A 3 dimensional parameter matrix, each dimension representing start port, end port and time. ***FCi,j,t*** in the matrix represents the fixed transportation cost to travel from **port i** to **port j** at **time t**, regardless of goods number or volume. For infeasible route, the cost element will be set to be big M as well. 80 | 81 | 3. **Warehouse Cost:**    ***wh***
82 | A one dimension array with dimension start port. ***whi*** represents the warehouse cost per cubic meter per day at **port i**. Warehouse cost for ports with no warehouse function (like airport, railway station etc.) is set to be big M. 83 | 84 | 4. **Transportation Time:**    ***T***
85 | A 3 dimensional parameter matrix, each dimension representing start port, end port and time. ***Ti,j,t*** in the matrix represents the overall transportation time from **port i** to **port j** at **time t**. This overall time includes custom clearance time, handling time, transit time and extra time from [**model data.xlsx**]model%20data.xlsx). For infeasible route, the time element will be set to be big M. 86 | 87 | 5. **Tax Percentage:**    ***tax***
88 | A one dimension array with dimension goods. ***taxk*** represents the tax percentage for **goods k** imposed by its destination country. If the goods only goes through a domestic transit, the tax percentage for such goods will be set as 0. 89 | 90 | 6. **Transit Duty:**    ***td***
91 | A two dimensional matrix, each dimension representing start port and end port. ***tdi,j*** represents the transit duty (tax imposed on goods passing through a country) percentage for goods to go from **port i** to **port j**. If port i and port j belong to the same country, transit duty percentage is set to be 0. For simplicity purpose, transit duty is set to be equal among all goods. (can be extended easily) 92 | 93 | 7. **Container Volume:**    ***ctnV***
94 | A two dimensional matrix, each dimension representing start port and end port. ***ctnVi,j*** represents the volume of container in the route from **port i** to **port j**. 95 | 96 | 8. **Goods Volume:**    ***V***
97 | A one dimension array with dimension goods. ***Vk*** represents the volume of **goods k**. 98 | 99 | 9. **Goods Value:**    ***val***
100 | A one dimension array with dimension goods. ***valk*** represents the value of **goods k**. 101 | 102 | 10. **Order Date:**    ***ord***
103 | A one dimension array with dimension goods. ***ordk*** represents the order date of **goods k**. 104 | 105 | 11. **Deadline Date:**    ***ddl***
106 | A one dimension array with dimension goods. ***ddlk*** represents the deadline delivery date of **goods k**. 107 | 108 | 12. **Origin Port:**    ***OP***
109 | A one dimension array with dimension goods. ***OPk*** represents the port where **goods k** starts from. 110 | 111 | 13. **Destination Port:**    ***DP***
112 | A one dimension array with dimension goods. ***DPk*** represents the port where **goods k** ends up to be in. 113 | 114 | The data of all the above parameter matrices will be imported from **model data.xlsx** with function **transform()** and **set_param()**. For more details, please refer to the codes in [**multi-modal transpotation.py**]multi-modal%20transpotation.py). 115 | 116 | ## Mathematical Modelling 117 | With all the variables and parameters defined above, we can build up the objectives and constraints to form an integer programming model. 118 | ### Objective 119 | The objective of the model is to minimize the overall cost, which includes 3 parts, **transportation cost**, **warehouse cost** and **tax cost**. Firstly, the **transportation cost** includes container cost and route fixed cost. Container cost equals the number of containers used in each route times per container cost while route fixed cost equals the sum of fixed cost of all used routes. Secondly, the **warehouse cost** equals all goods' sum of volume times days of storage times warehouse fee per cubic meter per day in each warehouse. Finally, the **tax cost** equals the sum of import tariff and transit duty of all goods. Mathematic formulation and Python implementation are attached below. 120 |

121 |

122 | 123 | ```python 124 | transportCost = np.sum(y*perCtnCost) + np.sum(z*tranFixedCost) 125 | warehouseCost = warehouse_fee(x)[0] #For details ,pleas refer to function warehouse_fee() in code. 126 | taxCost = np.sum(taxPct*kValue) + np.sum(np.sum(np.dot(x,kValue),axis=2)*transitDuty) 127 | model.minimize(transportCost + warehouseCost + taxCost) 128 | ``` 129 | 130 | ### Constraints 131 | 1. For each goods k, it must be shipped out from its origin to another node and shipped to its destination. 132 |

133 | 134 | ```python 135 | model.add_constraints(np.sum(x[OriginPort[k],:,:,k]) == 1 for k in range(goodsDim)) 136 | model.add_constraints(np.sum(x[:,DestinationPort[k],:,k]) == 1 for k in range(goodsDim)) 137 | ``` 138 | 139 | 2. For each goods k, it couldn't be shipped out from its destination or shipped to its origin. 140 |

141 | 142 | ```python 143 | model.add_constraints(np.sum(x[:,OriginPort[k],:,k]) == 0 for k in range(goodsDim)) 144 | model.add_constraints(np.sum(x[DestinationPort[k],:,:,k]) == 0 for k in range(goodsDim)) 145 | ``` 146 | 147 | 3. For each goods k at transition point j (neither origin nor destination), ship-in times must equal ship-out times. 148 |

149 | 150 | ```python 151 | for k in range(goodsDim): 152 | for j in range(portDim): 153 | if (j != OriginPort[k]) & (j != DestinationPort[k]): 154 | model.add_constraint(np.sum(x[:,j,:,k])==np.sum(x[j,:,:,k])) 155 | ``` 156 | 157 | 4. Each goods k can only be transitioned in or out of a port for at most once. 158 |

159 | 160 | ```python 161 | model.add_constraints(np.sum(x[i,:,:,k]) <= 1 for k in range(goodsDim) for i in range(portDim)) 162 | model.add_constraints(np.sum(x[:,j,:,k]) <= 1 for k in range(goodsDim) for j in range(portDim)) 163 | ``` 164 | 165 | 5. For each goods k at transition port j, ship-out time should be after ship-in time. For goods k at its origin port, ship-out time should be after order date. (Or stay time greater than order date, because ship-in time is none) 166 |

167 | 168 | ```python 169 | startTime = np.arange(timeDim).reshape(1,1,timeDim,1)*x 170 | arrTime = startTime + tranTime.reshape(portDim,portDim,timeDim,1)*x 171 | stayTime = np.sum(startTime,axis=(1,2)) - np.sum(arrTime,axis=(0,2)) 172 | stayTime[OriginPort,range(goodsDim)] -= OrderDate #ship-out time at origin port should be after order date 173 | stayTime[DestinationPort,range(goodsDim)] = 0 #stay time at destination port is not considered 174 | model.add_constraints(stayTime[i,k] >= 0 for i in range(portDim) for k in range(goodsDim)) 175 | ``` 176 | 177 | 6. At each route at time t, the total volume of containers should be larger than the total volume of goods. 178 |

179 | 180 | ```python 181 | numCtn = np.dot(x,kVol) / ctnVol.reshape(portDim,portDim,1) 182 | model.add_constraints(y[i,j,t] - numCtn[i,j,t] >= 0 \ 183 | for i in range(portDim) for j in range(portDim) for t in range(dateDim)) 184 | ``` 185 | 186 | 7. Check whether a route is used at time t. Because ***Zi,j,t*** is binary variable, if a route is used, sum of ***Xi,j,t,k*** for all goods k at **i,j,t** must be greater than 0. We can scale it back to [0,1] by multiplying a small number. 187 |

188 | 189 | ```python 190 | model.add_constraints(z[i,j,t] >= np.sum(x[i,j,t,:])*10e-5 \ 191 | for i in range(portDim) for j in range(portDim) for t in range(timeDim)) 192 | ``` 193 | 194 | 8. For each goods k, it should be shipped to its destination port before the deadline delivery date. 195 |

196 | 197 | ```python 198 | arrTime = np.arange(timeDim).reshape(1,1,timeDim,1)*x + tranTime.reshape(portDim,portDim,timeDim,1)*x 199 | model.add_constraints(np.sum(arrTime[:,DestinationPort[k],:,k]) <= kDDL[k] for k in range(goodsDim)) 200 | ``` 201 | 202 | ## Optimization Result & Solution 203 | With the objective & constraints built, the model is now complete! To make users understand the result easier, we process it with the function **txt_solution()** and save it into a text file [**Solution.txt**]Solution.txt). The minimized cost value as well as optimal routes for all goods are presented in it. 204 | 205 | ## Model Use & Extension Guide 206 | Now that you know about all the details of the model, if you want to use it, just download the [**multi-modal transportation.py**]multi-modal%20transportation.py) & [**model data.xlsx**]model%20data.xlsx) in this repo. You can reset all the parameters' data in the excel file to fit your own case. After that, just run the python file and your **Solution.txt** will be generated! 207 | 208 | The code is written in OOP format, so you can easily modify or extend the **class MMT** to fit cases of more complex situations. You can also try modifying the assumptions such as making it to be a stochastic optimization problem or so on. 209 | 210 | Finally, hope this long documentation helps, and thanks all my MSBA teammates (**Derek, Teng Lei, Max, Veronica, Sophia**) for the help and contributions in this project! 211 | --------------------------------------------------------------------------------