├── 1. Pandas Basics.ipynb ├── 2. Pandas replication of Google Trends paper.ipynb ├── 3. Backtesting using Zipline.ipynb ├── README.md └── data ├── GoogleTrendsData.csv └── PreisMoatStanley_ScientificReports_3_1684_2013.dat /README.md: -------------------------------------------------------------------------------- 1 | Financial Analysis in Python 2 | ============================ 3 | 4 | Tutorial at PyData Boston 2013 July 27, 2013 at 4:30 pm. 5 | http://pydata.org/bos2013/schedule/ 6 | 7 | You can view the video of the talk [here](https://vimeo.com/73875233). 8 | 9 | *Thomas Wiecki, Quantopian Inc. and Brown University* 10 | 11 | Thomas Wiecki is a Quantitative 12 | Researcher at [Quantopian Inc](https://www.quantopian.com) -- a Boston 13 | based startup providing you with the first browser based algorithmic 14 | trading platform -- and a PhD student at Brown University where he 15 | studies Computational Cognitive Neuroscience. He specializes in 16 | Bayesian Inference, Machine Learning, Scientific Computing in Python, 17 | algorithmic trading and Computational Psychiatry. 18 | 19 | Description 20 | ----------- 21 | 22 | This tutorial will provide hands-on experience of various data 23 | analysis tools relevant for financial analysis in Python. We will 24 | first see how financial data can be imported from various sources such 25 | as Yahoo! finance. Pandas, Matplotlib and statsmodels can be used for 26 | basic and more advanced time-series analysis. While rudimentary 27 | backtesting of investment strategies on historical data can be carried 28 | out using Pandas, a more realistic simulation that considers 29 | transaction costs, slippage and avoids look-ahead bias, introduces 30 | various complexities. We will then see how Zipline, an open-source 31 | streaming-based financial simulator written in Python, can make 32 | realistic backtesting much easier. After going through some simple 33 | example algorithms we will see how statistical Python libraries like 34 | scikits.learn can easily be incorporated with Zipline to build 35 | state-of-the art trading algorithms. Finally, I will briefly show how 36 | the same algorithm code can be run with minimal code changes on 37 | Quantopian -- a free, browser-based platform for developing 38 | algorithmic trading models. 39 | 40 | The target audience for the tutorial includes all new Python users, 41 | though we recommend that users also attend the NumPy and IPython 42 | session in the introductory track. 43 | 44 | What you will learn 45 | ------------------- 46 | 47 | * Timeseries analysis using Pandas 48 | * Using Google Trends to predict market movements 49 | * Build your own trading strategies using Zipline 50 | * Common trading strategies: 51 | * Momentum trading 52 | * Mean-reversion 53 | 54 | Student Instructions 55 | -------------------- 56 | 57 | For students familiar with Git, you may simply clone this repository 58 | to obtain all the materials (IPython notebooks and data) for the 59 | tutorial. Alternatively, you may [download a zip 60 | file](https://github.com/twiecki/financial-analysis-python-tutorial/archive/master.zip) 61 | containing the materials. A third option is to simply view static 62 | notebooks by clicking on the titles of each section below. 63 | 64 | I strongly encourage you to set up the environment on your own 65 | computer so that you can follow along during the tutorial. 66 | 67 | After you have the materials, from a command shell cd into the financial-analysis-in-python-tutorial directory and execute: 68 | ``` 69 | ipython notebook --pylab=inline 70 | ``` 71 | 72 | This should open a new browser window from where you can access the notebooks. 73 | 74 | Outline 75 | ------- 76 | 77 | You can view the video of the talk [here](https://vimeo.com/73875233). 78 | 79 | [1. Pandas 80 | basics](http://nbviewer.ipython.org/urls/raw.github.com/twiecki/financial-analysis-python-tutorial/master/1.%2520Pandas%2520Basics.ipynb) 81 | * Creating/Loading time-series data 82 | * Series and DataFrame: First steps 83 | * Data alignment 84 | * Plotting basics 85 | * Common financial analyses (returns, correlations, ...) 86 | 87 | [2. Pandas replication of Google Trends 88 | paper](http://nbviewer.ipython.org/urls/raw.github.com/twiecki/financial-analysis-python-tutorial/master/2.%2520Pandas%2520replication%2520of%2520Google%2520Trends%2520paper.ipynb) 89 | * Replication of recent paper: [Quantifying Trading Behavior in Financial Markets Using Google Trends](http://www.nature.com/srep/2013/130425/srep01684/pdf/srep01684.pdf) 90 | * Uses Google search trends to predict market movements 91 | 92 | [3. Backtesting using 93 | Zipline](http://nbviewer.ipython.org/urls/raw.github.com/twiecki/financial-analysis-python-tutorial/master/3.%2520Backtesting%2520using%2520Zipline.ipynb) 94 | * What gets modeled? Why? 95 | * Stream-based computing 96 | * My first algorithm 97 | * Example momentum trade algorithm 98 | * Example mean-reversion algorithm 99 | 100 | [4. Quantopian: Community, Data, Infrastructure, Live Trading](https://www.quantopian.com) 101 | * Quick intro 102 | * Example algorithm 103 | 104 | Required Packages 105 | ----------------- 106 | 107 | * Python 2.7 (*Python 3 is not supported at this point!*) 108 | * pandas >= 0.11.1 109 | * NumPy >= 1.6.1 110 | * SciPy >= 0.7.0 111 | * Matplotlib >= 1.0.0 112 | * Zipline == 0.5.10 113 | 114 | The easiest way to install all the necessary packages (except Zipline) is 115 | to use Continuum Analytics' [Anaconda](http://docs.continuum.io/anaconda/install.html). 116 | 117 | Zipline can then be installed via pip: 118 | ``` 119 | pip install zipline 120 | ``` 121 | -------------------------------------------------------------------------------- /data/GoogleTrendsData.csv: -------------------------------------------------------------------------------- 1 | Date,djia,debt 2 | 2004-01-14,10485.18,0.21 3 | 2004-01-22,10528.66,0.21 4 | 2004-01-28,10702.51,0.21 5 | 2004-02-04,10499.18,0.213333 6 | 2004-02-11,10579.03,0.2 7 | 2004-02-19,10714.88,0.203333 8 | 2004-02-25,10609.62,0.2 9 | 2004-03-03,10678.14,0.2 10 | 2004-03-10,10529.48,0.196667 11 | 2004-03-17,10102.89,0.196667 12 | 2004-03-24,10064.75,0.193333 13 | 2004-03-31,10329.63,0.173333 14 | 2004-04-07,10558.37,0.193333 15 | 2004-04-14,10515.56,0.18 16 | 2004-04-21,10437.85,0.183333 17 | 2004-04-28,10444.73,0.196667 18 | 2004-05-05,10314.0,0.2 19 | 2004-05-12,9990.02,0.18 20 | 2004-05-19,9906.91,0.18 21 | 2004-05-26,9958.43,0.173333 22 | 2004-06-03,10202.65,0.18 23 | 2004-06-09,10391.08,0.18 24 | 2004-06-16,10334.73,0.196667 25 | 2004-06-23,10371.47,0.183333 26 | 2004-06-30,10357.09,0.183333 27 | 2004-07-08,10219.34,0.193333 28 | 2004-07-14,10238.22,0.176667 29 | 2004-07-21,10094.06,0.186667 30 | 2004-07-28,9961.92,0.19 31 | 2004-08-04,10179.16,0.2 32 | 2004-08-11,9814.66,0.19 33 | 2004-08-18,9954.55,0.173333 34 | 2004-08-25,10073.05,0.176667 35 | 2004-09-01,10122.52,0.18 36 | 2004-09-09,10342.79,0.176667 37 | 2004-09-15,10314.76,0.186667 38 | 2004-09-22,10204.89,0.186667 39 | 2004-09-29,9988.54,0.196667 40 | 2004-10-06,10216.54,0.21 41 | 2004-10-13,10081.97,0.2 42 | 2004-10-20,9956.32,0.203333 43 | 2004-10-27,9749.99,0.206667 44 | 2004-11-03,10054.39,0.2 45 | 2004-11-10,10391.31,0.226667 46 | 2004-11-17,10550.24,0.213333 47 | 2004-11-24,10489.42,0.216667 48 | 2004-12-01,10475.9,0.183333 49 | 2004-12-08,10547.06,0.206667 50 | 2004-12-15,10638.32,0.196667 51 | 2004-12-22,10661.6,0.17 52 | 2004-12-29,10776.13,0.133333 53 | 2005-01-05,10729.43,0.16 54 | 2005-01-12,10621.03,0.19 55 | 2005-01-20,10628.79,0.193333 56 | 2005-01-26,10368.61,0.186667 57 | 2005-02-02,10489.94,0.196667 58 | 2005-02-09,10715.76,0.196667 59 | 2005-02-16,10791.13,0.19 60 | 2005-02-24,10611.2,0.18 61 | 2005-03-02,10766.23,0.19 62 | 2005-03-09,10936.86,0.19 63 | 2005-03-16,10804.51,0.19 64 | 2005-03-23,10565.39,0.18 65 | 2005-03-30,10485.65,0.18 66 | 2005-04-06,10421.14,0.18 67 | 2005-04-13,10448.56,0.186667 68 | 2005-04-20,10071.25,0.2 69 | 2005-04-27,10242.47,0.19 70 | 2005-05-04,10251.7,0.19 71 | 2005-05-11,10384.34,0.186667 72 | 2005-05-18,10252.29,0.18 73 | 2005-05-25,10523.56,0.18 74 | 2005-06-02,10467.48,0.163333 75 | 2005-06-08,10467.03,0.163333 76 | 2005-06-15,10522.56,0.166667 77 | 2005-06-22,10609.11,0.183333 78 | 2005-06-29,10290.78,0.17 79 | 2005-07-07,10371.8,0.173333 80 | 2005-07-13,10519.72,0.166667 81 | 2005-07-20,10574.99,0.173333 82 | 2005-07-27,10596.48,0.176667 83 | 2005-08-03,10623.15,0.176667 84 | 2005-08-10,10536.93,0.17 85 | 2005-08-17,10634.38,0.166667 86 | 2005-08-24,10569.89,0.163333 87 | 2005-08-31,10463.05,0.156667 88 | 2005-09-08,10589.24,0.166667 89 | 2005-09-14,10682.94,0.156667 90 | 2005-09-21,10557.63,0.18 91 | 2005-09-28,10443.63,0.186667 92 | 2005-10-05,10535.48,0.186667 93 | 2005-10-12,10238.76,0.18 94 | 2005-10-19,10348.1,0.183333 95 | 2005-10-26,10385.0,0.17 96 | 2005-11-02,10440.07,0.17 97 | 2005-11-09,10586.23,0.173333 98 | 2005-11-16,10697.17,0.163333 99 | 2005-11-23,10820.28,0.176667 100 | 2005-11-30,10890.72,0.14 101 | 2005-12-07,10835.01,0.173333 102 | 2005-12-14,10767.77,0.17 103 | 2005-12-21,10836.53,0.16 104 | 2005-12-29,10777.77,0.133333 105 | 2006-01-05,10847.41,0.146667 106 | 2006-01-11,11011.9,0.18 107 | 2006-01-19,10896.32,0.18 108 | 2006-01-25,10688.77,0.18 109 | 2006-02-01,10899.92,0.18 110 | 2006-02-08,10798.27,0.18 111 | 2006-02-15,10892.32,0.18 112 | 2006-02-23,11069.06,0.173333 113 | 2006-03-01,11097.55,0.18 114 | 2006-03-08,10958.59,0.18 115 | 2006-03-15,11076.02,0.183333 116 | 2006-03-22,11274.53,0.196667 117 | 2006-03-29,11250.11,0.19 118 | 2006-04-05,11144.94,0.19 119 | 2006-04-12,11141.33,0.19 120 | 2006-04-19,11073.78,0.19 121 | 2006-04-26,11336.32,0.19 122 | 2006-05-03,11343.29,0.19 123 | 2006-05-10,11584.54,0.18 124 | 2006-05-17,11428.77,0.176667 125 | 2006-05-24,11125.33,0.17 126 | 2006-06-01,11094.43,0.163333 127 | 2006-06-07,11048.72,0.156667 128 | 2006-06-14,10792.58,0.163333 129 | 2006-06-21,10942.11,0.17 130 | 2006-06-28,11045.28,0.156667 131 | 2006-07-05,11228.02,0.16 132 | 2006-07-12,11103.55,0.16 133 | 2006-07-19,10747.36,0.21 134 | 2006-07-26,11051.05,0.193333 135 | 2006-08-02,11185.68,0.18 136 | 2006-08-09,11219.38,0.18 137 | 2006-08-16,11097.87,0.17 138 | 2006-08-23,11345.05,0.17 139 | 2006-08-30,11352.01,0.17 140 | 2006-09-07,11469.28,0.153333 141 | 2006-09-13,11396.84,0.16 142 | 2006-09-20,11555.0,0.163333 143 | 2006-09-27,11575.81,0.17 144 | 2006-10-04,11670.35,0.183333 145 | 2006-10-11,11857.81,0.18 146 | 2006-10-18,11980.6,0.18 147 | 2006-10-25,12116.91,0.183333 148 | 2006-11-01,12086.5,0.18 149 | 2006-11-08,12105.55,0.183333 150 | 2006-11-15,12131.88,0.18 151 | 2006-11-22,12316.54,0.18 152 | 2006-11-29,12121.71,0.153333 153 | 2006-12-06,12283.85,0.18 154 | 2006-12-13,12328.48,0.173333 155 | 2006-12-20,12441.27,0.17 156 | 2006-12-28,12407.63,0.14 157 | 2007-01-05,12474.52,0.14 158 | 2007-01-10,12423.49,0.166667 159 | 2007-01-18,12582.59,0.176667 160 | 2007-01-24,12477.16,0.176667 161 | 2007-01-31,12490.78,0.2 162 | 2007-02-07,12661.74,0.19 163 | 2007-02-14,12552.55,0.18 164 | 2007-02-22,12786.64,0.173333 165 | 2007-02-28,12632.26,0.183333 166 | 2007-03-07,12050.41,0.18 167 | 2007-03-14,12318.62,0.186667 168 | 2007-03-21,12226.17,0.18 169 | 2007-03-28,12469.07,0.18 170 | 2007-04-04,12382.3,0.186667 171 | 2007-04-11,12569.14,0.186667 172 | 2007-04-18,12720.46,0.186667 173 | 2007-04-25,12919.4,0.183333 174 | 2007-05-02,13062.91,0.196667 175 | 2007-05-09,13312.97,0.196667 176 | 2007-05-16,13346.78,0.183333 177 | 2007-05-23,13542.88,0.183333 178 | 2007-05-31,13521.34,0.173333 179 | 2007-06-06,13676.32,0.17 180 | 2007-06-13,13424.96,0.18 181 | 2007-06-20,13612.98,0.183333 182 | 2007-06-27,13352.05,0.18 183 | 2007-07-04,13535.43,0.18 184 | 2007-07-11,13649.97,0.163333 185 | 2007-07-18,13950.98,0.19 186 | 2007-07-25,13943.42,0.183333 187 | 2007-08-01,13358.31,0.19 188 | 2007-08-08,13468.78,0.186667 189 | 2007-08-15,13236.53,0.18 190 | 2007-08-22,13121.35,0.18 191 | 2007-08-29,13322.13,0.173333 192 | 2007-09-06,13448.86,0.18 193 | 2007-09-12,13127.85,0.163333 194 | 2007-09-19,13403.42,0.18 195 | 2007-09-26,13759.06,0.193333 196 | 2007-10-03,14087.55,0.2 197 | 2007-10-10,14043.73,0.2 198 | 2007-10-17,13984.8,0.19 199 | 2007-10-24,13566.97,0.2 200 | 2007-10-31,13870.26,0.2 201 | 2007-11-07,13543.4,0.21 202 | 2007-11-14,12987.55,0.213333 203 | 2007-11-21,12958.44,0.21 204 | 2007-11-28,12743.44,0.173333 205 | 2007-12-05,13314.57,0.226667 206 | 2007-12-12,13727.03,0.226667 207 | 2007-12-19,13167.2,0.196667 208 | 2007-12-26,13550.04,0.17 209 | 2008-01-02,13264.82,0.17 210 | 2008-01-09,12827.49,0.2 211 | 2008-01-16,12778.15,0.226667 212 | 2008-01-24,11971.19,0.226667 213 | 2008-01-30,12383.89,0.226667 214 | 2008-02-06,12635.16,0.23 215 | 2008-02-13,12240.01,0.22 216 | 2008-02-21,12337.22,0.206667 217 | 2008-02-27,12570.22,0.21 218 | 2008-03-05,12258.9,0.226667 219 | 2008-03-12,11740.15,0.22 220 | 2008-03-19,11972.25,0.21 221 | 2008-03-26,12548.64,0.21 222 | 2008-04-02,12262.89,0.213333 223 | 2008-04-09,12612.43,0.21 224 | 2008-04-16,12302.06,0.21 225 | 2008-04-23,12825.02,0.21 226 | 2008-04-30,12871.75,0.206667 227 | 2008-05-07,12969.54,0.203333 228 | 2008-05-14,12876.31,0.2 229 | 2008-05-21,13028.16,0.19 230 | 2008-05-29,12548.35,0.183333 231 | 2008-06-04,12503.82,0.18 232 | 2008-06-11,12280.32,0.193333 233 | 2008-06-18,12269.08,0.19 234 | 2008-06-25,11842.36,0.19 235 | 2008-07-02,11350.01,0.2 236 | 2008-07-09,11231.96,0.18 237 | 2008-07-16,11055.19,0.213333 238 | 2008-07-23,11467.34,0.21 239 | 2008-07-30,11131.08,0.2 240 | 2008-08-06,11284.15,0.21 241 | 2008-08-13,11782.35,0.196667 242 | 2008-08-20,11479.39,0.19 243 | 2008-08-27,11386.25,0.2 244 | 2008-09-04,11516.92,0.203333 245 | 2008-09-10,11510.74,0.203333 246 | 2008-09-17,10917.51,0.226667 247 | 2008-09-24,11015.69,0.26 248 | 2008-10-01,10365.45,0.303333 249 | 2008-10-08,9955.5,0.283333 250 | 2008-10-15,9387.61,0.316667 251 | 2008-10-22,9265.43,0.28 252 | 2008-10-29,8175.77,0.24 253 | 2008-11-05,9319.83,0.25 254 | 2008-11-12,8870.54,0.243333 255 | 2008-11-19,8273.58,0.263333 256 | 2008-11-26,8443.39,0.253333 257 | 2008-12-03,8149.09,0.22 258 | 2008-12-10,8934.18,0.253333 259 | 2008-12-17,8564.53,0.236667 260 | 2008-12-24,8519.69,0.216667 261 | 2008-12-31,8483.93,0.17 262 | 2009-01-07,8952.89,0.216667 263 | 2009-01-14,8473.97,0.256667 264 | 2009-01-22,7949.09,0.24 265 | 2009-01-28,8116.03,0.236667 266 | 2009-02-04,7936.75,0.24 267 | 2009-02-11,8270.87,0.256667 268 | 2009-02-19,7552.6,0.25 269 | 2009-02-25,7114.78,0.263333 270 | 2009-03-04,6763.29,0.27 271 | 2009-03-11,6547.05,0.253333 272 | 2009-03-18,7216.97,0.233333 273 | 2009-03-25,7775.86,0.236667 274 | 2009-04-01,7522.02,0.25 275 | 2009-04-08,7975.85,0.23 276 | 2009-04-15,8057.81,0.226667 277 | 2009-04-22,7841.73,0.243333 278 | 2009-04-29,8025.0,0.23 279 | 2009-05-06,8426.74,0.22 280 | 2009-05-13,8418.77,0.21 281 | 2009-05-20,8504.08,0.206667 282 | 2009-05-28,8473.49,0.21 283 | 2009-06-03,8721.44,0.203333 284 | 2009-06-10,8764.49,0.216667 285 | 2009-06-17,8612.13,0.21 286 | 2009-06-24,8339.01,0.22 287 | 2009-07-01,8529.38,0.22 288 | 2009-07-08,8324.87,0.23 289 | 2009-07-15,8331.68,0.23 290 | 2009-07-22,8848.15,0.23 291 | 2009-07-29,9108.51,0.22 292 | 2009-08-05,9286.56,0.213333 293 | 2009-08-12,9337.95,0.223333 294 | 2009-08-19,9135.34,0.226667 295 | 2009-08-26,9509.28,0.21 296 | 2009-09-02,9496.28,0.243333 297 | 2009-09-10,9497.34,0.213333 298 | 2009-09-16,9626.8,0.21 299 | 2009-09-23,9778.86,0.22 300 | 2009-09-30,9789.36,0.216667 301 | 2009-10-07,9599.75,0.21 302 | 2009-10-14,9885.8,0.21 303 | 2009-10-21,10092.19,0.23 304 | 2009-10-28,9867.96,0.22 305 | 2009-11-04,9789.44,0.21 306 | 2009-11-11,10226.94,0.21 307 | 2009-11-18,10406.96,0.21 308 | 2009-11-25,10450.95,0.226667 309 | 2009-12-02,10344.84,0.213333 310 | 2009-12-09,10390.11,0.226667 311 | 2009-12-16,10501.05,0.21 312 | 2009-12-23,10414.14,0.203333 313 | 2009-12-30,10547.08,0.146667 314 | 2010-01-06,10583.96,0.18 315 | 2010-01-13,10663.99,0.216667 316 | 2010-01-21,10725.43,0.22 317 | 2010-01-27,10196.86,0.22 318 | 2010-02-03,10185.53,0.253333 319 | 2010-02-10,9908.39,0.27 320 | 2010-02-18,10268.81,0.243333 321 | 2010-02-24,10383.38,0.243333 322 | 2010-03-03,10403.79,0.24 323 | 2010-03-10,10552.52,0.236667 324 | 2010-03-17,10642.15,0.223333 325 | 2010-03-24,10785.89,0.22 326 | 2010-03-31,10895.86,0.243333 327 | 2010-04-07,10973.55,0.21 328 | 2010-04-14,11005.97,0.223333 329 | 2010-04-21,11092.05,0.23 330 | 2010-04-28,11205.03,0.213333 331 | 2010-05-05,11151.83,0.246667 332 | 2010-05-12,10785.14,0.233333 333 | 2010-05-19,10625.83,0.22 334 | 2010-05-26,10066.57,0.2 335 | 2010-06-03,10024.02,0.196667 336 | 2010-06-09,9816.49,0.186667 337 | 2010-06-16,10190.89,0.2 338 | 2010-06-23,10442.41,0.19 339 | 2010-06-30,10138.52,0.193333 340 | 2010-07-08,9743.62,0.19 341 | 2010-07-14,10216.27,0.173333 342 | 2010-07-21,10154.43,0.18 343 | 2010-07-28,10525.43,0.186667 344 | 2010-08-04,10674.38,0.18 345 | 2010-08-11,10698.75,0.173333 346 | 2010-08-18,10302.01,0.18 347 | 2010-08-25,10174.41,0.17 348 | 2010-09-01,10009.73,0.17 349 | 2010-09-09,10340.69,0.163333 350 | 2010-09-15,10544.13,0.166667 351 | 2010-09-22,10753.62,0.173333 352 | 2010-09-29,10812.04,0.176667 353 | 2010-10-06,10751.27,0.18 354 | 2010-10-13,11010.34,0.18 355 | 2010-10-20,11143.69,0.18 356 | 2010-10-27,11164.05,0.186667 357 | 2010-11-03,11124.62,0.19 358 | 2010-11-10,11406.84,0.2 359 | 2010-11-17,11201.97,0.2 360 | 2010-11-24,11178.58,0.193333 361 | 2010-12-01,11052.49,0.14 362 | 2010-12-08,11362.19,0.186667 363 | 2010-12-15,11428.56,0.18 364 | 2010-12-22,11478.13,0.16 365 | 2010-12-29,11555.03,0.12 366 | 2011-01-05,11670.75,0.13 367 | 2011-01-12,11637.45,0.186667 368 | 2011-01-20,11837.93,0.17 369 | 2011-01-26,11980.52,0.18 370 | 2011-02-02,11891.93,0.19 371 | 2011-02-09,12161.63,0.176667 372 | 2011-02-16,12268.19,0.173333 373 | 2011-02-24,12212.79,0.18 374 | 2011-03-02,12226.34,0.17 375 | --------------------------------------------------------------------------------