├── .gitignore ├── .travis.yml ├── README.md ├── docs ├── .buildinfo ├── .doctrees │ ├── environment.pickle │ ├── index.doctree │ ├── intro.doctree │ ├── readme.doctree │ └── tmp.doctree ├── Makefile ├── build │ ├── .buildinfo │ └── .doctrees │ │ ├── asset_class.doctree │ │ ├── environment.pickle │ │ ├── index.doctree │ │ ├── intro.doctree │ │ └── readme.doctree └── source │ ├── .doctrees │ ├── asset_class.doctree │ ├── index.doctree │ ├── intro.doctree │ ├── modules.doctree │ ├── readme.doctree │ ├── visualize_wealth.analyze.doctree │ ├── visualize_wealth.classify.doctree │ ├── visualize_wealth.construct_portfolio.doctree │ ├── visualize_wealth.doctree │ └── visualize_wealth.utils.doctree │ ├── asset_class.rst │ ├── conf.py │ ├── images │ └── subclass_overtime.png │ ├── index.rst │ ├── intro.rst │ ├── readme.rst │ └── visualize_wealth.rst ├── requirements.txt ├── run_tests ├── setup.py ├── test_data ├── estimating when splits have occurred.xlsx ├── panel from weight file test.xlsx ├── test_analyze.xlsx ├── test_ret_calcs.xlsx ├── test_splits.xlsx ├── transaction-costs.xlsx └── ~$panel from weight file test.xlsx ├── test_module ├── __init__.py ├── test_analyze.py ├── test_construct_portfolio.py └── test_utils.py └── visualize_wealth ├── __init__.py ├── analyze.py ├── classify.py ├── construct_portfolio.py └── utils.py /.gitignore: -------------------------------------------------------------------------------- 1 | .DS_Store 2 | *~ 3 | build/ 4 | *.pyc 5 | *.dropbox 6 | *.egg-info/ 7 | dist/ 8 | docs/ 9 | .coverage 10 | -------------------------------------------------------------------------------- /.travis.yml: -------------------------------------------------------------------------------- 1 | language: python 2 | python: 3 | - 2.7 4 | # command to install dependencies 5 | before_install: 6 | - wget http://repo.continuum.io/miniconda/Miniconda-latest-Linux-x86_64.sh -O miniconda.sh 7 | - chmod +x miniconda.sh 8 | - ./miniconda.sh -b 9 | - export PATH=/home/travis/miniconda/bin:$PATH 10 | - conda update --yes conda 11 | 12 | install: 13 | - conda install --yes python=$TRAVIS_PYTHON_VERSION atlas numpy scipy pytest 14 | - conda install --yes python=$TRAVIS_PYTHON_VERSION matplotlib nose dateutil 15 | - conda install --yes python=$TRAVIS_PYTHON_VERSION pandas statsmodels pytables xlrd 16 | 17 | # - python setup.py install 18 | # - pip install -r preamble.txt 19 | # - pip install -r requirements.txt 20 | # - pip install -r denouement.txt 21 | 22 | # command to run tests 23 | script: 24 | - py.test ./test_module/test_analyze.py -v 25 | - py.test ./test_module/test_utils.py -v 26 | - py.test ./test_module/test_construct_portfolio.py -v 27 | 28 | # the body of this script was found by @dan-blanchard at https://gist.github.com/dan-blanchard/7045057 29 | -------------------------------------------------------------------------------- /README.md: -------------------------------------------------------------------------------- 1 | #`visualize_wealth` README.md [![Build Status](https://travis-ci.org/benjaminmgross/visualize-wealth.svg?branch=master)](https://travis-ci.org/benjaminmgross/visualize-wealth) 2 | 3 | A library built in Python to construct, backtest, analyze, and evaluate portfolios and their benchmarks, with comprehensive documentation and manual calculations to illustrate all underlying methodologies and statistics. 4 | 5 | ##License 6 | 7 | This program is free software and is distrubuted under the 8 | [GNU General Public License version 3](http://www.gnu.org/licenses/quick-guide-gplv3.html) ("GNU GPL v3") 9 | 10 | © Benjamin M. Gross 2013 11 | 12 | **NOTE:** Because so much of the underlying technology I'm continuing to build has become the building blocks 13 | for [my financial technology startup](http://www.visualizewealth.com), I've forked this repo (as of 5.2015) and made new 14 | changes private. I might continue to push some of the bigger changes to this repo to keep it open source, but 15 | we'll see. 16 | 17 | ##Dependencies 18 | 19 | - `numpy` & `scipy`: The building blocks of everything quant 20 | - `pandas`: extensively used (`numpy` and `scipy` obviously, but 21 | - `pandas` depends on those) 22 | - `tables`: for HDFStore price extraction 23 | - `urllib2`: for Yahoo! API calls to append price `DataFrame`s with 24 | Dividends 25 | 26 | For a full list of dependencies, see the `requirements.txt` file in 27 | the root folder. 28 | 29 | 30 | ##Installation 31 | 32 | To install the `visualize_wealth` modules onto your computer, go into 33 | your desired folder of choice (say `Downloads`), and: 34 | 35 | 1. Clone the repository 36 | 37 | $ cd ~/Downloads 38 | $ git clone https://github.com/benjaminmgross/wealth-viz 39 | 40 | 2. `cd` into the `wealth-viz` directory 41 | 42 | $ cd wealth-viz 43 | 44 | 3. Install the package 45 | 46 | $ python setup.py install 47 | 48 | 4. Check your install. From anywhere on your machine, be able to open 49 | `iPython` and import the library, for example: 50 | 51 | $ cd ~/ 52 | $ ipython 53 | 54 | IPython 1.1.0 -- An enhanced Interactive Python. 55 | ? -> Introduction and overview of IPython's features. 56 | %quickref -> Quick reference. 57 | help -> Python's own help system. 58 | object? -> Details about 'object', use 'object??' for extra details. 59 | 60 | In [1]: import visualize_wealth 61 | 62 | **"Ligget Se!"** 63 | 64 | ##Documentation 65 | 66 | The `README.md` file has fairly good examples, but I've gone to great lengths to autogenerate documentation for the code using [Sphinx](http://sphinx-doc.org/). Therefore, aside from the docstrings, when you `git clone` the repository, use these instructions to generate the auto-documentation: 67 | 68 | 1. `cd /path-to-wealth-viz/` 69 | 2. `sphinx-build -b html ./docs/source/ ./docs/build/` 70 | 71 | Now that the autogenerated documentation is complete, you can `cd` into: 72 | 73 | $ cd visualize_wealth/docs/build/ 74 | 75 | and find full `.html` browseable code documentation (that's pretty beautiful... if I do say so my damn self) with live links, function explanations (that also have live links to their respective definition on the web), etc. 76 | 77 | Also I've created an Excel spreadsheet that illustrates almost all of the `analyze.py` portfolio statistic calculations. That spreadsheet can be found in: 78 | 79 | visualize_wealth > tests > test_analyze.xlsx 80 | 81 | In fact, the unit testing for the `analyze.py` portfolio statistics tests the python calculations against this same excel spreadsheet, so you can really get into the guts of how these things are calculated. 82 | 83 | 84 | ##[Portfolio Construction Examples](portfolio-construction-examples) 85 | 86 | Portfolios can (generally) be constructed in one of three ways: 87 | 88 | 1. The Blotter Method 89 | 2. Weight Allocation Method 90 | 3. Initial Allocation with specific Rebalancing Period Method 91 | 92 | ### 1. [The Blotter Method](blotter-method-examples) 93 | 94 | **The blotter method:** In finance, a spreadsheet of "buys/sells", "Prices", "Dates" etc. is called a "trade blotter." This also would be the easiest way for an investor to actually analyze the past performance of her portfolio, because trade confirmations provide this exact data. 95 | 96 | This method is most effectively achieved by providing an Excel / `.csv` file with the following format: 97 | 98 | | Date |Buy / Sell| Price |Ticker| 99 | |:-------|:---------|:------|:-----| 100 | |9/4/2001| 50 | 123.45| EFA | 101 | |5/5/2003| 65 | 107.71| EEM | 102 | |6/6/2003|-15 | 118.85| EEM | 103 | 104 | where "Buys" can be distinguished from "Sells" because buys are positive (+) and sells are negative (-). 105 | 106 | For example, let's say I wanted to generate a random portfolio containing the following tickers and respective asset classes, using the `generate_random_portfolio_blotter` method 107 | 108 | |Ticker | Description | Asset Class | Price Start| 109 | |:-------|:-------------------------|:-------------------|:-----------| 110 | | IWB | iShares Russell 1000 | US Equity | 5/19/2000 | 111 | | IWR | iShares Russell Midcap | US Equity | 8/27/2001 | 112 | | IWM | iShares Russell 2000 | US Equity | 5/26/2000 | 113 | | EFA | iShares EAFE | Foreign Dev Equity | 8/27/2001 | 114 | | EEM | iShares EAFE EM | Foreign EM Equity | 4/15/2003 | 115 | | TIP | iShares TIPS | Fixed Income | 12/5/2003 | 116 | | TLT | iShares LT Treasuries | Fixed Income | 7/31/2002 | 117 | | IEF | iShares MT Treasuries | Fixed Income | 7/31/2002 | 118 | | SHY | iShares ST Treasuries | Fixed Income | 7/31/2002 | 119 | | LQD | iShares Inv Grade | Fixed Income | 7/31/2002 | 120 | | IYR | iShares Real Estate | Alternative | 6/19/2000 | 121 | | GLD | iShares Gold Index | Alternative | 11/18/2004 | 122 | | GSG | iShares Commodities | Alternative | 7/21/2006 | 123 | 124 | I could construct a portfolio of random trades (i.e. the "blotter method"), say 20 trades for each asset, by executing the following: 125 | 126 | #import the modules 127 | In [5]: import vizualize_wealth.construct_portfolio as vwcp 128 | 129 | In [6]: ticks = ['IWB','IWR','IWM','EFA','EEM','TIP','TLT','IEF', 130 | 'SHY','LQD','IYR','GLD','GSG'] 131 | In [7]: num_trades = 20 132 | 133 | #construct the random trade blotter 134 | In [8]: blotter = vwcp.generate_random_portfolio_blotter(ticks, num_trades) 135 | 136 | #construct the portfolio panel 137 | In [9]: port_panel = vwcp.panel_from_blotter(blotter) 138 | 139 | Now I have a `pandas.Panel`. Before we constuct the cumulative portfolio values, let's examine the dimensions of the panel (which are generally the same for all construction methods, although the columns of the `minor_axis` are different because the methods call for different optimized calculations) with the following dimensions: 140 | 141 | #tickers are `panel.items` 142 | In [10]: port_panel.items 143 | Out[10]: Index([u'EEM', u'EFA', u'GLD', u'GSG', u'IEF', u'IWB', u'IWM', u'IWR', 144 | u'IYR', u'LQD', u'SHY', u'TIP', u'TLT'], dtype=object) 145 | 146 | #dates are along the `panel.major_axis` 147 | In [12]: port_panel.major_axis 148 | Out[12]: 149 | 150 | [2000-07-06 00:00:00, ..., 2013-10-30 00:00:00] 151 | Length: 3351, Freq: None, Timezone: None 152 | 153 | #price data, cumulative investment, dividends, and split ratios are `panel.minor_axis` 154 | In [13]: port_panel.minor_axis 155 | Out[13]: Index([u'Open', u'High', u'Low', u'Close', u'Volume', u'Adj Close', 156 | u'Dividends',u'Splits', u'contr_withdrawal', u'cum_investment', 157 | u'cum_shares'], dtype=object) 158 | 159 | There is a lot of information to be gleaned from this data object, but the most common goal would be to convert this `pandas.Panel` to a Portfolio `pandas.DataFrame` with columns `['Open', 'Close']`, so it can be compared against other assets or combination of assets. In this case, use `pfp_from_blotter`(which stands for "portfolio_from_panel" + portfolio construction method [i.e. blotter, weights, or initial allocaiton] which in this case was "the blotter method"). 160 | 161 | #construct_the portfolio series 162 | In [14]: port_df = vwcp.pfp_from_blotter(panel, 1000.) 163 | 164 | In [117]: port_df.head() 165 | Out[117]: 166 | Close Open 167 | Date 168 | 2000-07-06 1000.000000 988.744754 169 | 2000-07-07 1006.295307 1000.190767 170 | 2000-07-10 1012.876765 1005.723006 171 | 2000-07-11 1011.636780 1011.064479 172 | 2000-07-12 1031.953453 1016.978253 173 | 174 | ###2. [The Weight Allocation Method](weight-allocation-method-examples) 175 | 176 | A commonplace way to test portoflio management strategies using a 177 | group of underlying assets is to construct aggregate portofolio 178 | performance, given a specified weighting allocation to specific assets 179 | on specified dates. Specifically, those (often times) percentage 180 | allocations represent a recommended allocation at some point in time, 181 | based on some "view" derived from either the output of a model or some qualitative 182 | analysis. Therefore, having an engine that is capable of taking in a weighting file (say, a `.csv`) with the following format: 183 | 184 | |Date | Ticker 1 | Ticker 2 | Ticker 3 | Ticker 4 | 185 | |:-------|:---------:|:---------:|:--------:|:--------:| 186 | |1/1/2002| 5% | 20% | 30% | 45% | 187 | |6/3/2003| 40% | 10% | 40% | 10% | 188 | |7/8/2003| 25% | 25% | 25% | 25% | 189 | 190 | and turning the above allocation file into a cumulative portfolio 191 | value that can then be analyzed and compared (both in isolation and 192 | relative to specified benchmarks) is highly valuable in the process of 193 | portfolio strategy creation. 194 | 195 | A quick example of a weighting allocation file can be found in the 196 | Excel File `visualize_wealth/tests/panel from weight file test.xlsx`, 197 | where the tab `rebal_weights` represents one of these specific 198 | weighting files. 199 | 200 | To construct a portfolio of using the **Weighting Allocation Method**, 201 | a process such as the following would be carried out. 202 | 203 | #import the library 204 | import visualize_wealth.construct_portfolio as vwcp 205 | 206 | If we didn't have the prices already, there's a function for that 207 | 208 | #fetch the prices and put them into a pandas.Panel 209 | price_panel = vwcp.fetch_data_for_weight_allocation_method(weight_df) 210 | 211 | #construct the panel that will go into the portfolio constructor 212 | 213 | port_panel = vwcp.panel_from_weight_file(weight_df, price_panel, 214 | start_value = 1000.) 215 | 216 | Construct the `pandas.DataFrame` for the portfolio, starting at 217 | `start_value` of 1000 with columns `['Open', Close']` 218 | 219 | portfolio = vwcp.pfp_from_weight_file(port_panel) 220 | 221 | Now a portfolio with `index` of daily values and columns 222 | `['Open', 'Close']` has been created upon which analytics and 223 | performance analysis can be done. 224 | 225 | ### 3. [The Initial Allocation & Rebalancing Method](initial-allocation-method-examples) 226 | 227 | The standard method of portoflio construction that pervades in many 228 | circles to this day is static allocation with a given interval of 229 | rebalancing. For instance, if I wanted to implement Oppenheimers' 230 | [The New 60/40](https://www.oppenheimerfunds.com/digitalAssets/Discover-the-New-60-40-43f7f642-e0aa-40d9-a3fc-00f31be5a4fa.pdf) 231 | static portfolio, rebalancing on a yearly interval, my weighting 232 | scheme would be as follows: 233 | 234 | | Ticker | Name | Asset Class | Allocation | 235 | |:-------|:-------------------------|:-------------------|:-----------| 236 | | IWB | iShares Russell 1000 | US Equity | 15% | 237 | | IWR | iShares Russell Midcap | US Equity | 7.5% | 238 | | IWM | iShares Russell 2000 | US Equity | 7.5% | 239 | | SCZ | iShares EAFE Small Cap | Foreign Dev Equity | 7.5% | 240 | | EFA | iShares EAFE | Foreign Dev Equity | 12.5% | 241 | | EEM | iShares EAFE EM | Foreign EM Equity | 10% | 242 | | TIP | iShares TIPS | Fixed Income | 5% | 243 | | TLT | iShares LT Treasuries | Fixed Income | 2.5% | 244 | | IEF | iShares MT Treasuries | Fixed Income | 2.5% | 245 | | SHY | iShares ST Treasuries | Fixed Income | 5% | 246 | | HYG | iShares High Yield | Fixed Income | 2.5% | 247 | | LQD | iShares Inv Grade | Fixed Income | 2.5% | 248 | | PCY | PowerShares EM Sovereign | Fixed Income | 2% | 249 | | BWX | SPDR intl Treasuries | Fixed Income | 2% | 250 | | MBB | iShares MBS | Fixed Income | 1% | 251 | | PFF | iShares Preferred Equity | Alternative | 2.5% | 252 | | IYR | iShares Real Estate | Alternative | 5% | 253 | | GLD | iShares Gold Index | Alternative | 2.5% | 254 | | GSG | iShares Commodities | Alternative | 5% | 255 | 256 | To implement such a weighting scheme, we can use the same worksheet 257 | `visualize_wealth/tests/panel from weight file test.xlsx`, and the 258 | tab. `static_allocation`. Note there is only a single row of 259 | weights, as this will be the "static allocation" to be rebalanced to 260 | at some given interval. 261 | 262 | #import the construct_portfolio library 263 | import visualize_wealth.construct_portfolio as vwcp 264 | 265 | Let's use the `static_allocation` provided in the `panel from weight 266 | file.xlsx` workbook 267 | 268 | f = pandas.ExcelFile('tests/panel from weight file test.xlsx') 269 | static_alloc = f.parse('static_allocation', index_col = 0, 270 | header_col = 0) 271 | 272 | Again, assume we don't have the prices and need to donwload them, use 273 | the `fetch_data_for_initial_allocation_method` 274 | 275 | price-panel = vwcp.fetch_data_for_initial_allocation_method(static_alloc) 276 | 277 | Construct the `panel` for the portoflio while determining the desired 278 | rebalance frequency 279 | 280 | panel = vwcp.panel_from_initial_weights(weight_series = static_alloc, 281 | static_alloc, price_panel = price_panel, rebal_frequency = 'quarterly') 282 | 283 | 284 | Construct the final portfolio with columns `['Open', 'Close']` 285 | 286 | portfolio = vwcp.pfp_from_weight_file(panel) 287 | 288 | Take a look at the portfolio series: 289 | 290 | In [10:] portfolio.head() 291 | Out[11:] 292 | 293 | Close Open 294 | Date 295 | 2007-12-12 1000.000000 1007.885932 296 | 2007-12-13 991.329125 990.717915 297 | 2007-12-14 978.157960 983.057829 298 | 2007-12-17 961.705069 969.797167 299 | 2007-12-18 969.794966 972.365687 300 | 301 | 302 | 303 | 304 | 305 | 306 | -------------------------------------------------------------------------------- /docs/.buildinfo: -------------------------------------------------------------------------------- 1 | # Sphinx build info version 1 2 | # This file hashes the configuration used when building these files. When it is not found, a full rebuild will be done. 3 | config: 0555db9e86a52ceab584e8f2f5bcac04 4 | tags: fbb0d17656682115ca4d033fb2f83ba1 5 | -------------------------------------------------------------------------------- /docs/.doctrees/environment.pickle: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/benjaminmgross/visualize-wealth/76f3f0fd815a25994ab9140a9d85cf7aae81f9fd/docs/.doctrees/environment.pickle -------------------------------------------------------------------------------- /docs/.doctrees/index.doctree: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/benjaminmgross/visualize-wealth/76f3f0fd815a25994ab9140a9d85cf7aae81f9fd/docs/.doctrees/index.doctree -------------------------------------------------------------------------------- /docs/.doctrees/intro.doctree: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/benjaminmgross/visualize-wealth/76f3f0fd815a25994ab9140a9d85cf7aae81f9fd/docs/.doctrees/intro.doctree -------------------------------------------------------------------------------- /docs/.doctrees/readme.doctree: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/benjaminmgross/visualize-wealth/76f3f0fd815a25994ab9140a9d85cf7aae81f9fd/docs/.doctrees/readme.doctree -------------------------------------------------------------------------------- /docs/.doctrees/tmp.doctree: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/benjaminmgross/visualize-wealth/76f3f0fd815a25994ab9140a9d85cf7aae81f9fd/docs/.doctrees/tmp.doctree -------------------------------------------------------------------------------- /docs/Makefile: -------------------------------------------------------------------------------- 1 | # Makefile for Sphinx documentation 2 | # 3 | 4 | # You can set these variables from the command line. 5 | SPHINXOPTS = 6 | SPHINXBUILD = sphinx-build 7 | PAPER = 8 | BUILDDIR = build 9 | 10 | # User-friendly check for sphinx-build 11 | ifeq ($(shell which $(SPHINXBUILD) >/dev/null 2>&1; echo $$?), 1) 12 | $(error The '$(SPHINXBUILD)' command was not found. Make sure you have Sphinx installed, then set the SPHINXBUILD environment variable to point to the full path of the '$(SPHINXBUILD)' executable. Alternatively you can add the directory with the executable to your PATH. If you don't have Sphinx installed, grab it from http://sphinx-doc.org/) 13 | endif 14 | 15 | # Internal variables. 16 | PAPEROPT_a4 = -D latex_paper_size=a4 17 | PAPEROPT_letter = -D latex_paper_size=letter 18 | ALLSPHINXOPTS = -d $(BUILDDIR)/doctrees $(PAPEROPT_$(PAPER)) $(SPHINXOPTS) source 19 | # the i18n builder cannot share the environment and doctrees with the others 20 | I18NSPHINXOPTS = $(PAPEROPT_$(PAPER)) $(SPHINXOPTS) source 21 | 22 | .PHONY: help clean html dirhtml singlehtml pickle json htmlhelp qthelp devhelp epub latex latexpdf text man changes linkcheck doctest gettext 23 | 24 | help: 25 | @echo "Please use \`make ' where is one of" 26 | @echo " html to make standalone HTML files" 27 | @echo " dirhtml to make HTML files named index.html in directories" 28 | @echo " singlehtml to make a single large HTML file" 29 | @echo " pickle to make pickle files" 30 | @echo " json to make JSON files" 31 | @echo " htmlhelp to make HTML files and a HTML help project" 32 | @echo " qthelp to make HTML files and a qthelp project" 33 | @echo " devhelp to make HTML files and a Devhelp project" 34 | @echo " epub to make an epub" 35 | @echo " latex to make LaTeX files, you can set PAPER=a4 or PAPER=letter" 36 | @echo " latexpdf to make LaTeX files and run them through pdflatex" 37 | @echo " latexpdfja to make LaTeX files and run them through platex/dvipdfmx" 38 | @echo " text to make text files" 39 | @echo " man to make manual pages" 40 | @echo " texinfo to make Texinfo files" 41 | @echo " info to make Texinfo files and run them through makeinfo" 42 | @echo " gettext to make PO message catalogs" 43 | @echo " changes to make an overview of all changed/added/deprecated items" 44 | @echo " xml to make Docutils-native XML files" 45 | @echo " pseudoxml to make pseudoxml-XML files for display purposes" 46 | @echo " linkcheck to check all external links for integrity" 47 | @echo " doctest to run all doctests embedded in the documentation (if enabled)" 48 | 49 | clean: 50 | rm -rf $(BUILDDIR)/* 51 | 52 | html: 53 | $(SPHINXBUILD) -b html $(ALLSPHINXOPTS) $(BUILDDIR)/html 54 | @echo 55 | @echo "Build finished. The HTML pages are in $(BUILDDIR)/html." 56 | 57 | dirhtml: 58 | $(SPHINXBUILD) -b dirhtml $(ALLSPHINXOPTS) $(BUILDDIR)/dirhtml 59 | @echo 60 | @echo "Build finished. The HTML pages are in $(BUILDDIR)/dirhtml." 61 | 62 | singlehtml: 63 | $(SPHINXBUILD) -b singlehtml $(ALLSPHINXOPTS) $(BUILDDIR)/singlehtml 64 | @echo 65 | @echo "Build finished. The HTML page is in $(BUILDDIR)/singlehtml." 66 | 67 | pickle: 68 | $(SPHINXBUILD) -b pickle $(ALLSPHINXOPTS) $(BUILDDIR)/pickle 69 | @echo 70 | @echo "Build finished; now you can process the pickle files." 71 | 72 | json: 73 | $(SPHINXBUILD) -b json $(ALLSPHINXOPTS) $(BUILDDIR)/json 74 | @echo 75 | @echo "Build finished; now you can process the JSON files." 76 | 77 | htmlhelp: 78 | $(SPHINXBUILD) -b htmlhelp $(ALLSPHINXOPTS) $(BUILDDIR)/htmlhelp 79 | @echo 80 | @echo "Build finished; now you can run HTML Help Workshop with the" \ 81 | ".hhp project file in $(BUILDDIR)/htmlhelp." 82 | 83 | qthelp: 84 | $(SPHINXBUILD) -b qthelp $(ALLSPHINXOPTS) $(BUILDDIR)/qthelp 85 | @echo 86 | @echo "Build finished; now you can run "qcollectiongenerator" with the" \ 87 | ".qhcp project file in $(BUILDDIR)/qthelp, like this:" 88 | @echo "# qcollectiongenerator $(BUILDDIR)/qthelp/VisualizeWealth.qhcp" 89 | @echo "To view the help file:" 90 | @echo "# assistant -collectionFile $(BUILDDIR)/qthelp/VisualizeWealth.qhc" 91 | 92 | devhelp: 93 | $(SPHINXBUILD) -b devhelp $(ALLSPHINXOPTS) $(BUILDDIR)/devhelp 94 | @echo 95 | @echo "Build finished." 96 | @echo "To view the help file:" 97 | @echo "# mkdir -p $$HOME/.local/share/devhelp/VisualizeWealth" 98 | @echo "# ln -s $(BUILDDIR)/devhelp $$HOME/.local/share/devhelp/VisualizeWealth" 99 | @echo "# devhelp" 100 | 101 | epub: 102 | $(SPHINXBUILD) -b epub $(ALLSPHINXOPTS) $(BUILDDIR)/epub 103 | @echo 104 | @echo "Build finished. The epub file is in $(BUILDDIR)/epub." 105 | 106 | latex: 107 | $(SPHINXBUILD) -b latex $(ALLSPHINXOPTS) $(BUILDDIR)/latex 108 | @echo 109 | @echo "Build finished; the LaTeX files are in $(BUILDDIR)/latex." 110 | @echo "Run \`make' in that directory to run these through (pdf)latex" \ 111 | "(use \`make latexpdf' here to do that automatically)." 112 | 113 | latexpdf: 114 | $(SPHINXBUILD) -b latex $(ALLSPHINXOPTS) $(BUILDDIR)/latex 115 | @echo "Running LaTeX files through pdflatex..." 116 | $(MAKE) -C $(BUILDDIR)/latex all-pdf 117 | @echo "pdflatex finished; the PDF files are in $(BUILDDIR)/latex." 118 | 119 | latexpdfja: 120 | $(SPHINXBUILD) -b latex $(ALLSPHINXOPTS) $(BUILDDIR)/latex 121 | @echo "Running LaTeX files through platex and dvipdfmx..." 122 | $(MAKE) -C $(BUILDDIR)/latex all-pdf-ja 123 | @echo "pdflatex finished; the PDF files are in $(BUILDDIR)/latex." 124 | 125 | text: 126 | $(SPHINXBUILD) -b text $(ALLSPHINXOPTS) $(BUILDDIR)/text 127 | @echo 128 | @echo "Build finished. The text files are in $(BUILDDIR)/text." 129 | 130 | man: 131 | $(SPHINXBUILD) -b man $(ALLSPHINXOPTS) $(BUILDDIR)/man 132 | @echo 133 | @echo "Build finished. The manual pages are in $(BUILDDIR)/man." 134 | 135 | texinfo: 136 | $(SPHINXBUILD) -b texinfo $(ALLSPHINXOPTS) $(BUILDDIR)/texinfo 137 | @echo 138 | @echo "Build finished. The Texinfo files are in $(BUILDDIR)/texinfo." 139 | @echo "Run \`make' in that directory to run these through makeinfo" \ 140 | "(use \`make info' here to do that automatically)." 141 | 142 | info: 143 | $(SPHINXBUILD) -b texinfo $(ALLSPHINXOPTS) $(BUILDDIR)/texinfo 144 | @echo "Running Texinfo files through makeinfo..." 145 | make -C $(BUILDDIR)/texinfo info 146 | @echo "makeinfo finished; the Info files are in $(BUILDDIR)/texinfo." 147 | 148 | gettext: 149 | $(SPHINXBUILD) -b gettext $(I18NSPHINXOPTS) $(BUILDDIR)/locale 150 | @echo 151 | @echo "Build finished. The message catalogs are in $(BUILDDIR)/locale." 152 | 153 | changes: 154 | $(SPHINXBUILD) -b changes $(ALLSPHINXOPTS) $(BUILDDIR)/changes 155 | @echo 156 | @echo "The overview file is in $(BUILDDIR)/changes." 157 | 158 | linkcheck: 159 | $(SPHINXBUILD) -b linkcheck $(ALLSPHINXOPTS) $(BUILDDIR)/linkcheck 160 | @echo 161 | @echo "Link check complete; look for any errors in the above output " \ 162 | "or in $(BUILDDIR)/linkcheck/output.txt." 163 | 164 | doctest: 165 | $(SPHINXBUILD) -b doctest $(ALLSPHINXOPTS) $(BUILDDIR)/doctest 166 | @echo "Testing of doctests in the sources finished, look at the " \ 167 | "results in $(BUILDDIR)/doctest/output.txt." 168 | 169 | xml: 170 | $(SPHINXBUILD) -b xml $(ALLSPHINXOPTS) $(BUILDDIR)/xml 171 | @echo 172 | @echo "Build finished. The XML files are in $(BUILDDIR)/xml." 173 | 174 | pseudoxml: 175 | $(SPHINXBUILD) -b pseudoxml $(ALLSPHINXOPTS) $(BUILDDIR)/pseudoxml 176 | @echo 177 | @echo "Build finished. The pseudo-XML files are in $(BUILDDIR)/pseudoxml." 178 | -------------------------------------------------------------------------------- /docs/build/.buildinfo: -------------------------------------------------------------------------------- 1 | # Sphinx build info version 1 2 | # This file hashes the configuration used when building these files. When it is not found, a full rebuild will be done. 3 | config: a18bbf59b31827e0b324c1c64a7a8468 4 | tags: 645f666f9bcd5a90fca523b33c5a78b7 5 | -------------------------------------------------------------------------------- /docs/build/.doctrees/asset_class.doctree: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/benjaminmgross/visualize-wealth/76f3f0fd815a25994ab9140a9d85cf7aae81f9fd/docs/build/.doctrees/asset_class.doctree -------------------------------------------------------------------------------- /docs/build/.doctrees/environment.pickle: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/benjaminmgross/visualize-wealth/76f3f0fd815a25994ab9140a9d85cf7aae81f9fd/docs/build/.doctrees/environment.pickle -------------------------------------------------------------------------------- /docs/build/.doctrees/index.doctree: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/benjaminmgross/visualize-wealth/76f3f0fd815a25994ab9140a9d85cf7aae81f9fd/docs/build/.doctrees/index.doctree -------------------------------------------------------------------------------- /docs/build/.doctrees/intro.doctree: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/benjaminmgross/visualize-wealth/76f3f0fd815a25994ab9140a9d85cf7aae81f9fd/docs/build/.doctrees/intro.doctree -------------------------------------------------------------------------------- /docs/build/.doctrees/readme.doctree: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/benjaminmgross/visualize-wealth/76f3f0fd815a25994ab9140a9d85cf7aae81f9fd/docs/build/.doctrees/readme.doctree -------------------------------------------------------------------------------- /docs/source/.doctrees/asset_class.doctree: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/benjaminmgross/visualize-wealth/76f3f0fd815a25994ab9140a9d85cf7aae81f9fd/docs/source/.doctrees/asset_class.doctree -------------------------------------------------------------------------------- /docs/source/.doctrees/index.doctree: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/benjaminmgross/visualize-wealth/76f3f0fd815a25994ab9140a9d85cf7aae81f9fd/docs/source/.doctrees/index.doctree -------------------------------------------------------------------------------- /docs/source/.doctrees/intro.doctree: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/benjaminmgross/visualize-wealth/76f3f0fd815a25994ab9140a9d85cf7aae81f9fd/docs/source/.doctrees/intro.doctree -------------------------------------------------------------------------------- /docs/source/.doctrees/modules.doctree: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/benjaminmgross/visualize-wealth/76f3f0fd815a25994ab9140a9d85cf7aae81f9fd/docs/source/.doctrees/modules.doctree -------------------------------------------------------------------------------- /docs/source/.doctrees/readme.doctree: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/benjaminmgross/visualize-wealth/76f3f0fd815a25994ab9140a9d85cf7aae81f9fd/docs/source/.doctrees/readme.doctree -------------------------------------------------------------------------------- /docs/source/.doctrees/visualize_wealth.analyze.doctree: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/benjaminmgross/visualize-wealth/76f3f0fd815a25994ab9140a9d85cf7aae81f9fd/docs/source/.doctrees/visualize_wealth.analyze.doctree -------------------------------------------------------------------------------- /docs/source/.doctrees/visualize_wealth.classify.doctree: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/benjaminmgross/visualize-wealth/76f3f0fd815a25994ab9140a9d85cf7aae81f9fd/docs/source/.doctrees/visualize_wealth.classify.doctree -------------------------------------------------------------------------------- /docs/source/.doctrees/visualize_wealth.construct_portfolio.doctree: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/benjaminmgross/visualize-wealth/76f3f0fd815a25994ab9140a9d85cf7aae81f9fd/docs/source/.doctrees/visualize_wealth.construct_portfolio.doctree -------------------------------------------------------------------------------- /docs/source/.doctrees/visualize_wealth.doctree: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/benjaminmgross/visualize-wealth/76f3f0fd815a25994ab9140a9d85cf7aae81f9fd/docs/source/.doctrees/visualize_wealth.doctree -------------------------------------------------------------------------------- /docs/source/.doctrees/visualize_wealth.utils.doctree: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/benjaminmgross/visualize-wealth/76f3f0fd815a25994ab9140a9d85cf7aae81f9fd/docs/source/.doctrees/visualize_wealth.utils.doctree -------------------------------------------------------------------------------- /docs/source/asset_class.rst: -------------------------------------------------------------------------------- 1 | asset\_class 2 | ============ 3 | 4 | A simple library that uses r-squared maximization techniques and asset 5 | sub class ETFs (that I personally chose) to determine asset class 6 | information, as well as historical asset subclass information for a 7 | given asset 8 | 9 | Installation 10 | ------------ 11 | 12 | :: 13 | 14 | $git clone https://github.com/benjaminmgross/asset_class 15 | $ cd asset_class 16 | $python setup.py install 17 | 18 | Quickstart 19 | ---------- 20 | 21 | Let's say we had some fund, for instance the `Franklin Templeton Growth 22 | Allocation Fund A `__ -- 23 | ticker FGTIX -- against which we we wanted to do historical attribution. 24 | 25 | In just a couple of key strokes, we can come up with quarterly 26 | attribution analysis to see where returns were coming from 27 | 28 | :: 29 | 30 | import pandas.io.data as web 31 | import asset_class 32 | 33 | fgtix = web.DataReader('FGTIX', 'yahoo', start = '01/01/2000')['Adj Close'] 34 | rolling_weights = asset_class.asset_class_and_subclass_by_interval(fgtix, 'quarterly') 35 | 36 | And that's it. Let's see the subclass attributions that the adjusted 37 | r-squared optimization algorithm came up with. 38 | 39 | :: 40 | 41 | import matplotlib.pyplot as plt 42 | 43 | #create the stacked area graph 44 | fig = plt.figure() 45 | ax = plt.subplot2grid((1,1), (0,0)) 46 | stack_coll = ax.stackplot(rolling_attr.index, rolling_attr.values.transpose()) 47 | ax.set_ylim(0, 1.) 48 | proxy_rects = [plt.Rectangle( (0,0), 1, 1, 49 | fc = pc.get_facecolor()[0]) for pc in stack_coll] 50 | ax.legend(proxy_rects, rolling_attr.columns.values.tolist(), ncol = 3, 51 | loc = 8, bbox_to_anchor = (0.5, -0.15)) 52 | plt.title("Asset Subclass Attribution Over Time", fontsize = 16) 53 | plt.show() 54 | 55 | .. figure:: ./images/subclass_overtime.png 56 | :alt: sub\_classes 57 | 58 | sub\_classes 59 | Dependencies 60 | ------------ 61 | 62 | Obvious Ones: 63 | ~~~~~~~~~~~~~ 64 | 65 | ``pandas`` ``numpy`` ``scipy.optimize`` (uses the ``TNC`` method to 66 | optimize the objective function of r-squared) 67 | 68 | Not So Obvious: 69 | ~~~~~~~~~~~~~~~ 70 | 71 | Another one of my open source repositories 72 | ```visualize_wealth`` `__ 73 | > But that's just for adjusted r-squared functionality, you could easily 74 | clone and hack it yourself without that library 75 | 76 | Status 77 | ------ 78 | 79 | Still very much a WIP, although I've added 80 | [Sphinx]http://sphinx-doc.org/) docstrings to auto generate 81 | documentation 82 | 83 | To Do: 84 | ------ 85 | 86 | - Given a ``pandas.DataFrame`` of asset prices, and asset price 87 | weights, return an aggregated asset class ``pandas.DataFrame`` on a 88 | quarterly basis 89 | 90 | - Write the ``Best Fitting Benchmark`` algorithm, either for use in 91 | this library or from the private ``strat_check`` repository that uses 92 | this module 93 | 94 | 95 | -------------------------------------------------------------------------------- /docs/source/conf.py: -------------------------------------------------------------------------------- 1 | # -*- coding: utf-8 -*- 2 | # 3 | # Visualize Wealth documentation build configuration file, created by 4 | # sphinx-quickstart on Mon Jan 6 08:48:51 2014. 5 | # 6 | # This file is execfile()d with the current directory set to its 7 | # containing dir. 8 | # 9 | # Note that not all possible configuration values are present in this 10 | # autogenerated file. 11 | # 12 | # All configuration values have a default; values that are commented out 13 | # serve to show the default. 14 | 15 | import sys 16 | import os 17 | 18 | # on_rtd is whether we are on readthedocs.org, this line of code grabbed from docs.readthedocs.org 19 | on_rtd = os.environ.get('READTHEDOCS', None) == 'True' 20 | 21 | if not on_rtd: # only import and set the theme if we're building docs locally 22 | import sphinx_rtd_theme 23 | html_theme = 'sphinx_rtd_theme' 24 | html_theme_path = [sphinx_rtd_theme.get_html_theme_path()] 25 | 26 | # otherwise, readthedocs.org uses their theme by default, so no need to specify it 27 | 28 | # If extensions (or modules to document with autodoc) are in another directory, 29 | # add these directories to sys.path here. If the directory is relative to the 30 | # documentation root, use os.path.abspath to make it absolute, like shown here. 31 | #sys.path.insert(0, os.path.abspath('.')) 32 | 33 | # -- General configuration ------------------------------------------------ 34 | 35 | # If your documentation needs a minimal Sphinx version, state it here. 36 | #needs_sphinx = '1.0' 37 | 38 | # Add any Sphinx extension module names here, as strings. They can be 39 | # extensions coming with Sphinx (named 'sphinx.ext.*') or your custom 40 | # ones. 41 | todo_include_todos = True 42 | extensions = [ 43 | 'sphinx.ext.autodoc', 44 | 'sphinx.ext.doctest', 45 | 'sphinx.ext.intersphinx', 46 | 'sphinx.ext.todo', 47 | 'sphinx.ext.pngmath', 48 | 'sphinx.ext.mathjax', 49 | 'sphinx.ext.viewcode', 50 | ] 51 | 52 | # Add any paths that contain templates here, relative to this directory. 53 | templates_path = ['_templates'] 54 | 55 | # The suffix of source filenames. 56 | source_suffix = '.rst' 57 | 58 | # The encoding of source files. 59 | #source_encoding = 'utf-8-sig' 60 | 61 | # The master toctree document. 62 | master_doc = 'index' 63 | 64 | # General information about the project. 65 | project = u'Visualize Wealth' 66 | copyright = u'2014, Benjamin M. Gross' 67 | 68 | # The version info for the project you're documenting, acts as replacement for 69 | # |version| and |release|, also used in various other places throughout the 70 | # built documents. 71 | # 72 | # The short X.Y version. 73 | version = '0.0.1' 74 | # The full version, including alpha/beta/rc tags. 75 | release = '0.0.1' 76 | 77 | # The language for content autogenerated by Sphinx. Refer to documentation 78 | # for a list of supported languages. 79 | #language = None 80 | 81 | # There are two options for replacing |today|: either, you set today to some 82 | # non-false value, then it is used: 83 | #today = '' 84 | # Else, today_fmt is used as the format for a strftime call. 85 | #today_fmt = '%B %d, %Y' 86 | 87 | # List of patterns, relative to source directory, that match files and 88 | # directories to ignore when looking for source files. 89 | exclude_patterns = [] 90 | 91 | # The reST default role (used for this markup: `text`) to use for all 92 | # documents. 93 | #default_role = None 94 | 95 | # If true, '()' will be appended to :func: etc. cross-reference text. 96 | #add_function_parentheses = True 97 | 98 | # If true, the current module name will be prepended to all description 99 | # unit titles (such as .. function::). 100 | #add_module_names = True 101 | 102 | # If true, sectionauthor and moduleauthor directives will be shown in the 103 | # output. They are ignored by default. 104 | #show_authors = False 105 | 106 | # The name of the Pygments (syntax highlighting) style to use. 107 | pygments_style = 'sphinx' 108 | 109 | # A list of ignored prefixes for module index sorting. 110 | #modindex_common_prefix = [] 111 | 112 | # If true, keep warnings as "system message" paragraphs in the built documents. 113 | #keep_warnings = False 114 | 115 | 116 | # -- Options for HTML output ---------------------------------------------- 117 | 118 | # The theme to use for HTML and HTML Help pages. See the documentation for 119 | # a list of builtin themes. 120 | #html_theme = 'sphinxdoc' 121 | 122 | # Theme options are theme-specific and customize the look and feel of a theme 123 | # further. For a list of options available for each theme, see the 124 | # documentation. 125 | #html_theme_options = {} 126 | 127 | # Add any paths that contain custom themes here, relative to this directory. 128 | #html_theme_path = [] 129 | 130 | # The name for this set of Sphinx documents. If None, it defaults to 131 | # " v documentation". 132 | #html_title = None 133 | 134 | # A shorter title for the navigation bar. Default is the same as html_title. 135 | #html_short_title = None 136 | 137 | # The name of an image file (relative to this directory) to place at the top 138 | # of the sidebar. 139 | #html_logo = None 140 | 141 | # The name of an image file (within the static path) to use as favicon of the 142 | # docs. This file should be a Windows icon file (.ico) being 16x16 or 32x32 143 | # pixels large. 144 | #html_favicon = None 145 | 146 | # Add any paths that contain custom static files (such as style sheets) here, 147 | # relative to this directory. They are copied after the builtin static files, 148 | # so a file named "default.css" will overwrite the builtin "default.css". 149 | html_static_path = ['_static'] 150 | 151 | # Add any extra paths that contain custom files (such as robots.txt or 152 | # .htaccess) here, relative to this directory. These files are copied 153 | # directly to the root of the documentation. 154 | #html_extra_path = [] 155 | 156 | # If not '', a 'Last updated on:' timestamp is inserted at every page bottom, 157 | # using the given strftime format. 158 | #html_last_updated_fmt = '%b %d, %Y' 159 | 160 | # If true, SmartyPants will be used to convert quotes and dashes to 161 | # typographically correct entities. 162 | #html_use_smartypants = True 163 | 164 | # Custom sidebar templates, maps document names to template names. 165 | #html_sidebars = {} 166 | 167 | # Additional templates that should be rendered to pages, maps page names to 168 | # template names. 169 | #html_additional_pages = {} 170 | 171 | # If false, no module index is generated. 172 | #html_domain_indices = True 173 | 174 | # If false, no index is generated. 175 | #html_use_index = True 176 | 177 | # If true, the index is split into individual pages for each letter. 178 | #html_split_index = False 179 | 180 | # If true, links to the reST sources are added to the pages. 181 | #html_show_sourcelink = True 182 | 183 | # If true, "Created using Sphinx" is shown in the HTML footer. Default is True. 184 | #html_show_sphinx = True 185 | 186 | # If true, "(C) Copyright ..." is shown in the HTML footer. Default is True. 187 | #html_show_copyright = True 188 | 189 | # If true, an OpenSearch description file will be output, and all pages will 190 | # contain a tag referring to it. The value of this option must be the 191 | # base URL from which the finished HTML is served. 192 | #html_use_opensearch = '' 193 | 194 | # This is the file name suffix for HTML files (e.g. ".xhtml"). 195 | #html_file_suffix = None 196 | 197 | # Output file base name for HTML help builder. 198 | htmlhelp_basename = 'VisualizeWealthdoc' 199 | 200 | 201 | # -- Options for LaTeX output --------------------------------------------- 202 | 203 | latex_elements = { 204 | # The paper size ('letterpaper' or 'a4paper'). 205 | #'papersize': 'letterpaper', 206 | 207 | # The font size ('10pt', '11pt' or '12pt'). 208 | #'pointsize': '10pt', 209 | 210 | # Additional stuff for the LaTeX preamble. 211 | #'preamble': '', 212 | } 213 | 214 | # Grouping the document tree into LaTeX files. List of tuples 215 | # (source start file, target name, title, 216 | # author, documentclass [howto, manual, or own class]). 217 | latex_documents = [ 218 | ('index', 'VisualizeWealth.tex', u'Visualize Wealth Documentation', 219 | u'Benjamin M. Gross', 'manual'), 220 | ] 221 | 222 | # The name of an image file (relative to this directory) to place at the top of 223 | # the title page. 224 | #latex_logo = None 225 | 226 | # For "manual" documents, if this is true, then toplevel headings are parts, 227 | # not chapters. 228 | #latex_use_parts = False 229 | 230 | # If true, show page references after internal links. 231 | #latex_show_pagerefs = False 232 | 233 | # If true, show URL addresses after external links. 234 | #latex_show_urls = False 235 | 236 | # Documents to append as an appendix to all manuals. 237 | #latex_appendices = [] 238 | 239 | # If false, no module index is generated. 240 | #latex_domain_indices = True 241 | 242 | 243 | # -- Options for manual page output --------------------------------------- 244 | 245 | # One entry per manual page. List of tuples 246 | # (source start file, name, description, authors, manual section). 247 | man_pages = [ 248 | ('index', 'visualizewealth', u'Visualize Wealth Documentation', 249 | [u'Benjamin M. Gross'], 1) 250 | ] 251 | 252 | # If true, show URL addresses after external links. 253 | #man_show_urls = False 254 | 255 | 256 | # -- Options for Texinfo output ------------------------------------------- 257 | 258 | # Grouping the document tree into Texinfo files. List of tuples 259 | # (source start file, target name, title, author, 260 | # dir menu entry, description, category) 261 | texinfo_documents = [ 262 | ('index', 'VisualizeWealth', u'Visualize Wealth Documentation', 263 | u'Benjamin M. Gross', 'VisualizeWealth', 'One line description of project.', 264 | 'Miscellaneous'), 265 | ] 266 | 267 | # Documents to append as an appendix to all manuals. 268 | #texinfo_appendices = [] 269 | 270 | # If false, no module index is generated. 271 | #texinfo_domain_indices = True 272 | 273 | # How to display URL addresses: 'footnote', 'no', or 'inline'. 274 | #texinfo_show_urls = 'footnote' 275 | 276 | # If true, do not generate a @detailmenu in the "Top" node's menu. 277 | #texinfo_no_detailmenu = False 278 | 279 | 280 | # Example configuration for intersphinx: refer to the Python standard library. 281 | intersphinx_mapping = {'http://docs.python.org/': None} 282 | -------------------------------------------------------------------------------- /docs/source/images/subclass_overtime.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/benjaminmgross/visualize-wealth/76f3f0fd815a25994ab9140a9d85cf7aae81f9fd/docs/source/images/subclass_overtime.png -------------------------------------------------------------------------------- /docs/source/index.rst: -------------------------------------------------------------------------------- 1 | .. _index: 2 | 3 | ********************************************************************************** 4 | Why `The Visualize Wealth Library 5 | `_ Exists 6 | ********************************************************************************** 7 | 8 | If you’re anything like me, you sit at home waiting with baited breath... nervously 9 | wringing your hands... waiting for your brokerage account statement 10 | arrive. Pure adrenaline comes over you when those comprehensive illustrations of "how your portfolio has performed" 11 | over the past month, year, several years... and what exactly you’ve done well, and 12 | how to continue improving as someone who “manages your own wealth” 13 | are finally in your hot little hands. Clearly I jest, but if you want to 14 | really **dig** into portfolio management, there aren't a lot of tools 15 | out there that help you understand academic concepts (like Sharpe 16 | Ratios, Risk Adjusted Excess Returns, etc.) *within the context* of your 17 | portfolio. 18 | 19 | Be the Change 20 | ============= 21 | 22 | So I've put together a fairly comprehensive "construction engine" for 23 | :ref:`constructing-portfolios` that can create 24 | portfolios for you in three different ways (more on that later). Then, I've also put together a module that has all of the 25 | back-end calculations for :ref:`analyzing-portfolio-performance`. 26 | With those two basic tools, you should be able to begin the process of 27 | constructing more comprehensive analytical tools... that's what 28 | I'm doing, anyway. This `Git Repository `_ is the backend 29 | for a visualization engine that I'm creating to aid investors and 30 | financial advisors better understand the impact of choices they make. 31 | 32 | But enough about me... enjoy, and please feel free to email me at 33 | benjaminMgross [at] gmail [dot] com. 34 | 35 | Contents: 36 | 37 | .. toctree:: 38 | :maxdepth: 2 39 | 40 | intro 41 | readme 42 | asset_class 43 | modules 44 | setup 45 | visualize_wealth 46 | 47 | Indices and tables 48 | ================== 49 | 50 | * :ref:`genindex` 51 | * :ref:`modindex` 52 | * :ref:`search` 53 | -------------------------------------------------------------------------------- /docs/source/intro.rst: -------------------------------------------------------------------------------- 1 | .. _intro: 2 | 3 | ***************** 4 | Modules Explained 5 | ***************** 6 | 7 | **Optimization of Functions:** 8 | 9 | In all cases, functions have been optimized to ensure the most 10 | expeditious calculation times possible, so most of the lag time the 11 | user experiences will be due to the ``Yahoo!`` API Calls. 12 | 13 | 14 | **Examples:** 15 | 16 | Fairly comprehensive examples can be found in the ``README`` section 17 | of this documentation, which also resides on the `Splash Page 18 | `_ of this 19 | project's GitHub Repository. 20 | 21 | 22 | **Testing**: 23 | 24 | The folder under ``visualize_wealth/tests/`` has fairly significant 25 | tests that illustrate the calculations of: 26 | 27 | 1. Portoflio construction methods 28 | 29 | 2. The portfolio statistic calculations use in the :mod:`analyze` module 30 | 31 | All the files are Microsoft Excel files (extensions `.xslx`), so 32 | seeing cell calculations and the steps to arrive at a final value 33 | should be pretty straightforward. 34 | 35 | .. _constructing-portfolios: 36 | 37 | Constructing portfolios 38 | ======================= 39 | 40 | In general I've provided for three ways to construct a portfolio using 41 | this package: 42 | 43 | 1. **The Blotter Method:** Provide trades given tickers, buy / sell 44 | dates, and prices. In general, this method most closely approximates 45 | how you would reconstruct your own portfolio performance (if you had 46 | to). You would essentially take trade confirmations of buys and 47 | sells, calcluate dividends and splits, and then arrive at an 48 | aggregate portfolio path. Specific examples can be found under 49 | `blotter method examples <./readme.html#the-blotter-method>`_. 50 | 51 | 2. **The Weight Allocation Method:** Provide a weighting scheme with 52 | dates along the first column with column titles as tickers and 53 | percentage allocations to each ticker for a given date for values. 54 | Specific examples can be found under 55 | `weight allocation examples 56 | <./readme.html#the-weight-allocation-method>`_. 57 | 58 | 3. **The Initial Allocation Method:** with specific Rebalancing 59 | Periods provide an initial allocation scheme representing the 60 | static weights to rebalance to at some given interval, and then 61 | define the rebalancing interval as 'weekly', 'monthly', 62 | 'quarterly', and 'yearly.' Specific examples can found under 63 | `initial allocation examples 64 | <./readme.html#the-initial-allocation-rebalancing-method>`_. 65 | 66 | Much more detailed examples of these three methods, as well as 67 | specific examples of code that would leverage this package can be 68 | found in the `Portfolio Construction Examples <./readme.html#portfolio-construction-examples>`_. 69 | 70 | 71 | In the cases where prices are not available to the investor, helper 72 | functions for all of the construction methods are available that use 73 | `Yahoo!'s API `_ to pull down relevant 74 | price series to be incorporated into the portfolio series calculation. 75 | 76 | .. _analyzing-portfolio-performance: 77 | 78 | Analyzing portfolio performance 79 | =============================== 80 | 81 | In general, there's a myriad of statistics and analysis that 82 | can be done to analyze portfolio performance. I don't go really 83 | deeply into the uses of any of these statistics, but any number of 84 | them can be "Google-ed", if there aren't live links already provided. 85 | 86 | Fairly extensive formulas are provided for each of the performance statistics 87 | inside of the :mod:`analyze`. Formulas were inserted using the 88 | `MathJax `_ rendering capability that `Sphinx 89 | `_ provides. 90 | 91 | 92 | .. _construct-portfolio-documentation: 93 | 94 | ``construct_portfolio.py`` documentation 95 | ======================================== 96 | 97 | Full documentation for each of the functions of :mod:`visualize_wealth.construct_portfolio` 98 | 99 | .. automodule:: visualize_wealth.construct_portfolio 100 | :members: 101 | 102 | .. _asset-class-documentation: 103 | 104 | ``classify.py`` documentation 105 | ================================= 106 | 107 | Full documentation for each of the functions of :mod:`visualize_wealth.classify` 108 | 109 | .. automodule:: visualize_wealth.classify 110 | :members: 111 | 112 | .. _analyze-documentation: 113 | 114 | ``analyze.py`` documentation 115 | ============================ 116 | 117 | Full documentation for each of the functions of :mod:`visualize_wealth.analyze` 118 | 119 | .. automodule:: visualize_wealth.analyze 120 | :members: 121 | 122 | .. _utils-documentation: 123 | 124 | ``utils.py`` documentation 125 | =========================== 126 | 127 | Full documentation for each of the functions of :mod:`visualize_wealth.utils` 128 | 129 | .. automodule:: visualize_wealth.utils 130 | :members: 131 | 132 | 133 | Sphinx Customizations 134 | ===================== 135 | 136 | The documentation for this module was created using `Sphinx 137 | `_. I keep a couple of commands here that I 138 | use when re-committing to `GitHub `_, or 139 | regenerating the documentation. It serves 140 | as reference for myself, but in case other people might find it useful 141 | I've posted that as well 142 | 143 | .. _convert-markdown-to-rst: 144 | 145 | Convert `README.md` to `.rst` format 146 | ------------------------------------ 147 | 148 | I use `Pandoc `_ to convert my 149 | `README <./readme.html>`_ (that's in Markdown format) into ``.rst`` format (that 150 | can be interepreted by Sphinx). 151 | 152 | .. code:: bash 153 | 154 | $ cd visualize_wealth 155 | $ pandoc README.md -f markdown -t rst -o docs/source/readme.rst 156 | 157 | 158 | .. _build-sphinx-documentation: 159 | 160 | Build Sphinx Documentation 161 | -------------------------- 162 | 163 | .. code:: bash 164 | 165 | #rebuild the package 166 | $ cd visualize_wealth 167 | $ python setup.py install 168 | 169 | #rebuild the documentation 170 | $ sphinx-build -b html docs/source/ docs/build/ 171 | 172 | .. _sphinx-customizations: 173 | 174 | Sphinx Customizations in ``conf.py`` 175 | ------------------------------------ 176 | 177 | Current customizations I use in order to make the docs look like they 178 | currently do 179 | 180 | .. code:: python 181 | 182 | #use the sphinxdoc theme 183 | html_theme = 'sphinxdoc' 184 | 185 | #enable todos 186 | todo_include_todos = True 187 | extensions = ['sphinx.ext.autodoc',..., 'sphinx.ext.todo'] 188 | 189 | -------------------------------------------------------------------------------- /docs/source/readme.rst: -------------------------------------------------------------------------------- 1 | README.md 2 | ========= 3 | 4 | A library built in Python to construct, backtest, analyze, and evaluate 5 | portfolios and their benchmarks, with comprehensive documentation and 6 | manual calculations to illustrate all underlying methodologies and 7 | statistics. 8 | 9 | License 10 | ------- 11 | 12 | This program is free software and is distrubuted under the `GNU General 13 | Public License version 14 | 3 `__ ("GNU GPL v3") 15 | 16 | © Benjamin M. Gross 2013 17 | 18 | Dependencies 19 | ------------ 20 | 21 | ``pandas``: extensively used (``numpy`` and ``scipy`` obviously, but 22 | ``pandas`` depends on those) 23 | 24 | ``urllib2``: for Yahoo! API calls to append price ``DataFrame``\ s with 25 | Dividends 26 | 27 | Installation 28 | ------------ 29 | 30 | To install the ``visualize_wealth`` modules onto your computer, go into 31 | your desired folder of choice (say ``Downloads``), and: 32 | 33 | 1. Clone the repository 34 | 35 | :: 36 | 37 | $ cd ~/Downloads 38 | $ git clone https://github.com/benjaminmgross/grind-my-ass-ets 39 | 40 | 2. ``cd`` into the ``grind-my-ass-ets`` directory 41 | 42 | :: 43 | 44 | $ cd grind-my-ass-ets 45 | 46 | 3. Install the package 47 | 48 | :: 49 | 50 | $python setup.py install 51 | 52 | 4. Check your install. From anywhere on your machine, be able to open 53 | ``iPython`` and import the library, for example: 54 | 55 | :: 56 | 57 | $ cd ~/ 58 | $ ipython 59 | 60 | IPython 1.1.0 -- An enhanced Interactive Python. 61 | ? -> Introduction and overview of IPython's features. 62 | %quickref -> Quick reference. 63 | help -> Python's own help system. 64 | object? -> Details about 'object', use 'object??' for extra details. 65 | 66 | In [1]: import visualize_wealth 67 | 68 | **"Ligget Se!"** 69 | 70 | Documentation 71 | ------------- 72 | 73 | The ``README.md`` file has fairly good examples, but I've gone to great 74 | lengths to autogenerate documentation for the code using 75 | `Sphinx `__. Therefore, aside from the 76 | docstrings, when you ``git clone`` the repository, you can ``cd`` into: 77 | 78 | :: 79 | 80 | $ cd visualize_wealth/docs/build/ 81 | 82 | and find full ``.html`` browseable code documentation (that's pretty 83 | f\*cking beautiful... if I do say so my damn self) with live links, 84 | function explanations (that also have live links to their respective 85 | definition on the web), etc. 86 | 87 | Also I've created an Excel spreadsheet that illustrates almost all of 88 | the ``analyze.py`` portfolio statistic calculations. That spreadsheet 89 | can found in: 90 | 91 | :: 92 | 93 | visualize_wealth > tests > test_analyze.xlsx 94 | 95 | In fact, the unit testing for the ``analyze.py`` portfolio statistics 96 | tests the python calculations against this same excel spreadsheet, so 97 | you can really get into the guts of how these things are calculated. 98 | 99 | `Portfolio Construction Examples `__ 100 | --------------------------------------------------------------------- 101 | 102 | Portfolios can (generally) be constructed in one of three ways: 103 | 104 | 1. The Blotter Method 105 | 2. Weight Allocation Method 106 | 3. Initial Allocation with specific Rebalancing Period Method 107 | 108 | 1. `The Blotter Method `__ 109 | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 110 | 111 | **The blotter method:** In finance, a spreadsheet of "buys/sells", 112 | "Prices", "Dates" etc. is called a "trade blotter." This also would be 113 | the easiest way for an investor to actually analyze the past performance 114 | of her portfolio, because trade confirmations provide this exact data. 115 | 116 | This method is most effectively achieved by providing an Excel / 117 | ``.csv`` file with the following format: 118 | 119 | +------------+--------------+----------+----------+ 120 | | Date | Buy / Sell | Price | Ticker | 121 | +============+==============+==========+==========+ 122 | | 9/4/2001 | 50 | 123.45 | EFA | 123 | +------------+--------------+----------+----------+ 124 | | 5/5/2003 | 65 | 107.71 | EEM | 125 | +------------+--------------+----------+----------+ 126 | | 6/6/2003 | -15 | 118.85 | EEM | 127 | +------------+--------------+----------+----------+ 128 | 129 | where "Buys" can be distinguished from "Sells" because buys are positive 130 | (+) and sells are negative (-). 131 | 132 | For example, let's say I wanted to generate a random portfolio 133 | containing the following tickers and respective asset classes, using the 134 | ``generate_random_portfolio_blotter`` method 135 | 136 | +----------+--------------------------+----------------------+---------------+ 137 | | Ticker | Description | Asset Class | Price Start | 138 | +==========+==========================+======================+===============+ 139 | | IWB | iShares Russell 1000 | US Equity | 5/19/2000 | 140 | +----------+--------------------------+----------------------+---------------+ 141 | | IWR | iShares Russell Midcap | US Equity | 8/27/2001 | 142 | +----------+--------------------------+----------------------+---------------+ 143 | | IWM | iShares Russell 2000 | US Equity | 5/26/2000 | 144 | +----------+--------------------------+----------------------+---------------+ 145 | | EFA | iShares EAFE | Foreign Dev Equity | 8/27/2001 | 146 | +----------+--------------------------+----------------------+---------------+ 147 | | EEM | iShares EAFE EM | Foreign EM Equity | 4/15/2003 | 148 | +----------+--------------------------+----------------------+---------------+ 149 | | TIP | iShares TIPS | Fixed Income | 12/5/2003 | 150 | +----------+--------------------------+----------------------+---------------+ 151 | | TLT | iShares LT Treasuries | Fixed Income | 7/31/2002 | 152 | +----------+--------------------------+----------------------+---------------+ 153 | | IEF | iShares MT Treasuries | Fixed Income | 7/31/2002 | 154 | +----------+--------------------------+----------------------+---------------+ 155 | | SHY | iShares ST Treasuries | Fixed Income | 7/31/2002 | 156 | +----------+--------------------------+----------------------+---------------+ 157 | | LQD | iShares Inv Grade | Fixed Income | 7/31/2002 | 158 | +----------+--------------------------+----------------------+---------------+ 159 | | IYR | iShares Real Estate | Alternative | 6/19/2000 | 160 | +----------+--------------------------+----------------------+---------------+ 161 | | GLD | iShares Gold Index | Alternative | 11/18/2004 | 162 | +----------+--------------------------+----------------------+---------------+ 163 | | GSG | iShares Commodities | Alternative | 7/21/2006 | 164 | +----------+--------------------------+----------------------+---------------+ 165 | 166 | I could construct a portfolio of random trades (i.e. the "blotter 167 | method"), say 20 trades for each asset, by executing the following: 168 | 169 | :: 170 | 171 | #import the modules 172 | In [5]: import vizualize_wealth.construct_portfolio as vwcp 173 | 174 | In [6]: ticks = ['IWB','IWR','IWM','EFA','EEM','TIP','TLT','IEF', 175 | 'SHY','LQD','IYR','GLD','GSG'] 176 | In [7]: num_trades = 20 177 | 178 | #construct the random trade blotter 179 | In [8]: blotter = vwcp.generate_random_portfolio_blotter(ticks, num_trades) 180 | 181 | #construct the portfolio panel 182 | In [9]: port_panel = vwcp.panel_from_blotter(blotter) 183 | 184 | Now I have a ``pandas.Panel``. Before we constuct the cumulative 185 | portfolio values, let's examine the dimensions of the panel (which are 186 | generally the same for all construction methods, although the columns of 187 | the ``minor_axis`` are different because the methods call for different 188 | optimized calculations) with the following dimensions: 189 | 190 | :: 191 | 192 | #tickers are `panel.items` 193 | In [10]: port_panel.items 194 | Out[10]: Index([u'EEM', u'EFA', u'GLD', u'GSG', u'IEF', u'IWB', u'IWM', u'IWR', 195 | u'IYR', u'LQD', u'SHY', u'TIP', u'TLT'], dtype=object) 196 | 197 | #dates are along the `panel.major_axis` 198 | In [12]: port_panel.major_axis 199 | Out[12]: 200 | 201 | [2000-07-06 00:00:00, ..., 2013-10-30 00:00:00] 202 | Length: 3351, Freq: None, Timezone: None 203 | 204 | #price data, cumulative investment, dividends, and split ratios are `panel.minor_axis` 205 | In [13]: port_panel.minor_axis 206 | Out[13]: Index([u'Open', u'High', u'Low', u'Close', u'Volume', u'Adj Close', 207 | u'Dividends',u'Splits', u'contr_withdrawal', u'cum_investment', 208 | u'cum_shares'], dtype=object) 209 | 210 | There is a lot of information to be gleaned from this data object, but 211 | the most common goal would be to convert this ``pandas.Panel`` to a 212 | Portfolio ``pandas.DataFrame`` with columns ``['Open', 'Close']``, so it 213 | can be compared against other assets or combination of assets. In this 214 | case, use ``pfp_from_blotter``\ (which stands for 215 | "portfolio\_from\_panel" + portfolio construction method [i.e. blotter, 216 | weights, or initial allocaiton] which in this case was "the blotter 217 | method"). 218 | 219 | :: 220 | 221 | #construct_the portfolio series 222 | In [14]: port_df = vwcp.pfp_from_blotter(panel, 1000.) 223 | 224 | In [117]: port_df.head() 225 | Out[117]: 226 | Close Open 227 | Date 228 | 2000-07-06 1000.000000 988.744754 229 | 2000-07-07 1006.295307 1000.190767 230 | 2000-07-10 1012.876765 1005.723006 231 | 2000-07-11 1011.636780 1011.064479 232 | 2000-07-12 1031.953453 1016.978253 233 | 234 | 2. `The Weight Allocation Method `__ 235 | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 236 | 237 | A commonplace way to test portoflio management strategies using a group 238 | of underlying assets is to construct aggregate portofolio performance, 239 | given a specified weighting allocation to specific assets on specified 240 | dates. Specifically, those (often times) percentage allocations 241 | represent a recommended allocation at some point in time, based on some 242 | "view" derived from either the output of a model or some qualitative 243 | analysis. Therefore, having an engine that is capable of taking in a 244 | weighting file (say, a ``.csv``) with the following format: 245 | 246 | +------------+------------+------------+------------+------------+ 247 | | Date | Ticker 1 | Ticker 2 | Ticker 3 | Ticker 4 | 248 | +============+============+============+============+============+ 249 | | 1/1/2002 | 5% | 20% | 30% | 45% | 250 | +------------+------------+------------+------------+------------+ 251 | | 6/3/2003 | 40% | 10% | 40% | 10% | 252 | +------------+------------+------------+------------+------------+ 253 | | 7/8/2003 | 25% | 25% | 25% | 25% | 254 | +------------+------------+------------+------------+------------+ 255 | 256 | and turning the above allocation file into a cumulative portfolio value 257 | that can then be analyzed and compared (both in isolation and relative 258 | to specified benchmarks) is highly valuable in the process of portfolio 259 | strategy creation. 260 | 261 | A quick example of a weighting allocation file can be found in the Excel 262 | File ``visualize_wealth/tests/panel from weight file test.xlsx``, where 263 | the tab ``rebal_weights`` represents one of these specific weighting 264 | files. 265 | 266 | To construct a portfolio of using the **Weighting Allocation Method**, a 267 | process such as the following would be carried out. 268 | 269 | :: 270 | 271 | #import the library 272 | import visualize_wealth.construct_portfolio as vwcp 273 | 274 | If we didn't have the prices already, there's a function for that 275 | 276 | :: 277 | 278 | #fetch the prices and put them into a pandas.Panel 279 | price_panel = vwcp.fetch_data_for_weight_allocation_method(weight_df) 280 | 281 | #construct the panel that will go into the portfolio constructor 282 | 283 | port_panel = vwcp.panel_from_weight_file(weight_df, price_panel, 284 | start_value = 1000.) 285 | 286 | Construct the ``pandas.DataFrame`` for the portfolio, starting at 287 | ``start_value`` of 1000 with columns ``['Open', Close']`` 288 | 289 | :: 290 | 291 | portfolio = vwcp.pfp_from_weight_file(port_panel) 292 | 293 | Now a portfolio with ``index`` of daily values and columns 294 | ``['Open', 'Close']`` has been created upon which analytics and 295 | performance analysis can be done. 296 | 297 | 3. `The Initial Allocation & Rebalancing Method `__ 298 | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 299 | 300 | The standard method of portoflio construction that pervades in many 301 | circles to this day is static allocation with a given interval of 302 | rebalancing. For instance, if I wanted to implement Oppenheimers' `The 303 | New 304 | 60/40 `__ 305 | static portfolio, rebalancing on a yearly interval, my weighting scheme 306 | would be as follows: 307 | 308 | +----------+----------------------------+----------------------+--------------+ 309 | | Ticker | Name | Asset Class | Allocation | 310 | +==========+============================+======================+==============+ 311 | | IWB | iShares Russell 1000 | US Equity | 15% | 312 | +----------+----------------------------+----------------------+--------------+ 313 | | IWR | iShares Russell Midcap | US Equity | 7.5% | 314 | +----------+----------------------------+----------------------+--------------+ 315 | | IWM | iShares Russell 2000 | US Equity | 7.5% | 316 | +----------+----------------------------+----------------------+--------------+ 317 | | SCZ | iShares EAFE Small Cap | Foreign Dev Equity | 7.5% | 318 | +----------+----------------------------+----------------------+--------------+ 319 | | EFA | iShares EAFE | Foreign Dev Equity | 12.5% | 320 | +----------+----------------------------+----------------------+--------------+ 321 | | EEM | iShares EAFE EM | Foreign EM Equity | 10% | 322 | +----------+----------------------------+----------------------+--------------+ 323 | | TIP | iShares TIPS | Fixed Income | 5% | 324 | +----------+----------------------------+----------------------+--------------+ 325 | | TLT | iShares LT Treasuries | Fixed Income | 2.5% | 326 | +----------+----------------------------+----------------------+--------------+ 327 | | IEF | iShares MT Treasuries | Fixed Income | 2.5% | 328 | +----------+----------------------------+----------------------+--------------+ 329 | | SHY | iShares ST Treasuries | Fixed Income | 5% | 330 | +----------+----------------------------+----------------------+--------------+ 331 | | HYG | iShares High Yield | Fixed Income | 2.5% | 332 | +----------+----------------------------+----------------------+--------------+ 333 | | LQD | iShares Inv Grade | Fixed Income | 2.5% | 334 | +----------+----------------------------+----------------------+--------------+ 335 | | PCY | PowerShares EM Sovereign | Fixed Income | 2% | 336 | +----------+----------------------------+----------------------+--------------+ 337 | | BWX | SPDR intl Treasuries | Fixed Income | 2% | 338 | +----------+----------------------------+----------------------+--------------+ 339 | | MBB | iShares MBS | Fixed Income | 1% | 340 | +----------+----------------------------+----------------------+--------------+ 341 | | PFF | iShares Preferred Equity | Alternative | 2.5% | 342 | +----------+----------------------------+----------------------+--------------+ 343 | | IYR | iShares Real Estate | Alternative | 5% | 344 | +----------+----------------------------+----------------------+--------------+ 345 | | GLD | iShares Gold Index | Alternative | 2.5% | 346 | +----------+----------------------------+----------------------+--------------+ 347 | | GSG | iShares Commodities | Alternative | 5% | 348 | +----------+----------------------------+----------------------+--------------+ 349 | 350 | To implement such a weighting scheme, we can use the same worksheet 351 | ``visualize_wealth/tests/panel from weight file test.xlsx``, and the 352 | tab. ``static_allocation``. Note there is only a single row of weights, 353 | as this will be the "static allocation" to be rebalanced to at some 354 | given interval. 355 | 356 | :: 357 | 358 | #import the construct_portfolio library 359 | import visualize_wealth.construct_portfolio as vwcp 360 | 361 | Let's use the ``static_allocation`` provided in the 362 | ``panel from weight file.xlsx`` workbook 363 | 364 | :: 365 | 366 | f = pandas.ExcelFile('tests/panel from weight file test.xlsx') 367 | static_alloc = f.parse('static_allocation', index_col = 0, 368 | header_col = 0) 369 | 370 | Again, assume we don't have the prices and need to donwload them, use 371 | the ``fetch_data_for_initial_allocation_method`` 372 | 373 | :: 374 | 375 | price-panel = vwcp.fetch_data_for_initial_allocation_method(static_alloc) 376 | 377 | Construct the ``panel`` for the portoflio while determining the desired 378 | rebalance frequency 379 | 380 | :: 381 | 382 | panel = vwcp.panel_from_initial_weights(weight_series = static_alloc, 383 | static_alloc, price_panel = price_panel, rebal_frequency = 'quarterly') 384 | 385 | Construct the final portfolio with columns ``['Open', 'Close']`` 386 | 387 | :: 388 | 389 | portfolio = vwcp.pfp_from_weight_file(panel) 390 | 391 | Take a look at the portfolio series: 392 | 393 | :: 394 | 395 | In [10:] portfolio.head() 396 | Out[11:] 397 | 398 | Close Open 399 | Date 400 | 2007-12-12 1000.000000 1007.885932 401 | 2007-12-13 991.329125 990.717915 402 | 2007-12-14 978.157960 983.057829 403 | 2007-12-17 961.705069 969.797167 404 | 2007-12-18 969.794966 972.365687 405 | 406 | ToDo List: 407 | ---------- 408 | 409 | - occassionally ``generate_random_asset_path`` will return with an 410 | Assertion Error 411 | 412 | - Add the following statistics to the ``analyze.py`` library: 413 | - [STRIKEOUT:Absolute Alpha: 414 | 415 | .. math:: R_p - R_b 416 | 417 | ] 418 | - Treynor ratio: 419 | 420 | .. math:: \\textrm{T.R.}\\triangleq \\frac{r_i - r_f}{\\beta{i}} 421 | 422 | - Information Ratio or Appraisal Ratio: 423 | 424 | .. math:: \\textrm{I.R.} \\triangleq \\frac{\\alpha}{\\omega} 425 | 426 | , or absolute alpha / tracking error. Other formulations include 427 | Jensens's Alpha / Idiosyncratic Vol 428 | - Up / Down beta (or 429 | `Dual-Beta `__) 430 | 431 | - Best broad asset classes to determine "best fit portfolio" 432 | 433 | +----------+-------------------------------+---------------------+ 434 | | Ticker | Name | Price Data Begins | 435 | +==========+===============================+=====================+ 436 | | VTSMX | Vanguard Total Stock Market | 6/20/1996 | 437 | +----------+-------------------------------+---------------------+ 438 | | VBMFX | Vanguard Total Bond Market | 6/4/1990 | 439 | +----------+-------------------------------+---------------------+ 440 | | VGTSX | Vanguard Total Intl Stock | 6/28/1996 | 441 | +----------+-------------------------------+---------------------+ 442 | 443 | - Rebuild Process: 444 | 445 | 1. If the ``README.md`` file is altered, run: 446 | 447 | :: 448 | 449 | $ pandoc -f markdown -t rst README.md -o docs/source/readme.rst 450 | 451 | 2. Then rebuild the Sphinx documentation 452 | 453 | :: 454 | 455 | $ sphinx-build -b html docs/source/ docs/build/ 456 | 457 | 458 | -------------------------------------------------------------------------------- /docs/source/visualize_wealth.rst: -------------------------------------------------------------------------------- 1 | visualize_wealth package 2 | ======================== 3 | 4 | Submodules 5 | ---------- 6 | 7 | visualize_wealth.analyze module 8 | ------------------------------- 9 | 10 | .. automodule:: visualize_wealth.analyze 11 | :members: 12 | :undoc-members: 13 | :show-inheritance: 14 | 15 | visualize_wealth.classify module 16 | -------------------------------- 17 | 18 | .. automodule:: visualize_wealth.classify 19 | :members: 20 | :undoc-members: 21 | :show-inheritance: 22 | 23 | visualize_wealth.construct_portfolio module 24 | ------------------------------------------- 25 | 26 | .. automodule:: visualize_wealth.construct_portfolio 27 | :members: 28 | :undoc-members: 29 | :show-inheritance: 30 | 31 | visualize_wealth.utils module 32 | ----------------------------- 33 | 34 | .. automodule:: visualize_wealth.utils 35 | :members: 36 | :undoc-members: 37 | :show-inheritance: 38 | 39 | 40 | Module contents 41 | --------------- 42 | 43 | .. automodule:: visualize_wealth 44 | :members: 45 | :undoc-members: 46 | :show-inheritance: 47 | -------------------------------------------------------------------------------- /requirements.txt: -------------------------------------------------------------------------------- 1 | 2 | chardet>=1.0.1 3 | cython>=0.21.1 4 | h5py>=2.3.1 5 | ipdb>=0.8 6 | ipython>=3.0.0 7 | matplotlib>=1.4.2 8 | numpy>=1.9.1 9 | numexpr>=2.4 10 | pandas>=0.14.1 11 | py>=1.4.26 12 | pytest>=2.6.4 13 | pytest-cov>=1.8.1 14 | scipy>=0.14.0 15 | tables>=3.1.1 16 | xlrd>=0.9.3 -------------------------------------------------------------------------------- /run_tests: -------------------------------------------------------------------------------- 1 | #!/bin/bash 2 | 3 | declare -a fList=( 4 | test_analyze.py 5 | test_construct_portfolio.py 6 | test_utils.py 7 | ) 8 | 9 | for nm in "${fList[@]}" 10 | do 11 | echo testing "$nm" 12 | py.test ./test_module/"$nm" -v 13 | done 14 | -------------------------------------------------------------------------------- /setup.py: -------------------------------------------------------------------------------- 1 | #!/usr/bin/env python 2 | # encoding: utf-8 3 | 4 | from setuptools import setup 5 | 6 | setup(name='visualize_wealth', 7 | version='0.1', 8 | description='Portfolio Construction and Analysis', 9 | author='Benjamin M. Gross', 10 | author_email='benjaminMgross@gmail.com', 11 | url='https://github.com/benjaminmgross/wealth-viz', 12 | packages=['visualize_wealth']) 13 | -------------------------------------------------------------------------------- /test_data/estimating when splits have occurred.xlsx: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/benjaminmgross/visualize-wealth/76f3f0fd815a25994ab9140a9d85cf7aae81f9fd/test_data/estimating when splits have occurred.xlsx -------------------------------------------------------------------------------- /test_data/panel from weight file test.xlsx: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/benjaminmgross/visualize-wealth/76f3f0fd815a25994ab9140a9d85cf7aae81f9fd/test_data/panel from weight file test.xlsx -------------------------------------------------------------------------------- /test_data/test_analyze.xlsx: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/benjaminmgross/visualize-wealth/76f3f0fd815a25994ab9140a9d85cf7aae81f9fd/test_data/test_analyze.xlsx -------------------------------------------------------------------------------- /test_data/test_ret_calcs.xlsx: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/benjaminmgross/visualize-wealth/76f3f0fd815a25994ab9140a9d85cf7aae81f9fd/test_data/test_ret_calcs.xlsx -------------------------------------------------------------------------------- /test_data/test_splits.xlsx: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/benjaminmgross/visualize-wealth/76f3f0fd815a25994ab9140a9d85cf7aae81f9fd/test_data/test_splits.xlsx -------------------------------------------------------------------------------- /test_data/transaction-costs.xlsx: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/benjaminmgross/visualize-wealth/76f3f0fd815a25994ab9140a9d85cf7aae81f9fd/test_data/transaction-costs.xlsx -------------------------------------------------------------------------------- /test_data/~$panel from weight file test.xlsx: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/benjaminmgross/visualize-wealth/76f3f0fd815a25994ab9140a9d85cf7aae81f9fd/test_data/~$panel from weight file test.xlsx -------------------------------------------------------------------------------- /test_module/__init__.py: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/benjaminmgross/visualize-wealth/76f3f0fd815a25994ab9140a9d85cf7aae81f9fd/test_module/__init__.py -------------------------------------------------------------------------------- /test_module/test_analyze.py: -------------------------------------------------------------------------------- 1 | #!/usr/bin/env python 2 | # encoding: utf-8 3 | 4 | """ 5 | .. module:: visualize_wealth.test_module.test_analyze.py 6 | 7 | .. moduleauthor:: Benjamin M. Gross 8 | 9 | """ 10 | 11 | import pytest 12 | import pandas 13 | from pandas.util import testing 14 | import visualize_wealth.analyze as analyze 15 | 16 | @pytest.fixture 17 | def test_file(): 18 | return pandas.ExcelFile('./test_data/test_analyze.xlsx') 19 | 20 | @pytest.fixture 21 | def man_calcs(test_file): 22 | return test_file.parse('calcs', index_col = 0) 23 | 24 | @pytest.fixture 25 | def stat_calcs(test_file): 26 | return test_file.parse('results', index_col = 0) 27 | 28 | @pytest.fixture 29 | def prices(test_file): 30 | tmp = test_file.parse('calcs', index_col = 0) 31 | return tmp[['S&P 500', 'VGTSX']] 32 | 33 | def test_active_return(prices, stat_calcs): 34 | man_ar = stat_calcs.loc['active_return', 'VGTSX'] 35 | 36 | testing.assert_almost_equal(man_ar, analyze.active_return( 37 | series = prices['VGTSX'], 38 | benchmark = prices['S&P 500'], 39 | freq = 'daily') 40 | ) 41 | 42 | def test_active_returns(man_calcs, prices): 43 | active_returns = analyze.active_returns(series = prices['VGTSX'], 44 | benchmark = prices['S&P 500']) 45 | 46 | testing.assert_series_equal(man_calcs['Active Return'], active_returns) 47 | 48 | def test_log_returns(man_calcs, prices): 49 | testing.assert_series_equal(man_calcs['S&P 500 Log Ret'], 50 | analyze.log_returns(prices['S&P 500']) 51 | ) 52 | 53 | def test_linear_returns(man_calcs, prices): 54 | testing.assert_series_equal(man_calcs['S&P 500 Lin Ret'], 55 | analyze.linear_returns(prices['S&P 500']) 56 | ) 57 | 58 | def test_drawdown(man_calcs, prices): 59 | testing.assert_series_equal(man_calcs['VGTSX Drawdown'], 60 | analyze.drawdown(prices['VGTSX']) 61 | ) 62 | 63 | def test_r2(man_calcs, prices): 64 | log_rets = analyze.log_returns(prices).dropna() 65 | pandas_rsq = pandas.ols(x = log_rets['S&P 500'], 66 | y = log_rets['VGTSX']).r2 67 | 68 | analyze_rsq = analyze.r2(benchmark = log_rets['S&P 500'], 69 | series = log_rets['VGTSX']) 70 | 71 | testing.assert_almost_equal(pandas_rsq, analyze_rsq) 72 | 73 | def test_r2_adj(man_calcs, prices): 74 | log_rets = analyze.log_returns(prices).dropna() 75 | pandas_rsq = pandas.ols(x = log_rets['S&P 500'], 76 | y = log_rets['VGTSX']).r2_adj 77 | 78 | analyze_rsq = analyze.r2_adj(benchmark = log_rets['S&P 500'], 79 | series = log_rets['VGTSX']) 80 | 81 | testing.assert_almost_equal(pandas_rsq, analyze_rsq) 82 | 83 | def test_cumulative_turnover(test_file, stat_calcs): 84 | alloc_df = test_file.parse('alloc_df', index_col = 0) 85 | cols = alloc_df.columns[alloc_df.columns!='Daily TO'] 86 | alloc_df = alloc_df[cols].dropna() 87 | asset_wt_df = test_file.parse('asset_wt_df', index_col = 0) 88 | testing.assert_almost_equal(analyze.cumulative_turnover(alloc_df, asset_wt_df), 89 | stat_calcs.loc['cumulative_turnover', 'S&P 500'] 90 | ) 91 | 92 | def test_mctr(test_file): 93 | mctr_prices = test_file.parse('mctr', index_col = 0) 94 | mctr_manual = test_file.parse('mctr_results', index_col = 0) 95 | cols = ['BSV','VBK','VBR','VOE','VOT'] 96 | mctr = analyze.mctr(mctr_prices[cols], mctr_prices['Portfolio']) 97 | testing.assert_series_equal(mctr, mctr_manual.loc['mctr', cols]) 98 | 99 | def test_risk_contribution(test_file): 100 | mctr_prices = test_file.parse('mctr', index_col = 0) 101 | mctr_manual = test_file.parse('mctr_results', index_col = 0) 102 | cols = ['BSV','VBK','VBR','VOE','VOT'] 103 | mctr = analyze.mctr(mctr_prices[cols], mctr_prices['Portfolio']) 104 | weights = pandas.Series( [.2, .2, .2, .2, .2], index = cols, name = 'risk_contribution') 105 | 106 | testing.assert_series_equal(analyze.risk_contribution(mctr, weights), 107 | mctr_manual.loc['risk_contribution', :] 108 | ) 109 | 110 | def test_risk_contribution_as_proportion(test_file): 111 | mctr_prices = test_file.parse('mctr', index_col = 0) 112 | mctr_manual = test_file.parse('mctr_results', index_col = 0) 113 | cols = ['BSV','VBK','VBR','VOE','VOT'] 114 | mctr = analyze.mctr(mctr_prices[cols], mctr_prices['Portfolio']) 115 | weights = pandas.Series( [.2, .2, .2, .2, .2], index = cols, name = 'risk_contribution') 116 | 117 | testing.assert_series_equal( 118 | analyze.risk_contribution_as_proportion(mctr, weights), 119 | mctr_manual.loc['risk_contribution_as_proportion'] 120 | ) 121 | 122 | def test_alpha(prices, stat_calcs): 123 | man_alpha = stat_calcs.loc['alpha', 'VGTSX'] 124 | 125 | testing.assert_almost_equal(man_alpha, analyze.alpha(series = prices['VGTSX'], 126 | benchmark = prices['S&P 500']) 127 | ) 128 | 129 | def test_annualized_return(prices, stat_calcs): 130 | man_ar = stat_calcs.loc['annualized_return', 'VGTSX'] 131 | 132 | testing.assert_almost_equal( 133 | man_ar, analyze.annualized_return(series = prices['VGTSX'], freq = 'daily') 134 | ) 135 | 136 | def test_annualized_vol(prices, stat_calcs): 137 | man_ar = stat_calcs.loc['annualized_vol', 'VGTSX'] 138 | 139 | testing.assert_almost_equal( 140 | man_ar, analyze.annualized_vol(series = prices['VGTSX'], freq = 'daily') 141 | ) 142 | 143 | def test_appraisal_ratio(prices, stat_calcs): 144 | man_ar = stat_calcs.loc['appraisal_ratio', 'VGTSX'] 145 | 146 | testing.assert_almost_equal(man_ar, analyze.appraisal_ratio( 147 | series = prices['VGTSX'], 148 | benchmark = prices['S&P 500'], 149 | freq = 'daily', 150 | rfr = 0.0) 151 | ) 152 | 153 | def test_beta(prices, stat_calcs): 154 | man_beta = stat_calcs.loc['beta', 'VGTSX'] 155 | 156 | testing.assert_almost_equal(man_beta, analyze.beta(series = prices['VGTSX'], 157 | benchmark = prices['S&P 500']) 158 | ) 159 | 160 | def test_cvar_cf(prices, stat_calcs): 161 | man_cvar_cf = stat_calcs.loc['cvar_cf', 'VGTSX'] 162 | 163 | testing.assert_almost_equal( 164 | man_cvar_cf, analyze.cvar_cf(series = prices['VGTSX'], p = 0.01) 165 | ) 166 | 167 | def test_cvar_norm(prices, stat_calcs): 168 | man_cvar_norm = stat_calcs.loc['cvar_norm', 'VGTSX'] 169 | 170 | testing.assert_almost_equal( 171 | man_cvar_norm, analyze.cvar_norm(series = prices['VGTSX'], p = 0.01) 172 | ) 173 | 174 | def test_downcapture(prices, stat_calcs): 175 | man_dc = stat_calcs.loc['downcapture', 'VGTSX'] 176 | 177 | testing.assert_almost_equal( 178 | man_dc, analyze.downcapture(series = prices['VGTSX'], 179 | benchmark = prices['S&P 500']) 180 | ) 181 | 182 | def test_downside_deviation(prices, stat_calcs): 183 | man_dd = stat_calcs.loc['downside_deviation', 'VGTSX'] 184 | 185 | testing.assert_almost_equal( 186 | man_dd, analyze.downside_deviation(series = prices['VGTSX']) 187 | ) 188 | 189 | def test_geometric_difference(): 190 | a, b = 1. , 1. 191 | assert analyze.geometric_difference(a, b) == 0. 192 | a, b = pandas.Series({'a': 1.}), pandas.Series({'a': 1.}) 193 | assert analyze.geometric_difference(a, b).values == 0. 194 | 195 | def test_idiosyncratic_as_proportion(prices, stat_calcs): 196 | man_iap = stat_calcs.loc['idiosyncratic_as_proportion', 'VGTSX'] 197 | 198 | testing.assert_almost_equal( 199 | man_iap, analyze.idiosyncratic_as_proportion( 200 | series = prices['VGTSX'], benchmark = prices['S&P 500']) 201 | ) 202 | 203 | def test_idiosyncratic_risk(prices, stat_calcs): 204 | man_ir = stat_calcs.loc['idiosyncratic_risk', 'VGTSX'] 205 | 206 | testing.assert_almost_equal( 207 | man_ir, analyze.idiosyncratic_risk( 208 | series = prices['VGTSX'], benchmark = prices['S&P 500']) 209 | ) 210 | 211 | def test_information_ratio(prices, stat_calcs): 212 | man_ir = stat_calcs.loc['information_ratio', 'VGTSX'] 213 | 214 | testing.assert_almost_equal(man_ir, analyze.information_ratio( 215 | series = prices['VGTSX'], 216 | benchmark = prices['S&P 500'], 217 | freq = 'daily') 218 | ) 219 | 220 | def test_jensens_alpha(prices, stat_calcs): 221 | man_ja = stat_calcs.loc['jensens_alpha', 'VGTSX'] 222 | 223 | testing.assert_almost_equal( 224 | man_ja, analyze.jensens_alpha( 225 | series = prices['VGTSX'], benchmark = prices['S&P 500']) 226 | ) 227 | 228 | def test_max_drawdown(prices, stat_calcs): 229 | man_md = stat_calcs.loc['max_drawdown', 'VGTSX'] 230 | 231 | testing.assert_almost_equal( 232 | man_md, analyze.max_drawdown(series = prices['VGTSX']) 233 | ) 234 | 235 | def test_mean_absolute_tracking_error(prices, stat_calcs): 236 | man_mate = stat_calcs.loc['mean_absolute_tracking_error', 'VGTSX'] 237 | 238 | testing.assert_almost_equal( 239 | man_mate, analyze.mean_absolute_tracking_error( 240 | series = prices['VGTSX'], benchmark = prices['S&P 500']) 241 | ) 242 | 243 | def test_median_downcapture(prices, stat_calcs): 244 | man_md = stat_calcs.loc['median_downcapture', 'VGTSX'] 245 | 246 | testing.assert_almost_equal( 247 | man_md, analyze.median_downcapture( 248 | series = prices['VGTSX'], benchmark = prices['S&P 500']) 249 | ) 250 | 251 | def test_median_upcapture(prices, stat_calcs): 252 | man_uc = stat_calcs.loc['median_upcapture', 'VGTSX'] 253 | 254 | testing.assert_almost_equal( 255 | man_uc, analyze.median_upcapture( 256 | series = prices['VGTSX'], benchmark = prices['S&P 500']) 257 | ) 258 | 259 | def test_risk_adjusted_excess_return(prices, stat_calcs): 260 | man_raer = stat_calcs.loc['risk_adjusted_excess_return', 'VGTSX'] 261 | 262 | testing.assert_almost_equal( 263 | man_raer, analyze.risk_adjusted_excess_return( 264 | series = prices['VGTSX'], benchmark = prices['S&P 500'], 265 | rfr = 0.0, freq = 'daily') 266 | ) 267 | 268 | def test_adj_sharpe_ratio(prices, stat_calcs): 269 | man_asr = stat_calcs.loc['adj_sharpe_ratio', 'VGTSX'] 270 | 271 | testing.assert_almost_equal( 272 | man_asr, analyze.adj_sharpe_ratio( 273 | series = prices['VGTSX'], 274 | rfr = 0.0, 275 | freq = 'daily') 276 | ) 277 | 278 | def test_sharpe_ratio(prices, stat_calcs): 279 | man_sr = stat_calcs.loc['sharpe_ratio', 'VGTSX'] 280 | 281 | testing.assert_almost_equal(man_sr, analyze.sharpe_ratio( 282 | series = prices['VGTSX'], 283 | rfr = 0.0, 284 | freq = 'daily') 285 | ) 286 | 287 | def test_sortino_ratio(prices, stat_calcs): 288 | man_sr = stat_calcs.loc['sortino_ratio', 'VGTSX'] 289 | 290 | testing.assert_almost_equal(man_sr, analyze.sortino_ratio( 291 | series = prices['VGTSX'], 292 | rfr = 0.0, 293 | freq = 'daily') 294 | ) 295 | 296 | def test_systematic_as_proportion(prices, stat_calcs): 297 | man_sap = stat_calcs.loc['systematic_as_proportion', 'VGTSX'] 298 | 299 | testing.assert_almost_equal( 300 | man_sap, analyze.systematic_as_proportion( 301 | series = prices['VGTSX'], benchmark = prices['S&P 500']) 302 | ) 303 | 304 | def test_systematic_risk(prices, stat_calcs): 305 | man_sr = stat_calcs.loc['systematic_risk', 'VGTSX'] 306 | 307 | testing.assert_almost_equal( 308 | man_sr, analyze.systematic_risk( 309 | series = prices['VGTSX'], benchmark = prices['S&P 500']) 310 | ) 311 | 312 | def test_tracking_error(prices, stat_calcs): 313 | man_te = stat_calcs.loc['tracking_error', 'VGTSX'] 314 | 315 | testing.assert_almost_equal( 316 | man_te, analyze.tracking_error( 317 | series = prices['VGTSX'], benchmark = prices['S&P 500']) 318 | ) 319 | 320 | def test_ulcer_index(prices, stat_calcs): 321 | man_ui = stat_calcs.loc['ulcer_index', 'VGTSX'] 322 | 323 | testing.assert_almost_equal( 324 | man_ui, analyze.ulcer_index(series = prices['VGTSX']) 325 | ) 326 | 327 | def test_upcapture(prices, stat_calcs): 328 | man_uc = stat_calcs.loc['upcapture', 'VGTSX'] 329 | 330 | testing.assert_almost_equal( 331 | man_uc, analyze.upcapture( 332 | series = prices['VGTSX'], benchmark = prices['S&P 500']) 333 | ) 334 | 335 | def test_upside_deviation(prices, stat_calcs): 336 | man_ud = stat_calcs.loc['upside_deviation', 'VGTSX'] 337 | 338 | testing.assert_almost_equal( 339 | man_ud, analyze.upside_deviation( 340 | series = prices['VGTSX'], 341 | freq = 'daily') 342 | ) 343 | -------------------------------------------------------------------------------- /test_module/test_construct_portfolio.py: -------------------------------------------------------------------------------- 1 | 2 | #!/usr/bin/env python 3 | # encoding: utf-8 4 | 5 | """ 6 | .. module:: visualize_wealth.test_module.test_construct_portfolio.py 7 | 8 | .. moduleauthor:: Benjamin M. Gross 9 | 10 | """ 11 | import os 12 | import pytest 13 | import numpy 14 | import pandas 15 | import tempfile 16 | import datetime 17 | from pandas.util import testing 18 | from visualize_wealth import construct_portfolio as cp 19 | 20 | @pytest.fixture 21 | def test_file(): 22 | f = './test_data/panel from weight file test.xlsx' 23 | return pandas.ExcelFile(f) 24 | 25 | @pytest.fixture 26 | def tc_file(): 27 | f = './test_data/transaction-costs.xlsx' 28 | return pandas.ExcelFile(f) 29 | 30 | @pytest.fixture 31 | def rebal_weights(test_file): 32 | return test_file.parse('rebal_weights', index_col = 0) 33 | 34 | @pytest.fixture 35 | def panel(test_file, rebal_weights): 36 | tickers = ['EEM', 'EFA', 'IYR', 'IWV', 'IEF', 'IYR', 'SHY'] 37 | d = {} 38 | for ticker in tickers: 39 | d[ticker] = test_file.parse(ticker, index_col = 0) 40 | 41 | return cp.panel_from_weight_file(rebal_weights, 42 | pandas.Panel(d), 43 | 1000. 44 | ) 45 | 46 | @pytest.fixture 47 | def manual_index(panel, test_file): 48 | man_calc = test_file.parse('index_result', 49 | index_col = 0 50 | ) 51 | return man_calc 52 | 53 | @pytest.fixture 54 | def manual_tc_bps(tc_file): 55 | man_tcosts = tc_file.parse('tc_bps', index_col = 0) 56 | man_tcosts = man_tcosts.fillna(0.0) 57 | return man_tcosts 58 | 59 | @pytest.fixture 60 | def manual_tc_cps(tc_file): 61 | man_tcosts = tc_file.parse('tc_cps', index_col = 0) 62 | man_tcosts = man_tcosts.fillna(0.0) 63 | return man_tcosts 64 | 65 | @pytest.fixture 66 | def manual_mngmt_fee(tc_file): 67 | return tc_file.parse('mgmt_fee', index_col = 0) 68 | 69 | def test_mngmt_fee(panel, tc_file, manual_mngmt_fee): 70 | index = cp.pfp_from_weight_file(panel) 71 | 72 | vw_mfee = cp.mngmt_fee(price_series = index['Close'], 73 | bps_cost = 100., 74 | frequency = 'daily' 75 | ) 76 | 77 | testing.assert_series_equal(manual_mngmt_fee['daily_index'], 78 | vw_mfee 79 | ) 80 | 81 | def test_pfp(panel, manual_index): 82 | #import ipdb; ipdb.set_trace() 83 | lib_calc = cp.pfp_from_weight_file(panel) 84 | 85 | # hack because names weren't matching up 86 | mn_series = manual_index['Close'] 87 | lb_series = lib_calc['Close'] 88 | mn_series.index.name = lb_series.index.name 89 | 90 | testing.assert_series_equal(mn_series, 91 | lb_series 92 | ) 93 | return lib_calc 94 | 95 | def test_tc_bps(rebal_weights, panel, manual_tc_bps): 96 | vw_tcosts = cp.tc_bps(weight_df = rebal_weights, 97 | share_panel = panel, 98 | bps = 10., 99 | ) 100 | cols = ['EEM', 'EFA', 'IEF', 'IWV', 'IYR', 'SHY'] 101 | testing.assert_frame_equal(manual_tc_bps[cols], vw_tcosts) 102 | 103 | def test_net_bps(rebal_weights, panel, manual_tc_bps, manual_index): 104 | 105 | index = test_pfp(panel, manual_index) 106 | index = index['Close'] 107 | 108 | vw_tcosts = cp.tc_bps(weight_df = rebal_weights, 109 | share_panel = panel, 110 | bps = 10., 111 | ) 112 | 113 | net_tcs = cp.net_tcs(tc_df = vw_tcosts, 114 | price_index = index 115 | ) 116 | 117 | testing.assert_series_equal(manual_tc_bps['adj_index'], 118 | net_tcs 119 | ) 120 | 121 | def test_net_cps(rebal_weights, panel, manual_tc_cps, manual_index): 122 | index = test_pfp(panel, manual_index) 123 | index = index['Close'] 124 | 125 | vw_tcosts = cp.tc_cps(weight_df = rebal_weights, 126 | share_panel = panel, 127 | cps = 10., 128 | ) 129 | 130 | net_tcs = cp.net_tcs(tc_df = vw_tcosts, 131 | price_index = index 132 | ) 133 | 134 | testing.assert_series_equal(manual_tc_cps['adj_index'], 135 | net_tcs 136 | ) 137 | 138 | def test_tc_cps(rebal_weights, panel, manual_tc_cps): 139 | cols = ['EEM', 'EFA', 'IEF', 'IWV', 'IYR', 'SHY'] 140 | vw_tcosts = cp.tc_cps(weight_df = rebal_weights, 141 | share_panel = panel, 142 | cps = 10., 143 | ) 144 | 145 | testing.assert_frame_equal(manual_tc_cps[cols], vw_tcosts) 146 | 147 | def test_funs(): 148 | """ 149 | >>> import pandas.util.testing as put 150 | >>> xl_file = pandas.ExcelFile('../tests/test_splits.xlsx') 151 | >>> blotter = xl_file.parse('blotter', index_col = 0) 152 | >>> cols = ['Close', 'Adj Close', 'Dividends'] 153 | >>> price_df = xl_file.parse('calc_sheet', index_col = 0) 154 | >>> price_df = price_df[cols] 155 | >>> split_frame = calculate_splits(price_df) 156 | 157 | >>> shares_owned = blotter_and_price_df_to_cum_shares(blotter, 158 | ... split_frame) 159 | >>> test_vals = xl_file.parse( 160 | ... 'share_balance', index_col = 0)['cum_shares'] 161 | >>> put.assert_almost_equal(shares_owned['cum_shares'].dropna(), 162 | ... test_vals) 163 | True 164 | >>> f = '../tests/panel from weight file test.xlsx' 165 | >>> xl_file = pandas.ExcelFile(f) 166 | >>> weight_df = xl_file.parse('rebal_weights', index_col = 0) 167 | >>> tickers = ['EEM', 'EFA', 'IYR', 'IWV', 'IEF', 'IYR', 'SHY'] 168 | >>> d = {} 169 | >>> for ticker in tickers: 170 | ... d[ticker] = xl_file.parse(ticker, index_col = 0) 171 | >>> panel = panel_from_weight_file(weight_df, pandas.Panel(d), 172 | ... 1000.) 173 | >>> portfolio = pfp_from_weight_file(panel) 174 | >>> manual_calcs = xl_file.parse('index_result', index_col = 0) 175 | >>> put.assert_series_equal(manual_calcs['Close'], 176 | ... portfolio['Close']) 177 | """ 178 | return None -------------------------------------------------------------------------------- /test_module/test_utils.py: -------------------------------------------------------------------------------- 1 | #!/usr/bin/env python 2 | # encoding: utf-8 3 | 4 | """ 5 | .. module:: visualize_wealth.test_module.test_utils.py 6 | 7 | .. moduleauthor:: Benjamin M. Gross 8 | 9 | """ 10 | import os 11 | import pytest 12 | import numpy 13 | import pandas 14 | import tempfile 15 | import datetime 16 | 17 | from pandas.util.testing import (assert_frame_equal, 18 | assert_series_equal, 19 | assert_index_equal, 20 | assert_almost_equal 21 | ) 22 | 23 | import visualize_wealth.utils as utils 24 | 25 | 26 | @pytest.fixture 27 | def populate_store(): 28 | name = './test_data/tmp.h5' 29 | store = pandas.HDFStore(name, mode = 'w') 30 | 31 | #two weeks of data before today, delete one week, then update 32 | delta = datetime.timedelta(14) 33 | today = datetime.datetime.date(datetime.datetime.today()) 34 | index = pandas.DatetimeIndex(start = today - delta, 35 | freq = 'b', 36 | periods = 10 37 | ) 38 | 39 | store.put('TICK', pandas.Series(numpy.ones(len(index), ), 40 | index = index, 41 | name = 'Close') 42 | ) 43 | 44 | store.put('TOCK', pandas.Series(numpy.ones(len(index), ), 45 | index = index, 46 | name = 'Close') 47 | ) 48 | store.close() 49 | return {'name': name, 'index': index} 50 | 51 | @pytest.fixture 52 | def populate_updated(): 53 | name = './test_data/tmp.h5' 54 | store = pandas.HDFStore(name, mode = 'w') 55 | 56 | #two weeks of data before today, delete one week, then update 57 | delta = datetime.timedelta(14) 58 | today = datetime.datetime.date(datetime.datetime.today()) 59 | index = pandas.DatetimeIndex(start = today - delta, 60 | freq = 'b', 61 | periods = 10 62 | ) 63 | 64 | store.put('TICK', pandas.Series(numpy.ones(len(index), ), 65 | index = index, 66 | name = 'Close') 67 | ) 68 | 69 | store.put('TOCK', pandas.Series(numpy.ones(len(index), ), 70 | index = index, 71 | name = 'Close') 72 | ) 73 | 74 | #truncate the index for updating 75 | ind = index[:5] 76 | n = len(ind) 77 | 78 | #store the Master IND3X 79 | store.put('IND3X', pandas.Series(ind, 80 | index = ind) 81 | ) 82 | 83 | cols = ['Open', 'High', 'Low', 'Close', 'Volume', 'Adj Close'] 84 | cash = pandas.DataFrame(numpy.ones([n, len(cols)]), 85 | index = ind, 86 | columns = cols 87 | ) 88 | 89 | #store the CA5H 90 | store.put('CA5H', cash) 91 | store.close() 92 | return {'name': name, 'index': index} 93 | 94 | 95 | def test_create_store_master_index(populate_store): 96 | index = populate_store['index'] 97 | index = pandas.Series(index, index = index) 98 | 99 | utils.create_store_master_index(populate_store['name']) 100 | store = pandas.HDFStore(populate_store['name'], mode = 'r+') 101 | assert_series_equal(store.get('IND3X'), index) 102 | store.close() 103 | os.remove(populate_store['name']) 104 | 105 | 106 | def test_union_store_indexes(populate_store): 107 | store = pandas.HDFStore(populate_store['name'], mode = 'r+') 108 | index = populate_store['index'] 109 | union = utils.union_store_indexes(store) 110 | assert_index_equal(index, union) 111 | store.close() 112 | os.remove(populate_store['name']) 113 | 114 | 115 | def test_create_store_cash(populate_store): 116 | index = populate_store['index'] 117 | utils.create_store_cash(populate_store['name']) 118 | store = pandas.HDFStore(populate_store['name'], mode = 'r+') 119 | cols = ['Open', 'High', 'Low', 'Close', 'Volume', 'Adj Close'] 120 | n = len(index) 121 | 122 | cash = pandas.DataFrame(numpy.ones([n, len(cols)]), 123 | index = index, 124 | columns = cols 125 | ) 126 | 127 | assert_frame_equal(store.get('CA5H'), cash) 128 | store.close() 129 | os.remove(populate_store['name']) 130 | 131 | 132 | def test_update_store_master_and_cash(populate_updated): 133 | index = populate_updated['index'] 134 | index = pandas.Series(index, index = index) 135 | 136 | utils.update_store_master_index(populate_updated['name']) 137 | utils.update_store_cash(populate_updated['name']) 138 | 139 | store = pandas.HDFStore(populate_updated['name'], mode = 'r+') 140 | assert_series_equal(store.get('IND3X'), index) 141 | 142 | cols = ['Open', 'High', 'Low', 'Close', 'Volume', 'Adj Close'] 143 | n = len(index) 144 | cash = pandas.DataFrame(numpy.ones([n, len(cols)]), 145 | index = index, 146 | columns = cols 147 | ) 148 | 149 | assert_frame_equal(store.get('CA5H'), cash) 150 | store.close() 151 | os.remove(populate_updated['name']) 152 | 153 | 154 | 155 | def test_rets_to_price(): 156 | dts = ['1/1/2000', '1/2/2000', '1/3/2000'] 157 | 158 | index = pandas.DatetimeIndex( 159 | pandas.Timestamp(dt) for dt in dts 160 | ) 161 | 162 | series = pandas.Series([numpy.nan, 0., 0.], 163 | index = index 164 | ) 165 | 166 | log = utils.rets_to_price( 167 | series, 168 | ret_typ = 'log', 169 | start_value = 100. 170 | ) 171 | 172 | lin = utils.rets_to_price( 173 | series, 174 | ret_typ = 'linear', 175 | start_value = 100. 176 | ) 177 | 178 | man = pandas.Series([100., 100., 100.], 179 | index = index 180 | ) 181 | 182 | assert_series_equal(log, man) 183 | assert_series_equal(lin, man) 184 | 185 | df = pandas.DataFrame({'a': series, 'b': series}) 186 | log = utils.rets_to_price( 187 | df, 188 | ret_typ = 'log', 189 | start_value = 100. 190 | ) 191 | 192 | lin = utils.rets_to_price( 193 | df, 194 | ret_typ = 'linear', 195 | start_value = 100. 196 | ) 197 | 198 | man = pandas.DataFrame({'a': man, 'b': man}) 199 | 200 | assert_frame_equal(log, man) 201 | assert_frame_equal(lin, man) 202 | 203 | with pytest.raises(TypeError): 204 | utils.rets_to_price(pandas.Panel(), 205 | ret_typ = 'log', 206 | start_value = 100. 207 | ) 208 | 209 | #@pytest.mark.newtest 210 | def test_strip_vals(): 211 | l = [' TLT', ' HYY ', 'IEF '] 212 | strpd = utils.strip_vals(l) 213 | res = ['TLT', 'HYY', 'IEF'] 214 | assert strpd == res 215 | 216 | @pytest.mark.newtest 217 | def test_zipped_time_chunks(): 218 | pts = pandas.Timestamp 219 | 220 | index = pandas.DatetimeIndex( 221 | start = '06/01/2000', 222 | freq = 'D', 223 | periods = 100 224 | ) 225 | res = [('06-01-2000', '06-30-2000'), 226 | ('07-01-2000', '07-31-2000'), 227 | ('08-01-2000', '08-31-2000')] 228 | 229 | mc = list(((pts(x), pts(y)) for x, y in res)) 230 | lc = utils.zipped_time_chunks( 231 | index = index, 232 | interval = 'monthly', 233 | incl_T = False 234 | ) 235 | assert mc == lc 236 | 237 | res = [('06-01-2000', '06-30-2000'), 238 | ('07-01-2000', '07-31-2000'), 239 | ('08-01-2000', '08-31-2000'), 240 | ('09-01-2000', '09-08-2000')] 241 | 242 | mc = list(((pts(x), pts(y)) for x, y in res)) 243 | lc = utils.zipped_time_chunks( 244 | index = index, 245 | interval = 'monthly', 246 | incl_T = True 247 | ) 248 | assert mc == lc 249 | 250 | res = [('06-01-2000', '06-30-2000')] 251 | mc = list(((pts(x), pts(y)) for x, y in res)) 252 | lc = utils.zipped_time_chunks( 253 | index = index, 254 | interval = 'quarterly', 255 | incl_T = False 256 | ) 257 | assert mc == lc 258 | 259 | res = [('06-01-2000', '06-30-2000')] 260 | mc = list(((pts(x), pts(y)) for x, y in res)) 261 | lc = utils.zipped_time_chunks( 262 | index = index, 263 | interval = 'quarterly', 264 | incl_T = False 265 | ) 266 | assert mc == lc 267 | 268 | res = [('06-01-2000', '06-30-2000'), 269 | ('07-01-2000', '09-08-2000')] 270 | mc = list(((pts(x), pts(y)) for x, y in res)) 271 | lc = utils.zipped_time_chunks( 272 | index = index, 273 | interval = 'quarterly', 274 | incl_T = True 275 | ) 276 | assert mc == lc 277 | 278 | mc = [] 279 | lc = utils.zipped_time_chunks( 280 | index = index, 281 | interval = 'yearly', 282 | incl_T = False 283 | ) 284 | assert mc == lc 285 | 286 | res = [('06-01-2000', '09-08-2000')] 287 | mc = list(((pts(x), pts(y)) for x, y in res)) 288 | lc = utils.zipped_time_chunks( 289 | index = index, 290 | interval = 'yearly', 291 | incl_T = True 292 | ) 293 | assert mc == lc 294 | 295 | """ 296 | def test_update_store_cash(populate_updated): 297 | index = populate_updated['index'] 298 | 299 | utils.update_store_cash(populate_updated['name']) 300 | store = pandas.HDFStore(populate_updated['name'], mode = 'r+') 301 | cols = ['Open', 'High', 'Low', 'Close', 'Volume', 'Adj Close'] 302 | n = len(index) 303 | cash = pandas.DataFrame(numpy.ones([n, len(cols)]), 304 | index = index, 305 | columns = cols 306 | ) 307 | 308 | assert_frame_equal(store.get('CA5H'), cash) 309 | store.close() 310 | os.remove(populate_updated['name']) 311 | """ -------------------------------------------------------------------------------- /visualize_wealth/__init__.py: -------------------------------------------------------------------------------- 1 | #!/usr/bin/env python 2 | # encoding: utf-8 3 | """ 4 | .. module:: __init__.py 5 | :synopsis: initialization file for ``visualize_wealth`` 6 | 7 | .. moduleauthor:: Benjamin M. Gross 8 | 9 | """ 10 | import visualize_wealth.construct_portfolio 11 | import visualize_wealth.utils 12 | import visualize_wealth.classify 13 | import visualize_wealth.analyze 14 | -------------------------------------------------------------------------------- /visualize_wealth/classify.py: -------------------------------------------------------------------------------- 1 | #!/usr/bin/env python 2 | # encoding: utf-8 3 | """ 4 | .. module:: visualize_wealth.classify.py 5 | 6 | Created by Benjamin M. Gross 7 | 8 | """ 9 | 10 | import argparse 11 | import datetime 12 | import numpy 13 | import pandas 14 | import os 15 | 16 | def classify_series_with_store(series, trained_series, store_path, 17 | calc_meth = 'x-inv-x', n = None): 18 | """ 19 | Determine the asset class of price series from an existing 20 | HDFStore with prices 21 | 22 | :ARGS: 23 | 24 | series: :class:`pandas.Series` or `pandas.DataFrame` of the 25 | price series to determine the asset class of 26 | 27 | trained_series: :class:`pandas.Series` of tickers 28 | and their respective asset classes 29 | 30 | store_path: :class:`string` of the location of the HDFStore 31 | to find asset prices 32 | 33 | calc_meth: :class:`string` of either ['x-inv-x', 'inv-x', 'exp-x'] 34 | to determine which calculation method is used 35 | 36 | n: :class:`integer` of the number of highest r-squared assets 37 | to include when classifying a new asset 38 | 39 | :RETURNS: 40 | 41 | :class:`string` of the tickers that have been estimated 42 | based on the method provided 43 | """ 44 | from .utils import index_intersect 45 | from .analyze import log_returns, r2_adj 46 | 47 | if series.name in trained_series.index: 48 | return trained_series[series.name] 49 | else: 50 | try: 51 | store = pandas.HDFStore(path = store_path, mode = 'r') 52 | except IOError: 53 | print store_path + " is not a valid path to HDFStore" 54 | return 55 | rsq_d = {} 56 | ys = log_returns(series) 57 | 58 | for key in store.keys(): 59 | key = key.strip('/') 60 | p = store.get(key) 61 | xs = log_returns(p['Adj Close']) 62 | ind = index_intersect(xs, ys) 63 | rsq_d[key] = r2_adj(benchmark = ys[ind], series = xs[ind]) 64 | rsq_df = pandas.Series(rsq_d) 65 | store.close() 66 | if not n: 67 | n = len(trained_series.unique()) + 1 68 | 69 | return __weighting_method_agg_fun(series = rsq_df, 70 | trained_series = trained_series, 71 | n = n, calc_meth = calc_meth) 72 | 73 | def classify_series_with_online(series, trained_series, 74 | calc_meth = 'x-inv-x', n = None): 75 | """ 76 | Determine the asset class of price series from an existing 77 | HDFStore with prices 78 | 79 | :ARGS: 80 | 81 | series: :class:`pandas.Series` or `pandas.DataFrame` of the 82 | price series to determine the asset class of 83 | 84 | trained_series: :class:`pandas.Series` of tickers 85 | and their respective asset classes 86 | 87 | calc_meth: :class:`string` of either ['x-inv-x', 'inv-x', 'exp-x'] 88 | to determine which calculation method is used 89 | 90 | n: :class:`integer` of the number of highest r-squared assets 91 | to include when classifying a new asset 92 | 93 | :RETURNS: 94 | 95 | :class:`string` of the tickers that have been estimated 96 | based on the method provided 97 | """ 98 | from .utils import tickers_to_dict, index_intersect 99 | from .analyze import log_returns, r2_adj 100 | 101 | if series.name in trained_series.index: 102 | return trained_series[series.name] 103 | else: 104 | price_dict = tickers_to_dict(trained_series.index) 105 | rsq_d = {} 106 | ys = log_returns(series) 107 | 108 | for key in price_dict.keys(): 109 | p = price_dict[key] 110 | xs = log_returns(p['Adj Close']) 111 | ind = index_intersect(xs, ys) 112 | rsq_d[key] = r2_adj(benchmark = xs[ind], series = ys[ind]) 113 | rsq_df = pandas.Series(rsq_d) 114 | 115 | if not n: 116 | n = len(trained_series.unique()) + 1 117 | 118 | return __weighting_method_agg_fun(series = rsq_df, 119 | trained_series = trained_series, 120 | n = n, calc_meth = calc_meth) 121 | 122 | def knn_exp_weighted(series, trained_series, n = None): 123 | """ 124 | Training data is a m x n matrix with 'training_tickers' as columns 125 | and rows of r-squared for different tickers and asset_class is a 126 | n x 1 result of the asset class 127 | 128 | :ARGS: 129 | 130 | series: :class:`pandas.Series` or :class:`pandas.DataFrame` of 131 | r-squared values 132 | 133 | trained_series: :class:`pandas.Series` of the columns 134 | and their respective asset classes 135 | 136 | n: :class:`integer` of the number of highest r-squared assets 137 | to include when classifying a new asset 138 | 139 | :RETURNS: 140 | 141 | :class:`string` of the tickers that have been estimated 142 | based on the n closest neighbors 143 | """ 144 | if not n: 145 | n = len(trained_series.unique()) + 1 146 | 147 | return __weighting_method_agg_fun(series, trained_series, n, 148 | calc_meth = 'exp-x') 149 | 150 | def knn_inverse_weighted(series, trained_series, n = None): 151 | """ 152 | Training data is a m x n matrix with 'training_tickers' as columns 153 | and rows of r-squared for different tickers and asset_class is a 154 | n x 1 result of the asset class 155 | 156 | :ARGS: 157 | 158 | series: :class:`pandas.Series` or :class:`pandas.DataFrame` of 159 | r-squared values 160 | 161 | trained_series: :class:`pandas.Series` of the columns 162 | and their respective asset clasnses 163 | 164 | n: :class:`integer` of the number of highest r-squared assets 165 | to include when classifying a new asset 166 | 167 | :RETURNS: 168 | 169 | :class:`string` of the tickers that have been estimated 170 | based on the n closest neighbors 171 | """ 172 | if not n: 173 | n = len(trained_series.unique()) + 1 174 | 175 | return __weighting_method_agg_fun(series, trained_series, n, 176 | calc_meth = 'inv-x') 177 | 178 | def knn_wt_inv_weighted(series, trained_series, n = None): 179 | """ 180 | Training data is a m x n matrix with 'training_tickers' as columns 181 | and rows of r-squared for different tickers and asset_class is a 182 | n x 1 result of the asset class 183 | 184 | :ARGS: 185 | 186 | series: :class:`pandas.Series` or :class:`pandas.DataFrame` of 187 | r-squared values 188 | 189 | trained_series: :class:`pandas.Series` of the columns 190 | and their respective asset clasnses 191 | 192 | n: :class:`integer` of the number of highest r-squared assets 193 | to include when classifying a new asset 194 | 195 | :RETURNS: 196 | 197 | :class:`string` of the tickers that have been estimated 198 | based on the n closest neighbors 199 | """ 200 | if not n: 201 | n = len(trained_series.unique()) + 1 202 | 203 | return __weighting_method_agg_fun(series, trained_series, n, 204 | calc_meth = 'x-inv-x') 205 | 206 | 207 | 208 | def __weighting_method_agg_fun(series, trained_series, n, calc_meth): 209 | """ 210 | Generator function for the different calcuation methods to determine 211 | the asset class based on a Series or DataFrame of r-squared values 212 | 213 | :ARGS: 214 | 215 | series: :class:`pandas.Series` or :class:`pandas.DataFrame` of 216 | r-squared values 217 | 218 | trained_series: :class:`pandas.Series` of the columns 219 | and their respective asset classes 220 | 221 | n: :class:`integer` of the number of highest r-squared assets 222 | to include when classifying a new asset 223 | 224 | calc_meth: :class:`string` of either ['x-inv-x', 'inv-x', 'exp-x'] 225 | to determine which calculation method is used 226 | 227 | :RETURNS: 228 | 229 | :class:`string` of the asset class been estimated 230 | based on the n closest neighbors, or 'series' in the case when 231 | a :class:`DataFrame` has been provided instead of a :class:`Series` 232 | 233 | """ 234 | def weighting_method_agg_fun(series, trained_series, n, calc_meth): 235 | weight_map = {'x-inv-x': lambda x: x.div(1. - x), 236 | 'inv-x': lambda x: 1./(1. - x), 237 | 'exp-x': lambda x: numpy.exp(x) 238 | } 239 | 240 | key_map = trained_series[series.index] 241 | series = series.rename(index = key_map) 242 | wts = weight_map[calc_meth](series) 243 | wts = wts.sort(ascending = False, inplace = False) 244 | grp = wts[:n].groupby(wts[:n].index).sum() 245 | return grp.argmax() 246 | 247 | if isinstance(series, pandas.DataFrame): 248 | return series.apply( 249 | lambda x: weighting_method_agg_fun(x, 250 | trained_series, n, calc_meth), axis = 1) 251 | else: 252 | return weighting_method_agg_fun(series, trained_series, n, calc_meth) 253 | 254 | -------------------------------------------------------------------------------- /visualize_wealth/construct_portfolio.py: -------------------------------------------------------------------------------- 1 | #!/usr/bin/env python 2 | # encoding: utf-8 3 | """ 4 | .. module:: visualize_wealth.construct_portfolio.py 5 | :synopsis: Engine to construct portfolios using three general methodologies 6 | 7 | .. moduleauthor:: Benjamin M. Gross 8 | """ 9 | import argparse 10 | import logging 11 | import pandas 12 | import numpy 13 | import pandas.io.data 14 | import datetime 15 | import urllib2 16 | 17 | from .utils import (_open_store, 18 | tradeplus_tchunks, 19 | zipped_time_chunks 20 | ) 21 | 22 | def format_blotter(blotter_file): 23 | """ 24 | Pass in either a location of a blotter file (in ``.csv`` format) or 25 | blotter :class:`pandas.DataFrame` with all positive values and 26 | return a :class:`pandas.DataFrame` where Sell values are then 27 | negative values 28 | 29 | :ARGS: 30 | 31 | blotter_file: :class:`pandas.DataFrame` with at least index 32 | (dates of Buy / Sell) columns = ['Buy/Sell', 'Shares'] or a 33 | string of the file location to such a formatted file 34 | 35 | :RETURNS: 36 | 37 | blotter: of type :class:`pandas.DataFrame` where sell values 38 | have been made negative 39 | 40 | """ 41 | if isinstance(blotter_file, str): 42 | blot = pandas.DataFrame.from_csv(blotter_file) 43 | elif isinstance(blotter_file, pandas.DataFrame): 44 | blot = blotter_file.copy() 45 | #map to ascii 46 | blot['Buy/Sell'] = map(lambda x: x.encode('ascii', 'ingore'), 47 | blot['Buy/Sell']) 48 | #remove whitespaces 49 | blot['Buy/Sell'] = map(str.strip, blot['Buy/Sell']) 50 | 51 | #if the Sell values are not negative, make them negative 52 | if ((blot['Buy/Sell'] == 'Sell') & (blot['Shares'] > 0.)).any(): 53 | idx = (blot['Buy/Sell'] == 'Sell') & (blot['Shares'] > 0.) 54 | sub = blot[idx] 55 | sub['Shares'] = -1.*sub['Shares'] 56 | blot.update(sub) 57 | 58 | return blot 59 | 60 | def append_price_frame_with_dividends(ticker, start_date, end_date=None): 61 | """ 62 | Given a ticker, start_date, & end_date, return a 63 | :class:`pandas.DataFrame` with a Dividend Columns appended to it 64 | 65 | :ARGS: 66 | 67 | ticker: :meth:`str` of ticker 68 | 69 | start_date: :class:`datetime.datetime` or string of format 70 | "mm/dd/yyyy" 71 | 72 | end_date: a :class:`datetime.datetime` or string of format 73 | "mm/dd/yyyy" 74 | 75 | :RETURNS: 76 | 77 | price_df: a :class:`pandas.DataFrame` with columns ['Close', 78 | 'Adj Close', 'Dividends'] 79 | 80 | .. code:: python 81 | 82 | import visualze_wealth.construct portfolio as vwcp 83 | frame_with_divs = vwcp.append_price_frame_with_dividends('EEM', 84 | '01/01/2000', '01/01/2013') 85 | 86 | .. warning:: Requires Internet Connectivity 87 | 88 | Because the function calls the `Yahoo! API 89 | `_ internet connectivity is 90 | required for the function to work properly 91 | """ 92 | reader = pandas.io.data.DataReader 93 | 94 | if isinstance(start_date, str): 95 | start_date = datetime.datetime.strptime(start_date, "%m/%d/%Y") 96 | 97 | if end_date == None: 98 | end = datetime.datetime.today() 99 | elif isinstance(end_date, str): 100 | end = datetime.datetime.strptime(end_date, "%m/%d/%Y") 101 | else: 102 | end = end_date 103 | 104 | #construct the dividend data series 105 | b_str = 'http://ichart.finance.yahoo.com/table.csv?s=' 106 | 107 | if end_date == None: 108 | end_date = datetime.datetime.today() 109 | 110 | a = '&a=' + str(start_date.month) 111 | b = '&b=' + str(start_date.day) 112 | c = '&c=' + str(start_date.year) 113 | d = '&d=' + str(end.month) 114 | e = '&e=' + str(end.day) 115 | f = '&f=' + str(end.year) 116 | tail = '&g=v&ignore=.csv' 117 | url = b_str + ticker + a + b + c + d + e + f + tail 118 | socket = urllib2.urlopen(url) 119 | div_df = pandas.io.parsers.read_csv(socket, index_col = 0) 120 | 121 | price_df = reader(ticker, data_source = 'yahoo', 122 | start = start_date, end = end_date) 123 | 124 | return price_df.join(div_df).fillna(0.0) 125 | 126 | def calculate_splits(price_df, tol = .1): 127 | """ 128 | Given a ``price_df`` of the format 129 | :meth:`append_price_frame_with_dividends`, return a 130 | :class:`pandas.DataFrame` with a split factor columns named 'Splits' 131 | 132 | :ARGS: 133 | 134 | price_df: a :class:`pandas.DataFrame` with columns ['Close', 135 | 'Adj Close', 'Dividends'] 136 | 137 | tol: class:`float` of the tolerance to determine whether a 138 | split has occurred 139 | 140 | :RETURNS: 141 | 142 | price: :class:`pandas.DataFrame` with columns ['Close', 143 | 'Adj Close','Dividends', Splits'] 144 | 145 | .. code:: 146 | 147 | price_df_with_divs_and_split_ratios = vwcp.calculate_splits( 148 | price_df_with_divs, tol = 0.1) 149 | 150 | .. note:: Calculating Splits 151 | 152 | This function specifically looks at the ratios of close to 153 | adjusted close to determine whether a split has occurred. To 154 | see the manual calculations of this function, see 155 | ``visualize_wealth/tests/estimating when splits have 156 | occurred.xlsx`` 157 | 158 | """ 159 | div_mul = 1 - price_df['Dividends'].shift(-1).div(price_df['Close']) 160 | rev_cp = div_mul[::-1].cumprod()[::-1] 161 | rev_cp[-1] = 1.0 162 | est_adj = price_df['Adj Close'].div(rev_cp) 163 | eps = est_adj.div(price_df['Close']) 164 | spl_mul = eps.div(eps.shift(1)) 165 | did_split = numpy.abs(spl_mul - 1) > tol 166 | splits = spl_mul[did_split] 167 | for date in splits.index: 168 | if splits[date] > 1.0: 169 | splits[date] = numpy.round(splits[date], 0) 170 | elif splits[date] < 1.0: 171 | splits[date] = 1./numpy.round(1./splits[date], 0) 172 | splits.name = 'Splits' 173 | return price_df.join(splits) 174 | 175 | def blotter_and_price_df_to_cum_shares(blotter_df, price_df): 176 | """ 177 | Given a blotter :class:`pandas.DataFrame` of dates, purchases (+/-), 178 | and price :class:`pandas.DataFrame` with Close Adj Close, Dividends, 179 | & Splits, calculate the cumulative share balance for the position 180 | 181 | :ARGS: 182 | 183 | blotter_df: a :class:`pandas.DataFrame` where index is 184 | buy/sell dates 185 | 186 | price_df: a :class:`pandas.DataFrame` with columns ['Close', 187 | 'Adj Close', 'Dividends', 'Splits'] 188 | 189 | :RETURNS: 190 | 191 | :class:`pandas.DataFrame` containing contributions, withdrawals, 192 | price values 193 | 194 | .. code:: python 195 | 196 | agg_stats_for_single_asset = vwcp.blotter_to_split_adj_shares( 197 | single_asset_blotter, split_adj_price_frame) 198 | 199 | .. note:: Calculating Position Value 200 | 201 | The sole reason you can't take the number of trades for a 202 | given asset, apply a :meth:`cumsum`, and then multiply by 203 | 'Close' for a given day is because of splits. Therefore, once 204 | this function has run, taking the cumulative shares and then 205 | multiplying by close **is** an appropriate way to determine 206 | aggregate position value for any given day 207 | 208 | """ 209 | blotter_df = blotter_df.sort_index() 210 | #make sure all dates in the blotter file are also in the price file 211 | #consider, if those dates aren't in price frame, assign the 212 | #"closest date" value 213 | 214 | msg = "Buy/Sell Dates not in Price File" 215 | assert blotter_df.index.isin(price_df.index).all(), msg 216 | 217 | #now cumsum the buy/sell chunks and mul by splits for total shares 218 | bs_series = pandas.Series() 219 | start_dts = blotter_df.index 220 | end_dts = blotter_df.index[1:].append( 221 | pandas.DatetimeIndex([price_df.index[-1]])) 222 | 223 | dt_chunks = zip(start_dts, end_dts) 224 | end = 0. 225 | 226 | for i, chunk in enumerate(dt_chunks): 227 | #print str(i) + ' of ' + str(len(dt_chunks)) + ' total' 228 | tmp = price_df[chunk[0]:chunk[1]][:-1] 229 | if chunk[1] == price_df.index[-1]: 230 | tmp = price_df[chunk[0]:chunk[1]] 231 | splits = tmp[pandas.notnull(tmp['Splits'])] 232 | vals = numpy.append(blotter_df['Buy/Sell'][chunk[0]] + end, 233 | splits['Splits'].values) 234 | dts = pandas.DatetimeIndex([chunk[0]]).append( 235 | splits['Splits'].index) 236 | tmp_series = pandas.Series(vals, index = dts) 237 | tmp_series = tmp_series.cumprod() 238 | tmp_series = tmp_series[tmp.index].ffill() 239 | bs_series = bs_series.append(tmp_series) 240 | end = bs_series[-1] 241 | 242 | bs_series.name = 'cum_shares' 243 | 244 | #construct the contributions, withdrawals, & cumulative investment 245 | 246 | #if a trade is missing a price, assign the 'Close' of that day 247 | no_price = blotter_df['Price'][pandas.isnull(blotter_df['Price'])] 248 | blotter_df.ix[no_price.index, 'Price'] = price_df.ix[no_price.index, 249 | 'Close'] 250 | 251 | contr = blotter_df['Buy/Sell'].mul(blotter_df['Price']) 252 | cum_inv = contr.cumsum() 253 | contr = contr[price_df.index].fillna(0.0) 254 | cum_inv = cum_inv[price_df.index].ffill() 255 | res = pandas.DataFrame({'cum_shares':bs_series, 256 | 'contr_withdrawal':contr, 'cum_investment':cum_inv}) 257 | 258 | return price_df.join(res) 259 | 260 | def construct_random_trades(split_df, num_trades): 261 | """ 262 | Create random trades on random trade dates, but never allow 263 | shares to go negative 264 | 265 | :ARGS: 266 | 267 | split_df: :class:`pandas.DataFrame` that has 'Close', 268 | 'Dividends', 'Splits' 269 | 270 | :RETURNS: 271 | 272 | blotter_frame: :class:`pandas.DataFrame` a blotter with random 273 | trades, num_trades 274 | 275 | .. note:: Why Create Random Trades? 276 | 277 | One disappointing aspect of any type of financial software is 278 | the fact that you **need** to have a portfolio to view what 279 | the software does (which never seemed like an appropriate 280 | "necessary" condition to me). Therefore, I've created 281 | comprehensive ability to create random trades for single assets, 282 | as well as random portfolios of assets, to avoid the 283 | "unnecessary condition" of having a portfolio to understand 284 | how to anaylze one. 285 | """ 286 | ind = numpy.sort(numpy.random.randint(0, len(split_df), 287 | size = num_trades)) 288 | #This unique makes sure there aren't double trade day entries 289 | #which breaks the function blotter_and_price_df_to_cum_shares 290 | ind = numpy.unique(ind) 291 | dates = split_df.index[ind] 292 | 293 | #construct random execution prices 294 | prices = [] 295 | for date in dates: 296 | u_lim = split_df.loc[date, 'High'] 297 | l_lim = split_df.loc[date, 'Low'] 298 | prices.append(numpy.random.rand()*(u_lim - l_lim + 1) + l_lim) 299 | 300 | trades = numpy.random.randint(-100, 100, size = len(ind)) 301 | trades = numpy.round(trades, -1) 302 | 303 | while numpy.any(trades.cumsum() < 0): 304 | trades[numpy.argmin(trades)] *= -1. 305 | 306 | return pandas.DataFrame({'Buy/Sell':trades, 'Price':prices}, 307 | index = dates) 308 | 309 | def blotter_to_cum_shares(blotter_series, ticker, start_date, 310 | end_date, tol): 311 | """ 312 | Aggregation function for :meth:`append_price_frame_with_dividend`, 313 | :meth:`calculate_splits`, and 314 | `:meth:`blotter_and_price_df_to_cum_shares`. Only blotter, ticker, 315 | start_date, & end_date are needed. 316 | 317 | :ARGS: 318 | 319 | blotter_series: a :class:`pandas.Series` with index of dates 320 | and values of quantity 321 | 322 | ticker: class:`str` the ticker for which the buys and sells 323 | occurs 324 | 325 | start_date: a :class:`string` or :class:`datetime.datetime` 326 | 327 | end_date: :class:`string` or :class:`datetime.datetime` 328 | 329 | tol: :class:`float` the tolerance to find the split dates 330 | (.1 recommended) 331 | 332 | :RETURNS: 333 | 334 | :class:`pandas.DataFrame` containing contributions, 335 | withdrawals, price values 336 | 337 | .. warning:: Requires Internet Connectivity 338 | 339 | Because the function calls the `Yahoo! API 340 | `_ internet connectivity is required 341 | for the function to work properly 342 | 343 | """ 344 | 345 | price_df = append_price_frame_with_dividends(ticker, start_date, 346 | end_date) 347 | 348 | split_df = calculate_splits(price_df) 349 | return blotter_and_price_df_to_cum_shares(blotter_series, split_df) 350 | 351 | def generate_random_asset_path(ticker, start_date, num_trades): 352 | """ 353 | Allows the user to input a ticker, start date, and num_trades to 354 | generate a :class:`pandas.DataFrame` with columns 'Open', 'Close', 355 | cum_withdrawals', 'cum_shares' (i.e. bypasses the need for a price 356 | :class:`pandas.DataFrame` to generate an asset path, as is required 357 | in :meth:`construct_random_trades` 358 | 359 | :ARGS: 360 | 361 | ticker: :class:`string` of the ticker to generate the path 362 | 363 | start_date: :class:`string` of format 'mm/dd/yyyy' or 364 | :class:`datetime` 365 | 366 | num_trades: :class:`int` of the number of trades to generate 367 | 368 | :RETURNS: 369 | 370 | :class:`pandas.DataFrame` with the additional columns 371 | 'cum_shares', 'contr_withdrawal', 'Splits', Dividends' 372 | 373 | .. warning:: Requires Internet Connectivity 374 | 375 | Because the function calls the `Yahoo! API 376 | `_ internet connectivity is required 377 | for the function to work properly 378 | """ 379 | if isinstance(start_date, str): 380 | start_date = datetime.datetime.strptime(start_date, "%m/%d/%Y") 381 | end_date = datetime.datetime.today() 382 | prices = append_price_frame_with_dividends(ticker, start_date) 383 | blotter = construct_random_trades(prices, num_trades) 384 | #blotter.to_csv('../tests/' + ticker + '.csv') 385 | return blotter_to_cum_shares(blotter_series = blotter, 386 | ticker = ticker, start_date = start_date, 387 | end_date = end_date, tol = .1) 388 | 389 | def generate_random_portfolio_blotter(tickers, num_trades): 390 | """ 391 | :meth:`construct_random_asset_path`, for multiple assets, given a 392 | list of tickers and a number of trades (to be used for all tickers). 393 | Execution prices will be the 'Close' of that ticker in the price 394 | DataFrame that is collected 395 | 396 | :ARGS: 397 | 398 | tickers: a :class:`list` with the tickers to be used 399 | 400 | num_trades: :class:`integer`, the number of trades to randomly 401 | generate for each ticker 402 | 403 | :RETURNS: 404 | 405 | :class:`pandas.DataFrame` with columns 'Ticker', 'Buy/Sell' 406 | (+ for buys, - for sells) and 'Price' 407 | 408 | .. warning:: Requires Internet Connectivity 409 | 410 | Because the function calls the `Yahoo! API 411 | `_ internet connectivity is required 412 | for the function to work properly 413 | 414 | """ 415 | blot_d = {} 416 | price_d = {} 417 | for ticker in tickers: 418 | tmp = append_price_frame_with_dividends( 419 | ticker, start_date = datetime.datetime(1990, 1, 1)) 420 | price_d[ticker] = calculate_splits(tmp) 421 | blot_d[ticker] = construct_random_trades(price_d[ticker], 422 | num_trades) 423 | ind = [] 424 | agg_d = {'Ticker':[], 'Buy/Sell':[], 'Price':[]} 425 | 426 | for ticker in tickers: 427 | for date in blot_d[ticker].index: 428 | ind.append(date) 429 | agg_d['Ticker'].append(ticker) 430 | agg_d['Buy/Sell'].append( 431 | blot_d[ticker].loc[date, 'Buy/Sell']) 432 | agg_d['Price'].append( 433 | blot_d[ticker].loc[date, 'Price']) 434 | 435 | return pandas.DataFrame(agg_d, index = ind) 436 | 437 | def panel_from_blotter(blotter_df): 438 | """ 439 | The aggregation function to construct a portfolio given a blotter 440 | of tickers, trades, and number of shares. 441 | 442 | :ARGS: 443 | 444 | agg_blotter_df: a :class:`pandas.DataFrame` with columns 445 | ['Ticker', 'Buy/Sell', 'Price'], where the 'Buy/Sell' column 446 | is the quantity of shares, (+) for buy, (-) for sell 447 | 448 | :RETURNS: 449 | 450 | :class:`pandas.Panel` with dimensions [tickers, dates, 451 | price data] 452 | 453 | .. note:: What to Do with your Panel 454 | 455 | The :class:`pandas.Panel` returned by this function has all of 456 | the necessary information to do some fairly exhaustive 457 | analysis. Cumulative investment, portfolio value (simply the 458 | ``cum_shares``*``close`` for all assets), closes, opens, etc. 459 | You've got a world of information about "your portfolio" with 460 | this object... get diggin! 461 | """ 462 | tickers = pandas.unique(blotter_df['Ticker']) 463 | start_date = blotter_df.sort_index().index[0] 464 | end_date = datetime.datetime.today() 465 | val_d = {} 466 | for ticker in tickers: 467 | blotter_series = blotter_df[blotter_df['Ticker'] == ticker] 468 | blotter_series = blotter_series.sort_index(inplace = True) 469 | val_d[ticker] = blotter_to_cum_shares(blotter_series, ticker, 470 | start_date, end_date, tol = .1) 471 | 472 | return pandas.Panel(val_d) 473 | 474 | def fetch_data_from_store_weight_alloc_method(weight_df, store_path): 475 | """ 476 | To speed up calculation time and allow for off-line functionality, 477 | provide a :class:`pandas.DataFrame` weight_df and point the function 478 | to an HDFStore 479 | 480 | :ARGS: 481 | 482 | weight_df: a :class:`pandas.DataFrame` with dates as index and 483 | tickers as columns 484 | 485 | store_path: :class:`string` of the location to an HDFStore 486 | 487 | :RETURNS: 488 | 489 | :class:`pandas.Panel` where: 490 | 491 | * :meth:`panel.items` are tickers 492 | 493 | * :meth:`panel.major_axis` dates 494 | 495 | * :meth:`panel.minor_axis:` price information, specifically: 496 | ['Open', 'Close', 'Adj Close'] 497 | 498 | """ 499 | 500 | store = _open_store(store_path) 501 | beg_port = weight_df.index.min() 502 | 503 | d = {} 504 | for ticker in weight_df.columns: 505 | try: 506 | d[ticker] = store.get(ticker) 507 | except KeyError as key: 508 | logging.exception("store.get({0}) ticker failed".format(ticker)) 509 | 510 | panel = pandas.Panel(d) 511 | 512 | #Check to make sure the earliest "full data date" is b/f first trade 513 | #first_price = max(map(lambda x: panel.loc[x, :, 514 | # 'Adj Close'].dropna().index.min(), panel.items)) 515 | 516 | #print the number of consectutive nans 517 | #for ticker in weight_df.columns: 518 | # print ticker + " " + str(vwa.consecutive(panel.loc[ticker, 519 | # first_price:, 'Adj Close'].isnull().astype(int)).max()) 520 | 521 | store.close() 522 | return panel.ffill() 523 | 524 | def fetch_data_for_weight_allocation_method(weight_df): 525 | """ 526 | To be used with `The Weight Allocation Method 527 | <./readme.html#the-weight-allocation-method>_` Given a weight_df 528 | with index of allocation dates and columns of percentage 529 | allocations, fetch the data using Yahoo!'s API and return a panel 530 | of dimensions [tickers, dates, price data], where ``price_data`` 531 | has columns ``['Open', 'Close','Adj Close'].`` 532 | 533 | :ARGS: 534 | 535 | weight_df: a :class:`pandas.DataFrame` with dates as index and 536 | tickers as columns 537 | 538 | :RETURNS: 539 | 540 | :class:`pandas.Panel` where: 541 | 542 | * :meth:`panel.items` are tickers 543 | 544 | * :meth:`panel.major_axis` dates 545 | 546 | * :meth:`panel.minor_axis:` price information, specifically: 547 | ['Open', 'Close', 'Adj Close'] 548 | 549 | .. warning:: Requires Internet Connectivity 550 | 551 | Because the function calls the `Yahoo! API 552 | `_ internet connectivity is required 553 | for the function to work properly 554 | """ 555 | reader = pandas.io.data.DataReader 556 | beg_port = weight_df.index.min() 557 | 558 | d = {} 559 | for ticker in weight_df.columns: 560 | try: 561 | d[ticker] = reader(ticker, 'yahoo', start = beg_port) 562 | except: 563 | print "didn't work for "+ticker+"!" 564 | 565 | #pull the data from Yahoo! 566 | panel = pandas.Panel(d) 567 | 568 | #Check to make sure the earliest "full data date" is b/f first trade 569 | #first_price = max(map(lambda x: panel.loc[x, :, 570 | # 'Adj Close'].dropna().index.min(), panel.items)) 571 | 572 | #print the number of consectutive nans 573 | #for ticker in weight_df.columns: 574 | # print ticker + " " + str(vwa.consecutive(panel.loc[ticker, 575 | # first_price:, 'Adj Close'].isnull().astype(int)).max()) 576 | 577 | return panel.ffill() 578 | 579 | def fetch_data_from_store_initial_alloc_method( 580 | initial_weights, store_path, start_date = '01/01/2000'): 581 | """ 582 | To speed up calculation time and allow for off-line functionality, 583 | provide a :class:`pandas.DataFrame` weight_df and point the function 584 | to an HDFStore 585 | 586 | :ARGS: 587 | 588 | weight_df: a :class:`pandas.DataFrame` with dates as index and 589 | tickers as columns 590 | 591 | store_path: :class:`string` of the location to an HDFStore 592 | 593 | :RETURNS: 594 | 595 | :class:`pandas.Panel` where: 596 | 597 | * :meth:`panel.items` are tickers 598 | 599 | * :meth:`panel.major_axis` dates 600 | 601 | * :meth:`panel.minor_axis:` price information, specifically: 602 | ['Open', 'Close', 'Adj Close'] 603 | 604 | """ 605 | msg = "Not all tickers in HDFStore" 606 | store = pandas.HDFStore(store_path) 607 | #assert vwu.check_store_for_tickers(initial_weights.index, store), msg 608 | #beg_port = datetime.sdat 609 | 610 | d = {} 611 | for ticker in initial_weights.index: 612 | try: 613 | d[ticker] = store.get(ticker) 614 | except KeyError as key: 615 | logging.exception("store.get({0}) ticker failed".format(ticker)) 616 | 617 | store.close() 618 | panel = pandas.Panel(d) 619 | 620 | #Check to make sure the earliest "full data date" is b/f first trade 621 | #first_price = max(map(lambda x: panel.loc[x, :, 622 | # 'Adj Close'].dropna().index.min(), panel.items)) 623 | 624 | #print the number of consectutive nans 625 | #for ticker in initial_weights.index: 626 | # print ticker + " " + str(vwa.consecutive(panel.loc[ticker, 627 | # first_price:, 'Adj Close'].isnull().astype(int)).max()) 628 | 629 | return panel.ffill() 630 | 631 | def fetch_data_for_initial_allocation_method(initial_weights, 632 | start_date = '01/01/2000'): 633 | """ 634 | To be used with `The Initial Allocaiton Method 635 | <./readme.html#the-initial-allocation-rebalancing-method>`_ 636 | Given initial_weights :class:`pandas.Series` with index of tickers 637 | and values of initial allocation percentages, fetch the data using 638 | Yahoo!'s API and return a panel of dimensions [tickers, dates, 639 | price data], where ``price_data`` has columns ``['Open', 'Close', 640 | 'Adj Close'].`` 641 | 642 | :ARGS: 643 | 644 | initial_weights :class:`pandas.Series` with tickers as index 645 | and weights as values 646 | 647 | :RETURNS: 648 | 649 | :class:`pandas.Panel` where: 650 | 651 | * :meth:`panel.items` are tickers 652 | 653 | * :meth:`panel.major_axis` dates 654 | 655 | * :meth:`panel.minor_axis` price information, specifically: 656 | ['Open', 'Close', 'Adj Close'] 657 | """ 658 | reader = pandas.io.data.DataReader 659 | d_0 = datetime.datetime.strptime(start_date, "%m/%d/%Y") 660 | 661 | d = {} 662 | for ticker in initial_weights.index: 663 | try: 664 | d[ticker] = reader(ticker, 'yahoo', start = d_0) 665 | except: 666 | print "Didn't work for " + ticker + "!" 667 | 668 | panel = pandas.Panel(d) 669 | 670 | #Check to make sure the earliest "full data date" is bf first trade 671 | #first_price = max(map(lambda x: panel.loc[x, :, 672 | # 'Adj Close'].dropna().index.min(), panel.items)) 673 | 674 | #print the number of consectutive nans 675 | #for ticker in initial_weights.index: 676 | # print ticker + " " + str(vwa.consecutive(panel.loc[ticker, 677 | # first_price: , 'Adj Close'].isnull().astype(int)).max()) 678 | 679 | return panel.ffill() 680 | 681 | 682 | def panel_from_weight_file(weight_df, price_panel, start_value): 683 | """ 684 | Returns a :class:`pandas.Panel` with the intermediate calculation 685 | steps of n0, c0_ac, and adj_q to calculate a portfolio's adjusted 686 | price path when provided a pandas.DataFrame of weight allocations and a 687 | starting value of the index 688 | 689 | :ARGS: 690 | 691 | weight_df of :class:`pandas.DataFrame` of a weight allocation 692 | with tickers for columns, index of dates and weight allocations 693 | to each of the tickers 694 | 695 | price_panel of :class:`pandas.Panel` with dimensions [tickers, 696 | index, price data] 697 | 698 | :RETURNS: 699 | 700 | :class:`pandas.Panel` with dimensions (tickers, dates, 701 | price data) 702 | 703 | """ 704 | #cols correspond 'value_calcs!' in "panel from weight file test.xlsx" 705 | cols = ['ac_c', 'c0_ac0', 'n0', 'Adj_Q'] 706 | 707 | #create the intervals spanning the trade dates 708 | 709 | index = price_panel.major_axis 710 | w_ind = weight_df.index 711 | time_chunks = tradeplus_tchunks(weight_index = w_ind, 712 | price_index = index 713 | ) 714 | 715 | p_val = start_value 716 | l = [] 717 | f_dt = w_ind[0] 718 | 719 | #for beg, fin in zip(int_beg, int_fin): 720 | 721 | for beg, fin in time_chunks: 722 | 723 | close = price_panel.loc[:, beg:fin, 'Close'] 724 | opn = price_panel.loc[:, beg:fin, 'Open'] 725 | adj = price_panel.loc[:, beg:fin, 'Adj Close'] 726 | 727 | n = len(close) 728 | cl_f = price_panel.loc[:, f_dt, 'Close'] 729 | ac_f = price_panel.loc[:, f_dt, 'Adj Close'] 730 | 731 | c0_ac0 = cl_f.div(ac_f) 732 | n0 = p_val*weight_df.xs(f_dt).div(cl_f) 733 | 734 | ac_c = adj.div(close) 735 | 736 | c0_ac0 = pandas.DataFrame(numpy.tile(c0_ac0, [n, 1]), 737 | index = close.index, 738 | columns = c0_ac0.index 739 | ) 740 | 741 | n0 = pandas.DataFrame(numpy.tile(n0, [n, 1]), 742 | index = close.index, 743 | columns = n0.index 744 | ) 745 | 746 | adj_q = c0_ac0.mul(ac_c).mul(n0) 747 | p_val = adj_q.xs(fin).mul(close.xs(fin)).sum() 748 | vac = adj_q.mul(close) 749 | vao = adj_q.mul(opn) 750 | 751 | panel = pandas.Panel.from_dict({'ac_c': ac_c, 752 | 'c0_ac0': c0_ac0, 753 | 'n0': n0, 754 | 'Adj_Q': adj_q, 755 | 'Value at Close': vac, 756 | 'Value at Open': vao} 757 | ) 758 | 759 | #set items and minor appropriately for pfp constructors 760 | panel = panel.transpose(2, 1, 0) 761 | l.append(panel) 762 | f_dt = fin 763 | 764 | agg = pandas.concat(l, axis = 1) 765 | return pandas.concat([agg, price_panel], 766 | join = 'inner', 767 | axis = 2 768 | ) 769 | 770 | def mngmt_fee(price_series, bps_cost, frequency): 771 | """ 772 | Extract management fees from repr(price_series) 773 | of repr(bps_cost) every repr(frequency) 774 | 775 | :ARGS: 776 | 777 | price_series: :class:`DataFrame` of 'Open', 'Close' 778 | or :class:`pandas.Series` of 'Close' 779 | 780 | bps_cost: :class:`float` of the management fee 781 | in bps 782 | 783 | frequency: :class:`string` of the frequency to 784 | charge the management fee in ['yearly', 'quarterly', 785 | 'monthly', 'daily'] 786 | 787 | :RETURNS: 788 | 789 | same as repr(price_series) 790 | """ 791 | def time_dist(date, interval): 792 | """ 793 | Return the proportion of time left to 794 | the end of the interval, from the current 795 | date 796 | """ 797 | 798 | return None 799 | 800 | ln = lambda x, y: x.div(y).apply(numpy.log) 801 | 802 | fac = {'daily': 252., 803 | 'weekly': 52., 804 | 'monthly': 12., 805 | 'quarterly': 4., 806 | 'yearly': 1. 807 | } 808 | 809 | per_fee = bps_cost/10000./fac[frequency] 810 | 811 | if frequency is 'daily': 812 | p_ln = ln(x = price_series, 813 | y = price_series.shift(1) 814 | ) 815 | 816 | p_ln[0] = 0. 817 | 818 | fee = numpy.log(1. - per_fee) 819 | 820 | # charge the daily fee on the first day 821 | ret_p = price_series[0] 822 | 823 | cum_ret = (p_ln + fee).cumsum() 824 | return ret_p*numpy.exp(cum_ret) 825 | 826 | else: 827 | tcs = zipped_time_chunks(price_series.index, 828 | frequency 829 | ) 830 | 831 | p_o, p_e = tcs[0][0], tcs[0][1] 832 | rem_t = (p_e - p_o).days 833 | return None 834 | 835 | 836 | # determine the first fee 837 | 838 | # extract the first fee 839 | 840 | # create the log changes 841 | 842 | # create the fee costs 843 | 844 | # sum them 845 | 846 | # re-create the price series 847 | 848 | # return None 849 | 850 | def _tc_helper(weight_df, share_panel, tau, meth): 851 | """ 852 | Helpfer function for the tc_* functions 853 | 854 | Estimate the cumulative rolling transaction costs by ticker using 855 | the cents per share method of calculation. Can 856 | be used to directly subtract against tickers / asset classes to 857 | determine the asset and asset class impact of transaction costs. 858 | 859 | :ARGS: 860 | 861 | weight_df: :class:`pandas.DataFrame` weight allocation 862 | 863 | share_panel: :class:`pandas.Panel` with dimensions 864 | (tickers, dates, price/share data) 865 | 866 | tau: :class:`float` of the cost per share or basis points 867 | 868 | method: :class:`string` in ['bps', 'cps'] 869 | 870 | :RETURNS: 871 | 872 | :class:`pandas.DataFrame` of the cumulative transaction 873 | cost for each ticker 874 | """ 875 | def cps_cost(**kwargs): 876 | shares = kwargs['shares'] 877 | shares_prev = kwargs['shares_prev'] 878 | tau = kwargs['tau']/100. 879 | 880 | share_diff = abs(shares - shares_prev) 881 | return share_diff * tau 882 | 883 | def bps_cost(**kwargs): 884 | shares = kwargs['shares'] 885 | shares_prev = kwargs['shares_prev'] 886 | prices = kwargs['prices'] 887 | tau = kwargs['tau']/10000. 888 | 889 | share_diff = abs(shares - shares_prev) 890 | return share_diff.mul(prices) * tau 891 | 892 | meth_d = {'cps': cps_cost, 893 | 'bps': bps_cost 894 | } 895 | 896 | adj_q = share_panel.loc[:, :, 'Adj_Q'] 897 | price = share_panel.loc[:, :, 'Close'] 898 | 899 | tchunks = tradeplus_tchunks(weight_index = weight_df.index, 900 | price_index = share_panel.major_axis 901 | ) 902 | 903 | #slight finegle to get the tradeplus to be what we need 904 | sper, fper = zip(*tchunks) 905 | sper = sper[1:] 906 | fper = fper[:-1] 907 | 908 | t_o = weight_df.index[0] 909 | 910 | d = {t_o: meth_d[meth](**{'shares': adj_q.loc[t_o, :], 911 | 'shares_prev': 0., 912 | 'prices': price.loc[t_o, :], 913 | 'tau': tau} 914 | ) 915 | } 916 | 917 | for beg, fin in zip(fper, sper): 918 | d[fin] = meth_d[meth](**{'shares': adj_q.loc[fin, :], 919 | 'shares_prev': adj_q.loc[beg, :], 920 | 'tau':tau, 921 | 'prices': price.loc[fin, :]} 922 | ) 923 | 924 | tcost = pandas.DataFrame(d).transpose() 925 | cumcost = tcost.reindex(share_panel.major_axis) 926 | return cumcost.fillna(0.) 927 | 928 | def tc_cps(weight_df, share_panel, cps = 10.): 929 | """ 930 | Estimate the cumulative rolling transaction costs by ticker using 931 | the cents per share method of calculation. Can 932 | be used to directly subtract against tickers / asset classes to 933 | determine the asset and asset class impact of transaction costs. 934 | 935 | :ARGS: 936 | 937 | weight_df: :class:`pandas.DataFrame` weight allocation 938 | 939 | share_panel: :class:`pandas.Panel` with dimensions 940 | (tickers, dates, price/share data) 941 | 942 | cps: :class:`float` of the transaction cost in cents per share 943 | 944 | :RETURNS: 945 | 946 | :class:`pandas.DataFrame` of the cumulative transaction 947 | cost for each ticker 948 | """ 949 | return _tc_helper(weight_df = weight_df, 950 | share_panel = share_panel, 951 | meth = 'cps', 952 | tau = cps 953 | ) 954 | 955 | def tc_bps(weight_df, share_panel, bps = 10.): 956 | """ 957 | Estimate the cumulative rolling transaction costs by ticker as 958 | basis points of the total value of the transaction. Can 959 | be used to directly subtract against tickers / asset classes to 960 | determine the asset and asset class impact of transaction costs. 961 | 962 | :ARGS: 963 | 964 | weight_df: :class:`pandas.DataFrame` weight allocation 965 | 966 | share_panel: :class:`pandas.Panel` with dimensions 967 | (tickers, dates, price/share data) 968 | 969 | bps: :class:`float` of the transaction cost per trade, 970 | 971 | :RETURNS: 972 | 973 | :class:`pandas.DataFrame` of the cumulative transaction 974 | cost for each ticker 975 | """ 976 | return _tc_helper(weight_df = weight_df, 977 | share_panel = share_panel, 978 | meth = 'bps', 979 | tau = bps 980 | ) 981 | 982 | def net_tcs(tc_df, price_index): 983 | """ 984 | Incorporate transaction costs calculated using tc_cps or 985 | tc_bps into the value of an index (i.e. return the index 986 | value had transaction costs been accounted for using the 987 | given method). 988 | 989 | :ARGS: 990 | 991 | tc_df: :class:`pandas.DataFrame` of transaction costs using 992 | ether tc_cps or tc_bps 993 | 994 | price_index: :class:`pandas.Series` on which the 995 | transaction costs were calculated on 996 | 997 | :RETURNS: 998 | 999 | :class:`pandas.Series` of the adjusted index value 1000 | 1001 | """ 1002 | #log returns are so ugly 1003 | ln = lambda x, y: x.div(y).apply(numpy.log) 1004 | 1005 | tc_sum = tc_df.sum(axis = 1) 1006 | tc_ln = numpy.log(1. - tc_sum.div(price_index)) 1007 | 1008 | p_ln = ln(x = price_index, 1009 | y = price_index.shift(1) 1010 | ) 1011 | 1012 | ln_sum = tc_ln.add(p_ln) 1013 | ln_sum[0] = 0. 1014 | p_o = price_index[0] - tc_sum[0] 1015 | return p_o * numpy.exp(ln_sum.cumsum()) 1016 | 1017 | def weight_df_from_initial_weights(weight_series, price_panel, 1018 | rebal_frequency, start_value = 1000., start_date = None): 1019 | """ 1020 | Returns a :class:`pandas.DataFrame` of weights that are used 1021 | to construct the portfolio. Useful in determining tactical 1022 | over / under weightings relative to other portfolios 1023 | 1024 | :ARGS: 1025 | 1026 | weight_series of :class:`pandas.Series` of a weight allocation 1027 | with an index of tickers, and a name of the initial allocation 1028 | 1029 | price_panel of type :class:`pandas.Panel` with dimensions 1030 | [tickers, index, price data] 1031 | 1032 | start_value: of type :class:`float` of the value to start the 1033 | index 1034 | 1035 | rebal_frequency: :class:`string` of 'weekly', 'monthly', 1036 | 'quarterly', 'yearly' 1037 | 1038 | :RETURNS: 1039 | 1040 | price: of type :class:`pandas.DataFrame` with portfolio 1041 | 'Close' and 'Open' 1042 | """ 1043 | 1044 | return initial_weight_help_fn(weight_series, price_panel, 1045 | rebal_frequency, start_value, start_date, ret_val = 'weights') 1046 | 1047 | 1048 | def panel_from_initial_weights(weight_series, price_panel, 1049 | rebal_frequency, start_value = 1000, start_date = None): 1050 | """ 1051 | Returns a pandas.DataFrame with columns ['Close', 'Open'] when 1052 | provided a pandas.Series of intial weight allocations, the date of 1053 | those initial weight allocations (series.name), a starting value 1054 | of the index, and a rebalance frequency (this is the classical 1055 | "static" construction" methodology, rebalancing at somspecified 1056 | interval) 1057 | 1058 | :ARGS: 1059 | 1060 | weight_series of :class:`pandas.Series` of a weight allocation 1061 | with an index of tickers, and a name of the initial allocation 1062 | 1063 | price_panel of type :class:`pandas.Panel` with dimensions 1064 | [tickers, index, price data] 1065 | 1066 | start_value: of type :class:`float` of the value to start the 1067 | index 1068 | 1069 | rebal_frequency: :class:`string` of 'weekly', 'monthly', 1070 | 'quarterly', 'yearly' 1071 | 1072 | :RETURNS: 1073 | 1074 | weight_df: of type :class:`pandas.DataFrame` of the rebalance 1075 | weights and dates 1076 | """ 1077 | return initial_weight_help_fn(weight_series, price_panel, 1078 | rebal_frequency, start_value, start_date, ret_val = 'panel') 1079 | 1080 | def initial_weight_help_fn(weight_series, price_panel, 1081 | rebal_frequency, start_value = 1000., start_date = None, ret_val = 'panel'): 1082 | 1083 | #determine the first valid date and make it the start_date 1084 | first_valid = numpy.max(price_panel.loc[:, :, 'Close'].apply( 1085 | pandas.Series.first_valid_index)) 1086 | 1087 | if start_date == None: 1088 | d_0 = first_valid 1089 | index = price_panel.loc[:, d_0:, :].major_axis 1090 | 1091 | else: 1092 | #make sure the the start_date begins after all assets are valid 1093 | if isinstance(start_date, str): 1094 | start_date = datetime.datetime.strptime(start_date, 1095 | "%m/%d/%Y") 1096 | assert start_date > first_valid, ( 1097 | "first_valid index doesn't occur until after start_date") 1098 | index = price_panel.loc[:, start_date, :].major_axis 1099 | 1100 | #the weigth_series must be a type series, but sometimes can be a 1101 | #``pandas.DataFrame`` with len(columns) = 1 1102 | msg = "Initial Allocation is not Series" 1103 | if isinstance(weight_series, pandas.DataFrame): 1104 | assert len(weight_series.columns) == 1, msg 1105 | weight_series = weight_series[weight_series.columns[0]] 1106 | 1107 | 1108 | interval_dict = {'weekly':lambda x: x[:-1].week != x[1:].week, 1109 | 'monthly': lambda x: x[:-1].month != x[1:].month, 1110 | 'quarterly':lambda x: x[:-1].quarter != x[1:].quarter, 1111 | 'yearly':lambda x: x[:-1].year != x[1:].year} 1112 | 1113 | #create a boolean array of rebalancing dates 1114 | ind = numpy.append(True, interval_dict[rebal_frequency](index)) 1115 | weight_df = pandas.DataFrame(numpy.tile(weight_series.values, 1116 | [len(index[ind]), 1]), index = index[ind], 1117 | columns = weight_series.index) 1118 | 1119 | if ret_val == 'panel': 1120 | return panel_from_weight_file(weight_df, price_panel, start_value) 1121 | else: 1122 | return weight_df 1123 | 1124 | 1125 | 1126 | def pfp_from_weight_file(panel_from_weight_file): 1127 | """ 1128 | pfp stands for "Portfolio from Panel", so this takes the final 1129 | ``pandas.Panel`` that is created in the portfolio construction 1130 | process when weight file is given and generates a portfolio path 1131 | of 'Open' and 'Close' 1132 | 1133 | :ARGS: 1134 | 1135 | panel_from_weight_file: a :class:`pandas.Panel` that was 1136 | generated using ``panel_from_weight_file`` 1137 | 1138 | :RETURNS: 1139 | 1140 | portfolio prices in a :class:`pandas.DataFrame` with columns 1141 | ['Open', 'Close'] 1142 | 1143 | .. note:: The Holy Grail of the Portfolio Path 1144 | 1145 | The portfolio path is what goes into all of the :mod:`analyze` 1146 | functions. So once the `pfp_from_`... has been created, you've 1147 | got all of the necessary bits to begin calculating performance 1148 | metrics on a portfolio 1149 | """ 1150 | adj_q = panel_from_weight_file.loc[:, :, 'Adj_Q'] 1151 | close = panel_from_weight_file.loc[:, :, 'Close'] 1152 | opn = panel_from_weight_file.loc[:, :, 'Open'] 1153 | 1154 | ind_close = adj_q.mul(close).sum(axis = 1) 1155 | ind_open = adj_q.mul(opn).sum(axis = 1) 1156 | port_df = pandas.DataFrame({'Open': ind_open, 1157 | 'Close': ind_close} 1158 | ) 1159 | 1160 | return port_df 1161 | 1162 | def pfp_from_blotter(panel_from_blotter, start_value = 1000.): 1163 | """ 1164 | pfp stands for "Portfolio from Panel", so this takes the final 1165 | :class`pandas.Panel` that is created in the portfolio construction 1166 | process when a blotter is given and generates a portfolio path of 1167 | 'Open' and 'Close' 1168 | 1169 | :ARGS: 1170 | 1171 | panel_from_blotter: a :class:`pandas.Panel` that was generated 1172 | using ref:`panel_from_weight_file` 1173 | 1174 | start_value: :class:`float` of the starting value, default=1000 1175 | 1176 | :RETURNS: 1177 | 1178 | portfolio prices in a :class:`pandas.DataFrame` with columns 1179 | ['Open', 'Close'] 1180 | 1181 | .. note:: The Holy Grail of the Portfolio Path 1182 | 1183 | The portfolio path is what goes into all of the :mod:`analyze` 1184 | functions. So once the `pfp_from_`... has been created, 1185 | you've got all of the necessary bits to begin calculating 1186 | performance metrics on your portfolio! 1187 | 1188 | .. note:: Another way to think of Portfolio Path 1189 | 1190 | This "Portfolio Path" is really nothing more than a series of 1191 | prices that, should you have made the trades given in the 1192 | blotter, would have been the the experience of someone 1193 | investing `start_value` in your strategy when your strategy 1194 | first begins, up until today. 1195 | """ 1196 | 1197 | panel = panel_from_blotter.copy() 1198 | index = panel.major_axis 1199 | price_df = pandas.DataFrame(numpy.zeros([len(index), 2]), 1200 | index = index, columns = ['Close', 'Open']) 1201 | 1202 | price_df.loc[index[0], 'Close'] = start_value 1203 | 1204 | #first determine the log returns for the series 1205 | cl_to_cl_end_val = panel.ix[:, :, 'cum_shares'].mul( 1206 | panel.ix[:, :, 'Close']).add(panel.ix[:, :, 'cum_shares'].mul( 1207 | panel.ix[:, :, 'Dividends'])).sub( 1208 | panel.ix[:, :, 'contr_withdrawal']).sum(axis = 1) 1209 | 1210 | cl_to_cl_beg_val = panel.ix[:, :, 'cum_shares'].mul( 1211 | panel.ix[:, :, 'Close']).add(panel.ix[:, :, 'cum_shares'].mul( 1212 | panel.ix[:, :, 'Dividends'])).sum(axis = 1).shift(1) 1213 | 1214 | op_to_cl_end_val = panel.ix[:, :, 'cum_shares'].mul( 1215 | panel.ix[:, :, 'Close']).add(panel.ix[:, :, 'cum_shares'].mul( 1216 | panel.ix[:, :, 'Dividends'])).sum(axis = 1) 1217 | 1218 | op_to_cl_beg_val = panel.ix[:, :, 'cum_shares'].mul( 1219 | panel.ix[:, :, 'Open']).sum(axis = 1) 1220 | 1221 | cl_to_cl = cl_to_cl_end_val.div(cl_to_cl_beg_val).apply(numpy.log) 1222 | op_to_cl = op_to_cl_end_val.div(op_to_cl_beg_val).apply(numpy.log) 1223 | price_df.loc[index[1]:, 'Close'] = start_value*numpy.exp( 1224 | cl_to_cl[1:].cumsum()) 1225 | price_df['Open'] = price_df['Close'].div(numpy.exp(op_to_cl)) 1226 | 1227 | return price_df 1228 | 1229 | 1230 | if __name__ == '__main__': 1231 | 1232 | usage = sys.argv[0] + "file_loc" 1233 | description = "description" 1234 | parser = argparse.ArgumentParser( 1235 | description = description, usage = usage) 1236 | parser.add_argument('arg_1', nargs = 1, type = str, help = 'help_1') 1237 | parser.add_argument('arg_2', nargs = 1, type = int, help = 'help_2') 1238 | args = parser.parse_args() 1239 | -------------------------------------------------------------------------------- /visualize_wealth/utils.py: -------------------------------------------------------------------------------- 1 | #!/usr/bin/env python 2 | # encoding: utf-8 3 | """ 4 | .. module:: visualize_wealth.utils.py 5 | 6 | .. moduleauthor:: Benjamin M. Gross 7 | 8 | """ 9 | import datetime 10 | import logging 11 | import pandas 12 | import numpy 13 | import os 14 | 15 | def append_dfs(prv_df, nxt_df): 16 | """ 17 | Return a single, sorted :class:`DataFrame` where prev_df 18 | is "stacked" with nxt_df and any overlapping dates 19 | are remvoed 20 | """ 21 | ind_a, ind_b = prv_df.index, nxt_df.index 22 | apnd = nxt_df.loc[~ind_b.isin(ind_a), :] 23 | return prv_df.append(apnd) 24 | 25 | def exchange_acs_for_ticker(weight_df, ticker_class_dict, date, asset_class, ticker, weight): 26 | """ 27 | It's common to wonder, what would happen if I took all tickers within a 28 | given asset class, zeroed them out, and used some other ticker beginning 29 | at some date. 30 | 31 | :ARGS: 32 | 33 | weight_df: class:`DataFrame` of the weight allocation frame 34 | 35 | ticker_class_dict: :class:`dictionary` of the tickers and the asset 36 | classes of each ticker 37 | 38 | date: :class:`string` of the date to zero out the existing tickers 39 | within an asset class and add ``ticker`` 40 | 41 | asset_class: :class:`string` of the 'asset_class' to exchange all 42 | tickers for 'ticker' 43 | 44 | ticker: :class:`string` the ticker to add to the weight_df 45 | 46 | weight: :class:`float` of the weight to assign to ``ticker`` 47 | 48 | :RETURNS: 49 | 50 | :class:`DataFrame` of the :class:PortfolioObject's rebal_weights, with 51 | ticker representing weight, beginning on date (or the first trade before) 52 | 53 | """ 54 | 55 | d = ticker_class_dict 56 | ind = weight_df.index 57 | 58 | #if the date is exact, use it, otherwise pick the previous one 59 | 60 | if ind[ind.searchsorted(date)] is not pandas.Timestamp(date): 61 | dt = ind[ind.searchsorted(date) - 1] 62 | else: 63 | dt = pandas.Datetime(date) 64 | 65 | #get the tickers with the given asset class 66 | l = [] 67 | for key, value in d.iteritems(): 68 | if value == asset_class: l.append(key) 69 | 70 | weight_df.loc[dt: , l] = 0. 71 | s = weight_df.sum(axis = 1) 72 | weight_df = weight_df.apply(lambda x: x.div(s)) 73 | 74 | return ticker_and_weight_into_weight_df(weight_df, ticker, weight, dt) 75 | 76 | def ticker_and_weight_into_weight_df(weight_df, ticker, weight, date): 77 | """ 78 | A helper function to insert a ticker, and its respective weight into a 79 | :class:`DataFrame` ``weight_df`` given a dynamic allocation strategy or 80 | a :class:`Series` given a static allocation strategy 81 | 82 | :ARGS: 83 | 84 | weight_df: :class:`pandas.DataFrame` to be used as a weight allocation 85 | to construct a portfolio 86 | 87 | ticker: :class:`string` to insert into the weight_df 88 | 89 | weight: :class:`float` of the weight to assign the ticker 90 | 91 | date: :class:`string`, :class:`datetime` or :class:`Timestamp` to first 92 | allocate ``weight`` to ``ticekr``, going forward. 93 | 94 | :RETURNS: 95 | 96 | :class:`pandas.DataFrame` where the weight_df weights have been 97 | proportionally re-distributed on or after ``date`` 98 | 99 | """ 100 | ret_df = weight_df.copy() 101 | ret_df[date:] = ret_df*(1. - weight) 102 | ret_df[ticker] = 0. 103 | ret_df.loc[date: , ticker] = weight 104 | return ret_df 105 | 106 | def epoch_to_datetime(pandas_obj): 107 | """ 108 | Convert string epochs to `pandas.DatetimeIndex` 109 | 110 | :ARGS: 111 | 112 | either a :class:`DataFrame` or :class:`Series` where index can be 113 | converted to datetimes 114 | 115 | :RETURNS: 116 | 117 | same as input type, but with index converted into Timestamps 118 | """ 119 | pandas_obj.index = pandas.to_datetime( pandas_obj.index.astype('int64'), 120 | unit = 'ms' 121 | ) 122 | return pandas_obj 123 | 124 | 125 | def append_store_prices(ticker_list, store_path, start = '01/01/1990'): 126 | """ 127 | Given an existing store located at ``path``, check to make sure 128 | the tickers in ``ticker_list`` are not already in the data 129 | set, and then insert the tickers into the store. 130 | 131 | :ARGS: 132 | 133 | ticker_list: :class:`list` of tickers to add to the 134 | :class:`pandas.HDStore` 135 | 136 | store_path: :class:`string` of the path to the 137 | :class:`pandas.HDStore` 138 | 139 | start: :class:`string` of the date to begin the price data 140 | 141 | :RETURNS: 142 | 143 | :class:`NoneType` but appends the store and comments the 144 | successes ands failures 145 | """ 146 | store = _open_store(store_path) 147 | store_keys = map(lambda x: x.strip('/'), store.keys()) 148 | not_in_store = numpy.setdiff1d(ticker_list, store_keys ) 149 | new_prices = tickers_to_dict(not_in_store, start = start) 150 | 151 | #attempt to add the new values to the store 152 | for val in new_prices.keys(): 153 | try: 154 | store.put(val, new_prices[val]) 155 | logging.log(20, "{0} has been stored".format( val)) 156 | except: 157 | logging.warning("{0} didn't store".format(val)) 158 | 159 | store.close() 160 | return None 161 | 162 | def check_store_for_tickers(ticker_list, store): 163 | """ 164 | Determine which, if any of the :class:`list` `ticker_list` are 165 | inside of the HDFStore. If all tickers are located in the store 166 | returns 1, otherwise returns 0 (provides a "check" to see if 167 | other functions can be run) 168 | 169 | :ARGS: 170 | 171 | ticker_list: iterable of tickers to be found in the store located 172 | at :class:`string` store_path 173 | 174 | store: :class:`HDFStore` of the location to the HDFStore 175 | 176 | :RETURNS: 177 | 178 | :class:`bool` True if all tickers are found in the store and 179 | False if not all the tickers are found in the HDFStore 180 | 181 | """ 182 | if isinstance(ticker_list, pandas.Index): 183 | #pandas.Index is not sortable, so much tolist() it 184 | ticker_list = ticker_list.tolist() 185 | 186 | store_keys = map(lambda x: x.strip('/'), store.keys()) 187 | not_in_store = numpy.setdiff1d(ticker_list, store_keys) 188 | 189 | #if len(not_in_store) == 0, all tickers are present 190 | if not len(not_in_store): 191 | #print "All tickers in store" 192 | ret_val = True 193 | else: 194 | for ticker in not_in_store: 195 | print "store does not contain " + ticker 196 | ret_val = False 197 | return ret_val 198 | 199 | def check_store_path_for_tickers(ticker_list, store_path): 200 | """ 201 | Determine which, if any of the :class:`list` `ticker_list` are 202 | inside of the HDFStore. If all tickers are located in the store 203 | returns 1, otherwise returns 0 (provides a "check" to see if 204 | other functions can be run) 205 | 206 | :ARGS: 207 | 208 | ticker_list: iterable of tickers to be found in the store located 209 | at :class:`string` store_path 210 | 211 | store_path: :class:`string` of the location to the HDFStore 212 | 213 | :RETURNS: 214 | 215 | :class:`bool` True if all tickers are found in the store and 216 | False if not all the tickers are found in the HDFStore 217 | """ 218 | store = _open_store(store_path) 219 | 220 | if isinstance(ticker_list, pandas.Index): 221 | #pandas.Index is not sortable, so much tolist() it 222 | ticker_list = ticker_list.tolist() 223 | 224 | store_keys = map(lambda x: x.strip('/'), store.keys()) 225 | not_in_store = numpy.setdiff1d(ticker_list, store_keys) 226 | store.close() 227 | 228 | #if len(not_in_store) == 0, all tickers are present 229 | if not len(not_in_store): 230 | print "All tickers in store" 231 | ret_val = True 232 | else: 233 | for ticker in not_in_store: 234 | print "store does not contain " + ticker 235 | ret_val = False 236 | return ret_val 237 | 238 | def check_trade_price_start(weight_df, price_df): 239 | """ 240 | Check to ensure that initial weights / trade dates are after 241 | the first available price for the same ticker 242 | 243 | :ARGS: 244 | 245 | weight_df: :class:`pandas.DataFrame` of the weights to 246 | rebalance the portfolio 247 | 248 | price_df: :class:`pandas.DataFrame` of the prices for each 249 | of the tickers 250 | 251 | :RETURNS: 252 | 253 | :class:`pandas.Series` of boolean values for each ticker 254 | where True indicates the first allocation takes place 255 | after the first price (as desired) and False the converse 256 | """ 257 | #make sure all of the weight_df tickers are in price_df 258 | intrsct = set(weight_df.columns).intersection(set(price_df.columns)) 259 | 260 | if set(weight_df.columns) != intrsct: 261 | 262 | raise KeyError, "Not all tickers in weight_df are in price_df" 263 | 264 | 265 | ret_d = {} 266 | for ticker in weight_df.columns: 267 | first_alloc = (weight_df[ticker] > 0).argmin() 268 | first_price = price_df[ticker].notnull().argmin() 269 | ret_d[ticker] = first_alloc >= first_price 270 | 271 | return pandas.Series(ret_d) 272 | 273 | def create_data_store(ticker_list, store_path): 274 | """ 275 | Creates the ETF store to run the training of the logistic 276 | classificaiton tree 277 | 278 | :ARGS: 279 | 280 | ticker_list: iterable of tickers 281 | 282 | store_path: :class:`str` of path to ``HDFStore`` 283 | """ 284 | #check to make sure the store doesn't already exist 285 | if os.path.isfile(store_path): 286 | print "File " + store_path + " already exists" 287 | return 288 | 289 | store = pandas.HDFStore(store_path, 'w') 290 | success = 0 291 | for ticker in ticker_list: 292 | try: 293 | tmp = tickers_to_dict(ticker, 'yahoo', start = '01/01/2000') 294 | store.put(ticker, tmp) 295 | print ticker + " added to store" 296 | success += 1 297 | except: 298 | print "unable to add " + ticker + " to store" 299 | store.close() 300 | 301 | if success == 0: #none of it worked, delete the store 302 | print "Creation Failed" 303 | os.remove(path) 304 | print 305 | return None 306 | 307 | def first_price_date_get_prices(ticker_list): 308 | """ 309 | Given a list of tickers, pull down prices and return the first valid price 310 | date for each ticker in the list 311 | 312 | :ARGS: 313 | 314 | ticker_list: :class:`string` or :class:`list` of tickers 315 | 316 | :RETURNS: 317 | 318 | :class:`string` of 'dd-mm-yyyy' or :class:`list` of said strings 319 | """ 320 | 321 | #pull down the data into a DataFrame 322 | df = tickers_to_frame(ticker_list) 323 | return first_price_date_from_prices(df) 324 | 325 | def first_price_date_from_prices(frame): 326 | """ 327 | Given a :class:`pandas.DataFrame` of prices, return the first date that a 328 | price exists for each of the tickers 329 | 330 | :ARGS: 331 | 332 | ticker_list: :class:`string` or :class:`list` of tickers 333 | 334 | :RETURNS: 335 | 336 | :class:`string` of 'dd-mm-yyyy' or :class:`list` of said strings 337 | """ 338 | 339 | fvi = pandas.Series.first_valid_index 340 | if isinstance(frame, pandas.Series): 341 | return frame.fvi() 342 | else: 343 | return frame.apply(fvi, axis = 0) 344 | 345 | def first_valid_date(prices): 346 | """ 347 | Helper function to determine the first valid date from a set of 348 | different prices Can take either a :class:`dict` of 349 | :class:`pandas.DataFrame`s where each key is a ticker's 'Open', 350 | 'High', 'Low', 'Close', 'Adj Close' or a single 351 | :class:`pandas.DataFrame` where each column is a different ticker 352 | 353 | :ARGS: 354 | 355 | prices: either :class:`dictionary` or :class:`pandas.DataFrame` 356 | 357 | :RETURNS: 358 | 359 | :class:`pandas.Timestamp` 360 | """ 361 | iter_dict = { pandas.DataFrame: lambda x: x.columns, 362 | dict: lambda x: x.keys() } 363 | try: 364 | each_first = map(lambda x: prices[x].first_valid_index(), 365 | iter_dict[ type(prices) ](prices) ) 366 | return max(each_first) 367 | except KeyError: 368 | print "prices must be a DataFrame or dictionary" 369 | return 370 | 371 | def gen_gbm_price_series(num_years, N, price_0, vol, drift): 372 | """ 373 | Return a price series generated using GBM 374 | 375 | :ARGS: 376 | 377 | num_years: number of years (if 20 trading days, then 20/252) 378 | 379 | N: number of total periods 380 | 381 | price_0: starting price for the security 382 | 383 | vol: the volatility of the security 384 | 385 | return: the expected return of the security 386 | 387 | :RETURNS: 388 | 389 | Pandas.Series of length n of the simulated price series 390 | 391 | """ 392 | dt = num_years/float(N) 393 | e1 = (drift - 0.5*vol**2)*dt 394 | e2 = (vol*numpy.sqrt(dt)) 395 | cum_shocks = numpy.cumsum(numpy.random.randn(N,)) 396 | cum_drift = numpy.arange(1, N + 1) 397 | 398 | return pandas.Series(numpy.append( 399 | price_0, price_0*numpy.exp(cum_drift*e1 + cum_shocks*e2)[:-1])) 400 | 401 | def index_intersect(arr_a, arr_b): 402 | """ 403 | Return the intersection of two :class:`pandas` objects, either a 404 | :class:`pandas.Series` or a :class:`pandas.DataFrame` 405 | 406 | :ARGS: 407 | 408 | arr_a: :class:`pandas.DataFrame` or :class:`pandas.Series` 409 | arr_b: :class:`pandas.DataFrame` or :class:`pandas.Series` 410 | 411 | :RETURNS: 412 | 413 | :class:`pandas.DatetimeIndex` of the intersection of the two 414 | :class:`pandas` objects 415 | """ 416 | arr_a = arr_a.sort_index() 417 | arr_a = arr_a.dropna() 418 | arr_b = arr_b.sort_index() 419 | arr_b = arr_b.dropna() 420 | if arr_a.index.equals(arr_b.index) == False: 421 | return arr_a.index & arr_b.index 422 | else: 423 | return arr_a.index 424 | 425 | def index_multi_union(frame_list): 426 | """ 427 | Returns the index union of multiple 428 | :class:`pandas.DataFrame`'s or :class:`pandas.Series` 429 | 430 | :ARGS: 431 | 432 | frame_list: :class:`list` containing either ``DataFrame``'s or 433 | ``Series`` 434 | 435 | :RETURNS: 436 | 437 | :class:`pandas.DatetimeIndex` of the objects' intersection 438 | """ 439 | #check to make sure all objects are Series or DataFrames 440 | 441 | 442 | 443 | return reduce(lambda x, y: x | y, 444 | map(lambda x: x.dropna().index, 445 | frame_list) 446 | ) 447 | 448 | def index_multi_intersect(frame_list): 449 | """ 450 | Returns the index intersection of multiple 451 | :class:`pandas.DataFrame`'s or :class:`pandas.Series` 452 | 453 | :ARGS: 454 | 455 | frame_list: :class:`list` containing either ``DataFrame``'s or 456 | ``Series`` 457 | 458 | :RETURNS: 459 | 460 | :class:`pandas.DatetimeIndex` of the objects' intersection 461 | """ 462 | 463 | return reduce(lambda x, y: x & y, 464 | map(lambda x: x.dropna().index, 465 | frame_list) 466 | ) 467 | 468 | def join_on_index(df_list, index): 469 | """ 470 | pandas doesn't current have the ability to :meth:`concat` several 471 | :class:`DataFrame`'s on a provided :class:`DatetimeIndex`. 472 | This is a quick function to provide that functionality 473 | 474 | :ARGS: 475 | 476 | df_list: :class:`list` of :class:`DataFrame`'s 477 | 478 | index: :class:`Index` on which to join all of the DataFrames 479 | """ 480 | return pandas.concat( 481 | map( lambda x: x.reindex(index), df_list), 482 | axis = 1 483 | ) 484 | 485 | def normalized_price(price_df): 486 | """ 487 | Return the normalized price of a :class:`pandas.Series` or 488 | :class:`pandas.DataFrame` 489 | 490 | :ARGS: 491 | 492 | price_df: :class:`pandas.Series` or :class:`pandas.DataFrame` 493 | 494 | :RETURNS: 495 | 496 | same as the input 497 | """ 498 | null_d = {pandas.DataFrame: lambda x: pandas.isnull(x).any().any(), 499 | pandas.Series: lambda x: pandas.isnull(x).any() 500 | } 501 | 502 | calc_d = {pandas.DataFrame: lambda x: x.div(x.iloc[0, :]), 503 | pandas.Series: lambda x: x.div(x[0]) 504 | } 505 | 506 | typ = type(price_df) 507 | if null_d[typ](price_df): 508 | raise ValueError, "cannot contain null values" 509 | 510 | return calc_d[typ](price_df) 511 | 512 | def rets_to_price(rets, ret_typ = 'log', start_value = 100.): 513 | """ 514 | Take a series of repr(rets), of type repr(ret_typ) and 515 | convert them into prices 516 | 517 | :ARGS: 518 | 519 | rets: :class:`Series` or :class:`DataFrame` of returns 520 | 521 | ret_typ: :class:`string` of the return type, 522 | either ['log', 'linear'] 523 | 524 | :RETURNS: 525 | 526 | same as provided type 527 | """ 528 | def _rets_to_price(rets, ret_typ, start_value): 529 | 530 | typ_d = {'log': lambda x: start_value * numpy.exp(x.cumsum()), 531 | 'linear': lambda x: start_value * (1. + x).cumprod() 532 | } 533 | 534 | fv = rets.first_valid_index() 535 | fd = rets.index[0] 536 | 537 | if fv == fd: # no nulls at the beginning 538 | p = typ_d[ret_typ](rets) 539 | p = normalized_price(p) * start_value 540 | 541 | else: 542 | cp = rets.copy() # copy to prepend with 0. 543 | loc = cp.index.get_loc(fv) 544 | fd = cp.index[loc - 1] 545 | cp[fd] = 0. 546 | p = typ_d[ret_typ](cp[fd:]) 547 | return p 548 | 549 | if isinstance(rets, pandas.Series): 550 | return _rets_to_price(rets = rets, 551 | ret_typ = ret_typ, 552 | start_value = start_value 553 | ) 554 | elif isinstance(rets, pandas.DataFrame): 555 | return rets.apply( 556 | lambda x: _rets_to_price(rets = x, 557 | ret_typ = ret_typ, 558 | start_value = start_value 559 | ), axis = 0 560 | ) 561 | 562 | else: 563 | raise TypeError, "rets must be Series or DataFrame" 564 | 565 | 566 | def perturbate_asset(frame, key, eps): 567 | """ 568 | Perturbate an asset within a weight allocation frame in the amount eps 569 | 570 | :ARGS: 571 | 572 | frame :class:`pandas.DataFrame` of a weight_allocation frame 573 | 574 | key: :class:`string` of the asset to perturbate_asset 575 | 576 | eps: :class:`float` of the amount to perturbate in relative terms 577 | 578 | :RETURNS: 579 | 580 | :class:`pandas.DataFrame` of the perturbed weight_df 581 | """ 582 | from .analyze import linear_returns 583 | 584 | pert_series = pandas.Series(numpy.zeros_like(frame[key]), 585 | index = frame.index 586 | ) 587 | 588 | lin_ret = linear_returns(frame[key]) 589 | lin_ret = lin_ret.mul(1. + eps) 590 | pert_series[0] = p_o = frame[key][0] 591 | pert_series[1:] = p_o * (1. + lin_ret[1:]) 592 | ret_frame = frame.copy() 593 | ret_frame[key] = pert_series 594 | return ret_frame 595 | 596 | 597 | def setup_trained_hdfstore(trained_data, store_path): 598 | """ 599 | The ``HDFStore`` doesn't work properly when it's compiled by different 600 | versions, so the appropriate thing to do is to setup the trained data 601 | locally (and not store the ``.h5`` file on GitHub). 602 | 603 | :ARGS: 604 | 605 | trained_data: :class:`pandas.Series` with tickers in the index and 606 | asset classes for values 607 | 608 | store_path: :class:`str` of where to create the ``HDFStore`` 609 | """ 610 | 611 | create_data_store(trained_data.index, store_path) 612 | return None 613 | 614 | def tickers_to_dict(ticker_list, api = 'yahoo', start = '01/01/1990'): 615 | """ 616 | Utility function to return ticker data where the input is either a 617 | ticker, or a list of tickers. 618 | 619 | :ARGS: 620 | 621 | ticker_list: :class:`list` in the case of multiple tickers or 622 | :class:`str` in the case of one ticker 623 | 624 | api: :class:`string` identifying which api to call the data 625 | from. Either 'yahoo' or 'google' 626 | 627 | start: :class:`string` of the desired start date 628 | 629 | :RETURNS: 630 | 631 | :class:`dictionary` of (ticker, price_df) mappings or a 632 | :class:`pandas.DataFrame` when the ``ticker_list`` is 633 | :class:`str` 634 | """ 635 | if isinstance(ticker_list, (str, unicode)): 636 | return __get_data(ticker_list, api = api, start = start) 637 | else: 638 | d = {} 639 | for ticker in ticker_list: 640 | d[ticker] = __get_data(ticker, api = api, start = start) 641 | return d 642 | 643 | def tickers_to_frame(ticker_list, api = 'yahoo', start = '01/01/1990', 644 | join_col = 'Adj Close'): 645 | """ 646 | Utility function to return ticker data where the input is either a 647 | ticker, or a list of tickers. 648 | 649 | :ARGS: 650 | 651 | ticker_list: :class:`list` in the case of multiple tickers or 652 | :class:`str` in the case of one ticker 653 | 654 | api: :class:`string` identifying which api to call the data 655 | from. Either 'yahoo' or 'google' 656 | 657 | start: :class:`string` of the desired start date 658 | 659 | join_col: :class:`string` to aggregate the 660 | :class:`pandas.DataFrame` 661 | 662 | :RETURNS: 663 | 664 | :class:`pandas.DataFrame` of (ticker, price_df) mappings or a 665 | :class:`pandas.DataFrame` when the ``ticker_list`` is 666 | :class:`str` 667 | """ 668 | 669 | if isinstance(ticker_list, (str, unicode)): 670 | return __get_data(ticker_list, api = api, start = start)[join_col] 671 | else: 672 | d = {} 673 | for ticker in ticker_list: 674 | 675 | tmp = __get_data(ticker, 676 | api = api, 677 | start = start 678 | ) 679 | 680 | d[ticker] = tmp[join_col] 681 | 682 | return pandas.DataFrame(d) 683 | 684 | def ticks_to_frame_from_store(ticker_list, store_path, join_col = 'Adj Close'): 685 | """ 686 | Utility function to return ticker data where the input is either a 687 | ticker, or a list of tickers. 688 | 689 | :ARGS: 690 | 691 | ticker_list: :class:`list` in the case of multiple tickers or 692 | :class:`str` in the case of one ticker 693 | 694 | store_path: :class:`str` of the path to the store 695 | 696 | join_col: :class:`string` to aggregate the :class:`pandas.DataFrame` 697 | 698 | :RETURNS: 699 | 700 | :class:`pandas.DataFrame` of (ticker, price_df) mappings or a 701 | :class:`pandas.DataFrame` when the ``ticker_list`` is 702 | :class:`str` 703 | """ 704 | store = _open_store(store_path) 705 | 706 | if isinstance(ticker_list, (str, unicode)): 707 | ret_series = store[ticker_list][join_col] 708 | store.close() 709 | return ret_series 710 | else: 711 | d = {} 712 | for ticker in ticker_list: 713 | d[ticker] = store[ticker][join_col] 714 | store.close() 715 | price_df = pandas.DataFrame(d) 716 | d_o = first_valid_date(price_df) 717 | price_df = price_df.loc[d_o:, :] 718 | 719 | return price_df 720 | 721 | def create_store_master_index(store_path): 722 | """ 723 | Add a master index, key = 'IND3X', to HDFStore located at store_path 724 | 725 | :ARGS: 726 | 727 | store_path: :class:`string` the location of the ``HDFStore`` file 728 | 729 | :RETURNS: 730 | 731 | :class:`NoneType` but updates the ``HDF5`` file 732 | 733 | """ 734 | store = _open_store(store_path) 735 | 736 | keys = store.keys() 737 | 738 | if '/IND3X' in keys: 739 | print "u'IND3X' already exists in HDFStore at {0}".format(store_path) 740 | 741 | store.close() 742 | return 743 | else: 744 | union = union_store_indexes(store) 745 | store.put('IND3X', pandas.Series(union, index = union)) 746 | store.close() 747 | 748 | def union_store_indexes(store): 749 | """ 750 | Return the union of all Indexes within a store located inside store 751 | 752 | :ARGS: 753 | 754 | store: :class:`HDFStore` 755 | 756 | :RETURNS: 757 | 758 | :class:`pandas.DatetimeIndex` of the union of all indexes within 759 | the store 760 | 761 | """ 762 | key_iter = (key for key in store.keys()) 763 | ind = store.get(key_iter.next()).index 764 | union = ind.copy() 765 | 766 | for key in key_iter: 767 | union = union | store.get(key).index 768 | return union 769 | 770 | def create_store_cash(store_path): 771 | """ 772 | Create a cash price, key = u'CA5H' in an HDFStore located at store_path 773 | 774 | :ARGS: 775 | 776 | store_path: :class:`string` the location of the ``HDFStore`` file 777 | 778 | :RETURNS: 779 | 780 | :class:`NoneType` but updates the ``HDF5`` file, and prints to 781 | screen which values would not update 782 | 783 | """ 784 | store = _open_store(store_path) 785 | keys = store.keys() 786 | if '/CA5H' in keys: 787 | logging.log(1, "CA5H prices already exists") 788 | store.close() 789 | return 790 | 791 | if '/IND3X' not in keys: 792 | m_index = union_store_indexes(store) 793 | else: 794 | m_index = store.get('IND3X') 795 | 796 | cols = ['Open', 'High', 'Low', 'Close', 'Volume', 'Adj Close'] 797 | n_dates, n_cols = len(m_index), len(cols) 798 | 799 | df = pandas.DataFrame(numpy.ones([n_dates, n_cols]), 800 | index = m_index, 801 | columns = cols 802 | ) 803 | store.put('CA5H', df) 804 | store.close() 805 | return 806 | 807 | def update_store_master_index(store_path): 808 | """ 809 | Intelligently update the store 'IND3X', this can only be done 810 | after the prices at the store path have been updated 811 | """ 812 | store = _open_store(store_path) 813 | 814 | try: 815 | stored_data = store.get('IND3X') 816 | except KeyError: 817 | logging.exception("store doesn't contain IND3X") 818 | store.close() 819 | raise 820 | 821 | last_stored_date = stored_data.dropna().index.max() 822 | today = datetime.datetime.date(datetime.datetime.today()) 823 | if last_stored_date < pandas.Timestamp(today): 824 | 825 | union_ind = union_store_indexes(store) 826 | tmp = pandas.Series(union_ind, index = union_ind) 827 | 828 | #need to drop duplicates because there's 1 row of overlap 829 | tmp = stored_data.append(tmp) 830 | tmp.drop_duplicates(inplace = True) 831 | store.put('IND3X', tmp) 832 | 833 | store.close() 834 | return None 835 | 836 | def update_store_cash(store_path): 837 | """ 838 | Intelligently update the values of CA5H based on existing keys in the 839 | store, and existing columns of the CA5H values 840 | 841 | :ARGS: 842 | 843 | store_path: :class:`string` the location of the ``HDFStore`` file 844 | 845 | :RETURNS: 846 | 847 | :class:`NoneType` but updates the ``HDF5`` file, and prints to 848 | screen which values would not update 849 | 850 | """ 851 | store = _open_store(store_path) 852 | td = datetime.datetime.today() 853 | 854 | try: 855 | master_ind = store.get('IND3X') 856 | cash = store.get('CA5H') 857 | except KeyError: 858 | print "store doesn't contain {0} and / or {1}".format( 859 | 'CA5H', 'IND3X') 860 | store.close() 861 | raise 862 | 863 | last_cash_dt = cash.dropna().index.max() 864 | today = datetime.datetime.date(td) 865 | if last_cash_dt < pandas.Timestamp(today): 866 | try: 867 | n = len(master_ind) 868 | cols = ['Open', 'High', 'Low', 869 | 'Close', 'Volume', 'Adj Close'] 870 | 871 | cash = pandas.DataFrame( 872 | numpy.ones([n, len(cols)]), 873 | index = master_ind, 874 | columns = cols 875 | ) 876 | store.put('CA5H', cash) 877 | except: 878 | print "Error updating cash" 879 | 880 | store.close() 881 | return None 882 | 883 | def strip_vals(keys): 884 | """ 885 | Return a stripped value for each key in keys 886 | 887 | :ARGS: 888 | 889 | keys: :class:`list` of string values (usually tickers) 890 | 891 | :RETURNS: 892 | 893 | same as input class with whitespace stripped out 894 | """ 895 | return list((x.strip() for x in keys)) 896 | 897 | def update_store_prices(store_path, store_keys = None): 898 | """ 899 | Update to the most recent prices for all keys of an existing store, 900 | located at ``store_path``. 901 | 902 | :ARGS: 903 | 904 | store_path: :class:`string` the location of the ``HDFStore`` file 905 | 906 | store_keys: :class:`list` of keys to update 907 | 908 | :RETURNS: 909 | 910 | :class:`NoneType` but updates the ``HDF5`` file, and prints to 911 | screen which values would not update 912 | 913 | .. note:: 914 | 915 | If special keys exist (like, CASH, or INDEX), then keys can be 916 | passed to update to ensure that the store does not try to update 917 | those keys 918 | 919 | """ 920 | def _cleaned_keys(keys): 921 | """ 922 | Remove the CA5H and IND3X keys from the list 923 | if they are present 924 | """ 925 | blk_lst = ['IND3X', 'CA5H', '/IND3X', '/CA5H'] 926 | 927 | for key in blk_lst: 928 | try: 929 | keys.remove(key) 930 | print "{0} removed".format(key) 931 | except: 932 | print "{0} not in keys".format(key) 933 | 934 | return keys 935 | 936 | reader = pandas.io.data.DataReader 937 | strftime = datetime.datetime.strftime 938 | today_str = strftime(datetime.datetime.today(), format = '%m/%d/%Y') 939 | 940 | store = _open_store(store_path) 941 | 942 | if not store_keys: 943 | store_keys = store.keys() 944 | 945 | store_keys = _cleaned_keys(store_keys) 946 | for key in store_keys: 947 | stored_data = store.get(key) 948 | last_stored_date = stored_data.dropna().index.max() 949 | today = datetime.datetime.date(datetime.datetime.today()) 950 | if last_stored_date < pandas.Timestamp(today): 951 | try: 952 | tmp = reader(key.strip('/'), 'yahoo', start = strftime( 953 | last_stored_date, format = '%m/%d/%Y')) 954 | 955 | #need to drop duplicates because there's 1 row of overlap 956 | tmp = stored_data.append(tmp) 957 | tmp["index"] = tmp.index 958 | tmp.drop_duplicates(cols = "index", inplace = True) 959 | tmp = tmp[tmp.columns[tmp.columns != "index"]] 960 | store.put(key, tmp) 961 | except: 962 | print "could not update {0}".format(key) 963 | logging.exception("could not update {0}".format(key)) 964 | 965 | store.close() 966 | return None 967 | 968 | 969 | def zipped_time_chunks(index, interval, incl_T = False): 970 | """ 971 | Given different period intervals, return a zipped list of tuples 972 | of length 'period_interval', containing only full periods 973 | 974 | .. note:: 975 | 976 | The function assumes indexes are of 'daily_frequency' 977 | 978 | :ARGS: 979 | 980 | index: :class:`pandas.DatetimeIndex` 981 | 982 | per_interval: :class:`string` either 'weekly, 983 | 'monthly', 'quarterly', or 'yearly' 984 | """ 985 | 986 | time_d = {'weekly': lambda x: x.week, 987 | 'monthly': lambda x: x.month, 988 | 'quarterly':lambda x:x.quarter, 989 | 'yearly':lambda x: x.year} 990 | 991 | prv = time_d[interval](index[:-1]) 992 | nxt = time_d[interval](index[1:]) 993 | ind = prv != nxt 994 | 995 | 996 | if ind[0]: # index started on the last day of period 997 | index = index.copy()[1:] # remove first elem 998 | prv = time_d[interval](index[:-1]) 999 | nxt = time_d[interval](index[1:]) 1000 | ind = prv != nxt 1001 | 1002 | if incl_T: 1003 | if not ind[-1]: # doesn't already end on True 1004 | ind = numpy.append(ind, True) 1005 | 1006 | ldop = index[ind] # last day of period 1007 | f_ind = numpy.append(True, ind[:-1]) 1008 | fdop = index[f_ind] # first day of period 1009 | return zip(fdop, ldop) 1010 | 1011 | def tradeplus_tchunks(weight_index, price_index): 1012 | """ 1013 | Return zipped time intervals of trade signal and trade signal + 1 1014 | 1015 | :ARGS: 1016 | 1017 | weight_index: :class:`pandas.DatetimeIndex` of the weight allocation 1018 | frame of generated signals 1019 | 1020 | price_index: :class:`pandas.DatetimeIndex` for all the price data 1021 | 1022 | :RETURNS: 1023 | 1024 | :class:`tuple` of int_beg, the t + 1 date after the weight signal 1025 | and int_fin, the next weight signal (or last date in the price_index) 1026 | 1027 | .. note:: 1028 | 1029 | having consecutive, non-overlapping intervals is commonly used for 1030 | things such as optimizing share calculation algorithms, transaction 1031 | cost calculation, etc. 1032 | """ 1033 | locs = list(price_index.get_loc(key) + 1 for key in weight_index) 1034 | do = pandas.DatetimeIndex([weight_index[0]]) 1035 | int_beg = price_index[locs[1:]] 1036 | int_beg = do.append(int_beg) 1037 | 1038 | int_fin = weight_index[1:] 1039 | dT = pandas.DatetimeIndex([price_index[-1]]) 1040 | int_fin = int_fin.append(dT) 1041 | return zip(int_beg, int_fin) 1042 | 1043 | def _open_store(store_path): 1044 | """ 1045 | open an HDFStore located at store_path with the appropriate error handling 1046 | 1047 | :ARGS: 1048 | 1049 | store_path: :class:`string` where the store is located 1050 | 1051 | :RETURNS: 1052 | 1053 | :class:`HDFStore` instance 1054 | """ 1055 | try: 1056 | store = pandas.HDFStore(path = store_path, mode = 'r+') 1057 | return store 1058 | except IOError: 1059 | logging.exception( 1060 | "{0} is not a valid path to an HDFStore Object".format(store_path) 1061 | ) 1062 | raise 1063 | 1064 | 1065 | def __get_data(ticker, api, start): 1066 | """ 1067 | Helper function to get Yahoo! Data with exceptions built in and 1068 | messages that confirm success for given tickers 1069 | 1070 | ARGS: 1071 | 1072 | ticker: either a :class:`string` of a ticker or a :class:`list` 1073 | of tickers 1074 | 1075 | api: :class:`string` the api from which to get the data, 1076 | 'yahoo'or 'google' 1077 | 1078 | start: :class:`string` the start date to start the data 1079 | series 1080 | 1081 | """ 1082 | reader = pandas.io.data.DataReader 1083 | try: 1084 | data = reader(ticker, api, start = start) 1085 | return data 1086 | except: 1087 | print "failed for " + ticker 1088 | return 1089 | 1090 | --------------------------------------------------------------------------------