├── files ├── py4fi.pdf ├── basic-docinfo.html ├── images │ └── spx_volatility.png ├── py4fi.asciidoc ├── code │ ├── bsm_mcs_euro.py │ ├── why_python.txt │ ├── why_python.txt~ │ └── why_python.ipynb ├── nb_parse.py ├── basic-theme.yml ├── why_python.asciidoc └── py4fi.html ├── output ├── tpqad_output_01.png └── tpqad_output_02.png ├── LICENSE.txt ├── .gitignore ├── README.md └── tpqad.ipynb /files/py4fi.pdf: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/yhilpisch/tpqad/master/files/py4fi.pdf -------------------------------------------------------------------------------- /files/basic-docinfo.html: -------------------------------------------------------------------------------- 1 | -------------------------------------------------------------------------------- /output/tpqad_output_01.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/yhilpisch/tpqad/master/output/tpqad_output_01.png -------------------------------------------------------------------------------- /output/tpqad_output_02.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/yhilpisch/tpqad/master/output/tpqad_output_02.png -------------------------------------------------------------------------------- /files/images/spx_volatility.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/yhilpisch/tpqad/master/files/images/spx_volatility.png -------------------------------------------------------------------------------- /files/py4fi.asciidoc: -------------------------------------------------------------------------------- 1 | = Python for Finance 2 | Dr. Yves J. Hilpisch | 3 | The Python Quants GmbH | www.tpq.io 4 | :doctype: book 5 | :source-highlighter: coderay 6 | // :coderay-css: inline 7 | :listing-caption: Listing 8 | :pdf-page-size: A4 9 | :icons: font 10 | :toc: left 11 | :toclevels: 2 12 | :numbered: 13 | :auto-fit: 14 | :docinfo: 15 | // :imagesdir: images 16 | 17 | 18 | include::why_python.asciidoc[] 19 | -------------------------------------------------------------------------------- /files/code/bsm_mcs_euro.py: -------------------------------------------------------------------------------- 1 | # 2 | # Monte Carlo valuation of European call option 3 | # in Black-Scholes-Merton model 4 | # bsm_mcs_euro.py 5 | # 6 | # Python for Finance, 2nd ed. 7 | # (c) Dr. Yves J. Hilpisch 8 | # 9 | import math 10 | import numpy as np 11 | 12 | # Parameter Values 13 | S0 = 100. # initial index level 14 | K = 105. # strike price 15 | T = 1.0 # time-to-maturity 16 | r = 0.05 # riskless short rate 17 | sigma = 0.2 # volatility 18 | 19 | I = 100000 # number of simulations 20 | 21 | # Valuation Algorithm 22 | z = np.random.standard_normal(I) # pseudo-random numbers 23 | # index values at maturity 24 | ST = S0 * np.exp((r - 0.5 * sigma ** 2) * T + sigma * math.sqrt(T) * z) 25 | hT = np.maximum(ST - K, 0) # payoff at maturity 26 | C0 = math.exp(-r * T) * np.mean(hT) # Monte Carlo estimator 27 | 28 | # Result Output 29 | print('Value of the European call option %5.3f.' % C0) 30 | -------------------------------------------------------------------------------- /LICENSE.txt: -------------------------------------------------------------------------------- 1 | The MIT License 2 | 3 | Copyright (c) 2018 Dr. Yves J. Hilpisch | The Python Quants GmbH | http://tpq.io 4 | 5 | Permission is hereby granted, free of charge, to any person obtaining a copy 6 | of this software and associated documentation files (the "Software"), to deal 7 | in the Software without restriction, including without limitation the rights 8 | to use, copy, modify, merge, publish, distribute, sublicense, and/or sell 9 | copies of the Software, and to permit persons to whom the Software is 10 | furnished to do so, subject to the following conditions: 11 | 12 | The above copyright notice and this permission notice shall be included in 13 | all copies or substantial portions of the Software. 14 | 15 | THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR 16 | IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, 17 | FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE 18 | AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER 19 | LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, 20 | OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN 21 | THE SOFTWARE. -------------------------------------------------------------------------------- /.gitignore: -------------------------------------------------------------------------------- 1 | # Specifics 2 | *.cfg 3 | *.swp 4 | _build/ 5 | 6 | # Byte-compiled / optimized / DLL files 7 | __pycache__/ 8 | *.py[cod] 9 | *$py.class 10 | 11 | # C extensions 12 | *.so 13 | 14 | # Distribution / packaging 15 | .Python 16 | build/ 17 | develop-eggs/ 18 | dist/ 19 | downloads/ 20 | eggs/ 21 | .eggs/ 22 | lib/ 23 | lib64/ 24 | parts/ 25 | sdist/ 26 | var/ 27 | wheels/ 28 | *.egg-info/ 29 | .installed.cfg 30 | *.egg 31 | MANIFEST 32 | 33 | # PyInstaller 34 | # Usually these files are written by a python script from a template 35 | # before PyInstaller builds the exe, so as to inject date/other infos into it. 36 | *.manifest 37 | *.spec 38 | 39 | # Installer logs 40 | pip-log.txt 41 | pip-delete-this-directory.txt 42 | 43 | # Unit test / coverage reports 44 | htmlcov/ 45 | .tox/ 46 | .coverage 47 | .coverage.* 48 | .cache 49 | nosetests.xml 50 | coverage.xml 51 | *.cover 52 | .hypothesis/ 53 | .pytest_cache/ 54 | 55 | # Translations 56 | *.mo 57 | *.pot 58 | 59 | # Django stuff: 60 | *.log 61 | local_settings.py 62 | db.sqlite3 63 | 64 | # Flask stuff: 65 | instance/ 66 | .webassets-cache 67 | 68 | # Scrapy stuff: 69 | .scrapy 70 | 71 | # Sphinx documentation 72 | docs/_build/ 73 | 74 | # PyBuilder 75 | target/ 76 | 77 | # Jupyter Notebook 78 | .ipynb_checkpoints 79 | 80 | # pyenv 81 | .python-version 82 | 83 | # celery beat schedule file 84 | celerybeat-schedule 85 | 86 | # SageMath parsed files 87 | *.sage.py 88 | 89 | # Environments 90 | .env 91 | .venv 92 | env/ 93 | venv/ 94 | ENV/ 95 | env.bak/ 96 | venv.bak/ 97 | 98 | # Spyder project settings 99 | .spyderproject 100 | .spyproject 101 | 102 | # Rope project settings 103 | .ropeproject 104 | 105 | # mkdocs documentation 106 | /site 107 | 108 | # mypy 109 | .mypy_cache/ -------------------------------------------------------------------------------- /files/code/why_python.txt: -------------------------------------------------------------------------------- 1 | # tag::MONTE_CARLO[] 2 | In [1]: import math 3 | import numpy as np # <1> 4 | 5 | In [2]: S0 = 100. # <2> 6 | K = 105. # <2> 7 | T = 1.0 # <2> 8 | r = 0.05 # <2> 9 | sigma = 0.2 # <2> 10 | 11 | In [3]: I = 100000 # <2> 12 | 13 | In [4]: np.random.seed(1000) # <3> 14 | 15 | In [5]: z = np.random.standard_normal(I) # <4> 16 | 17 | In [6]: ST = S0 * np.exp((r - sigma ** 2 / 2) * T + sigma * math.sqrt(T) * z) # <5> 18 | 19 | In [7]: hT = np.maximum(ST - K, 0) # <6> 20 | 21 | In [8]: C0 = math.exp(-r * T) * np.mean(hT) # <7> 22 | 23 | In [9]: print('Value of the European call option: {:5.3f}.'.format(C0)) # <8> 24 | Value of the European call option: 8.019. 25 | 26 | # end::MONTE_CARLO[] 27 | 28 | In [10]: %run bsm_mcs_euro.py 29 | Value of the European call option 7.989. 30 | 31 | # tag::SPX_EXAMPLE[] 32 | In [11]: import numpy as np # <1> 33 | import pandas as pd # <1> 34 | from pylab import plt, mpl # <2> 35 | 36 | In [12]: plt.style.use('seaborn') # <2> 37 | mpl.rcParams['font.family'] = 'serif' # <2> 38 | %matplotlib inline 39 | 40 | In [13]: data = pd.read_csv('http://hilpisch.com/tr_eikon_eod_data.csv', 41 | index_col=0, parse_dates=True) # <3> 42 | data = pd.DataFrame(data['.SPX']) # <4> 43 | data.dropna(inplace=True) # <4> 44 | data.info() # <5> 45 | 46 | DatetimeIndex: 2138 entries, 2010-01-04 to 2018-06-29 47 | Data columns (total 1 columns): 48 | .SPX 2138 non-null float64 49 | dtypes: float64(1) 50 | memory usage: 33.4 KB 51 | 52 | In [14]: data['rets'] = np.log(data / data.shift(1)) # <6> 53 | data['vola'] = data['rets'].rolling(252).std() * np.sqrt(252) # <7> 54 | 55 | In [15]: data[['.SPX', 'vola']].plot(subplots=True, figsize=(10, 6)); # <8> 56 | plt.savefig('../images/spx_volatility.png') 57 | # end::SPX_EXAMPLE[] 58 | 59 | # tag::PARADIGM_1[] 60 | In [16]: import math 61 | loops = 2500000 62 | a = range(1, loops) 63 | def f(x): 64 | return 3 * math.log(x) + math.cos(x) ** 2 65 | %timeit r = [f(x) for x in a] 66 | 1.21 s ± 8.27 ms per loop (mean ± std. dev. of 7 runs, 1 loop each) 67 | 68 | # end::PARADIGM_1[] 69 | 70 | # tag::PARADIGM_2[] 71 | In [17]: import numpy as np 72 | a = np.arange(1, loops) 73 | %timeit r = 3 * np.log(a) + np.cos(a) ** 2 74 | 94.7 ms ± 1.55 ms per loop (mean ± std. dev. of 7 runs, 10 loops each) 75 | 76 | # end::PARADIGM_2[] 77 | 78 | # tag::PARADIGM_3[] 79 | In [18]: import numexpr as ne 80 | ne.set_num_threads(1) 81 | f = '3 * log(a) + cos(a) ** 2' 82 | %timeit r = ne.evaluate(f) 83 | 37.5 ms ± 936 µs per loop (mean ± std. dev. of 7 runs, 10 loops each) 84 | 85 | # end::PARADIGM_3[] 86 | 87 | # tag::PARADIGM_4[] 88 | In [19]: ne.set_num_threads(4) 89 | %timeit r = ne.evaluate(f) 90 | 12.8 ms ± 300 µs per loop (mean ± std. dev. of 7 runs, 100 loops each) 91 | 92 | # end::PARADIGM_4[] 93 | 94 | -------------------------------------------------------------------------------- /files/code/why_python.txt~: -------------------------------------------------------------------------------- 1 | # tag::MONTE_CARLO[] 2 | In [1]: import math 3 | import numpy as np # <1> 4 | 5 | In [2]: S0 = 100. # <2> 6 | K = 105. # <2> 7 | T = 1.0 # <2> 8 | r = 0.05 # <2> 9 | sigma = 0.2 # <2> 10 | 11 | In [3]: I = 100000 # <2> 12 | 13 | In [4]: np.random.seed(1000) # <3> 14 | 15 | In [5]: z = np.random.standard_normal(I) # <4> 16 | 17 | In [6]: ST = S0 * np.exp((r - sigma ** 2 / 2) * T + sigma * math.sqrt(T) * z) # <5> 18 | 19 | In [7]: hT = np.maximum(ST - K, 0) # <6> 20 | 21 | In [8]: C0 = math.exp(-r * T) * np.mean(hT) # <7> 22 | 23 | In [9]: print('Value of the European call option: {:5.3f}.'.format(C0)) # <8> 24 | Value of the European call option: 8.019. 25 | 26 | # end::MONTE_CARLO[] 27 | 28 | In [10]: %run bsm_mcs_euro.py 29 | Value of the European call option 7.989. 30 | 31 | # tag::SPX_EXAMPLE[] 32 | In [11]: import numpy as np # <1> 33 | import pandas as pd # <1> 34 | from pylab import plt, mpl # <2> 35 | 36 | In [12]: plt.style.use('seaborn') # <2> 37 | mpl.rcParams['font.family'] = 'serif' # <2> 38 | %matplotlib inline 39 | 40 | In [13]: data = pd.read_csv('http://hilpisch.com/tr_eikon_eod_data.csv', 41 | index_col=0, parse_dates=True) # <3> 42 | data = pd.DataFrame(data['.SPX']) # <4> 43 | data.dropna(inplace=True) # <4> 44 | data.info() # <5> 45 | 46 | DatetimeIndex: 2138 entries, 2010-01-04 to 2018-06-29 47 | Data columns (total 1 columns): 48 | .SPX 2138 non-null float64 49 | dtypes: float64(1) 50 | memory usage: 33.4 KB 51 | 52 | In [14]: data['rets'] = np.log(data / data.shift(1)) # <6> 53 | data['vola'] = data['rets'].rolling(252).std() * np.sqrt(252) # <7> 54 | 55 | In [15]: data[['.SPX', 'vola']].plot(subplots=True, figsize=(10, 6)); # <8> 56 | plt.savefig('../images/spx_volatility.png') 57 | # end::SPX_EXAMPLE[] 58 | 59 | # tag::PARADIGM_1[] 60 | In [16]: import math 61 | loops = 2500000 62 | a = range(1, loops) 63 | def f(x): 64 | return 3 * math.log(x) + math.cos(x) ** 2 65 | %timeit r = [f(x) for x in a] 66 | 1.21 s ± 8.27 ms per loop (mean ± std. dev. of 7 runs, 1 loop each) 67 | 68 | # end::PARADIGM_1[] 69 | 70 | # tag::PARADIGM_2[] 71 | In [17]: import numpy as np 72 | a = np.arange(1, loops) 73 | %timeit r = 3 * np.log(a) + np.cos(a) ** 2 74 | 94.7 ms ± 1.55 ms per loop (mean ± std. dev. of 7 runs, 10 loops each) 75 | 76 | # end::PARADIGM_2[] 77 | 78 | # tag::PARADIGM_3[] 79 | In [18]: import numexpr as ne 80 | ne.set_num_threads(1) 81 | f = '3 * log(a) + cos(a) ** 2' 82 | %timeit r = ne.evaluate(f) 83 | 37.5 ms ± 936 µs per loop (mean ± std. dev. of 7 runs, 10 loops each) 84 | 85 | # end::PARADIGM_3[] 86 | 87 | # tag::PARADIGM_4[] 88 | In [19]: ne.set_num_threads(4) 89 | %timeit r = ne.evaluate(f) 90 | 12.8 ms ± 300 µs per loop (mean ± std. dev. of 7 runs, 100 loops each) 91 | 92 | # end::PARADIGM_4[] 93 | 94 | -------------------------------------------------------------------------------- /README.md: -------------------------------------------------------------------------------- 1 | 2 | The Python Quants
3 | 4 | # tpqad 5 | 6 | ## Asciidoctor with a Twist 7 | 8 | This repository provides a few (unpolished) helper scripts to write larger documents with Asciidoctor that import (Python) code from Jupyter Notebook semi-automatically. 9 | 10 | The approach introduced hase been used e.g. to write the book Python for Finance (2nd ed., O'Reilly). See http://py4fi.tpq.io. 11 | 12 | The repository is authored and maintained by The Python Quants GmbH. © Dr. Yves J. Hilpisch. MIT License. 13 | 14 | ## Requirements 15 | 16 | The following requires a standard Python 3.x installation as well as the installation of Asciidoctor — see https://asciidoctor.org. 17 | 18 | ## Basic Idea 19 | 20 | The basic idea of the approach is to use Asciidoctor as the basis. Codes can be written and stored in Jupyter Notebooks. Tags are added that later (after a parsing step) are used to include code snippets automatically into the Asciidoctor files. The same holds true for images/plots that are created during the execution of a Jupyter Notebook. 21 | 22 | ## Sources 23 | 24 | The files in the `files` folder are taken from earlier draft versions of **Python for Finance** (2nd ed., O'Reilly, http://py4fi.tpq.io). They have been modified for the simplified example to follow. 25 | 26 | ## Files and Steps 27 | 28 | To render the example files (see `files` folder) with Asciidoctor, do the following. 29 | 30 | **First**, execute the Jupyter Notebook (`code/why_python.ipynb`) and save it. Note the tags added in `raw` cells. Plots are saved automatically (in `images` folder). 31 | 32 | **Second**, on the shell in the `files` folder, parse the executed and saved Jupyter Notebook via 33 | 34 | python nb_parse.py code/why_python.ipynb 35 | 36 | This creates a text file with location and name `code/why_python.txt`. 37 | 38 | **Third**, on the shell, execute 39 | 40 | asciidoctor -a stem=latexmath py4fi.asciidoc 41 | 42 | to render a **HTML version**. 43 | 44 | **Fourth**, on the shell, execute 45 | 46 | asciidoctor-pdf -r asciidoctor-mathematical -a pdf-style=basic-theme.yml py4fi.asciidoc 47 | rm *.png 48 | 49 | to render a **PDF version** and delete temporary `png` files. 50 | 51 | ## `nb_parse.py` 52 | 53 | The file `nb_parse.py` is at the core of the example and the approach presented. It has grown over the course of writing **Python for Finance** (2nd ed.). However, it has not been reenginered or optimized in any way. Therefore, use it with caution and take it maybe as a starting point for your custom conversions. 54 | 55 | ## Example Output 56 | 57 | The following screenshot shows example output from the **HTML version**. 58 | 59 | 60 | 61 | The following screenshot shows example output from the **PDF version**. 62 | 63 | 64 | 65 | The Python Quants
66 | 67 | http://tpq.io | @dyjh | training@tpq.io 68 | -------------------------------------------------------------------------------- /tpqad.ipynb: -------------------------------------------------------------------------------- 1 | { 2 | "cells": [ 3 | { 4 | "cell_type": "markdown", 5 | "metadata": { 6 | "slideshow": { 7 | "slide_type": "slide" 8 | } 9 | }, 10 | "source": [ 11 | "\"The
" 12 | ] 13 | }, 14 | { 15 | "cell_type": "markdown", 16 | "metadata": {}, 17 | "source": [ 18 | "# tpqad" 19 | ] 20 | }, 21 | { 22 | "cell_type": "markdown", 23 | "metadata": {}, 24 | "source": [ 25 | "## Asciidoctor with a Twist" 26 | ] 27 | }, 28 | { 29 | "cell_type": "markdown", 30 | "metadata": {}, 31 | "source": [ 32 | "This repository provides a few (unpolished) helper scripts to write larger documents with Asciidoctor that import (Python) code from Jupyter Notebook semi-automatically.\n", 33 | "\n", 34 | "The approach introduced hase been used e.g. to write the book Python for Finance (2nd ed., O'Reilly). See http://py4fi.tpq.io.\n", 35 | "\n", 36 | "The repository is authored and maintained by The Python Quants GmbH. © Dr. Yves J. Hilpisch. MIT License." 37 | ] 38 | }, 39 | { 40 | "cell_type": "markdown", 41 | "metadata": {}, 42 | "source": [ 43 | "## Requirements" 44 | ] 45 | }, 46 | { 47 | "cell_type": "markdown", 48 | "metadata": {}, 49 | "source": [ 50 | "The following requires a standard Python 3.x installation as well as the installation of Asciidoctor — see https://asciidoctor.org." 51 | ] 52 | }, 53 | { 54 | "cell_type": "markdown", 55 | "metadata": {}, 56 | "source": [ 57 | "## Basic Idea" 58 | ] 59 | }, 60 | { 61 | "cell_type": "markdown", 62 | "metadata": {}, 63 | "source": [ 64 | "The basic idea of the approach is to use Asciidoctor as the basis. Codes can be written and stored in Jupyter Notebooks. Tags are added that later (after a parsing step) are used to include code snippets automatically into the Asciidoctor files. The same holds true for images/plots that are created during the execution of a Jupyter Notebook." 65 | ] 66 | }, 67 | { 68 | "cell_type": "markdown", 69 | "metadata": {}, 70 | "source": [ 71 | "## Sources" 72 | ] 73 | }, 74 | { 75 | "cell_type": "markdown", 76 | "metadata": {}, 77 | "source": [ 78 | "The files in the `files` folder are taken from earlier draft versions of **Python for Finance** (2nd ed., O'Reilly, http://py4fi.tpq.io). They have been modified for the simplified example to follow." 79 | ] 80 | }, 81 | { 82 | "cell_type": "markdown", 83 | "metadata": {}, 84 | "source": [ 85 | "## Files and Steps" 86 | ] 87 | }, 88 | { 89 | "cell_type": "markdown", 90 | "metadata": {}, 91 | "source": [ 92 | "To render the example files (see `files` folder) with Asciidoctor, do the following.\n", 93 | "\n", 94 | "**First**, execute the Jupyter Notebook (`code/why_python.ipynb`) and save it. Note the tags added in `raw` cells. Plots are saved automatically (in `images` folder).\n", 95 | "\n", 96 | "**Second**, on the shell in the `files` folder, parse the executed and saved Jupyter Notebook via\n", 97 | "\n", 98 | " python nb_parse.py code/why_python.ipynb\n", 99 | " \n", 100 | "This creates a text file with location and name `code/why_python.txt`.\n", 101 | "\n", 102 | "**Third**, on the shell, execute\n", 103 | "\n", 104 | " asciidoctor -a stem=latexmath py4fi.asciidoc\n", 105 | " \n", 106 | "to render a **HTML version**.\n", 107 | "\n", 108 | "**Fourth**, on the shell, execute\n", 109 | "\n", 110 | " asciidoctor-pdf -r asciidoctor-mathematical -a pdf-style=basic-theme.yml py4fi.asciidoc\n", 111 | " rm *.png\n", 112 | " \n", 113 | "to render a **PDF version** and delete temporary `png` files." 114 | ] 115 | }, 116 | { 117 | "cell_type": "markdown", 118 | "metadata": {}, 119 | "source": [ 120 | "## `nb_parse.py`" 121 | ] 122 | }, 123 | { 124 | "cell_type": "markdown", 125 | "metadata": {}, 126 | "source": [ 127 | "The file `nb_parse.py` is at the core of the example and the approach presented. It has grown over the course of writing **Python for Finance** (2nd ed.). However, it has not been reenginered or optimized in any way. Therefore, use it with caution and take it maybe as a starting point for your custom conversions." 128 | ] 129 | }, 130 | { 131 | "cell_type": "markdown", 132 | "metadata": {}, 133 | "source": [ 134 | "## Example Output" 135 | ] 136 | }, 137 | { 138 | "cell_type": "markdown", 139 | "metadata": {}, 140 | "source": [ 141 | "The following screenshot shows example output from the **HTML version**.\n", 142 | "\n", 143 | "" 144 | ] 145 | }, 146 | { 147 | "cell_type": "markdown", 148 | "metadata": {}, 149 | "source": [ 150 | "The following screenshot shows example output from the **PDF version**.\n", 151 | "\n", 152 | "" 153 | ] 154 | }, 155 | { 156 | "cell_type": "markdown", 157 | "metadata": { 158 | "slideshow": { 159 | "slide_type": "slide" 160 | } 161 | }, 162 | "source": [ 163 | "\"The
\n", 164 | "\n", 165 | "http://tpq.io | @dyjh | training@tpq.io" 166 | ] 167 | } 168 | ], 169 | "metadata": { 170 | "anaconda-cloud": {}, 171 | "kernelspec": { 172 | "display_name": "Python 3", 173 | "language": "python", 174 | "name": "python3" 175 | }, 176 | "language_info": { 177 | "codemirror_mode": { 178 | "name": "ipython", 179 | "version": 3 180 | }, 181 | "file_extension": ".py", 182 | "mimetype": "text/x-python", 183 | "name": "python", 184 | "nbconvert_exporter": "python", 185 | "pygments_lexer": "ipython3", 186 | "version": "3.6.7" 187 | } 188 | }, 189 | "nbformat": 4, 190 | "nbformat_minor": 1 191 | } 192 | -------------------------------------------------------------------------------- /files/nb_parse.py: -------------------------------------------------------------------------------- 1 | # 2 | # Script to Convert Jupyter Notebook 3 | # to IPython Style Text File 4 | # 5 | # (c) The Python Quants GmbH 6 | # WORK IN PROGRESS | DRAFT VERSION 7 | # FOR ILLUSTRATION PURPOSES ONLY 8 | import re 9 | import os 10 | import sys 11 | import json 12 | 13 | ansi_escape = re.compile(r'\x1B\[[0-?]*[ -/]*[@-~]') 14 | 15 | 16 | def parse(argv): 17 | if len(argv) < 1: 18 | raise ValueError('Please provide a file') 19 | path = argv[0] 20 | 21 | if not os.path.isfile(path): 22 | raise ValueError('Can not open %s. No such file or directory.' % path) 23 | 24 | parts = path.split('.') 25 | if len(parts) < 2 or len(parts) > 2 or parts[1] != 'ipynb': 26 | raise ValueError('Invalid filename') 27 | target = parts[0] + '.' + 'txt' 28 | 29 | notebook = read_in(argv[0]) 30 | 31 | with open(target, 'w') as target_file: 32 | for cell in notebook['cells']: 33 | if cell['cell_type'] == 'raw': 34 | for l in cell['source']: 35 | target_file.write(l) 36 | if l.startswith('# end'): 37 | target_file.write('\n\n') 38 | else: 39 | target_file.write('\n') 40 | 41 | if cell['cell_type'] == 'code': 42 | if 'execution_count' in cell: 43 | execution_count = cell['execution_count'] 44 | else: 45 | execution_count = "" 46 | prefix = 'In [%s]: ' % execution_count 47 | len_prefix = len(prefix) 48 | code = '' 49 | first = 0 50 | for line in cell['source']: 51 | code = code + ' ' * len_prefix * first + line 52 | first = 1 53 | text_input = prefix + code 54 | target_file.write(text_input) 55 | if cell['outputs']: 56 | target_file.write('\n') 57 | else: 58 | target_file.write('\n\n') 59 | 60 | for output in cell['outputs']: 61 | if 'execution_count' in output: 62 | execution_count = output['execution_count'] 63 | else: 64 | execution_count = '' 65 | suffix = 'Out[%s]: ' % execution_count 66 | len_suffix = len(suffix) 67 | if 'data' in output and 'text/plain' in output['data'] \ 68 | and 'image/png' not in output['data']: 69 | result = "" 70 | first = 0 71 | lb = 0 72 | for line in output['data']['text/plain']: 73 | add_lines = list() 74 | if len(line) + len_suffix <= 81: 75 | add_lines.append(line) 76 | else: 77 | add_lines = break_long_lines(line, len_suffix) 78 | for l in add_lines: 79 | if len(add_lines) >= 2: 80 | lb = 1 81 | result += (' ' * len_suffix * first + l + 82 | lb * '\n') 83 | first = 1 84 | if len(add_lines) == 2: 85 | result = result[:-1] 86 | text_output = suffix + result 87 | target_file.write(text_output) 88 | target_file.write('\n\n') 89 | 90 | if 'name' in output: 91 | result = "" 92 | first = 0 93 | result = "" 94 | first = 0 95 | lb = 0 96 | for line in output['text']: 97 | add_lines = list() 98 | if len(line) + len_suffix <= 81: 99 | add_lines.append(line) 100 | else: 101 | add_lines = break_long_lines(line, len_suffix) 102 | lb = 1 103 | for l in add_lines: 104 | result += ' ' * (len_prefix + 0) * \ 105 | first + l + lb * '\n' 106 | first = 1 107 | if len(add_lines) == 2: 108 | result = result[:-1] 109 | lb = 0 110 | 111 | text_output = (len_prefix + 0) * ' ' + result 112 | target_file.write(text_output) 113 | target_file.write('\n') 114 | 115 | if 'ename' in output: 116 | result = "" 117 | first = 0 118 | target_file.write('\n') 119 | for line in output['traceback']: 120 | line = ansi_escape.sub('', line) 121 | result = (result + ' ' * len_prefix * first + 122 | line + '\n') 123 | first = 1 124 | text_output = len_prefix * ' ' + result 125 | target_file.write(text_output) 126 | target_file.write('\n') 127 | 128 | 129 | def break_long_lines(line, len_suffix): 130 | comma = False 131 | # line.replace('\n', '') 132 | if line[len_suffix + 1:].find(' ') >= 0: 133 | line_parts = line.split(' ') 134 | elif line[len_suffix + 1:].find(',') >= 0: 135 | line_parts = line.split(',') 136 | comma = True 137 | else: 138 | line_parts = [line[start:start + 70] for 139 | start in range(0, len(line) + 1, 70)] 140 | p_line = '' 141 | lines = list() 142 | splitter = ',' if comma else ' ' 143 | for l in line_parts: 144 | if len(p_line) + len(l) + len_suffix < 81: 145 | if len(p_line) == 0: 146 | p_line = l 147 | else: 148 | p_line = p_line + splitter + l 149 | else: 150 | lines.append(p_line + splitter) 151 | p_line = splitter + l 152 | lines.append(p_line) 153 | return lines 154 | 155 | 156 | def read_in(path): 157 | with open(path, 'r') as f: 158 | notebook = f.read() 159 | 160 | notebook = json.loads(notebook) 161 | return notebook 162 | 163 | 164 | if __name__ == '__main__': 165 | parse(sys.argv[1:]) 166 | import os 167 | fn = sys.argv[1].split('.')[0] + '.txt' 168 | os.rename(fn, fn + '~') 169 | with open(fn + '~', 'r') as f: 170 | with open(fn, 'w') as o: 171 | for line in f: 172 | if line.find('# plt.save') >= 0: 173 | pass 174 | else: 175 | o.write(line) 176 | # os.remove(sys.argv[2] + '~') 177 | -------------------------------------------------------------------------------- /files/basic-theme.yml: -------------------------------------------------------------------------------- 1 | font: 2 | catalog: 3 | # Noto Serif supports Latin, Latin-1 Supplement, Latin Extended-A, Greek, Cyrillic, Vietnamese & an assortment of symbols 4 | Noto Serif: 5 | normal: notoserif-regular-subset.ttf 6 | bold: notoserif-bold-subset.ttf 7 | italic: notoserif-italic-subset.ttf 8 | bold_italic: notoserif-bold_italic-subset.ttf 9 | # M+ 1mn supports ASCII and the circled numbers used for conums 10 | M+ 1mn: 11 | normal: mplus1mn-regular-ascii-conums.ttf 12 | bold: mplus1mn-bold-ascii.ttf 13 | italic: mplus1mn-italic-ascii.ttf 14 | bold_italic: mplus1mn-bold_italic-ascii.ttf 15 | # M+ 1p supports Latin, Latin-1 Supplement, Latin Extended, Greek, Cyrillic, Vietnamese, Japanese & an assortment of symbols 16 | # It also provides arrows for ->, <-, => and <= replacements in case these glyphs are missing from font 17 | M+ 1p Fallback: 18 | normal: mplus1p-regular-fallback.ttf 19 | bold: mplus1p-regular-fallback.ttf 20 | italic: mplus1p-regular-fallback.ttf 21 | bold_italic: mplus1p-regular-fallback.ttf 22 | fallbacks: 23 | - M+ 1p Fallback 24 | page: 25 | background_color: ffffff 26 | layout: portrait 27 | margin: [0.5in, 0.67in, 0.67in, 0.67in] 28 | # margin_inner and margin_outer keys are used for recto/verso print margins when media=press 29 | margin_inner: 0.75in 30 | margin_outer: 0.59in 31 | size: A4 32 | base: 33 | align: justify 34 | # color as hex string (leading # is optional) 35 | font_color: 333333 36 | # color as RGB array 37 | #font_color: [51, 51, 51] 38 | # color as CMYK array (approximated) 39 | #font_color: [0, 0, 0, 0.92] 40 | #font_color: [0, 0, 0, 92%] 41 | font_family: Noto Serif 42 | # choose one of these font_size/line_height_length combinations 43 | #font_size: 14 44 | #line_height_length: 20 45 | #font_size: 11.25 46 | #line_height_length: 18 47 | #font_size: 11.2 48 | #line_height_length: 16 49 | font_size: 12 50 | #line_height_length: 15 51 | # correct line height for Noto Serif metrics 52 | line_height_length: 11 53 | #font_size: 11.25 54 | #line_height_length: 18 55 | line_height: $base_line_height_length / $base_font_size 56 | font_size_large: round($base_font_size * 1.25) 57 | font_size_small: round($base_font_size * 0.85) 58 | font_size_min: $base_font_size * 0.75 59 | font_style: normal 60 | border_color: eeeeee 61 | border_radius: 4 62 | border_width: 0.5 63 | # FIXME vertical_rhythm is weird; we should think in terms of ems 64 | #vertical_rhythm: $base_line_height_length * 2 / 3 65 | # correct line height for Noto Serif metrics (comes with built-in line height) 66 | vertical_rhythm: $base_line_height_length 67 | horizontal_rhythm: $base_line_height_length 68 | # QUESTION should vertical_spacing be block_spacing instead? 69 | vertical_spacing: $vertical_rhythm 70 | link: 71 | font_color: 428bca 72 | # literal is currently used for inline monospaced in prose and table cells 73 | literal: 74 | font_color: b12146 75 | font_family: M+ 1mn 76 | menu_caret_content: " \u203a " 77 | heading: 78 | align: left 79 | #font_color: 181818 80 | font_color: $base_font_color 81 | font_family: $base_font_family 82 | font_style: bold 83 | # h1 is used for part titles (book doctype only) 84 | h1_font_size: floor($base_font_size * 2.6) 85 | # h2 is used for chapter titles (book doctype only) 86 | h2_font_size: floor($base_font_size * 2.15) 87 | h3_font_size: round($base_font_size * 1.7) 88 | h4_font_size: $base_font_size_large 89 | h5_font_size: $base_font_size 90 | h6_font_size: $base_font_size_small 91 | #line_height: 1.4 92 | # correct line height for Noto Serif metrics (comes with built-in line height) 93 | line_height: 1 94 | margin_top: $vertical_rhythm * 0.4 95 | margin_bottom: $vertical_rhythm * 0.9 96 | title_page: 97 | align: right 98 | logo: 99 | top: 10% 100 | title: 101 | top: 55% 102 | font_size: $heading_h1_font_size 103 | font_color: 999999 104 | line_height: 0.9 105 | subtitle: 106 | font_size: $heading_h3_font_size 107 | font_style: bold_italic 108 | line_height: 1 109 | authors: 110 | margin_top: $base_font_size * 1.25 111 | font_size: $base_font_size_large 112 | font_color: 181818 113 | revision: 114 | margin_top: $base_font_size * 1.25 115 | block: 116 | margin_top: 0 117 | margin_bottom: $vertical_rhythm 118 | caption: 119 | align: left 120 | font_size: $base_font_size * 0.95 121 | font_style: italic 122 | # FIXME perhaps set line_height instead of / in addition to margins? 123 | margin_inside: $vertical_rhythm / 3 124 | #margin_inside: $vertical_rhythm / 4 125 | margin_outside: 0 126 | lead: 127 | font_size: $base_font_size_large 128 | line_height: 1.4 129 | abstract: 130 | font_color: 5c6266 131 | font_size: $lead_font_size 132 | line_height: $lead_line_height 133 | font_style: italic 134 | first_line_font_style: bold 135 | title: 136 | align: center 137 | font_color: $heading_font_color 138 | font_family: $heading_font_family 139 | font_size: $heading_h4_font_size 140 | font_style: $heading_font_style 141 | admonition: 142 | column_rule_color: $base_border_color 143 | column_rule_width: $base_border_width 144 | padding: [0, $horizontal_rhythm, 0, $horizontal_rhythm] 145 | #icon: 146 | # tip: 147 | # name: fa-lightbulb-o 148 | # stroke_color: 111111 149 | # size: 24 150 | label: 151 | text_transform: uppercase 152 | font_style: bold 153 | blockquote: 154 | font_color: $base_font_color 155 | font_size: $base_font_size - 1 156 | line_height: $base_line_height - 2 157 | border_color: $base_border_color 158 | border_width: 5 159 | # FIXME disable negative padding bottom once margin collapsing is implemented 160 | padding: [0, $horizontal_rhythm, $block_margin_bottom * -0.75, $horizontal_rhythm + $blockquote_border_width / 2] 161 | cite_font_size: $base_font_size_small 162 | cite_font_color: 999999 163 | # code is used for source blocks (perhaps change to source or listing?) 164 | code: 165 | font_color: $base_font_color 166 | font_family: $literal_font_family 167 | font_size: $base_font_size - 1 # ceil($base_font_size) 168 | padding: $code_font_size 169 | line_height: 1.25 170 | # line_gap is an experimental property to control how a background color is applied to an inline block element 171 | line_gap: 3.8 172 | background_color: f5f5f5 173 | border_color: cccccc 174 | border_radius: $base_border_radius 175 | border_width: 0.75 176 | conum: 177 | font_family: M+ 1mn 178 | font_color: $literal_font_color 179 | font_size: $base_font_size 180 | line_height: 4 / 3 181 | example: 182 | border_color: $base_border_color 183 | border_radius: $base_border_radius 184 | border_width: 0.75 185 | background_color: transparent 186 | # FIXME reenable padding bottom once margin collapsing is implemented 187 | padding: [$vertical_rhythm, $horizontal_rhythm, 0, $horizontal_rhythm] 188 | image: 189 | align: left 190 | prose: 191 | margin_top: $block_margin_top 192 | margin_bottom: $block_margin_bottom 193 | sidebar: 194 | border_color: $page_background_color 195 | border_radius: $base_border_radius 196 | border_width: $base_border_width 197 | background_color: eeeeee 198 | # FIXME reenable padding bottom once margin collapsing is implemented 199 | padding: [$vertical_rhythm, $vertical_rhythm * 1.25, 0, $vertical_rhythm * 1.25] 200 | title: 201 | align: center 202 | font_color: $heading_font_color 203 | font_family: $heading_font_family 204 | font_size: $heading_h4_font_size 205 | font_style: $heading_font_style 206 | thematic_break: 207 | border_color: $base_border_color 208 | border_style: solid 209 | border_width: $base_border_width 210 | margin_top: $vertical_rhythm * 0.5 211 | margin_bottom: $vertical_rhythm * 1.5 212 | description_list: 213 | term_font_style: italic 214 | term_spacing: $vertical_rhythm / 4 215 | description_indent: $horizontal_rhythm * 1.25 216 | outline_list: 217 | indent: $horizontal_rhythm * 1.5 218 | #marker_font_color: 404040 219 | # NOTE outline_list_item_spacing applies to list items that do not have complex content 220 | item_spacing: $vertical_rhythm / 2 221 | table: 222 | background_color: $page_background_color 223 | #head_background_color: 224 | #head_font_color: $base_font_color 225 | head_font_style: bold 226 | even_row_background_color: f9f9f9 227 | #odd_row_background_color: 228 | foot_background_color: f0f0f0 229 | border_color: dddddd 230 | border_width: $base_border_width 231 | cell_padding: 3 232 | toc: 233 | indent: $horizontal_rhythm 234 | line_height: 1.4 235 | dot_leader: 236 | #content: ". " 237 | font_color: a9a9a9 238 | #levels: 2 3 239 | # NOTE in addition to footer, header is also supported 240 | footer: 241 | font_size: $base_font_size_small 242 | # NOTE if background_color is set, background and border will span width of page 243 | border_color: dddddd 244 | border_width: 0.25 245 | height: $base_line_height_length * 2.5 246 | line_height: 1 247 | padding: [$base_line_height_length / 2, 1, 0, 1] 248 | vertical_align: top 249 | #image_vertical_align: or 250 | # additional attributes for content: 251 | # * {page-count} 252 | # * {page-number} 253 | # * {document-title} 254 | # * {document-subtitle} 255 | # * {chapter-title} 256 | # * {section-title} 257 | # * {section-or-chapter-title} 258 | recto: 259 | #columns: "<50% =0% >50%" 260 | right: 261 | content: '{page-number}' 262 | #content: '{section-or-chapter-title} | {page-number}' 263 | #content: '{document-title} | {page-number}' 264 | #center: 265 | # content: '{page-number}' 266 | verso: 267 | #columns: $footer_recto_columns 268 | left: 269 | content: $footer_recto_right_content 270 | #content: '{page-number} | {chapter-title}' 271 | #center: 272 | # content: '{page-number}' -------------------------------------------------------------------------------- /files/why_python.asciidoc: -------------------------------------------------------------------------------- 1 | 2 | [[why_python_for_finance]] 3 | == Why Python for Finance 4 | 5 | [role="align_me_right"] 6 | [quote, Hugo Banziger] 7 | ____ 8 | Banks are essentially technology firms. 9 | ____ 10 | 11 | 12 | 13 | === Finance and Python Syntax 14 | 15 | Most people who make their first steps with Python in a finance context may attack an algorithmic problem. This is similar to a scientist who, for example, wants to solve a differential equation, wants to evaluate an integral or simply wants to visualize some data. In general, at this stage, there is only little thought spent on topics like a formal development process, testing, documentation or deployment. However, this especially seems to be the stage when people fall in love with Python. A major reason for this might be that Python syntax is generally quite close to the mathematical syntax used to describe scientific problems or financial algorithms. 16 | 17 | This aspect can be illustrated by a financial algorithm, namely the valuation of a European call option by Monte Carlo simulation. The example considers a Black-Scholes-Merton (BSM) setup in which the option's underlying risk factor follows a geometric Brownian motion. 18 | 19 | Assume the following numerical _parameter values_ for the valuation: 20 | 21 | * initial stock index level latexmath:[S_0 = 100] 22 | * strike price of the European call option latexmath:[K = 105] 23 | * time-to-maturity latexmath:[T = 1] year 24 | * constant, riskless short rate latexmath:[r = 0.05] 25 | * constant volatility latexmath:[\sigma = 0.2] 26 | 27 | 28 | In the BSM model, the index level at maturity is a random variable, given by <> with latexmath:[z] being a standard normally distributed random variable. 29 | 30 | [[bsm_rv]] 31 | [latexmath] 32 | .Black-Scholes-Merton (1973) index level at maturity 33 | ++++ 34 | \begin{equation} 35 | S_T = S_0 \exp \left( \left( r - \frac{1}{2} \sigma^2 \right) T + \sigma \sqrt{T} z \right) 36 | \end{equation} 37 | ++++ 38 | 39 | The following is an _algorithmic description_ of the Monte Carlo valuation procedure: 40 | 41 | . draw latexmath:[I] pseudo-random numbers latexmath:[z(i), i \in \{1, 2, ..., I \}], from the standard normal distribution 42 | . calculate all resulting index levels at maturity latexmath:[S_T(i)] for given latexmath:[z(i)] and <> 43 | . calculate all inner values of the option at maturity as latexmath:[h_T(i)= \max \left(S_T(i) - K, 0 \right)] 44 | . estimate the option present value via the Monte Carlo estimator as given in <> 45 | 46 | [[bsm_mcs_est]] 47 | [latexmath] 48 | .Monte Carlo estimator for European option 49 | ++++ 50 | \begin{equation} 51 | C_0 \approx e^{-rT} \frac{1}{I} \sum_I h_T(i) 52 | \end{equation} 53 | ++++ 54 | 55 | This problem and algorithm is now to be translated into Python code. The code below implements the required steps. 56 | 57 | [source,python] 58 | ---- 59 | include::code/why_python.txt[tags=MONTE_CARLO] 60 | ---- 61 | <1> The model and simulation parameter values are defined. 62 | <2> `NumPy` is used here as the main package. 63 | <3> The seed value for the random number generator is fixed. 64 | <4> Standard normally distributed random numbers are drawn. 65 | <5> End-of-period values are simulated. 66 | <6> The option payoffs at maturity are calculated. 67 | <7> The Monte Carlo estimator is evaluated. 68 | <8> The resulting value estimate is printed. 69 | 70 | Three aspects are worth highlighting: 71 | 72 | syntax:: 73 | the Python syntax is indeed quite close to the mathematical syntax, e.g., when it comes to the parameter value assignments 74 | translation:: 75 | every mathematical and/or algorithmic statement can generally be translated into a _single_ line of Python code 76 | vectorization:: 77 | one of the strengths of `NumPy` is the compact, vectorized syntax, e.g., allowing for 100,000 calculations within a single line of code 78 | 79 | This code can be used in an interactive environment like `IPython` or `Jupyter Notebook`. However, code that is meant to be reused regularly typically gets organized in so-called _modules_ (or _scripts_), which are single Python files (technically text files) with the suffix `.py`. Such a module could in this case look like <> and could be saved as a file named `bsm_mcs_euro.py`. 80 | 81 | 82 | [[bsm_mcs_euro]] 83 | .Monte Carlo valuation of European call option 84 | ==== 85 | [source, python] 86 | ---- 87 | include::code/bsm_mcs_euro.py[] 88 | ---- 89 | ==== 90 | 91 | The algorithmic example in this subsection illustrates that Python, with its very syntax, is well suited to complement the classic duo of scientific languages, English and mathematics. It seems that adding `Python` to the set of scientific languages makes it more well rounded. One then has: 92 | 93 | * *English* for _writing, talking_ about scientific and financial problems, etc. 94 | * *Mathematics* for _concisely, exactly describing and modeling_ abstract aspects, algorithms, complex quantities, etc. 95 | * *Python* for _technically modeling and implementing_ abstract aspects, algorithms, complex quantities, etc. 96 | 97 | .Mathematics and Python Syntax 98 | [TIP] 99 | ==== 100 | There is hardly any programming language that comes as close to mathematical syntax as Python. Numerical algorithms are therefore in general straightforward to translate from the mathematical representation into the `Pythonic` implementation. This makes prototyping, development and code maintenance in finance quite efficient with Python. 101 | ==== 102 | 103 | In some areas, it is common practice to use _pseudo-code_ and therewith to introduce a fourth language family member. The role of pseudo-code is to represent, for example, financial algorithms in a more technical fashion that is both still close to the mathematical representation and already quite close to the technical implementation. In addition to the algorithm itself, pseudo-code takes into account how computers work in principle. 104 | 105 | This practice generally has its cause in the fact that with most (compiled) programming languages the technical implementation is quite "`far away`" from its formal, mathematical representation. The majority of programming languages make it necessary to include so many elements that are only technically required that it is hard to see the equivalence between the mathematics and the code. 106 | 107 | Nowadays, Python is often used in a _pseudo-code way_ since its syntax is almost analogous to the mathematics and since the technical "`overhead`" is kept to a minimum. This is accomplished by a number of high-level concepts embodied in the language that not only have their advantages but also come in general with risks and/or other costs. However, it is safe to say that with Python you can, whenever the need arises, follow the same strict implementation and coding practices that other languages might require from the outset. In that sense, Python can provide the best of both worlds: _high-level abstraction_ and _rigorous implementation_. 108 | 109 | 110 | === Efficiency and Productivity Through Python 111 | 112 | At a high level, benefits from using Python can be measured in three dimensions: 113 | 114 | efficiency:: 115 | how can Python help in getting results faster, in saving costs, and in saving time? 116 | productivity:: 117 | how can Python help in getting more done with the same resources (people, assets, etc.)? 118 | quality:: 119 | what does Python allow to do that alternative technologies do not allow for? 120 | 121 | A discussion of these aspects can by nature not be exhaustive. However, it can highlight some arguments as a starting point. 122 | 123 | ==== Shorter Time-to-Results 124 | 125 | A field where the efficiency of Python becomes quite obvious is interactive data analytics. This is a field that benefits tremendously from such powerful tools as `IPython`, `Jupyter Notebook` and packages like `pandas`. 126 | 127 | Consider a finance student, writing her master's thesis and interested in S&P 500 index values. She wants to analyze historical index levels for, say, a few years to see how the volatility of the index has fluctuated over time. She wants to find evidence that volatility, in contrast to some typical model assumptions, fluctuates over time and is far from being constant. The results should also be visualized. She mainly has to do the following: 128 | 129 | * retrieve index level data from the web 130 | * calculate the annualized rolling standard deviation of the log returns (volatility) 131 | * plot the index level data and the volatility results 132 | 133 | These tasks are complex enough that not too long ago one would have considered them to be something for professional financial analysts only. Today, even the finance student can easily cope with such problems. The code below shows how exactly this works -- without worrying about syntax details at this stage (everything is explained in detail in subsequent chapters). 134 | 135 | [source,python] 136 | ---- 137 | include::code/why_python.txt[tags=SPX_EXAMPLE] 138 | ---- 139 | <1> This imports `NumPy` and `pandas`. 140 | <2> This imports `matplotlib` and configures the plotting style and approach for `Jupyter`. 141 | <3> `pd.read_csv()` allows the retrieval of remotely or locally stored data sets in `CSV` form. 142 | <4> A sub-set of the data is picked and `NaN` ("not a number") values eliminated. 143 | <5> This shows some meta-information about the data set. 144 | <6> The log returns are calculated in vectorized fashion ("`no looping`" on the Python level). 145 | <7> The rolling, annualized volatility is derived. 146 | <8> This finally plots the two time series. 147 | 148 | <> shows the graphical result of this brief interactive session. It can be considered almost amazing that a few lines of code suffice to implement three rather complex tasks typically encountered in financial analytics: data gathering, complex and repeated mathematical calculations as well as visualization of the results. The example illustrates that `pandas` makes working with whole time series almost as simple as doing mathematical operations on floating-point numbers. 149 | 150 | [[spx_vola]] 151 | .S&P 500 closing values and annualized volatility 152 | image::images/spx_volatility.png[] 153 | 154 | Translated to a professional finance context, the example implies that financial analysts can -- when applying the right Python tools and packages that provide high-level abstractions -- focus on their very domain and not on the technical intrinsicalities. Analysts can react faster, providing valuable insights almost in real-time and making sure they are one step ahead of the competition. This example of _increased efficiency_ can easily translate into measurable bottom-line effects. 155 | 156 | ==== Ensuring High Performance 157 | 158 | In general, it is accepted that Python has a rather concise syntax and that it is relatively efficient to code with. However, due to the very nature of Python being an interpreted language, the _prejudice_ persists that Python often is too slow for compute-intensive tasks in finance. Indeed, depending on the specific implementation approach, Python can be really slow. But it _does not have to be slow_ -- it can be highly performing in almost any application area. In principle, one can distinguish at least three different strategies for better performance: 159 | 160 | idioms and paradigms:: 161 | in general, many different ways can lead to the same result in Python, but sometimes with rather different performance characteristics; "`simply`" choosing the right way (e.g., a specific implementation approach, such as the judicious use of data structures, avoiding loops through vectorization or the use of a specific package such as `pandas`) can improve results significantly 162 | compiling:: 163 | nowadays, there are several performance packages available that provide compiled versions of important functions or that compile Python code statically or dynamically (at runtime or call time) to machine code, which can make such functions orders of magnitude faster than pure Python code; popular ones are `Cython` and `Numba` 164 | parallelization:: 165 | many computational tasks, in particular in finance, can significantly benefit from parallel execution; this is nothing special to Python but something that can easily be accomplished with it 166 | 167 | [TIP] 168 | .Performance Computing with Python 169 | ==== 170 | Python per se is not a high-performance computing technology. However, Python has developed into an ideal platform to access current performance technologies. In that sense, Python has become something like a _glue language_ for performance computing technologies. 171 | ==== 172 | 173 | Later chapters illustrate all three strategies in detail. This sub-section sticks to a simple, but still realistic, example that touches upon all three strategies. A quite common task in financial analytics is to evaluate complex mathematical expressions on large arrays of numbers. To this end, Python itself provides everything needed. 174 | 175 | [source,python] 176 | ---- 177 | include::code/why_python.txt[tags=PARADIGM_1] 178 | ---- 179 | 180 | The Python interpreter needs about 1.4 seconds in this case to evaluate the function `f` 2,500,000 times. The same task can be implemented using `NumPy`, which provides optimized (i.e., _pre-compiled_), functions to handle such array-based operations. 181 | 182 | [source,python] 183 | ---- 184 | include::code/why_python.txt[tags=PARADIGM_2] 185 | ---- 186 | 187 | Using `NumPy` considerably reduces the execution time to about 80 milliseconds. However, there is even a package specifically dedicated to this kind of task. It is called `numexpr`, for "`numerical expressions." It _compiles_ the expression to improve upon the performance of the general `NumPy` functionality by, for example, avoiding in-memory copies of `ndarray` objects along the way. 188 | 189 | [source,python] 190 | ---- 191 | include::code/why_python.txt[tags=PARADIGM_3] 192 | ---- 193 | 194 | Using this more specialized approach further reduces execution time to about 40 milliseconds. However, `numexpr` also has built-in capabilities to parallelize the execution of the respective operation. This allows us to use multiple threads of a CPU. 195 | 196 | [source,python] 197 | ---- 198 | include::code/why_python.txt[tags=PARADIGM_4] 199 | ---- 200 | 201 | Parallelization brings execution time further down to below 15 milliseconds in this case, with four threads utilized. Overall, this is a performance improvement of more than 90 times. Note, in particular, that this kind of improvement is possible without altering the basic problem/algorithm and without knowing any detail about compiling or parallelization approaches. The capabilities are accessible from a high level even by non-experts. However, one has to be aware, of course, of which capabilities and options exist. 202 | 203 | The example shows that Python provides a number of options to make more out of existing resources -- i.e., to _increase productivity_. With the parallel approach, three times as many calculations can be accomplished in the same amount of time as compared to the sequential approach -- in this case simply by telling Python to use multiple available CPU threads instead of just one. 204 | 205 | 206 | === Conclusions 207 | 208 | Python as a language -- and even more so as an ecosystem -- is an ideal technological framework for the financial industry as whole and the individual working in finance alike. It is characterized by a number of benefits, like an elegant syntax, efficient development approaches and usability for prototyping _as well as_ production. With its huge amount of available packages, libraries and tools, Python seems to have answers to most questions raised by recent developments in the financial industry in terms of analytics, data volumes and frequency, compliance and regulation, as well as technology itself. It has the potential to provide a _single, powerful, consistent framework_ with which to streamline end-to-end development and production efforts even across larger financial institutions. 209 | 210 | In addition, Python has become the programming language of choice for _artificial intelligence_ in general and _machine and deep learning_ in particular. Python is therefore both the right language for _data-driven finance_ as well as for _AI-first finance_, two recent trends that are about to reshape finance and the financial industry in fundamental ways. 211 | 212 | 213 | === Further Resources 214 | 215 | The following books cover several aspects only touched upon in this chapter in more detail (e.g. Python tools, derivatives analytics, machine learning in general, machine learning in finance): 216 | 217 | * Hilpisch, Yves (2015): _Derivatives Analytics with Python_. Wiley Finance, Chichester, England. http://dawp.tpq.io[]. 218 | * López de Prado, Marcos (2018): _Advances in Financial Machine Learning_. John Wiley & Sons, Hoboken. 219 | * VanderPlas, Jake (2016): _Python Data Science Handbook_. O'Reilly, Beijing et al. 220 | 221 | When it comes to algorithmic trading, the author's company offers a range of online training programs that focus on Python and other tools and techniques required in this rapidly growing field: 222 | 223 | * http://pyalgo.tpq.io[] -- online training course 224 | * http://certificate.tpq.io[] -- online training program 225 | 226 | Sources referenced in this chapter are, among others, the following: 227 | 228 | * Ding, Cubillas (2010): "`Optimizing the OTC Pricing and Valuation Infrastructure.`" _Celent study_. 229 | * Lewis, Michael (2014): _Flash Boys_. W. W. Norton & Company, New York. 230 | * Patterson, Scott (2010): _The Quants._ Crown Business, New York. 231 | -------------------------------------------------------------------------------- /files/code/why_python.ipynb: -------------------------------------------------------------------------------- 1 | { 2 | "cells": [ 3 | { 4 | "cell_type": "markdown", 5 | "metadata": {}, 6 | "source": [ 7 | "\"The
" 8 | ] 9 | }, 10 | { 11 | "cell_type": "markdown", 12 | "metadata": {}, 13 | "source": [ 14 | "# Why Python for Finance?" 15 | ] 16 | }, 17 | { 18 | "cell_type": "markdown", 19 | "metadata": {}, 20 | "source": [ 21 | "## Finance and Python Syntax" 22 | ] 23 | }, 24 | { 25 | "cell_type": "raw", 26 | "metadata": {}, 27 | "source": [ 28 | "# tag::MONTE_CARLO[]" 29 | ] 30 | }, 31 | { 32 | "cell_type": "code", 33 | "execution_count": 1, 34 | "metadata": { 35 | "uuid": "a95c7301-39f7-4d51-937a-334f051c2d9e" 36 | }, 37 | "outputs": [], 38 | "source": [ 39 | "import math\n", 40 | "import numpy as np # <1>" 41 | ] 42 | }, 43 | { 44 | "cell_type": "code", 45 | "execution_count": 2, 46 | "metadata": { 47 | "uuid": "1447b7bb-ed26-4c0f-9e0f-222dfd5d0c9b" 48 | }, 49 | "outputs": [], 50 | "source": [ 51 | "S0 = 100. # <2>\n", 52 | "K = 105. # <2>\n", 53 | "T = 1.0 # <2>\n", 54 | "r = 0.05 # <2>\n", 55 | "sigma = 0.2 # <2>" 56 | ] 57 | }, 58 | { 59 | "cell_type": "code", 60 | "execution_count": 3, 61 | "metadata": { 62 | "uuid": "a95c7301-39f7-4d51-937a-334f051c2d9e" 63 | }, 64 | "outputs": [], 65 | "source": [ 66 | "I = 100000 # <2>" 67 | ] 68 | }, 69 | { 70 | "cell_type": "code", 71 | "execution_count": 4, 72 | "metadata": { 73 | "uuid": "a95c7301-39f7-4d51-937a-334f051c2d9e" 74 | }, 75 | "outputs": [], 76 | "source": [ 77 | "np.random.seed(1000) # <3>" 78 | ] 79 | }, 80 | { 81 | "cell_type": "code", 82 | "execution_count": 5, 83 | "metadata": { 84 | "uuid": "a95c7301-39f7-4d51-937a-334f051c2d9e" 85 | }, 86 | "outputs": [], 87 | "source": [ 88 | "z = np.random.standard_normal(I) # <4>" 89 | ] 90 | }, 91 | { 92 | "cell_type": "code", 93 | "execution_count": 6, 94 | "metadata": { 95 | "uuid": "a95c7301-39f7-4d51-937a-334f051c2d9e" 96 | }, 97 | "outputs": [], 98 | "source": [ 99 | "ST = S0 * np.exp((r - sigma ** 2 / 2) * T + sigma * math.sqrt(T) * z) # <5>" 100 | ] 101 | }, 102 | { 103 | "cell_type": "code", 104 | "execution_count": 7, 105 | "metadata": { 106 | "uuid": "a95c7301-39f7-4d51-937a-334f051c2d9e" 107 | }, 108 | "outputs": [], 109 | "source": [ 110 | "hT = np.maximum(ST - K, 0) # <6>" 111 | ] 112 | }, 113 | { 114 | "cell_type": "code", 115 | "execution_count": 8, 116 | "metadata": { 117 | "uuid": "a95c7301-39f7-4d51-937a-334f051c2d9e" 118 | }, 119 | "outputs": [], 120 | "source": [ 121 | "C0 = math.exp(-r * T) * np.mean(hT) # <7>" 122 | ] 123 | }, 124 | { 125 | "cell_type": "code", 126 | "execution_count": 9, 127 | "metadata": { 128 | "uuid": "84aab05d-40de-4ef1-b3a8-eb08f86e1662" 129 | }, 130 | "outputs": [ 131 | { 132 | "name": "stdout", 133 | "output_type": "stream", 134 | "text": [ 135 | "Value of the European call option: 8.019.\n" 136 | ] 137 | } 138 | ], 139 | "source": [ 140 | "print('Value of the European call option: {:5.3f}.'.format(C0)) # <8>" 141 | ] 142 | }, 143 | { 144 | "cell_type": "raw", 145 | "metadata": {}, 146 | "source": [ 147 | "# end::MONTE_CARLO[]" 148 | ] 149 | }, 150 | { 151 | "cell_type": "code", 152 | "execution_count": 10, 153 | "metadata": {}, 154 | "outputs": [ 155 | { 156 | "name": "stdout", 157 | "output_type": "stream", 158 | "text": [ 159 | "Value of the European call option 7.989.\n" 160 | ] 161 | } 162 | ], 163 | "source": [ 164 | "%run bsm_mcs_euro.py" 165 | ] 166 | }, 167 | { 168 | "cell_type": "markdown", 169 | "metadata": {}, 170 | "source": [ 171 | "## Time-to-Results" 172 | ] 173 | }, 174 | { 175 | "cell_type": "raw", 176 | "metadata": {}, 177 | "source": [ 178 | "# tag::SPX_EXAMPLE[]" 179 | ] 180 | }, 181 | { 182 | "cell_type": "code", 183 | "execution_count": 11, 184 | "metadata": { 185 | "uuid": "e16924db-8402-4bcb-a9c7-d5753346c72a" 186 | }, 187 | "outputs": [], 188 | "source": [ 189 | "import numpy as np # <1>\n", 190 | "import pandas as pd # <1>\n", 191 | "from pylab import plt, mpl # <2>" 192 | ] 193 | }, 194 | { 195 | "cell_type": "code", 196 | "execution_count": 12, 197 | "metadata": {}, 198 | "outputs": [], 199 | "source": [ 200 | "plt.style.use('seaborn') # <2>\n", 201 | "mpl.rcParams['font.family'] = 'serif' # <2>\n", 202 | "%matplotlib inline" 203 | ] 204 | }, 205 | { 206 | "cell_type": "code", 207 | "execution_count": 13, 208 | "metadata": { 209 | "uuid": "22071d72-094b-4b51-aa39-cfa793d58623" 210 | }, 211 | "outputs": [ 212 | { 213 | "name": "stdout", 214 | "output_type": "stream", 215 | "text": [ 216 | "\n", 217 | "DatetimeIndex: 2138 entries, 2010-01-04 to 2018-06-29\n", 218 | "Data columns (total 1 columns):\n", 219 | ".SPX 2138 non-null float64\n", 220 | "dtypes: float64(1)\n", 221 | "memory usage: 33.4 KB\n" 222 | ] 223 | } 224 | ], 225 | "source": [ 226 | "data = pd.read_csv('http://hilpisch.com/tr_eikon_eod_data.csv',\n", 227 | " index_col=0, parse_dates=True) # <3>\n", 228 | "data = pd.DataFrame(data['.SPX']) # <4>\n", 229 | "data.dropna(inplace=True) # <4>\n", 230 | "data.info() # <5>" 231 | ] 232 | }, 233 | { 234 | "cell_type": "code", 235 | "execution_count": 14, 236 | "metadata": { 237 | "uuid": "304b2f50-5b81-4e88-8fa7-ea69995cf2fe" 238 | }, 239 | "outputs": [], 240 | "source": [ 241 | "data['rets'] = np.log(data / data.shift(1)) # <6>\n", 242 | "data['vola'] = data['rets'].rolling(252).std() * np.sqrt(252) # <7>" 243 | ] 244 | }, 245 | { 246 | "cell_type": "code", 247 | "execution_count": 15, 248 | "metadata": { 249 | "uuid": "b4e61939-c7fe-4a1e-a8c9-6fd755edcf86" 250 | }, 251 | "outputs": [ 252 | { 253 | "data": { 254 | "image/png": "\n", 255 | "text/plain": [ 256 | "" 257 | ] 258 | }, 259 | "metadata": {}, 260 | "output_type": "display_data" 261 | } 262 | ], 263 | "source": [ 264 | "data[['.SPX', 'vola']].plot(subplots=True, figsize=(10, 6)); # <8>\n", 265 | "plt.savefig('../images/spx_volatility.png')" 266 | ] 267 | }, 268 | { 269 | "cell_type": "raw", 270 | "metadata": {}, 271 | "source": [ 272 | "# end::SPX_EXAMPLE[]" 273 | ] 274 | }, 275 | { 276 | "cell_type": "markdown", 277 | "metadata": {}, 278 | "source": [ 279 | "## Idioms & Paradigms" 280 | ] 281 | }, 282 | { 283 | "cell_type": "raw", 284 | "metadata": {}, 285 | "source": [ 286 | "# tag::PARADIGM_1[]" 287 | ] 288 | }, 289 | { 290 | "cell_type": "code", 291 | "execution_count": 16, 292 | "metadata": { 293 | "uuid": "beca497d-4c20-4240-b003-4c79c26154be" 294 | }, 295 | "outputs": [ 296 | { 297 | "name": "stdout", 298 | "output_type": "stream", 299 | "text": [ 300 | "1.21 s ± 8.27 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)\n" 301 | ] 302 | } 303 | ], 304 | "source": [ 305 | "import math\n", 306 | "loops = 2500000\n", 307 | "a = range(1, loops)\n", 308 | "def f(x):\n", 309 | " return 3 * math.log(x) + math.cos(x) ** 2\n", 310 | "%timeit r = [f(x) for x in a]" 311 | ] 312 | }, 313 | { 314 | "cell_type": "raw", 315 | "metadata": {}, 316 | "source": [ 317 | "# end::PARADIGM_1[]" 318 | ] 319 | }, 320 | { 321 | "cell_type": "raw", 322 | "metadata": {}, 323 | "source": [ 324 | "# tag::PARADIGM_2[]" 325 | ] 326 | }, 327 | { 328 | "cell_type": "code", 329 | "execution_count": 17, 330 | "metadata": { 331 | "uuid": "931fd1fc-cc54-4045-802f-acb8f094a23f" 332 | }, 333 | "outputs": [ 334 | { 335 | "name": "stdout", 336 | "output_type": "stream", 337 | "text": [ 338 | "94.7 ms ± 1.55 ms per loop (mean ± std. dev. of 7 runs, 10 loops each)\n" 339 | ] 340 | } 341 | ], 342 | "source": [ 343 | "import numpy as np\n", 344 | "a = np.arange(1, loops)\n", 345 | "%timeit r = 3 * np.log(a) + np.cos(a) ** 2" 346 | ] 347 | }, 348 | { 349 | "cell_type": "raw", 350 | "metadata": {}, 351 | "source": [ 352 | "# end::PARADIGM_2[]" 353 | ] 354 | }, 355 | { 356 | "cell_type": "raw", 357 | "metadata": {}, 358 | "source": [ 359 | "# tag::PARADIGM_3[]" 360 | ] 361 | }, 362 | { 363 | "cell_type": "code", 364 | "execution_count": 18, 365 | "metadata": { 366 | "uuid": "8d1602b3-a490-4d1a-97d9-112825d86185" 367 | }, 368 | "outputs": [ 369 | { 370 | "name": "stdout", 371 | "output_type": "stream", 372 | "text": [ 373 | "37.5 ms ± 936 µs per loop (mean ± std. dev. of 7 runs, 10 loops each)\n" 374 | ] 375 | } 376 | ], 377 | "source": [ 378 | "import numexpr as ne\n", 379 | "ne.set_num_threads(1)\n", 380 | "f = '3 * log(a) + cos(a) ** 2'\n", 381 | "%timeit r = ne.evaluate(f)" 382 | ] 383 | }, 384 | { 385 | "cell_type": "raw", 386 | "metadata": {}, 387 | "source": [ 388 | "# end::PARADIGM_3[]" 389 | ] 390 | }, 391 | { 392 | "cell_type": "raw", 393 | "metadata": {}, 394 | "source": [ 395 | "# tag::PARADIGM_4[]" 396 | ] 397 | }, 398 | { 399 | "cell_type": "code", 400 | "execution_count": 19, 401 | "metadata": { 402 | "uuid": "6994f16a-1802-4abe-851d-a48e0ead154d" 403 | }, 404 | "outputs": [ 405 | { 406 | "name": "stdout", 407 | "output_type": "stream", 408 | "text": [ 409 | "12.8 ms ± 300 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)\n" 410 | ] 411 | } 412 | ], 413 | "source": [ 414 | "ne.set_num_threads(4)\n", 415 | "%timeit r = ne.evaluate(f)" 416 | ] 417 | }, 418 | { 419 | "cell_type": "raw", 420 | "metadata": {}, 421 | "source": [ 422 | "# end::PARADIGM_4[]" 423 | ] 424 | }, 425 | { 426 | "cell_type": "markdown", 427 | "metadata": {}, 428 | "source": [ 429 | "\"The
\n", 430 | "\n", 431 | "http://tpq.io | @dyjh | training@tpq.io" 432 | ] 433 | } 434 | ], 435 | "metadata": { 436 | "kernelspec": { 437 | "display_name": "Python 3", 438 | "language": "python", 439 | "name": "python3" 440 | }, 441 | "language_info": { 442 | "codemirror_mode": { 443 | "name": "ipython", 444 | "version": 3 445 | }, 446 | "file_extension": ".py", 447 | "mimetype": "text/x-python", 448 | "name": "python", 449 | "nbconvert_exporter": "python", 450 | "pygments_lexer": "ipython3", 451 | "version": "3.6.7" 452 | } 453 | }, 454 | "nbformat": 4, 455 | "nbformat_minor": 1 456 | } 457 | -------------------------------------------------------------------------------- /files/py4fi.html: -------------------------------------------------------------------------------- 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9 | Python for Finance 10 | 11 | 433 | 434 | 525 | 526 | 527 | 547 |
548 |
549 |

1. Why Python for Finance

550 |
551 |
552 |
553 |
554 |

Banks are essentially technology firms.

555 |
556 |
557 |
558 | — Hugo Banziger 559 |
560 |
561 |
562 |

1.1. Finance and Python Syntax

563 |
564 |

Most people who make their first steps with Python in a finance context may attack an algorithmic problem. This is similar to a scientist who, for example, wants to solve a differential equation, wants to evaluate an integral or simply wants to visualize some data. In general, at this stage, there is only little thought spent on topics like a formal development process, testing, documentation or deployment. However, this especially seems to be the stage when people fall in love with Python. A major reason for this might be that Python syntax is generally quite close to the mathematical syntax used to describe scientific problems or financial algorithms.

565 |
566 |
567 |

This aspect can be illustrated by a financial algorithm, namely the valuation of a European call option by Monte Carlo simulation. The example considers a Black-Scholes-Merton (BSM) setup in which the option’s underlying risk factor follows a geometric Brownian motion.

568 |
569 |
570 |

Assume the following numerical parameter values for the valuation:

571 |
572 |
573 |
    574 |
  • 575 |

    initial stock index level \(S_0 = 100\)

    576 |
  • 577 |
  • 578 |

    strike price of the European call option \(K = 105\)

    579 |
  • 580 |
  • 581 |

    time-to-maturity \(T = 1\) year

    582 |
  • 583 |
  • 584 |

    constant, riskless short rate \(r = 0.05\)

    585 |
  • 586 |
  • 587 |

    constant volatility \(\sigma = 0.2\)

    588 |
  • 589 |
590 |
591 |
592 |

In the BSM model, the index level at maturity is a random variable, given by Black-Scholes-Merton (1973) index level at maturity with \(z\) being a standard normally distributed random variable.

593 |
594 |
595 |
Black-Scholes-Merton (1973) index level at maturity
596 |
597 | \[\begin{equation} 598 | S_T = S_0 \exp \left( \left( r - \frac{1}{2} \sigma^2 \right) T + \sigma \sqrt{T} z \right) 599 | \end{equation}\] 600 |
601 |
602 |
603 |

The following is an algorithmic description of the Monte Carlo valuation procedure:

604 |
605 |
606 |
    607 |
  1. 608 |

    draw \(I\) pseudo-random numbers \(z(i), i \in \{1, 2, ..., I \}\), from the standard normal distribution

    609 |
  2. 610 |
  3. 611 |

    calculate all resulting index levels at maturity \(S_T(i)\) for given \(z(i)\) and Black-Scholes-Merton (1973) index level at maturity

    612 |
  4. 613 |
  5. 614 |

    calculate all inner values of the option at maturity as \(h_T(i)= \max \left(S_T(i) - K, 0 \right)\)

    615 |
  6. 616 |
  7. 617 |

    estimate the option present value via the Monte Carlo estimator as given in Monte Carlo estimator for European option

    618 |
  8. 619 |
620 |
621 |
622 |
Monte Carlo estimator for European option
623 |
624 | \[\begin{equation} 625 | C_0 \approx e^{-rT} \frac{1}{I} \sum_I h_T(i) 626 | \end{equation}\] 627 |
628 |
629 |
630 |

This problem and algorithm is now to be translated into Python code. The code below implements the required steps.

631 |
632 |
633 |
634 |
In [1]: import math
 635 |         import numpy as np  (1)
 636 | 
 637 | In [2]: S0 = 100.  (2)
 638 |         K = 105.  (2)
 639 |         T = 1.0  (2)
 640 |         r = 0.05  (2)
 641 |         sigma = 0.2  (2)
 642 | 
 643 | In [3]: I = 100000  (2)
 644 | 
 645 | In [4]: np.random.seed(1000)  (3)
 646 | 
 647 | In [5]: z = np.random.standard_normal(I)  (4)
 648 | 
 649 | In [6]: ST = S0 * np.exp((r - sigma ** 2 / 2) * T + sigma * math.sqrt(T) * z)  (5)
 650 | 
 651 | In [7]: hT = np.maximum(ST - K, 0)  (6)
 652 | 
 653 | In [8]: C0 = math.exp(-r * T) * np.mean(hT)  (7)
 654 | 
 655 | In [9]: print('Value of the European call option: {:5.3f}.'.format(C0))  (8)
 656 |         Value of the European call option: 8.019.
657 |
658 |
659 |
660 | 661 | 662 | 663 | 664 | 665 | 666 | 667 | 668 | 669 | 670 | 671 | 672 | 673 | 674 | 675 | 676 | 677 | 678 | 679 | 680 | 681 | 682 | 683 | 684 | 685 | 686 | 687 | 688 | 689 | 690 | 691 | 692 | 693 |
1The model and simulation parameter values are defined.
2NumPy is used here as the main package.
3The seed value for the random number generator is fixed.
4Standard normally distributed random numbers are drawn.
5End-of-period values are simulated.
6The option payoffs at maturity are calculated.
7The Monte Carlo estimator is evaluated.
8The resulting value estimate is printed.
694 |
695 |
696 |

Three aspects are worth highlighting:

697 |
698 |
699 |
700 |
syntax
701 |
702 |

the Python syntax is indeed quite close to the mathematical syntax, e.g., when it comes to the parameter value assignments

703 |
704 |
translation
705 |
706 |

every mathematical and/or algorithmic statement can generally be translated into a single line of Python code

707 |
708 |
vectorization
709 |
710 |

one of the strengths of NumPy is the compact, vectorized syntax, e.g., allowing for 100,000 calculations within a single line of code

711 |
712 |
713 |
714 |
715 |

This code can be used in an interactive environment like IPython or Jupyter Notebook. However, code that is meant to be reused regularly typically gets organized in so-called modules (or scripts), which are single Python files (technically text files) with the suffix .py. Such a module could in this case look like Monte Carlo valuation of European call option and could be saved as a file named bsm_mcs_euro.py.

716 |
717 |
718 |
Example 1. Monte Carlo valuation of European call option
719 |
720 |
721 |
722 |
#
 723 | # Monte Carlo valuation of European call option
 724 | # in Black-Scholes-Merton model
 725 | # bsm_mcs_euro.py
 726 | #
 727 | # Python for Finance, 2nd ed.
 728 | # (c) Dr. Yves J. Hilpisch
 729 | #
 730 | import math
 731 | import numpy as np
 732 | 
 733 | # Parameter Values
 734 | S0 = 100.  # initial index level
 735 | K = 105.  # strike price
 736 | T = 1.0  # time-to-maturity
 737 | r = 0.05  # riskless short rate
 738 | sigma = 0.2  # volatility
 739 | 
 740 | I = 100000  # number of simulations
 741 | 
 742 | # Valuation Algorithm
 743 | z = np.random.standard_normal(I)  # pseudo-random numbers
 744 | # index values at maturity
 745 | ST = S0 * np.exp((r - 0.5 * sigma ** 2) * T + sigma * math.sqrt(T) * z)
 746 | hT = np.maximum(ST - K, 0)  # payoff at maturity
 747 | C0 = math.exp(-r * T) * np.mean(hT)  # Monte Carlo estimator
 748 | 
 749 | # Result Output
 750 | print('Value of the European call option %5.3f.' % C0)
751 |
752 |
753 |
754 |
755 |
756 |

The algorithmic example in this subsection illustrates that Python, with its very syntax, is well suited to complement the classic duo of scientific languages, English and mathematics. It seems that adding Python to the set of scientific languages makes it more well rounded. One then has:

757 |
758 |
759 |
    760 |
  • 761 |

    English for writing, talking about scientific and financial problems, etc.

    762 |
  • 763 |
  • 764 |

    Mathematics for concisely, exactly describing and modeling abstract aspects, algorithms, complex quantities, etc.

    765 |
  • 766 |
  • 767 |

    Python for technically modeling and implementing abstract aspects, algorithms, complex quantities, etc.

    768 |
  • 769 |
770 |
771 |
772 | 773 | 774 | 777 | 783 | 784 |
775 | 776 | 778 |
Mathematics and Python Syntax
779 |
780 |

There is hardly any programming language that comes as close to mathematical syntax as Python. Numerical algorithms are therefore in general straightforward to translate from the mathematical representation into the Pythonic implementation. This makes prototyping, development and code maintenance in finance quite efficient with Python.

781 |
782 |
785 |
786 |
787 |

In some areas, it is common practice to use pseudo-code and therewith to introduce a fourth language family member. The role of pseudo-code is to represent, for example, financial algorithms in a more technical fashion that is both still close to the mathematical representation and already quite close to the technical implementation. In addition to the algorithm itself, pseudo-code takes into account how computers work in principle.

788 |
789 |
790 |

This practice generally has its cause in the fact that with most (compiled) programming languages the technical implementation is quite “far away” from its formal, mathematical representation. The majority of programming languages make it necessary to include so many elements that are only technically required that it is hard to see the equivalence between the mathematics and the code.

791 |
792 |
793 |

Nowadays, Python is often used in a pseudo-code way since its syntax is almost analogous to the mathematics and since the technical “overhead” is kept to a minimum. This is accomplished by a number of high-level concepts embodied in the language that not only have their advantages but also come in general with risks and/or other costs. However, it is safe to say that with Python you can, whenever the need arises, follow the same strict implementation and coding practices that other languages might require from the outset. In that sense, Python can provide the best of both worlds: high-level abstraction and rigorous implementation.

794 |
795 |
796 |
797 |

1.2. Efficiency and Productivity Through Python

798 |
799 |

At a high level, benefits from using Python can be measured in three dimensions:

800 |
801 |
802 |
803 |
efficiency
804 |
805 |

how can Python help in getting results faster, in saving costs, and in saving time?

806 |
807 |
productivity
808 |
809 |

how can Python help in getting more done with the same resources (people, assets, etc.)?

810 |
811 |
quality
812 |
813 |

what does Python allow to do that alternative technologies do not allow for?

814 |
815 |
816 |
817 |
818 |

A discussion of these aspects can by nature not be exhaustive. However, it can highlight some arguments as a starting point.

819 |
820 |
821 |

1.2.1. Shorter Time-to-Results

822 |
823 |

A field where the efficiency of Python becomes quite obvious is interactive data analytics. This is a field that benefits tremendously from such powerful tools as IPython, Jupyter Notebook and packages like pandas.

824 |
825 |
826 |

Consider a finance student, writing her master’s thesis and interested in S&P 500 index values. She wants to analyze historical index levels for, say, a few years to see how the volatility of the index has fluctuated over time. She wants to find evidence that volatility, in contrast to some typical model assumptions, fluctuates over time and is far from being constant. The results should also be visualized. She mainly has to do the following:

827 |
828 |
829 |
    830 |
  • 831 |

    retrieve index level data from the web

    832 |
  • 833 |
  • 834 |

    calculate the annualized rolling standard deviation of the log returns (volatility)

    835 |
  • 836 |
  • 837 |

    plot the index level data and the volatility results

    838 |
  • 839 |
840 |
841 |
842 |

These tasks are complex enough that not too long ago one would have considered them to be something for professional financial analysts only. Today, even the finance student can easily cope with such problems. The code below shows how exactly this works — without worrying about syntax details at this stage (everything is explained in detail in subsequent chapters).

843 |
844 |
845 |
846 |
In [11]: import numpy as np  (1)
 847 |          import pandas as pd  (1)
 848 |          from pylab import plt, mpl  (2)
 849 | 
 850 | In [12]: plt.style.use('seaborn')  (2)
 851 |          mpl.rcParams['font.family'] = 'serif'  (2)
 852 |          %matplotlib inline
 853 | 
 854 | In [13]: data = pd.read_csv('http://hilpisch.com/tr_eikon_eod_data.csv',
 855 |                            index_col=0, parse_dates=True)  (3)
 856 |          data = pd.DataFrame(data['.SPX']) (4)
 857 |          data.dropna(inplace=True)  (4)
 858 |          data.info()  (5)
 859 |          <class 'pandas.core.frame.DataFrame'>
 860 |          DatetimeIndex: 2138 entries, 2010-01-04 to 2018-06-29
 861 |          Data columns (total 1 columns):
 862 |          .SPX    2138 non-null float64
 863 |          dtypes: float64(1)
 864 |          memory usage: 33.4 KB
 865 | 
 866 | In [14]: data['rets'] = np.log(data / data.shift(1))  (6)
 867 |          data['vola'] = data['rets'].rolling(252).std() * np.sqrt(252)  (7)
 868 | 
 869 | In [15]: data[['.SPX', 'vola']].plot(subplots=True, figsize=(10, 6));  (8)
 870 |          plt.savefig('../images/spx_volatility.png')
871 |
872 |
873 |
874 | 875 | 876 | 877 | 878 | 879 | 880 | 881 | 882 | 883 | 884 | 885 | 886 | 887 | 888 | 889 | 890 | 891 | 892 | 893 | 894 | 895 | 896 | 897 | 898 | 899 | 900 | 901 | 902 | 903 | 904 | 905 | 906 | 907 |
1This imports NumPy and pandas.
2This imports matplotlib and configures the plotting style and approach for Jupyter.
3pd.read_csv() allows the retrieval of remotely or locally stored data sets in CSV form.
4A sub-set of the data is picked and NaN ("not a number") values eliminated.
5This shows some meta-information about the data set.
6The log returns are calculated in vectorized fashion (“no looping” on the Python level).
7The rolling, annualized volatility is derived.
8This finally plots the two time series.
908 |
909 |
910 |

S&P 500 closing values and annualized volatility shows the graphical result of this brief interactive session. It can be considered almost amazing that a few lines of code suffice to implement three rather complex tasks typically encountered in financial analytics: data gathering, complex and repeated mathematical calculations as well as visualization of the results. The example illustrates that pandas makes working with whole time series almost as simple as doing mathematical operations on floating-point numbers.

911 |
912 |
913 |
914 | spx volatility 915 |
916 |
Figure 1. S&P 500 closing values and annualized volatility
917 |
918 |
919 |

Translated to a professional finance context, the example implies that financial analysts can — when applying the right Python tools and packages that provide high-level abstractions — focus on their very domain and not on the technical intrinsicalities. Analysts can react faster, providing valuable insights almost in real-time and making sure they are one step ahead of the competition. This example of increased efficiency can easily translate into measurable bottom-line effects.

920 |
921 |
922 |
923 |

1.2.2. Ensuring High Performance

924 |
925 |

In general, it is accepted that Python has a rather concise syntax and that it is relatively efficient to code with. However, due to the very nature of Python being an interpreted language, the prejudice persists that Python often is too slow for compute-intensive tasks in finance. Indeed, depending on the specific implementation approach, Python can be really slow. But it does not have to be slow — it can be highly performing in almost any application area. In principle, one can distinguish at least three different strategies for better performance:

926 |
927 |
928 |
929 |
idioms and paradigms
930 |
931 |

in general, many different ways can lead to the same result in Python, but sometimes with rather different performance characteristics; “simply” choosing the right way (e.g., a specific implementation approach, such as the judicious use of data structures, avoiding loops through vectorization or the use of a specific package such as pandas) can improve results significantly

932 |
933 |
compiling
934 |
935 |

nowadays, there are several performance packages available that provide compiled versions of important functions or that compile Python code statically or dynamically (at runtime or call time) to machine code, which can make such functions orders of magnitude faster than pure Python code; popular ones are Cython and Numba

936 |
937 |
parallelization
938 |
939 |

many computational tasks, in particular in finance, can significantly benefit from parallel execution; this is nothing special to Python but something that can easily be accomplished with it

940 |
941 |
942 |
943 |
944 | 945 | 946 | 949 | 955 | 956 |
947 | 948 | 950 |
Performance Computing with Python
951 |
952 |

Python per se is not a high-performance computing technology. However, Python has developed into an ideal platform to access current performance technologies. In that sense, Python has become something like a glue language for performance computing technologies.

953 |
954 |
957 |
958 |
959 |

Later chapters illustrate all three strategies in detail. This sub-section sticks to a simple, but still realistic, example that touches upon all three strategies. A quite common task in financial analytics is to evaluate complex mathematical expressions on large arrays of numbers. To this end, Python itself provides everything needed.

960 |
961 |
962 |
963 |
In [16]: import math
 964 |          loops = 2500000
 965 |          a = range(1, loops)
 966 |          def f(x):
 967 |              return 3 * math.log(x) + math.cos(x) ** 2
 968 |          %timeit r = [f(x) for x in a]
 969 |          1.21 s ± 8.27 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)
970 |
971 |
972 |
973 |

The Python interpreter needs about 1.4 seconds in this case to evaluate the function f 2,500,000 times. The same task can be implemented using NumPy, which provides optimized (i.e., pre-compiled), functions to handle such array-based operations.

974 |
975 |
976 |
977 |
In [17]: import numpy as np
 978 |          a = np.arange(1, loops)
 979 |          %timeit r = 3 * np.log(a) + np.cos(a) ** 2
 980 |          94.7 ms ± 1.55 ms per loop (mean ± std. dev. of 7 runs, 10 loops each)
981 |
982 |
983 |
984 |

Using NumPy considerably reduces the execution time to about 80 milliseconds. However, there is even a package specifically dedicated to this kind of task. It is called numexpr, for "`numerical expressions." It compiles the expression to improve upon the performance of the general NumPy functionality by, for example, avoiding in-memory copies of ndarray objects along the way.

985 |
986 |
987 |
988 |
In [18]: import numexpr as ne
 989 |          ne.set_num_threads(1)
 990 |          f = '3 * log(a) + cos(a) ** 2'
 991 |          %timeit r = ne.evaluate(f)
 992 |          37.5 ms ± 936 µs per loop (mean ± std. dev. of 7 runs, 10 loops each)
993 |
994 |
995 |
996 |

Using this more specialized approach further reduces execution time to about 40 milliseconds. However, numexpr also has built-in capabilities to parallelize the execution of the respective operation. This allows us to use multiple threads of a CPU.

997 |
998 |
999 |
1000 |
In [19]: ne.set_num_threads(4)
1001 |          %timeit r = ne.evaluate(f)
1002 |          12.8 ms ± 300 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)
1003 |
1004 |
1005 |
1006 |

Parallelization brings execution time further down to below 15 milliseconds in this case, with four threads utilized. Overall, this is a performance improvement of more than 90 times. Note, in particular, that this kind of improvement is possible without altering the basic problem/algorithm and without knowing any detail about compiling or parallelization approaches. The capabilities are accessible from a high level even by non-experts. However, one has to be aware, of course, of which capabilities and options exist.

1007 |
1008 |
1009 |

The example shows that Python provides a number of options to make more out of existing resources — i.e., to increase productivity. With the parallel approach, three times as many calculations can be accomplished in the same amount of time as compared to the sequential approach — in this case simply by telling Python to use multiple available CPU threads instead of just one.

1010 |
1011 |
1012 |
1013 |
1014 |

1.3. Conclusions

1015 |
1016 |

Python as a language — and even more so as an ecosystem — is an ideal technological framework for the financial industry as whole and the individual working in finance alike. It is characterized by a number of benefits, like an elegant syntax, efficient development approaches and usability for prototyping as well as production. With its huge amount of available packages, libraries and tools, Python seems to have answers to most questions raised by recent developments in the financial industry in terms of analytics, data volumes and frequency, compliance and regulation, as well as technology itself. It has the potential to provide a single, powerful, consistent framework with which to streamline end-to-end development and production efforts even across larger financial institutions.

1017 |
1018 |
1019 |

In addition, Python has become the programming language of choice for artificial intelligence in general and machine and deep learning in particular. Python is therefore both the right language for data-driven finance as well as for AI-first finance, two recent trends that are about to reshape finance and the financial industry in fundamental ways.

1020 |
1021 |
1022 |
1023 |

1.4. Further Resources

1024 |
1025 |

The following books cover several aspects only touched upon in this chapter in more detail (e.g. Python tools, derivatives analytics, machine learning in general, machine learning in finance):

1026 |
1027 |
1028 |
    1029 |
  • 1030 |

    Hilpisch, Yves (2015): Derivatives Analytics with Python. Wiley Finance, Chichester, England. http://dawp.tpq.io.

    1031 |
  • 1032 |
  • 1033 |

    López de Prado, Marcos (2018): Advances in Financial Machine Learning. John Wiley & Sons, Hoboken.

    1034 |
  • 1035 |
  • 1036 |

    VanderPlas, Jake (2016): Python Data Science Handbook. O’Reilly, Beijing et al.

    1037 |
  • 1038 |
1039 |
1040 |
1041 |

When it comes to algorithmic trading, the author’s company offers a range of online training programs that focus on Python and other tools and techniques required in this rapidly growing field:

1042 |
1043 |
1044 | 1052 |
1053 |
1054 |

Sources referenced in this chapter are, among others, the following:

1055 |
1056 |
1057 |
    1058 |
  • 1059 |

    Ding, Cubillas (2010): “Optimizing the OTC Pricing and Valuation Infrastructure.” Celent study.

    1060 |
  • 1061 |
  • 1062 |

    Lewis, Michael (2014): Flash Boys. W. W. Norton & Company, New York.

    1063 |
  • 1064 |
  • 1065 |

    Patterson, Scott (2010): The Quants. Crown Business, New York.

    1066 |
  • 1067 |
1068 |
1069 |
1070 |
1071 |
1072 |
1073 | 1078 | 1093 | 1094 | 1095 | --------------------------------------------------------------------------------