├── .gitignore ├── files ├── small.mmap ├── random-matrix.csv ├── random-matrix.npy ├── data_structure.pkl ├── matlab_test_data_01.mat ├── matlab_test_data_02.mat ├── data_structure.pkl_01.npy ├── data_structure.pkl_02.npy └── test.mat ├── images ├── ndarray.png ├── ep15_logo.png ├── reference.png ├── storage_index.png ├── storage_simple.png ├── ndarray_with_details.png └── predictive_modeling_data_flow.png ├── README.md ├── 01 - Preliminaries.ipynb ├── 06_Memmapping.ipynb ├── 05_Sparse_Matrices.ipynb ├── 05_1_Sparse_Graphs_in_Python.ipynb ├── 03_Numpy_Operations.ipynb └── 02_Introduction to Numpy.ipynb /.gitignore: -------------------------------------------------------------------------------- 1 | sol_021.py 2 | -------------------------------------------------------------------------------- /files/small.mmap: -------------------------------------------------------------------------------- 1 | (B,B -------------------------------------------------------------------------------- /images/ndarray.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/leriomaggio/numpy_ep2015/master/images/ndarray.png -------------------------------------------------------------------------------- /images/ep15_logo.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/leriomaggio/numpy_ep2015/master/images/ep15_logo.png -------------------------------------------------------------------------------- /images/reference.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/leriomaggio/numpy_ep2015/master/images/reference.png -------------------------------------------------------------------------------- /files/random-matrix.csv: -------------------------------------------------------------------------------- 1 | 0.31318 0.20088 0.41317 2 | 0.73103 0.06485 0.65212 3 | 0.48175 0.95090 0.55600 4 | -------------------------------------------------------------------------------- /files/random-matrix.npy: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/leriomaggio/numpy_ep2015/master/files/random-matrix.npy -------------------------------------------------------------------------------- /files/data_structure.pkl: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/leriomaggio/numpy_ep2015/master/files/data_structure.pkl -------------------------------------------------------------------------------- /images/storage_index.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/leriomaggio/numpy_ep2015/master/images/storage_index.png -------------------------------------------------------------------------------- /images/storage_simple.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/leriomaggio/numpy_ep2015/master/images/storage_simple.png -------------------------------------------------------------------------------- /files/matlab_test_data_01.mat: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/leriomaggio/numpy_ep2015/master/files/matlab_test_data_01.mat -------------------------------------------------------------------------------- /files/matlab_test_data_02.mat: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/leriomaggio/numpy_ep2015/master/files/matlab_test_data_02.mat -------------------------------------------------------------------------------- /files/data_structure.pkl_01.npy: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/leriomaggio/numpy_ep2015/master/files/data_structure.pkl_01.npy -------------------------------------------------------------------------------- /files/data_structure.pkl_02.npy: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/leriomaggio/numpy_ep2015/master/files/data_structure.pkl_02.npy -------------------------------------------------------------------------------- /images/ndarray_with_details.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/leriomaggio/numpy_ep2015/master/images/ndarray_with_details.png -------------------------------------------------------------------------------- /images/predictive_modeling_data_flow.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/leriomaggio/numpy_ep2015/master/images/predictive_modeling_data_flow.png -------------------------------------------------------------------------------- /files/test.mat: -------------------------------------------------------------------------------- 1 | MATLAB 5.0 MAT-file Platform: posix, Created on: Tue Jul 21 09:22:44 2015IMPa  2 | (2 )3  *4 !+5",6#-7 -------------------------------------------------------------------------------- /README.md: -------------------------------------------------------------------------------- 1 | # Training Description 2 | 3 | It is very hard to be a scientist without knowing how to write code, 4 | and nowadays **Python** is probably the programming language of 5 | choice in many research fields. 6 | This is mainly because the Python ecosystem includes a lot of tools and libraries 7 | for many research tasks: `pandas` for *data analysis* , 8 | `networkx` for *social network analysis*, `nltk` for *natural language processing*, 9 | `scikit-learn` for *machine learning*, and so on. 10 | 11 | Most of these libraries relies (or are built on top of) `numpy`. 12 | Therefore, `numpy` is a crucial component of the common Python 13 | stack used for numerical analysis and data science. 14 | 15 | On the one hand, NumPy code tends to be much cleaner (and faster) than 16 | "straight" Python code that tries to accomplish the same task. 17 | Moreover, the underlying algorithms have 18 | been designed with high performance in mind. 19 | 20 | This training provides most of the essential concepts 21 | needed to become confident with NumPy data structures and functions. 22 | Moreover, some examples of data analysis libraries and code 23 | will be presented, where NumPy takes a central role. 24 | 25 | Here is a list of software used to develop and test the code examples presented 26 | during the training: 27 | 28 | * Python 3.x (2.x would work as well) 29 | * iPython 2.3+ (with **notebook support**) or Jupyter: 30 | * `pip install ipython[notebook]` 31 | * `pip install jupyter` 32 | * numpy 1.9+ 33 | * scipy 0.14+ 34 | * scikit-learn 0.15+ 35 | * pandas 0.8+ 36 | 37 | # Target Audience 38 | 39 | The training is meant to be mostly introductory, thus it is perfectly suited 40 | for **beginners**. However, a good proficiency in Python programming is (at least) 41 | required. 42 | 43 | # License and Sharing Material 44 | 45 | Creative Commons License
This work is licensed under a Creative Commons Attribution-ShareAlike 4.0 International License. 46 | -------------------------------------------------------------------------------- /01 - Preliminaries.ipynb: -------------------------------------------------------------------------------- 1 | { 2 | "cells": [ 3 | { 4 | "cell_type": "markdown", 5 | "metadata": {}, 6 | "source": [ 7 | "

NumPy Array Tutorial @ EuroPython 2015

\n", 8 | "\n", 9 | "" 10 | ] 11 | }, 12 | { 13 | "cell_type": "markdown", 14 | "metadata": {}, 15 | "source": [ 16 | "#Goal of this Tutorial\n", 17 | "\n", 18 | "- **Introduce the basics of Numpy**, and some more advanced stuff;\n", 19 | "- **Provide some concrete examples** where Numpy takes a central role.\n", 20 | " " 21 | ] 22 | }, 23 | { 24 | "cell_type": "markdown", 25 | "metadata": {}, 26 | "source": [ 27 | "# Schedule\n", 28 | "\n", 29 | "Outline:\n", 30 | "\n", 31 | "**11:00 - 11:15** Preliminaries\n", 32 | "\n", 33 | "- Making sure your computer is set-up\n", 34 | "\n", 35 | "** PART 1 ** (11:15 - 12:15)\n", 36 | "\n", 37 | "**11:15 - 11:45** Introduction to Numpy\n", 38 | "\n", 39 | "- What is Numpy?\n", 40 | "- Introduction to Numpy Arrays\n", 41 | "- Numpy Data Types\n", 42 | "- Record Array\n", 43 | "- Slicing and Indexing\n", 44 | "\n", 45 | "** 11:45 - 12:15** Numpy Operations\n", 46 | "\n", 47 | "- Linear Algebra\n", 48 | "- Array and Matrix\n", 49 | "- Reshaping and Resizing\n", 50 | "- File I/O\n", 51 | "- Data Processing\n", 52 | "\n", 53 | "** 12:15 - 12:25 ** Short Break\n", 54 | "\n", 55 | "** PART 2 ** (12:25 - 13:30)\n", 56 | "\n", 57 | "** 12:25 - 12:55 ** Advanced Numpy Functions\n", 58 | "\n", 59 | "- Sparse Arrays with Scipy\n", 60 | "- Using Numpy for graph matrices\n", 61 | "- Memmap and Serialization\n", 62 | "\n", 63 | "** 12:55 - 13:25 ** Connecting Numpy with the Rest of the world\n", 64 | "\n", 65 | "- Machine Learning with scikit-learn\n", 66 | "\n", 67 | "** 13:25 - 13:30 ** A look at the future (of Numpy)" 68 | ] 69 | }, 70 | { 71 | "cell_type": "markdown", 72 | "metadata": {}, 73 | "source": [ 74 | "# Requirements" 75 | ] 76 | }, 77 | { 78 | "cell_type": "markdown", 79 | "metadata": {}, 80 | "source": [ 81 | "This tutorial requires the following packages:\n", 82 | "\n", 83 | "- Python version 2.7, 3.4+\n", 84 | "- `numpy` version 1.5 or later: http://www.numpy.org/\n", 85 | "- `scipy` version 0.9 or later: http://www.scipy.org/\n", 86 | "- `matplotlib` version 1.0 or later: http://matplotlib.org/\n", 87 | "- `ipython` version 1.0 or later, with notebook support: http://ipython.org\n", 88 | "\n", 89 | "(and for the *second part* of the tutorial):\n", 90 | "\n", 91 | "- `scikit-learn` version 0.12 or later: http://scikit-learn.org\n", 92 | "- `networkx` version 1.9.1 or later: https://networkx.github.io\n", 93 | "\n", 94 | "The easiest way to get these is to use an all-in-one installer such as [Anaconda](http://www.continuum.io/downloads) from Continuum. These are available for multiple architectures." 95 | ] 96 | }, 97 | { 98 | "cell_type": "markdown", 99 | "metadata": { 100 | "collapsed": true 101 | }, 102 | "source": [ 103 | "# How to setup your environment" 104 | ] 105 | }, 106 | { 107 | "cell_type": "markdown", 108 | "metadata": {}, 109 | "source": [ 110 | "## The simplest way" 111 | ] 112 | }, 113 | { 114 | "cell_type": "markdown", 115 | "metadata": {}, 116 | "source": [ 117 | "The easiest way to get these is to use the [conda](https://store.continuum.io) environment manager. \n", 118 | "\n", 119 | "I suggest downloading and installing [miniconda](http://conda.pydata.org/miniconda.html).\n", 120 | "\n", 121 | "The following command will install all required packages:\n", 122 | "\n", 123 | " $ conda install numpy scipy matplotlib scikit-learn ipython-notebook\n", 124 | " \n", 125 | "Alternatively, you can download and install the (very large) **Anaconda software distribution**, found at [https://store.continuum.io/]()." 126 | ] 127 | }, 128 | { 129 | "cell_type": "markdown", 130 | "metadata": {}, 131 | "source": [ 132 | "## The \"longest\" way" 133 | ] 134 | }, 135 | { 136 | "cell_type": "markdown", 137 | "metadata": {}, 138 | "source": [ 139 | "1. Create your **Virtual Environment** (highly suggested)\n", 140 | "\n", 141 | " - `$ virtualenv -p numpy_training`\n", 142 | " - `$ source numpy_training/bin/activate`\n", 143 | "\n", 144 | "2. **pip** on the run\n", 145 | " - `pip install numpy`\n", 146 | " - `pip install scipy`\n", 147 | " - `pip install matplotlib`\n", 148 | " - `pip install \"ipython[all]\" # don't forget the quotation!`\n", 149 | " - `pip install scikit-learn`" 150 | ] 151 | }, 152 | { 153 | "cell_type": "markdown", 154 | "metadata": {}, 155 | "source": [ 156 | "## Alternatives\n", 157 | "\n", 158 | "- **Linux**: If you're on Linux, you can use the linux distribution tools \n", 159 | "\n", 160 | " - Type, for example, `apt-get install numpy` or `yum install numpy`.\n", 161 | " \n", 162 | " \n", 163 | "\n", 164 | "- **Mac**: If you're on OSX, there are similar tools such as MacPorts or HomeBrew which contain pre-compiled versions of these packages.\n", 165 | "\n", 166 | " - Just type `brew install numpy` in your terminal (if you're using HomeBrew)\n", 167 | "\n", 168 | "\n", 169 | "\n", 170 | "- **Windows**: Windows can be challenging: the best bet is probably to use a package installer such as Anaconda, above." 171 | ] 172 | }, 173 | { 174 | "cell_type": "markdown", 175 | "metadata": {}, 176 | "source": [ 177 | "### Python Version" 178 | ] 179 | }, 180 | { 181 | "cell_type": "markdown", 182 | "metadata": {}, 183 | "source": [ 184 | "I'm currently running this tutorial with **Python 3** on **Anaconda*" 185 | ] 186 | }, 187 | { 188 | "cell_type": "code", 189 | "execution_count": 10, 190 | "metadata": { 191 | "collapsed": false 192 | }, 193 | "outputs": [ 194 | { 195 | "name": "stdout", 196 | "output_type": "stream", 197 | "text": [ 198 | "Python 3.4.3 :: Anaconda 2.3.0 (x86_64)\r\n" 199 | ] 200 | } 201 | ], 202 | "source": [ 203 | "!python --version" 204 | ] 205 | }, 206 | { 207 | "cell_type": "markdown", 208 | "metadata": {}, 209 | "source": [ 210 | "# How to test if everything is Up&Running" 211 | ] 212 | }, 213 | { 214 | "cell_type": "markdown", 215 | "metadata": {}, 216 | "source": [ 217 | "## 1. Try running iPython with notebook support" 218 | ] 219 | }, 220 | { 221 | "cell_type": "code", 222 | "execution_count": null, 223 | "metadata": { 224 | "collapsed": false 225 | }, 226 | "outputs": [], 227 | "source": [ 228 | "!ipython notebook # run this in your terminal" 229 | ] 230 | }, 231 | { 232 | "cell_type": "markdown", 233 | "metadata": {}, 234 | "source": [ 235 | "## 2. Try to import everything" 236 | ] 237 | }, 238 | { 239 | "cell_type": "code", 240 | "execution_count": 2, 241 | "metadata": { 242 | "collapsed": true 243 | }, 244 | "outputs": [], 245 | "source": [ 246 | "import numpy as np\n", 247 | "import scipy as sp\n", 248 | "import matplotlib.pyplot as plt\n", 249 | "import pandas as pd\n", 250 | "import sklearn" 251 | ] 252 | }, 253 | { 254 | "cell_type": "markdown", 255 | "metadata": {}, 256 | "source": [ 257 | "## 3. Check Installed Versions " 258 | ] 259 | }, 260 | { 261 | "cell_type": "code", 262 | "execution_count": 6, 263 | "metadata": { 264 | "collapsed": false 265 | }, 266 | "outputs": [ 267 | { 268 | "name": "stdout", 269 | "output_type": "stream", 270 | "text": [ 271 | "numpy: 1.9.2\n", 272 | "scipy: 0.15.1\n", 273 | "matplotlib: 1.4.3\n", 274 | "iPython: 3.2.0\n", 275 | "scikit-learn: 0.16.1\n" 276 | ] 277 | } 278 | ], 279 | "source": [ 280 | "import numpy\n", 281 | "print('numpy:', numpy.__version__)\n", 282 | "\n", 283 | "import scipy\n", 284 | "print('scipy:', scipy.__version__)\n", 285 | "\n", 286 | "import matplotlib\n", 287 | "print('matplotlib:', matplotlib.__version__)\n", 288 | "\n", 289 | "import IPython\n", 290 | "print('iPython:', IPython.__version__)\n", 291 | "\n", 292 | "import sklearn\n", 293 | "print('scikit-learn:', sklearn.__version__)" 294 | ] 295 | }, 296 | { 297 | "cell_type": "markdown", 298 | "metadata": {}, 299 | "source": [ 300 | "### 4. Enable the inline visualisation of plots" 301 | ] 302 | }, 303 | { 304 | "cell_type": "code", 305 | "execution_count": 8, 306 | "metadata": { 307 | "collapsed": false 308 | }, 309 | "outputs": [], 310 | "source": [ 311 | "%matplotlib inline" 312 | ] 313 | }, 314 | { 315 | "cell_type": "markdown", 316 | "metadata": {}, 317 | "source": [ 318 | "
\n", 319 | "
\n", 320 | "

If everything worked down here, you're ready to start!

" 321 | ] 322 | } 323 | ], 324 | "metadata": { 325 | "kernelspec": { 326 | "display_name": "Python 3", 327 | "language": "python", 328 | "name": "python3" 329 | }, 330 | "language_info": { 331 | "codemirror_mode": { 332 | "name": "ipython", 333 | "version": 3 334 | }, 335 | "file_extension": ".py", 336 | "mimetype": "text/x-python", 337 | "name": "python", 338 | "nbconvert_exporter": "python", 339 | "pygments_lexer": "ipython3", 340 | "version": "3.4.3" 341 | } 342 | }, 343 | "nbformat": 4, 344 | "nbformat_minor": 0 345 | } 346 | -------------------------------------------------------------------------------- /06_Memmapping.ipynb: -------------------------------------------------------------------------------- 1 | { 2 | "cells": [ 3 | { 4 | "cell_type": "markdown", 5 | "metadata": {}, 6 | "source": [ 7 | "# Memmapping" 8 | ] 9 | }, 10 | { 11 | "cell_type": "markdown", 12 | "metadata": {}, 13 | "source": [ 14 | "The numpy package makes it possible to memory map large contiguous chunks of binary files as shared memory for all the Python processes running on a given host:" 15 | ] 16 | }, 17 | { 18 | "cell_type": "code", 19 | "execution_count": 1, 20 | "metadata": { 21 | "collapsed": false 22 | }, 23 | "outputs": [], 24 | "source": [ 25 | "import numpy as np" 26 | ] 27 | }, 28 | { 29 | "cell_type": "markdown", 30 | "metadata": {}, 31 | "source": [ 32 | "* Creating a `numpy.memmap` instance with the `w+` mode creates a file on the filesystem and zeros its content. " 33 | ] 34 | }, 35 | { 36 | "cell_type": "code", 37 | "execution_count": 2, 38 | "metadata": { 39 | "collapsed": false 40 | }, 41 | "outputs": [ 42 | { 43 | "name": "stdout", 44 | "output_type": "stream", 45 | "text": [ 46 | "[ 0. 0. 0. 0. 0. 0. 0. 0. 0. 0.]\n" 47 | ] 48 | } 49 | ], 50 | "source": [ 51 | "# Cleanup any existing file from past session (necessary for windows)\n", 52 | "import os\n", 53 | "\n", 54 | "current_dir = os.path.abspath(os.path.curdir)\n", 55 | "mmap_filepath = os.path.join(current_dir, 'files', 'small.mmap')\n", 56 | "if os.path.exists(mmap_filepath):\n", 57 | " os.unlink(mmap_filepath)\n", 58 | "\n", 59 | "mm_w = np.memmap(mmap_filepath, shape=10, dtype=np.float32, mode='w+')\n", 60 | "print(mm_w)" 61 | ] 62 | }, 63 | { 64 | "cell_type": "markdown", 65 | "metadata": {}, 66 | "source": [ 67 | "* This binary file can then be mapped as a new numpy array by all the engines having access to the same filesystem. \n", 68 | "* The `mode='r+'` opens this shared memory area in read write mode:" 69 | ] 70 | }, 71 | { 72 | "cell_type": "code", 73 | "execution_count": 3, 74 | "metadata": { 75 | "collapsed": false 76 | }, 77 | "outputs": [ 78 | { 79 | "name": "stdout", 80 | "output_type": "stream", 81 | "text": [ 82 | "[ 0. 0. 0. 0. 0. 0. 0. 0. 0. 0.]\n" 83 | ] 84 | } 85 | ], 86 | "source": [ 87 | "mm_r = np.memmap('files/small.mmap', dtype=np.float32, mode='r+')\n", 88 | "print(mm_r)" 89 | ] 90 | }, 91 | { 92 | "cell_type": "code", 93 | "execution_count": 4, 94 | "metadata": { 95 | "collapsed": false 96 | }, 97 | "outputs": [ 98 | { 99 | "name": "stdout", 100 | "output_type": "stream", 101 | "text": [ 102 | "[ 42. 0. 0. 0. 0. 0. 0. 0. 0. 0.]\n" 103 | ] 104 | } 105 | ], 106 | "source": [ 107 | "mm_w[0] = 42\n", 108 | "print(mm_w)" 109 | ] 110 | }, 111 | { 112 | "cell_type": "code", 113 | "execution_count": 5, 114 | "metadata": { 115 | "collapsed": false 116 | }, 117 | "outputs": [ 118 | { 119 | "name": "stdout", 120 | "output_type": "stream", 121 | "text": [ 122 | "[ 42. 0. 0. 0. 0. 0. 0. 0. 0. 0.]\n" 123 | ] 124 | } 125 | ], 126 | "source": [ 127 | "print(mm_r)" 128 | ] 129 | }, 130 | { 131 | "cell_type": "markdown", 132 | "metadata": {}, 133 | "source": [ 134 | "* Memory mapped arrays created with `mode='r+'` can be modified and the modifications are shared \n", 135 | " - in case of multiple process" 136 | ] 137 | }, 138 | { 139 | "cell_type": "code", 140 | "execution_count": 12, 141 | "metadata": { 142 | "collapsed": false 143 | }, 144 | "outputs": [], 145 | "source": [ 146 | "mm_r[1] = 43" 147 | ] 148 | }, 149 | { 150 | "cell_type": "code", 151 | "execution_count": 13, 152 | "metadata": { 153 | "collapsed": false 154 | }, 155 | "outputs": [ 156 | { 157 | "name": "stdout", 158 | "output_type": "stream", 159 | "text": [ 160 | "[ 42. 43. 0. 0. 0. 0. 0. 0. 0. 0.]\n" 161 | ] 162 | } 163 | ], 164 | "source": [ 165 | "print(mm_r)" 166 | ] 167 | }, 168 | { 169 | "cell_type": "markdown", 170 | "metadata": {}, 171 | "source": [ 172 | "### Memmap Operations" 173 | ] 174 | }, 175 | { 176 | "cell_type": "markdown", 177 | "metadata": {}, 178 | "source": [ 179 | "Memmap arrays generally behave very much like regular in-memory numpy arrays:" 180 | ] 181 | }, 182 | { 183 | "cell_type": "code", 184 | "execution_count": 14, 185 | "metadata": { 186 | "collapsed": false 187 | }, 188 | "outputs": [ 189 | { 190 | "name": "stdout", 191 | "output_type": "stream", 192 | "text": [ 193 | "85.0\n", 194 | "sum=85.0, mean=8.5, std=17.0014705657959\n" 195 | ] 196 | } 197 | ], 198 | "source": [ 199 | "print(mm_r.sum())\n", 200 | "print(\"sum={0}, mean={1}, std={2}\".format(mm_r.sum(), \n", 201 | " np.mean(mm_r), np.std(mm_r)))" 202 | ] 203 | }, 204 | { 205 | "cell_type": "markdown", 206 | "metadata": {}, 207 | "source": [ 208 | "Before allocating more data let us define a couple of utility functions from the previous exercise (and more) to monitor what is used by which engine and what is still free on the cluster as a whole:" 209 | ] 210 | }, 211 | { 212 | "cell_type": "markdown", 213 | "metadata": {}, 214 | "source": [ 215 | "* Let's allocate a 80MB memmap array:" 216 | ] 217 | }, 218 | { 219 | "cell_type": "code", 220 | "execution_count": 15, 221 | "metadata": { 222 | "collapsed": false 223 | }, 224 | "outputs": [ 225 | { 226 | "data": { 227 | "text/plain": [ 228 | "memmap([ 0., 0., 0., ..., 0., 0., 0.])" 229 | ] 230 | }, 231 | "execution_count": 15, 232 | "metadata": {}, 233 | "output_type": "execute_result" 234 | } 235 | ], 236 | "source": [ 237 | "# Cleanup any existing file from past session (necessary for windows)\n", 238 | "import os\n", 239 | "if os.path.exists('files/big.mmap'):\n", 240 | " os.unlink('files/big.mmap')\n", 241 | "\n", 242 | "np.memmap('files/big.mmap', shape=10 * int(1e6), dtype=np.float64, mode='w+')" 243 | ] 244 | }, 245 | { 246 | "cell_type": "markdown", 247 | "metadata": {}, 248 | "source": [ 249 | "No significant memory was used in this operation as we just asked the OS to allocate the buffer on the hard drive and just maitain a virtual memory area as a cheap reference to this buffer.\n", 250 | "\n", 251 | "Let's open new references to the same buffer from all the engines at once:" 252 | ] 253 | }, 254 | { 255 | "cell_type": "code", 256 | "execution_count": 17, 257 | "metadata": { 258 | "collapsed": false 259 | }, 260 | "outputs": [ 261 | { 262 | "name": "stdout", 263 | "output_type": "stream", 264 | "text": [ 265 | "CPU times: user 393 µs, sys: 577 µs, total: 970 µs\n", 266 | "Wall time: 773 µs\n" 267 | ] 268 | } 269 | ], 270 | "source": [ 271 | "%time big_mmap = np.memmap('files/big.mmap', dtype=np.float64, mode='r+')" 272 | ] 273 | }, 274 | { 275 | "cell_type": "code", 276 | "execution_count": 18, 277 | "metadata": { 278 | "collapsed": false 279 | }, 280 | "outputs": [ 281 | { 282 | "data": { 283 | "text/plain": [ 284 | "memmap([ 0., 0., 0., ..., 0., 0., 0.])" 285 | ] 286 | }, 287 | "execution_count": 18, 288 | "metadata": {}, 289 | "output_type": "execute_result" 290 | } 291 | ], 292 | "source": [ 293 | "big_mmap" 294 | ] 295 | }, 296 | { 297 | "cell_type": "markdown", 298 | "metadata": {}, 299 | "source": [ 300 | "* Let's trigger an actual load of the data from the drive into the in-memory disk cache of the OS, this can take some time depending on the speed of the hard drive (on the order of 100MB/s to 300MB/s hence 3s to 8s for this dataset):" 301 | ] 302 | }, 303 | { 304 | "cell_type": "code", 305 | "execution_count": 19, 306 | "metadata": { 307 | "collapsed": false 308 | }, 309 | "outputs": [ 310 | { 311 | "name": "stdout", 312 | "output_type": "stream", 313 | "text": [ 314 | "CPU times: user 39.4 ms, sys: 89.6 ms, total: 129 ms\n", 315 | "Wall time: 602 ms\n" 316 | ] 317 | }, 318 | { 319 | "data": { 320 | "text/plain": [ 321 | "memmap(0.0)" 322 | ] 323 | }, 324 | "execution_count": 19, 325 | "metadata": {}, 326 | "output_type": "execute_result" 327 | } 328 | ], 329 | "source": [ 330 | "%time np.sum(big_mmap)" 331 | ] 332 | }, 333 | { 334 | "cell_type": "markdown", 335 | "metadata": {}, 336 | "source": [ 337 | "* Now back into memory" 338 | ] 339 | }, 340 | { 341 | "cell_type": "code", 342 | "execution_count": 20, 343 | "metadata": { 344 | "collapsed": false 345 | }, 346 | "outputs": [ 347 | { 348 | "name": "stdout", 349 | "output_type": "stream", 350 | "text": [ 351 | "CPU times: user 16.6 ms, sys: 2.2 ms, total: 18.8 ms\n", 352 | "Wall time: 16.3 ms\n" 353 | ] 354 | }, 355 | { 356 | "data": { 357 | "text/plain": [ 358 | "memmap(0.0)" 359 | ] 360 | }, 361 | "execution_count": 20, 362 | "metadata": {}, 363 | "output_type": "execute_result" 364 | } 365 | ], 366 | "source": [ 367 | "%time np.sum(big_mmap)" 368 | ] 369 | }, 370 | { 371 | "cell_type": "markdown", 372 | "metadata": {}, 373 | "source": [ 374 | "This strategy makes it very interesting to load the readonly datasets of machine learning problems, especially when the same data is reused over and over by concurrent processes as can be the case when doing learning curves analysis or grid search (**Hyperparameter Optimisation** & **Model Selection**).\n", 375 | "\n", 376 | "This is of great importance in case of multiple and **embarassingly** parallel processes (like **Grid Search**)" 377 | ] 378 | }, 379 | { 380 | "cell_type": "markdown", 381 | "metadata": {}, 382 | "source": [ 383 | "## Memmaping Nested Numpy-based Data Structures with Joblib" 384 | ] 385 | }, 386 | { 387 | "cell_type": "markdown", 388 | "metadata": {}, 389 | "source": [ 390 | "joblib is a utility library included in the sklearn package. Among other things it provides tools to serialize objects that comprise large numpy arrays and reload them as memmap backed datastructures.\n", 391 | "\n", 392 | "To demonstrate it, let's create an arbitrary python datastructure involving numpy arrays:" 393 | ] 394 | }, 395 | { 396 | "cell_type": "code", 397 | "execution_count": 21, 398 | "metadata": { 399 | "collapsed": false 400 | }, 401 | "outputs": [ 402 | { 403 | "data": { 404 | "text/plain": [ 405 | "(array([[ 0., 0., 0., 0.],\n", 406 | " [ 0., 0., 0., 0.],\n", 407 | " [ 0., 0., 0., 0.]], dtype=float32), array([[1, 1, 1, 1],\n", 408 | " [1, 1, 1, 1],\n", 409 | " [1, 1, 1, 1]]))" 410 | ] 411 | }, 412 | "execution_count": 21, 413 | "metadata": {}, 414 | "output_type": "execute_result" 415 | } 416 | ], 417 | "source": [ 418 | "import numpy as np\n", 419 | "\n", 420 | "class MyDataStructure(object):\n", 421 | " \n", 422 | " def __init__(self, shape):\n", 423 | " self.float_zeros = np.zeros(shape, dtype=np.float32)\n", 424 | " self.integer_ones = np.ones(shape, dtype=np.int64)\n", 425 | " \n", 426 | "data_structure = MyDataStructure((3, 4))\n", 427 | "data_structure.float_zeros, data_structure.integer_ones" 428 | ] 429 | }, 430 | { 431 | "cell_type": "markdown", 432 | "metadata": {}, 433 | "source": [ 434 | "We can now persist this datastructure to disk:" 435 | ] 436 | }, 437 | { 438 | "cell_type": "code", 439 | "execution_count": 22, 440 | "metadata": { 441 | "collapsed": false 442 | }, 443 | "outputs": [ 444 | { 445 | "data": { 446 | "text/plain": [ 447 | "['files/data_structure.pkl',\n", 448 | " 'files/data_structure.pkl_01.npy',\n", 449 | " 'files/data_structure.pkl_02.npy']" 450 | ] 451 | }, 452 | "execution_count": 22, 453 | "metadata": {}, 454 | "output_type": "execute_result" 455 | } 456 | ], 457 | "source": [ 458 | "from sklearn.externals import joblib\n", 459 | "joblib.dump(data_structure, 'files/data_structure.pkl')" 460 | ] 461 | }, 462 | { 463 | "cell_type": "code", 464 | "execution_count": 23, 465 | "metadata": { 466 | "collapsed": false 467 | }, 468 | "outputs": [ 469 | { 470 | "name": "stdout", 471 | "output_type": "stream", 472 | "text": [ 473 | "-rw-r--r-- 1 valerio staff 267 Jul 21 10:17 files/data_structure.pkl\r\n", 474 | "-rw-r--r-- 1 valerio staff 176 Jul 21 10:17 files/data_structure.pkl_01.npy\r\n", 475 | "-rw-r--r-- 1 valerio staff 128 Jul 21 10:17 files/data_structure.pkl_02.npy\r\n" 476 | ] 477 | } 478 | ], 479 | "source": [ 480 | "!ls -l files/data_structure*" 481 | ] 482 | }, 483 | { 484 | "cell_type": "markdown", 485 | "metadata": {}, 486 | "source": [ 487 | "A memmapped copy of this datastructure can then be loaded:" 488 | ] 489 | }, 490 | { 491 | "cell_type": "code", 492 | "execution_count": 24, 493 | "metadata": { 494 | "collapsed": false 495 | }, 496 | "outputs": [ 497 | { 498 | "data": { 499 | "text/plain": [ 500 | "(memmap([[ 0., 0., 0., 0.],\n", 501 | " [ 0., 0., 0., 0.],\n", 502 | " [ 0., 0., 0., 0.]], dtype=float32), memmap([[1, 1, 1, 1],\n", 503 | " [1, 1, 1, 1],\n", 504 | " [1, 1, 1, 1]]))" 505 | ] 506 | }, 507 | "execution_count": 24, 508 | "metadata": {}, 509 | "output_type": "execute_result" 510 | } 511 | ], 512 | "source": [ 513 | "memmaped_data_structure = joblib.load('files/data_structure.pkl', \n", 514 | " mmap_mode='r+')\n", 515 | "memmaped_data_structure.float_zeros, memmaped_data_structure.integer_ones" 516 | ] 517 | } 518 | ], 519 | "metadata": { 520 | "kernelspec": { 521 | "display_name": "Python 3", 522 | "language": "python", 523 | "name": "python3" 524 | }, 525 | "language_info": { 526 | "codemirror_mode": { 527 | "name": "ipython", 528 | "version": 3 529 | }, 530 | "file_extension": ".py", 531 | "mimetype": "text/x-python", 532 | "name": "python", 533 | "nbconvert_exporter": "python", 534 | "pygments_lexer": "ipython3", 535 | "version": "3.4.3" 536 | } 537 | }, 538 | "nbformat": 4, 539 | "nbformat_minor": 0 540 | } 541 | -------------------------------------------------------------------------------- /05_Sparse_Matrices.ipynb: -------------------------------------------------------------------------------- 1 | { 2 | "cells": [ 3 | { 4 | "cell_type": "markdown", 5 | "metadata": { 6 | "slideshow": { 7 | "slide_type": "skip" 8 | } 9 | }, 10 | "source": [ 11 | "This notebook was put together by [Jake Vanderplas](http://www.vanderplas.com) for PyCon 2014. Source and license info is on [GitHub](https://github.com/jakevdp/sklearn_pycon2014/)." 12 | ] 13 | }, 14 | { 15 | "cell_type": "markdown", 16 | "metadata": { 17 | "slideshow": { 18 | "slide_type": "slide" 19 | } 20 | }, 21 | "source": [ 22 | "# Scipy Sparse Matrices" 23 | ] 24 | }, 25 | { 26 | "cell_type": "markdown", 27 | "metadata": {}, 28 | "source": [ 29 | "**Sparse Matrices** are very nice in some situations. \n", 30 | "\n", 31 | "For example, in some machine learning tasks, especially those associated\n", 32 | "with textual analysis, the data may be mostly zeros. \n", 33 | "\n", 34 | "Storing all these zeros is very inefficient. \n", 35 | "\n", 36 | "We can create and manipulate sparse matrices as follows:" 37 | ] 38 | }, 39 | { 40 | "cell_type": "code", 41 | "execution_count": 2, 42 | "metadata": { 43 | "collapsed": true 44 | }, 45 | "outputs": [], 46 | "source": [ 47 | "import numpy as np" 48 | ] 49 | }, 50 | { 51 | "cell_type": "code", 52 | "execution_count": 4, 53 | "metadata": { 54 | "collapsed": false, 55 | "slideshow": { 56 | "slide_type": "subslide" 57 | } 58 | }, 59 | "outputs": [ 60 | { 61 | "name": "stdout", 62 | "output_type": "stream", 63 | "text": [ 64 | "[[ 0.92071168 0.66941621 0.30097014 0.8668366 0.94764952]\n", 65 | " [ 0.16978456 0.59292571 0.78884569 0.76910071 0.56415941]\n", 66 | " [ 0.096867 0.96869327 0.8643055 0.0297782 0.11921581]\n", 67 | " [ 0.22387061 0.71015351 0.45882072 0.34433871 0.85566776]\n", 68 | " [ 0.22217957 0.83387745 0.40605966 0.41212024 0.65548993]\n", 69 | " [ 0.53416368 0.92406734 0.66444729 0.57218427 0.48198361]\n", 70 | " [ 0.37469397 0.33167227 0.9107519 0.03360275 0.20205017]\n", 71 | " [ 0.39939621 0.61025928 0.14715445 0.86871212 0.25921407]\n", 72 | " [ 0.07210422 0.99690991 0.31477122 0.49698491 0.34563232]\n", 73 | " [ 0.10310154 0.3806856 0.77690381 0.46116052 0.43330533]]\n" 74 | ] 75 | } 76 | ], 77 | "source": [ 78 | "# Create a random array with a lot of zeros\n", 79 | "X = np.random.random((10, 5))\n", 80 | "print(X)" 81 | ] 82 | }, 83 | { 84 | "cell_type": "code", 85 | "execution_count": 5, 86 | "metadata": { 87 | "collapsed": false, 88 | "slideshow": { 89 | "slide_type": "subslide" 90 | } 91 | }, 92 | "outputs": [ 93 | { 94 | "name": "stdout", 95 | "output_type": "stream", 96 | "text": [ 97 | "[[ 0.92071168 0. 0. 0.8668366 0.94764952]\n", 98 | " [ 0. 0. 0.78884569 0.76910071 0. ]\n", 99 | " [ 0. 0.96869327 0.8643055 0. 0. ]\n", 100 | " [ 0. 0.71015351 0. 0. 0.85566776]\n", 101 | " [ 0. 0.83387745 0. 0. 0. ]\n", 102 | " [ 0. 0.92406734 0. 0. 0. ]\n", 103 | " [ 0. 0. 0.9107519 0. 0. ]\n", 104 | " [ 0. 0. 0. 0.86871212 0. ]\n", 105 | " [ 0. 0.99690991 0. 0. 0. ]\n", 106 | " [ 0. 0. 0.77690381 0. 0. ]]\n" 107 | ] 108 | } 109 | ], 110 | "source": [ 111 | "X[X < 0.7] = 0\n", 112 | "print(X)" 113 | ] 114 | }, 115 | { 116 | "cell_type": "code", 117 | "execution_count": 6, 118 | "metadata": { 119 | "collapsed": false, 120 | "slideshow": { 121 | "slide_type": "subslide" 122 | } 123 | }, 124 | "outputs": [ 125 | { 126 | "name": "stdout", 127 | "output_type": "stream", 128 | "text": [ 129 | " (0, 0)\t0.920711681384\n", 130 | " (0, 3)\t0.866836604396\n", 131 | " (0, 4)\t0.947649515452\n", 132 | " (1, 2)\t0.788845688727\n", 133 | " (1, 3)\t0.769100712548\n", 134 | " (2, 1)\t0.968693269052\n", 135 | " (2, 2)\t0.864305496772\n", 136 | " (3, 1)\t0.710153508323\n", 137 | " (3, 4)\t0.855667757095\n", 138 | " (4, 1)\t0.833877448584\n", 139 | " (5, 1)\t0.924067342994\n", 140 | " (6, 2)\t0.910751902907\n", 141 | " (7, 3)\t0.868712121221\n", 142 | " (8, 1)\t0.996909907387\n", 143 | " (9, 2)\t0.776903807028\n" 144 | ] 145 | } 146 | ], 147 | "source": [ 148 | "from scipy import sparse\n", 149 | "\n", 150 | "# turn X into a csr (Compressed-Sparse-Row) matrix\n", 151 | "X_csr = sparse.csr_matrix(X)\n", 152 | "print(X_csr)" 153 | ] 154 | }, 155 | { 156 | "cell_type": "code", 157 | "execution_count": 7, 158 | "metadata": { 159 | "collapsed": false, 160 | "slideshow": { 161 | "slide_type": "subslide" 162 | } 163 | }, 164 | "outputs": [ 165 | { 166 | "name": "stdout", 167 | "output_type": "stream", 168 | "text": [ 169 | "[[ 0.92071168 0. 0. 0.8668366 0.94764952]\n", 170 | " [ 0. 0. 0.78884569 0.76910071 0. ]\n", 171 | " [ 0. 0.96869327 0.8643055 0. 0. ]\n", 172 | " [ 0. 0.71015351 0. 0. 0.85566776]\n", 173 | " [ 0. 0.83387745 0. 0. 0. ]\n", 174 | " [ 0. 0.92406734 0. 0. 0. ]\n", 175 | " [ 0. 0. 0.9107519 0. 0. ]\n", 176 | " [ 0. 0. 0. 0.86871212 0. ]\n", 177 | " [ 0. 0.99690991 0. 0. 0. ]\n", 178 | " [ 0. 0. 0.77690381 0. 0. ]]\n" 179 | ] 180 | } 181 | ], 182 | "source": [ 183 | "# convert the sparse matrix to a dense array\n", 184 | "print(X_csr.toarray())" 185 | ] 186 | }, 187 | { 188 | "cell_type": "code", 189 | "execution_count": 8, 190 | "metadata": { 191 | "collapsed": false, 192 | "slideshow": { 193 | "slide_type": "subslide" 194 | } 195 | }, 196 | "outputs": [ 197 | { 198 | "data": { 199 | "text/plain": [ 200 | "True" 201 | ] 202 | }, 203 | "execution_count": 8, 204 | "metadata": {}, 205 | "output_type": "execute_result" 206 | } 207 | ], 208 | "source": [ 209 | "# Sparse matrices support linear algebra:\n", 210 | "y = np.random.random(X_csr.shape[1])\n", 211 | "z1 = X_csr.dot(y)\n", 212 | "z2 = X.dot(y)\n", 213 | "np.allclose(z1, z2)" 214 | ] 215 | }, 216 | { 217 | "cell_type": "markdown", 218 | "metadata": { 219 | "slideshow": { 220 | "slide_type": "subslide" 221 | } 222 | }, 223 | "source": [ 224 | "* The CSR representation can be very efficient for computations, but it is not as good for adding elements. \n", 225 | "\n", 226 | "* For that, the **LIL** (List-In-List) representation is better:" 227 | ] 228 | }, 229 | { 230 | "cell_type": "code", 231 | "execution_count": 9, 232 | "metadata": { 233 | "collapsed": false, 234 | "slideshow": { 235 | "slide_type": "fragment" 236 | } 237 | }, 238 | "outputs": [ 239 | { 240 | "name": "stdout", 241 | "output_type": "stream", 242 | "text": [ 243 | " (0, 2)\t2.0\n", 244 | " (1, 1)\t2.0\n", 245 | " (1, 2)\t3.0\n", 246 | " (1, 3)\t4.0\n", 247 | " (2, 0)\t2.0\n", 248 | " (2, 3)\t5.0\n", 249 | " (2, 4)\t6.0\n", 250 | " (3, 0)\t3.0\n", 251 | " (3, 1)\t4.0\n", 252 | " (3, 4)\t7.0\n", 253 | " (4, 2)\t6.0\n", 254 | " (4, 3)\t7.0\n", 255 | "[[ 0. 0. 2. 0. 0.]\n", 256 | " [ 0. 2. 3. 4. 0.]\n", 257 | " [ 2. 0. 0. 5. 6.]\n", 258 | " [ 3. 4. 0. 0. 7.]\n", 259 | " [ 0. 0. 6. 7. 0.]]\n" 260 | ] 261 | } 262 | ], 263 | "source": [ 264 | "# Create an empty LIL matrix and add some items\n", 265 | "X_lil = sparse.lil_matrix((5, 5))\n", 266 | "\n", 267 | "for i, j in np.random.randint(0, 5, (15, 2)):\n", 268 | " X_lil[i, j] = i + j\n", 269 | "\n", 270 | "print(X_lil)\n", 271 | "print(X_lil.toarray())" 272 | ] 273 | }, 274 | { 275 | "cell_type": "markdown", 276 | "metadata": { 277 | "slideshow": { 278 | "slide_type": "subslide" 279 | } 280 | }, 281 | "source": [ 282 | "* Often, once an LIL matrix is created, it is useful to convert it to a CSR format \n", 283 | " * **Note**: many scikit-learn algorithms require CSR or CSC format" 284 | ] 285 | }, 286 | { 287 | "cell_type": "code", 288 | "execution_count": 10, 289 | "metadata": { 290 | "collapsed": false, 291 | "slideshow": { 292 | "slide_type": "fragment" 293 | } 294 | }, 295 | "outputs": [ 296 | { 297 | "name": "stdout", 298 | "output_type": "stream", 299 | "text": [ 300 | " (0, 2)\t2.0\n", 301 | " (1, 1)\t2.0\n", 302 | " (1, 2)\t3.0\n", 303 | " (1, 3)\t4.0\n", 304 | " (2, 0)\t2.0\n", 305 | " (2, 3)\t5.0\n", 306 | " (2, 4)\t6.0\n", 307 | " (3, 0)\t3.0\n", 308 | " (3, 1)\t4.0\n", 309 | " (3, 4)\t7.0\n", 310 | " (4, 2)\t6.0\n", 311 | " (4, 3)\t7.0\n" 312 | ] 313 | } 314 | ], 315 | "source": [ 316 | "X_csr = X_lil.tocsr()\n", 317 | "print(X_csr)" 318 | ] 319 | }, 320 | { 321 | "cell_type": "markdown", 322 | "metadata": { 323 | "slideshow": { 324 | "slide_type": "subslide" 325 | } 326 | }, 327 | "source": [ 328 | "There are several other sparse formats that can be useful for various problems:\n", 329 | "\n", 330 | "- `CSC` (compressed sparse column)\n", 331 | "- `BSR` (block sparse row)\n", 332 | "- `COO` (coordinate)\n", 333 | "- `DIA` (diagonal)\n", 334 | "- `DOK` (dictionary of keys)" 335 | ] 336 | }, 337 | { 338 | "cell_type": "markdown", 339 | "metadata": {}, 340 | "source": [ 341 | "## CSC - Compressed Sparse Column\n", 342 | "\n", 343 | "**Advantages of the CSC format**\n", 344 | "\n", 345 | " * efficient arithmetic operations CSC + CSC, CSC * CSC, etc.\n", 346 | " * efficient column slicing\n", 347 | " * fast matrix vector products (CSR, BSR may be faster)\n", 348 | "\n", 349 | "**Disadvantages of the CSC format**\n", 350 | "\n", 351 | " * slow row slicing operations (consider CSR)\n", 352 | " * changes to the sparsity structure are expensive (consider LIL or DOK)" 353 | ] 354 | }, 355 | { 356 | "cell_type": "markdown", 357 | "metadata": {}, 358 | "source": [ 359 | "### BSR - Block Sparse Row\n", 360 | "\n", 361 | "The Block Compressed Row (`BSR`) format is very similar to the Compressed Sparse Row (`CSR`) format. \n", 362 | "\n", 363 | "BSR is appropriate for sparse matrices with *dense sub matrices* like the example below. \n", 364 | "\n", 365 | "Block matrices often arise in *vector-valued* finite element discretizations. \n", 366 | "\n", 367 | "In such cases, BSR is **considerably more efficient** than CSR and CSC for many sparse arithmetic operations." 368 | ] 369 | }, 370 | { 371 | "cell_type": "code", 372 | "execution_count": 12, 373 | "metadata": { 374 | "collapsed": false 375 | }, 376 | "outputs": [ 377 | { 378 | "data": { 379 | "text/plain": [ 380 | "array([[1, 1, 0, 0, 2, 2],\n", 381 | " [1, 1, 0, 0, 2, 2],\n", 382 | " [0, 0, 0, 0, 3, 3],\n", 383 | " [0, 0, 0, 0, 3, 3],\n", 384 | " [4, 4, 5, 5, 6, 6],\n", 385 | " [4, 4, 5, 5, 6, 6]])" 386 | ] 387 | }, 388 | "execution_count": 12, 389 | "metadata": {}, 390 | "output_type": "execute_result" 391 | } 392 | ], 393 | "source": [ 394 | "from scipy.sparse import bsr_matrix\n", 395 | "\n", 396 | "indptr = np.array([0, 2, 3, 6])\n", 397 | "indices = np.array([0, 2, 2, 0, 1, 2])\n", 398 | "data = np.array([1, 2, 3, 4, 5, 6]).repeat(4).reshape(6, 2, 2)\n", 399 | "bsr_matrix((data,indices,indptr), shape=(6, 6)).toarray()" 400 | ] 401 | }, 402 | { 403 | "cell_type": "markdown", 404 | "metadata": {}, 405 | "source": [ 406 | "## COO - Coordinate Sparse Matrix\n", 407 | "\n", 408 | "**Advantages of the CSC format**\n", 409 | "\n", 410 | " * facilitates fast conversion among sparse formats\n", 411 | " * permits duplicate entries (see example)\n", 412 | " * very fast conversion to and from CSR/CSC formats\n", 413 | "\n", 414 | "**Disadvantages of the CSC format**\n", 415 | "\n", 416 | " * does not directly support arithmetic operations and slicing\n", 417 | " \n", 418 | "** Intended Usage**\n", 419 | "\n", 420 | " * COO is a fast format for constructing sparse matrices\n", 421 | " * Once a matrix has been constructed, convert to CSR or CSC format for fast arithmetic and matrix vector\n", 422 | " operations\n", 423 | " * By default when converting to CSR or CSC format, duplicate (i,j) entries will be summed together. \n", 424 | " This facilitates efficient construction of finite element matrices and the like.\n" 425 | ] 426 | }, 427 | { 428 | "cell_type": "markdown", 429 | "metadata": {}, 430 | "source": [ 431 | "## DOK - Dictionary of Keys\n", 432 | "\n", 433 | "Sparse matrices can be used in arithmetic operations: they support addition, subtraction, multiplication, division, and matrix power.\n", 434 | "\n", 435 | "Allows for efficient O(1) access of individual elements. Duplicates are not allowed. Can be efficiently converted to a coo_matrix once constructed." 436 | ] 437 | }, 438 | { 439 | "cell_type": "code", 440 | "execution_count": 15, 441 | "metadata": { 442 | "collapsed": false 443 | }, 444 | "outputs": [ 445 | { 446 | "data": { 447 | "text/plain": [ 448 | "array([[ 0., 1., 2., 3., 4.],\n", 449 | " [ 0., 2., 3., 4., 5.],\n", 450 | " [ 0., 0., 4., 5., 6.],\n", 451 | " [ 0., 0., 0., 6., 7.],\n", 452 | " [ 0., 0., 0., 0., 8.]], dtype=float32)" 453 | ] 454 | }, 455 | "execution_count": 15, 456 | "metadata": {}, 457 | "output_type": "execute_result" 458 | } 459 | ], 460 | "source": [ 461 | "from scipy.sparse import dok_matrix\n", 462 | "S = dok_matrix((5, 5), dtype=np.float32)\n", 463 | "for i in range(5):\n", 464 | " for j in range(i, 5):\n", 465 | " S[i,j] = i+j\n", 466 | " \n", 467 | "S.toarray()" 468 | ] 469 | }, 470 | { 471 | "cell_type": "markdown", 472 | "metadata": {}, 473 | "source": [ 474 | "The ``scipy.sparse`` submodule also has a lot of functions for sparse matrices\n", 475 | "including linear algebra, sparse solvers, graph algorithms, and much more." 476 | ] 477 | } 478 | ], 479 | "metadata": { 480 | "kernelspec": { 481 | "display_name": "Python 3", 482 | "language": "python", 483 | "name": "python3" 484 | }, 485 | "language_info": { 486 | "codemirror_mode": { 487 | "name": "ipython", 488 | "version": 3 489 | }, 490 | "file_extension": ".py", 491 | "mimetype": "text/x-python", 492 | "name": "python", 493 | "nbconvert_exporter": "python", 494 | "pygments_lexer": "ipython3", 495 | "version": "3.4.3" 496 | } 497 | }, 498 | "nbformat": 4, 499 | "nbformat_minor": 0 500 | } 501 | -------------------------------------------------------------------------------- /05_1_Sparse_Graphs_in_Python.ipynb: -------------------------------------------------------------------------------- 1 | { 2 | "cells": [ 3 | { 4 | "cell_type": "markdown", 5 | "metadata": {}, 6 | "source": [ 7 | "## Scipy Sparse Matrix\n", 8 | "\n", 9 | "There are several sparse formats that can be useful for various problems:\n", 10 | "\n", 11 | "- `CSR` (compressed sparse row)\n", 12 | "- `CSC` (compressed sparse column)\n", 13 | "- `BSR` (block sparse row)\n", 14 | "- `COO` (coordinate)\n", 15 | "- `DIA` (diagonal)\n", 16 | "- `DOK` (dictionary of keys)\n", 17 | "\n", 18 | "The ``scipy.sparse`` submodule also has a lot of functions for sparse matrices\n", 19 | "including linear algebra, sparse solvers, graph algorithms, and much more." 20 | ] 21 | }, 22 | { 23 | "cell_type": "markdown", 24 | "metadata": {}, 25 | "source": [ 26 | "### Preamble\n", 27 | "The following notebook contains code and explanations extracted from the Post [\"Sparse Graphs in Python: Playing with Word Ladders\"](http://jakevdp.github.io/blog/2012/10/14/scipy-sparse-graph-module-word-ladders/ \"Scipy Sparse Graphs\") of the Blog [Pythonic Perambulations](http://jakevdp.github.io) bu Jake Vanderplas." 28 | ] 29 | }, 30 | { 31 | "cell_type": "markdown", 32 | "metadata": {}, 33 | "source": [ 34 | "Let's Play with Word Ladders\n", 35 | "----------------------------\n", 36 | "\n", 37 | "[Word Ladders](http://en.wikipedia.org/wiki/Word_ladder) is a game invented by the famous English writer and mathematician **Charles Lutwidge Dodgson**, better known by the name of [Lewis Carroll](http://en.wikipedia.org/wiki/Lewis_Carroll).\n", 38 | "\n", 39 | "A *World Ladder* puzzle begins with two words, and to solve the puzzle one must find a chain of other words to link the two, in which two adjacent words (that is, words in successive steps) differ by only one letter.\n", 40 | "\n", 41 | "For example, Lewis Carroll found that only six steps were required for the word `APE` to *evolve* to `MAN` [\\[1\\]][1]:\n", 42 | "\n", 43 | "`APE -> APT -> OPT -> OAT -> MAT -> MAN`\n", 44 | "\n", 45 | "---\n", 46 | "[1]: http://books.google.com/books?isbn=0198662645 \"Oxford Guide to Word Games\"" 47 | ] 48 | }, 49 | { 50 | "cell_type": "markdown", 51 | "metadata": {}, 52 | "source": [ 53 | "### Math Games and Computer Science\n", 54 | "\n", 55 | "Like many others Mathematical Games, Word Ladders fed the interests of many Computer Scientists. The most famous algorithmic contribution to this game has been provided by [Don Knuth](http://en.wikipedia.org/wiki/Donald_Knuth). \n", 56 | "\n", 57 | "He studied five-letters word ladders, as he believed that three-letters word ladders were too easy, and that six-letters word ladders were less interesting, since relatively few pairs of English words could be connected [\\[2\\]][2].\n", 58 | "\n", 59 | "Knuth used a fixed collection of 5,757 of the most common English five-letters words, and discovered that most words were connected to each other. \n", 60 | "Only 671 words of the collection did not form a word ladder with any other words. \n", 61 | "He called these words *aloof*, because the word `aloof` is itself an example of such a word [\\[2\\]][2].\n", 62 | "\n", 63 | "---\n", 64 | "[2]: http://books.google.com/books?isbn=0883855550 \"The Edge of the Universe: Celebrating Ten Years of Math Horizons\"" 65 | ] 66 | }, 67 | { 68 | "cell_type": "code", 69 | "execution_count": 1, 70 | "metadata": { 71 | "collapsed": false 72 | }, 73 | "outputs": [ 74 | { 75 | "name": "stdout", 76 | "output_type": "stream", 77 | "text": [ 78 | "Populating the interactive namespace from numpy and matplotlib\n" 79 | ] 80 | } 81 | ], 82 | "source": [ 83 | "# Inject Numpy and Matplotlib in the current namespace\n", 84 | "%pylab inline" 85 | ] 86 | }, 87 | { 88 | "cell_type": "code", 89 | "execution_count": 2, 90 | "metadata": { 91 | "collapsed": false 92 | }, 93 | "outputs": [], 94 | "source": [ 95 | "import numpy as np\n", 96 | "from scipy.sparse import csgraph" 97 | ] 98 | }, 99 | { 100 | "cell_type": "code", 101 | "execution_count": 3, 102 | "metadata": { 103 | "collapsed": false 104 | }, 105 | "outputs": [ 106 | { 107 | "data": { 108 | "text/plain": [ 109 | "235886" 110 | ] 111 | }, 112 | "execution_count": 3, 113 | "metadata": {}, 114 | "output_type": "execute_result" 115 | } 116 | ], 117 | "source": [ 118 | "# Obtaining the list of words\n", 119 | "wordlist = open('/usr/share/dict/words').read().split()\n", 120 | "len(wordlist)" 121 | ] 122 | }, 123 | { 124 | "cell_type": "code", 125 | "execution_count": 4, 126 | "metadata": { 127 | "collapsed": false 128 | }, 129 | "outputs": [ 130 | { 131 | "data": { 132 | "text/plain": [ 133 | "(1135,)" 134 | ] 135 | }, 136 | "execution_count": 4, 137 | "metadata": {}, 138 | "output_type": "execute_result" 139 | } 140 | ], 141 | "source": [ 142 | "wordlist = filter(lambda w: len(w) == 3, wordlist) #keep 3-letter words\n", 143 | "wordlist = filter(str.isalpha, list(wordlist)) # no punctuation\n", 144 | "wordlist = filter(str.islower, list(wordlist)) # no proper nouns or acronyms\n", 145 | "\n", 146 | "wordlist = np.sort(list(wordlist))\n", 147 | "wordlist.shape" 148 | ] 149 | }, 150 | { 151 | "cell_type": "markdown", 152 | "metadata": {}, 153 | "source": [ 154 | "## Word Ladders as a Graph" 155 | ] 156 | }, 157 | { 158 | "cell_type": "markdown", 159 | "metadata": {}, 160 | "source": [ 161 | "Before detailing the adopted representation, we need to introduce the **Ladder relationship** $\\lambda$:\n", 162 | "\n", 163 | "> #### The Ladder Relationship:\n", 164 | "> Given two strings $s_i, s_j \\in \\Sigma^*$ such that\n", 165 | "> $s_i = \\langle c_1 c_2 \\cdots c_k \\rangle$ and $s_j = \\langle c_1 c_2 \\cdots c_l \\rangle$, we say that\n", 166 | "> the *Ladder relationship* holds for $s_i$ and $s_j$ if and only if \n", 167 | "> \n", 168 | "$$\n", 169 | " k = l \\land \\exists! n, 1 \\leq n \\leq k : c_n \\in s_i \\land c_n \\not \\in s_j\n", 170 | "$$ \n", 171 | "> In other words, the two strings $s_i$ and $s_j$ differ by only *one* character. \n", 172 | "> We indicate this relationship by $s_i \\lambda s_j$\n", 173 | "\n", 174 | "The *Ladder Relation* defines a relationships among words (i.e., two words are related if they differ by only one character) that could be used to derive a Graph-based representation of the problem at the hand.\n", 175 | "\n", 176 | "In more details, we may represent the *Word Ladders* as an undirected (and unweighted) Graph $G=(V,E)$, where the set of vertices $V \\subseteq \\Sigma^*$, while the set $E$ fo edges is defined as\n", 177 | "$E = \\{(u, v): u,v \\in V \\wedge u \\lambda v \\}$\n", 178 | "\n", 179 | "Back to our example, the set $V$ will simply correspond to the obtained `wordlist`, so the set of edges $E$ will be represented by all the words in `wordlist` for which the Ladder relationship $lambda$ holds.\n", 180 | "\n", 181 | "In order to derive this set of edges, we may apply one of the several available [String Metrics](http://en.wikipedia.org/wiki/String_metric), such as the [Levenshtein distance](http://en.wikipedia.org/wiki/Levenshtein_distance) or the [Smith–Waterman Algorithm](http://en.wikipedia.org/wiki/Smith–Waterman_algorithm). \n", 182 | "\n", 183 | "In particular, these two metrics are the two most widely adopted methods for string/sequence alignment. These algorithms accept two input strings of any length and compute their alignement in a time proportional to the product of their lenghths (i.e., $O(k*l)$). \n", 184 | "\n", 185 | "However the $\\lambda$ relationship is more restrictive, since it considers only strings of the same length. \n", 186 | "\n", 187 | "Thus applying any of the two algorithms above will result in a computation that is quadratic in the length of input strings ($O(N^2)$).\n", 188 | "\n", 189 | "Alternatively, we may compute the [Hamming Distance](http://en.wikipedia.org/wiki/Hamming_distance), a distance metric borrowed from the Information Theory. \n", 190 | "\n", 191 | "The Hamming distance between two words of the same lenght corresponds to the number of positions at which the corresponding letters in the words differ.\n", 192 | "\n", 193 | "An **naive** Python implementation would be:\n", 194 | " \n", 195 | "```python\n", 196 | " from itertools import izip\n", 197 | " def hamming_distance(s1, s2):\n", 198 | " if len(s1) != len(s2):\n", 199 | " raise ValueError('The two input strings must be of the same length!')\n", 200 | " return sum(c1 != c2 for c1, c2 in izip(s1, s2))\n", 201 | "```\n", 202 | " \n", 203 | "Therefore the `hamming_distance` function works in time that is linear w.r.t. the length of the input words ($O(N)$)." 204 | ] 205 | }, 206 | { 207 | "cell_type": "markdown", 208 | "metadata": {}, 209 | "source": [ 210 | "## Solving Word Ladders using Numpy/Scipy" 211 | ] 212 | }, 213 | { 214 | "cell_type": "markdown", 215 | "metadata": {}, 216 | "source": [ 217 | "Now we need to figure out how to efficiently find all pairs of\n", 218 | "words which differ by a single letter.\n", 219 | "\n", 220 | "We'll do that by a numpy **type-wrangling** trick:\n", 221 | "\n", 222 | "* Converting the set of three-letter words (i.e., `wordlist`) into a `[wordlist.size x 3]` matrix of 8-bit integers \n", 223 | " - (i.e., the `ASCII` code corresponding to each letter)" 224 | ] 225 | }, 226 | { 227 | "cell_type": "code", 228 | "execution_count": 5, 229 | "metadata": { 230 | "collapsed": false 231 | }, 232 | "outputs": [ 233 | { 234 | "data": { 235 | "text/plain": [ 236 | "(1135, 12)" 237 | ] 238 | }, 239 | "execution_count": 5, 240 | "metadata": {}, 241 | "output_type": "execute_result" 242 | } 243 | ], 244 | "source": [ 245 | "word_bytes = np.ndarray((wordlist.size, wordlist.itemsize), \n", 246 | " dtype='int8',\n", 247 | " buffer=wordlist.data)\n", 248 | "word_bytes.shape # This will be 1135 x 3" 249 | ] 250 | }, 251 | { 252 | "cell_type": "code", 253 | "execution_count": 6, 254 | "metadata": { 255 | "collapsed": false 256 | }, 257 | "outputs": [ 258 | { 259 | "data": { 260 | "text/plain": [ 261 | "643545" 262 | ] 263 | }, 264 | "execution_count": 6, 265 | "metadata": {}, 266 | "output_type": "execute_result" 267 | } 268 | ], 269 | "source": [ 270 | "from scipy.spatial.distance import pdist\n", 271 | "from scipy import sparse\n", 272 | "\n", 273 | "hamming_dist = pdist(word_bytes, metric='hamming')\n", 274 | "hamming_dist.size" 275 | ] 276 | }, 277 | { 278 | "cell_type": "markdown", 279 | "metadata": {}, 280 | "source": [ 281 | "The `pdist(X, metric='hamming')` function of `scipy` calculates a length-normalized version of the *Hamming distance*, namely: \n", 282 | "\n", 283 | " sum(c1 != c2 for c1, c2 in izip(s1, s2)) / len(s1)\n", 284 | " \n", 285 | "The function returns a *condensed* representation of the distance matrix, that is a `M x 1` array, where \n", 286 | "`M = N(N-1)/2` corresponding to the total number of compared pairs.\n", 287 | "\n", 288 | "In our case, the `hamming_dist` size is equal to `643545` (`(1135*1134)/2`)\n", 289 | "\n", 290 | "This structure is very important for our Graph-based representation model: we are going to manipulate this structure in order to define the *adjacency matrix* for the edge set $E$. \n", 291 | "\n", 292 | "In more details, the definition of the adjacency matrix requires the following two steps:\n", 293 | "\n", 294 | "1. Transform the condensed array into a `N x N` square matrix;\n", 295 | " \n", 296 | "2. Manipulate the square (distance) matrix in order to match the adjancency matrix representation.\n", 297 | " * i.e., a matrix $A$ such as \n", 298 | " $ A_{i,j} = \\begin{cases} 1 &\\mbox{if } (i,j) \\in E \\equiv w_i \\lambda\\ w_j \\\\ 0 & \\mbox{otherwise} \\end{cases} $\n", 299 | " \n", 300 | "Therefore, to end up with the desired matrix representation, we may apply the `scipy.spatial.distance.squareform(X)` function, which transforms a condensed input vector-form array `X` into a square matrix `Y`. \n", 301 | "In order to correctly represent the content of the matrix $A$, we leverage the powerful capability of numpy for array indexing based on boolean expressions, i.e., `hamming_dist < 1.01/wordlist.itemsize`. \n", 302 | "In fact, this expression matches all the words in the `wordlist` satisfying the $\\lambda$ relationships, namely their *normalized* hamming distance is equal to $\\frac{1}{3}$ (**remember:** we're considering only three-letters words).\n", 303 | "\n", 304 | "Finally, since the resulting **sparse** adjancency matrix will be very sparse, we're going to represent it efficiently by using the `scipy.sparse.csr_matrix` (*Compressed Sparse Row Matrix*) function." 305 | ] 306 | }, 307 | { 308 | "cell_type": "code", 309 | "execution_count": 7, 310 | "metadata": { 311 | "collapsed": false 312 | }, 313 | "outputs": [ 314 | { 315 | "data": { 316 | "text/plain": [ 317 | "(1135, 1135)" 318 | ] 319 | }, 320 | "execution_count": 7, 321 | "metadata": {}, 322 | "output_type": "execute_result" 323 | } 324 | ], 325 | "source": [ 326 | "from scipy.spatial.distance import squareform\n", 327 | "hamming_dist_sqform = squareform(hamming_dist < 1.01/wordlist.itemsize)\n", 328 | "hamming_dist_sqform.shape" 329 | ] 330 | }, 331 | { 332 | "cell_type": "markdown", 333 | "metadata": {}, 334 | "source": [ 335 | "### Insights on the `squareform` scipy function" 336 | ] 337 | }, 338 | { 339 | "cell_type": "code", 340 | "execution_count": 8, 341 | "metadata": { 342 | "collapsed": false 343 | }, 344 | "outputs": [], 345 | "source": [ 346 | "squareform?" 347 | ] 348 | }, 349 | { 350 | "cell_type": "code", 351 | "execution_count": 9, 352 | "metadata": { 353 | "collapsed": false 354 | }, 355 | "outputs": [ 356 | { 357 | "data": { 358 | "text/plain": [ 359 | "(1135, 1135)" 360 | ] 361 | }, 362 | "execution_count": 9, 363 | "metadata": {}, 364 | "output_type": "execute_result" 365 | } 366 | ], 367 | "source": [ 368 | "graph = sparse.csr_matrix(hamming_dist_sqform)\n", 369 | "graph.shape" 370 | ] 371 | }, 372 | { 373 | "cell_type": "raw", 374 | "metadata": {}, 375 | "source": [ 376 | "To get a feeling for what this graph looks like, let's visualize it with matplotlib:" 377 | ] 378 | }, 379 | { 380 | "cell_type": "code", 381 | "execution_count": 10, 382 | "metadata": { 383 | "collapsed": false 384 | }, 385 | "outputs": [ 386 | { 387 | "data": { 388 | "image/png": "iVBORw0KGgoAAAANSUhEUgAAAeUAAAHaCAYAAAA+HFEaAAAABHNCSVQICAgIfAhkiAAAAAlwSFlz\nAAALEgAACxIB0t1+/AAAIABJREFUeJzt3X+sbllZH/DvUwaZCTPQDtSGRtpr0KDFy0CRHxWMhzYx\nqVEgYoLJ0Ez4lTRtjWlH+wdU7xnKXySmRi2mJGXEisYIQkEjJOA9MsqEAWaYuTPA2JgJWu0obaQo\nzQDC6h/vfu/dd9/9Y+2914/nedb3k5zcc895z/uu389ea++9toQQQERERPX9rdoJICIiogMGZSIi\nIiUYlImIiJRgUCYiIlKCQZmIiEgJBmUiIiIlVAZlETkTkefVTocmInIqIrfXTodmInJORC7VTkdr\n2DavxbZIW6kMygBC90VXsDxIK7bNawn0jq+kWLFGIyLvFZFPisiDIvKG7me/KCKf6H52WiotVojI\nm0TkYRG5C8Azu5+9XkTuEZFPi8i7ReSG7ufnROR3ReR+EfmwiDy9auLreZyIvL1rUx8SketF5A0T\nZfZLIvI2EblbRP5IRE5E5J0i8hkRubN2Rkrq2s/nROTOrs29S0S+X0T+QET+UESeLyI3i8j7ujZ2\nt4ic773FLSLyse61r6+WkYq6MnxYRN4J4BKAG4ZtsXvdM0Tkd7rx8KMi8sy6Ka9ruKogIj8hIhdE\n5KKI/KyI3Ccil0Tk+d3v59qhfSGEIl8A/k737w04NNibez97HICLAM53/78I4B+XSpvGLwDPA/AA\ngOsB3ATgfwD4dwBu7r3mPwL4N933HwDwL7rvXwPgvbXzUKHMzgH4GoBnd///dQC3zpTZnQB+tfv+\nZQC+BOBZOMxyPgngltp5qlB2/fz/117ZvBfAzwH4qe5nLwVwX/f9KYBPA3gCgKcA+GMAT6udp0pl\n+HUALwDwD8faYvf9RwB8W/f9CwF8pHbaFZTbpd7/bwdwoYsD/6X72fceXwPg58faoZev664N09n8\nuIi8ovv+WwB8O4DndrPm6wA8DcB34hCw6dAIfzOE8BiAx0Tk/TgMludF5C0AngzgRgAf7F7/IgDH\n8v0VAG8tnF4tHgkhPNB9/ykcOvxUmQGHgxkAeBDAoyGEhwBARB7q/vb+AmnW4pFB/j/c/fwSgG/F\nIdD8MACEEC6KyFNE5CYclq/fF0L4CoCviMhFHALTfy+dAQU+H0K4R0TOYaQtisgTAXwPgN8QkePf\nfFPxVNrxawAQQrhLRJ4kIk8G8GJc2w5vDCH8dc2EplIkKIvICYB/BuBFIYTHuk77nTgcEX13COH/\ndsuF15dIjxEBhyA8dCeAl4cQLonIbQC+r/e7sde35iu977+Ow8rMsMxOeq/5avfvNwZ/+w0U6h+K\nDPN/LJuAw2rW1xHfxr6RMF2WfLn3/bAtXo/DKcO/DCE8t2iqdPsbXH0qdS4OHK9fcDvWlTqn/CQc\nGuJjIvIdOMzqnoRDA/6SiPw9AP+8UFqs+CiAV3TnRG8C8EPdz28C8KiIPB7Aq3uv/xiAH+2+v7X7\nezq4EVeXGS9M2uYuHNrW8UD7CyGEv8JhgHy5iDxBRJ6Cw0HPJ2olUjHpyusREfkRAJCDZ1dOV21/\nDuCbu3PFTwDwg73fvQoAROQlAL4YQvgSxtuhi1kyUG4m8EEA/1JEPgPgYQB347AseB+AzwH4EwC/\nXygtJoQQ7hORX8ehnP4CwD04BJOfAvBxAF/o/r2x+5MfA3CniPxk9/rXFE+0DmMB96cxXmbD1w//\ntrXgPZf/AOAOAO8QkftxOKC+rfe7B3A4B/hUAG8OITyaOa1axbSnWwH8ooj8BwCPx2GJ9gE0KoTw\nNRF5Mw5j3J8C+Gzv14+JyL04xKrXdj87xXg7dEG6k+VERERqdKc5bw8h3Fs7LSWVvCWKG4IQERHN\nKHkhCzcEISKiKCGEl9ZOQw27ZsrCDUGIiIiS2TtTfm0I4S+7HZLuEZH3AHhj97PHAfiwiJwPIfDe\nYyIiogV7g/LuDUFEhEvaRETUlBDC6L3Wm5evBxuCPAeHbfaOG4L80xDCLQB+GxEbgly4cAEXLlwA\nAFy8eLHEtm5X/Tv83srXhQsXqqfB4hfLjWXHsrPz5aHcLl68eDnOHWPdlD0z5bENQd6LazcEubj0\nRqenpxARhHBl0jz8/1r9vx++1/H7sZ8RERGldHJygpOTk8v/v+OOOyZfuycoJ90QZCwgjwXm2GDN\ngEtERNZsDsohhK8C+IGRX/3exOujLm/vB92xYMoAe0X/yIvisdy2Y9ltx7LbprVyq76jl4iEqWXm\nne87uXxN01hWpA3bJHnTtem0F3qlNrVUvdXc8vXxffvvv+ezPOHgR9qwTVJL1ATlmKXqqcA5FmRj\nPqvmeWceBBAR1aF5/FUTlJfMLWHNnYOOed8SfzPEo//tNHcoTVhOVJqVNne8kFgjE0F5zzmlpdn1\nlve1EFC1NrgUcpe/l7LTPPCQTxbGxiOt/UN9UB67f3nsNVO/W5pde+U9fzl5KjtPedE4gJJtGvuH\nuqA8vPgqZkORPcvXRGQD+ze1QEVQ7gfiuYuvhgH6+EVEROSBiqC89giY9y0SEZFHKoLyFlOBudbV\n1ERERHuZDcrAeGD2ejU1ERH5pz4oj+261X8kVv93nPESEZFle54SVUTMrls8x0xERB6YnCkvvZ4z\nZiIissjFTFnrzixERERrqJ8px1oK3gzaRNQqjn92qArKex+lOLz4q/9ePOdMRK3i+GeHqqCc41GK\nax/rSEREVIuqoDxn66Ygc1dsExERaaIqKM8tX2/ZivP479xDLIiIiLRQFZRTLV/HPl1Ku1yzeU2r\nBJrSQjaVaENsp1SK1A5WIhJKpsFqgCYiIh+6ODR6pKdqppwbA3IanDUQEeWhJiiXGOiHAXnvLVit\n4oENEVEeaoLy0kC/NWjO/V2OW7BiPlcLC2nUyEu5eckHkSdqgvLSALE1aMbu7pV6gLIwm9x7MV2r\nLNRtDC/5IPJETVCeGyByB0yea16P5UVElJ6aoDwnZwBoecZHRES6qAnKJYPjcGOR1md9PDBpl6a6\n15SW2lgWepSuCzVB+RgYl66ITlFAx89qPRgfsRzapanuNaWlNpaFHqXrQk1QPlq6IrrEUraIXPM1\n9rq170tEZbDP7We5DFOnfc/7rf1bdUG5trGLvpb+v4RHvURlle5zlgPYFMvjVuq073m/tX/LoDwi\nxdXYHjtpKR7LzmOexuzNp9VyCiFkTbvVcvGkVB0wKE8YC8xLldL/fe5O6pnHsrMy69hb7nvrzko5\njcmZdg3l4q1PrlVqXGJQnjEMzEuVErPMrbVha5vhlBiENNWFlrSkGHi0rTJpKdsUauZFw4FBbSUC\nc3NPiVoytXS99/nOREDejWq4CY4u3uvDe/7m7M07nxK1wlRAnrsq3NOROOWVe4mTbXFeyfLxfp65\n5faWsx8zKM8Y3jM9VRFj91iP/Z/28TzApeJt5mLxtEipz9NQ1xrSsFbKNpVj3GBQnjE8Elxasth7\n6xTN8z7ApeLlAAPIVy9eyshLPvZYWwYp21SO1QITQbnGFpxH/d2/+pU5tqkIlcFynze1ckNXeDkI\n05aPGm1uaxmkSmvqOjARlEs2vJhdxDjYxctRVgw6cVp8RvhWnvNWkraDhDla02oiKGultVI14ZKz\nH57L23PeaJmmgzIG5Z7YiukvZS/tk035eS/zFPnzXkakj6U2p+mgjEG5Z65i1mwaskX2G9INdZC1\nNHWoHFLkz3sZWVbzmpmc2Oa2YVCOtDVgp3j/FNhByskx8Hk8qNKep1Lpq33NDF2hoU0yKM9Ye99x\nigvCNDSK3Lzfb5xj4NO2UYOFA9G9tKeP0tNQ5wzKM9bed7x2r+yYz/Qo59XTnssvZd5q729tiaaD\nISqjZp0zKCdwvPCL9zGvY21g91Sf1sq+plJlVbN91fhszf2pZv9oNiiveQzjnJi9srXS3Ck08nZ/\ntJd8eFFzzKjx2RbGyBqaDcpLDWJs6XlNILcQmDWlz9J5Zk3ltoeXfBB5wkc3JsZHPBIR0Rw+urGg\n/vnlqQ1GtvK+3Og1f6nz5bWcrMpVH6znNjEoF8INIJZ5zV/qfHktp9Ss32espZ55cFAWg3IhnmdL\nmtJC23mrRy1BzTqWY1kMygtyXyS09f01PQFI28YWsTSlWUNatFxd7vkAdi9PebEqdx0wKC/IHfxy\nb98Za28+LT6MXtPBhKbZSO1y4XL/NA150dJnasldB+qvvrZwa9FWnvNGRKRV7bHX9NXXXoNW60eb\nRES1aI4r6oOyN8dgPNyW0zMegJAF3tqpt/y04rraCbBu7TLI1Lacsa+1yGo+ai9xaeS5TLzly1t+\nWsGZ8k6pLorJfvEAj5pX46B2LZZJeiX7JscB/RiUE9gamEvula1lMOWgUFeK8mcdplWyb2oZB2ga\ng3IiKZ6d7Hlp8Cj37TaeA4aWFZnat0wRecagnFCKAY+Bef9755AjvXsP4kp97phUdcjgTl5tbdsM\nyomlHKg8D1jWZls5gn2tgy9N10FYOAC11E4pjZr9Q/3mIQk/x8QAUIKmstCUFrKpRBtiO6WUTG8e\nkkrt3Vu2/n7v4x7HaJqlakpLDdbzriH9Jfo2A3J7arXtZoJyTUsb/S91+JxLp60MqlpZz7v19NPV\nNIwHWtSaMDAoFzRWyTGVnvuhDEREQL3xQOvBQI3yYFAubFjJMbPk/muOy9nDLw+85GOIjyL0LVd9\ntFTPnBxcsTkoi8ipiNyeMjE0L3b5e21n3tP5Uw4cXjsmH0XoW676YD23ac9MmS2mktTnoPd0fosD\nh5cZiJd8EHm3pq+uCsoi8iYReVhE7gLwzO5nrxeRe0Tk0yLybhG5ofv5ORH5XRG5X0Q+LCJPX/NZ\ntWke8FKdh9aax9zpqnEgkSNPGg+I9uZTa5uMwZ3qfNtTB2v6anRQFpHnAXgVgFsA/ACA5+MwW/7N\nEMILQgjPAfBZAK/r/uTnAdwZQrgFwLsA/Fx0qhTIMeDluLVpze+Gn6/1diSNwWYvSzuN7bE3n6nL\nqWT5tLB3vTUWT6+teXTj9+IQgB8D8JiIvB+AADgvIm8B8GQANwL4YPf6FwF4Rff9rwB469Qbn56e\nXv7+5OQEJycnK5JlR+kBZ7jhwTEID39Gdo3VKV1RulxYF+XElLWW/nF2doazs7Oo10bv6CUiPw7g\n5hDChe7/PwPgzwD8awAvDyFcEpHbAHxfCOG1IvIFAE8LIfyNiDwewJ+FEP7uyPsW2dHLg7nGdQzQ\nLEu7NAweKXnLD9mmqT2m2tHrowBeISLXi8hNAH6o+/lNAB7tAu+re6//GIAf7b6/tft72mF4a1T/\ne02NzRotadZSh6l4yw/ZZuXUyKq9r0XkjQBuA/AXAD4P4F4A/w/AvwfwBQAfB3BjN1P+BwDuBPDU\n7vWvCSH8z5H35Ex5h7HzxLVpOkigfViXROnNzZSbeSAFERGRBuYfSKFlefFIW3qGYnb80p6HOZbT\nXhLLiUqz1Oa0ptVEUNZ2646Fmf3SFdbaynSNEo/p88ByHZNNFsbGI639w0RQBmxVdilLj3xc+pnW\nRrlGrk05tJTL3nRo6zdaytW72uVc+/NjpewfqfLMc8oO9RsHy5aISJe5c8prNg8hI5YeUMFATaQD\nr26nITPL15qlWLaIfY+tn8WOT6RP6VMlVpaVW8agnECKgBf7HsfXDTtXzJabVEaOsvZYf9rzVCp9\nJQ+YeXA+T0ObZFA2ati5Yh5AMfU6SitHGXusN+150p4+Sk9DnTMoK7TmaG3utbGBugYt6Yjlafab\n7CrRgqdtiKzZ2rZ59XUDpi4m0fIQi5YvdrGed+vpJ5qSs22b39GLrthy9DV2MYmmGYqlQb3kM7Et\nsJ7+tTT1GxqXqo5qtW0GZWO2NpTWBs9cWI5tY/3rZ72OGJQ7no6AY/ISQrj8Re3S1O41paU2loUe\npeuCQbmj+aKotbYEWm151ZYer5Y2mtmC99Lvp7EsWu2TU8+xz4UXehERERXEC70SafVIUYuc5c+6\n1Yn1QqmlbFM52qfqoKytQ3JGX1fO8mfd6sR6odRStqkc7VN1UJ7KsLZg7ZGIXP4iIqIyVAflKTx6\nLoPlTERUlsmgTGVwlkxEVBafp0yjlmbJ3F6RiCg9zpRp1NK2nAzIRETpMSjTqKlHQ3JJuy6WP1m3\npQ231O4ZlGnRsUP0O0ZLnUQTrlBcje3Qni1tuKV2z6BMs47njocBuaVOQnqxHZI3DMo06zjo9f/l\nQEhEtXhfHWFQ7niv6DlLeT/+frgxe8tl5oWmOtSUltpYFtNKTwpK1wUfSEG79RvtMHCzbomIrsYH\nUmzAI9U4UwF57P9b3tMLj3kaszeflsuJD0zxrVQdMChPyHULkNbOtTddcwF47XuXml2XrIuYzVi0\n2JOWvXU3vKgwhVJlW+KBKZraSWtK1QGXryNwGTbO0gYjU+XYUvnmzGtL5WiB9/rwnr85e/PO5eud\nLDS8mkfQc0vYc79b+nkpmmbMWt+b1vNeH97zNyfHis4Rg7ITNTvI8OKuqd9pZSGNRKRLrnGDQZmy\nYKAjagPPc6ctAxNBWVula0uPFseNRfoXRPTvZ54qt7XlyfKPw3KiEuZWyjRLmdaUkxATQVnbrEtb\nemqb2xO7X1apzinzStQ4qdup5/L2nLeSLI2NWtNqIiiTbnOBd2ywi/3Zms+l/DyXt+e8kS0MypRN\nfzl76SpsDopENNTiCgaD8k4tNpotjoF5+KWZ9vTVwDKhklo8WGdQ3inV/WrZd4nhYLpaiwPCEpZJ\neiX7JscB/RiUE0gxUOUe7HLe7O4dy41yKnmgU+siSfaheAzKDeEsZxuWG3lTuk2zD8VjUE6MR4Tj\nxsqFs/d6WO7xWFbtqVnn11X7ZKd4RDiuv2zGMqqPdRCPZdWemnXOmXKjahwJjgXk1jcCsZ5v6+kn\nmlKrbTMoNyrlkeDcjl4xT4lq+eHw1mdhVu4+SIl3SqSlNb+1+iafp2xIC0u/U+eeLUtdby20A0ty\n1Qfr2S8+T9kJ7xdGlchbjfJLPbDmaAetzXZTyhU4vff33KyWHYPyBK0V2sKR81geU9VHtSWpxO0p\nR6BP8R5786m138XIkfbWr7nYw+oDWRiUJ+RajtL8flqsWcK2Uga5Dga05X9vPq0OpEDeA74WDsZz\n0Pp4xjkMygtSV6rWRmJ1hrNnVq0poG1Ni4WDx5qf720J2FNeSrDYP3ihl3GeLgaZauye8uclL0ta\nyutQy3mnOHMXenHzkAiaO5nWdG2xZol67LnN2sviOGvLuZStpQy0pKOGVjbK8Z6/OTnzzuXrCK02\nPK2mnsdsYWmP5x3b4b0+vOdvTs68MyiTGy0PEkTkA4PyShZmY61rvY5az38ML2XkJR971C6D5HfV\n1J5d8EIvOoo9TzPsBGw/eaU4f9by+cc+z+XgOW+pcUcvMmFNh2bnLyfVxiLkuxw8560kM0G59hKF\nNZbLayztcw+6KPH5VnnKC1FqGvuHmaDMo7B1LJfX1NXVsa9N8flaOuvedGhrB1rK1bva5Vz782Np\n2oDp8vvU7rQ8p0xTeI6KiDziOWXlWnk+69p0xG4mcvy/iFz+It1YR+WxzG1gUFYg92xQy2wzVTpy\nPkWKytDSJqd4bE/ay9yDFO2GQTkRPo82Tgt51IztNA4DGG2Rot0wKCdi5baR2gMqB7u6rLRTolYx\nKDdmz4BaO6DPCSFc9aU5rXt4zRe1Y0sbbqndMyg3aksjz/l0oz2vHwvCx59pv/gr1cVvtWgtV9Jr\nSxuu0e5rte1mg3IrVzxP0TSbHHsM41D/Z2vuY84hZbnVDLIp8qHtIIEolVptu9mgbOmK51zBU1Ng\n7lsTdGuk30sg8pIPSkfjeFBTjfJoNiiXon02on1gXtpeU3v6iSxhf7pajfKYDcoick5ELpVKjEds\n5OuNLVUff7b2oRUtXPxFRH5wpjyQY+BmMFgn1znjsYu/WDc+sV73s1yGyZ9xvOP91v5tTFB+nIi8\nXUQeFJEPicj1IvIGEblHRD4tIu8WkRu6D/8lEXmbiNwtIn8kIici8k4R+YyI3LklQ6XlesBBSpY7\niwZeVi/YDqZ5qeOaLJdh6rTveb+1q3QxQfnbAfxCCOG7AHwRwCsBvCeE8IIQwnMAfBbA646fD+Bv\nhxD+CYB/C+D9AN4K4FkAzovILdEpo0nalmI1pSVGycdA5sSDPSIb1vTV6yJe80gI4YHu+08BOIdD\ngH0LgCcDuBHAB3uv/0D374MAHg0hPAQAIvJQ97f3Dz/g9PT08vcnJyc4OTmJzkBuqZ9UlOr9cgzI\nW9/zeJBg7chaU7o1pKN//r72rVoa+5wGnvJi1ZY6ODs7w9nZWdz7z725iJwD8IEQwvnu/7fjEIRv\nA/DyEMIlEbkNwEkI4TXdEvVvhRDeM/K3l383+Aw+unEHdtJ1xmaDMeWXopxbqquW8jrUct4pTo5H\nN94I4FEReTyAV+OwbE0VtNL5Uy2tDq/IHl78NfVZ/XLemhZrdbWnzK3lNaWW866dhVM0MUF5rIX9\nNICPA/h9HM4pT71++LdsrbRJyYFueM4+ZlvPlLQMHKnKXEt+jkqnR1v+W2bhgGl2+bpIArh87Y61\n5bulLTyPr7GUp1RazTdRTjmWr0khLUfkOQfxWvcflg5MLdQlEV2LQdmR4e5XteVIR477D9eeXy6B\nwZAonpYxLwUGZYe0DOhWzr3OvY+WstTA08A35DlvLfDUTxmUK/M+GOQMzCluUUrxPq3wXE7aNuRJ\nwVt+WsGgXFmJwaB25yyxrL7lvaf22B5byq5dht5oLE9vBx0xzyknfRiUFbD0bOc9Wn4EJV1Ne315\nDGDay9yDFO2GQZncie0YHgdeSqP0RZMttEUreUyxac6e92BQTsxKw9Og9rlmj+cRY7Wa77VKtZEW\nZrFW8piizvfklUE5MT65J17uThpTdnNpOP5u6nYpy6wMkBqULKuabctDu06lZv9gUFaOg+d2Gp/G\nRTSnZntjW09vy4EOgzI1izMDIh289sUtBzoMytSkLXs692+XIirN89I2+9QVDMrUJK9PQPKC5Xot\nLm23gUGZaIeWr+DOiUGAWsWgTJMYbOL0AzPLjIj2cBOUORimx9lKvGNgZpkR0R5ugjIHQ6qNS9lE\ntJeboHzEQZFqYxskoq3cBWXOmKk2tkEi2spdUCaqqb8hPWfMRLQWgzLRRnNBt79fNpXBsm6H57pm\nUCbKiEvZ5bCsbVsTaD3XNYOyMZ6PEEtIWX6xA8Pw6VIp0pC6HWhME7VVpnzC3gGD8gyNler5CLGv\n9rOWc9L49KoUD2dP8R50NQ3t9chaveYouxJlwKA8Q1OHaE2pAd7aQJNbyuBMvvCgq0zbZlAm1XJ3\nglIBpP90KQuDWr9cuLSdjodymOozHvKmAYPyTmyIfuVcQrd0y1SKAM3Z84HncvCct5IYlHdiQ9Qn\nVbBLVbee2oinvMQodeBk5QBtDyt5rL0yxKCcmJWGp4Hni7m0YzuNU+pe8xbarJU8pqjzPXllUE7M\nSsPToOWyqr25SMtlvxbLKj9tB4k165xB2ShNjVhTWqyJPb9cooxZj1QLD3yuYFA2SlMj1pQWi/qz\n5uFGIwCuek5zzsDpvR5LHnTwAIe2YlCOwA5GqWxpS/1gORc42U7nlTzoaGHfc+/5m5Mz7wzKESzM\nILR0EC3pWKNkmre2pZg0WminLfFeH97zNyfnQReD8kpag46WDqIlHWtoSPNSu1qTRq1tVBMvZeQl\nH3vUvlgy+T70tQckEQm10+BJ//wjERHp043To9HczEyZR4RxLG3lqJWWskux65eWvBBppLF/mAnK\nXmd/LWygobHhz6lRdi20A+2stdMl3vKTg8b+YSYoe6WxUaTWQh73mDrlMHarFOWT+iEctbHflZei\n3TAoN8TDQFND7nIbDp7De5SpPO8BrXS7aqUdp2g3DMoN8T7Q5FK63GLvSybaqmabpnnX1U4AEY07\nLmtzQCNqB2fKAzmWWVpZuqG05h4mP/U1fF2rWs57KpbLMPm9wzveb+3fMigP5JiVcKZDc3I9/7nl\nducp7zU3x7AamFPX/573W/u3DMpGaeosmtJikZYAwnrUqWb70NI2W8KgPEPzIKWps+RIi+ay9yrV\nzIh15xPrtUwZMCjP0BT4pnjtKBbKfi0tdTWXjhTlbnnZMwWveffYJ9cqUQYMysZ57yi5BzgLT4hK\nLSZo7i0XLXmtwXvevR50aMEHUlTg9aERufLltbxSGw6WMWXGsiUq59jfXDyQwhOvg2CufHktLw1Y\ntkTlxPQ3BmUiIiIlGJQpOZ5z0m1N/bAuicpiUHao9kDKZzrXcdySc+npUmuehsTlbbLA01jDoOyQ\nloG0RDpydUYtndzbVdJaytW72uVc+vM1jHmp8sygXFntzlNCzjx6v7hsKR1jv9fcprSU6xjN5bZW\n7XvFNddzLqnyzKBcWYnOU3uwKbGcXTuPGhzLYPjox+FS9tiSdmm1P3+Mt0Ay95xu0otBOYEUS4i5\nZ5MaOmTOQc/bgLpkrD6H9ay5TDSnrbRSfVPLOEDzGJQTSLU1YU4tDYItDDxT9cmBN60SZVmyzmqO\nA1baZe393xmUyZ2WDkCOrMyQrSlVli3UmZUDxtqTLAZlohW0DCpTtzlNpW94u5SVAVIbltk+lu/I\nKIVBmWgFLTMaLl/XoaX+vUnZZq3XEYMykXFLD6KYmz0zgJMG1gNpSgzKAzkGKQ58lNOWe5n7vxu7\nXSq2zWpt21rTZYnlMkyd9j3vt/ZvGZQHchyxcUZC1s3NtjVin9vP8na5qdvlnvdb+7cMyoVoHbxa\nZXGgqWnNftlaeOtztcqdBzhlMShTkzwN2C3uM6yVxS1ltX/2FOunWKYwKJNa1jpTLRoHzFaxLsqJ\nncFbq5PraieAaIq1zmTVVDn399IWEdYHqeOxTXKmvBJnb6Td3jY6ta+2J176sZd87FG7DFJ/frag\nLCKnInJ7rvevJeXgVLsxkU972+jx7487fw3VaLepP7PGxUs5Ps/bwdIWtcsg9efnnCmztSyo3Zio\nDo0HY3Oc2rtYAAAZP0lEQVRpGt67vLXd7sl3rlsVgbJPaSr5eaTTUv3PBmUROScinxORO0XkYRF5\nl4h8v4j8gYj8oYg8X0RuFpH3icj9InK3iJzvvcUtIvKx7rWvz5kRuhrLa7vcZVfyYCw2LyVmxFoP\nQkunS2s5tKjGOLlU/zEXej0DwCsBfAbAJwC8KoTwYhF5GYA3AvgTAJ8KIbxCRF4K4JcBPBeAAHg2\ngBcCuBHAfSLy2yGE/5UjI3Q1ltd2nsqu5KYHsXjR2AHLoT6N5R+zfP1ICOGhcEj9QwA+3P38EoBv\nBfASAP8NAEIIFwE8RURuwmH5+n0hhK+EEP4PgIsAXpA6AxrUfv6mhvcn2+bax/DJUmu34hx7P6qz\nnM1xQL+YmfJXet9/A8BXu+8DgMcB+DoOs+IY3xj74enp6eXvT05OcHJyEvl2OtR+/mbs+/PIfJsW\nys17/jQrWfb9A4HSp1FabmNnZ2c4OzuLeq3MFZSInAPwgRDC+e7/dwL4rRDCe46/A/ARAP87hPAW\nETkB8DMhhOeJyCmAlwN4EQ7L1/cCeGEI4dHBZ4SWK4vSa30A2Guu/JaeSOVNqbbUQpu1kscU6Vx6\nj+73o5PZmJny8J3D4Ps7ALxDRO4H8GUAt/V+9wAOy9ZPBfDmYUD2yErD0yBXWbH8l82VPVdVrihV\nFi2UtZU8pqjzPX87O1MugTNlssRrsNqar723SVnjtf49sVBHczNl7uhFtIL2zr6V13ylxnLSz3od\nMSgb4v3KSa/50/TA9dSmrsoeXp2tKc2p5cqb5zKjaQzKhnjfEajUEa71Rx1aaQf9fFufvczJlTfP\nZdaaNX2VQdkgbQ8d15SWGF4Gu9T5qLkXtLU2RLTGmr7KoLxA69KjpgFZ20FCLE1p1pCWWntBD7/X\n2uc08JQXq7Jv9FR71sCrr4nsG7vitT94aejjJa/KtXAFMNXDq6+V8Xq0ywte2pUiAHl6EAgDMo2J\naeMMyhV47bC84IX2YD2TdzFtnEF5Jc7aSDu20WVeyshLPvaoXQbJr4GofXQac05Z2/kZbelpDcs/\njqZyGhu4tKSN0tHU5pbUTKv5c8raKllbelrTSvnvPQJPWU6pZwOt1GENNWeOlu7E0NoGTQRlz6w0\n4D1ayGMOmgYNTWnJxUs7rb25TAttJScG5cpaaMC58+hlMJ2SIn/Z761ceP/jdpxj23Jq4a0v1s6P\nprq1hEE5AQ+DpmW1B5/cUuSvxKMHl9qY5za4R8lyKflZ3vtlLgzKCaRYLsq95KSpg+QcGDwP/Jry\nNnfh1lQ6NbVBTXj/NPUxKCeU4iKHFjpNzgMQz+Wn6cKtubR4qwNNB0NURs06Z1BOzNLVh7VZG7w9\n1WuJsp974ET//PLY7zXx+vSy2p/NOh/HoJyBtWBDcViv67Ty6MZUapZRjc/O+RASyxiUiai6/ox5\nywDNQd02HrRdwaA8gx29HpZ9HdruJIh9Lw7q+bFPlikDE9tsEhFRGyxt1bmV+W02iagtW2cknM2V\nkbOcvQfkJQzKRKTO3F0M/fPOY1d1U34s53wYlIlIpanA3L+VisGBvGFQJiI1pma+XJamVjAoE5Ea\nUzNfzoipFQzKRGQCZ8vrsczsYVAmIjXmggi3sF2PKwz2MChTFA6GVMJSEGFgJu/cBmV23LR4xE2l\nDPvu2MVf7N/kldugzI5LZNPwAHDsgPDYv7fulU2kldugDHB2R9QKBmb/Wqlj10GZiPw5zo77G4jw\nANy/VlY/GZQzaqEBpZKzrFgPdsXu6MU6bkMLB1/NB2VurK5DzrJiPdg09rSgqf6693nMRFo0H5Q5\nYBPpNHWBF5FnzQdlIrKtPzPm+WWy7rraCdCohYdsE3kx1lenzkXnwjGDUuFMeUQrV/lp5bHsPeZp\nzN58Wi2n3GOG1XLxpFQdMChPYGCux2PZW5lF7S33vXVnpZzGeL9Y0VufXKvUuMSgPENDR2iVx8Bs\nQYo2z37jk8Z6LT1GlCgDBmVSq0RgZuAf12q5jOWbt1ldTVNZaDxQ2ItBOYKmRtia3J2udKe2ct7R\nw2B3vBL7+NXfK3vpfuex91nL67jBDVvy5p1BOYKFAUpLB9GSjjVKptn7eUct1pbF1Aw59rUp0mCN\n9/zNybmKx6C8guaAo6WD5L7tJAeev9arVL2s2ahES1+jbVK1qVzjBoPyCuyMcXIGz1xYt+NqH6yw\nXii1lG0qR/tkUK6o9oCXCwdSP7TUZYq+MjzHXJrX/k5pcUevirQMeETa5eorw0BpfTWGO4vZx5ky\nUWGtzpg05Hvv9psp8sAn07Vtqf4ZlMm0XAOcloFTQyBLpcYFdWMz4eHP1qTJyuYqtdtN7c+vIdVV\n+QzKZFquAU7LjENLOlIpHZjH7qnVUqZaDvw8fn4NqfLMc8o78RyOX6zba6Uokxpluub2plJPmGLb\nojGcKe/EjuUX6/ZaLBOivBiUyY0Wz2PRNLaHcbXP63v5rFwYlBviocHOKXXbDNkwdf669fqscV6/\n5s5s1jAoK2RxR6y1LA2MuQaVWmWQ6nO13x4ErNs+syWly6DFMt/atqV2YYlIqJ0GImrPmovWeNEf\npdS1p9GozZkyETVpLsiW3OmLqI9BOQNLS7MUj/XajqlbpZaex7xHzfZV47PZn8YxKGfAo2qf+IhH\n+/Ze+GV9b2xNn80LM8cxKBuQs5FZb8Cl8YDLlphtNlvd9nSMh/yl7KM1yoNB2QCvR+dEueW6+tpr\nv/Gar61qlAe32SQiijAcoHlFNuXAmTIRUQRekU0lMCgPeLuqkqhFOfrcVBAeXpV9/Gzr/d5y+lOn\nfc/7rf1bBuUBPg2mLssDAemhoc9pSMMeltOfOu173m/t3zIoG6UpeKVMi+WBwDpNbaplrIe2MSgv\n0LQM0pc6eO1Jl9X7dzWlWUNajm2qdlq09rmc+mlcs9MYlZe7DhiUF2haBsnJ4oPr99KU5r1p8bRa\nkaPPaQxmsYG4T2tetNPQP2LTwKDsADupD3tXK7TR1C5zl08I4aqvmIu/xm6xiv0sWkfDwUxsve0O\nyiJyTkQu7X2flqRuHFo6aa5GX7szlaKlHlPxlp8pa/I599p+4GilzZekoT3G1GuKmbIkep9maGgc\nOeTKl9fyIj9SXyvCNu9TTL1uCqbd7PhhEXkngEsAbhCRt4vIgyLyIRG5vnvdM0Tkd0TkkyLyURF5\n5pbPI2qJt1mStvyUvId56XdEQ3tmuN8G4D8DeBaApwP4hRDCdwH4IoBXdq95O4AfCyF8N4CfBPC2\nHZ9HlJWW4OFtENeWn9LpmWpXqdqblnZLaezZ+/rzIYR7ROQcgEdCCA90P/8UgHMi8kQA3wPgN3qN\n5pvG3uj09PTy9ycnJzg5OdmRLDu4d64urAu9cvSVnP1v6X1TfjbbrX5nZ2c4OzuLeq1sqdAuEH8g\nhHC+/333u9sBPBHAfwLwcAjh7y+8VyjZgFPQlp7WsPzjsJzqGit/7/tnW2pzNdPaffboEkeuC7Qk\nhPBXAB4RkR/pEiEi8uwtb6atkrWlpzXWy7/UcqO2ctK6zJorXVO34Wy9FcoCbW1uTs2APGdPUA4T\n3/f/fyuA14nIpwE8COBlOz6P6BoWb8NaMxh4G7A15if388qHeS4xW65dzrU/v4ZU95lvWr5OKWb5\nmoj8sLTEmdqWXbzInxrL17RC7qNKLUetOdOhJY+0TOuMObWppevj1/E1/R2/SqeH9GFQVqDEFoAa\n5F4mJDu011eKAKYtj9rS41GKdsOgrFALR7TW8pgjvbXKQNP9sVrbwZ4ApjVPlF+/3WxtBzyn3ChN\n5/U0paU063m3nv6c1pYNzzfrkrNt85wyXUNTp9eUliVeHyaylfX0r7Wm/qfOnS9djU37pN6HvDQG\nZaIVWgtCdLW19T/2+uHPWrnwbY6G5x1rwaDcablTtJz31mmqe01pKa1/Rbb1oLKF5jyXbpc8p0zU\nOJ4X1mkpGIzVGevSBp5TdqTl2UQKXsov9XKf5yupLZsKsLFbeA61VEdW88qZMqnCI32ig7mgwj5i\nG2fKG1g9yrLO60UvHvM0Zm8+LZeTlv3SKY9SbZNBeYKFTeItD2BzPA5AOe931GRvPnM8M7mUnO12\nbT6WXj/1e23taS+LV3UzKC/QeO6u/36ptDjD0ZTmrWmxcPBY8/O9rbzEPvZxKd9z56o9sdg/eE6Z\nVMt9jpnnsMexXOpbG1Sn6uz4Pqnqk21jv7lzygzKREQGjW3LyYBpAy/0aoCWJTot6VjDYpqpbXPL\n1lRGrnGDQdkJLZ3R4jk8LWWnkbW6JCol17jBoOyIlgFU01WoVmnJJw9Y7Ji7olpLe6Jl19VOAKXT\nwgDaQh6BdvJJy8bOE8dehR1zjpnnoXXhTJmISLHYgDn1RKqlmTIDsi4Myg5pWarSkg6iVhwDcD8Q\n959AtfS3VllO+xCDskNajny1pMMqTQONprTQtfp9betDKVLex1yap7GGQZlIKU0Djaa00Lwtj3xM\niW1lHwblylqYgbSQR7LPQjsdppHniv3h1deVtdBxWsgj2Wehnc5dhT12FfXYrl+kG2fKVISFWQiR\nt3Yae5EX6cGgvJO3TpyLxYGBdXst72WiqZ2uLeu5tPdvjRpenb3lsygfBuWdUnTi7I8CU9LhcqYj\nx3trGqC1sHyFrjWp299S0CYdGJQVaOVqyJzp0JJHisP6Ko8HQjYwKJMbHHSI6t8SNVSyX3oYAxiU\nyY3+M2WJUijVlmI/J+aWqKV+ULp/lOyXWk4n7nkPBmWFWggqOfOYYyaQI7216jnV59YevEooNavc\nur+1pfPE2tIzJUU69zzCVmoXlIiE2mkgIiIqpbunfDRqc6ZMRESkBINyBtqX5Ggb1ivltGYLzVyf\nv/SYx9SfR9dqNijnPqdpocFZSKMm3i4k85IPL5a20MyJT3bSo9mgbOne4FwdxsrBgzZeBhMv+WjB\n2rpa6te5H9/oRY3xkRd6ERE1auwhFrVoSktucxd68SlRPS01CqKU2HdsmquzvY+FnGoTbCvzml2+\nHsOGsh6Xvwlg3/Fmbb+e2sRkbnOT2J+3hkGZdmFH2o8HNjSnZvvYuqlJ7N+z7V+LQXkCG0s9Hst+\n71KgFXvrznLde7wgM/fnWmr7peqAQXmCpcbiTYmyr7X/r3d782m5nHI/Bc3CjNmzUmXAoLwgdUfQ\nOhNobYZTe5AbYlryfX7t/KRifcbcf4/aG6Xskf2e8dpHQLwliohIl7n7mHn19H7c+5rM8riTkQUs\nFxrDgJwf71Mm1SztvOaJx3IpGVA8BK81V06vuR+Z9y/P40w5AmcNlErOtsR2Oq/kgK/tmoW1YtI/\nFVjnfr/l5xrlrFsG5QgWGouWAUBLOtYomebcV+iSHtbrY8s9xlvybHHMyHnQxaDshJYBwOKTlLSU\nHZFFrd4Gl+spXgzKFVkKXGtZ7WikU46+UuN5xVZNzYotHoTnkHK8Y1CuiIGLLEh9j+oWOfpK6f5n\n+TzzUln1g/Pxa4nVssiNQdkhLY1dSzponxTBiwegB17LYSoQl9pe1tNYw6DskJaOryUdVmkaaDSl\nhebV2BGt39eH35doO57GGgblyloY7FrIYw6aBhpNacnFSztNfZ53bd0PP3csPZa32cyN22ySed43\nHUiRP+9lRGmlai9z23Xm+kwLuM1mZhouhKn9/jV578g8p+tbyb4Z+1mx7WXu/dYE5OPPPY9TsRiU\nE7AwaLYwKLNDk0Xadxpbe7HW1PJ1jBQrQtYxKDfEQ4Odk2tw815u1Ja1gXltvxq+f8llaQ+TDwbl\nhnhosDFSB9FWyo2uVeqArPSBX+mVuWFg1nyhV+3TkbzQi4iIkku1N7ZHvNCLiIhGbblYK8ZxG87+\n1/A9Nc2QtWBQJsqEAw5ZMDd7zbHr1thGI+wrVzAoK8fGul3tsuNSHa1Vs83uOg868bdzz1des092\nSxiUlePAvh3Ljqyp2WZjbm9a87dzP4/9fYsYlA3IeSTJo1QiOtpzj/Hc+8TMpDWqkT4GZQNyHk3y\nSJWIjlKNB8P3Gbt3OcfnplYjXdcV/0QiIjIlxQYgU0+SSvkZHnCmTEREs6aCZcrlXQbkA3VBWfs5\nBiIiOmAgTU9dUGYlExG1q/WJmbqgTERE7bKyR3Yu0UFZRM6JyKXe/39CRC6IyEUR+VkRuU9ELonI\n87vf3ywi7xOR+0XkbhE5nyMDRETkz/HCr7kNSDzaM1Pul84NIYTnAvhXAN7R/ewOAJ8KIdwC4I0A\nfnnHZxERUUOOAbmFQNyX6paoXwOAEMJdIvIkEXkygBcD+OHu5xdF5CkicmMI4a+Hf3x6enr5+5OT\nE5ycnCRKFqXUYgfRhnVA1q1pw8Ol7P7M2VI/ODs7w9nZWdRrox/dKCLfAuBDIYRndf9/Ew5B/QTA\nHSGEs+7nnwdwHsAZgFeGEB7pfv7HAP7RMCjz0Y1ERDTH28Foqkc3/jmAb+7OFT8BwA/2fveq7oNe\nAuCLIYQvAbgLwK3dz08AfGFsllxSCxcJEBF509LTpKKXr0MIXxORNwO4B8CfAvhs79ePici93fu9\ntvvZKYB3iMj9AL4M4LYkKd7B05EWEVErjjPlfnD2Op5HL19PvoHIRQC3hxDu3fj3XL5ewXNjLMFL\n+aXOR4r381K2mrBMp02dW+6fe9ZadqmWr0kBrY3MCi/lN3cv5973q/ke1qVeXmWZThsLxv2fW13y\n3j1T3p2ABmfKmo/g9siVL6/lRUT7WboaezCL50w5BS2zEo1y5ctreeVibWZQg5cy8pKPPfrnmmtY\nUwcx6WRQXslCgNDSUbWkozUW2mhtXsrISz5SqDXepK4DBuWNNAccLedSOGDoULsdEOXQb9d7TnFp\n6x8MyhtZCDgW0piSts6lRWvtgOaV7ie5Pq/frve0cW39g0GZ3CjRuRj42+ah/ksHIW1BTzsG5YEc\nnc5DR6YDDjA2lJideWd93BKRq77Gfr/097GvjX2fGAzKAzk6XepNHmg7L+WXOh8p3k9T2XoJnjXL\n1HIZxpxjXvP7ksvjDMoFpb6dStMgaIXlgaYv9cV8XjcPsd5HNJapFal2qSuNQbmg1B2MHZaGD4Gn\nq5XuI6yLcmLKOiYwz71PjTGWQZnIOAbmZaXKhxcblrPiscOzry0VeGPrjUE5AjsBpZLzAiS202k1\nZjzeLzbT3t6OO331T/X0L/pac6HXmt/NpScGg3IELZ2gZdoHgFg521Lq9/ZS5rV4Hzc0528qbcOL\nt7YuXefMO4PyShyo6tA8AGij6eIvrbz0Yy/52GNNGQxfO9bG15Zp8jshane8padEeX1CkNd8ERHR\nPNNPidJ4roxPiqpLW3vQiuVEpWlpc7Hp2LO5SC7qgzKgLzBbCKiayiu13OXvpey09RvyT8vYOJWO\nNQ+xqJUXE0EZ0FPZR9oHOw7I22lra3uwHVArptp5/+fDvq3xNKKZoKyNhcFOW2OjOtgOqAUxV1yP\n0RaYGZR30FSRlEeuAy/tB3QtY92ko7ksh/cxA+PnmEvngUGZaMaWA6+YTrz2fTUPbltpzVPqfcVb\nZmniMjVjLp0HBmVyp/Zgqv1JY1poz1PJU1S122wJFvKYail7T14ZlBOz0PC08L4NoWZsp3FKBeYW\n2qyVPKYIzHv+nkE5MUsNrzYuE9bLe6p22kLdWenTdEXsE6T6ps4xl8agbECOhqEpILY86FnPu/X0\n09U0jAcpxIxvWq/KZlDuaG6MuR9ioDnvlJemuteUltqsr6BosSY/Mftil8Cg3Km9ZFGTt45I8TTV\nvaa0rJV6zLBcFlYtPYNhbkvOub9bi0F5BGePdXkse495GrM3n1bLKfeYYbVcPIk9UJrbQSwGg/IE\nBuZ6PJa9lZnP3nLfW3dWymmMpWdlb+GtT641do55rEz21hWD8owcwUFrw9Y2wykxCGmqCy1pSdHm\n99ZdK31ui5p50XBgUNswME/1lz31pP55yhpo2xuV7MrZlthOdfFeH97z1xczS157UZnZ5ylrYKHh\naZkNaEnHGiXT7H2Jk67wXh/e89dX8ulSDMoraA44WjpIznTk3AFMc922jPVCqe1aWl7xPOatGJRX\n4OBdF2eZcTy10ZxL/R54yccea8tgT5saxoCpwMy9rwvyNHjnxMGiHrbRZV7KSEM+avf10mUQs5TN\nva9JnVIdxfvzjrWkIxVv+dGq9nUS3uu5v0926rGOV18TEQ20dGUxpRWzeQivviYiWoEBmYZinzzF\nzUOIBrwvnVE5pdpSC23WSh6n0jn15Kmx//OcsjNWGu8eOfOYY5aTI7216jnV56Z4H+1tvdSMuYWZ\nuZU8LqUz5qKuPYGZ55SJiIgK4jnlFTzNiIhaxT63n+Uy1LZ/+ppHPjIoD+SYtSe/ZN5wZ9HAS/l5\nyUcOXH3bz3IZpk77nvdbu5TNoGzQsYLPzs7qJsSoixcv1k5CEv2OXipAW21zGg5gUpWdhryUlKLc\napfZmsDMoFxBqgaibYC0spGHtnJLodSsxmrZaZj1pSo7DXkpKUW51SyztRuMMChX4LVT5cqXt/Kq\nfdRei9Z8l06X1nIgHRiUyTxrg9zagwwvtx5pPbiqvXeyVxranEUqbomqmgAiIqLCpm6Jqh6UiYiI\n6IDL10REREowKBMRESnBoExERKQEgzIREZESDMpERERK/H8jD0MN3SpTIAAAAABJRU5ErkJggg==\n", 389 | "text/plain": [ 390 | "" 391 | ] 392 | }, 393 | "metadata": {}, 394 | "output_type": "display_data" 395 | } 396 | ], 397 | "source": [ 398 | "%matplotlib inline\n", 399 | "fig = plt.figure(figsize=(8, 8))\n", 400 | "ax = fig.add_subplot(1, 1, 1)\n", 401 | "ax.matshow(graph.toarray(), cmap=plt.cm.binary)\n", 402 | "\n", 403 | "# Label axes with the words\n", 404 | "def format_func(x, *args):\n", 405 | " return wordlist[max(0, min(int(x), graph.shape[0]-1))]\n", 406 | " \n", 407 | "ax.xaxis.set_major_formatter(plt.FuncFormatter(format_func))\n", 408 | "ax.yaxis.set_major_formatter(plt.FuncFormatter(format_func))" 409 | ] 410 | }, 411 | { 412 | "cell_type": "code", 413 | "execution_count": 11, 414 | "metadata": { 415 | "collapsed": false 416 | }, 417 | "outputs": [ 418 | { 419 | "data": { 420 | "image/png": "iVBORw0KGgoAAAANSUhEUgAAAeoAAAHfCAYAAACf2pskAAAABHNCSVQICAgIfAhkiAAAAAlwSFlz\nAAALEgAACxIB0t1+/AAAEYhJREFUeJzt3U2I5Wde9vHrmikhjfElK9FFaBEJNATRODFIr3prNg8a\nkd44g7hJQBRc6Ca9FVzMIjQxjGRm0QYkMm4fXcgIgdBxHKE7CWQhCCGQTZBkGBlB53bRJ1BJJl0v\nfarOr3I+H2jqf6j7nPr1TVV9+65TXdW1VgCAmb606wEAgM8n1AAwmFADwGBCDQCDCTUADCbUADDY\n3oW67eW2d3c9xxfBSfey7bfa/vZZznSReF/cDvv44E7xsfz7bX/+LGe6SM76fXDvQs1Orc0fGKHt\nl3c9wwX11SS/sOsh9sW+hvqg7a22b7d9te2lts+3faPt3bYvfbyw7Vfa3mn7b23/0r/cP+PYe7nR\nnUw510neF7/T9i/a3m77Tturuxx8mJPu49fb/kuSP9rhzNMcaw/b/k6SX0/yN22/1/ah3Y49xnH3\n75fa/uvHd2r7y4dv/zj7GurHktxca11J8lGSZ5O8sNZ6cq31eJJLbZ/erP1mkj9ca/1qkv+JE+Gn\nnWQv+ayT7N9K8uW11m8k+eMkN3Yy8Uwn3cefWGt9Za319R3NO9Gx9nCt9XdJvpvk+lrr19ZaP9zh\nzJMcd//+PcmHbX9lc7+vJXn5fg+8r6F+d631+ub6VpKrSa5tTip3klxLcqXtzyZ5eK11e7P2lTgR\nftqx9nJn08130v379ubl95JcPrcp5zvpPv7teQ94AZx0D30u/KST7N9fJ/la2y8l+d3ca8vnOjij\ngac7fCru5vbNJE+std5reyPJQ/ns6dk75mcddy/58U66f/+9efm/2d+P3x/npPv4g/Mc7oI46R76\n6uInHWf/Lm1e/+3c+4rYPyX57lrrP+/3wPt6on607VOb6+tJXttcf9D24STPJMla68Mk32/75Ob1\nv3e+Y14Ix9pLPpf92w77+OBOsoffT/LT5zncBXCc/VtJsnm64B+SvJh7T6/e1z7+i3wleSfJc21f\nTvJW7m3WI0neTPJ+ktuH1v9Bkm+0/VGSf07y4fmOO9pJ9/Lj+3DPafbv0/fnwfeRk+/ht5L8Vdv/\nSvKbnqc+1fvgK0n+X5J/POrB69dc3l/bn1xr/WBz/WdJfm6t9Sc7HguAC6ztnyb5qbXWkd8Uuo8n\n6pP6rbZ/nnt79R+59/8HAeBU2v59kl/MvW8wO3q9EzUAzLWv30wGABeCUAPAYOf+HHVbX2sHYO+s\ntU71szh28s1k23xevPUzSAD44vKlbwAYTKgBYDChBoDBhBoABhNqABhMqAFgMKEGgMGEGgAGE2oA\nGEyoAWAwoQaAwYQaAAYTagAYTKgBYDChBoDBhBoABhNqABjsYBdvtO3WHmuttbXHOivb/PsCsF+c\nqAFgMKEGgMGEGgAGE2oAGEyoAWAwoQaAwYQaAAYTagAYTKgBYDChBoDBhBoABhNqABhMqAFgMKEG\ngMGEGgAGE2oAGEyoAWCw+4a67eW2d89rGADgk5yoAWCw44T6oO2ttm+3fbXtpbbPt32j7d22L328\nsO1X2t5p+29t/9JpHAAezHFC/ViSm2utK0k+SvJskhfWWk+utR5Pcqnt05u130zyh2utX03yP0nW\nWQwNAPviOKF+d631+ub6VpKrSa61vd32TpJrSa60/dkkD6+1bm/WvpKkW58YAPbIwTHWHD4Vd3P7\nZpIn1lrvtb2R5KF89vQs0gDwgI5zon607VOb6+tJXttcf9D24STPJMla68Mk32/75Ob1v7fVSQFg\nDx11ol5J3knyXNuXk7yV5MUkjyR5M8n7SW4fWv8HSb7R9kdJ/jnJh1ufGAD2SNfa3vd7tf3JtdYP\nNtd/luTn1lp/8qk1W/0Gs23Of1ZazwIA7Lu11qlicJznqE/it9r++eZx/yPJV7f8+ACwV7Z6oj7W\nG3SiBmAPnfZE7SeTAcBgQg0Agwk1AAwm1AAwmFADwGBCDQCDCTUADCbUADCYUAPAYEINAIMJNQAM\nJtQAMNi2f3vWubsIv/DCLw4B4LScqAFgMKEGgMGEGgAGE2oAGEyoAWAwoQaAwYQaAAYTagAYTKgB\nYDChBoDBhBoABhNqABhMqAFgMKEGgMGEGgAGE2oAGEyoAWAwoQaAwYQaAAYTagAYTKgBYDChBoDB\nhBoABhNqABhMqAFgMKEGgMGEGgAGE2oAGEyoAWAwoQaAwYQaAAYTagAYTKgBYDChBoDBhBoABhNq\nABhMqAFgMKEGgMGEGgAGE2oAGOxg1wPsg7a7HuFIa61dj3Cki7CPANvmRA0Agwk1AAwm1AAwmFAD\nwGBCDQCDCTUADCbUADCYUAPAYEINAIMJNQAMJtQAMJhQA8BgQg0Agwk1AAwm1AAwmFADwGAnCnXb\ny23vnmD9t9r+9snHAgCSsz9Rr80fAOAUThPqg7a32r7d9tW2l9o+3/aNtnfbvvSp9d3GoACwj04T\n6seS3FxrXUnyUZJnk7yw1npyrfV4kkttn97mkACwr04T6nfXWq9vrm8luZrkWtvbbe8kuZbkyrYG\nBIB9dnCK+xx+zrmb2zeTPLHWeq/tjSQPbWM4ANh3pzlRP9r2qc319SSvba4/aPtwkme2MhkAcOIT\n9UryTpLn2r6c5K0kLyZ5JMmbSd5PcvvH3AcAOIWudb4dbSvcA533+8FptP4DAXBxrbVO9UnMTyYD\ngMGEGgAGE2oAGEyoAWAwoQaAwYQaAAYTagAYTKgBYDChBoDBhBoABhNqABhMqAFgMKEGgMGEGgAG\nE2oAGOxg1wMww0X4Xc/Tf2f2RdhD4OJxogaAwYQaAAYTagAYTKgBYDChBoDBhBoABhNqABhMqAFg\nMKEGgMGEGgAGE2oAGEyoAWAwoQaAwYQaAAYTagAYTKgBYDChBoDBhBoABhNqABhMqAFgMKEGgMGE\nGgAGE2oAGEyoAWAwoQaAwYQaAAYTagAYTKgBYDChBoDBhBoABhNqABhMqAFgMKEGgMGEGgAGE2oA\nGEyoAWAwoQaAwYQaAAYTagAY7GDXA8Bxtd31CPe11tr1CEeavofAZzlRA8BgQg0Agwk1AAwm1AAw\nmFADwGBCDQCDCTUADCbUADCYUAPAYEINAIMJNQAMJtQAMJhQA8BgQg0Agwk1AAwm1AAw2H1D3fZy\n27vnNQwA8ElO1AAw2HFCfdD2Vtu3277a9lLb59u+0fZu25c+Xtj2O23/ou3ttu+0vXqGswPAF95x\nQv1YkptrrStJPkrybJIX1lpPrrUeT3Kp7dObtSvJl9dav5Hkj5PcOIuhAWBfHCfU7661Xt9c30py\nNcm1zan5TpJrSa4cWv/tzcvvJbm8rUEBYB8dHGPNOnTdze2bSZ5Ya73X9kaShw6t+e/Ny/895uMD\nAJ/jOCfqR9s+tbm+nuS1zfUHbR9O8syZTAYAHHniXUneSfJc25eTvJXkxSSPJHkzyftJbh9xfwDg\nlLrW+ba0rXjzhXTeH0un0XbXI8DeWmud6gPQ/6MGgMGEGgAGE2oAGEyoAWAwoQaAwYQaAAYTagAY\nTKgBYDChBoDBhBoABhNqABhMqAFgMKEGgMGEGgAGE2oAGOxg1wPAF8VF+F3Pfmc2XDxO1AAwmFAD\nwGBCDQCDCTUADCbUADCYUAPAYEINAIMJNQAMJtQAMJhQA8BgQg0Agwk1AAwm1AAwmFADwGBCDQCD\nCTUADCbUADCYUAPAYEINAIMJNQAMJtQAMJhQA8BgQg0Agwk1AAwm1AAwmFADwGBCDQCDCTUADCbU\nADCYUAPAYEINAIMJNQAMJtQAMJhQA8BgQg0Agwk1AAwm1AAwmFADwGBCDQCDCTUADHaw6wGA89N2\n1yMcaa216xHu6yLsIV8sTtQAMJhQA8BgQg0Agwk1AAwm1AAwmFADwGBCDQCDCTUADCbUADCYUAPA\nYEINAIMJNQAMJtQAMJhQA8BgQg0Agx0r1G0vt7171sMAAJ905ifqtl8+67cBAF9UJwn1Qdtbbd9u\n+2rbS22fb/tG27ttX/p4YdvvtP16239J8kfbHxsA9sNJQv1YkptrrStJPkrybJIX1lpPrrUeT3Kp\n7dObtSvJT6y1vrLW+vp2RwaA/XGSUL+71np9c30rydUk19rebnsnybUkVw6t/9stzQgAe+vgBGvX\noetubt9M8sRa6722N5I8dGjND7YwHwDstZOcqB9t+9Tm+nqS1zbXH7R9OMkzW50MADj2iXoleSfJ\nc21fTvJWkheTPJLkzSTvJ7l9JhMCwB7rWuvoVdt8g+35vkHgQjnvz0kn1XbXI3BBrbVO9c7jJ5MB\nwGBCDQCDCTUADCbUADCYUAPAYEINAIMJNQAMJtQAMJhQA8BgQg0Agwk1AAwm1AAwmFADwGBCDQCD\nCTUADCbUADDYwa4HADis7a5HuK+11q5HONL0PeRknKgBYDChBoDBhBoABhNqABhMqAFgMKEGgMGE\nGgAGE2oAGEyoAWAwoQaAwYQaAAYTagAYTKgBYDChBoDBhBoABhNqABhMqAFgMKEGgMGEGgAGE2oA\nGEyoAWAwoQaAwYQaAAYTagAYTKgBYDChBoDBhBoABhNqABhMqAFgMKEGgMGEGgAGE2oAGEyoAWAw\noQaAwYQaAAYTagAYTKgBYDChBoDBhBoABjvY9QAAF0nbXY9wpLXWrkc40kXYxymcqAFgMKEGgMGE\nGgAGE2oAGEyoAWAwoQaAwYQaAAYTagAYTKgBYDChBoDBhBoABhNqABhMqAFgMKEGgMGEGgAGO1ao\n215ue/e4D9r299v+/OnHAgCSsztRfzXJL5zRYwPA3jhJqA/a3mr7dttX215q+3zbN9rebftSkrT9\nnSS/nuRv2n6v7UNnMjkA7IGThPqxJDfXWleSfJTk2SQvrLWeXGs9nuRS26fXWn+X5LtJrq+1fm2t\n9cPtjw0A++EkoX53rfX65vpWkqtJrrW93fZOkmtJrhxa3y3NCAB76+AEa9eh625u30zyxFrrvbY3\nkjz0OesBgFM4yYn60bZPba6vJ3ltc/1B24eTPHNo7feT/PQW5gOAvXbcE/VK8k6S59q+nOStJC8m\neSTJm0neT3L70PpvJfmrtv+V5Dc9Tw0Ap9O1zvcr1G19SRzgDJ335/XTaPfv25jWWqf6S/vJZAAw\nmFADwGBCDQCDCTUADCbUADCYUAPAYEINAIMJNQAMJtQAMJhQA8BgQg0Agwk1AAwm1AAwmFADwGBC\nDQCDCTUADHaw6wEA2K62ux7hSGutXY9wpCn76EQNAIMJNQAMJtQAMJhQA8BgQg0Agwk1AAwm1AAw\nmFADwGBCDQCDCTUADCbUADCYUAPAYEINAIMJNQAMJtQAMJhQA8BgQg0Agwk1AAwm1AAwmFADwGBC\nDQCDCTUADCbUADCYUAPAYEINAIMJNQAMJtQAMJhQA8BgQg0Agwk1AAwm1AAwmFADwGBCDQCDCTUA\nDCbUADCYUAPAYEINAIMJNQAMJtQAMJhQA8BgB7seAID903bXIxxprbW1x3qQv68TNQAMJtQAMJhQ\nA8BgQg0Agwk1AAwm1AAwmFADwGBCDQCDCTUADCbUADCYUAPAYEINAIMJNQAMJtQAMJhQA8Bg9w11\n28tt757XMADAJzlRA8Bgxwn1Qdtbbd9u+2rbS22fb/tG27ttX0qStr/U9l8/vlPbXz58GwA4ueOE\n+rEkN9daV5J8lOTZJC+stZ5caz2e5FLbp9da/57kw7a/srnf15K8fCZTA8CeOE6o311rvb65vpXk\napJrbW+3vZPkWpIrm9f/dZKvtf1Skt9N8sq2BwaA89B2a38exMEx1qzDc29u30zyxFrrvbY3klza\nvP7bSW4k+ack311r/ednHmytB5sYAPbIcU7Uj7Z9anN9Pclrm+sP2j6c5JlsYr7W+mGSf0jyYpJv\nbnlWANg7R4V6JXknyXNt307yM7kX4W8keTPJ/09y+1P3eSXJj5L843ZHBYD907XW0atO8oDtnyb5\nqbXWja0+MADsoeM8R31sbf8+yS/m3jeYAQAPaOsnagBge/xkMgAYTKgBYDChBoDBhBoABhNqABjs\n/wDv+2pvnEitGgAAAABJRU5ErkJggg==\n", 421 | "text/plain": [ 422 | "" 423 | ] 424 | }, 425 | "execution_count": 11, 426 | "metadata": {}, 427 | "output_type": "execute_result" 428 | } 429 | ], 430 | "source": [ 431 | "#Zooming on the Diagonal\n", 432 | "ax.set_xlim(93, 104)\n", 433 | "ax.set_ylim(104,93)\n", 434 | "fig" 435 | ] 436 | } 437 | ], 438 | "metadata": { 439 | "kernelspec": { 440 | "display_name": "Python 3", 441 | "language": "python", 442 | "name": "python3" 443 | }, 444 | "language_info": { 445 | "codemirror_mode": { 446 | "name": "ipython", 447 | "version": 3 448 | }, 449 | "file_extension": ".py", 450 | "mimetype": "text/x-python", 451 | "name": "python", 452 | "nbconvert_exporter": "python", 453 | "pygments_lexer": "ipython3", 454 | "version": "3.4.3" 455 | } 456 | }, 457 | "nbformat": 4, 458 | "nbformat_minor": 0 459 | } 460 | -------------------------------------------------------------------------------- /03_Numpy_Operations.ipynb: -------------------------------------------------------------------------------- 1 | { 2 | "cells": [ 3 | { 4 | "cell_type": "markdown", 5 | "metadata": { 6 | "slideshow": { 7 | "slide_type": "slide" 8 | } 9 | }, 10 | "source": [ 11 | "# Linear algebra" 12 | ] 13 | }, 14 | { 15 | "cell_type": "markdown", 16 | "metadata": {}, 17 | "source": [ 18 | "**Vectorizing** code is the key to writing efficient numerical calculation with Python/Numpy. \n", 19 | "\n", 20 | "That means that as much as possible of a program should be formulated in terms of matrix and vector operations, like **matrix-matrix multiplication**." 21 | ] 22 | }, 23 | { 24 | "cell_type": "code", 25 | "execution_count": 14, 26 | "metadata": { 27 | "collapsed": false, 28 | "slideshow": { 29 | "slide_type": "fragment" 30 | } 31 | }, 32 | "outputs": [], 33 | "source": [ 34 | "import numpy as np" 35 | ] 36 | }, 37 | { 38 | "cell_type": "markdown", 39 | "metadata": { 40 | "slideshow": { 41 | "slide_type": "subslide" 42 | } 43 | }, 44 | "source": [ 45 | "## Scalar-array operations" 46 | ] 47 | }, 48 | { 49 | "cell_type": "markdown", 50 | "metadata": {}, 51 | "source": [ 52 | "We can use the usual arithmetic operators to multiply, add, subtract, and divide arrays with scalar numbers." 53 | ] 54 | }, 55 | { 56 | "cell_type": "code", 57 | "execution_count": 15, 58 | "metadata": { 59 | "collapsed": false, 60 | "slideshow": { 61 | "slide_type": "fragment" 62 | } 63 | }, 64 | "outputs": [], 65 | "source": [ 66 | "v1 = np.arange(0, 5)" 67 | ] 68 | }, 69 | { 70 | "cell_type": "code", 71 | "execution_count": 16, 72 | "metadata": { 73 | "collapsed": false, 74 | "slideshow": { 75 | "slide_type": "fragment" 76 | } 77 | }, 78 | "outputs": [ 79 | { 80 | "data": { 81 | "text/plain": [ 82 | "array([0, 2, 4, 6, 8])" 83 | ] 84 | }, 85 | "execution_count": 16, 86 | "metadata": {}, 87 | "output_type": "execute_result" 88 | } 89 | ], 90 | "source": [ 91 | "v1 * 2" 92 | ] 93 | }, 94 | { 95 | "cell_type": "code", 96 | "execution_count": 17, 97 | "metadata": { 98 | "collapsed": false, 99 | "slideshow": { 100 | "slide_type": "fragment" 101 | } 102 | }, 103 | "outputs": [ 104 | { 105 | "data": { 106 | "text/plain": [ 107 | "array([2, 3, 4, 5, 6])" 108 | ] 109 | }, 110 | "execution_count": 17, 111 | "metadata": {}, 112 | "output_type": "execute_result" 113 | } 114 | ], 115 | "source": [ 116 | "v1 + 2" 117 | ] 118 | }, 119 | { 120 | "cell_type": "code", 121 | "execution_count": 28, 122 | "metadata": { 123 | "collapsed": false 124 | }, 125 | "outputs": [], 126 | "source": [ 127 | "A = np.array([[n+m*10 for n in range(5)] for m in range(5)])" 128 | ] 129 | }, 130 | { 131 | "cell_type": "code", 132 | "execution_count": 29, 133 | "metadata": { 134 | "collapsed": false, 135 | "slideshow": { 136 | "slide_type": "subslide" 137 | } 138 | }, 139 | "outputs": [ 140 | { 141 | "name": "stdout", 142 | "output_type": "stream", 143 | "text": [ 144 | "A * 2: \n", 145 | " [[ 0 2 4 6 8]\n", 146 | " [20 22 24 26 28]\n", 147 | " [40 42 44 46 48]\n", 148 | " [60 62 64 66 68]\n", 149 | " [80 82 84 86 88]]\n", 150 | "A + 2: \n", 151 | " [[ 2 3 4 5 6]\n", 152 | " [12 13 14 15 16]\n", 153 | " [22 23 24 25 26]\n", 154 | " [32 33 34 35 36]\n", 155 | " [42 43 44 45 46]]\n" 156 | ] 157 | } 158 | ], 159 | "source": [ 160 | "print('A * 2: ', '\\n', A * 2)\n", 161 | "print('A + 2: ', '\\n', A + 2)" 162 | ] 163 | }, 164 | { 165 | "cell_type": "markdown", 166 | "metadata": { 167 | "slideshow": { 168 | "slide_type": "subslide" 169 | } 170 | }, 171 | "source": [ 172 | "## Element-wise array-array operations" 173 | ] 174 | }, 175 | { 176 | "cell_type": "markdown", 177 | "metadata": {}, 178 | "source": [ 179 | "When we add, subtract, multiply and divide arrays with each other, the default behaviour is **element-wise** operations:" 180 | ] 181 | }, 182 | { 183 | "cell_type": "code", 184 | "execution_count": 30, 185 | "metadata": { 186 | "collapsed": false, 187 | "slideshow": { 188 | "slide_type": "fragment" 189 | } 190 | }, 191 | "outputs": [ 192 | { 193 | "data": { 194 | "text/plain": [ 195 | "array([[ 0, 1, 4, 9, 16],\n", 196 | " [ 100, 121, 144, 169, 196],\n", 197 | " [ 400, 441, 484, 529, 576],\n", 198 | " [ 900, 961, 1024, 1089, 1156],\n", 199 | " [1600, 1681, 1764, 1849, 1936]])" 200 | ] 201 | }, 202 | "execution_count": 30, 203 | "metadata": {}, 204 | "output_type": "execute_result" 205 | } 206 | ], 207 | "source": [ 208 | "A * A # element-wise multiplication" 209 | ] 210 | }, 211 | { 212 | "cell_type": "code", 213 | "execution_count": 31, 214 | "metadata": { 215 | "collapsed": false, 216 | "slideshow": { 217 | "slide_type": "fragment" 218 | } 219 | }, 220 | "outputs": [ 221 | { 222 | "data": { 223 | "text/plain": [ 224 | "array([ 0, 1, 4, 9, 16])" 225 | ] 226 | }, 227 | "execution_count": 31, 228 | "metadata": {}, 229 | "output_type": "execute_result" 230 | } 231 | ], 232 | "source": [ 233 | "v1 * v1" 234 | ] 235 | }, 236 | { 237 | "cell_type": "markdown", 238 | "metadata": { 239 | "slideshow": { 240 | "slide_type": "subslide" 241 | } 242 | }, 243 | "source": [ 244 | "If we multiply arrays with compatible shapes, we get an element-wise multiplication of each row:" 245 | ] 246 | }, 247 | { 248 | "cell_type": "code", 249 | "execution_count": 32, 250 | "metadata": { 251 | "collapsed": false, 252 | "slideshow": { 253 | "slide_type": "fragment" 254 | } 255 | }, 256 | "outputs": [ 257 | { 258 | "data": { 259 | "text/plain": [ 260 | "((5, 5), (5,))" 261 | ] 262 | }, 263 | "execution_count": 32, 264 | "metadata": {}, 265 | "output_type": "execute_result" 266 | } 267 | ], 268 | "source": [ 269 | "A.shape, v1.shape" 270 | ] 271 | }, 272 | { 273 | "cell_type": "code", 274 | "execution_count": 33, 275 | "metadata": { 276 | "collapsed": false, 277 | "slideshow": { 278 | "slide_type": "fragment" 279 | } 280 | }, 281 | "outputs": [ 282 | { 283 | "data": { 284 | "text/plain": [ 285 | "array([[ 0, 1, 4, 9, 16],\n", 286 | " [ 0, 11, 24, 39, 56],\n", 287 | " [ 0, 21, 44, 69, 96],\n", 288 | " [ 0, 31, 64, 99, 136],\n", 289 | " [ 0, 41, 84, 129, 176]])" 290 | ] 291 | }, 292 | "execution_count": 33, 293 | "metadata": {}, 294 | "output_type": "execute_result" 295 | } 296 | ], 297 | "source": [ 298 | "A * v1" 299 | ] 300 | }, 301 | { 302 | "cell_type": "markdown", 303 | "metadata": { 304 | "slideshow": { 305 | "slide_type": "subslide" 306 | } 307 | }, 308 | "source": [ 309 | "## Matrix algebra" 310 | ] 311 | }, 312 | { 313 | "cell_type": "markdown", 314 | "metadata": {}, 315 | "source": [ 316 | "What about **matrix mutiplication**? \n", 317 | "\n", 318 | "There are two ways. \n", 319 | "\n", 320 | "We can either use the `np.dot` function, which applies a **matrix-matrix**, **matrix-vector**, or **inner vector multiplication** to its two arguments: " 321 | ] 322 | }, 323 | { 324 | "cell_type": "code", 325 | "execution_count": 34, 326 | "metadata": { 327 | "collapsed": false, 328 | "slideshow": { 329 | "slide_type": "fragment" 330 | } 331 | }, 332 | "outputs": [ 333 | { 334 | "data": { 335 | "text/plain": [ 336 | "array([[ 300, 310, 320, 330, 340],\n", 337 | " [1300, 1360, 1420, 1480, 1540],\n", 338 | " [2300, 2410, 2520, 2630, 2740],\n", 339 | " [3300, 3460, 3620, 3780, 3940],\n", 340 | " [4300, 4510, 4720, 4930, 5140]])" 341 | ] 342 | }, 343 | "execution_count": 34, 344 | "metadata": {}, 345 | "output_type": "execute_result" 346 | } 347 | ], 348 | "source": [ 349 | "np.dot(A, A)" 350 | ] 351 | }, 352 | { 353 | "cell_type": "code", 354 | "execution_count": 35, 355 | "metadata": { 356 | "collapsed": false, 357 | "slideshow": { 358 | "slide_type": "fragment" 359 | } 360 | }, 361 | "outputs": [ 362 | { 363 | "data": { 364 | "text/plain": [ 365 | "array([ 30, 130, 230, 330, 430])" 366 | ] 367 | }, 368 | "execution_count": 35, 369 | "metadata": {}, 370 | "output_type": "execute_result" 371 | } 372 | ], 373 | "source": [ 374 | "np.dot(A, v1)" 375 | ] 376 | }, 377 | { 378 | "cell_type": "code", 379 | "execution_count": 36, 380 | "metadata": { 381 | "collapsed": false, 382 | "slideshow": { 383 | "slide_type": "fragment" 384 | } 385 | }, 386 | "outputs": [ 387 | { 388 | "data": { 389 | "text/plain": [ 390 | "30" 391 | ] 392 | }, 393 | "execution_count": 36, 394 | "metadata": {}, 395 | "output_type": "execute_result" 396 | } 397 | ], 398 | "source": [ 399 | "np.dot(v1, v1)" 400 | ] 401 | }, 402 | { 403 | "cell_type": "markdown", 404 | "metadata": {}, 405 | "source": [ 406 | "### The `Matrix` Array Type" 407 | ] 408 | }, 409 | { 410 | "cell_type": "markdown", 411 | "metadata": { 412 | "slideshow": { 413 | "slide_type": "subslide" 414 | } 415 | }, 416 | "source": [ 417 | "Alternatively, we can cast the array objects to the type `matrix`. \n", 418 | "\n", 419 | "This changes the behavior of the standard arithmetic operators `+, -, *` to use matrix algebra." 420 | ] 421 | }, 422 | { 423 | "cell_type": "code", 424 | "execution_count": 37, 425 | "metadata": { 426 | "collapsed": true 427 | }, 428 | "outputs": [], 429 | "source": [ 430 | "from numpy import matrix" 431 | ] 432 | }, 433 | { 434 | "cell_type": "code", 435 | "execution_count": 38, 436 | "metadata": { 437 | "collapsed": false, 438 | "slideshow": { 439 | "slide_type": "fragment" 440 | } 441 | }, 442 | "outputs": [], 443 | "source": [ 444 | "M = matrix(A)\n", 445 | "v = matrix(v1).T # make it a column vector" 446 | ] 447 | }, 448 | { 449 | "cell_type": "code", 450 | "execution_count": 39, 451 | "metadata": { 452 | "collapsed": false, 453 | "slideshow": { 454 | "slide_type": "fragment" 455 | } 456 | }, 457 | "outputs": [ 458 | { 459 | "data": { 460 | "text/plain": [ 461 | "matrix([[0],\n", 462 | " [1],\n", 463 | " [2],\n", 464 | " [3],\n", 465 | " [4]])" 466 | ] 467 | }, 468 | "execution_count": 39, 469 | "metadata": {}, 470 | "output_type": "execute_result" 471 | } 472 | ], 473 | "source": [ 474 | "v" 475 | ] 476 | }, 477 | { 478 | "cell_type": "code", 479 | "execution_count": 40, 480 | "metadata": { 481 | "collapsed": false, 482 | "slideshow": { 483 | "slide_type": "subslide" 484 | } 485 | }, 486 | "outputs": [ 487 | { 488 | "data": { 489 | "text/plain": [ 490 | "matrix([[ 300, 310, 320, 330, 340],\n", 491 | " [1300, 1360, 1420, 1480, 1540],\n", 492 | " [2300, 2410, 2520, 2630, 2740],\n", 493 | " [3300, 3460, 3620, 3780, 3940],\n", 494 | " [4300, 4510, 4720, 4930, 5140]])" 495 | ] 496 | }, 497 | "execution_count": 40, 498 | "metadata": {}, 499 | "output_type": "execute_result" 500 | } 501 | ], 502 | "source": [ 503 | "M * M" 504 | ] 505 | }, 506 | { 507 | "cell_type": "code", 508 | "execution_count": 41, 509 | "metadata": { 510 | "collapsed": false, 511 | "slideshow": { 512 | "slide_type": "fragment" 513 | } 514 | }, 515 | "outputs": [ 516 | { 517 | "data": { 518 | "text/plain": [ 519 | "matrix([[ 30],\n", 520 | " [130],\n", 521 | " [230],\n", 522 | " [330],\n", 523 | " [430]])" 524 | ] 525 | }, 526 | "execution_count": 41, 527 | "metadata": {}, 528 | "output_type": "execute_result" 529 | } 530 | ], 531 | "source": [ 532 | "M * v" 533 | ] 534 | }, 535 | { 536 | "cell_type": "code", 537 | "execution_count": 42, 538 | "metadata": { 539 | "collapsed": false, 540 | "slideshow": { 541 | "slide_type": "subslide" 542 | } 543 | }, 544 | "outputs": [ 545 | { 546 | "data": { 547 | "text/plain": [ 548 | "matrix([[30]])" 549 | ] 550 | }, 551 | "execution_count": 42, 552 | "metadata": {}, 553 | "output_type": "execute_result" 554 | } 555 | ], 556 | "source": [ 557 | "# inner product\n", 558 | "v.T * v" 559 | ] 560 | }, 561 | { 562 | "cell_type": "code", 563 | "execution_count": 43, 564 | "metadata": { 565 | "collapsed": false, 566 | "slideshow": { 567 | "slide_type": "fragment" 568 | } 569 | }, 570 | "outputs": [ 571 | { 572 | "data": { 573 | "text/plain": [ 574 | "matrix([[ 30],\n", 575 | " [131],\n", 576 | " [232],\n", 577 | " [333],\n", 578 | " [434]])" 579 | ] 580 | }, 581 | "execution_count": 43, 582 | "metadata": {}, 583 | "output_type": "execute_result" 584 | } 585 | ], 586 | "source": [ 587 | "# with matrix objects, standard matrix algebra applies\n", 588 | "v + M*v" 589 | ] 590 | }, 591 | { 592 | "cell_type": "markdown", 593 | "metadata": { 594 | "slideshow": { 595 | "slide_type": "subslide" 596 | } 597 | }, 598 | "source": [ 599 | "If we try to add, subtract or multiply objects with incomplatible shapes we get an error:" 600 | ] 601 | }, 602 | { 603 | "cell_type": "code", 604 | "execution_count": 44, 605 | "metadata": { 606 | "collapsed": false, 607 | "slideshow": { 608 | "slide_type": "fragment" 609 | } 610 | }, 611 | "outputs": [], 612 | "source": [ 613 | "v = matrix([1,2,3,4,5,6]).T" 614 | ] 615 | }, 616 | { 617 | "cell_type": "code", 618 | "execution_count": 45, 619 | "metadata": { 620 | "collapsed": false, 621 | "slideshow": { 622 | "slide_type": "fragment" 623 | } 624 | }, 625 | "outputs": [ 626 | { 627 | "ename": "NameError", 628 | "evalue": "name 'shape' is not defined", 629 | "output_type": "error", 630 | "traceback": [ 631 | "\u001b[0;31m---------------------------------------------------------------------------\u001b[0m", 632 | "\u001b[0;31mNameError\u001b[0m Traceback (most recent call last)", 633 | "\u001b[0;32m\u001b[0m in \u001b[0;36m\u001b[0;34m()\u001b[0m\n\u001b[0;32m----> 1\u001b[0;31m \u001b[0mshape\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0mM\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m,\u001b[0m \u001b[0mshape\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0mv\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[0m", 634 | "\u001b[0;31mNameError\u001b[0m: name 'shape' is not defined" 635 | ] 636 | } 637 | ], 638 | "source": [ 639 | "shape(M), shape(v)" 640 | ] 641 | }, 642 | { 643 | "cell_type": "code", 644 | "execution_count": 46, 645 | "metadata": { 646 | "collapsed": false, 647 | "slideshow": { 648 | "slide_type": "subslide" 649 | } 650 | }, 651 | "outputs": [ 652 | { 653 | "ename": "ValueError", 654 | "evalue": "shapes (5,5) and (6,1) not aligned: 5 (dim 1) != 6 (dim 0)", 655 | "output_type": "error", 656 | "traceback": [ 657 | "\u001b[0;31m---------------------------------------------------------------------------\u001b[0m", 658 | "\u001b[0;31mValueError\u001b[0m Traceback (most recent call last)", 659 | "\u001b[0;32m\u001b[0m in \u001b[0;36m\u001b[0;34m()\u001b[0m\n\u001b[0;32m----> 1\u001b[0;31m \u001b[0mM\u001b[0m \u001b[0;34m*\u001b[0m \u001b[0mv\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[0m", 660 | "\u001b[0;32m/Users/valerio/anaconda/lib/python3.4/site-packages/numpy/matrixlib/defmatrix.py\u001b[0m in \u001b[0;36m__mul__\u001b[0;34m(self, other)\u001b[0m\n\u001b[1;32m 339\u001b[0m \u001b[0;32mif\u001b[0m \u001b[0misinstance\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0mother\u001b[0m\u001b[0;34m,\u001b[0m \u001b[0;34m(\u001b[0m\u001b[0mN\u001b[0m\u001b[0;34m.\u001b[0m\u001b[0mndarray\u001b[0m\u001b[0;34m,\u001b[0m \u001b[0mlist\u001b[0m\u001b[0;34m,\u001b[0m \u001b[0mtuple\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m)\u001b[0m \u001b[0;34m:\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[1;32m 340\u001b[0m \u001b[0;31m# This promotes 1-D vectors to row vectors\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[0;32m--> 341\u001b[0;31m \u001b[0;32mreturn\u001b[0m \u001b[0mN\u001b[0m\u001b[0;34m.\u001b[0m\u001b[0mdot\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0mself\u001b[0m\u001b[0;34m,\u001b[0m \u001b[0masmatrix\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0mother\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[0m\u001b[1;32m 342\u001b[0m \u001b[0;32mif\u001b[0m \u001b[0misscalar\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0mother\u001b[0m\u001b[0;34m)\u001b[0m \u001b[0;32mor\u001b[0m \u001b[0;32mnot\u001b[0m \u001b[0mhasattr\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0mother\u001b[0m\u001b[0;34m,\u001b[0m \u001b[0;34m'__rmul__'\u001b[0m\u001b[0;34m)\u001b[0m \u001b[0;34m:\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[1;32m 343\u001b[0m \u001b[0;32mreturn\u001b[0m \u001b[0mN\u001b[0m\u001b[0;34m.\u001b[0m\u001b[0mdot\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0mself\u001b[0m\u001b[0;34m,\u001b[0m \u001b[0mother\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n", 661 | "\u001b[0;31mValueError\u001b[0m: shapes (5,5) and (6,1) not aligned: 5 (dim 1) != 6 (dim 0)" 662 | ] 663 | } 664 | ], 665 | "source": [ 666 | "M * v" 667 | ] 668 | }, 669 | { 670 | "cell_type": "markdown", 671 | "metadata": { 672 | "slideshow": { 673 | "slide_type": "subslide" 674 | } 675 | }, 676 | "source": [ 677 | "See also the related functions: `inner`, `outer`, `cross`, `kron`, `tensordot`. \n", 678 | "\n", 679 | "Try for example `help(inner)`." 680 | ] 681 | }, 682 | { 683 | "cell_type": "markdown", 684 | "metadata": { 685 | "slideshow": { 686 | "slide_type": "subslide" 687 | } 688 | }, 689 | "source": [ 690 | "## Array/Matrix transformations" 691 | ] 692 | }, 693 | { 694 | "cell_type": "markdown", 695 | "metadata": {}, 696 | "source": [ 697 | "Above we have used the `.T` to transpose the matrix object `v`. We could also have used the `transpose` function to accomplish the same thing. \n", 698 | "\n", 699 | "Other mathematical functions that transforms matrix objects are:" 700 | ] 701 | }, 702 | { 703 | "cell_type": "code", 704 | "execution_count": 47, 705 | "metadata": { 706 | "collapsed": false, 707 | "slideshow": { 708 | "slide_type": "fragment" 709 | } 710 | }, 711 | "outputs": [ 712 | { 713 | "data": { 714 | "text/plain": [ 715 | "matrix([[ 0.+1.j, 0.+2.j],\n", 716 | " [ 0.+3.j, 0.+4.j]])" 717 | ] 718 | }, 719 | "execution_count": 47, 720 | "metadata": {}, 721 | "output_type": "execute_result" 722 | } 723 | ], 724 | "source": [ 725 | "C = matrix([[1j, 2j], [3j, 4j]])\n", 726 | "C" 727 | ] 728 | }, 729 | { 730 | "cell_type": "code", 731 | "execution_count": 49, 732 | "metadata": { 733 | "collapsed": false, 734 | "slideshow": { 735 | "slide_type": "fragment" 736 | } 737 | }, 738 | "outputs": [ 739 | { 740 | "data": { 741 | "text/plain": [ 742 | "matrix([[ 0.-1.j, 0.-2.j],\n", 743 | " [ 0.-3.j, 0.-4.j]])" 744 | ] 745 | }, 746 | "execution_count": 49, 747 | "metadata": {}, 748 | "output_type": "execute_result" 749 | } 750 | ], 751 | "source": [ 752 | "np.conjugate(C)" 753 | ] 754 | }, 755 | { 756 | "cell_type": "markdown", 757 | "metadata": { 758 | "slideshow": { 759 | "slide_type": "subslide" 760 | } 761 | }, 762 | "source": [ 763 | "* **Hermitian conjugate**: transpose + conjugate" 764 | ] 765 | }, 766 | { 767 | "cell_type": "code", 768 | "execution_count": 51, 769 | "metadata": { 770 | "collapsed": false, 771 | "slideshow": { 772 | "slide_type": "fragment" 773 | } 774 | }, 775 | "outputs": [ 776 | { 777 | "data": { 778 | "text/plain": [ 779 | "matrix([[ 0.-1.j, 0.-3.j],\n", 780 | " [ 0.-2.j, 0.-4.j]])" 781 | ] 782 | }, 783 | "execution_count": 51, 784 | "metadata": {}, 785 | "output_type": "execute_result" 786 | } 787 | ], 788 | "source": [ 789 | "C.H" 790 | ] 791 | }, 792 | { 793 | "cell_type": "markdown", 794 | "metadata": { 795 | "slideshow": { 796 | "slide_type": "subslide" 797 | } 798 | }, 799 | "source": [ 800 | "We can extract the real and imaginary parts of complex-valued arrays using `real` and `imag`:" 801 | ] 802 | }, 803 | { 804 | "cell_type": "code", 805 | "execution_count": 52, 806 | "metadata": { 807 | "collapsed": false, 808 | "slideshow": { 809 | "slide_type": "fragment" 810 | } 811 | }, 812 | "outputs": [ 813 | { 814 | "data": { 815 | "text/plain": [ 816 | "matrix([[ 0., 0.],\n", 817 | " [ 0., 0.]])" 818 | ] 819 | }, 820 | "execution_count": 52, 821 | "metadata": {}, 822 | "output_type": "execute_result" 823 | } 824 | ], 825 | "source": [ 826 | "np.real(C) # same as: C.real" 827 | ] 828 | }, 829 | { 830 | "cell_type": "code", 831 | "execution_count": 53, 832 | "metadata": { 833 | "collapsed": false, 834 | "slideshow": { 835 | "slide_type": "fragment" 836 | } 837 | }, 838 | "outputs": [ 839 | { 840 | "data": { 841 | "text/plain": [ 842 | "matrix([[ 1., 2.],\n", 843 | " [ 3., 4.]])" 844 | ] 845 | }, 846 | "execution_count": 53, 847 | "metadata": {}, 848 | "output_type": "execute_result" 849 | } 850 | ], 851 | "source": [ 852 | "np.imag(C) # same as: C.imag" 853 | ] 854 | }, 855 | { 856 | "cell_type": "markdown", 857 | "metadata": { 858 | "slideshow": { 859 | "slide_type": "subslide" 860 | } 861 | }, 862 | "source": [ 863 | "* Or the complex argument and absolute value" 864 | ] 865 | }, 866 | { 867 | "cell_type": "code", 868 | "execution_count": 54, 869 | "metadata": { 870 | "collapsed": false, 871 | "slideshow": { 872 | "slide_type": "fragment" 873 | } 874 | }, 875 | "outputs": [ 876 | { 877 | "data": { 878 | "text/plain": [ 879 | "array([[ 0.78539816, 1.10714872],\n", 880 | " [ 1.24904577, 1.32581766]])" 881 | ] 882 | }, 883 | "execution_count": 54, 884 | "metadata": {}, 885 | "output_type": "execute_result" 886 | } 887 | ], 888 | "source": [ 889 | "np.angle(C+1) # heads up MATLAB Users, angle is used instead of arg" 890 | ] 891 | }, 892 | { 893 | "cell_type": "code", 894 | "execution_count": 56, 895 | "metadata": { 896 | "collapsed": false 897 | }, 898 | "outputs": [ 899 | { 900 | "name": "stdout", 901 | "output_type": "stream", 902 | "text": [ 903 | "Help on function angle in module numpy.lib.function_base:\n", 904 | "\n", 905 | "angle(z, deg=0)\n", 906 | " Return the angle of the complex argument.\n", 907 | " \n", 908 | " Parameters\n", 909 | " ----------\n", 910 | " z : array_like\n", 911 | " A complex number or sequence of complex numbers.\n", 912 | " deg : bool, optional\n", 913 | " Return angle in degrees if True, radians if False (default).\n", 914 | " \n", 915 | " Returns\n", 916 | " -------\n", 917 | " angle : {ndarray, scalar}\n", 918 | " The counterclockwise angle from the positive real axis on\n", 919 | " the complex plane, with dtype as numpy.float64.\n", 920 | " \n", 921 | " See Also\n", 922 | " --------\n", 923 | " arctan2\n", 924 | " absolute\n", 925 | " \n", 926 | " \n", 927 | " \n", 928 | " Examples\n", 929 | " --------\n", 930 | " >>> np.angle([1.0, 1.0j, 1+1j]) # in radians\n", 931 | " array([ 0. , 1.57079633, 0.78539816])\n", 932 | " >>> np.angle(1+1j, deg=True) # in degrees\n", 933 | " 45.0\n", 934 | "\n" 935 | ] 936 | } 937 | ], 938 | "source": [ 939 | "help(np.angle)" 940 | ] 941 | }, 942 | { 943 | "cell_type": "code", 944 | "execution_count": 55, 945 | "metadata": { 946 | "collapsed": false, 947 | "slideshow": { 948 | "slide_type": "fragment" 949 | } 950 | }, 951 | "outputs": [ 952 | { 953 | "data": { 954 | "text/plain": [ 955 | "matrix([[ 1., 2.],\n", 956 | " [ 3., 4.]])" 957 | ] 958 | }, 959 | "execution_count": 55, 960 | "metadata": {}, 961 | "output_type": "execute_result" 962 | } 963 | ], 964 | "source": [ 965 | "np.abs(C)" 966 | ] 967 | }, 968 | { 969 | "cell_type": "markdown", 970 | "metadata": { 971 | "slideshow": { 972 | "slide_type": "slide" 973 | } 974 | }, 975 | "source": [ 976 | "## Matrix computations" 977 | ] 978 | }, 979 | { 980 | "cell_type": "markdown", 981 | "metadata": { 982 | "slideshow": { 983 | "slide_type": "subslide" 984 | } 985 | }, 986 | "source": [ 987 | "### Inverse: `np.linalg.inv`" 988 | ] 989 | }, 990 | { 991 | "cell_type": "code", 992 | "execution_count": 58, 993 | "metadata": { 994 | "collapsed": false 995 | }, 996 | "outputs": [ 997 | { 998 | "data": { 999 | "text/plain": [ 1000 | "matrix([[ 0.+2.j , 0.-1.j ],\n", 1001 | " [ 0.-1.5j, 0.+0.5j]])" 1002 | ] 1003 | }, 1004 | "execution_count": 58, 1005 | "metadata": {}, 1006 | "output_type": "execute_result" 1007 | } 1008 | ], 1009 | "source": [ 1010 | "np.linalg.inv(C) # equivalent to C.I " 1011 | ] 1012 | }, 1013 | { 1014 | "cell_type": "code", 1015 | "execution_count": 59, 1016 | "metadata": { 1017 | "collapsed": false 1018 | }, 1019 | "outputs": [ 1020 | { 1021 | "data": { 1022 | "text/plain": [ 1023 | "matrix([[ 1.00000000e+00+0.j, 0.00000000e+00+0.j],\n", 1024 | " [ 2.22044605e-16+0.j, 1.00000000e+00+0.j]])" 1025 | ] 1026 | }, 1027 | "execution_count": 59, 1028 | "metadata": {}, 1029 | "output_type": "execute_result" 1030 | } 1031 | ], 1032 | "source": [ 1033 | "C.I * C" 1034 | ] 1035 | }, 1036 | { 1037 | "cell_type": "markdown", 1038 | "metadata": { 1039 | "slideshow": { 1040 | "slide_type": "subslide" 1041 | } 1042 | }, 1043 | "source": [ 1044 | "### Determinant: `np.linalg.det`" 1045 | ] 1046 | }, 1047 | { 1048 | "cell_type": "code", 1049 | "execution_count": 60, 1050 | "metadata": { 1051 | "collapsed": false 1052 | }, 1053 | "outputs": [ 1054 | { 1055 | "data": { 1056 | "text/plain": [ 1057 | "(2.0000000000000004+0j)" 1058 | ] 1059 | }, 1060 | "execution_count": 60, 1061 | "metadata": {}, 1062 | "output_type": "execute_result" 1063 | } 1064 | ], 1065 | "source": [ 1066 | "np.linalg.det(C)" 1067 | ] 1068 | }, 1069 | { 1070 | "cell_type": "code", 1071 | "execution_count": 61, 1072 | "metadata": { 1073 | "collapsed": false 1074 | }, 1075 | "outputs": [ 1076 | { 1077 | "data": { 1078 | "text/plain": [ 1079 | "(0.49999999999999972+0j)" 1080 | ] 1081 | }, 1082 | "execution_count": 61, 1083 | "metadata": {}, 1084 | "output_type": "execute_result" 1085 | } 1086 | ], 1087 | "source": [ 1088 | "np.linalg.det(C.I)" 1089 | ] 1090 | }, 1091 | { 1092 | "cell_type": "markdown", 1093 | "metadata": { 1094 | "slideshow": { 1095 | "slide_type": "slide" 1096 | } 1097 | }, 1098 | "source": [ 1099 | "## Reshaping, resizing and stacking arrays" 1100 | ] 1101 | }, 1102 | { 1103 | "cell_type": "markdown", 1104 | "metadata": {}, 1105 | "source": [ 1106 | "The shape of an Numpy array can be modified without copying the underlaying data, which makes it a fast operation even for large arrays." 1107 | ] 1108 | }, 1109 | { 1110 | "cell_type": "code", 1111 | "execution_count": 65, 1112 | "metadata": { 1113 | "collapsed": false 1114 | }, 1115 | "outputs": [ 1116 | { 1117 | "data": { 1118 | "text/plain": [ 1119 | "array([[ 0, 1, 2, 3, 4],\n", 1120 | " [10, 11, 12, 13, 14],\n", 1121 | " [20, 21, 22, 23, 24],\n", 1122 | " [30, 31, 32, 33, 34],\n", 1123 | " [40, 41, 42, 43, 44]])" 1124 | ] 1125 | }, 1126 | "execution_count": 65, 1127 | "metadata": {}, 1128 | "output_type": "execute_result" 1129 | } 1130 | ], 1131 | "source": [ 1132 | "A" 1133 | ] 1134 | }, 1135 | { 1136 | "cell_type": "code", 1137 | "execution_count": 66, 1138 | "metadata": { 1139 | "collapsed": false, 1140 | "slideshow": { 1141 | "slide_type": "subslide" 1142 | } 1143 | }, 1144 | "outputs": [], 1145 | "source": [ 1146 | "n, m = A.shape" 1147 | ] 1148 | }, 1149 | { 1150 | "cell_type": "code", 1151 | "execution_count": 67, 1152 | "metadata": { 1153 | "collapsed": false, 1154 | "slideshow": { 1155 | "slide_type": "fragment" 1156 | } 1157 | }, 1158 | "outputs": [ 1159 | { 1160 | "data": { 1161 | "text/plain": [ 1162 | "array([[ 0, 1, 2, 3, 4, 10, 11, 12, 13, 14, 20, 21, 22, 23, 24, 30, 31,\n", 1163 | " 32, 33, 34, 40, 41, 42, 43, 44]])" 1164 | ] 1165 | }, 1166 | "execution_count": 67, 1167 | "metadata": {}, 1168 | "output_type": "execute_result" 1169 | } 1170 | ], 1171 | "source": [ 1172 | "B = A.reshape((1,n*m))\n", 1173 | "B" 1174 | ] 1175 | }, 1176 | { 1177 | "cell_type": "code", 1178 | "execution_count": 68, 1179 | "metadata": { 1180 | "collapsed": false, 1181 | "slideshow": { 1182 | "slide_type": "fragment" 1183 | } 1184 | }, 1185 | "outputs": [ 1186 | { 1187 | "data": { 1188 | "text/plain": [ 1189 | "array([[ 5, 5, 5, 5, 5, 10, 11, 12, 13, 14, 20, 21, 22, 23, 24, 30, 31,\n", 1190 | " 32, 33, 34, 40, 41, 42, 43, 44]])" 1191 | ] 1192 | }, 1193 | "execution_count": 68, 1194 | "metadata": {}, 1195 | "output_type": "execute_result" 1196 | } 1197 | ], 1198 | "source": [ 1199 | "B[0,0:5] = 5 # modify the array\n", 1200 | "\n", 1201 | "B" 1202 | ] 1203 | }, 1204 | { 1205 | "cell_type": "code", 1206 | "execution_count": 69, 1207 | "metadata": { 1208 | "collapsed": false, 1209 | "slideshow": { 1210 | "slide_type": "fragment" 1211 | } 1212 | }, 1213 | "outputs": [ 1214 | { 1215 | "data": { 1216 | "text/plain": [ 1217 | "array([[ 5, 5, 5, 5, 5],\n", 1218 | " [10, 11, 12, 13, 14],\n", 1219 | " [20, 21, 22, 23, 24],\n", 1220 | " [30, 31, 32, 33, 34],\n", 1221 | " [40, 41, 42, 43, 44]])" 1222 | ] 1223 | }, 1224 | "execution_count": 69, 1225 | "metadata": {}, 1226 | "output_type": "execute_result" 1227 | } 1228 | ], 1229 | "source": [ 1230 | "A # and the original variable is also changed. B is only a different view of the same data" 1231 | ] 1232 | }, 1233 | { 1234 | "cell_type": "markdown", 1235 | "metadata": {}, 1236 | "source": [ 1237 | "### Flattening" 1238 | ] 1239 | }, 1240 | { 1241 | "cell_type": "markdown", 1242 | "metadata": { 1243 | "slideshow": { 1244 | "slide_type": "subslide" 1245 | } 1246 | }, 1247 | "source": [ 1248 | "We can also use the function `flatten` to make a higher-dimensional array into a vector. But this function create a copy of the data." 1249 | ] 1250 | }, 1251 | { 1252 | "cell_type": "code", 1253 | "execution_count": 70, 1254 | "metadata": { 1255 | "collapsed": false, 1256 | "slideshow": { 1257 | "slide_type": "fragment" 1258 | } 1259 | }, 1260 | "outputs": [ 1261 | { 1262 | "data": { 1263 | "text/plain": [ 1264 | "array([ 5, 5, 5, 5, 5, 10, 11, 12, 13, 14, 20, 21, 22, 23, 24, 30, 31,\n", 1265 | " 32, 33, 34, 40, 41, 42, 43, 44])" 1266 | ] 1267 | }, 1268 | "execution_count": 70, 1269 | "metadata": {}, 1270 | "output_type": "execute_result" 1271 | } 1272 | ], 1273 | "source": [ 1274 | "B = A.flatten()\n", 1275 | "\n", 1276 | "B" 1277 | ] 1278 | }, 1279 | { 1280 | "cell_type": "code", 1281 | "execution_count": 71, 1282 | "metadata": { 1283 | "collapsed": false, 1284 | "slideshow": { 1285 | "slide_type": "subslide" 1286 | } 1287 | }, 1288 | "outputs": [ 1289 | { 1290 | "data": { 1291 | "text/plain": [ 1292 | "array([10, 10, 10, 10, 10, 10, 11, 12, 13, 14, 20, 21, 22, 23, 24, 30, 31,\n", 1293 | " 32, 33, 34, 40, 41, 42, 43, 44])" 1294 | ] 1295 | }, 1296 | "execution_count": 71, 1297 | "metadata": {}, 1298 | "output_type": "execute_result" 1299 | } 1300 | ], 1301 | "source": [ 1302 | "B[0:5] = 10\n", 1303 | "\n", 1304 | "B" 1305 | ] 1306 | }, 1307 | { 1308 | "cell_type": "code", 1309 | "execution_count": 72, 1310 | "metadata": { 1311 | "collapsed": false, 1312 | "slideshow": { 1313 | "slide_type": "fragment" 1314 | } 1315 | }, 1316 | "outputs": [ 1317 | { 1318 | "data": { 1319 | "text/plain": [ 1320 | "array([[ 5, 5, 5, 5, 5],\n", 1321 | " [10, 11, 12, 13, 14],\n", 1322 | " [20, 21, 22, 23, 24],\n", 1323 | " [30, 31, 32, 33, 34],\n", 1324 | " [40, 41, 42, 43, 44]])" 1325 | ] 1326 | }, 1327 | "execution_count": 72, 1328 | "metadata": {}, 1329 | "output_type": "execute_result" 1330 | } 1331 | ], 1332 | "source": [ 1333 | "A # now A has not changed, because B's data is a copy of A's, not refering to the same data" 1334 | ] 1335 | }, 1336 | { 1337 | "cell_type": "markdown", 1338 | "metadata": {}, 1339 | "source": [ 1340 | "### `np.ravel`" 1341 | ] 1342 | }, 1343 | { 1344 | "cell_type": "code", 1345 | "execution_count": 98, 1346 | "metadata": { 1347 | "collapsed": false 1348 | }, 1349 | "outputs": [ 1350 | { 1351 | "data": { 1352 | "text/plain": [ 1353 | "array([1, 2, 3, 4, 5, 6])" 1354 | ] 1355 | }, 1356 | "execution_count": 98, 1357 | "metadata": {}, 1358 | "output_type": "execute_result" 1359 | } 1360 | ], 1361 | "source": [ 1362 | "a = np.array([[1, 2, 3], [4, 5, 6]])\n", 1363 | "a.ravel()" 1364 | ] 1365 | }, 1366 | { 1367 | "cell_type": "code", 1368 | "execution_count": 99, 1369 | "metadata": { 1370 | "collapsed": false 1371 | }, 1372 | "outputs": [ 1373 | { 1374 | "data": { 1375 | "text/plain": [ 1376 | "array([[1, 4],\n", 1377 | " [2, 5],\n", 1378 | " [3, 6]])" 1379 | ] 1380 | }, 1381 | "execution_count": 99, 1382 | "metadata": {}, 1383 | "output_type": "execute_result" 1384 | } 1385 | ], 1386 | "source": [ 1387 | "a.T" 1388 | ] 1389 | }, 1390 | { 1391 | "cell_type": "code", 1392 | "execution_count": 100, 1393 | "metadata": { 1394 | "collapsed": false 1395 | }, 1396 | "outputs": [ 1397 | { 1398 | "data": { 1399 | "text/plain": [ 1400 | "array([1, 4, 2, 5, 3, 6])" 1401 | ] 1402 | }, 1403 | "execution_count": 100, 1404 | "metadata": {}, 1405 | "output_type": "execute_result" 1406 | } 1407 | ], 1408 | "source": [ 1409 | "a.T.ravel()" 1410 | ] 1411 | }, 1412 | { 1413 | "cell_type": "markdown", 1414 | "metadata": { 1415 | "slideshow": { 1416 | "slide_type": "slide" 1417 | } 1418 | }, 1419 | "source": [ 1420 | "## Adding a new dimension: `np.newaxis`" 1421 | ] 1422 | }, 1423 | { 1424 | "cell_type": "markdown", 1425 | "metadata": {}, 1426 | "source": [ 1427 | "With `newaxis`, we can insert new dimensions in an array, for example converting a vector to a column or row matrix:" 1428 | ] 1429 | }, 1430 | { 1431 | "cell_type": "code", 1432 | "execution_count": 74, 1433 | "metadata": { 1434 | "collapsed": false, 1435 | "slideshow": { 1436 | "slide_type": "fragment" 1437 | } 1438 | }, 1439 | "outputs": [], 1440 | "source": [ 1441 | "v = np.array([1,2,3])" 1442 | ] 1443 | }, 1444 | { 1445 | "cell_type": "code", 1446 | "execution_count": 75, 1447 | "metadata": { 1448 | "collapsed": false, 1449 | "slideshow": { 1450 | "slide_type": "fragment" 1451 | } 1452 | }, 1453 | "outputs": [ 1454 | { 1455 | "data": { 1456 | "text/plain": [ 1457 | "(3,)" 1458 | ] 1459 | }, 1460 | "execution_count": 75, 1461 | "metadata": {}, 1462 | "output_type": "execute_result" 1463 | } 1464 | ], 1465 | "source": [ 1466 | "np.shape(v)" 1467 | ] 1468 | }, 1469 | { 1470 | "cell_type": "code", 1471 | "execution_count": 77, 1472 | "metadata": { 1473 | "collapsed": false, 1474 | "slideshow": { 1475 | "slide_type": "fragment" 1476 | } 1477 | }, 1478 | "outputs": [ 1479 | { 1480 | "data": { 1481 | "text/plain": [ 1482 | "array([[1],\n", 1483 | " [2],\n", 1484 | " [3]])" 1485 | ] 1486 | }, 1487 | "execution_count": 77, 1488 | "metadata": {}, 1489 | "output_type": "execute_result" 1490 | } 1491 | ], 1492 | "source": [ 1493 | "# make a column matrix of the vector v\n", 1494 | "v[:, np.newaxis]" 1495 | ] 1496 | }, 1497 | { 1498 | "cell_type": "code", 1499 | "execution_count": 78, 1500 | "metadata": { 1501 | "collapsed": false, 1502 | "slideshow": { 1503 | "slide_type": "subslide" 1504 | } 1505 | }, 1506 | "outputs": [ 1507 | { 1508 | "data": { 1509 | "text/plain": [ 1510 | "(3, 1)" 1511 | ] 1512 | }, 1513 | "execution_count": 78, 1514 | "metadata": {}, 1515 | "output_type": "execute_result" 1516 | } 1517 | ], 1518 | "source": [ 1519 | "# column matrix\n", 1520 | "v[:,np.newaxis].shape" 1521 | ] 1522 | }, 1523 | { 1524 | "cell_type": "code", 1525 | "execution_count": 80, 1526 | "metadata": { 1527 | "collapsed": false, 1528 | "slideshow": { 1529 | "slide_type": "fragment" 1530 | } 1531 | }, 1532 | "outputs": [ 1533 | { 1534 | "data": { 1535 | "text/plain": [ 1536 | "(1, 3)" 1537 | ] 1538 | }, 1539 | "execution_count": 80, 1540 | "metadata": {}, 1541 | "output_type": "execute_result" 1542 | } 1543 | ], 1544 | "source": [ 1545 | "# row matrix\n", 1546 | "v[np.newaxis,:].shape" 1547 | ] 1548 | }, 1549 | { 1550 | "cell_type": "markdown", 1551 | "metadata": { 1552 | "slideshow": { 1553 | "slide_type": "slide" 1554 | } 1555 | }, 1556 | "source": [ 1557 | "# Stacking and repeating arrays" 1558 | ] 1559 | }, 1560 | { 1561 | "cell_type": "markdown", 1562 | "metadata": {}, 1563 | "source": [ 1564 | "Using function `repeat`, `tile`, `vstack`, `hstack`, and `concatenate` we can create larger vectors and matrices from smaller ones:" 1565 | ] 1566 | }, 1567 | { 1568 | "cell_type": "markdown", 1569 | "metadata": { 1570 | "slideshow": { 1571 | "slide_type": "subslide" 1572 | } 1573 | }, 1574 | "source": [ 1575 | "## `np.tile` and `np.repeat`" 1576 | ] 1577 | }, 1578 | { 1579 | "cell_type": "code", 1580 | "execution_count": 82, 1581 | "metadata": { 1582 | "collapsed": false 1583 | }, 1584 | "outputs": [], 1585 | "source": [ 1586 | "a = np.array([[1, 2], [3, 4]])" 1587 | ] 1588 | }, 1589 | { 1590 | "cell_type": "code", 1591 | "execution_count": 83, 1592 | "metadata": { 1593 | "collapsed": false, 1594 | "slideshow": { 1595 | "slide_type": "fragment" 1596 | } 1597 | }, 1598 | "outputs": [ 1599 | { 1600 | "data": { 1601 | "text/plain": [ 1602 | "array([1, 1, 1, 2, 2, 2, 3, 3, 3, 4, 4, 4])" 1603 | ] 1604 | }, 1605 | "execution_count": 83, 1606 | "metadata": {}, 1607 | "output_type": "execute_result" 1608 | } 1609 | ], 1610 | "source": [ 1611 | "# repeat each element 3 times\n", 1612 | "np.repeat(a, 3)" 1613 | ] 1614 | }, 1615 | { 1616 | "cell_type": "code", 1617 | "execution_count": 84, 1618 | "metadata": { 1619 | "collapsed": false, 1620 | "slideshow": { 1621 | "slide_type": "fragment" 1622 | } 1623 | }, 1624 | "outputs": [ 1625 | { 1626 | "data": { 1627 | "text/plain": [ 1628 | "array([[1, 2, 1, 2, 1, 2],\n", 1629 | " [3, 4, 3, 4, 3, 4]])" 1630 | ] 1631 | }, 1632 | "execution_count": 84, 1633 | "metadata": {}, 1634 | "output_type": "execute_result" 1635 | } 1636 | ], 1637 | "source": [ 1638 | "# tile the matrix 3 times \n", 1639 | "np.tile(a, 3)" 1640 | ] 1641 | }, 1642 | { 1643 | "cell_type": "markdown", 1644 | "metadata": { 1645 | "slideshow": { 1646 | "slide_type": "subslide" 1647 | } 1648 | }, 1649 | "source": [ 1650 | "## `np.concatenate`" 1651 | ] 1652 | }, 1653 | { 1654 | "cell_type": "code", 1655 | "execution_count": 85, 1656 | "metadata": { 1657 | "collapsed": false, 1658 | "slideshow": { 1659 | "slide_type": "-" 1660 | } 1661 | }, 1662 | "outputs": [], 1663 | "source": [ 1664 | "b = np.array([[5, 6]])" 1665 | ] 1666 | }, 1667 | { 1668 | "cell_type": "code", 1669 | "execution_count": 86, 1670 | "metadata": { 1671 | "collapsed": false, 1672 | "slideshow": { 1673 | "slide_type": "fragment" 1674 | } 1675 | }, 1676 | "outputs": [ 1677 | { 1678 | "data": { 1679 | "text/plain": [ 1680 | "array([[1, 2],\n", 1681 | " [3, 4],\n", 1682 | " [5, 6]])" 1683 | ] 1684 | }, 1685 | "execution_count": 86, 1686 | "metadata": {}, 1687 | "output_type": "execute_result" 1688 | } 1689 | ], 1690 | "source": [ 1691 | "np.concatenate((a, b), axis=0)" 1692 | ] 1693 | }, 1694 | { 1695 | "cell_type": "code", 1696 | "execution_count": 87, 1697 | "metadata": { 1698 | "collapsed": false, 1699 | "slideshow": { 1700 | "slide_type": "fragment" 1701 | } 1702 | }, 1703 | "outputs": [ 1704 | { 1705 | "data": { 1706 | "text/plain": [ 1707 | "array([[1, 2, 5],\n", 1708 | " [3, 4, 6]])" 1709 | ] 1710 | }, 1711 | "execution_count": 87, 1712 | "metadata": {}, 1713 | "output_type": "execute_result" 1714 | } 1715 | ], 1716 | "source": [ 1717 | "np.concatenate((a, b.T), axis=1)" 1718 | ] 1719 | }, 1720 | { 1721 | "cell_type": "markdown", 1722 | "metadata": { 1723 | "slideshow": { 1724 | "slide_type": "subslide" 1725 | } 1726 | }, 1727 | "source": [ 1728 | "## `np.hstack` and `np.vstack`" 1729 | ] 1730 | }, 1731 | { 1732 | "cell_type": "code", 1733 | "execution_count": 88, 1734 | "metadata": { 1735 | "collapsed": false 1736 | }, 1737 | "outputs": [ 1738 | { 1739 | "data": { 1740 | "text/plain": [ 1741 | "array([[1, 2],\n", 1742 | " [3, 4],\n", 1743 | " [5, 6]])" 1744 | ] 1745 | }, 1746 | "execution_count": 88, 1747 | "metadata": {}, 1748 | "output_type": "execute_result" 1749 | } 1750 | ], 1751 | "source": [ 1752 | "np.vstack((a,b))" 1753 | ] 1754 | }, 1755 | { 1756 | "cell_type": "code", 1757 | "execution_count": 89, 1758 | "metadata": { 1759 | "collapsed": false 1760 | }, 1761 | "outputs": [ 1762 | { 1763 | "data": { 1764 | "text/plain": [ 1765 | "array([[1, 2, 5],\n", 1766 | " [3, 4, 6]])" 1767 | ] 1768 | }, 1769 | "execution_count": 89, 1770 | "metadata": {}, 1771 | "output_type": "execute_result" 1772 | } 1773 | ], 1774 | "source": [ 1775 | "np.hstack((a,b.T))" 1776 | ] 1777 | }, 1778 | { 1779 | "cell_type": "markdown", 1780 | "metadata": { 1781 | "slideshow": { 1782 | "slide_type": "slide" 1783 | } 1784 | }, 1785 | "source": [ 1786 | "# Copy and \"deep copy\"" 1787 | ] 1788 | }, 1789 | { 1790 | "cell_type": "markdown", 1791 | "metadata": {}, 1792 | "source": [ 1793 | "To achieve high performance, assignments in Python usually do not copy the underlaying objects. \n", 1794 | "\n", 1795 | "This is important for example when objects are passed between functions, to avoid an excessive amount of memory copying when it is not necessary (techincal term: **pass by reference**).\n", 1796 | "\n", 1797 | "" 1798 | ] 1799 | }, 1800 | { 1801 | "cell_type": "code", 1802 | "execution_count": 90, 1803 | "metadata": { 1804 | "collapsed": false 1805 | }, 1806 | "outputs": [ 1807 | { 1808 | "data": { 1809 | "text/plain": [ 1810 | "array([[1, 2],\n", 1811 | " [3, 4]])" 1812 | ] 1813 | }, 1814 | "execution_count": 90, 1815 | "metadata": {}, 1816 | "output_type": "execute_result" 1817 | } 1818 | ], 1819 | "source": [ 1820 | "A = np.array([[1, 2], [3, 4]])\n", 1821 | "\n", 1822 | "A" 1823 | ] 1824 | }, 1825 | { 1826 | "cell_type": "code", 1827 | "execution_count": 91, 1828 | "metadata": { 1829 | "collapsed": false, 1830 | "slideshow": { 1831 | "slide_type": "subslide" 1832 | } 1833 | }, 1834 | "outputs": [], 1835 | "source": [ 1836 | "# now B is referring to the same array data as A \n", 1837 | "B = A " 1838 | ] 1839 | }, 1840 | { 1841 | "cell_type": "code", 1842 | "execution_count": 92, 1843 | "metadata": { 1844 | "collapsed": false, 1845 | "slideshow": { 1846 | "slide_type": "fragment" 1847 | } 1848 | }, 1849 | "outputs": [ 1850 | { 1851 | "data": { 1852 | "text/plain": [ 1853 | "array([[10, 2],\n", 1854 | " [ 3, 4]])" 1855 | ] 1856 | }, 1857 | "execution_count": 92, 1858 | "metadata": {}, 1859 | "output_type": "execute_result" 1860 | } 1861 | ], 1862 | "source": [ 1863 | "# changing B affects A\n", 1864 | "B[0,0] = 10\n", 1865 | "\n", 1866 | "B" 1867 | ] 1868 | }, 1869 | { 1870 | "cell_type": "code", 1871 | "execution_count": 93, 1872 | "metadata": { 1873 | "collapsed": false, 1874 | "slideshow": { 1875 | "slide_type": "fragment" 1876 | } 1877 | }, 1878 | "outputs": [ 1879 | { 1880 | "data": { 1881 | "text/plain": [ 1882 | "array([[10, 2],\n", 1883 | " [ 3, 4]])" 1884 | ] 1885 | }, 1886 | "execution_count": 93, 1887 | "metadata": {}, 1888 | "output_type": "execute_result" 1889 | } 1890 | ], 1891 | "source": [ 1892 | "A" 1893 | ] 1894 | }, 1895 | { 1896 | "cell_type": "markdown", 1897 | "metadata": { 1898 | "slideshow": { 1899 | "slide_type": "subslide" 1900 | } 1901 | }, 1902 | "source": [ 1903 | "* If we want to **avoid** this behavior, so that when we get a new completely independent object `B` copied from `A`, then we need to do a so-called **deep copy** using the function `np.copy`:" 1904 | ] 1905 | }, 1906 | { 1907 | "cell_type": "code", 1908 | "execution_count": 94, 1909 | "metadata": { 1910 | "collapsed": false, 1911 | "slideshow": { 1912 | "slide_type": "fragment" 1913 | } 1914 | }, 1915 | "outputs": [], 1916 | "source": [ 1917 | "B = np.copy(A)" 1918 | ] 1919 | }, 1920 | { 1921 | "cell_type": "code", 1922 | "execution_count": 95, 1923 | "metadata": { 1924 | "collapsed": false, 1925 | "slideshow": { 1926 | "slide_type": "fragment" 1927 | } 1928 | }, 1929 | "outputs": [ 1930 | { 1931 | "data": { 1932 | "text/plain": [ 1933 | "array([[-5, 2],\n", 1934 | " [ 3, 4]])" 1935 | ] 1936 | }, 1937 | "execution_count": 95, 1938 | "metadata": {}, 1939 | "output_type": "execute_result" 1940 | } 1941 | ], 1942 | "source": [ 1943 | "# now, if we modify B, A is not affected\n", 1944 | "B[0,0] = -5\n", 1945 | "\n", 1946 | "B" 1947 | ] 1948 | }, 1949 | { 1950 | "cell_type": "code", 1951 | "execution_count": 96, 1952 | "metadata": { 1953 | "collapsed": false, 1954 | "slideshow": { 1955 | "slide_type": "fragment" 1956 | } 1957 | }, 1958 | "outputs": [ 1959 | { 1960 | "data": { 1961 | "text/plain": [ 1962 | "array([[10, 2],\n", 1963 | " [ 3, 4]])" 1964 | ] 1965 | }, 1966 | "execution_count": 96, 1967 | "metadata": {}, 1968 | "output_type": "execute_result" 1969 | } 1970 | ], 1971 | "source": [ 1972 | "A" 1973 | ] 1974 | }, 1975 | { 1976 | "cell_type": "markdown", 1977 | "metadata": {}, 1978 | "source": [ 1979 | "# Exercise: Shape manipulations" 1980 | ] 1981 | }, 1982 | { 1983 | "cell_type": "markdown", 1984 | "metadata": {}, 1985 | "source": [ 1986 | "* Look at the docstring for `reshape`, especially the notes section which\n", 1987 | "has some more information about copies and views.\n", 1988 | "\n", 1989 | "* Use `flatten` as an alternative to `ravel`. What is the difference?\n", 1990 | "(Hint: check which one returns a view and which a copy)\n", 1991 | "\n", 1992 | "* Experiment with `transpose` for dimension shuffling." 1993 | ] 1994 | } 1995 | ], 1996 | "metadata": { 1997 | "kernelspec": { 1998 | "display_name": "Python 3", 1999 | "language": "python", 2000 | "name": "python3" 2001 | }, 2002 | "language_info": { 2003 | "codemirror_mode": { 2004 | "name": "ipython", 2005 | "version": 3 2006 | }, 2007 | "file_extension": ".py", 2008 | "mimetype": "text/x-python", 2009 | "name": "python", 2010 | "nbconvert_exporter": "python", 2011 | "pygments_lexer": "ipython3", 2012 | "version": "3.4.3" 2013 | } 2014 | }, 2015 | "nbformat": 4, 2016 | "nbformat_minor": 0 2017 | } 2018 | -------------------------------------------------------------------------------- /02_Introduction to Numpy.ipynb: -------------------------------------------------------------------------------- 1 | { 2 | "cells": [ 3 | { 4 | "cell_type": "markdown", 5 | "metadata": {}, 6 | "source": [ 7 | "# What is Numpy" 8 | ] 9 | }, 10 | { 11 | "cell_type": "markdown", 12 | "metadata": {}, 13 | "source": [ 14 | "NumPy is the fundamental package for scientific computing with Python. \n", 15 | "It is a package that provide high-performance vector, matrix and higher-dimensional data structures for Python. \n", 16 | "It is implemented in C and Fortran so when calculations are **vectorized**, performance is very good.\n", 17 | "\n", 18 | "So, in a nutshell:\n", 19 | "\n", 20 | "* a powerful Python extension for N-dimensional array\n", 21 | "* a tool for integrating C/C++ and Fortran code\n", 22 | "* designed for scientific computation: linear algebra and Signal Analysis\n", 23 | "\n", 24 | "If you are a MATLAB® user we recommend to read [Numpy for MATLAB Users](http://www.scipy.org/NumPy_for_Matlab_Users) and [Benefit of Open Source Python versus commercial packages](http://www.scipy.org/NumPyProConPage). \n", 25 | "\n", 26 | "I'm a supporter of the **Open Science Movement**, thus I humbly suggest you to take a look at the [Science Code Manifesto](http://sciencecodemanifesto.org/)" 27 | ] 28 | }, 29 | { 30 | "cell_type": "markdown", 31 | "metadata": {}, 32 | "source": [ 33 | "# Getting Started with Numpy Arrays" 34 | ] 35 | }, 36 | { 37 | "cell_type": "markdown", 38 | "metadata": {}, 39 | "source": [ 40 | "NumPy's main object is the **homogeneous** ***multidimensional array***. It is a table of elements (usually numbers), all of the same type. \n", 41 | "\n", 42 | "In Numpy dimensions are called **axes**. \n", 43 | "\n", 44 | "The number of axes is called **rank**. \n", 45 | "\n", 46 | "The most important attributes of an ndarray object are:\n", 47 | "\n", 48 | "* **ndarray.ndim** - the number of axes (dimensions) of the array. \n", 49 | "* **ndarray.shape** - the dimensions of the array. For a matrix with n rows and m columns, shape will be (n,m). \n", 50 | "* **ndarray.size** - the total number of elements of the array. \n", 51 | "* **ndarray.dtype** - numpy.int32, numpy.int16, and numpy.float64 are some examples. \n", 52 | "* **ndarray.itemsize** - the size in bytes of elements of the array. For example, elements of type float64 has itemsize 8 (=64/8) " 53 | ] 54 | }, 55 | { 56 | "cell_type": "markdown", 57 | "metadata": {}, 58 | "source": [ 59 | "To use `numpy` need to import the module it using of example:" 60 | ] 61 | }, 62 | { 63 | "cell_type": "code", 64 | "execution_count": 2, 65 | "metadata": { 66 | "collapsed": true 67 | }, 68 | "outputs": [], 69 | "source": [ 70 | "import numpy as np # naming import convention" 71 | ] 72 | }, 73 | { 74 | "cell_type": "markdown", 75 | "metadata": {}, 76 | "source": [ 77 | "### Terminology Assumption" 78 | ] 79 | }, 80 | { 81 | "cell_type": "markdown", 82 | "metadata": {}, 83 | "source": [ 84 | "In the `numpy` package the terminology used for vectors, matrices and higher-dimensional data sets is *array*. " 85 | ] 86 | }, 87 | { 88 | "cell_type": "markdown", 89 | "metadata": {}, 90 | "source": [ 91 | "### Reference Documentation" 92 | ] 93 | }, 94 | { 95 | "cell_type": "markdown", 96 | "metadata": {}, 97 | "source": [ 98 | "* On the web: [http://docs.scipy.org](http://docs.scipy.org)/\n", 99 | "\n", 100 | "* Interactive help:" 101 | ] 102 | }, 103 | { 104 | "cell_type": "code", 105 | "execution_count": null, 106 | "metadata": { 107 | "collapsed": true 108 | }, 109 | "outputs": [], 110 | "source": [ 111 | "np.array?" 112 | ] 113 | }, 114 | { 115 | "cell_type": "markdown", 116 | "metadata": {}, 117 | "source": [ 118 | "If you're looking for something" 119 | ] 120 | }, 121 | { 122 | "cell_type": "code", 123 | "execution_count": null, 124 | "metadata": { 125 | "collapsed": false 126 | }, 127 | "outputs": [], 128 | "source": [ 129 | "np.lookfor('create array')" 130 | ] 131 | }, 132 | { 133 | "cell_type": "code", 134 | "execution_count": null, 135 | "metadata": { 136 | "collapsed": true 137 | }, 138 | "outputs": [], 139 | "source": [ 140 | "np.con*?" 141 | ] 142 | }, 143 | { 144 | "cell_type": "markdown", 145 | "metadata": {}, 146 | "source": [ 147 | "#### Help is your friend" 148 | ] 149 | }, 150 | { 151 | "cell_type": "markdown", 152 | "metadata": {}, 153 | "source": [ 154 | "Whenever in doubt, there is the `help` function to the rescue" 155 | ] 156 | }, 157 | { 158 | "cell_type": "code", 159 | "execution_count": null, 160 | "metadata": { 161 | "collapsed": false, 162 | "scrolled": false 163 | }, 164 | "outputs": [], 165 | "source": [ 166 | "# For example, try \n", 167 | "help(np.ndarray)" 168 | ] 169 | }, 170 | { 171 | "cell_type": "markdown", 172 | "metadata": {}, 173 | "source": [ 174 | "## Numpy Array Object" 175 | ] 176 | }, 177 | { 178 | "cell_type": "markdown", 179 | "metadata": {}, 180 | "source": [ 181 | "`NumPy` has a multidimensional array object called ndarray. It consists of two parts as follows:\n", 182 | " \n", 183 | " * The actual data\n", 184 | " * Some metadata describing the data\n", 185 | " \n", 186 | " \n", 187 | "The majority of array operations leave the raw data untouched. The only aspect that changes is the metadata." 188 | ] 189 | }, 190 | { 191 | "cell_type": "markdown", 192 | "metadata": {}, 193 | "source": [ 194 | "" 195 | ] 196 | }, 197 | { 198 | "cell_type": "markdown", 199 | "metadata": {}, 200 | "source": [ 201 | "## Creating `numpy` arrays" 202 | ] 203 | }, 204 | { 205 | "cell_type": "markdown", 206 | "metadata": {}, 207 | "source": [ 208 | "There are a number of ways to initialize new numpy arrays, for example from\n", 209 | "\n", 210 | "* a Python list or tuples\n", 211 | "* using functions that are dedicated to generating numpy arrays, such as `arange`, `linspace`, etc." 212 | ] 213 | }, 214 | { 215 | "cell_type": "markdown", 216 | "metadata": {}, 217 | "source": [ 218 | "### From lists" 219 | ] 220 | }, 221 | { 222 | "cell_type": "markdown", 223 | "metadata": {}, 224 | "source": [ 225 | "For example, to create new vector and matrix arrays from Python lists we can use the `numpy.array` function." 226 | ] 227 | }, 228 | { 229 | "cell_type": "code", 230 | "execution_count": 3, 231 | "metadata": { 232 | "collapsed": false 233 | }, 234 | "outputs": [ 235 | { 236 | "data": { 237 | "text/plain": [ 238 | "array([1, 2, 3, 4])" 239 | ] 240 | }, 241 | "execution_count": 3, 242 | "metadata": {}, 243 | "output_type": "execute_result" 244 | } 245 | ], 246 | "source": [ 247 | "# a vector: the argument to the array function is a Python list\n", 248 | "v = np.array([1,2,3,4])\n", 249 | "v" 250 | ] 251 | }, 252 | { 253 | "cell_type": "code", 254 | "execution_count": 4, 255 | "metadata": { 256 | "collapsed": false 257 | }, 258 | "outputs": [ 259 | { 260 | "data": { 261 | "text/plain": [ 262 | "array([[1, 2],\n", 263 | " [3, 4]])" 264 | ] 265 | }, 266 | "execution_count": 4, 267 | "metadata": {}, 268 | "output_type": "execute_result" 269 | } 270 | ], 271 | "source": [ 272 | "# a matrix: the argument to the array function is a nested Python list\n", 273 | "M = np.array([[1, 2], [3, 4]])\n", 274 | "M" 275 | ] 276 | }, 277 | { 278 | "cell_type": "markdown", 279 | "metadata": {}, 280 | "source": [ 281 | "The `v` and `M` objects are both of the type `ndarray` that the `numpy` module provides." 282 | ] 283 | }, 284 | { 285 | "cell_type": "code", 286 | "execution_count": 5, 287 | "metadata": { 288 | "collapsed": false 289 | }, 290 | "outputs": [ 291 | { 292 | "name": "stdout", 293 | "output_type": "stream", 294 | "text": [ 295 | "Type of v: \n", 296 | "Type of M: \n" 297 | ] 298 | } 299 | ], 300 | "source": [ 301 | "print('Type of v: ', type(v))\n", 302 | "print('Type of M: ', type(M))" 303 | ] 304 | }, 305 | { 306 | "cell_type": "markdown", 307 | "metadata": {}, 308 | "source": [ 309 | "The difference between the `v` and `M` arrays is only their shapes. \n", 310 | "\n", 311 | "To do so, we could use the `numpy.shape` function:" 312 | ] 313 | }, 314 | { 315 | "cell_type": "code", 316 | "execution_count": 12, 317 | "metadata": { 318 | "collapsed": false 319 | }, 320 | "outputs": [ 321 | { 322 | "name": "stdout", 323 | "output_type": "stream", 324 | "text": [ 325 | "Size of v: (4,)\n", 326 | "Size of M: (2, 2)\n" 327 | ] 328 | } 329 | ], 330 | "source": [ 331 | "print('Shape of v: ', np.shape(v))\n", 332 | "print('Shape of M: ', np.shape(M))" 333 | ] 334 | }, 335 | { 336 | "cell_type": "markdown", 337 | "metadata": {}, 338 | "source": [ 339 | "Alternatively, We can get information about the shape of an array by using the `ndarray.shape` **property** :" 340 | ] 341 | }, 342 | { 343 | "cell_type": "code", 344 | "execution_count": 10, 345 | "metadata": { 346 | "collapsed": false 347 | }, 348 | "outputs": [ 349 | { 350 | "data": { 351 | "text/plain": [ 352 | "((4,), (2, 2))" 353 | ] 354 | }, 355 | "execution_count": 10, 356 | "metadata": {}, 357 | "output_type": "execute_result" 358 | } 359 | ], 360 | "source": [ 361 | "v.shape, M.shape" 362 | ] 363 | }, 364 | { 365 | "cell_type": "markdown", 366 | "metadata": {}, 367 | "source": [ 368 | "Equivalently, we can get information about the **size** of the two `ndarrays`, namely the *total number of elements* in the array." 369 | ] 370 | }, 371 | { 372 | "cell_type": "code", 373 | "execution_count": 13, 374 | "metadata": { 375 | "collapsed": false 376 | }, 377 | "outputs": [ 378 | { 379 | "name": "stdout", 380 | "output_type": "stream", 381 | "text": [ 382 | "Size of v: 4\n", 383 | "Size of M: 4\n" 384 | ] 385 | } 386 | ], 387 | "source": [ 388 | "print('Size of v:', v.size)\n", 389 | "print('Size of M:', M.size)" 390 | ] 391 | }, 392 | { 393 | "cell_type": "markdown", 394 | "metadata": {}, 395 | "source": [ 396 | "#### More properties of the `numpy array`" 397 | ] 398 | }, 399 | { 400 | "cell_type": "code", 401 | "execution_count": 32, 402 | "metadata": { 403 | "collapsed": false 404 | }, 405 | "outputs": [ 406 | { 407 | "data": { 408 | "text/plain": [ 409 | "8" 410 | ] 411 | }, 412 | "execution_count": 32, 413 | "metadata": {}, 414 | "output_type": "execute_result" 415 | } 416 | ], 417 | "source": [ 418 | "M.itemsize # bytes per element" 419 | ] 420 | }, 421 | { 422 | "cell_type": "code", 423 | "execution_count": 33, 424 | "metadata": { 425 | "collapsed": false 426 | }, 427 | "outputs": [ 428 | { 429 | "data": { 430 | "text/plain": [ 431 | "32" 432 | ] 433 | }, 434 | "execution_count": 33, 435 | "metadata": {}, 436 | "output_type": "execute_result" 437 | } 438 | ], 439 | "source": [ 440 | "M.nbytes # number of bytes" 441 | ] 442 | }, 443 | { 444 | "cell_type": "code", 445 | "execution_count": 34, 446 | "metadata": { 447 | "collapsed": false 448 | }, 449 | "outputs": [ 450 | { 451 | "data": { 452 | "text/plain": [ 453 | "2" 454 | ] 455 | }, 456 | "execution_count": 34, 457 | "metadata": {}, 458 | "output_type": "execute_result" 459 | } 460 | ], 461 | "source": [ 462 | "M.ndim # number of dimensions" 463 | ] 464 | }, 465 | { 466 | "cell_type": "markdown", 467 | "metadata": {}, 468 | "source": [ 469 | "## Using array-generating functions" 470 | ] 471 | }, 472 | { 473 | "cell_type": "markdown", 474 | "metadata": {}, 475 | "source": [ 476 | "For larger arrays it is inpractical to initialize the data manually, using explicit python lists. \n", 477 | "\n", 478 | "Instead we can use one of the many **functions** in `numpy` that generates arrays of different forms. \n", 479 | "\n", 480 | "Some of the more common are: \n", 481 | "\n", 482 | "* `np.arange`; \n", 483 | "* `np.linspace`; \n", 484 | "* `np.logspace`; \n", 485 | "* `np.mgrid`;\n", 486 | "* `np.random.rand`;\n", 487 | "* `np.diag`;\n", 488 | "* `np.zeros`;\n", 489 | "* `np.ones`;\n", 490 | "* `np.empty`;\n", 491 | "* `np.tile`." 492 | ] 493 | }, 494 | { 495 | "cell_type": "markdown", 496 | "metadata": {}, 497 | "source": [ 498 | "### `np.arange`" 499 | ] 500 | }, 501 | { 502 | "cell_type": "code", 503 | "execution_count": 15, 504 | "metadata": { 505 | "collapsed": false 506 | }, 507 | "outputs": [ 508 | { 509 | "name": "stdout", 510 | "output_type": "stream", 511 | "text": [ 512 | "[0 1 2 3 4 5 6 7 8 9]\n" 513 | ] 514 | } 515 | ], 516 | "source": [ 517 | "# create a range\n", 518 | "x = np.arange(0, 10, 1) # arguments: start, stop, step\n", 519 | "print(x)" 520 | ] 521 | }, 522 | { 523 | "cell_type": "code", 524 | "execution_count": 17, 525 | "metadata": { 526 | "collapsed": false 527 | }, 528 | "outputs": [ 529 | { 530 | "name": "stdout", 531 | "output_type": "stream", 532 | "text": [ 533 | "[ -1.00000000e+00 -9.00000000e-01 -8.00000000e-01 -7.00000000e-01\n", 534 | " -6.00000000e-01 -5.00000000e-01 -4.00000000e-01 -3.00000000e-01\n", 535 | " -2.00000000e-01 -1.00000000e-01 -2.22044605e-16 1.00000000e-01\n", 536 | " 2.00000000e-01 3.00000000e-01 4.00000000e-01 5.00000000e-01\n", 537 | " 6.00000000e-01 7.00000000e-01 8.00000000e-01 9.00000000e-01]\n" 538 | ] 539 | } 540 | ], 541 | "source": [ 542 | "x = np.arange(-1, 1, 0.1) # floating point step-wise range generatation\n", 543 | "print(x)" 544 | ] 545 | }, 546 | { 547 | "cell_type": "markdown", 548 | "metadata": {}, 549 | "source": [ 550 | "### `np.linspace` and `np.logspace`" 551 | ] 552 | }, 553 | { 554 | "cell_type": "code", 555 | "execution_count": 18, 556 | "metadata": { 557 | "collapsed": false 558 | }, 559 | "outputs": [ 560 | { 561 | "data": { 562 | "text/plain": [ 563 | "array([ 0. , 0.41666667, 0.83333333, 1.25 ,\n", 564 | " 1.66666667, 2.08333333, 2.5 , 2.91666667,\n", 565 | " 3.33333333, 3.75 , 4.16666667, 4.58333333,\n", 566 | " 5. , 5.41666667, 5.83333333, 6.25 ,\n", 567 | " 6.66666667, 7.08333333, 7.5 , 7.91666667,\n", 568 | " 8.33333333, 8.75 , 9.16666667, 9.58333333, 10. ])" 569 | ] 570 | }, 571 | "execution_count": 18, 572 | "metadata": {}, 573 | "output_type": "execute_result" 574 | } 575 | ], 576 | "source": [ 577 | "# using linspace, both end points **ARE included**\n", 578 | "np.linspace(0, 10, 25)" 579 | ] 580 | }, 581 | { 582 | "cell_type": "code", 583 | "execution_count": 36, 584 | "metadata": { 585 | "collapsed": false 586 | }, 587 | "outputs": [ 588 | { 589 | "data": { 590 | "text/plain": [ 591 | "array([ 1.00000000e+00, 2.27278564e+00, 5.16555456e+00,\n", 592 | " 1.17401982e+01, 2.66829540e+01, 6.06446346e+01,\n", 593 | " 1.37832255e+02, 3.13263169e+02, 7.11980032e+02,\n", 594 | " 1.61817799e+03])" 595 | ] 596 | }, 597 | "execution_count": 36, 598 | "metadata": {}, 599 | "output_type": "execute_result" 600 | } 601 | ], 602 | "source": [ 603 | "np.logspace(0, np.e**2, 10, base=np.e)" 604 | ] 605 | }, 606 | { 607 | "cell_type": "markdown", 608 | "metadata": {}, 609 | "source": [ 610 | "### `np.mgrid`" 611 | ] 612 | }, 613 | { 614 | "cell_type": "code", 615 | "execution_count": 21, 616 | "metadata": { 617 | "collapsed": true 618 | }, 619 | "outputs": [], 620 | "source": [ 621 | "x, y = np.mgrid[0:5, 0:5] # similar to meshgrid in MATLAB" 622 | ] 623 | }, 624 | { 625 | "cell_type": "code", 626 | "execution_count": 22, 627 | "metadata": { 628 | "collapsed": false 629 | }, 630 | "outputs": [ 631 | { 632 | "data": { 633 | "text/plain": [ 634 | "array([[0, 0, 0, 0, 0],\n", 635 | " [1, 1, 1, 1, 1],\n", 636 | " [2, 2, 2, 2, 2],\n", 637 | " [3, 3, 3, 3, 3],\n", 638 | " [4, 4, 4, 4, 4]])" 639 | ] 640 | }, 641 | "execution_count": 22, 642 | "metadata": {}, 643 | "output_type": "execute_result" 644 | } 645 | ], 646 | "source": [ 647 | "x" 648 | ] 649 | }, 650 | { 651 | "cell_type": "code", 652 | "execution_count": 23, 653 | "metadata": { 654 | "collapsed": false 655 | }, 656 | "outputs": [ 657 | { 658 | "data": { 659 | "text/plain": [ 660 | "array([[0, 1, 2, 3, 4],\n", 661 | " [0, 1, 2, 3, 4],\n", 662 | " [0, 1, 2, 3, 4],\n", 663 | " [0, 1, 2, 3, 4],\n", 664 | " [0, 1, 2, 3, 4]])" 665 | ] 666 | }, 667 | "execution_count": 23, 668 | "metadata": {}, 669 | "output_type": "execute_result" 670 | } 671 | ], 672 | "source": [ 673 | "y" 674 | ] 675 | }, 676 | { 677 | "cell_type": "markdown", 678 | "metadata": {}, 679 | "source": [ 680 | "### `np.random.rand` & `np.random.randn`" 681 | ] 682 | }, 683 | { 684 | "cell_type": "code", 685 | "execution_count": 24, 686 | "metadata": { 687 | "collapsed": false 688 | }, 689 | "outputs": [ 690 | { 691 | "data": { 692 | "text/plain": [ 693 | "array([[ 0.33658948, 0.28564552, 0.73183017, 0.7395105 , 0.66427382],\n", 694 | " [ 0.25942094, 0.43844615, 0.48250402, 0.24063916, 0.90171053],\n", 695 | " [ 0.51114245, 0.49587249, 0.61832302, 0.71996951, 0.22064571],\n", 696 | " [ 0.38625609, 0.44313367, 0.74975323, 0.57600147, 0.80771956],\n", 697 | " [ 0.84511666, 0.6064582 , 0.62365173, 0.62766319, 0.80129396]])" 698 | ] 699 | }, 700 | "execution_count": 24, 701 | "metadata": {}, 702 | "output_type": "execute_result" 703 | } 704 | ], 705 | "source": [ 706 | "# uniform random numbers in [0,1]\n", 707 | "np.random.rand(5,5)" 708 | ] 709 | }, 710 | { 711 | "cell_type": "code", 712 | "execution_count": 25, 713 | "metadata": { 714 | "collapsed": false 715 | }, 716 | "outputs": [ 717 | { 718 | "data": { 719 | "text/plain": [ 720 | "array([[ 0.65782724, 0.65168367, 0.58525852, 0.33781734, -0.00700978],\n", 721 | " [ 0.61574011, 0.59150639, -0.33797592, -0.2509655 , 0.77237429],\n", 722 | " [-0.15693266, -0.38377945, -0.28140147, 0.90558314, 0.25437408],\n", 723 | " [-1.136108 , 2.43964939, 0.28583627, -0.27540796, -0.57253111],\n", 724 | " [-0.79080395, 0.50525127, 2.1113386 , -0.33769711, -0.64914575]])" 725 | ] 726 | }, 727 | "execution_count": 25, 728 | "metadata": {}, 729 | "output_type": "execute_result" 730 | } 731 | ], 732 | "source": [ 733 | "# standard normal distributed random numbers\n", 734 | "np.random.randn(5,5)" 735 | ] 736 | }, 737 | { 738 | "cell_type": "markdown", 739 | "metadata": {}, 740 | "source": [ 741 | "### `np.diag`" 742 | ] 743 | }, 744 | { 745 | "cell_type": "code", 746 | "execution_count": 27, 747 | "metadata": { 748 | "collapsed": false 749 | }, 750 | "outputs": [ 751 | { 752 | "data": { 753 | "text/plain": [ 754 | "array([[1, 0, 0],\n", 755 | " [0, 2, 0],\n", 756 | " [0, 0, 3]])" 757 | ] 758 | }, 759 | "execution_count": 27, 760 | "metadata": {}, 761 | "output_type": "execute_result" 762 | } 763 | ], 764 | "source": [ 765 | "# a diagonal matrix\n", 766 | "np.diag([1,2,3])" 767 | ] 768 | }, 769 | { 770 | "cell_type": "code", 771 | "execution_count": 29, 772 | "metadata": { 773 | "collapsed": false 774 | }, 775 | "outputs": [ 776 | { 777 | "data": { 778 | "text/plain": [ 779 | "array([[0, 1, 0, 0],\n", 780 | " [0, 0, 2, 0],\n", 781 | " [0, 0, 0, 3],\n", 782 | " [0, 0, 0, 0]])" 783 | ] 784 | }, 785 | "execution_count": 29, 786 | "metadata": {}, 787 | "output_type": "execute_result" 788 | } 789 | ], 790 | "source": [ 791 | "# diagonal with offset from the main diagonal\n", 792 | "np.diag([1,2,3], k=1) " 793 | ] 794 | }, 795 | { 796 | "cell_type": "markdown", 797 | "metadata": {}, 798 | "source": [ 799 | "### `np.eye`" 800 | ] 801 | }, 802 | { 803 | "cell_type": "code", 804 | "execution_count": 50, 805 | "metadata": { 806 | "collapsed": false 807 | }, 808 | "outputs": [ 809 | { 810 | "data": { 811 | "text/plain": [ 812 | "array([[ 1., 0., 0.],\n", 813 | " [ 0., 1., 0.],\n", 814 | " [ 0., 0., 1.]])" 815 | ] 816 | }, 817 | "execution_count": 50, 818 | "metadata": {}, 819 | "output_type": "execute_result" 820 | } 821 | ], 822 | "source": [ 823 | "# a diagonal matrix with ones on the main diagonal\n", 824 | "np.eye(3) # 3 is the " 825 | ] 826 | }, 827 | { 828 | "cell_type": "markdown", 829 | "metadata": {}, 830 | "source": [ 831 | "### `np.zeros` and `np.ones`" 832 | ] 833 | }, 834 | { 835 | "cell_type": "code", 836 | "execution_count": 30, 837 | "metadata": { 838 | "collapsed": false 839 | }, 840 | "outputs": [ 841 | { 842 | "data": { 843 | "text/plain": [ 844 | "array([[ 0., 0., 0.],\n", 845 | " [ 0., 0., 0.],\n", 846 | " [ 0., 0., 0.]])" 847 | ] 848 | }, 849 | "execution_count": 30, 850 | "metadata": {}, 851 | "output_type": "execute_result" 852 | } 853 | ], 854 | "source": [ 855 | "np.zeros((3,3))" 856 | ] 857 | }, 858 | { 859 | "cell_type": "code", 860 | "execution_count": 31, 861 | "metadata": { 862 | "collapsed": false 863 | }, 864 | "outputs": [ 865 | { 866 | "data": { 867 | "text/plain": [ 868 | "array([[ 1., 1., 1.],\n", 869 | " [ 1., 1., 1.],\n", 870 | " [ 1., 1., 1.]])" 871 | ] 872 | }, 873 | "execution_count": 31, 874 | "metadata": {}, 875 | "output_type": "execute_result" 876 | } 877 | ], 878 | "source": [ 879 | "np.ones((3, 3))" 880 | ] 881 | }, 882 | { 883 | "cell_type": "markdown", 884 | "metadata": {}, 885 | "source": [ 886 | "### DIY" 887 | ] 888 | }, 889 | { 890 | "cell_type": "markdown", 891 | "metadata": {}, 892 | "source": [ 893 | "***Try by yourself*** the following commands:\n", 894 | "\n", 895 | " np.zeros((3,4))\n", 896 | " np.ones((3,4))\n", 897 | " np.empty((2,3))\n", 898 | " np.eye(5)\n", 899 | " np.diag(np.arange(5))\n", 900 | " np.tile(np.array([[6, 7], [8, 9]]), (2, 2))" 901 | ] 902 | }, 903 | { 904 | "cell_type": "markdown", 905 | "metadata": {}, 906 | "source": [ 907 | "## So, why is it useful then?" 908 | ] 909 | }, 910 | { 911 | "cell_type": "markdown", 912 | "metadata": {}, 913 | "source": [ 914 | "So far the `numpy.ndarray` looks awefully much like a Python list (or nested list). \n", 915 | "\n", 916 | "*Why not simply use Python lists for computations instead of creating a new array type?*" 917 | ] 918 | }, 919 | { 920 | "cell_type": "markdown", 921 | "metadata": {}, 922 | "source": [ 923 | "There are several reasons:\n", 924 | "\n", 925 | "* Python lists are very general. \n", 926 | " - They can contain any kind of object. \n", 927 | " - They are dynamically typed. \n", 928 | " - They do not support mathematical functions such as matrix and dot multiplications, etc. \n", 929 | " - Implementing such functions for Python lists would not be very efficient because of the dynamic typing.\n", 930 | " \n", 931 | " \n", 932 | "* Numpy arrays are **statically typed** and **homogeneous**. \n", 933 | " - The type of the elements is determined when array is created.\n", 934 | " \n", 935 | " \n", 936 | "* Numpy arrays are memory efficient.\n", 937 | " - Because of the static typing, fast implementation of mathematical functions such as multiplication and addition of `numpy` arrays can be implemented in a compiled language (C and Fortran is used)." 938 | ] 939 | }, 940 | { 941 | "cell_type": "code", 942 | "execution_count": 51, 943 | "metadata": { 944 | "collapsed": true 945 | }, 946 | "outputs": [], 947 | "source": [ 948 | "L = range(1000)" 949 | ] 950 | }, 951 | { 952 | "cell_type": "code", 953 | "execution_count": 52, 954 | "metadata": { 955 | "collapsed": false 956 | }, 957 | "outputs": [ 958 | { 959 | "name": "stdout", 960 | "output_type": "stream", 961 | "text": [ 962 | "1000 loops, best of 3: 558 µs per loop\n" 963 | ] 964 | } 965 | ], 966 | "source": [ 967 | "%timeit [i**2 for i in L]" 968 | ] 969 | }, 970 | { 971 | "cell_type": "code", 972 | "execution_count": 53, 973 | "metadata": { 974 | "collapsed": true 975 | }, 976 | "outputs": [], 977 | "source": [ 978 | "a = np.arange(1000)" 979 | ] 980 | }, 981 | { 982 | "cell_type": "code", 983 | "execution_count": 54, 984 | "metadata": { 985 | "collapsed": false 986 | }, 987 | "outputs": [ 988 | { 989 | "name": "stdout", 990 | "output_type": "stream", 991 | "text": [ 992 | "The slowest run took 52.96 times longer than the fastest. This could mean that an intermediate result is being cached \n", 993 | "100000 loops, best of 3: 2.19 µs per loop\n" 994 | ] 995 | } 996 | ], 997 | "source": [ 998 | "%timeit a**2" 999 | ] 1000 | }, 1001 | { 1002 | "cell_type": "markdown", 1003 | "metadata": {}, 1004 | "source": [ 1005 | "## Exercises" 1006 | ] 1007 | }, 1008 | { 1009 | "cell_type": "markdown", 1010 | "metadata": {}, 1011 | "source": [ 1012 | "### Simple arrays" 1013 | ] 1014 | }, 1015 | { 1016 | "cell_type": "markdown", 1017 | "metadata": {}, 1018 | "source": [ 1019 | "* Create simple one and two dimensional arrays. First, redo the examples\n", 1020 | "from above. And then create your own.\n", 1021 | "\n", 1022 | "* Use the functions `len`, `shape` and `ndim` on some of those arrays and\n", 1023 | "observe their output." 1024 | ] 1025 | }, 1026 | { 1027 | "cell_type": "markdown", 1028 | "metadata": {}, 1029 | "source": [ 1030 | "### Creating arrays using functions" 1031 | ] 1032 | }, 1033 | { 1034 | "cell_type": "markdown", 1035 | "metadata": {}, 1036 | "source": [ 1037 | "* Experiment with `arange`, `linspace`, `ones`, `zeros`, `eye` and `diag`.\n", 1038 | "\n", 1039 | "* Create different kinds of arrays with random numbers.\n", 1040 | "\n", 1041 | "* Try setting the seed before creating an array with random values \n", 1042 | " - *hint*: use `np.random.seed`\n", 1043 | "\n", 1044 | "* Look at the function `np.empty`. What does it do? When might this be\n", 1045 | "useful?" 1046 | ] 1047 | }, 1048 | { 1049 | "cell_type": "markdown", 1050 | "metadata": {}, 1051 | "source": [ 1052 | "# Basic Data Type" 1053 | ] 1054 | }, 1055 | { 1056 | "cell_type": "markdown", 1057 | "metadata": {}, 1058 | "source": [ 1059 | "You may have noticed that, in some instances, array elements are\n", 1060 | "displayed with a trailing dot (e.g. `2.` vs `2`). This is due to a\n", 1061 | "difference in the data-type used:" 1062 | ] 1063 | }, 1064 | { 1065 | "cell_type": "code", 1066 | "execution_count": 59, 1067 | "metadata": { 1068 | "collapsed": false 1069 | }, 1070 | "outputs": [ 1071 | { 1072 | "data": { 1073 | "text/plain": [ 1074 | "dtype('int64')" 1075 | ] 1076 | }, 1077 | "execution_count": 59, 1078 | "metadata": {}, 1079 | "output_type": "execute_result" 1080 | } 1081 | ], 1082 | "source": [ 1083 | "a = np.array([1, 2, 3])\n", 1084 | "a.dtype" 1085 | ] 1086 | }, 1087 | { 1088 | "cell_type": "code", 1089 | "execution_count": 60, 1090 | "metadata": { 1091 | "collapsed": false 1092 | }, 1093 | "outputs": [ 1094 | { 1095 | "data": { 1096 | "text/plain": [ 1097 | "dtype('float64')" 1098 | ] 1099 | }, 1100 | "execution_count": 60, 1101 | "metadata": {}, 1102 | "output_type": "execute_result" 1103 | } 1104 | ], 1105 | "source": [ 1106 | "b = np.array([1., 2., 3.])\n", 1107 | "b.dtype" 1108 | ] 1109 | }, 1110 | { 1111 | "cell_type": "markdown", 1112 | "metadata": {}, 1113 | "source": [ 1114 | "### Note\n", 1115 | "\n", 1116 | "Different data-types allow us to store data more compactly in memory,\n", 1117 | "but most of the time we simply work with floating point numbers. Note\n", 1118 | "that, in the example above, NumPy auto-detects the data-type from the\n", 1119 | "input." 1120 | ] 1121 | }, 1122 | { 1123 | "cell_type": "markdown", 1124 | "metadata": {}, 1125 | "source": [ 1126 | "You can explicitly specify which data-type you want:" 1127 | ] 1128 | }, 1129 | { 1130 | "cell_type": "code", 1131 | "execution_count": 61, 1132 | "metadata": { 1133 | "collapsed": false 1134 | }, 1135 | "outputs": [ 1136 | { 1137 | "data": { 1138 | "text/plain": [ 1139 | "dtype('float64')" 1140 | ] 1141 | }, 1142 | "execution_count": 61, 1143 | "metadata": {}, 1144 | "output_type": "execute_result" 1145 | } 1146 | ], 1147 | "source": [ 1148 | "c = np.array([1, 2, 3], dtype=float)\n", 1149 | "c.dtype" 1150 | ] 1151 | }, 1152 | { 1153 | "cell_type": "markdown", 1154 | "metadata": {}, 1155 | "source": [ 1156 | "The **default** data type is floating point:" 1157 | ] 1158 | }, 1159 | { 1160 | "cell_type": "code", 1161 | "execution_count": 62, 1162 | "metadata": { 1163 | "collapsed": false 1164 | }, 1165 | "outputs": [ 1166 | { 1167 | "data": { 1168 | "text/plain": [ 1169 | "dtype('float64')" 1170 | ] 1171 | }, 1172 | "execution_count": 62, 1173 | "metadata": {}, 1174 | "output_type": "execute_result" 1175 | } 1176 | ], 1177 | "source": [ 1178 | "a = np.ones((3, 3))\n", 1179 | "a.dtype" 1180 | ] 1181 | }, 1182 | { 1183 | "cell_type": "markdown", 1184 | "metadata": {}, 1185 | "source": [ 1186 | "## Basic Data Types" 1187 | ] 1188 | }, 1189 | { 1190 | "cell_type": "markdown", 1191 | "metadata": {}, 1192 | "source": [ 1193 | " bool | This stores boolean (True or False) as a bit\n", 1194 | "\n", 1195 | " inti | This is a platform integer (normally either int32 or int64)\n", 1196 | " int8 | This is an integer ranging from -128 to 127\n", 1197 | " int16 | This is an integer ranging from -32768 to 32767\n", 1198 | " int32 | This is an integer ranging from -2 ** 31 to 2 ** 31 -1\n", 1199 | " int64 | This is an integer ranging from -2 ** 63 to 2 ** 63 -1\n", 1200 | " \n", 1201 | " uint8 | This is an unsigned integer ranging from 0 to 255\n", 1202 | " uint16 | This is an unsigned integer ranging from 0 to 65535\n", 1203 | " uint32 | This is an unsigned integer ranging from 0 to 2 ** 32 - 1\n", 1204 | " uint64 | This is an unsigned integer ranging from 0 to 2 ** 64 - 1\n", 1205 | "\n", 1206 | " float16 | This is a half precision float with sign bit, 5 bits exponent, and 10 bits mantissa\n", 1207 | " float32 | This is a single precision float with sign bit, 8 bits exponent, and 23 bits mantissa\n", 1208 | " float64 or float | This is a double precision float with sign bit, 11 bits exponent, and 52 bits mantissa\n", 1209 | " complex64 | This is a complex number represented by two 32-bit floats (real and imaginary components)\n", 1210 | " complex128 | This is a complex number represented by two 64-bit floats (real and imaginary components)\n", 1211 | " (or complex)\n" 1212 | ] 1213 | }, 1214 | { 1215 | "cell_type": "markdown", 1216 | "metadata": {}, 1217 | "source": [ 1218 | "## Conversions and Type Casting" 1219 | ] 1220 | }, 1221 | { 1222 | "cell_type": "code", 1223 | "execution_count": 63, 1224 | "metadata": { 1225 | "collapsed": false 1226 | }, 1227 | "outputs": [ 1228 | { 1229 | "data": { 1230 | "text/plain": [ 1231 | "42.0" 1232 | ] 1233 | }, 1234 | "execution_count": 63, 1235 | "metadata": {}, 1236 | "output_type": "execute_result" 1237 | } 1238 | ], 1239 | "source": [ 1240 | "np.float64(42) # int to float" 1241 | ] 1242 | }, 1243 | { 1244 | "cell_type": "code", 1245 | "execution_count": 64, 1246 | "metadata": { 1247 | "collapsed": false 1248 | }, 1249 | "outputs": [ 1250 | { 1251 | "data": { 1252 | "text/plain": [ 1253 | "42" 1254 | ] 1255 | }, 1256 | "execution_count": 64, 1257 | "metadata": {}, 1258 | "output_type": "execute_result" 1259 | } 1260 | ], 1261 | "source": [ 1262 | "np.int8(42.0) # float to int8" 1263 | ] 1264 | }, 1265 | { 1266 | "cell_type": "code", 1267 | "execution_count": 67, 1268 | "metadata": { 1269 | "collapsed": false 1270 | }, 1271 | "outputs": [ 1272 | { 1273 | "data": { 1274 | "text/plain": [ 1275 | "True" 1276 | ] 1277 | }, 1278 | "execution_count": 67, 1279 | "metadata": {}, 1280 | "output_type": "execute_result" 1281 | } 1282 | ], 1283 | "source": [ 1284 | "np.bool(42) # int to bool" 1285 | ] 1286 | }, 1287 | { 1288 | "cell_type": "code", 1289 | "execution_count": 68, 1290 | "metadata": { 1291 | "collapsed": false 1292 | }, 1293 | "outputs": [ 1294 | { 1295 | "data": { 1296 | "text/plain": [ 1297 | "False" 1298 | ] 1299 | }, 1300 | "execution_count": 68, 1301 | "metadata": {}, 1302 | "output_type": "execute_result" 1303 | } 1304 | ], 1305 | "source": [ 1306 | "np.bool(0) # \"special\" int to bool" 1307 | ] 1308 | }, 1309 | { 1310 | "cell_type": "code", 1311 | "execution_count": 69, 1312 | "metadata": { 1313 | "collapsed": false 1314 | }, 1315 | "outputs": [ 1316 | { 1317 | "data": { 1318 | "text/plain": [ 1319 | "True" 1320 | ] 1321 | }, 1322 | "execution_count": 69, 1323 | "metadata": {}, 1324 | "output_type": "execute_result" 1325 | } 1326 | ], 1327 | "source": [ 1328 | "np.bool(42.0) # float to bool" 1329 | ] 1330 | }, 1331 | { 1332 | "cell_type": "code", 1333 | "execution_count": 70, 1334 | "metadata": { 1335 | "collapsed": false 1336 | }, 1337 | "outputs": [ 1338 | { 1339 | "data": { 1340 | "text/plain": [ 1341 | "1.0" 1342 | ] 1343 | }, 1344 | "execution_count": 70, 1345 | "metadata": {}, 1346 | "output_type": "execute_result" 1347 | } 1348 | ], 1349 | "source": [ 1350 | "np.float(True) # bool to float" 1351 | ] 1352 | }, 1353 | { 1354 | "cell_type": "code", 1355 | "execution_count": 71, 1356 | "metadata": { 1357 | "collapsed": false 1358 | }, 1359 | "outputs": [ 1360 | { 1361 | "data": { 1362 | "text/plain": [ 1363 | "0.0" 1364 | ] 1365 | }, 1366 | "execution_count": 71, 1367 | "metadata": {}, 1368 | "output_type": "execute_result" 1369 | } 1370 | ], 1371 | "source": [ 1372 | "np.float(False)" 1373 | ] 1374 | }, 1375 | { 1376 | "cell_type": "code", 1377 | "execution_count": 72, 1378 | "metadata": { 1379 | "collapsed": false 1380 | }, 1381 | "outputs": [ 1382 | { 1383 | "data": { 1384 | "text/plain": [ 1385 | "array([0, 1, 2, 3, 4, 5, 6], dtype=uint16)" 1386 | ] 1387 | }, 1388 | "execution_count": 72, 1389 | "metadata": {}, 1390 | "output_type": "execute_result" 1391 | } 1392 | ], 1393 | "source": [ 1394 | "np.arange(7, dtype=np.uint16)" 1395 | ] 1396 | }, 1397 | { 1398 | "cell_type": "code", 1399 | "execution_count": 74, 1400 | "metadata": { 1401 | "collapsed": false 1402 | }, 1403 | "outputs": [ 1404 | { 1405 | "ename": "TypeError", 1406 | "evalue": "can't convert complex to int", 1407 | "output_type": "error", 1408 | "traceback": [ 1409 | "\u001b[0;31m---------------------------------------------------------------------------\u001b[0m", 1410 | "\u001b[0;31mTypeError\u001b[0m Traceback (most recent call last)", 1411 | "\u001b[0;32m\u001b[0m in \u001b[0;36m\u001b[0;34m()\u001b[0m\n\u001b[0;32m----> 1\u001b[0;31m \u001b[0mnp\u001b[0m\u001b[0;34m.\u001b[0m\u001b[0mint\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0;36m42.0\u001b[0m \u001b[0;34m+\u001b[0m \u001b[0;36m1.j\u001b[0m\u001b[0;34m)\u001b[0m \u001b[0;31m# complex to int\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[0m", 1412 | "\u001b[0;31mTypeError\u001b[0m: can't convert complex to int" 1413 | ] 1414 | } 1415 | ], 1416 | "source": [ 1417 | "np.int(42.0 + 1.j) # complex to int" 1418 | ] 1419 | }, 1420 | { 1421 | "cell_type": "code", 1422 | "execution_count": 73, 1423 | "metadata": { 1424 | "collapsed": false 1425 | }, 1426 | "outputs": [ 1427 | { 1428 | "ename": "TypeError", 1429 | "evalue": "can't convert complex to float", 1430 | "output_type": "error", 1431 | "traceback": [ 1432 | "\u001b[0;31m---------------------------------------------------------------------------\u001b[0m", 1433 | "\u001b[0;31mTypeError\u001b[0m Traceback (most recent call last)", 1434 | "\u001b[0;32m\u001b[0m in \u001b[0;36m\u001b[0;34m()\u001b[0m\n\u001b[0;32m----> 1\u001b[0;31m \u001b[0mnp\u001b[0m\u001b[0;34m.\u001b[0m\u001b[0mfloat\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0;36m42.0\u001b[0m \u001b[0;34m+\u001b[0m \u001b[0;36m1.j\u001b[0m\u001b[0;34m)\u001b[0m \u001b[0;31m# complex to float\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[0m", 1435 | "\u001b[0;31mTypeError\u001b[0m: can't convert complex to float" 1436 | ] 1437 | } 1438 | ], 1439 | "source": [ 1440 | "np.float(42.0 + 1.j) # complex to float" 1441 | ] 1442 | }, 1443 | { 1444 | "cell_type": "code", 1445 | "execution_count": 75, 1446 | "metadata": { 1447 | "collapsed": false 1448 | }, 1449 | "outputs": [ 1450 | { 1451 | "ename": "TypeError", 1452 | "evalue": "can't convert complex to float", 1453 | "output_type": "error", 1454 | "traceback": [ 1455 | "\u001b[0;31m---------------------------------------------------------------------------\u001b[0m", 1456 | "\u001b[0;31mTypeError\u001b[0m Traceback (most recent call last)", 1457 | "\u001b[0;32m\u001b[0m in \u001b[0;36m\u001b[0;34m()\u001b[0m\n\u001b[0;32m----> 1\u001b[0;31m \u001b[0mnp\u001b[0m\u001b[0;34m.\u001b[0m\u001b[0mfloat\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0;36m42.0\u001b[0m \u001b[0;34m+\u001b[0m \u001b[0;36m0.j\u001b[0m\u001b[0;34m)\u001b[0m \u001b[0;31m# complex to float\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[0m", 1458 | "\u001b[0;31mTypeError\u001b[0m: can't convert complex to float" 1459 | ] 1460 | } 1461 | ], 1462 | "source": [ 1463 | "np.float(42.0 + 0.j) # complex to float" 1464 | ] 1465 | }, 1466 | { 1467 | "cell_type": "code", 1468 | "execution_count": 77, 1469 | "metadata": { 1470 | "collapsed": false 1471 | }, 1472 | "outputs": [ 1473 | { 1474 | "name": "stdout", 1475 | "output_type": "stream", 1476 | "text": [ 1477 | "(42+0j)\n" 1478 | ] 1479 | } 1480 | ], 1481 | "source": [ 1482 | "cn = np.complex(42.0) # Btw, you can convert a float to a complex..\n", 1483 | "print(cn)" 1484 | ] 1485 | }, 1486 | { 1487 | "cell_type": "code", 1488 | "execution_count": 79, 1489 | "metadata": { 1490 | "collapsed": false 1491 | }, 1492 | "outputs": [ 1493 | { 1494 | "data": { 1495 | "text/plain": [ 1496 | "42.0" 1497 | ] 1498 | }, 1499 | "execution_count": 79, 1500 | "metadata": {}, 1501 | "output_type": "execute_result" 1502 | } 1503 | ], 1504 | "source": [ 1505 | "# Extracting the Real part..\n", 1506 | "cn.real" 1507 | ] 1508 | }, 1509 | { 1510 | "cell_type": "code", 1511 | "execution_count": 80, 1512 | "metadata": { 1513 | "collapsed": false 1514 | }, 1515 | "outputs": [ 1516 | { 1517 | "data": { 1518 | "text/plain": [ 1519 | "0.0" 1520 | ] 1521 | }, 1522 | "execution_count": 80, 1523 | "metadata": {}, 1524 | "output_type": "execute_result" 1525 | } 1526 | ], 1527 | "source": [ 1528 | "# .. and the Imaginary part\n", 1529 | "cn.imag" 1530 | ] 1531 | }, 1532 | { 1533 | "cell_type": "markdown", 1534 | "metadata": {}, 1535 | "source": [ 1536 | "## Numerical Types and Representation" 1537 | ] 1538 | }, 1539 | { 1540 | "cell_type": "markdown", 1541 | "metadata": {}, 1542 | "source": [ 1543 | "The **numerical dtype** of an array should be selected very carefully, as it directly affects the numerical representation of elements, that is: \n", 1544 | "\n", 1545 | " * the number of **bytes used; \n", 1546 | " * the *numerical range*" 1547 | ] 1548 | }, 1549 | { 1550 | "cell_type": "markdown", 1551 | "metadata": {}, 1552 | "source": [ 1553 | "So, then: **What happens if I try to represent a number that is Out of range?**\n", 1554 | "\n", 1555 | "Let's have a go with **integers**, i.e., `int8` and `uint8`" 1556 | ] 1557 | }, 1558 | { 1559 | "cell_type": "code", 1560 | "execution_count": 27, 1561 | "metadata": { 1562 | "collapsed": false 1563 | }, 1564 | "outputs": [ 1565 | { 1566 | "data": { 1567 | "text/plain": [ 1568 | "array([0, 0, 0, 0], dtype=int8)" 1569 | ] 1570 | }, 1571 | "execution_count": 27, 1572 | "metadata": {}, 1573 | "output_type": "execute_result" 1574 | } 1575 | ], 1576 | "source": [ 1577 | "x = np.zeros(4, 'int8') # Integer ranging from -128 to 127\n", 1578 | "x" 1579 | ] 1580 | }, 1581 | { 1582 | "cell_type": "code", 1583 | "execution_count": 28, 1584 | "metadata": { 1585 | "collapsed": false 1586 | }, 1587 | "outputs": [ 1588 | { 1589 | "data": { 1590 | "text/plain": [ 1591 | "array([127, 0, 0, 0], dtype=int8)" 1592 | ] 1593 | }, 1594 | "execution_count": 28, 1595 | "metadata": {}, 1596 | "output_type": "execute_result" 1597 | } 1598 | ], 1599 | "source": [ 1600 | "x[0] = 127\n", 1601 | "x" 1602 | ] 1603 | }, 1604 | { 1605 | "cell_type": "code", 1606 | "execution_count": 29, 1607 | "metadata": { 1608 | "collapsed": false 1609 | }, 1610 | "outputs": [ 1611 | { 1612 | "data": { 1613 | "text/plain": [ 1614 | "array([-128, 0, 0, 0], dtype=int8)" 1615 | ] 1616 | }, 1617 | "execution_count": 29, 1618 | "metadata": {}, 1619 | "output_type": "execute_result" 1620 | } 1621 | ], 1622 | "source": [ 1623 | "x[0] = 128\n", 1624 | "x" 1625 | ] 1626 | }, 1627 | { 1628 | "cell_type": "code", 1629 | "execution_count": 30, 1630 | "metadata": { 1631 | "collapsed": false 1632 | }, 1633 | "outputs": [ 1634 | { 1635 | "data": { 1636 | "text/plain": [ 1637 | "array([-128, -127, 0, 0], dtype=int8)" 1638 | ] 1639 | }, 1640 | "execution_count": 30, 1641 | "metadata": {}, 1642 | "output_type": "execute_result" 1643 | } 1644 | ], 1645 | "source": [ 1646 | "x[1] = 129\n", 1647 | "x" 1648 | ] 1649 | }, 1650 | { 1651 | "cell_type": "code", 1652 | "execution_count": 31, 1653 | "metadata": { 1654 | "collapsed": false 1655 | }, 1656 | "outputs": [ 1657 | { 1658 | "data": { 1659 | "text/plain": [ 1660 | "array([-128, -127, 1, 0], dtype=int8)" 1661 | ] 1662 | }, 1663 | "execution_count": 31, 1664 | "metadata": {}, 1665 | "output_type": "execute_result" 1666 | } 1667 | ], 1668 | "source": [ 1669 | "x[2] = 257 # i.e. (128 x 2) + 1\n", 1670 | "x" 1671 | ] 1672 | }, 1673 | { 1674 | "cell_type": "code", 1675 | "execution_count": 32, 1676 | "metadata": { 1677 | "collapsed": false 1678 | }, 1679 | "outputs": [ 1680 | { 1681 | "data": { 1682 | "text/plain": [ 1683 | "array([0, 0, 0, 0], dtype=uint8)" 1684 | ] 1685 | }, 1686 | "execution_count": 32, 1687 | "metadata": {}, 1688 | "output_type": "execute_result" 1689 | } 1690 | ], 1691 | "source": [ 1692 | "ux = np.zeros(4, 'uint8') # Integer ranging from 0 to 255\n", 1693 | "ux" 1694 | ] 1695 | }, 1696 | { 1697 | "cell_type": "code", 1698 | "execution_count": 33, 1699 | "metadata": { 1700 | "collapsed": false 1701 | }, 1702 | "outputs": [ 1703 | { 1704 | "data": { 1705 | "text/plain": [ 1706 | "array([255, 0, 1, 1], dtype=uint8)" 1707 | ] 1708 | }, 1709 | "execution_count": 33, 1710 | "metadata": {}, 1711 | "output_type": "execute_result" 1712 | } 1713 | ], 1714 | "source": [ 1715 | "ux[0] = 255\n", 1716 | "ux[1] = 256\n", 1717 | "ux[2] = 257\n", 1718 | "ux[3] = 513 # (256 x 2) + 1\n", 1719 | "ux" 1720 | ] 1721 | }, 1722 | { 1723 | "cell_type": "markdown", 1724 | "metadata": {}, 1725 | "source": [ 1726 | "## Data Type Object" 1727 | ] 1728 | }, 1729 | { 1730 | "cell_type": "markdown", 1731 | "metadata": {}, 1732 | "source": [ 1733 | "**Data type objects** are instances of the `numpy.dtype` class. \n", 1734 | "\n", 1735 | "Once again, arrays have a data type. \n", 1736 | "
\n", 1737 | "To be precise, *every element* in a NumPy array has the same data type. \n", 1738 | "\n", 1739 | "The data type object can tell you the `size` of the data in bytes.\n", 1740 | "
\n", 1741 | "(**Recall**: The size in bytes is given by the `itemsize` attribute of the dtype class)" 1742 | ] 1743 | }, 1744 | { 1745 | "cell_type": "code", 1746 | "execution_count": 81, 1747 | "metadata": { 1748 | "collapsed": false 1749 | }, 1750 | "outputs": [ 1751 | { 1752 | "name": "stdout", 1753 | "output_type": "stream", 1754 | "text": [ 1755 | "a itemsize: 2\n", 1756 | "a.dtype.itemsize: 2\n" 1757 | ] 1758 | } 1759 | ], 1760 | "source": [ 1761 | "a = np.arange(7, dtype=np.uint16)\n", 1762 | "print('a itemsize: ', a.itemsize)\n", 1763 | "print('a.dtype.itemsize: ', a.dtype.itemsize)" 1764 | ] 1765 | }, 1766 | { 1767 | "cell_type": "markdown", 1768 | "metadata": {}, 1769 | "source": [ 1770 | "We may also have access to the `byteorder`, i.e. **Big Endian** or **Little Endian**" 1771 | ] 1772 | }, 1773 | { 1774 | "cell_type": "code", 1775 | "execution_count": 82, 1776 | "metadata": { 1777 | "collapsed": false 1778 | }, 1779 | "outputs": [ 1780 | { 1781 | "data": { 1782 | "text/plain": [ 1783 | "'='" 1784 | ] 1785 | }, 1786 | "execution_count": 82, 1787 | "metadata": {}, 1788 | "output_type": "execute_result" 1789 | } 1790 | ], 1791 | "source": [ 1792 | "a.dtype.byteorder" 1793 | ] 1794 | }, 1795 | { 1796 | "cell_type": "markdown", 1797 | "metadata": {}, 1798 | "source": [ 1799 | "### Note:\n", 1800 | "\n", 1801 | "**Byte Order** can be one of:\n", 1802 | "\n", 1803 | "* `=\tnative`\n", 1804 | "* `<\tlittle-endian`\n", 1805 | "* `>\tbig-endian`\n", 1806 | "* `|\tnot applicable`" 1807 | ] 1808 | }, 1809 | { 1810 | "cell_type": "markdown", 1811 | "metadata": {}, 1812 | "source": [ 1813 | "### Character Codes\n", 1814 | "\n", 1815 | "Character codes are included for backward compatibility with **Numeric**. \n", 1816 | "
\n", 1817 | "Numeric is the predecessor of NumPy. Their use is not recommended, but these codes pop up in several places. \n", 1818 | "\n", 1819 | "Btw, You should instead use the **dtype** objects. \n", 1820 | "\n", 1821 | " integer i\n", 1822 | " Unsigned integer u\n", 1823 | " Single precision float f\n", 1824 | " Double precision float d\n", 1825 | " bool b\n", 1826 | " complex D\n", 1827 | " string S\n", 1828 | " unicode U" 1829 | ] 1830 | }, 1831 | { 1832 | "cell_type": "markdown", 1833 | "metadata": {}, 1834 | "source": [ 1835 | "### `dtypes` properties" 1836 | ] 1837 | }, 1838 | { 1839 | "cell_type": "code", 1840 | "execution_count": 83, 1841 | "metadata": { 1842 | "collapsed": true 1843 | }, 1844 | "outputs": [], 1845 | "source": [ 1846 | "t = np.dtype('Float64')" 1847 | ] 1848 | }, 1849 | { 1850 | "cell_type": "code", 1851 | "execution_count": 84, 1852 | "metadata": { 1853 | "collapsed": false 1854 | }, 1855 | "outputs": [ 1856 | { 1857 | "data": { 1858 | "text/plain": [ 1859 | "'d'" 1860 | ] 1861 | }, 1862 | "execution_count": 84, 1863 | "metadata": {}, 1864 | "output_type": "execute_result" 1865 | } 1866 | ], 1867 | "source": [ 1868 | "t.char" 1869 | ] 1870 | }, 1871 | { 1872 | "cell_type": "code", 1873 | "execution_count": 85, 1874 | "metadata": { 1875 | "collapsed": false 1876 | }, 1877 | "outputs": [ 1878 | { 1879 | "data": { 1880 | "text/plain": [ 1881 | "numpy.float64" 1882 | ] 1883 | }, 1884 | "execution_count": 85, 1885 | "metadata": {}, 1886 | "output_type": "execute_result" 1887 | } 1888 | ], 1889 | "source": [ 1890 | "t.type" 1891 | ] 1892 | }, 1893 | { 1894 | "cell_type": "code", 1895 | "execution_count": 86, 1896 | "metadata": { 1897 | "collapsed": false 1898 | }, 1899 | "outputs": [ 1900 | { 1901 | "data": { 1902 | "text/plain": [ 1903 | "'