├── .gitignore
├── DiffusionMaps_template.ipynb
├── Introduction to NumPy.ipynb
├── Introduction to Python.ipynb
├── PCA_template.ipynb
├── README.md
├── ReinforcementLearning_template.ipynb
├── data
├── data1.mat
├── guo.xlsx
├── pca_ped_25x50.mat
├── pca_toy_4d.npy
└── pca_toy_samples.npy
├── mllab
└── pca
│ ├── __init__.py
│ ├── __pycache__
│ ├── __init__.cpython-310.pyc
│ ├── __init__.cpython-35.pyc
│ ├── __init__.cpython-36.pyc
│ └── __init__.cpython-38.pyc
│ ├── hog_ref.mat
│ ├── pca_toy_colors.npy
│ └── ped.tar.gz
├── requirements.txt
├── requirements_w_version.txt
└── solution_example_pictures
├── Task2_10_c.png
├── Task2_2.png
├── Task2_4.png
├── Task2_7_a1.png
├── Task2_7_b1.png
├── Task2_9.png
├── Task3_3.png
├── Task3_6_a.png
├── Task3_6_b.png
├── Task3_7.png
├── Task3_8_c_example.png
├── Task4_2_a.png
├── Task4_2_b.png
├── Task4_3_b2.png
├── Task4_3b.png
├── Task4_4.png
├── Task4_4b.png
├── Task4_5_b1.png
├── Task4_5_b2.png
├── Task4_5_b3.png
├── Task4_5_b4.png
├── Task4_6.png
├── Task4_7.png
├── Task5_1.png
├── Task5_10.png
├── Task5_12.png
├── Task5_13.png
├── Task5_3.png
├── Task5_4a.png
├── Task5_4b.png
├── Task5_6.png
├── Task5_7.png
├── Task5_8a.png
├── Task5_8b.png
├── Task5_9a.png
├── Task5_9b.png
├── Task5_9c.png
├── Task6_3.png
├── Task6_4a.png
├── Task6_4b.png
├── Task6_4c.png
├── Task6_5.png
├── Task7_1a.png
├── Task7_1b.png
├── Task7_2a.png
├── Task7_2b.png
├── Task7_3a.png
├── Task7_3b.png
├── Task7_4a.png
└── Task7_4b.png
/.gitignore:
--------------------------------------------------------------------------------
1 | .ipynb_checkpoints
2 | solutions
3 | *.gz
4 |
--------------------------------------------------------------------------------
/DiffusionMaps_template.ipynb:
--------------------------------------------------------------------------------
1 | {
2 | "cells": [
3 | {
4 | "cell_type": "markdown",
5 | "metadata": {},
6 | "source": [
7 | "# Diffusion maps for single-cell data analysis"
8 | ]
9 | },
10 | {
11 | "cell_type": "markdown",
12 | "metadata": {},
13 | "source": [
14 | "This notebook serves as a template for the tasks regarding nonlinear dimensionality reduction (chapter 6)."
15 | ]
16 | },
17 | {
18 | "cell_type": "code",
19 | "execution_count": null,
20 | "metadata": {},
21 | "outputs": [],
22 | "source": [
23 | "%matplotlib inline\n",
24 | "import time\n",
25 | "import numpy as np\n",
26 | "from sklearn import datasets\n",
27 | "from sklearn.manifold import Isomap\n",
28 | "from scipy.spatial.distance import pdist\n",
29 | "from scipy.spatial.distance import squareform\n",
30 | "from scipy.sparse.linalg import eigs\n",
31 | "from scipy.io import loadmat \n",
32 | "from pandas import read_excel\n",
33 | "from math import ceil\n",
34 | "from sklearn.decomposition import PCA\n",
35 | "from sklearn.manifold import TSNE\n",
36 | "from scipy.io import loadmat\n",
37 | "import matplotlib.pyplot as plt\n",
38 | "from mpl_toolkits.mplot3d import Axes3D\n",
39 | "from sklearn.cluster import KMeans"
40 | ]
41 | },
42 | {
43 | "cell_type": "markdown",
44 | "metadata": {},
45 | "source": [
46 | "## Introduction"
47 | ]
48 | },
49 | {
50 | "cell_type": "markdown",
51 | "metadata": {},
52 | "source": [
53 | "#### Task 5.1: Apply the Isomap algorithm to the Swissroll data set"
54 | ]
55 | },
56 | {
57 | "cell_type": "code",
58 | "execution_count": null,
59 | "metadata": {},
60 | "outputs": [],
61 | "source": [
62 | "# your code goes here\n",
63 | "\n",
64 | "\n"
65 | ]
66 | },
67 | {
68 | "cell_type": "markdown",
69 | "metadata": {},
70 | "source": [
71 | "#### Task 5.2: Implement the diffusion maps algorithm"
72 | ]
73 | },
74 | {
75 | "cell_type": "markdown",
76 | "metadata": {},
77 | "source": [
78 | "It is recommended to solve this task by defining a class for diffusion maps and implementing a fit_transform function, which returns the embedding of a given data set. This standardizes the code when comparing diffusion maps with other dimensionality reduction methods from sklearn.\n",
79 | "\n",
80 | "You might want to use scipy.spatial.distance.pdist and scipy.spatial.distance.squareform to efficiently create the Gaussian kernel matrix K.\n",
81 | "scipy.sparse.linalg.eigs can compute the eigenvalues of P for you. Don't forget to sort them accordingly and cut away possible imagninary parts (they are zero in theory, but numerically there might be small imagninary values present.)\n"
82 | ]
83 | },
84 | {
85 | "cell_type": "code",
86 | "execution_count": null,
87 | "metadata": {},
88 | "outputs": [],
89 | "source": [
90 | "# your code goes here\n",
91 | "\n",
92 | "\n"
93 | ]
94 | },
95 | {
96 | "cell_type": "markdown",
97 | "metadata": {},
98 | "source": [
99 | "#### Task 5.3: Perform a diffusion map analysis on the Buettner data set. \n",
100 | "\n",
101 | "After creating a suitable Dataset class we load in the data for tasks 5.3 and 5.4."
102 | ]
103 | },
104 | {
105 | "cell_type": "code",
106 | "execution_count": null,
107 | "metadata": {},
108 | "outputs": [],
109 | "source": [
110 | "class Dataset:\n",
111 | " \"\"\"\n",
112 | " Data class for simplification for later tasks\n",
113 | " \n",
114 | " Parameters\n",
115 | " ----------\n",
116 | " data: input data [n_cells, n_genes]\n",
117 | " stage_names: names of the cell embryonic stages (time points)\n",
118 | " labels: assignment of each sample to a cell stage\n",
119 | " num_groups: number of cell stages\n",
120 | " \"\"\"\n",
121 | "\n",
122 | " def __init__(self, data, stage_names, labels):\n",
123 | " self.data = data\n",
124 | " self.stage_names = stage_names\n",
125 | " self.labels = labels\n",
126 | " self.num_stages = max(labels)+1"
127 | ]
128 | },
129 | {
130 | "cell_type": "code",
131 | "execution_count": null,
132 | "metadata": {},
133 | "outputs": [],
134 | "source": [
135 | "def load_buettner_data(): \n",
136 | " # load data for task 6.2\n",
137 | " file = loadmat('data//data1.mat')\n",
138 | " data = file.get('in_X')\n",
139 | " data = np.array(data)\n",
140 | "\n",
141 | " labels = file.get('true_labs')\n",
142 | " labels = labels[:,0] -1\n",
143 | "\n",
144 | " stage_names = ['1', '2', '3']\n",
145 | "\n",
146 | " adata = Dataset(data, stage_names, labels)\n",
147 | " return adata"
148 | ]
149 | },
150 | {
151 | "cell_type": "code",
152 | "execution_count": null,
153 | "metadata": {},
154 | "outputs": [],
155 | "source": [
156 | "# Run Diffusion Maps on the data set and visualize the results\n",
157 | "# your code goes here\n",
158 | "\n",
159 | "\n",
160 | "\n"
161 | ]
162 | },
163 | {
164 | "cell_type": "markdown",
165 | "metadata": {},
166 | "source": [
167 | "#### Task 5.4: Perform a PCA and Isomap analysis of the data set."
168 | ]
169 | },
170 | {
171 | "cell_type": "code",
172 | "execution_count": null,
173 | "metadata": {},
174 | "outputs": [],
175 | "source": [
176 | "# your code goes here\n",
177 | "\n",
178 | "\n",
179 | "\n"
180 | ]
181 | },
182 | {
183 | "cell_type": "markdown",
184 | "metadata": {},
185 | "source": [
186 | "## Single-cell data analysis\n",
187 | "\n",
188 | "In the following, we will apply Diffusion maps to the Guo data. To this end, we first load the data set. To make yourself familiar with the data set it makes sense to look at the _guo.xlsx_ file in the data directory. There, you will find some necessary information:\n",
189 | "\n",
190 | "1. the input data, which is a matrix with a certain number of cells as row number and a certain number of genes as column number,\n",
191 | "2. the names of the measured genes and\n",
192 | "3. an assignment of each cell to an embryonic stage. These assignments have to be converted into numerical labels to use them for the scatter plots.\n",
193 | "\n",
194 | "### Pre-processing\n",
195 | "#### Task 5.5: Pre-process the Guo data.\n",
196 | "\n",
197 | "Take a look at the file guo.xlsx. The naming annotation in the first column refers to the embryonic stage, embryo number, and individual cell number. For example, 64C 2.7 refers to the 7th cell harvested from the 2nd embryo collected from the 64-cell stage. In the first row, you will find the names of the measured genes.\n",
198 | "In the following code, the data is cleaned and normalized according to the description in Section 5.5.2.\n"
199 | ]
200 | },
201 | {
202 | "cell_type": "code",
203 | "execution_count": null,
204 | "metadata": {},
205 | "outputs": [],
206 | "source": [
207 | "def load_guo_data():\n",
208 | " # load guo data\n",
209 | " data_frame = read_excel('data//guo.xlsx', sheet_name='Sheet1')\n",
210 | "\n",
211 | " # data\n",
212 | " adata = data_frame.to_numpy()\n",
213 | " data = adata[:,1:]\n",
214 | " embryonic_stages = adata[:,0]\n",
215 | "\n",
216 | " # genes\n",
217 | " genes_tmp = data_frame.axes[1][1:]\n",
218 | " genes_names = [genes_tmp[k] for k in range(genes_tmp.size)]\n",
219 | " \n",
220 | " \n",
221 | " # your code goes here\n",
222 | " # Remove 1-cell stage cells\n",
223 | " # Remove cells with values bigger than 28 \n",
224 | " # Normalization\n",
225 | " # Treat background expression values\n",
226 | " # Round\n",
227 | " \n",
228 | " \n",
229 | " \n",
230 | " \n",
231 | " # stage_names and creating labels\n",
232 | " stage_names = ['2C', '4C', '8C', '16C', '32C', '64C']\n",
233 | "\n",
234 | " labels = np.array([next(np.where([name.startswith(sname) for name in stage_names])[0][0] \n",
235 | " for sname in stage_names if ename.startswith(sname)) for ename in embryonic_stages])\n",
236 | "\n",
237 | " adata = Dataset(data, stage_names, labels)\n",
238 | " return adata"
239 | ]
240 | },
241 | {
242 | "cell_type": "markdown",
243 | "metadata": {},
244 | "source": [
245 | "#### Task 5.6: Perform a Diffusion map analysis of the pre-processed Guo data."
246 | ]
247 | },
248 | {
249 | "cell_type": "code",
250 | "execution_count": null,
251 | "metadata": {},
252 | "outputs": [],
253 | "source": [
254 | "# your code goes here\n",
255 | "\n",
256 | "\n",
257 | "\n",
258 | "\n",
259 | "\n"
260 | ]
261 | },
262 | {
263 | "cell_type": "markdown",
264 | "metadata": {},
265 | "source": [
266 | "#### Task 5.7: Comparison with the un-pre-processed data.\n",
267 | "Now we just remove cells with values bigger than 28 and round the original data. Thus, we skip the cleaning and normaliztation part."
268 | ]
269 | },
270 | {
271 | "cell_type": "code",
272 | "execution_count": null,
273 | "metadata": {},
274 | "outputs": [],
275 | "source": [
276 | "# your code goes here\n",
277 | "# load guo data\n",
278 | "data_frame = read_excel('data//guo.xlsx', sheet_name='Sheet1')\n",
279 | "\n",
280 | "# data\n",
281 | "adata = data_frame.to_numpy()\n",
282 | "data = adata[:,1:]\n",
283 | "embryonic_stages = adata[:,0]\n",
284 | "\n",
285 | "# genes\n",
286 | "genes_tmp = data_frame.axes[1][1:]\n",
287 | "genes_names = [genes_tmp[k] for k in range(genes_tmp.size)]\n",
288 | "\n",
289 | "# your code goes here\n",
290 | "# Remove cells with values bigger than 28 \n",
291 | "\n",
292 | "\n",
293 | "# Round\n",
294 | "\n",
295 | "\n",
296 | "\n",
297 | "\n",
298 | "# stage_names and creating labels\n",
299 | "stage_names = ['1C', '2C', '4C', '8C', '16C', '32C', '64C']\n",
300 | "\n",
301 | "labels = np.array([next(np.where([name.startswith(sname) for name in stage_names])[0][0] \n",
302 | " for sname in stage_names if ename.startswith(sname)) for ename in embryonic_stages])\n",
303 | "\n",
304 | "adata = Dataset(data, stage_names,labels)\n",
305 | "\n",
306 | "\n",
307 | "# Run diffusion maps and visualize the results\n",
308 | "# your code goes here\n"
309 | ]
310 | },
311 | {
312 | "cell_type": "markdown",
313 | "metadata": {},
314 | "source": [
315 | "The non-pre-processed Guo data is less appropriate for analysis. In particular, the 1-cell stage cells deliver a distorted picture. The branching of the PE and EPI lineages cannot be detected here."
316 | ]
317 | },
318 | {
319 | "cell_type": "markdown",
320 | "metadata": {},
321 | "source": [
322 | "### Comparison with other dimensionality reduction methods\n",
323 | "\n",
324 | "#### Task 5.8: Compare Diffusion maps with PCA and tSNE."
325 | ]
326 | },
327 | {
328 | "cell_type": "code",
329 | "execution_count": null,
330 | "metadata": {},
331 | "outputs": [],
332 | "source": [
333 | "# your code goes here\n",
334 | "\n",
335 | "\n",
336 | "\n"
337 | ]
338 | },
339 | {
340 | "cell_type": "markdown",
341 | "metadata": {},
342 | "source": [
343 | "### Parameter selection\n",
344 | "\n",
345 | "#### Task 5.9: Bandwidth comparison\n",
346 | "We run Diffusion Maps with different bandwidth parameters sigma."
347 | ]
348 | },
349 | {
350 | "cell_type": "code",
351 | "execution_count": null,
352 | "metadata": {},
353 | "outputs": [],
354 | "source": [
355 | "# your code goes here\n",
356 | "\n",
357 | "\n",
358 | "\n",
359 | "\n"
360 | ]
361 | },
362 | {
363 | "cell_type": "markdown",
364 | "metadata": {},
365 | "source": [
366 | "#### Task 5.10: Implement the Lafon rule for $\\sigma$ and plot the embedding with the $\\sigma$ chosen by this rule."
367 | ]
368 | },
369 | {
370 | "cell_type": "code",
371 | "execution_count": null,
372 | "metadata": {},
373 | "outputs": [],
374 | "source": [
375 | "# your code goes here\n",
376 | "\n",
377 | "\n",
378 | "\n"
379 | ]
380 | },
381 | {
382 | "cell_type": "markdown",
383 | "metadata": {},
384 | "source": [
385 | "### Cell group detection\n",
386 | "\n",
387 | "Now, we want to apply spectral clustering to detect cell groups in the single-cell data.\n",
388 | "\n",
389 | "#### Task 5.11: Implement the spectral clustering algorithm using k-means with k as input."
390 | ]
391 | },
392 | {
393 | "cell_type": "code",
394 | "execution_count": null,
395 | "metadata": {},
396 | "outputs": [],
397 | "source": [
398 | "# your code goes here\n",
399 | "\n",
400 | "\n"
401 | ]
402 | },
403 | {
404 | "cell_type": "markdown",
405 | "metadata": {},
406 | "source": [
407 | "#### Task 5.12: Plot the first 20 eigenvalues of transition matrix $P$ for the Guo data and identify k."
408 | ]
409 | },
410 | {
411 | "cell_type": "code",
412 | "execution_count": null,
413 | "metadata": {},
414 | "outputs": [],
415 | "source": [
416 | "# your code goes here\n",
417 | "\n",
418 | "\n",
419 | "\n"
420 | ]
421 | },
422 | {
423 | "cell_type": "markdown",
424 | "metadata": {},
425 | "source": [
426 | "#### Task 5.13: Perform the spectral clustering algorithm for the Guo data."
427 | ]
428 | },
429 | {
430 | "cell_type": "code",
431 | "execution_count": null,
432 | "metadata": {},
433 | "outputs": [],
434 | "source": [
435 | "# your code goes here\n",
436 | "\n",
437 | "\n",
438 | "\n"
439 | ]
440 | }
441 | ],
442 | "metadata": {
443 | "kernelspec": {
444 | "display_name": "Python 3 (ipykernel)",
445 | "language": "python",
446 | "name": "python3"
447 | },
448 | "language_info": {
449 | "codemirror_mode": {
450 | "name": "ipython",
451 | "version": 3
452 | },
453 | "file_extension": ".py",
454 | "mimetype": "text/x-python",
455 | "name": "python",
456 | "nbconvert_exporter": "python",
457 | "pygments_lexer": "ipython3",
458 | "version": "3.10.11"
459 | }
460 | },
461 | "nbformat": 4,
462 | "nbformat_minor": 4
463 | }
464 |
--------------------------------------------------------------------------------
/Introduction to NumPy.ipynb:
--------------------------------------------------------------------------------
1 | {
2 | "cells": [
3 | {
4 | "cell_type": "markdown",
5 | "metadata": {},
6 | "source": [
7 | "**A general introduction to Jupyter Notebooks**\n",
8 | "\n",
9 | "Some useful shortcuts (the full list can be seen by navigating to Help > Keyboard Shortcuts):\n",
10 | "\n",
11 | "* Enter enters the edit mode for the selected cell.\n",
12 | "* Shift + Enter evaluates the current cell.\n",
13 | "* Esc allows you to navigate between cells using the arrow keys, or K (up) and J (down).\n",
14 | "* When navigating between cells,\n",
15 | " * A inserts a cell above, B inserts one below.\n",
16 | " * D + D deletes the cell.\n",
17 | " * C copies the current cell, V pastes the copied cell below the currently selected cell.\n",
18 | " * X copies the cell and deletes it afterwards.\n",
19 | " * Y sets the cell type to code, M switches it to Markdown.\n",
20 | "* Ctrl + S saves the Notebook and creates a checkpoint. Going to File > Revert to Checkpoint you can go back in time to the contents at previous checkpoints."
21 | ]
22 | },
23 | {
24 | "cell_type": "markdown",
25 | "metadata": {},
26 | "source": [
27 | "\n",
28 | "\n",
29 | "NumPy is a package that implements scientific computing tools directly into Python. You can start using it with"
30 | ]
31 | },
32 | {
33 | "cell_type": "code",
34 | "execution_count": null,
35 | "metadata": {},
36 | "outputs": [],
37 | "source": [
38 | "import numpy as np"
39 | ]
40 | },
41 | {
42 | "cell_type": "markdown",
43 | "metadata": {},
44 | "source": [
45 | "# Working with arrays\n",
46 | "\n",
47 | "The essential new data type that NumPy introduces is the array. On the outside it is, essentially, a list of lists, designed to represent a *$k$-dimensional matrix* $A\\in\\mathbb{R}^{d_1 \\times d_2 \\times \\cdots \\times d_k}$. Of course, NumPy includes several predetermined functions that allows one to use them in a very efficient way.\n",
48 | "\n",
49 | "Creating an array is simple. For instance you can easily create an array of zeros with"
50 | ]
51 | },
52 | {
53 | "cell_type": "code",
54 | "execution_count": null,
55 | "metadata": {},
56 | "outputs": [],
57 | "source": [
58 | "A = np.zeros((3, 4))\n",
59 | "\n",
60 | "print(A)"
61 | ]
62 | },
63 | {
64 | "cell_type": "markdown",
65 | "metadata": {},
66 | "source": [
67 | "The array `A` will have shape $(3, 4)$, meaning that it lives in $\\mathbb{R}^{3\\times 4}$, or in practical terms that it is a list of three elements, which are themselves lists of four `float`s. The shape of any array is always a tuple of `int`s, and will determine whether we can perform some operations on the array (just as it happens with matrices). You can always check the shape of an array using `.shape` as follows:"
68 | ]
69 | },
70 | {
71 | "cell_type": "code",
72 | "execution_count": null,
73 | "metadata": {},
74 | "outputs": [],
75 | "source": [
76 | "shape = (3, 4)\n",
77 | "B = np.zeros(shape)\n",
78 | "\n",
79 | "print('The shape is', B.shape, '- Is this what we wanted?', B.shape == shape)"
80 | ]
81 | },
82 | {
83 | "cell_type": "markdown",
84 | "metadata": {},
85 | "source": [
86 | "There are other possible constructions to create new arrays: for instance, `np.ones` does the same as `np.zeros` but adding ones instead (or, in general, `np.full(shape, x)` will give you an array of shape `shape` where every element has value `x`).\n",
87 | "\n",
88 | "A similar function is `np.eye(d)`, which returns a two-dimensional *identity matrix* of shape $(d, d)$, like"
89 | ]
90 | },
91 | {
92 | "cell_type": "code",
93 | "execution_count": null,
94 | "metadata": {},
95 | "outputs": [],
96 | "source": [
97 | "print(np.eye(4))"
98 | ]
99 | },
100 | {
101 | "cell_type": "markdown",
102 | "metadata": {},
103 | "source": [
104 | "Two other important constructors are `np.arange` and `np.linspace`. They are similar in the sense that they will return an array of evenly spaced values, but they are also different:\n",
105 | "\n",
106 | "- `np.arange(start, stop, step)` will return all values of `start` $ + n$ `step` in the interval $[$ `start`$, $ `stop` $)$. This is just like the `range` function we already know, but will return an array instead of a list, and does work when `step` is a `float` and not just an integer.\n",
107 | "- `np.linspace(start, stop, num, endpoint)` creates `num` equally spaced values on the half-open interval $[$ `start`$, $ `stop`$)$ (if `endpoint` is `False`) or on the closed interval $[$ `start`$, $ `stop`$]$. So we now know the shape that will be returned, but not the step size of the increments (although they can be trivially calculated).\n",
108 | "\n",
109 | "In general, since both functions can be used for the same purposes, it is better to use `arange` to get integers and `linspace` to get floats. Note that you can use `dtype` to make sure that your array consists of integers:"
110 | ]
111 | },
112 | {
113 | "cell_type": "code",
114 | "execution_count": null,
115 | "metadata": {},
116 | "outputs": [],
117 | "source": [
118 | "print('arange: ', np.arange(1, 10, 2.0))\n",
119 | "print('arange of ints:', np.arange(1, 10, 2.0, dtype = 'int'))\n",
120 | "print('linspace: ', np.linspace(0, 1, 9))"
121 | ]
122 | },
123 | {
124 | "cell_type": "code",
125 | "execution_count": null,
126 | "metadata": {},
127 | "outputs": [],
128 | "source": [
129 | "print(np.eye(3, dtype = 'int'))"
130 | ]
131 | },
132 | {
133 | "cell_type": "markdown",
134 | "metadata": {},
135 | "source": [
136 | "And of course, we can create an array from a standard list:"
137 | ]
138 | },
139 | {
140 | "cell_type": "code",
141 | "execution_count": null,
142 | "metadata": {},
143 | "outputs": [],
144 | "source": [
145 | "np.array([n**2 for n in range(10)])"
146 | ]
147 | },
148 | {
149 | "cell_type": "markdown",
150 | "metadata": {},
151 | "source": [
152 | "# Random number generation\n",
153 | "\n",
154 | "NumPy provides functions to generate random data drawn from specific distributions. The most typical samples are:"
155 | ]
156 | },
157 | {
158 | "cell_type": "code",
159 | "execution_count": null,
160 | "metadata": {},
161 | "outputs": [],
162 | "source": [
163 | "np.random.random(10)"
164 | ]
165 | },
166 | {
167 | "cell_type": "markdown",
168 | "metadata": {},
169 | "source": [
170 | "returns an array of `10` random numbers from a uniform distribution in the $[0,1)$ interval."
171 | ]
172 | },
173 | {
174 | "cell_type": "code",
175 | "execution_count": null,
176 | "metadata": {},
177 | "outputs": [],
178 | "source": [
179 | "np.random.randint(1, 365, 12)"
180 | ]
181 | },
182 | {
183 | "cell_type": "markdown",
184 | "metadata": {},
185 | "source": [
186 | "returns random integers between `1` and `365`."
187 | ]
188 | },
189 | {
190 | "cell_type": "code",
191 | "execution_count": null,
192 | "metadata": {},
193 | "outputs": [],
194 | "source": [
195 | "np.random.randn(10)"
196 | ]
197 | },
198 | {
199 | "cell_type": "markdown",
200 | "metadata": {},
201 | "source": [
202 | "returns an array of normally distributed ($\\mu = 0, \\sigma = 1$) real numbers."
203 | ]
204 | },
205 | {
206 | "cell_type": "markdown",
207 | "metadata": {},
208 | "source": [
209 | "In all cases, the number of samples can be changed for the shape of an array that we want, as in"
210 | ]
211 | },
212 | {
213 | "cell_type": "code",
214 | "execution_count": null,
215 | "metadata": {},
216 | "outputs": [],
217 | "source": [
218 | "np.random.random((3, 4))"
219 | ]
220 | },
221 | {
222 | "cell_type": "markdown",
223 | "metadata": {},
224 | "source": [
225 | "Another random function that can be useful is `shuffle` that will, as its name say, shuffle the elements of a list."
226 | ]
227 | },
228 | {
229 | "cell_type": "code",
230 | "execution_count": null,
231 | "metadata": {},
232 | "outputs": [],
233 | "source": [
234 | "N = np.arange(10)\n",
235 | "print(N)\n",
236 | "np.random.shuffle(N)\n",
237 | "print(N)"
238 | ]
239 | },
240 | {
241 | "cell_type": "markdown",
242 | "metadata": {},
243 | "source": [
244 | "If you don't want that much randomness, you can fix a seed. This means that, although the results will be *random* (as in unpredictable), they will not change if you do it several times. Note that every time you execute a random function the seed changes, therefore you have to initialize it again (or perform the commands always in the same order). So, if you evaluate the next cell, then shuffle, then set again the seed and shuffle once more, the result should not change (you should be getting `[2 9 6 4 0 3 1 7 8 5]` for the shuffled list)."
245 | ]
246 | },
247 | {
248 | "cell_type": "code",
249 | "execution_count": null,
250 | "metadata": {},
251 | "outputs": [],
252 | "source": [
253 | "np.random.seed(1)"
254 | ]
255 | },
256 | {
257 | "cell_type": "markdown",
258 | "metadata": {},
259 | "source": [
260 | "# Shapes\n",
261 | "\n",
262 | "The shape is an essential property of the array, that can however be modified easily. Consider for instance the array"
263 | ]
264 | },
265 | {
266 | "cell_type": "code",
267 | "execution_count": null,
268 | "metadata": {},
269 | "outputs": [],
270 | "source": [
271 | "A = np.arange(1, 13)\n",
272 | "print(A)"
273 | ]
274 | },
275 | {
276 | "cell_type": "markdown",
277 | "metadata": {},
278 | "source": [
279 | "We might want to get a matrix from this array. We can do so by using the `reshape` function as follows:"
280 | ]
281 | },
282 | {
283 | "cell_type": "code",
284 | "execution_count": null,
285 | "metadata": {},
286 | "outputs": [],
287 | "source": [
288 | "print(A.reshape((3, 4)))"
289 | ]
290 | },
291 | {
292 | "cell_type": "markdown",
293 | "metadata": {},
294 | "source": [
295 | "Note, however, that `A` itself has not changed, as we have only printed a *view* of `A`, and if we print it again we indeed see this."
296 | ]
297 | },
298 | {
299 | "cell_type": "code",
300 | "execution_count": null,
301 | "metadata": {},
302 | "outputs": [],
303 | "source": [
304 | "print(A)"
305 | ]
306 | },
307 | {
308 | "cell_type": "markdown",
309 | "metadata": {},
310 | "source": [
311 | "If one wants to change the shape, there are two options."
312 | ]
313 | },
314 | {
315 | "cell_type": "code",
316 | "execution_count": null,
317 | "metadata": {},
318 | "outputs": [],
319 | "source": [
320 | "B = np.copy(A)\n",
321 | "B = B.reshape((3, 4))\n",
322 | "print(B)"
323 | ]
324 | },
325 | {
326 | "cell_type": "code",
327 | "execution_count": null,
328 | "metadata": {},
329 | "outputs": [],
330 | "source": [
331 | "C = np.copy(A)\n",
332 | "C.shape = (3, 4)\n",
333 | "print(C)"
334 | ]
335 | },
336 | {
337 | "cell_type": "markdown",
338 | "metadata": {},
339 | "source": [
340 | "# Slicing arrays\n",
341 | "\n",
342 | "The advantage of working with NumPy is the ease to create other arrays and get data from an array. Consider for instance"
343 | ]
344 | },
345 | {
346 | "cell_type": "code",
347 | "execution_count": null,
348 | "metadata": {},
349 | "outputs": [],
350 | "source": [
351 | "A = np.arange(1, 21).reshape((4, 5))\n",
352 | "print(A)"
353 | ]
354 | },
355 | {
356 | "cell_type": "markdown",
357 | "metadata": {},
358 | "source": [
359 | "In the case of lists we could use the syntax `l[start:stop:step]`, and something similar happens here. Of course, we have more dimensions here, so whenever we want to take a slice across several dimensions we must specify each dimension and separate with commas. Let's see an example:"
360 | ]
361 | },
362 | {
363 | "cell_type": "code",
364 | "execution_count": null,
365 | "metadata": {},
366 | "outputs": [],
367 | "source": [
368 | "A[0]"
369 | ]
370 | },
371 | {
372 | "cell_type": "markdown",
373 | "metadata": {},
374 | "source": [
375 | "As one should expect, we get the first list. We can take more than one by doing"
376 | ]
377 | },
378 | {
379 | "cell_type": "code",
380 | "execution_count": null,
381 | "metadata": {},
382 | "outputs": [],
383 | "source": [
384 | "A[::2]"
385 | ]
386 | },
387 | {
388 | "cell_type": "markdown",
389 | "metadata": {},
390 | "source": [
391 | "which gives us the even-indexed lists. But we can slice accross the result at the same time to get, for instance, the odd-indexed elements as"
392 | ]
393 | },
394 | {
395 | "cell_type": "code",
396 | "execution_count": null,
397 | "metadata": {},
398 | "outputs": [],
399 | "source": [
400 | "A[::2, 1::2]"
401 | ]
402 | },
403 | {
404 | "cell_type": "markdown",
405 | "metadata": {},
406 | "source": [
407 | "In general, you can use `:` to act as a placeholder for *all the elements in this dimension*, as in"
408 | ]
409 | },
410 | {
411 | "cell_type": "code",
412 | "execution_count": null,
413 | "metadata": {},
414 | "outputs": [],
415 | "source": [
416 | "A[:, 2]"
417 | ]
418 | },
419 | {
420 | "cell_type": "markdown",
421 | "metadata": {},
422 | "source": [
423 | "that gets you the third column. You can choose the elements that you want to filter using lists too."
424 | ]
425 | },
426 | {
427 | "cell_type": "code",
428 | "execution_count": null,
429 | "metadata": {},
430 | "outputs": [],
431 | "source": [
432 | "print(A[[0, -1], :])\n",
433 | "print(A[[0, -1], :][:, [0, -1]])"
434 | ]
435 | },
436 | {
437 | "cell_type": "markdown",
438 | "metadata": {},
439 | "source": [
440 | "As you may notice from above, list indexing may not work as expected and require additional work. For instance, one may want the following code to work and return the *inside* of the matrix (rows 1 and 2, columns 1, 2 and 3). However..."
441 | ]
442 | },
443 | {
444 | "cell_type": "code",
445 | "execution_count": null,
446 | "metadata": {},
447 | "outputs": [],
448 | "source": [
449 | "xs, ys = [1, 2], [1, 2, 3]\n",
450 | "print(A[xs, ys])"
451 | ]
452 | },
453 | {
454 | "cell_type": "markdown",
455 | "metadata": {},
456 | "source": [
457 | "Fortunately enough, this is easy to solve using the `ix_` function to turn `xs` and `ys` into the right shapes. We get"
458 | ]
459 | },
460 | {
461 | "cell_type": "code",
462 | "execution_count": null,
463 | "metadata": {},
464 | "outputs": [],
465 | "source": [
466 | "print(A[np.ix_(xs, ys)])"
467 | ]
468 | },
469 | {
470 | "cell_type": "markdown",
471 | "metadata": {},
472 | "source": [
473 | "In general you can also use `A[B]`, where `B` is a Boolean array of the same shape."
474 | ]
475 | },
476 | {
477 | "cell_type": "code",
478 | "execution_count": null,
479 | "metadata": {},
480 | "outputs": [],
481 | "source": [
482 | "B = (A % 5 > 2)\n",
483 | "print('The Boolean array:\\n', B)\n",
484 | "print('The sliced array:\\n', A[B])"
485 | ]
486 | },
487 | {
488 | "cell_type": "markdown",
489 | "metadata": {},
490 | "source": [
491 | "Note that using Boolean slicing may result in a flattened result."
492 | ]
493 | },
494 | {
495 | "cell_type": "markdown",
496 | "metadata": {},
497 | "source": [
498 | "# Working with slices\n",
499 | "\n",
500 | "Now we will apply our knowledge of slices to easily modify lists. Let us use, as an example, the Legendre polynomials. Recall that they are orthogonal in $L^2([-1, 1])$, and can be defined from the recurrence relation\n",
501 | "$$ L_0(x) = 1,\\;\\; L_1(x) = x,\\;\\; n L_n(x) = (2n-1) x L_{n-1}(x) - (n-1) L_{n-2}(x) \\text{ for } n \\geq 2.$$\n",
502 | "\n",
503 | "Moreover we can define the *integrated* Legendre polynomials by\n",
504 | "$$ \\hat{L}_n(x) = \\int_{-1}^1 L_{n-1}(x) dx = \\frac{1}{2n-1} (L_n(x) - L_{n-2}(x)) $$\n",
505 | "as well as the *normalized* integrated Legendre polynomials as\n",
506 | "$$ K_0(x) = \\frac{1-x}{2},\\;\\; K_1(x) = \\frac{1+x}{2}, \\;\\; K_n(x) = (-1)^n \\gamma_n \\hat{L}_n(x) $$\n",
507 | "where $\\gamma_n = \\sqrt{(2n-3)(2n-1)(2n+1)/4}$. Let us then build them. We write `N` for the number polynomials we want to build, and `npoints` for the precision (we will calculate the values of the polynomials of course at a finite number of points)."
508 | ]
509 | },
510 | {
511 | "cell_type": "code",
512 | "execution_count": null,
513 | "metadata": {},
514 | "outputs": [],
515 | "source": [
516 | "N, npoints = 9, 120"
517 | ]
518 | },
519 | {
520 | "cell_type": "markdown",
521 | "metadata": {},
522 | "source": [
523 | "Of course we want those points to be equally spaced, hence it is a good moment to use a `linspace`."
524 | ]
525 | },
526 | {
527 | "cell_type": "code",
528 | "execution_count": null,
529 | "metadata": {},
530 | "outputs": [],
531 | "source": [
532 | "X = np.linspace(-1, 1, npoints)"
533 | ]
534 | },
535 | {
536 | "cell_type": "markdown",
537 | "metadata": {},
538 | "source": [
539 | "We now build a matrix for the values of $L_n$ to be stored simultaneously."
540 | ]
541 | },
542 | {
543 | "cell_type": "code",
544 | "execution_count": null,
545 | "metadata": {},
546 | "outputs": [],
547 | "source": [
548 | "L = np.zeros((N, npoints))"
549 | ]
550 | },
551 | {
552 | "cell_type": "markdown",
553 | "metadata": {},
554 | "source": [
555 | "The initial conditions allow us to create $L_0$ and $L_1$ easily:"
556 | ]
557 | },
558 | {
559 | "cell_type": "code",
560 | "execution_count": null,
561 | "metadata": {},
562 | "outputs": [],
563 | "source": [
564 | "L[0] = np.ones((npoints))\n",
565 | "L[1] = X"
566 | ]
567 | },
568 | {
569 | "cell_type": "markdown",
570 | "metadata": {},
571 | "source": [
572 | "For the recurrence relation, we now have a small loop:"
573 | ]
574 | },
575 | {
576 | "cell_type": "code",
577 | "execution_count": null,
578 | "metadata": {},
579 | "outputs": [],
580 | "source": [
581 | "for n in range(2, N):\n",
582 | " L[n] = ((2*n - 1)*X*L[n - 1] - (n - 1)*L[n - 2])/n"
583 | ]
584 | },
585 | {
586 | "cell_type": "markdown",
587 | "metadata": {},
588 | "source": [
589 | "Let us print our results using the matplotlib library:"
590 | ]
591 | },
592 | {
593 | "cell_type": "code",
594 | "execution_count": null,
595 | "metadata": {},
596 | "outputs": [],
597 | "source": [
598 | "import matplotlib.pyplot as plt"
599 | ]
600 | },
601 | {
602 | "cell_type": "code",
603 | "execution_count": null,
604 | "metadata": {},
605 | "outputs": [],
606 | "source": [
607 | "for n in range(N):\n",
608 | " plt.plot(X, L[n])"
609 | ]
610 | },
611 | {
612 | "cell_type": "markdown",
613 | "metadata": {},
614 | "source": [
615 | "Now on to `Lhat`. The procedure will be very similar. Note that, to define $\\hat{L}_1$, we add a float `1` to a vector `X`. The dimensions are clearly not correct, however NumPy understands what we mean and performs the operation in every element of `X`. This is called *broadcasting*: NumPy takes a smaller-dimensional array (in this case even a `float`) and *augments* it by turning it into a bigger shape. Implicitely this is what happens when we multiply an array by a `float` (the difference being that this seems natural, although it isn't natural for a programming language)."
616 | ]
617 | },
618 | {
619 | "cell_type": "code",
620 | "execution_count": null,
621 | "metadata": {},
622 | "outputs": [],
623 | "source": [
624 | "Lhat = np.copy(L)\n",
625 | "Lhat[1] = 0.5*(1 + X)\n",
626 | "\n",
627 | "for n in range(2, N):\n",
628 | " Lhat[n] -= L[n - 2]\n",
629 | " Lhat[n] /= 2*n - 1\n",
630 | "\n",
631 | "for n in range(1, N):\n",
632 | " plt.plot(X, Lhat[n])"
633 | ]
634 | },
635 | {
636 | "cell_type": "markdown",
637 | "metadata": {},
638 | "source": [
639 | "We finally arrive to $K_n$. This case is surprisingly simpler once we define the appropriate functions, **we do not even need a loop!**"
640 | ]
641 | },
642 | {
643 | "cell_type": "code",
644 | "execution_count": null,
645 | "metadata": {},
646 | "outputs": [],
647 | "source": [
648 | "gamma = lambda n : np.sqrt(0.25*(2*n - 3)*(2*n - 1)*(2*n + 1))\n",
649 | "scale = np.array([((-1)**n)*gamma(n) for n in range(2, N)])\n",
650 | "\n",
651 | "K = np.copy(Lhat)\n",
652 | "K[0] = 0.5*(1 - X)\n",
653 | "K[2:] *= scale[:, np.newaxis]\n",
654 | "\n",
655 | "for n in range(N):\n",
656 | " plt.plot(X, K[n])"
657 | ]
658 | },
659 | {
660 | "cell_type": "markdown",
661 | "metadata": {},
662 | "source": [
663 | "Why does this work? Recall that we mentioned before how it is possible to *broadcast* arrays. In this case, since we are only doing a scaling, it is a good idea to take advantage of this. The shapes of our objects are"
664 | ]
665 | },
666 | {
667 | "cell_type": "code",
668 | "execution_count": null,
669 | "metadata": {},
670 | "outputs": [],
671 | "source": [
672 | "print(K[2:].shape)\n",
673 | "print(scale.shape)"
674 | ]
675 | },
676 | {
677 | "cell_type": "markdown",
678 | "metadata": {},
679 | "source": [
680 | "The issue is that `scale` is missing a dimension. If it was $(7, 1)$, then NumPy would understand that it has to broadcast in that dimension, and do the multiplication a total of `npoints` times. So we artificially add a dimension using `np.newaxis`, which gives us"
681 | ]
682 | },
683 | {
684 | "cell_type": "code",
685 | "execution_count": null,
686 | "metadata": {},
687 | "outputs": [],
688 | "source": [
689 | "print(scale[:, np.newaxis].shape)"
690 | ]
691 | },
692 | {
693 | "cell_type": "markdown",
694 | "metadata": {},
695 | "source": [
696 | "Note that the position of the new dimension is important! Indeed,"
697 | ]
698 | },
699 | {
700 | "cell_type": "code",
701 | "execution_count": null,
702 | "metadata": {},
703 | "outputs": [],
704 | "source": [
705 | "print(scale[np.newaxis, :].shape)"
706 | ]
707 | },
708 | {
709 | "cell_type": "markdown",
710 | "metadata": {},
711 | "source": [
712 | "will not be broadcasted correctly. While this is not so important here (it will throw an error if you try it) it could be critical if for instance `K[2:]` was a square matrix, and broadcasting could be done in both directions, since the results would be very different. Now that you know this, go back to the construction of $\\hat{L}$ and try to remove the loop. The substraction of $L_{n-2}$ should be easy by slicing appropiately, and the division by $2n-1$ can be done by broadcasting appropiately."
713 | ]
714 | },
715 | {
716 | "cell_type": "markdown",
717 | "metadata": {},
718 | "source": [
719 | "# Plotting possibilities with matplotlib\n",
720 | "\n",
721 | "Besides the standard plot function of matplotlib.pyplot, there are also possibilities to create scatter plots or contour plots for instance. "
722 | ]
723 | },
724 | {
725 | "cell_type": "markdown",
726 | "metadata": {},
727 | "source": [
728 | "## Drawing random numbers and making a scatter plot\n",
729 | "\n",
730 | "Here we draw $20$ random two-dimensional vectors distributed i.i.d. according to ${\\cal N}\\left(\\left( \\begin{matrix} 0 \\\\ 0 \\end{matrix} \\right), \\left(\\begin{matrix}\n",
731 | "2 & 0 \\\\ 0 & 1 \\end{matrix}\\right)\\right)$\n",
732 | "and another $20$ distributed according to ${\\cal N}\\left(\\left( \\begin{matrix} 3 \\\\ 3 \\end{matrix} \\right), \\left(\\begin{matrix}\n",
733 | "2 & 0 \\\\ 0 & 1 \\end{matrix}\\right)\\right)$ and plot them with the help of the scatter function."
734 | ]
735 | },
736 | {
737 | "cell_type": "code",
738 | "execution_count": null,
739 | "metadata": {},
740 | "outputs": [],
741 | "source": [
742 | "mean = np.zeros(2)\n",
743 | "cov = np.array([[2,0],[0,1]])\n",
744 | "\n",
745 | "x = np.random.multivariate_normal(mean,cov,20)\n",
746 | "\n",
747 | "plt.scatter(x[:,0],x[:,1],marker='D',c='orange')\n",
748 | "\n",
749 | "y = np.random.multivariate_normal(mean + 3,cov,20)\n",
750 | "\n",
751 | "plt.scatter(y[:,0],y[:,1],marker='+',c='blue')"
752 | ]
753 | },
754 | {
755 | "cell_type": "markdown",
756 | "metadata": {},
757 | "source": [
758 | "## Defining a function to make a contour plot\n",
759 | "\n",
760 | "Next, we show how to make a contourplot of an implicitly given function, i.e. plotting the contours of $f(x_1,x_2) = 0$ for some $f: \\mathbb{R}^2 \\to \\mathbb{R}$"
761 | ]
762 | },
763 | {
764 | "cell_type": "code",
765 | "execution_count": null,
766 | "metadata": {},
767 | "outputs": [],
768 | "source": [
769 | "def f(x):\n",
770 | " #Values x have to be passed as n times 2 array, where n is the number of different data points to be evaluated\n",
771 | " return np.power(x[:,0],3) + x[:,1]\n",
772 | "\n",
773 | "def PlotContourLine(func, value=0):\n",
774 | " #This plots the contourline func(x) = value\n",
775 | " \n",
776 | " samplenum = 100\n",
777 | " minx = -2\n",
778 | " maxx = 2\n",
779 | " miny = -2\n",
780 | " maxy = 2\n",
781 | " xrange = np.arange(minx, maxx, (maxx-minx)/samplenum)\n",
782 | " yrange = np.arange(miny, maxy, (maxy-miny)/samplenum)\n",
783 | " \n",
784 | " #This generates a two-dimensional mesh\n",
785 | " X, Y = np.meshgrid(xrange,yrange)\n",
786 | " \n",
787 | " argsForf = np.array([X.flatten(),Y.flatten()]).T\n",
788 | " Z = func(argsForf)\n",
789 | " Z = np.reshape(Z,X.shape)\n",
790 | " \n",
791 | " plt.xlim(minx, maxx)\n",
792 | " plt.ylim(miny, maxy)\n",
793 | " plt.xlabel(r'$x_1$')\n",
794 | " plt.ylabel(r'$x_2$')\n",
795 | " #plt.contour(X, Y, Z, alpha=0.5,levels=[value],linestyles='dashed',linewidths=3)\n",
796 | " Z = np.where(Z > value, 1, -1)\n",
797 | " plt.contourf(X, Y, Z, alpha=0.5)\n",
798 | "\n",
799 | "PlotContourLine(func=f,value=0)"
800 | ]
801 | },
802 | {
803 | "cell_type": "markdown",
804 | "metadata": {},
805 | "source": [
806 | "# Boolean slices and in place modification\n",
807 | "\n",
808 | "Something that we did not mention was the fact that we were able to modify array slices *in place*. This means that you can alter the original array by looking at its slices instead, and you can use Boolean slicing to modify the array. Imagine we have some random data"
809 | ]
810 | },
811 | {
812 | "cell_type": "code",
813 | "execution_count": null,
814 | "metadata": {},
815 | "outputs": [],
816 | "source": [
817 | "R = np.random.normal(0, 1, 100).reshape((10, 10))\n",
818 | "print(R[:3, :3])"
819 | ]
820 | },
821 | {
822 | "cell_type": "markdown",
823 | "metadata": {},
824 | "source": [
825 | "We instead want to make sure that all values are non-negative, by resetting to zero all negative values. We could try taking a maximum with some `np.zeros`, but a faster way is just given by slicing and correcting all negative values as"
826 | ]
827 | },
828 | {
829 | "cell_type": "code",
830 | "execution_count": null,
831 | "metadata": {},
832 | "outputs": [],
833 | "source": [
834 | "R[R < 0] = 0\n",
835 | "print(R[:3, :3])"
836 | ]
837 | },
838 | {
839 | "cell_type": "markdown",
840 | "metadata": {},
841 | "source": [
842 | "Note that one can do this for any type of slices (so doing `R[1, :] = 0` would change the first row to zero too)."
843 | ]
844 | },
845 | {
846 | "cell_type": "markdown",
847 | "metadata": {},
848 | "source": [
849 | "# Concatenation of arrays\n",
850 | "\n",
851 | "Imagine you have (several) arrays that you wish to combine. For instance, the following:"
852 | ]
853 | },
854 | {
855 | "cell_type": "code",
856 | "execution_count": null,
857 | "metadata": {},
858 | "outputs": [],
859 | "source": [
860 | "left = np.zeros((3,5))\n",
861 | "middle = np.ones((3, 1))\n",
862 | "right = 2*np.ones((3, 2))\n",
863 | "print(left)\n",
864 | "print(middle)\n",
865 | "print(right)"
866 | ]
867 | },
868 | {
869 | "cell_type": "markdown",
870 | "metadata": {},
871 | "source": [
872 | "Of course, you want to have a matrix with three rows and add the columns from the variables. NumPy provides a function for this:"
873 | ]
874 | },
875 | {
876 | "cell_type": "code",
877 | "execution_count": null,
878 | "metadata": {},
879 | "outputs": [],
880 | "source": [
881 | "np.concatenate((left, middle, right))"
882 | ]
883 | },
884 | {
885 | "cell_type": "markdown",
886 | "metadata": {},
887 | "source": [
888 | "However, NumPy also assumes by default that you want to concatenate along the first dimension (i.e., that you want to add rows). The solution is telling NumPy to concatenate along the second axis (`axis = 1`, since everything is zero-indexed):"
889 | ]
890 | },
891 | {
892 | "cell_type": "code",
893 | "execution_count": null,
894 | "metadata": {},
895 | "outputs": [],
896 | "source": [
897 | "np.concatenate((left, middle, right), axis = 1)"
898 | ]
899 | },
900 | {
901 | "cell_type": "markdown",
902 | "metadata": {},
903 | "source": [
904 | "Note that there are two special functions: `vstack((a, b))` concatenates `a` and `b` vertically (so along `axis = 0`), while `hstack` concatenates them horizontally (`axis = 1`). If you are working along any other dimension (or you want to use always the same function), `concatenate` allows an arbitrary axis."
905 | ]
906 | },
907 | {
908 | "cell_type": "markdown",
909 | "metadata": {},
910 | "source": [
911 | "# Basic linear algebra\n",
912 | "\n",
913 | "Aside from the convenience of working with multidimensional data directly via arrays, the big advantage of using NumPy (and SciPy) is the fact that it provides a very efficient linear algebra implementation). Let us create some arrays first."
914 | ]
915 | },
916 | {
917 | "cell_type": "code",
918 | "execution_count": null,
919 | "metadata": {},
920 | "outputs": [],
921 | "source": [
922 | "A, B, x = np.random.randint(1, 10, (5, 5)), np.random.randint(1, 10, (5, 5)), np.random.randint(1, 10, 5)\n",
923 | "\n",
924 | "print(A)\n",
925 | "print(B)\n",
926 | "print(x)"
927 | ]
928 | },
929 | {
930 | "cell_type": "markdown",
931 | "metadata": {},
932 | "source": [
933 | "The most simple operation is transposing, which is done via"
934 | ]
935 | },
936 | {
937 | "cell_type": "code",
938 | "execution_count": null,
939 | "metadata": {},
940 | "outputs": [],
941 | "source": [
942 | "print(A.T)"
943 | ]
944 | },
945 | {
946 | "cell_type": "markdown",
947 | "metadata": {},
948 | "source": [
949 | "To get the product of a matrix and a vector, one can do"
950 | ]
951 | },
952 | {
953 | "cell_type": "code",
954 | "execution_count": null,
955 | "metadata": {},
956 | "outputs": [],
957 | "source": [
958 | "np.dot(A, x)"
959 | ]
960 | },
961 | {
962 | "cell_type": "markdown",
963 | "metadata": {},
964 | "source": [
965 | "Or, even shorter,"
966 | ]
967 | },
968 | {
969 | "cell_type": "code",
970 | "execution_count": null,
971 | "metadata": {},
972 | "outputs": [],
973 | "source": [
974 | "A.dot(x)"
975 | ]
976 | },
977 | {
978 | "cell_type": "markdown",
979 | "metadata": {},
980 | "source": [
981 | "This also works with two matrices (meaning, two arrays of dimension 2), but in that case it is preferred to do"
982 | ]
983 | },
984 | {
985 | "cell_type": "code",
986 | "execution_count": null,
987 | "metadata": {},
988 | "outputs": [],
989 | "source": [
990 | "A @ B"
991 | ]
992 | },
993 | {
994 | "cell_type": "markdown",
995 | "metadata": {},
996 | "source": [
997 | "Of course the result is the same:"
998 | ]
999 | },
1000 | {
1001 | "cell_type": "code",
1002 | "execution_count": null,
1003 | "metadata": {},
1004 | "outputs": [],
1005 | "source": [
1006 | "(A @ B) == A.dot(B)"
1007 | ]
1008 | },
1009 | {
1010 | "cell_type": "markdown",
1011 | "metadata": {},
1012 | "source": [
1013 | "It is also possible to calculate the Hadamard/element-wise product, with"
1014 | ]
1015 | },
1016 | {
1017 | "cell_type": "code",
1018 | "execution_count": null,
1019 | "metadata": {},
1020 | "outputs": [],
1021 | "source": [
1022 | "np.multiply(A, B)"
1023 | ]
1024 | },
1025 | {
1026 | "cell_type": "markdown",
1027 | "metadata": {},
1028 | "source": [
1029 | "For more advanced uses, the eigenvalues can be calculated with"
1030 | ]
1031 | },
1032 | {
1033 | "cell_type": "code",
1034 | "execution_count": null,
1035 | "metadata": {},
1036 | "outputs": [],
1037 | "source": [
1038 | "np.linalg.eigvals(A)"
1039 | ]
1040 | },
1041 | {
1042 | "cell_type": "markdown",
1043 | "metadata": {},
1044 | "source": [
1045 | "# The takeaway\n",
1046 | "\n",
1047 | "The most important thing you have to learn is that NumPy gives you very powerful tools to operate on arrays. So, whenever possible, you should use NumPy functions and slices instead of `for` loops. And while the saving may not be important now, once the examples start getting bigger (for instance, a $1200 \\times 50 \\times 100 \\times 3$ array for a list of 50$\\times$100 px images in color), vectorization will be the only way to perform the tasks in a reasonable time."
1048 | ]
1049 | },
1050 | {
1051 | "cell_type": "markdown",
1052 | "metadata": {},
1053 | "source": [
1054 | "## Vectorization runtime test\n",
1055 | "Let us construct a 10000-element array with random integers between 0 and 99. We want to count the number of times that the sum of two consecutive values in the array is larger than 100. To this end, we implement two functions to do so: one function using classical non-vectorized python code and one function using slicing/vectorization."
1056 | ]
1057 | },
1058 | {
1059 | "cell_type": "code",
1060 | "execution_count": null,
1061 | "metadata": {},
1062 | "outputs": [],
1063 | "source": [
1064 | "z = np.random.randint(100, size=10000)\n",
1065 | "\n",
1066 | "#1. With a for-loop and elementwise access (classical python, non-vectorized)\n",
1067 | "def checkSumNonVectorized(z):\n",
1068 | " counter = 0\n",
1069 | " for i in range(z.size-1):\n",
1070 | " if (z[i] + z[i+1] > 100):\n",
1071 | " counter += 1\n",
1072 | " return counter\n",
1073 | "\n",
1074 | "#2. Without for loop and with operations on the whole array (numpy style, vectorized, much more efficient!)\n",
1075 | "def checkSumVectorized(z):\n",
1076 | " counter = np.count_nonzero(z[:-1] + z[1:] > 100)\n",
1077 | " return counter\n",
1078 | "\n",
1079 | "print(\"Number of times that z[i] + z[i+1] > 100: \" + str(checkSumNonVectorized(z)))\n",
1080 | "print(\"Number of times that z[i] + z[i+1] > 100: \" + str(checkSumVectorized(z)))"
1081 | ]
1082 | },
1083 | {
1084 | "cell_type": "markdown",
1085 | "metadata": {},
1086 | "source": [
1087 | "Let us check how much faster the vectorized variant really is."
1088 | ]
1089 | },
1090 | {
1091 | "cell_type": "code",
1092 | "execution_count": null,
1093 | "metadata": {},
1094 | "outputs": [],
1095 | "source": [
1096 | "from timeit import timeit\n",
1097 | "setup = 'from __main__ import checkSumNonVectorized, checkSumVectorized, z; import numpy as np'\n",
1098 | "t1 = timeit('checkSumNonVectorized(z)', setup=setup, number = 100)\n",
1099 | "t2 = timeit('checkSumVectorized(z)', setup=setup, number = 100)\n",
1100 | "print('The vectorized variant is {:0.0f} times faster than the non-vectorized one'.format(t1 / t2))"
1101 | ]
1102 | }
1103 | ],
1104 | "metadata": {
1105 | "kernelspec": {
1106 | "display_name": "Python 3 (ipykernel)",
1107 | "language": "python",
1108 | "name": "python3"
1109 | },
1110 | "language_info": {
1111 | "codemirror_mode": {
1112 | "name": "ipython",
1113 | "version": 3
1114 | },
1115 | "file_extension": ".py",
1116 | "mimetype": "text/x-python",
1117 | "name": "python",
1118 | "nbconvert_exporter": "python",
1119 | "pygments_lexer": "ipython3",
1120 | "version": "3.8.3"
1121 | }
1122 | },
1123 | "nbformat": 4,
1124 | "nbformat_minor": 4
1125 | }
1126 |
--------------------------------------------------------------------------------
/Introduction to Python.ipynb:
--------------------------------------------------------------------------------
1 | {
2 | "cells": [
3 | {
4 | "cell_type": "markdown",
5 | "metadata": {},
6 | "source": [
7 | "**A general introduction to Jupyter Notebooks**\n",
8 | "\n",
9 | "Some useful shortcuts (the full list can be seen by navigating to Help > Keyboard Shortcuts):\n",
10 | "\n",
11 | "* Enter enters the edit mode for the selected cell.\n",
12 | "* Shift + Enter evaluates the current cell.\n",
13 | "* Esc allows you to navigate between cells using the arrow keys, or K (up) and J (down).\n",
14 | "* When navigating between cells,\n",
15 | " * A inserts a cell above, B inserts one below.\n",
16 | " * D + D deletes the cell.\n",
17 | " * C copies the current cell, V pastes the copied cell below the currently selected cell.\n",
18 | " * X copies the cell and deletes it afterwards.\n",
19 | " * Y sets the cell type to code, M switches it to Markdown.\n",
20 | "* Ctrl + S saves the Notebook and creates a checkpoint. Going to File > Revert to Checkpoint you can go back in time to the contents at previous checkpoints."
21 | ]
22 | },
23 | {
24 | "cell_type": "markdown",
25 | "metadata": {},
26 | "source": [
27 | "\n",
28 | "\n",
29 | "Python is a general-purpose programming language. It has many properties and qualities, you can find some in *The Zen of Python*:"
30 | ]
31 | },
32 | {
33 | "cell_type": "code",
34 | "execution_count": 1,
35 | "metadata": {},
36 | "outputs": [
37 | {
38 | "name": "stdout",
39 | "output_type": "stream",
40 | "text": [
41 | "The Zen of Python, by Tim Peters\n",
42 | "\n",
43 | "Beautiful is better than ugly.\n",
44 | "Explicit is better than implicit.\n",
45 | "Simple is better than complex.\n",
46 | "Complex is better than complicated.\n",
47 | "Flat is better than nested.\n",
48 | "Sparse is better than dense.\n",
49 | "Readability counts.\n",
50 | "Special cases aren't special enough to break the rules.\n",
51 | "Although practicality beats purity.\n",
52 | "Errors should never pass silently.\n",
53 | "Unless explicitly silenced.\n",
54 | "In the face of ambiguity, refuse the temptation to guess.\n",
55 | "There should be one-- and preferably only one --obvious way to do it.\n",
56 | "Although that way may not be obvious at first unless you're Dutch.\n",
57 | "Now is better than never.\n",
58 | "Although never is often better than *right* now.\n",
59 | "If the implementation is hard to explain, it's a bad idea.\n",
60 | "If the implementation is easy to explain, it may be a good idea.\n",
61 | "Namespaces are one honking great idea -- let's do more of those!\n"
62 | ]
63 | }
64 | ],
65 | "source": [
66 | "import this"
67 | ]
68 | },
69 | {
70 | "cell_type": "markdown",
71 | "metadata": {},
72 | "source": [
73 | "# 1. Types\n",
74 | "\n",
75 | "Python has several built-in data types, we will go over the ones we will use more often.\n",
76 | "\n",
77 | "## 1.1. Boolean types\n",
78 | "\n",
79 | "As expected, the possible Boolean types are `True` and `False`. The classical laws of Boolean algebra hold for the operations, which are given by `not`, `and` and `or`:"
80 | ]
81 | },
82 | {
83 | "cell_type": "code",
84 | "execution_count": 2,
85 | "metadata": {},
86 | "outputs": [
87 | {
88 | "name": "stdout",
89 | "output_type": "stream",
90 | "text": [
91 | "Not A: False\n",
92 | "Not B: True\n",
93 | "A and B: False\n",
94 | "A or B: True\n"
95 | ]
96 | }
97 | ],
98 | "source": [
99 | "A, B = True, False\n",
100 | "\n",
101 | "print('Not A: ', not A)\n",
102 | "print('Not B: ', not B)\n",
103 | "print('A and B:', A and B)\n",
104 | "print('A or B: ', A or B)"
105 | ]
106 | },
107 | {
108 | "cell_type": "markdown",
109 | "metadata": {},
110 | "source": [
111 | "Note that Python has some relaxed requirements for Boolean values. The following values are `False`:\n",
112 | "\n",
113 | "* `False`.\n",
114 | "* `None`.\n",
115 | "* `0`, `0.0` or any other numerical zero.\n",
116 | "* `[]`, `()`, `''` or any other empty structure.\n",
117 | "\n",
118 | "Everything else is `True`."
119 | ]
120 | },
121 | {
122 | "cell_type": "code",
123 | "execution_count": 3,
124 | "metadata": {},
125 | "outputs": [
126 | {
127 | "name": "stdout",
128 | "output_type": "stream",
129 | "text": [
130 | "Of course this will be printed.\n",
131 | "This works too, but shouldn't be surprising\n",
132 | "And the above is also True!\n"
133 | ]
134 | }
135 | ],
136 | "source": [
137 | "if True:\n",
138 | " print('Of course this will be printed.')\n",
139 | "if 1:\n",
140 | " print(\"This works too, but shouldn't be surprising\")\n",
141 | "if 'The quick brown fox jumps over the lazy dog' and [2, 0, 1, 9]:\n",
142 | " print(\"And the above is also True!\")\n",
143 | "if False:\n",
144 | " print('Clearly this will not be printed...')\n",
145 | "if [] or () or '':\n",
146 | " print('... and neither will this.')"
147 | ]
148 | },
149 | {
150 | "cell_type": "markdown",
151 | "metadata": {},
152 | "source": [
153 | "## 1.2. Numeric types\n",
154 | "\n",
155 | "The most important numeric types for us are `int` (integers) and `float` (real numbers). Python also natively supports complex numbers, but we do not care about this.\n",
156 | "\n",
157 | "Whenever you define a number by just writing it down, Python will assume it is an integer if it has no decimal part. To make sure it gets treated as a float, you can just append a decimal dot (without needing to add trailing zeroes). For example,"
158 | ]
159 | },
160 | {
161 | "cell_type": "code",
162 | "execution_count": 4,
163 | "metadata": {},
164 | "outputs": [
165 | {
166 | "name": "stdout",
167 | "output_type": "stream",
168 | "text": [
169 | "Is integer an int? True\n",
170 | "Is integer a float? False\n"
171 | ]
172 | }
173 | ],
174 | "source": [
175 | "integer = 1\n",
176 | "\n",
177 | "print('Is integer an int?', isinstance(integer, int))\n",
178 | "print('Is integer a float?', isinstance(integer, float))"
179 | ]
180 | },
181 | {
182 | "cell_type": "code",
183 | "execution_count": 5,
184 | "metadata": {},
185 | "outputs": [
186 | {
187 | "name": "stdout",
188 | "output_type": "stream",
189 | "text": [
190 | "Is realnumber an int? False\n",
191 | "Is realnumber a float? True\n"
192 | ]
193 | }
194 | ],
195 | "source": [
196 | "realnumber = 1.\n",
197 | "\n",
198 | "print('Is realnumber an int?', isinstance(realnumber, int))\n",
199 | "print('Is realnumber a float?', isinstance(realnumber, float))"
200 | ]
201 | },
202 | {
203 | "cell_type": "markdown",
204 | "metadata": {},
205 | "source": [
206 | "Of course you can operate an `int` with a `float` without issues, and the result will be converted to the more *general* type (in this case, `float`)."
207 | ]
208 | },
209 | {
210 | "cell_type": "code",
211 | "execution_count": 6,
212 | "metadata": {},
213 | "outputs": [
214 | {
215 | "name": "stdout",
216 | "output_type": "stream",
217 | "text": [
218 | "2.0\n"
219 | ]
220 | }
221 | ],
222 | "source": [
223 | "print(integer + realnumber)"
224 | ]
225 | },
226 | {
227 | "cell_type": "markdown",
228 | "metadata": {},
229 | "source": [
230 | "Aside from the expected operations between numbers (`a + b`, `a - b`, `a * b`, `a / b`), Python offers some additional and useful functions out-of-the-box:\n",
231 | "\n",
232 | "* `a ** b` gives $a^b$.\n",
233 | "* `a // b` gives $\\lfloor a/b\\rfloor$. `a % b` gives the result of $a \\: (\\text{mod } b)$. Therefore `b * (a // b) + a % b == a`.\n",
234 | "* Integers can be converted to floats with `float(n)`, and similarly floats can be converted to integers using `float(x)`.\n",
235 | "\n",
236 | "\n",
237 | "\n"
238 | ]
239 | },
240 | {
241 | "cell_type": "code",
242 | "execution_count": 7,
243 | "metadata": {},
244 | "outputs": [
245 | {
246 | "name": "stdout",
247 | "output_type": "stream",
248 | "text": [
249 | "True\n"
250 | ]
251 | }
252 | ],
253 | "source": [
254 | "a, b = 78, 13\n",
255 | "print(b * (a // b) + a % b == a)"
256 | ]
257 | },
258 | {
259 | "cell_type": "code",
260 | "execution_count": 8,
261 | "metadata": {},
262 | "outputs": [
263 | {
264 | "name": "stdout",
265 | "output_type": "stream",
266 | "text": [
267 | "1.0\n",
268 | "1\n"
269 | ]
270 | }
271 | ],
272 | "source": [
273 | "print(float(integer))\n",
274 | "print(int(realnumber))"
275 | ]
276 | },
277 | {
278 | "cell_type": "markdown",
279 | "metadata": {},
280 | "source": [
281 | "Numerical values can also be compared, giving raise to Boolean values. The possible operators are `==` for equality, `!=1` for inequality and `>`, `<`, `>=` and `<=` for the obvious operators."
282 | ]
283 | },
284 | {
285 | "cell_type": "markdown",
286 | "metadata": {},
287 | "source": [
288 | "## 1.3. Sequential types\n",
289 | "\n",
290 | "That is, structures that behave like sequences. We will look at lists and tuples. The main difference between them is that tuples are **not mutable**, meaning that once they have been defined it is not possible to change the value of its elements.\n",
291 | "\n",
292 | "Lists are defined using square brackets `[ .. ]`, whereas tuples are defined with parenthesis `( .. )`. Note that they can contain not just numbers as elements: Boolean values, other lists and tuples, strings or dictionaries are also allowed."
293 | ]
294 | },
295 | {
296 | "cell_type": "code",
297 | "execution_count": 9,
298 | "metadata": {},
299 | "outputs": [],
300 | "source": [
301 | "l = [1, 1, 2, 3, 5, 8, 13]\n",
302 | "t = (-0.618033988749895, 1.61803398874989)"
303 | ]
304 | },
305 | {
306 | "cell_type": "markdown",
307 | "metadata": {},
308 | "source": [
309 | "Note that there is a *usage* difference between them. While the list `l` gives just the first Fibbonacci numbers, the tuple `t` represents the roots of the polynomial $x^2 - x - 1$. Of course, `l` could be extended by adding numbers (or modified in general, for instance by changing the first value $F_0$ and the rest of them accordingly). However, it doesn't make sense to modify the tuple `t`: the roots are not going to change in any way or form.\n",
310 | "\n",
311 | "Some operations can be done on both lists and tuples. For instance, it is possible to check membership by using `in`:"
312 | ]
313 | },
314 | {
315 | "cell_type": "code",
316 | "execution_count": 10,
317 | "metadata": {},
318 | "outputs": [
319 | {
320 | "name": "stdout",
321 | "output_type": "stream",
322 | "text": [
323 | "True\n"
324 | ]
325 | }
326 | ],
327 | "source": [
328 | "phi = 1.61803398874989\n",
329 | "print(phi in t)"
330 | ]
331 | },
332 | {
333 | "cell_type": "code",
334 | "execution_count": 11,
335 | "metadata": {},
336 | "outputs": [
337 | {
338 | "name": "stdout",
339 | "output_type": "stream",
340 | "text": [
341 | "True\n"
342 | ]
343 | }
344 | ],
345 | "source": [
346 | "print(1 in l)"
347 | ]
348 | },
349 | {
350 | "cell_type": "markdown",
351 | "metadata": {},
352 | "source": [
353 | "Something useful is knowing how long your sequence is, which can be found with `len`:"
354 | ]
355 | },
356 | {
357 | "cell_type": "code",
358 | "execution_count": 12,
359 | "metadata": {},
360 | "outputs": [
361 | {
362 | "name": "stdout",
363 | "output_type": "stream",
364 | "text": [
365 | "List l has 7 elements\n",
366 | "Tuple t has 2 elements\n"
367 | ]
368 | }
369 | ],
370 | "source": [
371 | "print('List l has', len(l), 'elements')\n",
372 | "print('Tuple t has', len(t), 'elements')"
373 | ]
374 | },
375 | {
376 | "cell_type": "markdown",
377 | "metadata": {},
378 | "source": [
379 | "A somewhat unique example of lists are ranges. They allow you to get a range of integers fast and easily. Let us start with a simple example:"
380 | ]
381 | },
382 | {
383 | "cell_type": "code",
384 | "execution_count": 13,
385 | "metadata": {},
386 | "outputs": [
387 | {
388 | "name": "stdout",
389 | "output_type": "stream",
390 | "text": [
391 | "True\n"
392 | ]
393 | }
394 | ],
395 | "source": [
396 | "long_list = [0, 3, 6, 9, 12, 15, 18]\n",
397 | "print(long_list == list(range(0, 20, 3)))"
398 | ]
399 | },
400 | {
401 | "cell_type": "markdown",
402 | "metadata": {},
403 | "source": [
404 | "The syntax is simply\n",
405 | "\n",
406 | "````python\n",
407 | "range(start, end, step)\n",
408 | "````\n",
409 | "\n",
410 | "(see below for an explanation of what these terms mean, although they are rather self-explaining). Note that, in general, there is no need to use `list(range(...))`, this is used here just for the comparison (since the result of `range` is actually a different data type)."
411 | ]
412 | },
413 | {
414 | "cell_type": "code",
415 | "execution_count": 14,
416 | "metadata": {},
417 | "outputs": [
418 | {
419 | "data": {
420 | "text/plain": [
421 | "True"
422 | ]
423 | },
424 | "execution_count": 14,
425 | "metadata": {},
426 | "output_type": "execute_result"
427 | }
428 | ],
429 | "source": [
430 | "0 in range(0, 20, 3)"
431 | ]
432 | },
433 | {
434 | "cell_type": "markdown",
435 | "metadata": {},
436 | "source": [
437 | "We can also choose specific elements of `l` and `t` by giving their indices. Recall that both of them are $0$-indexed, meaning that the first term will be the zeroth. To acces the $i$-th element, just use `l[i]`. If $i$ is negative, the search will start from end to beginning, meaning that `l[-1]` is the last element of the sequence `l`. More complex slices can be taken: from the code\n",
438 | "\n",
439 | " l[start:end:step]\n",
440 | "\n",
441 | "we will get the elements of `l` at positions `start`, `start`$ + $ `step`, `start` $ + 2$ `step`, etc. until we reach `end`. Note that `l[end]` will not be included.\n",
442 | "\n",
443 | "By default, `start` will be zero, and `step` will be one. It is possible to take negative values for `step`, which would mean the list is traversed from right to left. `end` will be in a way $\\text{sign}($ `step` $)\\cdot\\infty$ (meaning that it will go on for as long as possible until it reaches the end of the list)."
444 | ]
445 | },
446 | {
447 | "cell_type": "code",
448 | "execution_count": 15,
449 | "metadata": {},
450 | "outputs": [
451 | {
452 | "name": "stdout",
453 | "output_type": "stream",
454 | "text": [
455 | "[1, 2, 5]\n",
456 | "-0.618033988749895\n"
457 | ]
458 | }
459 | ],
460 | "source": [
461 | "print(l[0:-1:2])\n",
462 | "print(t[0])"
463 | ]
464 | },
465 | {
466 | "cell_type": "markdown",
467 | "metadata": {},
468 | "source": [
469 | "One can now see that, if `Z` was somehow a list of all the integers such that `Z[n]` is $n$ for all $n\\in\\mathbb{Z}$, then we should expect\n",
470 | "\n",
471 | "````python\n",
472 | "range(start, end, step) == Z[start:end:step]\n",
473 | "````"
474 | ]
475 | },
476 | {
477 | "cell_type": "markdown",
478 | "metadata": {},
479 | "source": [
480 | "An interesting way to define lists is using *list comprehension*. The best way to understand them is with an example:"
481 | ]
482 | },
483 | {
484 | "cell_type": "code",
485 | "execution_count": 16,
486 | "metadata": {},
487 | "outputs": [
488 | {
489 | "name": "stdout",
490 | "output_type": "stream",
491 | "text": [
492 | "[0, 1, 4, 9, 16, 25, 36, 49, 64, 81]\n"
493 | ]
494 | }
495 | ],
496 | "source": [
497 | "squares = [a ** 2 for a in range(10)]\n",
498 | "\n",
499 | "print(squares)"
500 | ]
501 | },
502 | {
503 | "cell_type": "markdown",
504 | "metadata": {},
505 | "source": [
506 | "So the code\n",
507 | "\n",
508 | "````python\n",
509 | "[f(a) for a in list]\n",
510 | "````\n",
511 | "\n",
512 | "returns the list\n",
513 | "\n",
514 | "````python\n",
515 | "[f(list[0]), f(list[1]), ..., f(list[len(list)])]\n",
516 | "````\n",
517 | "\n",
518 | "One can extend this by adding conditions for the `a` we take from `list`, as in the following:"
519 | ]
520 | },
521 | {
522 | "cell_type": "code",
523 | "execution_count": 17,
524 | "metadata": {},
525 | "outputs": [
526 | {
527 | "name": "stdout",
528 | "output_type": "stream",
529 | "text": [
530 | "[0, 4, 16, 36, 64]\n"
531 | ]
532 | }
533 | ],
534 | "source": [
535 | "even_squares = [a for a in squares if a % 2 == 0]\n",
536 | "print(even_squares)"
537 | ]
538 | },
539 | {
540 | "cell_type": "markdown",
541 | "metadata": {},
542 | "source": [
543 | "Of course, multiple (and more complex) conditions can be created by the combined use of `and` and `or`.\n",
544 | "\n",
545 | "Recall that we have said that the difference between lists and tuples is that the former can be modified. Indeed, the following example shows it very clearly. Consider the following list and tuple:"
546 | ]
547 | },
548 | {
549 | "cell_type": "code",
550 | "execution_count": 18,
551 | "metadata": {},
552 | "outputs": [],
553 | "source": [
554 | "a, b, c = 1, 2, 3\n",
555 | "\n",
556 | "t = (a, b, c)\n",
557 | "l = [a, b, c]"
558 | ]
559 | },
560 | {
561 | "cell_type": "markdown",
562 | "metadata": {},
563 | "source": [
564 | "We can modify `l` by just asigning a new value to one of the list elements:"
565 | ]
566 | },
567 | {
568 | "cell_type": "code",
569 | "execution_count": 19,
570 | "metadata": {},
571 | "outputs": [
572 | {
573 | "name": "stdout",
574 | "output_type": "stream",
575 | "text": [
576 | "['a new value', 2, 3]\n"
577 | ]
578 | }
579 | ],
580 | "source": [
581 | "l[0] = 'a new value'\n",
582 | "\n",
583 | "print(l)"
584 | ]
585 | },
586 | {
587 | "cell_type": "markdown",
588 | "metadata": {},
589 | "source": [
590 | "But for tuples this doesn't work, and Python complains instead:"
591 | ]
592 | },
593 | {
594 | "cell_type": "code",
595 | "execution_count": 20,
596 | "metadata": {},
597 | "outputs": [
598 | {
599 | "ename": "TypeError",
600 | "evalue": "'tuple' object does not support item assignment",
601 | "output_type": "error",
602 | "traceback": [
603 | "\u001b[0;31m---------------------------------------------------------------------------\u001b[0m",
604 | "\u001b[0;31mTypeError\u001b[0m Traceback (most recent call last)",
605 | "\u001b[0;32m/tmp/ipykernel_38826/1131666323.py\u001b[0m in \u001b[0;36m\u001b[0;34m\u001b[0m\n\u001b[0;32m----> 1\u001b[0;31m \u001b[0mt\u001b[0m\u001b[0;34m[\u001b[0m\u001b[0;36m0\u001b[0m\u001b[0;34m]\u001b[0m \u001b[0;34m=\u001b[0m \u001b[0;34m'tuples are immutable'\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[0m",
606 | "\u001b[0;31mTypeError\u001b[0m: 'tuple' object does not support item assignment"
607 | ]
608 | }
609 | ],
610 | "source": [
611 | "t[0] = 'tuples are immutable'"
612 | ]
613 | },
614 | {
615 | "cell_type": "markdown",
616 | "metadata": {},
617 | "source": [
618 | "We can go further by adding elements to `l` at the end using `append`:"
619 | ]
620 | },
621 | {
622 | "cell_type": "code",
623 | "execution_count": 21,
624 | "metadata": {},
625 | "outputs": [
626 | {
627 | "name": "stdout",
628 | "output_type": "stream",
629 | "text": [
630 | "['a new value', 2, 3, 'this goes last']\n"
631 | ]
632 | }
633 | ],
634 | "source": [
635 | "l.append('this goes last')\n",
636 | "\n",
637 | "print(l)"
638 | ]
639 | },
640 | {
641 | "cell_type": "markdown",
642 | "metadata": {},
643 | "source": [
644 | "And even remove some of the values using `del l[i]` (other functions with similar purpose exist: see `pop` or `remove`):"
645 | ]
646 | },
647 | {
648 | "cell_type": "code",
649 | "execution_count": 22,
650 | "metadata": {},
651 | "outputs": [
652 | {
653 | "name": "stdout",
654 | "output_type": "stream",
655 | "text": [
656 | "['a new value', 'this goes last']\n"
657 | ]
658 | }
659 | ],
660 | "source": [
661 | "del l[1:3]\n",
662 | "\n",
663 | "print(l)"
664 | ]
665 | },
666 | {
667 | "cell_type": "markdown",
668 | "metadata": {},
669 | "source": [
670 | "## 1.4. Dictionaries\n",
671 | "\n",
672 | "Dictionaries store data as (key, value) pairs. The keys are (unique) strings or numbers (note that two numbers that Python considers equal, i.e. such that `a == b` returns `True`, are the same key), and the values can be any data type. They are defined using curly braces `{ .. }`. The pairs must be separated by commas `,`. Let us work with the following example dictionary:"
673 | ]
674 | },
675 | {
676 | "cell_type": "code",
677 | "execution_count": 23,
678 | "metadata": {},
679 | "outputs": [],
680 | "source": [
681 | "oeis = {\n",
682 | " 'A000000': [0, 1, 1, 1, 2, 1, 2, 1, 5, 2],\n",
683 | " 'A000001': [1, 2, 2, 1, 1, 2, 1, 2, 2, 1],\n",
684 | " 'A000002': [1, 1, 1, 1, 2, 2, 1, 2, 2, 2],\n",
685 | " 'A000003': [0, 0, 0, 0, 0, 0, 0, 0, 0, 0],\n",
686 | " 'A000004': [1, 2, 2, 3, 2, 4, 2, 4, 3, 4],\n",
687 | "}"
688 | ]
689 | },
690 | {
691 | "cell_type": "markdown",
692 | "metadata": {},
693 | "source": [
694 | "The value for a specific key can now be retrieved by using `oeis[key]`, for example as"
695 | ]
696 | },
697 | {
698 | "cell_type": "code",
699 | "execution_count": 24,
700 | "metadata": {},
701 | "outputs": [
702 | {
703 | "data": {
704 | "text/plain": [
705 | "[1, 1, 1, 1, 2, 2, 1, 2, 2, 2]"
706 | ]
707 | },
708 | "execution_count": 24,
709 | "metadata": {},
710 | "output_type": "execute_result"
711 | }
712 | ],
713 | "source": [
714 | "oeis['A000002']"
715 | ]
716 | },
717 | {
718 | "cell_type": "markdown",
719 | "metadata": {},
720 | "source": [
721 | "Whenever the dictionary is *unknown*, one can retrieve the keys and the values by using `oeis.keys()` and `oeis.values()`. This two functions will return a custom type, which can be turned into a list using `list(..)`. Note however that by default they allow you to do the standard operations which may be of interest: finding the length using `len`, checking membership using `in` and iterating over it using `for _ in`. Note that the keys and values will be returned in the same order."
722 | ]
723 | },
724 | {
725 | "cell_type": "code",
726 | "execution_count": 25,
727 | "metadata": {},
728 | "outputs": [
729 | {
730 | "data": {
731 | "text/plain": [
732 | "5"
733 | ]
734 | },
735 | "execution_count": 25,
736 | "metadata": {},
737 | "output_type": "execute_result"
738 | }
739 | ],
740 | "source": [
741 | "len(oeis.values())"
742 | ]
743 | },
744 | {
745 | "cell_type": "code",
746 | "execution_count": 26,
747 | "metadata": {},
748 | "outputs": [
749 | {
750 | "data": {
751 | "text/plain": [
752 | "True"
753 | ]
754 | },
755 | "execution_count": 26,
756 | "metadata": {},
757 | "output_type": "execute_result"
758 | }
759 | ],
760 | "source": [
761 | "'A000002' in oeis.keys()"
762 | ]
763 | },
764 | {
765 | "cell_type": "markdown",
766 | "metadata": {},
767 | "source": [
768 | "Sometimes it may be desirable to access all pairs of keys and values in the dictionary at the same time. This is possible with the method `oeis.items()`, which will return a list of tuples `(key, value)`:"
769 | ]
770 | },
771 | {
772 | "cell_type": "code",
773 | "execution_count": 27,
774 | "metadata": {},
775 | "outputs": [
776 | {
777 | "name": "stdout",
778 | "output_type": "stream",
779 | "text": [
780 | "The sequence A000000 starts with 0\n",
781 | "The sequence A000001 starts with 1\n",
782 | "The sequence A000002 starts with 1\n",
783 | "The sequence A000003 starts with 0\n",
784 | "The sequence A000004 starts with 1\n"
785 | ]
786 | }
787 | ],
788 | "source": [
789 | "for (key, value) in oeis.items():\n",
790 | " print('The sequence', key, 'starts with', value[0])"
791 | ]
792 | },
793 | {
794 | "cell_type": "markdown",
795 | "metadata": {},
796 | "source": [
797 | "New values can be added to the dictionary (and existing ones can be altered) by just providing the corresponding key:"
798 | ]
799 | },
800 | {
801 | "cell_type": "code",
802 | "execution_count": 28,
803 | "metadata": {},
804 | "outputs": [
805 | {
806 | "name": "stdout",
807 | "output_type": "stream",
808 | "text": [
809 | "{'A000000': [0, 1, 1, 1, 2, 1, 2, 1, 5, 2], 'A000001': [1, 2, 2, 1, 1, 2, 1, 2, 2, 1], 'A000002': [1, 1, 1, 1, 2, 2, 1, 2, 2, 2], 'A000003': [0, 0, 0, 0, 0, 0, 0, 0, 0, 0], 'A000004': [1, 2, 2, 3, 2, 4, 2, 4, 3, 4], 'A000006': [1, 1, 2, 2, 3, 3, 4, 4, 4, 5]}\n"
810 | ]
811 | }
812 | ],
813 | "source": [
814 | "oeis['A000006'] = [1, 1, 2, 2, 3, 3, 4, 4, 4, 5]\n",
815 | "print(oeis)"
816 | ]
817 | },
818 | {
819 | "cell_type": "markdown",
820 | "metadata": {},
821 | "source": [
822 | "# 2. Advanced statements \n",
823 | "\n",
824 | "Loops allow you to do similar work several times in a row according to some conditions. In Python, just like in any other modern language, one may easily use the `for` loop, the `if` conditional and the `while` loop.\n",
825 | "\n",
826 | "## 2.1. The `for` loop\n",
827 | "\n",
828 | "The syntax for the for loop in Python is\n",
829 | "\n",
830 | "```python\n",
831 | "for element in list:\n",
832 | " function(element)\n",
833 | "```\n",
834 | "\n",
835 | "This tells Python to go over every `element` in `list` and, for each one, perform a given action. Of course, at every step you can use the current `element` for whatever you want. For instance, the following short code prints the square of the first 10 natural numbers."
836 | ]
837 | },
838 | {
839 | "cell_type": "code",
840 | "execution_count": 29,
841 | "metadata": {},
842 | "outputs": [
843 | {
844 | "name": "stdout",
845 | "output_type": "stream",
846 | "text": [
847 | "0\n",
848 | "1\n",
849 | "4\n",
850 | "9\n",
851 | "16\n",
852 | "25\n",
853 | "36\n",
854 | "49\n",
855 | "64\n",
856 | "81\n"
857 | ]
858 | }
859 | ],
860 | "source": [
861 | "for n in range(10):\n",
862 | " print(n**2)"
863 | ]
864 | },
865 | {
866 | "cell_type": "markdown",
867 | "metadata": {},
868 | "source": [
869 | "Of course, we take zero to be the first natural number.\n",
870 | "\n",
871 | "Note that in the previous example we iterated over a `range`, which is not exactly a list. Still, Python understands this and the snippet works fine. In general, one may use a `for` loop over any iterator:\n",
872 | "\n",
873 | "- Ranges\n",
874 | "- Lists (including strings, as an case of list of `char`s)\n",
875 | "- Tuples\n",
876 | "- Dictionaries (the iteration will happen over the *keys*)\n",
877 | "\n",
878 | "## 2.2. The `while` loop\n",
879 | "\n",
880 | "The behaviour is similar to the case of `for`, only in this case the loop will continue not until we run out of elements to iterate, but until its argument becomes false. The syntax is\n",
881 | "\n",
882 | "```python\n",
883 | "while boolean_value:\n",
884 | " function()\n",
885 | "```\n",
886 | "\n",
887 | "For example, the following snippet gives us the first natural number such that $e^n > M$."
888 | ]
889 | },
890 | {
891 | "cell_type": "code",
892 | "execution_count": 30,
893 | "metadata": {},
894 | "outputs": [
895 | {
896 | "name": "stdout",
897 | "output_type": "stream",
898 | "text": [
899 | "24 26489122129.84347\n"
900 | ]
901 | }
902 | ],
903 | "source": [
904 | "import math\n",
905 | "M, n = 1e10, 0\n",
906 | "while math.exp(n) <= M:\n",
907 | " n = n + 1\n",
908 | "print(n, math.exp(n))"
909 | ]
910 | },
911 | {
912 | "cell_type": "markdown",
913 | "metadata": {},
914 | "source": [
915 | "## 2.3. The `if` statement\n",
916 | "\n",
917 | "As in other languages, `if`/`else` statements allow you to perform actions based on logic. In Python this is achieved using the syntax\n",
918 | "\n",
919 | "```python\n",
920 | "if condition_one:\n",
921 | " first_function()\n",
922 | "elif condition_two:\n",
923 | " second_function()\n",
924 | "elif condition_three:\n",
925 | " third_function()\n",
926 | "else:\n",
927 | " last_option()\n",
928 | "```\n",
929 | "\n",
930 | "Python will first check whether `condition_one` is `True`. If it is, it will perform the `first_function` and, afterwards, skip the rest. If it is not, it will then check if `condition_two`, and keep doing that for any additional `elif` statement. Finally, if all conditions turn out to be `False`, it will do whatever happens in the `else` part.\n",
931 | "\n",
932 | "Note that one may have a lone `if` condition without any `elif` or `else` clause."
933 | ]
934 | },
935 | {
936 | "cell_type": "markdown",
937 | "metadata": {},
938 | "source": [
939 | "## 2.4. Further control\n",
940 | "\n",
941 | "There are other functions that allow you to get even more control over the flow of the program.\n",
942 | "\n",
943 | "### 2.4.1. Continuing\n",
944 | "\n",
945 | "While inside a `for` loop, the command `continue` mandates to finish the current iteration and jump straight to the next one."
946 | ]
947 | },
948 | {
949 | "cell_type": "code",
950 | "execution_count": 31,
951 | "metadata": {},
952 | "outputs": [
953 | {
954 | "name": "stdout",
955 | "output_type": "stream",
956 | "text": [
957 | "1\n",
958 | "3\n",
959 | "5\n",
960 | "7\n",
961 | "9\n"
962 | ]
963 | }
964 | ],
965 | "source": [
966 | "for element in range(10):\n",
967 | " if element % 2 == 0:\n",
968 | " continue\n",
969 | " print(element)"
970 | ]
971 | },
972 | {
973 | "cell_type": "markdown",
974 | "metadata": {},
975 | "source": [
976 | "### 2.4.2. Breaking\n",
977 | "\n",
978 | "When inside a `for` or a `while` loop, using `break` stops the looping process. For instance, one might want to find the first element in a list satisfying a given condition."
979 | ]
980 | },
981 | {
982 | "cell_type": "code",
983 | "execution_count": 32,
984 | "metadata": {},
985 | "outputs": [
986 | {
987 | "name": "stdout",
988 | "output_type": "stream",
989 | "text": [
990 | "Checking for 1\n",
991 | "Checking for 2\n",
992 | "Checking for 4\n",
993 | "Checking for 5\n",
994 | "Checking for 7\n",
995 | "Checking for 8\n",
996 | "Checking for 9\n",
997 | "Found: 9\n"
998 | ]
999 | }
1000 | ],
1001 | "source": [
1002 | "l = [1, 2, 4, 5, 7, 8, 9, 11, 12, 13, 16]\n",
1003 | "\n",
1004 | "for element in l:\n",
1005 | " print('Checking for', element)\n",
1006 | " if element % 3 == 0:\n",
1007 | " print('Found:', element)\n",
1008 | " break"
1009 | ]
1010 | },
1011 | {
1012 | "cell_type": "markdown",
1013 | "metadata": {},
1014 | "source": [
1015 | "Note that using `break` will stop the innermost loop. So, if there are nested loops, only the last one to start will be finished.\n",
1016 | "\n",
1017 | "### 2.4.3. Passing\n",
1018 | "\n",
1019 | "Using `pass` does essentially nothing, but it may be useful as a placeholder for some other function that has to be added."
1020 | ]
1021 | },
1022 | {
1023 | "cell_type": "code",
1024 | "execution_count": 33,
1025 | "metadata": {},
1026 | "outputs": [],
1027 | "source": [
1028 | "for element in range(30):\n",
1029 | " if element % 7 == 0:\n",
1030 | " pass # Do something else here instead"
1031 | ]
1032 | },
1033 | {
1034 | "cell_type": "markdown",
1035 | "metadata": {},
1036 | "source": [
1037 | "### 2.4.4. Asserting\n",
1038 | "\n",
1039 | "The `assert` command forces its argument to be break, otherwise it will raise an error. This is useful when a program raises an error and you want to find it: adding assertions will tell us when what should happen is not happening. The syntax is simple, and allows for a custom error message to be added:\n",
1040 | "\n",
1041 | "```python\n",
1042 | "message_error = 'Oops, something went wrong!'\n",
1043 | "assert condition, message_error\n",
1044 | "```\n",
1045 | "\n",
1046 | "For instance, imagine we have a list of even numbers that we want to divide by two and print. It might make sense to ensure that the numbers are indeed even."
1047 | ]
1048 | },
1049 | {
1050 | "cell_type": "code",
1051 | "execution_count": 34,
1052 | "metadata": {},
1053 | "outputs": [
1054 | {
1055 | "name": "stdout",
1056 | "output_type": "stream",
1057 | "text": [
1058 | "1\n",
1059 | "3\n",
1060 | "4\n",
1061 | "5\n",
1062 | "8\n",
1063 | "13\n",
1064 | "14\n"
1065 | ]
1066 | }
1067 | ],
1068 | "source": [
1069 | "l = [2, 6, 8, 10, 16, 26, 28]\n",
1070 | "\n",
1071 | "for element in l:\n",
1072 | " assert element % 2 == 0, 'Not an even number!'\n",
1073 | " print(element//2)"
1074 | ]
1075 | },
1076 | {
1077 | "cell_type": "markdown",
1078 | "metadata": {},
1079 | "source": [
1080 | "If our list was bad, the assertion would tell us that something is wrong."
1081 | ]
1082 | },
1083 | {
1084 | "cell_type": "code",
1085 | "execution_count": 35,
1086 | "metadata": {},
1087 | "outputs": [
1088 | {
1089 | "name": "stdout",
1090 | "output_type": "stream",
1091 | "text": [
1092 | "1\n",
1093 | "3\n",
1094 | "4\n"
1095 | ]
1096 | },
1097 | {
1098 | "ename": "AssertionError",
1099 | "evalue": "Not an even number!",
1100 | "output_type": "error",
1101 | "traceback": [
1102 | "\u001b[0;31m---------------------------------------------------------------------------\u001b[0m",
1103 | "\u001b[0;31mAssertionError\u001b[0m Traceback (most recent call last)",
1104 | "\u001b[0;32m/tmp/ipykernel_38826/3618698844.py\u001b[0m in \u001b[0;36m\u001b[0;34m\u001b[0m\n\u001b[1;32m 2\u001b[0m \u001b[0;34m\u001b[0m\u001b[0m\n\u001b[1;32m 3\u001b[0m \u001b[0;32mfor\u001b[0m \u001b[0melement\u001b[0m \u001b[0;32min\u001b[0m \u001b[0ml\u001b[0m\u001b[0;34m:\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[0;32m----> 4\u001b[0;31m \u001b[0;32massert\u001b[0m \u001b[0melement\u001b[0m \u001b[0;34m%\u001b[0m \u001b[0;36m2\u001b[0m \u001b[0;34m==\u001b[0m \u001b[0;36m0\u001b[0m\u001b[0;34m,\u001b[0m \u001b[0;34m'Not an even number!'\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[0m\u001b[1;32m 5\u001b[0m \u001b[0mprint\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0melement\u001b[0m\u001b[0;34m//\u001b[0m\u001b[0;36m2\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n",
1105 | "\u001b[0;31mAssertionError\u001b[0m: Not an even number!"
1106 | ]
1107 | }
1108 | ],
1109 | "source": [
1110 | "l = [2, 6, 8, 11, 16, 26, 28]\n",
1111 | "\n",
1112 | "for element in l:\n",
1113 | " assert element % 2 == 0, 'Not an even number!'\n",
1114 | " print(element//2)"
1115 | ]
1116 | },
1117 | {
1118 | "cell_type": "markdown",
1119 | "metadata": {},
1120 | "source": [
1121 | "### 2.4.5. Trying\n",
1122 | "\n",
1123 | "The `try` command allows you to try to execute a piece of code and provide a fallback in case the code raises an error. The most basic syntax looks like\n",
1124 | "```python\n",
1125 | "try:\n",
1126 | " check_condition(n)\n",
1127 | "except:\n",
1128 | " print('There was an error! Trying option b instead.')\n",
1129 | " b_function(n)\n",
1130 | "else:\n",
1131 | " a_function(n)\n",
1132 | "```\n",
1133 | "Note that the `except` part can have more options depending on the error raised when interpreting the code, for more information one may consult the Python documentation.\n",
1134 | "\n",
1135 | "As an example of when this could be useful, consider the following circumstance: you want a function that receives a number as input and prints its square, but if it receives a list it prints the square of every number. Then you could do"
1136 | ]
1137 | },
1138 | {
1139 | "cell_type": "code",
1140 | "execution_count": 36,
1141 | "metadata": {},
1142 | "outputs": [],
1143 | "source": [
1144 | "def print_squares(arg):\n",
1145 | " try:\n",
1146 | " xs = iter(arg)\n",
1147 | " except:\n",
1148 | " print(arg**2)\n",
1149 | " else:\n",
1150 | " for x in xs:\n",
1151 | " print_squares(x)"
1152 | ]
1153 | },
1154 | {
1155 | "cell_type": "markdown",
1156 | "metadata": {},
1157 | "source": [
1158 | "The code will first try to turn the argument `arg` into an iterator. This is only possible if it is a list, a range or a similar structure, not if it is a number. So if it is a number it will raise a `TypeError` and proceed to the `except` clause, which simply prints the square.\n",
1159 | "\n",
1160 | "If `arg` is indeed an iterator, then the process is repeated for every element inside `arg`."
1161 | ]
1162 | },
1163 | {
1164 | "cell_type": "code",
1165 | "execution_count": 37,
1166 | "metadata": {},
1167 | "outputs": [
1168 | {
1169 | "name": "stdout",
1170 | "output_type": "stream",
1171 | "text": [
1172 | "144\n"
1173 | ]
1174 | }
1175 | ],
1176 | "source": [
1177 | "print_squares(12)"
1178 | ]
1179 | },
1180 | {
1181 | "cell_type": "code",
1182 | "execution_count": 38,
1183 | "metadata": {},
1184 | "outputs": [
1185 | {
1186 | "name": "stdout",
1187 | "output_type": "stream",
1188 | "text": [
1189 | "1\n",
1190 | "4\n",
1191 | "9\n",
1192 | "25\n",
1193 | "64\n"
1194 | ]
1195 | }
1196 | ],
1197 | "source": [
1198 | "print_squares([1, 2, 3, 5, 8])"
1199 | ]
1200 | },
1201 | {
1202 | "cell_type": "code",
1203 | "execution_count": 39,
1204 | "metadata": {},
1205 | "outputs": [
1206 | {
1207 | "name": "stdout",
1208 | "output_type": "stream",
1209 | "text": [
1210 | "4\n",
1211 | "64\n",
1212 | "1024\n",
1213 | "4\n",
1214 | "9\n",
1215 | "25\n"
1216 | ]
1217 | }
1218 | ],
1219 | "source": [
1220 | "print_squares([[2, 8, 32], [2, 3, 5]])"
1221 | ]
1222 | },
1223 | {
1224 | "cell_type": "markdown",
1225 | "metadata": {},
1226 | "source": [
1227 | "### 2.4.6. Using `else` in loops\n",
1228 | "\n",
1229 | "With loops, and with the `try` statement, one may use an else clause in case the loop finishes without a break. Consider for instance the following example."
1230 | ]
1231 | },
1232 | {
1233 | "cell_type": "code",
1234 | "execution_count": 40,
1235 | "metadata": {},
1236 | "outputs": [
1237 | {
1238 | "name": "stdout",
1239 | "output_type": "stream",
1240 | "text": [
1241 | "No multiples of 6 found\n"
1242 | ]
1243 | }
1244 | ],
1245 | "source": [
1246 | "l = [1, 2, 4, 5, 7, 8, 9, 11, 15, 13, 16]\n",
1247 | "p = 6\n",
1248 | "\n",
1249 | "for element in l:\n",
1250 | " if element % p == 0:\n",
1251 | " print('Found:', element)\n",
1252 | " break\n",
1253 | "else:\n",
1254 | " print('No multiples of', p, 'found')"
1255 | ]
1256 | },
1257 | {
1258 | "cell_type": "markdown",
1259 | "metadata": {},
1260 | "source": [
1261 | "If we change the list slightly so as to trigger the `break`, the `else` part will not show up:"
1262 | ]
1263 | },
1264 | {
1265 | "cell_type": "code",
1266 | "execution_count": 41,
1267 | "metadata": {},
1268 | "outputs": [
1269 | {
1270 | "name": "stdout",
1271 | "output_type": "stream",
1272 | "text": [
1273 | "Found: 12\n"
1274 | ]
1275 | }
1276 | ],
1277 | "source": [
1278 | "l = [1, 2, 4, 5, 7, 8, 9, 11, 12, 13, 16]\n",
1279 | "p = 6\n",
1280 | "\n",
1281 | "for element in l:\n",
1282 | " if element % p == 0:\n",
1283 | " print('Found:', element)\n",
1284 | " break\n",
1285 | "else:\n",
1286 | " print('No multiples of', p, 'found')"
1287 | ]
1288 | },
1289 | {
1290 | "cell_type": "markdown",
1291 | "metadata": {},
1292 | "source": [
1293 | "# 3. Functions\n",
1294 | "\n",
1295 | "Functions are the essential part of any programming language, since they allow you to automate the workload and repeat redundant code.\n",
1296 | "\n",
1297 | "## 3.1. Definition\n",
1298 | "\n",
1299 | "The basic syntax for function definition in Python is\n",
1300 | "```python\n",
1301 | "def function_name(argument_one, argument_two, argument_three):\n",
1302 | " temp = argument_one + argument_two + argument_three\n",
1303 | " temp = temp**2\n",
1304 | " return temp\n",
1305 | "```\n",
1306 | "When called with arguments $a$, $b$ and $c$, the above function will return $(a+b+c)^2$. Note that a function can return more than one argument, by using\n",
1307 | "```python\n",
1308 | " return one, two, three\n",
1309 | "```\n",
1310 | "This is useful if, for instance, you want to return a main result but also some intermediate calculation. Note that all the variables defined inside of a function are *forgotten* and removed from memory once the function finishes execution.\n",
1311 | "\n",
1312 | "Note, moreover, than when *receiving* the return of a function, some of them can be ignored. Consider the following example:"
1313 | ]
1314 | },
1315 | {
1316 | "cell_type": "code",
1317 | "execution_count": 42,
1318 | "metadata": {},
1319 | "outputs": [
1320 | {
1321 | "name": "stdout",
1322 | "output_type": "stream",
1323 | "text": [
1324 | "(9, 3)\n"
1325 | ]
1326 | }
1327 | ],
1328 | "source": [
1329 | "def min_squared(a, b, c):\n",
1330 | " minimum = min(a, b, c)\n",
1331 | " return minimum**2, minimum\n",
1332 | "\n",
1333 | "print(min_squared(5, 3, 4))"
1334 | ]
1335 | },
1336 | {
1337 | "cell_type": "markdown",
1338 | "metadata": {},
1339 | "source": [
1340 | "If one only cares about the final result and not about the number chosen to be the minimum, one could use"
1341 | ]
1342 | },
1343 | {
1344 | "cell_type": "code",
1345 | "execution_count": 43,
1346 | "metadata": {},
1347 | "outputs": [
1348 | {
1349 | "name": "stdout",
1350 | "output_type": "stream",
1351 | "text": [
1352 | "9\n"
1353 | ]
1354 | }
1355 | ],
1356 | "source": [
1357 | "result, _ = min_squared(5, 3, 4)\n",
1358 | "print(result)"
1359 | ]
1360 | },
1361 | {
1362 | "cell_type": "markdown",
1363 | "metadata": {},
1364 | "source": [
1365 | "The underscore, `_`, acts here as a placeholder for a parameter we do not care about.\n",
1366 | "\n",
1367 | "By the way, a function does *not* need to have a `return` statement. For instance, it might simply print some stuff and do nothing more, hence there is no need to return anything.\n",
1368 | "\n",
1369 | "## 3.2. Default arguments\n",
1370 | "\n",
1371 | "When defining a function, you may add a default argument in the definition that will be used if, when the function is called, the argument does not appear. Be careful: the arguments with a default option must all appear at the end! The syntax is"
1372 | ]
1373 | },
1374 | {
1375 | "cell_type": "code",
1376 | "execution_count": 44,
1377 | "metadata": {},
1378 | "outputs": [],
1379 | "source": [
1380 | "def fibb(n, verbose = False):\n",
1381 | " a, b = 0, 1\n",
1382 | " for i in range(n):\n",
1383 | " c = a + b\n",
1384 | " a = b\n",
1385 | " b = c\n",
1386 | " if verbose:\n",
1387 | " print(b)\n",
1388 | " return b"
1389 | ]
1390 | },
1391 | {
1392 | "cell_type": "markdown",
1393 | "metadata": {},
1394 | "source": [
1395 | "When defining the arguments, ```verbose = False``` means that, whenever there is no second argument, `False` will be used by default. Now one may call the function in several ways:"
1396 | ]
1397 | },
1398 | {
1399 | "cell_type": "code",
1400 | "execution_count": 45,
1401 | "metadata": {},
1402 | "outputs": [
1403 | {
1404 | "data": {
1405 | "text/plain": [
1406 | "21"
1407 | ]
1408 | },
1409 | "execution_count": 45,
1410 | "metadata": {},
1411 | "output_type": "execute_result"
1412 | }
1413 | ],
1414 | "source": [
1415 | "fibb(7)"
1416 | ]
1417 | },
1418 | {
1419 | "cell_type": "code",
1420 | "execution_count": 46,
1421 | "metadata": {},
1422 | "outputs": [
1423 | {
1424 | "name": "stdout",
1425 | "output_type": "stream",
1426 | "text": [
1427 | "1\n",
1428 | "2\n",
1429 | "3\n",
1430 | "5\n",
1431 | "8\n",
1432 | "13\n",
1433 | "21\n"
1434 | ]
1435 | },
1436 | {
1437 | "data": {
1438 | "text/plain": [
1439 | "21"
1440 | ]
1441 | },
1442 | "execution_count": 46,
1443 | "metadata": {},
1444 | "output_type": "execute_result"
1445 | }
1446 | ],
1447 | "source": [
1448 | "fibb(7, True)"
1449 | ]
1450 | },
1451 | {
1452 | "cell_type": "code",
1453 | "execution_count": 47,
1454 | "metadata": {},
1455 | "outputs": [
1456 | {
1457 | "data": {
1458 | "text/plain": [
1459 | "21"
1460 | ]
1461 | },
1462 | "execution_count": 47,
1463 | "metadata": {},
1464 | "output_type": "execute_result"
1465 | }
1466 | ],
1467 | "source": [
1468 | "fibb(n = 7, verbose = False)"
1469 | ]
1470 | },
1471 | {
1472 | "cell_type": "markdown",
1473 | "metadata": {},
1474 | "source": [
1475 | "When working with several functions, each with several arguments, typing out the name of every argument may be a good option. While it is not immediate to check if an argument has been given, a workaround can be assigning a default value of `argument = None` and checking if it has been redefined using\n",
1476 | "```python\n",
1477 | "if argument is None\n",
1478 | "```\n",
1479 | "\n",
1480 | "## 3.3. Additional arguments\n",
1481 | "\n",
1482 | "Sometimes you may want to leave room for some extra arguments, but you don't know how many they are nor how do they look. Let us start with an example. Consider the following:"
1483 | ]
1484 | },
1485 | {
1486 | "cell_type": "code",
1487 | "execution_count": 48,
1488 | "metadata": {},
1489 | "outputs": [],
1490 | "source": [
1491 | "def zip_list(xs, ys, function):\n",
1492 | " assert len(xs) == len(ys)\n",
1493 | " return [function(xs[i], ys[i]) for i in range(len(xs))]\n",
1494 | "\n",
1495 | "def divide_by(p, q, r = 2):\n",
1496 | " return (p + q)/r\n",
1497 | "\n",
1498 | "def nth_power(x, y, n = 2):\n",
1499 | " return (x + y)**n"
1500 | ]
1501 | },
1502 | {
1503 | "cell_type": "markdown",
1504 | "metadata": {},
1505 | "source": [
1506 | "Essentially, given a couple of lists, we want to get a list of the function applied to couples of elements from the two list in the same order. The code works fine, for instance"
1507 | ]
1508 | },
1509 | {
1510 | "cell_type": "code",
1511 | "execution_count": 49,
1512 | "metadata": {},
1513 | "outputs": [
1514 | {
1515 | "name": "stdout",
1516 | "output_type": "stream",
1517 | "text": [
1518 | "[5.5, 9.0, 13.5, 18.0, 20.5]\n",
1519 | "[121, 324, 729, 1296, 1681]\n"
1520 | ]
1521 | }
1522 | ],
1523 | "source": [
1524 | "xs, ys = [1, 2, 3, 4, 5], [10, 16, 24, 32, 36]\n",
1525 | "\n",
1526 | "print(zip_list(xs, ys, divide_by))\n",
1527 | "print(zip_list(xs, ys, nth_power))"
1528 | ]
1529 | },
1530 | {
1531 | "cell_type": "markdown",
1532 | "metadata": {},
1533 | "source": [
1534 | "This works fine, but what if we want to tune the arguments of the function we apply?\n",
1535 | "\n",
1536 | "### 3.3.1.`*args`\n",
1537 | "\n",
1538 | "The `*args` command allows you to get an undefined amount of simple arguments at the end of a function. For instance,"
1539 | ]
1540 | },
1541 | {
1542 | "cell_type": "code",
1543 | "execution_count": 50,
1544 | "metadata": {},
1545 | "outputs": [],
1546 | "source": [
1547 | "def print_all_arguments(*args):\n",
1548 | " for a in args:\n",
1549 | " print(a)"
1550 | ]
1551 | },
1552 | {
1553 | "cell_type": "code",
1554 | "execution_count": 51,
1555 | "metadata": {},
1556 | "outputs": [
1557 | {
1558 | "name": "stdout",
1559 | "output_type": "stream",
1560 | "text": [
1561 | "12\n",
1562 | "34\n",
1563 | "56\n",
1564 | "78\n",
1565 | "90\n"
1566 | ]
1567 | }
1568 | ],
1569 | "source": [
1570 | "print_all_arguments(12, 34, 56, 78, 90)"
1571 | ]
1572 | },
1573 | {
1574 | "cell_type": "markdown",
1575 | "metadata": {},
1576 | "source": [
1577 | "Note that, when we want to recover the arguments, we drop the asterisk `*` and call `args` simply.\n",
1578 | "\n",
1579 | "### 3.3.2. `**kwargs`\n",
1580 | "\n",
1581 | "Maybe more importantly, the `**kwargs` command recovers all *keyworded* arguments, i.e. those of the form `key = argument`. Following the previous example, a simple modification to `zip_list` allows us to send all *extra* information to be processed to the `function`:"
1582 | ]
1583 | },
1584 | {
1585 | "cell_type": "code",
1586 | "execution_count": 52,
1587 | "metadata": {},
1588 | "outputs": [],
1589 | "source": [
1590 | "def zip_list(xs, ys, function, **kwargs):\n",
1591 | " assert len(xs) == len(ys)\n",
1592 | " return [function(xs[i], ys[i], **kwargs) for i in range(len(xs))]\n",
1593 | "\n",
1594 | "def divide_by(p, q, r = 2):\n",
1595 | " return (p + q)/r\n",
1596 | "\n",
1597 | "def nth_power(x, y, n = 2):\n",
1598 | " return (x + y)**n"
1599 | ]
1600 | },
1601 | {
1602 | "cell_type": "markdown",
1603 | "metadata": {},
1604 | "source": [
1605 | "Now we can use keyworded arguments to modify the functions."
1606 | ]
1607 | },
1608 | {
1609 | "cell_type": "code",
1610 | "execution_count": 53,
1611 | "metadata": {},
1612 | "outputs": [
1613 | {
1614 | "name": "stdout",
1615 | "output_type": "stream",
1616 | "text": [
1617 | "[1.1, 1.8, 2.7, 3.6, 4.1]\n",
1618 | "[14641, 104976, 531441, 1679616, 2825761]\n"
1619 | ]
1620 | }
1621 | ],
1622 | "source": [
1623 | "print(zip_list(xs, ys, divide_by, r = 10))\n",
1624 | "print(zip_list(xs, ys, nth_power, n = 4))"
1625 | ]
1626 | },
1627 | {
1628 | "cell_type": "markdown",
1629 | "metadata": {},
1630 | "source": [
1631 | "The arguments will now be stored in a dictionary. To see how to access them, see the following definition:"
1632 | ]
1633 | },
1634 | {
1635 | "cell_type": "code",
1636 | "execution_count": 54,
1637 | "metadata": {},
1638 | "outputs": [],
1639 | "source": [
1640 | "def print_all_kwarguments(**kwargs):\n",
1641 | " for (k, v) in kwargs.items():\n",
1642 | " print('Key:', k)\n",
1643 | " print('Value', v)"
1644 | ]
1645 | },
1646 | {
1647 | "cell_type": "code",
1648 | "execution_count": 55,
1649 | "metadata": {},
1650 | "outputs": [
1651 | {
1652 | "name": "stdout",
1653 | "output_type": "stream",
1654 | "text": [
1655 | "Key: e\n",
1656 | "Value 2.71\n",
1657 | "Key: pi\n",
1658 | "Value 3.14\n"
1659 | ]
1660 | }
1661 | ],
1662 | "source": [
1663 | "print_all_kwarguments(e = 2.71, pi = 3.14)"
1664 | ]
1665 | },
1666 | {
1667 | "cell_type": "markdown",
1668 | "metadata": {},
1669 | "source": [
1670 | "### 3.3.3. Position vs keyword\n",
1671 | "\n",
1672 | "By now you should have realized that there are two ways of passing arguments to a function (and of defining a function to accept those arguments): by position and by keyword. The (big) advantage of using keywords is that you do not need to remember the specific order of the arguments. But it also makes your code less error-prone (no strange mistakes because you swapped two numbers and didn't notice) and cleaner (easy to identify what every argument means when calling a function)."
1673 | ]
1674 | },
1675 | {
1676 | "cell_type": "code",
1677 | "execution_count": 56,
1678 | "metadata": {},
1679 | "outputs": [],
1680 | "source": [
1681 | "def exponential(base = 2, exponent = 1):\n",
1682 | " return base**exponent"
1683 | ]
1684 | },
1685 | {
1686 | "cell_type": "code",
1687 | "execution_count": 57,
1688 | "metadata": {},
1689 | "outputs": [
1690 | {
1691 | "data": {
1692 | "text/plain": [
1693 | "125"
1694 | ]
1695 | },
1696 | "execution_count": 57,
1697 | "metadata": {},
1698 | "output_type": "execute_result"
1699 | }
1700 | ],
1701 | "source": [
1702 | "exponential(exponent = 3, base = 5)"
1703 | ]
1704 | },
1705 | {
1706 | "cell_type": "markdown",
1707 | "metadata": {},
1708 | "source": [
1709 | "Of course, there are situations where positional arguments make sense and others where you will be better off if you use keyworded arguments for your functions. Note also that, when giving arguments for a function, they must be in the correct order:\n",
1710 | "\n",
1711 | "\n",
1712 | " positional, keyworded, *positional, **keyworded "
1713 | ]
1714 | },
1715 | {
1716 | "cell_type": "markdown",
1717 | "metadata": {},
1718 | "source": [
1719 | "## 3.4. Lambda functions\n",
1720 | "\n",
1721 | "Lambda functions provide a fast way of defining very simple functions. It is best seen with an example: "
1722 | ]
1723 | },
1724 | {
1725 | "cell_type": "code",
1726 | "execution_count": 58,
1727 | "metadata": {},
1728 | "outputs": [],
1729 | "source": [
1730 | "average = lambda xs : sum(xs)/len(xs)"
1731 | ]
1732 | },
1733 | {
1734 | "cell_type": "markdown",
1735 | "metadata": {},
1736 | "source": [
1737 | "Above, `average` is the name of the function, `lambda` is the keyword to indicate that we are defining such a function, `xs` is the argument (more could be added separating them with commas). The colon `:` marks the separation between arguments and the function itself. Then everything to the right of the colon will be returned."
1738 | ]
1739 | },
1740 | {
1741 | "cell_type": "code",
1742 | "execution_count": 59,
1743 | "metadata": {},
1744 | "outputs": [
1745 | {
1746 | "data": {
1747 | "text/plain": [
1748 | "3.0"
1749 | ]
1750 | },
1751 | "execution_count": 59,
1752 | "metadata": {},
1753 | "output_type": "execute_result"
1754 | }
1755 | ],
1756 | "source": [
1757 | "average(xs)"
1758 | ]
1759 | },
1760 | {
1761 | "cell_type": "markdown",
1762 | "metadata": {},
1763 | "source": [
1764 | "# 4. Packages\n",
1765 | "\n",
1766 | "One of the best things about Python is the fact that it has a lot of packages that already implement most of the programs that one may need."
1767 | ]
1768 | },
1769 | {
1770 | "cell_type": "markdown",
1771 | "metadata": {},
1772 | "source": [
1773 | "## 4.1 Loading packages\n",
1774 | "\n",
1775 | "There are several ways of loading a package into Python, with some differences."
1776 | ]
1777 | },
1778 | {
1779 | "cell_type": "code",
1780 | "execution_count": 60,
1781 | "metadata": {},
1782 | "outputs": [],
1783 | "source": [
1784 | "import numpy"
1785 | ]
1786 | },
1787 | {
1788 | "cell_type": "markdown",
1789 | "metadata": {},
1790 | "source": [
1791 | "This allows you to use every function that `numpy` defines on your document. The syntax for such functions is"
1792 | ]
1793 | },
1794 | {
1795 | "cell_type": "code",
1796 | "execution_count": 61,
1797 | "metadata": {},
1798 | "outputs": [
1799 | {
1800 | "data": {
1801 | "text/plain": [
1802 | "20"
1803 | ]
1804 | },
1805 | "execution_count": 61,
1806 | "metadata": {},
1807 | "output_type": "execute_result"
1808 | }
1809 | ],
1810 | "source": [
1811 | "numpy.multiply(4, 5)"
1812 | ]
1813 | },
1814 | {
1815 | "cell_type": "markdown",
1816 | "metadata": {},
1817 | "source": [
1818 | "When the name of the imported module is longer, it makes sense to import it with a different prefix. This can be done using"
1819 | ]
1820 | },
1821 | {
1822 | "cell_type": "code",
1823 | "execution_count": 62,
1824 | "metadata": {},
1825 | "outputs": [],
1826 | "source": [
1827 | "import numpy as np"
1828 | ]
1829 | },
1830 | {
1831 | "cell_type": "markdown",
1832 | "metadata": {},
1833 | "source": [
1834 | "Now the syntax becomes"
1835 | ]
1836 | },
1837 | {
1838 | "cell_type": "code",
1839 | "execution_count": 63,
1840 | "metadata": {},
1841 | "outputs": [
1842 | {
1843 | "data": {
1844 | "text/plain": [
1845 | "18"
1846 | ]
1847 | },
1848 | "execution_count": 63,
1849 | "metadata": {},
1850 | "output_type": "execute_result"
1851 | }
1852 | ],
1853 | "source": [
1854 | "np.multiply(3, 6)"
1855 | ]
1856 | },
1857 | {
1858 | "cell_type": "markdown",
1859 | "metadata": {},
1860 | "source": [
1861 | "In case you really don't want to deal with prefixes, an option is importing a function directly, via"
1862 | ]
1863 | },
1864 | {
1865 | "cell_type": "code",
1866 | "execution_count": 64,
1867 | "metadata": {},
1868 | "outputs": [],
1869 | "source": [
1870 | "from numpy import multiply"
1871 | ]
1872 | },
1873 | {
1874 | "cell_type": "markdown",
1875 | "metadata": {},
1876 | "source": [
1877 | "Now you can just say"
1878 | ]
1879 | },
1880 | {
1881 | "cell_type": "code",
1882 | "execution_count": 65,
1883 | "metadata": {},
1884 | "outputs": [
1885 | {
1886 | "data": {
1887 | "text/plain": [
1888 | "14"
1889 | ]
1890 | },
1891 | "execution_count": 65,
1892 | "metadata": {},
1893 | "output_type": "execute_result"
1894 | }
1895 | ],
1896 | "source": [
1897 | "multiply(7, 2)"
1898 | ]
1899 | },
1900 | {
1901 | "cell_type": "markdown",
1902 | "metadata": {},
1903 | "source": [
1904 | "It is possible to import several functions from the same statement, using\n",
1905 | "```python\n",
1906 | "from numpy import multiply, array\n",
1907 | "```\n",
1908 | "or just import everything in the package as\n",
1909 | "```python\n",
1910 | "from numpy import *\n",
1911 | "```\n",
1912 | "However, for huge packages like NumPy, this is discouraged."
1913 | ]
1914 | },
1915 | {
1916 | "cell_type": "markdown",
1917 | "metadata": {},
1918 | "source": [
1919 | "## 4.2 Documentation\n",
1920 | "\n",
1921 | "Oftentimes you will be interested in knowing the documentation of some function, for instance to understand what the inputs should be. Of course, navigating to the corresponding website (or searching the problem) is an option. In case you want to stay in your notebook, however, Jupyter provides two ways of displaying the `docstring` (that is, the command documentation) of a function.\n",
1922 | "\n",
1923 | "The first one is simply typing the command, but with a question mark `?` before:"
1924 | ]
1925 | },
1926 | {
1927 | "cell_type": "code",
1928 | "execution_count": 66,
1929 | "metadata": {},
1930 | "outputs": [
1931 | {
1932 | "data": {
1933 | "text/plain": [
1934 | "\u001b[0;31mCall signature:\u001b[0m \u001b[0mmultiply\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0;34m*\u001b[0m\u001b[0margs\u001b[0m\u001b[0;34m,\u001b[0m \u001b[0;34m**\u001b[0m\u001b[0mkwargs\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n",
1935 | "\u001b[0;31mType:\u001b[0m ufunc\n",
1936 | "\u001b[0;31mString form:\u001b[0m \n",
1937 | "\u001b[0;31mFile:\u001b[0m ~/.local/lib/python3.8/site-packages/numpy/__init__.py\n",
1938 | "\u001b[0;31mDocstring:\u001b[0m \n",
1939 | "multiply(x1, x2, /, out=None, *, where=True, casting='same_kind', order='K', dtype=None, subok=True[, signature, extobj])\n",
1940 | "\n",
1941 | "Multiply arguments element-wise.\n",
1942 | "\n",
1943 | "Parameters\n",
1944 | "----------\n",
1945 | "x1, x2 : array_like\n",
1946 | " Input arrays to be multiplied.\n",
1947 | " If ``x1.shape != x2.shape``, they must be broadcastable to a common\n",
1948 | " shape (which becomes the shape of the output).\n",
1949 | "out : ndarray, None, or tuple of ndarray and None, optional\n",
1950 | " A location into which the result is stored. If provided, it must have\n",
1951 | " a shape that the inputs broadcast to. If not provided or None,\n",
1952 | " a freshly-allocated array is returned. A tuple (possible only as a\n",
1953 | " keyword argument) must have length equal to the number of outputs.\n",
1954 | "where : array_like, optional\n",
1955 | " This condition is broadcast over the input. At locations where the\n",
1956 | " condition is True, the `out` array will be set to the ufunc result.\n",
1957 | " Elsewhere, the `out` array will retain its original value.\n",
1958 | " Note that if an uninitialized `out` array is created via the default\n",
1959 | " ``out=None``, locations within it where the condition is False will\n",
1960 | " remain uninitialized.\n",
1961 | "**kwargs\n",
1962 | " For other keyword-only arguments, see the\n",
1963 | " :ref:`ufunc docs `.\n",
1964 | "\n",
1965 | "Returns\n",
1966 | "-------\n",
1967 | "y : ndarray\n",
1968 | " The product of `x1` and `x2`, element-wise.\n",
1969 | " This is a scalar if both `x1` and `x2` are scalars.\n",
1970 | "\n",
1971 | "Notes\n",
1972 | "-----\n",
1973 | "Equivalent to `x1` * `x2` in terms of array broadcasting.\n",
1974 | "\n",
1975 | "Examples\n",
1976 | "--------\n",
1977 | ">>> np.multiply(2.0, 4.0)\n",
1978 | "8.0\n",
1979 | "\n",
1980 | ">>> x1 = np.arange(9.0).reshape((3, 3))\n",
1981 | ">>> x2 = np.arange(3.0)\n",
1982 | ">>> np.multiply(x1, x2)\n",
1983 | "array([[ 0., 1., 4.],\n",
1984 | " [ 0., 4., 10.],\n",
1985 | " [ 0., 7., 16.]])\n",
1986 | "\n",
1987 | "The ``*`` operator can be used as a shorthand for ``np.multiply`` on\n",
1988 | "ndarrays.\n",
1989 | "\n",
1990 | ">>> x1 = np.arange(9.0).reshape((3, 3))\n",
1991 | ">>> x2 = np.arange(3.0)\n",
1992 | ">>> x1 * x2\n",
1993 | "array([[ 0., 1., 4.],\n",
1994 | " [ 0., 4., 10.],\n",
1995 | " [ 0., 7., 16.]])\n",
1996 | "\u001b[0;31mClass docstring:\u001b[0m\n",
1997 | "Functions that operate element by element on whole arrays.\n",
1998 | "\n",
1999 | "To see the documentation for a specific ufunc, use `info`. For\n",
2000 | "example, ``np.info(np.sin)``. Because ufuncs are written in C\n",
2001 | "(for speed) and linked into Python with NumPy's ufunc facility,\n",
2002 | "Python's help() function finds this page whenever help() is called\n",
2003 | "on a ufunc.\n",
2004 | "\n",
2005 | "A detailed explanation of ufuncs can be found in the docs for :ref:`ufuncs`.\n",
2006 | "\n",
2007 | "**Calling ufuncs:** ``op(*x[, out], where=True, **kwargs)``\n",
2008 | "\n",
2009 | "Apply `op` to the arguments `*x` elementwise, broadcasting the arguments.\n",
2010 | "\n",
2011 | "The broadcasting rules are:\n",
2012 | "\n",
2013 | "* Dimensions of length 1 may be prepended to either array.\n",
2014 | "* Arrays may be repeated along dimensions of length 1.\n",
2015 | "\n",
2016 | "Parameters\n",
2017 | "----------\n",
2018 | "*x : array_like\n",
2019 | " Input arrays.\n",
2020 | "out : ndarray, None, or tuple of ndarray and None, optional\n",
2021 | " Alternate array object(s) in which to put the result; if provided, it\n",
2022 | " must have a shape that the inputs broadcast to. A tuple of arrays\n",
2023 | " (possible only as a keyword argument) must have length equal to the\n",
2024 | " number of outputs; use None for uninitialized outputs to be\n",
2025 | " allocated by the ufunc.\n",
2026 | "where : array_like, optional\n",
2027 | " This condition is broadcast over the input. At locations where the\n",
2028 | " condition is True, the `out` array will be set to the ufunc result.\n",
2029 | " Elsewhere, the `out` array will retain its original value.\n",
2030 | " Note that if an uninitialized `out` array is created via the default\n",
2031 | " ``out=None``, locations within it where the condition is False will\n",
2032 | " remain uninitialized.\n",
2033 | "**kwargs\n",
2034 | " For other keyword-only arguments, see the :ref:`ufunc docs `.\n",
2035 | "\n",
2036 | "Returns\n",
2037 | "-------\n",
2038 | "r : ndarray or tuple of ndarray\n",
2039 | " `r` will have the shape that the arrays in `x` broadcast to; if `out` is\n",
2040 | " provided, it will be returned. If not, `r` will be allocated and\n",
2041 | " may contain uninitialized values. If the function has more than one\n",
2042 | " output, then the result will be a tuple of arrays.\n"
2043 | ]
2044 | },
2045 | "metadata": {},
2046 | "output_type": "display_data"
2047 | }
2048 | ],
2049 | "source": [
2050 | "?multiply"
2051 | ]
2052 | },
2053 | {
2054 | "cell_type": "markdown",
2055 | "metadata": {},
2056 | "source": [
2057 | "The second one, by typing the command and then, while on top of it, pressing Shift+Tab."
2058 | ]
2059 | },
2060 | {
2061 | "cell_type": "code",
2062 | "execution_count": 67,
2063 | "metadata": {},
2064 | "outputs": [
2065 | {
2066 | "data": {
2067 | "text/plain": [
2068 | ""
2069 | ]
2070 | },
2071 | "execution_count": 67,
2072 | "metadata": {},
2073 | "output_type": "execute_result"
2074 | }
2075 | ],
2076 | "source": [
2077 | "multiply"
2078 | ]
2079 | },
2080 | {
2081 | "cell_type": "markdown",
2082 | "metadata": {},
2083 | "source": [
2084 | "## 4.3 Needed packages\n",
2085 | "\n",
2086 | "The following snippet will make sure that you have all the right packages installed."
2087 | ]
2088 | },
2089 | {
2090 | "cell_type": "code",
2091 | "execution_count": 68,
2092 | "metadata": {},
2093 | "outputs": [
2094 | {
2095 | "name": "stdout",
2096 | "output_type": "stream",
2097 | "text": [
2098 | "✗ tensorflow is not installed\n",
2099 | "✓ jupyter correctly installed\n",
2100 | "✓ numpy correctly installed\n",
2101 | "✓ scipy correctly installed\n",
2102 | "✓ scikit-learn correctly installed\n",
2103 | "✓ matplotlib correctly installed\n",
2104 | "✓ pandas correctly installed\n",
2105 | "✓ urllib3 correctly installed\n"
2106 | ]
2107 | }
2108 | ],
2109 | "source": [
2110 | "import subprocess, sys\n",
2111 | "\n",
2112 | "packages = ['tensorflow', \n",
2113 | " 'jupyter', \n",
2114 | " 'numpy', \n",
2115 | " 'scipy', \n",
2116 | " 'scikit-learn', \n",
2117 | " 'matplotlib', \n",
2118 | " 'pandas', \n",
2119 | " 'urllib3']\n",
2120 | "\n",
2121 | "reqs = subprocess.check_output([sys.executable, '-m', 'pip', 'freeze'])\n",
2122 | "packages_iv = [r.decode() for r in reqs.split()]\n",
2123 | "packages_i = [r.split('==')[0] for r in packages_iv]\n",
2124 | "\n",
2125 | "for package in packages:\n",
2126 | " n = package.split('==')\n",
2127 | " if len(n) > 1:\n",
2128 | " [p, v] = n\n",
2129 | " else:\n",
2130 | " p, v = n[0], False\n",
2131 | " if (package in packages_i) or (package in packages_iv):\n",
2132 | " print('✓ ' + p + ' correctly installed')\n",
2133 | " elif v and (p in packages_i):\n",
2134 | " print('? ' + p + ' is installed, version ' + packages_iv[packages_i.index(p)].split('==')[1] + ' (sugested ' + v + ')')\n",
2135 | " else:\n",
2136 | " print('✗ ' + package + ' is not installed')"
2137 | ]
2138 | },
2139 | {
2140 | "cell_type": "code",
2141 | "execution_count": null,
2142 | "metadata": {},
2143 | "outputs": [],
2144 | "source": []
2145 | },
2146 | {
2147 | "cell_type": "code",
2148 | "execution_count": null,
2149 | "metadata": {},
2150 | "outputs": [],
2151 | "source": []
2152 | }
2153 | ],
2154 | "metadata": {
2155 | "kernelspec": {
2156 | "display_name": "Python 3 (ipykernel)",
2157 | "language": "python",
2158 | "name": "python3"
2159 | },
2160 | "language_info": {
2161 | "codemirror_mode": {
2162 | "name": "ipython",
2163 | "version": 3
2164 | },
2165 | "file_extension": ".py",
2166 | "mimetype": "text/x-python",
2167 | "name": "python",
2168 | "nbconvert_exporter": "python",
2169 | "pygments_lexer": "ipython3",
2170 | "version": "3.8.3"
2171 | }
2172 | },
2173 | "nbformat": 4,
2174 | "nbformat_minor": 4
2175 | }
2176 |
--------------------------------------------------------------------------------
/PCA_template.ipynb:
--------------------------------------------------------------------------------
1 | {
2 | "cells": [
3 | {
4 | "cell_type": "markdown",
5 | "metadata": {},
6 | "source": [
7 | "# Principal Component Analysis\n",
8 | "\n",
9 | "In this exercise sheet we look into how to compute and apply a Principal Component Analysis (PCA)."
10 | ]
11 | },
12 | {
13 | "cell_type": "markdown",
14 | "metadata": {},
15 | "source": [
16 | "## Toy 4D Example\n",
17 | "\n",
18 | "We start by loading our toy example. The data is stored as a Numpy array, it is a $2585\\times 5$ matrix. The last component of each row is the label, the first four components are the coordinates in 4D. Each label is an integer from $\\{0, 1, 2, 3, 4\\}$.\n",
19 | "\n",
20 | "The data contains a noisy 2D plane which is embded into 4D. We would like to represent the data in its _intrinsic_ 2D form."
21 | ]
22 | },
23 | {
24 | "cell_type": "code",
25 | "execution_count": null,
26 | "metadata": {},
27 | "outputs": [],
28 | "source": [
29 | "!pip install pillow # install the Python package \"pillow\"\n",
30 | "import numpy as np\n",
31 | "import mllab.pca"
32 | ]
33 | },
34 | {
35 | "cell_type": "code",
36 | "execution_count": null,
37 | "metadata": {},
38 | "outputs": [],
39 | "source": [
40 | "pca_toy_4d = np.load(\"data/pca_toy_4d.npy\")\n",
41 | "y = pca_toy_4d[:, -1] # labels\n",
42 | "x = pca_toy_4d[:, :-1] # 4D coordinates"
43 | ]
44 | },
45 | {
46 | "cell_type": "markdown",
47 | "metadata": {},
48 | "source": [
49 | "Let us plot slices from this 4D data. We provide a helper function for this:"
50 | ]
51 | },
52 | {
53 | "cell_type": "code",
54 | "execution_count": null,
55 | "metadata": {},
56 | "outputs": [],
57 | "source": [
58 | "# Show documentation\n",
59 | "mllab.pca.plot_toy_slice?"
60 | ]
61 | },
62 | {
63 | "cell_type": "code",
64 | "execution_count": null,
65 | "metadata": {},
66 | "outputs": [],
67 | "source": [
68 | "mllab.pca.plot_toy_slice(x, y, drop_dim=4)"
69 | ]
70 | },
71 | {
72 | "cell_type": "markdown",
73 | "metadata": {},
74 | "source": [
75 | "We want to remove the noise and recover the 2D information."
76 | ]
77 | },
78 | {
79 | "cell_type": "markdown",
80 | "metadata": {},
81 | "source": [
82 | "### Task 3.1\n",
83 | "\n",
84 | "Write an implementation of the function below. Use a singular value decomposition (SVD), but avoid computing it completely since we only need the first $q$ eigenvectors. You can use a NumPy/SciPy function for this."
85 | ]
86 | },
87 | {
88 | "cell_type": "code",
89 | "execution_count": null,
90 | "metadata": {},
91 | "outputs": [],
92 | "source": [
93 | "from scipy.sparse.linalg import svds\n",
94 | "\n",
95 | "def pca(x, q):\n",
96 | " \"\"\"\n",
97 | " Compute principal components and the coordinates.\n",
98 | " \n",
99 | " Parameters\n",
100 | " ----------\n",
101 | " \n",
102 | " x: (n, d) NumPy array\n",
103 | " q: int\n",
104 | " The number of principal components to compute.\n",
105 | " Has to be less than `p`.\n",
106 | "\n",
107 | " Returns\n",
108 | " -------\n",
109 | " \n",
110 | " Vq: (d, q) NumPy array, orthonormal vectors (column-wise)\n",
111 | " xq: (n, q) NumPy array, coordinates for x (row-wise)\n",
112 | " \"\"\"\n",
113 | " # your code here\n",
114 | "\n",
115 | " \n",
116 | " "
117 | ]
118 | },
119 | {
120 | "cell_type": "markdown",
121 | "metadata": {},
122 | "source": [
123 | "Now we can compute the 2D dimensional representation of `x` using PCA."
124 | ]
125 | },
126 | {
127 | "cell_type": "code",
128 | "execution_count": null,
129 | "metadata": {},
130 | "outputs": [],
131 | "source": [
132 | "V, xq = pca(x, q=2)"
133 | ]
134 | },
135 | {
136 | "cell_type": "markdown",
137 | "metadata": {},
138 | "source": [
139 | "And then plot the coordinates `xq`, which are two dimensional. We provide a helper function for this task. Let us check how to use it:"
140 | ]
141 | },
142 | {
143 | "cell_type": "code",
144 | "execution_count": null,
145 | "metadata": {},
146 | "outputs": [],
147 | "source": [
148 | "mllab.pca.plot_toy_2d?"
149 | ]
150 | },
151 | {
152 | "cell_type": "code",
153 | "execution_count": null,
154 | "metadata": {},
155 | "outputs": [],
156 | "source": [
157 | "xq = mllab.pca.plot_toy_2d(xq, y)"
158 | ]
159 | },
160 | {
161 | "cell_type": "markdown",
162 | "metadata": {},
163 | "source": [
164 | "Hopefully you appreciate the result.\n",
165 | "\n",
166 | "### Task 3.2\n",
167 | "\n",
168 | "Let us see how PCA handles a non-linear transformation. To test this we map our data into 3D by keeping the y-axis as the new z-axis and bending x-coordinate onto an ellipse."
169 | ]
170 | },
171 | {
172 | "cell_type": "code",
173 | "execution_count": null,
174 | "metadata": {},
175 | "outputs": [],
176 | "source": [
177 | "xyz = mllab.pca.map_on_ellipse(xq, a=32, b=1, gap_angle=90)"
178 | ]
179 | },
180 | {
181 | "cell_type": "code",
182 | "execution_count": null,
183 | "metadata": {},
184 | "outputs": [],
185 | "source": [
186 | " # If you are running the interactive variant with %matplotlib widget,\n",
187 | " # you may need to restart the kernel afterwards and run the remaining tasks without it to execute the plots correctly.\n",
188 | "#!pip install ipympl\n",
189 | "#%matplotlib widget\n",
190 | "mllab.pca.plot_toy_3d(xyz, y)"
191 | ]
192 | },
193 | {
194 | "cell_type": "markdown",
195 | "metadata": {},
196 | "source": [
197 | "Now apply PCA to our transformed data and plot the result as before."
198 | ]
199 | },
200 | {
201 | "cell_type": "code",
202 | "execution_count": null,
203 | "metadata": {},
204 | "outputs": [],
205 | "source": [
206 | "%matplotlib inline\n",
207 | "# plot code here\n",
208 | "\n"
209 | ]
210 | },
211 | {
212 | "cell_type": "markdown",
213 | "metadata": {},
214 | "source": [
215 | "Could be worse, but undeniably discomforting. Try different axes lengths and gap sizes of the ellipse. What do you observe?"
216 | ]
217 | },
218 | {
219 | "cell_type": "markdown",
220 | "metadata": {},
221 | "source": [
222 | "### Task 3.3\n",
223 | "\n",
224 | "We want to see if PCA can improve the accuracy of separating hyperlanes. First compute the singular values of the Iris dataset, then check how many percent of the variance the first two principal components capture."
225 | ]
226 | },
227 | {
228 | "cell_type": "code",
229 | "execution_count": null,
230 | "metadata": {},
231 | "outputs": [],
232 | "source": [
233 | "from sklearn.datasets import load_iris\n",
234 | "iris = load_iris()\n",
235 | "iris_x = iris['data']\n",
236 | "iris_y = iris['target']"
237 | ]
238 | },
239 | {
240 | "cell_type": "code",
241 | "execution_count": null,
242 | "metadata": {},
243 | "outputs": [],
244 | "source": [
245 | "from scipy.linalg import svdvals\n",
246 | "#Your code here\n",
247 | "\n"
248 | ]
249 | },
250 | {
251 | "cell_type": "markdown",
252 | "metadata": {},
253 | "source": [
254 | "Now apply PCA and compute the first two principal components. Plot the projected 2D data in a scatter plot such that the three labels are recognizable. What do you observe?"
255 | ]
256 | },
257 | {
258 | "cell_type": "code",
259 | "execution_count": null,
260 | "metadata": {},
261 | "outputs": [],
262 | "source": [
263 | "import matplotlib.pyplot as plt\n",
264 | "# your code here\n",
265 | "\n"
266 | ]
267 | },
268 | {
269 | "cell_type": "code",
270 | "execution_count": null,
271 | "metadata": {},
272 | "outputs": [],
273 | "source": [
274 | "def plot_1d_iris(a, b, c):\n",
275 | " \"\"\"Show a 1D plot of three 1D datasets a, b and c.\n",
276 | " \n",
277 | " Top to bottom plotted in order is a, b, c.\"\"\"\n",
278 | " left = min(x.min() for x in (a, b, c))\n",
279 | " right = max(x.max() for x in (a, b, c))\n",
280 | " for i, (x, c) in enumerate(((a, 'red'), (b, 'blue'), (c, 'green'))):\n",
281 | " plt.hlines(i * .3, left, right, linestyles='dotted', colors=[(.8,.8,.8,1)])\n",
282 | " plt.eventplot(x, colors=c, linewidths=.5, linelengths=.25, lineoffsets=(2 - i) * .3)\n",
283 | " plt.axis('off')\n",
284 | "\n",
285 | "# your code here\n",
286 | "\n",
287 | "\n"
288 | ]
289 | },
290 | {
291 | "cell_type": "markdown",
292 | "metadata": {},
293 | "source": [
294 | "Finally, recompute the accurancies and compare the results."
295 | ]
296 | },
297 | {
298 | "cell_type": "code",
299 | "execution_count": null,
300 | "metadata": {},
301 | "outputs": [],
302 | "source": [
303 | "from numpy.linalg import lstsq\n",
304 | "from sklearn.metrics import confusion_matrix, accuracy_score\n",
305 | "from sklearn.svm import LinearSVC\n",
306 | "\n",
307 | "def labels(x1, x2):\n",
308 | " return np.concatenate((np.zeros(x1.shape[0], dtype='int'), np.ones(x2.shape[0], dtype='int')))"
309 | ]
310 | },
311 | {
312 | "cell_type": "code",
313 | "execution_count": null,
314 | "metadata": {},
315 | "outputs": [],
316 | "source": [
317 | "# add your code here\n",
318 | "\n"
319 | ]
320 | },
321 | {
322 | "cell_type": "markdown",
323 | "metadata": {},
324 | "source": [
325 | "## Pedestrian Classification\n",
326 | "\n",
327 | "__Read the pedestrian dataset into a NumPy array and normalize to [0,1]__ (Task 5.4)"
328 | ]
329 | },
330 | {
331 | "cell_type": "code",
332 | "execution_count": null,
333 | "metadata": {},
334 | "outputs": [],
335 | "source": [
336 | "import mllab.pca\n",
337 | "mllab.pca.load_pedestrian_images?"
338 | ]
339 | },
340 | {
341 | "cell_type": "code",
342 | "execution_count": null,
343 | "metadata": {},
344 | "outputs": [],
345 | "source": [
346 | "# your code here\n",
347 | "\n"
348 | ]
349 | },
350 | {
351 | "cell_type": "markdown",
352 | "metadata": {},
353 | "source": [
354 | "__Write a function to plot an image__"
355 | ]
356 | },
357 | {
358 | "cell_type": "code",
359 | "execution_count": null,
360 | "metadata": {},
361 | "outputs": [],
362 | "source": [
363 | "import matplotlib.pyplot as plt\n",
364 | "import matplotlib as mpl\n",
365 | "\n",
366 | "def plot_im(im, ax=None, title=None, max_contrast=False):\n",
367 | " \"\"\"\n",
368 | " Plot a normalized image.\n",
369 | " \n",
370 | " Parameters\n",
371 | " ----------\n",
372 | " \n",
373 | " im: (1250,) array-like\n",
374 | " \"\"\"\n",
375 | " # your code here\n",
376 | "\n",
377 | " "
378 | ]
379 | },
380 | {
381 | "cell_type": "markdown",
382 | "metadata": {},
383 | "source": [
384 | "__Plot 10 randomly chosen images showing a pedestrian, and 10 randomly chosen images not showing a pedestrain__"
385 | ]
386 | },
387 | {
388 | "cell_type": "code",
389 | "execution_count": null,
390 | "metadata": {},
391 | "outputs": [],
392 | "source": [
393 | "# your code here\n",
394 | "\n",
395 | "\n"
396 | ]
397 | },
398 | {
399 | "cell_type": "markdown",
400 | "metadata": {},
401 | "source": [
402 | "__Compute the PCA of the full training set for $q=200$__ (Task 5.5)"
403 | ]
404 | },
405 | {
406 | "cell_type": "code",
407 | "execution_count": null,
408 | "metadata": {},
409 | "outputs": [],
410 | "source": [
411 | "from sklearn.decomposition import PCA\n",
412 | "\n",
413 | "\n"
414 | ]
415 | },
416 | {
417 | "cell_type": "markdown",
418 | "metadata": {},
419 | "source": [
420 | "__Plot the eigenpedestrian 1-20, 51-60, and 101-110__"
421 | ]
422 | },
423 | {
424 | "cell_type": "code",
425 | "execution_count": null,
426 | "metadata": {},
427 | "outputs": [],
428 | "source": [
429 | "# your code here\n",
430 | "\n",
431 | "\n"
432 | ]
433 | },
434 | {
435 | "cell_type": "markdown",
436 | "metadata": {},
437 | "source": [
438 | "We observe that higher eigenvectors (eigenpedestrians) correspond to higher frequencies in the image data. For the first eigenpedestrians we indeed recognize the shape of a human being, whereas the latter ones seem to have high frequency fluctuations/noise in them."
439 | ]
440 | },
441 | {
442 | "cell_type": "markdown",
443 | "metadata": {},
444 | "source": [
445 | "__Compute the scores for a linear SVM using increasing numbers of principal components__ (Task 5.6)\n",
446 | "\n",
447 | "Use 10 to 200 components in steps of 5. Train the linear SVM with $C=0.01$ and increse the maximum number of iterations for the solver. You can reuse the computed PCA from above."
448 | ]
449 | },
450 | {
451 | "cell_type": "code",
452 | "execution_count": null,
453 | "metadata": {},
454 | "outputs": [],
455 | "source": [
456 | "from sklearn.svm import LinearSVC\n",
457 | "\n",
458 | "# your code here\n",
459 | "\n",
460 | "\n"
461 | ]
462 | },
463 | {
464 | "cell_type": "markdown",
465 | "metadata": {},
466 | "source": [
467 | "Plot the training and test scores over $q$."
468 | ]
469 | },
470 | {
471 | "cell_type": "code",
472 | "execution_count": null,
473 | "metadata": {},
474 | "outputs": [],
475 | "source": [
476 | "# your code here\n",
477 | "\n",
478 | "\n"
479 | ]
480 | },
481 | {
482 | "cell_type": "markdown",
483 | "metadata": {},
484 | "source": [
485 | "We observe that too small values of q lead to worse accuracies on both training and test data (underfitting), whereas high values of q lead to high training accuracies but low test accuracies (overfitting). The sweet spot seems to be somewhere in the middle (roughly q = 50)."
486 | ]
487 | },
488 | {
489 | "cell_type": "markdown",
490 | "metadata": {},
491 | "source": [
492 | "### HOG Features\n",
493 | "\n",
494 | "__Implementation of the HOG features__\n",
495 | "\n",
496 | "Finally, we want to see if we can increase the accuracies by using well-tailored features such as the HoG features."
497 | ]
498 | },
499 | {
500 | "cell_type": "code",
501 | "execution_count": null,
502 | "metadata": {},
503 | "outputs": [],
504 | "source": [
505 | "import numpy as np\n",
506 | "import scipy.ndimage as ndimage\n",
507 | "from numpy.linalg import norm\n",
508 | "from scipy.ndimage import convolve\n",
509 | "\n",
510 | "\n",
511 | "class HogFeatures:\n",
512 | " def __init__(self, im_shape, n_bins=9, cell_size=8, blk_size=2, unsigned=True, clip_val=.2):\n",
513 | " self.deg_range = np.pi if unsigned else 2 * np.pi\n",
514 | " self.n_bins = n_bins\n",
515 | " self.bins = np.linspace(0, self.deg_range, n_bins, endpoint=False)\n",
516 | " self.bin_size = self.deg_range / n_bins\n",
517 | " self.cell_size = cell_size\n",
518 | " self.blk_size = blk_size\n",
519 | " self.clip_val = clip_val\n",
520 | "\n",
521 | "\n",
522 | " self.im_h, self.im_w = im_shape\n",
523 | " x, y = np.arange(self.im_w), np.arange(self.im_h)\n",
524 | " \n",
525 | " # Compute logical cell indices of next lower and upper cell\n",
526 | " # w.r.t. to the cell center\n",
527 | " cells_x = np.arange(-cell_size, self.im_w - (cell_size + 1)/2, cell_size)\n",
528 | " self.n_cells_x = len(cells_x) - (2 if cells_x[-1] >= self.im_w else 1)\n",
529 | " x0 = np.digitize(x, cells_x + cell_size / 2) - 2\n",
530 | " Xc = ((x0 + 1) - .5) * cell_size - .5\n",
531 | " f_x = (x - Xc) / cell_size\n",
532 | "\n",
533 | " cells_y = np.arange(-cell_size, self.im_h - (cell_size + 1)/2, cell_size)\n",
534 | " self.n_cells_y = len(cells_y) - (2 if cells_y[-1] >= self.im_h else 1)\n",
535 | " y0 = np.digitize(y, cells_y + cell_size / 2) - 2\n",
536 | " Yc = ((y0 + 1) - .5) * cell_size - .5\n",
537 | " f_y = (y - Yc) / cell_size\n",
538 | " \n",
539 | " self.f_x, self.f_y = np.meshgrid(f_x, f_y)\n",
540 | "\n",
541 | " \n",
542 | " def extract(self, im):\n",
543 | " \"\"\"\n",
544 | " Extract the HOG features for a image.\n",
545 | " \n",
546 | " Parameters\n",
547 | " ----------\n",
548 | " \n",
549 | " im: ndarray\n",
550 | " An array of shape (height, width, 3).\n",
551 | " \"\"\"\n",
552 | "\n",
553 | " im = np.rollaxis(im.reshape(self.im_h, self.im_w, -1), 2)\n",
554 | " dx = convolve(im, [[[1,0,-1]]], mode='constant')\n",
555 | " dy = convolve(im, [[[-1],[0],[1]]], mode='constant')\n",
556 | " grads_mag = norm(np.stack((dx, dy), axis=-1), axis=3)\n",
557 | " max_grads = np.argmax(np.rollaxis(grads_mag, 0, 3), 2)\n",
558 | " Y, X = np.ogrid[:grads_mag.shape[1], :grads_mag.shape[2]]\n",
559 | " grads_dir = np.arctan2(dy[max_grads, Y, X], dx[max_grads, Y, X]) % self.deg_range\n",
560 | " grads_mag = grads_mag[max_grads, Y, X]\n",
561 | " del dx, dy, max_grads, Y, X\n",
562 | " \n",
563 | " # Compute logical bin indices of next lower (<=) and upper bin (>)\n",
564 | " # w.r.t. to the bin center\n",
565 | " bin0 = np.digitize(grads_dir, self.bins + .5 * self.bin_size) - 1\n",
566 | " bin1 = bin0 + 1\n",
567 | " dirc = (bin0 + .5) * self.bin_size\n",
568 | " f_b = (grads_dir - dirc) / self.bin_size\n",
569 | " del grads_dir\n",
570 | " \n",
571 | " bin0 %= self.n_bins\n",
572 | " bin1 %= self.n_bins\n",
573 | " \n",
574 | " f_x, f_y = self.f_x, self.f_y\n",
575 | "\n",
576 | " hist = np.zeros((self.n_cells_y, self.n_cells_x, self.n_bins))\n",
577 | " bin_labels = np.arange(self.n_bins)\n",
578 | " # Iterate over all cells\n",
579 | " for ci_x in range(self.n_cells_x):\n",
580 | " x_pos = (ci_x * self.cell_size - (self.cell_size + 1) // 2, ci_x * self.cell_size + (self.cell_size + 1) // 2)\n",
581 | " x_pre = slice(max(0, x_pos[0] + self.cell_size), max(0, x_pos[1] + self.cell_size))\n",
582 | " x_pos = slice(max(0, x_pos[0]), x_pos[1])\n",
583 | " for ci_y in range(self.n_cells_y):\n",
584 | " y_pos = (ci_y * self.cell_size - (self.cell_size + 1) // 2, ci_y * self.cell_size + (self.cell_size + 1) // 2)\n",
585 | " y_pre = slice(max(0, y_pos[0] + self.cell_size), max(0, y_pos[1] + self.cell_size))\n",
586 | " y_pos = slice(max(0, y_pos[0]), y_pos[1])\n",
587 | " # Consider all four sourinding cells\n",
588 | " \n",
589 | " # y-pre x-pre\n",
590 | " m = (y_pre, x_pre)\n",
591 | " g = grads_mag[m] * (1 - f_x[m]) * (1 - f_y[m])\n",
592 | " hist[ci_y, ci_x] += ndimage.sum(g * (1 - f_b[m]), bin0[m], bin_labels) + \\\n",
593 | " ndimage.sum(g * f_b[m], bin1[m], bin_labels)\n",
594 | " # y-pos x-pre\n",
595 | " m = (y_pos, x_pre)\n",
596 | " g = grads_mag[m] * (1 - f_x[m]) * f_y[m]\n",
597 | " hist[ci_y, ci_x] += ndimage.sum(g * (1 - f_b[m]), bin0[m], bin_labels) + \\\n",
598 | " ndimage.sum(g * f_b[m], bin1[m], bin_labels)\n",
599 | " # y-pre x-pos\n",
600 | " m = (y_pre, x_pos)\n",
601 | " g = grads_mag[m] * f_x[m] * (1 - f_y[m])\n",
602 | " hist[ci_y, ci_x] += ndimage.sum(g * (1 - f_b[m]), bin0[m], bin_labels) + \\\n",
603 | " ndimage.sum(g * f_b[m], bin1[m], bin_labels)\n",
604 | " # y-pos x-pos\n",
605 | " m = (y_pos, x_pos)\n",
606 | " g = grads_mag[m] * f_x[m] * f_y[m]\n",
607 | " hist[ci_y, ci_x] += ndimage.sum(g * (1 - f_b[m]), bin0[m], bin_labels) + \\\n",
608 | " ndimage.sum(g * f_b[m], bin1[m], bin_labels)\n",
609 | " \n",
610 | " n_blks_x = self.n_cells_x + 1 - self.blk_size\n",
611 | " n_blks_y = self.n_cells_y + 1 - self.blk_size\n",
612 | " features = np.zeros((n_blks_x, n_blks_y, self.blk_size ** 2 * self.n_bins))\n",
613 | " for bi_x in range(n_blks_x):\n",
614 | " for bi_y in range(n_blks_y):\n",
615 | " blk = hist[bi_y:bi_y+self.blk_size, bi_x:bi_x+self.blk_size].copy()\n",
616 | " blk_norm = norm(blk.flatten())\n",
617 | " if blk_norm > 0:\n",
618 | " blk /= blk_norm\n",
619 | " np.clip(blk, None, self.clip_val, out=blk)\n",
620 | " blk_norm = norm(blk.flatten())\n",
621 | " if blk_norm > 0:\n",
622 | " blk /= blk_norm\n",
623 | " features[bi_x, bi_y] = blk.ravel()\n",
624 | " return features.flatten()\n"
625 | ]
626 | },
627 | {
628 | "cell_type": "markdown",
629 | "metadata": {},
630 | "source": [
631 | "__Compute the HOG features for the training data, then compute the PCA for $q=200$.__ (Task 5.7)"
632 | ]
633 | },
634 | {
635 | "cell_type": "code",
636 | "execution_count": null,
637 | "metadata": {},
638 | "outputs": [],
639 | "source": [
640 | "#! pip install tqdm\n",
641 | "hog = HogFeatures((100, 50))\n",
642 | "\n",
643 | "from tqdm import tqdm\n",
644 | "hog_train_features = []\n",
645 | "for i in tqdm(range(int_train_features.shape[0])):\n",
646 | " im = int_train_features[i].reshape(100, 50, 3)\n",
647 | " hog_train_features.append(hog.extract(im))\n",
648 | "print(\"Computed HoG.\")\n",
649 | "\n",
650 | "q = 200\n",
651 | "hog_train_features = np.array(hog_train_features)\n",
652 | "hog_train_pca = PCA(n_components=q)\n",
653 | "hog_train_pca.fit(hog_train_features)"
654 | ]
655 | },
656 | {
657 | "cell_type": "markdown",
658 | "metadata": {},
659 | "source": [
660 | "__Compute and plot the scores as above, but this time use the HOG features.__"
661 | ]
662 | },
663 | {
664 | "cell_type": "code",
665 | "execution_count": null,
666 | "metadata": {},
667 | "outputs": [],
668 | "source": [
669 | "# your code here\n",
670 | "\n"
671 | ]
672 | },
673 | {
674 | "cell_type": "code",
675 | "execution_count": null,
676 | "metadata": {},
677 | "outputs": [],
678 | "source": [
679 | "# compute scores\n",
680 | "# your code here \n",
681 | "\n"
682 | ]
683 | },
684 | {
685 | "cell_type": "code",
686 | "execution_count": null,
687 | "metadata": {},
688 | "outputs": [],
689 | "source": [
690 | "# Plot the results\n",
691 | "# your code here\n",
692 | "\n"
693 | ]
694 | }
695 | ],
696 | "metadata": {
697 | "kernelspec": {
698 | "display_name": "Python 3 (ipykernel)",
699 | "language": "python",
700 | "name": "python3"
701 | },
702 | "language_info": {
703 | "codemirror_mode": {
704 | "name": "ipython",
705 | "version": 3
706 | },
707 | "file_extension": ".py",
708 | "mimetype": "text/x-python",
709 | "name": "python",
710 | "nbconvert_exporter": "python",
711 | "pygments_lexer": "ipython3",
712 | "version": "3.9.2"
713 | }
714 | },
715 | "nbformat": 4,
716 | "nbformat_minor": 4
717 | }
718 |
--------------------------------------------------------------------------------
/README.md:
--------------------------------------------------------------------------------
1 | # AlgMathML
2 |
3 | This repository contains the template notebooks and data sets for the tasks from the text book [Algorithmic Mathematics in Machine Learning](https://epubs.siam.org/doi/10.1137/1.9781611977882) by Bastian Bohn, Jochen Garcke and Michael Griebel.
4 |
5 | How to install jupyter-notebooks and get the necessary python packages (assuming
6 | you are running a Linux-based operating system - in the case of different operating systems please check https://docs.python.org/3/library/venv.html to see how to set up a virtual environment appropriately):
7 |
8 | 1. Make sure you have python3 and pip3 installed.
9 | 2. Install virtualenv via
10 | "sudo -H pip3 install virtualenv"
11 | or use python3's venv instead if you do not have virtualenv installed and do
12 | not have root rights.
13 | 3. Go to the directory of your choice and create a virtual environment via
14 | "virtualenv -p python3 .mlbook"
15 | or
16 | "python3 -m venv .mlbook"
17 | 4. Once created you should activate the environment by running
18 | "source .mlbook/bin/activate"
19 | 5. Then, to install the necessary packages for python download the
20 | requirements.txt file from the practical lab website and then run
21 | "pip3 install -r requirements.txt"
22 | 6. Finally, you can run and edit your jupyter-notebooks via
23 | "jupyter-notebook NameOfYourNotebook.ipynb"
24 | 7. When you want to leave the virutal environment use
25 | "deactivate"
26 |
27 | If the above fails, try to use "requirements_w_version.txt" instead in step 5.
28 |
29 |
30 | You can find examplary images depicting solutions to certain tasks in the folder [solution_example_pictures](solution_example_pictures). Note that your solution might look differently depending on your specific implementation and the specific training and test data set you used.
31 |
32 | In case you are a lecturer and you are interested in reference solutions to these tasks, please contact us at bohn@ins.uni-bonn.de.
33 |
--------------------------------------------------------------------------------
/ReinforcementLearning_template.ipynb:
--------------------------------------------------------------------------------
1 | {
2 | "cells": [
3 | {
4 | "cell_type": "code",
5 | "execution_count": null,
6 | "id": "9d4f0023",
7 | "metadata": {},
8 | "outputs": [],
9 | "source": [
10 | "import random\n",
11 | "import numpy as np\n",
12 | "import gymnasium as gym\n",
13 | "from collections import deque\n",
14 | "from keras.models import Sequential\n",
15 | "from keras.layers import Dense\n",
16 | "from keras.optimizers import Adam"
17 | ]
18 | },
19 | {
20 | "cell_type": "markdown",
21 | "id": "2eaaaf7d",
22 | "metadata": {},
23 | "source": [
24 | "## 6.1. Introduction to Reinforcement Learning"
25 | ]
26 | },
27 | {
28 | "cell_type": "markdown",
29 | "id": "a5c3ce3e",
30 | "metadata": {},
31 | "source": [
32 | "For the problem formulation, we introduce the [gymnasium](https://gymnasium.farama.org/) library. It implements control problems from the past and present of reinforcement learning that have served as milestones in the development of that technique. Researchers that work on the same standard problems have the advantage that their work is easier to compare and to transfer. On the other hand, if benchmark problems are too prevalent in a community, it may drive research in a certain, uniform direction that is not as productive anymore. Note that gym is a product of OpenAI, a private company. \n",
33 | "\n",
34 | "gym uses a unifying framework that defines every control problem as an *environment*. The basic building blocks of an environment are `env = gym.make` to create the environment, `env.reset` to start an episode, `env.render` to give a human readable representation of the state of the environment, and `env.step` to perform an action.\n",
35 | "\n",
36 | "We start the exercises with the 4x4 [FrozenLake](https://gymnasium.farama.org/environments/toy_text/frozen_lake/) environment. It is a kind of maze with \"frozen\" traversable squares marked by `F` and \"holes\", losing terminal squares marked by `H`. The agent starts at the `S` start square and only incurs reward, when they manage to get to the goal `G` square. We mostly look at the deterministic case, where traversing on the frozen lake is deterministic, which is controlled by the variable `is_slippery=False` when creating the environment. If the lake is slippery, a movement in a certain direction may by chance result in the agent arriving at a different square than expected."
37 | ]
38 | },
39 | {
40 | "cell_type": "code",
41 | "execution_count": null,
42 | "id": "3f3ef494",
43 | "metadata": {},
44 | "outputs": [],
45 | "source": [
46 | "env = gym.make(\"FrozenLake-v1\", is_slippery=False, render_mode=\"human\")\n",
47 | "#print(env.action_space)\n",
48 | "#print(env.observation_space)"
49 | ]
50 | },
51 | {
52 | "cell_type": "code",
53 | "execution_count": null,
54 | "id": "0a0af69f",
55 | "metadata": {},
56 | "outputs": [],
57 | "source": [
58 | "starting_state, _ = env.reset()\n",
59 | "#print(starting_state)\n",
60 | "env.render()"
61 | ]
62 | },
63 | {
64 | "cell_type": "markdown",
65 | "id": "34850d84",
66 | "metadata": {},
67 | "source": [
68 | "The `env.action_space` always implements a `sample` method, which returns a valid, random aciton. We can utilize this, to have a look at the dynamics of the system. You can execute the following cell a few times to see what happens. When the agent enters a terminal state, you need to execute `env.reset` to start anew."
69 | ]
70 | },
71 | {
72 | "cell_type": "code",
73 | "execution_count": null,
74 | "id": "47af7c8a",
75 | "metadata": {},
76 | "outputs": [],
77 | "source": [
78 | "state, reward, terminated, truncated, info = env.step(env.action_space.sample())\n",
79 | "print(state, reward, terminated, truncated, info)"
80 | ]
81 | },
82 | {
83 | "cell_type": "markdown",
84 | "id": "cdeb143e",
85 | "metadata": {},
86 | "source": [
87 | "#### Task 1. a) Random Agent:\n",
88 | "We provide the framework for the random agent, a method to rollout a policy"
89 | ]
90 | },
91 | {
92 | "cell_type": "code",
93 | "execution_count": null,
94 | "id": "ebb60f28",
95 | "metadata": {},
96 | "outputs": [],
97 | "source": [
98 | "def rollout(env, agent):\n",
99 | " state, _ = env.reset()\n",
100 | " done = False\n",
101 | " total_reward = 0\n",
102 | " while not done:\n",
103 | " action = agent.action(state)\n",
104 | " state, reward, terminated, truncated, info = env.step(action)\n",
105 | " done = terminated or truncated\n",
106 | " total_reward += reward\n",
107 | " return total_reward\n",
108 | "\n",
109 | "class RandomAgent:\n",
110 | " def __init__(self, action_space, observation_space):\n",
111 | " self.action_space = action_space\n",
112 | " self.observation_space = observation_space\n",
113 | " \n",
114 | " # We pass the state only for compatability\n",
115 | " def action(self, state):\n",
116 | " # your code goes here\n",
117 | " return None\n",
118 | " \n",
119 | "def compute_avg_return(env, agent, num_episodes=5000):\n",
120 | " # your code goes here\n",
121 | " return avg_reward"
122 | ]
123 | },
124 | {
125 | "cell_type": "markdown",
126 | "id": "6d85a08a",
127 | "metadata": {},
128 | "source": [
129 | "Add your code to estimate the `avg_return_random_agent` for the deterministic case and `avg_return_random_agent_slippery` for the stochastic case!"
130 | ]
131 | },
132 | {
133 | "cell_type": "code",
134 | "execution_count": null,
135 | "id": "4b1754c9",
136 | "metadata": {},
137 | "outputs": [],
138 | "source": [
139 | "env = gym.make(\"FrozenLake-v1\", is_slippery=False, render_mode=None)\n",
140 | "# your code goes here"
141 | ]
142 | },
143 | {
144 | "cell_type": "code",
145 | "execution_count": null,
146 | "id": "2ec8e8d4",
147 | "metadata": {},
148 | "outputs": [],
149 | "source": [
150 | "print(\"Estimation for the deterministic case:\", avg_return_random_agent)\n",
151 | "print(\"Estimation for the stochastic case:\", avg_return_random_agent_slippery)"
152 | ]
153 | },
154 | {
155 | "cell_type": "markdown",
156 | "id": "f05c63c3",
157 | "metadata": {},
158 | "source": [
159 | "### 1. b) Iterative Policy Evaluation\n",
160 | "We provide a `set_state` method that changes the state of the environment. This is a pretty unusual way to interact with this framework. Note, that the random policy is stochastic, while the environment is not. In the value update we sum the value of each possible action that is weighted by its probability to be picked by the action. The architecture of the agent does provide access to these inner dynamics, so instead of passing the agent or its dynamics as a variable, we implement iterative policy evaluation just for the random agent, with the probability of `0.25` for each action hard coded.\n",
161 | "\n",
162 | "We also provide `all_states` and `all_actions`, lists of all admissable states and actions for the environment. "
163 | ]
164 | },
165 | {
166 | "cell_type": "code",
167 | "execution_count": null,
168 | "id": "6ce56d4a",
169 | "metadata": {},
170 | "outputs": [],
171 | "source": [
172 | "all_states = list(range(env.observation_space.n))\n",
173 | "all_actions = list(range(env.action_space.n))\n",
174 | "\n",
175 | "def set_state(env, state):\n",
176 | " env.reset()\n",
177 | " env.env.env.env.s = state\n",
178 | " return env\n",
179 | "\n",
180 | "def visualize_value_fct(v):\n",
181 | " print(np.round(np.array(list(v.values())).reshape((4,4)),3))"
182 | ]
183 | },
184 | {
185 | "cell_type": "code",
186 | "execution_count": null,
187 | "id": "f5a32c74",
188 | "metadata": {},
189 | "outputs": [],
190 | "source": [
191 | "def iterative_policy_iteration_random_agent(env, all_states, all_actions, discount_rate, \n",
192 | " threshold=0.001, max_iter=10000):\n",
193 | " v = {s: 0 for s in all_states} # value function, initialized to 0\n",
194 | " # your code goes here\n",
195 | " return v"
196 | ]
197 | },
198 | {
199 | "cell_type": "code",
200 | "execution_count": null,
201 | "id": "35557fe3",
202 | "metadata": {},
203 | "outputs": [],
204 | "source": [
205 | "v_random = iterative_policy_iteration_random_agent(env, all_states, all_actions, discount_rate=0.9)\n",
206 | "visualize_value_fct(v_random)"
207 | ]
208 | },
209 | {
210 | "cell_type": "markdown",
211 | "id": "e00941b0",
212 | "metadata": {},
213 | "source": [
214 | "### 1. c) Value Iteration\n",
215 | "Use value iteration to find the optimal policy!"
216 | ]
217 | },
218 | {
219 | "cell_type": "code",
220 | "execution_count": null,
221 | "id": "61587995",
222 | "metadata": {},
223 | "outputs": [],
224 | "source": [
225 | "def value_iteration(env, all_states, all_actions, discount_rate, threshold=0.001, max_iter=10000):\n",
226 | " v = {s: 0 for s in all_states} # value function, initialized to 0\n",
227 | " # your code goes here\n",
228 | " return v"
229 | ]
230 | },
231 | {
232 | "cell_type": "code",
233 | "execution_count": null,
234 | "id": "8d68d13d",
235 | "metadata": {},
236 | "outputs": [],
237 | "source": [
238 | "v_optimal = value_iteration(env, all_states, all_actions, discount_rate=0.9)\n",
239 | "visualize_value_fct(v_optimal)"
240 | ]
241 | },
242 | {
243 | "cell_type": "markdown",
244 | "id": "0dd5617c",
245 | "metadata": {},
246 | "source": [
247 | "### 2. a) Sarsa & Q-Learning\n",
248 | "With the language of a Q-table, we can define a more general agent by a Q-function.\n",
249 | "\n",
250 | "*Please do not use* `set_state` *anymore! Instead always start an episode with* `state = env.reset()`!"
251 | ]
252 | },
253 | {
254 | "cell_type": "code",
255 | "execution_count": null,
256 | "id": "dc7e6b00",
257 | "metadata": {},
258 | "outputs": [],
259 | "source": [
260 | "def visualize_q_fct(q):\n",
261 | " acts = {0 : \"L\", 1 : \"D\", 2 : \"R\", 3 : \"U\"} \n",
262 | " for j in range(4):\n",
263 | " print(\"Value for action\", acts[j], \":\")\n",
264 | " print(np.round(np.array([q[i][j] for i in range(16)]).reshape((4,4)), 3))\n",
265 | " for i in range(4):\n",
266 | " print([acts[np.argmax(q[4*i + j])] for j in range(4)])\n",
267 | " \n",
268 | "def argmax_tiebreak(array):\n",
269 | " return np.random.choice(np.where(array == array.max())[0])"
270 | ]
271 | },
272 | {
273 | "cell_type": "code",
274 | "execution_count": null,
275 | "id": "904ec408",
276 | "metadata": {},
277 | "outputs": [],
278 | "source": [
279 | "class Discrete_Q_Agent:\n",
280 | " def __init__(self, action_space, observation_space, epsilon=0.9):\n",
281 | " self.action_space = action_space\n",
282 | " self.observation_space = observation_space\n",
283 | " self.epsilon = epsilon\n",
284 | " self.reset_Q()\n",
285 | " \n",
286 | " def reset_Q(self):\n",
287 | " all_states = list(range(self.observation_space.n))\n",
288 | " self.actions = list(range(self.action_space.n))\n",
289 | " self.Q = {s: np.zeros(self.action_space.n) for s in all_states}\n",
290 | "\n",
291 | " def action(self, state):\n",
292 | "# your code goes here\n",
293 | " return action"
294 | ]
295 | },
296 | {
297 | "cell_type": "code",
298 | "execution_count": null,
299 | "id": "88b55655",
300 | "metadata": {},
301 | "outputs": [],
302 | "source": [
303 | "def Sarsa(env, q_agent, alpha=0.1, gamma=0.99, rollouts=10000):\n",
304 | " # your code goes here\n",
305 | " return q_agent, q_agent.Q"
306 | ]
307 | },
308 | {
309 | "cell_type": "code",
310 | "execution_count": null,
311 | "id": "5bfadf5c",
312 | "metadata": {},
313 | "outputs": [],
314 | "source": [
315 | "def Q_Learning(env, q_agent, alpha=0.1, gamma=0.99, rollouts=10000):\n",
316 | " # your code goes here\n",
317 | " return q_agent, q_agent.Q"
318 | ]
319 | },
320 | {
321 | "cell_type": "code",
322 | "execution_count": null,
323 | "id": "b7993db0",
324 | "metadata": {},
325 | "outputs": [],
326 | "source": [
327 | "env_slippery = gym.make(\"FrozenLake-v1\", is_slippery=True)\n",
328 | "q_agent = Discrete_Q_Agent(env_slippery.action_space, env_slippery.observation_space, epsilon=0.9)\n",
329 | "q_agent, q = Sarsa(env_slippery, q_agent)\n",
330 | "visualize_q_fct(q)"
331 | ]
332 | },
333 | {
334 | "cell_type": "code",
335 | "execution_count": null,
336 | "id": "91535fe6",
337 | "metadata": {},
338 | "outputs": [],
339 | "source": [
340 | "env_slippery = gym.make(\"FrozenLake-v1\", is_slippery=True)\n",
341 | "q_agent = Discrete_Q_Agent(env_slippery.action_space, env_slippery.observation_space, epsilon=0.9)\n",
342 | "q_agent, q = Q_Learning(env_slippery, q_agent)\n",
343 | "visualize_q_fct(q)"
344 | ]
345 | },
346 | {
347 | "cell_type": "markdown",
348 | "id": "0a4eab0e",
349 | "metadata": {},
350 | "source": [
351 | "### 2. b) Cartpole\n",
352 | "Next, try the [Cartpole](https://www.gymlibrary.ml/environments/classic_control/cart_pole/) environment. It has a continuous state space, so we need to adjust our methods to accomodate that."
353 | ]
354 | },
355 | {
356 | "cell_type": "code",
357 | "execution_count": null,
358 | "id": "95f036dd",
359 | "metadata": {},
360 | "outputs": [],
361 | "source": [
362 | "# your code goes here"
363 | ]
364 | },
365 | {
366 | "cell_type": "markdown",
367 | "id": "ac79b2d8",
368 | "metadata": {},
369 | "source": [
370 | "### 2. c) Cartpole learning\n",
371 | "The observation space of the Cartpole environment can be accessed with `env.observation_space`. It is a [`Box`](https://gymnasium.farama.org/api/spaces/fundamental/#box) space, which contains lower bounds, upper bounds, number of dimensions, and datatype. The second and forth dimension are unbounded. We can make them bounded by clipping every value over a certain threshold. Also, the first and third dimension have higher admissbable bounds, than is useful during training!\n",
372 | "\n",
373 | "Hint: Binned Q-Learning is not the most efficient or useful algorithm for this problem. With the provided hyperparameters I achieved only a mean reward of ~100 after 50000 rollouts of training without any further tuning. Can you achieve a better result by changing the hyperparameters or employing some additional technique?"
374 | ]
375 | },
376 | {
377 | "cell_type": "code",
378 | "execution_count": null,
379 | "id": "b052a5fd",
380 | "metadata": {},
381 | "outputs": [],
382 | "source": [
383 | "learning_rate = 0.1\n",
384 | "discounting_rate = 0.95\n",
385 | "number_episodes = 50000\n",
386 | "total_reward = 0\n",
387 | "\n",
388 | "q_table = np.zeros([31, 31, 51, 51, 2])\n",
389 | "window_size = np.array([0.25, 0.25, 0.01, 0.1])\n",
390 | "low_clip = [-3.75, -3.75, -0.25, -2.5]\n",
391 | "high_clip = [3.75, 3.75, 0.25, 2.5]\n",
392 | "\n",
393 | "# your code goes here"
394 | ]
395 | },
396 | {
397 | "cell_type": "code",
398 | "execution_count": null,
399 | "id": "e73ee792",
400 | "metadata": {},
401 | "outputs": [],
402 | "source": [
403 | "env = gym.make(\"CartPole-v1\")\n",
404 | "bagent = Binned_Q_Agent_Cartpole(window_size, q_table)\n",
405 | "binned_q_learning(env, bagent, num_episodes=50000)"
406 | ]
407 | },
408 | {
409 | "cell_type": "markdown",
410 | "id": "93474134",
411 | "metadata": {},
412 | "source": [
413 | "### 3.a) Linear function control\n",
414 | "Implement the linear gradient Sarsa here. Most of the time after a few thousend episodes the linear policy is able to solve the problem (500 reward), but sometimes it just does not converge. The algorithm is a bit shakey as is! I also needed to add one little tweak: Normalize the state by clipping it, just as in the task before, and then dividing by the clip-value. This normalizes the state-vectors to [-1,1] and stablizes the algorithm.\n",
415 | "\n",
416 | "Note that for a linear formulation of Q_theta(., a), Grad(Q_theta(., a)) at state vector s is just that state vector s."
417 | ]
418 | },
419 | {
420 | "cell_type": "code",
421 | "execution_count": null,
422 | "id": "a70c03df",
423 | "metadata": {},
424 | "outputs": [],
425 | "source": [
426 | "class Linear_Q_Agent:\n",
427 | " def __init__(self, action_space, observation_space, epsilon=0.9):\n",
428 | " self.action_space = action_space\n",
429 | " self.observation_space = observation_space\n",
430 | " self.epsilon = epsilon\n",
431 | " self.theta = np.zeros((action_space.n, observation_space.shape[0]))\n",
432 | " \n",
433 | " def norm_state(self, state):\n",
434 | " norm_state = state\n",
435 | " norm_state = np.clip(norm_state,low_clip,high_clip)\n",
436 | " norm_state /= high_clip\n",
437 | " return norm_state\n",
438 | "\n",
439 | "# your code goes here"
440 | ]
441 | },
442 | {
443 | "cell_type": "code",
444 | "execution_count": null,
445 | "id": "330b4e00",
446 | "metadata": {},
447 | "outputs": [],
448 | "source": [
449 | "lin_agent = Linear_Q_Agent(env.action_space, env.observation_space)\n",
450 | "lin_agent = Grad_Sarsa(env, lin_agent, rollouts=10000)"
451 | ]
452 | },
453 | {
454 | "cell_type": "markdown",
455 | "id": "ce9638e6",
456 | "metadata": {},
457 | "source": [
458 | "### 3.b) DQN\n",
459 | "As a suggestion, I provided the interfaces for functions, some hyperparameters, and the architecture of the neural net that approximates Q. For this algorithm to somewhat work, I needed at least experience replay. But other techniques may also be interesting and work even better. Please feel free to experiment!\n",
460 | "\n",
461 | "*Note*: 1. Whenever you either `model.predict` oder `model.fit` you can gain a lot of performance if you do it as a batch. E.g. use \n",
462 | "```\n",
463 | "X = []\n",
464 | "y = []\n",
465 | "for i in I:\n",
466 | " X.append(get_data(i))\n",
467 | " y.append(get_label(i))\n",
468 | "model.fit(X,y)\n",
469 | "```\n",
470 | "instead of\n",
471 | "```\n",
472 | "for i in I:\n",
473 | " model.fit(get_data(i), get_label(i))\n",
474 | "```"
475 | ]
476 | },
477 | {
478 | "cell_type": "code",
479 | "execution_count": null,
480 | "id": "de617670",
481 | "metadata": {},
482 | "outputs": [],
483 | "source": [
484 | "memory_size = 2000\n",
485 | "epsilon = 0.05\n",
486 | "learning_rate = 0.001\n",
487 | "\n",
488 | "class DQN_Agent:\n",
489 | " def _init_model(self, state_dim, action_dim, learning_rate):\n",
490 | " model = Sequential()\n",
491 | " model.add(Dense(32, input_dim=state_dim, activation='relu'))\n",
492 | " model.add(Dense(32, activation='relu'))\n",
493 | " model.add(Dense(action_dim, activation='linear'))\n",
494 | " model.compile(loss='mse', optimizer=Adam(lr=learning_rate))\n",
495 | " return model\n",
496 | " \n",
497 | " def action(self, state):\n",
498 | " pass\n",
499 | " \n",
500 | " def remember(self, state, action, reward, next_state, done):\n",
501 | " pass\n",
502 | "\n",
503 | " def learn_from_replay(self, batch_size):\n",
504 | " pass\n",
505 | " \n",
506 | "def DQN(env, agent, replay_batch_size=128, rollouts=2000):\n",
507 | " pass"
508 | ]
509 | },
510 | {
511 | "cell_type": "markdown",
512 | "id": "36f78b5b",
513 | "metadata": {},
514 | "source": [
515 | "### 3.c) Another one\n",
516 | "Browse the [environments](https://gymnasium.farama.org/) to pick another challenge! Maybe even record a video with the [RecordVideo wrapper](https://gymnasium.farama.org/api/wrappers/misc_wrappers/#gymnasium.wrappers.RecordVideo)!"
517 | ]
518 | },
519 | {
520 | "cell_type": "code",
521 | "execution_count": null,
522 | "id": "8c377104",
523 | "metadata": {},
524 | "outputs": [],
525 | "source": []
526 | }
527 | ],
528 | "metadata": {
529 | "kernelspec": {
530 | "display_name": "Python 3 (ipykernel)",
531 | "language": "python",
532 | "name": "python3"
533 | },
534 | "language_info": {
535 | "codemirror_mode": {
536 | "name": "ipython",
537 | "version": 3
538 | },
539 | "file_extension": ".py",
540 | "mimetype": "text/x-python",
541 | "name": "python",
542 | "nbconvert_exporter": "python",
543 | "pygments_lexer": "ipython3",
544 | "version": "3.10.11"
545 | }
546 | },
547 | "nbformat": 4,
548 | "nbformat_minor": 5
549 | }
550 |
--------------------------------------------------------------------------------
/data/data1.mat:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/Fraunhofer-SCAI/AlgMathML/848e301d37df72506d1e5173acb7c0fc5b45db74/data/data1.mat
--------------------------------------------------------------------------------
/data/guo.xlsx:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/Fraunhofer-SCAI/AlgMathML/848e301d37df72506d1e5173acb7c0fc5b45db74/data/guo.xlsx
--------------------------------------------------------------------------------
/data/pca_ped_25x50.mat:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/Fraunhofer-SCAI/AlgMathML/848e301d37df72506d1e5173acb7c0fc5b45db74/data/pca_ped_25x50.mat
--------------------------------------------------------------------------------
/data/pca_toy_4d.npy:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/Fraunhofer-SCAI/AlgMathML/848e301d37df72506d1e5173acb7c0fc5b45db74/data/pca_toy_4d.npy
--------------------------------------------------------------------------------
/data/pca_toy_samples.npy:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/Fraunhofer-SCAI/AlgMathML/848e301d37df72506d1e5173acb7c0fc5b45db74/data/pca_toy_samples.npy
--------------------------------------------------------------------------------
/mllab/pca/__init__.py:
--------------------------------------------------------------------------------
1 | import tarfile
2 | import re
3 | from PIL import Image
4 | from mpl_toolkits.mplot3d import Axes3D
5 | import matplotlib.pyplot as plt
6 | import scipy.cluster.vq as vq
7 | import numpy as np
8 | from scipy.io import loadmat
9 | from scipy.special import ellipeinc, ellipe
10 | from os import path
11 |
12 |
13 | def plot_toy_slice(x, y, drop_dim):
14 | """
15 | Plot a slice of the 4D input data in our toy example.
16 |
17 | A 3D scatter plot is generated by droping the dimension `drop_dim`. The
18 | color of each plotted point is determined using the labels `y`.
19 |
20 | Parameters
21 | ----------
22 |
23 | x: (n, 4) array-like
24 | y: (n,) array-like
25 | The labels.
26 | drop_dim: int
27 | The dimension to cut, 1 indexed.
28 | """
29 | x = np.array(x)
30 | x_3d = np.delete(x, drop_dim - 1, 1)
31 | fig = plt.figure()
32 | ax = fig.add_subplot(111, projection='3d')
33 | y = y.astype(int)
34 | ax.scatter(x_3d[:,0], x_3d[:,1], x_3d[:,2], c=_COLORS[y])
35 | plt.show()
36 |
37 |
38 | def plot_toy_3d(x, y):
39 | """
40 | Show a 3D scatter plot of our toy data.
41 |
42 | Parameters
43 | ----------
44 |
45 | x: (n, 3) array-like
46 | y: (n,) array-like
47 | The labels
48 | """
49 | y = y.astype(int)
50 | fig = plt.figure()
51 | ax = fig.add_subplot(111, projection='3d')
52 | ax.scatter(x[:,0], x[:,1], x[:,2], c=_COLORS[y])
53 | ax.set_aspect('auto')
54 | plt.show()
55 |
56 |
57 | def plot_toy_2d(x, y):
58 | """
59 | Plot the low-dimensional representation of the toy data.
60 |
61 | The function uses a colored scatter plot where each point is
62 | from `x` and the color is determined by the labels `y`.
63 |
64 | Parameters
65 | ----------
66 |
67 | x: (n, 2) array-like
68 | The two dimensional representation.
69 | y: (n,) array-like
70 | The labels.
71 | """
72 | x = np.array(x)
73 | y = np.array(y).reshape(-1).astype(int)
74 | if len(x.shape) != 2 or len(y) != x.shape[0] or x.shape[1] != 2:
75 | raise ValueError("Invalid input shape")
76 | theta = _find_orientation(x, y)
77 | x = _rot_mat(theta).dot(x.T).T
78 | plt.scatter(x[:,0], x[:,1], c=_COLORS[y], s=30)
79 | plt.gca().set_aspect('equal')
80 | plt.show()
81 | return x
82 |
83 |
84 | def hog_test_data():
85 | """
86 | Load test data for the HOG features.
87 |
88 | Can be used to check the implementation.
89 |
90 | Returns
91 | -------
92 |
93 | image: array with shape (25,50)
94 | steps: dictionary with values of intermediate steps
95 | """
96 | data = loadmat(path.join(path.dirname(__file__), 'hog_ref.mat'))
97 | print("Number of bins: {}".format(data.pop('n_bins')[0,0]))
98 | print("Cell size: {}".format(data.pop('cell_size')[0,0]))
99 | print("Block size: {}".format(data.pop('blk_size')[0,0]))
100 | print("Unsigned directions: {}".format(data.pop('unsigned')[0,0] == 1))
101 | print("Clip value: {}".format(data.pop('clip_val')[0,0]))
102 | for key in list(data.keys()):
103 | if key.startswith('__'):
104 | del data[key]
105 |
106 | image = data.pop('image')
107 | return image, data
108 |
109 |
110 | def map_on_ellipse(xq, a=4, b=1, gap_angle=90):
111 | """Map 2D points on bend ellipse in 3D.
112 |
113 | Parameters
114 | ----------
115 |
116 | xq: (n, 2) array-like
117 | The 2D points
118 | a, b: non-negative, scalar
119 | Ellipsis axis
120 | gap_angle: scalar in [0,360)
121 | Degree of the gap the bend ellipse should have.
122 | E.g. for 0 the data is mapped onto a closed ellipse.
123 |
124 | Returns
125 | -------
126 |
127 | xyz: (n, 3) array-like
128 | Mapped 3D points
129 | """
130 | gap_angle = np.deg2rad(gap_angle) / 2
131 | # Range of the data
132 | x_min = np.min(xq[:,0])
133 | x_max = np.max(xq[:,0])
134 | width = x_max - x_min
135 |
136 | # Compute perimeter of the open ellipse
137 | a = abs(a)
138 | b = abs(b)
139 | major, minor = max(a, b), min(a, b)
140 | ecc = (1 - minor**2 / major**2) ** 2
141 | offset = ellipeinc(gap_angle, ecc)
142 | diameter = major * (4 * ellipe(ecc) - offset)
143 | # Scale ellipse to match the span
144 | scale = width / diameter
145 | a *= scale
146 | b *= scale
147 |
148 | # Find the angles. We use the canonical parametrization.
149 | x_zeroed = xq[:,0] - x_min
150 | alpha = (2 * np.pi - 2 * gap_angle) * x_zeroed / (x_max - x_min) + gap_angle # initial guess
151 | for _ in range(6):
152 | # 6 Newton steps to solve inverse elliptic integral.
153 | # 6 Newton steps ought to be enough for anybody.
154 | f = major * ellipeinc(alpha, ecc) - (x_zeroed + major * offset)
155 | f_prime = major * np.sqrt(1 - ecc * np.sin(alpha)**2)
156 | alpha = alpha - f / f_prime
157 | xy = np.vstack((a * np.cos(alpha), b * np.sin(alpha))).T
158 | z = xq[:,1] # z-axis is the previous y-axis, we only bend in the xy-plane
159 | xyz = np.hstack((xy, z.reshape(-1, 1)))
160 | return xyz
161 |
162 |
163 | def load_pedestrian_images(split, label):
164 | """
165 | Load test/train images from the pedestrian dataset.
166 |
167 |
168 | Parameters
169 | ----------
170 |
171 | split: str
172 | Either 'train' or 'test', to specify if training or testing data shall
173 | be returned.
174 | label: bool
175 | Specifies whether the images with or without pedestrians shall
176 | be returned.
177 |
178 | Returns
179 | -------
180 |
181 | images: ndarray
182 | A NumPy array with shape (N, 100, 50, 3) representing pixel values of
183 | the images. Each image is 100x50 and has three color channels (RGB).
184 | The pixel values are in the range [0, 255].
185 | """
186 | if split not in ('train', 'test'):
187 | raise ValueError("The split must be 'train' or 'test'")
188 | folder = 'ped/{}/{}'.format(split, int(bool(label)))
189 | images = []
190 | for member in _PED_TARFILE.getmembers():
191 | if not member.isfile() or not path.dirname(member.name) == folder:
192 | continue
193 | if not re.match(_IMAGE_FILE_RE, path.basename(member.name)):
194 | continue
195 | image = Image.open(_PED_TARFILE.extractfile(member))
196 | images.append(np.asarray(image))
197 | return np.array(images, dtype="float64")
198 |
199 |
200 | _COLORS = np.load(path.join(path.dirname(__file__), 'pca_toy_colors.npy'))
201 | _PED_TARFILE = tarfile.open(path.join(path.dirname(__file__), "ped.tar.gz"), "r:gz")
202 | _IMAGE_FILE_RE = re.compile(r'^\d{3}\.jpg$')
203 |
204 |
205 | def _rot_mat(theta):
206 | """Compute 2D rotatation matrix."""
207 | c, s = np.cos(theta), np.sin(theta)
208 | R = np.array([[c, -s], [s, c]])
209 | return R
210 |
211 |
212 | def _find_orientation(x, y, eye_color_index=4):
213 | """Find the orientation of the face."""
214 | old = np.seterr(all='raise')
215 | try:
216 | eyes, _ = vq.kmeans2(x[y == eye_color_index], 2)
217 | except:
218 | return 0
219 | finally:
220 | np.seterr(**old)
221 | eye_line = eyes[0] - eyes[1]
222 | rad = np.arctan2(*eye_line) + np.pi / 2
223 | eyes_rot = _rot_mat(rad).dot(eyes.T).T
224 | if eyes_rot[0,1] < 0:
225 | rad += np.pi
226 | return rad
227 |
--------------------------------------------------------------------------------
/mllab/pca/__pycache__/__init__.cpython-310.pyc:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/Fraunhofer-SCAI/AlgMathML/848e301d37df72506d1e5173acb7c0fc5b45db74/mllab/pca/__pycache__/__init__.cpython-310.pyc
--------------------------------------------------------------------------------
/mllab/pca/__pycache__/__init__.cpython-35.pyc:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/Fraunhofer-SCAI/AlgMathML/848e301d37df72506d1e5173acb7c0fc5b45db74/mllab/pca/__pycache__/__init__.cpython-35.pyc
--------------------------------------------------------------------------------
/mllab/pca/__pycache__/__init__.cpython-36.pyc:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/Fraunhofer-SCAI/AlgMathML/848e301d37df72506d1e5173acb7c0fc5b45db74/mllab/pca/__pycache__/__init__.cpython-36.pyc
--------------------------------------------------------------------------------
/mllab/pca/__pycache__/__init__.cpython-38.pyc:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/Fraunhofer-SCAI/AlgMathML/848e301d37df72506d1e5173acb7c0fc5b45db74/mllab/pca/__pycache__/__init__.cpython-38.pyc
--------------------------------------------------------------------------------
/mllab/pca/hog_ref.mat:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/Fraunhofer-SCAI/AlgMathML/848e301d37df72506d1e5173acb7c0fc5b45db74/mllab/pca/hog_ref.mat
--------------------------------------------------------------------------------
/mllab/pca/pca_toy_colors.npy:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/Fraunhofer-SCAI/AlgMathML/848e301d37df72506d1e5173acb7c0fc5b45db74/mllab/pca/pca_toy_colors.npy
--------------------------------------------------------------------------------
/mllab/pca/ped.tar.gz:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/Fraunhofer-SCAI/AlgMathML/848e301d37df72506d1e5173acb7c0fc5b45db74/mllab/pca/ped.tar.gz
--------------------------------------------------------------------------------
/requirements.txt:
--------------------------------------------------------------------------------
1 | tensorflow
2 | jupyter
3 | numpy
4 | scipy
5 | scikit-learn
6 | matplotlib
7 | pandas
8 | openpyxl
9 | urllib3
10 | tqdm
11 | ipympl
12 | gymnasium[toy-text]
13 |
--------------------------------------------------------------------------------
/requirements_w_version.txt:
--------------------------------------------------------------------------------
1 | absl-py==1.4.0
2 | anyio==3.7.0
3 | appdirs==1.4.4
4 | argcomplete==2.0.0
5 | argon2-cffi==21.3.0
6 | argon2-cffi-bindings==21.2.0
7 | arrow==1.2.3
8 | asttokens==2.2.1
9 | astunparse==1.6.3
10 | attrs==23.1.0
11 | Babel==2.10.3
12 | backcall==0.2.0
13 | Beaker==1.12.1
14 | beautifulsoup4==4.12.2
15 | bleach==6.0.0
16 | blivet==3.5.0
17 | blivet-gui==2.4.1
18 | cachetools==5.3.1
19 | cairocffi==1.3.0
20 | CairoSVG==2.7.0
21 | certifi==2021.10.8
22 | cffi==1.15.1
23 | chardet==5.1.0
24 | charset-normalizer==2.1.0
25 | click==8.1.3
26 | cloudpickle==2.2.1
27 | comm==0.1.3
28 | conda==4.13.0
29 | contourpy==1.1.0
30 | cssselect==1.1.0
31 | cssselect2==0.6.0
32 | cupshelpers==1.0
33 | cycler==0.11.0
34 | dasbus==1.6
35 | debugpy==1.6.7
36 | decorator==5.1.1
37 | defusedxml==0.7.1
38 | distro==1.7.0
39 | et-xmlfile==1.1.0
40 | exceptiongroup==1.1.1
41 | executing==1.2.0
42 | Farama-Notifications==0.0.4
43 | fastjsonschema==2.17.1
44 | fedora-third-party==0.10
45 | file-magic==0.4.0
46 | flatbuffers==23.5.26
47 | fonttools==4.40.0
48 | fqdn==1.5.1
49 | fros==1.1
50 | frozendict==1.2
51 | fs==2.4.11
52 | gast==0.4.0
53 | google-auth==2.20.0
54 | google-auth-oauthlib==1.0.0
55 | google-pasta==0.2.0
56 | grpcio==1.56.0
57 | gymnasium==0.28.1
58 | h5py==3.9.0
59 | humanize==3.13.1
60 | idna==3.3
61 | initial-setup==0.3.95
62 | ipykernel==6.23.3
63 | ipympl==0.9.3
64 | ipython==8.14.0
65 | ipython-genutils==0.2.0
66 | ipywidgets==8.0.6
67 | isoduration==20.11.0
68 | jax==0.4.13
69 | jax-jumpy==1.0.0
70 | jedi==0.18.2
71 | Jinja2==3.1.2
72 | joblib==1.2.0
73 | jsonpointer==2.4
74 | jsonschema==4.17.3
75 | jupyter==1.0.0
76 | jupyter-console==6.6.3
77 | jupyter-events==0.6.3
78 | jupyter_client==8.3.0
79 | jupyter_core==5.3.1
80 | jupyter_server==2.6.0
81 | jupyter_server_terminals==0.4.4
82 | jupyterlab-pygments==0.2.2
83 | jupyterlab-widgets==3.0.7
84 | keras==2.12.0
85 | kiwisolver==1.4.4
86 | koji==1.33.0
87 | langtable==0.0.62
88 | libclang==16.0.0
89 | Mako==1.1.4
90 | Markdown==3.4.1
91 | MarkupSafe==2.1.3
92 | matplotlib==3.7.1
93 | matplotlib-inline==0.1.6
94 | meson==1.0.1
95 | mistune==3.0.1
96 | ml-dtypes==0.2.0
97 | munkres==1.1.2
98 | mutagen==1.45.1
99 | nbclassic==1.0.0
100 | nbclient==0.8.0
101 | nbconvert==7.6.0
102 | nbformat==5.9.0
103 | nest-asyncio==1.5.6
104 | nftables==0.1
105 | notebook==6.5.4
106 | notebook_shim==0.2.3
107 | numpy==1.23.5
108 | oauthlib==3.2.2
109 | olefile==0.46
110 | openpyxl==3.1.2
111 | opt-einsum==3.3.0
112 | overrides==7.3.1
113 | packaging==21.3
114 | pandas==2.0.2
115 | pandocfilters==1.5.0
116 | parso==0.8.3
117 | Paste==3.5.0
118 | pexpect==4.8.0
119 | pickleshare==0.7.5
120 | pid==2.2.3
121 | Pillow==9.5.0
122 | platformdirs==3.8.0
123 | ply==3.11
124 | podman-compose==1.0.6
125 | powerline-status==2.8.3
126 | productmd==1.35
127 | progressbar2==3.53.2
128 | prometheus-client==0.17.0
129 | prompt-toolkit==3.0.38
130 | protobuf==4.23.3
131 | psutil==5.9.5
132 | ptyprocess==0.6.0
133 | pure-eval==0.2.2
134 | py-cpuinfo==8.0.0
135 | pyasn1==0.5.0
136 | pyasn1-modules==0.3.0
137 | pycparser==2.20
138 | pyenchant==3.2.2
139 | pygame==2.1.3
140 | Pygments==2.12.0
141 | pyinotify==0.9.6
142 | pykickstart==3.41
143 | pyOpenSSL==21.0.0
144 | pyparsing==3.0.9
145 | pyrsistent==0.19.3
146 | PySocks==1.7.1
147 | pystray==0.17.3
148 | python-augeas==1.1.0
149 | python-dateutil==2.8.2
150 | python-dotenv==0.19.2
151 | python-gettext==4.0
152 | python-json-logger==2.0.7
153 | python-manatools==0.0.4
154 | python-meh==0.50
155 | python-pam==2.0.2
156 | python-utils==3.1.0
157 | python-xlib==0.33
158 | pytz==2023.3
159 | pyudev==0.23.2
160 | pyxdg==0.27
161 | PyYAML==6.0
162 | pyzmq==25.1.0
163 | qtconsole==5.4.3
164 | QtPy==2.3.1
165 | requests==2.28.1
166 | requests-file==1.5.1
167 | requests-ftp==0.3.1
168 | requests-gssapi==1.2.3
169 | requests-oauthlib==1.3.1
170 | rfc3339-validator==0.1.4
171 | rfc3986-validator==0.1.1
172 | rpmautospec==0.3.5
173 | rsa==4.9
174 | ruamel.yaml==0.17.24
175 | scikit-learn==1.2.2
176 | scipy==1.11.0
177 | scour==0.38.2
178 | Send2Trash==1.8.2
179 | sepolicy==3.5
180 | setroubleshoot==3.3.32
181 | simpleline==1.9.0
182 | six==1.16.0
183 | sniffio==1.3.0
184 | sos==4.4
185 | soupsieve==2.3.2.post1
186 | stack-data==0.6.2
187 | Tempita==0.5.2
188 | tensorboard==2.12.3
189 | tensorboard-data-server==0.7.1
190 | tensorflow==2.12.0
191 | tensorflow-estimator==2.12.0
192 | tensorflow-io-gcs-filesystem==0.32.0
193 | termcolor==2.3.0
194 | terminado==0.17.1
195 | threadpoolctl==3.1.0
196 | tinycss2==1.1.1
197 | toolz==0.11.2
198 | tornado==6.3.2
199 | tqdm==4.65.0
200 | traitlets==5.9.0
201 | typing_extensions==4.6.3
202 | tzdata==2023.3
203 | uri-template==1.3.0
204 | urllib3==1.26.15
205 | wcwidth==0.2.6
206 | webcolors==1.13
207 | webencodings==0.5.1
208 | websocket-client==1.6.1
209 | Werkzeug==2.3.6
210 | widgetsnbextension==4.0.7
211 | wrapt==1.14.1
212 | xcffib==0.11.1
213 | xlrd==2.0.1
214 | yt-dlp==2023.3.4
215 |
--------------------------------------------------------------------------------
/solution_example_pictures/Task2_10_c.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/Fraunhofer-SCAI/AlgMathML/848e301d37df72506d1e5173acb7c0fc5b45db74/solution_example_pictures/Task2_10_c.png
--------------------------------------------------------------------------------
/solution_example_pictures/Task2_2.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/Fraunhofer-SCAI/AlgMathML/848e301d37df72506d1e5173acb7c0fc5b45db74/solution_example_pictures/Task2_2.png
--------------------------------------------------------------------------------
/solution_example_pictures/Task2_4.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/Fraunhofer-SCAI/AlgMathML/848e301d37df72506d1e5173acb7c0fc5b45db74/solution_example_pictures/Task2_4.png
--------------------------------------------------------------------------------
/solution_example_pictures/Task2_7_a1.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/Fraunhofer-SCAI/AlgMathML/848e301d37df72506d1e5173acb7c0fc5b45db74/solution_example_pictures/Task2_7_a1.png
--------------------------------------------------------------------------------
/solution_example_pictures/Task2_7_b1.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/Fraunhofer-SCAI/AlgMathML/848e301d37df72506d1e5173acb7c0fc5b45db74/solution_example_pictures/Task2_7_b1.png
--------------------------------------------------------------------------------
/solution_example_pictures/Task2_9.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/Fraunhofer-SCAI/AlgMathML/848e301d37df72506d1e5173acb7c0fc5b45db74/solution_example_pictures/Task2_9.png
--------------------------------------------------------------------------------
/solution_example_pictures/Task3_3.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/Fraunhofer-SCAI/AlgMathML/848e301d37df72506d1e5173acb7c0fc5b45db74/solution_example_pictures/Task3_3.png
--------------------------------------------------------------------------------
/solution_example_pictures/Task3_6_a.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/Fraunhofer-SCAI/AlgMathML/848e301d37df72506d1e5173acb7c0fc5b45db74/solution_example_pictures/Task3_6_a.png
--------------------------------------------------------------------------------
/solution_example_pictures/Task3_6_b.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/Fraunhofer-SCAI/AlgMathML/848e301d37df72506d1e5173acb7c0fc5b45db74/solution_example_pictures/Task3_6_b.png
--------------------------------------------------------------------------------
/solution_example_pictures/Task3_7.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/Fraunhofer-SCAI/AlgMathML/848e301d37df72506d1e5173acb7c0fc5b45db74/solution_example_pictures/Task3_7.png
--------------------------------------------------------------------------------
/solution_example_pictures/Task3_8_c_example.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/Fraunhofer-SCAI/AlgMathML/848e301d37df72506d1e5173acb7c0fc5b45db74/solution_example_pictures/Task3_8_c_example.png
--------------------------------------------------------------------------------
/solution_example_pictures/Task4_2_a.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/Fraunhofer-SCAI/AlgMathML/848e301d37df72506d1e5173acb7c0fc5b45db74/solution_example_pictures/Task4_2_a.png
--------------------------------------------------------------------------------
/solution_example_pictures/Task4_2_b.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/Fraunhofer-SCAI/AlgMathML/848e301d37df72506d1e5173acb7c0fc5b45db74/solution_example_pictures/Task4_2_b.png
--------------------------------------------------------------------------------
/solution_example_pictures/Task4_3_b2.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/Fraunhofer-SCAI/AlgMathML/848e301d37df72506d1e5173acb7c0fc5b45db74/solution_example_pictures/Task4_3_b2.png
--------------------------------------------------------------------------------
/solution_example_pictures/Task4_3b.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/Fraunhofer-SCAI/AlgMathML/848e301d37df72506d1e5173acb7c0fc5b45db74/solution_example_pictures/Task4_3b.png
--------------------------------------------------------------------------------
/solution_example_pictures/Task4_4.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/Fraunhofer-SCAI/AlgMathML/848e301d37df72506d1e5173acb7c0fc5b45db74/solution_example_pictures/Task4_4.png
--------------------------------------------------------------------------------
/solution_example_pictures/Task4_4b.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/Fraunhofer-SCAI/AlgMathML/848e301d37df72506d1e5173acb7c0fc5b45db74/solution_example_pictures/Task4_4b.png
--------------------------------------------------------------------------------
/solution_example_pictures/Task4_5_b1.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/Fraunhofer-SCAI/AlgMathML/848e301d37df72506d1e5173acb7c0fc5b45db74/solution_example_pictures/Task4_5_b1.png
--------------------------------------------------------------------------------
/solution_example_pictures/Task4_5_b2.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/Fraunhofer-SCAI/AlgMathML/848e301d37df72506d1e5173acb7c0fc5b45db74/solution_example_pictures/Task4_5_b2.png
--------------------------------------------------------------------------------
/solution_example_pictures/Task4_5_b3.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/Fraunhofer-SCAI/AlgMathML/848e301d37df72506d1e5173acb7c0fc5b45db74/solution_example_pictures/Task4_5_b3.png
--------------------------------------------------------------------------------
/solution_example_pictures/Task4_5_b4.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/Fraunhofer-SCAI/AlgMathML/848e301d37df72506d1e5173acb7c0fc5b45db74/solution_example_pictures/Task4_5_b4.png
--------------------------------------------------------------------------------
/solution_example_pictures/Task4_6.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/Fraunhofer-SCAI/AlgMathML/848e301d37df72506d1e5173acb7c0fc5b45db74/solution_example_pictures/Task4_6.png
--------------------------------------------------------------------------------
/solution_example_pictures/Task4_7.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/Fraunhofer-SCAI/AlgMathML/848e301d37df72506d1e5173acb7c0fc5b45db74/solution_example_pictures/Task4_7.png
--------------------------------------------------------------------------------
/solution_example_pictures/Task5_1.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/Fraunhofer-SCAI/AlgMathML/848e301d37df72506d1e5173acb7c0fc5b45db74/solution_example_pictures/Task5_1.png
--------------------------------------------------------------------------------
/solution_example_pictures/Task5_10.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/Fraunhofer-SCAI/AlgMathML/848e301d37df72506d1e5173acb7c0fc5b45db74/solution_example_pictures/Task5_10.png
--------------------------------------------------------------------------------
/solution_example_pictures/Task5_12.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/Fraunhofer-SCAI/AlgMathML/848e301d37df72506d1e5173acb7c0fc5b45db74/solution_example_pictures/Task5_12.png
--------------------------------------------------------------------------------
/solution_example_pictures/Task5_13.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/Fraunhofer-SCAI/AlgMathML/848e301d37df72506d1e5173acb7c0fc5b45db74/solution_example_pictures/Task5_13.png
--------------------------------------------------------------------------------
/solution_example_pictures/Task5_3.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/Fraunhofer-SCAI/AlgMathML/848e301d37df72506d1e5173acb7c0fc5b45db74/solution_example_pictures/Task5_3.png
--------------------------------------------------------------------------------
/solution_example_pictures/Task5_4a.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/Fraunhofer-SCAI/AlgMathML/848e301d37df72506d1e5173acb7c0fc5b45db74/solution_example_pictures/Task5_4a.png
--------------------------------------------------------------------------------
/solution_example_pictures/Task5_4b.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/Fraunhofer-SCAI/AlgMathML/848e301d37df72506d1e5173acb7c0fc5b45db74/solution_example_pictures/Task5_4b.png
--------------------------------------------------------------------------------
/solution_example_pictures/Task5_6.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/Fraunhofer-SCAI/AlgMathML/848e301d37df72506d1e5173acb7c0fc5b45db74/solution_example_pictures/Task5_6.png
--------------------------------------------------------------------------------
/solution_example_pictures/Task5_7.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/Fraunhofer-SCAI/AlgMathML/848e301d37df72506d1e5173acb7c0fc5b45db74/solution_example_pictures/Task5_7.png
--------------------------------------------------------------------------------
/solution_example_pictures/Task5_8a.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/Fraunhofer-SCAI/AlgMathML/848e301d37df72506d1e5173acb7c0fc5b45db74/solution_example_pictures/Task5_8a.png
--------------------------------------------------------------------------------
/solution_example_pictures/Task5_8b.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/Fraunhofer-SCAI/AlgMathML/848e301d37df72506d1e5173acb7c0fc5b45db74/solution_example_pictures/Task5_8b.png
--------------------------------------------------------------------------------
/solution_example_pictures/Task5_9a.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/Fraunhofer-SCAI/AlgMathML/848e301d37df72506d1e5173acb7c0fc5b45db74/solution_example_pictures/Task5_9a.png
--------------------------------------------------------------------------------
/solution_example_pictures/Task5_9b.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/Fraunhofer-SCAI/AlgMathML/848e301d37df72506d1e5173acb7c0fc5b45db74/solution_example_pictures/Task5_9b.png
--------------------------------------------------------------------------------
/solution_example_pictures/Task5_9c.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/Fraunhofer-SCAI/AlgMathML/848e301d37df72506d1e5173acb7c0fc5b45db74/solution_example_pictures/Task5_9c.png
--------------------------------------------------------------------------------
/solution_example_pictures/Task6_3.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/Fraunhofer-SCAI/AlgMathML/848e301d37df72506d1e5173acb7c0fc5b45db74/solution_example_pictures/Task6_3.png
--------------------------------------------------------------------------------
/solution_example_pictures/Task6_4a.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/Fraunhofer-SCAI/AlgMathML/848e301d37df72506d1e5173acb7c0fc5b45db74/solution_example_pictures/Task6_4a.png
--------------------------------------------------------------------------------
/solution_example_pictures/Task6_4b.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/Fraunhofer-SCAI/AlgMathML/848e301d37df72506d1e5173acb7c0fc5b45db74/solution_example_pictures/Task6_4b.png
--------------------------------------------------------------------------------
/solution_example_pictures/Task6_4c.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/Fraunhofer-SCAI/AlgMathML/848e301d37df72506d1e5173acb7c0fc5b45db74/solution_example_pictures/Task6_4c.png
--------------------------------------------------------------------------------
/solution_example_pictures/Task6_5.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/Fraunhofer-SCAI/AlgMathML/848e301d37df72506d1e5173acb7c0fc5b45db74/solution_example_pictures/Task6_5.png
--------------------------------------------------------------------------------
/solution_example_pictures/Task7_1a.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/Fraunhofer-SCAI/AlgMathML/848e301d37df72506d1e5173acb7c0fc5b45db74/solution_example_pictures/Task7_1a.png
--------------------------------------------------------------------------------
/solution_example_pictures/Task7_1b.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/Fraunhofer-SCAI/AlgMathML/848e301d37df72506d1e5173acb7c0fc5b45db74/solution_example_pictures/Task7_1b.png
--------------------------------------------------------------------------------
/solution_example_pictures/Task7_2a.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/Fraunhofer-SCAI/AlgMathML/848e301d37df72506d1e5173acb7c0fc5b45db74/solution_example_pictures/Task7_2a.png
--------------------------------------------------------------------------------
/solution_example_pictures/Task7_2b.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/Fraunhofer-SCAI/AlgMathML/848e301d37df72506d1e5173acb7c0fc5b45db74/solution_example_pictures/Task7_2b.png
--------------------------------------------------------------------------------
/solution_example_pictures/Task7_3a.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/Fraunhofer-SCAI/AlgMathML/848e301d37df72506d1e5173acb7c0fc5b45db74/solution_example_pictures/Task7_3a.png
--------------------------------------------------------------------------------
/solution_example_pictures/Task7_3b.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/Fraunhofer-SCAI/AlgMathML/848e301d37df72506d1e5173acb7c0fc5b45db74/solution_example_pictures/Task7_3b.png
--------------------------------------------------------------------------------
/solution_example_pictures/Task7_4a.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/Fraunhofer-SCAI/AlgMathML/848e301d37df72506d1e5173acb7c0fc5b45db74/solution_example_pictures/Task7_4a.png
--------------------------------------------------------------------------------
/solution_example_pictures/Task7_4b.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/Fraunhofer-SCAI/AlgMathML/848e301d37df72506d1e5173acb7c0fc5b45db74/solution_example_pictures/Task7_4b.png
--------------------------------------------------------------------------------