├── .gitignore ├── 0_Jupyter_Notebook--Tips_and_Tricks.ipynb ├── 1_Pragmatic_Python.ipynb ├── 2_Procedural_Python.ipynb ├── 3_Object_Oriented_Python.ipynb ├── 4_Functional_Python.ipynb ├── 5_High_Performance_Python.ipynb ├── 6_Pedantic_Python_Tricks.ipynb ├── README.md ├── images ├── Big_O_magnitudes.png ├── Dataflow_pipeline.png ├── GED_triplet.png ├── LEGB_Variable_Scoping.png ├── Python-logo.png ├── Tensorflow_logo.png ├── The-Big-O.jpg ├── advanced-python.jpg ├── alternative_constructors.jpg ├── be_this_tall_to_use_multithreading.jpg ├── calling-python.jpg ├── concurrency.jpg ├── context_switching_cost.jpg ├── daleks-exterminate.jpg ├── multithreaded_process.png ├── mybinder_screenshot.jpg ├── parallelism.jpg ├── party_emoji.jpg ├── python-word-cloud.jpg ├── python_is_awesome.jpg ├── python_yin_yang.png ├── recursion_call_stack.gif ├── ricer.jpg ├── scientific-ecosystem.png └── thumbs-up.png └── requirements.txt /.gitignore: -------------------------------------------------------------------------------- 1 | .ipynb_checkpoints 2 | __pycache__ 3 | -------------------------------------------------------------------------------- /0_Jupyter_Notebook--Tips_and_Tricks.ipynb: -------------------------------------------------------------------------------- 1 | { 2 | "cells": [ 3 | { 4 | "cell_type": "markdown", 5 | "metadata": {}, 6 | "source": [ 7 | "# An Introduction to the Useful Parts of Jupyter Notebook\n", 8 | "A Jupyter notebook contains text cells and executable code cells. \n", 9 | "Fun fact: Jupyter is a portmanteau/acronym for the 3 open-source data science languages: **Ju**lia, **Pyt**hon, **R**.\n", 10 | "\n", 11 | "### Cells\n", 12 | "Cells can be of two types:\n", 13 | "* **Code cells**: contain code to evaluate. Any outputs (like printouts or plots) from executing the code are displayed in the bottom half of the code cell.\n", 14 | "\n", 15 | "* **Markdown cells**: contain markdown text that is converted to HTML to produce headers, lists (bulleted or numbered), formatted text (ie **bold**, *italicize*, ~~strikethrough~~), and formulas.\n", 16 | "\n", 17 | "### Working with Notebooks\n", 18 | "The variables are persisted in memory, so you can run different code cells that use a variable (or class or module) that was executed in a prior code cell. You can click on `Kernel` > `Restart` (or the circular arrow icon) on the toolbar to restart the kernel and start running code against a clean slate. If the notebook server is separate from the machine you are actually on (ie the Jupyter notebook is run on an EC2 machine and you are connecting to it via your laptop), then sometimes your notebook gets disconnected. All you do to reconnect your notebook is refresh the page, and hopefully your variables are still alive in memory. Notebooks are automatically saved periodically (to the `.ipynb_checkpoints/` folder). However you can choose to save the notebook at any point.\n", 19 | "\n", 20 | "### Editing Notebooks\n", 21 | "Cells can be individually selected, and up/down keys can be used to navigate the notebook. Once you have the appropriate cell selected, you can switch into edit mode either by pressing the `Enter` key or a double clicking on the cell.\n", 22 | "\n", 23 | "In addition to working within a cell, you can add new cells, delete cells, reorder cells or clear past results to organize your notebook content. These editing capabilities are provided by the toolbar above (or through an equivalent keystroke shortcut).\n", 24 | "\n", 25 | "\n", 26 | "When you are done editing the cell, you can execute a code cell to produce results, either by clicking the `Run` button or pressing `Shift+Enter`. Try executing the following code cell." 27 | ] 28 | }, 29 | { 30 | "cell_type": "code", 31 | "execution_count": null, 32 | "metadata": {}, 33 | "outputs": [], 34 | "source": [ 35 | "def greet():\n", 36 | " print('Hello World! Welcome to Notebooks!')\n", 37 | " \n", 38 | "greet()" 39 | ] 40 | }, 41 | { 42 | "cell_type": "code", 43 | "execution_count": null, 44 | "metadata": {}, 45 | "outputs": [], 46 | "source": [] 47 | }, 48 | { 49 | "cell_type": "markdown", 50 | "metadata": {}, 51 | "source": [ 52 | "## Some tips for Jupyter Notebook\n", 53 | "\n", 54 | "Notebooks can render simple Markdown/LaTex.\n", 55 | "* Good for formulas: $\\int_{a}^{b} x^2 dx$\n", 56 | "* Add a picture (or even video) from your computer or Internet:\n", 57 | "\n", 58 | "* Render Python's color-coded syntax highlighting with triple backtick and then language: \n", 59 | "\\`\\`\\`python
print(\"Hello world!') # Python code here
print(range(10))
\\`\\`\\`\n", 60 | "```python\n", 61 | "print(\"Hello world!\") # Python code here\n", 62 | "print(range(10))\n", 63 | "```\n", 64 | "\n", 65 | "## Jupyter Notebook Hotkeys:\n", 66 | "Must not be actually editing the cell itself. Press Escape first.\n", 67 | "* Shift + Enter: run current cell and move to the next cell\n", 68 | "* Ctrl + Enter: run current cell and stay on current cell\n", 69 | "* i + i: stop executing the current running cell. I use this a LOT!\n", 70 | "* 0 + 0: reset the kernel. I use this a LOT!\n", 71 | "* m: make the current cell a LaTex/Markdown cell\n", 72 | "* y: make the current cell a coding cell\n", 73 | "* a: add a cell **a**bove the current cell\n", 74 | "* b: add a cell **b**elow the current cell\n", 75 | "* Shift + m: **m**erge the current cell with the cell below\n", 76 | "* d + d: delete current cell (like the vim command)\n", 77 | "* x: **c**ut current cell. Basically equivalent to deleting current cell (if you never paste)\n", 78 | "* c: **c**opy current cell\n", 79 | "* v: paste cut/copied cell\n", 80 | "* l: add **l**ine number for code cell\n", 81 | "* o: toggle beteween hiding or showing the code cell's **o**utput\n", 82 | "* h: see the cheat sheet of the hotkeys. Basically more of these tips\n", 83 | "\n", 84 | "The Notebook also offers tab completion on a variable/function/class/module name when a code cell is in edit mode." 85 | ] 86 | }, 87 | { 88 | "cell_type": "code", 89 | "execution_count": 1, 90 | "metadata": {}, 91 | "outputs": [ 92 | { 93 | "name": "stdout", 94 | "output_type": "stream", 95 | "text": [ 96 | "hi\n", 97 | "bye\n" 98 | ] 99 | } 100 | ], 101 | "source": [ 102 | "print('hi') # tap L to add line numbers. Might need to tap first\n", 103 | "print('bye')" 104 | ] 105 | }, 106 | { 107 | "cell_type": "code", 108 | "execution_count": null, 109 | "metadata": {}, 110 | "outputs": [], 111 | "source": [] 112 | }, 113 | { 114 | "cell_type": "markdown", 115 | "metadata": {}, 116 | "source": [ 117 | "## Getting Help about a Python Function\n", 118 | "To get more details about a function (i.e. function arguments or function definition), you can do 3 things:\n", 119 | "* Go to the function and press Shift + Tab to see the function/class signature (for the arguments and docstring). The function has to be in memory first. Hence, it always works with builtins, but you would need to import it if the function is from a module. Press Shift + Tab + Tab to see more details, which sometimes includes the source code. CAVEAT: If the function has not been loaded into memory, then Shift + Tab won't do anything since that function does not exist (yet) to extract the docstring.\n", 120 | "* Pass the function of interest into ```help()```.\n", 121 | "* Type `??` after the function to see the actual Python source code. Doesn't work for builtins, as those are written in C. Equivalent to Shift + Tab + Tab\n", 122 | "\n", 123 | "Technically, these strategies work for **any** object, so you can do this on a class or instance object." 124 | ] 125 | }, 126 | { 127 | "cell_type": "code", 128 | "execution_count": 2, 129 | "metadata": {}, 130 | "outputs": [], 131 | "source": [ 132 | "import pandas as pd # load up the module/class/function" 133 | ] 134 | }, 135 | { 136 | "cell_type": "code", 137 | "execution_count": null, 138 | "metadata": {}, 139 | "outputs": [], 140 | "source": [ 141 | "pd.DataFrame() # press + inside the parentheses\n", 142 | "# press + + to see even more details" 143 | ] 144 | }, 145 | { 146 | "cell_type": "code", 147 | "execution_count": 3, 148 | "metadata": {}, 149 | "outputs": [ 150 | { 151 | "name": "stdout", 152 | "output_type": "stream", 153 | "text": [ 154 | "Help on function mean in module pandas.core.frame:\n", 155 | "\n", 156 | "mean(self, axis=None, skipna=None, level=None, numeric_only=None, **kwargs)\n", 157 | " Return the mean of the values for the requested axis.\n", 158 | " \n", 159 | " Parameters\n", 160 | " ----------\n", 161 | " axis : {index (0), columns (1)}\n", 162 | " Axis for the function to be applied on.\n", 163 | " skipna : bool, default True\n", 164 | " Exclude NA/null values when computing the result.\n", 165 | " level : int or level name, default None\n", 166 | " If the axis is a MultiIndex (hierarchical), count along a\n", 167 | " particular level, collapsing into a Series.\n", 168 | " numeric_only : bool, default None\n", 169 | " Include only float, int, boolean columns. If None, will attempt to use\n", 170 | " everything, then use only numeric data. Not implemented for Series.\n", 171 | " **kwargs\n", 172 | " Additional keyword arguments to be passed to the function.\n", 173 | " \n", 174 | " Returns\n", 175 | " -------\n", 176 | " Series or DataFrame (if level specified)\n", 177 | "\n" 178 | ] 179 | } 180 | ], 181 | "source": [ 182 | "help(pd.DataFrame.mean)" 183 | ] 184 | }, 185 | { 186 | "cell_type": "code", 187 | "execution_count": 4, 188 | "metadata": {}, 189 | "outputs": [], 190 | "source": [ 191 | "# add ?? as a suffix to get source code. Press to make the pop up tab disappear\n", 192 | "pd.DataFrame.mean??" 193 | ] 194 | }, 195 | { 196 | "cell_type": "code", 197 | "execution_count": null, 198 | "metadata": {}, 199 | "outputs": [], 200 | "source": [] 201 | }, 202 | { 203 | "cell_type": "markdown", 204 | "metadata": {}, 205 | "source": [ 206 | "## Cell Magic/Magic functions\n", 207 | "If you're familiar with IPython notebooks, you might have come across the term `cell magic`, which basically makes Jupyter notebooks even more awesome: run arbitrary bash commands, write cell to disk, load file to cell, mix-n-match bash and Python, and time the cell." 208 | ] 209 | }, 210 | { 211 | "cell_type": "code", 212 | "execution_count": 5, 213 | "metadata": {}, 214 | "outputs": [ 215 | { 216 | "name": "stdout", 217 | "output_type": "stream", 218 | "text": [ 219 | "print('hi there')\n", 220 | "hi there\n", 221 | "0_Jupyter_Notebook--Tips_and_Tricks.ipynb\n", 222 | "1_Pragmatic_Python.ipynb\n", 223 | "2_Procedural_Python.ipynb\n", 224 | "3_Object_Oriented_Python.ipynb\n", 225 | "4_Functional_Python.ipynb\n", 226 | "5_High_Performance_Python.ipynb\n", 227 | "6_Pedantic_Python_Tricks.ipynb\n" 228 | ] 229 | } 230 | ], 231 | "source": [ 232 | "# The bash sign will run the rest of the line as a bash commmand. Really useful if you're connected to a \n", 233 | "# remote machine and want to run a simple line command but don't want to switch to the Terminal.\n", 234 | "!echo \"print('hi there')\" > temp.py\n", 235 | "!cat temp.py\n", 236 | "!python3 temp.py\n", 237 | "!touch another_temp_file.py\n", 238 | "!rm another_temp_file.py\n", 239 | "!mkdir temp_dir\n", 240 | "!rmdir temp_dir\n", 241 | "!ls *.ipynb | head" 242 | ] 243 | }, 244 | { 245 | "cell_type": "code", 246 | "execution_count": 6, 247 | "metadata": {}, 248 | "outputs": [ 249 | { 250 | "name": "stdout", 251 | "output_type": "stream", 252 | "text": [ 253 | "print('hi there')\n", 254 | "hi there\n" 255 | ] 256 | } 257 | ], 258 | "source": [ 259 | "%%bash\n", 260 | "# %%bash (or %%sh) can run an entire cell as bash commands\n", 261 | "cat temp.py # notice no need to start the line with !\n", 262 | "python3 temp.py\n", 263 | "rm temp.py\n", 264 | "touch another_temp_file.py\n", 265 | "rm another_temp_file.py\n", 266 | "mkdir temp_dir\n", 267 | "rmdir temp_dir" 268 | ] 269 | }, 270 | { 271 | "cell_type": "code", 272 | "execution_count": 7, 273 | "metadata": {}, 274 | "outputs": [ 275 | { 276 | "data": { 277 | "text/plain": [ 278 | "['0_Jupyter_Notebook--Tips_and_Tricks.ipynb',\n", 279 | " '1_Pragmatic_Python.ipynb',\n", 280 | " '2_Procedural_Python.ipynb',\n", 281 | " '3_Object_Oriented_Python.ipynb',\n", 282 | " '4_Functional_Python.ipynb',\n", 283 | " '5_High_Performance_Python.ipynb',\n", 284 | " '6_Pedantic_Python_Tricks.ipynb']" 285 | ] 286 | }, 287 | "execution_count": 7, 288 | "metadata": {}, 289 | "output_type": "execute_result" 290 | } 291 | ], 292 | "source": [ 293 | "# you can do a hacky thing where you save the outputs of bash command to a Python variable\n", 294 | "files_in_directory = !ls *.ipynb\n", 295 | "files_in_directory" 296 | ] 297 | }, 298 | { 299 | "cell_type": "code", 300 | "execution_count": 8, 301 | "metadata": { 302 | "scrolled": true 303 | }, 304 | "outputs": [ 305 | { 306 | "name": "stdout", 307 | "output_type": "stream", 308 | "text": [ 309 | "{\r\n", 310 | " \"cells\": [\r\n", 311 | " {\r\n", 312 | " \"cell_type\": \"markdown\",\r\n", 313 | " \"metadata\": {},\r\n", 314 | " \"source\": [\r\n", 315 | " \"## Some tips for Jupyter Notebook\\n\",\r\n", 316 | " \"\\n\",\r\n", 317 | " \"Notebooks can render simple Markdown/LaTex.\\n\",\r\n", 318 | " \"* Good for formulas: $\\\\int_{a}^{b} x^2 dx$\\n\",\r\n" 319 | ] 320 | } 321 | ], 322 | "source": [ 323 | "# in a reverse manner, you can send Python variable to bash command by enclosing your Python variable name with { }\n", 324 | "this_file_name = \"'0_Jupyter_Notebook--Tips_and_Tricks.ipynb'\" # had to add extra quotation marks since file name had dashes\n", 325 | "!head {this_file_name}" 326 | ] 327 | }, 328 | { 329 | "cell_type": "code", 330 | "execution_count": 9, 331 | "metadata": {}, 332 | "outputs": [ 333 | { 334 | "name": "stdout", 335 | "output_type": "stream", 336 | "text": [ 337 | "Writing temp.py\n" 338 | ] 339 | } 340 | ], 341 | "source": [ 342 | "%%writefile temp.py\n", 343 | "# This will put all the code from this code cell into a file with the file name you specificed (temp.py).\n", 344 | "# %% must be the 1st line for this to work.\n", 345 | "print('Hello World!')" 346 | ] 347 | }, 348 | { 349 | "cell_type": "code", 350 | "execution_count": null, 351 | "metadata": {}, 352 | "outputs": [], 353 | "source": [ 354 | "%load temp.py\n", 355 | "# load the file" 356 | ] 357 | }, 358 | { 359 | "cell_type": "code", 360 | "execution_count": 10, 361 | "metadata": {}, 362 | "outputs": [], 363 | "source": [ 364 | "%pycat temp.py\n", 365 | "# !cat will print the file but no syntax highlighting. %pycat will print file WITH syntax highlighting" 366 | ] 367 | }, 368 | { 369 | "cell_type": "markdown", 370 | "metadata": {}, 371 | "source": [ 372 | "#### Profiling Runtime" 373 | ] 374 | }, 375 | { 376 | "cell_type": "code", 377 | "execution_count": 11, 378 | "metadata": {}, 379 | "outputs": [ 380 | { 381 | "name": "stdout", 382 | "output_type": "stream", 383 | "text": [ 384 | "hi\n", 385 | "CPU times: user 0 ns, sys: 0 ns, total: 0 ns\n", 386 | "Wall time: 11 µs\n", 387 | "bye\n" 388 | ] 389 | } 390 | ], 391 | "source": [ 392 | "print('hi')\n", 393 | "%time sum(range(10)) # get runtime for only this line; %time be used on any line in the cell\n", 394 | "print('bye')" 395 | ] 396 | }, 397 | { 398 | "cell_type": "code", 399 | "execution_count": 12, 400 | "metadata": {}, 401 | "outputs": [ 402 | { 403 | "name": "stdout", 404 | "output_type": "stream", 405 | "text": [ 406 | "CPU times: user 188 ms, sys: 0 ns, total: 188 ms\n", 407 | "Wall time: 188 ms\n" 408 | ] 409 | } 410 | ], 411 | "source": [ 412 | "%%time\n", 413 | "# With double percent, you get runtime of the entire cell. %%time MUST be the 1st line\n", 414 | "summer = 0\n", 415 | "for i in range(1000000):\n", 416 | " summer += i" 417 | ] 418 | }, 419 | { 420 | "cell_type": "code", 421 | "execution_count": 13, 422 | "metadata": {}, 423 | "outputs": [ 424 | { 425 | "name": "stdout", 426 | "output_type": "stream", 427 | "text": [ 428 | "172 ms ± 4.75 ms per loop (mean ± std. dev. of 7 runs, 10 loops each)\n" 429 | ] 430 | } 431 | ], 432 | "source": [ 433 | "%%timeit\n", 434 | "# If you are concerned about the stability of the runtime, run the cell multiple times and get profiling results.\n", 435 | "# %%time will run the code cell exactly once. %%timeit will run the cell multiple times.\n", 436 | "sum(range(10000000))" 437 | ] 438 | }, 439 | { 440 | "cell_type": "code", 441 | "execution_count": 14, 442 | "metadata": {}, 443 | "outputs": [ 444 | { 445 | "name": "stdout", 446 | "output_type": "stream", 447 | "text": [ 448 | "HI\n", 449 | "210 ms ± 19.6 ms per loop (mean ± std. dev. of 7 runs, 10 loops each)\n", 450 | "BYE\n" 451 | ] 452 | } 453 | ], 454 | "source": [ 455 | "print(\"HI\")\n", 456 | "%timeit sum(range(10000000)) # you can use %time on any line and it will run that line multiple times\n", 457 | "print(\"BYE\")" 458 | ] 459 | }, 460 | { 461 | "cell_type": "code", 462 | "execution_count": 15, 463 | "metadata": {}, 464 | "outputs": [ 465 | { 466 | "name": "stdout", 467 | "output_type": "stream", 468 | "text": [ 469 | "HI\n", 470 | "203 ms ± 22.5 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)\n", 471 | "BYE\n", 472 | "CPU times: user 1.55 s, sys: 15.6 ms, total: 1.56 s\n", 473 | "Wall time: 1.63 s\n" 474 | ] 475 | } 476 | ], 477 | "source": [ 478 | "%%time\n", 479 | "# You can simultaneously use %%time and %time (and %%timeit and %timeit),\n", 480 | "# though typically you only use one of them in 1 cell\n", 481 | "print(\"HI\")\n", 482 | "%timeit sum(range(10000000))\n", 483 | "print(\"BYE\")" 484 | ] 485 | }, 486 | { 487 | "cell_type": "code", 488 | "execution_count": null, 489 | "metadata": {}, 490 | "outputs": [], 491 | "source": [] 492 | }, 493 | { 494 | "cell_type": "markdown", 495 | "metadata": {}, 496 | "source": [ 497 | "## Don't Print *Everything*\n", 498 | "To prevent the coding cell to printing random junk, just add semicolon (;)" 499 | ] 500 | }, 501 | { 502 | "cell_type": "code", 503 | "execution_count": 16, 504 | "metadata": { 505 | "scrolled": true 506 | }, 507 | "outputs": [ 508 | { 509 | "data": { 510 | "text/plain": [ 511 | "[]" 512 | ] 513 | }, 514 | "execution_count": 16, 515 | "metadata": {}, 516 | "output_type": "execute_result" 517 | }, 518 | { 519 | "data": { 520 | "image/png": "iVBORw0KGgoAAAANSUhEUgAAAWoAAAD4CAYAAADFAawfAAAAOXRFWHRTb2Z0d2FyZQBNYXRwbG90bGliIHZlcnNpb24zLjMuMSwgaHR0cHM6Ly9tYXRwbG90bGliLm9yZy/d3fzzAAAACXBIWXMAAAsTAAALEwEAmpwYAAAdzUlEQVR4nO3dd2BV5eHG8e9LQoAwwgojQEggEEhIEAjbiQtFFMRWrZtabH/aam2FMFRUVBy1WusCd9VaJWEPkTqKCwWE7DDCSJiBkEF2ct/fH9AWLUKAe3Nu7n0+f5FB8nhIHk9O7nmusdYiIiLeq5HTAURE5MRU1CIiXk5FLSLi5VTUIiJeTkUtIuLlAj3xQdu3b28jIiI88aFFRHzSunXrDlhrQ4/3No8UdUREBGvXrvXEhxYR8UnGmB0/9TZd+hAR8XIqahERL6eiFhHxcipqEREvp6IWEfFyKmoRES+nohYR8XIqahERN/huewEvf77VIx/bIze8iIj4i8OVNTy5Iou3v95BeNtgbh7eneAg91arilpE5DR9lr2f6fPT2F1Uzm0jI/jjJdFuL2lQUYuInLJDpVU8sjSD5PW7iOrQgnm/HsGg7m089vlU1CIidWStZXnaXh5YmEZhWTW/HRXFXaOiaBIY4NHPq6IWEamD/cUV3L8wjY/S9xHXJYS3Jw4lJqxVvXxuFbWIyAlYa/lwXR6zlmRQWeNi6mV9+OXZkQQG1N+D5lTUIiI/IbegjKnJqXyx5QBDItsy++o4eoS2qPccKmoRkR+pdVne+mo7T32UTUAjw6xx/fjFkHAaNTKO5FFRi4gcY/O+EqYkpbB+ZyHnR4fy2Pg4wlo3czSTilpEBKiudfHyZ1t5/pMtNG8SwLPXnsVVZ4VhjDNn0cdSUYuI30vNK+K+eRvJ2lvC2P5hPDg2hvYtmjgd6z9U1CLityqqa/nzqk3M/VcOoS2bMPfmBC6O6eh0rP+hohYRv/RNzkESk1LYfrCM64d0I/GyvoQ0a+x0rONSUYuIXympqGb28izeXbOT8LbBvHf7UEZEtXc61gmpqEXEb3yatZ9p81PZV1zB7WdHcu8lvT0youRu3p9QROQMFZRW8fDidBZs2E3vji148YYRDAj33IiSu6moRcRnWWtZnLKHmYvSKamo5u4Le3HnBVEEBTas50xRUYuIT9pbVMGMBWmsytxH/64hPHHNUPp0qp8RJXdTUYuIT7HW8v53uTy2NJNql4vpl/dl4tmRBDh0+7c71KmojTG/B24HLJAK3GatrfBkMBGRU7XjYCmJSal8nXOQYT3aMvvqeCLaN3c61hk7aVEbY7oAvwNirLXlxpgPgOuANz2cTUSkTmpdlje+3MbTK7Np3KgRj18dx3WDu3nF7d/uUNdLH4FAM2NMNRAM7PZcJBGRusveW8LkpBQ25hZyUd8OzBoXR6eQpk7HcquTFrW1dpcx5mlgJ1AOrLTWrvzx+xljJgGTAMLDw92dU0TkB6pqXLz42RZe+HQLLZs25i/XD2BsfGefOYs+Vl0ufbQBrgIigULgQ2PMjdbad459P2vtHGAOQEJCgnV/VBGRIzbkFjJlXgrZ+0q46qwwHhwbS9vmQU7H8pi6XPq4CNhmrc0HMMYkAyOAd074t0RE3Ky8qpZnPs7mtS+20aFlU167JYEL+3rfiJK71aWodwLDjDHBHLn0cSGw1qOpRER+5KutB0hMSmVnQRk3DA1nymV9aNXUO0eU3K0u16jXGGPmAeuBGuB7jl7iEBHxtOKKah5flsXfv91JRLtg3p80jGE92jkdq17V6VEf1toHgQc9nEVE5AdWZexj+oJU8ksquePcHtxzUW+aBQU4Have6c5EEfE6Bw9XMnNxBos37qZPp5bMvTmB+K6tnY7lGBW1iHgNay2LNu5m5qJ0DlfWcO/Fvfn1eT0b3IiSu6moRcQr7C4sZ8aCND7J2s9Z3Vrz5DXx9O7Y0ulYXkFFLSKOcrksf/9uJ48vy6LWZbn/ihhuHRHRoEeU3E1FLSKO2XaglMSkFNZsK2BkVDseHx9PeLtgp2N5HRW1iNS7mloXr3+5jT+t3ERQYCOemBDHzxN8Z0TJ3VTUIlKvMvcUMyUphZS8Ii6O6ciscf3o2Mq3RpTcTUUtIvWisqaWFz7ZwoufbaV1cGNe+MVALo/rpLPoOlBRi4jHrd95iCnzUti8/zBXD+jC/VfE0MaHR5TcTUUtIh5TVlXD0x9t4o2vttG5VVPeuG0wF0R3cDpWg6OiFhGP+HLLARKTU8gtKOemYd2ZPDqaln4youRuKmoRcaui8moeW5rJP9bmEtm+Of+YNIyhfjai5G4qahFxm5Xpe5mxII2DpVX8+rye3HNRL5o29r8RJXdTUYvIGcsvqWTm4nSWpuyhb+dWvHbLYOK6hjgdy2eoqEXktFlrWbBhFw8tzqCsspb7Lo1m0rk9aBzg3yNK7qaiFpHTsquwnOnzU/ksO5+B4UdGlKI6aETJE1TUInJKXC7Lu2t2MHt5Fi4LD46N4ebhGlHyJBW1iNRZTv5hEpNS+XZ7Aef0as9j4+Po1lYjSp6mohaRk6qpdTF39Tb+vGoTTQMb8dQ18VwzqKtu/64nKmoROaH03UVMSUohbVcxo2M78fC4WDq01IhSfVJRi8hxVVTX8vwnm3n58xzaBAfx0g0DuSyus9Ox/JKKWkT+x7odBUyel8LW/FImDOzK/Vf0pXWwRpScoqIWkf8orazhqY+yeevr7YSFNOOtiUM4r3eo07H8nopaRAD416Z8piansruonFuGR3DfpdE0b6KK8Ab6VxDxc4VlVcxamsm8dXn0CG3Oh3cMJyGirdOx5BgqahE/tjx1D/cvTOdQWRV3XtCT347SiJI3UlGL+KH9JRU8uDCd5Wl7iQ1rxVsTBxMbphElb6WiFvEj1lrmrctj1tJMyqtrmTK6D7efE6kRJS+nohbxE7kFZUybn8rqzQcYHNGG2RPi6RnawulYUgcqahEf53JZ3v56O09+lI0BHrkqlhuGdqeRRpQaDBW1iA/bsr+EKUmprNtxiPN6h/Lo+H50baMRpYZGRS3ig6prXcz5Vw7PrdpMcJMAnvl5f8YP6KIRpQZKRS3iY9J2FXHfvBQy9xQzJr4zM8fGEtqyidOx5AyoqEV8REV1Lc+u2szc1Tm0bR7EKzcN4tLYTk7HEjdQUYv4gG+3FZCYlELOgVKuTejGtMv7EhLc2OlY4iZ1KmpjTGvgVaAfYIGJ1tqvPZhLROqgpKKaJ1dk87dvdtCtbTPe+eVQzu7V3ulY4mZ1PaN+Dlhhrb3GGBME6NfGIg77NHs/05NT2VNcwcSRkfzx0t4EB+mHZF900n9VY0wIcC5wK4C1tgqo8mwsEfkph0qreGRJBsnf76JXhxYk/WYEA8PbOB1LPKgu//uNBPKBN4wx/YF1wN3W2tJj38kYMwmYBBAeHu7unCJ+z1rL0tQ9PLgwnaLyan43Koo7R0XRJFAjSr6uLjf4BwIDgZestQOAUiDxx+9krZ1jrU2w1iaEhmpoXMSd9hVXcMff1nHXe9/TpU0zFv/2bO69JFol7SfqckadB+RZa9ccfXkexylqEXE/ay0frM1l1tJMqmpcTLu8DxNHRhKoESW/ctKittbuNcbkGmOirbXZwIVAhuejifi3nQfLmDo/hS+3HGRoZFuemBBPRPvmTscSB9T1V8S/Bd49+oiPHOA2z0US8W+1LsubX23n6Y+yCWhkeHR8P64fHK4RJT9Wp6K21m4AEjwbRUQ27Sth8rwUNuQWMqpPBx4d34/OIc2cjiUO04MuRbxAVY2Llz/fyvOfbKZFk0Ceu+4sruwfphElAVTUIo7bmFvIlKQUsvaWMLZ/GDPHxtCuhUaU5L9U1CIOKa+q5dlVm5i7OofQlk2Ye3MCF8d0dDqWeCEVtYgDvsk5SGJSCtsPlnH9kHCmXt6HVk01oiTHp6IWqUclFdXMXp7Fu2t20r1dMO/9aigjempESU5MRS1STz7J2sf0+WnsK67gV+dEcu/F0TQL0p2FcnIqahEPO3i4koeXZLBww26iO7bkpRsHcVa31k7HkgZERS3iIdZaFqfsYeaidEoqqrnnol783/lRBAXq9m85NSpqEQ/YW1TBjAWprMrcT/9urXlyQjzRnVo6HUsaKBW1iBtZa3n/u1weW5pJtcvFjDF9uW1kJAG6/VvOgIpaxE12HCwlMSmVr3MOMrxHO2ZPiKN7O40oyZlTUYucoVqX5Y0vt/H0ymwaN2rE41fHcd3gbrr9W9xGRS1yBrL3ljA5KYWNuYVc1LcDs8bF0SmkqdOxxMeoqEVOQ1WNixc+3cKLn22hVdPGPH/9AK6I76yzaPEIFbXIKdqQW8jkeRvZtO8w484K44GxsbRtHuR0LPFhKmqROiqvquVPK7N5/cttdGzVlNdvTWBUH40oieepqEXq4KutB0hMSmVnQRm/GBrO1Mv60FIjSlJPVNQiJ1BcUc3jyzL5+7e5RLQL5v1JwxjWo53TscTPqKhFfsKqjH1MX5BKfkkld5zbg3su6q0RJXGEilrkRw4cruShxRks3ribPp1aMvfmBOK7tnY6lvgxFbXIUdZaFm7YzUOL0ymtrOUPF/fmjvN6akRJHKeiFgF2F5YzY0Ean2TtZ0D4kRGlXh01oiTeQUUtfs3lsrz37U5mL8+i1mV54IoYbhkRoREl8SoqavFb2w6UkpiUwpptBZwd1Z7Hr46jW9tgp2OJ/A8VtfidmloXr32xjWc+3kRQYCOenBDPzxK66vZv8VoqavErGbuLmZKUQuquIi6J6cgj4/rRsZVGlMS7qajFL1TW1PLXT7bw0mdbaR3cmBd+MZDL4zrpLFoaBBW1+Lx1Ow4xJSmFLfsPc/XALtw/JoY2GlGSBkRFLT6rrKqGpz7K5s2vttO5VVPeuG0wF0R3cDqWyClTUYtP+mLzARKTU8g7VM7Nw7szeXQfWjTRl7s0TPrKFZ9SVFbNo8sy+GBtHj3aN+eDO4YzJLKt07FEzoiKWnzGirS93L8wjYLSKn5zfk/uvrAXTRtrREkaPhW1NHj5JZXMXJTO0tQ9xHRuxRu3DqZflxCnY4m4jYpaGixrLcnrd/HwkgzKq2q579JoJp3bg8YBGlES36KilgZpV2E505JT+XxTPoO6t+GJCfFEdWjhdCwRj6hzURtjAoC1wC5r7RWeiyTy01wuyztrdvDE8iws8NCVsdw0rDuNNKIkPuxUzqjvBjKBVh7KInJCW/MPk5iUwnfbD3FOr/Y8Nl4jSuIf6lTUxpiuwBjgUeBejyYS+ZHqWhdzV+fw7KrNNGscwNM/68+EgV10+7f4jbqeUT8LTAZ+ckndGDMJmAQQHh5+xsFEANJ2FTElKYX03cWMju3Ew+Ni6dBSI0riX05a1MaYK4D91tp1xpjzf+r9rLVzgDkACQkJ1l0BxT9VVNfy/CebefnzHNoEB/HSDQO5LK6z07FEHFGXM+qRwJXGmMuBpkArY8w71tobPRtN/NXa7QVMTkohJ7+UawZ1ZcaYvrQO1oiS+K+TFrW1diowFeDoGfUfVdLiCYcra3hqRRZvf7ODsJBmvD1xCOf2DnU6lojj9Dhq8Qqfb8pnWnIqu4vKuWV4BPddGk1zjSiJAKdY1Nbaz4DPPJJE/FJhWRWPLMkkaX0ePUOb8+Edw0mI0IiSyLF0yiKOWZ66h/sXpnOorIq7LojirlFRGlESOQ4VtdS7/cUVPLAwnRXpe4kNa8VbEwcTG6YRJZGfoqKWemOt5cN1ecxakkFFjYspo/vwq3MiCdSIksgJqailXuQWlDFtfiqrNx9gcEQbZk+Ip2eoRpRE6kJFLR5V67K8/fV2nvooGwM8clUsNwzViJLIqVBRi8ds2V/ClKRU1u04xHm9Q3ns6ji6tG7mdCyRBkdFLW5XXevilc+38pd/biG4SQDP/Lw/4wdoREnkdKmoxa1S84q4b95GsvaWMCa+MzPHxhLasonTsUQaNBW1uEVFdS3PrtrM3NU5tG0exCs3DeLS2E5OxxLxCSpqOWNrcg6SmJzKtgOlXJvQjWmX9yUkuLHTsUR8hopaTltJRTVPrsjmb9/soGubZrzzy6Gc3au907FEfI6KWk7Lp9n7mZ6cyp7iCiaOjOSPl/YmOEhfTiKeoO8sOSUFpVU8siSD+d/vIqpDC+b9egSDurdxOpaIT1NRS51Ya1mauocHF6ZTVF7N70ZFceeoKJoEakRJxNNU1HJS+4ormLEgjY8z9hHXJYR3bh9K3856MnqR+qKilp9kreWDtbnMWppJVY2LqZf14Zdna0RJpL6pqOW4dh4sIzE5ha+2HmRIZFuemBBPZPvmTscS8UsqavmBWpflza+28/RH2QQ0Mswa149fDAnXiJKIg1TU8h+b9pUweV4KG3ILuSA6lEfHxxGmESURx6mohaoaFy99tpW/frqZFk0Cee66s7iyf5hGlES8hIraz23MLWRKUgpZe0sY2z+MmWNjaNdCI0oi3kRF7afKq2r586pNvLo6h9CWTZh7cwIXx3R0OpaIHIeK2g99vfUgU5NT2H6wjOuHdGPq5X1p1VQjSiLeSkXtR4orqpm9PIv31uwkvG0w790+lBFRGlES8XYqaj/xz8x9TJ+fxv6SCn51TiT3XhxNsyDd/i3SEKiofdzBw5U8tDiDRRt3E92xJS/fNIizurV2OpaInAIVtY+y1rJo424eWpxBSUU191zUi/87P4qgQN3+LdLQqKh90J6icmbMT+OfWfvp3601T06IJ7pTS6djichpUlH7EJfL8v53uTy+LJNql4sZY/py28hIAnT7t0iDpqL2EdsPlJKYnMI3OQUM79GO2RPi6N5OI0oivkBF3cDV1Lp4/ctt/GnlJoICGjH76jiuHdxNt3+L+BAVdQOWtbeYKfNS2JhXxEV9OzBrXBydQpo6HUtE3ExF3QBV1tTywqdbefHTLYQ0a8zz1w/givjOOosW8VEq6gbm+52HmJKUwqZ9hxl3VhgPjI2lbfMgp2OJiAepqBuIsqoa/rRyE69/uY1OrZry+q0JjOqjESURf3DSojbGdAPeBjoCFphjrX3O08Hkv77acoDE5FR2FpRx47BwpozuQ0uNKIn4jbqcUdcAf7DWrjfGtATWGWM+ttZmeDib3ysqr+bxZZm8/10uEe2CeX/SMIb1aOd0LBGpZyctamvtHmDP0T+XGGMygS6AitqDPs7Yx4wFqeSXVHLHeT34/UW9adpYI0oi/uiUrlEbYyKAAcCa47xtEjAJIDw83B3Z/NKBw5XMXJTOkpQ99OnUkrk3JxDftbXTsUTEQXUuamNMCyAJuMdaW/zjt1tr5wBzABISEqzbEvoJay0LNuziocUZlFXW8oeLe3PHeT01oiQidStqY0xjjpT0u9baZM9G8j+7C8uZPj+VT7PzGRB+ZESpV0eNKInIEXV51IcBXgMyrbXPeD6S/3C5LO9+u5MnlmdR67I8cEUMt4yI0IiSiPxAXc6oRwI3AanGmA1HXzfNWrvMY6n8QE7+YRKTUvl2ewFnR7Xn8avj6NY22OlYIuKF6vKojy8AneK5SU2ti1e/2MafP95EUGAjnpwQz88Suur2bxH5SbozsR5l7C5mctJG0nYVc0lMRx4Z14+OrTSiJCInpqKuB5U1tfz1ky289NlWWgc35sUbBnJZv046ixaROlFRe9i6HUdGlLbsP8zVA7tw/5gY2mhESUROgYraQ0ora3h6ZTZvfrWdsJBmvHnbYM6P7uB0LBFpgFTUHrB6cz5Tk1PJO1TOzcO7M3l0H1o00aEWkdOj9nCjorJqHl2WwQdr8+jRvjkf3DGcIZFtnY4lIg2citpNVqTt5f6FaRSUVvGb83ty94W9NKIkIm6hoj5D+0sqmLkonWWpe4np3Io3bh1Mvy4hTscSER+ioj5N1lqS1+/i4SUZlFfXct+l0Uw6tweNAzSiJCLupaI+DXmHypg2P41/bcpnUPc2PDEhnqgOLZyOJSI+SkV9ClwuyztrdvDE8iws8NCVsdw0rDuNNKIkIh6koq6jrfmHSUxK4bvthzinV3seG68RJRGpHyrqk6iudTF3dQ7PrtpMs8YBPP2z/kwY2EW3f4tIvVFRn0DariKmJKWQvruYy+M6MfPKWDq01IiSiNQvFfVxVFTX8pd/buaVf+XQJjiIl28cyOh+nZ2OJSJ+SkX9I2u3FzA5KYWc/FJ+NqgrM8bEEBLc2OlYIuLHVNRHHa6s4akVWbz9zQ7CQprx9sQhnNs71OlYIiIqaoDPN+UzLTmV3UXl3DI8gvsujaa5RpRExEv4dRsVllXxyJJMktbn0TO0OfN+PZxB3TWiJCLexW+LelnqHh5YmEZhWTV3XRDFXaOiNKIkIl7J74p6f3EFDyxMZ0X6Xvp1acVbE4cQG6YRJRHxXn5T1NZaPlyXx6wlGVTUuJgyug+/OieSQI0oiYiX84uizi0oY9r8VFZvPsCQiLbMnhBHj1CNKIlIw+DTRV3rsrz99Xae+igbAzxyVSw3DNWIkog0LD5b1Fv2lzB5XgrrdxZyfnQoj46Po0vrZk7HEhE5ZT5X1NW1Ll75fCt/+ecWgpsE8Odr+zPuLI0oiUjD5VNFnZpXxH3zNpK1t4Qx8Z156MpY2rdo4nQsEZEz4hNFXVFdy7OrNjN3dQ7tmgfxyk2DuDS2k9OxRETcosEX9ZqcgyQmp7LtQCnXJnRj2pi+hDTTiJKI+I4GW9QlFdU8sSKLd77ZSbe2zXj39qGMjGrvdCwREbdrkEX9adZ+ps9PZU9xBb88O5I/XNKb4KAG+Z8iInJSDardCkqreGRJBvO/30WvDi1I+s0IBoa3cTqWiIhHNYiittayJGUPMxelU1Reze8u7MWdF/SkSaBGlETE93l9Ue8rrmD6/DRWZe4jvmsI79w+lL6dWzkdS0Sk3nhtUVtr+cd3uTy6LJOqGhfTLu/DxJEaURIR/1OnojbGjAaeAwKAV621sz0ZaufBMhKTU/hq60GGRrbliQnxRLRv7slPKSLitU5a1MaYAOAF4GIgD/jOGLPIWpvh7jC1LssbX27j6ZXZBDZqxKPj+3H94HCNKImIX6vLGfUQYIu1NgfAGPM+cBXg1qIuKqvmlje+ZUNuIaP6dODR8f3oHKIRJRGRuhR1FyD3mJfzgKE/fidjzCRgEkB4ePgpB2nVLJDu7YK5bWQEV/YP04iSiMhRbvtlorV2DjAHICEhwZ7q3zfG8Nx1A9wVR0TEZ9TlIRS7gG7HvNz16OtERKQe1KWovwN6GWMijTFBwHXAIs/GEhGRfzvppQ9rbY0x5i7gI448PO91a226x5OJiAhQx2vU1tplwDIPZxERkePQbX4iIl5ORS0i4uVU1CIiXk5FLSLi5Yy1p3xvysk/qDH5wI7T/OvtgQNujNOQ6Vj8kI7HD+l4/JcvHIvu1trQ473BI0V9Jowxa621CU7n8AY6Fj+k4/FDOh7/5evHQpc+RES8nIpaRMTLeWNRz3E6gBfRsfghHY8f0vH4L58+Fl53jVpERH7IG8+oRUTkGCpqEREv5zVFbYwZbYzJNsZsMcYkOp3HScaYbsaYT40xGcaYdGPM3U5ncpoxJsAY870xZonTWZxmjGltjJlnjMkyxmQaY4Y7nclJxpjfH/0+STPG/N0Y09TpTO7mFUV9zBPoXgbEANcbY2KcTeWoGuAP1toYYBhwp58fD4C7gUynQ3iJ54AV1to+QH/8+LgYY7oAvwMSrLX9ODLFfJ2zqdzPK4qaY55A11pbBfz7CXT9krV2j7V2/dE/l3DkG7GLs6mcY4zpCowBXnU6i9OMMSHAucBrANbaKmttoaOhnBcINDPGBALBwG6H87idtxT18Z5A12+L6VjGmAhgALDG4ShOehaYDLgczuENIoF84I2jl4JeNcY0dzqUU6y1u4CngZ3AHqDIWrvS2VTu5y1FLcdhjGkBJAH3WGuLnc7jBGPMFcB+a+06p7N4iUBgIPCStXYAUAr47e90jDFtOPLTdyQQBjQ3xtzobCr385ai1hPo/ogxpjFHSvpda22y03kcNBK40hiznSOXxEYZY95xNpKj8oA8a+2/f8Kax5Hi9lcXAdustfnW2mogGRjhcCa385ai1hPoHsMYYzhyDTLTWvuM03mcZK2daq3taq2N4MjXxSfWWp87Y6ora+1eINcYE330VRcCGQ5GctpOYJgxJvjo982F+OAvV+v0nImepifQ/R8jgZuAVGPMhqOvm3b0uStFfgu8e/SkJge4zeE8jrHWrjHGzAPWc+TRUt/jg7eT6xZyEREv5y2XPkRE5CeoqEVEvJyKWkTEy6moRUS8nIpaRMTLqahFRLycilpExMv9P50DX6ms46IMAAAAAElFTkSuQmCC\n", 521 | "text/plain": [ 522 | "
" 523 | ] 524 | }, 525 | "metadata": { 526 | "needs_background": "light" 527 | }, 528 | "output_type": "display_data" 529 | } 530 | ], 531 | "source": [ 532 | "import matplotlib.pyplot as plt\n", 533 | "%matplotlib inline\n", 534 | "\n", 535 | "plt.plot(range(10))" 536 | ] 537 | }, 538 | { 539 | "cell_type": "code", 540 | "execution_count": 17, 541 | "metadata": { 542 | "scrolled": true 543 | }, 544 | "outputs": [ 545 | { 546 | "data": { 547 | "image/png": "iVBORw0KGgoAAAANSUhEUgAAAWoAAAD4CAYAAADFAawfAAAAOXRFWHRTb2Z0d2FyZQBNYXRwbG90bGliIHZlcnNpb24zLjMuMSwgaHR0cHM6Ly9tYXRwbG90bGliLm9yZy/d3fzzAAAACXBIWXMAAAsTAAALEwEAmpwYAAAdzUlEQVR4nO3dd2BV5eHG8e9LQoAwwgojQEggEEhIEAjbiQtFFMRWrZtabH/aam2FMFRUVBy1WusCd9VaJWEPkTqKCwWE7DDCSJiBkEF2ct/fH9AWLUKAe3Nu7n0+f5FB8nhIHk9O7nmusdYiIiLeq5HTAURE5MRU1CIiXk5FLSLi5VTUIiJeTkUtIuLlAj3xQdu3b28jIiI88aFFRHzSunXrDlhrQ4/3No8UdUREBGvXrvXEhxYR8UnGmB0/9TZd+hAR8XIqahERL6eiFhHxcipqEREvp6IWEfFyKmoRES+nohYR8XIqahERN/huewEvf77VIx/bIze8iIj4i8OVNTy5Iou3v95BeNtgbh7eneAg91arilpE5DR9lr2f6fPT2F1Uzm0jI/jjJdFuL2lQUYuInLJDpVU8sjSD5PW7iOrQgnm/HsGg7m089vlU1CIidWStZXnaXh5YmEZhWTW/HRXFXaOiaBIY4NHPq6IWEamD/cUV3L8wjY/S9xHXJYS3Jw4lJqxVvXxuFbWIyAlYa/lwXR6zlmRQWeNi6mV9+OXZkQQG1N+D5lTUIiI/IbegjKnJqXyx5QBDItsy++o4eoS2qPccKmoRkR+pdVne+mo7T32UTUAjw6xx/fjFkHAaNTKO5FFRi4gcY/O+EqYkpbB+ZyHnR4fy2Pg4wlo3czSTilpEBKiudfHyZ1t5/pMtNG8SwLPXnsVVZ4VhjDNn0cdSUYuI30vNK+K+eRvJ2lvC2P5hPDg2hvYtmjgd6z9U1CLityqqa/nzqk3M/VcOoS2bMPfmBC6O6eh0rP+hohYRv/RNzkESk1LYfrCM64d0I/GyvoQ0a+x0rONSUYuIXympqGb28izeXbOT8LbBvHf7UEZEtXc61gmpqEXEb3yatZ9p81PZV1zB7WdHcu8lvT0youRu3p9QROQMFZRW8fDidBZs2E3vji148YYRDAj33IiSu6moRcRnWWtZnLKHmYvSKamo5u4Le3HnBVEEBTas50xRUYuIT9pbVMGMBWmsytxH/64hPHHNUPp0qp8RJXdTUYuIT7HW8v53uTy2NJNql4vpl/dl4tmRBDh0+7c71KmojTG/B24HLJAK3GatrfBkMBGRU7XjYCmJSal8nXOQYT3aMvvqeCLaN3c61hk7aVEbY7oAvwNirLXlxpgPgOuANz2cTUSkTmpdlje+3MbTK7Np3KgRj18dx3WDu3nF7d/uUNdLH4FAM2NMNRAM7PZcJBGRusveW8LkpBQ25hZyUd8OzBoXR6eQpk7HcquTFrW1dpcx5mlgJ1AOrLTWrvzx+xljJgGTAMLDw92dU0TkB6pqXLz42RZe+HQLLZs25i/XD2BsfGefOYs+Vl0ufbQBrgIigULgQ2PMjdbad459P2vtHGAOQEJCgnV/VBGRIzbkFjJlXgrZ+0q46qwwHhwbS9vmQU7H8pi6XPq4CNhmrc0HMMYkAyOAd074t0RE3Ky8qpZnPs7mtS+20aFlU167JYEL+3rfiJK71aWodwLDjDHBHLn0cSGw1qOpRER+5KutB0hMSmVnQRk3DA1nymV9aNXUO0eU3K0u16jXGGPmAeuBGuB7jl7iEBHxtOKKah5flsXfv91JRLtg3p80jGE92jkdq17V6VEf1toHgQc9nEVE5AdWZexj+oJU8ksquePcHtxzUW+aBQU4Have6c5EEfE6Bw9XMnNxBos37qZPp5bMvTmB+K6tnY7lGBW1iHgNay2LNu5m5qJ0DlfWcO/Fvfn1eT0b3IiSu6moRcQr7C4sZ8aCND7J2s9Z3Vrz5DXx9O7Y0ulYXkFFLSKOcrksf/9uJ48vy6LWZbn/ihhuHRHRoEeU3E1FLSKO2XaglMSkFNZsK2BkVDseHx9PeLtgp2N5HRW1iNS7mloXr3+5jT+t3ERQYCOemBDHzxN8Z0TJ3VTUIlKvMvcUMyUphZS8Ii6O6ciscf3o2Mq3RpTcTUUtIvWisqaWFz7ZwoufbaV1cGNe+MVALo/rpLPoOlBRi4jHrd95iCnzUti8/zBXD+jC/VfE0MaHR5TcTUUtIh5TVlXD0x9t4o2vttG5VVPeuG0wF0R3cDpWg6OiFhGP+HLLARKTU8gtKOemYd2ZPDqaln4youRuKmoRcaui8moeW5rJP9bmEtm+Of+YNIyhfjai5G4qahFxm5Xpe5mxII2DpVX8+rye3HNRL5o29r8RJXdTUYvIGcsvqWTm4nSWpuyhb+dWvHbLYOK6hjgdy2eoqEXktFlrWbBhFw8tzqCsspb7Lo1m0rk9aBzg3yNK7qaiFpHTsquwnOnzU/ksO5+B4UdGlKI6aETJE1TUInJKXC7Lu2t2MHt5Fi4LD46N4ebhGlHyJBW1iNRZTv5hEpNS+XZ7Aef0as9j4+Po1lYjSp6mohaRk6qpdTF39Tb+vGoTTQMb8dQ18VwzqKtu/64nKmoROaH03UVMSUohbVcxo2M78fC4WDq01IhSfVJRi8hxVVTX8vwnm3n58xzaBAfx0g0DuSyus9Ox/JKKWkT+x7odBUyel8LW/FImDOzK/Vf0pXWwRpScoqIWkf8orazhqY+yeevr7YSFNOOtiUM4r3eo07H8nopaRAD416Z8piansruonFuGR3DfpdE0b6KK8Ab6VxDxc4VlVcxamsm8dXn0CG3Oh3cMJyGirdOx5BgqahE/tjx1D/cvTOdQWRV3XtCT347SiJI3UlGL+KH9JRU8uDCd5Wl7iQ1rxVsTBxMbphElb6WiFvEj1lrmrctj1tJMyqtrmTK6D7efE6kRJS+nohbxE7kFZUybn8rqzQcYHNGG2RPi6RnawulYUgcqahEf53JZ3v56O09+lI0BHrkqlhuGdqeRRpQaDBW1iA/bsr+EKUmprNtxiPN6h/Lo+H50baMRpYZGRS3ig6prXcz5Vw7PrdpMcJMAnvl5f8YP6KIRpQZKRS3iY9J2FXHfvBQy9xQzJr4zM8fGEtqyidOx5AyoqEV8REV1Lc+u2szc1Tm0bR7EKzcN4tLYTk7HEjdQUYv4gG+3FZCYlELOgVKuTejGtMv7EhLc2OlY4iZ1KmpjTGvgVaAfYIGJ1tqvPZhLROqgpKKaJ1dk87dvdtCtbTPe+eVQzu7V3ulY4mZ1PaN+Dlhhrb3GGBME6NfGIg77NHs/05NT2VNcwcSRkfzx0t4EB+mHZF900n9VY0wIcC5wK4C1tgqo8mwsEfkph0qreGRJBsnf76JXhxYk/WYEA8PbOB1LPKgu//uNBPKBN4wx/YF1wN3W2tJj38kYMwmYBBAeHu7unCJ+z1rL0tQ9PLgwnaLyan43Koo7R0XRJFAjSr6uLjf4BwIDgZestQOAUiDxx+9krZ1jrU2w1iaEhmpoXMSd9hVXcMff1nHXe9/TpU0zFv/2bO69JFol7SfqckadB+RZa9ccfXkexylqEXE/ay0frM1l1tJMqmpcTLu8DxNHRhKoESW/ctKittbuNcbkGmOirbXZwIVAhuejifi3nQfLmDo/hS+3HGRoZFuemBBPRPvmTscSB9T1V8S/Bd49+oiPHOA2z0US8W+1LsubX23n6Y+yCWhkeHR8P64fHK4RJT9Wp6K21m4AEjwbRUQ27Sth8rwUNuQWMqpPBx4d34/OIc2cjiUO04MuRbxAVY2Llz/fyvOfbKZFk0Ceu+4sruwfphElAVTUIo7bmFvIlKQUsvaWMLZ/GDPHxtCuhUaU5L9U1CIOKa+q5dlVm5i7OofQlk2Ye3MCF8d0dDqWeCEVtYgDvsk5SGJSCtsPlnH9kHCmXt6HVk01oiTHp6IWqUclFdXMXp7Fu2t20r1dMO/9aigjempESU5MRS1STz7J2sf0+WnsK67gV+dEcu/F0TQL0p2FcnIqahEPO3i4koeXZLBww26iO7bkpRsHcVa31k7HkgZERS3iIdZaFqfsYeaidEoqqrnnol783/lRBAXq9m85NSpqEQ/YW1TBjAWprMrcT/9urXlyQjzRnVo6HUsaKBW1iBtZa3n/u1weW5pJtcvFjDF9uW1kJAG6/VvOgIpaxE12HCwlMSmVr3MOMrxHO2ZPiKN7O40oyZlTUYucoVqX5Y0vt/H0ymwaN2rE41fHcd3gbrr9W9xGRS1yBrL3ljA5KYWNuYVc1LcDs8bF0SmkqdOxxMeoqEVOQ1WNixc+3cKLn22hVdPGPH/9AK6I76yzaPEIFbXIKdqQW8jkeRvZtO8w484K44GxsbRtHuR0LPFhKmqROiqvquVPK7N5/cttdGzVlNdvTWBUH40oieepqEXq4KutB0hMSmVnQRm/GBrO1Mv60FIjSlJPVNQiJ1BcUc3jyzL5+7e5RLQL5v1JwxjWo53TscTPqKhFfsKqjH1MX5BKfkkld5zbg3su6q0RJXGEilrkRw4cruShxRks3ribPp1aMvfmBOK7tnY6lvgxFbXIUdZaFm7YzUOL0ymtrOUPF/fmjvN6akRJHKeiFgF2F5YzY0Ean2TtZ0D4kRGlXh01oiTeQUUtfs3lsrz37U5mL8+i1mV54IoYbhkRoREl8SoqavFb2w6UkpiUwpptBZwd1Z7Hr46jW9tgp2OJ/A8VtfidmloXr32xjWc+3kRQYCOenBDPzxK66vZv8VoqavErGbuLmZKUQuquIi6J6cgj4/rRsZVGlMS7qajFL1TW1PLXT7bw0mdbaR3cmBd+MZDL4zrpLFoaBBW1+Lx1Ow4xJSmFLfsPc/XALtw/JoY2GlGSBkRFLT6rrKqGpz7K5s2vttO5VVPeuG0wF0R3cDqWyClTUYtP+mLzARKTU8g7VM7Nw7szeXQfWjTRl7s0TPrKFZ9SVFbNo8sy+GBtHj3aN+eDO4YzJLKt07FEzoiKWnzGirS93L8wjYLSKn5zfk/uvrAXTRtrREkaPhW1NHj5JZXMXJTO0tQ9xHRuxRu3DqZflxCnY4m4jYpaGixrLcnrd/HwkgzKq2q579JoJp3bg8YBGlES36KilgZpV2E505JT+XxTPoO6t+GJCfFEdWjhdCwRj6hzURtjAoC1wC5r7RWeiyTy01wuyztrdvDE8iws8NCVsdw0rDuNNKIkPuxUzqjvBjKBVh7KInJCW/MPk5iUwnfbD3FOr/Y8Nl4jSuIf6lTUxpiuwBjgUeBejyYS+ZHqWhdzV+fw7KrNNGscwNM/68+EgV10+7f4jbqeUT8LTAZ+ckndGDMJmAQQHh5+xsFEANJ2FTElKYX03cWMju3Ew+Ni6dBSI0riX05a1MaYK4D91tp1xpjzf+r9rLVzgDkACQkJ1l0BxT9VVNfy/CebefnzHNoEB/HSDQO5LK6z07FEHFGXM+qRwJXGmMuBpkArY8w71tobPRtN/NXa7QVMTkohJ7+UawZ1ZcaYvrQO1oiS+K+TFrW1diowFeDoGfUfVdLiCYcra3hqRRZvf7ODsJBmvD1xCOf2DnU6lojj9Dhq8Qqfb8pnWnIqu4vKuWV4BPddGk1zjSiJAKdY1Nbaz4DPPJJE/FJhWRWPLMkkaX0ePUOb8+Edw0mI0IiSyLF0yiKOWZ66h/sXpnOorIq7LojirlFRGlESOQ4VtdS7/cUVPLAwnRXpe4kNa8VbEwcTG6YRJZGfoqKWemOt5cN1ecxakkFFjYspo/vwq3MiCdSIksgJqailXuQWlDFtfiqrNx9gcEQbZk+Ip2eoRpRE6kJFLR5V67K8/fV2nvooGwM8clUsNwzViJLIqVBRi8ds2V/ClKRU1u04xHm9Q3ns6ji6tG7mdCyRBkdFLW5XXevilc+38pd/biG4SQDP/Lw/4wdoREnkdKmoxa1S84q4b95GsvaWMCa+MzPHxhLasonTsUQaNBW1uEVFdS3PrtrM3NU5tG0exCs3DeLS2E5OxxLxCSpqOWNrcg6SmJzKtgOlXJvQjWmX9yUkuLHTsUR8hopaTltJRTVPrsjmb9/soGubZrzzy6Gc3au907FEfI6KWk7Lp9n7mZ6cyp7iCiaOjOSPl/YmOEhfTiKeoO8sOSUFpVU8siSD+d/vIqpDC+b9egSDurdxOpaIT1NRS51Ya1mauocHF6ZTVF7N70ZFceeoKJoEakRJxNNU1HJS+4ormLEgjY8z9hHXJYR3bh9K3856MnqR+qKilp9kreWDtbnMWppJVY2LqZf14Zdna0RJpL6pqOW4dh4sIzE5ha+2HmRIZFuemBBPZPvmTscS8UsqavmBWpflza+28/RH2QQ0Mswa149fDAnXiJKIg1TU8h+b9pUweV4KG3ILuSA6lEfHxxGmESURx6mohaoaFy99tpW/frqZFk0Cee66s7iyf5hGlES8hIraz23MLWRKUgpZe0sY2z+MmWNjaNdCI0oi3kRF7afKq2r586pNvLo6h9CWTZh7cwIXx3R0OpaIHIeK2g99vfUgU5NT2H6wjOuHdGPq5X1p1VQjSiLeSkXtR4orqpm9PIv31uwkvG0w790+lBFRGlES8XYqaj/xz8x9TJ+fxv6SCn51TiT3XhxNsyDd/i3SEKiofdzBw5U8tDiDRRt3E92xJS/fNIizurV2OpaInAIVtY+y1rJo424eWpxBSUU191zUi/87P4qgQN3+LdLQqKh90J6icmbMT+OfWfvp3601T06IJ7pTS6djichpUlH7EJfL8v53uTy+LJNql4sZY/py28hIAnT7t0iDpqL2EdsPlJKYnMI3OQUM79GO2RPi6N5OI0oivkBF3cDV1Lp4/ctt/GnlJoICGjH76jiuHdxNt3+L+BAVdQOWtbeYKfNS2JhXxEV9OzBrXBydQpo6HUtE3ExF3QBV1tTywqdbefHTLYQ0a8zz1w/givjOOosW8VEq6gbm+52HmJKUwqZ9hxl3VhgPjI2lbfMgp2OJiAepqBuIsqoa/rRyE69/uY1OrZry+q0JjOqjESURf3DSojbGdAPeBjoCFphjrX3O08Hkv77acoDE5FR2FpRx47BwpozuQ0uNKIn4jbqcUdcAf7DWrjfGtATWGWM+ttZmeDib3ysqr+bxZZm8/10uEe2CeX/SMIb1aOd0LBGpZyctamvtHmDP0T+XGGMygS6AitqDPs7Yx4wFqeSXVHLHeT34/UW9adpYI0oi/uiUrlEbYyKAAcCa47xtEjAJIDw83B3Z/NKBw5XMXJTOkpQ99OnUkrk3JxDftbXTsUTEQXUuamNMCyAJuMdaW/zjt1tr5wBzABISEqzbEvoJay0LNuziocUZlFXW8oeLe3PHeT01oiQidStqY0xjjpT0u9baZM9G8j+7C8uZPj+VT7PzGRB+ZESpV0eNKInIEXV51IcBXgMyrbXPeD6S/3C5LO9+u5MnlmdR67I8cEUMt4yI0IiSiPxAXc6oRwI3AanGmA1HXzfNWrvMY6n8QE7+YRKTUvl2ewFnR7Xn8avj6NY22OlYIuKF6vKojy8AneK5SU2ti1e/2MafP95EUGAjnpwQz88Suur2bxH5SbozsR5l7C5mctJG0nYVc0lMRx4Z14+OrTSiJCInpqKuB5U1tfz1ky289NlWWgc35sUbBnJZv046ixaROlFRe9i6HUdGlLbsP8zVA7tw/5gY2mhESUROgYraQ0ora3h6ZTZvfrWdsJBmvHnbYM6P7uB0LBFpgFTUHrB6cz5Tk1PJO1TOzcO7M3l0H1o00aEWkdOj9nCjorJqHl2WwQdr8+jRvjkf3DGcIZFtnY4lIg2citpNVqTt5f6FaRSUVvGb83ty94W9NKIkIm6hoj5D+0sqmLkonWWpe4np3Io3bh1Mvy4hTscSER+ioj5N1lqS1+/i4SUZlFfXct+l0Uw6tweNAzSiJCLupaI+DXmHypg2P41/bcpnUPc2PDEhnqgOLZyOJSI+SkV9ClwuyztrdvDE8iws8NCVsdw0rDuNNKIkIh6koq6jrfmHSUxK4bvthzinV3seG68RJRGpHyrqk6iudTF3dQ7PrtpMs8YBPP2z/kwY2EW3f4tIvVFRn0DariKmJKWQvruYy+M6MfPKWDq01IiSiNQvFfVxVFTX8pd/buaVf+XQJjiIl28cyOh+nZ2OJSJ+SkX9I2u3FzA5KYWc/FJ+NqgrM8bEEBLc2OlYIuLHVNRHHa6s4akVWbz9zQ7CQprx9sQhnNs71OlYIiIqaoDPN+UzLTmV3UXl3DI8gvsujaa5RpRExEv4dRsVllXxyJJMktbn0TO0OfN+PZxB3TWiJCLexW+LelnqHh5YmEZhWTV3XRDFXaOiNKIkIl7J74p6f3EFDyxMZ0X6Xvp1acVbE4cQG6YRJRHxXn5T1NZaPlyXx6wlGVTUuJgyug+/OieSQI0oiYiX84uizi0oY9r8VFZvPsCQiLbMnhBHj1CNKIlIw+DTRV3rsrz99Xae+igbAzxyVSw3DNWIkog0LD5b1Fv2lzB5XgrrdxZyfnQoj46Po0vrZk7HEhE5ZT5X1NW1Ll75fCt/+ecWgpsE8Odr+zPuLI0oiUjD5VNFnZpXxH3zNpK1t4Qx8Z156MpY2rdo4nQsEZEz4hNFXVFdy7OrNjN3dQ7tmgfxyk2DuDS2k9OxRETcosEX9ZqcgyQmp7LtQCnXJnRj2pi+hDTTiJKI+I4GW9QlFdU8sSKLd77ZSbe2zXj39qGMjGrvdCwREbdrkEX9adZ+ps9PZU9xBb88O5I/XNKb4KAG+Z8iInJSDardCkqreGRJBvO/30WvDi1I+s0IBoa3cTqWiIhHNYiittayJGUPMxelU1Reze8u7MWdF/SkSaBGlETE93l9Ue8rrmD6/DRWZe4jvmsI79w+lL6dWzkdS0Sk3nhtUVtr+cd3uTy6LJOqGhfTLu/DxJEaURIR/1OnojbGjAaeAwKAV621sz0ZaufBMhKTU/hq60GGRrbliQnxRLRv7slPKSLitU5a1MaYAOAF4GIgD/jOGLPIWpvh7jC1LssbX27j6ZXZBDZqxKPj+3H94HCNKImIX6vLGfUQYIu1NgfAGPM+cBXg1qIuKqvmlje+ZUNuIaP6dODR8f3oHKIRJRGRuhR1FyD3mJfzgKE/fidjzCRgEkB4ePgpB2nVLJDu7YK5bWQEV/YP04iSiMhRbvtlorV2DjAHICEhwZ7q3zfG8Nx1A9wVR0TEZ9TlIRS7gG7HvNz16OtERKQe1KWovwN6GWMijTFBwHXAIs/GEhGRfzvppQ9rbY0x5i7gI448PO91a226x5OJiAhQx2vU1tplwDIPZxERkePQbX4iIl5ORS0i4uVU1CIiXk5FLSLi5Yy1p3xvysk/qDH5wI7T/OvtgQNujNOQ6Vj8kI7HD+l4/JcvHIvu1trQ473BI0V9Jowxa621CU7n8AY6Fj+k4/FDOh7/5evHQpc+RES8nIpaRMTLeWNRz3E6gBfRsfghHY8f0vH4L58+Fl53jVpERH7IG8+oRUTkGCpqEREv5zVFbYwZbYzJNsZsMcYkOp3HScaYbsaYT40xGcaYdGPM3U5ncpoxJsAY870xZonTWZxmjGltjJlnjMkyxmQaY4Y7nclJxpjfH/0+STPG/N0Y09TpTO7mFUV9zBPoXgbEANcbY2KcTeWoGuAP1toYYBhwp58fD4C7gUynQ3iJ54AV1to+QH/8+LgYY7oAvwMSrLX9ODLFfJ2zqdzPK4qaY55A11pbBfz7CXT9krV2j7V2/dE/l3DkG7GLs6mcY4zpCowBXnU6i9OMMSHAucBrANbaKmttoaOhnBcINDPGBALBwG6H87idtxT18Z5A12+L6VjGmAhgALDG4ShOehaYDLgczuENIoF84I2jl4JeNcY0dzqUU6y1u4CngZ3AHqDIWrvS2VTu5y1FLcdhjGkBJAH3WGuLnc7jBGPMFcB+a+06p7N4iUBgIPCStXYAUAr47e90jDFtOPLTdyQQBjQ3xtzobCr385ai1hPo/ogxpjFHSvpda22y03kcNBK40hiznSOXxEYZY95xNpKj8oA8a+2/f8Kax5Hi9lcXAdustfnW2mogGRjhcCa385ai1hPoHsMYYzhyDTLTWvuM03mcZK2daq3taq2N4MjXxSfWWp87Y6ora+1eINcYE330VRcCGQ5GctpOYJgxJvjo982F+OAvV+v0nImepifQ/R8jgZuAVGPMhqOvm3b0uStFfgu8e/SkJge4zeE8jrHWrjHGzAPWc+TRUt/jg7eT6xZyEREv5y2XPkRE5CeoqEVEvJyKWkTEy6moRUS8nIpaRMTLqahFRLycilpExMv9P50DX6ms46IMAAAAAElFTkSuQmCC\n", 548 | "text/plain": [ 549 | "
" 550 | ] 551 | }, 552 | "metadata": { 553 | "needs_background": "light" 554 | }, 555 | "output_type": "display_data" 556 | } 557 | ], 558 | "source": [ 559 | "plt.plot(range(10)); # notice it doesn't print []" 560 | ] 561 | }, 562 | { 563 | "cell_type": "code", 564 | "execution_count": 18, 565 | "metadata": {}, 566 | "outputs": [ 567 | { 568 | "data": { 569 | "text/plain": [ 570 | "'some documentation here at the end of the cell'" 571 | ] 572 | }, 573 | "execution_count": 18, 574 | "metadata": {}, 575 | "output_type": "execute_result" 576 | } 577 | ], 578 | "source": [ 579 | "\"some documentation here at the end of the cell\"" 580 | ] 581 | }, 582 | { 583 | "cell_type": "code", 584 | "execution_count": 19, 585 | "metadata": {}, 586 | "outputs": [], 587 | "source": [ 588 | "\"some documentation here at the end of the cell\"; # notice it doesn't print" 589 | ] 590 | } 591 | ], 592 | "metadata": { 593 | "kernelspec": { 594 | "display_name": "Python 3", 595 | "language": "python", 596 | "name": "python3" 597 | }, 598 | "language_info": { 599 | "codemirror_mode": { 600 | "name": "ipython", 601 | "version": 3 602 | }, 603 | "file_extension": ".py", 604 | "mimetype": "text/x-python", 605 | "name": "python", 606 | "nbconvert_exporter": "python", 607 | "pygments_lexer": "ipython3", 608 | "version": "3.8.3" 609 | } 610 | }, 611 | "nbformat": 4, 612 | "nbformat_minor": 2 613 | } 614 | -------------------------------------------------------------------------------- /2_Procedural_Python.ipynb: -------------------------------------------------------------------------------- 1 | { 2 | "cells": [ 3 | { 4 | "cell_type": "markdown", 5 | "metadata": {}, 6 | "source": [ 7 | "# Procedural Programming (Really Control Flow)\n", 8 | "#### Boring, Academic Definition (Nobody Cares Except Professors and Pedantic, Technical Interviewers...):\n", 9 | "Procedural Programming can be defined as a programming model (derived from structured programming), which focuses on calling procedures. **Procedures**, also known as routines, subroutines or **functions**, simply consist of a series of computational steps to be carried out.\n", 10 | "#### Reality:\n", 11 | "Simple, fairly boring code that just works well: tell the computer what to do step-by-step with control flow using conditions (`if/elif/else`), loops/iteration (`for/while/else`), \"jumping\" (`continue/break/return/pass`), and exception handling (`try/except/else/finally`). This notebook will focus on mastering **control flow** inside and outside a function. Don't be like Bruce Lee, be an Airbender: don't be like water, bend the (control) flow to your will." 12 | ] 13 | }, 14 | { 15 | "cell_type": "code", 16 | "execution_count": null, 17 | "metadata": {}, 18 | "outputs": [], 19 | "source": [] 20 | }, 21 | { 22 | "cell_type": "markdown", 23 | "metadata": {}, 24 | "source": [ 25 | "## I want to EXPRESS my STATEMENTS clearly\n", 26 | "Know the difference between an expression and statement.\n", 27 | "* expression: evaluates to something or returns a value (and thus can be assigned to a variable): `1 + 1`, `range(10)`, all function calls: `sum(range(10))`, `\" \".join([\"Python\", \"is\", \"the\", \"best\"])`\n", 28 | "* statements: do something but not always assignable: `if`, `try/except`, `pass`, `break`, `continue`, `del`, `return`, `def func`, `class SomeClass`\n", 29 | "* Technically, speaking all expressions are statements. Here's a trick: if something can be assignable, then it is an expression and thus a statement. If something cannot be assignable, then it is a statement." 30 | ] 31 | }, 32 | { 33 | "cell_type": "code", 34 | "execution_count": 1, 35 | "metadata": {}, 36 | "outputs": [ 37 | { 38 | "name": "stdout", 39 | "output_type": "stream", 40 | "text": [ 41 | "3\n" 42 | ] 43 | } 44 | ], 45 | "source": [ 46 | "# Expressions/functions can take parentheses to stretch the body across multiple lines\n", 47 | "print(\n", 48 | " 1 \n", 49 | " + \n", 50 | " 2)" 51 | ] 52 | }, 53 | { 54 | "cell_type": "code", 55 | "execution_count": 2, 56 | "metadata": {}, 57 | "outputs": [ 58 | { 59 | "name": "stdout", 60 | "output_type": "stream", 61 | "text": [ 62 | "Hi\n", 63 | "None Thanks\n", 64 | "None Bye\n" 65 | ] 66 | } 67 | ], 68 | "source": [ 69 | "# In Python 3, print() is a function. In Python 2, print is a statement.\n", 70 | "print(print(print(\"Hi\"), \"Thanks\"), \"Bye\") # this line COULD not run in Python 2" 71 | ] 72 | }, 73 | { 74 | "cell_type": "code", 75 | "execution_count": 3, 76 | "metadata": {}, 77 | "outputs": [], 78 | "source": [ 79 | "# statements do not need parentheses \n", 80 | "def func():\n", 81 | " return 1 # return is a statement, not a function. Don't put return(1) <- R people do this" 82 | ] 83 | }, 84 | { 85 | "cell_type": "code", 86 | "execution_count": 4, 87 | "metadata": {}, 88 | "outputs": [ 89 | { 90 | "ename": "AssertionError", 91 | "evalue": "This is False", 92 | "output_type": "error", 93 | "traceback": [ 94 | "\u001b[1;31m---------------------------------------------------------------------------\u001b[0m", 95 | "\u001b[1;31mAssertionError\u001b[0m Traceback (most recent call last)", 96 | "\u001b[1;32m\u001b[0m in \u001b[0;36m\u001b[1;34m()\u001b[0m\n\u001b[1;32m----> 1\u001b[1;33m \u001b[1;32massert\u001b[0m \u001b[1;32mFalse\u001b[0m\u001b[1;33m,\u001b[0m \u001b[1;34m\"This is False\"\u001b[0m \u001b[1;31m# assert is statement\u001b[0m\u001b[1;33m\u001b[0m\u001b[0m\n\u001b[0m", 97 | "\u001b[1;31mAssertionError\u001b[0m: This is False" 98 | ] 99 | } 100 | ], 101 | "source": [ 102 | "assert False, \"This is False\" # assert is statement" 103 | ] 104 | }, 105 | { 106 | "cell_type": "code", 107 | "execution_count": 5, 108 | "metadata": {}, 109 | "outputs": [ 110 | { 111 | "name": "stderr", 112 | "output_type": "stream", 113 | "text": [ 114 | ":1: SyntaxWarning: assertion is always true, perhaps remove parentheses?\n", 115 | " assert(False, \"This is False\") # what happened?\n" 116 | ] 117 | } 118 | ], 119 | "source": [ 120 | "assert(False, \"This is False\") # what happened?" 121 | ] 122 | }, 123 | { 124 | "cell_type": "code", 125 | "execution_count": null, 126 | "metadata": {}, 127 | "outputs": [], 128 | "source": [] 129 | }, 130 | { 131 | "cell_type": "markdown", 132 | "metadata": {}, 133 | "source": [ 134 | "## `pass` my turn\n", 135 | "`pass` is a no-op (which stands for \"no operation\") and is used in two ways:\n", 136 | "* tell the computer to do nothing\n", 137 | "* temporary placeholder to fill with actual code later.\n", 138 | "\n", 139 | "`pass` is a statement--it cannot be saved. If you like to be *cute/special/different*, you can use ... (which is called Ellipsis) instead of pass--but nobody does this." 140 | ] 141 | }, 142 | { 143 | "cell_type": "code", 144 | "execution_count": 1, 145 | "metadata": {}, 146 | "outputs": [], 147 | "source": [ 148 | "try:\n", 149 | " 1 / 0\n", 150 | "except:\n", 151 | " pass # keep going no matter way. Don't have a fatal exception" 152 | ] 153 | }, 154 | { 155 | "cell_type": "code", 156 | "execution_count": 2, 157 | "metadata": {}, 158 | "outputs": [], 159 | "source": [ 160 | "def foo():\n", 161 | " pass # fill in later" 162 | ] 163 | }, 164 | { 165 | "cell_type": "code", 166 | "execution_count": 3, 167 | "metadata": {}, 168 | "outputs": [ 169 | { 170 | "ename": "SyntaxError", 171 | "evalue": "invalid syntax (, line 1)", 172 | "output_type": "error", 173 | "traceback": [ 174 | "\u001b[1;36m File \u001b[1;32m\"\"\u001b[1;36m, line \u001b[1;32m1\u001b[0m\n\u001b[1;33m x = pass\u001b[0m\n\u001b[1;37m ^\u001b[0m\n\u001b[1;31mSyntaxError\u001b[0m\u001b[1;31m:\u001b[0m invalid syntax\n" 175 | ] 176 | } 177 | ], 178 | "source": [ 179 | "x = pass" 180 | ] 181 | }, 182 | { 183 | "cell_type": "code", 184 | "execution_count": 4, 185 | "metadata": {}, 186 | "outputs": [], 187 | "source": [ 188 | "# using Ellipsis instead of pass\n", 189 | "try:\n", 190 | " 1 / 0\n", 191 | "except:\n", 192 | " ... # keep going no matter way. Don't have a fatal exception\n", 193 | "\n", 194 | "def foo():\n", 195 | " ... # fill in later" 196 | ] 197 | }, 198 | { 199 | "cell_type": "code", 200 | "execution_count": null, 201 | "metadata": {}, 202 | "outputs": [], 203 | "source": [] 204 | }, 205 | { 206 | "cell_type": "markdown", 207 | "metadata": {}, 208 | "source": [ 209 | "## Do you want to take a `break` or `continue`?\n", 210 | "What is the difference between break or continue? They both exist inside loops (`for` and `while`).\n", 211 | "* `break`: when you want to stop the loop immediately.\n", 212 | "* `continue`: skip the rest of the body of the loop and go immediately back to the top of the loop with the next iteration." 213 | ] 214 | }, 215 | { 216 | "cell_type": "code", 217 | "execution_count": 1, 218 | "metadata": {}, 219 | "outputs": [ 220 | { 221 | "data": { 222 | "text/plain": [ 223 | "[0, 1, 2]" 224 | ] 225 | }, 226 | "execution_count": 1, 227 | "metadata": {}, 228 | "output_type": "execute_result" 229 | } 230 | ], 231 | "source": [ 232 | "accumulator = []\n", 233 | "for i in range(5):\n", 234 | " if i == 3:\n", 235 | " break # immediately \"break\" out of the loop\n", 236 | " accumulator.append(i)\n", 237 | "accumulator" 238 | ] 239 | }, 240 | { 241 | "cell_type": "code", 242 | "execution_count": 2, 243 | "metadata": {}, 244 | "outputs": [ 245 | { 246 | "data": { 247 | "text/plain": [ 248 | "[0, 1, 2, 4]" 249 | ] 250 | }, 251 | "execution_count": 2, 252 | "metadata": {}, 253 | "output_type": "execute_result" 254 | } 255 | ], 256 | "source": [ 257 | "accumulator = []\n", 258 | "for i in range(5):\n", 259 | " if i == 3:\n", 260 | " continue # go back to the top of the loop, you can think of `continue` as `skip`\n", 261 | " accumulator.append(i)\n", 262 | "accumulator" 263 | ] 264 | }, 265 | { 266 | "cell_type": "code", 267 | "execution_count": null, 268 | "metadata": {}, 269 | "outputs": [], 270 | "source": [] 271 | }, 272 | { 273 | "cell_type": "markdown", 274 | "metadata": {}, 275 | "source": [ 276 | "## The power of `else`!\n", 277 | "What `else` is cool?\n", 278 | "* `if/else`\n", 279 | "* `for/else`\n", 280 | "* `while/else`\n", 281 | "* `try/except/else/finally`\n", 282 | "\n", 283 | "#### `if/elif/else` construct\n", 284 | "This is the most common type of control flow and is used for branching. `if` a certain condition is true, do this. `elif` (which stands for \"else if\") a different condition is true, do that. If none of the previous conditions are true, then do `else`. You may use multiple (0, 1, 2, ..) `elif` clauses but only at most 1 `else` clause.\n", 285 | "\n", 286 | "Be careful about using `if/if/if/...` vs `if/elif/elif/...`, especially for non-mutually exclusive conditions. In the 1st case, multiple clauses can be triggered. In the 2nd case, at most 1 of the clauses will be triggered. Also, in the 2nd case, order of the conditions matters since the first condition triggered will have its body executed. `if/else` has EXACTLY 1 condition that will be triggered. This is also true for `if/elif/elif/.../else`.\n", 287 | "\n", 288 | "Fizz Buzz test: If the number is multiple of 3, print “Fizz”. If the number is a multiple of 5, print “Buzz”. If the number is both a multiple of 3 and 5, print \"FizzBuzz\". Otherwise, just print the original number." 289 | ] 290 | }, 291 | { 292 | "cell_type": "code", 293 | "execution_count": 1, 294 | "metadata": {}, 295 | "outputs": [], 296 | "source": [ 297 | "def fizzbuzz_wrong1(num): # multiple conditions can be triggered\n", 298 | " if num % 3 == 0:\n", 299 | " print(\"Fizz\")\n", 300 | " if num % 5 == 0:\n", 301 | " print(\"Buzz\")\n", 302 | " if (num % 3 == 0) and (num % 5 == 0):\n", 303 | " print(\"FizzBuzz\")\n", 304 | " else:\n", 305 | " print(num)\n", 306 | "\n", 307 | "def fizzbuzz_wrong2(num): # EXACTLY 1 condition can be triggered\n", 308 | " if num % 3 == 0:\n", 309 | " print(\"Fizz\")\n", 310 | " elif num % 5 == 0:\n", 311 | " print(\"Buzz\")\n", 312 | " elif (num % 3 == 0) and (num % 5 == 0):\n", 313 | " print(\"FizzBuzz\")\n", 314 | " else:\n", 315 | " print(num)\n", 316 | "\n", 317 | "def fizzbuzz_correct(num): # EXACTLY 1 condition can be triggered\n", 318 | " if (num % 3 == 0) and (num % 5 == 0):\n", 319 | " print(\"FizzBuzz\")\n", 320 | " elif num % 3 == 0:\n", 321 | " print(\"Fizz\")\n", 322 | " elif num % 5 == 0:\n", 323 | " print(\"Buzz\")\n", 324 | " else:\n", 325 | " print(num)" 326 | ] 327 | }, 328 | { 329 | "cell_type": "code", 330 | "execution_count": 2, 331 | "metadata": {}, 332 | "outputs": [ 333 | { 334 | "name": "stdout", 335 | "output_type": "stream", 336 | "text": [ 337 | "2\n", 338 | "2\n", 339 | "2\n" 340 | ] 341 | } 342 | ], 343 | "source": [ 344 | "# so far, all look okay\n", 345 | "fizzbuzz_wrong1(2)\n", 346 | "fizzbuzz_wrong2(2)\n", 347 | "fizzbuzz_correct(2)" 348 | ] 349 | }, 350 | { 351 | "cell_type": "code", 352 | "execution_count": 3, 353 | "metadata": {}, 354 | "outputs": [ 355 | { 356 | "name": "stdout", 357 | "output_type": "stream", 358 | "text": [ 359 | "Fizz\n", 360 | "3\n" 361 | ] 362 | } 363 | ], 364 | "source": [ 365 | "fizzbuzz_wrong1(3) # definitely wrong" 366 | ] 367 | }, 368 | { 369 | "cell_type": "code", 370 | "execution_count": 4, 371 | "metadata": {}, 372 | "outputs": [ 373 | { 374 | "name": "stdout", 375 | "output_type": "stream", 376 | "text": [ 377 | "Fizz\n" 378 | ] 379 | } 380 | ], 381 | "source": [ 382 | "fizzbuzz_wrong2(3) # still looks okay" 383 | ] 384 | }, 385 | { 386 | "cell_type": "code", 387 | "execution_count": 5, 388 | "metadata": {}, 389 | "outputs": [ 390 | { 391 | "name": "stdout", 392 | "output_type": "stream", 393 | "text": [ 394 | "Fizz\n" 395 | ] 396 | } 397 | ], 398 | "source": [ 399 | "fizzbuzz_correct(3) # still looks okay" 400 | ] 401 | }, 402 | { 403 | "cell_type": "code", 404 | "execution_count": 6, 405 | "metadata": {}, 406 | "outputs": [ 407 | { 408 | "name": "stdout", 409 | "output_type": "stream", 410 | "text": [ 411 | "Fizz\n", 412 | "Buzz\n", 413 | "FizzBuzz\n" 414 | ] 415 | } 416 | ], 417 | "source": [ 418 | "fizzbuzz_wrong1(15) # faulty implementation: should have printed only \"FizzBuzz\"" 419 | ] 420 | }, 421 | { 422 | "cell_type": "code", 423 | "execution_count": 7, 424 | "metadata": {}, 425 | "outputs": [ 426 | { 427 | "name": "stdout", 428 | "output_type": "stream", 429 | "text": [ 430 | "Fizz\n" 431 | ] 432 | } 433 | ], 434 | "source": [ 435 | "fizzbuzz_wrong2(15) # faulty implementation: should have printed \"FizzBuzz\"" 436 | ] 437 | }, 438 | { 439 | "cell_type": "code", 440 | "execution_count": 8, 441 | "metadata": {}, 442 | "outputs": [ 443 | { 444 | "name": "stdout", 445 | "output_type": "stream", 446 | "text": [ 447 | "FizzBuzz\n" 448 | ] 449 | } 450 | ], 451 | "source": [ 452 | "fizzbuzz_correct(15) # correct implementation and order matters in if/elif" 453 | ] 454 | }, 455 | { 456 | "cell_type": "markdown", 457 | "metadata": {}, 458 | "source": [ 459 | "#### `for/else` construct\n", 460 | "Most people are aware of `for` loops in Python. Most Python programmers are not aware of `for/else`. The `else` clause executes only if the loop completes normally; this means that the loop did not encounter a `break` statement. `for/else` is useful when you want to determine whether a loop had:\n", 461 | "* an \"early\" exit due to `break`/`return` (because a certain condition was reached)\n", 462 | "* or a \"natural\" exit due to all elements have been looped through.\n", 463 | "\n", 464 | "Instead of using `for/else`, you can use a flag-based approach instead. \n", 465 | "Fun fact: Raymond Hettinger (a core Python developer) said in hindsight, this use case of `else` should have been called `nobreak` instead." 466 | ] 467 | }, 468 | { 469 | "cell_type": "code", 470 | "execution_count": 1, 471 | "metadata": {}, 472 | "outputs": [], 473 | "source": [ 474 | "# early break\n", 475 | "for i in range(10):\n", 476 | " if i == 5:\n", 477 | " break\n", 478 | "else: # basically nobreak\n", 479 | " print(\"ALL ITERATIONS WERE LOOPED\") # not executed" 480 | ] 481 | }, 482 | { 483 | "cell_type": "code", 484 | "execution_count": 2, 485 | "metadata": {}, 486 | "outputs": [ 487 | { 488 | "name": "stdout", 489 | "output_type": "stream", 490 | "text": [ 491 | "ALL ITERATIONS WERE LOOPED\n" 492 | ] 493 | } 494 | ], 495 | "source": [ 496 | "# make sure all elements have been iterated through normally\n", 497 | "for i in range(10):\n", 498 | " pass\n", 499 | "else:\n", 500 | " print(\"ALL ITERATIONS WERE LOOPED\") # `else` is executed" 501 | ] 502 | }, 503 | { 504 | "cell_type": "code", 505 | "execution_count": 3, 506 | "metadata": {}, 507 | "outputs": [ 508 | { 509 | "name": "stdout", 510 | "output_type": "stream", 511 | "text": [ 512 | "NO EVEN NUMBER SEEN\n" 513 | ] 514 | } 515 | ], 516 | "source": [ 517 | "# instead of a for/else, the alternative is flag-based approach to determine if a condition is reached\n", 518 | "even_number_seen_flag = None # this flag is set when a condition is reached\n", 519 | "for i in range(1, 10, 2):\n", 520 | " if i % 2 == 0:\n", 521 | " even_number_seen_flag = True\n", 522 | " break\n", 523 | "if not even_number_seen_flag: # flag is set, so execute body\n", 524 | " print(\"NO EVEN NUMBER SEEN\")\n", 525 | " # perhaps do an additional step" 526 | ] 527 | }, 528 | { 529 | "cell_type": "code", 530 | "execution_count": 4, 531 | "metadata": {}, 532 | "outputs": [ 533 | { 534 | "name": "stdout", 535 | "output_type": "stream", 536 | "text": [ 537 | "NO EVEN NUMBER SEEN\n" 538 | ] 539 | } 540 | ], 541 | "source": [ 542 | "# using for/else instead of flag-based approach\n", 543 | "for i in range(1, 10, 2):\n", 544 | " if i % 2 == 0:\n", 545 | " break\n", 546 | "else:\n", 547 | " print(\"NO EVEN NUMBER SEEN\")\n", 548 | " # perhaps do an additional step" 549 | ] 550 | }, 551 | { 552 | "cell_type": "code", 553 | "execution_count": 5, 554 | "metadata": {}, 555 | "outputs": [], 556 | "source": [ 557 | "# instead of a for/else, the alternative is flag-based approach to determine if a condition is reached\n", 558 | "even_number_seen_flag = None # this flag is set when a condition is reached\n", 559 | "for i in range(1, 10):\n", 560 | " if i % 2 == 0:\n", 561 | " even_number_seen_flag = True\n", 562 | " break\n", 563 | "if not even_number_seen_flag: # flag not set, so do not execute body\n", 564 | " print(\"NO EVEN NUMBER SEEN\")\n", 565 | " # perhaps do an additional step" 566 | ] 567 | }, 568 | { 569 | "cell_type": "code", 570 | "execution_count": 6, 571 | "metadata": {}, 572 | "outputs": [], 573 | "source": [ 574 | "# using for/else instead of flag-based approach\n", 575 | "for i in range(1, 10):\n", 576 | " if i % 2 == 0:\n", 577 | " break\n", 578 | "else:\n", 579 | " print(\"NO EVEN NUMBER SEEN\")\n", 580 | " # perhaps do an additional step" 581 | ] 582 | }, 583 | { 584 | "cell_type": "markdown", 585 | "metadata": {}, 586 | "source": [ 587 | "Basically using `else` in a `for` loop saves you 2 lines (defining a flag variable and setting a flag variable once the condition is reached) and replaces 1 line (checking the status of the flag variable after the loop).\n", 588 | "\n", 589 | "#### `while/else` construct\n", 590 | "The equivalent `else` clause is available for `while` loops. The `else` clause is only executed if the `while` loop finally hits a False condition, not when a `break` statement is encountered. Basically the same usage case as `for/else`--`while/else` allows you to determine how a loop exited: either early `break` or natural end of while loop. \n", 591 | "Fun fact: Raymond Hettinger's suggestion for `nobreak` would also apply to the `else` in `while/else`." 592 | ] 593 | }, 594 | { 595 | "cell_type": "code", 596 | "execution_count": 1, 597 | "metadata": {}, 598 | "outputs": [], 599 | "source": [ 600 | "# early break\n", 601 | "i = 0\n", 602 | "while i < 10:\n", 603 | " i += 1\n", 604 | " if i == 5:\n", 605 | " break\n", 606 | "else: # basically nobreak\n", 607 | " print(\"A micro-aggression? I am triggered!\") # thankfully not triggered" 608 | ] 609 | }, 610 | { 611 | "cell_type": "code", 612 | "execution_count": 2, 613 | "metadata": {}, 614 | "outputs": [ 615 | { 616 | "name": "stdout", 617 | "output_type": "stream", 618 | "text": [ 619 | "Everything is working normally\n" 620 | ] 621 | } 622 | ], 623 | "source": [ 624 | "# condition naturally became False\n", 625 | "i = 0\n", 626 | "while i < 10:\n", 627 | " i += 1\n", 628 | "else:\n", 629 | " print(\"Everything is working normally\") # thankfully not triggered" 630 | ] 631 | }, 632 | { 633 | "cell_type": "code", 634 | "execution_count": 3, 635 | "metadata": {}, 636 | "outputs": [], 637 | "source": [ 638 | "# flag based approach\n", 639 | "early_break = None\n", 640 | "i = 0\n", 641 | "while i < 10:\n", 642 | " i += 1\n", 643 | " if i == 5:\n", 644 | " early_break = True\n", 645 | " break\n", 646 | "if not early_break:\n", 647 | " print(\"I did not break out early\")" 648 | ] 649 | }, 650 | { 651 | "cell_type": "code", 652 | "execution_count": 4, 653 | "metadata": {}, 654 | "outputs": [], 655 | "source": [ 656 | "# flag based approach\n", 657 | "i = 0\n", 658 | "while i < 10:\n", 659 | " i += 1\n", 660 | " if i == 5:\n", 661 | " break\n", 662 | "else:\n", 663 | " print(\"I did not break out early\")" 664 | ] 665 | }, 666 | { 667 | "cell_type": "markdown", 668 | "metadata": {}, 669 | "source": [ 670 | "Small detour: Many procedural languages (that uses braces to denote the loop body) have a `do while` construct. Python does not have a `do while` loop since the creators of Python use indentation instead of braces and thus could not formulate the syntax for a `do while` loop. \n", 671 | "Fear not! You can effectively create the same construct as a `do while` loop by using an infinite loop (`while True`) and checking the *negation* of the loop condition at the bottom of the loop body to break out of the infinite loop.\n", 672 | "\n", 673 | "```C\n", 674 | "do { // do while in C\n", 675 | " do_work(); \n", 676 | "} while (condition);\n", 677 | "```\n", 678 | "\n", 679 | "Effective replacement in Python\n", 680 | "```python\n", 681 | "while True:\n", 682 | " do_work() \n", 683 | " if not condition: # notice that you have to check the opposite of the intended condition\n", 684 | " break\n", 685 | "```\n", 686 | "If you have never seen a `do while` loop, don't worry about. Python programmers don't use it. This small detour was purely for completeness and as a neat trick. Python is magic! ✨" 687 | ] 688 | }, 689 | { 690 | "cell_type": "code", 691 | "execution_count": null, 692 | "metadata": {}, 693 | "outputs": [], 694 | "source": [] 695 | }, 696 | { 697 | "cell_type": "markdown", 698 | "metadata": {}, 699 | "source": [ 700 | "#### try/except/else/finally\n", 701 | "You might of heard \"Oh noes! Don't use `try/except`. It is slow!\" \n", 702 | "\\< In Maury Povich's voice after a lie detector test \\> That was a lie! \n", 703 | "`try/except` is super *fast*, as fast as an `if` statement--if (pun intended!) you don't hit an exception. It is only slightly *slower* if an exception is triggered." 704 | ] 705 | }, 706 | { 707 | "cell_type": "code", 708 | "execution_count": 1, 709 | "metadata": {}, 710 | "outputs": [ 711 | { 712 | "name": "stdout", 713 | "output_type": "stream", 714 | "text": [ 715 | "Wall time: 820 ms\n" 716 | ] 717 | } 718 | ], 719 | "source": [ 720 | "%%time\n", 721 | "for i in range(10000000):\n", 722 | " if i:\n", 723 | " pass" 724 | ] 725 | }, 726 | { 727 | "cell_type": "code", 728 | "execution_count": 2, 729 | "metadata": {}, 730 | "outputs": [ 731 | { 732 | "name": "stdout", 733 | "output_type": "stream", 734 | "text": [ 735 | "Wall time: 841 ms\n" 736 | ] 737 | } 738 | ], 739 | "source": [ 740 | "%%time\n", 741 | "# no time penalty\n", 742 | "for i in range(10000000):\n", 743 | " try:\n", 744 | " i\n", 745 | " except:\n", 746 | " pass" 747 | ] 748 | }, 749 | { 750 | "cell_type": "code", 751 | "execution_count": 3, 752 | "metadata": {}, 753 | "outputs": [ 754 | { 755 | "name": "stdout", 756 | "output_type": "stream", 757 | "text": [ 758 | "Wall time: 2.89 s\n" 759 | ] 760 | } 761 | ], 762 | "source": [ 763 | "%%time\n", 764 | "# less than a microsecond runtime penalty per exception raised\n", 765 | "for i in range(10000000):\n", 766 | " try:\n", 767 | " raise Exception\n", 768 | " except:\n", 769 | " pass" 770 | ] 771 | }, 772 | { 773 | "cell_type": "markdown", 774 | "metadata": {}, 775 | "source": [ 776 | "#### Pokemon Catch 'em All Exception\n", 777 | "When I was a (wee-little) intern many years ago, my mentor said, do you know about the \"Pokemon catch 'em all\" exception? Pokemon catch 'em all exception is when you use a bare exception (`except:`/`except Exception:`/`except BaseException:`) instead of a specific exception type. Always specify a specific exception when you can because you don't want to catch a different exception that you did not expect." 778 | ] 779 | }, 780 | { 781 | "cell_type": "code", 782 | "execution_count": 4, 783 | "metadata": {}, 784 | "outputs": [ 785 | { 786 | "data": { 787 | "text/plain": [ 788 | "1.5" 789 | ] 790 | }, 791 | "execution_count": 4, 792 | "metadata": {}, 793 | "output_type": "execute_result" 794 | } 795 | ], 796 | "source": [ 797 | "def how_divisive(x, y):\n", 798 | " if (not isinstance(x, (int, float))) or (not isinstance(y, (int, float))):\n", 799 | " raise TypeError(\"x and y have to be numeric!\")\n", 800 | " return x / y\n", 801 | "\n", 802 | "how_divisive(3, 2)" 803 | ] 804 | }, 805 | { 806 | "cell_type": "code", 807 | "execution_count": 5, 808 | "metadata": {}, 809 | "outputs": [ 810 | { 811 | "name": "stdout", 812 | "output_type": "stream", 813 | "text": [ 814 | "This is to be expected\n" 815 | ] 816 | } 817 | ], 818 | "source": [ 819 | "# do this\n", 820 | "try:\n", 821 | " how_divisive(3, \"hi\")\n", 822 | "except TypeError:\n", 823 | " print(\"This is to be expected\")" 824 | ] 825 | }, 826 | { 827 | "cell_type": "code", 828 | "execution_count": 6, 829 | "metadata": {}, 830 | "outputs": [ 831 | { 832 | "name": "stdout", 833 | "output_type": "stream", 834 | "text": [ 835 | "This is to be expected...or am I being lazy?\n" 836 | ] 837 | } 838 | ], 839 | "source": [ 840 | "# don't do this\n", 841 | "try:\n", 842 | " how_divisive(3, 0)\n", 843 | "except: # you wrote the code expecting TypeError, but you did not consider ZeroDivisionError\n", 844 | " print(\"This is to be expected...or am I being lazy?\")" 845 | ] 846 | }, 847 | { 848 | "cell_type": "markdown", 849 | "metadata": {}, 850 | "source": [ 851 | "#### Complex `try/except` examples\n", 852 | "* Multiple `Exception` clauses: do a different intervention per exception.\n", 853 | "* Multiple `Exception`s in 1 clause: do the same intervention for the specified exceptions." 854 | ] 855 | }, 856 | { 857 | "cell_type": "code", 858 | "execution_count": 7, 859 | "metadata": { 860 | "scrolled": true 861 | }, 862 | "outputs": [ 863 | { 864 | "name": "stdout", 865 | "output_type": "stream", 866 | "text": [ 867 | "Can I have your number?\n" 868 | ] 869 | } 870 | ], 871 | "source": [ 872 | "try:\n", 873 | " 1 / 0\n", 874 | "except ZeroDivisionError:\n", 875 | " print(\"Can I have your number?\")\n", 876 | "except KeyError: # You can have multiple `except` clauses\n", 877 | " print(\"You trying to steal my key?\")" 878 | ] 879 | }, 880 | { 881 | "cell_type": "code", 882 | "execution_count": 8, 883 | "metadata": {}, 884 | "outputs": [ 885 | { 886 | "name": "stdout", 887 | "output_type": "stream", 888 | "text": [ 889 | "You trying to steal my key?\n" 890 | ] 891 | } 892 | ], 893 | "source": [ 894 | "try:\n", 895 | " {}[\"unknown_key\"]\n", 896 | "except ZeroDivisionError:\n", 897 | " print(\"Can I have your number?\")\n", 898 | "except KeyError: # You can have multiple `except` clauses\n", 899 | " print(\"You trying to steal my key?\")" 900 | ] 901 | }, 902 | { 903 | "cell_type": "code", 904 | "execution_count": 9, 905 | "metadata": {}, 906 | "outputs": [ 907 | { 908 | "name": "stdout", 909 | "output_type": "stream", 910 | "text": [ 911 | "You forgot my name? Or did you simply forget where you put it?\n" 912 | ] 913 | } 914 | ], 915 | "source": [ 916 | "try:\n", 917 | " x\n", 918 | "except (NameError, IndexError): # you can have multple exceptions in 1 clause\n", 919 | " print(\"You forgot my name? Or did you simply forget where you put it?\")" 920 | ] 921 | }, 922 | { 923 | "cell_type": "code", 924 | "execution_count": 10, 925 | "metadata": {}, 926 | "outputs": [ 927 | { 928 | "name": "stdout", 929 | "output_type": "stream", 930 | "text": [ 931 | "('You are the best',)\n" 932 | ] 933 | } 934 | ], 935 | "source": [ 936 | "class YouAreExceptional(Exception): # this is how you create a custom exception\n", 937 | " pass\n", 938 | "\n", 939 | "try:\n", 940 | " raise YouAreExceptional(\"You are the best\")\n", 941 | "except Exception as e: # this is how you capture an exception and save it\n", 942 | " print(e.args) # interrogate it for intel!" 943 | ] 944 | }, 945 | { 946 | "cell_type": "markdown", 947 | "metadata": {}, 948 | "source": [ 949 | "#### What else to throw in? `else` and `finally`\n", 950 | "* `else` clause in `try/except` is executed only if no `Exception` is triggered.\n", 951 | "* `finally` clause: always execute no matter what!\n", 952 | "\n", 953 | "You can have at most of 1 of each. Also, they are independent of each other, so you can have no `else` and no `finally`, `else` but no `finally`, `finally` but no `else`, both `else` and `finally`." 954 | ] 955 | }, 956 | { 957 | "cell_type": "code", 958 | "execution_count": 11, 959 | "metadata": {}, 960 | "outputs": [ 961 | { 962 | "name": "stdout", 963 | "output_type": "stream", 964 | "text": [ 965 | "Can I have your number?\n" 966 | ] 967 | } 968 | ], 969 | "source": [ 970 | "# else: not triggered\n", 971 | "try:\n", 972 | " 1 / 0\n", 973 | "except ZeroDivisionError:\n", 974 | " print(\"Can I have your number?\")\n", 975 | "else:\n", 976 | " print(\"Only print if no exception hit\")" 977 | ] 978 | }, 979 | { 980 | "cell_type": "code", 981 | "execution_count": 12, 982 | "metadata": {}, 983 | "outputs": [ 984 | { 985 | "name": "stdout", 986 | "output_type": "stream", 987 | "text": [ 988 | "Only print if no exception hit\n" 989 | ] 990 | } 991 | ], 992 | "source": [ 993 | "# else: triggered\n", 994 | "try:\n", 995 | " 0 / 1\n", 996 | "except ZeroDivisionError:\n", 997 | " print(\"Can I have your number?\")\n", 998 | "else:\n", 999 | " print(\"Only print if no exception hit\")" 1000 | ] 1001 | }, 1002 | { 1003 | "cell_type": "code", 1004 | "execution_count": 13, 1005 | "metadata": {}, 1006 | "outputs": [ 1007 | { 1008 | "name": "stdout", 1009 | "output_type": "stream", 1010 | "text": [ 1011 | "I will always run!\n" 1012 | ] 1013 | }, 1014 | { 1015 | "data": { 1016 | "text/plain": [ 1017 | "1" 1018 | ] 1019 | }, 1020 | "execution_count": 13, 1021 | "metadata": {}, 1022 | "output_type": "execute_result" 1023 | } 1024 | ], 1025 | "source": [ 1026 | "# finally we see `finally`!\n", 1027 | "def always_return_1_no_matter_what():\n", 1028 | " try:\n", 1029 | " assert False, \"This has been Falsified!\"\n", 1030 | " finally: # run this no matter what, even if an Exception is triggered\n", 1031 | " print(\"I will always run!\")\n", 1032 | " return 1\n", 1033 | " \n", 1034 | "always_return_1_no_matter_what()" 1035 | ] 1036 | }, 1037 | { 1038 | "cell_type": "code", 1039 | "execution_count": 14, 1040 | "metadata": {}, 1041 | "outputs": [ 1042 | { 1043 | "name": "stdout", 1044 | "output_type": "stream", 1045 | "text": [ 1046 | "('This has been Falsified!',)\n", 1047 | "Attempted\n", 1048 | "I will always run!\n" 1049 | ] 1050 | }, 1051 | { 1052 | "data": { 1053 | "text/plain": [ 1054 | "1" 1055 | ] 1056 | }, 1057 | "execution_count": 14, 1058 | "metadata": {}, 1059 | "output_type": "execute_result" 1060 | } 1061 | ], 1062 | "source": [ 1063 | "def always_return_1_no_matter_what():\n", 1064 | " try:\n", 1065 | " assert False, \"This has been Falsified!\"\n", 1066 | " except AssertionError as e:\n", 1067 | " print(e.args) # the print runs\n", 1068 | " return 2, print(\"Attempted\") # the return is attempted but not ultimately returned\n", 1069 | " finally: # run this no matter what, even if something else is \n", 1070 | " print(\"I will always run!\") # notice this print comes after the AssertionError's print\n", 1071 | " return 1\n", 1072 | "\n", 1073 | " \n", 1074 | "always_return_1_no_matter_what()" 1075 | ] 1076 | }, 1077 | { 1078 | "cell_type": "code", 1079 | "execution_count": null, 1080 | "metadata": {}, 1081 | "outputs": [], 1082 | "source": [] 1083 | }, 1084 | { 1085 | "cell_type": "markdown", 1086 | "metadata": {}, 1087 | "source": [ 1088 | "I recently came across an interesting problem. I was writing a unit test where I anticipated it to raise a specific exception. I wanted to test:\n", 1089 | "* the correct exception is raised when a certain condition is met\n", 1090 | "* alert me if the specific exception is not raised when the condition is met\n", 1091 | "* alert me if the wrong exception is raised when the condition is met\n", 1092 | "\n", 1093 | "So the question is: how do I raise an exception in response to if no anticipated exception is raised?" 1094 | ] 1095 | }, 1096 | { 1097 | "cell_type": "code", 1098 | "execution_count": 15, 1099 | "metadata": {}, 1100 | "outputs": [ 1101 | { 1102 | "ename": "TypeError", 1103 | "evalue": "You snake! Black adders will come bit you!", 1104 | "output_type": "error", 1105 | "traceback": [ 1106 | "\u001b[1;31m---------------------------------------------------------------------------\u001b[0m", 1107 | "\u001b[1;31mTypeError\u001b[0m Traceback (most recent call last)", 1108 | "\u001b[1;32m\u001b[0m in \u001b[0;36m\u001b[1;34m()\u001b[0m\n\u001b[0;32m 4\u001b[0m \u001b[1;32mreturn\u001b[0m \u001b[0mx\u001b[0m \u001b[1;33m+\u001b[0m \u001b[0my\u001b[0m\u001b[1;33m\u001b[0m\u001b[0m\n\u001b[0;32m 5\u001b[0m \u001b[1;33m\u001b[0m\u001b[0m\n\u001b[1;32m----> 6\u001b[1;33m \u001b[0madder\u001b[0m\u001b[1;33m(\u001b[0m\u001b[1;36m1\u001b[0m\u001b[1;33m,\u001b[0m \u001b[1;34m\"2\"\u001b[0m\u001b[1;33m)\u001b[0m \u001b[1;31m# except-able behavior ;-)\u001b[0m\u001b[1;33m\u001b[0m\u001b[0m\n\u001b[0m", 1109 | "\u001b[1;32m\u001b[0m in \u001b[0;36madder\u001b[1;34m(x, y)\u001b[0m\n\u001b[0;32m 1\u001b[0m \u001b[1;32mdef\u001b[0m \u001b[0madder\u001b[0m\u001b[1;33m(\u001b[0m\u001b[0mx\u001b[0m\u001b[1;33m,\u001b[0m \u001b[0my\u001b[0m\u001b[1;33m)\u001b[0m\u001b[1;33m:\u001b[0m\u001b[1;33m\u001b[0m\u001b[0m\n\u001b[0;32m 2\u001b[0m \u001b[1;32mif\u001b[0m \u001b[0mtype\u001b[0m\u001b[1;33m(\u001b[0m\u001b[0mx\u001b[0m\u001b[1;33m)\u001b[0m \u001b[1;33m!=\u001b[0m \u001b[0mtype\u001b[0m\u001b[1;33m(\u001b[0m\u001b[0my\u001b[0m\u001b[1;33m)\u001b[0m\u001b[1;33m:\u001b[0m\u001b[1;33m\u001b[0m\u001b[0m\n\u001b[1;32m----> 3\u001b[1;33m \u001b[1;32mraise\u001b[0m \u001b[0mTypeError\u001b[0m\u001b[1;33m(\u001b[0m\u001b[1;34m\"You snake! Black adders will come bit you!\"\u001b[0m\u001b[1;33m)\u001b[0m\u001b[1;33m\u001b[0m\u001b[0m\n\u001b[0m\u001b[0;32m 4\u001b[0m \u001b[1;32mreturn\u001b[0m \u001b[0mx\u001b[0m \u001b[1;33m+\u001b[0m \u001b[0my\u001b[0m\u001b[1;33m\u001b[0m\u001b[0m\n\u001b[0;32m 5\u001b[0m \u001b[1;33m\u001b[0m\u001b[0m\n", 1110 | "\u001b[1;31mTypeError\u001b[0m: You snake! Black adders will come bit you!" 1111 | ] 1112 | } 1113 | ], 1114 | "source": [ 1115 | "def adder(x, y):\n", 1116 | " if type(x) != type(y):\n", 1117 | " raise TypeError(\"You snake! Black adders will come bit you!\")\n", 1118 | " return x + y\n", 1119 | "\n", 1120 | "adder(1, \"2\") # except-able behavior ;-)" 1121 | ] 1122 | }, 1123 | { 1124 | "cell_type": "code", 1125 | "execution_count": 16, 1126 | "metadata": {}, 1127 | "outputs": [ 1128 | { 1129 | "name": "stdout", 1130 | "output_type": "stream", 1131 | "text": [ 1132 | "Whew, it worked!\n" 1133 | ] 1134 | } 1135 | ], 1136 | "source": [ 1137 | "# basically this is what I wanted but with nicer syntax\n", 1138 | "try:\n", 1139 | " adder(1, \"2\")\n", 1140 | "except TypeError: # this is the specific exception I expected, any other exception types will still be raised\n", 1141 | " print(\"Whew, it worked!\")\n", 1142 | "else:\n", 1143 | " raise Exception(\"Something went wrong! Something exceptional should have happened\")" 1144 | ] 1145 | }, 1146 | { 1147 | "cell_type": "code", 1148 | "execution_count": 17, 1149 | "metadata": {}, 1150 | "outputs": [], 1151 | "source": [ 1152 | "# here's the solution, so much nicer\n", 1153 | "import pytest\n", 1154 | "\n", 1155 | "with pytest.raises(TypeError): # don't raise an exception if the anticipated exception is raised\n", 1156 | " adder(1, \"2\")" 1157 | ] 1158 | }, 1159 | { 1160 | "cell_type": "code", 1161 | "execution_count": 18, 1162 | "metadata": {}, 1163 | "outputs": [ 1164 | { 1165 | "ename": "Failed", 1166 | "evalue": "DID NOT RAISE ", 1167 | "output_type": "error", 1168 | "traceback": [ 1169 | "\u001b[1;31m---------------------------------------------------------------------------\u001b[0m", 1170 | "\u001b[1;31mFailed\u001b[0m Traceback (most recent call last)", 1171 | "\u001b[1;32m\u001b[0m in \u001b[0;36m\u001b[1;34m()\u001b[0m\n\u001b[0;32m 1\u001b[0m \u001b[1;32mwith\u001b[0m \u001b[0mpytest\u001b[0m\u001b[1;33m.\u001b[0m\u001b[0mraises\u001b[0m\u001b[1;33m(\u001b[0m\u001b[0mTypeError\u001b[0m\u001b[1;33m)\u001b[0m\u001b[1;33m:\u001b[0m \u001b[1;31m# raise an exception if no exeptions raised\u001b[0m\u001b[1;33m\u001b[0m\u001b[0m\n\u001b[1;32m----> 2\u001b[1;33m \u001b[1;36m1\u001b[0m \u001b[1;33m+\u001b[0m \u001b[1;36m1\u001b[0m\u001b[1;33m\u001b[0m\u001b[0m\n\u001b[0m", 1172 | "\u001b[1;32mC:\\ProgramData\\Anaconda3\\lib\\site-packages\\_pytest\\python_api.py\u001b[0m in \u001b[0;36m__exit__\u001b[1;34m(self, *tp)\u001b[0m\n\u001b[0;32m 610\u001b[0m \u001b[0m__tracebackhide__\u001b[0m \u001b[1;33m=\u001b[0m \u001b[1;32mTrue\u001b[0m\u001b[1;33m\u001b[0m\u001b[0m\n\u001b[0;32m 611\u001b[0m \u001b[1;32mif\u001b[0m \u001b[0mtp\u001b[0m\u001b[1;33m[\u001b[0m\u001b[1;36m0\u001b[0m\u001b[1;33m]\u001b[0m \u001b[1;32mis\u001b[0m \u001b[1;32mNone\u001b[0m\u001b[1;33m:\u001b[0m\u001b[1;33m\u001b[0m\u001b[0m\n\u001b[1;32m--> 612\u001b[1;33m \u001b[0mfail\u001b[0m\u001b[1;33m(\u001b[0m\u001b[0mself\u001b[0m\u001b[1;33m.\u001b[0m\u001b[0mmessage\u001b[0m\u001b[1;33m)\u001b[0m\u001b[1;33m\u001b[0m\u001b[0m\n\u001b[0m\u001b[0;32m 613\u001b[0m \u001b[0mself\u001b[0m\u001b[1;33m.\u001b[0m\u001b[0mexcinfo\u001b[0m\u001b[1;33m.\u001b[0m\u001b[0m__init__\u001b[0m\u001b[1;33m(\u001b[0m\u001b[0mtp\u001b[0m\u001b[1;33m)\u001b[0m\u001b[1;33m\u001b[0m\u001b[0m\n\u001b[0;32m 614\u001b[0m \u001b[0msuppress_exception\u001b[0m \u001b[1;33m=\u001b[0m \u001b[0missubclass\u001b[0m\u001b[1;33m(\u001b[0m\u001b[0mself\u001b[0m\u001b[1;33m.\u001b[0m\u001b[0mexcinfo\u001b[0m\u001b[1;33m.\u001b[0m\u001b[0mtype\u001b[0m\u001b[1;33m,\u001b[0m \u001b[0mself\u001b[0m\u001b[1;33m.\u001b[0m\u001b[0mexpected_exception\u001b[0m\u001b[1;33m)\u001b[0m\u001b[1;33m\u001b[0m\u001b[0m\n", 1173 | "\u001b[1;32mC:\\ProgramData\\Anaconda3\\lib\\site-packages\\_pytest\\outcomes.py\u001b[0m in \u001b[0;36mfail\u001b[1;34m(msg, pytrace)\u001b[0m\n\u001b[0;32m 90\u001b[0m \"\"\"\n\u001b[0;32m 91\u001b[0m \u001b[0m__tracebackhide__\u001b[0m \u001b[1;33m=\u001b[0m \u001b[1;32mTrue\u001b[0m\u001b[1;33m\u001b[0m\u001b[0m\n\u001b[1;32m---> 92\u001b[1;33m \u001b[1;32mraise\u001b[0m \u001b[0mFailed\u001b[0m\u001b[1;33m(\u001b[0m\u001b[0mmsg\u001b[0m\u001b[1;33m=\u001b[0m\u001b[0mmsg\u001b[0m\u001b[1;33m,\u001b[0m \u001b[0mpytrace\u001b[0m\u001b[1;33m=\u001b[0m\u001b[0mpytrace\u001b[0m\u001b[1;33m)\u001b[0m\u001b[1;33m\u001b[0m\u001b[0m\n\u001b[0m\u001b[0;32m 93\u001b[0m \u001b[1;33m\u001b[0m\u001b[0m\n\u001b[0;32m 94\u001b[0m \u001b[1;33m\u001b[0m\u001b[0m\n", 1174 | "\u001b[1;31mFailed\u001b[0m: DID NOT RAISE " 1175 | ] 1176 | } 1177 | ], 1178 | "source": [ 1179 | "with pytest.raises(TypeError): # raise an exception if the anticipated exeption is not raised\n", 1180 | " 1 + 1" 1181 | ] 1182 | }, 1183 | { 1184 | "cell_type": "code", 1185 | "execution_count": 19, 1186 | "metadata": {}, 1187 | "outputs": [ 1188 | { 1189 | "ename": "ZeroDivisionError", 1190 | "evalue": "division by zero", 1191 | "output_type": "error", 1192 | "traceback": [ 1193 | "\u001b[1;31m---------------------------------------------------------------------------\u001b[0m", 1194 | "\u001b[1;31mZeroDivisionError\u001b[0m Traceback (most recent call last)", 1195 | "\u001b[1;32m\u001b[0m in \u001b[0;36m\u001b[1;34m()\u001b[0m\n\u001b[0;32m 1\u001b[0m \u001b[1;32mwith\u001b[0m \u001b[0mpytest\u001b[0m\u001b[1;33m.\u001b[0m\u001b[0mraises\u001b[0m\u001b[1;33m(\u001b[0m\u001b[0mTypeError\u001b[0m\u001b[1;33m)\u001b[0m\u001b[1;33m:\u001b[0m \u001b[1;31m# raise an exception if the wrong exception is raised\u001b[0m\u001b[1;33m\u001b[0m\u001b[0m\n\u001b[1;32m----> 2\u001b[1;33m \u001b[1;36m1\u001b[0m \u001b[1;33m/\u001b[0m \u001b[1;36m0\u001b[0m\u001b[1;33m\u001b[0m\u001b[0m\n\u001b[0m", 1196 | "\u001b[1;31mZeroDivisionError\u001b[0m: division by zero" 1197 | ] 1198 | } 1199 | ], 1200 | "source": [ 1201 | "with pytest.raises(TypeError): # raise an exception if the wrong exception is raised\n", 1202 | " 1 / 0" 1203 | ] 1204 | }, 1205 | { 1206 | "cell_type": "markdown", 1207 | "metadata": {}, 1208 | "source": [ 1209 | "#### In Totality: `try/except/else/finally` in its fully beauty" 1210 | ] 1211 | }, 1212 | { 1213 | "cell_type": "code", 1214 | "execution_count": 20, 1215 | "metadata": {}, 1216 | "outputs": [ 1217 | { 1218 | "name": "stdout", 1219 | "output_type": "stream", 1220 | "text": [ 1221 | "I'm going to trigger an exception\n", 1222 | "Harley Quinn: I just wanna say something...\n" 1223 | ] 1224 | }, 1225 | { 1226 | "ename": "Exception", 1227 | "evalue": "Triggered!", 1228 | "output_type": "error", 1229 | "traceback": [ 1230 | "\u001b[1;31m---------------------------------------------------------------------------\u001b[0m", 1231 | "\u001b[1;31mException\u001b[0m Traceback (most recent call last)", 1232 | "\u001b[1;32m\u001b[0m in \u001b[0;36m\u001b[1;34m()\u001b[0m\n\u001b[0;32m 11\u001b[0m \u001b[1;32melse\u001b[0m\u001b[1;33m:\u001b[0m\u001b[1;33m\u001b[0m\u001b[0m\n\u001b[0;32m 12\u001b[0m \u001b[0mprint\u001b[0m\u001b[1;33m(\u001b[0m\u001b[1;34m\"I'm going to trigger an exception\"\u001b[0m\u001b[1;33m)\u001b[0m\u001b[1;33m\u001b[0m\u001b[0m\n\u001b[1;32m---> 13\u001b[1;33m \u001b[1;32mraise\u001b[0m \u001b[0mException\u001b[0m\u001b[1;33m(\u001b[0m\u001b[1;34m\"Triggered!\"\u001b[0m\u001b[1;33m)\u001b[0m\u001b[1;33m\u001b[0m\u001b[0m\n\u001b[0m\u001b[0;32m 14\u001b[0m \u001b[1;32mfinally\u001b[0m\u001b[1;33m:\u001b[0m\u001b[1;33m\u001b[0m\u001b[0m\n\u001b[0;32m 15\u001b[0m \u001b[0mprint\u001b[0m\u001b[1;33m(\u001b[0m\u001b[1;34m\"Harley Quinn: I just wanna say something...\"\u001b[0m\u001b[1;33m)\u001b[0m \u001b[1;31m# does print before the `else` clause raises the Exception\u001b[0m\u001b[1;33m\u001b[0m\u001b[0m\n", 1233 | "\u001b[1;31mException\u001b[0m: Triggered!" 1234 | ] 1235 | } 1236 | ], 1237 | "source": [ 1238 | "try:\n", 1239 | " pass\n", 1240 | "except ZeroDivisionError:\n", 1241 | " print(\"Can I have your number?\")\n", 1242 | "except KeyError: # You can have multiple `except` clauses\n", 1243 | " print(\"You trying to steal my key?\")\n", 1244 | "except (NameError, IndexError): # you can have multple exceptions in 1 clause\n", 1245 | " print(\"You forgot my name? Or did you simply forget where you put it?\")\n", 1246 | "except:\n", 1247 | " print(\"Catch any remaining exception types\")\n", 1248 | "else:\n", 1249 | " print(\"I'm going to trigger an exception\")\n", 1250 | " raise Exception(\"Triggered!\")\n", 1251 | "finally:\n", 1252 | " print(\"Harley Quinn: I just wanna say something...\") # does print before the `else` clause raises the Exception" 1253 | ] 1254 | }, 1255 | { 1256 | "cell_type": "code", 1257 | "execution_count": 21, 1258 | "metadata": {}, 1259 | "outputs": [ 1260 | { 1261 | "name": "stdout", 1262 | "output_type": "stream", 1263 | "text": [ 1264 | "I'm going to trigger an exception\n", 1265 | "I'm going to silence any exceptions\n" 1266 | ] 1267 | }, 1268 | { 1269 | "data": { 1270 | "text/plain": [ 1271 | "1" 1272 | ] 1273 | }, 1274 | "execution_count": 21, 1275 | "metadata": {}, 1276 | "output_type": "execute_result" 1277 | } 1278 | ], 1279 | "source": [ 1280 | "def always_return_1():\n", 1281 | " try:\n", 1282 | " pass\n", 1283 | " except ZeroDivisionError:\n", 1284 | " print(\"Can I have your number?\")\n", 1285 | " except KeyError: # You can have multiple `except` clauses\n", 1286 | " print(\"You trying to steal my key?\")\n", 1287 | " except (NameError, IndexError): # you can have multple exceptions in 1 clause\n", 1288 | " print(\"You forgot my name? Or did you simply forget where you put it?\")\n", 1289 | " except:\n", 1290 | " print(\"Catch any remaining exception types\")\n", 1291 | " else:\n", 1292 | " print(\"I'm going to trigger an exception\")\n", 1293 | " raise Exception(\"Triggered!\")\n", 1294 | " finally:\n", 1295 | " print(\"I'm going to silence any exceptions\") # does print before the `else` clause raises the Exception\n", 1296 | " return 1\n", 1297 | " \n", 1298 | "always_return_1()" 1299 | ] 1300 | }, 1301 | { 1302 | "cell_type": "markdown", 1303 | "metadata": {}, 1304 | "source": [ 1305 | "#### Context for using `finally`: An Actual Use Case\n", 1306 | "When would you use `finally`? `finally` is used when you want to clean up your environment/release resources: close files, close database connections, write final log messages, release locks, etc. For example, if you are connected to a SQL database but hit an exception during a query, you might have not released the database connection. A database has a limited number of connections it can handle at any one time. Hence, if Python does not release the connection to the database, nobody else can actually query the database. The `finally` block can be used to close a database connection, regardless of whether or not the SQL query completed successfully.\n", 1307 | "\n", 1308 | "Well, that sounds awfully like a context manager such as `with open(my_file) as f:` where the file will close itself (no matter what) when the body of the clause is finished executing. Your hunch is correct: `try/finally` can be used to create your own custom context managers.\n", 1309 | "\n", 1310 | "Here's an example where if you did not use a context manager for opening a file and hit an error, the file (descriptor) does not automatically close and your message might be lost." 1311 | ] 1312 | }, 1313 | { 1314 | "cell_type": "code", 1315 | "execution_count": 1, 1316 | "metadata": {}, 1317 | "outputs": [ 1318 | { 1319 | "ename": "ZeroDivisionError", 1320 | "evalue": "division by zero", 1321 | "output_type": "error", 1322 | "traceback": [ 1323 | "\u001b[1;31m---------------------------------------------------------------------------\u001b[0m", 1324 | "\u001b[1;31mZeroDivisionError\u001b[0m Traceback (most recent call last)", 1325 | "\u001b[1;32m\u001b[0m in \u001b[0;36m\u001b[1;34m()\u001b[0m\n\u001b[0;32m 6\u001b[0m \u001b[0mopened_file\u001b[0m\u001b[1;33m.\u001b[0m\u001b[0mclose\u001b[0m\u001b[1;33m(\u001b[0m\u001b[1;33m)\u001b[0m \u001b[1;31m# these lines are not executed\u001b[0m\u001b[1;33m\u001b[0m\u001b[0m\n\u001b[0;32m 7\u001b[0m \u001b[1;33m\u001b[0m\u001b[0m\n\u001b[1;32m----> 8\u001b[1;33m \u001b[0mbad_open\u001b[0m\u001b[1;33m(\u001b[0m\u001b[1;34m\"silly.txt\"\u001b[0m\u001b[1;33m)\u001b[0m\u001b[1;33m\u001b[0m\u001b[0m\n\u001b[0m", 1326 | "\u001b[1;32m\u001b[0m in \u001b[0;36mbad_open\u001b[1;34m(filename)\u001b[0m\n\u001b[0;32m 2\u001b[0m \u001b[0mopened_file\u001b[0m \u001b[1;33m=\u001b[0m \u001b[0mopen\u001b[0m\u001b[1;33m(\u001b[0m\u001b[0mfilename\u001b[0m\u001b[1;33m,\u001b[0m \u001b[1;34m\"w\"\u001b[0m\u001b[1;33m)\u001b[0m\u001b[1;33m\u001b[0m\u001b[0m\n\u001b[0;32m 3\u001b[0m \u001b[0mopened_file\u001b[0m\u001b[1;33m.\u001b[0m\u001b[0mwrite\u001b[0m\u001b[1;33m(\u001b[0m\u001b[1;34m\"what to hear a secret?\"\u001b[0m\u001b[1;33m)\u001b[0m \u001b[1;31m# this message is lost\u001b[0m\u001b[1;33m\u001b[0m\u001b[0m\n\u001b[1;32m----> 4\u001b[1;33m \u001b[1;36m1\u001b[0m \u001b[1;33m/\u001b[0m \u001b[1;36m0\u001b[0m \u001b[1;31m# pretend here is some code you wrote but didn't expect to break\u001b[0m\u001b[1;33m\u001b[0m\u001b[0m\n\u001b[0m\u001b[0;32m 5\u001b[0m \u001b[0mopened_file\u001b[0m\u001b[1;33m.\u001b[0m\u001b[0mwrite\u001b[0m\u001b[1;33m(\u001b[0m\u001b[1;34m\"my very important secret message is ...\"\u001b[0m\u001b[1;33m)\u001b[0m \u001b[1;31m# these lines are not executed\u001b[0m\u001b[1;33m\u001b[0m\u001b[0m\n\u001b[0;32m 6\u001b[0m \u001b[0mopened_file\u001b[0m\u001b[1;33m.\u001b[0m\u001b[0mclose\u001b[0m\u001b[1;33m(\u001b[0m\u001b[1;33m)\u001b[0m \u001b[1;31m# these lines are not executed\u001b[0m\u001b[1;33m\u001b[0m\u001b[0m\n", 1327 | "\u001b[1;31mZeroDivisionError\u001b[0m: division by zero" 1328 | ] 1329 | } 1330 | ], 1331 | "source": [ 1332 | "def bad_open(filename):\n", 1333 | " opened_file = open(filename, \"w\")\n", 1334 | " opened_file.write(\"what to hear a secret?\") # this message is lost\n", 1335 | " 1 / 0 # pretend here is some code you wrote but didn't expect to break\n", 1336 | " opened_file.write(\"my very important secret message is ...\") # these lines are not executed\n", 1337 | " opened_file.close() # these lines are not executed\n", 1338 | "\n", 1339 | "bad_open(\"silly.txt\")" 1340 | ] 1341 | }, 1342 | { 1343 | "cell_type": "code", 1344 | "execution_count": 2, 1345 | "metadata": {}, 1346 | "outputs": [ 1347 | { 1348 | "name": "stdout", 1349 | "output_type": "stream", 1350 | "text": [ 1351 | "\n" 1352 | ] 1353 | } 1354 | ], 1355 | "source": [ 1356 | "with open(\"silly.txt\", \"r\") as f:\n", 1357 | " print(f.read()) # the message is lost; file is empty" 1358 | ] 1359 | }, 1360 | { 1361 | "cell_type": "code", 1362 | "execution_count": 3, 1363 | "metadata": {}, 1364 | "outputs": [ 1365 | { 1366 | "name": "stderr", 1367 | "output_type": "stream", 1368 | "text": [ 1369 | "rm: cannot remove ‘silly.txt’: Permission denied\n" 1370 | ] 1371 | } 1372 | ], 1373 | "source": [ 1374 | "%%bash\n", 1375 | "rm silly.txt # can't even delete the file because file descriptor is not closed" 1376 | ] 1377 | }, 1378 | { 1379 | "cell_type": "code", 1380 | "execution_count": null, 1381 | "metadata": {}, 1382 | "outputs": [], 1383 | "source": [ 1384 | "# restart notebook" 1385 | ] 1386 | }, 1387 | { 1388 | "cell_type": "code", 1389 | "execution_count": 1, 1390 | "metadata": {}, 1391 | "outputs": [ 1392 | { 1393 | "ename": "ZeroDivisionError", 1394 | "evalue": "division by zero", 1395 | "output_type": "error", 1396 | "traceback": [ 1397 | "\u001b[1;31m---------------------------------------------------------------------------\u001b[0m", 1398 | "\u001b[1;31mZeroDivisionError\u001b[0m Traceback (most recent call last)", 1399 | "\u001b[1;32m\u001b[0m in \u001b[0;36m\u001b[1;34m()\u001b[0m\n\u001b[0;32m 5\u001b[0m \u001b[0mopened_file\u001b[0m\u001b[1;33m.\u001b[0m\u001b[0mwrite\u001b[0m\u001b[1;33m(\u001b[0m\u001b[1;34m\"my very important secret message is ...\"\u001b[0m\u001b[1;33m)\u001b[0m \u001b[1;31m# these lines are not executed\u001b[0m\u001b[1;33m\u001b[0m\u001b[0m\n\u001b[0;32m 6\u001b[0m \u001b[1;33m\u001b[0m\u001b[0m\n\u001b[1;32m----> 7\u001b[1;33m \u001b[0mgood_open\u001b[0m\u001b[1;33m(\u001b[0m\u001b[1;34m\"silly.txt\"\u001b[0m\u001b[1;33m)\u001b[0m\u001b[1;33m\u001b[0m\u001b[0m\n\u001b[0m", 1400 | "\u001b[1;32m\u001b[0m in \u001b[0;36mgood_open\u001b[1;34m(filename)\u001b[0m\n\u001b[0;32m 2\u001b[0m \u001b[1;32mwith\u001b[0m \u001b[0mopen\u001b[0m\u001b[1;33m(\u001b[0m\u001b[0mfilename\u001b[0m\u001b[1;33m,\u001b[0m \u001b[1;34m\"w\"\u001b[0m\u001b[1;33m)\u001b[0m \u001b[1;32mas\u001b[0m \u001b[0mopened_file\u001b[0m\u001b[1;33m:\u001b[0m\u001b[1;33m\u001b[0m\u001b[0m\n\u001b[0;32m 3\u001b[0m \u001b[0mopened_file\u001b[0m\u001b[1;33m.\u001b[0m\u001b[0mwrite\u001b[0m\u001b[1;33m(\u001b[0m\u001b[1;34m\"what to hear a secret?\"\u001b[0m\u001b[1;33m)\u001b[0m \u001b[1;31m# this message is lost\u001b[0m\u001b[1;33m\u001b[0m\u001b[0m\n\u001b[1;32m----> 4\u001b[1;33m \u001b[1;36m1\u001b[0m \u001b[1;33m/\u001b[0m \u001b[1;36m0\u001b[0m \u001b[1;31m# pretend here is some code you wrote but didn't expect to break\u001b[0m\u001b[1;33m\u001b[0m\u001b[0m\n\u001b[0m\u001b[0;32m 5\u001b[0m \u001b[0mopened_file\u001b[0m\u001b[1;33m.\u001b[0m\u001b[0mwrite\u001b[0m\u001b[1;33m(\u001b[0m\u001b[1;34m\"my very important secret message is ...\"\u001b[0m\u001b[1;33m)\u001b[0m \u001b[1;31m# these lines are not executed\u001b[0m\u001b[1;33m\u001b[0m\u001b[0m\n\u001b[0;32m 6\u001b[0m \u001b[1;33m\u001b[0m\u001b[0m\n", 1401 | "\u001b[1;31mZeroDivisionError\u001b[0m: division by zero" 1402 | ] 1403 | } 1404 | ], 1405 | "source": [ 1406 | "def good_open(filename):\n", 1407 | " with open(filename, \"w\") as opened_file:\n", 1408 | " opened_file.write(\"what to hear a secret?\") # this message is lost\n", 1409 | " 1 / 0 # pretend here is some code you wrote but didn't expect to break\n", 1410 | " opened_file.write(\"my very important secret message is ...\") # these lines are not executed\n", 1411 | "\n", 1412 | "good_open(\"silly.txt\")" 1413 | ] 1414 | }, 1415 | { 1416 | "cell_type": "code", 1417 | "execution_count": 2, 1418 | "metadata": {}, 1419 | "outputs": [ 1420 | { 1421 | "name": "stdout", 1422 | "output_type": "stream", 1423 | "text": [ 1424 | "what to hear a secret?\n" 1425 | ] 1426 | } 1427 | ], 1428 | "source": [ 1429 | "with open(\"silly.txt\", \"r\") as f:\n", 1430 | " print(f.read()) # message is kept" 1431 | ] 1432 | }, 1433 | { 1434 | "cell_type": "code", 1435 | "execution_count": 3, 1436 | "metadata": {}, 1437 | "outputs": [], 1438 | "source": [ 1439 | "%%bash\n", 1440 | "rm silly.txt # everything works fine" 1441 | ] 1442 | }, 1443 | { 1444 | "cell_type": "markdown", 1445 | "metadata": {}, 1446 | "source": [ 1447 | "Here's a real context manager I created to do runtime profiling. Basically, I wanted to runtime profile another person's function, but I didn't want to apply a decorator to their function." 1448 | ] 1449 | }, 1450 | { 1451 | "cell_type": "code", 1452 | "execution_count": 4, 1453 | "metadata": {}, 1454 | "outputs": [], 1455 | "source": [ 1456 | "from contextlib import contextmanager\n", 1457 | "from datetime import datetime\n", 1458 | "import time\n", 1459 | "\n", 1460 | "@contextmanager\n", 1461 | "def custom_context_manager():\n", 1462 | " # set up environment here\n", 1463 | " start_time = datetime.now()\n", 1464 | " print(\"Started code block at {}\".format(start_time))\n", 1465 | " try:\n", 1466 | " yield \"Horology--Pirates of the Caribbean joke\"\n", 1467 | " finally:\n", 1468 | " # release resource here\n", 1469 | " end_time = datetime.now()\n", 1470 | " print(\"Ended code block at {}\".format(end_time))\n", 1471 | " print(\n", 1472 | " \"Total runtime is {} seconds\"\n", 1473 | " .format(round((end_time - start_time).total_seconds()))\n", 1474 | " )" 1475 | ] 1476 | }, 1477 | { 1478 | "cell_type": "code", 1479 | "execution_count": 5, 1480 | "metadata": {}, 1481 | "outputs": [ 1482 | { 1483 | "name": "stdout", 1484 | "output_type": "stream", 1485 | "text": [ 1486 | "Started code block at 2020-08-07 22:48:55.312899\n", 1487 | "Ended code block at 2020-08-07 22:49:00.323800\n", 1488 | "Total runtime is 5 seconds\n", 1489 | "Horology--Pirates of the Caribbean joke\n" 1490 | ] 1491 | } 1492 | ], 1493 | "source": [ 1494 | "with custom_context_manager() as ccm:\n", 1495 | " time.sleep(5) # suppose this is somebody else's function that I am timing\n", 1496 | "\n", 1497 | "print(ccm) # what is ccm? It is whatever is yielded" 1498 | ] 1499 | }, 1500 | { 1501 | "cell_type": "code", 1502 | "execution_count": 6, 1503 | "metadata": {}, 1504 | "outputs": [ 1505 | { 1506 | "name": "stdout", 1507 | "output_type": "stream", 1508 | "text": [ 1509 | "Started code block at 2020-08-07 22:50:18.328126\n", 1510 | "Ended code block at 2020-08-07 22:50:24.352074\n", 1511 | "Total runtime is 6 seconds\n" 1512 | ] 1513 | } 1514 | ], 1515 | "source": [ 1516 | "with custom_context_manager(): # I can even time a block of code, not just 1 function \n", 1517 | " time.sleep(1)\n", 1518 | " time.sleep(2)\n", 1519 | " time.sleep(3)" 1520 | ] 1521 | }, 1522 | { 1523 | "cell_type": "code", 1524 | "execution_count": 7, 1525 | "metadata": {}, 1526 | "outputs": [ 1527 | { 1528 | "name": "stdout", 1529 | "output_type": "stream", 1530 | "text": [ 1531 | "Started code block at 2020-08-07 22:51:24.631221\n", 1532 | "Ended code block at 2020-08-07 22:51:25.642154\n", 1533 | "Total runtime is 1 seconds\n" 1534 | ] 1535 | }, 1536 | { 1537 | "ename": "ZeroDivisionError", 1538 | "evalue": "division by zero", 1539 | "output_type": "error", 1540 | "traceback": [ 1541 | "\u001b[1;31m---------------------------------------------------------------------------\u001b[0m", 1542 | "\u001b[1;31mZeroDivisionError\u001b[0m Traceback (most recent call last)", 1543 | "\u001b[1;32m\u001b[0m in \u001b[0;36m\u001b[1;34m()\u001b[0m\n\u001b[0;32m 1\u001b[0m \u001b[1;32mwith\u001b[0m \u001b[0mcustom_context_manager\u001b[0m\u001b[1;33m(\u001b[0m\u001b[1;33m)\u001b[0m\u001b[1;33m:\u001b[0m \u001b[1;31m# I can even time a block of code, not just 1 function\u001b[0m\u001b[1;33m\u001b[0m\u001b[0m\n\u001b[0;32m 2\u001b[0m \u001b[0mtime\u001b[0m\u001b[1;33m.\u001b[0m\u001b[0msleep\u001b[0m\u001b[1;33m(\u001b[0m\u001b[1;36m1\u001b[0m\u001b[1;33m)\u001b[0m\u001b[1;33m\u001b[0m\u001b[0m\n\u001b[1;32m----> 3\u001b[1;33m \u001b[1;36m1\u001b[0m \u001b[1;33m/\u001b[0m \u001b[1;36m0\u001b[0m\u001b[1;33m\u001b[0m\u001b[0m\n\u001b[0m", 1544 | "\u001b[1;31mZeroDivisionError\u001b[0m: division by zero" 1545 | ] 1546 | } 1547 | ], 1548 | "source": [ 1549 | "with custom_context_manager(): # finally block will still execute even if the body hits an exception\n", 1550 | " time.sleep(1)\n", 1551 | " 1 / 0" 1552 | ] 1553 | }, 1554 | { 1555 | "cell_type": "markdown", 1556 | "metadata": {}, 1557 | "source": [ 1558 | "In summary, `finally` is frequently used for releasing resources and (less often) used for creating a custom context manager." 1559 | ] 1560 | }, 1561 | { 1562 | "cell_type": "code", 1563 | "execution_count": null, 1564 | "metadata": {}, 1565 | "outputs": [], 1566 | "source": [] 1567 | }, 1568 | { 1569 | "cell_type": "markdown", 1570 | "metadata": {}, 1571 | "source": [ 1572 | "## In CASE You Need Help, just SWITCH things Up\n", 1573 | "\n", 1574 | "Once in awhile, you need to make a decision and it look likes this with a long body of `if/elif/else`:\n", 1575 | "```python\n", 1576 | "# Python\n", 1577 | "month = 5\n", 1578 | "if month == 1:\n", 1579 | " return \"January\"\n", 1580 | "elif month == 2:\n", 1581 | " return \"February\"\n", 1582 | "...\n", 1583 | "else:\n", 1584 | " return \"I have no idea!\"\n", 1585 | "```\n", 1586 | "\n", 1587 | "You might also think of it as the `CASE` statement in SQL:\n", 1588 | "``` mysql\n", 1589 | "# SQL\n", 1590 | "CASE\n", 1591 | " WHEN condition1 THEN result1\n", 1592 | " WHEN condition2 THEN result2\n", 1593 | " WHEN conditionN THEN resultN\n", 1594 | " ELSE result\n", 1595 | "END;\n", 1596 | "```\n", 1597 | "\n", 1598 | "If you suffered in with C in ~~skewl~~ school, then it would be the `switch` statement:\n", 1599 | "``` C\n", 1600 | "switch (x) \n", 1601 | "{ \n", 1602 | " case 1: printf(\"Choice is 1\"); \n", 1603 | " break; \n", 1604 | " case 2: printf(\"Choice is 2\"); \n", 1605 | " break; \n", 1606 | " case 3: printf(\"Choice is 3\"); \n", 1607 | " break; \n", 1608 | " default: printf(\"Choice other than 1, 2 and 3\"); \n", 1609 | " break; \n", 1610 | "} \n", 1611 | "```\n", 1612 | "\n", 1613 | "Why suffer when you can do idiomatic Python? Use a dictionary instead." 1614 | ] 1615 | }, 1616 | { 1617 | "cell_type": "code", 1618 | "execution_count": 1, 1619 | "metadata": {}, 1620 | "outputs": [ 1621 | { 1622 | "name": "stdout", 1623 | "output_type": "stream", 1624 | "text": [ 1625 | "March\n", 1626 | "I have no idea!\n" 1627 | ] 1628 | } 1629 | ], 1630 | "source": [ 1631 | "def get_month(month, default_option=\"I have no idea!\"):\n", 1632 | " month_names = {1: \"January\", 2: \"February\", 3: \"March\", }\n", 1633 | " return month_names.get(month, default_option)\n", 1634 | "\n", 1635 | "print(get_month(3))\n", 1636 | "print(get_month(13))" 1637 | ] 1638 | }, 1639 | { 1640 | "cell_type": "code", 1641 | "execution_count": 2, 1642 | "metadata": {}, 1643 | "outputs": [ 1644 | { 1645 | "name": "stdout", 1646 | "output_type": "stream", 1647 | "text": [ 1648 | "On the coolest day of Christmas, my true love sent to me:\n", 1649 | "5 golden rings\n", 1650 | "4 calling birds\n", 1651 | "3 french hens\n", 1652 | "2 turtle doves\n", 1653 | "1 partridge in a pear tree\n" 1654 | ] 1655 | } 1656 | ], 1657 | "source": [ 1658 | "gifts = {5: \"golden rings\", 4: \"calling birds\", 3: \"french hens\", 2: \"turtle doves\", 1: \"partridge in a pear tree\"}\n", 1659 | "\n", 1660 | "print(\"On the coolest day of Christmas, my true love sent to me:\")\n", 1661 | "for day_of_Christmas in range(5, 0, -1):\n", 1662 | " print(day_of_Christmas, gifts[day_of_Christmas])" 1663 | ] 1664 | }, 1665 | { 1666 | "cell_type": "markdown", 1667 | "metadata": {}, 1668 | "source": [ 1669 | "#### Advanced Technique: Dispatch\n", 1670 | "If you want a switch statement on a function (or method), you can use `singledispatch()` (or `singledispatchmethod()`). Suppose you create different versions of the same function where each version did something different based on what the *type* of the argument is. Each version of the function has the same function name. When you call the function, the function will determine the type of the argument and apply the correct version of the function on the argument. Take a look at `4_Functional_Python.ipynb` for an example." 1671 | ] 1672 | } 1673 | ], 1674 | "metadata": { 1675 | "kernelspec": { 1676 | "display_name": "Python 3", 1677 | "language": "python", 1678 | "name": "python3" 1679 | }, 1680 | "language_info": { 1681 | "codemirror_mode": { 1682 | "name": "ipython", 1683 | "version": 3 1684 | }, 1685 | "file_extension": ".py", 1686 | "mimetype": "text/x-python", 1687 | "name": "python", 1688 | "nbconvert_exporter": "python", 1689 | "pygments_lexer": "ipython3", 1690 | "version": "3.8.3" 1691 | } 1692 | }, 1693 | "nbformat": 4, 1694 | "nbformat_minor": 2 1695 | } 1696 | -------------------------------------------------------------------------------- /5_High_Performance_Python.ipynb: -------------------------------------------------------------------------------- 1 | { 2 | "cells": [ 3 | { 4 | "cell_type": "markdown", 5 | "metadata": {}, 6 | "source": [ 7 | "# High-Performance Python\n", 8 | "Once in awhile, you will hear that \"oh Python is so slow or not performant enough\" or \"Python is not for enterprise use cases since it is not fast enough\" or \"Python is a language for script kiddies and at best \"glue\" code.\" \n", 9 | "ZOMG! Scala's functional API is awesome and stands on the shoulders of the JVM. Go(lang) has goroutines that beats the pants off of Python's multithreading. Rust--I don't know what it does but it sounds trendy. 🙄 \n", 10 | "\n", 11 | "Spare me the faux concern ~~you pretentious C++ developer~~ ... I mean good question! Python supports rapid speed of development and experimentation, expressiveness and conciseness of syntax, and high-performance frameworks for speed. At a high-level, speed up Python performance on 1 machine using vectorization, multi-threading, multi-processing, and asynchronous programming. On multiple machines, you have cluster/distributed computing frameworks: Apache Spark, Dask, Apache Beam, Ray, and others. On cloud services, you have containers and scalable, serverless computing. In this tutorial, I will show you various ways to make Python fast and scalable, so performance is not your bottleneck. " 12 | ] 13 | }, 14 | { 15 | "cell_type": "code", 16 | "execution_count": null, 17 | "metadata": {}, 18 | "outputs": [], 19 | "source": [] 20 | }, 21 | { 22 | "cell_type": "markdown", 23 | "metadata": {}, 24 | "source": [ 25 | "## Debunking the \"Slow\" Python Myth\n", 26 | "What do you mean by slow? Do you mean that Python runs slow, so it would cost more money when renting EC2s? \n", 27 | "Remember that compute time is cheap and developer time is relatively expensive. Suppose a standard laptop has 4 CPUs and 16 GB of RAM. The equivalent EC2 instance on AWS rents for just \\\\$0.154 per hour; an engineer costs \\\\$50 per hour. So for each hour of developer time you waste, you could have rented a machine for 300 hours--or you can rent 300 machines for 1 hour. A 40 hour week for on EC2 would be \\$6--cheaper than your daily lunch! \n", 28 | "Some \"faster\" languages are notoriously verbose, so you end up with pages and pages of code. Python's ease of use and expressivenes gives you a super power: rapid speed of development, which gives you a second super power: the ability to beat your competition to the market. Way of the Pro! Rapid experimentation allows you to out-maneuver your competition with better and newer features. Python's readability offers maintainability since developers come and go. Basically, can you write your code faster than somebody else: can you do something in 1 hour that somebody else takes 5? \n", 29 | "\n", 30 | "If you need speed, Guido recommends you delegate the performance critical part to Cython: \n", 31 | "```\"At some point, you end up with one little piece of your system, as a whole, where you end up spending all your time. If you write that just as a sort of simple-minded Python loop, at some point you will see that that is the bottleneck in your system. It is usually much more effective to take that one piece and replace that one function or module with a little bit of code you wrote in C or C++ rather than rewriting your entire system in a faster language, because for most of what you're doing, the speed of the language is irrelevant.\" --Guido van Rossum``` \n", 32 | "I won't mention Cython in the tutorial since it is rarely ever used--I have never personally met anybody who wrote their custom Cython code. Things that need to be performant in Python are already written in Cython: numpy, XGBoost, SpaCy, etc. The high level idea for Cython is that Python is actually written on top of C, so for speed you can write directly in C (or C++) and import it to Python. \n", 33 | "\n", 34 | "If you are concerned about speed, do you know what is faster than Python? Fortran. Then you can brush up on your Fortran to get hired by ... nobody (or worse, mainframe specialists 🤢). \n", 35 | "\n", 36 | "Speed of the language primarily helps with CPU bound problems and not IO bound problems--in IO bound problems, any programming language will perform roughly the same since the majority of the time, you are waiting for a network response. \n", 37 | "\n", 38 | "Finally, you don't really care about the raw speed of a language. You care about the expressioness and functionality of the language: its ecosystem, user-base, and mindshare. Whatever problem you can think of, Python probably has a library for that. Which you can pip install. So you can write minimal code. With which you solve your problem. So you can call it a day. Like Newton, when you use Python, you stand on the shoulders of giants. Things just work! Imagine if you couldn't just import sklearn and run Random Forest but had to write your own random forest algorithm. Actually, don't... \n", 39 | "Here are some libraries just for scientific computing. \n", 40 | "


" 41 | ] 42 | }, 43 | { 44 | "cell_type": "code", 45 | "execution_count": 1, 46 | "metadata": {}, 47 | "outputs": [], 48 | "source": [ 49 | "import antigravity # a cute Easter egg; things in Python are so easy. Python has \"batteries included\"" 50 | ] 51 | }, 52 | { 53 | "cell_type": "markdown", 54 | "metadata": {}, 55 | "source": [ 56 | "## The Big O, Not Just Japanese Batman\n", 57 | "

\n", 58 | "\n", 59 | "Data structures and algorithms (ie your functions) have different performance characteristics that can be measured by how much memory (space) it takes to do something and how many steps (time) it takes to do something. Fancy people use the terms space and (run)time complexity, respectively. Don't be intimidated by the fancy names; I assure you it's a simple idea. \n", 60 | "Big O notation: is used to denote the worst case scenario of how resource utilization (memory or number of steps) will grow with respect to the number of elements in your input data. \n", 61 | "Note: O(n) is pronounced oh-enn.\n", 62 | "\n", 63 | "### Time Complexity of Data Structures\n", 64 | "Let's walk through some simple examples. \n", 65 | "* list:\n", 66 | " * to append to the end of a list, it is an (amoritized) O(1) operation. It is a fast operation.\n", 67 | " * to append/insert to the beginning of a list, it takes O(n) operations because a new list is created, new element is inserted, and all the elements from the original list is copied over. It is slow in comparison to append to the end of the list.\n", 68 | " * to check if an element is in a list (ie test for membership), it takes O(n) operations since the worst case is that element is not in the list, so you have to check every element of the list.\n", 69 | " * to get an element at a known, specific ith index, it is an O(1) operation. Fast\n", 70 | "* dict:\n", 71 | " * to check if a key is in a dictionary: O(1) operation because dict keys are hashable (and unique).\n", 72 | " * to check if a value is in a dictinoary: O(n) operations because dict values are not hashable (and do not have to be unique).\n", 73 | "\n", 74 | "Ideally all your code takes as little memory and as few steps as possible (ie O(1) for instantenous results). In practice, code uses non-trivial amounts of memory and computation. Sometimes, you can consider a tradeoff between space and time requirements. For example, if you are using a list and often have to append to the front and the end of the list, then you are better off with a data structure called double-ended queue (AKA deque). A deque have rapid `append()` and `appendleft()` methods as well as `pop()` and `popleft()`, since these operations are all O(1). However, a deque has slow access to the ith element, which is now a O(n) operation. You have to figure out which data structure gives you the best benefit given tradeoffs/constraints." 75 | ] 76 | }, 77 | { 78 | "cell_type": "markdown", 79 | "metadata": {}, 80 | "source": [ 81 | "### Time Complexity of Algorithms\n", 82 | "As with data structures, algorithms also have space and time complexity. An algorithm is just a fancy word for a set of steps to get you to the right answer: sorting algorithm, sudoku solver algorithm, determining the winner in a tournament algorithm.\n", 83 | "\n", 84 | "When an engineer says something is *(computationally) cheap*, that is fancy lingo for if the calculation uses little resources (low space complexity) and is fast (low time complexity). If a calculation is *(computationally) expensive*, it means it takes a lot of memory or is slow. Ideally you want your algorithm to be cheap.\n", 85 | "\n", 86 | "There are orders of magnitudes for complexity of algorithms (and data structures):\n", 87 | "* `O(1)` (AKA constant time): the best (that you can possibly do)!\n", 88 | "* `O(log(n))` (AKA logarithmic time): good\n", 89 | "* `O(n)` (AKA linear time): still good\n", 90 | "* `O(n*log(n))`: fair, use sparingly\n", 91 | "* `O(n^2)` (AKA quadratic time): bad, use very sparingly. If the exponent is any number other than 2 (ie 3, 4, ...), then it is called polynomial time\n", 92 | "* `O(2^n)` (AKA exponential time) and `O(n!)`: horrible, ideally you never have use an algorithm (or data structure) that has these complexities because it will be slow and resource intensive.\n", 93 | " \n", 94 | "Basically, you want your algorithm (or data structure) to *scale* well with the size of your input data. To visualize how important the scaling factor, Big O, is to your algorithm, here's a chart. Stay in the green. \n", 95 | "The chart and a more elaborate explanation of Big O is available at [Understanding time complexity with Python examples](https://towardsdatascience.com/understanding-time-complexity-with-python-examples-2bda6e8158a7)\n", 96 | "


\n", 97 | "\n", 98 | "Here is simple algorithm that concatenates strings together: it exhibits quadratic, O(n^2) time complexity." 99 | ] 100 | }, 101 | { 102 | "cell_type": "code", 103 | "execution_count": 1, 104 | "metadata": {}, 105 | "outputs": [ 106 | { 107 | "name": "stdout", 108 | "output_type": "stream", 109 | "text": [ 110 | "abcdefghijklmnopqrstuvwxyzABCDEFGHIJKLMNOPQRSTUVWXYZ\n", 111 | "['i', 'h', 'b', 's', 'v', 'm', 'U', 'q', 's', 'J']\n" 112 | ] 113 | } 114 | ], 115 | "source": [ 116 | "from string import ascii_letters\n", 117 | "from random import choices\n", 118 | "\n", 119 | "print(ascii_letters)\n", 120 | "lots_of_letters = choices(ascii_letters, k=100000)\n", 121 | "print(lots_of_letters[:10])" 122 | ] 123 | }, 124 | { 125 | "cell_type": "code", 126 | "execution_count": 2, 127 | "metadata": {}, 128 | "outputs": [ 129 | { 130 | "name": "stdout", 131 | "output_type": "stream", 132 | "text": [ 133 | "Wall time: 171 ms\n" 134 | ] 135 | } 136 | ], 137 | "source": [ 138 | "%%time\n", 139 | "# functional style: quadratic time\n", 140 | "from functools import reduce\n", 141 | "\n", 142 | "result_FP = reduce(lambda accumulated_string, char: accumulated_string + char, lots_of_letters)" 143 | ] 144 | }, 145 | { 146 | "cell_type": "code", 147 | "execution_count": 3, 148 | "metadata": {}, 149 | "outputs": [ 150 | { 151 | "name": "stdout", 152 | "output_type": "stream", 153 | "text": [ 154 | "Wall time: 31 ms\n" 155 | ] 156 | } 157 | ], 158 | "source": [ 159 | "%%time\n", 160 | "# procedural style: quadratic time\n", 161 | "result_procedural = \"\"\n", 162 | "for char in lots_of_letters:\n", 163 | " result_procedural += char" 164 | ] 165 | }, 166 | { 167 | "cell_type": "markdown", 168 | "metadata": {}, 169 | "source": [ 170 | "Here's a smarter option that gives you linear, O(n) time complexity." 171 | ] 172 | }, 173 | { 174 | "cell_type": "code", 175 | "execution_count": 4, 176 | "metadata": {}, 177 | "outputs": [ 178 | { 179 | "name": "stdout", 180 | "output_type": "stream", 181 | "text": [ 182 | "Wall time: 1 ms\n" 183 | ] 184 | } 185 | ], 186 | "source": [ 187 | "%%time\n", 188 | "result_OOP = \"\".join(lots_of_letters) # OOP is much faster since runtime is O(n)" 189 | ] 190 | }, 191 | { 192 | "cell_type": "code", 193 | "execution_count": 5, 194 | "metadata": {}, 195 | "outputs": [ 196 | { 197 | "data": { 198 | "text/plain": [ 199 | "True" 200 | ] 201 | }, 202 | "execution_count": 5, 203 | "metadata": {}, 204 | "output_type": "execute_result" 205 | } 206 | ], 207 | "source": [ 208 | "result_FP == result_procedural == result_OOP # proof that the results are the same" 209 | ] 210 | }, 211 | { 212 | "cell_type": "markdown", 213 | "metadata": {}, 214 | "source": [ 215 | "\"Smart data structures and dumb code works a lot better than the other way around.\" -from *The Cathedral and the Bazaar* by Eric S. Raymond\n", 216 | "\n", 217 | "When possible, pick a good data structure that fits your problem--because there are a lot of data structures that do cool and useful things. Using the right data structure is the low-hanging fruit that requires minimum effort. If your logic is fundamentally complicated, then you will have to get creative and write (and optimize) a smart algorithm--that is more work. Of course, the quote is simply a rule of thumb--right most of the time, but not all the time. Use good judgment.\n", 218 | "\n", 219 | "Let me give you a practical example. Often in technical interview questions, the interviewer will ask you to write/whiteboard an algorithm. This is an interview question I received many years ago when I was a beginner: \"Given a list of words, find which words are anagrams of each other.\" Here is my very crude solution." 220 | ] 221 | }, 222 | { 223 | "cell_type": "code", 224 | "execution_count": 6, 225 | "metadata": {}, 226 | "outputs": [ 227 | { 228 | "data": { 229 | "text/plain": [ 230 | "{'rats': {'arts', 'star'}, 'arts': {'star'}, 'god': {'dog'}}" 231 | ] 232 | }, 233 | "execution_count": 6, 234 | "metadata": {}, 235 | "output_type": "execute_result" 236 | } 237 | ], 238 | "source": [ 239 | "from collections import Counter\n", 240 | "\n", 241 | "def find_anagrams(list_of_words):\n", 242 | " anagrams = {}\n", 243 | " for i, current_word in enumerate(list_of_words):\n", 244 | " for following_word in list_of_words[i+1:]:\n", 245 | " if Counter(current_word) == Counter(following_word):\n", 246 | " if current_word in anagrams:\n", 247 | " anagrams[current_word].add(following_word)\n", 248 | " else:\n", 249 | " anagrams[current_word] = {following_word}\n", 250 | " return anagrams\n", 251 | "\n", 252 | "find_anagrams([\"rats\", \"arts\", \"star\", \"god\", \"dog\", \"apple\"])" 253 | ] 254 | }, 255 | { 256 | "cell_type": "markdown", 257 | "metadata": {}, 258 | "source": [ 259 | "My algorithm is very inefficient since it has O(n^2) time complexity: it has a double for-loop. And it is duplicating the letter counts (ie calling *Counter()*) multiple times for the same word. Finally, the output isn't great.\n", 260 | "\n", 261 | "The correct way (Way of the Pro) is to find some way to transform the data into what I call a \"canonical form\" or \"canonical representation\" of the data that is invariant to differences that don't matter. For example, in this anagram problem, it does not matter what order the letters are in the word. Hence, to get the \"canonical form\", you sort the letters in the word." 262 | ] 263 | }, 264 | { 265 | "cell_type": "code", 266 | "execution_count": 7, 267 | "metadata": {}, 268 | "outputs": [ 269 | { 270 | "data": { 271 | "text/plain": [ 272 | "[['rats', 'arts', 'star'], ['god', 'dog']]" 273 | ] 274 | }, 275 | "execution_count": 7, 276 | "metadata": {}, 277 | "output_type": "execute_result" 278 | } 279 | ], 280 | "source": [ 281 | "from collections import defaultdict\n", 282 | "\n", 283 | "def find_anagrams_optimized(list_of_words):\n", 284 | " anagrams = defaultdict(list)\n", 285 | " for word in list_of_words:\n", 286 | " word_with_letters_sorted = \"\".join(sorted(word))\n", 287 | " anagrams[word_with_letters_sorted].append(word)\n", 288 | " return [val for val in anagrams.values() if len(val) > 1]\n", 289 | "\n", 290 | "find_anagrams_optimized([\"rats\", \"arts\", \"star\", \"god\", \"dog\", \"apple\"])" 291 | ] 292 | }, 293 | { 294 | "cell_type": "markdown", 295 | "metadata": {}, 296 | "source": [ 297 | "Notice that I used both:\n", 298 | "* a better data structure: default dict (with *list* as default) allows easy appending vs a regular dict\n", 299 | "* a better algorithm: that has linear O(n) time complexity instead of O(n^2). No redundant, repeated computations\n", 300 | "\n", 301 | "Moreover, the new function uses half the lines of code to get better results, so it is more readable and maintainable." 302 | ] 303 | }, 304 | { 305 | "cell_type": "code", 306 | "execution_count": null, 307 | "metadata": {}, 308 | "outputs": [], 309 | "source": [] 310 | }, 311 | { 312 | "cell_type": "markdown", 313 | "metadata": {}, 314 | "source": [ 315 | "## Vectorized Calculations with SIMD\n", 316 | "vectorization with numpy and pandas: SIMD \n", 317 | "like R's apply. pd.DataFrame.apply is not vectorized but without the speed performance. It's a convenience function. \n", 318 | "CPU vs GPU: ALU vs FPU \n", 319 | "order of magnitude of speed: CPU, RAM, storage, network: have image " 320 | ] 321 | }, 322 | { 323 | "cell_type": "code", 324 | "execution_count": null, 325 | "metadata": {}, 326 | "outputs": [], 327 | "source": [] 328 | }, 329 | { 330 | "cell_type": "markdown", 331 | "metadata": {}, 332 | "source": [ 333 | "## Multi-threading vs Multi-processing\n", 334 | "Multithreading and multiprocessing is when you hope to speed things up rather than wait for things to execute serially (ie a for loop). Multithreading is when you have multiple threads in 1 process, and thus the threads have access to the same variables and objects in memory--this is called *concurrency*. Multiprocessing is when you have multiple processes where each Python process/interpreter/session has its own memory and is running independently of each other--this is called true *parallelism*. Multithreading is faster to start up (<100 ms), but in most cases the right thing to do is use multiprocessing. Multithreading is used for *IO bound* tasks whereas multiprocessing is used for *CPU bound* tasks.\n", 335 | "\n", 336 | " \n", 337 | " \n", 338 | "
\"Concurrency\" \"Parallelism\"
\n", 339 | "Beautiful visual diagrams from https://medium.com/rungo/achieving-concurrency-in-go-3f84cbf870ca" 340 | ] 341 | }, 342 | { 343 | "cell_type": "markdown", 344 | "metadata": {}, 345 | "source": [ 346 | "### Multithreading\n", 347 | "Within 1 Python process, you can create multiple threads. Python has a Global Interpreter Lock (aptly shortened to GIL) that only allows 1 thread to be run at any given time, so you are switching between threads--this is called context switching. So why would you use multithreading instead of regular ol' serial execution (ie for loop)?\n", 348 | "\"Context\n", 349 | "Multithreading is good for *IO bound* tasks. The typical example of IO bound task is scraping/downloading websites.\n", 350 | "Suppose you have 100 urls you want to scrap and each url takes 1 second to scrap. Serially scraping the urls will take 100 seconds. However, if you multithread, IO tasks release the GIL, so you can scrap multiple urls \"at the same time.\" Really, the bulk of the time is waiting for the response of data from the url (as any CPU time is virtually instantaneous), so multithreading can shrink that runtime significantly. Remember, you are still just using 1 CPU. Scraping websites does not take a lot of CPU power. Heavy number crunching is called *CPU bound* tasks, and multithreading does not help--and in fact, will be slower than serial execution.\n", 351 | "\n", 352 | "A Python process uses 1 CPU, so each thread uses a *time slice* of the CPU. The thread that acquires the GIL gets to run for a few milliseconds, then is paused, and then releases the GIL. Another thread acquires the GIL, gets to run for a few milliseconds, then is paused, and then releases the GIL, etc. In practice, because only 1 thread is executed at any 1 time, multithreading is in some sense running \"serially\"--2 things are happening after each other, 2 things are NOT happening at exactly the same time. Imagine if you are the CPU and are multi-tasking multiple threads, you can be making a phone call, responding to an email, watching TV. You are switching between those threads very quickly (which are interweaved), so it gives the *impression* that you are doing them \"at the same time.\" However, you are only doing 1 thing at a time.\n", 353 | "\n", 354 | "You don't know which thread is next, you don't know when the current thread is going to be paused, and you don't know which line of code the next thread will resume from. Well, you might think this is a bad idea--and you are correct.\n", 355 | "* Newbie: multithreading sounds like a bad idea. \n", 356 | "* Intermediate: multithreading sounds like it could be good idea. \n", 357 | "* Pro: multithreading is definitely a bad idea!\n", 358 | "\n", 359 | "In fact, multithreading is such a bad idea that Mozilla has this sign at their office. If you want to skip this entire section on multithreading, feel free to do so. This section is to edify you on *not* using this technique.\n", 360 | "\"Don't \n", 361 | "\n", 362 | "\n", 363 | "\n", 364 | "Why is multithreading a bad idea? Multiple threads have access to the same/shared objects in memory, so different threads can modify the shared object. The shared object can be corrupted due to the unexpected order of thread execution--this is called a **race condition**. The typical example is a bank transaction. Suppose Alice and Bob are 2 threads that share 1 bank account, and both want to withdraw money \"at the same time\" (within milliseconds of each other). Here's is the linearized timeline (since only 1 thread is running at any moment in time) of Alice and Bob triggering a race condition and corrupting the state of the shared object: the bank account balance. In blockchain, this is called the double spend problem.\n", 365 | "* beginning bank account balance: \\$100\n", 366 | "* Alice: begins action of withdrawing \\$60\n", 367 | "* Alice: bank checks that balance (of \\$100) >= \\\\$60: yes!\n", 368 | "* Alice: ATM gives Alice \\$60\n", 369 | "* Bob: begins action of withdrawing \\$70\n", 370 | "* Bob: bank checks that balance (of \\$100) >= \\\\$70: yes!\n", 371 | "* Bob: ATM gives Bob \\$70\n", 372 | "* Alice: bank account decrements balance to \\$40 (\\\\$100 - \\\\$60)\n", 373 | "* Bob: bank account decrements balance to \\$30 (\\\\$1000 - \\\\$70)\n", 374 | "* ending bank account balance: \\$30 (even though \\\\$130 was successfully cashed out and that the bank should have prevented Bob from withdrawing money in the first place due to insufficient funds)\n", 375 | "\n", 376 | "\n", 377 | "Race conditions can also occur with the opposite problem where if Alice and Bob are both depositing money \"at the same time\" (basically milliseconds of each other) as multiple threads.\n", 378 | "* beginning bank account balance: \\$100\n", 379 | "* Alice: begins action of depositing \\$60\n", 380 | "* Alice: bank acccount gets current balance (\\$100) and is about to increment\n", 381 | "* Bob: begins action of depositing \\$70\n", 382 | "* Bob: bank account gets current balance (\\$100) and is about to increment\n", 383 | "* Bob: deposit is successful. Bank account increments balance to \\$170 (\\\\$100 + \\\\$70)\n", 384 | "* Alice: deposit is successful. Bank account increments balance to \\$160 (\\\\$100 + \\\\$60)\n", 385 | "* ending bank account balance: $160 (because it just so happens that Alice's thread finished last)\n", 386 | "\n", 387 | "As you can see, multithreaded actions on a shared object are not \"thread-safe\", as the actions (ie the functions) are not atomic. Thread 1 might do something halfway and pause, and thread 2 can do something else on the same object. Not even functional programming can save you now. This problem would have not occurred if you just did things serially (ie a for loop).\n", 388 | "\n", 389 | "The typical example of multithreading triggering a race condition is the bank transaction. Let's do something more fun: tug of war! A deque is double-ended queue--you can think of it as a list with fast O(1) operation for `popleft()`, `pop()` (ie `popright()`), `appendleft()`, and `append()` (ie `appendright()`). Notice that the deque length does not always stay at 9,999,999 (deque length - 1 for the popped value) and the resulting deque has lost its order." 390 | ] 391 | }, 392 | { 393 | "cell_type": "code", 394 | "execution_count": 1, 395 | "metadata": {}, 396 | "outputs": [], 397 | "source": [ 398 | "def popleft_and_appendright(dq, iterations, checkpoint_iterations):\n", 399 | " for i in range(iterations):\n", 400 | " value = dq.popleft()\n", 401 | " if i % checkpoint_iterations == 0:\n", 402 | " print(\"length of deque: {}\".format(len(dq)))\n", 403 | " dq.append(value)\n", 404 | " print(\"final length of deque: {}\".format(len(dq)))\n", 405 | "\n", 406 | "def popright_and_appendleft(dq, iterations, checkpoint_iterations):\n", 407 | " for i in range(iterations):\n", 408 | " value = dq.pop()\n", 409 | " if i % checkpoint_iterations == 0:\n", 410 | " print(\"length of deque: {}\".format(len(dq)))\n", 411 | " dq.appendleft(value)\n", 412 | " print(\"final length of deque: {}\".format(len(dq)))" 413 | ] 414 | }, 415 | { 416 | "cell_type": "code", 417 | "execution_count": 2, 418 | "metadata": {}, 419 | "outputs": [ 420 | { 421 | "name": "stdout", 422 | "output_type": "stream", 423 | "text": [ 424 | "Threads are ready to start!\n", 425 | "length of deque: 9999999Thread 1 is started\n", 426 | "\n", 427 | "length of deque: 9999998Thread 2 is started\n", 428 | "\n", 429 | "Threads 1 and 2 are probably still running\n", 430 | "length of deque: 9999998\n", 431 | "length of deque: 9999998\n", 432 | "length of deque: 9999998length of deque: 9999998\n", 433 | "\n", 434 | "length of deque: 9999998length of deque: 9999998\n", 435 | "\n", 436 | "length of deque: 9999998length of deque: 9999998\n", 437 | "\n", 438 | "length of deque: 9999998length of deque: 9999998\n", 439 | "\n", 440 | "length of deque: 9999999length of deque: 9999998\n", 441 | "\n", 442 | "length of deque: 9999999\n", 443 | "length of deque: 9999998\n", 444 | "length of deque: 9999998length of deque: 9999998\n", 445 | "\n", 446 | "length of deque: 9999999\n", 447 | "length of deque: 9999998\n", 448 | "final length of deque: 10000000\n", 449 | "Thread 1 is done for sure\n", 450 | "final length of deque: 10000000\n", 451 | "Thread 2 is done for sure\n" 452 | ] 453 | } 454 | ], 455 | "source": [ 456 | "from collections import deque\n", 457 | "from threading import Thread\n", 458 | "\n", 459 | "# double-ended queue (deque) has O(1) operation for popleft/popright, appendleft/appendright\n", 460 | "dq = deque(range(10000000)) # this is the shared object\n", 461 | "\n", 462 | "thread1 = Thread(target=popleft_and_appendright, args=(dq, 10000000, 1000000))\n", 463 | "thread2 = Thread(target=popright_and_appendleft, args=(dq, 10000000, 1000000))\n", 464 | "\n", 465 | "print(\"Threads are ready to start!\")\n", 466 | "thread1.start()\n", 467 | "print(\"Thread 1 is started\")\n", 468 | "thread2.start()\n", 469 | "print(\"Thread 2 is started\")\n", 470 | "\n", 471 | "print(\"Threads 1 and 2 are probably still running\")\n", 472 | "thread1.join()\n", 473 | "print(\"Thread 1 is done for sure\")\n", 474 | "thread2.join()\n", 475 | "print(\"Thread 2 is done for sure\")" 476 | ] 477 | }, 478 | { 479 | "cell_type": "code", 480 | "execution_count": 3, 481 | "metadata": {}, 482 | "outputs": [ 483 | { 484 | "name": "stdout", 485 | "output_type": "stream", 486 | "text": [ 487 | "Are all the elements in order? Counter({True: 9819747, False: 180253})\n", 488 | "first 10: [9976631, 31728, 27419, 0, 9980878, 2, 3, 4, 5, 6]\n", 489 | "last 10: [9999987, 9999988, 9999989, 9999990, 9999991, 9999992, 9999993, 9999994, 9999995, 9999996]\n" 490 | ] 491 | } 492 | ], 493 | "source": [ 494 | "from collections import Counter\n", 495 | "\n", 496 | "print(\"Are all the elements in order? {}\".format(Counter(x == y for x, y in zip(dq, range(len(dq))))))\n", 497 | "print(\"first 10: {}\".format(list(dq)[:10]))\n", 498 | "print(\"last 10: {}\".format(list(dq)[-10:]))" 499 | ] 500 | }, 501 | { 502 | "cell_type": "markdown", 503 | "metadata": {}, 504 | "source": [ 505 | "Here's another example of the race condition: basically the bank account example. The bank account starts at 0 and should end at 0--but where it ends up, nobody knows! 🤔\n", 506 | "\n", 507 | "Also notice that thread is triggered by calling `thread1.start()`. Before this moment, the thread is not executed--it is just sitting there. When the line `thread1.start()` is run, notice that it is \"non-blocking\", which means that the thread kind of runs in the \"background\", so Python can execute the next line before thread1 finishes its work. Just like thread1, `thread2.start()` is non-blocking, so you can run the next line of code (such as `print()`). \n", 508 | "\n", 509 | "So how do you know when a thread is done running (because theoretically Python could have run the final line of the .py file but the thread is still chugging along)? You can call `thread1.join()`, which IS a blocking operation--Python will wait until that thread has completed its task. If the thread is already done, then `.join()` will finish very quickly. If the thread is still running, then `.join()` will continue to run/block for an indeterminate amount of time before the next line is run." 510 | ] 511 | }, 512 | { 513 | "cell_type": "code", 514 | "execution_count": 4, 515 | "metadata": {}, 516 | "outputs": [], 517 | "source": [ 518 | "def incrementer(lst, iterations, checkpoint_iterations):\n", 519 | " for i in range(iterations):\n", 520 | " lst[0] += 1\n", 521 | " if i % checkpoint_iterations == 0:\n", 522 | " print(\"lst: {}\".format(lst))\n", 523 | " print(\"final lst: {}\".format(lst))\n", 524 | "\n", 525 | "def decrementer(lst, iterations, checkpoint_iterations):\n", 526 | " for i in range(iterations):\n", 527 | " lst[0] -= 1\n", 528 | " if i % checkpoint_iterations == 0:\n", 529 | " print(\"lst: {}\".format(lst))\n", 530 | " print(\"final lst: {}\".format(lst))" 531 | ] 532 | }, 533 | { 534 | "cell_type": "code", 535 | "execution_count": 5, 536 | "metadata": {}, 537 | "outputs": [ 538 | { 539 | "name": "stdout", 540 | "output_type": "stream", 541 | "text": [ 542 | "lst: [1]\n", 543 | "lst: [13564]\n", 544 | "lst: [207439]\n", 545 | "lst: [227883]\n", 546 | "lst: [673279]\n", 547 | "lst: [454662]\n", 548 | "lst: [721027]\n", 549 | "lst: [638504]\n", 550 | "lst: [610278]\n", 551 | "lst: [540047]\n", 552 | "lst: [587566]\n", 553 | "lst: [645003]\n", 554 | "lst: [797312]\n", 555 | "lst: [767610]\n", 556 | "lst: [854488]\n", 557 | "lst: [493231]\n", 558 | "lst: [521828]\n", 559 | "lst: [606092]\n", 560 | "lst: [704832]\n", 561 | "lst: [569396]\n", 562 | "final lst: [630574]\n", 563 | "final lst: [-273920]\n" 564 | ] 565 | } 566 | ], 567 | "source": [ 568 | "lst = [0] # this is the shared object\n", 569 | "\n", 570 | "thread1 = Thread(target=incrementer, args=(lst, 10000000, 1000000))\n", 571 | "thread2 = Thread(target=decrementer, args=(lst, 10000000, 1000000))\n", 572 | "\n", 573 | "thread1.start()\n", 574 | "thread2.start()\n", 575 | "\n", 576 | "thread1.join()\n", 577 | "thread2.join()" 578 | ] 579 | }, 580 | { 581 | "cell_type": "markdown", 582 | "metadata": {}, 583 | "source": [ 584 | "The purpose of this example is that multithreading causes function calls to be \"non-atomic\". Multiple threads exist in the same memory and have access to the same variables. 2 (or more) threads compete to access and mutate variables. Suppose the first thread is executing some lines that update a list, that thread can be paused at any time in the function--and you have no idea where. Then a second thread can be activated and then mutate that same variable. By the time, Python goes back to the first thread, the variable can already have changed.\n", 585 | "\n", 586 | "How to make threads not trip over each other on getting/setting the same variable? You create a lock over a piece of code such that the thread has to acquire the lock before it can run that piece of code. When that piece of code is finished, then the thread can release the lock. However, a natural question is that wouldn't the threaded code run effectively sequentially since no 2 threads can run the same code protected by the lock at the same time? Yep, you are correct. Adding locks basically makes multithreaded code run sequentially.\n", 587 | "\n", 588 | "The only redeeming thing is that if you run threads that apply the same operation on different elements, then you can use `ThreadPool`, which also can function as a context manager." 589 | ] 590 | }, 591 | { 592 | "cell_type": "code", 593 | "execution_count": 6, 594 | "metadata": {}, 595 | "outputs": [ 596 | { 597 | "name": "stdout", 598 | "output_type": "stream", 599 | "text": [ 600 | "lst: [1]lst: [2]\n", 601 | "\n", 602 | "lst: [1194685]\n", 603 | "lst: [1313960]\n", 604 | "lst: [2481346]lst: [2504151]\n", 605 | "\n", 606 | "lst: [3845698]lst: [3864301]\n", 607 | "\n", 608 | "lst: [4996320]lst: [5056937]\n", 609 | "\n", 610 | "lst: [6183852]\n", 611 | "lst: [6272983]\n", 612 | "lst: [7384063]\n", 613 | "lst: [7437495]\n", 614 | "lst: [8687036]\n", 615 | "lst: [8695291]\n", 616 | "lst: [9921856]\n", 617 | "lst: [10007207]\n", 618 | "lst: [11182370]\n", 619 | "lst: [11243052]\n", 620 | "final lst: [12417084]\n", 621 | "final lst: [12518556]\n", 622 | "12518556\n" 623 | ] 624 | } 625 | ], 626 | "source": [ 627 | "from multiprocessing.pool import ThreadPool\n", 628 | "\n", 629 | "lst = [0] # this is the shared object\n", 630 | "thread_pool = ThreadPool(2) # creating 2 threads\n", 631 | "thread_pool.map(\n", 632 | " func=lambda lst: incrementer(lst, 10000000, 1000000), # hard coded the arguments `iterations` and `checkpoint_iterations`\n", 633 | " iterable=[lst] * 2, # both elements in the list point to the same object\n", 634 | ")\n", 635 | "thread_pool.close() # remember to close the threadpool\n", 636 | "print(lst[0]) # find result should be 20000000 but it's not" 637 | ] 638 | }, 639 | { 640 | "cell_type": "code", 641 | "execution_count": 7, 642 | "metadata": {}, 643 | "outputs": [ 644 | { 645 | "name": "stdout", 646 | "output_type": "stream", 647 | "text": [ 648 | "lst: [1]lst: [2]\n", 649 | "\n", 650 | "lst: [1195744]\n", 651 | "lst: [1311003]\n", 652 | "lst: [2360466]\n", 653 | "lst: [2598903]\n", 654 | "lst: [3566836]\n", 655 | "lst: [3913564]\n", 656 | "lst: [4893890]\n", 657 | "lst: [5275909]\n", 658 | "lst: [6276431]\n", 659 | "lst: [6571892]\n", 660 | "lst: [7445152]\n", 661 | "lst: [7915186]\n", 662 | "lst: [8760764]\n", 663 | "lst: [9127098]\n", 664 | "lst: [10046542]\n", 665 | "lst: [10358581]\n", 666 | "lst: [11315160]\n", 667 | "lst: [11499093]\n", 668 | "final lst: [12558194]\n", 669 | "final lst: [12800597]\n", 670 | "12800597\n" 671 | ] 672 | } 673 | ], 674 | "source": [ 675 | "# context manager syntax--better, no manual closing required!\n", 676 | "lst = [0] # this is the shared object\n", 677 | "with ThreadPool(2) as thread_pool:\n", 678 | " thread_pool.map(\n", 679 | " func=lambda lst: incrementer(lst, 10000000, 1000000), # hard coded the arguments `iterations` and `checkpoint_iterations`\n", 680 | " iterable=[lst] * 2, # both elements in the list point to the same object\n", 681 | ")\n", 682 | "print(lst[0]) # find result should be 20000000 but it's not" 683 | ] 684 | }, 685 | { 686 | "cell_type": "markdown", 687 | "metadata": {}, 688 | "source": [ 689 | "Finally, threads are *not* scalable. The scheduling of the threads (ie, which one comes next) is actually done by the OS, so although a couple milliseconds seems fast to humans, it is still slow. Computers operate at the nanosecond level (billions of operations per second). Switching between threads is called **context switching**. When there are a lot of threads (say 30), then the OS has to pick which thread out of the 30 runs next. In this case, more time is spent acquiring the GIL than actually doing work. It's like the Hunger Games of threads--may the odds be ever in your favor in acquiring the GIL. In this case, it might be better to use serial execution (for loop) compared to using 30 threads.\n", 690 | "In practice, acquiring the lock is not a costless operation. \"Context\n", 691 | "\n", 692 | "\n", 693 | "So where does that leave us? Threads can mutate the same variables in memory and are non-blocking and may provide some speedup for IO operations. The simple answer is to not use multithreading. The more nuanced answer is don't use multithreading for real, production code--you can use multithreading on simple, toy projects where correctness is not essential. \n", 694 | "\n", 695 | "So why did I show all this *useless* stuff? It was to tee up for the *useful*: multiprocessing and asynchronous programming." 696 | ] 697 | }, 698 | { 699 | "cell_type": "code", 700 | "execution_count": null, 701 | "metadata": {}, 702 | "outputs": [], 703 | "source": [] 704 | }, 705 | { 706 | "cell_type": "markdown", 707 | "metadata": {}, 708 | "source": [ 709 | "## Multi-processing\n", 710 | "\n", 711 | "How do you get true parallelism (ie multiple things running at the *exact* same time)? You use multiple processes (multiprocessing), so there is no context switching and no problems with the Global Interpreter Lock (GIL). Each (sub)process is basically a separate Python process/kernel (with its own GIL) and can run on a different CPU--this is scheduled by the OS, so you don't need to do it yourself. You can think of multiprocessing as being able to create multiple Python (sub)processes (say 10) with very little code.\n", 712 | "\n", 713 | "Multiprocessing is very good for \"embarrassingly parallel\" problems, which can be subdivided and worked on separately \n", 714 | "and independently. If you want to see the predicted speedup of using multiprocessing, you can look at [Amdahl's law](https://en.wikipedia.org/wiki/Amdahl%27s_law), which is a formula about how much faster will multiprocessing be over regular ol' serial loops. Multiprocessing is not so good where the results of A depend on results of B, which depend on results of C. For example, multiprocessing is good for running many simulations, but it would not be good for something sequential like withdrawals from a bank account (because you want to know when the bank account runs out of money).\n", 715 | "\n", 716 | "A downside of multiprocessing is that each process has separate memory, not shared memory (which multithreading has). Hence, processes generally don't talk to each other. Communication is still possible through IPC (inter-process communication) or RPC (remote process communication), but these are not (explicitly) used often. The other small downside is the start up time for a new process--it takes on the order of 100 milliseconds whereas a new thread can be started much faster. But a 100 millisecond is approximately a blink of an eye.\n", 717 | "\n", 718 | "At a high level, you can think of multiprocessing as distributed functional programming because you are using `map()` over an iterable and the process pool will send each element of the iterable to a process. The process will apply the function onto the element and then return the result back to you.\n", 719 | "\n", 720 | "#### Sound Impressive While Doing Very Little (AKA Working Smarter)\n", 721 | "I want to share my dirty little secret. I ETLed (extract, transform, load) 150 Terabytes of data *without* Spark. To put that in perspective, that is the equivalent of 3,000 HD movies. I used 1 large machine (64 CPUs and 256 GB of RAM). The ETL job was to parse compressed zip files, perform some data transformation, and store into a database. In fact, my Python code was so performant that the machine running Python was only at ~50% CPU utilization but all the servers running NoSQL (HBase) were all maxed out at 100% CPU utilization. NoSQL is known for its rapid read and write speeds, so actually it was quite surprising that the NoSQL was the bounding constraint and not Python. This is a case study of Python being very performant and doing distributed, parallel computation.\n", 722 | "\n", 723 | "\n", 724 | "My other dirty secret is that multiprocessing is very easy to write. You can do impressive things with very little code. Don't work harder, work smarter. \n", 725 | "Note: multiprocessing also has the context manager that closes the processes for you." 726 | ] 727 | }, 728 | { 729 | "cell_type": "code", 730 | "execution_count": 1, 731 | "metadata": {}, 732 | "outputs": [ 733 | { 734 | "data": { 735 | "text/plain": [ 736 | "[0, 2, 4, 6, 8, 10, 12, 14, 16, 18]" 737 | ] 738 | }, 739 | "execution_count": 1, 740 | "metadata": {}, 741 | "output_type": "execute_result" 742 | } 743 | ], 744 | "source": [ 745 | "import multiprocessing\n", 746 | "\n", 747 | "def double(x): return 2 * x\n", 748 | "\n", 749 | "# unfortunately multiprocessing on Windows is buggy, you might need to put your code under if __name__ == \"__main__\":\n", 750 | "pool = multiprocessing.Pool(processes=4) # I decide to use 4 Python processes, which can use the full power of 4 CPUs\n", 751 | "result = pool.map(func=double, iterable=range(10))\n", 752 | "pool.close() # remember to close your process pool when you are done\n", 753 | "result # result is collected in the corresponding order as the input" 754 | ] 755 | }, 756 | { 757 | "cell_type": "code", 758 | "execution_count": 2, 759 | "metadata": {}, 760 | "outputs": [ 761 | { 762 | "data": { 763 | "text/plain": [ 764 | "[0, 2, 4, 6, 8, 10, 12, 14, 16, 18]" 765 | ] 766 | }, 767 | "execution_count": 2, 768 | "metadata": {}, 769 | "output_type": "execute_result" 770 | } 771 | ], 772 | "source": [ 773 | "# context manager syntax, preferred syntax as automatically closes process pool regardless of hitting an exception\n", 774 | "with multiprocessing.Pool(processes=multiprocessing.cpu_count()) as pool: # use all my CPUs\n", 775 | " result = pool.map(func=double, iterable=range(10))\n", 776 | "\n", 777 | "result" 778 | ] 779 | }, 780 | { 781 | "cell_type": "markdown", 782 | "metadata": {}, 783 | "source": [ 784 | "`pool.map()` is a *blocking* method call. Python stays at that line until all the results are completed. However, you can make it non-blocking by `changing pool.map()` to `pool.map_async()`. Just make sure that you get all the results before you close the process pool." 785 | ] 786 | }, 787 | { 788 | "cell_type": "code", 789 | "execution_count": 3, 790 | "metadata": {}, 791 | "outputs": [ 792 | { 793 | "name": "stdout", 794 | "output_type": "stream", 795 | "text": [ 796 | "False\n" 797 | ] 798 | }, 799 | { 800 | "ename": "ValueError", 801 | "evalue": " not ready", 802 | "output_type": "error", 803 | "traceback": [ 804 | "\u001b[0;31m---------------------------------------------------------------------------\u001b[0m", 805 | "\u001b[0;31mValueError\u001b[0m Traceback (most recent call last)", 806 | "\u001b[0;32m\u001b[0m in \u001b[0;36m\u001b[0;34m\u001b[0m\n\u001b[1;32m 3\u001b[0m \u001b[0;34m\u001b[0m\u001b[0m\n\u001b[1;32m 4\u001b[0m \u001b[0mprint\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0masync_result\u001b[0m\u001b[0;34m.\u001b[0m\u001b[0mready\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[0;32m----> 5\u001b[0;31m \u001b[0mprint\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0masync_result\u001b[0m\u001b[0;34m.\u001b[0m\u001b[0msuccessful\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m)\u001b[0m \u001b[0;31m# closed the process pool before all the results are completed\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[0m\u001b[1;32m 6\u001b[0m \u001b[0;31m#print(async_result.get()) # stuck forever as the results are not available\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n", 807 | "\u001b[0;32m/usr/lib/python3.8/multiprocessing/pool.py\u001b[0m in \u001b[0;36msuccessful\u001b[0;34m(self)\u001b[0m\n\u001b[1;32m 753\u001b[0m \u001b[0;32mdef\u001b[0m \u001b[0msuccessful\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0mself\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m:\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[1;32m 754\u001b[0m \u001b[0;32mif\u001b[0m \u001b[0;32mnot\u001b[0m \u001b[0mself\u001b[0m\u001b[0;34m.\u001b[0m\u001b[0mready\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m:\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[0;32m--> 755\u001b[0;31m \u001b[0;32mraise\u001b[0m \u001b[0mValueError\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0;34m\"{0!r} not ready\"\u001b[0m\u001b[0;34m.\u001b[0m\u001b[0mformat\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0mself\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[0m\u001b[1;32m 756\u001b[0m \u001b[0;32mreturn\u001b[0m \u001b[0mself\u001b[0m\u001b[0;34m.\u001b[0m\u001b[0m_success\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[1;32m 757\u001b[0m \u001b[0;34m\u001b[0m\u001b[0m\n", 808 | "\u001b[0;31mValueError\u001b[0m: not ready" 809 | ] 810 | } 811 | ], 812 | "source": [ 813 | "with multiprocessing.Pool(processes=multiprocessing.cpu_count()) as pool:\n", 814 | " async_result = pool.map_async(func=double, iterable=range(10)) # non-blocking so runs instantaneously\n", 815 | "\n", 816 | "print(async_result.ready())\n", 817 | "print(async_result.successful()) # closed the process pool before all the results are completed\n", 818 | "#print(async_result.get()) # stuck forever as the results are not available" 819 | ] 820 | }, 821 | { 822 | "cell_type": "code", 823 | "execution_count": 4, 824 | "metadata": {}, 825 | "outputs": [ 826 | { 827 | "name": "stdout", 828 | "output_type": "stream", 829 | "text": [ 830 | "True\n", 831 | "True\n", 832 | "[0, 2, 4, 6, 8, 10, 12, 14, 16, 18]\n", 833 | "CPU times: user 0 ns, sys: 78.1 ms, total: 78.1 ms\n", 834 | "Wall time: 82.7 ms\n" 835 | ] 836 | } 837 | ], 838 | "source": [ 839 | "%%time\n", 840 | "with multiprocessing.Pool(processes=multiprocessing.cpu_count()) as pool:\n", 841 | " async_result = pool.map_async(func=double, iterable=range(10))\n", 842 | " async_result.wait() # wait until all the results collected\n", 843 | " # async_result.wait(1) # you can control how long you want to wait for collecting results\n", 844 | "\n", 845 | "print(async_result.ready())\n", 846 | "print(async_result.successful())\n", 847 | "print(async_result.get())" 848 | ] 849 | }, 850 | { 851 | "cell_type": "code", 852 | "execution_count": 5, 853 | "metadata": {}, 854 | "outputs": [ 855 | { 856 | "name": "stdout", 857 | "output_type": "stream", 858 | "text": [ 859 | "[0, 2, 4, 6, 8, 10, 12, 14, 16, 18]\n" 860 | ] 861 | } 862 | ], 863 | "source": [ 864 | "with multiprocessing.Pool(processes=multiprocessing.cpu_count()) as pool:\n", 865 | " async_result = pool.map_async(func=double, iterable=range(10))\n", 866 | " print(async_result.get()) # get() is a blocking call but then it's basically equal to pool.map()" 867 | ] 868 | }, 869 | { 870 | "cell_type": "markdown", 871 | "metadata": {}, 872 | "source": [ 873 | "A common question question is: how many CPUs to use? It depends on the problem.\n", 874 | "* If your function is **CPU bound** (ie uses a lot of heavy computation), then you can use as many processes as up to the number of CPUs. The reason is that 1 function on 1 process will use up all the CPU cycles of 1 CPU.\n", 875 | "* If your function is **IO bound** (basically does read/writes and does not use a lot of CPUs), you can use more processes than number of CPUs. The reason is that multiple processes can be run on the same CPU since each process uses little CPU. IO bound problems, if they are not mission critical, can also be addressed with multithreading (mentioned above) or asynchronous programming (mentioned below).\n", 876 | "\n", 877 | "Some gotchas:\n", 878 | "* An additional constraint is RAM. If each process takes a lot of RAM, then together all the processes will take too much RAM and then the OS will terminate the process and you'll get `MemoryError`. Hence you might have to manually fiddle with the number of processes to get maximum performance while not hitting a RAM ceiling.\n", 879 | "* `multiprocessing` uses `pickle` to serialize between data and functions from your Python process to each subprocess. Hence objects that are not pickleable can cause trouble (ie lambda functions and generators)." 880 | ] 881 | }, 882 | { 883 | "cell_type": "code", 884 | "execution_count": 6, 885 | "metadata": {}, 886 | "outputs": [ 887 | { 888 | "ename": "PicklingError", 889 | "evalue": "Can't pickle at 0x7fb73c266a60>: attribute lookup on __main__ failed", 890 | "output_type": "error", 891 | "traceback": [ 892 | "\u001b[0;31m---------------------------------------------------------------------------\u001b[0m", 893 | "\u001b[0;31mPicklingError\u001b[0m Traceback (most recent call last)", 894 | "\u001b[0;32m\u001b[0m in \u001b[0;36m\u001b[0;34m\u001b[0m\n\u001b[1;32m 2\u001b[0m \u001b[0;34m\u001b[0m\u001b[0m\n\u001b[1;32m 3\u001b[0m \u001b[0;32mwith\u001b[0m \u001b[0mmultiprocessing\u001b[0m\u001b[0;34m.\u001b[0m\u001b[0mPool\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0mprocesses\u001b[0m\u001b[0;34m=\u001b[0m\u001b[0mmultiprocessing\u001b[0m\u001b[0;34m.\u001b[0m\u001b[0mcpu_count\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m)\u001b[0m \u001b[0;32mas\u001b[0m \u001b[0mpool\u001b[0m\u001b[0;34m:\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[0;32m----> 4\u001b[0;31m \u001b[0masync_result\u001b[0m \u001b[0;34m=\u001b[0m \u001b[0mpool\u001b[0m\u001b[0;34m.\u001b[0m\u001b[0mmap\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0mfunc\u001b[0m\u001b[0;34m=\u001b[0m\u001b[0mdouble\u001b[0m\u001b[0;34m,\u001b[0m \u001b[0miterable\u001b[0m\u001b[0;34m=\u001b[0m\u001b[0mrange\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0;36m10\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[0m", 895 | "\u001b[0;32m/usr/lib/python3.8/multiprocessing/pool.py\u001b[0m in \u001b[0;36mmap\u001b[0;34m(self, func, iterable, chunksize)\u001b[0m\n\u001b[1;32m 362\u001b[0m \u001b[0;32min\u001b[0m \u001b[0ma\u001b[0m \u001b[0mlist\u001b[0m \u001b[0mthat\u001b[0m \u001b[0;32mis\u001b[0m \u001b[0mreturned\u001b[0m\u001b[0;34m.\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[1;32m 363\u001b[0m '''\n\u001b[0;32m--> 364\u001b[0;31m \u001b[0;32mreturn\u001b[0m \u001b[0mself\u001b[0m\u001b[0;34m.\u001b[0m\u001b[0m_map_async\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0mfunc\u001b[0m\u001b[0;34m,\u001b[0m \u001b[0miterable\u001b[0m\u001b[0;34m,\u001b[0m \u001b[0mmapstar\u001b[0m\u001b[0;34m,\u001b[0m \u001b[0mchunksize\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m.\u001b[0m\u001b[0mget\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[0m\u001b[1;32m 365\u001b[0m \u001b[0;34m\u001b[0m\u001b[0m\n\u001b[1;32m 366\u001b[0m \u001b[0;32mdef\u001b[0m \u001b[0mstarmap\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0mself\u001b[0m\u001b[0;34m,\u001b[0m \u001b[0mfunc\u001b[0m\u001b[0;34m,\u001b[0m \u001b[0miterable\u001b[0m\u001b[0;34m,\u001b[0m \u001b[0mchunksize\u001b[0m\u001b[0;34m=\u001b[0m\u001b[0;32mNone\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m:\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n", 896 | "\u001b[0;32m/usr/lib/python3.8/multiprocessing/pool.py\u001b[0m in \u001b[0;36mget\u001b[0;34m(self, timeout)\u001b[0m\n\u001b[1;32m 766\u001b[0m \u001b[0;32mreturn\u001b[0m \u001b[0mself\u001b[0m\u001b[0;34m.\u001b[0m\u001b[0m_value\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[1;32m 767\u001b[0m \u001b[0;32melse\u001b[0m\u001b[0;34m:\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[0;32m--> 768\u001b[0;31m \u001b[0;32mraise\u001b[0m \u001b[0mself\u001b[0m\u001b[0;34m.\u001b[0m\u001b[0m_value\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[0m\u001b[1;32m 769\u001b[0m \u001b[0;34m\u001b[0m\u001b[0m\n\u001b[1;32m 770\u001b[0m \u001b[0;32mdef\u001b[0m \u001b[0m_set\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0mself\u001b[0m\u001b[0;34m,\u001b[0m \u001b[0mi\u001b[0m\u001b[0;34m,\u001b[0m \u001b[0mobj\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m:\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n", 897 | "\u001b[0;32m/usr/lib/python3.8/multiprocessing/pool.py\u001b[0m in \u001b[0;36m_handle_tasks\u001b[0;34m(taskqueue, put, outqueue, pool, cache)\u001b[0m\n\u001b[1;32m 535\u001b[0m \u001b[0;32mbreak\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[1;32m 536\u001b[0m \u001b[0;32mtry\u001b[0m\u001b[0;34m:\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[0;32m--> 537\u001b[0;31m \u001b[0mput\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0mtask\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[0m\u001b[1;32m 538\u001b[0m \u001b[0;32mexcept\u001b[0m \u001b[0mException\u001b[0m \u001b[0;32mas\u001b[0m \u001b[0me\u001b[0m\u001b[0;34m:\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[1;32m 539\u001b[0m \u001b[0mjob\u001b[0m\u001b[0;34m,\u001b[0m \u001b[0midx\u001b[0m \u001b[0;34m=\u001b[0m \u001b[0mtask\u001b[0m\u001b[0;34m[\u001b[0m\u001b[0;34m:\u001b[0m\u001b[0;36m2\u001b[0m\u001b[0;34m]\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n", 898 | "\u001b[0;32m/usr/lib/python3.8/multiprocessing/connection.py\u001b[0m in \u001b[0;36msend\u001b[0;34m(self, obj)\u001b[0m\n\u001b[1;32m 204\u001b[0m \u001b[0mself\u001b[0m\u001b[0;34m.\u001b[0m\u001b[0m_check_closed\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[1;32m 205\u001b[0m \u001b[0mself\u001b[0m\u001b[0;34m.\u001b[0m\u001b[0m_check_writable\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[0;32m--> 206\u001b[0;31m \u001b[0mself\u001b[0m\u001b[0;34m.\u001b[0m\u001b[0m_send_bytes\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0m_ForkingPickler\u001b[0m\u001b[0;34m.\u001b[0m\u001b[0mdumps\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0mobj\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[0m\u001b[1;32m 207\u001b[0m \u001b[0;34m\u001b[0m\u001b[0m\n\u001b[1;32m 208\u001b[0m \u001b[0;32mdef\u001b[0m \u001b[0mrecv_bytes\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0mself\u001b[0m\u001b[0;34m,\u001b[0m \u001b[0mmaxlength\u001b[0m\u001b[0;34m=\u001b[0m\u001b[0;32mNone\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m:\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n", 899 | "\u001b[0;32m/usr/lib/python3.8/multiprocessing/reduction.py\u001b[0m in \u001b[0;36mdumps\u001b[0;34m(cls, obj, protocol)\u001b[0m\n\u001b[1;32m 49\u001b[0m \u001b[0;32mdef\u001b[0m \u001b[0mdumps\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0mcls\u001b[0m\u001b[0;34m,\u001b[0m \u001b[0mobj\u001b[0m\u001b[0;34m,\u001b[0m \u001b[0mprotocol\u001b[0m\u001b[0;34m=\u001b[0m\u001b[0;32mNone\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m:\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[1;32m 50\u001b[0m \u001b[0mbuf\u001b[0m \u001b[0;34m=\u001b[0m \u001b[0mio\u001b[0m\u001b[0;34m.\u001b[0m\u001b[0mBytesIO\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[0;32m---> 51\u001b[0;31m \u001b[0mcls\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0mbuf\u001b[0m\u001b[0;34m,\u001b[0m \u001b[0mprotocol\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m.\u001b[0m\u001b[0mdump\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0mobj\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[0m\u001b[1;32m 52\u001b[0m \u001b[0;32mreturn\u001b[0m \u001b[0mbuf\u001b[0m\u001b[0;34m.\u001b[0m\u001b[0mgetbuffer\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[1;32m 53\u001b[0m \u001b[0;34m\u001b[0m\u001b[0m\n", 900 | "\u001b[0;31mPicklingError\u001b[0m: Can't pickle at 0x7fb73c266a60>: attribute lookup on __main__ failed" 901 | ] 902 | } 903 | ], 904 | "source": [ 905 | "double = lambda x: x * 2\n", 906 | "\n", 907 | "with multiprocessing.Pool(processes=multiprocessing.cpu_count()) as pool:\n", 908 | " async_result = pool.map(func=double, iterable=range(10))" 909 | ] 910 | }, 911 | { 912 | "cell_type": "markdown", 913 | "metadata": {}, 914 | "source": [ 915 | "#### Advanced `multiprocessing` Arguments\n", 916 | "Everything above is probably all you need for most use cases. Here are some extra features of multiprocessing in case you want some power features.\n", 917 | "* `pool.map(chunksize=SOME_INTEGER_HERE)`: `chunksize` allows how many adjacent elements of the iterable to be sent to 1 process. Suppose you have 4 processes and the iterable is `range(40)` and `chunksize=5`, then elements 1,2,3,4,5 will be sent to 1 process and 6,7,8,9,10 will be sent to another process, etc. By default, chunksize=1.\n", 918 | "* `multiprocessing.Pool(maxtasksperchild=SOME_INTEGER_HERE)`: `maxtasksperchild` says how many times each process will run a chunk before that process is terminated and restarted. This is helpful in a long running job if you believe there might be a memory leak where the RAM utilization keeps increasing over time. By setting a small `maxtasksperchild`, then even if there is a small memory leak, the processs will be restarted to replace the previous process, so you won't hit your RAM ceiling. By default, maxtasksperchild=float(\"inf\"), so the same processes are used for all the elements of the iterable (ie there are no new processes that replace the old ones).\n", 919 | "* `multiprocessing.Pool(initializer=SOME_FUNCTION_HERE)`: `initializer` is a function that runs whenever a process is started up." 920 | ] 921 | }, 922 | { 923 | "cell_type": "code", 924 | "execution_count": 7, 925 | "metadata": {}, 926 | "outputs": [ 927 | { 928 | "name": "stdout", 929 | "output_type": "stream", 930 | "text": [ 931 | "0 2132\n", 932 | "1 2133\n", 933 | "2 2134\n", 934 | "3 2132\n", 935 | "4 2133\n", 936 | "5 2134\n", 937 | "6 2132\n", 938 | "7 2133\n", 939 | "8 2134\n", 940 | "9 2132\n", 941 | "[None, None, None, None, None, None, None, None, None, None]\n" 942 | ] 943 | } 944 | ], 945 | "source": [ 946 | "import os\n", 947 | "import time\n", 948 | "\n", 949 | "def print_process_id(num):\n", 950 | " time.sleep(num)\n", 951 | " print(num, os.getpid(), flush=True)\n", 952 | "\n", 953 | "with multiprocessing.Pool(processes=3) as pool:\n", 954 | " async_result = pool.map(func=print_process_id, iterable=range(10)) # default chunksize=1, so process id alternates\n", 955 | " print(async_result) # just a bunch of Nones" 956 | ] 957 | }, 958 | { 959 | "cell_type": "code", 960 | "execution_count": 8, 961 | "metadata": {}, 962 | "outputs": [ 963 | { 964 | "name": "stdout", 965 | "output_type": "stream", 966 | "text": [ 967 | "0 2218\n", 968 | "1 2218\n", 969 | "2 2219\n", 970 | "4 2220\n", 971 | "3 2219\n", 972 | "6 2218\n", 973 | "5 2220\n", 974 | "8 2219\n", 975 | "7 2218\n", 976 | "9 2219\n" 977 | ] 978 | } 979 | ], 980 | "source": [ 981 | "with multiprocessing.Pool(processes=3) as pool:\n", 982 | " async_result = pool.map(func=print_process_id, iterable=range(10), chunksize=2) # adjacent elements are run on the same process" 983 | ] 984 | }, 985 | { 986 | "cell_type": "code", 987 | "execution_count": 9, 988 | "metadata": {}, 989 | "outputs": [ 990 | { 991 | "name": "stdout", 992 | "output_type": "stream", 993 | "text": [ 994 | "0 2304\n", 995 | "1 2305\n", 996 | "2 2306\n", 997 | "3 2304\n", 998 | "4 2305\n", 999 | "5 2306\n", 1000 | "6 2342\n", 1001 | "7 2351\n", 1002 | "8 2360\n", 1003 | "9 2342\n" 1004 | ] 1005 | } 1006 | ], 1007 | "source": [ 1008 | "with multiprocessing.Pool(processes=3, maxtasksperchild=2) as pool:\n", 1009 | " async_result = pool.map(func=print_process_id, iterable=range(10)) # elements 6 thru 9 go to new processes" 1010 | ] 1011 | }, 1012 | { 1013 | "cell_type": "code", 1014 | "execution_count": 10, 1015 | "metadata": {}, 1016 | "outputs": [ 1017 | { 1018 | "name": "stdout", 1019 | "output_type": "stream", 1020 | "text": [ 1021 | "Started process 2394\n", 1022 | "Started process 2397\n", 1023 | "0Started process 2402 \n", 1024 | "2394\n", 1025 | "1 2397\n", 1026 | "2 2402\n", 1027 | "3 2394\n", 1028 | "Started process 2444\n", 1029 | "4 2397\n", 1030 | "Started process 2457\n", 1031 | "5 2402\n", 1032 | "Started process 2470\n", 1033 | "6 2444\n", 1034 | "7 2457\n", 1035 | "8 2470\n", 1036 | "9 2444\n" 1037 | ] 1038 | } 1039 | ], 1040 | "source": [ 1041 | "def start_process():\n", 1042 | " print(\"Started process {}\".format(os.getpid()), flush=True)\n", 1043 | "\n", 1044 | "# it is now more visible to see that the process is (terminated and) restarted\n", 1045 | "with multiprocessing.Pool(processes=3, initializer=start_process, maxtasksperchild=2) as pool: \n", 1046 | " async_result = pool.map(func=print_process_id, iterable=range(10), chunksize=1)" 1047 | ] 1048 | }, 1049 | { 1050 | "cell_type": "markdown", 1051 | "metadata": {}, 1052 | "source": [ 1053 | "#### Multiprocessing Summary\n", 1054 | "* Multiprocessing gives true parallelism (ie multiple things running at the *exact* same time by running on separate CPUs via separate processes).\n", 1055 | "* Multiprocessing is great for CPU bound problem over an iterable. Basically you are loop over an iterable and each element is sent to a different process (which is run on a separate CPU), so you save time on getting the final results.\n", 1056 | "\n", 1057 | "#### Last Thoughts\n", 1058 | "* I use `multiprocessing`. In Python 3, you have the option of using `multiprocessing.Pool` or `concurrent.futures.ProcessPoolExecutor`--both are basically the same thing.\n", 1059 | "* Multiprocessing is used to run multiple things on **1** machine. If your job is super-duper intense (due to needing more CPUs or running out of RAM), then there are 2 options:\n", 1060 | " * **vertical scaling**: get a larger machine with more CPUs and RAM. Run everything on that 1 larger machine\n", 1061 | " * **horizontal scaling**: distributed the computations across multiple machines. Horizontal scaling is what the juicy section on *distributed computing frameworks* is all about (mentioned below)🤩!\n", 1062 | "* If you have dependency across multiple things, you can use Dask delayed (also mentioned in the *distributed computing frameworks* section).\n", 1063 | "* If you want to do IO bound problems (but you already know that multithreading is bad), then you want to use asynchronous programming." 1064 | ] 1065 | }, 1066 | { 1067 | "cell_type": "code", 1068 | "execution_count": null, 1069 | "metadata": {}, 1070 | "outputs": [], 1071 | "source": [] 1072 | }, 1073 | { 1074 | "cell_type": "markdown", 1075 | "metadata": {}, 1076 | "source": [ 1077 | "## Distributed Computing Frameworks\n", 1078 | "In some sense, distributed computing (AKA *parallel computing* or *high-performance computing*) is glorified multiprocessing. Basically, distributed computing aims to *distribute* workloads horizontally instead of vertically. Horizontally scaling means using lots of small or medium-sized machines to do the work. Vertically scaling means using 1 very large machine (lots of RAM + CPU). Horizontal scaling is preferable because medium-sized machines are commodities--very easy to get access to and are cheap. Large machines are very expensive and sometimes an instance that large is not available.\n", 1079 | "\n", 1080 | "Like multiprocessing, distributed computing works very well with \"embarrassingly parallel\" problems, which are problems that can be subdivided and worked on separately and independently. Fortunately, distributed compouting frameworks can coordinate between the worker machines to aggregate results, which is not an embarrassingly parallel task. Take a look at [Amdahl's law](https://en.wikipedia.org/wiki/Amdahl%27s_law), which gives a formula about predicted speedup vs running a regular loop. A common lesson of distributed computing is that it isn’t about doing one thing exceedingly well; it’s about doing nothing poorly.\n", 1081 | "\n", 1082 | "In my opinion, distributed computing frameworks are glorified `map`/`filter`/`reduce` operations. Basically you want to chain a bunch of transformations together. If you have just 1 transformation, you might be better off with just `multiprocessing`. In distributed computing, you have a lot of elements (basically a list where the elements are distributed over multiple machines) and then you want to do 1 thing to it, get the outputted list, and then do another thing to that outputted list. Often the transformations are written using functional programming syntax and applied lazily, so they don't use memory or CPUs until you want to trigger the actual computation. Basically the purpose of the previous notebook (4_Functional_Python.ipynb) was to get you familiar with functional programming syntax, so you can use distributed computing frameworks.\n", 1083 | "\n", 1084 | "Popular distributed computing frameworks in Python are Apache Spark and Dask. Apache Beam and Ray are interested, but they do not have mainstream adoption--I added them just for fun. 🎉\n", 1085 | "\n", 1086 | "\n", 1087 | "### MapReduce (old news)\n", 1088 | "explain what is MapReduce and disadvantages \n", 1089 | "Out-of-core-computation (often mistakenly called out of memory computation): enables large matrix multiplications and querying large databases such as Hive query executed with MapReduce \n", 1090 | "A `reduce` operation that is commutative and associative can be partially parallelized using a technique call combiner. \n", 1091 | "Map reduce on-premise cannot be scalled up, so EMR on AWS gives you the ability to scale up when needed. Even Spark cluster is apportioned and probably isn't ideal for overallocation. \n", 1092 | "show mapreduce stage picture \n", 1093 | "\n", 1094 | "### Apache Spark (industry standard 🏆)\n", 1095 | "If you want to get a high-demand skill to get a new job, learn Spark. I always say: `If you know Spark, you can't get fired!` \n", 1096 | "Spark is basically MapReduce in memory. The bottleneck in large scale computation is getting the data to the CPU. RAM is way faster than disk; order of magnitudes, get picture of CPU cache, RAM, disk, network \n", 1097 | "Simple Spark \n", 1098 | "Spark and Dask: often a functional approach, functoolz syntax translates to Dask bags, FP allows embarrassingly parallel and laziness. \n", 1099 | "Don't need Scala for speed \n", 1100 | "Weird Java exceptions \n", 1101 | "Show Simple-Spark \n", 1102 | "Weakness: no ability to do multiple actions. The workaround is to split pipeline into 2 such as word count and character simultaneously and then do groupby. \n", 1103 | "DataFrame, RDD\n", 1104 | "show Spark architecture picture \n", 1105 | "\n", 1106 | "### Dask (new framework with growing community adoption 🌱)\n", 1107 | "Distributed + Task \n", 1108 | "`If you learn Dask, then you know Spark. And if you know Spark, you can't get fired!` \n", 1109 | "Compare and contrast \n", 1110 | "No weird Java exceptions \n", 1111 | "Show Dask tutorial (tree aggregation example) \n", 1112 | "delayed and futures \n", 1113 | "x = x.map_blocks(compress).persist().map_blocks(decompress) \n", 1114 | "Can do multiple actions \n", 1115 | "DataFrame, array, bag, delayed\n", 1116 | "show Dask architecture picture (basically the same thing) \n", 1117 | "\n", 1118 | "### Apache Beam (theoretically interesting but nobody uses 😞)\n", 1119 | "explain why it is cool: unified API btween Batch + Stream, auto-scaling, serverless, templates, can do multiple actions \n", 1120 | "pcollections, ParDo \n", 1121 | "explain its weaknesses: does not support data with schema, no graph algorithms like K-Means, also no concept of action so have to use trick to trigger finality \n", 1122 | "Throw picture in: 62 machines running for 30 minutes cost \\$2.50. \n", 1123 | "

\n", 1124 | "\n", 1125 | "Spark, Dask, Beam \n", 1126 | "Features: \n", 1127 | "* **batch**: first classs, first class, first class\n", 1128 | "* stream: first class, kinda, first class\n", 1129 | "* **real time Dashboard (for debugging and throughput)**: yes, yes, yes\n", 1130 | "* **interactive mode/REPL**: yes, yes, yes\n", 1131 | "* **persist (memory and disk)**: memory+disk, only memory, no\n", 1132 | "* **with persist, can execute different sections independently**: yes, yes, no\n", 1133 | "* **can split 1 input to multiple output pipelines**: yes, yes, yes\n", 1134 | "* **simultaneous actions on multiple output pipelines**: no, yes, yes\n", 1135 | "* support for branching logic of pipeline: yes, yes, yes\n", 1136 | "* support for iterative algorithms: yes, yes, no\n", 1137 | "* **GROUP BY logic**: yes, yes, yes\n", 1138 | "* **JOIN logic**: yes, yes, yes\n", 1139 | "* **tagged output pipelines**: no, no, yes\n", 1140 | "* **combiner/aggregate logic**: yes, yes, yes\n", 1141 | "* **SQL support**: yes, kinda, no\n", 1142 | "* distributed numpy array: no, yes, no\n", 1143 | "* ML algorithms: yes, yes, no\n", 1144 | "* futures: no, yes, no\n", 1145 | "* delayed: no, yes, no\n", 1146 | "* support to address \"hot keys\": yes, yes, yes\n", 1147 | "* **shared variables**: broad variable and accumulators, no, idk\n", 1148 | "* supports logging for debugging: yes, yes, yes\n", 1149 | "* **supports multiple input/output file types**: yes, yes, yes\n", 1150 | "* click-based or GUI-based template: no, no, yes\n", 1151 | "* **auto scaling**: idk, idk, yes\n", 1152 | "\n", 1153 | "Other support:\n", 1154 | "* integration: many, growing, primarily only GCP\n", 1155 | "* pure Python (3): no, yes, yes\n", 1156 | "* **easy to run/test on your own machine**: yes, yes, yes\n", 1157 | "* performant local cluster on your machine: idk, yes, no\n", 1158 | "* speed: Scala (fast), Python (less fast), Python (less fast)\n", 1159 | "* **documentation**: great, great, lacking\n", 1160 | "* user base: big, medium, low\n", 1161 | "\n", 1162 | "\n", 1163 | "Which one to use: Spark is a very hot data engineering skill. I tell my teammates: \"if you know Spark, then you can't be fired.\" Since I work on Dask, I also tell my teammates: \"if you know Dask, then you know Spark. If you know Spark, then you can't be fired.\" \n", 1164 | "main problem is RAM vs CPU \n", 1165 | "MapReduce for RAM. The better way is to use Spark, that scales horizontally instead of vertically. It is still doing multiprocessing. for RAM\n", 1166 | "\n", 1167 | "\n", 1168 | "### Ray (and Actor Model)\n", 1169 | "like a blend of functional and OOP because it is stateful between transformations and dependency. Like Dask futures \n", 1170 | "Plasma -> Apache Arrow \n", 1171 | "totally novel architecture that overcomes IPC/RPC latency \n", 1172 | "show Ray architecture picture \n", 1173 | "\n", 1174 | "\n", 1175 | "These things are not built in a vacuum: progress stands on the shoulders of giants. Their histories are intertwined. Spark is basically mapreduce but in memory for the speed. Apache Beam originated from Google's internal Millwheel project and was marketed as true streaming. Spark basically copied Beam's approach. Dask was created originally for distributed arrays because Spark doesn't support it. Ray uses Redis and was probably inspired by Akka (framework in Scala). Dask adopted Ray's actor model. Basically copy ideas that work and dump the rest. A lot of frameworks that don't make the cut and just fade out. Other frameworks: Apache Apex (dead), Apache Gearpump (basically dead), Apache Samza (~~lame~~ only Java), Apache Flink (seems like a Hadoop thing, so basically for legacy applications) " 1176 | ] 1177 | }, 1178 | { 1179 | "cell_type": "code", 1180 | "execution_count": null, 1181 | "metadata": {}, 1182 | "outputs": [], 1183 | "source": [] 1184 | }, 1185 | { 1186 | "cell_type": "code", 1187 | "execution_count": null, 1188 | "metadata": {}, 1189 | "outputs": [], 1190 | "source": [] 1191 | }, 1192 | { 1193 | "cell_type": "markdown", 1194 | "metadata": {}, 1195 | "source": [ 1196 | "## Asynchronous Programming/Event-Driven Programming\n", 1197 | "asyncio is in some sense serial with indirection of a coroutine. asynchronous immplemented in 3 ways: callback (hell from nested structure), futures/promises (looks like [flat] method chaining) and flatter, async/await \n", 1198 | "yield from \n", 1199 | "yield and return in the same function in a coroutine \n", 1200 | "you don't need asyncio to do asynchronous programming \n", 1201 | "C10k problem \n", 1202 | "basically a queue--asynchronous because not executed immediately. The event loop is the buffer that manages queue \n", 1203 | "\n", 1204 | "cooperative vs competitive multitasking \n", 1205 | "https://pybay.com/site_media/slides/raymond2017-keynote/async_examples.html" 1206 | ] 1207 | }, 1208 | { 1209 | "cell_type": "code", 1210 | "execution_count": null, 1211 | "metadata": {}, 1212 | "outputs": [], 1213 | "source": [] 1214 | }, 1215 | { 1216 | "cell_type": "code", 1217 | "execution_count": null, 1218 | "metadata": {}, 1219 | "outputs": [], 1220 | "source": [] 1221 | }, 1222 | { 1223 | "cell_type": "markdown", 1224 | "metadata": {}, 1225 | "source": [ 1226 | "## Virtual Machines vs Containers\n", 1227 | "VM vs Docker/containerization, multi-tenancy \n", 1228 | "serverless computing, scalable/decoupled, notifications and (asynchronous) PubSub connected to AWS Lambda (in some sense distributed but independent multiprocessing) \n", 1229 | "world has moved on from hardware -> co-location -> virtualization (rented VM) -> container -> function: https://read.acloud.guru/the-evolution-from-servers-to-functions-21833b576744 \n", 1230 | "Focus on your business logic and delegate the rest to AWS. Who knows (and who cares) how its really implemented? Are you in the business of creating your private cloud/Hadoop or solving business problems that make $$$? \n", 1231 | "\n", 1232 | "\n", 1233 | "## What is the Cloud?\n", 1234 | "There's an old saying: \"Nobody gets fired for buying IBM\". That was decades ago. I think the modern version would be \"Nobody gets fired for buying AWS\". The cloud is so versatile with all the services builtin, that if you are still buying hardware for your own datacenters (without good reasons), you *should* be fired. One of the clients I worked on is such an old-boys club that they still buy hardware, explain story. \n", 1235 | "AWS is really supply chain maximization, squeezing out efficiencies from elementary pieces: compute, storage, and networking. With these Lego blocks, they can create managed services that are just a copy of a Apache/open-source software and charge for it. The cloud doesn't cost you money--it saves you money. Overall win-win scenario. Like Costco or Trade Joe model. Show all the logos of services from AWS or GCP. Show the open source version and then AWS service \n", 1236 | "\n", 1237 | "Why do you buy a car? Because it is better, faster, and/or cheaper than you can make it yourself. It just \"works out of the box.\" You could build a car yourself; it would be worse functionality, slower to get to market, more expensive build from scratch, and a nightmare to maintain. Just buy it off-the-shelf. Focus on your main problem. Don't roll your own--you must be high if you do. Plus, if something breaks, there's somebody you can ~~complain to~~ ask for help.\n", 1238 | "\n" 1239 | ] 1240 | }, 1241 | { 1242 | "cell_type": "code", 1243 | "execution_count": null, 1244 | "metadata": {}, 1245 | "outputs": [], 1246 | "source": [] 1247 | }, 1248 | { 1249 | "cell_type": "markdown", 1250 | "metadata": {}, 1251 | "source": [ 1252 | "## Extra Resources\n", 1253 | "* Raymond Hettinger, Keynote on Concurrency, PyBay 2017: https://www.youtube.com/watch?v=9zinZmE3Ogk and presentation materials at:\n", 1254 | " * multithreading: https://pybay.com/site_media/slides/raymond2017-keynote/threading.html\n", 1255 | " * multiprocessing: https://pybay.com/site_media/slides/raymond2017-keynote/process.html\n", 1256 | " * asynchronous: https://pybay.com/site_media/slides/raymond2017-keynote/async_examples.html \n", 1257 | "* About asynchornous programming from the creator of Twisted and key influencer to asyncio: https://glyph.twistedmatrix.com/2014/02/unyielding.html" 1258 | ] 1259 | } 1260 | ], 1261 | "metadata": { 1262 | "kernelspec": { 1263 | "display_name": "Python 3", 1264 | "language": "python", 1265 | "name": "python3" 1266 | }, 1267 | "language_info": { 1268 | "codemirror_mode": { 1269 | "name": "ipython", 1270 | "version": 3 1271 | }, 1272 | "file_extension": ".py", 1273 | "mimetype": "text/x-python", 1274 | "name": "python", 1275 | "nbconvert_exporter": "python", 1276 | "pygments_lexer": "ipython3", 1277 | "version": "3.9.7" 1278 | } 1279 | }, 1280 | "nbformat": 4, 1281 | "nbformat_minor": 4 1282 | } 1283 | -------------------------------------------------------------------------------- /README.md: -------------------------------------------------------------------------------- 1 | # So you want to be a ~~millionaire~~ awesome Python programmer? 2 | 3 | * You want to code like the pros. 4 | * You want to write elegant, clean code that your colleagues consider "best practice" (AKA "idiomatic" Python). 5 | * You want to write high-quality code, so that you and your colleagues can maintain it in the future ... so you can go home earlier AND PARTY IT UP. 6 | * You want to be the coolest programmer on your team, get promoted to Senior Principal Python Genius, and make the BIG $$$! 7 | * You want to learn how the descriptor protocol implements magic method calls, which you can hijack by inserting malicious code into super classes such that method calls up the C3 linearization graph destroy your competitor's code! 8 | 9 | Well, if you are reading this notebook for the right reasons, you'll be a **certified Python Pro!** If not..., even better: you'll be a **millionaire watching your competitor crying in ruins!** 10 |

thank goodness it's not Evil Ernie's Smiley

11 | 12 | # Why Python? 13 | Python takes a Multi-paradigm approach: 14 | * Pragmatic problem solving: Lua, PHP, Perl 15 | * Procedural programming: C, Rust, Cython 16 | * Object-oriented programming: Java, C#, Eiffel 17 | * Functional programming: Haskell, Scala, Clojure, F# 18 | * High performance with parallel/array-oriented processing: MATLAB/Octave, Julia, R, Rust, Go 19 | * Asynchronous/Event-driven programming: JavaScript, Go, Erlang, Elixir 20 |

21 | 22 | 23 | Testimonials: 24 | * "It is the most frequently taught first language at universities nowadays, it is number one in the statistical domain, number one in AI programming, number one in scripting and number one in writing system tests. Besides this, Python is also leading in web programming and scientific computing (just to name some other domains). In summary, Python is everywhere." 25 | * "There are several reasons for Python's rapid rise, bloom, and dominance in multiple domains, including web development, scientific computing, testing, data science, machine learning, and more. The reasons include its readable and maintainable code; extensive support for third-party integrations and libraries; modular, dynamic, and portable structure; flexible programming; learning ease and support; user-friendly data structures; productivity and speed; and, most important, community support. The diverse application of Python is a result of its combined features, which give it an edge over other languages." 26 | * "Very few languages can match Python's ability to conform to a developer's coding style rather than forcing him or her to code in a particular way. Python lets more advanced developers use the style they feel is best suited to solve a particular problem." 27 |

28 | 29 | 30 | **TLDR**: high level qualities of Python: 31 | * Ease of learning 32 | * Economy of expression 33 | * Readability and beauty 34 | * Batteries included 35 | * Rapid development cycle 36 | * Interactive prompt 37 | * One way to do it 38 |

39 | 40 | 41 | Guido van Rossum (retired Benevolent Dictator for Life [BFDL]) emphasizes readability. Here's some other fundamental core tenets of Python. 42 | ```python 43 | import this 44 | # The Zen of Python, by Tim Peters 45 | 46 | # Beautiful is better than ugly. 47 | # Explicit is better than implicit. 48 | # Simple is better than complex. 49 | # Complex is better than complicated. 50 | # Flat is better than nested. 51 | # Sparse is better than dense. 52 | # Readability counts. 53 | # Special cases aren't special enough to break the rules. 54 | # Although practicality beats purity. 55 | # Errors should never pass silently. 56 | # Unless explicitly silenced. 57 | # In the face of ambiguity, refuse the temptation to guess. 58 | # There should be one-- and preferably only one --obvious way to do it. 59 | # Although that way may not be obvious at first unless you're Dutch. 60 | # Now is better than never. 61 | # Although never is often better than *right* now. 62 | # If the implementation is hard to explain, it's a bad idea. 63 | # If the implementation is easy to explain, it may be a good idea. 64 | # Namespaces are one honking great idea -- let's do more of those! 65 | 66 | 67 | # Another Easter egg 68 | from __future__ import braces 69 | # SyntaxError: not a chance 70 | ``` 71 |
72 | 73 | 74 | # Don't Just Listen, Show Me the Code 75 | mybinder.org allows you to run Jupyter Notebooks for any repo for free! Go to https://mybinder.org and copy+paste this repo url (https://github.com/eugeneh101/advanced-python) into "GitHub repository name or URL". Or you can simply click on this link (https://mybinder.org/v2/gh/eugeneh101/advanced-python/master) or this neat button ([![Binder](https://mybinder.org/badge_logo.svg)](https://mybinder.org/v2/gh/eugeneh101/advanced-python/master). If you prefer JupyterLab instead of Jupyter Notebook, then go to this click: https://mybinder.org/v2/gh/eugeneh101/advanced-python/master?urlpath=lab 76 |

free software!

77 |
78 | 79 | 80 | # Spoilers 81 | Here's a sneek peek of topics covered in each notebook. If you see a topic that you want to especially learn, just jump to that notebook! 82 | 83 | **1_Pragmatic_Python.ipynb**: Quickstart Guide to Writing Idiomatic Python 84 | * Warmups 85 | * `enumerate/zip` 86 | * Versatility of `dict`s 87 | * Ternary operators 88 | * List comprehension 89 | * Nested comprehension and double list comprehension 90 | * Dict, set, and tuple comprehension 91 | * `try/except` 92 | * Practical Object-Oriented Programming 93 | * Practical Functional Programming 94 | * Lambda functions 95 | * `map/filter/reduce` 96 | * Generators/lazy iterators 97 | * Generator expressions 98 | * Generator functions 99 | * Infinite generators 100 | * `*args/**kwargs` 101 | * Decorators 102 | * Practical Debugging Tricks 103 | * Helpful libraries 104 | * Pythonic syntax 105 | * String interpolation 106 | * Context manager 107 | * Implicit line concatenation and implicit string concatenation 108 | * Trailing comma 109 | * PEP8: Python's style guide 110 | 111 | **2_Procedural_Python.ipynb**: Mastering Control Flow 112 | * Difference between expression and statement 113 | * `pass/break/continue` 114 | * `if/elif/else` construct 115 | * `for/else` construct 116 | * `while/else` construct 117 | * `try/except/else/finally` construct 118 | * Context manager with `try/finally` 119 | * case/switch statement in Python 120 | 121 | **3_Object_Oriented_Python.ipynb**: Deep Dive into Object-Oriented Programming 122 | * Class/object/method/attribute 123 | * Methods types 124 | * Instance method 125 | * Class method 126 | * Static method 127 | * Attribute types 128 | * Instance attribute 129 | * Class attribute 130 | * Classic classes vs new-style classes 131 | * Similarities between a module and a class 132 | * `@property`: accessor (getter) and mutator (setter) 133 | * Fluent interface 134 | * Python's data model: magic/dunder methods 135 | * Objects are really dictionaries underneath 136 | * Method overriding 137 | * Operator overloading 138 | * Reflected arithmetic operators 139 | * Making an instance callable 140 | * Context manager with `__enter__/__exit__` 141 | * Inheritance 142 | * `super()` for extending a parent class's method 143 | * Advanced inheritance topics 144 | * Advanced OOP tricks 145 | 146 | **4_Functional_Python.ipynb**: Deep Dive into Functional Programming 147 | * All functions return something 148 | * `map/filter/reduce` 149 | * `starmap` 150 | * Immutability (and copying) 151 | * Pure functions 152 | * High-order functions 153 | * Decorator deep dive 154 | * Closure 155 | * `*args/**kwargs` 156 | * `@wraps` 157 | * Decorator on top of decorator 158 | * Decorator with arguments 159 | * Recursion 160 | * Tail call optimization's duality to iteration 161 | * Academic FP topics 162 | * Currying 163 | * Partial functions 164 | * Function composition 165 | * Topics specific to Python's functions 166 | * Functions are first class 167 | * Caching 168 | * Combining FP with OOP and Procedural 169 | * Functional programming + method chaining/fluent interface 170 | * `singledispatch()` 171 | * Infinite recursion depth 172 | 173 | **5_High_Performance_Python.ipynb**: Writing Performant Python at Scale 174 | * Debunking the "slow" Python myth 175 | * Big O notation: runtime and space complexity 176 | * Time complexity of data structures 177 | * Time complexity of algorithms 178 | * Vectorization/SIMD parallelization 179 | * Multithreading (concurrency) 180 | * Multiprocessing (parallelism) 181 | * Sound impressive while writing very little code 182 | * Advanced `multiprocessing` arguments 183 | * Multiprocessing summary and last thoughts 184 | * Event-driven/asynchronous programming 185 | * Coroutines 186 | * MapReduce/Spark/Dask 187 | * Ray 188 | 189 | **6_Pedantic_Python_Tricks.ipynb**: Extra Tricks of the Trade 190 | * Name binding 191 | * Reference counting 192 | * Tuple unpacking 193 | * Multiple assignment 194 | * Tuple unpacking + multiple assignment 195 | * Accidentally overwriting a built-in 196 | * Multiple comparison 197 | * Short ciruiting with `and` + `or` 198 | * Variable scoping/namespace: LEGB rule 199 | * One value for a given variable name in a given namespace 200 | * (Re)assignment with `global` and `nonlocal` 201 | * `exec` and `eval` 202 | * Don't use mutable default arguments 203 | * Don't fear importing a module multiple times 204 | * Extra fun details 205 | * Specification vs implementation 206 | * Interpreted vs compiled 207 | * Dynamically typed vs statically typed 208 | * Strongly typed vs weakly typed 209 | * Pass by object 210 | * Not (really) one way to do it 211 | * New features: looking into the future 212 | -------------------------------------------------------------------------------- /images/Big_O_magnitudes.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/eugeneh101/Advanced-Python/64cafbc763311a23f93fc39ad237ee779d570c6f/images/Big_O_magnitudes.png -------------------------------------------------------------------------------- /images/Dataflow_pipeline.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/eugeneh101/Advanced-Python/64cafbc763311a23f93fc39ad237ee779d570c6f/images/Dataflow_pipeline.png -------------------------------------------------------------------------------- /images/GED_triplet.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/eugeneh101/Advanced-Python/64cafbc763311a23f93fc39ad237ee779d570c6f/images/GED_triplet.png -------------------------------------------------------------------------------- /images/LEGB_Variable_Scoping.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/eugeneh101/Advanced-Python/64cafbc763311a23f93fc39ad237ee779d570c6f/images/LEGB_Variable_Scoping.png -------------------------------------------------------------------------------- /images/Python-logo.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/eugeneh101/Advanced-Python/64cafbc763311a23f93fc39ad237ee779d570c6f/images/Python-logo.png -------------------------------------------------------------------------------- /images/Tensorflow_logo.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/eugeneh101/Advanced-Python/64cafbc763311a23f93fc39ad237ee779d570c6f/images/Tensorflow_logo.png -------------------------------------------------------------------------------- /images/The-Big-O.jpg: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/eugeneh101/Advanced-Python/64cafbc763311a23f93fc39ad237ee779d570c6f/images/The-Big-O.jpg -------------------------------------------------------------------------------- /images/advanced-python.jpg: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/eugeneh101/Advanced-Python/64cafbc763311a23f93fc39ad237ee779d570c6f/images/advanced-python.jpg -------------------------------------------------------------------------------- /images/alternative_constructors.jpg: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/eugeneh101/Advanced-Python/64cafbc763311a23f93fc39ad237ee779d570c6f/images/alternative_constructors.jpg -------------------------------------------------------------------------------- /images/be_this_tall_to_use_multithreading.jpg: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/eugeneh101/Advanced-Python/64cafbc763311a23f93fc39ad237ee779d570c6f/images/be_this_tall_to_use_multithreading.jpg -------------------------------------------------------------------------------- /images/calling-python.jpg: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/eugeneh101/Advanced-Python/64cafbc763311a23f93fc39ad237ee779d570c6f/images/calling-python.jpg -------------------------------------------------------------------------------- /images/concurrency.jpg: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/eugeneh101/Advanced-Python/64cafbc763311a23f93fc39ad237ee779d570c6f/images/concurrency.jpg -------------------------------------------------------------------------------- /images/context_switching_cost.jpg: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/eugeneh101/Advanced-Python/64cafbc763311a23f93fc39ad237ee779d570c6f/images/context_switching_cost.jpg -------------------------------------------------------------------------------- /images/daleks-exterminate.jpg: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/eugeneh101/Advanced-Python/64cafbc763311a23f93fc39ad237ee779d570c6f/images/daleks-exterminate.jpg -------------------------------------------------------------------------------- /images/multithreaded_process.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/eugeneh101/Advanced-Python/64cafbc763311a23f93fc39ad237ee779d570c6f/images/multithreaded_process.png -------------------------------------------------------------------------------- /images/mybinder_screenshot.jpg: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/eugeneh101/Advanced-Python/64cafbc763311a23f93fc39ad237ee779d570c6f/images/mybinder_screenshot.jpg -------------------------------------------------------------------------------- /images/parallelism.jpg: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/eugeneh101/Advanced-Python/64cafbc763311a23f93fc39ad237ee779d570c6f/images/parallelism.jpg -------------------------------------------------------------------------------- /images/party_emoji.jpg: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/eugeneh101/Advanced-Python/64cafbc763311a23f93fc39ad237ee779d570c6f/images/party_emoji.jpg -------------------------------------------------------------------------------- /images/python-word-cloud.jpg: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/eugeneh101/Advanced-Python/64cafbc763311a23f93fc39ad237ee779d570c6f/images/python-word-cloud.jpg -------------------------------------------------------------------------------- /images/python_is_awesome.jpg: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/eugeneh101/Advanced-Python/64cafbc763311a23f93fc39ad237ee779d570c6f/images/python_is_awesome.jpg -------------------------------------------------------------------------------- /images/python_yin_yang.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/eugeneh101/Advanced-Python/64cafbc763311a23f93fc39ad237ee779d570c6f/images/python_yin_yang.png -------------------------------------------------------------------------------- /images/recursion_call_stack.gif: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/eugeneh101/Advanced-Python/64cafbc763311a23f93fc39ad237ee779d570c6f/images/recursion_call_stack.gif -------------------------------------------------------------------------------- /images/ricer.jpg: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/eugeneh101/Advanced-Python/64cafbc763311a23f93fc39ad237ee779d570c6f/images/ricer.jpg -------------------------------------------------------------------------------- /images/scientific-ecosystem.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/eugeneh101/Advanced-Python/64cafbc763311a23f93fc39ad237ee779d570c6f/images/scientific-ecosystem.png -------------------------------------------------------------------------------- /images/thumbs-up.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/eugeneh101/Advanced-Python/64cafbc763311a23f93fc39ad237ee779d570c6f/images/thumbs-up.png -------------------------------------------------------------------------------- /requirements.txt: -------------------------------------------------------------------------------- 1 | autopep8 2 | fluentpy 3 | ipdb 4 | matplotlib 5 | pandas 6 | pytest 7 | toolz 8 | tqdm --------------------------------------------------------------------------------