├── 01_Compare_How_Fast_Are_BuiltIn_Ops.ipynb ├── 02_Compare_eq_UFUNCS.ipynb ├── 03_Compare_Aggregations.ipynb ├── 04_Compare_Broadcasting.ipynb ├── 05_Compare_Where_Select.ipynb ├── 06_Compare_Sorting.ipynb ├── 07_Numpy_Select_Pandas.ipynb ├── 08_Compare_CrossEntropy_Softmax_KL_Mean_Std.ipynb ├── 09_Inverse_Sqrt.ipynb ├── Assets ├── Broadcasting.png ├── ConditionalLogic.png ├── IDCGetStarted.png ├── IDCLandingPage.png ├── IDCLauncher.png ├── IDCTrainingPage.png ├── NumpyAxis0.PNG ├── NumpyAxis1.PNG ├── PairwiseSimple.PNG ├── PairwiseStocks.jpg ├── SimpleLogic.png └── SlowWadeWater.png ├── README.md ├── build ├── lib.linux-x86_64-cpython-39 │ ├── cython_Exact.cpython-39-x86_64-linux-gnu.so │ └── cython_NewtonRecipSqrt.cpython-39-x86_64-linux-gnu.so └── temp.linux-x86_64-cpython-39 │ ├── cython_Exact.o │ └── cython_NewtonRecipSqrt.o ├── cython_Exact.c ├── cython_Exact.cpython-39-x86_64-linux-gnu.so ├── cython_Exact.pyx ├── cython_NewtonRecipSqrt.c ├── cython_NewtonRecipSqrt.cpython-39-x86_64-linux-gnu.so ├── cython_NewtonRecipSqrt.pyx └── setup.py /01_Compare_How_Fast_Are_BuiltIn_Ops.ipynb: -------------------------------------------------------------------------------- 1 | { 2 | "cells": [ 3 | { 4 | "cell_type": "markdown", 5 | "metadata": { 6 | "tags": [] 7 | }, 8 | "source": [ 9 | "# Introduction to PyTorch built in functions to replace loopy code\n", 10 | "\n", 11 | "### Replacing Inefficient code\n", 12 | "![SLowWadeWater.PNG](Assets/SlowWadeWater.png)\n", 13 | "\n", 14 | "\n", 15 | "### Why Vectorize \n", 16 | "\n", 17 | "(In Python Parlence = Cahche optimization + SIMD)?\n", 18 | "- __[A New Golden Age for Computer Architecture](https://www.doc.ic.ac.uk/~wl/teachlocal/arch/papers/cacm19golden-age.pdf)__\n", 19 | "Hennessy and Patterson \n", 20 | "\n", 21 | "- *\"Optimizing the memory layout to exploit caches yields a factor of 20, and a final factor of 9 comes from using the hardware extensions for doing single instruction multiple data (SIMD) parallelism operations that are able to perform 16 32-bit operations per instruction. All told, the final, highly optimized version runs more than 62,000× faster on a multicore Intel processor compared to the original Python version. This is of course a small example, one might expect programmers to use an optimized library for. Although it exaggerates the usual performance gap, there are likely many programs for which factors of 100 to 1,000 could be achieved\"*\n", 22 | "\n", 23 | "\n", 24 | "\n", 25 | "\n", 26 | "### Learning Objectives: \n", 27 | "\n", 28 | "- Describe why replacing inefficient code, such as time-consuming loops, wastes resources, and time\n", 29 | "- Describe why using Python for highly repetitive small tasks is inefficient\n", 30 | "- Describe the additive value of leveraging packages such as PyTorch and NumPy which are powered by oneAPI in a cloud world \n", 31 | "- Describe the importance of keeping oneAPI and 3rd party package such as PyTorch, NumPy, SciPy and others is important\n", 32 | "- Enumerate ways in which PyTorch and NumPy accelerates code\n", 33 | "- Apply loop replacement methodologies in a variety of scenarios\n", 34 | "\n" 35 | ] 36 | }, 37 | { 38 | "cell_type": "markdown", 39 | "metadata": { 40 | "tags": [] 41 | }, 42 | "source": [ 43 | "## To run the lab: These step could be run on a laptop - NOT REQUIRED for DevCloud!\n", 44 | "Laptop Requirements:\n", 45 | " - conda config --add channels intel\n", 46 | " - conda install numpy\n", 47 | " - conda install scipy\n", 48 | " - conda install update pandas\n", 49 | " \n", 50 | "#### Here is a list of topics we will explore in this module:\n", 51 | "- The \"WHY\", Why use NumPy as replacement “for loops”?: Its FAST!\n", 52 | "- PyTorch/NumPy Universal Functions or ufuncs\n", 53 | "- PyTorch/NumPy Broadcasting \n", 54 | "- PyTorch/NumPy Aggregations\n", 55 | "- PyTorch/NumPy Where\n", 56 | "- PyTorch/NumPy Select\n", 57 | "\n", 58 | "Code that written inefficiently:\n", 59 | "- Be less readable (less pythonic)\n", 60 | "- Be less maintainable (and therefore larger security threat cross section)\n", 61 | "- Can consume more time\n", 62 | "- Waste energy \n", 63 | "- Waste purchased or leased resources\n", 64 | "\n", 65 | "\n", 66 | "This module will focus on trying to simultaneously make code readable as well as more efficient as measured by how well we accelerate code examples. While the code example themselves are small examples, the techniques described are application in a wide variety of scenarios in AI.\n", 67 | "\n", 68 | "### Python loops are bad for performance\n", 69 | "**Python is great!** Its a great language for AI. There are many, many advantages in using Python especially for data science.\n", 70 | "- Easy to program (don’t worry about data types and fussy syntax at least relative to C/C++ and other languages\n", 71 | "- FAST for developing code!\n", 72 | "- Leverages huge array of libraries to conquer any domain\n", 73 | "- Lots of quick answers to common issues in Stack Exchange\n", 74 | "\n", 75 | "\n", 76 | "#### Python, however, is slow for Massively repeating small tasks - such as found in loops! **Python loops are SLOW**\n", 77 | "\n", 78 | "- Compared to C, C++, Fortran and other typed languages\n", 79 | "- Python is forced to look up every occurrence and type of variable in a loop to determine what operations it can perform on that data type\n", 80 | "- It cannot usually take advantage of advances in hardware in terms of vector width increases, multiple cores, new instructions from a new HW instruction set, new AI accelerators, effective cache memory layout, and more\n", 81 | "\n", 82 | "\n", 83 | "#### BUT: Python has library remedies to these ills!\n", 84 | "- Importing key libraries shift the burden of computation to highly efficient code\n", 85 | "- NumPy, for example, through its focus on elementwise efficient operations, gives indirect access to the efficiencies afforded in \"C\" \n", 86 | "- libraries included in oneAPI and NumPy, SciPy, Scikit-learn all powered by oneAPI give access to modern advancements in hardware level: access to better cache and memory usage, access to low level vector instructions, and more.\n", 87 | "- By leveraging packages such as these powered by oneAPI AND keeping libraries up to date, more capability is added to your underlying frameworks so that moving code, especially in a cloud world, can give you ready access to hardware acceleration, in many cases, without having to modify code this vectorized code\n", 88 | "- Routines are written in C (based on Cython framework)\n", 89 | "- NumPy arrays are densely packed arrays of homogeneous type. Python lists, by contrast, are arrays of pointers to objects, even when all of them are of the same type. So, you get the benefits of not having to check data types, and you also get locality of reference. Also, many NumPy operations are implemented in C, avoiding the general cost of loops in Python, pointer indirection and per-element dynamic type checking. The speed boost depends on which operations you’re performing. \n", 90 | "\n", 91 | " \n", 92 | "**Goal of this module: Search and destroy (replace) loops**\n", 93 | "\n", 94 | "Avoid loops if you can - find an alternative if possible. Sometimes it cannot be done - true data dependencies may limit our options. But many, many time there are alternatives.\n", 95 | "\n", 96 | "\n", 97 | "**The problem** \n", 98 | "- Loops isolate your code from hardware and software advances that update frequently.\n", 99 | "- They prevent you from effectively using key underlying resources - it is a waste.\n", 100 | "- They consume your time!\n", 101 | "\n", 102 | "\n", 103 | "### Reference:\n", 104 | "\n", 105 | "- [Video: **Losing your Loops Fast Numerical Computing with NumPy** by Jake VanderPlas ](https://www.youtube.com/watch?v=EEUXKG97YRw). \n", 106 | "\n", 107 | "- [Book: **Python Data Science Handbook** by Jake VanderPlas](https://jakevdp.github.io/PythonDataScienceHandbook/). \n", 108 | "\n", 109 | "- [Book: **Elegant SciPy: The Art of Scientific Python** by by Juan Nunez-Iglesias, Stéfan van der Walt, Harriet Dashnow](https://www.amazon.com/Elegant-SciPy-Art-Scientific-Python/dp/1491922877)\n", 110 | "\n", 111 | "- [Article: **The Ultimate NumPy Tutorial for Data Science Beginners**](https://www.analyticsvidhya.com/blog/2020/04/the-ultimate-numpy-tutorial-for-data-science-beginners/) : by Aniruddha April 28, 2020 at www.analyticsvidhya.com\n", 112 | "\n", 113 | "- [Academic Lecture pdf: **Vectorization** by Aaron Birkland Cornell CAC](http://www.cac.cornell.edu/education/training/StampedeJune2013/Vectorization-2013_06_18.pdf)\n", 114 | "\n", 115 | "## Prerequisites: (Already included on Intel DevCloud)\n", 116 | "\n", 117 | " - conda config --add channels intel\n", 118 | " - conda install numpy\n", 119 | " - conda install scipy\n", 120 | " - conda install update pandas\n", 121 | "\n", 122 | "\n", 123 | "# Comparison tables of numpy and PyTorch functionality\n", 124 | "\n", 125 | "### Simple stuff - reference\n", 126 | "\n", 127 | "Below is a table comparing ndarry and torch tensor approaches to similar tensor manipulation capabilities bewteen NumPy and PyTorch:\n", 128 | "\n", 129 | " | Function | Numpy | PyTorch | \n", 130 | " | --- | --- | --- | \n", 131 | " | Create array from list | numpy.array() | torch.tensor() | \n", 132 | " | Create array of zeros | numpy.zeros() | torch.zeros() | \n", 133 | " | Create array of ones | numpy.ones() | torch.ones() | \n", 134 | " | Create identity matrix | numpy.identity() | torch.eye() | \n", 135 | " | Create diagonal matrix | numpy.diag() | torch.diag() | \n", 136 | " | Create range of values | numpy.arange() | torch.arange() | \n", 137 | " | Create evenly spaced values | numpy.linspace() | torch.linspace() | \n", 138 | " | Create random values | numpy.random.rand() | torch.rand() | \n", 139 | " | Cast to a different data type | numpy.astype() | torch.to() | \n", 140 | " \n", 141 | "### Aggregations\n", 142 | "\n", 143 | "Where data in a given dimensions is reduced or aggregated down to a smaller dimension or even to a scalar value\n", 144 | "\n", 145 | "| PyTorch Aggregations | NumPy Aggregations |\n", 146 | "| --- | --- |\n", 147 | "|torch.sum(x)|np.sum(x)|\n", 148 | "|torch.mean(x)|np.mean(x)|\n", 149 | "|torch.median(x)|np.median(x)|\n", 150 | "|torch.max(x)|np.max(x)|\n", 151 | "|torch.min(x)|np.min(x)|\n", 152 | "|torch.prod(x)|np.prod(x)|\n", 153 | "|torch.std(x)|np.std(x)|\n", 154 | "|torch.var(x)|np.var(x)|\n", 155 | "|torch.any(x)|np.any(x)|\n", 156 | "|torch.all(x)|np.all(x)|\n", 157 | "|torch.unique(x)|np.unique(x)|\n", 158 | "|\n", 159 | "\n", 160 | "### UFUNCS\n", 161 | "\n", 162 | "NumPy name for Universal Functions, Aggregations can also be considered UFUNCS in NumPy. \n", 163 | "\n", 164 | "PyTorch doesn't really use the terminology UFuncs\n", 165 | "\n", 166 | "| PyTorch UFUNCs | NumPy UFUNCs | Description |\n", 167 | "| --- | --- | --- |\n", 168 | "|torch.abs(x)|np.abs(x)| absolute value |\n", 169 | "|torch.acos(x)|np.arccos(x)| Arc Cosine |\n", 170 | "|torch.asin(x)|np.arcsin(x)| Arc Sine |\n", 171 | "|torch.atan(x)|np.arctan(x)| Arc Tangent |\n", 172 | "|torch.ceil(x)|np.ceil(x)| Ceiling function |\n", 173 | "|torch.cos(x)|np.cos(x)| Cosine function |\n", 174 | "|torch.exp(x)|np.exp(x)| Exponentiation |\n", 175 | "|torch.floor(x)|np.floor(x)| Floor function |\n", 176 | "|torch.log(x)|np.log(x)| Log Funtion |\n", 177 | "|torch.neg(x)|np.negative(x)| Negation |\n", 178 | "|torch.reciprocal(x)|np.reciprocal(x)| Reciprocation |\n", 179 | "|torch.round(x)|np.round(x)| Rounding precision |\n", 180 | "|torch.rsqrt(x)|np.reciprocal(np.sqrt(x))| Reciprocal Sqaure Root |\n", 181 | "|torch.sin(x)|np.sin(x)| Sine FUnction |\n", 182 | "|torch.sqrt(x)|np.sqrt(x)| Square root function |\n", 183 | "|torch.square(x)|np.square(x)| Square function |\n", 184 | "|torch.tan(x)| np.tan(x)| Tangent function |\n", 185 | "\n", 186 | "\n", 187 | "## SPECIAL FOCUS \n", 188 | "\n", 189 | "Here are four functions we will devlote the last module to:\n", 190 | "\n", 191 | "| PyTorch UFUNCs | NumPy UFUNCs | Description |\n", 192 | "| --- | --- | --- |\n", 193 | "|torch.sigmoid(x)|1 / (1 + np.exp(-x))| **Sigmoid Function** |\n", 194 | "| torch.nn.Softmax() | np.exp(x)/np.exp(x).sum()| **Softmax function** |\n", 195 | "|torch.nn.CrossEntropyLoss() | 1.0 / (1.0 + np.exp(-x))| **Cross Entropy Loss** |\n", 196 | "| torch.nn.KLDivLoss() | np.sum(p * np.log(p/q) | **Kullback-Leibler Divergence (KL)** |\n", 197 | "\n", 198 | "\n", 199 | "\n", 200 | "## Caveat: Regarding mixing NumPy and PyTorch Random numbers\n", 201 | "#### https://tanelp.github.io/posts/a-bug-that-plagues-thousands-of-open-source-ml-projects/\n", 202 | "\n", 203 | "# Exercises:\n", 204 | "\n", 205 | "Do a page search for each **Exercise** in this notebook. Complete all exercises. Code in cells above each exercise may give insight into a solid approach" 206 | ] 207 | }, 208 | { 209 | "cell_type": "code", 210 | "execution_count": null, 211 | "metadata": {}, 212 | "outputs": [], 213 | "source": [ 214 | "import torch\n", 215 | "torch.__version__" 216 | ] 217 | }, 218 | { 219 | "cell_type": "markdown", 220 | "metadata": {}, 221 | "source": [ 222 | "## Why use PyTorch or Numpy as replacement for loops?\n", 223 | "\n", 224 | "## They are FAST!\n", 225 | "\n", 226 | "In this section we will explore a smattering a different PyTorch/NumPy approaches that lead to accelerations over naive loops\n", 227 | "\n", 228 | "The bigger (more iterations) of a loop the better PyTorch/NumPy gets and the bigger (more dimensions of data) the better PyTorch/ gets generally.\n", 229 | "\n", 230 | "Ultimately, we are hunting to \"BIG LOOPS\". What is a BIG LOOP? One that consumes a lot of time! Sometimes, even a loop with somewhat smaller iteration can be time consuming because each iteration takes long by itself. Well call these BIG LOOP too.\n", 231 | "\n", 232 | "\n", 233 | "#### Compare different ways of computing Log10 of a larger vector\n", 234 | "\n", 235 | "In this next section, we will create a list of 1-million random floating-point numbers. Then we will use for loop to iterate over its elements, take Log10 and store the value in another list. We'll compare the execution speed with that of a direct NumPy Log10 operation.\n", 236 | "\n", 237 | "For this log10 problem, we will compare:\n", 238 | "\n", 239 | "- Naive loop\n", 240 | "- Map function\n", 241 | "- List Comprehension\n", 242 | "- NumPy\n", 243 | "\n", 244 | "\n", 245 | " " 246 | ] 247 | }, 248 | { 249 | "cell_type": "code", 250 | "execution_count": null, 251 | "metadata": { 252 | "ExecuteTime": { 253 | "end_time": "2022-08-08T21:06:57.173636Z", 254 | "start_time": "2022-08-08T21:06:57.148702Z" 255 | } 256 | }, 257 | "outputs": [], 258 | "source": [ 259 | "\n", 260 | "import numpy as np\n", 261 | "import torch\n", 262 | "from math import log10 as lg10\n", 263 | "import time\n", 264 | "import matplotlib.pyplot as plt\n", 265 | "import random\n", 266 | "%matplotlib inline\n", 267 | "\n" 268 | ] 269 | }, 270 | { 271 | "cell_type": "code", 272 | "execution_count": null, 273 | "metadata": {}, 274 | "outputs": [], 275 | "source": [ 276 | "A = np.array((1, 2, 3)) # from tuple\n", 277 | "B = np.array([1, 2, 3]) # from list\n", 278 | "C = np.empty(([2, 2]), dtype=int) # empty\n", 279 | "D = np.zeros((2, 3)) # zero filled\n", 280 | "E = np.ones((2, 3)) # one filled\n", 281 | "F = np.eye(3) # identity matrix\n", 282 | "A,B,C,D,E,F" 283 | ] 284 | }, 285 | { 286 | "cell_type": "code", 287 | "execution_count": null, 288 | "metadata": {}, 289 | "outputs": [], 290 | "source": [ 291 | "A = torch.tensor((1, 2, 3)) # from tuple\n", 292 | "B = torch.tensor([1, 2, 3]) # from list\n", 293 | "C = torch.empty((2,3), dtype=torch.int64) # empty\n", 294 | "D = torch.zeros(2, 3) # zero filled\n", 295 | "E = torch.ones(2, 3) # one filled \n", 296 | "F = torch.eye(3) # identity matrix\n", 297 | "A,B,C,D,E,F" 298 | ] 299 | }, 300 | { 301 | "cell_type": "code", 302 | "execution_count": null, 303 | "metadata": {}, 304 | "outputs": [], 305 | "source": [ 306 | "t = torch.tensor((1, 2, 3))\n", 307 | "torch.cos(t)" 308 | ] 309 | }, 310 | { 311 | "cell_type": "markdown", 312 | "metadata": {}, 313 | "source": [ 314 | "## Show PyTorch/Numpy configs\n", 315 | "\n", 316 | "Look to ensure Intel MKL or oneAPI is part of your configuration using show_config" 317 | ] 318 | }, 319 | { 320 | "cell_type": "code", 321 | "execution_count": null, 322 | "metadata": { 323 | "ExecuteTime": { 324 | "end_time": "2022-08-08T23:22:53.508125Z", 325 | "start_time": "2022-08-08T23:22:53.495165Z" 326 | } 327 | }, 328 | "outputs": [], 329 | "source": [ 330 | "torch.__config__\n", 331 | "#!cat /opt/intel/inteloneapi/pytorch/latest/lib/python3.9/site-packages/torch/__config__.py" 332 | ] 333 | }, 334 | { 335 | "cell_type": "markdown", 336 | "metadata": {}, 337 | "source": [ 338 | "torch.__config__.parallel_info()" 339 | ] 340 | }, 341 | { 342 | "cell_type": "code", 343 | "execution_count": null, 344 | "metadata": {}, 345 | "outputs": [], 346 | "source": [ 347 | "print(torch.__config__.show())" 348 | ] 349 | }, 350 | { 351 | "cell_type": "markdown", 352 | "metadata": {}, 353 | "source": [ 354 | "## Now NumPy Configs" 355 | ] 356 | }, 357 | { 358 | "cell_type": "code", 359 | "execution_count": null, 360 | "metadata": {}, 361 | "outputs": [], 362 | "source": [ 363 | "np.show_config()\n" 364 | ] 365 | }, 366 | { 367 | "cell_type": "markdown", 368 | "metadata": {}, 369 | "source": [ 370 | "#### Create a list of 100 million floats whose values lie between 0 and 100\n", 371 | "\n", 372 | "This is the fastest way to get data into tensors!\n", 373 | "\n", 374 | "Create them that way from the start!" 375 | ] 376 | }, 377 | { 378 | "cell_type": "code", 379 | "execution_count": null, 380 | "metadata": {}, 381 | "outputs": [], 382 | "source": [ 383 | "%%time\n", 384 | "N = 100_000_000 # Number of records to process\n", 385 | "import random\n", 386 | "randomlist = []\n", 387 | "for i in range(0,N):\n", 388 | " n = random.uniform(0.0, 100.0)\n", 389 | " randomlist.append(n)\n", 390 | "print(randomlist[:5])" 391 | ] 392 | }, 393 | { 394 | "cell_type": "code", 395 | "execution_count": null, 396 | "metadata": { 397 | "ExecuteTime": { 398 | "end_time": "2022-08-08T21:07:21.150865Z", 399 | "start_time": "2022-08-08T21:06:57.183651Z" 400 | } 401 | }, 402 | "outputs": [], 403 | "source": [ 404 | "%%time\n", 405 | "NP = 100*(np.random.random(N))\n", 406 | "NP[:5]" 407 | ] 408 | }, 409 | { 410 | "cell_type": "code", 411 | "execution_count": null, 412 | "metadata": {}, 413 | "outputs": [], 414 | "source": [ 415 | "%%time \n", 416 | "PT = 100*torch.rand(N)\n", 417 | "PT[:5]" 418 | ] 419 | }, 420 | { 421 | "cell_type": "markdown", 422 | "metadata": {}, 423 | "source": [ 424 | "#### Create PyTorch/NumPy from Python list\n", 425 | "\n", 426 | "If you MUST convert from a list - here's how\n", 427 | "\n", 428 | "This is the first step towards vectorization" 429 | ] 430 | }, 431 | { 432 | "cell_type": "code", 433 | "execution_count": null, 434 | "metadata": {}, 435 | "outputs": [], 436 | "source": [ 437 | "%%time\n", 438 | "N = 100_000_000 # Number of records to process\n", 439 | "import random\n", 440 | "PyList = []\n", 441 | "for i in range(0,N):\n", 442 | " n = random.uniform(0.0, 100.0)\n", 443 | " PyList.append(n)\n", 444 | "print(PyList[:5])" 445 | ] 446 | }, 447 | { 448 | "cell_type": "markdown", 449 | "metadata": {}, 450 | "source": [ 451 | "#### Convert List to NumPy ndarrary" 452 | ] 453 | }, 454 | { 455 | "cell_type": "code", 456 | "execution_count": null, 457 | "metadata": {}, 458 | "outputs": [], 459 | "source": [ 460 | "%%time\n", 461 | "NP = np.array(PyList)\n", 462 | "NP[:5]" 463 | ] 464 | }, 465 | { 466 | "cell_type": "markdown", 467 | "metadata": {}, 468 | "source": [ 469 | "#### Convert List to PyTorch Tensor" 470 | ] 471 | }, 472 | { 473 | "cell_type": "code", 474 | "execution_count": null, 475 | "metadata": {}, 476 | "outputs": [], 477 | "source": [ 478 | "%%time\n", 479 | "PT = torch.tensor(PyList)\n", 480 | "PT[:5]" 481 | ] 482 | }, 483 | { 484 | "cell_type": "markdown", 485 | "metadata": {}, 486 | "source": [ 487 | "#### Create PyTorch/NumPy random ints\n", 488 | "\n", 489 | "This is the first step towards vectorization" 490 | ] 491 | }, 492 | { 493 | "cell_type": "code", 494 | "execution_count": null, 495 | "metadata": {}, 496 | "outputs": [], 497 | "source": [ 498 | "%%time\n", 499 | "N = 100_000_000 # Number of records to process\n", 500 | "import random\n", 501 | "PyList = []\n", 502 | "for i in range(0,N):\n", 503 | " n = random.uniform(0.0, 100.0)\n", 504 | " PyList.append(n)\n", 505 | "print(PyList[:5])" 506 | ] 507 | }, 508 | { 509 | "cell_type": "code", 510 | "execution_count": null, 511 | "metadata": {}, 512 | "outputs": [], 513 | "source": [ 514 | "%%time\n", 515 | "#a1 = np.array(L)\n", 516 | "NP = np.random.randint(1, 101, (N+1,))\n", 517 | "print(f\"{NP[:5]}\\nelements: {len(NP)}\")" 518 | ] 519 | }, 520 | { 521 | "cell_type": "code", 522 | "execution_count": null, 523 | "metadata": { 524 | "ExecuteTime": { 525 | "end_time": "2022-08-08T21:07:46.288651Z", 526 | "start_time": "2022-08-08T21:07:21.159841Z" 527 | } 528 | }, 529 | "outputs": [], 530 | "source": [ 531 | "%%time\n", 532 | "#a1 = np.array(L)\n", 533 | "PT = torch.randint(1, 101, (N+1,))\n", 534 | "print(f\"{PT[:5]}\\nelements: {len(PT)}\")" 535 | ] 536 | }, 537 | { 538 | "cell_type": "markdown", 539 | "metadata": {}, 540 | "source": [ 541 | "#### Create an empty PyTorch/NumPy ndarray object " 542 | ] 543 | }, 544 | { 545 | "cell_type": "code", 546 | "execution_count": null, 547 | "metadata": {}, 548 | "outputs": [], 549 | "source": [ 550 | "np.empty([N, 2])\n", 551 | "\n", 552 | "# or\n", 553 | "\n", 554 | "np.ndarray(shape=(N,2), dtype=float)" 555 | ] 556 | }, 557 | { 558 | "cell_type": "code", 559 | "execution_count": null, 560 | "metadata": {}, 561 | "outputs": [], 562 | "source": [ 563 | "torch.tensor(np.empty([N, 2]))" 564 | ] 565 | }, 566 | { 567 | "cell_type": "markdown", 568 | "metadata": {}, 569 | "source": [ 570 | "#### Create zero filled NumPy ndarray object " 571 | ] 572 | }, 573 | { 574 | "cell_type": "code", 575 | "execution_count": null, 576 | "metadata": {}, 577 | "outputs": [], 578 | "source": [ 579 | "np.zeros([N, 2])" 580 | ] 581 | }, 582 | { 583 | "cell_type": "markdown", 584 | "metadata": {}, 585 | "source": [ 586 | "#### Create PyTorch ndarray object filled with ones" 587 | ] 588 | }, 589 | { 590 | "cell_type": "code", 591 | "execution_count": null, 592 | "metadata": {}, 593 | "outputs": [], 594 | "source": [ 595 | "#torch.tensor(np.ones([N, 2]))\n", 596 | "\n", 597 | "torch.ones(N, 2)" 598 | ] 599 | }, 600 | { 601 | "cell_type": "markdown", 602 | "metadata": {}, 603 | "source": [ 604 | "#### Create Identity Matrix of given size" 605 | ] 606 | }, 607 | { 608 | "cell_type": "code", 609 | "execution_count": null, 610 | "metadata": {}, 611 | "outputs": [], 612 | "source": [ 613 | "#np.eye(100)\n", 614 | "\n", 615 | "torch.eye(100)" 616 | ] 617 | }, 618 | { 619 | "cell_type": "markdown", 620 | "metadata": {}, 621 | "source": [ 622 | "# Appending using NumPy\n", 623 | "\n", 624 | "Appending is easy in PyTorch/NumPy\n", 625 | "\n", 626 | "Simply use np.append() as in the below example:\n", 627 | "\n", 628 | "```python\n", 629 | "# NumPy\n", 630 | "np.append(a, [7,8,9]) \n", 631 | "\n", 632 | "# PyTorch\n", 633 | "a = torch.cat((a, new_a), dim=1)\n", 634 | "```\n", 635 | "\n", 636 | "### Caveat: Append With NumPy.\n", 637 | "\n", 638 | "Appending process does not occur in the same array. Rather a new array is created and filled.\n", 639 | "\n", 640 | "### With Python.\n", 641 | "\n", 642 | "Things are very different. The list filling process stays within the list itself, and no new lists are generated.\n", 643 | "\n", 644 | "# One Solution: Change your mindset!\n", 645 | "\n", 646 | "The goal is not to find a replacement for **append()**!\n", 647 | "\n", 648 | "The goal is to find a replacement for the **loop**\n", 649 | "\n", 650 | "Replace loop with linspace and ufuncs...??? It will depend on your specific loop\n", 651 | "\n", 652 | "##### Get rid of loop at all costs" 653 | ] 654 | }, 655 | { 656 | "cell_type": "code", 657 | "execution_count": null, 658 | "metadata": {}, 659 | "outputs": [], 660 | "source": [] 661 | }, 662 | { 663 | "cell_type": "code", 664 | "execution_count": null, 665 | "metadata": { 666 | "ExecuteTime": { 667 | "end_time": "2022-08-08T21:59:43.995711Z", 668 | "start_time": "2022-08-08T21:59:43.923897Z" 669 | } 670 | }, 671 | "outputs": [], 672 | "source": [ 673 | "a = []\n", 674 | "t1=time.time()\n", 675 | "timing = {}\n", 676 | "N = 10_000_000\n", 677 | "for i in range(N):\n", 678 | " a.append(np.sin(i))\n", 679 | "t2 = time.time()\n", 680 | "print(\"With for loop and appending it took {} seconds\".format(t2-t1))\n", 681 | "timing['loop'] = (t2-t1)\n", 682 | "a[:5]" 683 | ] 684 | }, 685 | { 686 | "cell_type": "code", 687 | "execution_count": null, 688 | "metadata": { 689 | "ExecuteTime": { 690 | "end_time": "2022-08-08T21:59:53.969720Z", 691 | "start_time": "2022-08-08T21:59:53.500964Z" 692 | } 693 | }, 694 | "outputs": [], 695 | "source": [ 696 | "a = np.array([])\n", 697 | "t1=time.time()\n", 698 | "for i in range(N//1000): #otherwise takes TOO long\n", 699 | " a = np.append(a, np.sin(i))\n", 700 | "t2 = time.time()\n", 701 | "print(\"With for loop and appending it took {} seconds\".format(t2-t1))\n", 702 | "timing['numpySilly'] = (t2-t1)\n", 703 | "a[:5]" 704 | ] 705 | }, 706 | { 707 | "cell_type": "code", 708 | "execution_count": null, 709 | "metadata": {}, 710 | "outputs": [], 711 | "source": [ 712 | "StopAndReadTheAbove()" 713 | ] 714 | }, 715 | { 716 | "cell_type": "markdown", 717 | "metadata": {}, 718 | "source": [ 719 | "# See Valuable lesson above!\n", 720 | "\n", 721 | "Blindly replacing function calls with NumPy or PyTorch replacements is NOT the goal!\n", 722 | "\n", 723 | "Getting rid of the loop is the goal!\n", 724 | "\n", 725 | "# Below we replaced the entire loop from above\n", 726 | "\n", 727 | "We avoided simple replacement of per row np.append()" 728 | ] 729 | }, 730 | { 731 | "cell_type": "code", 732 | "execution_count": null, 733 | "metadata": { 734 | "ExecuteTime": { 735 | "end_time": "2022-08-08T22:02:24.538584Z", 736 | "start_time": "2022-08-08T22:02:23.882337Z" 737 | } 738 | }, 739 | "outputs": [], 740 | "source": [ 741 | "a = np.linspace(0, N, num=N + 1)\n", 742 | "t1=time.time()\n", 743 | "a = np.sin(a)\n", 744 | "t2 = time.time()\n", 745 | "print(\"With linspace and ufunc it took {} seconds for WAY MORE values!\".format(t2-t1))\n", 746 | "timing['numpy'] = (t2-t1)\n", 747 | "a[:5]" 748 | ] 749 | }, 750 | { 751 | "cell_type": "code", 752 | "execution_count": null, 753 | "metadata": {}, 754 | "outputs": [], 755 | "source": [ 756 | "a = torch.randint(0, N, (N + 1,))\n", 757 | "t1=time.time()\n", 758 | "a = torch.sin(a)\n", 759 | "t2 = time.time()\n", 760 | "print(\"With linspace and ufunc it took {} seconds for WAY MORE values!\".format(t2-t1))\n", 761 | "timing['torch'] = (t2-t1)\n", 762 | "a[:5]" 763 | ] 764 | }, 765 | { 766 | "cell_type": "markdown", 767 | "metadata": {}, 768 | "source": [ 769 | "### Plot the time taken by each operation" 770 | ] 771 | }, 772 | { 773 | "cell_type": "code", 774 | "execution_count": null, 775 | "metadata": {}, 776 | "outputs": [], 777 | "source": [ 778 | "plt.figure(figsize=(10,6))\n", 779 | "plt.title(\"Time taken to compue sin on {:,} records in seconds\".format(N),fontsize=12)\n", 780 | "plt.ylabel(\"Time in seconds\",fontsize=12)\n", 781 | "plt.yscale('log')\n", 782 | "plt.xlabel(\"Various types of operations\",fontsize=14)\n", 783 | "plt.grid(True)\n", 784 | "plt.xticks(rotation=-60)\n", 785 | "plt.bar(x = list(timing.keys()), height= list(timing.values()), align='center',tick_label=list(timing.keys()))\n", 786 | "print('Acceleration : {:5.0f} X'.format(timing['loop']/(timing['numpy'])))" 787 | ] 788 | }, 789 | { 790 | "cell_type": "markdown", 791 | "metadata": {}, 792 | "source": [ 793 | "# Time a *Naive* For loop computing Log\n", 794 | "\n", 795 | "This one has a somewhat expensive **log10 function** that is being called on each element\n", 796 | "\n", 797 | "We append the results to the list" 798 | ] 799 | }, 800 | { 801 | "cell_type": "code", 802 | "execution_count": null, 803 | "metadata": { 804 | "ExecuteTime": { 805 | "end_time": "2022-08-08T21:09:19.773733Z", 806 | "start_time": "2022-08-08T21:07:46.297634Z" 807 | } 808 | }, 809 | "outputs": [], 810 | "source": [ 811 | "# Create a blank list for appending elements\n", 812 | "\n", 813 | "timing = {} # Just a blank dictionary to append to\n", 814 | "N = 100_000_000 # Number of records to process\n", 815 | "L = list(100*(np.random.random(N))+1)\n", 816 | "#L = torch.randint(1, 101, (N+1,))\n", 817 | "t1=time.time()\n", 818 | "l2 = []\n", 819 | "#for item in L:\n", 820 | "for item in L:\n", 821 | " l2.append(lg10(item))\n", 822 | "t2 = time.time()\n", 823 | "print(\"With for loop and appending it took {} seconds\".format(t2-t1))\n", 824 | "timing['loop'] = (t2-t1)\n", 825 | "print(\"First few elements of the resulting array:\", l2[:4])" 826 | ] 827 | }, 828 | { 829 | "cell_type": "markdown", 830 | "metadata": {}, 831 | "source": [ 832 | "# Time the *Map* function\n", 833 | "\n", 834 | "One Python alternative to looping is the **map function** that applies a function to each element in a list" 835 | ] 836 | }, 837 | { 838 | "cell_type": "code", 839 | "execution_count": null, 840 | "metadata": { 841 | "ExecuteTime": { 842 | "end_time": "2022-08-08T21:10:32.118342Z", 843 | "start_time": "2022-08-08T21:09:19.781708Z" 844 | } 845 | }, 846 | "outputs": [], 847 | "source": [ 848 | "def op1(x):\n", 849 | " return (lg10(x))\n", 850 | "\n", 851 | "t1=time.time()\n", 852 | "\n", 853 | "l2=list(map(op1,L))\n", 854 | "\n", 855 | "t2 = time.time()\n", 856 | "print(\"With list(map) functional method it took {} seconds\".format(t2-t1))\n", 857 | "timing['map'] = (t2-t1)\n", 858 | "print(\"First few elements of the resulting array:\", l2[:4])" 859 | ] 860 | }, 861 | { 862 | "cell_type": "markdown", 863 | "metadata": {}, 864 | "source": [ 865 | "# Time a *List comprehension*\n", 866 | "\n", 867 | "One very popular Python alternative to use a list comprehension instead of the loop" 868 | ] 869 | }, 870 | { 871 | "cell_type": "code", 872 | "execution_count": null, 873 | "metadata": { 874 | "ExecuteTime": { 875 | "end_time": "2022-08-08T21:11:30.751521Z", 876 | "start_time": "2022-08-08T21:10:32.122284Z" 877 | } 878 | }, 879 | "outputs": [], 880 | "source": [ 881 | "t1=time.time()\n", 882 | "l2 = [lg10(i+1) for i in range(len(L))]\n", 883 | "t2 = time.time()\n", 884 | "print(\"With list comprehension, it took {} seconds\".format(t2-t1))\n", 885 | "timing['list comprehension'] = (t2-t1)\n", 886 | "print(\"First few elements of the resulting array:\", l2[:4])" 887 | ] 888 | }, 889 | { 890 | "cell_type": "code", 891 | "execution_count": null, 892 | "metadata": {}, 893 | "outputs": [], 894 | "source": [ 895 | "t1=time.time()\n", 896 | "# Notice that we assume we already converted the list to numpy \n", 897 | "# otherwise we should include that time as well\n", 898 | "####### Insert corrected code below\n", 899 | "\n", 900 | "L = np.linspace(1, 101, N+1)\n", 901 | "a2= np.log10(L)\n", 902 | "\n", 903 | "##################################\n", 904 | "\n", 905 | "t2 = time.time()\n", 906 | "print(\"With direct NumPy log10 method it took {} seconds\".format(t2-t1))\n", 907 | "timing['numpy'] = (t2-t1)\n", 908 | "\n", 909 | "print(\"First few elements of the resulting array:\", a2[:4])" 910 | ] 911 | }, 912 | { 913 | "cell_type": "markdown", 914 | "metadata": {}, 915 | "source": [ 916 | "# Time *Torch operation* (vectorized array)\n", 917 | "\n", 918 | "- Exercise: replace ReplaceThisBrokenCode with:\n", 919 | "```python\n", 920 | "a2=np.log10(a1)\n", 921 | "```" 922 | ] 923 | }, 924 | { 925 | "cell_type": "code", 926 | "execution_count": null, 927 | "metadata": { 928 | "ExecuteTime": { 929 | "end_time": "2022-08-08T21:11:51.031297Z", 930 | "start_time": "2022-08-08T21:11:33.891130Z" 931 | } 932 | }, 933 | "outputs": [], 934 | "source": [ 935 | "t1=time.time()\n", 936 | "# Notice that we assume we already converted the list to numpy \n", 937 | "# otherwise we should include that time as well\n", 938 | "\n", 939 | "####### Insert corrected code below\n", 940 | "L = torch.linspace(1, 101, N+1)\n", 941 | "a2=torch.log10(L)\n", 942 | "##################################\n", 943 | "\n", 944 | "t2 = time.time()\n", 945 | "print(\"With direct NumPy log10 method it took {} seconds\".format(t2-t1))\n", 946 | "timing['torch'] = (t2-t1)\n", 947 | "\n", 948 | "print(\"First few elements of the resulting array:\", a2[:4])" 949 | ] 950 | }, 951 | { 952 | "cell_type": "markdown", 953 | "metadata": {}, 954 | "source": [ 955 | "### Plot the time taken by each operation" 956 | ] 957 | }, 958 | { 959 | "cell_type": "code", 960 | "execution_count": null, 961 | "metadata": { 962 | "ExecuteTime": { 963 | "end_time": "2022-08-08T21:11:54.847104Z", 964 | "start_time": "2022-08-08T21:11:51.036287Z" 965 | } 966 | }, 967 | "outputs": [], 968 | "source": [ 969 | "%matplotlib inline\n", 970 | "import matplotlib.pyplot as plt\n", 971 | "plt.figure(figsize=(10,6))\n", 972 | "plt.title(\"Plot of various method of computing Log10, for {:,} elements\".format(L.shape[0]))\n", 973 | "plt.ylabel(\"Time in seconds\",fontsize=12)\n", 974 | "plt.xlabel(\"Various types of operations\",fontsize=14)\n", 975 | "plt.xticks(rotation=-60)\n", 976 | "plt.grid(True)\n", 977 | "plt.yscale('log')\n", 978 | "plt.bar(x = range(len(timing)), height=list(timing.values()), align='center', tick_label=list(timing.keys()))\n", 979 | "print('Acceleration : {:4.0f} X'.format(timing['loop']/timing['torch']))" 980 | ] 981 | }, 982 | { 983 | "cell_type": "markdown", 984 | "metadata": {}, 985 | "source": [ 986 | "We see the evidence that NumPy operations over ndarray objects are much faster than regular Python math operations. The exact speed of regular Python operations vary a little but they are almost always much slower compared to the vectorized NumPy operation. This is primarily due to memory layout and having to interpret datatypes in every iteration" 987 | ] 988 | }, 989 | { 990 | "cell_type": "markdown", 991 | "metadata": {}, 992 | "source": [ 993 | "# Shift a vector - Time Series Prediction Task\n", 994 | "\n", 995 | "In AI, we often need to shift data. For example, in time series prediction we typically want to predict a later value in a time sequence, given earlier available feature data. We often select a target variable that is time shifted to predict later labels in time.\n", 996 | "\n", 997 | "Suppose we want to take a vector, a column of numbers, and shift them by a constant. Append the last element of the list to be zero.\n", 998 | "\n", 999 | "Naive approach with for loop **b[i] = a[i+1]**" 1000 | ] 1001 | }, 1002 | { 1003 | "cell_type": "code", 1004 | "execution_count": null, 1005 | "metadata": { 1006 | "ExecuteTime": { 1007 | "end_time": "2022-08-08T22:14:07.322024Z", 1008 | "start_time": "2022-08-08T22:11:59.675173Z" 1009 | } 1010 | }, 1011 | "outputs": [], 1012 | "source": [ 1013 | "# try naive loop\n", 1014 | "num = 100_000_001\n", 1015 | "a = np.linspace(0, num - 1, num=num)\n", 1016 | "b = np.ndarray(num-1)\n", 1017 | "\n", 1018 | "timing = {}\n", 1019 | "\n", 1020 | "t1=time.time()\n", 1021 | "\n", 1022 | "for i in range(len(a)-1):\n", 1023 | " b[i] = a[i+1]\n", 1024 | "\n", 1025 | "t2=time.time()\n", 1026 | "print(\"shift b w loop {} secs\".format(t2-t1))\n", 1027 | "timing['loop'] = (t2-t1)\n", 1028 | "b[:5], b[-5:]" 1029 | ] 1030 | }, 1031 | { 1032 | "cell_type": "markdown", 1033 | "metadata": {}, 1034 | "source": [ 1035 | "List comprehensions are generally faster than explicit \"for loops\"" 1036 | ] 1037 | }, 1038 | { 1039 | "cell_type": "code", 1040 | "execution_count": null, 1041 | "metadata": { 1042 | "ExecuteTime": { 1043 | "end_time": "2022-08-08T22:15:18.319166Z", 1044 | "start_time": "2022-08-08T22:14:07.327976Z" 1045 | } 1046 | }, 1047 | "outputs": [], 1048 | "source": [ 1049 | "# try list comprehension \n", 1050 | "t1=time.time()\n", 1051 | "\n", 1052 | "b = [a[i+1] for i in range(len(a)-1)] # shift b by 1\n", 1053 | "\n", 1054 | "t2=time.time()\n", 1055 | "print(\"shift b {} secs\".format(t2-t1))\n", 1056 | "timing['list comprehension'] = (t2-t1)\n", 1057 | "b[:5], b[-5:]" 1058 | ] 1059 | }, 1060 | { 1061 | "cell_type": "markdown", 1062 | "metadata": {}, 1063 | "source": [ 1064 | "Use fancy slicing\n", 1065 | "\n", 1066 | "- Exercise: replace ReplaceThisBrokenCode with:\n", 1067 | "```python\n", 1068 | "b = a[1:]\n", 1069 | "```" 1070 | ] 1071 | }, 1072 | { 1073 | "cell_type": "code", 1074 | "execution_count": null, 1075 | "metadata": { 1076 | "ExecuteTime": { 1077 | "end_time": "2022-08-08T22:15:39.393844Z", 1078 | "start_time": "2022-08-08T22:15:39.373865Z" 1079 | } 1080 | }, 1081 | "outputs": [], 1082 | "source": [ 1083 | "# try fancy slicing\n", 1084 | "t1=time.time()\n", 1085 | "####### Insert corrected code below\n", 1086 | "\n", 1087 | "b = a[1:]\n", 1088 | "\n", 1089 | "##################################\n", 1090 | "\n", 1091 | "t2=time.time()\n", 1092 | "print(\"shift c {} secs\".format(t2-t1))\n", 1093 | "timing['numpy_slicing'] = (t2-t1)\n" 1094 | ] 1095 | }, 1096 | { 1097 | "cell_type": "code", 1098 | "execution_count": null, 1099 | "metadata": {}, 1100 | "outputs": [], 1101 | "source": [ 1102 | "# try torch.tensor\n", 1103 | "\n", 1104 | "a = torch.linspace(0, num - 1, steps=num)\n", 1105 | "b = np.ndarray(num-1)\n", 1106 | "b = torch.tensor(b)\n", 1107 | "\n", 1108 | "t1=time.time()\n", 1109 | "b = a[1:]\n", 1110 | "t2=time.time()\n", 1111 | "print(\"shift b w loop {} secs\".format(t2-t1))\n", 1112 | "timing['pytorch_slicing'] = (t2-t1)\n", 1113 | "b[:5], b[-5:]" 1114 | ] 1115 | }, 1116 | { 1117 | "cell_type": "markdown", 1118 | "metadata": {}, 1119 | "source": [ 1120 | "Plot the results" 1121 | ] 1122 | }, 1123 | { 1124 | "cell_type": "code", 1125 | "execution_count": null, 1126 | "metadata": { 1127 | "ExecuteTime": { 1128 | "end_time": "2022-08-08T22:18:11.140757Z", 1129 | "start_time": "2022-08-08T22:18:10.622110Z" 1130 | } 1131 | }, 1132 | "outputs": [], 1133 | "source": [ 1134 | "plt.figure(figsize=(10,6))\n", 1135 | "plt.title(\"Time taken to process {:,} records in seconds\".format(num),fontsize=12)\n", 1136 | "plt.ylabel(\"Time in seconds\",fontsize=12)\n", 1137 | "plt.yscale('log')\n", 1138 | "plt.xlabel(\"Various types of operations\",fontsize=14)\n", 1139 | "plt.grid(True)\n", 1140 | "plt.xticks(rotation=-60)\n", 1141 | "plt.bar(x = list(timing.keys()), height= list(timing.values()), align='center',tick_label=list(timing.keys()))\n", 1142 | "print('Acceleration : {:5.0f} X'.format(timing['loop']/(timing['pytorch_slicing'])))" 1143 | ] 1144 | }, 1145 | { 1146 | "cell_type": "code", 1147 | "execution_count": null, 1148 | "metadata": {}, 1149 | "outputs": [], 1150 | "source": [ 1151 | "print(\"Done!\")" 1152 | ] 1153 | }, 1154 | { 1155 | "cell_type": "markdown", 1156 | "metadata": {}, 1157 | "source": [ 1158 | "# Notices and Disclaimers\n", 1159 | "\n", 1160 | "Intel technologies may require enabled hardware, software or service activation.\n", 1161 | "No product or component can be absolutely secure. \n", 1162 | "\n", 1163 | "Your costs and results may vary. \n", 1164 | "\n", 1165 | "© Intel Corporation. Intel, the Intel logo, and other Intel marks are trademarks of Intel Corporation or its subsidiaries. Other names and brands may be claimed as the property of others. " 1166 | ] 1167 | }, 1168 | { 1169 | "cell_type": "code", 1170 | "execution_count": null, 1171 | "metadata": {}, 1172 | "outputs": [], 1173 | "source": [] 1174 | } 1175 | ], 1176 | "metadata": { 1177 | "anaconda-cloud": {}, 1178 | "kernelspec": { 1179 | "display_name": "PyTorch GPU", 1180 | "language": "python", 1181 | "name": "pytorch-gpu" 1182 | }, 1183 | "language_info": { 1184 | "codemirror_mode": { 1185 | "name": "ipython", 1186 | "version": 3 1187 | }, 1188 | "file_extension": ".py", 1189 | "mimetype": "text/x-python", 1190 | "name": "python", 1191 | "nbconvert_exporter": "python", 1192 | "pygments_lexer": "ipython3", 1193 | "version": "3.9.19" 1194 | }, 1195 | "nbTranslate": { 1196 | "displayLangs": [ 1197 | "*" 1198 | ], 1199 | "hotkey": "alt-t", 1200 | "langInMainMenu": true, 1201 | "sourceLang": "en", 1202 | "targetLang": "fr", 1203 | "useGoogleTranslate": true 1204 | }, 1205 | "toc": { 1206 | "base_numbering": 1, 1207 | "nav_menu": {}, 1208 | "number_sections": true, 1209 | "sideBar": true, 1210 | "skip_h1_title": false, 1211 | "title_cell": "Table of Contents", 1212 | "title_sidebar": "Contents", 1213 | "toc_cell": false, 1214 | "toc_position": {}, 1215 | "toc_section_display": true, 1216 | "toc_window_display": false 1217 | } 1218 | }, 1219 | "nbformat": 4, 1220 | "nbformat_minor": 4 1221 | } 1222 | -------------------------------------------------------------------------------- /03_Compare_Aggregations.ipynb: -------------------------------------------------------------------------------- 1 | { 2 | "cells": [ 3 | { 4 | "cell_type": "markdown", 5 | "metadata": { 6 | "tags": [] 7 | }, 8 | "source": [ 9 | "# Introduction to Pytorch Aggregation by way of NumPy\n", 10 | "\n", 11 | "![Assets/NumpyAxis0.PNG](Assets/NumpyAxis0.PNG)" 12 | ] 13 | }, 14 | { 15 | "cell_type": "markdown", 16 | "metadata": { 17 | "tags": [] 18 | }, 19 | "source": [ 20 | "\n", 21 | "# Exercises:\n", 22 | "\n", 23 | "Do a page search for each **Exercise** in this notebook. Complete all exercises. Code in cells above each exercise may give insight into a solid approach" 24 | ] 25 | }, 26 | { 27 | "cell_type": "code", 28 | "execution_count": null, 29 | "metadata": {}, 30 | "outputs": [], 31 | "source": [ 32 | "import torch\n", 33 | "import numpy as np\n", 34 | "from math import log10 as lg10\n", 35 | "import time\n", 36 | "import matplotlib.pyplot as plt\n", 37 | "import random\n", 38 | "import time\n", 39 | "%matplotlib inline" 40 | ] 41 | }, 42 | { 43 | "cell_type": "markdown", 44 | "metadata": {}, 45 | "source": [ 46 | "Whatever loopy code you have - spend time looking for alternatives such as this. The acceleration can be exrardinary" 47 | ] 48 | }, 49 | { 50 | "cell_type": "markdown", 51 | "metadata": {}, 52 | "source": [ 53 | "\n", 54 | "\n", 55 | "# Numpy Aggregation\n", 56 | "\n", 57 | "Aggregation is where we operate on an array and generate resulting data with a smaller dimension than the original array\n", 58 | "\n", 59 | "The aggregations can typically be done using different axes to control the direction\n", 60 | "\n", 61 | "![Assets/NumpyAxis0.PNG](Assets/NumpyAxis0.PNG)\n", 62 | "\n", 63 | "![Assets/NumpyAxis11.PNG](Assets/NumpyAxis1.PNG)\n", 64 | "\n", 65 | "Common examples in AI are:\n", 66 | "- min\n", 67 | "- max\n", 68 | "- sum\n", 69 | "- mean\n", 70 | "- std ... among others\n", 71 | "\n", 72 | "----------------------------------------------------------------------------------\n", 73 | "| Functions | Description | \n", 74 | "| --- | --- |\n", 75 | "| np.mean() | Compute the arithmetic mean along the specified axis. |\n", 76 | "| np.std() | Compute the standard deviation along the specified axis. |\n", 77 | "| np.var() | Compute the variance along the specified axis. |\n", 78 | "| np.sum() | Sum of array elements over a given axis. |\n", 79 | "| np.prod() | Return the product of array elements over a given axis. |\n", 80 | "| np.cumsum() | Return the cumulative sum of the elements along a given axis. |\n", 81 | "| np.cumprod() | Return the cumulative product of elements along a given axis. |\n", 82 | "| np.min(), np.max() | Return the minimum / maximum of an array or minimum along an axis. |\n", 83 | "| np.argmin(), np.argmax() | Returns the indices of the minimum / maximum values along an axis |\n", 84 | "| np.all() | Test whether all array elements along a given axis evaluate to True. |\n", 85 | "| np.any() | Test whether any array element along a given axis evaluates to True. |\n", 86 | "\n", 87 | "\n", 88 | "Specialty calcualtions exist so always eamine your code with a view to simply and remove loops with off the shelf solutions\n", 89 | "\n", 90 | "For example, in AI there re times we need to add the values of the diagonal of special arrays.\n", 91 | "\n", 92 | "For very long vectors these will accelerate noticibly and more so for larger multdimensional arrays\n", 93 | "\n", 94 | "Below is a very partial list comparing PyTorch and NumPy aggregations:\n", 95 | "\n", 96 | "| PyTorch Aggregations | NumPy Aggregations | \n", 97 | "| ---| --- | \n", 98 | "| torch.sum(x) | np.sum(x)| \n", 99 | "| torch.mean(x) | np.mean(x)| \n", 100 | "| torch.median(x) | np.median(x)| \n", 101 | "| torch.max(x) | np.max(x)| \n", 102 | "| torch.min(x) | np.min(x)| \n", 103 | "| torch.prod(x) | np.prod(x)| \n", 104 | "| torch.std(x) | np.std(x)| \n", 105 | "| torch.var(x) | np.var(x)| \n", 106 | "| torch.any(x) | np.any(x)| \n", 107 | "| torch.all(x) | np.all(x)| \n", 108 | "| torch.unique(x) | np.unique(x)| \n", 109 | "| torch.cumsum(x, dim=0) | np.cumsum(x) |\n", 110 | "\n", 111 | "\n", 112 | "Here are more comparison of usefule NumPy functions and their PyTorch counterparts:\n", 113 | "\n", 114 | "| Function | Numpy | PyTorch | \n", 115 | "| Test for NaN values| numpy.isnan()| torch.isnan()| \n", 116 | "| Test for infinite values| numpy.isinf()| torch.isinf()| \n", 117 | "| Test for negative infinite values| numpy.isneginf()| torch.isneginf()| \n", 118 | "| Test for positive infinite values| numpy.isposinf()| torch.isposinf()| \n", 119 | "| Test for finite values| numpy.isfinite()| torch.isfinite()| \n", 120 | "\n", 121 | "Below is a very partial list comparing PyTorch and NumPy aggregations:\n", 122 | "\n", 123 | "Below is a naive approach for addng all the diagnoal elements of a smallish array of 1000 x 1000. So the accerlation is reasonable but not outlandish\n" 124 | ] 125 | }, 126 | { 127 | "cell_type": "code", 128 | "execution_count": null, 129 | "metadata": {}, 130 | "outputs": [], 131 | "source": [ 132 | "A = np.arange(1_000_000).reshape(1000, 1000)\n", 133 | "A_torch = torch.tensor(A)\n", 134 | "Diag = 0\n", 135 | "\n", 136 | "t1 = time.time()\n", 137 | "for i in range(len(A)):\n", 138 | " for j in range(len(A)): \n", 139 | " if i == j:\n", 140 | " Diag += A[i,j]\n", 141 | "t2 = time.time()\n", 142 | "Elapsed_Diag_base = t2-t1\n", 143 | "print(\"elapsed time: \", Elapsed_Diag_base)\n", 144 | "print(\"Diag: \", Diag)" 145 | ] 146 | }, 147 | { 148 | "cell_type": "markdown", 149 | "metadata": {}, 150 | "source": [ 151 | "## Exercise:\n", 152 | "\n", 153 | "Use a search engine to find numpy method to find the sum of the diagonals of this array.\n", 154 | "- Hint: trace\n", 155 | "- Hint: Diag = np.trace(A)" 156 | ] 157 | }, 158 | { 159 | "cell_type": "code", 160 | "execution_count": null, 161 | "metadata": {}, 162 | "outputs": [], 163 | "source": [ 164 | "t1 = time.time()\n", 165 | "### Complete the code below #####\n", 166 | "\n", 167 | "Diag = np.trace(A)\n", 168 | "\n", 169 | "##################### \n", 170 | "t2 = time.time()\n", 171 | "Elapsed_Diag_numpy = t2 - t1\n", 172 | "print(\"elapsed time: \", Elapsed_Diag_numpy)\n", 173 | "print(\"Diag: \", Diag)\n", 174 | "print(\"Acceleration: {:4.0f}X\".format(Elapsed_Diag_base/Elapsed_Diag_numpy))" 175 | ] 176 | }, 177 | { 178 | "cell_type": "code", 179 | "execution_count": null, 180 | "metadata": {}, 181 | "outputs": [], 182 | "source": [ 183 | "t1 = time.time()\n", 184 | "### Complete the code below #####\n", 185 | "\n", 186 | "Diag = torch.trace(A_torch)\n", 187 | "\n", 188 | "##################### \n", 189 | "t2 = time.time()\n", 190 | "Elapsed_Diag_numpy = t2 - t1\n", 191 | "print(\"elapsed time: \", Elapsed_Diag_numpy)\n", 192 | "print(\"Diag: \", Diag)\n", 193 | "print(\"Acceleration: {:4.0f}X\".format(Elapsed_Diag_base/Elapsed_Diag_numpy))" 194 | ] 195 | }, 196 | { 197 | "cell_type": "markdown", 198 | "metadata": {}, 199 | "source": [ 200 | "# Exercise: Compute Mean & Std of array using NumPy" 201 | ] 202 | }, 203 | { 204 | "cell_type": "code", 205 | "execution_count": null, 206 | "metadata": {}, 207 | "outputs": [], 208 | "source": [ 209 | "rng = np.random.default_rng(2021)\n", 210 | "# random.default_range is the recommended method for generated random's\n", 211 | "# see blog \"Stop using numpy.random.seed()\" for reasoning\n", 212 | "# https://towardsdatascience.com/stop-using-numpy-random-seed-581a9972805f\n", 213 | "\n", 214 | "a = rng.random((10_000_000,))\n", 215 | "a_torch = torch.tensor(a)\n", 216 | "t1 = time.time()\n", 217 | "timing = {}\n", 218 | "S = 0\n", 219 | "for i in range (len(a)):\n", 220 | " S += a[i]\n", 221 | "mean = S/len(a)\n", 222 | "std = 0\n", 223 | "for i in range (len(a)):\n", 224 | " d = a[i] - mean\n", 225 | " std += d*d\n", 226 | "std = np.sqrt(std/len(a))\n", 227 | "timing['loop'] = time.time() - t1\n", 228 | "print(\"mean\", mean)\n", 229 | "print(\"std\", std)\n", 230 | "\n", 231 | "print(timing)" 232 | ] 233 | }, 234 | { 235 | "cell_type": "code", 236 | "execution_count": null, 237 | "metadata": {}, 238 | "outputs": [], 239 | "source": [ 240 | "t1 = time.time()\n", 241 | "print(a.mean())\n", 242 | "print(a.std())\n", 243 | "\n", 244 | "timing['numpy'] = time.time() - t1\n", 245 | "print(timing)\n", 246 | "print(f\"Acceleration {timing['loop']/timing['numpy']:4.1f} X\")" 247 | ] 248 | }, 249 | { 250 | "cell_type": "code", 251 | "execution_count": null, 252 | "metadata": {}, 253 | "outputs": [], 254 | "source": [ 255 | "t1 = time.time()\n", 256 | "print(a_torch.mean())\n", 257 | "print(a_torch.std())\n", 258 | "\n", 259 | "timing['pytorch'] = time.time() - t1\n", 260 | "print(timing)\n", 261 | "print(f\"Acceleration {timing['loop']/timing['numpy']:4.1f} X\")" 262 | ] 263 | }, 264 | { 265 | "cell_type": "code", 266 | "execution_count": null, 267 | "metadata": {}, 268 | "outputs": [], 269 | "source": [ 270 | "%matplotlib inline\n", 271 | "import matplotlib.pyplot as plt\n", 272 | "plt.figure(figsize=(10,6))\n", 273 | "plt.title(\"Measure acceleration of looping versus Numpy log10 [Lower is better]\",fontsize=12)\n", 274 | "plt.ylabel(\"Time in seconds\",fontsize=12)\n", 275 | "plt.yscale('log')\n", 276 | "plt.xlabel(\"Various types of operations\",fontsize=14)\n", 277 | "plt.grid(True)\n", 278 | "plt.bar(x = list(timing.keys()), height= list(timing.values()), align='center',tick_label=list(timing.keys()))" 279 | ] 280 | }, 281 | { 282 | "cell_type": "code", 283 | "execution_count": null, 284 | "metadata": {}, 285 | "outputs": [], 286 | "source": [ 287 | "print(\"Done\")" 288 | ] 289 | }, 290 | { 291 | "cell_type": "markdown", 292 | "metadata": {}, 293 | "source": [ 294 | "# Notices and Disclaimers\n", 295 | "\n", 296 | "Intel technologies may require enabled hardware, software or service activation.\n", 297 | "No product or component can be absolutely secure. \n", 298 | "\n", 299 | "Your costs and results may vary. \n", 300 | "\n", 301 | "© Intel Corporation. Intel, the Intel logo, and other Intel marks are trademarks of Intel Corporation or its subsidiaries. Other names and brands may be claimed as the property of others. " 302 | ] 303 | }, 304 | { 305 | "cell_type": "code", 306 | "execution_count": null, 307 | "metadata": {}, 308 | "outputs": [], 309 | "source": [] 310 | } 311 | ], 312 | "metadata": { 313 | "anaconda-cloud": {}, 314 | "kernelspec": { 315 | "display_name": "pytorch-gpu", 316 | "language": "python", 317 | "name": "pytorch-gpu" 318 | }, 319 | "language_info": { 320 | "codemirror_mode": { 321 | "name": "ipython", 322 | "version": 3 323 | }, 324 | "file_extension": ".py", 325 | "mimetype": "text/x-python", 326 | "name": "python", 327 | "nbconvert_exporter": "python", 328 | "pygments_lexer": "ipython3", 329 | "version": "3.9.16" 330 | }, 331 | "nbTranslate": { 332 | "displayLangs": [ 333 | "*" 334 | ], 335 | "hotkey": "alt-t", 336 | "langInMainMenu": true, 337 | "sourceLang": "en", 338 | "targetLang": "fr", 339 | "useGoogleTranslate": true 340 | }, 341 | "toc": { 342 | "base_numbering": 1, 343 | "nav_menu": {}, 344 | "number_sections": true, 345 | "sideBar": true, 346 | "skip_h1_title": false, 347 | "title_cell": "Table of Contents", 348 | "title_sidebar": "Contents", 349 | "toc_cell": false, 350 | "toc_position": {}, 351 | "toc_section_display": true, 352 | "toc_window_display": false 353 | } 354 | }, 355 | "nbformat": 4, 356 | "nbformat_minor": 4 357 | } 358 | -------------------------------------------------------------------------------- /05_Compare_Where_Select.ipynb: -------------------------------------------------------------------------------- 1 | { 2 | "cells": [ 3 | { 4 | "cell_type": "markdown", 5 | "metadata": { 6 | "tags": [] 7 | }, 8 | "source": [ 9 | "# Introduction to PyTorch - Simple Conditional Logic by way of NumPy\n", 10 | "\n", 11 | "### NumPy and PyTorch ways to vectorize a loop and also handle simple conditional logic\n", 12 | "\n", 13 | "\n", 14 | "One thing that could prevent us from effectively getting vector performance when converting a loop to a vector approach is when the original loop has if then else statements in it - called conditional logic\n", 15 | "\n", 16 | "![SimpleLogic.png](Assets/SimpleLogic.png)\n", 17 | "\n", 18 | "One thing that could prevent us from effectively getting vector performance when converting a loop to a vector approach is when the original loop has if then else statements in it - called conditional logic\n", 19 | "\n", 20 | "The Numpy Where allows us to tackle conditional loops in a fast vectorized way\n", 21 | "\n", 22 | "Apply conditional logic to an array to create a new column orupdate contents of an existing column\n", 23 | "\n", 24 | "**Syntax:**\n", 25 | "- numpy.where(condition, [x, y, ]/)\n", 26 | "- Return elements chosen from x or y depending on condition.\n", 27 | "\n", 28 | "To understand what numpy where does, look at the simple example below\n", 29 | "See a simple example below to add 50 to all elements currently greater than 5:\n" 30 | ] 31 | }, 32 | { 33 | "cell_type": "markdown", 34 | "metadata": { 35 | "tags": [] 36 | }, 37 | "source": [ 38 | "\n", 39 | "# Exercises:\n", 40 | "\n", 41 | "Do a page search for each **Exercise** in this notebook. Complete all exercises. Code in cells above each exercise may give insight into a solid approach" 42 | ] 43 | }, 44 | { 45 | "cell_type": "code", 46 | "execution_count": null, 47 | "metadata": {}, 48 | "outputs": [], 49 | "source": [ 50 | "import torch\n", 51 | "import numpy as np\n", 52 | "from math import log10 as lg10\n", 53 | "import time\n", 54 | "import matplotlib.pyplot as plt\n", 55 | "import random\n", 56 | "import time\n", 57 | "%matplotlib inline" 58 | ] 59 | }, 60 | { 61 | "cell_type": "markdown", 62 | "metadata": {}, 63 | "source": [] 64 | }, 65 | { 66 | "cell_type": "code", 67 | "execution_count": null, 68 | "metadata": {}, 69 | "outputs": [], 70 | "source": [ 71 | "a = torch.arange(10)\n", 72 | "torch.where(a > 5, a + 50, a )\n", 73 | "# if a > 5 then return a + 50\n", 74 | "# else return a" 75 | ] 76 | }, 77 | { 78 | "cell_type": "markdown", 79 | "metadata": {}, 80 | "source": [ 81 | "This could come n handy for many AI applications, but let's choose labeling data\n", 82 | "\n", 83 | "There may be better wyas to binarize data but here is a simple example of converting conrinuous data into categorical values\n", 84 | "\n", 85 | "arr = **np.array([11, 1.2, 12, 13, 14, 7.3, 5.4, 12.5])**\n", 86 | "\n", 87 | "Let's say all values 10 and above represent a medical parameter threshold that indicates further testing, while values below 10 indicate normal range\n", 88 | "\n", 89 | "We might like to print the values as words such as \n", 90 | "**['More Testing', 'Normal', 'More Testing', 'More Testing', ...]**\n", 91 | "\n" 92 | ] 93 | }, 94 | { 95 | "cell_type": "code", 96 | "execution_count": null, 97 | "metadata": {}, 98 | "outputs": [], 99 | "source": [ 100 | "arr = np.array([11, 1.2, 12, 13, 14, 7.3, 5.4, 12.5])\n", 101 | "np.where(arr < 10, 'Normal', 'More Testing')" 102 | ] 103 | }, 104 | { 105 | "cell_type": "markdown", 106 | "metadata": {}, 107 | "source": [ 108 | "# PyTorch tensors wants numbers!\n", 109 | "\n", 110 | "The following cell will fail because we are passing strings in" 111 | ] 112 | }, 113 | { 114 | "cell_type": "code", 115 | "execution_count": null, 116 | "metadata": {}, 117 | "outputs": [], 118 | "source": [ 119 | "arr = np.array([11, 1.2, 12, 13, 14, 7.3, 5.4, 12.5])\n", 120 | "try:\n", 121 | " print(np.where(arr < 10, 'Normal', 'More Testing') )\n", 122 | "except: \n", 123 | " print(\"This where clause crashed due to using strings\")" 124 | ] 125 | }, 126 | { 127 | "cell_type": "code", 128 | "execution_count": null, 129 | "metadata": {}, 130 | "outputs": [], 131 | "source": [ 132 | "arr = torch.tensor([11, 1.2, 12, 13, 14, 7.3, 5.4, 12.5])\n", 133 | "try:\n", 134 | " print(torch.where(arr < 10, 'Normal', 'More Testing') )\n", 135 | "except: \n", 136 | " print(\"This where clause crashed due to using strings\")" 137 | ] 138 | }, 139 | { 140 | "cell_type": "code", 141 | "execution_count": null, 142 | "metadata": {}, 143 | "outputs": [], 144 | "source": [ 145 | "arr = torch.tensor([11, 1.2, 12, 13, 14, 7.3, 5.4, 12.5])\n", 146 | "torch.where(arr < 10, arr*2+1, arr*-1)" 147 | ] 148 | }, 149 | { 150 | "cell_type": "markdown", 151 | "metadata": {}, 152 | "source": [ 153 | "or we could binarize data for use in a classifier" 154 | ] 155 | }, 156 | { 157 | "cell_type": "code", 158 | "execution_count": null, 159 | "metadata": {}, 160 | "outputs": [], 161 | "source": [ 162 | "# Simple Numpy Binarizer Discretizer\n", 163 | "# convert continous data to discrete integer bins\n", 164 | "arr = np.array([11, 1.2, 12, 13, 14, 7.3, 5.4, 12.5])\n", 165 | "print(np.where(arr < 6, 0, np.where(arr < 12, 1, 2)))\n" 166 | ] 167 | }, 168 | { 169 | "cell_type": "code", 170 | "execution_count": null, 171 | "metadata": {}, 172 | "outputs": [], 173 | "source": [ 174 | "# Simple Numpy Binarizer Discretizer\n", 175 | "# convert continous data to discrete integer bins\n", 176 | "arr = torch.tensor([11, 1.2, 12, 13, 14, 7.3, 5.4, 12.5])\n", 177 | "print(torch.where(arr < 6, 0, torch.where(arr < 12, 1, 2)))" 178 | ] 179 | }, 180 | { 181 | "cell_type": "markdown", 182 | "metadata": {}, 183 | "source": [ 184 | "### Numpy Where to find rows and columns of conditions\n", 185 | "\n", 186 | "Given a mask of TRUE/FALSE values, we will \n", 187 | "- generate a new array with a 1 at every location TRUE is located\n", 188 | "- generate a -1 at every location a FALSE is located\n", 189 | "\n", 190 | "**Apply a mask**" 191 | ] 192 | }, 193 | { 194 | "cell_type": "code", 195 | "execution_count": null, 196 | "metadata": {}, 197 | "outputs": [], 198 | "source": [ 199 | "#Apply a mask of True/False array to select or manipulate elements\n", 200 | "a = np.ones((3,3)) # a contains all 1's\n", 201 | "\n", 202 | "print(\"initial array a\\n\", a)\n", 203 | "# Given a mask of true/ false values, we will generate \n", 204 | "# a new array with a 1 at every location TRUE is located\n", 205 | "# a -1 at every location a FALSE is located\n", 206 | "mask = [[False,True,True],[False,True,False],[True,False,True]]\n", 207 | "print(\"\\nmask\")\n", 208 | "for el in mask: # simple loop to print the mask\n", 209 | " print(el)\n", 210 | " \n", 211 | "testing_array = np.where(mask,a,-a)\n", 212 | "\n", 213 | "print(\"\\ntesting_array\\n\",testing_array)\n", 214 | "\n", 215 | "# now we can find where all the ones are by row and column\n", 216 | "print(\"row index (where ones are): \",np.where(testing_array > 0)[0])\n", 217 | "print(\"col index (where ones are): \",np.where(testing_array > 0)[1])" 218 | ] 219 | }, 220 | { 221 | "cell_type": "code", 222 | "execution_count": null, 223 | "metadata": {}, 224 | "outputs": [], 225 | "source": [] 226 | }, 227 | { 228 | "cell_type": "markdown", 229 | "metadata": {}, 230 | "source": [ 231 | "This can be used the other way to **create a mask** given a conidtion or threshold" 232 | ] 233 | }, 234 | { 235 | "cell_type": "code", 236 | "execution_count": null, 237 | "metadata": {}, 238 | "outputs": [], 239 | "source": [ 240 | "# create a mask for indexing, to later manipulate arrays\n", 241 | "a = np.ones((3,3)) # a contains all 1's\n", 242 | "\n", 243 | "print(\"initial array a\\n\", a)\n", 244 | " \n", 245 | "mask = [[False,True,True],[False,True,False],[True,False,True]]\n", 246 | "print(\"\\nmask\")\n", 247 | "for el in mask: # simple loop to print the mask\n", 248 | " print(el)\n", 249 | " \n", 250 | "testing_array = np.where(mask,a,-a)\n", 251 | "\n", 252 | "print(\"\\ntesting_array\\n\",testing_array)\n", 253 | "\n", 254 | "WentTheOtherWay = np.where(testing_array > 0,True, False)\n", 255 | "\n", 256 | "print(\"\\nWentTheOtherWay\\n\",WentTheOtherWay)\n", 257 | "\n", 258 | "# now we can find where all the ones are by row and column\n", 259 | "print(\"row index (where ones are): \",np.where(WentTheOtherWay > 0)[0])\n", 260 | "print(\"col index (where ones are): \",np.where(WentTheOtherWay > 0)[1])" 261 | ] 262 | }, 263 | { 264 | "cell_type": "code", 265 | "execution_count": null, 266 | "metadata": {}, 267 | "outputs": [], 268 | "source": [ 269 | "# create a mask for indexing, to later manipulate arrays\n", 270 | "a = torch.ones((3,3)) # a contains all 1's\n", 271 | "\n", 272 | "print(\"initial array a\\n\", a)\n", 273 | " \n", 274 | "mask = torch.tensor([[False,True,True],[False,True,False],[True,False,True]])\n", 275 | "print(\"\\nmask\")\n", 276 | "for el in mask: # simple loop to print the mask\n", 277 | " print(el)\n", 278 | " \n", 279 | "testing_array = torch.where(mask,a,-a)\n", 280 | "\n", 281 | "print(\"\\ntesting_array\\n\",testing_array)\n", 282 | "\n", 283 | "WentTheOtherWay = np.where(testing_array > 0,True, False)\n", 284 | "\n", 285 | "print(\"\\nWentTheOtherWay\\n\",WentTheOtherWay)\n", 286 | "\n", 287 | "# now we can find where all the ones are by row and column\n", 288 | "print(\"row index (where ones are): \",np.where(WentTheOtherWay > 0)[0])\n", 289 | "print(\"col index (where ones are): \",np.where(WentTheOtherWay > 0)[1])" 290 | ] 291 | }, 292 | { 293 | "cell_type": "markdown", 294 | "metadata": {}, 295 | "source": [ 296 | "## PyTorch/NumPy.where Multiplication Table Example\n", 297 | "\n", 298 | "Find all locations of a value in a multiplication table\n", 299 | "\n", 300 | "Find all(rows, cols) of the value 24 a multiplication table " 301 | ] 302 | }, 303 | { 304 | "cell_type": "code", 305 | "execution_count": null, 306 | "metadata": {}, 307 | "outputs": [], 308 | "source": [ 309 | "numLine = np.arange(1, 11).reshape(10,1)\n", 310 | "MultiplicationTable = numLine * numLine.T\n", 311 | "MultiplicationTable\n", 312 | "np.where( MultiplicationTable == 24)" 313 | ] 314 | }, 315 | { 316 | "cell_type": "code", 317 | "execution_count": null, 318 | "metadata": {}, 319 | "outputs": [], 320 | "source": [ 321 | "numLine = torch.arange(1, 11).reshape(10,1)\n", 322 | "MultiplicationTable = numLine * numLine.T\n", 323 | "MultiplicationTable\n", 324 | "torch.where( MultiplicationTable == 24)" 325 | ] 326 | }, 327 | { 328 | "cell_type": "markdown", 329 | "metadata": {}, 330 | "source": [ 331 | "# Exercise:\n", 332 | "\n", 333 | "Find all(rows, cols) of (all **multiples of **12 or all multiples of 9) in a 10x10 multiplication table and make all other values 0. Preserve the first row and first column as readable indexes for the table as follows:" 334 | ] 335 | }, 336 | { 337 | "cell_type": "raw", 338 | "metadata": {}, 339 | "source": [ 340 | "array([[ 1, 2, 3, 4, 5, 6, 7, 8, 9, 10],\n", 341 | " [ 2, 0, 0, 0, 0, 12, 0, 0, 18, 0],\n", 342 | " [ 3, 0, 9, 12, 0, 18, 0, 24, 27, 0],\n", 343 | " [ 4, 0, 12, 0, 0, 24, 0, 0, 36, 0],\n", 344 | " [ 5, 0, 0, 0, 0, 0, 0, 0, 45, 0],\n", 345 | " [ 6, 12, 18, 24, 0, 36, 0, 48, 54, 60],\n", 346 | " [ 7, 0, 0, 0, 0, 0, 0, 0, 63, 0],\n", 347 | " [ 8, 0, 24, 0, 0, 48, 0, 0, 72, 0],\n", 348 | " [ 9, 18, 27, 36, 45, 54, 63, 72, 81, 90],\n", 349 | " [10, 0, 0, 0, 0, 60, 0, 0, 90, 0]])" 350 | ] 351 | }, 352 | { 353 | "cell_type": "markdown", 354 | "metadata": {}, 355 | "source": [ 356 | "# Python loopy approach" 357 | ] 358 | }, 359 | { 360 | "cell_type": "code", 361 | "execution_count": null, 362 | "metadata": {}, 363 | "outputs": [], 364 | "source": [ 365 | "N = 10\n", 366 | "timing = {}\n", 367 | "A = np.arange(0, N)\n", 368 | "A_torch = torch.arange(0, N)\n", 369 | "#A_torch = torch.tensor(A)\n", 370 | "t1 = time.time()\n", 371 | "B = np.zeros((N,N))\n", 372 | "B_torch = torch.zeros((N,N))\n", 373 | "for i in range(N):\n", 374 | " row = []\n", 375 | " for j in range(N):\n", 376 | " B[i,j] = (i+1)*(j+1)\n", 377 | " if B[i,j] % 9 != 0 and B[i,j] %12 != 0:\n", 378 | " B[i,j] = 0\n", 379 | " row.append(int(B[i,j]))\n", 380 | " print(row)\n", 381 | "t2 = time.time()\n", 382 | "loop = t2-t1\n", 383 | "timing['loop'] = loop\n", 384 | "print(\"Elapsed \", t2-t1)\n" 385 | ] 386 | }, 387 | { 388 | "cell_type": "code", 389 | "execution_count": null, 390 | "metadata": {}, 391 | "outputs": [], 392 | "source": [ 393 | "## one solution - preserves the indexing edges for easy checking\n", 394 | "\n", 395 | "arr = np.array([1,2,3,4,5,6,7,8,9,10])\n", 396 | "table = arr.reshape(10,1)*arr\n", 397 | "np.where( (table % 9) == 0, table, np.where( (table % 12) == 0, table, 0))" 398 | ] 399 | }, 400 | { 401 | "cell_type": "code", 402 | "execution_count": null, 403 | "metadata": {}, 404 | "outputs": [], 405 | "source": [ 406 | "## one solution - preserves the indexing edges for easy checking\n", 407 | "arr = torch.tensor([1,2,3,4,5,6,7,8,9,10])\n", 408 | "table = arr.reshape(10,1)*arr\n", 409 | "torch.where( (table % 9) == 0, table, torch.where( (table % 12) == 0, table, 0))" 410 | ] 411 | }, 412 | { 413 | "cell_type": "markdown", 414 | "metadata": {}, 415 | "source": [ 416 | "## Numpy Where applied to California Housing data\n", 417 | "\n", 418 | "In AI context, this could be applying categorical classifier to otherwise continuous values. For example, california hosugin dataset the target price varibale is continuous." 419 | ] 420 | }, 421 | { 422 | "cell_type": "markdown", 423 | "metadata": {}, 424 | "source": [ 425 | "## Fictitious scenario\n", 426 | "\n", 427 | "A new stimulous package is considered whereby new house buyers will be given a couon worth 50,000 off toward purchase of hosues in California whose price (prior to coupon) is between 250,0000 and 350,000. Other prices will be unaffected. Generate array with the adjusted targets\n" 428 | ] 429 | }, 430 | { 431 | "cell_type": "code", 432 | "execution_count": null, 433 | "metadata": {}, 434 | "outputs": [], 435 | "source": [ 436 | "# Ficticious scenario:\n", 437 | "from sklearn.datasets import fetch_california_housing\n", 438 | "\n", 439 | "california_housing = fetch_california_housing(as_frame=True)\n", 440 | "X = california_housing.data.to_numpy()\n", 441 | "buyerPriceRangeLo = 250_000/100_000\n", 442 | "buyerPriceRangeHi= 350_000/100_000\n", 443 | "T = california_housing.target.to_numpy() \n", 444 | "t1 = time.time()\n", 445 | "timing = {}\n", 446 | "New = np.empty_like(T)\n", 447 | "for i in range(len(T)):\n", 448 | " if ( (T[i] < buyerPriceRangeHi) & (T[i] >= buyerPriceRangeLo) ):\n", 449 | " New[i] = T[i] - 50_000/100_000\n", 450 | " else:\n", 451 | " New[i] = T[i]\n", 452 | "t2 = time.time()\n", 453 | "plt.title( \"California Housing Dataset - conditional Logic Applied\")\n", 454 | "plt.scatter(T, New, color = 'b')\n", 455 | "plt.grid()\n", 456 | "print(\"time elapsed: \", t2-t1)\n", 457 | "timing['loop'] = t2-t1" 458 | ] 459 | }, 460 | { 461 | "cell_type": "markdown", 462 | "metadata": {}, 463 | "source": [ 464 | "## Excercise:\n", 465 | "\n", 466 | "Duplicate the above condition using a Numpy.Where " 467 | ] 468 | }, 469 | { 470 | "cell_type": "code", 471 | "execution_count": null, 472 | "metadata": {}, 473 | "outputs": [], 474 | "source": [ 475 | "t1 = time.time()\n", 476 | "#############################################################################\n", 477 | "### Exercise: Addone moddify code below to compute same results as above loop\n", 478 | "#New = np.where(() & (), (), ()) \n", 479 | "New = np.where((T < buyerPriceRangeHi) & (T >= buyerPriceRangeLo), T - 50_000/100_000, T ) \n", 480 | "\n", 481 | "##############################################################################\n", 482 | "t2 = time.time()\n", 483 | "\n", 484 | "plt.scatter(T, New, color = 'r')\n", 485 | "plt.grid()\n", 486 | "print(\"time elapsed: \", t2-t1)\n", 487 | "timing['np.where'] = t2-t1\n", 488 | "print(\"Speedup: {:4.1f}X\".format( timing['loop']/timing['np.where']))" 489 | ] 490 | }, 491 | { 492 | "cell_type": "code", 493 | "execution_count": null, 494 | "metadata": {}, 495 | "outputs": [], 496 | "source": [ 497 | "X = torch.tensor(california_housing.data.to_numpy())\n", 498 | "buyerPriceRangeLo = 250_000/100_000\n", 499 | "buyerPriceRangeHi= 350_000/100_000\n", 500 | "T = torch.tensor(california_housing.target.to_numpy() )\n", 501 | "\n", 502 | "t1 = time.time()\n", 503 | "#############################################################################\n", 504 | "### Exercise: Addone moddify code below to compute same results as above loop\n", 505 | "#New = np.where(() & (), (), ()) \n", 506 | "New = torch.where((T < buyerPriceRangeHi) & (T >= buyerPriceRangeLo), T - 50_000/100_000, T ) \n", 507 | "\n", 508 | "##############################################################################\n", 509 | "t2 = time.time()\n", 510 | "\n", 511 | "plt.scatter(T, New, color = 'r')\n", 512 | "plt.grid()\n", 513 | "print(\"time elapsed: \", t2-t1)\n", 514 | "timing['torch.where'] = t2-t1\n", 515 | "print(\"Speedup: {:4.1f}X\".format( timing['loop']/timing['torch.where']))" 516 | ] 517 | }, 518 | { 519 | "cell_type": "code", 520 | "execution_count": null, 521 | "metadata": {}, 522 | "outputs": [], 523 | "source": [ 524 | "%matplotlib inline\n", 525 | "import matplotlib.pyplot as plt\n", 526 | "plt.figure(figsize=(10,6))\n", 527 | "plt.title(\"Plot of various method of computing California Housing Discount Rebate!\")\n", 528 | "plt.ylabel(\"Time in seconds\",fontsize=12)\n", 529 | "plt.xlabel(\"Various types of operations\",fontsize=14)\n", 530 | "plt.xticks(rotation=-60)\n", 531 | "plt.grid(True)\n", 532 | "plt.bar(x = range(len(timing)), height=list(timing.values()), align='center', tick_label=list(timing.keys()))\n", 533 | "print('Acceleration : {:4.0f} X'.format(timing['loop']/timing['torch.where']))" 534 | ] 535 | }, 536 | { 537 | "cell_type": "markdown", 538 | "metadata": {}, 539 | "source": [ 540 | "As you can see, we generated the same data with Numpy where as we did woth the original loop but we did so 13X faster (the speedup amount may vary a bit)" 541 | ] 542 | }, 543 | { 544 | "cell_type": "markdown", 545 | "metadata": { 546 | "tags": [] 547 | }, 548 | "source": [ 549 | "# Numpy Select to handle conditional logic\n", 550 | "\n", 551 | "![ConditionalLogic.png](Assets/ConditionalLogic.png)\n", 552 | "![SimpleLogic.png](Assets/SimpleLogic.png)\n", 553 | "\n", 554 | "Apply conditional logic to an array to create a new column orupdate contents of an existing column. This method handles more complex conditional sceanrios than numpy where.\n", 555 | "\n", 556 | "**Syntax:**\n", 557 | "- [numpy.select(condlist, choicelist, default=0)[source]\n", 558 | "- Return an array drawn from elements in choicelist, depending on conditions.\n", 559 | "\n", 560 | "function return an array drawn from elements in choicelist, depending on conditions.\n", 561 | "\n", 562 | "This is very useful function for handing conditionals that otherwise slow down and map or apply, or else add complexity in reading the code\n", 563 | "\n", 564 | "First we will create some new data\n" 565 | ] 566 | }, 567 | { 568 | "cell_type": "code", 569 | "execution_count": null, 570 | "metadata": {}, 571 | "outputs": [], 572 | "source": [ 573 | "import time\n", 574 | "\n", 575 | "BIG = 10_000_000\n", 576 | "\n", 577 | "np.random.seed(2022)\n", 578 | "A = np.random.randint(0, 11, size=(BIG, 6))\n", 579 | "print(A[:5])" 580 | ] 581 | }, 582 | { 583 | "cell_type": "code", 584 | "execution_count": null, 585 | "metadata": {}, 586 | "outputs": [], 587 | "source": [ 588 | "import time\n", 589 | "\n", 590 | "BIG = 10_000_000\n", 591 | "\n", 592 | "torch.manual_seed(2022) # different seed than NumPy\n", 593 | "\n", 594 | "A_torch =torch.randint(0, 11, size=(BIG, 6))\n", 595 | "print(A_torch[:5])" 596 | ] 597 | }, 598 | { 599 | "cell_type": "markdown", 600 | "metadata": {}, 601 | "source": [ 602 | "Find Large loop iteration loop2\n", 603 | "\n", 604 | "If they contain conditional logic:\n", 605 | "- consider np.where or np.select\n", 606 | "\n", 607 | "else\n", 608 | "- Try to find a Numpy replacement using ufuncs, aggergations, etc\n", 609 | "\n", 610 | "Below is a loop consuming 100,000 iterations, with a messy set of conditions\n", 611 | "\n", 612 | "Look for a way to summarize these conditions using a numpy select statement if possible\n", 613 | "\n", 614 | "# Brute Force Approach (Big Loop)" 615 | ] 616 | }, 617 | { 618 | "cell_type": "code", 619 | "execution_count": null, 620 | "metadata": { 621 | "tags": [] 622 | }, 623 | "outputs": [], 624 | "source": [ 625 | "# NumPy approach\n", 626 | "timing = {}\n", 627 | "t1 = time.time()\n", 628 | "for i in range(BIG):\n", 629 | " if A[i,4] == 10:\n", 630 | " A[i,5] = A[i,2] * A[i,3]\n", 631 | " elif (A[i,4] < 10) and (A[i,4] >=5):\n", 632 | " A[i,5] = A[i,2] + A[i,3]\n", 633 | " elif A[i,4] < 5:\n", 634 | " A[i,5] = A[i,0] + A[i,1]\n", 635 | "t2 = time.time()\n", 636 | "baseTime = t2- t1\n", 637 | "print(A[:5,:])\n", 638 | "print(\"time: \", baseTime)\n", 639 | "timing['Naive Loop'] = t2 - t1" 640 | ] 641 | }, 642 | { 643 | "cell_type": "markdown", 644 | "metadata": {}, 645 | "source": [ 646 | "# Try Vectorizing with masks \n", 647 | "\n", 648 | "Just remove the references to i and remove the loop, create mask for each condition\n" 649 | ] 650 | }, 651 | { 652 | "cell_type": "code", 653 | "execution_count": null, 654 | "metadata": {}, 655 | "outputs": [], 656 | "source": [ 657 | "# Try Vectorizing simply NumPy\n", 658 | "t1 = time.time()\n", 659 | "mask1 = A[:,4] == 10\n", 660 | "A[mask1,5] = A[mask1,2] * A[mask1,3]\n", 661 | "mask2 = ((A[:,4].any() < 10) and (A[:,4].any() >=5))\n", 662 | "A[mask2,5] = A[mask2,2] + A[mask2,3]\n", 663 | "mask3 = A[:,4].any() < 5\n", 664 | "A[mask3,5] = A[mask3,0] + A[mask3,1]\n", 665 | "t2 = time.time()\n", 666 | "print(A[:5,:])\n", 667 | "print(\"time :\", t2-t1)\n", 668 | "\n", 669 | "fastest_time = t2-t1\n", 670 | "Speedup = baseTime / fastest_time\n", 671 | "print(\"Speed up: {:4.0f} X\".format(Speedup))\n", 672 | "timing['Vector Masks NumPy'] = t2 - t1" 673 | ] 674 | }, 675 | { 676 | "cell_type": "code", 677 | "execution_count": null, 678 | "metadata": {}, 679 | "outputs": [], 680 | "source": [ 681 | "# Try Vectorizing simply PyTorch\n", 682 | "t1 = time.time()\n", 683 | "A_torch = torch.tensor(A)\n", 684 | "mask1 = A_torch[:,4] == 10\n", 685 | "A_torch[mask1,5] = A_torch[mask1,2] * A_torch[mask1,3]\n", 686 | "mask2 = ((A_torch[:,4].any() < 10) and (A_torch[:,4].any() >=5))\n", 687 | "A_torch[mask2,5] = A_torch[mask2,2] + A_torch[mask2,3]\n", 688 | "mask3 = A_torch[:,4].any() < 5\n", 689 | "A_torch[mask3,5] = A_torch[mask3,0] + A_torch[mask3,1]\n", 690 | "t2 = time.time()\n", 691 | "print(A_torch[:5,:])\n", 692 | "print(\"time :\", t2-t1)\n", 693 | "\n", 694 | "fastest_time = t2-t1\n", 695 | "Speedup = baseTime / fastest_time\n", 696 | "print(\"Speed up: {:4.0f} X\".format(Speedup))\n", 697 | "timing['Vector Masks PyTorch'] = t2 - t1" 698 | ] 699 | }, 700 | { 701 | "cell_type": "markdown", 702 | "metadata": {}, 703 | "source": [ 704 | "# Try Vectorizing with select\n", 705 | "\n", 706 | "### Much cleaner logic\n", 707 | "\n", 708 | "put condition inside a list of tuples\n", 709 | "put execution choice inside a list of tuples\n", 710 | "result = np.select(condition, choice, default)\n", 711 | "\n", 712 | "PyTorch does not implement Select - but you can write one yourself" 713 | ] 714 | }, 715 | { 716 | "cell_type": "code", 717 | "execution_count": null, 718 | "metadata": {}, 719 | "outputs": [], 720 | "source": [ 721 | "from functools import reduce\n", 722 | "def my_select1(c, v, d =0):\n", 723 | " _c, _v = c.pop(), v.pop()\n", 724 | " r = select(c, v, d) if len(c) else d\n", 725 | " return torch.where(_c, _v, r)\n", 726 | "\n", 727 | "def my_select2(c, v, d=0):\n", 728 | " zipped = reversed(list(zip(c, v)))\n", 729 | " return reduce(lambda o, a: torch.where(*a, o), zipped, d)" 730 | ] 731 | }, 732 | { 733 | "cell_type": "code", 734 | "execution_count": null, 735 | "metadata": {}, 736 | "outputs": [], 737 | "source": [ 738 | "# np.select(condlist, choicelist, default=0)\n", 739 | "t1 = time.time()\n", 740 | "\n", 741 | "condition = [ (A[:,4] < 10) & (A[:,4] >= 5),\n", 742 | " ( A[:,4] < 5)]\n", 743 | "choice = [ (A[:,2] + A[:,3]), \n", 744 | " (A[:,0] + A[:,1] ) ]\n", 745 | "default = [(A[:,2] * A[:,3])]\n", 746 | "A[:,5] = np.select(condition, choice, default= default )\n", 747 | "\n", 748 | "t2 = time.time()\n", 749 | "print(A[:5,:])\n", 750 | "print(\"time :\", t2-t1)\n", 751 | "fastest_time = t2-t1\n", 752 | "Speedup = baseTime / fastest_time\n", 753 | "print(\"Speed up: {:4.0f} X\".format(Speedup))\n", 754 | "timing['Numpy Select'] = t2 - t1" 755 | ] 756 | }, 757 | { 758 | "cell_type": "code", 759 | "execution_count": null, 760 | "metadata": {}, 761 | "outputs": [], 762 | "source": [ 763 | "# np.select(condlist, choicelist, default=0)\n", 764 | "t1 = time.time()\n", 765 | "\n", 766 | "condition = [ (A_torch[:,4] < 10) & (A_torch[:,4] >= 5),\n", 767 | " ( A_torch[:,4] < 5)] \n", 768 | "\n", 769 | "choice = [ (A_torch[:,2] + A_torch[:,3]), \n", 770 | " (A_torch[:,0] + A_torch[:,1] ) ] \n", 771 | "\n", 772 | "A_torch[:,5] = my_select2(condition, choice, d = (A_torch[:,2] * A_torch[:,3]))\n", 773 | "\n", 774 | "t2 = time.time()\n", 775 | "print(A_torch[:5,:])\n", 776 | "print(\"time :\", t2-t1)\n", 777 | "fastest_time = t2-t1\n", 778 | "Speedup = baseTime / fastest_time\n", 779 | "print(\"Speed up: {:4.0f} X\".format(Speedup))\n", 780 | "timing['PyTorchCustom Select'] = t2 - t1" 781 | ] 782 | }, 783 | { 784 | "cell_type": "code", 785 | "execution_count": null, 786 | "metadata": {}, 787 | "outputs": [], 788 | "source": [] 789 | }, 790 | { 791 | "cell_type": "code", 792 | "execution_count": null, 793 | "metadata": {}, 794 | "outputs": [], 795 | "source": [ 796 | "plt.figure(figsize=(10,6))\n", 797 | "plt.title(\"Time taken to process {:,} records in seconds\".format(BIG),fontsize=12)\n", 798 | "plt.ylabel(\"Time in seconds\",fontsize=12)\n", 799 | "plt.xlabel(\"Various types of operations\",fontsize=14)\n", 800 | "plt.grid(True)\n", 801 | "plt.xticks(rotation=-60)\n", 802 | "plt.bar(x = list(timing.keys()), height= list(timing.values()), align='center',tick_label=list(timing.keys()))" 803 | ] 804 | }, 805 | { 806 | "cell_type": "markdown", 807 | "metadata": {}, 808 | "source": [ 809 | "## Exercise: Numpy Select\n", 810 | "\n", 811 | "Find all(rows, cols) of (all multiples of 12, 15, 21) in a multiplication table and make all other values 0 using numpy select" 812 | ] 813 | }, 814 | { 815 | "cell_type": "code", 816 | "execution_count": null, 817 | "metadata": {}, 818 | "outputs": [], 819 | "source": [ 820 | "# numpy.select(condlist, choicelist, default)\n", 821 | "numLine = np.arange(1, 11).reshape(10,1)\n", 822 | "multT = numLine * numLine.T\n", 823 | "\n", 824 | "# condition = [(), (), ()]\n", 825 | "# choice = [(), (), ()]\n", 826 | "# default =[()]\n", 827 | "# res = np.select(condition, choice, default)\n", 828 | "\n", 829 | "# res[0,:] = MultiplicationTable[0,:] # put edges back in to check result\n", 830 | "# res[:,0] = MultiplicationTable[:,0] # put edges back in to check result\n", 831 | "# res" 832 | ] 833 | }, 834 | { 835 | "cell_type": "code", 836 | "execution_count": null, 837 | "metadata": {}, 838 | "outputs": [], 839 | "source": [ 840 | "# numpy approach\n", 841 | "\n", 842 | "numLine = np.arange(1, 11).reshape(10,1)\n", 843 | " \n", 844 | "res = np.select(condition, choice, default)\n", 845 | "# res[0,:] = MultiplicationTable[0,:] # put edges back in to check result\n", 846 | "# res[:,0] = MultiplicationTable[:,0] # put edges back in to check result\n", 847 | "res" 848 | ] 849 | }, 850 | { 851 | "cell_type": "code", 852 | "execution_count": null, 853 | "metadata": {}, 854 | "outputs": [], 855 | "source": [ 856 | "# PyTorch tensr approach\n", 857 | "numLine = torch.arange(1, 11).reshape(10,1)\n", 858 | "multT = numLine * numLine.T\n", 859 | "\n", 860 | "condition = [(multT%12 == 0), (multT%15 == 0), (multT%21 == 0)]\n", 861 | "choice = [(multT), (multT), (multT)]\n", 862 | "default =[(0)]\n", 863 | "\n", 864 | "res = np.select(condition, choice, default)\n", 865 | "# res[0,:] = MultiplicationTable[0,:] # put edges back in to check result\n", 866 | "# res[:,0] = MultiplicationTable[:,0] # put edges back in to check result\n", 867 | "res" 868 | ] 869 | }, 870 | { 871 | "cell_type": "code", 872 | "execution_count": null, 873 | "metadata": {}, 874 | "outputs": [], 875 | "source": [] 876 | }, 877 | { 878 | "cell_type": "markdown", 879 | "metadata": {}, 880 | "source": [ 881 | "# List of days\n", 882 | "\n", 883 | "### AKA showcasing fancy slicing\n", 884 | "\n", 885 | "In machine learning, feature engineering, particularly for data containnig dates and times need special preprocessing.\n", 886 | "\n", 887 | "It is fairly common to create new columns from datetime data to explicitly call out Day of Week (DOW), Day of Year (DOY) Day of Month (DOM), Quarter, hour of day, minutes and seconds of day and so on.\n", 888 | "\n", 889 | "This is because some cyclical patterns or special handling may have to occur to ahndle exceptions. For exmaple reporting revenue for weekdays as opposed to weekends.\n", 890 | "\n", 891 | "In the example below, we will assum Saturday starts on day 3 and we want to efficiently report out each weekend day of the month\n", 892 | "\n", 893 | "Goal: grab subset of data for weekend days into a numpy array\n", 894 | "\n", 895 | "Demonstrate approach using slicing as wellas np.where()" 896 | ] 897 | }, 898 | { 899 | "cell_type": "code", 900 | "execution_count": null, 901 | "metadata": {}, 902 | "outputs": [], 903 | "source": [ 904 | "import numpy as np\n", 905 | "# create simple array of data with days info\n", 906 | "a = np.array([i for i in range(21)])\n", 907 | "a_torch = torch.tensor([i for i in range(21)])\n", 908 | "a,a_torch" 909 | ] 910 | }, 911 | { 912 | "cell_type": "code", 913 | "execution_count": null, 914 | "metadata": {}, 915 | "outputs": [], 916 | "source": [ 917 | "# skip count by 7 starting on day 0\n", 918 | "print(\"Here are the days of the month that are called Saturday\")\n", 919 | "a[3::7], a_torch[3::7]" 920 | ] 921 | }, 922 | { 923 | "cell_type": "code", 924 | "execution_count": null, 925 | "metadata": {}, 926 | "outputs": [], 927 | "source": [ 928 | "# skip count by 7 starting on day 1\n", 929 | "print(\"Here are the days of the month that are called Sunday\")\n", 930 | "a[4::7], a_torch[4::7]" 931 | ] 932 | }, 933 | { 934 | "cell_type": "markdown", 935 | "metadata": {}, 936 | "source": [ 937 | "# Here is list of all the weekend days" 938 | ] 939 | }, 940 | { 941 | "cell_type": "code", 942 | "execution_count": null, 943 | "metadata": {}, 944 | "outputs": [], 945 | "source": [ 946 | "# indices for weekend days\n", 947 | "start = 3 # say Saturday of interest starts on the day 3 of the month\n", 948 | "blist = list(zip(a[start::7],a[start+1::7]))\n", 949 | "blist_torch = list(zip(a_torch[start::7].numpy(),a_torch[start+1::7].numpy()))\n", 950 | "blist, blist_torch" 951 | ] 952 | }, 953 | { 954 | "cell_type": "code", 955 | "execution_count": null, 956 | "metadata": {}, 957 | "outputs": [], 958 | "source": [ 959 | "np.array(blist).flatten()" 960 | ] 961 | }, 962 | { 963 | "cell_type": "code", 964 | "execution_count": null, 965 | "metadata": {}, 966 | "outputs": [], 967 | "source": [ 968 | "idx = np.where((a%7==3) | (a%7==4))\n", 969 | "a[idx]" 970 | ] 971 | }, 972 | { 973 | "cell_type": "code", 974 | "execution_count": null, 975 | "metadata": {}, 976 | "outputs": [], 977 | "source": [ 978 | "idx = np.where((a_torch%7==3) | (a_torch%7==4))\n", 979 | "a_torch[idx].numpy()" 980 | ] 981 | }, 982 | { 983 | "cell_type": "code", 984 | "execution_count": null, 985 | "metadata": {}, 986 | "outputs": [], 987 | "source": [ 988 | "print(\"Done\")" 989 | ] 990 | }, 991 | { 992 | "cell_type": "code", 993 | "execution_count": null, 994 | "metadata": {}, 995 | "outputs": [], 996 | "source": [] 997 | } 998 | ], 999 | "metadata": { 1000 | "anaconda-cloud": {}, 1001 | "kernelspec": { 1002 | "display_name": "pytorch-gpu", 1003 | "language": "python", 1004 | "name": "pytorch-gpu" 1005 | }, 1006 | "language_info": { 1007 | "codemirror_mode": { 1008 | "name": "ipython", 1009 | "version": 3 1010 | }, 1011 | "file_extension": ".py", 1012 | "mimetype": "text/x-python", 1013 | "name": "python", 1014 | "nbconvert_exporter": "python", 1015 | "pygments_lexer": "ipython3", 1016 | "version": "3.9.16" 1017 | }, 1018 | "nbTranslate": { 1019 | "displayLangs": [ 1020 | "*" 1021 | ], 1022 | "hotkey": "alt-t", 1023 | "langInMainMenu": true, 1024 | "sourceLang": "en", 1025 | "targetLang": "fr", 1026 | "useGoogleTranslate": true 1027 | }, 1028 | "toc": { 1029 | "base_numbering": 1, 1030 | "nav_menu": {}, 1031 | "number_sections": true, 1032 | "sideBar": true, 1033 | "skip_h1_title": false, 1034 | "title_cell": "Table of Contents", 1035 | "title_sidebar": "Contents", 1036 | "toc_cell": false, 1037 | "toc_position": {}, 1038 | "toc_section_display": true, 1039 | "toc_window_display": false 1040 | } 1041 | }, 1042 | "nbformat": 4, 1043 | "nbformat_minor": 4 1044 | } 1045 | -------------------------------------------------------------------------------- /06_Compare_Sorting.ipynb: -------------------------------------------------------------------------------- 1 | { 2 | "cells": [ 3 | { 4 | "cell_type": "markdown", 5 | "id": "2e0034b0-668b-4d8c-a8c9-75202798d875", 6 | "metadata": {}, 7 | "source": [ 8 | "# Introduction to PyTorch Sorting by way of NumPy\n", 9 | "\n", 10 | "This module explores the reasons to abandon \"roll your own\" sorting algorithms in deferences to using either NumPy or PyTorch function equivalents.\n", 11 | "\n", 12 | "We took a stab at creating a table to compare sort related functions between NumPy and PyTorch:\n", 13 | "\n", 14 | "Hannesy & Patternson:\n", 15 | "\"A New Golden Age for Computer Architecture\"\n", 16 | "https://www.doc.ic.ac.uk/~wl/teachlocal/arch/papers/cacm19golden-age.pdf\n", 17 | "\n", 18 | "\n", 19 | "| Function | Description | NumPy | PyTorch|\n", 20 | "| ---| --- | --- | --- |\n", 21 | "| sort | Return a sorted copy of an array. | numpy.sort select kind = ‘quicksort’, ‘mergesort’, ‘heapsort’, ‘stable’ | Cannot specify algorithm |\n", 22 | "| argsort | Returns the indices that would sort an array. | numpy.argsort | torch.argsort |\n", 23 | "| lexsort | Perform an indirect sort on multiple keys. | numpy.lexsort | torch.argsort |\n", 24 | "| partition | Partially sort an array in-place. | numpy.partition | torch.partition |\n", 25 | "| argpartition | Returns the indices that would partition an array. | numpy.argpartition | torch.topk |\n", 26 | "| msort | Merge sort an array. | numpy.msort | N/A |\n", 27 | "| sort_complex | Sort a complex array using the real part first. | numpy.sort_complex | N/A |\n", 28 | "| searchsorted | Find indices where elements should be inserted to maintain order. | numpy.searchsorted | N/A |\n", 29 | "\n", 30 | "Here is a recent news story about Intel accelerating NumPy quicksort:\n", 31 | "\n", 32 | "**Intel Publishes Blazing Fast AVX-512 Sorting Library, Numpy Switching To It For 10~17x Faster Sorts**\n", 33 | "- https://www.phoronix.com/news/Intel-AVX-512-Quicksort-Numpy\n", 34 | "- Written by Michael Larabel in Intel on 15 February 2023 at 04:00 PM EST. 51 Comments\n", 35 | "\n", 36 | "\n", 37 | "We will only be investiagating sort in this module." 38 | ] 39 | }, 40 | { 41 | "cell_type": "code", 42 | "execution_count": null, 43 | "id": "ae5e937f-2e0e-413c-ad26-8d04210f2f5f", 44 | "metadata": {}, 45 | "outputs": [], 46 | "source": [ 47 | "import numpy as np\n", 48 | "import torch\n", 49 | "import time\n", 50 | "BIG = 10_000_000\n", 51 | "np.random.seed(seed=12)\n", 52 | "arr = np.random.rand(BIG)\n", 53 | "orig = arr.copy()" 54 | ] 55 | }, 56 | { 57 | "cell_type": "code", 58 | "execution_count": null, 59 | "id": "34dbff91-9b15-41f7-a74f-94777af61c86", 60 | "metadata": {}, 61 | "outputs": [], 62 | "source": [ 63 | "def heapify(arr, n, i):\n", 64 | " largest = i \n", 65 | " l = 2 * i + 1 \n", 66 | " r = 2 * i + 2 \n", 67 | " \n", 68 | " if l < n and arr[largest] < arr[l]:\n", 69 | " largest = l\n", 70 | " \n", 71 | " if r < n and arr[largest] < arr[r]:\n", 72 | " largest = r\n", 73 | " \n", 74 | " if largest != i:\n", 75 | " arr[i],arr[largest] = arr[largest],arr[i] \n", 76 | " heapify(arr, n, largest)\n", 77 | " return\n", 78 | "\n", 79 | "def heapSort(arr):\n", 80 | " n = len(arr)\n", 81 | " \n", 82 | " for i in range(n // 2 - 1, -1, -1):\n", 83 | " heapify(arr, n, i)\n", 84 | " \n", 85 | " for i in range(n-1, 0, -1):\n", 86 | " arr[i], arr[0] = arr[0], arr[i] \n", 87 | " heapify(arr, i, 0)\n", 88 | " return arr" 89 | ] 90 | }, 91 | { 92 | "cell_type": "markdown", 93 | "id": "5f222785-06ba-42fb-b3e1-75fdc23c8fa6", 94 | "metadata": {}, 95 | "source": [ 96 | "# Heapsort\n", 97 | "\n", 98 | "### The loopy way" 99 | ] 100 | }, 101 | { 102 | "cell_type": "code", 103 | "execution_count": null, 104 | "id": "9e6fae24-e8c9-4a8e-96f0-018aa50ef9fa", 105 | "metadata": {}, 106 | "outputs": [], 107 | "source": [ 108 | "np.random.seed(seed=12)\n", 109 | "arr = np.random.rand(BIG)\n", 110 | "timing = {}\n", 111 | "t1 = time.time()\n", 112 | "heapSort(arr)\n", 113 | "t2 = time.time()\n", 114 | "print(\"Sorted array is:\")\n", 115 | "print(arr[:10] )\n", 116 | "timing['heapsort_bruteForce'] = time.time() - t1\n", 117 | "print('Elapsed time', timing['heapsort_bruteForce'])" 118 | ] 119 | }, 120 | { 121 | "cell_type": "markdown", 122 | "id": "2525893e-31b4-436f-99fe-2ec8ee5beb95", 123 | "metadata": {}, 124 | "source": [ 125 | "### The NumPy way" 126 | ] 127 | }, 128 | { 129 | "cell_type": "code", 130 | "execution_count": null, 131 | "id": "f529fb51-a150-4b36-af1b-fdc2b495f888", 132 | "metadata": {}, 133 | "outputs": [], 134 | "source": [ 135 | "np.random.seed(seed=12)\n", 136 | "arr = np.random.rand(BIG)\n", 137 | "t1 = time.time()\n", 138 | "np.sort(arr, axis=None, kind='heapsort') \n", 139 | "t2 = time.time()\n", 140 | "print(\"Sorted array is:\")\n", 141 | "print(arr[:10] )\n", 142 | "timing['heapsort_numpy'] = time.time() - t1\n", 143 | "print('Elapsed time', timing['heapsort_numpy'])\n", 144 | "print('Numpy Acceleration: {:4.1f} X faster'.format(timing['heapsort_bruteForce']/timing['heapsort_numpy']))" 145 | ] 146 | }, 147 | { 148 | "cell_type": "markdown", 149 | "id": "afd4d5aa-d8c2-4d34-9229-654b0498bc4f", 150 | "metadata": {}, 151 | "source": [ 152 | "### The PyTorch way" 153 | ] 154 | }, 155 | { 156 | "cell_type": "code", 157 | "execution_count": null, 158 | "id": "5f41c4eb-c0c4-4318-85e6-7eab53809d8c", 159 | "metadata": {}, 160 | "outputs": [], 161 | "source": [ 162 | "np.random.seed(seed=12)\n", 163 | "arr = np.random.rand(BIG)\n", 164 | "arr = torch.tensor(arr)\n", 165 | "t1 = time.time()\n", 166 | "torch.sort(arr) \n", 167 | "t2 = time.time()\n", 168 | "print(\"Sorted array is:\")\n", 169 | "print(arr[:10] )\n", 170 | "timing['heapsort_pytorch'] = time.time() - t1\n", 171 | "print('Elapsed time', timing['heapsort_pytorch'])\n", 172 | "print('Numpy Acceleration: {:4.1f} X faster'.format(timing['heapsort_bruteForce']/timing['heapsort_pytorch']))" 173 | ] 174 | }, 175 | { 176 | "cell_type": "markdown", 177 | "id": "5c0cccd1-8532-455c-908f-c0de7dbc2430", 178 | "metadata": {}, 179 | "source": [ 180 | "### Plot the times" 181 | ] 182 | }, 183 | { 184 | "cell_type": "code", 185 | "execution_count": null, 186 | "id": "2be82f0a-e664-4393-a802-f28cd3bc8d4f", 187 | "metadata": {}, 188 | "outputs": [], 189 | "source": [ 190 | "%matplotlib inline\n", 191 | "import matplotlib.pyplot as plt\n", 192 | "plt.figure(figsize=(10,6))\n", 193 | "plt.title(\"Measure acceleration of looping versus PyTorch/NumPy [Lower is better]\",fontsize=12)\n", 194 | "plt.ylabel(\"Time in seconds\",fontsize=12)\n", 195 | "plt.yscale('log')\n", 196 | "plt.xlabel(\"Various types of operations\",fontsize=14)\n", 197 | "plt.grid(True)\n", 198 | "plt.bar(x = list(timing.keys()), height= list(timing.values()), align='center',tick_label=list(timing.keys()))" 199 | ] 200 | }, 201 | { 202 | "cell_type": "markdown", 203 | "id": "c4a92e63-d28c-4e92-98c9-d19b7b1e70d5", 204 | "metadata": {}, 205 | "source": [ 206 | "# Quicksort\n", 207 | "\n", 208 | "### The loopy way" 209 | ] 210 | }, 211 | { 212 | "cell_type": "code", 213 | "execution_count": null, 214 | "id": "a43bb20a-7f3e-45fe-8f31-1bc59e8bdca1", 215 | "metadata": {}, 216 | "outputs": [], 217 | "source": [ 218 | "def quickSort(arr, low, high):\n", 219 | " if low < high:\n", 220 | " pivotIndex = partition(arr, low, high)\n", 221 | " quickSort(arr, low, pivotIndex - 1)\n", 222 | " quickSort(arr, pivotIndex + 1, high)\n", 223 | "\n", 224 | "def partition(arr, low, high):\n", 225 | " pivot = arr[high]\n", 226 | " i = low - 1 # Index of smaller element\n", 227 | " for j in range(low, high):\n", 228 | " # If current element is smaller than or equal to pivot\n", 229 | " if arr[j] <= pivot:\n", 230 | " i += 1\n", 231 | " arr[i], arr[j] = arr[j], arr[i]\n", 232 | " arr[i+1], arr[high] = arr[high], arr[i+1]\n", 233 | " return i + 1\n" 234 | ] 235 | }, 236 | { 237 | "cell_type": "markdown", 238 | "id": "f44c1a6e-8770-44e1-9a05-08d216e75f39", 239 | "metadata": {}, 240 | "source": [ 241 | "### The Loopy way" 242 | ] 243 | }, 244 | { 245 | "cell_type": "code", 246 | "execution_count": null, 247 | "id": "54be80bc-36d7-472a-b59f-31343465ae02", 248 | "metadata": {}, 249 | "outputs": [], 250 | "source": [ 251 | "import time\n", 252 | "np.random.seed(seed=12)\n", 253 | "arr = np.random.rand(BIG)\n", 254 | "timing = {}\n", 255 | "t1 = time.time()\n", 256 | "quickSort(arr, 0, len(arr)-1)\n", 257 | "t2 = time.time()\n", 258 | "print(\"Sorted array is:\")\n", 259 | "timing['quicksort_bruteForce'] = time.time() - t1\n", 260 | "print('Elapsed time', timing['quicksort_bruteForce'])" 261 | ] 262 | }, 263 | { 264 | "cell_type": "markdown", 265 | "id": "7ec9a126-b0db-4a92-9a4d-1c169b4d158a", 266 | "metadata": {}, 267 | "source": [ 268 | "### The NumPy way" 269 | ] 270 | }, 271 | { 272 | "cell_type": "code", 273 | "execution_count": null, 274 | "id": "7179a7ef-cb42-4064-bfd8-05756e56959f", 275 | "metadata": {}, 276 | "outputs": [], 277 | "source": [ 278 | "np.random.seed(seed=12)\n", 279 | "arr = np.random.rand(BIG)\n", 280 | "t1 = time.time()\n", 281 | "np.sort(arr, axis=None, kind='quicksort') \n", 282 | "t2 = time.time()\n", 283 | "print(\"Sorted array is:\")\n", 284 | "timing['quicksort_numpy'] = time.time() - t1\n", 285 | "print('Elapsed time', timing['quicksort_numpy'])\n", 286 | "print('Numpy Acceleration: {:4.1f} X faster'.format(timing['quicksort_bruteForce']/timing['quicksort_numpy']))" 287 | ] 288 | }, 289 | { 290 | "cell_type": "markdown", 291 | "id": "a690d426-87a2-434e-9a5c-a9bf12eb1a07", 292 | "metadata": {}, 293 | "source": [ 294 | "### The PyTorch way" 295 | ] 296 | }, 297 | { 298 | "cell_type": "code", 299 | "execution_count": null, 300 | "id": "41bc5784-5b40-48f1-b604-dccd84546051", 301 | "metadata": {}, 302 | "outputs": [], 303 | "source": [ 304 | "np.random.seed(seed=12)\n", 305 | "arr = np.random.rand(BIG)\n", 306 | "arr = torch.tensor(arr)\n", 307 | "t1 = time.time()\n", 308 | "torch.sort(arr) \n", 309 | "t2 = time.time()\n", 310 | "print(\"Sorted array is:\")\n", 311 | "print(arr[:10] )\n", 312 | "timing['sort_pytorch'] = time.time() - t1\n", 313 | "print('Elapsed time', timing['sort_pytorch'])\n", 314 | "print('Numpy Acceleration: {:4.1f} X faster'.format(timing['quicksort_bruteForce']/timing['sort_pytorch']))" 315 | ] 316 | }, 317 | { 318 | "cell_type": "markdown", 319 | "id": "014690d1-39b9-42ee-9f20-99a4780c9bee", 320 | "metadata": {}, 321 | "source": [ 322 | "### Plot the times" 323 | ] 324 | }, 325 | { 326 | "cell_type": "code", 327 | "execution_count": null, 328 | "id": "402ded87-4f41-442d-bb5c-c586a359a50e", 329 | "metadata": {}, 330 | "outputs": [], 331 | "source": [ 332 | "%matplotlib inline\n", 333 | "import matplotlib.pyplot as plt\n", 334 | "plt.figure(figsize=(10,6))\n", 335 | "plt.title(\"Measure acceleration of looping versus PyTorch/NumPy [Lower is better]\",fontsize=12)\n", 336 | "plt.ylabel(\"Time in seconds\",fontsize=12)\n", 337 | "plt.yscale('log')\n", 338 | "plt.xlabel(\"Various types of operations\",fontsize=14)\n", 339 | "plt.grid(True)\n", 340 | "plt.bar(x = list(timing.keys()), height= list(timing.values()), align='center',tick_label=list(timing.keys()))" 341 | ] 342 | }, 343 | { 344 | "cell_type": "markdown", 345 | "id": "73b3ad9a-50dd-4aa1-908a-ebb083cd84d8", 346 | "metadata": {}, 347 | "source": [ 348 | "# Mergesort\n", 349 | "\n", 350 | "### The Loopy way" 351 | ] 352 | }, 353 | { 354 | "cell_type": "code", 355 | "execution_count": null, 356 | "id": "59ce64dd-770f-4853-87f7-e69844870eb7", 357 | "metadata": {}, 358 | "outputs": [], 359 | "source": [ 360 | "def mergeSort(arr):\n", 361 | " if len(arr) > 1:\n", 362 | " mid = len(arr) // 2\n", 363 | " leftHalf = arr[:mid]\n", 364 | " rightHalf = arr[mid:]\n", 365 | " \n", 366 | " mergeSort(leftHalf)\n", 367 | " mergeSort(rightHalf)\n", 368 | " \n", 369 | " i = j = k = 0\n", 370 | " \n", 371 | " while i < len(leftHalf) and j < len(rightHalf):\n", 372 | " if leftHalf[i] < rightHalf[j]:\n", 373 | " arr[k] = leftHalf[i]\n", 374 | " i += 1\n", 375 | " else:\n", 376 | " arr[k] = rightHalf[j]\n", 377 | " j += 1\n", 378 | " k += 1\n", 379 | " \n", 380 | " while i < len(leftHalf):\n", 381 | " arr[k] = leftHalf[i]\n", 382 | " i += 1\n", 383 | " k += 1\n", 384 | " \n", 385 | " while j < len(rightHalf):\n", 386 | " arr[k] = rightHalf[j]\n", 387 | " j += 1\n", 388 | " k += 1" 389 | ] 390 | }, 391 | { 392 | "cell_type": "markdown", 393 | "id": "393b20da-491d-42d3-bba0-cbc13d4d8db0", 394 | "metadata": {}, 395 | "source": [ 396 | "### The Loopy way" 397 | ] 398 | }, 399 | { 400 | "cell_type": "code", 401 | "execution_count": null, 402 | "id": "c3b0a50d-a3f1-4ea9-a80e-ed6bdf1d3669", 403 | "metadata": {}, 404 | "outputs": [], 405 | "source": [ 406 | "import time\n", 407 | "np.random.seed(seed=12)\n", 408 | "arr = np.random.rand(BIG)\n", 409 | "timing = {}\n", 410 | "t1 = time.time()\n", 411 | "mergeSort(arr)\n", 412 | "t2 = time.time()\n", 413 | "print(\"Sorted array is:\")\n", 414 | "timing['mergesort_bruteForce'] = time.time() - t1\n", 415 | "print('Elapsed time', timing['mergesort_bruteForce'])" 416 | ] 417 | }, 418 | { 419 | "cell_type": "markdown", 420 | "id": "679d425a-ac4a-4838-a55d-1bd85c51d0bd", 421 | "metadata": {}, 422 | "source": [ 423 | "### The NumPy way" 424 | ] 425 | }, 426 | { 427 | "cell_type": "code", 428 | "execution_count": null, 429 | "id": "8c8563ba-5565-4c7f-ba03-18f21c8f6d36", 430 | "metadata": {}, 431 | "outputs": [], 432 | "source": [ 433 | "np.random.seed(seed=12)\n", 434 | "arr = np.random.rand(BIG)\n", 435 | "t1 = time.time()\n", 436 | "np.sort(arr, axis=None, kind='mergesort') \n", 437 | "t2 = time.time()\n", 438 | "print(\"Sorted array is:\")\n", 439 | "timing['mergesort_numpy'] = time.time() - t1\n", 440 | "print('Elapsed time', timing['mergesort_numpy'])\n", 441 | "print('Numpy Acceleration: {:4.1f} X faster'.format(timing['mergesort_bruteForce']/timing['mergesort_numpy']))" 442 | ] 443 | }, 444 | { 445 | "cell_type": "markdown", 446 | "id": "07017490-2086-479a-8757-95dbb703af7a", 447 | "metadata": {}, 448 | "source": [ 449 | "### The PyTorch way" 450 | ] 451 | }, 452 | { 453 | "cell_type": "code", 454 | "execution_count": null, 455 | "id": "a8cf1045-1953-4689-b155-d9b62abbfdfe", 456 | "metadata": {}, 457 | "outputs": [], 458 | "source": [ 459 | "np.random.seed(seed=12)\n", 460 | "arr = np.random.rand(BIG)\n", 461 | "arr = torch.tensor(arr)\n", 462 | "t1 = time.time()\n", 463 | "torch.sort(arr) \n", 464 | "t2 = time.time()\n", 465 | "print(\"Sorted array is:\")\n", 466 | "print(arr[:10] )\n", 467 | "timing['sort_pytorch'] = time.time() - t1\n", 468 | "print('Elapsed time', timing['sort_pytorch'])\n", 469 | "print('Numpy Acceleration: {:4.1f} X faster'.format(timing['mergesort_bruteForce']/timing['sort_pytorch']))" 470 | ] 471 | }, 472 | { 473 | "cell_type": "markdown", 474 | "id": "b3943b0e-879b-4ce8-98eb-38e007c62106", 475 | "metadata": {}, 476 | "source": [ 477 | "### Plot the times" 478 | ] 479 | }, 480 | { 481 | "cell_type": "code", 482 | "execution_count": null, 483 | "id": "5dd7a242-6627-4d77-8da7-8c2d749d4e64", 484 | "metadata": {}, 485 | "outputs": [], 486 | "source": [ 487 | "%matplotlib inline\n", 488 | "import matplotlib.pyplot as plt\n", 489 | "plt.figure(figsize=(10,6))\n", 490 | "plt.title(\"Measure acceleration of looping versus PyTorch/NumPy [Lower is better]\",fontsize=12)\n", 491 | "plt.ylabel(\"Time in seconds\",fontsize=12)\n", 492 | "plt.yscale('log')\n", 493 | "\n", 494 | "plt.xlabel(\"Various types of operations\",fontsize=14)\n", 495 | "plt.grid(True)\n", 496 | "plt.bar(x = list(timing.keys()), height= list(timing.values()), align='center',tick_label=list(timing.keys()))" 497 | ] 498 | }, 499 | { 500 | "cell_type": "markdown", 501 | "id": "a6241907-d82c-46ba-9a0b-861459d96eb4", 502 | "metadata": {}, 503 | "source": [ 504 | "# Notices and Disclaimers\n", 505 | "\n", 506 | "Intel technologies may require enabled hardware, software or service activation.\n", 507 | "No product or component can be absolutely secure. \n", 508 | "\n", 509 | "Your costs and results may vary. \n", 510 | "\n", 511 | "© Intel Corporation. Intel, the Intel logo, and other Intel marks are trademarks of Intel Corporation or its subsidiaries. Other names and brands may be claimed as the property of others. " 512 | ] 513 | } 514 | ], 515 | "metadata": { 516 | "kernelspec": { 517 | "display_name": "pytorch-gpu", 518 | "language": "python", 519 | "name": "pytorch-gpu" 520 | }, 521 | "language_info": { 522 | "codemirror_mode": { 523 | "name": "ipython", 524 | "version": 3 525 | }, 526 | "file_extension": ".py", 527 | "mimetype": "text/x-python", 528 | "name": "python", 529 | "nbconvert_exporter": "python", 530 | "pygments_lexer": "ipython3", 531 | "version": "3.9.16" 532 | } 533 | }, 534 | "nbformat": 4, 535 | "nbformat_minor": 5 536 | } 537 | -------------------------------------------------------------------------------- /07_Numpy_Select_Pandas.ipynb: -------------------------------------------------------------------------------- 1 | { 2 | "cells": [ 3 | { 4 | "attachments": { 5 | "4830f075-ff79-4f43-bc5a-38e0d17d65bc.png": { 6 | "image/png": "iVBORw0KGgoAAAANSUhEUgAAAY8AAADPCAIAAACC18poAAAgAElEQVR4nOy9eZBl2V3n9/2de+/b38s9syozK2vrWrqrWuqWelNLQoDQIJZwCCOHAcEYPCwOxvYEYcaGsCNMOMIEM8bW2GYCDMNgYcuBh4EAY2EYJCG0oG51S63urlbXlrXmnvn27W7n/PzHvffly7dkZWZlVi51Pv2i+uW9591736263/yd3/ktxMzQaDSaA4/Y7wvQaDSaLWHu9wVoNJrNUIodz8+Xa8ulWrHarNue7fq249mebNhu03N9TzEjHrPiMRE3jWQ8loibMdNMxq103JoYzk6O5LKpOBHt91d5WLRaaTQHFMf1qw0nX2qulmqLpfJSoVasNRuu5zh+0/VsVzZt1/Y96SulOGYZiZiZiJnxWCwRM2OmkYiZmURsfCh7+vjQyYmhwUwym4yZprHfX2vnkPZbaTQHDaW41rRnF/NvzS5dv7e2UqxVHMdzfAmWCgAzK98HwFKp1qeEIEGCCIYgEBlERGQZIpdJTo7knj498b5zU9Pjg4nYYbVRtFppNAcLZl4r1b954/7Xrty9PrdWqNme6/lKAWQIAoOIQARmQcRgAgEI3qjwcQ63AMwKRDANMTmSe/bc1Iffc/rJk+PpROwwTgwPq8pqNEeVal2+Obv8xW/PvnN7udZwmWGaIm2ZSrFUShELIoAYzGCEqoTgTSRBjECxQDAAhlR8f6VUqtlNx7dMceHEWDIe25+v9xBotdJoDhDMvLTSfO2dpev3V+uOR0IMpRIzE4OjA6lKw1ku1wrlhuP6asOMqMNK6pwtkRDEikGVpv2tG3PZVCwVi52dGjGMQxYSoNVKozlAOJ5cLBXn8mvVuit9NZhNfuDyzPc8c3ZyZKDcdN+aXXj1O3dvzq81HD/6RPeEjgCOtvO6t4dATIVq8/Vr81OjgxPD2YFM4lF9rd1Bq5VGc4BwPLdsV+pu05NsGsaJsYEPXj75nrPHEzFrGsgmY/WmM58vVxuOEC096gl3vYEQwpdypVS7s1wo1pq59CELazhkpqBGc7RxfdWwfV8yANOg8aHMseFszAytitFccmo0l7CsHa+NEZFUqlRtlqpNKdWDP3CQ0LaVRnOAUJJtx/OlBJgZTdtrOJ5iFiAAjuvXHU8q1WYT9TSOuPd2Cj3zddutNR1fKROHKfxKq5VGc4BQzFKxUiCCp9S9ldLbt5aGMqnRgXTT8d69s/zu3eW67QoBAESiIwKJguVCVtGPgqHACMWNiIiUgutJz5fqsAUvabXSaA4MzAAUI1QXxWvl+t+9eathe9NjA8VK4917qzfn12zXJxItSQogCj7dpUDcOjDADAKYFbNaj3w4NGi10mgODBt93kQkFc/O5/OlRjoZbzhOo+m5UjIzgdqkJpj3CaIwzv2Bp+kzfzzoaLXSaA4MzMysWCklWTERFMOXarlcQ7mGQGOIDCIwKzBBthxYRO06FQWJsgp2BaHuYAYzxCEVK61WGs3BgYhABgnLMCzTICKATWEAYGYiirznwayPIluMGcSsiFqJdO3mFbXCrcJdfPjmgAFarTSaA0Q8Zp0YH3jmialivSmIBLVS/3ogQJJV0/HWyvW1St12/F7hU6HCMTMJgZ6+rUOCViuNZp9hZmYQgYiScfP8zNjIQNrxfHrghI3AipcK1W/Pzlebru340Srhhlj2cCwRKxX6rKg9erTNHKMDHYCp1UqjeaQws1RKMVzXL9ebtucH8sKBoDD7UiViZtwyHuguNwQpRrXhWIZphLELHOlQ54ejuSRHcQ4EMJw83DwgwQQAQjAEkIKZIivJwqKDpF9arTSaRwQzN11/MV+ZXy0Xqo1CpZGvNJqez6olMUQEIYSgIGbqAbZVYCFV685SoWq7PgkCd4en88ZPAAAH8aXsofItXv0yKRtkKGYhiJmgEkiMUeYUMuc4NU1mcpduwMOi1UqjeRS4vlwt1a/fX3nj5sKNubV8uV5rup5UTAwORSYwj2h9qvbgpTsCVJC6rJiieKpoD/odhIiISECiPktrX4JsQhiCGQQBATKVSKjiOA28B2PfhdxlmFkcgIxCrVYazZ5ju/7NhfzfX7n9rWvz82uVhusppZTiqK5euMjXRhSMvvE47REKG0wmErSuVdS+vRXWwK2PEoGiJUL2wQ64CY4BFFT0g3Ihm8KvwFthdwXSxtALsDK7cy8eAq1WGs3e4nr+3YXCl7518+vv3l0sVJUMauZR+CcghGBWwU9RhDoBTCTAG4I9A088EGiPCsQoGhAEOHSwLmuEDeLHiAIbmEAGhAEVRm0pFiAwJLlFKr0OiiE2itxTEPssFwfIhabRHD2kVKuF+qvv3H/96txSodaSKoADu0oIwVCtWp+dEK2v4rXZTO2DKYxPCA/ycH5xAkgEyc9MIAFZR/VtFF+Bs/YQh90dtFppNHtI0/PenVv+xs17C4WyVEqI9SeOSICg1CYBVTtgU+9SYLdxMF2kNs0LzCwGVJDSE1hwKogjdfMofAON+1D7XGFGq5VGs1cwc6nS+M6dldtLJddXFKbCtNtIj47ABAtiJSh0cEkQQxBAUL0vRxFBOWjcQe0qZP0RXm8PtFppNHuF7XoL+eqdpWLd8YDWHC1ye/MGq4oZbWUVOBrQHTalmGWwnbviFZiZWbadQrYfP8zdEbBipikAvwq/AlaAwSClQjVTHKYcBhaXYkA20LjNzhKUxP6h1Uqj2ROY2fXk/FqlUGlyZ5XOHmWI+2zsPOoWtrS2d4SJhl51ISgZi8dMwC3CKwV6xB0faj8uEVjBXYKz0lqs3Be0Wmk0ewIR2a63WqpVbSdYy2szhXpW/uze2LNDxBa3UNub1nYGsyFoIJPIJi3ySvBqYBWkR4uNvvz1t0wAw86TVwJr20qjOXIws2RUGk7T8XbqoNqKJdVvSw/bKki+ScetscFUNgnhr8GvgkAiWFbs/S1AACvIKvz6/mZE63grjWZPYGbPl9WGbbtuGLjZtrPrTc+9/QZscUznAGYkYrHzU6NPTQ8OW3VRuw2vyCpsmtpRnaH9EhUk+TX4dQRxYfuEtq00mr2ibruFmu3td2sZiiKyBNHU+OCzT82cGo8l/FlRfRduiYI6pMwdcV0bZ5JMskn2MpS9b19D21YazR7hKy5UG+W6LRWsmMGqNS/bLINv4wBE5UKFClfsEJYyprZcnY4MPu4yu5iYkEklT08MPf/UiadPD6UxJwvfsJq32W8G4fEdlp+MDBkZJBVCgH24K+RVEBvZ6S15WLRaaTR7gut65brtOJ4Ia4J2RiMEpUE3ak5n8U8OTB6CaRqGCBxIxEp15C63E2wXREGJUSKYhhhIJ8+dGP/Q06eePj2UE0ti5esofgtulcOEHBHYV+vX1krZaUXZM+DX4BaROrlfZbC0Wmk0uw8zO56s1Gx/Q++/bUEEmKZIJ+O5dCKXTuXS8bghHlyiL/x0YBNRIm6MDKSnRrLTw/GpQZGlWeT/HqWvwlmR7AOiy7LqfznKg1+B8mDEd/SNHhatVhrN7sPMnpRN1w8LLTyIQDI4TG1hBltCDObSMxNDp4+PHBtKDmfiA6lYzKAo+Xkzz3qULg0iJGJiMGXm4sr016h2lYuvUfUK3DUGA4LXzahNXfUCkAR24dfAHqDVSqM5OhAYpgiLs/SODog2h5M2FfTXAoHSCevkxNDTZ48/+8TkqfFsxnIsWRGqyOxF/qr+nSAotKsYzEqSrJNdNsqr1LiHxm1qLrBsMDORIOoqVNN+be3TzGC9kCWUu49BDFqtNJrdRwhKxKxUImaaQknFQrS1Jl3PvIn6O6C95mc2FX/6zLHveu/ZSzNDo4lqzH1TlG6jOQ+/wuwFyTPM3FdoiCiq8wCWkDZ7VXhl5VVJ2iClIEDE4YnBXUEJijdo4XrLHDIg4vsYSKDVSqPZE+JxM5OMtxVd2Eo+DcUt4+LJ8Y89d+7ZM8MDmEPxFS68pup34VcEe1GlULRMnw4TK3BBKUTFY0K7jQAFMIgUi2hk24n7XFp0JiYmQLFIsjUgRGw7t2E30Wql0ewJliEGs4lk3EQUdLBxHW9j7U8K4qEwNT744lMnL50czqlbWPwLLr1GTp6UD2JESdCCAIQ/iY12UVCQT3FUcSGISqX1OCrRWjRsd1Z1HIQ3hFEIIgqCJkQM1iCTuV8BolqtNJo9QRANZVJDmWSv8njdseyCgGwqcfn0scunRgdogde+qPJfJWcNECREUJAqmC6qsJofgLBzTftxw5CDqEHqhpOga2S3eUXrA8JPMwNMZLCRJmM/C7RrtdJo9gQiSsWtwUwyZgq13jgL3eJBREopyzQnRwcuzQwfy9hG6VVReAVuSZHBEB0rgO0/qI6UnvUQ0vUfqWtvx0E64km7SjcEjqsYx4/DzOxj5o1WK41mTyAiw6BsMpaIGXWn1cq0Q6rC5BbFyjLNqdHBqZFUUq2I2tvcXAQxrU/dAIA4rNjSWs0T6FQjbkua4a5y7MEsj9tmgh0DOg7SGklGlpKT2NduXVqtNJo9gRmWYWTT8UQ8Vre9oGI6t03Pwh4QAMCK2TBoJJfKxg3hrcHJs3IAEoFotAWaEyiMkmqV19tQdYYMxnr3m2CiuLEsDYVzyCCuK5indrjqocKZZBAKocAmYjnEBkDGnt2wB6OzmjWaPYEIlmEMZ1KZhMUbYwLax4QmjFIxUwznkukEyCuRbFLL6qEu2yfa0mNS1j64o7zVhq29fuhi3ZclCPExxEf2t/W8ViuNZq9IJKzpsYHhbJI73UcbYEAxxWOx0cFMOm6QrLFy2sKg+odW9TvcbsIKDBFDYgbxyf1t0qXVSqPZK2KGmBzNzUwMJWJWW9n1TjlhQDGbgjLJRNwUgn1WfuQ6elBOzF7DLMhCYhS5i4iPPMga21u0Wmk0e4UQYiibenJmYnpsMMrw29BIogVz0Ow9iJ96uIp32/hwr6FMrVIQYZUIiiN9AeknIPbTxQ6tVhrNnpKwzPMzY8+cO5ZNxxS38pYBhC1tSKy3WI5amAJhZl60gNf24qBVjmJu9ftrfykO+xNGg6E6jsDBETg09oI34V6lEHTUaV0PGJQ8jsHnkZraV7sK0Gql0ewphiEmR7IvPTlz+dTxdNwMpGRDvGhLWsL3aJXdU11Spdo/AQCQ3cNaH28dqkPvoo3rcaFdnyWCgAIrER/G4Psx+H6Y2T27SVtFRzBoNHtLImadnx772HPnTEO8fWupXA9qH0fxAVAsWSqlWAV1Y4iJCBzl+LUjABkV0Asip6K0GG4540XURav3kmDbMmP3rg05jWSKxAiGXsb4R5E4tr+rgQFarTSaPWcgnXj/+elcOnF8NPf2rcWlfM12PBUu9SmCYEIiZolwVsjMUIzuynuteSKCiM2w3IKAMEEiiBlVkYG2PgwbPsVt/+eNshWcFEIwJSgxhaHnMfoRpM/uV/m9DrRaaTR7CzOYOZ2MXTo1cXwk+75zUwtr1Xyl3nQ9X8qgYbPjyePDuZFcUlADUfmYIKumO3qBAdlybymmeA7J00idVEYGUB0f6F5TFETMisCKZcdgAkEYMNKUPMaJE5yYosQYCWs3b8dDoNVKo9krmOH6fqnWrDYcQTScTR0bzI0NZC6fkrbnNR1PRX4sqVTCMoeyacg6mARD9pmsrcfCtwaYg8heUAPPU3yMoLitnTJ3HYSD2HYyQBYFFlmbYBEDwoSZIiPJZBGJXinZ+4ZWK41m9/GldFy/2vTuLBXeubN0d6kkCC9cnPnA5ZND2aRpiFTCQjZYoAssKBKCmMESRAyCaKtN1Y7kMDdwfVaoJLy68IogBRVYXSR69oAP3GQgsgaQmKLYCAwT6MiBFmgV8ztgaLXSaHYNpZQvue66q4XqnaXijfnC1fsrC2uVcs0WhLrtTYxkn01PiqgqVeR4aqvTwhxIReCW6s43DscLQIaduuDmUf4mmjdBFjjyslNYrG9DfRcisCIIJE9j/GOIDYWO9Y5knYOKViuNZid093HwpCzVmndXylfnV+4ulO4tFhfylWrDDvRFMc8u5W/MrZ2fHs2m+jqttyUXYat41YBTR/N+dID1PJ+oamigU8FPkslg6YnB97eXVz4UaLXSaLaNlKrhuI7nM8MQwjRF03HnVovX7+ev3F6+OVco1pu+VAwWJCgq+lKvu7cW8kuFSjo5Kna3pp0ChIACBEX2FIeRDVFjQQQV3UFB3DzEAXJIbRGtVhrNNlDMDdtbzJfvr5bz5YZiTlimZRqr5drVeyt3lguliu16UjKDSBAxMYEUGARFXG3a1YYjJYtNygUHnqfNr4OBsBcgg1o9TAlM3Jr+kamMhDBSAEE2oOpQPiAIgsM2EocMrVYazVaRSuUrjbdnF791Y24hX03GY5mk5biqXG+uluv5cqPpeAwlRJAUGFavUsxEghUsQwxkktlUfHOzhgEK4hCYw1pU7XuZJWBE79frVYUf5dDpJZLInKbse5A8AfZRv4nyG2jOgeV6gfjDhlYrjWZL1G1vMV95c3b+y2/evr1UGB/MPH3m+Mz4wFK++rUrd5cLFamUEISN1T4DmJUhxPhA5sL02LHhrNFHrsLaemSooPZCECC6MSK09S5MkenWnSAWPjHBwx+k8e9FYgbsoT4LIeAV2SkyEYme0REHHa1WGs0DkFKtlRtX76188/r8W7cX7y8V4pb1xOTIBy+fPDkxtFauLRWqt5fWZP/SCUSUSyeePDXx5MmJTHKzuHAmg0SKhNXT/glOILq2tP1AFHSZiI9R7jKSMxAWYCF9GpmLKL1BbhEAGUkY8YOQTLMttFppNH1hZsf176+WXr82//q1uVuLhXK96fgyHo+BiJmUglJBCku7tRLaPS3FScSts5MjLz45MzM+uHmjeWGmEB+GEScSwYc754IcbiTu3Bt2hVZMBoFiLJK0rmwGKAaYYW6imUNsAGI/yxbvAK1WGk1vXF+ules359a+eX3+zVuLC2tlTyowW4bwfP/m/NrX3r5zezS7mK/cWMhHvf7aDSImImY2BE2N5F6+fOrSqYlU4kFZLEaK4+NkDSkywDJ0TEWi1CquQK3ONNyukQwQgjxnv8T1W0gcQ3wMYPLW0LgLrwwophhio2QNH7rH/5BdrkbzCHB9v1J37i4X37y5eOX20q3FQqVhK2ZBgsFE5Ct1b7nUsG+kErFqwyk3mkqG2tRuUilmQTSSzTx3Yfr5C9MjuS1UsxMGJac5dwmN+8JZCrKb0d9sE+0/BPtJMBjNBZH/CmQT2QsgQvUdlF6DuwphIT2DzHlYQ/vYGXBnaLXSaEKY2fPlarl+d7l4e7F49d7q9furxVrT9SWYgwIH0VC4vlwqVgWEYgYxQMHeVu+sYFgibl08Of7CxRPHhjNiiyFOsTEa+TC5ZeT/jrwCcdB6maKKnjBAEExh1GcU2s4UtJQPezj7JVS+DXsBpeMAo3kX9gLASJ2msY/ywDMw07t57x4JWq00mhCpeG61/NUrd755bX61VKs0nLrtBgtzbe6h0JIRQjCzHyhUkJgXiRSRYFYAmYaYGMo8+8Tk2alRy9iyS9uIIXsBygUkCq/CXSP2w/VCap8RtogCrILoBWZBBEjIChoN2PfBEsqDsJA+i9HvxuhHKDl56Fzs0Gql0bTwpZxbLX/r+vyV20u+VEEzPyFEW4/2doIIzLAWJ6v2mWA4OJuOn58euTAzluufatMbI4ncJZAFawyVN1C/Ba8MyCDZGcRgEcZ3UnuhKgAwAMUy8ri7pBhkIjGBzEUMv4TB55Ca2d/WNTvmUF60RrMXSMU12603HW5VrAvLd3bYMhR2FQVvLFbQ4UBCLhU/Nz06MbSjOZeZxsAlxEeRO4/qO6hch7sCvw5Vh3KgJLFiqCifBmAJYkjFBKYYRAIiDiMJawDJaWQuIHcJmbOwhva3g+nDoNVKowkhwBTCMIyg0wJFT3WvwO/+8eCtPcwx0xjKppLxnVazEzEkpxAbQfYpDC1w8y6cZXKL8EvwVuAVyXOhfLAfLAWCYkgkKTZA1gCsQRhZWENITiJ1EvFxWFlQ7NB51tvRaqXRhBhC5NKxXCZmGYZSLIIO7utdHdAqhMcsu+eGgZedFUdOJG663kqx3nTlpgGhm0ICZoqNJMXHkDnLsgnlkbLBebh5OA04FcgaWLGRhDlAiTHERoWZBJmMGMw4jBSMFJOxeZzXoUCrlUYTYplicmzgqdMTtitrDafhONWGY7uSN0zv+tE2iBUAQaJYbV69t/rM2ePD2R124nM9v9p0pEQmYSWspBAWyGCyiM4ADOmw3yRpKygScRhpGHEIK0rhAYADW1pvB2i10mhChBATg9kPXT5zZmK4VvfmVhpv3lieXV502IkWBduLDHdAG00wBth2/NuLa7Pzq6eODSUTse1eT8N2r82tvHN7NW5Zl2YGp7PFLC+S9Dg2QonjMHMQBhkpmGkCiAWIwRK+B5asPFINKA+xEVgDh3oC2EKrlUazTtwyT44NTQ1lfaVKZU4buUqzvlBZY0Ws2hsydAtWRxS7YFZS+cW6fX0+f+lM7dTx4W0Jhu+r6/Nrf/nKtbdvLU2ODadj7rB3K+O/QvV7QiSRPKHiE8LMskiSmSQRAwmwZOkRO/CrcFbZLSI+ShMfh5U7GtaVViuNZgOWKSwzBiAVx8WzubfmMmu1oiO7y2z2mxUGPbaCSFGj6fhX5/K3FwsnxgZNcxshTqvl2hs35t+aXVgo1OOJRLnW8LMN9te4cYeVpPpNYaTZTBLFmGJkxMAAS4JUyie/TrICVmrgGcjmURAqAFqtNJp+GAZGhszx4WRs3rQ9B2HYJ/eZCaI7mRnMnq8WVku3FgvvPz89kEls8dRKqeVi7eq91VLNJsVKsecrMhJQKYIgIZW0WbpwCxREjAata5hhkGBAKbBUZkoYCRixo2FYQXeW12j6whyzjIF0ImFaUSPkACIS3UtsQTMrIopCH4iEIQhNx59fK69V6ts5NzmuX2+6vlJM8KVyfQkRJyMBQQif2yjHmRiQIAXBYMVKBVGtIAMiHiTqHA20Wmk0fSAyyUhYpmUYUUGY7T33FHbZ4nLdzpcbnie3/EnELCMVtwxDCCLps+dLphiLeNjQRgS1rKjTfd5qogMIMpWwQOYhrRTajVYrjaY3zDBMEY9bhrmhRei2jsAMgOq2V6g2XOlv/bOWKZKJmCkEA75Uni8VE9F2XDckIOIQ5tFYEIRWK42mH0QwBMUt0whLA2/rswKAUjKozV633WK16fpbta0IMA1hmSLIBnSkdDyfyYAwmQhqS521mAgiDmFp20qjOfpYppGKm5ZpsFJdi4BbiRQlAKxQazhr5ZqUW5UNIhIkBBERMSvfZ9fzAfFg2ypKcASBWBAswOhVvP1QotcENZq+mELELbMt8iDKAOzVNzTMvOFWWFbgQSIGGraTrzQd19vieQPNCew5QeRL1XQ9BcEwggCJsMwxM3UmLHJwRgIDBhlxCJP5iMwFtW2l0fTFEJSImWboZUfUAqv1puOFaBfaxzAr21drpVq+2uzu8NwbZs9XtitZsSDyJXueYkUgkwEmIcNCNZDc+Qp0lCUrAos4wzgaUgWtVhrNJpimEbdMs3fNT9r4at++/p7CCR0X6s0bc2vlur2V80rFxXozX667UjLBlb7j+YoEkylIiLbmEAIwNr7WL0iYEDESOt5Ko3k8iJmmYbSvCbYnDLbTLVjrtdQFUblmv3tvZSFfUVswrxzfv5cvrZTrnq8ECcnKVVKxAZhEBFZBf5sgX5k3viiYjTJABhuxw1gjtB9H55toNHtBJhXPJmNEpDZ0iNh8Jrg+JkzBATmud3M+f2NurWG7Dzxpud5cKdQqDTtITiRm6bPHwmdiZpBQ0UxQcRgn0XoFM0EFBkMIi3FYa+91o9VKo9mMVNxKJWMgKNVyrnfPAfvNBNHyb/uSlwvVt28t3V8pBS0I+1G33dm5tbmlsufJoOEWQUkWrrJcaQYyJaIziSgGNXwJiPCySBABgrRtpdE8JihJUMI0RNTMJki7CaILIEhQEKbAFMaCRr0eKEKIMPfFcb0bc6tvzS6Wan3d7a7v314qvHJ17vZSQSkWQgAkiCWLpmc1fUsCQShVdIqNahVVsyLCEVkIbEOrlUazGdWaqtfZEGSI7mqhUKxAJISwLJGImamEFY+ZgkTUUSIKRmAWghhYKda/cXX+27OL1YbTcTCluNawr99f+9qVu2/OLpZrtgqDIZgAn0VDWk1lyU2KLHdC61dwJNDxVhpNX5RSxVqj0mj6UoEoqMHArILpFUetkeMxczibODaUjZnGSrkxv1a2HRWGhjJT9EGAmq537d5yKmFmk7Fnnpi0TAOAYm7Y3lKhcv3+6pu3Fq/eW10t1uV6E0FFADN5inw2QaKzeUUfmCWxQq/QsEOKViuNpi8Nx1utFitO1ZeKgnW4XsRMMZpLT47mwKg5XscopZQQIsgaJELVdr5zZ3lsMJ2MW5PDOU/JtXLjznLp2r2Va/dWFtYqTdeXrdB5ChvrgCElKRZEFoKQUH6AXkUxokdnPqjVSqPpS7Xh3FstFKsNbOoFUoprtnN/pVSpO2vVRpQPuK4UgYUVzA0JVKk7b9xc8Hw5PTpQaTp3l0tzq+VCtdFoer6UANF6rgwFQe0K7EvJipmMNu/+ZoRVZY6OWGm10mj6U67Zd5dKpZotSADcyqoJFweJAbBSjucvF+qrot60PU+upy5HzVDBvO5VF4IYvLBWqTWdVDzWcLxqw3ZcP8iPIUK7p0kpFTi8mOFLqUAsLPgCHPzX7ZNiitICoxOrI6NXWq00mt44rr9YqC7mK67vRwnG3ek1YLDnS9dXBO610te5hZkls2LOlxtr3ADCKE/qnXvMYRSVglQMCIIZdYgOQ642fI7B1L6xOxn7EKPVSqPpBWOtXL8xt5qvNKMp3Xps+sb3gYnD3NXSufdxg/9xW/pxFBjRliwdLPyFEhm4qWQQBS8sCEFKho3lu20ucmcAACAASURBVE5CAAcxWsonaZPyjozzSquVRtMDX6mlQvXOctF2vSCIaWPdhc173mx1DIV/BGrXkdwTbguMLmb2fclEgMEgiqSRuhxTgW0FQYIl/Aqk3b+Q/CFDx1tpND2QUq1VGov5quPJPv71nnUXeo7pSM3p3tt9tA1HZrBSUAxWBLE+EwSgALnxFU0RiZXD7hr7lSOhVIC2rTSanihwtWGXak1fKtMIfqm3LJR+pkrLu73u5t7mmN57A8e85/tSKZAgQSzDfd01TQPnGZGAcmAvwc2DJbZVIvmgom0rjaYHtF4NvZVJ03pYguSbro+s97zpkZ3XSotpaRDRxuh42nCK9g+CO53wwcDwgBvSFonCozDYgz2P0rdRm4VfhrShtlEY/gByFBRXo9l1FMP1pS8VgYMMwa71vk0yYHru6mFqtY/rZYkFnnhiwBRiIJ1MJJJkbzqvay8SAQVnDfkvwVlA+iySJ5A+heQ0jORmRzjAaLXSaHpge27dsdFrqvVIWPdbKSktYZwYGzg3PTqeLJlLYLWlECrFTMqm5j04yyi9gcQxZC9h7CPIXoJ5KAVLq5VG0wMplVQqqmm19UTiXSPKSWTTMGbGBt5/7vjJicGcTAHbuBbFECyJG5ANOKuwl2HEkDgGc3rvrnzv0Gql0fQgbprpeEIQ7aiR4PYgApFQSiEqqh7GUQnKpeJnjo28cGH6g5dPDqZjVG05vx540NaRgLDYjYS7itK3kXsasTEY8b36PnuGViuNpgdNV1brjlLMxEopIsEsWxrwoJ43raJ97XvbgxJaAVytvJz2fB0IQTHLOD6ae/aJqRefnD5zfHg4m6awhjEB4Rl69LzhsMqxaK8TGESikiB2YS+iPsvZCxDHtip8BwatVhpNJ1Kq1VJ1rVyXUhkIilW1EpWx8U03Ox9DJKSUhmGMD2WeODb07PmpZ89NTY0PxE0TAJgVBzVkuKVR/VoUtrbLVsg7g5lIOeyswa1R/PCFjGq10mg6qTadW4vF5VIdtEkThn5x6pss9PUcs75PAOlk/NTE0LPnJt93bvLM5GguFTeM9SsIWjcHVUKDI1BXFJLqFRgWJg4xERSUA/iHTKgAaLXSaDrwpby/UnxrdmEpX4FClMxMD4oOpS4x6jeG2jcQKIjmtIhymcSTJ8Y+9J7T7zlzfHQgFbO6Hk8GhZUXuF90aBCXJSgMaqdWkjMBBthIkDUII3EY6yBrtdJoNpAvN67cWrp2f7XadHi9+Mp6FFP0ZpMcQN50zPpICssvIBEzp0dz7z07+fzFExdmxgbSiZ7XFqQ/czCrAyFY9esaw0DQp4KBINNQgVkpGCmRvYShFxCfOIz9m7VaaTTrNB3v5kL+jZtLq+WGUtz2QLc/2j19Rd2TuweoQRDzHo+ZQ9nk+anR585PXzp9bHIkZ1mb9NQKvFCgSEdF93k4dLFza0mQmYhgpDl7gUa/C7knYaYOnVRBq5VG08KX6v5K6bWr96/NrdquH6a1iLD0HaFV0SVIbumsZhWUfCFB4J57A91YX0xUzIm4dWZy5LnzU+8/N3Xq2HA6EaNNo1EJxAgTdig85sYiDEQisKyCi2cIIoBgxJE+SyMfweBzMHMPdZv2D61WGg0AMPNqpf76zfm3ZpdqDYeYSVBYmhhqGzVXthafFRSvGs2lX3pq5sOXTx0fyQUdJTYj9HoJtXl+b/cFiBhSpzH6vRj9CMcPX+BCC61WGg0ANB1vdmHtWzfmFlbLilUQ3CSEYN5KK/jtQgAyyfj5EyPve2JyemxAiK3VFwh85lujVRoL8WEMPY/hl5Gcpi2e6EByiC9do9ktpOLFQvXN6wt3Fgq2/3CFCh5kuBCJYPY2kklemB4/NpzdqlQFwQsP7HVD6/9nViADyVMYeolTMzjMUgWtVhoNM1fqzpXby2/cWCiWm8E8iUFdi2Ydq349q+sBUcvS3nvDEQzGQCYxNZJNxa1tXSwg20qL9trfKisa6JqZQfos0qfIiG3nRAcRPRPUPO54vrq7XPjm9ftza2UZNFVmDtJaWm51RIk1ocnCPWrpcVievSVV3XvX4yGEQUPZ5OhgpkdQ1WYoxZKgKCwf2nbU6DxROk/ghzdg5RAbhTiURRc60GqleaxRzPlK463ZhWv31tcBI3rmx/R7303vvcyKwcm4OTGUGc6lt+PzZjBHUZ/hwVWvqScHTitWBAHEIeLct3PrYUKrleaxplq337q18M0b82uVoMUp9bSbAGwsQ7x5FWP0GRNKjEHGcDZ1bCSXSmxrGgiAoXyKcm6AXj1vGBz4xtYvR9CR8Pkche+g0ewMz5f3V4qvvnP35nxeKkUC/aUKG7fvxKoKShAzs2nS2GB6bCBtbLPWH7Nilhzl0rRKi7a/gulo6JAHQILFkbCstFppHluYuVhtvHNn+frcWr3phr6eaGe/D226a/Mt61mElmFMDGfHBzPm9hbpGFCCJSvZmgn28bMDAHOQ4UwgQ+1+EMY+oGeCmscU15O3l4vfurmYrzTbNj+wTOjmGcsdW3rAQCJmTQ7mRgfS7fUVHgwRMYOViCrs9ThJkDZIBDAxkwBBAD16XhxGtG2leUyp2s61ubWb83nb81oPc5Cv0tmNJthF6xXuiHo4x6PWOCASHbuJIILKxYBh0HA2OTWSyya3Wb0zjE5QrWyb8Cqo7SXaLi3cKwDxwCCtQ4FWK83jiC/VSrF2Z75QrdvMEKG+EMI+Wj1tEdr4vqd5hTbliLaGrbeC41ImGX9ienR6YnDT7OVeMADFLHvFWXVdY8v0InEIqy30RquV5nHE9fz7y6V7qyVftRb3W67o3Xy2WzXXGVDMYB7KJM5Pj4wPZ3ZyOFaC1Raf2uhrGCDjaAiWVivN40ilac8u5BfzFaUUUbgUuDsPdOdRwp+ZJRESlnVyfOjc9Fg2uYPI8rB01Wa2FbrMK2o1mj70aC+75rHDdv07S8Wr91erdUcIgAQzYz1UXfX0o2+1i4RqZd4EEfDMzEExGYBHB9OXTh/bRhpzO+GypYQKOqL27iIRlHcIfohcWMbuGoz7hVYrzeMFA5V68/q9teViLXCXb5SenvHrW9zbe0zQL0cxEpZ5+vjwxRNjmdROu2MxgyVTKEqIaoR2np7ZCHxnDJCA0Gql0RxClFTLperN+dVyrSkEHsEUKVBDwxBjQ5n3nD1++vjwNsOsWhCgoBSEikrF9716hdC8Y2Zi9SjaIu49Wq00jxe2791fLd5aLNpuUOezNelrT6xBj0CmTvoN2DA3C+eAhGwyfmlm4tLJ8cx2AxfWUZA2s4RCVCkwCFrYOIij/CFmEAsosK/9VppDyYVfe73n9mu/9twjvpJHj1KqUG7cWSoXa03FygwbxHTQM0C0PSh069GhzKxAImaKk+MDL1ycnpkYFNvMtmm7eg9uAcwgA32kCoAAONgeFKZhD8rTarXn9HuuepKKiUzcGEgaT4wlL0+mn5vJPHNiR4vEmqOLJ3mlWL+3WPY8X4SBldylPhz92XMjeu1FLzkIneus5MRQ7uXLpy6fPZ5OPEQ/d2aoZtDQq3U13T1vFMDMAiAhQIDywX6/POzDxYFWq23RcFXDVStV78aK/f+9UwRwZjTxY8+N/cTzY9a28hs0RxfX8++vlu6tFH3Fbf71rTzIHbZVvwEbYFYgGsmlXrg48+LFE2O51MNcPFiCFTMzt/JuNut5AwIpQHpQPvU0Ig8bR/kxvrVm//pf3f/R3333O4uN/b4Wzf7DjFKtcfPeymq5rpQCWAijPV0mylUBEMa0E0GIMMmOqBV6TkRGd+pNUGIhyokRAJiQSyaev3DiI+89PT0+KB7ytyb7gQdKRKH3IujZvDHzRggiCivEMAD4gNdj7fAQcpTVKuDacvNTf3D12/dr+30hmn3G9fylfHWhUPV8nxlEglVXh4jAH0QEIqWgFFQ0hllu6TThYAVQKh576tT4973/3PnpsQe3tHkg0lXSDt9vYittmLMylA/pPSCg9JBw9NUKQMNVP/vZG/eLzn5fiGY/sT1/bq26Vm2GctRnGBEJISxTmCYRsSAS6yGYW33mCSIWM06MD7745MyFE2Px2G64XJQjWBJ4q7M6FYiaBPtHwWv1mKgVgKot//m/u7/fV6HZNxhwPH+pWCnXHSb0KKEQDGNWrLLJ2JMz4+8/N/3UqYnjo7lEzNxayZUwfYdAsZhxcnzow5dPvvDkieyOY0E3HFtBuWDvwYIZXWlkXQXzxx7B94eOw+pl71hurzny1mrzL94ufPa1Fdnn7+XfvVu6utS4eOzhPJ2awwkrxUo5ru/5vghiC7oHBdFXTEOZ5PvPTU2PDRRrzdnF/PX7awv5iu34jFZx0Q1l+QLfV+CzJ6JEzDw5MfShp09+6PKpY8PZXfoCDHahXGweSdF2XRxYYcqDco/GTPCwqlUHmbjxnunMe6YzH704+LP/5w1P9v67+cK1klarxxMSQghhGsI0DGaXgtzAiKB1TKusVbXh3FsuEsCEmGlODGWbrr/sVX2posyWDUIRNLsBAKJEwjx3fPS7nj7zwpPTU+MDu9YYmcDKI+USwEFVPqxr5zoc9rwJqtNASbCPLXrcDjxHRK1avHQ699MvTfze15Z67v3abOUff2SyfUup4d8r2HcLzp2Cc69gz5fcUtMvNfymp1yfFXMqZqTjYiBhPjGWuDCRevF09tndC+NSir9xt/r/vl14a76+UvVqjhxKmTPD8Q+fHfiRZ0YmcttL018su395pfC310vzJXe15mXixnjWungs9UOXhl4+m9txGMcjuEWOp752q/L3tyqzq/a9gl11ZMNVUnHMFElLDCaN4bQ1kbOmBuInhuJnxxJnRhPD6e31XwhW0JIxyxTE1LvlcSthsFhvfuP6/atza8m4ZZlCEDErEoQNvwV7RJASMD028OH3nv7Q06fHBtO7WaaFmaUTRU6tJ9KoXg730MBiEJjYhzoiXvajplYAfvjp4X5qtVRxO7a8+M+/vfnRao6sOXK54l1faf7lO0UAJ4bi//DF8Z98YfyBQcmbR42/dqf6a5+7e3PVbt+1UvVWqt7rd2u//ZXF//Qjx3/uQ8c3P0WA46nf/sriv/raUrtRWWz4xYZ/bbn552/mJwdi/80PzHz04uBWjtbBnt6ipiv/9deXf/9rS3W3xwTe9pTtqWLDv53vXCHZQeQ9EcVjZlgDLzCcemkWEUnFlYZbb3qCSIigQBUrpaJSCj0+BLAgDGVTz52bfvHJmbHB1C5XlGIW7EK6zGFjVuYoqKpP+BcBAEN54CMSy34EveynRhL9duXrD9c0HABwv+j89391/z/8/avLXdq3dX73q4s/9ZlrHVLVju2p3/z8/K//1b0HHqpq+5/636/99pcX+81/ASyU3V/8o5u/+TdzO7zcbbLFW5SveT/+r6/+L3+70FOqdh1DiETMilkmBSl01BGqFBBl2DGkYteXTUc2Hc/1pepsNEptUy5iRjIee9+5qRefnDk2lNm1CeA6DPbALoCgCXNwAhEGi0Wv9u1hGQYJ9sC78C9/3zmCatU9l2+xi/+C3pqv/9xnb9ScnXgE/qcvzP2Pn5/fim3+mVdWvnC1tMmAhit/+g+vvz1f38p5f+9rS7/x149uYXTzW8TM//kfz7671Oy5dy8QAsmYGTcMtHrtPQAmghCBrgWFqnqXkVFKSebRwdTLl2bOTY9urzfEVmEoF8olAIqCb8Bo62MfaamKtqvgT2b4Vagt/Qs54BxBtbpb6BtXNZLezZnvteXm//zF+R188H/7Su+Jak9+/a/vbaK/v/k3c1cWthGp/wdfX9764Idnk1v0198pvn73kYbsGoaRTcUS8cDhtZX4qXazq60iZ9ewoO1E3DQHcslEzNwDwwqRWvksOIySiNJrOgxE0X6JJAQBsgl5FNI5jqDf6i/eLvTbdWygt986HRMvn809fzJ7djRxciSRjRvpuGBGzZHzJff1e9V/8821W2s9Zm1/9M3Vn//Q8bHsdjvuboO5ovv129WXz+S6d71+t/rZ11b37tTt7PotCjxcHQynzV/8ruMvn8kdH4jFTdFwZdWWdwvO7Jp9ZaH+yu3qYnnns2/LFEO5VDppRqEK6z6fKAQheBP2KO34eNsyIkXWFhBKBgFctd18vcHMe6JWzJAO2AkTAwng4Dp4Y3cLIgaY19cRiMA+5M7v28HhqKnVK7crn3mlr/nwwa5n/r3T6Z/5wMT3XRzsuWQ2bIrhtPX0VPpTz4//kz+e/eK1cscA1+cvXCv+2HPjO7jUmeH4P/meyQ+czqXjxrtLjU9/Yf7VO9WeI798o9xTrfotJgC4MJH8hQ8ff+FkdiBplJr+a3dqv/OVxesrO5l27dEt6jl7/Z0ff+K90+vridmEmU2Yk4PxD0Rf/07e/ssrhc9d6fsLaRNMIYayiaFs0hTkqygxuD1qKtIuAOsd5tdp98q3SgkHkkfMsF2vUrVdXyViezMThA/pMW/abCvIUmSgNW8igCWUv1cy+gg5IjPBuiPfmq//+l/d+0f/R99gKwDfe6FzUezf/OyTP3Bp+IGr+zFT/PL3Tffc9Y07O5nOXJ5M/dkvPPXDT4+MZKyEJZ49kfn9nzr39FS65+B3emVlzxWdv7vRKQ0BH70w+Cc//+QPXR4ey1oxU4xnYz/09PCf/PyT331+YAeXuke3KF/3ujee7r9CEnBqJPGLH5n83D++vPmwnhBRJpGYGs2lk3EiCvJ8W0nL2PmyWVDDk6VU1arbsN1NZu4PAUM5W4pl78x8ZCiG8ujwR10dVttqW6WvAv7Bk4MPExra70G6urxtj4Bl0Kc/eSYdNzZuFD/90vh/8Se3u8ffK/SYYX35ZrnnQzGcNn/jE6e6xSVmin/2idMf/60rxcZeLQ9t6xa1Jd+t8y++OP8r338iZu7VL9G4ZR4fHhzOpapNRyplGEJFEQnt64I96LOPGSAFEkTs+bJQadQa9mAmsSdWjO+APY7yeza71PYyXOvLgvLwPu8Bh/vqt04mbvyXHzuxyYBC3fu7G+W3Fxq31+z7RafmyKanbO/BK+ul7T/8P3BpaGa4x4P9ZB8xrfZaVnujT1WJTz47mkv2/msdTJk/+uzov+o/f9yc3b1FoxmrO8/8s6+tfu5K4flT2fPjyenB+MxwfAeBoJtgCjE+mJmZGFguVBuOz6p18cRQkQj0+kYMRu+eN4G3i0CKebVcK9ftqd263M4riCIYKPS79et5szEjJ0hs9tA7Uuww8VioVdISv/upJ04M904ufXex8ekvzn/lZnlnJYAq9rYN7I92TUgDRvo8lnWnx7+zq33W/j94toeHq33vDtRqL27R+2YyPatilJryb94t/c2763EbQynz/HjyuZOZl07n3j+TMXZcKRgwTTE1lnvvE8fmVyt3lsq+9Ila5UOxhbkgd70BAFaShPClWipU89WGYn7o6jDdZw7yBL1AqR7Q8wagqGKfAMCKpUPQM8EDz7nxxG984vTlyd4uod/76uJvfn4nUQgtNnGT9eOp471tqKS1jRlQvwndE2PJTT51btO9PdmjW/Qj7x358zfzW/l4seG/eqf66p3qv/y7xYmc9ePPjf3HHzgW3869akFEQ5nUM2enCmW77swurlV2K8ibACk5X23OrZZt17OSu61Xsq7copA+Q3HbtLXf1bcWBBWY4IM9VvJw+9iPjJe9J6dG4v/VP5j+059/qp9UfeaV5Yd8DndGPxvKNLbxz6ls91arXGKz52Rgm0/R3t2iD5zJ/eDloe1+arni/YsvLvx7v/POQmmH1cqEoOnRwQ8+fep956YGMkmlgqlc60HYEA7QFemOfnuDNjN125lbq9SbexAu4BWFswZIgASJ1r8UAgxaf4mOYqIEARArkt4RyG0+OrZV0hLZhJFLGGfHkpcnU8/NZN83s1lu7VLZ/fQX9kGqAKT6rHAftF99e32L/tknTpuC/p+3th2OcCfv/Pz/dePPfuHStvS9RcwyT00Mf/d7z1Trzjeu3bMdf+MqHvWqxdK+sedeAJBSQULtellh5cJdg6yDGLwemYCunjeBSKloe1T1VIFd4t49qA8Rh1WtHr6d1J+9mW/28RA/dzLzH7048d7p9HDabF9c28FCZE/6LRhtq3fTQMJcrfUIAqjaciTT12QuN7fxC3avb1HMFP/Dv3/mR54Z/cwry1++sT2n2I0V+y/ezv/IM6Pb+Ewbybh1cWa8brsN171ya7HpSSNq/h4N6XiwOxxb7coFAEoppZCImRPDmUzqAXEY24Vlk5rLLBsEEV5Hm9+qR8+bsKsgmEAQYEnsB8sIh1irDq9aPTxfutE7/+4HLg19+pNnugWl4R4sQ3oo1Vutbq42RzJ9F9Furm4jQPTR3KKXz+RePpMrNvxXb1dev1e7udKcXbNXqj2+Wgefv1rasVoByKTiT5853nS9puNdvbfaVaF9c9uqYztAJAQG0onjI7m4tduPlV+Hs0R+I6gbEUzxZCsEtONKGdRy8TBATOxB2jre6hAzV+ztXPjPvnuyp+3Tb/x+cfFYsmds+ldnKy+e7rss+NXZytZP8Shv0VDK/Pil4Y9fGg5+dDy1WHHnis6N1ebfz1a+MlvpDi67svBQmboEDGeTzz4xtVqsFyuNxXyVhAjyaYgE0ClfREaroxdRMJKJgKC9jVKWZcxMDI4Ppc3dtmBI2bCXoJphwg0AQSQ5XNDcmHkjgmxmCr9jGG+lGuAH/wI44BxlL/vmlJq9vdQnhnoHOvQzNPaLfhXv/uSNtWofB3y56f/JG2tbP8U+3qK4JU6NJD70xMDPfODY7/3k+f/uh092j3n4MFciGs6mnj5z7MzUqGkaihVoR08EQynOJuKXTk1MjQ7s7Bib4TfgLMHrnZi1KQQA0oZb1Wp1iEn0WQLvGQRUbvp/+MrKHl/R9vjwEwM9f4Xn6/6v/vkdvytowJPqV/7s9rae8INzi37w8nD3RvMhAq9axCxjfDA9nEkIAcXMrGibYhN4u+KmOXNs8D1nj48MpITY1ceKJZwlOHmQv5N4C8WQDlQNUqvVoWWyTz2GP3y185GzPfVL//ZWTyfRPnJiKP6Rc73z/v7m3dInf+87f3mlsFbzPKlWq97nrhR+9Hff7U453pw9vUU/+C+v/K9fWri+vCU/2t/3msButxJ0Pzxfer6UEgTqXx20BxQ4kBhK8XAudfnUxImxQdPY7Ugrr4rmffgVBKFW25hmRp4t5cKrBR0oDjWPr9/qA6dz13o9Kn/0+qrrq3/44sTJkbjj8yu3K7/1pYVNinzuIz/3wWNfut5bgN5dav7Sv731kMff01s0u2r/1pcWfutLC0Mp830nMk9PpU4OJ2aG42MZKxUTSctgcKUpb63Zn79a+uxrPay2l07vQjsZz5flhlNpOJGbapOg9o6i7MwAcVg45vhI9vKpidyuNOPacE4FvwRnBSy3FHzQ+8Jd+BWoho5gOKz8B+8b/cyryz0Tg//02/k//faWwqz3l+dOZj/1/Njelbh6NLeo2PC/cK30hWvb83kR4ZPP7nxBsIXtekuFar7SYHBQSyFwn3c/1VE3mQ0DGExkZJLmxZmxs1OjUam/3YMVnDU4K4CMatxAMZMK9JK7JkcchLqLMPs5LHcFWYVbgPIgdscg3Rce35ngE+PJn3hubOvjP/X8NgY/Mn75Y9OXJ7dRWOKnX5rY+uCDfIv+0cvHLvVJUdg6zFys2zcW8qulOoXFrtojqrqqCG8cQEEvHMLkyMDFk+MDqT0ovaCaXJtFc46Vo3j9IlqxaQpQvOGFMI85CrwiKCL2bTh5lgdxirB1Hl+1AvCrHz+xeQ5wix95ZuS//oGZvb6eHZCKGX/wU+ef3ppg/fRLE7/y/b1LUPXjYN6in/3gsV/+vl2odOBLtVisXJ9fK9ccDtVnW25sZlDcNJ6YHJkZGzTNXfZYMTPsFaq9C2eJgtaqUcXS9kQh0fZqr8dM6wMEsQ+vQF7lUPe+eazVyjLE7/7EuZ/5wMQmq0vZhPHf/tDMb3zi9MOk/u8puaT52Z+5+J98+Pgm32I8a336k2d+9eMntvvL/6DdopdOZz/7Mxf+6cemH96KYeZq055bKa8Wa67vd+UDUtcLHWMYMAwxkkuenxwbG8hsKxVhS/g1VK+hcRehTUQiiqKi6I1ob3jTlicoNrTAISiX7WWEmYaHlcfXbxVgGvQr33/iJ18Y/+NvrX39duVewanaMpswRtLm2bHkxy4Ofs+FwUx89+t/7C5xS/zSR6d+7Lmxz10pfPFaab7k5Ot+OibGstaFidTHnxr6ricGdla0AHt2i77+T997O2/fXnPmSs5i2V2uuIWGX276VUe6PvuKTUFJS6TjxtRg7PRI4unJ9HefH9itdUAAzMiXGjfv5StVm8IOfR1+9E0ybwCAFWdT1vmZsbNTI+nkbvuD2KPGLRS/jsY9KBVoTGuih+hNd+ZNa3swRATOK+mQs8DuKrEEHfR/z/3Yrumr0RwRak33q2/f/r//9s27y0Xu0bOr21DqjG0nwrnjo5/44KUPvudULr2ruYHKRf0Wlv8aq59HcwlQKmpwo6IZX6BfoutCVXT1rSLzQcsLxMYx/Ukc/wTiI7t5qY+Qx3omqHlssV3v9lLhjRsLi4WKVIo2OIKiIjDAxiYTon0mSIBlGrlkYiCZ9DxZtz3X83fjd7+CV0XlHSz/FfJfhrMCYpAQJARRawIYFIoOWp9ucFyJaDLYNisECRDDL6H0FirvwH+kjdF2kcd9Jqh5DGHmctN5687S1bsrjusTibbcZLTlCQYmVysRL+ziTISgCIL0sVisfvnK7Vsr+YFsaiyXnB4dHB9Kx2I7eqxYsbTJWeHKFSq+ivKbcJahZFQ3oU0HiSI3GnVbfAAoEs3WhQME5aH6DlZyEDEMXIa5WT2lg4lWK81jR1g5r+nUbDfsyRfFr0eRAZHXZ4Pn6mN2IAAABshJREFUilobAs+Rr3i5VK28aydnrVQyPpZLXT4x8eFnTs8cHzQemHzTOhEUpAO/Ab9EjXuovE2lN9G8Da/KoUEXTekoUM4H+fJ7m3eBeVVG6XWQAdlE7knEhkHWIaoio9VK8ziSTcRPjQ+NDqTWqnXpy1YxcyZWbAiyDGEagiyDiMgQZBmGKYQwyBDCNMkgYRjCEkIYMIQwhBBCGKZZ86QjuYeDhRnKA0uwBCRYgn0oD7IeBn+6BbhraNxD4y7cPFRQir/VwzTo0xqYfw/6bh32FrW9I4ZXQPEV9orUmEXmPCemKTYEIwER32FG9yNEq5XmcSQZj12cmXjuybLtq4btCYIQQsAgGAlLpRIcM4RpCcswTIMsYVqmsKItlilipmmZhmUIwyDTMGKmIQwyDErHYkMpn2QViuHVoRpQEpCQNvwGpA1lQ9pQTagGvArsNTjLkBVIG8qFtKGcwKfOGxxnvP4Hb3i/ec+bjQcBKxApeAUqfQuNO5yY4tQpSk4jPobEGOLHYA2zsA5sk1S9Jqh57Ai6FruevL1UeOf2ctP1Y5YZM4WBuIl4Lulmk1WTgpZWitgXpAQksRQkBSRYEinBkiAJPkEZkESK2SflJ01OxiDYg1chrxJYUqw8Uh6zT8qDcsEelK9kU8gmlANSkfAEbZ/XvU7dsqEiyy2Yu3aYQy1ta5tAru/iMBSLwobPRkIZaWFmYaZhDSBzASMfRvYCjN3OdtwltG2leWxQkgN1YI+VaylnMlaLHyv7vi3gm4IgDXgiE2tmE0WDHCgF9pV0BUmlXFIK7AUvVj7Yh3LXG4sqD+wp6QrlkfIASSroAh/4uHi99mjLI8YKIEUqEKl2nxiidz1NCdXnfc8xHUdgQHKQQQgKDD23ECwccn0WZFJiAsY20rMeJVqtNEcRZrAH2VwXFLcMe5lkjWWd/Iby65C1lGpabhWyyapJ0gZLJX2DPUNIgiIwGIIVwILArFpZxIqlCJcPOazeqVgxC7AgKCmZmQSIKXAGBfWIBYcmFAPMJMIRgpiYmKKQTkFh2XXRq4665DBAQUVerM4xG/ufCtogWK2zRLH5wWUpMMhZ5co7GH4R1jCM3U7P3g20WmmOIuyhdhPlb6NxH7IGv8JeFX419G0rxewDUkAlohkcA8w+SEExFDGI1mWDmKGYCVAGcRCpKYJnHGK9MAMIgEFEIprVBV6jVnwEK8WC1/P9iNdHoD3yIBAUQcwbXDWCIASRatlqDGZihohmjSq4KBKBy1xQIE4UxF8p5kjNSAhmUOgfM5gUsUvOCpoLSJ/RaqXRPCqki/otXv1bavz/7d1Bi+NGEMXx/6uWbIY5bCDk+3+yfIINSTbZ4BlLXZVDy6MccggsAffyfvgwGHns06O6VV36mbzndkO9kmitSNHG42DG4+ClQjqmwFQcyzaUxSg+HjtAgiJRneWPpEygIoiPAmsMdT8PyJzdCuOfC4VUlSOz6pyhfq7czk7QSsZmk5RZkaSoTCAa9CiVHg+OyOPWYZVKalQeX581ErCOciwqcyRXqBWMQpK81far8v05J2E5rex7lHf6F90/s/1OSVFVISp7IipGoiShqLE8g8d6rYp2jK4qOMqTo8QRMaoTLcqmCLQogghirVLTwvFmA6EFBTQIIooSUaO5UyGJUcJJj+6B0XkaUERQodG/LlHjzziO4AgKtQvrD9VeIMZHo6py0/6F/pUsVOQeNdZ6XSoyq7ro1Kb9Rn+j3tTfyRu10d/Z/iK/deD9/8RpZd+p5DiKomXMLtASRVCoBWgc+B3XKoLjpApCpTjP2SiQhFQpNeKV5YVYYSUauhILsaI11FKXWNaiKa5oIVZigQUFaihQiEBx5EsFI6pihJrG+GQUUnv8AKGgRMQ4RPOPyxpxVVxHDXUUeyT5Rm30joq+Qyd36NSuSuVedVe/sf9J/8r2x/G6/0JcdflEPGksPOnPMvsmCpYXrj9SO1ppF9TQKpaMRXFF7SNliAWtUqNd0Fo0tBIrbYUVLeOCkIgL6yfa62NDKohH6MARH62Jj4VmO94fM98/9sOPVvk4y6RjsksgkY9zimNDXvqXVdl/aYk6J5yOerDIMaFv3Kns5M64uTnuaW6/0d+5/kS8PuEyEPdbmdksnr3X3sxscFqZ2RycVmY2B6eVmc3BaWVmc3BamdkcnFZmNgenlZnNwWllZnNwWpnZHJxWZjYHp5WZzcFpZWZzcFqZ2RycVmY2B6eVmc3BaWVmc3BamdkcnFZmNgenlZnNwWllZnNwWpnZHJxWZjYHp5WZzcFpZWZzcFqZ2RycVmY2B6eVmc3hb9Wn2ttgT2PeAAAAAElFTkSuQmCC" 7 | } 8 | }, 9 | "cell_type": "markdown", 10 | "id": "09cbf33a-b659-4455-aecb-659b6649eb1a", 11 | "metadata": { 12 | "jp-MarkdownHeadingCollapsed": true, 13 | "tags": [] 14 | }, 15 | "source": [ 16 | "# Applying Vectorization to speed up Pandas\n", 17 | "\n", 18 | "![image.png](attachment:4830f075-ff79-4f43-bc5a-38e0d17d65bc.png)\n", 19 | "\n", 20 | "Like many other libraries - Pandas has built in access to Numpy. And there are many ways to accomplish any given task in Pandas. \n", 21 | "\n", 22 | "# Learning Objectives:\n", 23 | "- Apply Numpy methods to dramatically speed up certain common Pandas bottlenecks\n", 24 | "- Apply WHERE or SELECT in Numpy powered by oneAPI\n", 25 | "- Avoid **iterrows** using Numpy techniques\n", 26 | "- Achieve better performacne by converting numerical columns to numpy arrays\n", 27 | "\n", 28 | "#### Please also see \n", 29 | "\n", 30 | "In the **near future**, we will be adding an addendum to this learning path [**Intel® Distribution of Modin*Intel® Distribution of Modin**](https://www.intel.com/content/www/us/en/developer/tools/oneapi/distribution-of-modin.html#gs.x7j0o9https://www.intel.com/content/www/us/en/developer/tools/oneapi/distribution-of-modin.html#gs.x7j0o9) to scale your pandas workflows by changing a single line of code.\n", 31 | "\n", 32 | "But its is also important to know how to speed up Pandas natively via its dependence on Numpy. **Pandas is powered by oneAPI via Numpy!**\n", 33 | "\n", 34 | "When the opportunity arises it often highly profitable to leverage the Numpy way of solving a Pandas apply() performance issue. Due to the nature of the size of many dataframes, it is oftern better to uncover a way to apply Numy instead.\n", 35 | "\n", 36 | "While not yet a part of this course Intel oneAPI AI Analytics Toolikit has a component called Modin* which is a drop in replacement for Pandas and this package can dramatically speed up Pandas operations. Modin can be used for probalems larger than can fit in your laptops memory for example and can distribute computations across a cluster of nodes. Our aim is to include Modin as a component of training in the future.\n", 37 | "\n", 38 | "There are a number of excellent references regarding speeding up Numpy or more specifically Pandas using Numpy and I encourage you to review these resources.\n", 39 | "\n", 40 | "### reference:\n", 41 | "\n", 42 | "- [Nathan Cheever Video: **1000x faster data manipulation: vectorizing with Pandas and Numpy**](https://www.youtube.com/watch?v=nxWginnBklU&t=237s). His Advise is precient!\n", 43 | "\n", 44 | "- [Jake VanderPlas Video: **Losing your Loops Fast Numerical Computing with NumPy**](https://www.youtube.com/watch?v=EEUXKG97YRw). I also \n", 45 | "\n", 46 | "- [Jake VanderPlas Book: **Python Data Science Handbook**](https://jakevdp.github.io/PythonDataScienceHandbook/). \n", 47 | "\n", 48 | "- [Selvaratnam Lavinan article (towardsdatascience.com ): **Understanding the need for optimization when using Pandas**](\n", 49 | "https://towardsdatascience.com/understanding-the-need-for-optimization-when-using-pandas-8ce23b83330c)\n" 50 | ] 51 | }, 52 | { 53 | "cell_type": "markdown", 54 | "id": "983ecc5c-6b26-4904-82f9-75804c78077b", 55 | "metadata": { 56 | "tags": [] 57 | }, 58 | "source": [ 59 | "# Pandas Apply with custom user function\n", 60 | "\n", 61 | "Let's prepare a dataset and time how long it takes to apply a user functions to all rows.\n", 62 | "\n", 63 | "We will look at different timings for applying our custom function to all rows of the dataset. Notice that this has no conditional logic we are just stamping the same set of instructions over and over for all data elements in each iteration of the loop.\n" 64 | ] 65 | }, 66 | { 67 | "cell_type": "code", 68 | "execution_count": null, 69 | "id": "e12d2a93-212f-465e-94a5-1f86e585fe56", 70 | "metadata": {}, 71 | "outputs": [], 72 | "source": [ 73 | "import pandas as pd \n", 74 | "import numpy as np\n", 75 | "import time\n", 76 | "\n", 77 | "BIG = 100_000\n", 78 | "df = pd.DataFrame(np.random.randint(0, 11, size=(BIG, 5)), columns=('a','b','c','d','e'))" 79 | ] 80 | }, 81 | { 82 | "cell_type": "markdown", 83 | "id": "7846aaa8-b2f1-470e-8960-28fa53f0ea14", 84 | "metadata": {}, 85 | "source": [ 86 | "We all get busy, and what starts off as a small project to prove a point grows into production code\n", 87 | "\n", 88 | "Many times, our short cuts worked well enough for the toy data sizes we proved our popint with, but with *real* data our shortcuts fall way behind the performance curve\n", 89 | "\n", 90 | "Below, are simplified versions of performance sappers I have done or that I have seen others do - distilled down to a simplified version.\n", 91 | "\n", 92 | "The idea is that you want to call a function on one column and have the result be assigned to a different column.\n", 93 | "\n", 94 | "Naively, you use a loop. You get bad performance calling this log function" 95 | ] 96 | }, 97 | { 98 | "cell_type": "code", 99 | "execution_count": null, 100 | "id": "e9040671-c285-4420-851b-fbe8a9339f12", 101 | "metadata": {}, 102 | "outputs": [], 103 | "source": [ 104 | "def my_function(x):\n", 105 | " return np.log(1+x)" 106 | ] 107 | }, 108 | { 109 | "cell_type": "markdown", 110 | "id": "2edc5140-3ff6-4572-8333-44e480ce77ca", 111 | "metadata": {}, 112 | "source": [ 113 | "### First approach - iterate over df using range" 114 | ] 115 | }, 116 | { 117 | "cell_type": "code", 118 | "execution_count": null, 119 | "id": "7a74ca8b-d61c-4b14-9156-30ec6f899c9b", 120 | "metadata": {}, 121 | "outputs": [], 122 | "source": [ 123 | "%%time\n", 124 | "# naive loop method using pandas iloc\n", 125 | "timing = {}\n", 126 | "t1 = time.time()\n", 127 | "\n", 128 | "for i in range(0,BIG):\n", 129 | " df.iloc[i,2] = my_function(df.iloc[i,0])\n", 130 | " \n", 131 | "t2 = time.time()\n", 132 | "timing['iloc'] = t2 - t1\n", 133 | "df.head()" 134 | ] 135 | }, 136 | { 137 | "cell_type": "markdown", 138 | "id": "7f7c6666-bb90-4bbd-a17b-4388564cd5b9", 139 | "metadata": {}, 140 | "source": [ 141 | "### Next, lets say we get advise from a web search \n", 142 | "\n", 143 | "Turned up advice on using Pandas iterrow function" 144 | ] 145 | }, 146 | { 147 | "cell_type": "code", 148 | "execution_count": null, 149 | "id": "b557d764-cc12-4e53-9922-8de0647aae53", 150 | "metadata": {}, 151 | "outputs": [], 152 | "source": [ 153 | "%%time\n", 154 | "# naive loop method using pandas loc\n", 155 | "import numpy as np\n", 156 | "import time\n", 157 | "\n", 158 | "# each iteration of the loop requires an interpretation of the instructions being used and this decoding takes time\n", 159 | "t1 = time.time()\n", 160 | "\n", 161 | "for index, row in df.iterrows():\n", 162 | " row[2] = my_function(row[0])\n", 163 | " \n", 164 | "t2 = time.time()\n", 165 | "baseTime = t2-t1\n", 166 | "timing['iterrow'] = t2 - t1\n", 167 | "df.head()" 168 | ] 169 | }, 170 | { 171 | "cell_type": "markdown", 172 | "id": "c5b98e02-6a95-42ad-9ce3-52cdf6002181", 173 | "metadata": {}, 174 | "source": [ 175 | "You realize, that sometimes pandas iloc is a little faster than loc if you are willing to do a little numerical indexing" 176 | ] 177 | }, 178 | { 179 | "cell_type": "markdown", 180 | "id": "1bc00d95-ab8c-47b6-b86d-76ac78cb6685", 181 | "metadata": {}, 182 | "source": [ 183 | "You may have seen various tips online, perhaps a Youtube video that shows how much faster pandas \"at\"." 184 | ] 185 | }, 186 | { 187 | "cell_type": "code", 188 | "execution_count": null, 189 | "id": "b3777fae-674c-423f-a19f-d39de8b1fee1", 190 | "metadata": {}, 191 | "outputs": [], 192 | "source": [ 193 | "%%time\n", 194 | "# naive loop method using pandas \"at\"\n", 195 | "t1 = time.time()\n", 196 | "\n", 197 | "for i in range(0,BIG):\n", 198 | " df.at[i,'c']=my_function(df.at[i,'a'])\n", 199 | " \n", 200 | "t2 = time.time()\n", 201 | "fastest_time = t2-t1\n", 202 | "Speedup = baseTime / fastest_time\n", 203 | "print(\"Speed up: {:4.0f} X\".format(Speedup))\n", 204 | "timing['df.at'] = t2 - t1\n", 205 | "df.head()" 206 | ] 207 | }, 208 | { 209 | "cell_type": "markdown", 210 | "id": "5bcd5b1c-9f51-42b4-bc8e-2349fb72fc36", 211 | "metadata": {}, 212 | "source": [ 213 | "**D'oh!** \n", 214 | "\n", 215 | "**Pandas \"Apply\"!**\n", 216 | "\n", 217 | "Why didn't I use it from the start?\n", 218 | "\n", 219 | "I think I used it in the past, just got flying thru my code and forgot about this old friend.\n", 220 | "\n", 221 | "With the vectorized version - we know the instructions to be applied and can apply in a large batch - no implicit loop - which means we dont have to fetch and decode instructions for every iteration" 222 | ] 223 | }, 224 | { 225 | "cell_type": "code", 226 | "execution_count": null, 227 | "id": "ffb67366-c490-4cad-b98e-c4a7a2e478b2", 228 | "metadata": {}, 229 | "outputs": [], 230 | "source": [ 231 | "%%time\n", 232 | "# vectorized method using pandas apply\n", 233 | "t1 = time.time()\n", 234 | "\n", 235 | "df['c'] = df['a'].apply(lambda x : my_function(x))\n", 236 | "\n", 237 | "t2 = time.time()\n", 238 | "fastest_time = t2-t1\n", 239 | "Speedup = baseTime / fastest_time\n", 240 | "timing['pandas apply'] = t2 - t1\n", 241 | "\n", 242 | "print(\"Speed up: {:4.0f} X\".format(Speedup))" 243 | ] 244 | }, 245 | { 246 | "cell_type": "markdown", 247 | "id": "cf46be4c-f880-4d09-a7f0-f96ef975573a", 248 | "metadata": {}, 249 | "source": [ 250 | "### Plot the timing results of tehse methods" 251 | ] 252 | }, 253 | { 254 | "cell_type": "code", 255 | "execution_count": null, 256 | "id": "c9558829-646f-4d31-9479-6f73ea8f2b13", 257 | "metadata": {}, 258 | "outputs": [], 259 | "source": [ 260 | "%matplotlib inline\n", 261 | "import matplotlib.pyplot as plt\n", 262 | "plt.figure(figsize=(10,6))\n", 263 | "plt.title(\"Plot of various method of applying log(1+x) to a dataframe\")\n", 264 | "plt.ylabel(\"Time in seconds\",fontsize=12)\n", 265 | "plt.xlabel(\"Various types of operations\",fontsize=14)\n", 266 | "plt.grid(True)\n", 267 | "plt.yscale('log')\n", 268 | "plt.xticks(rotation=-60)\n", 269 | "plt.bar(x = range(len(timing)), height=timing.values(), align='center', tick_label=list(timing.keys()))\n", 270 | "short = min(list(timing.values()))\n", 271 | "long = max(list(timing.values()))\n", 272 | "print('Speedup : {:4.0f} X'.format(long/short))" 273 | ] 274 | }, 275 | { 276 | "cell_type": "markdown", 277 | "id": "beee35d1-24e0-40aa-a105-d647d008b674", 278 | "metadata": {}, 279 | "source": [ 280 | "### WOW! \n", 281 | "\n", 282 | "Sped up a couple hundred times!\n", 283 | "\n", 284 | "**Pandas.apply** is my new **best friend!**" 285 | ] 286 | }, 287 | { 288 | "cell_type": "markdown", 289 | "id": "116f8b67-4110-49b3-a499-4fe584a8282c", 290 | "metadata": {}, 291 | "source": [ 292 | "# Alternative to Pandas Apply for Conditional Logic\n", 293 | "\n", 294 | "Hmmm... But then we run into **conditional Logic** in our function. Hmmm. **Is Pandas Apply** still our **best friend?**\n", 295 | "\n", 296 | "Below we create a randomly generated array of values to be included in a Pandas dataframe with a large number of rows" 297 | ] 298 | }, 299 | { 300 | "cell_type": "code", 301 | "execution_count": null, 302 | "id": "bd0c7d23-cbac-4ca3-9915-430f4f63d1dd", 303 | "metadata": {}, 304 | "outputs": [], 305 | "source": [ 306 | "import pandas as pd \n", 307 | "import numpy as np\n", 308 | "timing = {}\n", 309 | "BIG = 4000000\n", 310 | "df = pd.DataFrame(np.random.randint(0, 11, size=(BIG, 5)), columns=('a','b','c','d','e'))" 311 | ] 312 | }, 313 | { 314 | "cell_type": "markdown", 315 | "id": "53dfb590-9b15-4601-b707-dfa0d81f1f78", 316 | "metadata": {}, 317 | "source": [ 318 | "Firsty of all, Pandas is built on Numpy - and we can demonstrate this by showing that under the hood when you look at values within a DataFrame, you will see the use of numpy.ndarray" 319 | ] 320 | }, 321 | { 322 | "cell_type": "code", 323 | "execution_count": null, 324 | "id": "cf990049-b43d-4730-9606-57d8ba35c5fa", 325 | "metadata": {}, 326 | "outputs": [], 327 | "source": [ 328 | "type(df['a'].values)" 329 | ] 330 | }, 331 | { 332 | "cell_type": "markdown", 333 | "id": "116c6b41-339e-4394-ab2b-8950cc331005", 334 | "metadata": {}, 335 | "source": [ 336 | "So how to best use the numpy that is built in?\n", 337 | "\n", 338 | "Below we have a function with a lot of conditional logic that we want to applyu to the Entire Dataframe\n" 339 | ] 340 | }, 341 | { 342 | "cell_type": "code", 343 | "execution_count": null, 344 | "id": "707f9458-6c7f-416f-913c-1f6af093ddf4", 345 | "metadata": {}, 346 | "outputs": [], 347 | "source": [ 348 | "def func(a,b,c,d,e):\n", 349 | " if e == 10:\n", 350 | " return c*d\n", 351 | " elif (e < 10) and (e>=5):\n", 352 | " return c+d\n", 353 | " elif e < 5:\n", 354 | " return a+b" 355 | ] 356 | }, 357 | { 358 | "cell_type": "markdown", 359 | "id": "240d6321-101e-458e-93a9-8906dfa1290d", 360 | "metadata": {}, 361 | "source": [ 362 | "Applying this function to the Dataframe, each row has to be evaluated for the condition and this makes the execution time slow because the conditional logic hinders can vectorization\n", 363 | "\n", 364 | "### Naive Apply lambda on function with condition rows" 365 | ] 366 | }, 367 | { 368 | "cell_type": "code", 369 | "execution_count": null, 370 | "id": "2c535a47-b8b1-487d-92d7-6dd84b24a4dd", 371 | "metadata": {}, 372 | "outputs": [], 373 | "source": [ 374 | "import time\n", 375 | "t1 = time.time()\n", 376 | "df['new'] = df.apply(lambda x: func(x['a'], x['b'], x['c'], x['d'], x['e']), axis=1)\n", 377 | "t2 = time.time()\n", 378 | "print(\"time : {:5.2f}\".format(t2-t1))\n", 379 | "df.head()\n", 380 | "timing['Pandas Apply'] = t2 - t1\n", 381 | "baseTime = t2-t1" 382 | ] 383 | }, 384 | { 385 | "cell_type": "markdown", 386 | "id": "f803c290-c8ed-447c-b742-19831ef00c6e", 387 | "metadata": {}, 388 | "source": [ 389 | "This feels slow!\n", 390 | "\n", 391 | "Can we do better?\n", 392 | "\n", 393 | "### Use vectorization !\n", 394 | "\n", 395 | "Here we will make use of Vectorization to create index masks that control the application of values to a given column - we operate on entire columns at a time this way. " 396 | ] 397 | }, 398 | { 399 | "cell_type": "code", 400 | "execution_count": null, 401 | "id": "fec58d77-e965-47b9-9658-a5beddc80e7d", 402 | "metadata": {}, 403 | "outputs": [], 404 | "source": [ 405 | "t1 = time.time()\n", 406 | "df['new'] = df['c'] * df['d'] #default case e = =10\n", 407 | "mask = (df['e'] < 10) & (df['e'] >= 5)\n", 408 | "df.loc[mask,'new'] = df['c'] + df['d']\n", 409 | "mask = df['e'] < 5\n", 410 | "df.loc[mask,'new'] = df['a'] + df['b']\n", 411 | "t2 = time.time()\n", 412 | "print(\"time :\", t2-t1)\n", 413 | "df.head()\n", 414 | "fastest_time = t2-t1\n", 415 | "timing['Mask'] = t2 - t1\n", 416 | "Speedup = baseTime / fastest_time\n", 417 | "print(\"Speed up: {:4.0f} X\".format(Speedup))" 418 | ] 419 | }, 420 | { 421 | "cell_type": "code", 422 | "execution_count": null, 423 | "id": "9705d38e-afbe-4fe3-81e1-a50701ddc93c", 424 | "metadata": {}, 425 | "outputs": [], 426 | "source": [ 427 | "%matplotlib inline\n", 428 | "import matplotlib.pyplot as plt\n", 429 | "plt.figure(figsize=(10,6))\n", 430 | "plt.title(\"Plot of various method of applying conditional (a,b,c,d,e) logic to a dataframe\")\n", 431 | "plt.ylabel(\"Time in seconds\",fontsize=12)\n", 432 | "plt.xlabel(\"Various types of operations\",fontsize=14)\n", 433 | "plt.grid(True)\n", 434 | "plt.bar(x = range(len(timing)), height=timing.values(), align='center', tick_label=list(timing.keys()))\n", 435 | "short = min(list(timing.values()))\n", 436 | "long = max(list(timing.values()))\n", 437 | "print('Speedup : {:4.0f} X'.format(long/short))" 438 | ] 439 | }, 440 | { 441 | "cell_type": "markdown", 442 | "id": "a0e92bdd-96a8-4dc6-90f7-a70fa194f131", 443 | "metadata": {}, 444 | "source": [ 445 | "Well that feels much better!\n", 446 | "\n", 447 | "over 100X speedup on DevCLoud (your milage may vary)\n", 448 | "\n", 449 | "But the code looks complicated. The masking trick on Numpy arrays is effective but a little hard to read and debug" 450 | ] 451 | }, 452 | { 453 | "cell_type": "code", 454 | "execution_count": null, 455 | "id": "aa0a0353-7837-45cd-8cb1-94bc70d70304", 456 | "metadata": {}, 457 | "outputs": [], 458 | "source": [ 459 | "import pandas as pd \n", 460 | "BIG = 1000_000\n", 461 | "timing = {}\n", 462 | "df = pd.DataFrame(np.random.randint(0, 11, size=(BIG, 5)), columns=('a','b','c','d','e'))" 463 | ] 464 | }, 465 | { 466 | "cell_type": "markdown", 467 | "id": "91e447f8-e694-4cfd-a84c-7fd3db8eba40", 468 | "metadata": {}, 469 | "source": [ 470 | "### With Conditional Logic and Expensive Function!" 471 | ] 472 | }, 473 | { 474 | "cell_type": "code", 475 | "execution_count": null, 476 | "id": "2f5c8833-6dcb-4943-956f-b46b3feb1e26", 477 | "metadata": {}, 478 | "outputs": [], 479 | "source": [ 480 | "def my_function(x):\n", 481 | " return np.log(1+x)\n", 482 | "\n", 483 | "def func(a,b,c,d,e):\n", 484 | " if e == 10:\n", 485 | " return c*d\n", 486 | " elif (e < 10) and (e>=7):\n", 487 | " return my_function(c+d)\n", 488 | " elif e < 7:\n", 489 | " return my_function(a+b+100)" 490 | ] 491 | }, 492 | { 493 | "cell_type": "markdown", 494 | "id": "706d57bb-4737-4004-b242-7ee3dd5ed66e", 495 | "metadata": {}, 496 | "source": [ 497 | "We confidently use our old Pandas \"Apply\" trick!" 498 | ] 499 | }, 500 | { 501 | "cell_type": "code", 502 | "execution_count": null, 503 | "id": "ded31f3c-83e4-4eae-8bf9-ae615802bc2e", 504 | "metadata": {}, 505 | "outputs": [], 506 | "source": [ 507 | "%%time\n", 508 | "# naive loop method using pandas loc\n", 509 | "import numpy as np\n", 510 | "\n", 511 | "# each iteration of the loop requires an interpretation of the instructions being used and this decoding takes time\n", 512 | " \n", 513 | "t1 = time.time()\n", 514 | "\n", 515 | "df['new'] = df.apply(lambda x: func(x['a'], x['b'], x['c'], x['d'], x['e']), axis=1)\n", 516 | "\n", 517 | "t2 = time.time()\n", 518 | "print(\"time : {:5.2f}\".format(t2-t1))\n", 519 | "df.head()\n", 520 | "baseTime = t2-t1\n", 521 | "timing['Pandas Apply'] = t2 - t1\n", 522 | "df.head()" 523 | ] 524 | }, 525 | { 526 | "cell_type": "markdown", 527 | "id": "25108261-fe15-46a0-88ab-bc1cef4f62f7", 528 | "metadata": {}, 529 | "source": [ 530 | "Hmmmm, I thought it would be faster - it vectorizes right?\n", 531 | "\n", 532 | "Oh - conditional logic can hamper vectorization. \n", 533 | "\n", 534 | "Can I do something about it?\n", 535 | "\n", 536 | "Maybe you read one time about a trick called masking. We will do the conditional logic to see an index or mask for our dataframe and use different masks for different conditions." 537 | ] 538 | }, 539 | { 540 | "cell_type": "code", 541 | "execution_count": null, 542 | "id": "2c2900b2-de2c-47a4-a4c7-86f6cfd1e1f3", 543 | "metadata": {}, 544 | "outputs": [], 545 | "source": [ 546 | "# masked approach\n", 547 | "t1 = time.time()\n", 548 | "df['new'] = df['c'] * df['d'] #default case e =10\n", 549 | "mask = (df['e'] < 10) & (df['e'] >= 7)\n", 550 | "df.loc[mask,'new'] = (df['c'] + df['d']).apply(lambda x : my_function(x))\n", 551 | "mask = df['e'] < 7\n", 552 | "df.loc[mask,'new'] = (df['a'] + df['b']).apply(lambda x : my_function(x + 100))\n", 553 | "t2 = time.time()\n", 554 | "print(\"time :\", t2-t1)\n", 555 | "fastest_time = t2-t1\n", 556 | "Speedup = baseTime / fastest_time\n", 557 | "print(\"Speed up: {:4.0f} X\".format(Speedup))\n", 558 | "timing['unrolled with masks on df'] = t2 - t1\n", 559 | "df.head()" 560 | ] 561 | }, 562 | { 563 | "cell_type": "markdown", 564 | "id": "9b681ab0-7cc5-4fd8-9101-17f854b9bd00", 565 | "metadata": {}, 566 | "source": [ 567 | "WOW! Masking to the rescue!\n", 568 | "\n", 569 | "Still - I wonder if I could do better?\n", 570 | "\n", 571 | "I watched a cool video on Youtube by a guy introducing me to the **Numpy \"Select\" clause**. He had so many great tips - But off the top of my head I can remember the \"Select\" trick.\n", 572 | "\n", 573 | "If you want to get serious about speeding up your python check these two references out!\n", 574 | "\n", 575 | "Seriously - Look up Nathan Cheever talk [1000x faster data manipulation: vectorizing with Pandas and Numpy](https://www.youtube.com/watch?v=nxWginnBklU&t=237s). His Advise is precient!\n", 576 | "\n", 577 | "While you are at it - Look up Jake VanderPlas talk [Losing your Loops Fast Numerical Computing with NumPy](https://www.youtube.com/watch?v=EEUXKG97YRw). I also recommend that you buy his book [Python Data Science Handbook](https://jakevdp.github.io/PythonDataScienceHandbook/). \n", 578 | "\n", 579 | "\n", 580 | "Let's try **Numpy \"Select\" clause** \n", 581 | "\n", 582 | "Notice that it cleans the code up alot!\n", 583 | "\n", 584 | "1. You create a list of tuple containing your condition.\n", 585 | "\n", 586 | "2. You create another list of tuples containg the opration you wish tou apply\n", 587 | "\n", 588 | "3. You call np.select(condlist, choicelist, default=0)\n", 589 | "\n", 590 | "**This cell will error - fix the error**\n", 591 | "\n", 592 | "- Hint:\n", 593 | "\n", 594 | "```python\n", 595 | "condition = [ (df['e'] < 10) & (df['e'] >= 7),\n", 596 | " ( df['e'] < 7)]\n", 597 | "choice = [ (df['c'] + df['d']).apply(lambda x : my_function(x) ), \n", 598 | " (df['a'] + df['b']).apply(lambda x : my_function(x + 100) ) ]\n", 599 | "default = (df['c'] * df['d'])\n", 600 | "np.select(condition, choice, default = default )\n", 601 | "```\n", 602 | "\n" 603 | ] 604 | }, 605 | { 606 | "cell_type": "code", 607 | "execution_count": null, 608 | "id": "09c5478c-585e-4a25-b614-edb9d482997a", 609 | "metadata": {}, 610 | "outputs": [], 611 | "source": [ 612 | "# np.select(condlist, choicelist, default=0)\n", 613 | "t1 = time.time()\n", 614 | "################### add code here ###########\n", 615 | "condition = [ (df['e'] < 10) & (df['e'] >= 7),\n", 616 | " ( df['e'] < 7)]\n", 617 | "choice = [ (df['c'] + df['d']).apply(lambda x : my_function(x) ), \n", 618 | " (df['a'] + df['b']).apply(lambda x : my_function(x + 100) ) ]\n", 619 | "default = (df['c'] * df['d'])\n", 620 | "np.select(condition, choice, default = default )\n", 621 | "#############################################\n", 622 | "np.select(condition, choice, default = default)\n", 623 | "t2 = time.time()\n", 624 | "print(\"time :\", t2-t1)\n", 625 | "timing['Numpy Select on Pandas df'] = t2 - t1\n", 626 | "df.head()" 627 | ] 628 | }, 629 | { 630 | "cell_type": "markdown", 631 | "id": "6669f4b6-16b3-463e-a620-5dad75ad73c0", 632 | "metadata": {}, 633 | "source": [ 634 | "Not bad. \n", 635 | "\n", 636 | "But, but I am using \"numpy.select\" and applying it to Pandas dataframes.\n", 637 | "\n", 638 | "Could we speed it up more if we drop the Pandas and go completely with Numpy?" 639 | ] 640 | }, 641 | { 642 | "cell_type": "code", 643 | "execution_count": null, 644 | "id": "a13cc3bd-8ac5-4e6f-a1e9-effe60a52a67", 645 | "metadata": {}, 646 | "outputs": [], 647 | "source": [ 648 | "# Convert Pandas to numpy entirely\n", 649 | "t1 = time.time()\n", 650 | "npArr = df.to_numpy() # convert to numpy\n", 651 | "idx = {} #intialize an indexing dictionary\n", 652 | "for index, value in enumerate(df.columns):\n", 653 | " idx[value] = index\n", 654 | "df.loc[:,'new'] = npArr[:,idx['c']] * npArr[:,idx['d']] #default case e =10\n", 655 | "mask = (npArr[:,idx['e']] < 10) & (npArr[:,idx['e']] >= 7)\n", 656 | "df.loc[mask,'new'] = my_function(npArr[mask,idx['c']] + npArr[mask,idx['d']])\n", 657 | "mask = (npArr[:,idx['e']] < 7)\n", 658 | "df.loc[mask,'new'] = my_function(npArr[mask,idx['a']] + npArr[mask,idx['b']] + 100)\n", 659 | "t2 = time.time()\n", 660 | "print(\"time :\", t2-t1)\n", 661 | "df.head()\n", 662 | "fastest_time = t2-t1\n", 663 | "Speedup = baseTime / fastest_time\n", 664 | "print(\"Speed up: {:4.0f} X\".format(Speedup))\n", 665 | "timing['unrolled with Masks on dataframe'] = t2 - t1\n", 666 | "df.head()" 667 | ] 668 | }, 669 | { 670 | "cell_type": "markdown", 671 | "id": "6e7da9e2-73fc-4057-a682-6a959fc30016", 672 | "metadata": {}, 673 | "source": [ 674 | "WOW!! Now we are talking - something over 60X speedup!\n", 675 | "\n", 676 | "Code looks a little messy though.\n", 677 | "\n", 678 | "How about if we try the Numpy.Select trick again?\n", 679 | "\n", 680 | "This cell will error - fix the error\n", 681 | "\n", 682 | "- Hint\n", 683 | "\n", 684 | "```python\n", 685 | "condition = [ (npArr[:,idx['e']] < 10) & (npArr[:,idx['e']] >= 7),\n", 686 | " (npArr[:,idx['e']] < 7)]\n", 687 | "\n", 688 | "choice = [(my_function(npArr[:,idx['c']] + npArr[:,idx['d']] )), \n", 689 | " (my_function(npArr[:,idx['a']] + npArr[:,idx['b']] + 100))]\n", 690 | "\n", 691 | "tmp = np.select(condition, choice, default= (npArr[:,idx['c']] * npArr[:,idx['d']]) )\n", 692 | "```" 693 | ] 694 | }, 695 | { 696 | "cell_type": "code", 697 | "execution_count": null, 698 | "id": "30b66fa3-70a2-4a1b-b9a1-04dc36518c05", 699 | "metadata": {}, 700 | "outputs": [], 701 | "source": [ 702 | "# np.select(condlist, choicelist, default=0)\n", 703 | "# Convert Pandas to numpy entirely\n", 704 | "t1 = time.time()\n", 705 | "npArr = df.to_numpy() # convert to numpy\n", 706 | "\n", 707 | "condition = [ (npArr[:,idx['e']] < 10) & (npArr[:,idx['e']] >= 7),\n", 708 | " (npArr[:,idx['e']] < 7)]\n", 709 | "\n", 710 | "choice = [(my_function(npArr[:,idx['c']] + npArr[:,idx['d']] )), \n", 711 | " (my_function(npArr[:,idx['a']] + npArr[:,idx['b']] + 100))]\n", 712 | "\n", 713 | "tmp = np.select(condition, choice, default= (npArr[:,idx['c']] * npArr[:,idx['d']]) )\n", 714 | "\n", 715 | "df.loc[:,'new'] = tmp\n", 716 | "t2 = time.time()\n", 717 | "\n", 718 | "print(\"time :\", t2-t1)\n", 719 | "\n", 720 | "fastest_time = t2-t1\n", 721 | "Speedup = baseTime / fastest_time\n", 722 | "print(\"Speed up: {:4.0f} X\".format(Speedup))\n", 723 | "timing['Numpy Select Pure'] = t2 - t1\n", 724 | "df.head()" 725 | ] 726 | }, 727 | { 728 | "cell_type": "markdown", 729 | "id": "12a0dde3-7dd2-473f-91a5-9c59327de198", 730 | "metadata": {}, 731 | "source": [ 732 | "### Plot the results" 733 | ] 734 | }, 735 | { 736 | "cell_type": "code", 737 | "execution_count": null, 738 | "id": "a276f7b1-5411-412c-8a8d-78fe4bf68623", 739 | "metadata": {}, 740 | "outputs": [], 741 | "source": [ 742 | "%matplotlib inline\n", 743 | "import matplotlib.pyplot as plt\n", 744 | "plt.figure(figsize=(10,6))\n", 745 | "plt.title(\"Plot of various method of applying conditional (a,b,c,d,e) logic to a dataframe\")\n", 746 | "plt.ylabel(\"Log of speed up\",fontsize=12)\n", 747 | "plt.xlabel(\"Various types of operations\",fontsize=14)\n", 748 | "plt.grid(True)\n", 749 | "plt.yscale('log')\n", 750 | "plt.xticks(rotation=-60)\n", 751 | "plt.bar(x = range(len(timing)), height=timing.values(), align='center', tick_label=list(timing.keys()))\n", 752 | "short = min(list(timing.values()))\n", 753 | "long = max(list(timing.values()))\n", 754 | "print('Speedup : {:4.0f} X'.format(long/short))" 755 | ] 756 | }, 757 | { 758 | "cell_type": "markdown", 759 | "id": "560e2948-d43e-48e7-a104-faf85a00cc5f", 760 | "metadata": {}, 761 | "source": [ 762 | "WOW!!! Hundreds time faster than Pandas Apply AND code is cleaner!\n", 763 | "\n", 764 | "SHIP IT!" 765 | ] 766 | }, 767 | { 768 | "cell_type": "code", 769 | "execution_count": null, 770 | "id": "3dae03aa-0b27-4b78-8ddd-c86018c9994f", 771 | "metadata": {}, 772 | "outputs": [], 773 | "source": [ 774 | "print(\"Done\")" 775 | ] 776 | } 777 | ], 778 | "metadata": { 779 | "kernelspec": { 780 | "display_name": "pytorch-gpu", 781 | "language": "python", 782 | "name": "pytorch-gpu" 783 | }, 784 | "language_info": { 785 | "codemirror_mode": { 786 | "name": "ipython", 787 | "version": 3 788 | }, 789 | "file_extension": ".py", 790 | "mimetype": "text/x-python", 791 | "name": "python", 792 | "nbconvert_exporter": "python", 793 | "pygments_lexer": "ipython3", 794 | "version": "3.9.16" 795 | } 796 | }, 797 | "nbformat": 4, 798 | "nbformat_minor": 5 799 | } 800 | -------------------------------------------------------------------------------- /08_Compare_CrossEntropy_Softmax_KL_Mean_Std.ipynb: -------------------------------------------------------------------------------- 1 | { 2 | "cells": [ 3 | { 4 | "cell_type": "markdown", 5 | "id": "5859fd3b", 6 | "metadata": {}, 7 | "source": [ 8 | "# Learning Objectives\n", 9 | "\n", 10 | "\n", 11 | "Replacing inefficient Pyth on loops, large trip count loops that perform low level operations, will negativley impact numeric computation performance.\n", 12 | "\n", 13 | "Replacing such loops has the following benefits, which are related to this modules learning objectives.\n", 14 | "\n", 15 | "| PyTorch UFUNCs | NumPy UFUNCs | Description |\n", 16 | "| --- | --- | --- |\n", 17 | "| torch.sigmoid(x)|1 / (1 + np.exp(-x))| **Sigmoid Function** |\n", 18 | "| torch.nn.Softmax() | np.exp(x)/np.exp(x).sum()| **Softmax function** |\n", 19 | "| torch.nn.CrossEntropyLoss() | 1.0 / (1.0 + np.exp(-x))| **Cross Entropy Loss** |\n", 20 | "| torch.nn.KLDivLoss() | np.sum(p * np.log(p/q) | **Kullback-Leibler Divergence (KL)** |\n", 21 | "\n", 22 | "Code will be:\n", 23 | " \n", 24 | "- More **readable**\n", 25 | "- More **maintainable** for the developer\n", 26 | "- Maintained by 3rd parties who invest THEIR time perfecting the algorithm\n", 27 | "- **Faster** on existing hardware\n", 28 | "- Faster on future enhancements to hardware platforms and 3rd party library developments\n", 29 | "\n", 30 | "### Learning Objectives\n", 31 | "At the end of this module you will be able to:\n", 32 | "- Apply NumPy vectorized libraies for inefficient loopy code\n", 33 | "- Describe the benefits of using NumPy as an alternative to your \"roll your code\"\n", 34 | "\n", 35 | "**To run the lab**: These step could be run on a laptop - NOT REQUIRED for DevCloud!\n", 36 | "**Laptop Requirements:**\n", 37 | "\n", 38 | "```bash\n", 39 | "conda config --add channels intel\n", 40 | "conda install numpy\n", 41 | "conda install scipy\n", 42 | "conda install update pandas\n", 43 | "```\n", 44 | "\n", 45 | "## Python loops are bad for performance\n", 46 | "\n", 47 | "**Python is great!** Its a great language for AI. There are many, many advantages in using Python especially for data science.\n", 48 | "\n", 49 | "- Easy to program (don’t worry about data types and fussy syntax at least relative to C/C++ and other languages\n", 50 | "- FAST for developing code!\n", 51 | "- Leverages huge array of libraries to conquer any domain\n", 52 | "- Lots of quick answers to common issues in Stack Exchange\n", 53 | "\n", 54 | "**Python, however, is slow for Massively repeating small tasks** - such as found in loops! \n", 55 | "-Python loops are SLOW\n", 56 | "-Compared to C, C++, Fortran and other typed languages\n", 57 | "-Python is forced to look up every occurrence and type of variable in a loop to determine what operations it can perform on that data type\n", 58 | "-It cannot usually take advantage of advances in hardware in terms of vector width increases, multiple cores, new instructions from a new HW instruction set, new AI accelerators, effective cache memory layout, and more\n", 59 | "\n", 60 | "**BUT: Python has library remedies to these ills!**\n", 61 | "\n", 62 | "Importing key libraries shift the burden of computation to highly efficient code.\n", 63 | "\n", 64 | "**NumPy**, for example, through its focus on elementwise efficient operations, gives indirect access to the efficiencies afforded in \"C\"\n", 65 | "\n", 66 | "libraries included in oneAPI and NumPy, SciPy, Scikit-learn all powered by Intel(r) oneAPI give access to modern advancements in hardware level: access to better cache and memory usage, access to low level vector instructions, and more.\n", 67 | "\n", 68 | "By leveraging packages such as these powered by oneAPI AND keeping libraries up to date, more capability is added to your underlying frameworks so that moving code, especially in a cloud world, can give you ready access to hardware acceleration, in many cases, without having to modify code this vectorized code\n", 69 | "Routines are written in C (based on Cython framework)\n", 70 | "\n", 71 | "NumPy arrays are densely packed arrays of homogeneous type. \n", 72 | "\n", 73 | "Python lists, by contrast, are arrays of pointers to objects, even when all of them are of the same type. So, you get the benefits of not having to check data types, and you also get locality of reference. Also, many NumPy operations are implemented in C, avoiding the general cost of loops in Python, pointer indirection and per-element dynamic type checking. The speed boost depends on which operations you’re performing.\n", 74 | "\n", 75 | "Goal of this module: **Search and destroy (replace) loops**\n", 76 | "\n", 77 | "- Avoid loops if you can - find an alternative if possible. \n", 78 | "- Sometimes it cannot be done - true data dependencies may limit our options. But many, many time there are alternatives.\n", 79 | "\n", 80 | "**The problem**\n", 81 | "\n", 82 | "- Loops isolate your code from hardware and software advances that update frequently.\n", 83 | "- They prevent you from effectively using key underlying resources - it is a waste.\n", 84 | "- They consume your time!\n", 85 | "- They can waster energy!\n", 86 | "\n", 87 | "\n", 88 | "## Reference:\n", 89 | "\n", 90 | "Video: Losing your Loops Fast Numerical Computing with NumPy by Jake VanderPlas .\n", 91 | "\n", 92 | "Book: Python Data Science Handbook by Jake VanderPlas.\n", 93 | "\n", 94 | "Book: Elegant SciPy: The Art of Scientific Python by by Juan Nunez-Iglesias, Stéfan van der Walt, Harriet Dashnow\n", 95 | "\n", 96 | "Article: The Ultimate NumPy Tutorial for Data Science Beginners : by Aniruddha April 28, 2020 at www.analyticsvidhya.com\n", 97 | "\n", 98 | "Academic Lecture pdf: Vectorization by Aaron Birkland Cornell CAC\n", 99 | "\n" 100 | ] 101 | }, 102 | { 103 | "cell_type": "markdown", 104 | "id": "de622d6f", 105 | "metadata": {}, 106 | "source": [ 107 | "# Exercise: Compute Mean & Std of array\n", 108 | "\n", 109 | "Below is an example of a loop based way to compute the mean and standard deviation for a list or vector of values\n", 110 | "\n", 111 | "```python\n", 112 | "for i in range (len(a)):\n", 113 | " S += a[i]\n", 114 | "mean = S/len(a)\n", 115 | "std = 0\n", 116 | "for i in range (len(a)):\n", 117 | " d = a[i] - mean\n", 118 | " std += d*d\n", 119 | "std = np.sqrt(std/len(a))\n", 120 | "print(\"mean\", mean)\n", 121 | "print(\"std\", std)\n", 122 | "```\n", 123 | "\n", 124 | "In the following exercise, replace this code with a more readbale and maintainable NumPy vectorized variant as follows:\n", 125 | "\n", 126 | "```python\n", 127 | "print(a.mean())\n", 128 | "print(a.std())\n", 129 | "```\n" 130 | ] 131 | }, 132 | { 133 | "cell_type": "code", 134 | "execution_count": null, 135 | "id": "c7b83ebd", 136 | "metadata": {}, 137 | "outputs": [], 138 | "source": [ 139 | "import numpy as np\n", 140 | "import torch\n", 141 | "import time\n", 142 | "import math\n", 143 | "\n", 144 | "rng = np.random.default_rng(2021)\n", 145 | "# random.default_range is the recommended method for generated random's\n", 146 | "# see blog \"Stop using numpy.random.seed()\" for reasoning\n", 147 | "# https://towardsdatascience.com/stop-using-numpy-random-seed-581a9972805f\n", 148 | "\n", 149 | "a = rng.random((10_000_000,))\n", 150 | "a_torch = torch.from_numpy(a)\n", 151 | "t1 = time.time()\n", 152 | "timing = {}\n", 153 | "S = 0\n", 154 | "\n", 155 | "################################ code to replace in next cell ##############\n", 156 | "for i in range (len(a)):\n", 157 | " S += a[i]\n", 158 | "mean = S/len(a)\n", 159 | "std = 0\n", 160 | "for i in range (len(a)):\n", 161 | " d = a[i] - mean\n", 162 | " std += d*d\n", 163 | "std = np.sqrt(std/len(a))\n", 164 | "print(\"mean\", mean)\n", 165 | "print(\"std\", std)\n", 166 | "############################################################################\n", 167 | "timing['loop'] = time.time() - t1\n", 168 | "\n", 169 | "\n", 170 | "print(timing)" 171 | ] 172 | }, 173 | { 174 | "cell_type": "markdown", 175 | "id": "c92aa67c-6e5b-4156-9dac-b4d6b945f1c1", 176 | "metadata": {}, 177 | "source": [ 178 | "# Show Intel oneMKL-DNN under the hood" 179 | ] 180 | }, 181 | { 182 | "cell_type": "code", 183 | "execution_count": null, 184 | "id": "8168c553-86f2-4341-864b-bc146b29f859", 185 | "metadata": {}, 186 | "outputs": [], 187 | "source": [ 188 | "print(torch.__config__.parallel_info())" 189 | ] 190 | }, 191 | { 192 | "cell_type": "markdown", 193 | "id": "b12d8d2e", 194 | "metadata": {}, 195 | "source": [ 196 | "## Excercise: use NumPy mean and std\n", 197 | "This cell will error - fix the error\n", 198 | "\n", 199 | "Hint:\n", 200 | "\n", 201 | "```python\n", 202 | "print(a.mean())\n", 203 | "print(a.std())\n", 204 | "```" 205 | ] 206 | }, 207 | { 208 | "cell_type": "code", 209 | "execution_count": null, 210 | "id": "d329b1f3", 211 | "metadata": {}, 212 | "outputs": [], 213 | "source": [ 214 | "t1 = time.time()\n", 215 | "##### insert NumPy code here ###############\n", 216 | "#print(np.xxx())\n", 217 | "print(a.mean())\n", 218 | "print(a.std()) \n", 219 | "############################################\n", 220 | "\n", 221 | "timing['numpy'] = time.time() - t1\n", 222 | "print(timing)\n", 223 | "print(f\"NumPy Acceleration {timing['loop']/timing['numpy']:4.1f} X\")\n", 224 | "\n", 225 | "t1 = time.time()\n", 226 | "##### insert NumPy code here ###############\n", 227 | "#print(np.xxx())\n", 228 | "print(a_torch.mean())\n", 229 | "print(a_torch.std()) \n", 230 | "############################################\n", 231 | "\n", 232 | "timing['numpy'] = time.time() - t1\n", 233 | "print(timing)\n", 234 | "print(f\"PyTorch Acceleration {timing['loop']/timing['numpy']:4.1f} X\")" 235 | ] 236 | }, 237 | { 238 | "cell_type": "markdown", 239 | "id": "e2294b5a-db88-4c20-b5ee-32edc80bab04", 240 | "metadata": {}, 241 | "source": [ 242 | "```python\n", 243 | "rng = np.random.default_rng(2021)\n", 244 | "a = rng.random((10_000_000,))\n", 245 | "a = rng.random((10_000_000,))\n", 246 | "a_torch = torch.from_numpy(a)\n", 247 | "```" 248 | ] 249 | }, 250 | { 251 | "cell_type": "code", 252 | "execution_count": null, 253 | "id": "9c0f1a4c-31cf-49ad-a43d-be99d2557178", 254 | "metadata": {}, 255 | "outputs": [], 256 | "source": [ 257 | "t1 = time.time()\n", 258 | "##### insert NumPy code here ###############\n", 259 | "#print(np.xxx())\n", 260 | "a.mean()\n", 261 | "a.std()\n", 262 | "############################################\n", 263 | "\n", 264 | "timing['numpy'] = time.time() - t1\n", 265 | "print(f\"NumPy Acceleration {timing['loop']/timing['numpy']:4.1f} X\")\n", 266 | "\n", 267 | "t1 = time.time()\n", 268 | "##### insert NumPy code here ###############\n", 269 | "#print(np.xxx())\n", 270 | "a_torch.mean()\n", 271 | "a_torch.std() \n", 272 | "############################################\n", 273 | "\n", 274 | "timing['numpy'] = time.time() - t1\n", 275 | "print(f\"PyTorch Acceleration {timing['loop']/timing['numpy']:4.1f} X\")" 276 | ] 277 | }, 278 | { 279 | "cell_type": "code", 280 | "execution_count": null, 281 | "id": "db1f9617", 282 | "metadata": {}, 283 | "outputs": [], 284 | "source": [ 285 | "%matplotlib inline\n", 286 | "import matplotlib.pyplot as plt\n", 287 | "plt.figure(figsize=(10,6))\n", 288 | "plt.title(\"Measure acceleration of looping versus Numpy Mean & STD [Lower is better]\",fontsize=12)\n", 289 | "plt.ylabel(\"Time in seconds\",fontsize=12)\n", 290 | "plt.yscale('log')\n", 291 | "plt.xlabel(\"Various types of operations\",fontsize=14)\n", 292 | "plt.grid(True)\n", 293 | "plt.bar(x = list(timing.keys()), height= list(timing.values()), align='center',tick_label=list(timing.keys()))" 294 | ] 295 | }, 296 | { 297 | "cell_type": "markdown", 298 | "id": "9e25d7aa", 299 | "metadata": {}, 300 | "source": [ 301 | "# Cross Entropy\n", 302 | "\n", 303 | "Cross entropy calculates are done all the time in machine learning. consider the loopy stream of consciousness variant of computing cross entropy below, as seen in the next cell. Try your hand at removing the loop and using the NumPy Ufunc, aggregation or other NumPy construct to make this code more readable and faster.\n", 304 | "\n", 305 | "### Logistic Regression\n", 306 | "\n", 307 | "probability of y=1\n", 308 | "\n", 309 | "# $ \\hat{y} = q_{(y=1)} = g(w,x) = \\frac{1}{1 + e^{-wx}} $ \n", 310 | "\n", 311 | "probability of y = 0\n", 312 | "\n", 313 | "\n", 314 | "# $ q_{(y=1)} = 1-\\hat{y}$\n", 315 | "\n", 316 | "w = weights\n", 317 | "\n", 318 | "x = input vector\n", 319 | "\n", 320 | "\n", 321 | "## cross-entropy to get a measure of dissimilarity between p and q\n", 322 | "\n", 323 | "# $ H(p,q) = - \\Sigma p_i log(q_i) = -y log \\hat{y} - (1-y) log(1-\\hat{y}) $\n" 324 | ] 325 | }, 326 | { 327 | "cell_type": "markdown", 328 | "id": "10f483f9", 329 | "metadata": {}, 330 | "source": [ 331 | "# Slow python method" 332 | ] 333 | }, 334 | { 335 | "cell_type": "code", 336 | "execution_count": null, 337 | "id": "ed0f384d", 338 | "metadata": { 339 | "scrolled": true 340 | }, 341 | "outputs": [], 342 | "source": [ 343 | "import matplotlib.pyplot as plt\n", 344 | "import math\n", 345 | "from math import log2\n", 346 | "import numpy as np\n", 347 | "import time\n", 348 | "# Sigmoid function\n", 349 | "\n", 350 | "timing = {}\n", 351 | "\n", 352 | "######################### This is the targted function for this exercise ######\n", 353 | "def sigmoid_slow(z):\n", 354 | " sig = [1.0/(1.0+math.exp(-1*xi)) for xi in z]\n", 355 | " return sig\n", 356 | "###############################################################################\n", 357 | "\n", 358 | "\n", 359 | "# yHat represents the predicted value / probability value calculated as output of hypothesis / sigmoid function\n", 360 | " \n", 361 | "# y represents label\n", 362 | "\n", 363 | "# cross_entropy list comprehension method\n", 364 | "\n", 365 | "def cross_entropy_loss_slow(yHat, y):\n", 366 | " if y == 1:\n", 367 | " return [-1*math.log(yi) for yi in yHat]\n", 368 | " else:\n", 369 | " return [-1*math.log(1.0 - yi) for yi in yHat]\n", 370 | " \n", 371 | "x = [xi/100_000. for xi in range(-1_000_000, 1_000_000)] # num between -10 and 10 step .00001\n", 372 | "p = x\n", 373 | "q = []\n", 374 | "start = time.time()\n", 375 | "sig_x = sigmoid_slow(x)\n", 376 | "cost_1 = cross_entropy_loss_slow(sig_x, 1)\n", 377 | "cost_0 = cross_entropy_loss_slow(sig_x, 0)\n", 378 | "timing['list comprehension'] = time.time() - start\n", 379 | "print(f\"time timing: {timing['list comprehension']:5.3f} sec\")\n", 380 | "fig, ax = plt.subplots(figsize=(8,6))\n", 381 | "plt.plot(sig_x, cost_1, label='$ \\hat{y} $ if y=1')\n", 382 | "plt.plot(sig_x, cost_0, label='$ (1 - \\hat{y} )$ if y=0')\n", 383 | "plt.xlabel('sigmoid(x)')\n", 384 | "plt.ylabel('$ \\hat{y}$')\n", 385 | "plt.legend(loc='best')\n", 386 | "plt.tight_layout()\n", 387 | "plt.show()" 388 | ] 389 | }, 390 | { 391 | "cell_type": "markdown", 392 | "id": "f79978c9", 393 | "metadata": {}, 394 | "source": [ 395 | "## Hint:\n", 396 | "\n", 397 | "```python\n", 398 | "def sigmoid(x):\n", 399 | " return 1.0 / (1.0 + np.exp(-x))\n", 400 | "```" 401 | ] 402 | }, 403 | { 404 | "cell_type": "code", 405 | "execution_count": null, 406 | "id": "2f934884", 407 | "metadata": {}, 408 | "outputs": [], 409 | "source": [ 410 | "import matplotlib.pyplot as plt\n", 411 | "import numpy as np\n", 412 | "import torch\n", 413 | "# Sigmoid function NumPy\n", 414 | "def sigmoid(z):\n", 415 | "##### insert improved NumPy code here\n", 416 | " return 1.0 / (1.0 + np.exp(-z))\n", 417 | "#####################################\n", 418 | "\n", 419 | "\n", 420 | "# yHat represents the predicted value / probability value calculated as output of hypothesis / sigmoid function\n", 421 | " \n", 422 | "# y represents label\n", 423 | "\n", 424 | "# cross_entropy NumPy method\n", 425 | "\n", 426 | "def cross_entropy_loss(yHat, y):\n", 427 | " if y == 1:\n", 428 | " return -np.log(yHat)\n", 429 | " else:\n", 430 | " return -np.log(1 - yHat)\n", 431 | " \n", 432 | "x = np.arange(-10, 10, 0.00001)\n", 433 | "start = time.time()\n", 434 | "sig_x = sigmoid(x)\n", 435 | "cost_1 = cross_entropy_loss(sig_x, 1)\n", 436 | "cost_0 = cross_entropy_loss(sig_x, 0)\n", 437 | "\n", 438 | "timing['numpy'] = time.time() - start\n", 439 | "print(f\"time elapsed: {timing['numpy']:5.3f} sec\")\n", 440 | "\n", 441 | "ratio = timing['list comprehension']/timing['numpy']\n", 442 | "print(f'Acceleration: {ratio:5.1f}X')\n", 443 | "\n", 444 | "fig, ax = plt.subplots(figsize=(8,6))\n", 445 | "plt.plot(sig_x, cost_1, label='$ \\hat{y} $ if y=1')\n", 446 | "plt.plot(sig_x, cost_0, label='$ (1 - \\hat{y} )$ if y=0')\n", 447 | "plt.xlabel('sigmoid(x)')\n", 448 | "plt.ylabel('$ \\hat{y}$')\n", 449 | "plt.legend(loc='best')\n", 450 | "plt.tight_layout()\n", 451 | "plt.show()" 452 | ] 453 | }, 454 | { 455 | "cell_type": "code", 456 | "execution_count": null, 457 | "id": "c56cea94-5fae-4f09-9865-3dcfbffe33f0", 458 | "metadata": {}, 459 | "outputs": [], 460 | "source": [ 461 | "%matplotlib inline\n", 462 | "import matplotlib.pyplot as plt\n", 463 | "plt.figure(figsize=(10,6))\n", 464 | "plt.title(\"Measure acceleration of looping versus Numpy Cross Entropy Loss [Lower is better]\",fontsize=12)\n", 465 | "plt.ylabel(\"Time in seconds\",fontsize=12)\n", 466 | "plt.yscale('log')\n", 467 | "plt.xlabel(\"Various types of operations\",fontsize=14)\n", 468 | "plt.grid(True)\n", 469 | "plt.bar(x = list(timing.keys()), height= list(timing.values()), align='center',tick_label=list(timing.keys()))" 470 | ] 471 | }, 472 | { 473 | "cell_type": "markdown", 474 | "id": "092fee78-fa99-42f9-a7eb-8e92ba264c4f", 475 | "metadata": {}, 476 | "source": [ 477 | "# Implement a PyTorch CrossEntropyLoss\n", 478 | "\n", 479 | "The cell below will error - replace the xxx with valid code!" 480 | ] 481 | }, 482 | { 483 | "cell_type": "code", 484 | "execution_count": null, 485 | "id": "9db87ab7-4b47-4288-a9e7-01ffaeed0bdd", 486 | "metadata": {}, 487 | "outputs": [], 488 | "source": [ 489 | "import matplotlib.pyplot as plt\n", 490 | "import numpy as np\n", 491 | "import torch\n", 492 | "\n", 493 | "timing = {}\n", 494 | "\n", 495 | "start = time.time()\n", 496 | "x = np.arange(-10, 10, 0.00001)\n", 497 | "numpy_sig_x = sigmoid(x)\n", 498 | "timing['sigmoidNumPy'] = time.time() - start\n", 499 | "print(numpy_sig_x)\n", 500 | "del x\n", 501 | "\n", 502 | "x = torch.arange(-10, 10, 0.00001)\n", 503 | "start = time.time()\n", 504 | "torch_sig_x = torch.sigmoid(x)\n", 505 | "timing['sigmoidPyTorch'] = time.time() - start\n", 506 | "print(torch_sig_x.cpu().detach().numpy())\n", 507 | "\n", 508 | "ratio = timing['sigmoidNumPy']/timing['sigmoidPyTorch']\n", 509 | "print(f\"\\nAcceleration PyTorch to NumPy: {ratio:4.1f} X\")\n" 510 | ] 511 | }, 512 | { 513 | "cell_type": "code", 514 | "execution_count": null, 515 | "id": "e8065bc7-fabd-492f-becc-923d09ff0629", 516 | "metadata": {}, 517 | "outputs": [], 518 | "source": [] 519 | }, 520 | { 521 | "cell_type": "code", 522 | "execution_count": null, 523 | "id": "b0a8db45", 524 | "metadata": {}, 525 | "outputs": [], 526 | "source": [ 527 | "%matplotlib inline\n", 528 | "import matplotlib.pyplot as plt\n", 529 | "plt.figure(figsize=(10,6))\n", 530 | "plt.title(\"Measure acceleration of looping versus Numpy Cross Entropy Loss [Lower is better]\",fontsize=12)\n", 531 | "plt.ylabel(\"Time in seconds\",fontsize=12)\n", 532 | "plt.yscale('log')\n", 533 | "plt.xlabel(\"Various types of operations\",fontsize=14)\n", 534 | "plt.grid(True)\n", 535 | "plt.bar(x = list(timing.keys()), height= list(timing.values()), align='center',tick_label=list(timing.keys()))" 536 | ] 537 | }, 538 | { 539 | "cell_type": "markdown", 540 | "id": "e08e9e72", 541 | "metadata": {}, 542 | "source": [ 543 | "# Softmax Loop\n", 544 | "\n", 545 | "Another Algorithm that's used all the time in machine learning is Softmax.\n", 546 | "\n", 547 | "The softmax function, or normalized exponential function, converts a vector of K real numbers into a probability distribution of K possible outcomes.\n", 548 | "\n", 549 | "### Below is slower python for loop method" 550 | ] 551 | }, 552 | { 553 | "cell_type": "code", 554 | "execution_count": null, 555 | "id": "5a00b0c7", 556 | "metadata": {}, 557 | "outputs": [], 558 | "source": [ 559 | "import numpy as np\n", 560 | "import torch\n", 561 | "import time\n", 562 | "import math\n", 563 | "from sys import getsizeof\n", 564 | "\n", 565 | "timing = {}\n", 566 | "np.random.seed(seed=42)\n", 567 | "BIG = 10_000_000\n", 568 | "b = list(np.random.rand(BIG))\n", 569 | "def softmax_slow(x):\n", 570 | " denominator = 0.0\n", 571 | " for xi in x:\n", 572 | " denominator += math.exp(xi)\n", 573 | " return [math.exp(xi)/denominator for i, xi in enumerate(x)]\n", 574 | "start = time.time()\n", 575 | "softmax_slow(b)\n", 576 | "timing['softmax_loop'] = time.time() - start\n", 577 | "print(f\"time elapsed: {timing['softmax_loop']:5.3f} sec\")\n", 578 | "#print(f'memory: {getsizeof(b):,}')\n" 579 | ] 580 | }, 581 | { 582 | "cell_type": "markdown", 583 | "id": "8bbe6945", 584 | "metadata": {}, 585 | "source": [ 586 | "# Softmax NumPy & PyTorch\n", 587 | "\n", 588 | "More Readable/Maintainable/Faster numpy method:\n", 589 | "\n", 590 | "```python\n", 591 | "def softmax(x):\n", 592 | " return(np.exp(x)/np.exp(x).sum()) # one line of code, no loop indices \n", 593 | "```" 594 | ] 595 | }, 596 | { 597 | "cell_type": "code", 598 | "execution_count": null, 599 | "id": "894fd136", 600 | "metadata": {}, 601 | "outputs": [], 602 | "source": [ 603 | "np.random.seed(seed=42)\n", 604 | "b = np.random.rand(BIG)\n", 605 | "b_torch = torch.from_numpy(b)\n", 606 | "\n", 607 | "def softmax_numpy(x):\n", 608 | " ########### insert solu\"tion here\n", 609 | " return(np.exp(x)/np.exp(x).sum()) # one line of code, no loop indices \n", 610 | " ################################\n", 611 | " \n", 612 | "def softmax_torch(x):\n", 613 | " #doIt = torch.nn.Softmax(dim=0)\n", 614 | " #return doIt(x)\n", 615 | " return(torch.exp(x)/torch.exp(x).sum())\n", 616 | "\n", 617 | "\n", 618 | "start = time.time()\n", 619 | "softmax_numpy(b)\n", 620 | "timing['softmax_numpy'] = time.time() - start\n", 621 | "print(f\"time elapsed: {timing['softmax_numpy']:5.3f} sec\")\n", 622 | "ratio = timing['softmax_loop'] / timing['softmax_numpy'] \n", 623 | "print(f'NumPy Acceleration: {ratio:5.4g}X')\n", 624 | "print(softmax_numpy(b))\n", 625 | "\n", 626 | " \n", 627 | "start = time.time()\n", 628 | "softmax_torch(b_torch)\n", 629 | "timing['softmax_pytorch'] = time.time() - start\n", 630 | "print(f\"time elapsed: {timing['softmax_pytorch']:5.3f} sec\")\n", 631 | "ratio = timing['softmax_loop'] / timing['softmax_pytorch'] \n", 632 | "print(f'PyTorch Acceleration: {ratio:5.4g}X')\n", 633 | "print(softmax_torch(b_torch))\n" 634 | ] 635 | }, 636 | { 637 | "cell_type": "code", 638 | "execution_count": null, 639 | "id": "4ea116cf-975f-43c8-81d0-f7a5055874ff", 640 | "metadata": {}, 641 | "outputs": [], 642 | "source": [] 643 | }, 644 | { 645 | "cell_type": "code", 646 | "execution_count": null, 647 | "id": "c36f9746", 648 | "metadata": {}, 649 | "outputs": [], 650 | "source": [ 651 | "%matplotlib inline\n", 652 | "import matplotlib.pyplot as plt\n", 653 | "plt.figure(figsize=(10,6))\n", 654 | "plt.title(\"Measure acceleration of looping versus Numpy Softmax [Lower is better]\",fontsize=12)\n", 655 | "plt.ylabel(\"Time in seconds\",fontsize=12)\n", 656 | "plt.yscale('log')\n", 657 | "plt.xlabel(\"Various types of operations\",fontsize=14)\n", 658 | "plt.grid(True)\n", 659 | "plt.bar(x = list(timing.keys()), height= list(timing.values()), align='center',tick_label=list(timing.keys()))" 660 | ] 661 | }, 662 | { 663 | "attachments": { 664 | "image.png": { 665 | "image/png": "iVBORw0KGgoAAAANSUhEUgAAAQYAAAAyCAYAAAC+qXUzAAAREElEQVR4nO2deZxWVRnHvy8zgwwgMCDgQioqDm7gCrng4IILkLtkmJWm5kJhmkuWJYqSpqFmmlupqbmgolZgarhgLlhuIGqGS7nmvqYo0x+/e7r3Pe+56/u+4gzn+/nweZl7zz13O+c5z3mWc8Hj8Xg8Ho/H4/F0LPYDNspRvitwKtCjlhfRWMvKPJ5liPFAX+DymP3DgHNi9n0EXAX8ztr+NWBjx/YkPgFuBm4EdgY+iym3P/AQ8HiOuj0eTw4moE7WPaVcL+AJYCHQEGzrAuwKLAYOipRtARYAfQpcTwm4EDg8oUx3YBawXoH6PR5PCoOBZ4AVM5TtjgTAeY59DwMvoE4NcDowrYrragX+iQRPHKsC84FVqjiPx+OxaAQeBDbPWH4M0A5MtLY3AR8irQMkHF4DNqvi2kpIMGydUm5/4LYqzuPxeCzOCP5l5RQkGFawtk8Itn89+HsDZCtw2fxWQRrHDGB7ZNuYAezkKHsF8JMM13UvcGCGch6PJ4VW4A1g+RzHzAXmodG8C7AWcADwIuXThgnAk47jVwKuAUYiu8QHwNXAdkjjsG0cU4L9aWwYHD84roD3Sng82TgGuAh4L2P5ZmAE8CpwJ+rUC5G6Px7ZGAyDgDcddRwLTAZeQVOE7sDxSEjcirSMKO9SqZ24eAT4C3Jzfi3T3Xg8ngpWBj4GvpTjmB3QdOHbGcoejYRHEj8CniU0Vro4GPh7pquDUcASYIhrp9cYOjYjUWBLO8kNJiuP4B65lnWOQEbHf+U4pi34nZWh7NtAt5Qyo4A70LuOoxcSYFmYCzwNTEJaSRleMHRsjgL2Dv6/ELgupbwRIC1oFByE5pvLBft/EdTpCVkeOASYmvO4NtTxXs5Q9jWgt2P7UWhUvxDYhjBOoTua1uxrle+T8XygtnAtMkIeQbLA8XQwWtAo1o4a0KgCdSyHRo2XgbeocWhtJ2Af9Hw3zFh+CLAbikC8AxhNujawOop3aLC2vwacBByK3s0ewfapwC6Oeq5FdoOstKF728TekRQM4fni8xYyHhlN4PdIWOThY+BcYG1kQf9WLS+wEzAChTA/mrH8V1Cuw1SkrmcRDM8hjc/OkfgGerefohiH0chmcQ0Kg45SQgNDmq0iyoNB/TtkKVwinK+WrL/rTTfgAuQaclGi2DWNIbz5nsDPCl5fbyRlV3PsGwlcnPO6asUJ6AW3AzdVUU8vYDbp95C3jXwZODvntexI9piBppx152EusuDXm8OAk6s4fiTwPPnMAyXgKTK0mTOQ8akduVfuQhJoEfBnJLHqyS+pjBID+W0fJ5Sedwb//opcNKeR3DD3QC4eUOLL9JzXtSuav08J6roe+BOVo/Nk4Oc5664FJWAOoXCYVEVdq5I8nZiOjGXtaBpzJ2onz6I59VZW+RWAx0gfNV2cQrLN45bgOu4qUHcWGlA/OKtO9UdpRm08T5yEoQRchrwSeZmB3mPqgLY7etjbWyfeC/lwv1Pg5FnYFKk2cTQC7wCXWtvXRsLiBwnH7k44PxtA9hfdHalss6k0Dl2ErPjReWEDin9fI2P9tWQl4D/o3X2MounqxVeD89g2jVOB95FwMVyKVOIi9EBxAH0TylxJPvU5Dxuj+/xeneq32Qw4scBxuwC/KXjOU9E9piWEcS4yhLhGjWlIo6iHgeoPJM9vN0M3cIhj33tIe4hjTyTYAAaSTa3tBtyDAlFc6tm6wfXsb20/FvhVhvrrwc6EWsOT1M+QeAFqI3ZjMoY6E247BHXsalT9X6CpUhy/Au6uov4k9kL3M65O9bvYARieo3w34tO7szAJ3eOw6EaX8XEUGrk/cOx7DKnPeS48C31Q7PcfE8qYaYw9OqyD7AbzU86Rd+5/Dpob74s0Epvng1/bonsX0lCWhq1hFnBJ8P9WNDWrNSUUhTcPhdVGGRH8Lgh+JyDL/OIqzncXobbnIs7Nlvf5u9KRjVr/Us66quHPZDd0AvyX6jQaE5sR1fIqBMMKSHLEqWb9gl+Xz7UaNkcjy+sJZdqQmm7HlB+KGmiSn9nEqkM2f+36KFf+Tyif3oUJPbVH5YeDfc6Iss+BycjCDdJmdk0oW4QBwFA0vYrSE60+dBNwX7BtSzTdsumOFiN5ACX0jEODwm1UGp7vQ22yV8I1Rd9pCRnyrkCuvluQgIqyOxI4c9Dc/Gr0ru20Z3POtxPO3dEx9zYwutFWkU3KZpwV1mgKj6OR8in08HqhDvsl9JI+Qn7cQajDL0YJG0/gjswagmLI4zptQ3Bt9wPbok6+NrKDbIKCP16IObYI+wW/tj0jimnAD1nbP0HaxFBkjHMxBvhxxmu5Gjg/Y1mQprcnGnWagN8iQVerUc+0kffRc++FRtv9UbuJhgCvTajBRDkPGbofRR33PDQ4zEYGtKjB8T/ontZF799FVDs4FsURbI3eRT8knJqQPWJzZDxeCxnVT0JToDGovUYxGkNnFgxG6ysTvLZgGI3U5nsdFZSAsegh/Rt1nuuQD/ZCJBg2QurrPOBIZKE+CHXsmehFuATDANTQ4hiOXtInSHN4F42KxyGBErecVRxp8Rum09+TUGbn4PcBx74PSTaY3QbcnnINhiIRaQtR0s90NPW7hPB6q2Wb4LcP6nz/RnP8S9AgYCghzcl+rysie5BRl1dBlvG3gRtQp7VJe57mGTUh490JhAlGb6DAn6lIMGyFgsEWBfufRgPT21QKctNZ7CmTi5FIa/oi8RbpuROmP3aNbrQFQxvqDP91VLA1CqM1Kvs05Ea8A41qjUjl2xi9DIBfB393Jdna+hnJnXvb4PdH5Jt/GZYQjipZOlpf1LDeiNm/PBKMs6nUGIxLNc09V+8Q1LOQIB+DRsceuO1GeSihd/E3suX9m2cR5RU0kIA63rCgrg9xtxFTR3PMOaJCvhVFctoj/EcourAFCeTTkTD4B7AmEqSvOeo277CJ9ByEA0lIY15KPEa6YDD2n7LBMioY+qGX5Ho5jcg6/DSVQRhd0JzxfOT6cHWmtE7wJskdqQ29uCJCAXSNdkBOEo8jI2xP5PHoCfRHvnqQuroYtxuuhATLuwn1b0CY45DGXGSQKsJxyCA4nuqFAkizayX7EmRvEd+hQRpqF8I4hObg7+i1mtyOd2LqiLYtY6Oy7T79kKB/Dxmp/4Cmcr2RljM2pu4lwW8LyRotlK/f2JEwWtGS6MaoYDCjsh0s0h+5p1ZAGoKdA96MVrwdjNR6m3bSVf0n0dy0RKUQaUINKEuWWhJ5RuhLkQFrGyQMjkSjzkxCd9zuaP7roi9ajCOOF8nue38mYzmb3ki93xHZgmpBXBtxYVymA63tbWgKOhqFDy8g7HTTkTcoavDthgaeuOfZQPhuX0Ea3IjIfuNFmUUYWrwIR0ahAzOF6E2+zMqOhHEklA0cjWh6cDBq6CDpORq9jDWRseYJ9EBd6tYENH+9Fs1rXUkc9gi9GqG7D2R5XpNKwTApOG9P1MCmBP/KpFtG8hwzDy2IcRpqEOOQAJyLbCobokboYmUkhZO0mzepb5htV+Qd+AluG0heBiFhuGfw906ofdyYctx9yAgbZQgyhPZD9oZm1PlHIW3T9gKtgzqoS7gdgQLjBiBNdxoygs4GfoqWdj8daSEmGnQBarMbEQ5YbwBnEnpTDKa9J9k3OjpGMJRp+o1oRG5Ho+FMJDleD/7dgCy6LpsDqCNfjjSFg5F9Ygbx1njDQZRb5d9F86FNUKcEjQQLUEMx3oFVKCYU7GvOwjR0P/uh0eYF9Lx2Rc9rUzTy2OsXtCHr+dKyZJdQks0sZPGvFXcTBhI1Ig0qjdtRh4tyMVLfh6LBaDjSzm5DNiSbNmTHcr33ZwgjXlvQe5mPPBhjUYTmlcgVaTTdXZBQnxP83Y7a1R1ICEUHLOPpypuYVi1rEBpHs7ABuu8idisjGJ4rcGwst1Pewc9EHo1omPAFlIdrjkNS3GYi9csz2ItwtBuIojuL8A5awPPHuFfaLaHO+M2C9deCMykeHlsPHkCGvyKUguNr5VEBDTyunJ+7qVzfYCPU2ap9n4ORPWoKElhJiU5H406pTmIyya71JI5C91iWo1FN2vUPkTFnLeQHBo2ei5GPehMUfLQ8MgZNQR6NIwi1gihXI3VyQBXXFMcSQmma17UZZT4KzDkGGSBtWlEsx5VVnKMavksYnFWUMdR2hDwBtyaQha3QSF+tfSnKtWhaYVyLXdBgtRrSGqKYQLG1Cp5rdTRQTEb9Yg6yq9yPu52PRVNqO6U6jbPROzu6wDUORrODNOPqUmUQMkDVOpx4PGEEYAv5U4ANKyFhZxvUQHPmG1h6Lqu90LSvSHaeoTf1+ebAJMKlzrKyKnqermddLdsj4T0lOMd04hdRfZFiXqE90UDp0nZuptKA24ymLlk+ZONiKDKG581LuQ933NIXjlaKS+g41iDssA3UJ318K5be579GIZ980UYFmqc+Sv2WdtstZ/mdcK978XkzE9mL8gxW2yEPiCvhD2S3akdBUYbDkZCqhlvJ7gYH9YVPqLQDeToB6yLD0boFji2hRnw9YVam/3xZOd9Hz2WdjOW7IcNhUrbv+KDOaDbxrcR/e7LZOr8JWLM5nnwfxB0WXEfFQOkXg+3YrIgs7g0o38Ae1ZIyDxuRlhCdeswmOf5iWeQmFNw3ktDmkMQBSDtNShU37k8TTVlCbnmXq39zNKLPR/aHR9C0dX1kLI26ce8n+cO2Njsij+Bce4cXDB2bI5F7zQgAexn5uDDwdmQMs8Nlq8nr76wsQh1uFNks//uhzjYjocymwa9JCmshXNg3ygBkUDb5HU8gV+sZyIC/JeWCYRGKo+lJNmPiRJTV6lpWwOPxpDCR9EQukCB+B7k+42wSJTT1iwbIDUbCeiWrbCthgFgzsgfsjYzEB2ElPiFNop1s7uH1iJlGeDyebDShzpw0PTB8iuwFcZgP2kZDslfGLRiimC9nJ7mSBwRlsqwFMo3KSE+P5/+sgzJgowxzFVzGMd+QTHMH3o7Ue6Mx7EOYY9ID5d7YcRldUYe21wptIFwC4DTClbFANg97ab2hQT1pmk0flOC2bUo5zzLKHigK9D3CaLsN0KhXNFqxs9INhXLba3zamHU3N0aBXRNRYN8WKH7hUdwxE89SqdafHNTVAxk+TeBcV+T1sAMUxyL7Rppr9QJStAX7yzeezsUWKGt1dPCvH+WW9YXIqzEaNcA5aA57K8VTvTsrn6LOeC7qoHGLtzwT7DuOMPKxBa2k9Uf0gSCXYfDLKCcpGhW8NXovrWjJwC2C+iegBZHsVbn2QULjsoT7GBbcwzjis4M9HZzeKHltBDImHUjoadot2Jcl7P1S1IBPxquXaZyIhGkag5FQnoLSAH4b2XeYo/zBVH57tBEJbRO41gtl9sYN6DNJTifvgjwhRZfz93QAmtHiOH3Q6H4Y8nGbNRlPyVhP3+D4hyj2UZhljS4or+e4HMesh7SIaSjBzdV5u6Nl8voXvK7+SANIWl7ul5QLqFiWxhLnntowABnCXkQx9iOQ6vogGll+g1aiirIQNWrDWKR+fooiHnes7yV3GhrRKtRnk82y34C8GoNQjsS2uNPIj0Rh4FkWkbE5Hdkp4hYO3hvFT0wh2xqWng7OaoTf1BhIuFR+Us5DEwqfNVFy30BzWE/96IamBcullLuY/Gnm2+PO9vUsg2yHjEw/IFQPpxJqgYdRuRy64eeUR+aNRsEzu5FudffUlxL53cXDqbH2770SHZvhKMruM5SYcxWhpXke+tzZFsi/3R9NMV5FdokehCnWL6F1JJpQEk7cil2ez4dX04tUVd7j8Xg8Ho/H4/F4PB6P53Phf+i3i7/M0C0DAAAAAElFTkSuQmCC" 666 | } 667 | }, 668 | "cell_type": "markdown", 669 | "id": "716d4d86", 670 | "metadata": {}, 671 | "source": [ 672 | "# Kullback-Leibler Divergence (KL)\n", 673 | "\n", 674 | "KL divergence score, describes how much one probability distribution differs from a different probability distribution.\n", 675 | "\n", 676 | "It is fairly widely used in the data mining literature and as a key ingredient in Variational Auto Encoders. The concept was originated in probability theory and information theory.\n", 677 | "\n", 678 | "A KL of zero reflects two distributions that are the same\n", 679 | "\n", 680 | "![image.png](attachment:image.png)\n", 681 | "\n", 682 | "- Which is easier to read and maintain?\n", 683 | "\n", 684 | "- Which is faster?\n", 685 | "\n", 686 | "### Naive Loop\n", 687 | "\n", 688 | "```python\n", 689 | "\n", 690 | "def kl_divergence(p, q):\n", 691 | " s = []\n", 692 | " for i in range(len(p)):\n", 693 | " s.append(p[i] * log(p[i]/q[i]))\n", 694 | " return sum(s)\n", 695 | "```\n", 696 | "\n", 697 | "### NumPy vector\n", 698 | "\n", 699 | "```python\n", 700 | "def kl_divergence(p, q):\n", 701 | " return np.sum(p * np.log(p/q))\n", 702 | "```" 703 | ] 704 | }, 705 | { 706 | "cell_type": "code", 707 | "execution_count": null, 708 | "id": "6a0460e3", 709 | "metadata": { 710 | "scrolled": true 711 | }, 712 | "outputs": [], 713 | "source": [ 714 | "def kl_divergence_slow(p, q):\n", 715 | " s = []\n", 716 | " for i in range(len(p)):\n", 717 | " s.append(p[i] * log(p[i]/q[i]))\n", 718 | " return sum(s)" 719 | ] 720 | }, 721 | { 722 | "cell_type": "markdown", 723 | "id": "279bb501", 724 | "metadata": {}, 725 | "source": [ 726 | "# Compute, Print Naive KL Loop, Naive Softmax" 727 | ] 728 | }, 729 | { 730 | "cell_type": "code", 731 | "execution_count": null, 732 | "id": "e96c78d5", 733 | "metadata": {}, 734 | "outputs": [], 735 | "source": [ 736 | "from math import log\n", 737 | "timing = {}\n", 738 | "start = time.time()\n", 739 | "np.random.seed(seed=42)\n", 740 | "BIG = 10_000_000\n", 741 | "p = softmax_slow(np.random.rand(BIG))\n", 742 | "q = softmax_slow(np.random.rand(BIG))\n", 743 | "# calculate (P || Q)\n", 744 | "kl_pq = kl_divergence_slow(p, q)\n", 745 | "print('KL(P || Q): %.3f nats' % kl_pq)\n", 746 | "# calculate (Q || P)\n", 747 | "kl_qp = kl_divergence_slow(q, p)\n", 748 | "print('KL(Q || P): %.3f nats' % kl_qp)\n", 749 | "timing['naiveLoop'] = time.time() - start\n", 750 | "\n", 751 | "# plot of distributions\n", 752 | "from matplotlib import pyplot\n", 753 | "\n", 754 | "print(f'timing Naive KL w Softmax Loops: {timing[\"naiveLoop\"]:.2f} sec')\n", 755 | "print('P=%.3f Q=%.3f' % (sum(p), sum(q)))\n", 756 | "del p\n", 757 | "del q" 758 | ] 759 | }, 760 | { 761 | "cell_type": "markdown", 762 | "id": "4805e510", 763 | "metadata": {}, 764 | "source": [ 765 | "# SciPy equivalent is Relative Entropy: rel_entr" 766 | ] 767 | }, 768 | { 769 | "cell_type": "code", 770 | "execution_count": null, 771 | "id": "7e10f2d9", 772 | "metadata": {}, 773 | "outputs": [], 774 | "source": [ 775 | "# example of calculating the kl divergence (relative entropy) with scipy\n", 776 | "from scipy.special import rel_entr\n", 777 | "# define distributions\n", 778 | "start = time.time()\n", 779 | "np.random.seed(seed=42)\n", 780 | "p = softmax_numpy(np.random.rand(BIG))\n", 781 | "q = softmax_numpy(np.random.rand(BIG))\n", 782 | "# calculate (P || Q)\n", 783 | "\n", 784 | "###### SciPy equvalent ##################\n", 785 | "kl_pq = rel_entr(p, q)\n", 786 | "#########################################\n", 787 | "\n", 788 | "print('KL(P || Q): %.3f nats' % sum(kl_pq))\n", 789 | "# calculate (Q || P)\n", 790 | "\n", 791 | "###### SciPy equvalent ##################\n", 792 | "kl_qp = rel_entr(q, p)\n", 793 | "#########################################\n", 794 | "\n", 795 | "print('KL(Q || P): %.3f nats' % sum(kl_qp))\n", 796 | "timing['SciPy'] = time.time() - start\n", 797 | "print(f'timing SciPy: {timing[\"SciPy\"]:.2f} sec')\n" 798 | ] 799 | }, 800 | { 801 | "cell_type": "markdown", 802 | "id": "e100dace", 803 | "metadata": {}, 804 | "source": [ 805 | "# Compute Naive KL Loop, NumPy Softmax" 806 | ] 807 | }, 808 | { 809 | "cell_type": "code", 810 | "execution_count": null, 811 | "id": "c081108d", 812 | "metadata": {}, 813 | "outputs": [], 814 | "source": [ 815 | "start = time.time()\n", 816 | "np.random.seed(seed=42)\n", 817 | "p = softmax_numpy(np.random.rand(BIG))\n", 818 | "q = softmax_numpy(np.random.rand(BIG))\n", 819 | "# calculate (P || Q)\n", 820 | "kl_pq = kl_divergence_slow(p, q)\n", 821 | "print('KL(P || Q): %.3f nats' % kl_pq)\n", 822 | "# calculate (Q || P)\n", 823 | "kl_qp = kl_divergence_slow(q, p)\n", 824 | "print('KL(Q || P): %.3f nats' % kl_qp)\n", 825 | "timing['naiveLoopFastSoftMax'] = time.time() - start\n", 826 | "\n", 827 | "# plot of distributions\n", 828 | "from matplotlib import pyplot\n", 829 | "# define distributions\n", 830 | "print(f'timing NaiveLoopNumPySoftmax: {timing[\"naiveLoopFastSoftMax\"]:.2f} sec')\n", 831 | "print('P=%.3f Q=%.3f' % (sum(p), sum(q)))\n", 832 | "del p,q" 833 | ] 834 | }, 835 | { 836 | "cell_type": "markdown", 837 | "id": "c8ec8ad3", 838 | "metadata": {}, 839 | "source": [ 840 | "# Compute NumPy KL Loop, NumPy Softmax" 841 | ] 842 | }, 843 | { 844 | "cell_type": "code", 845 | "execution_count": null, 846 | "id": "a6d817e7", 847 | "metadata": {}, 848 | "outputs": [], 849 | "source": [ 850 | "def kl_divergence(p, q):\n", 851 | " return np.sum(p * np.log(p/q))" 852 | ] 853 | }, 854 | { 855 | "cell_type": "code", 856 | "execution_count": null, 857 | "id": "e86c7e4e", 858 | "metadata": {}, 859 | "outputs": [], 860 | "source": [ 861 | "# define distributions\n", 862 | "start = time.time()\n", 863 | "np.random.seed(seed=42)\n", 864 | "p = softmax_numpy(np.random.rand(BIG))\n", 865 | "q = softmax_numpy(np.random.rand(BIG))\n", 866 | "# calculate (P || Q)\n", 867 | "kl_pq = kl_divergence(p, q)\n", 868 | "print('KL(P || Q): %.3f nats' % np.sum(kl_pq))\n", 869 | "# calculate (Q || P)\n", 870 | "kl_qp = kl_divergence(q, p)\n", 871 | "print('KL(Q || P): %.3f nats' % np.sum(kl_qp))\n", 872 | "timing['NumPyFastAll'] = time.time() - start\n", 873 | "print(f'timing naiveFastAll: {timing[\"NumPyFastAll\"]:.2f} sec')\n" 874 | ] 875 | }, 876 | { 877 | "cell_type": "markdown", 878 | "id": "1c004863-8a6b-4972-857a-1e815c52c75f", 879 | "metadata": {}, 880 | "source": [ 881 | "# Compute NumPy KL Loop, PyTorch Softmax\n", 882 | "\n", 883 | "Something is wrong with PyTorch code \n", 884 | "\n", 885 | "See if you can get it to match NumPy code" 886 | ] 887 | }, 888 | { 889 | "cell_type": "code", 890 | "execution_count": null, 891 | "id": "b38659e3-fe22-4655-9863-568c63c9312b", 892 | "metadata": {}, 893 | "outputs": [], 894 | "source": [ 895 | "import torch.nn.functional as F\n", 896 | "start = time.time()\n", 897 | "\n", 898 | "kl_loss = torch.nn.KLDivLoss(reduction=\"batchmean\")\n", 899 | "# input should be a distribution in the log space\n", 900 | "input = torch.from_numpy(p)\n", 901 | "target = torch.from_numpy(q)\n", 902 | "# Sample a batch of distributions. Usually this would come from the dataset\n", 903 | "output = kl_loss(input, target)\n", 904 | "print(output)\n", 905 | "timing['pytorch_NOT_SAME_RESULT'] = time.time() - start\n", 906 | "\n", 907 | "kl_loss = torch.nn.KLDivLoss(reduction=\"batchmean\", log_target=True)\n", 908 | "log_target = F.log_softmax(input, dim=0)\n", 909 | "output = kl_loss(input, log_target)\n", 910 | "print(output)" 911 | ] 912 | }, 913 | { 914 | "cell_type": "code", 915 | "execution_count": null, 916 | "id": "a9c5d5d5-615f-456e-90c2-582a215aba44", 917 | "metadata": {}, 918 | "outputs": [], 919 | "source": [ 920 | "plt.figure(figsize=(10,6))\n", 921 | "plt.title(\"Time taken to process\" ,fontsize=12)\n", 922 | "plt.ylabel(\"Time in seconds\",fontsize=12)\n", 923 | "plt.yscale('log')\n", 924 | "plt.xlabel(\"Various types of operations\",fontsize=14)\n", 925 | "plt.grid(True)\n", 926 | "plt.xticks(rotation=-60)\n", 927 | "plt.bar(x = list(timing.keys()), height= list(timing.values()), align='center',tick_label=list(timing.keys()))\n", 928 | "print('Acceleration : {:5.0f} X'.format(timing['naiveLoop']/(timing['pytorch_NOT_SAME_RESULT'])))" 929 | ] 930 | }, 931 | { 932 | "cell_type": "code", 933 | "execution_count": null, 934 | "id": "1fd6081e-8569-4389-9cca-78df26610067", 935 | "metadata": {}, 936 | "outputs": [], 937 | "source": [ 938 | "import torch\n", 939 | "print (torch.__version__)\n", 940 | "\n", 941 | "m = torch.nn.Softmax (dim=1)\n", 942 | "in_tensor_before_softmax = torch.Tensor ([0.1, 0.2, 0.4, 0.3])\n", 943 | "in_tensor = m (in_tensor_before_softmax.view (-1,4))\n", 944 | "out_tensor_before_softmax = torch.Tensor ([0.7, 0.1, 0.1, 0.1])\n", 945 | "out_tensor = m (out_tensor_before_softmax.view (-1,4))\n", 946 | "\n", 947 | "import torch.nn.functional as F\n", 948 | "kl_loss = torch.nn.KLDivLoss (reduction = 'batchmean')\n", 949 | "loss = torch.nn.CrossEntropyLoss()\n", 950 | "kl_output = kl_loss (input = F.log_softmax (in_tensor_before_softmax, dim=-1), target = out_tensor)\n", 951 | "cross_ent = loss (input = in_tensor_before_softmax.view (-1,4), target = out_tensor)\n", 952 | "ent = loss (input=out_tensor_before_softmax.view (-1,4), target = out_tensor)\n", 953 | "kl_output_using_ce = cross_ent - ent\n", 954 | "print (kl_output, kl_output_using_ce)" 955 | ] 956 | }, 957 | { 958 | "cell_type": "markdown", 959 | "id": "0164a13f-035b-4ed3-8c0e-68fd65e070c9", 960 | "metadata": {}, 961 | "source": [ 962 | "# Experiment to get same KM_output as NumPy\n", 963 | "\n", 964 | "Experiment not working yet" 965 | ] 966 | }, 967 | { 968 | "cell_type": "code", 969 | "execution_count": null, 970 | "id": "6ce993ff-96bf-4d94-95bb-e5e4f861d869", 971 | "metadata": {}, 972 | "outputs": [], 973 | "source": [ 974 | "kl_loss = torch.nn.KLDivLoss (reduction = 'none')\n", 975 | "loss = torch.nn.CrossEntropyLoss (reduction = 'none')\n", 976 | "kl_output_unr = kl_loss (input = F.log_softmax (in_tensor_before_softmax, dim=-1), target = out_tensor)\n", 977 | "cross_ent_unr = loss (input = in_tensor_before_softmax.view (-1,4), target = out_tensor)\n", 978 | "ent_unr = loss (input=out_tensor_before_softmax.view (-1,4), target = out_tensor)\n", 979 | "kl_output_using_ce_unr = cross_ent_unr - ent_unr\n", 980 | "print (kl_output_unr)\n", 981 | "\n", 982 | "print (kl_output_using_ce_unr)\n", 983 | "\n", 984 | "print (kl_output_unr.mean(), kl_output_unr.sum(), kl_output_using_ce_unr)\n" 985 | ] 986 | }, 987 | { 988 | "cell_type": "markdown", 989 | "id": "488819a9-3830-49f2-8f6e-c8a09196fecc", 990 | "metadata": {}, 991 | "source": [ 992 | "# Notices and Disclaimers\n", 993 | "\n", 994 | "Intel technologies may require enabled hardware, software or service activation.\n", 995 | "No product or component can be absolutely secure. \n", 996 | "\n", 997 | "Your costs and results may vary. \n", 998 | "\n", 999 | "© Intel Corporation. Intel, the Intel logo, and other Intel marks are trademarks of Intel Corporation or its subsidiaries. Other names and brands may be claimed as the property of others. \n" 1000 | ] 1001 | }, 1002 | { 1003 | "cell_type": "code", 1004 | "execution_count": null, 1005 | "id": "a30d52ef", 1006 | "metadata": {}, 1007 | "outputs": [], 1008 | "source": [] 1009 | } 1010 | ], 1011 | "metadata": { 1012 | "kernelspec": { 1013 | "display_name": "pytorch-gpu", 1014 | "language": "python", 1015 | "name": "pytorch-gpu" 1016 | }, 1017 | "language_info": { 1018 | "codemirror_mode": { 1019 | "name": "ipython", 1020 | "version": 3 1021 | }, 1022 | "file_extension": ".py", 1023 | "mimetype": "text/x-python", 1024 | "name": "python", 1025 | "nbconvert_exporter": "python", 1026 | "pygments_lexer": "ipython3", 1027 | "version": "3.9.16" 1028 | } 1029 | }, 1030 | "nbformat": 4, 1031 | "nbformat_minor": 5 1032 | } 1033 | -------------------------------------------------------------------------------- /Assets/Broadcasting.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/IntelSoftware/Python-Loop-Replacement-with-NumPy-and-PyTorch/bcf315374f8209fbe74f233124916ebb063e6b03/Assets/Broadcasting.png -------------------------------------------------------------------------------- /Assets/ConditionalLogic.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/IntelSoftware/Python-Loop-Replacement-with-NumPy-and-PyTorch/bcf315374f8209fbe74f233124916ebb063e6b03/Assets/ConditionalLogic.png -------------------------------------------------------------------------------- /Assets/IDCGetStarted.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/IntelSoftware/Python-Loop-Replacement-with-NumPy-and-PyTorch/bcf315374f8209fbe74f233124916ebb063e6b03/Assets/IDCGetStarted.png -------------------------------------------------------------------------------- /Assets/IDCLandingPage.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/IntelSoftware/Python-Loop-Replacement-with-NumPy-and-PyTorch/bcf315374f8209fbe74f233124916ebb063e6b03/Assets/IDCLandingPage.png -------------------------------------------------------------------------------- /Assets/IDCLauncher.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/IntelSoftware/Python-Loop-Replacement-with-NumPy-and-PyTorch/bcf315374f8209fbe74f233124916ebb063e6b03/Assets/IDCLauncher.png -------------------------------------------------------------------------------- /Assets/IDCTrainingPage.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/IntelSoftware/Python-Loop-Replacement-with-NumPy-and-PyTorch/bcf315374f8209fbe74f233124916ebb063e6b03/Assets/IDCTrainingPage.png -------------------------------------------------------------------------------- /Assets/NumpyAxis0.PNG: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/IntelSoftware/Python-Loop-Replacement-with-NumPy-and-PyTorch/bcf315374f8209fbe74f233124916ebb063e6b03/Assets/NumpyAxis0.PNG -------------------------------------------------------------------------------- /Assets/NumpyAxis1.PNG: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/IntelSoftware/Python-Loop-Replacement-with-NumPy-and-PyTorch/bcf315374f8209fbe74f233124916ebb063e6b03/Assets/NumpyAxis1.PNG -------------------------------------------------------------------------------- /Assets/PairwiseSimple.PNG: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/IntelSoftware/Python-Loop-Replacement-with-NumPy-and-PyTorch/bcf315374f8209fbe74f233124916ebb063e6b03/Assets/PairwiseSimple.PNG -------------------------------------------------------------------------------- /Assets/PairwiseStocks.jpg: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/IntelSoftware/Python-Loop-Replacement-with-NumPy-and-PyTorch/bcf315374f8209fbe74f233124916ebb063e6b03/Assets/PairwiseStocks.jpg -------------------------------------------------------------------------------- /Assets/SimpleLogic.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/IntelSoftware/Python-Loop-Replacement-with-NumPy-and-PyTorch/bcf315374f8209fbe74f233124916ebb063e6b03/Assets/SimpleLogic.png -------------------------------------------------------------------------------- /Assets/SlowWadeWater.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/IntelSoftware/Python-Loop-Replacement-with-NumPy-and-PyTorch/bcf315374f8209fbe74f233124916ebb063e6b03/Assets/SlowWadeWater.png -------------------------------------------------------------------------------- /README.md: -------------------------------------------------------------------------------- 1 | # Python-Loop-Replacement-with-NumPy-and-PyTorch 2 | Python Loop Replacement with NumPy and PyTorch - Fancy Slicing, UFuncs and equivalent, Aggregations, Sorting and more... 3 | 4 | # Archived 5 | 6 | This repo will no longer be maintained as of Oct 1, 2024 and I will archive it on Github. 7 | 8 | Feel free to use the code for you experimentation but no further development is expected 9 | 10 | # Intel Developer Cloud 11 | 12 | - Register and login for free access to Cloud.Intel.com 13 | 14 | ![IDCGetStarted.png](Assets/IDCGetStarted.png) 15 | 16 | - Click Training on Intel Developer Landing Page 17 | 18 | ![IDCLandingPage.png](Assets/IDCLandingPage.png) 19 | 20 | - Click Training Icon, then Launch JupyterLab 21 | 22 | ![IDCTrainingPage.png](Assets/IDCTrainingPage.png) 23 | 24 | - Launch Terminal Session 25 | 26 | ![IDCLauncher.png](Assets/IDCLauncher.png) 27 | 28 | 29 | # Preparation (from Terminal on Intel Developer Cloud) 30 | - cd ~ 31 | - mkdir NP 32 | - cd NP 33 | - git clone https://github.com/IntelSoftware/Python-Loop-Replacement-with-NumPy-and-PyTorch.git 34 | - cd Python-Loop-Replacement-with-NumPy-and-PyTorch 35 | 36 | -------------------------------------------------------------------------------- /build/lib.linux-x86_64-cpython-39/cython_Exact.cpython-39-x86_64-linux-gnu.so: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/IntelSoftware/Python-Loop-Replacement-with-NumPy-and-PyTorch/bcf315374f8209fbe74f233124916ebb063e6b03/build/lib.linux-x86_64-cpython-39/cython_Exact.cpython-39-x86_64-linux-gnu.so -------------------------------------------------------------------------------- /build/lib.linux-x86_64-cpython-39/cython_NewtonRecipSqrt.cpython-39-x86_64-linux-gnu.so: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/IntelSoftware/Python-Loop-Replacement-with-NumPy-and-PyTorch/bcf315374f8209fbe74f233124916ebb063e6b03/build/lib.linux-x86_64-cpython-39/cython_NewtonRecipSqrt.cpython-39-x86_64-linux-gnu.so -------------------------------------------------------------------------------- /build/temp.linux-x86_64-cpython-39/cython_Exact.o: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/IntelSoftware/Python-Loop-Replacement-with-NumPy-and-PyTorch/bcf315374f8209fbe74f233124916ebb063e6b03/build/temp.linux-x86_64-cpython-39/cython_Exact.o -------------------------------------------------------------------------------- /build/temp.linux-x86_64-cpython-39/cython_NewtonRecipSqrt.o: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/IntelSoftware/Python-Loop-Replacement-with-NumPy-and-PyTorch/bcf315374f8209fbe74f233124916ebb063e6b03/build/temp.linux-x86_64-cpython-39/cython_NewtonRecipSqrt.o -------------------------------------------------------------------------------- /cython_Exact.cpython-39-x86_64-linux-gnu.so: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/IntelSoftware/Python-Loop-Replacement-with-NumPy-and-PyTorch/bcf315374f8209fbe74f233124916ebb063e6b03/cython_Exact.cpython-39-x86_64-linux-gnu.so -------------------------------------------------------------------------------- /cython_Exact.pyx: -------------------------------------------------------------------------------- 1 | def ctypes_Exsqrt(x): # operates on single value at a time 2 | y = x**(-.5) 3 | return y -------------------------------------------------------------------------------- /cython_NewtonRecipSqrt.cpython-39-x86_64-linux-gnu.so: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/IntelSoftware/Python-Loop-Replacement-with-NumPy-and-PyTorch/bcf315374f8209fbe74f233124916ebb063e6b03/cython_NewtonRecipSqrt.cpython-39-x86_64-linux-gnu.so -------------------------------------------------------------------------------- /cython_NewtonRecipSqrt.pyx: -------------------------------------------------------------------------------- 1 | def ctypes_nbsqrt(number): # operates on single value at a time 2 | from ctypes import c_float, c_int32, cast, byref, POINTER 3 | threehalfs = 1.5 4 | x2 = number * 0.5 5 | y = c_float(number) 6 | 7 | i = cast(byref(y), POINTER(c_int32)).contents.value 8 | i = c_int32(0x5f3759df - (i >> 1)) 9 | y = cast(byref(i), POINTER(c_float)).contents.value 10 | 11 | y = y * (1.5 - (x2 * y * y)) 12 | return y -------------------------------------------------------------------------------- /setup.py: -------------------------------------------------------------------------------- 1 | from distutils.core import setup 2 | from Cython.Build import cythonize 3 | setup(ext_modules = cythonize('cython_NewtonRecipSqrt.pyx')) 4 | setup(ext_modules = cythonize('cython_Exact.pyx')) --------------------------------------------------------------------------------