├── LICENSE.md ├── README.md ├── automata.png ├── automata.py ├── bad-dither.py ├── basic-manipulation.py ├── data.txt ├── diffusion.png ├── diffusion.py ├── dithered.png ├── geometry.png ├── geometry.py ├── imshow.png ├── imshow.py ├── input-output.py ├── kitten-dithered.jpg ├── kitten-quantized.jpg ├── kitten.jpg ├── kmeans.py ├── moving-average.py ├── nan-arithmetics.py ├── original.png ├── perceptron.mp4 ├── perceptron.py ├── random-walk.py ├── reorder.py ├── repeat.py ├── strides.py └── tools.py /LICENSE.md: -------------------------------------------------------------------------------- 1 | MIT License 2 | 3 | Copyright (c) 2018 Nicolas P. Rougier 4 | 5 | Permission is hereby granted, free of charge, to any person obtaining a copy 6 | of this software and associated documentation files (the "Software"), to deal 7 | in the Software without restriction, including without limitation the rights 8 | to use, copy, modify, merge, publish, distribute, sublicense, and/or sell 9 | copies of the Software, and to permit persons to whom the Software is 10 | furnished to do so, subject to the following conditions: 11 | 12 | The above copyright notice and this permission notice shall be included in all 13 | copies or substantial portions of the Software. 14 | 15 | THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR 16 | IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, 17 | FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE 18 | AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER 19 | LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, 20 | OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE 21 | SOFTWARE. 22 | -------------------------------------------------------------------------------- /README.md: -------------------------------------------------------------------------------- 1 | # Advanced NumPy 2 | 3 | A 3h00 course on advanced numpy techniques 4 | [Nicolas P. Rougier](http://www.labri.fr/perso/nrougier), 5 | [G-Node summer school](https://python.g-node.org/), 6 | Camerino, Italy, 2018 7 | 8 | 9 | > NumPy is a library for the Python programming language, adding support for 10 | > large, multi-dimensional arrays and matrices, along with a large collection 11 | > of high-level mathematical functions to operate on these arrays. 12 | > 13 | > – Wikipedia 14 | 15 | 16 | **Quicklinks**: 17 | [Numpy website](https://www.numpy.org) – 18 | [Numpy GitHub](https://github.com/numpy/numpy) – 19 | [Numpy documentation](https://www.numpy.org/devdocs/reference/) – 20 | [ASPP archives](https://python.g-node.org/wiki/archives) – 21 | [100 Numpy Exercises](https://github.com/rougier/numpy-100) – 22 | [From Python to Numpy](http://www.labri.fr/perso/nrougier/from-python-to-numpy/) 23 | 24 | #### Table of Contents 25 | 26 | * [Introduction](#--introduction) 27 | * [Warmup](#--warmup) 28 | * [Advanced exercises](#--advanced-exercises) 29 | * [References](#--references) 30 | 31 | --- 32 | 33 | ## ❶ – Introduction 34 | 35 | NumPy is all about vectorization. If you are familiar with Python, this is the 36 | main difficulty you'll face because you'll need to change your way of thinking 37 | and your new friends (among others) are named "vectors", "arrays", "views" or 38 | "ufuncs". Let's take a very simple example: random walk. 39 | 40 | One obvious way to write a random walk in Python is: 41 | 42 | ```Python 43 | def random_walk_slow(n): 44 | position = 0 45 | walk = [position] 46 | for i in range(n): 47 | position += 2*random.randint(0, 1)-1 48 | walk.append(position) 49 | return walk 50 | walk = random_walk_slow(1000) 51 | ``` 52 | 53 | 54 | It works, but it is slow. We can do better using the itertools Python module 55 | that offers a set of functions for creating iterators for efficient looping. If 56 | we observe that a random walk is an accumulation of steps, we can rewrite the 57 | function by first generating all the steps and accumulate them without any 58 | loop: 59 | 60 | ```Python 61 | def random_walk_faster(n=1000): 62 | from itertools import accumulate 63 | # Only available from Python 3.6 64 | steps = random.choices([-1,+1], k=n) 65 | return [0]+list(accumulate(steps)) 66 | walk = random_walk_faster(1000) 67 | ``` 68 | 69 | It is better but still, it is slow. A more efficient implementation, taking 70 | full advantage of NumPy, can be written as: 71 | 72 | ```Python 73 | def random_walk_fastest(n=1000): 74 | steps = np.random.choice([-1,+1], n) 75 | return np.cumsum(steps) 76 | walk = random_walk_fastest(1000) 77 | ``` 78 | 79 | Now, it is amazingly fast ! 80 | 81 | ```Pycon 82 | >>> timeit("random_walk_slow(1000)", globals()) 83 | Timing 'random_walk_slow(1000)' 84 | 1.58 ms ± 0.0228 ms per loop (mean ± std. dev. of 7 runs, 1000 loops each) 85 | 86 | >>> timeit("random_walk_faster(1000)", globals()) 87 | Timing 'random_walk_faster(1000)' 88 | 281 us ± 3.15 us per loop (mean ± std. dev. of 7 runs, 10000 loops each) 89 | 90 | >>> timeit("random_walk_fastest(1000)", globals()) 91 | Timing 'random_walk_fastest(1000)' 92 | 27.6 us ± 3.45 us per loop (mean ± std. dev. of 7 runs, 1000 loops each) 93 | ``` 94 | 95 | **Warning**: You may have noticed (or not) that the `random_walk_fast` works 96 | but is not reproducible at all, which is pretty annoying in Science. If you 97 | want to know why, you can have a look at the article [Re-run, Repeat, 98 | Reproduce, Reuse, Replicate: Transforming Code into Scientific 99 | Contributions](https://www.frontiersin.org/articles/10.3389/fninf.2017.00069/full) 100 | (that I wrote with [Fabien Benureau](https://github.com/benureau)). 101 | 102 | 103 | 104 | Last point, before heading to the course, I would like to warn you about a 105 | potential problem you may encounter once you'll have become familiar enough 106 | with NumPy. It is a very powerful library and you can make wonders with it but, 107 | most of the time, this comes at the price of readability. If you don't comment 108 | your code at the time of writing, you won't be able to tell what a function is 109 | doing after a few weeks (or possibly days). For example, can you tell what the 110 | two functions below are doing? 111 | 112 | ```Python 113 | def function_1(seq, sub): 114 | return [i for i in range(len(seq) - len(sub)) if seq[i:i+len(sub)] == sub] 115 | 116 | def function_2(seq, sub): 117 | target = np.dot(sub, sub) 118 | candidates = np.where(np.correlate(seq, sub, mode='valid') == target)[0] 119 | check = candidates[:, np.newaxis] + np.arange(len(sub)) 120 | mask = np.all((np.take(seq, check) == sub), axis=-1) 121 | return candidates[mask] 122 | ``` 123 | 124 | As you may have guessed, the second function is the 125 | vectorized-optimized-faster-NumPy version of the first function and it runs 10x 126 | faster than the pure Python version. But it is hardly readable. 127 | 128 | 129 | 130 | ## ❷ – Warmup 131 | 132 | You're supposed to be already familiar with NumPy. If not, you should read the 133 | [NumPy chapter](http://www.scipy-lectures.org/intro/numpy/index.html) from the [SciPy Lecture Notes](http://www.scipy-lectures.org/). Before heading to the more advanced 134 | stuff, let's do some warmup exercises (that should pose no problem). If you 135 | choke on the first exercise, you should try to have a look at the [Anatomy of an array](https://www.labri.fr/perso/nrougier/from-python-to-numpy/#anatomy-of-an-array) and check also the [Quick references](https://www.labri.fr/perso/nrougier/from-python-to-numpy/#quick-references). 136 | 137 | ### Useful tools 138 | 139 | Before heading to the exercises, We'll write a `sysinfo` and `info` function that will help us debug our code. 140 | 141 | The `sysinfo` function displays some information related to you scientific 142 | environment: 143 | 144 | ```Pycon 145 | >>> import tools 146 | >>> tools.sysinfo() 147 | Date: 08/25/18 148 | Python: 3.7.0 149 | Numpy: 1.14.5 150 | Scipy: 1.1.0 151 | Matplotlib: 2.2.2 152 | ``` 153 | 154 | While the `info` function displays a lot of information for a specific array: 155 | 156 | ```Pycon 157 | >>> import tools 158 | >>> Z = np.arange(9).reshape(3,3) 159 | >>> tools.info(Z) 160 | ------------------------------ 161 | Interface (item) 162 | shape: (3,3) 163 | dtype: int64 164 | length: 3 165 | size: 9 166 | endianess: native (little) 167 | order: ☑ C ☐ Fortran 168 | 169 | Memory (byte) 170 | item size: 8 171 | array size: 72 172 | strides: (24, 8) 173 | 174 | Properties 175 | own data: ☑ Yes ☐ No 176 | writeable: ☑ Yes ☐ No 177 | contiguous: ☑ Yes ☐ No 178 | aligned: ☑ Yes ☐ No 179 | ------------------------------ 180 | ``` 181 | 182 | 183 | Try to code these two functions. You can then compare your implementation with 184 | [mine](tools.py). 185 | **NOTE**: We don't care so much about the formatting, do not lose time trying 186 | to copy it exactly. 187 | 188 | 189 | The [tools.py](tools.py) script comes with two other functions that might be 190 | useful. The `timeit` function allows to precisely time some code (e.g. to 191 | measure which one is the fastest). It is pretty similar to the `%timeit` magic 192 | function from IPython: 193 | 194 | ```Pycon 195 | >>> import tools 196 | >>> tools.timeit("Z=np.random.uniform(0,1,1000000)", globals()) 197 | >>> Measuring time for 'Z=np.random.uniform(0,1,1000000)' 198 | 11.4 ms ± 0.198 ms per loop (mean ± std. dev. of 7 runs, 100 loops each) 199 | ``` 200 | 201 | And the `imshow` function is able to display a one-dimensional or 202 | two-dimensional array in the console. It won't replace matplotlib but it can 203 | comes handy for some (small) arrays (you'll need a 256 colors terminal): 204 | 205 | ![](imshow.png) 206 | 207 | 208 | ### Basic manipulation 209 | 210 | Let's start with some basic operations: 211 | 212 | • Create a vector with values ranging from 10 to 49 213 | • Create a null vector of size 100 but the fifth value which is 1 214 | • Reverse a vector (first element becomes last) 215 | • Create a 3x3 matrix with values ranging from 0 to 8 216 | • Create a 3x3 identity matrix 217 | • Create a 2d array with 1 on the border and 0 inside 218 | • Given a 1D array, negate all elements which are between 3 and 8, in place 219 | 220 | For a more complete list, you can have a look at the [100 Numpy Exercises](https://github.com/rougier/numpy-100). 221 | 222 |
Solution (click to expand)

223 | 224 | Sources: [basic-manipulation.py](basic-manipulation.py) 225 | 226 | ```Python 227 | import numpy as np 228 | 229 | # Create a vector with values ranging from 10 to 49 230 | Z = np.arange(10,50) 231 | 232 | # Create a null vector of size 100 but the fifth value which is 1 233 | Z = np.zeros(100) 234 | Z[4] = 1 235 | 236 | # Reverse a vector (first element becomes last) 237 | Z = np.arange(50)[::-1] 238 | 239 | # Create a 3x3 matrix with values ranging from 0 to 8 240 | Z = np.arange(9).reshape(3,3) 241 | 242 | # Create a 3x3 identity matrix 243 | Z = np.eye(3) 244 | 245 | # Create a 2d array with 1 on the border and 0 inside 246 | Z = np.ones((10,10)) 247 | Z[1:-1,1:-1] = 0 248 | 249 | # Given a 1D array, negate all elements which are between 3 and 8, in place 250 | Z = np.arange(11) 251 | Z[(3 < Z) & (Z <= 8)] *= -1 252 | ``` 253 |

254 | 255 | 256 | 257 | ### NaN arithmetics 258 | 259 | Just a reminder on NaN arithmetics: 260 | 261 | What is the result of the following expression? 262 | **→ Hints**: [What Every Computer Scientist Should Know About Floating-Point Arithmetic, D. Goldberg, 1991](https://docs.oracle.com/cd/E19957-01/806-3568/ncg_goldberg.html) 263 | 264 | 265 | ```Python 266 | print(0 * np.nan) 267 | print(np.nan == np.nan) 268 | print(np.inf > np.nan) 269 | print(np.nan - np.nan) 270 | print(0.3 == 3 * 0.1) 271 | ``` 272 | 273 | 274 |
Solution (click to expand)

275 | 276 | Sources [nan-arithmetics.py](nan-arithmetics.py) 277 | 278 | ```Python 279 | import numpy as np 280 | 281 | # Result is NaN 282 | print(0 * np.nan) 283 | 284 | # Result is False 285 | print(np.nan == np.nan) 286 | 287 | # Result is False 288 | print(np.inf > np.nan) 289 | 290 | # Result is NaN 291 | print(np.nan - np.nan) 292 | 293 | # Result is False !!! 294 | print(0.3 == 3 * 0.1) 295 | print("0.1 really is {:0.56f}".format(0.1)) 296 | ``` 297 | 298 |

299 | 300 | 301 | 302 | ### Computing strides 303 | 304 | Consider an array Z, how to compute Z strides (manually)? 305 | **→ Hints**: 306 | [itemsize](https://docs.scipy.org/doc/numpy/reference/generated/numpy.ndarray.itemsize.html) – 307 | [shape](https://docs.scipy.org/doc/numpy/reference/generated/numpy.ndarray.shape.html) – 308 | [ndim](https://docs.scipy.org/doc/numpy/reference/generated/numpy.ndarray.ndim.html) 309 | 310 | 311 | ```Python 312 | import numpy as np 313 | Z = np.arange(24).reshape(2,3,4) 314 | print(Z.strides) 315 | ``` 316 | 317 |
Solution (click to expand)

318 | 319 | Sources [strides.py](strides.py) 320 | 321 | ```Python 322 | import numpy as np 323 | 324 | def strides(Z): 325 | strides = [Z.itemsize] 326 | 327 | # Fotran ordered array 328 | if np.isfortran(Z): 329 | for i in range(0, Z.ndim-1): 330 | strides.append(strides[-1] * Z.shape[i]) 331 | return tuple(strides) 332 | # C ordered array 333 | else: 334 | for i in range(Z.ndim-1, 0, -1): 335 | strides.append(strides[-1] * Z.shape[i]) 336 | return tuple(strides[::-1]) 337 | 338 | # This work 339 | Z = np.arange(24).reshape((2,3,4), order="C") 340 | print(Z.strides, " – ", strides(Z)) 341 | 342 | Z = np.arange(24).reshape((2,3,4), order="F") 343 | print(Z.strides, " – ", strides(Z)) 344 | 345 | # This does not work 346 | Z = Z[::2] 347 | print(Z.strides, " – ", strides(Z)) 348 | ``` 349 | 350 |

351 | 352 | ### Repeat and repeat 353 | 354 | Can you tell the difference? 355 | **→ Hints**: 356 | [tile](https://docs.scipy.org/doc/numpy/reference/generated/numpy.tile.html) – 357 | [as_strided](https://docs.scipy.org/doc/numpy/reference/generated/numpy.lib.stride_tricks.as_strided.html) 358 | 359 | 360 | ```Python 361 | import numpy as np 362 | from numpy.lib.stride_tricks import as_strided 363 | 364 | Z = np.random.randint(0,10,5) 365 | Z1 = np.tile(Z, (3,1)) 366 | Z2 = as_strided(Z, shape=(3,)+Z.shape, strides=(0,)+Z.strides) 367 | ``` 368 | 369 |
Solution (click to expand)

370 | 371 | Sources [repeat.py](repeat.py) 372 | 373 | ```Python 374 | import numpy as np 375 | from numpy.lib.stride_tricks import as_strided 376 | 377 | Z = np.zeros(5) 378 | Z1 = np.tile(Z,(3,1)) 379 | Z2 = as_strided(Z, shape=(3,)+Z.shape, strides=(0,)+Z.strides) 380 | 381 | # Real repeat: three times the memory 382 | Z1[0,0] = 1 383 | print(Z1) 384 | 385 | # Fake repeat: less memory but not totally equivalent 386 | Z2[0,0] = 1 387 | print(Z2) 388 | ``` 389 | 390 |

391 | 392 | 393 | ### Reordering things 394 | 395 | Let's consider the following list: 396 | 397 | ```Python 398 | L = [ 0, 0, 0, 0, 0, 0, 3, 233, 399 | 0, 0, 0, 0, 0, 0, 3, 237, 400 | 0, 0, 0, 0, 0, 0, 3, 235, 401 | 0, 0, 0, 0, 0, 0, 3, 239, 402 | 0, 0, 0, 0, 0, 0, 3, 234, 403 | 0, 0, 0, 0, 0, 0, 3, 238, 404 | 0, 0, 0, 0, 0, 0, 3, 236, 405 | 0, 0, 0, 0, 0, 0, 3, 240] 406 | ``` 407 | 408 | This is actually the byte dump of a 2x2x2 array, fortran ordered of 64 bits 409 | integers using big endian encoding. 410 | 411 | How would you access element at [1,0,0] with NumPy (simple)? 412 | 413 |
Solution (click to expand)

414 | 415 | ```Python 416 | 417 | import struct 418 | import numpy as np 419 | 420 | # Generation of the array 421 | # Z = range(1001, 1009) 422 | # L = np.reshape(Z, (2,2,2), order="F").ravel().astype(">i8").view(np.ubyte) 423 | 424 | L = [ 0, 0, 0, 0, 0, 0, 3, 233, 425 | 0, 0, 0, 0, 0, 0, 3, 237, 426 | 0, 0, 0, 0, 0, 0, 3, 235, 427 | 0, 0, 0, 0, 0, 0, 3, 239, 428 | 0, 0, 0, 0, 0, 0, 3, 234, 429 | 0, 0, 0, 0, 0, 0, 3, 238, 430 | 0, 0, 0, 0, 0, 0, 3, 236, 431 | 0, 0, 0, 0, 0, 0, 3, 240] 432 | 433 | # Automatic (numpy) 434 | Z = np.reshape(np.array(L, dtype=np.ubyte).view(dtype=">i8"), (2,2,2), order="F") 435 | print(Z[1,0,0]) 436 | ``` 437 |


438 | 439 | 440 | How would you access element at [1,0,0] without NumPy (harder)? 441 | **→ Hints**: Use your brain! 442 | 443 | 444 |
Solution (click to expand)

445 | 446 | Sources [reorder.py](reorder.py) 447 | 448 | ```Python 449 | 450 | import struct 451 | import numpy as np 452 | 453 | # Generation of the array 454 | # Z = range(1001, 1009) 455 | # L = np.reshape(Z, (2,2,2), order="F").ravel().astype(">i8").view(np.ubyte) 456 | 457 | L = [ 0, 0, 0, 0, 0, 0, 3, 233, 458 | 0, 0, 0, 0, 0, 0, 3, 237, 459 | 0, 0, 0, 0, 0, 0, 3, 235, 460 | 0, 0, 0, 0, 0, 0, 3, 239, 461 | 0, 0, 0, 0, 0, 0, 3, 234, 462 | 0, 0, 0, 0, 0, 0, 3, 238, 463 | 0, 0, 0, 0, 0, 0, 3, 236, 464 | 0, 0, 0, 0, 0, 0, 3, 240] 465 | 466 | # Automatic (numpy) 467 | Z = np.reshape(np.array(L, dtype=np.ubyte).view(dtype=">i8"), (2,2,2), order="F") 468 | print(Z[1,0,0]) 469 | 470 | # Manual (brain) 471 | shape = (2,2,2) 472 | itemsize = 8 473 | # We can probably do better 474 | strides = itemsize, itemsize*shape[0], itemsize*shape[0]*shape[1] 475 | index = (1,0,0) 476 | start = sum(i*s for i,s in zip(index,strides)) 477 | end = start+itemsize 478 | value = struct.unpack(">Q", bytes(L[start:end]))[0] 479 | print(value) 480 | ``` 481 | 482 |

483 | 484 | 485 | ### Heat equation 486 | 487 | 488 | > The diffusion equation (a.k.a the heat equation) reads `∂u/∂t = α∂²u/∂x²` where 489 | > u(x,t) is the unknown function to be solved, x is a coordinate in space, and t 490 | > is time. The coefficient α is the diffusion coefficient and determines how fast 491 | > u changes in time. The discrete (time(n) and space (i)) version of the equation 492 | > can be rewritten as `u(i,n+1) = u(i,n) + F(u(i-1,n) - 2u(i,n) + u(i+1,n))`. 493 | > 494 | > – [Finite difference methods for diffusion processes](http://hplgit.github.io/num-methods-for-PDEs/doc/pub/diffu/sphinx/._main_diffu000.html), Hans Petter Langtangen 495 | 496 | The goal here is to compute the discrete equation over a finite domain using 497 | `as_strided` to produce a sliding-window view of a 1D array. This view can be 498 | then used to compute `U` at the next iteration. Using the the following initial 499 | conditions (using Z instead of U): 500 | 501 | 502 | ```Python 503 | Z = np.random.uniform(0.00, 0.05, (50,100)) 504 | Z[0,5::10] = 1 505 | ``` 506 | 507 | Try to obtain this picture (where time goes from top to bottom): 508 | 509 | ![](diffusion.png) 510 | 511 | 512 | The code to display the figure from an array Z is: 513 | 514 | ```Python 515 | import matplotlib as plt 516 | 517 | plt.figure(figsize=(6,3)) 518 | plt.subplot(1,1,1,frameon=False) 519 | plt.imshow(Z, vmin=0, vmax=1) 520 | plt.xticks([]), plt.yticks([]) 521 | plt.tight_layout() 522 | plt.show() 523 | ``` 524 | 525 | **Hint**: You will need to write a `sliding_window(Z, size=3)` function that returns 526 | a strided view of Z. 527 | 528 |
Solution (click to expand)

529 | 530 | Sources [diffusion.py](diffusion.py) 531 | 532 | ```Python 533 | import numpy as np 534 | import matplotlib.pyplot as plt 535 | from numpy.lib.stride_tricks import as_strided 536 | 537 | 538 | def sliding_window(Z, size=2): 539 | n, s = Z.shape[0], Z.strides[0] 540 | return as_strided(Z, shape=(n-size+1, size), strides=(s, s)) 541 | 542 | 543 | # Initial conditions: 544 | # Domain size is 100 and we'll iterate over 50 time steps 545 | Z = np.zeros((50,100)) 546 | Z[0,5::10] = 1.5 547 | 548 | # Actual iteration 549 | F = 0.05 550 | for i in range(1, len(Z)): 551 | Z[i,1:-1] = Z[i-1,1:-1] + F*(sliding_window(Z[i-1], 3)*[+1,-2,+1]).sum(axis=1) 552 | 553 | # Display 554 | plt.figure(figsize=(6,3)) 555 | plt.subplot(1,1,1,frameon=False) 556 | plt.imshow(Z, vmin=0, vmax=1) 557 | plt.xticks([]), plt.yticks([]) 558 | plt.tight_layout() 559 | plt.savefig("diffusion.png") 560 | plt.show() 561 | ``` 562 | 563 |

564 | 565 | 566 | ### Rule 30 567 | 568 | With only a slight modification of the previous exercise, we can compute a 569 | one-dimensional [cellular automata](https://en.wikipedia.org/wiki/Cellular_automaton) and more specifically the [Rule 30](https://en.wikipedia.org/wiki/Rule_30) that 570 | exhibits intriguing patterns as shown below: 571 | 572 | ![](automata.png) 573 | 574 | To start with, here is how to convert the rule in a useful form: 575 | 576 | ```Python 577 | rule = 30 578 | R = np.array([int(v) for v in '{0:08b}'.format(rule)])[::-1] 579 | ``` 580 | 581 | and we consider this initial state: 582 | 583 | ```Python 584 | Z = np.zeros((250,501), dtype=int) 585 | Z[0,250] = 1 586 | ``` 587 | 588 | Try to obtain the same figure. Display code is: 589 | 590 | ```Python 591 | plt.figure(figsize=(6,3)) 592 | plt.subplot(1,1,1,frameon=False) 593 | plt.imshow(Z, vmin=0, vmax=1, cmap=plt.cm.gray_r) 594 | plt.xticks([]), plt.yticks([]) 595 | plt.tight_layout() 596 | plt.savefig("automata.png") 597 | plt.show() 598 | ``` 599 | 600 | 601 |
Solution (click to expand)

602 | 603 | Sources [automata.py](automata.py) 604 | 605 | ```Python 606 | import numpy as np 607 | import matplotlib.pyplot as plt 608 | from numpy.lib.stride_tricks import as_strided 609 | 610 | def sliding_window(Z, size=2): 611 | n, s = Z.shape[0], Z.strides[0] 612 | return as_strided(Z, shape=(n-size+1, size), strides=(s, s)) 613 | 614 | # Rule 30 (see https://en.wikipedia.org/wiki/Rule_30) 615 | # 0x000: 0, 0x001: 1, 0x010: 1, 0x011: 1 616 | # 0x100: 1, 0x101: 0, 0x110: 0, 0x111: 0 617 | rule = 30 618 | R = np.array([int(v) for v in '{0:08b}'.format(rule)])[::-1] 619 | 620 | # Initial state 621 | Z = np.zeros((250,501), dtype=int) 622 | Z[0,250] = 1 623 | 624 | # Computing some iterations 625 | for i in range(1, len(Z)): 626 | N = sliding_window(Z[i-1],3) * [1,2,4] 627 | Z[i,1:-1] = R[N.sum(axis=1)] 628 | 629 | # Display 630 | plt.figure(figsize=(6,3)) 631 | plt.subplot(1,1,1,frameon=False) 632 | plt.imshow(Z, vmin=0, vmax=1, cmap=plt.cm.gray_r) 633 | plt.xticks([]), plt.yticks([]) 634 | plt.tight_layout() 635 | plt.savefig("automata.png") 636 | plt.show() 637 | ``` 638 | 639 |

640 | 641 | 642 | ### Input / Output 643 | 644 | → Exercise written by [Stefan van der Walt](http://mentat.za.net/). 645 | 646 | Place the following data in a text file, data.txt: 647 | 648 | ``` 649 | % rank lemma (10 letters max) frequency dispersion 650 | 21 they 1865844 0.96 651 | 42 her 969591 0.91 652 | 49 as 829018 0.95 653 | 7 to 6332195 0.98 654 | 63 take 670745 0.97 655 | 14 you 3085642 0.92 656 | 35 go 1151045 0.93 657 | 56 think 772787 0.91 658 | 28 not 1638883 0.98 659 | ``` 660 | 661 | Now, design a suitable structured data type, then load the data from the text 662 | file using [np.loadtxt](https://docs.scipy.org/doc/numpy/reference/generated/numpy.loadtxt.html) (look at the documenration to see how to handle the '%' comment character). 663 | 664 | Here's a skeleton to start with: 665 | 666 | ```Python 667 | import numpy as np 668 | 669 | # Construct the data-type 670 | # For example: 671 | # dtype = np.dtype([('x', np.float), ('y', np.int), ('z', np.uint8)]) 672 | 673 | dt = np.dtype(...) # Modify this line to give the correct answer 674 | data = np.loadtxt(...) # Load data with loadtxt 675 | ``` 676 | 677 | Examine the data you got: 678 | * Extract words only 679 | * Extract the 3rd row 680 | * Print all words with rank < 30 681 | 682 | Sort the data according to frequency (see 683 | [np.sort](https://docs.scipy.org/doc/numpy/reference/routines.sort.html)). 684 | 685 | 686 | Save the result to a compressed numpy data file (e.g. "sorted.npz") using [np.savez](https://docs.scipy.org/doc/numpy/reference/generated/numpy.savez.html) and load it back with `out = np.load("sorted.npz")`. Do you get back what you put in? Why? 687 | 688 |
Solution (click to expand)

689 | 690 | Source: [input-output.py](input-output.py) 691 | 692 | ``` 693 | import numpy as np 694 | 695 | # Create our own dtype 696 | dtype = np.dtype([('rank', 'i8'), 697 | ('lemma', 'S8'), 698 | ('frequency', 'i8'), 699 | ('dispersion', 'f8')]) 700 | 701 | # Load file using our own dtype 702 | data = np.loadtxt('data.txt', comments='%', dtype=dtype) 703 | 704 | # Extract words only 705 | print(data["lemma"]) 706 | 707 | # Extract the 3rd row 708 | print(data[2]) 709 | 710 | # Print all words with rank < 30 711 | print(data[data["rank"] < 30]) 712 | 713 | # Sort the data according to frequency. 714 | sorted = np.sort(data, order="frequency") 715 | print(sorted) 716 | 717 | # Save unsorted and sorted array 718 | np.savez("sorted.npz", data=data, sorted=sorted) 719 | 720 | # Load saved array 721 | out = np.load("sorted.npz") 722 | print(out["sorted"]) 723 | ``` 724 | 725 |


726 | 727 | 728 | ## ❸ – Advanced exercises 729 | 730 | ### Geometry 731 | 732 | We consider a collection of 2d squares that are each defined by four points, a scaling factor, a translation and a rotation angle. We want to obtain the following figure: 733 | 734 | ![](geometry.png) 735 | 736 | made of 25 squares, scaled by 0.1, translated by (1,0) and with increasing 737 | rotation angles. The order of operation is `scale`, `translate` and 738 | `rotate`. What would be the best structure `S` to hold all these information at 739 | once? 740 | **→ Hints**: [structured arrays](https://docs.scipy.org/doc/numpy/user/basics.rec.html) 741 | 742 |
Solution (click to expand)

743 | 744 | ```Python 745 | dtype = [("points", float, (4, 2)), 746 | ("scale", float, 1), 747 | ("translate", float, 2), 748 | ("rotate", float, 1)] 749 | S = np.zeros(25, dtype = dtype) 750 | ``` 751 | 752 |


753 | 754 | We now need to initialize our array. For the four points describing a square, 755 | you can use the following points: [(-1,-1), (-1,+1), (+1,+1), (+1,-1)] 756 | 757 |
Solution (click to expand)

758 | 759 | ```Python 760 | S["points"] = [(-1,-1), (-1,+1), (+1,+1), (+1,-1)] 761 | S["translate"] = (1,0) 762 | S["scale"] = 0.1 763 | S["rotate"] = np.linspace(0, 2*np.pi, len(S), endpoint=False) 764 | ``` 765 |


766 | 767 | 768 | Now, we need to write a function that apply all these transformations and write 769 | the results in new array: 770 | 771 | ```Python 772 | 773 | P = np.zeros((len(S), 4, 2)) 774 | # Your code here (to populate P) 775 | ... 776 | ``` 777 | 778 | You can start by writing a translate, scale and rotate function first. 779 | 780 | > Rotation reminder. Considering a point (x,y) and a rotation angle a, 781 | > the rotated coordinates (x',y') are: 782 | > 783 | > x' = x.cos(a) - y.sin(a) and y' = x.sin(a) + y.cos(a) 784 | 785 | 786 | The display code is: 787 | 788 | ```Python 789 | import matplotlib.pyplot as plt 790 | 791 | fig = plt.figure(figsize=(6,6)) 792 | ax = plt.subplot(1,1,1, frameon=False) 793 | for i in range(len(P)): 794 | X = np.r_[P[i,:,0], P[i,0,0]] 795 | Y = np.r_[P[i,:,1], P[i,0,1]] 796 | plt.plot(X, Y, color="black") 797 | plt.xticks([]), plt.yticks([]) 798 | plt.tight_layout() 799 | plt.show() 800 | ``` 801 | 802 | 803 | 804 |
Solution (click to expand)

805 | 806 | Source: [geometry.py](geometry.py) 807 | 808 | ```Python 809 | import numpy as np 810 | import matplotlib.pyplot as plt 811 | 812 | dtype = [("points", float, (4, 2)), 813 | ("scale", float, 1), 814 | ("translate", float, 2), 815 | ("rotate", float, 1)] 816 | S = np.zeros(25, dtype = dtype) 817 | S["points"] = [(-1,-1), (-1,+1), (+1,+1), (+1,-1)] 818 | S["translate"] = (1,0) 819 | S["scale"] = 0.1 820 | S["rotate"] = np.linspace(0, 2*np.pi, len(S), endpoint=False) 821 | 822 | P = np.zeros((len(S), 4, 2)) 823 | for i in range(len(S)): 824 | for j in range(4): 825 | x = S[i]["points"][j,0] 826 | y = S[i]["points"][j,1] 827 | tx, ty = S[i]["translate"] 828 | scale = S[i]["scale"] 829 | theta = S[i]["rotate"] 830 | x = tx + x*scale 831 | y = ty + y*scale 832 | x_ = x*np.cos(theta) - y*np.sin(theta) 833 | y_ = x*np.sin(theta) + y*np.cos(theta) 834 | P[i,j] = x_, y_ 835 | 836 | fig = plt.figure(figsize=(6,6)) 837 | ax = plt.subplot(1,1,1, frameon=False) 838 | for i in range(len(P)): 839 | X = np.r_[P[i,:,0], P[i,0,0]] 840 | Y = np.r_[P[i,:,1], P[i,0,1]] 841 | plt.plot(X, Y, color="black") 842 | plt.xticks([]), plt.yticks([]) 843 | plt.tight_layout() 844 | plt.savefig("geometry.png") 845 | plt.show() 846 | ``` 847 | 848 |


849 | 850 | The proposed solution has two loops. Can you imagine a way to do it without loop ? 851 | **→ Hints**: [einsum](https://docs.scipy.org/doc/numpy/reference/generated/numpy.einsum.html) 852 | 853 |
Solution (click to expand)

854 | 855 | Have a look at [Multiple individual 2d rotation at once](https://stackoverflow.com/questions/40822983/multiple-individual-2d-rotation-at-once) on stack overflow. I did not implement it, feel free to issue a PR with the solution. 856 | 857 |

858 | 859 | ### Image quantization 860 | 861 | > In computer graphics, color quantization or color image quantization is 862 | > quantization applied to color spaces; it is a process that reduces the number 863 | > of distinct colors used in an image, usually with the intention that the new 864 | > image should be as visually similar as possible to the original image. 865 | > 866 | > – Wikipedia 867 | 868 | In this exercise, we want to produce color quantization, that is, considering a 869 | random image, we would like to reduce the number of colors without altering too 870 | much the perception of the image. We thus need to find the most representative 871 | colors. 872 | 873 | The first (naive) idea that may come to mind is to count the number of times a 874 | specific color is used and to use the most frequent colors for quantization. 875 | Unfortunately, this does not work very well as illustrated below: 876 | 877 | ![](kitten.jpg) 878 | ![](kitten-dithered.jpg) 879 | 880 | The reason is that some color and slight variations might be over-represented 881 | in th eoriginal image and will thus appears among the most frequent 882 | colors. This the reason why the kitten ended mostly in green and the flower 883 | totally dissapeared. 884 | 885 | To check by yourself, you'll write the corresponding script and check for the 886 | result: 887 | 888 | 1. Load an image (using [imageio](http://imageio.github.io/).[imread](https://imageio.readthedocs.io/en/latest/userapi.html#imageio.imread)) 889 | 2. Find the number of unique colors and their frequency (counts) 890 | 3. Pick the n=16 most frequent colors 891 | 4. Replace colors in the original image with the closest color (found previously) 892 | 5. Save the result (using [imageio](http://imageio.github.io/).[imsave](https://imageio.readthedocs.io/en/latest/userapi.html#imageio.imsave)) 893 | 894 | 895 |
Solution (click to expand)

896 | 897 | Sources: [bad-dither.py](bad-dither.py) 898 | 899 | ```Python 900 | import imageio 901 | import numpy as np 902 | import scipy.spatial 903 | 904 | # Number of final colors we want 905 | n = 16 906 | 907 | # Original Image 908 | I = imageio.imread("kitten.jpg") 909 | shape = I.shape 910 | 911 | # Flattened image 912 | I = I.reshape(shape[0]*shape[1], shape[2]) 913 | 914 | # Find the unique colors and their frequency (=counts) 915 | colors, counts = np.unique(I, axis=0, return_counts=True) 916 | 917 | # Get the n most frequent colors 918 | sorted = np.argsort(counts)[::-1] 919 | C = I[sorted][:n] 920 | 921 | # Compute distance to most frequent colors 922 | D = scipy.spatial.distance.cdist(I, C, 'sqeuclidean') 923 | 924 | # Replace colors with closest one 925 | Z = (C[D.argmin(axis=1)]).reshape(shape) 926 | 927 | # Save result 928 | imageio.imsave("kitten-dithered.jpg", Z) 929 | ``` 930 | 931 |


932 | 933 | 934 | We thus need a different method and this method is called [k-means 935 | clustering](https://en.wikipedia.org/wiki/K-means_clustering) that allow to 936 | partition data into n clusters whose centroids may serve as a prototype for the 937 | cluster. 938 | 939 | ![](kitten.jpg) 940 | ![](kitten-quantized.jpg) 941 | 942 | The algorithm is quite simple. We start with n random points (centroids) and we 943 | compute for each point in our data what is the closest centroid. Those 944 | constitute cluster of points. For each cluster, we compute its centroid (mean 945 | point) and we reiterate the processus for a given number of steps. In this 946 | exercise, you'll have to write such a k-means function and to use it to 947 | quantize the image. 948 | 949 |
Solution (click to expand)

950 | 951 | Sources: [kmeans.py](kmeans.py) 952 | 953 | ```Python 954 | # Code by Gareth Rees, posted on stack overflow 955 | # https://codereview.stackexchange.com/questions/61598/k-mean-with-numpy 956 | 957 | import numpy as np 958 | import scipy.spatial 959 | 960 | def cluster_centroids(data, clusters, k=None): 961 | if k is None: 962 | k = np.max(clusters) + 1 963 | result = np.empty(shape=(k,) + data.shape[1:]) 964 | for i in range(k): 965 | np.mean(data[clusters == i], axis=0, out=result[i]) 966 | return result 967 | 968 | 969 | def kmeans(data, k=None, centroids=None, steps=20): 970 | if centroids is not None and k is not None: 971 | assert(k == len(centroids)) 972 | elif centroids is not None: 973 | k = len(centroids) 974 | elif k is not None: 975 | # Forgy initialization method: choose k data points randomly. 976 | centroids = data[np.random.choice(np.arange(len(data)), k, False)] 977 | else: 978 | raise RuntimeError("Need a value for k or centroids.") 979 | 980 | for _ in range(max(steps, 1)): 981 | # Squared distances between each point and each centroid. 982 | sqdists = scipy.spatial.distance.cdist(centroids, data, 'sqeuclidean') 983 | 984 | # Index of the closest centroid to each data point. 985 | clusters = np.argmin(sqdists, axis=0) 986 | 987 | new_centroids = cluster_centroids(data, clusters, k) 988 | if np.array_equal(new_centroids, centroids): 989 | break 990 | 991 | centroids = new_centroids 992 | return centroids, clusters 993 | 994 | 995 | if __name__ == '__main__': 996 | import imageio 997 | 998 | # Number of final colors we want 999 | n = 16 1000 | 1001 | # Original Image 1002 | I = imageio.imread("kitten.jpg") 1003 | shape = I.shape 1004 | 1005 | # Flattened image 1006 | D = I.reshape(shape[0]*shape[1], shape[2]) 1007 | 1008 | # Search for 16 centroids in D (using 20 iterations) 1009 | centroids, clusters = kmeans(D, k=n, steps=20) 1010 | 1011 | # Create quantized image 1012 | I = (centroids[clusters]).reshape(shape) 1013 | I = np.round(I).astype(np.uint8) 1014 | 1015 | # Save result 1016 | imageio.imsave("kitten-quantized.jpg", I) 1017 | ``` 1018 | 1019 |


1020 | 1021 | 1022 | 1023 | 1024 | ### Neural networks 1025 | 1026 | In this exercise, we'll implement one of the most simple feed-forward neural 1027 | network, a.k.a. the [Perceptron](https://en.wikipedia.org/wiki/Perceptron). We'll use it to discrimate between two classes 1028 | (points in two dimensions,see [desired output](perceptron.mp4)): 1029 | 1030 | ```Python 1031 | samples = np.zeros(100, dtype=[('input', float, 2), 1032 | ('output', float, 1)]) 1033 | 1034 | P = np.random.uniform(0.05,0.95,(len(samples),2)) 1035 | samples["input"] = P 1036 | stars = np.where(P[:,0]+P[:,1] < 1) 1037 | discs = np.where(P[:,0]+P[:,1] > 1) 1038 | samples["output"][stars] = +1 1039 | samples["output"][discs] = 0 1040 | ``` 1041 | 1042 | Your goal is to populate the following class in order to train the 1043 | network. You'll need: 1044 | 1045 | * a one-dimensional array to store the input 1046 | * a one-dimensional array to store the output 1047 | * a two-dimensional array to store the weights 1048 | * a threshold function (for example `lambda x: x > 0`) 1049 | 1050 | The `propagate_forward` method is supposed to compute the output of the network 1051 | while the `propagate_backward` is supposed to modify the weights according to 1052 | the actual error. 1053 | 1054 | ```Python 1055 | class Perceptron: 1056 | def __init__(self, n, m): 1057 | "Initialization of the perceptron with given sizes" 1058 | ... 1059 | 1060 | def reset(self): 1061 | "Reset weights" 1062 | ... 1063 | 1064 | def propagate_forward(self, data): 1065 | "Propagate data from input layer to output layer" 1066 | ... 1067 | 1068 | def propagate_backward(self, target, lrate=0.1): 1069 | "Back propagate error related to target using lrate" 1070 | ... 1071 | ``` 1072 | 1073 |
Solution (click to expand)

1074 | 1075 | Sources: [perceptron.py](perceptron.py) 1076 | 1077 | ```Python 1078 | class Perceptron: 1079 | ''' Perceptron class. ''' 1080 | 1081 | def __init__(self, n, m): 1082 | "Initialization of the perceptron with given sizes" 1083 | 1084 | self.input = np.ones(n+1) 1085 | self.output = np.ones(m) 1086 | self.weights= np.zeros((m,n+1)) 1087 | self.reset() 1088 | 1089 | def reset(self): 1090 | "Reset weights" 1091 | 1092 | self.weights[...] = np.random.uniform(-.5, .5, self.weights.shape) 1093 | 1094 | def propagate_forward(self, data): 1095 | "Propagate data from input layer to output layer" 1096 | 1097 | # Set input layer (but not bias) 1098 | self.input[1:] = data 1099 | self.output[...] = f(np.dot(self.weights,self.input)) 1100 | 1101 | # Return output 1102 | return self.output 1103 | 1104 | def propagate_backward(self, target, lrate=0.1): 1105 | "Back propagate error related to target using lrate" 1106 | 1107 | error = np.atleast_2d(target-self.output) 1108 | input = np.atleast_2d(self.input) 1109 | self.weights += lrate*np.dot(error.T,input) 1110 | 1111 | # Return error 1112 | return (error**2).sum() 1113 | ``` 1114 | 1115 |


1116 | 1117 | To train the network for 1000 iterations, we can do: 1118 | 1119 | ```Python 1120 | 1121 | lrate = 0.1 1122 | for i in range(1000): 1123 | lrate *= 0.999 1124 | n = np.random.randint(samples.size) 1125 | network.propagate_forward( samples['input'][n] ) 1126 | error = network.propagate_backward( samples['output'][n], lrate ) 1127 | ``` 1128 | 1129 | For other type of neural networks, you can have a look at https://github.com/rougier/neural-networks/. 1130 | 1131 | 1132 | 1133 | 1134 | ## ❹ – References 1135 | 1136 | ### Book & tutorials 1137 | 1138 | This is a curated list of resources among the plethora of books & tutorials 1139 | that exist online. Make no mistake, it is strongly biased. 1140 | 1141 | * [From Python to Numpy](http://www.labri.fr/perso/nrougier/from-python-to-numpy/), 1142 | Nicolas P.Rougier, 2017 1143 | * [100 Numpy Exercises](https://github.com/rougier/numpy-100), 1144 | Nicolas P. Rougier, 2017 1145 | * [SciPy Lecture Notes](http://www.scipy-lectures.org/), 1146 | Gaël Varoquaux, Emmanuelle Gouillart, Olav Vahtras et al., 2016 1147 | * [Elegant SciPy: The Art of Scientific Python](https://github.com/elegant-scipy/elegant-scipy), 1148 | Juan Nunez-Iglesias, Stéfan van der Walt, Harriet Dashnow, 2016 1149 | * [Numpy Medkit](http://mentat.za.net/numpy/numpy_advanced_slides), 1150 | Stéfan van der Walt, 2008 1151 | 1152 | ### Archives 1153 | 1154 | You can access all ASPP archives from https://python.g-node.org/wiki/archives 1155 | 1156 | * **2017** (Nikiti, Greece, Juan Nunez-Iglesias): 1157 | [exercises](https://github.com/jni/aspp2017-numpy) – [solutions](https://github.com/jni/aspp2017-numpy-solutions) 1158 | * **2016** (Reading, United Kingdom, Stéfan van der Walt): 1159 | [exercises](https://github.com/ASPP/2016_numpy) 1160 | * **2015** (Munich, Germany, Juan Nunez-Iglesias): 1161 | [exercises](https://github.com/jni/aspp2015/tree/delivered) – [solutions](https://github.com/jni/aspp2015/tree/solved-in-class) 1162 | * **2014** (Split, Croatia, Stéfan van der Walt): 1163 | [notebooks](https://python.g-node.org/python-summerschool-2014/_media/numpy_advanced.tar.bz2) 1164 | * **2013** (Züricj, Switzerland, Stéfan van der Walt): 1165 | [slides](https://python.g-node.org/python-summerschool-2013/_media/advanced_numpy/slides/index.html) – [exercises](https://python.g-node.org/python-summerschool-2013/_media/advanced_numpy/problems.html) – [dropbox](https://www.dropbox.com/sh/4esl1ii7cac5xfa/O-CSFKKYvS/assp2013/numpy_problems) 1166 | * **2012** (Kiel, Germany, Stéfan van der Walt): 1167 | [slides](https://python.g-node.org/python-summerschool-2012/_media/wiki/numpy/numpy_kiel2012.pdf) – [exercises](https://python.g-node.org/python-summerschool-2012/_media/wiki/numpy/problems.html) 1168 | * **2011** (St Andrew, United Kingdom, Pauli Virtanen): 1169 | [slides](https://python.g-node.org/python-summerschool-2011/_media/materials/numpy/numpy-slides.pdf) – [exercises](https://python.g-node.org/python-summerschool-2011/_media/materials/numpy/numpy-exercises.zip) – [solutions](https://python.g-node.org/python-summerschool-2011/_media/materials/numpy/numpy-solutions.zip) 1170 | * **2010** (Trento, Italy, Stéfan van der Walt): 1171 | [slides](https://python.g-node.org/python-autumnschool-2010/_media/materials/advanced_numpy/numpy_trento2010.pdf) – [exercises](https://python.g-node.org/python-autumnschool-2010/_media/materials/advanced_numpy/problems.html) – [solutions 1](https://python.g-node.org/python-autumnschool-2010/_media/materials/advanced_numpy/array_interface/solution.py) – [solutions 2](https://python.g-node.org/python-autumnschool-2010/_media/materials/advanced_numpy/structured_arrays/load_txt_solution.py) 1172 | * **2010** (Warsaw, Poland, Bartosz Teleńczuk): 1173 | [slides](https://python.g-node.org/python-winterschool-2010/_media/scientific_python.pdf) – [exercises](https://python.g-node.org/python-winterschool-2010/_media/python_tools_for_science.pdf) 1174 | * **2009** (Berlin, Germany, Jens Kremkow): 1175 | [slides](https://python.g-node.org/python-summerschool-2009/_media/numpy_scipy_matplotlib_pynn_neurotools.pdf) – [examples](https://python.g-node.org/python-summerschool-2009/_media/examples_numpy.py) – [exercises](https://python.g-node.org/python-summerschool-2009/_media/exercises_day2_numpy.py) 1176 | -------------------------------------------------------------------------------- /automata.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/ASPP/ASPP-2018-numpy/430c43d4023b20ee581557ce874dbcacede3f56f/automata.png -------------------------------------------------------------------------------- /automata.py: -------------------------------------------------------------------------------- 1 | # ----------------------------------------------------------------------------- 2 | # Copyright (C) 2018 Nicolas P. Rougier 3 | # Distributed under the terms of the BSD License. 4 | # ----------------------------------------------------------------------------- 5 | import numpy as np 6 | import matplotlib.pyplot as plt 7 | from numpy.lib.stride_tricks import as_strided 8 | 9 | def sliding_window(Z, size=2): 10 | n, s = Z.shape[0], Z.strides[0] 11 | return as_strided(Z, shape=(n-size+1, size), strides=(s, s)) 12 | 13 | # Rule 30 (see https://en.wikipedia.org/wiki/Rule_30) 14 | # 0x000: 0, 0x001: 1, 0x010: 1, 0x011: 1 15 | # 0x100: 1, 0x101: 0, 0x110: 0, 0x111: 0 16 | rule = 30 17 | R = np.array([int(v) for v in '{0:08b}'.format(rule)])[::-1] 18 | 19 | # Initial state 20 | Z = np.zeros((250,501), dtype=int) 21 | Z[0,250] = 1 22 | 23 | # Computing some iterations 24 | for i in range(1, len(Z)): 25 | N = sliding_window(Z[i-1],3) * [1,2,4] 26 | Z[i,1:-1] = R[N.sum(axis=1)] 27 | 28 | # Display 29 | plt.figure(figsize=(6,3)) 30 | plt.subplot(1,1,1,frameon=False) 31 | plt.imshow(Z, vmin=0, vmax=1, cmap=plt.cm.gray_r) 32 | plt.xticks([]), plt.yticks([]) 33 | plt.tight_layout() 34 | plt.savefig("automata.png") 35 | plt.show() 36 | -------------------------------------------------------------------------------- /bad-dither.py: -------------------------------------------------------------------------------- 1 | # ----------------------------------------------------------------------------- 2 | # Copyright (C) 2018 Nicolas P. Rougier 3 | # Distributed under the terms of the BSD License. 4 | # ----------------------------------------------------------------------------- 5 | import imageio 6 | import numpy as np 7 | import scipy.spatial 8 | 9 | # Number of final colors we want 10 | n = 16 11 | 12 | # Original Image 13 | I = imageio.imread("kitten.jpg") 14 | shape = I.shape 15 | 16 | # Flattened image 17 | I = I.reshape(shape[0]*shape[1], shape[2]) 18 | 19 | # Find the unique colors and their frequency (=counts) 20 | colors, counts = np.unique(I, axis=0, return_counts=True) 21 | 22 | # Get the n most frequent colors 23 | sorted = np.argsort(counts)[::-1] 24 | C = I[sorted][:n] 25 | 26 | # Compute distance to most frequent colors 27 | D = scipy.spatial.distance.cdist(I, C, 'sqeuclidean') 28 | 29 | # Replace colors with closest one 30 | Z = (C[D.argmin(axis=1)]).reshape(shape) 31 | 32 | # Save result 33 | imageio.imsave("kitten-dithered.jpg", Z) 34 | -------------------------------------------------------------------------------- /basic-manipulation.py: -------------------------------------------------------------------------------- 1 | # ----------------------------------------------------------------------------- 2 | # Copyright (C) 2018 Nicolas P. Rougier 3 | # Distributed under the terms of the BSD License. 4 | # ----------------------------------------------------------------------------- 5 | import numpy as np 6 | 7 | # Create a vector with values ranging from 10 to 49 8 | Z = np.arange(10,50) 9 | 10 | # Create a null vector of size 100 but the fifth value which is 1 11 | Z = np.zeros(100) 12 | Z[4] = 1 13 | 14 | # Reverse a vector (first element becomes last) 15 | Z = np.arange(50)[::-1] 16 | 17 | # Create a 3x3 matrix with values ranging from 0 to 8 18 | Z = np.arange(9).reshape(3,3) 19 | 20 | # Create a 3x3 identity matrix 21 | Z = np.eye(3) 22 | 23 | # Create a 2d array with 1 on the border and 0 inside 24 | Z = np.ones((10,10)) 25 | Z[1:-1,1:-1] = 0 26 | 27 | # Given a 1D array, negate all elements which are between 3 and 8, in place 28 | Z = np.arange(11) 29 | Z[(3 < Z) & (Z <= 8)] *= -1 30 | -------------------------------------------------------------------------------- /data.txt: -------------------------------------------------------------------------------- 1 | % rank lemma (8 letters max) frequency dispersion 2 | 21 they 1865844 0.96 3 | 42 her 969591 0.91 4 | 49 as 829018 0.95 5 | 7 to 6332195 0.98 6 | 63 take 670745 0.97 7 | 14 you 3085642 0.92 8 | 35 go 1151045 0.93 9 | 56 think 772787 0.91 10 | 28 not 1638883 0.98 11 | -------------------------------------------------------------------------------- /diffusion.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/ASPP/ASPP-2018-numpy/430c43d4023b20ee581557ce874dbcacede3f56f/diffusion.png -------------------------------------------------------------------------------- /diffusion.py: -------------------------------------------------------------------------------- 1 | # ----------------------------------------------------------------------------- 2 | # Copyright (C) 2018 Nicolas P. Rougier 3 | # Distributed under the terms of the BSD License. 4 | # ----------------------------------------------------------------------------- 5 | import numpy as np 6 | import matplotlib.pyplot as plt 7 | 8 | from numpy.lib.stride_tricks import as_strided 9 | 10 | def sliding_window(Z, size=2): 11 | n, s = Z.shape[0], Z.strides[0] 12 | return as_strided(Z, shape=(n-size+1, size), strides=(s, s)) 13 | 14 | 15 | # Initial conditions: 16 | # Domain size is 100 and we'll iterate over 50 time steps 17 | U = np.zeros((50,100)) 18 | U[0,5::10] = 1.5 19 | 20 | # Actual iteration 21 | F = 0.05 22 | for i in range(1, len(Z)): 23 | Z[i,1:-1] = Z[i-1,1:-1] + F*(sliding_window(Z[i-1], 3)*[+1,-2,+1]).sum(axis=1) 24 | 25 | # Display 26 | plt.figure(figsize=(6,3)) 27 | plt.subplot(1,1,1,frameon=False) 28 | plt.imshow(Z, vmin=0, vmax=1) 29 | plt.xticks([]), plt.yticks([]) 30 | plt.tight_layout() 31 | plt.savefig("diffusion.png") 32 | plt.show() 33 | -------------------------------------------------------------------------------- /dithered.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/ASPP/ASPP-2018-numpy/430c43d4023b20ee581557ce874dbcacede3f56f/dithered.png -------------------------------------------------------------------------------- /geometry.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/ASPP/ASPP-2018-numpy/430c43d4023b20ee581557ce874dbcacede3f56f/geometry.png -------------------------------------------------------------------------------- /geometry.py: -------------------------------------------------------------------------------- 1 | # ----------------------------------------------------------------------------- 2 | # Copyright (C) 2018 Nicolas P. Rougier 3 | # Distributed under the terms of the BSD License. 4 | # ----------------------------------------------------------------------------- 5 | import numpy as np 6 | import matplotlib.pyplot as plt 7 | 8 | dtype = [("points", float, (4, 2)), 9 | ("scale", float, 1), 10 | ("translate", float, 2), 11 | ("rotate", float, 1)] 12 | S = np.zeros(25, dtype = dtype) 13 | S["points"] = [(-1,-1), (-1,+1), (+1,+1), (+1,-1)] 14 | S["translate"] = (1,0) 15 | S["scale"] = 0.1 16 | S["rotate"] = np.linspace(0, 2*np.pi, len(S), endpoint=False) 17 | 18 | P = np.zeros((len(S), 4, 2)) 19 | for i in range(len(S)): 20 | for j in range(4): 21 | x = S[i]["points"][j,0] 22 | y = S[i]["points"][j,1] 23 | tx, ty = S[i]["translate"] 24 | scale = S[i]["scale"] 25 | theta = S[i]["rotate"] 26 | x = tx + x*scale 27 | y = ty + y*scale 28 | x_ = x*np.cos(theta) - y*np.sin(theta) 29 | y_ = x*np.sin(theta) + y*np.cos(theta) 30 | P[i,j] = x_, y_ 31 | 32 | fig = plt.figure(figsize=(6,6)) 33 | ax = plt.subplot(1,1,1, frameon=False) 34 | for i in range(len(P)): 35 | X = np.r_[P[i,:,0], P[i,0,0]] 36 | Y = np.r_[P[i,:,1], P[i,0,1]] 37 | plt.plot(X, Y, color="black") 38 | plt.xticks([]), plt.yticks([]) 39 | plt.tight_layout() 40 | plt.savefig("geometry.png") 41 | plt.show() 42 | -------------------------------------------------------------------------------- /imshow.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/ASPP/ASPP-2018-numpy/430c43d4023b20ee581557ce874dbcacede3f56f/imshow.png -------------------------------------------------------------------------------- /imshow.py: -------------------------------------------------------------------------------- 1 | # Terminal visualization of 2D numpy arrays 2 | # Copyright (c) 2009 Nicolas P. Rougier 3 | # 4 | # This program is free software: you can redistribute it and/or modify it under 5 | # the terms of the GNU General Public License as published by the Free Software 6 | # Foundation, either version 3 of the License, or (at your option) any later 7 | # version. 8 | # 9 | # This program is distributed in the hope that it will be useful, but WITHOUT 10 | # ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS 11 | # FOR A PARTICULAR PURPOSE. See the GNU General Public License for more details. 12 | # 13 | # You should have received a copy of the GNU General Public License along with 14 | # this program. If not, see . 15 | # ------------------------------------------------------------------------------ 16 | """ Terminal visualization of 2D numpy arrays 17 | Using extended color capability of terminal (256 colors), the imshow function 18 | renders a 2D numpy array within terminal. 19 | """ 20 | import sys 21 | import numpy as np 22 | from matplotlib.cm import viridis 23 | 24 | 25 | def imshow (Z, vmin=None, vmax=None, cmap=viridis, show_cmap=True): 26 | ''' Show a 2D numpy array using terminal colors ''' 27 | 28 | Z = np.atleast_2d(Z) 29 | 30 | if len(Z.shape) != 2: 31 | print("Cannot display non 2D array") 32 | return 33 | 34 | vmin = vmin or Z.min() 35 | vmax = vmax or Z.max() 36 | 37 | # Build initialization string that setup terminal colors 38 | init = '' 39 | for i in range(240): 40 | v = i/240 41 | r,g,b,a = cmap(v) 42 | init += "\x1b]4;%d;rgb:%02x/%02x/%02x\x1b\\" % (16+i, int(r*255),int(g*255),int(b*255)) 43 | 44 | # Build array data string 45 | data = '' 46 | for i in range(Z.shape[0]): 47 | for j in range(Z.shape[1]): 48 | c = 16 + int( ((Z[Z.shape[0]-i-1,j]-vmin) / (vmax-vmin))*239) 49 | if (c < 16): 50 | c=16 51 | elif (c > 255): 52 | c=255 53 | data += "\x1b[48;5;%dm " % c 54 | u = vmax - (i/float(max(Z.shape[0]-1,1))) * ((vmax-vmin)) 55 | if show_cmap: 56 | data += "\x1b[0m " 57 | data += "\x1b[48;5;%dm " % (16 + (1-i/float(Z.shape[0]))*239) 58 | data += "\x1b[0m %+.2f" % u 59 | data += "\n" 60 | 61 | sys.stdout.write(init+'\n') 62 | sys.stdout.write(data+'\n') 63 | 64 | 65 | if __name__ == '__main__': 66 | def func3(x,y): 67 | return (1- x/2 + x**5 + y**3)*np.exp(-x**2-y**2) 68 | dx, dy = .2, .2 69 | x = np.arange(-3.0, 3.0, dx) 70 | y = np.arange(-3.0, 3.0, dy) 71 | X,Y = np.meshgrid(x, y) 72 | Z = np.array (func3(X, Y)) 73 | imshow (Z) 74 | -------------------------------------------------------------------------------- /input-output.py: -------------------------------------------------------------------------------- 1 | # ----------------------------------------------------------------------------- 2 | # Copyright (C) 2018 Nicolas P. Rougier 3 | # Distributed under the terms of the BSD License. 4 | # ----------------------------------------------------------------------------- 5 | import numpy as np 6 | 7 | # Create our own dtype 8 | dtype = np.dtype([('rank', 'i8'), 9 | ('lemma', 'S8'), 10 | ('frequency', 'i8'), 11 | ('dispersion', 'f8')]) 12 | 13 | # Load file using our own dtype 14 | data = np.loadtxt('data.txt', comments='%', dtype=dtype) 15 | 16 | # Extract words only 17 | print(data["lemma"]) 18 | 19 | # Extract the 3rd row 20 | print(data[2]) 21 | 22 | # Print all words with rank < 30 23 | print(data[data["rank"] < 30]) 24 | 25 | # Sort the data according to frequency 26 | sorted = np.sort(data, order="frequency") 27 | print(sorted) 28 | 29 | # Save unsorted and sorted array 30 | np.savez("sorted.npz", data=data, sorted=sorted) 31 | 32 | # Load saved array 33 | out = np.load("sorted.npz") 34 | print(out["sorted"]) 35 | -------------------------------------------------------------------------------- /kitten-dithered.jpg: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/ASPP/ASPP-2018-numpy/430c43d4023b20ee581557ce874dbcacede3f56f/kitten-dithered.jpg -------------------------------------------------------------------------------- /kitten-quantized.jpg: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/ASPP/ASPP-2018-numpy/430c43d4023b20ee581557ce874dbcacede3f56f/kitten-quantized.jpg -------------------------------------------------------------------------------- /kitten.jpg: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/ASPP/ASPP-2018-numpy/430c43d4023b20ee581557ce874dbcacede3f56f/kitten.jpg -------------------------------------------------------------------------------- /kmeans.py: -------------------------------------------------------------------------------- 1 | # ----------------------------------------------------------------------------- 2 | # Copyright (C) 2018 Nicolas P. Rougier 3 | # Distributed under the terms of the BSD License. 4 | # ----------------------------------------------------------------------------- 5 | # Code by Gareth Rees, posted on stack overflow 6 | # https://codereview.stackexchange.com/questions/61598/k-mean-with-numpy 7 | 8 | import numpy as np 9 | import scipy.spatial 10 | 11 | 12 | def cluster_centroids(data, clusters, k=None): 13 | """Return centroids of clusters in data. 14 | 15 | data is an array of observations with shape (A, B, ...). 16 | 17 | clusters is an array of integers of shape (A,) giving the index 18 | (from 0 to k-1) of the cluster to which each observation belongs. 19 | The clusters must all be non-empty. 20 | 21 | k is the number of clusters. If omitted, it is deduced from the 22 | values in the clusters array. 23 | 24 | The result is an array of shape (k, B, ...) containing the 25 | centroid of each cluster. 26 | 27 | >>> data = np.array([[12, 10, 87], 28 | ... [ 2, 12, 33], 29 | ... [68, 31, 32], 30 | ... [88, 13, 66], 31 | ... [79, 40, 89], 32 | ... [ 1, 77, 12]]) 33 | >>> cluster_centroids(data, np.array([1, 1, 2, 2, 0, 1])) 34 | array([[ 79., 40., 89.], 35 | [ 5., 33., 44.], 36 | [ 78., 22., 49.]]) 37 | 38 | """ 39 | if k is None: 40 | k = np.max(clusters) + 1 41 | result = np.empty(shape=(k,) + data.shape[1:]) 42 | for i in range(k): 43 | np.mean(data[clusters == i], axis=0, out=result[i]) 44 | return result 45 | 46 | 47 | def kmeans(data, k=None, centroids=None, steps=20): 48 | """Divide the observations in data into clusters using the k-means 49 | algorithm, and return an array of integers assigning each data 50 | point to one of the clusters. 51 | 52 | centroids, if supplied, must be an array giving the initial 53 | position of the centroids of each cluster. 54 | 55 | If centroids is omitted, the number k gives the number of clusters 56 | and the initial positions of the centroids are selected randomly 57 | from the data. 58 | 59 | The k-means algorithm adjusts the centroids iteratively for the 60 | given number of steps, or until no further progress can be made. 61 | 62 | >>> data = np.array([[12, 10, 87], 63 | ... [ 2, 12, 33], 64 | ... [68, 31, 32], 65 | ... [88, 13, 66], 66 | ... [79, 40, 89], 67 | ... [ 1, 77, 12]]) 68 | >>> np.random.seed(73) 69 | >>> kmeans(data, k=3) 70 | (array([[79., 40., 89.], 71 | [ 5., 33., 44.], 72 | [78., 22., 49.]]), array([1, 1, 2, 2, 0, 1])) 73 | 74 | """ 75 | if centroids is not None and k is not None: 76 | assert(k == len(centroids)) 77 | elif centroids is not None: 78 | k = len(centroids) 79 | elif k is not None: 80 | # Forgy initialization method: choose k data points randomly. 81 | centroids = data[np.random.choice(np.arange(len(data)), k, False)] 82 | else: 83 | raise RuntimeError("Need a value for k or centroids.") 84 | 85 | for _ in range(max(steps, 1)): 86 | # Squared distances between each point and each centroid. 87 | sqdists = scipy.spatial.distance.cdist(centroids, data, 'sqeuclidean') 88 | 89 | # Index of the closest centroid to each data point. 90 | clusters = np.argmin(sqdists, axis=0) 91 | 92 | new_centroids = cluster_centroids(data, clusters, k) 93 | if np.array_equal(new_centroids, centroids): 94 | break 95 | 96 | centroids = new_centroids 97 | 98 | return centroids, clusters 99 | 100 | 101 | 102 | if __name__ == '__main__': 103 | import imageio 104 | 105 | # Number of final colors we want 106 | n = 16 107 | 108 | # Original Image 109 | I = imageio.imread("kitten.jpg") 110 | shape = I.shape 111 | 112 | # Flattened image 113 | D = I.reshape(shape[0]*shape[1], shape[2]) 114 | 115 | # Search for 16 centroids in D (using 20 iterations) 116 | centroids, clusters = kmeans(D, k=n, steps=20) 117 | 118 | # Create quantized image 119 | I = (centroids[clusters]).reshape(shape) 120 | I = np.round(I).astype(np.uint8) 121 | 122 | # Save result 123 | imageio.imsave("kitten-quantized.jpg", I) 124 | -------------------------------------------------------------------------------- /moving-average.py: -------------------------------------------------------------------------------- 1 | import numpy as np 2 | import matplolib.pyplot as plt 3 | 4 | -------------------------------------------------------------------------------- /nan-arithmetics.py: -------------------------------------------------------------------------------- 1 | # ----------------------------------------------------------------------------- 2 | # Copyright (C) 2018 Nicolas P. Rougier 3 | # Distributed under the terms of the BSD License. 4 | # ----------------------------------------------------------------------------- 5 | import numpy as np 6 | 7 | # Result is NaN 8 | print(0 * np.nan) 9 | 10 | # Result is False 11 | print(np.nan == np.nan) 12 | 13 | # Result is False 14 | print(np.inf > np.nan) 15 | 16 | # Result is NaN 17 | print(np.nan - np.nan) 18 | 19 | # Result is False !!! 20 | print(0.3 == 3 * 0.1) 21 | print("0.1 really is {:0.56f}".format(0.1)) 22 | -------------------------------------------------------------------------------- /original.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/ASPP/ASPP-2018-numpy/430c43d4023b20ee581557ce874dbcacede3f56f/original.png -------------------------------------------------------------------------------- /perceptron.mp4: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/ASPP/ASPP-2018-numpy/430c43d4023b20ee581557ce874dbcacede3f56f/perceptron.mp4 -------------------------------------------------------------------------------- /perceptron.py: -------------------------------------------------------------------------------- 1 | # ----------------------------------------------------------------------------- 2 | # Copyright (C) 2018 Nicolas P. Rougier 3 | # Distributed under the terms of the BSD License. 4 | # ----------------------------------------------------------------------------- 5 | import numpy as np 6 | 7 | def f(x): 8 | return x > 0 9 | 10 | class Perceptron: 11 | ''' Perceptron class. ''' 12 | 13 | def __init__(self, n, m): 14 | ''' Initialization of the perceptron with given sizes. ''' 15 | 16 | self.input = np.ones(n+1) 17 | self.output = np.ones(m) 18 | self.weights= np.zeros((m,n+1)) 19 | self.reset() 20 | 21 | def reset(self): 22 | ''' Reset weights ''' 23 | 24 | self.weights[...] = np.random.uniform(-.5, .5, self.weights.shape) 25 | 26 | def propagate_forward(self, data): 27 | ''' Propagate data from input layer to output layer. ''' 28 | 29 | # Set input layer (but not bias) 30 | self.input[1:] = data 31 | self.output[...] = f(np.dot(self.weights,self.input)) 32 | 33 | # Return output 34 | return self.output 35 | 36 | def propagate_backward(self, target, lrate=0.1): 37 | ''' Back propagate error related to target using lrate. ''' 38 | 39 | error = np.atleast_2d(target-self.output) 40 | input = np.atleast_2d(self.input) 41 | self.weights += lrate*np.dot(error.T,input) 42 | 43 | # Return error 44 | return (error**2).sum() 45 | 46 | 47 | # ----------------------------------------------------------------------------- 48 | if __name__ == '__main__': 49 | import numpy as np 50 | import matplotlib.pyplot as plt 51 | import matplotlib.animation as animation 52 | 53 | np.random.seed(123) 54 | 55 | samples = np.zeros(100, dtype=[('input', float, 2), 56 | ('output', float, 1)]) 57 | 58 | P = np.random.uniform(0.05,0.95,(len(samples),2)) 59 | samples["input"] = P 60 | stars = np.where(P[:,0]+P[:,1] < 1) 61 | discs = np.where(P[:,0]+P[:,1] > 1) 62 | samples["output"][stars] = +1 63 | samples["output"][discs] = 0 64 | 65 | 66 | network = Perceptron(2,1) 67 | network.reset() 68 | lrate = 0.05 69 | 70 | fig = plt.figure(figsize=(6,6)) 71 | ax = plt.subplot(1,1,1, aspect=1, frameon=False) 72 | ax.scatter(P[stars,0], P[stars,1], color="red", marker="*", s=50, alpha=.5) 73 | ax.scatter(P[discs,0], P[discs,1], color="blue", s=25, alpha=.5) 74 | line, = ax.plot([], [], color="black", linewidth=2) 75 | ax.set_xlim(0,1) 76 | ax.set_xticks([]) 77 | ax.set_ylim(0,1) 78 | ax.set_yticks([]) 79 | plt.tight_layout() 80 | 81 | def animate(i): 82 | global lrate 83 | error = 0 84 | 85 | count = 0 86 | lrate *= 0.99 87 | while error == 0 and count < 10: 88 | n = np.random.randint(samples.size) 89 | network.propagate_forward( samples['input'][n] ) 90 | error = network.propagate_backward( samples['output'][n], lrate ) 91 | count += 1 92 | 93 | c,a,b = network.weights[0] 94 | x0 = -2 95 | x1 = +2 96 | if a != 0: 97 | y0 = (-c -b*x0)/a 98 | y1 = (-c -b*x1)/a 99 | else: 100 | y0 = 0 101 | y1 = 1 102 | 103 | line.set_xdata([x0,x1]) 104 | line.set_ydata([y0,y1]) 105 | 106 | return line, 107 | 108 | anim = animation.FuncAnimation(fig, animate, np.arange(1, 300)) 109 | #Writer = animation.writers['ffmpeg'] 110 | #writer = Writer(fps=30, 111 | # metadata=dict(artist='Nicolas P. Rougier'), bitrate=1800) 112 | # anim.save('perceptron.mp4', writer=writer) 113 | plt.show() 114 | -------------------------------------------------------------------------------- /random-walk.py: -------------------------------------------------------------------------------- 1 | # ----------------------------------------------------------------------------- 2 | # Copyright (C) 2018 Nicolas P. Rougier 3 | # Distributed under the terms of the BSD License. 4 | # ----------------------------------------------------------------------------- 5 | import random 6 | import numpy as np 7 | from tools import timeit 8 | 9 | def random_walk_slow(n): 10 | position = 0 11 | walk = [position] 12 | for i in range(n): 13 | position += 2*random.randint(0, 1)-1 14 | walk.append(position) 15 | return walk 16 | 17 | 18 | def random_walk_faster(n=1000): 19 | from itertools import accumulate 20 | # Only available from Python 3.6 21 | steps = random.choices([-1,+1], k=n) 22 | return [0]+list(accumulate(steps)) 23 | 24 | def random_walk_fastest(n=1000): 25 | steps = np.random.choice([-1,+1], n) 26 | return np.cumsum(steps) 27 | 28 | 29 | if __name__ == '__main__': 30 | 31 | timeit("random_walk_slow(1000)", globals()) 32 | timeit("random_walk_faster(1000)", globals()) 33 | timeit("random_walk_fastest(1000)", globals()) 34 | -------------------------------------------------------------------------------- /reorder.py: -------------------------------------------------------------------------------- 1 | # ----------------------------------------------------------------------------- 2 | # Copyright (C) 2018 Nicolas P. Rougier 3 | # Distributed under the terms of the BSD License. 4 | # ----------------------------------------------------------------------------- 5 | import struct 6 | import numpy as np 7 | 8 | # Generation of the array 9 | # Z = range(1001, 1009) 10 | # L = np.reshape(Z, (2,2,2), order="F").ravel().astype(">i8").view(np.ubyte) 11 | 12 | L = [ 0, 0, 0, 0, 0, 0, 3, 233, 13 | 0, 0, 0, 0, 0, 0, 3, 237, 14 | 0, 0, 0, 0, 0, 0, 3, 235, 15 | 0, 0, 0, 0, 0, 0, 3, 239, 16 | 0, 0, 0, 0, 0, 0, 3, 234, 17 | 0, 0, 0, 0, 0, 0, 3, 238, 18 | 0, 0, 0, 0, 0, 0, 3, 236, 19 | 0, 0, 0, 0, 0, 0, 3, 240] 20 | 21 | # Automatic (numpy) 22 | Z = np.reshape(np.array(L, dtype=np.ubyte).view(dtype=">i8"), (2,2,2), order="F") 23 | print(Z[1,0,0]) 24 | 25 | # Manual (brain) 26 | shape = (2,2,2) 27 | itemsize = 8 28 | # We can probably do better 29 | strides = itemsize, itemsize*shape[0], itemsize*shape[0]*shape[1] 30 | index = (1,0,0) 31 | start = sum(i*s for i,s in zip(index,strides)) 32 | end = start+itemsize 33 | value = struct.unpack(">Q", bytes(L[start:end]))[0] 34 | print(value) 35 | -------------------------------------------------------------------------------- /repeat.py: -------------------------------------------------------------------------------- 1 | # ----------------------------------------------------------------------------- 2 | # Copyright (C) 2018 Nicolas P. Rougier 3 | # Distributed under the terms of the BSD License. 4 | # ----------------------------------------------------------------------------- 5 | import numpy as np 6 | from numpy.lib.stride_tricks import as_strided 7 | 8 | Z = np.zeros(5) 9 | Z1 = np.tile(Z,(3,1)) 10 | Z2 = as_strided(Z, shape=(3,)+Z.shape, strides=(0,)+Z.strides) 11 | 12 | # Real repeat (three times the memory) 13 | Z1[0,0] = 1 14 | print(Z1) 15 | 16 | # Fake repeat (but less memory) 17 | Z2[0,0] = 1 18 | print(Z2) 19 | -------------------------------------------------------------------------------- /strides.py: -------------------------------------------------------------------------------- 1 | # ----------------------------------------------------------------------------- 2 | # Copyright (C) 2018 Nicolas P. Rougier 3 | # Distributed under the terms of the BSD License. 4 | # ----------------------------------------------------------------------------- 5 | import numpy as np 6 | 7 | def strides(Z): 8 | strides = [Z.itemsize] 9 | 10 | # Fotran ordered array 11 | if np.isfortran(Z): 12 | for i in range(0, Z.ndim-1): 13 | strides.append(strides[-1] * Z.shape[i]) 14 | return tuple(strides) 15 | # C ordered array 16 | else: 17 | for i in range(Z.ndim-1, 0, -1): 18 | strides.append(strides[-1] * Z.shape[i]) 19 | return tuple(strides[::-1]) 20 | 21 | # This work 22 | Z = np.arange(24).reshape((2,3,4), order="C") 23 | print(Z.strides, " – ", strides(Z)) 24 | 25 | Z = np.arange(24).reshape((2,3,4), order="F") 26 | print(Z.strides, " – ", strides(Z)) 27 | 28 | # This does not work 29 | # Z = Z[::2] 30 | # print(Z.strides, " – ", strides(Z)) 31 | 32 | -------------------------------------------------------------------------------- /tools.py: -------------------------------------------------------------------------------- 1 | # ----------------------------------------------------------------------------- 2 | # Copyright (C) 2018 Nicolas P. Rougier 3 | # Distributed under the terms of the BSD License. 4 | # ----------------------------------------------------------------------------- 5 | from imshow import imshow 6 | 7 | def sysinfo(): 8 | import sys 9 | import time 10 | import numpy as np 11 | import scipy as sp 12 | import matplotlib 13 | 14 | print("Date: %s" % (time.strftime("%D"))) 15 | version = sys.version_info 16 | major, minor, micro = version.major, version.minor, version.micro 17 | print("Python: %d.%d.%d" % (major, minor, micro)) 18 | print("Numpy: ", np.__version__) 19 | print("Scipy: ", sp.__version__) 20 | print("Matplotlib:", matplotlib.__version__) 21 | 22 | 23 | def timeit(stmt, globals=globals()): 24 | import numpy as np 25 | import timeit as _timeit 26 | 27 | print("Timing '{0}'".format(stmt)) 28 | 29 | # Rough approximation of a 10 runs 30 | trial = _timeit.timeit(stmt, globals=globals, number=10)/10 31 | 32 | # Maximum duration 33 | duration = 5.0 34 | 35 | # Number of repeat 36 | repeat = 7 37 | 38 | # Compute rounded number of trials 39 | number = max(1,int(10**np.ceil(np.log((duration/repeat)/trial)/np.log(10)))) 40 | 41 | # Only report best run 42 | times = _timeit.repeat(stmt, globals=globals, number=number, repeat=repeat) 43 | times = np.array(times)/number 44 | mean = np.mean(times) 45 | std = np.std(times) 46 | 47 | # Display results 48 | units = {"s": 1, "ms": 1e-3, "us": 1e-6, "ns": 1e-9} 49 | for key,value in units.items(): 50 | unit, factor = key, 1/value 51 | if mean > value: break 52 | mean *= factor 53 | std *= factor 54 | 55 | print("%.3g %s ± %.3g %s per loop (mean ± std. dev. of %d runs, %d loops each)" % 56 | (mean, unit, std, unit, repeat, number)) 57 | 58 | 59 | def info(Z): 60 | import sys 61 | import numpy as np 62 | endianness = {'=': 'native (%s)' % sys.byteorder, 63 | '<': 'little', 64 | '>': 'big', 65 | '|': 'not applicable'} 66 | 67 | print("------------------------------") 68 | print("Interface (item)") 69 | print(" shape: ", Z.shape) 70 | print(" dtype: ", Z.dtype) 71 | print(" length: ", len(Z)) 72 | print(" size: ", Z.size) 73 | print(" endianness: ", endianness[Z.dtype.byteorder]) 74 | if np.isfortran(Z): 75 | print(" order: ☐ C ☑ Fortran") 76 | else: 77 | print(" order: ☑ C ☐ Fortran") 78 | print("") 79 | print("Memory (byte)") 80 | print(" item size: ", Z.itemsize) 81 | print(" array size: ", Z.size*Z.itemsize) 82 | print(" strides: ", Z.strides) 83 | print("") 84 | print("Properties") 85 | if Z.flags["OWNDATA"]: 86 | print(" own data: ☑ Yes ☐ No") 87 | else: 88 | print(" own data: ☐ Yes ☑ No") 89 | if Z.flags["WRITEABLE"]: 90 | print(" writeable: ☑ Yes ☐ No") 91 | else: 92 | print(" writeable: ☐ Yes ☑ No") 93 | if np.isfortran(Z) and Z.flags["F_CONTIGUOUS"]: 94 | print(" contiguous: ☑ Yes ☐ No") 95 | elif not np.isfortran(Z) and Z.flags["C_CONTIGUOUS"]: 96 | print(" contiguous: ☑ Yes ☐ No") 97 | else: 98 | print(" contiguous: ☐ Yes ☑ No") 99 | if Z.flags["ALIGNED"]: 100 | print(" aligned: ☑ Yes ☐ No") 101 | else: 102 | print(" aligned: ☐ Yes ☑ No") 103 | print("------------------------------") 104 | print() 105 | 106 | 107 | if __name__ == '__main__': 108 | import numpy as np 109 | 110 | sysinfo() 111 | 112 | Z = np.arange(9).reshape(3,3) 113 | info(Z) 114 | 115 | timeit("Z=np.random.uniform(0,1,1000000)", globals()) 116 | 117 | 118 | --------------------------------------------------------------------------------