├── LICENSE.md
├── README.md
├── automata.png
├── automata.py
├── bad-dither.py
├── basic-manipulation.py
├── data.txt
├── diffusion.png
├── diffusion.py
├── dithered.png
├── geometry.png
├── geometry.py
├── imshow.png
├── imshow.py
├── input-output.py
├── kitten-dithered.jpg
├── kitten-quantized.jpg
├── kitten.jpg
├── kmeans.py
├── moving-average.py
├── nan-arithmetics.py
├── original.png
├── perceptron.mp4
├── perceptron.py
├── random-walk.py
├── reorder.py
├── repeat.py
├── strides.py
└── tools.py


/LICENSE.md:
--------------------------------------------------------------------------------
 1 | MIT License
 2 | 
 3 | Copyright (c) 2018 Nicolas P. Rougier
 4 | 
 5 | Permission is hereby granted, free of charge, to any person obtaining a copy
 6 | of this software and associated documentation files (the "Software"), to deal
 7 | in the Software without restriction, including without limitation the rights
 8 | to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
 9 | copies of the Software, and to permit persons to whom the Software is
10 | furnished to do so, subject to the following conditions:
11 | 
12 | The above copyright notice and this permission notice shall be included in all
13 | copies or substantial portions of the Software.
14 | 
15 | THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
16 | IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
17 | FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
18 | AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
19 | LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
20 | OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
21 | SOFTWARE.
22 | 


--------------------------------------------------------------------------------
/README.md:
--------------------------------------------------------------------------------
   1 | # Advanced NumPy
   2 | 
   3 | A 3h00 course on advanced numpy techniques  
   4 | [Nicolas P. Rougier](http://www.labri.fr/perso/nrougier),
   5 | [G-Node summer school](https://python.g-node.org/),
   6 | Camerino, Italy, 2018
   7 | 
   8 | 
   9 | > NumPy is a library for the Python programming language, adding support for
  10 | > large, multi-dimensional arrays and matrices, along with a large collection
  11 | > of high-level mathematical functions to operate on these arrays.
  12 | >
  13 | > – Wikipedia
  14 | 
  15 | 
  16 | **Quicklinks**:
  17 |   [Numpy website](https://www.numpy.org) –
  18 |   [Numpy GitHub](https://github.com/numpy/numpy) –
  19 |   [Numpy documentation](https://www.numpy.org/devdocs/reference/) –
  20 |   [ASPP archives](https://python.g-node.org/wiki/archives) –
  21 |   [100 Numpy Exercises](https://github.com/rougier/numpy-100) –
  22 |   [From Python to Numpy](http://www.labri.fr/perso/nrougier/from-python-to-numpy/)
  23 | 
  24 | #### Table of Contents
  25 | 
  26 | * [Introduction](#--introduction)
  27 | * [Warmup](#--warmup)
  28 | * [Advanced exercises](#--advanced-exercises)
  29 | * [References](#--references)
  30 | 
  31 | ---
  32 | 
  33 | ## ❶ – Introduction
  34 | 
  35 | NumPy is all about vectorization. If you are familiar with Python, this is the
  36 | main difficulty you'll face because you'll need to change your way of thinking
  37 | and your new friends (among others) are named "vectors", "arrays", "views" or
  38 | "ufuncs". Let's take a very simple example: random walk.
  39 | 
  40 | One obvious way to write a random walk in Python is:
  41 | 
  42 | ```Python
  43 | def random_walk_slow(n):
  44 |     position = 0
  45 |     walk = [position]
  46 |     for i in range(n):
  47 |         position += 2*random.randint(0, 1)-1
  48 |         walk.append(position)
  49 |     return walk
  50 | walk = random_walk_slow(1000)
  51 | ```
  52 | 
  53 | 
  54 | It works, but it is slow. We can do better using the itertools Python module
  55 | that offers a set of functions for creating iterators for efficient looping. If
  56 | we observe that a random walk is an accumulation of steps, we can rewrite the
  57 | function by first generating all the steps and accumulate them without any
  58 | loop:
  59 | 
  60 | ```Python
  61 | def random_walk_faster(n=1000):
  62 |     from itertools import accumulate
  63 |     # Only available from Python 3.6
  64 |     steps = random.choices([-1,+1], k=n)
  65 |     return [0]+list(accumulate(steps))
  66 | walk = random_walk_faster(1000)
  67 | ```
  68 | 
  69 | It is better but still, it is slow. A more efficient implementation, taking
  70 | full advantage of NumPy, can be written as:
  71 | 
  72 | ```Python
  73 | def random_walk_fastest(n=1000):
  74 |     steps = np.random.choice([-1,+1], n)
  75 |     return np.cumsum(steps)
  76 | walk = random_walk_fastest(1000)
  77 | ```
  78 | 
  79 | Now, it is amazingly fast !
  80 | 
  81 | ```Pycon
  82 | >>> timeit("random_walk_slow(1000)", globals())
  83 | Timing 'random_walk_slow(1000)'
  84 | 1.58 ms ± 0.0228 ms per loop (mean ± std. dev. of 7 runs, 1000 loops each)
  85 | 
  86 | >>> timeit("random_walk_faster(1000)", globals())
  87 | Timing 'random_walk_faster(1000)'
  88 | 281 us ± 3.15 us per loop (mean ± std. dev. of 7 runs, 10000 loops each)
  89 | 
  90 | >>> timeit("random_walk_fastest(1000)", globals())
  91 | Timing 'random_walk_fastest(1000)'
  92 | 27.6 us ± 3.45 us per loop (mean ± std. dev. of 7 runs, 1000 loops each)
  93 | ```
  94 | 
  95 | **Warning**: You may have noticed (or not) that the `random_walk_fast` works
  96 | but is not reproducible at all, which is pretty annoying in Science. If you
  97 | want to know why, you can have a look at the article [Re-run, Repeat,
  98 | Reproduce, Reuse, Replicate: Transforming Code into Scientific
  99 | Contributions](https://www.frontiersin.org/articles/10.3389/fninf.2017.00069/full)
 100 | (that I wrote with [Fabien Benureau](https://github.com/benureau)).
 101 | 
 102 | 
 103 | 
 104 | Last point, before heading to the course, I would like to warn you about a
 105 | potential problem you may encounter once you'll have become familiar enough
 106 | with NumPy. It is a very powerful library and you can make wonders with it but,
 107 | most of the time, this comes at the price of readability. If you don't comment
 108 | your code at the time of writing, you won't be able to tell what a function is
 109 | doing after a few weeks (or possibly days). For example, can you tell what the
 110 | two functions below are doing?
 111 | 
 112 | ```Python
 113 | def function_1(seq, sub):
 114 |     return [i for i in range(len(seq) - len(sub)) if seq[i:i+len(sub)] == sub]
 115 | 
 116 | def function_2(seq, sub):
 117 |     target = np.dot(sub, sub)
 118 |     candidates = np.where(np.correlate(seq, sub, mode='valid') == target)[0]
 119 |     check = candidates[:, np.newaxis] + np.arange(len(sub))
 120 |     mask = np.all((np.take(seq, check) == sub), axis=-1)
 121 |     return candidates[mask]
 122 | ```
 123 | 
 124 | As you may have guessed, the second function is the
 125 | vectorized-optimized-faster-NumPy version of the first function and it runs 10x
 126 | faster than the pure Python version. But it is hardly readable.
 127 | 
 128 | 
 129 | 
 130 | ## ❷ – Warmup
 131 | 
 132 | You're supposed to be already familiar with NumPy. If not, you should read the
 133 | [NumPy chapter](http://www.scipy-lectures.org/intro/numpy/index.html) from the [SciPy Lecture Notes](http://www.scipy-lectures.org/). Before heading to the more advanced
 134 | stuff, let's do some warmup exercises (that should pose no problem). If you
 135 | choke on the first exercise, you should try to have a look at the [Anatomy of an array](https://www.labri.fr/perso/nrougier/from-python-to-numpy/#anatomy-of-an-array) and check also the [Quick references](https://www.labri.fr/perso/nrougier/from-python-to-numpy/#quick-references).
 136 | 
 137 | ### Useful tools
 138 | 
 139 | Before heading to the exercises, We'll write a `sysinfo` and `info` function that will help us debug our code.
 140 | 
 141 | The `sysinfo` function displays some information related to you scientific
 142 | environment:
 143 | 
 144 | ```Pycon
 145 | >>> import tools
 146 | >>> tools.sysinfo()
 147 | Date:       08/25/18
 148 | Python:     3.7.0
 149 | Numpy:      1.14.5
 150 | Scipy:      1.1.0
 151 | Matplotlib: 2.2.2
 152 | ```
 153 | 
 154 | While the `info` function displays a lot of information for a specific array:
 155 | 
 156 | ```Pycon
 157 | >>> import tools
 158 | >>> Z = np.arange(9).reshape(3,3)
 159 | >>> tools.info(Z)
 160 | ------------------------------
 161 | Interface (item)
 162 |   shape:       (3,3)
 163 |   dtype:       int64
 164 |   length:      3
 165 |   size:        9
 166 |   endianess:   native (little)
 167 |   order:       ☑ C  ☐ Fortran
 168 | 
 169 | Memory (byte)
 170 |   item size:   8
 171 |   array size:  72
 172 |   strides:     (24, 8)
 173 | 
 174 | Properties
 175 |   own data:    ☑ Yes  ☐ No
 176 |   writeable:   ☑ Yes  ☐ No
 177 |   contiguous:  ☑ Yes  ☐ No
 178 |   aligned:     ☑ Yes  ☐ No
 179 | ------------------------------
 180 | ```
 181 | 
 182 | 
 183 | Try to code these two functions. You can then compare your implementation with
 184 | [mine](tools.py).  
 185 | **NOTE**: We don't care so much about the formatting, do not lose time trying
 186 | to copy it exactly.
 187 | 
 188 | 
 189 | The [tools.py](tools.py) script comes with two other functions that might be
 190 | useful.  The `timeit` function allows to precisely time some code (e.g. to
 191 | measure which one is the fastest). It is pretty similar to the `%timeit` magic
 192 | function from IPython:
 193 | 
 194 | ```Pycon
 195 | >>> import tools
 196 | >>> tools.timeit("Z=np.random.uniform(0,1,1000000)", globals())
 197 | >>> Measuring time for 'Z=np.random.uniform(0,1,1000000)'
 198 | 11.4 ms ± 0.198 ms per loop (mean ± std. dev. of 7 runs, 100 loops each)
 199 | ```
 200 | 
 201 | And the `imshow` function is able to display a one-dimensional or
 202 | two-dimensional array in the console. It won't replace matplotlib but it can
 203 | comes handy for some (small) arrays (you'll need a 256 colors terminal):
 204 | 
 205 | ![](imshow.png)
 206 | 
 207 | 
 208 | ### Basic manipulation
 209 | 
 210 | Let's start with some basic operations:
 211 | 
 212 | • Create a vector with values ranging from 10 to 49  
 213 | • Create a null vector of size 100 but the fifth value which is 1  
 214 | • Reverse a vector (first element becomes last)  
 215 | • Create a 3x3 matrix with values ranging from 0 to 8  
 216 | • Create a 3x3 identity matrix  
 217 | • Create a 2d array with 1 on the border and 0 inside   
 218 | • Given a 1D array, negate all elements which are between 3 and 8, in place  
 219 | 
 220 | For a more complete list, you can have a look at the [100 Numpy Exercises](https://github.com/rougier/numpy-100).
 221 | 
 222 | <details><summary><b>Solution</b> (click to expand)</summary><p>
 223 | 
 224 | Sources: [basic-manipulation.py](basic-manipulation.py)
 225 | 
 226 | ```Python
 227 | import numpy as np
 228 | 
 229 | # Create a vector with values ranging from 10 to 49
 230 | Z = np.arange(10,50)
 231 | 
 232 | # Create a null vector of size 100 but the fifth value which is 1
 233 | Z = np.zeros(100)
 234 | Z[4] = 1
 235 | 
 236 | # Reverse a vector (first element becomes last)
 237 | Z = np.arange(50)[::-1]
 238 | 
 239 | # Create a 3x3 matrix with values ranging from 0 to 8
 240 | Z = np.arange(9).reshape(3,3)
 241 | 
 242 | # Create a 3x3 identity matrix
 243 | Z = np.eye(3)
 244 | 
 245 | # Create a 2d array with 1 on the border and 0 inside
 246 | Z = np.ones((10,10))
 247 | Z[1:-1,1:-1] = 0
 248 | 
 249 | # Given a 1D array, negate all elements which are between 3 and 8, in place
 250 | Z = np.arange(11)
 251 | Z[(3 < Z) & (Z <= 8)] *= -1
 252 | ```
 253 | </p></details>
 254 | 
 255 | 
 256 | 
 257 | ### NaN arithmetics
 258 | 
 259 | Just a reminder on NaN arithmetics:
 260 | 
 261 | What is the result of the following expression?  
 262 | **→ Hints**: [What Every Computer Scientist Should Know About Floating-Point Arithmetic, D. Goldberg, 1991](https://docs.oracle.com/cd/E19957-01/806-3568/ncg_goldberg.html)  
 263 | 
 264 | 
 265 | ```Python
 266 | print(0 * np.nan)
 267 | print(np.nan == np.nan)
 268 | print(np.inf > np.nan)
 269 | print(np.nan - np.nan)
 270 | print(0.3 == 3 * 0.1)
 271 | ```
 272 | 
 273 | 
 274 | <details><summary><b>Solution</b> (click to expand)</summary><p>
 275 | 
 276 | Sources [nan-arithmetics.py](nan-arithmetics.py)
 277 | 
 278 | ```Python
 279 | import numpy as np
 280 | 
 281 | # Result is NaN
 282 | print(0 * np.nan)
 283 | 
 284 | # Result is False
 285 | print(np.nan == np.nan)
 286 | 
 287 | # Result is False
 288 | print(np.inf > np.nan)
 289 | 
 290 | # Result is NaN
 291 | print(np.nan - np.nan)
 292 | 
 293 | # Result is False !!!
 294 | print(0.3 == 3 * 0.1)
 295 | print("0.1 really is {:0.56f}".format(0.1))
 296 | ```
 297 | 
 298 | </p></details>
 299 | 
 300 | 
 301 | 
 302 | ### Computing strides
 303 | 
 304 | Consider an array Z, how to compute Z strides (manually)?  
 305 | **→ Hints**:
 306 |  [itemsize](https://docs.scipy.org/doc/numpy/reference/generated/numpy.ndarray.itemsize.html) –
 307 |  [shape](https://docs.scipy.org/doc/numpy/reference/generated/numpy.ndarray.shape.html) –
 308 |  [ndim](https://docs.scipy.org/doc/numpy/reference/generated/numpy.ndarray.ndim.html)  
 309 | 
 310 | 
 311 | ```Python
 312 | import numpy as np
 313 | Z = np.arange(24).reshape(2,3,4)
 314 | print(Z.strides)
 315 | ```
 316 | 
 317 | <details><summary><b>Solution</b> (click to expand)</summary><p>
 318 | 
 319 | Sources [strides.py](strides.py)
 320 | 
 321 | ```Python
 322 | import numpy as np
 323 | 
 324 | def strides(Z):
 325 |     strides = [Z.itemsize]
 326 |     
 327 |     # Fotran ordered array
 328 |     if np.isfortran(Z):
 329 |         for i in range(0, Z.ndim-1):
 330 |             strides.append(strides[-1] * Z.shape[i])
 331 |         return tuple(strides)
 332 |     # C ordered array
 333 |     else:
 334 |         for i in range(Z.ndim-1, 0, -1):
 335 |             strides.append(strides[-1] * Z.shape[i])
 336 |         return tuple(strides[::-1])
 337 | 
 338 | # This work
 339 | Z = np.arange(24).reshape((2,3,4), order="C")
 340 | print(Z.strides, " – ", strides(Z))
 341 | 
 342 | Z = np.arange(24).reshape((2,3,4), order="F")
 343 | print(Z.strides, " – ", strides(Z))
 344 | 
 345 | # This does not work
 346 | Z = Z[::2]
 347 | print(Z.strides, " – ", strides(Z))
 348 | ```
 349 | 
 350 | </p></details>
 351 | 
 352 | ### Repeat and repeat
 353 | 
 354 | Can you tell the difference?  
 355 | **→ Hints**:
 356 |   [tile](https://docs.scipy.org/doc/numpy/reference/generated/numpy.tile.html) –
 357 |   [as_strided](https://docs.scipy.org/doc/numpy/reference/generated/numpy.lib.stride_tricks.as_strided.html)  
 358 | 
 359 | 
 360 | ```Python
 361 | import numpy as np
 362 | from numpy.lib.stride_tricks import as_strided
 363 | 
 364 | Z = np.random.randint(0,10,5)
 365 | Z1 = np.tile(Z, (3,1))
 366 | Z2 = as_strided(Z, shape=(3,)+Z.shape, strides=(0,)+Z.strides)
 367 | ```
 368 | 
 369 | <details><summary><b>Solution</b> (click to expand)</summary><p>
 370 | 
 371 | Sources [repeat.py](repeat.py)
 372 | 
 373 | ```Python
 374 | import numpy as np
 375 | from numpy.lib.stride_tricks import as_strided
 376 | 
 377 | Z = np.zeros(5)
 378 | Z1 = np.tile(Z,(3,1))
 379 | Z2 = as_strided(Z, shape=(3,)+Z.shape, strides=(0,)+Z.strides)
 380 | 
 381 | # Real repeat: three times the memory
 382 | Z1[0,0] = 1
 383 | print(Z1)
 384 | 
 385 | # Fake repeat: less memory but not totally equivalent
 386 | Z2[0,0] = 1
 387 | print(Z2)
 388 | ```
 389 | 
 390 | </p></details>
 391 | 
 392 | 
 393 | ### Reordering things
 394 | 
 395 | Let's consider the following list:
 396 | 
 397 | ```Python
 398 | L = [  0,   0,   0,   0,   0,   0,   3, 233,
 399 |        0,   0,   0,   0,   0,   0,   3, 237,
 400 |        0,   0,   0,   0,   0,   0,   3, 235,
 401 |        0,   0,   0,   0,   0,   0,   3, 239,
 402 |        0,   0,   0,   0,   0,   0,   3, 234,
 403 |        0,   0,   0,   0,   0,   0,   3, 238,
 404 |        0,   0,   0,   0,   0,   0,   3, 236,
 405 |        0,   0,   0,   0,   0,   0,   3, 240]
 406 | ```
 407 | 
 408 | This is actually the byte dump of a 2x2x2 array, fortran ordered of 64 bits
 409 | integers using big endian encoding.
 410 | 
 411 | How would you access element at [1,0,0] with NumPy (simple)?  
 412 | 
 413 | <details><summary><b>Solution</b> (click to expand)</summary><p>
 414 | 
 415 | ```Python
 416 | 
 417 | import struct
 418 | import numpy as np
 419 | 
 420 | # Generation of the array
 421 | # Z = range(1001, 1009)
 422 | # L = np.reshape(Z, (2,2,2), order="F").ravel().astype(">i8").view(np.ubyte)
 423 | 
 424 | L = [  0,   0,   0,   0,   0,   0,   3, 233,
 425 |        0,   0,   0,   0,   0,   0,   3, 237,
 426 |        0,   0,   0,   0,   0,   0,   3, 235,
 427 |        0,   0,   0,   0,   0,   0,   3, 239,
 428 |        0,   0,   0,   0,   0,   0,   3, 234,
 429 |        0,   0,   0,   0,   0,   0,   3, 238,
 430 |        0,   0,   0,   0,   0,   0,   3, 236,
 431 |        0,   0,   0,   0,   0,   0,   3, 240]
 432 | 
 433 | # Automatic (numpy)
 434 | Z = np.reshape(np.array(L, dtype=np.ubyte).view(dtype=">i8"), (2,2,2), order="F")
 435 | print(Z[1,0,0])
 436 | ```
 437 | </p></details><br/>
 438 | 
 439 | 
 440 | How would you access element at [1,0,0] without NumPy (harder)?  
 441 |   **→ Hints**: Use your brain!
 442 | 
 443 | 
 444 | <details><summary><b>Solution</b> (click to expand)</summary><p>
 445 | 
 446 | Sources [reorder.py](reorder.py)
 447 | 
 448 | ```Python
 449 | 
 450 | import struct
 451 | import numpy as np
 452 | 
 453 | # Generation of the array
 454 | # Z = range(1001, 1009)
 455 | # L = np.reshape(Z, (2,2,2), order="F").ravel().astype(">i8").view(np.ubyte)
 456 | 
 457 | L = [  0,   0,   0,   0,   0,   0,   3, 233,
 458 |        0,   0,   0,   0,   0,   0,   3, 237,
 459 |        0,   0,   0,   0,   0,   0,   3, 235,
 460 |        0,   0,   0,   0,   0,   0,   3, 239,
 461 |        0,   0,   0,   0,   0,   0,   3, 234,
 462 |        0,   0,   0,   0,   0,   0,   3, 238,
 463 |        0,   0,   0,   0,   0,   0,   3, 236,
 464 |        0,   0,   0,   0,   0,   0,   3, 240]
 465 | 
 466 | # Automatic (numpy)
 467 | Z = np.reshape(np.array(L, dtype=np.ubyte).view(dtype=">i8"), (2,2,2), order="F")
 468 | print(Z[1,0,0])
 469 | 
 470 | # Manual (brain)
 471 | shape = (2,2,2)
 472 | itemsize = 8
 473 | # We can probably do better
 474 | strides = itemsize, itemsize*shape[0], itemsize*shape[0]*shape[1]
 475 | index = (1,0,0)
 476 | start = sum(i*s for i,s in zip(index,strides))
 477 | end = start+itemsize
 478 | value = struct.unpack(">Q", bytes(L[start:end]))[0]
 479 | print(value)
 480 | ```
 481 | 
 482 | </p></details>
 483 | 
 484 | 
 485 | ### Heat equation
 486 | 
 487 |  
 488 | > The diffusion equation (a.k.a the heat equation) reads `∂u/∂t = α∂²u/∂x²` where
 489 | > u(x,t) is the unknown function to be solved, x is a coordinate in space, and t
 490 | > is time. The coefficient α is the diffusion coefficient and determines how fast
 491 | > u changes in time. The discrete (time(n) and space (i)) version of the equation
 492 | > can be rewritten as `u(i,n+1) = u(i,n) + F(u(i-1,n) - 2u(i,n) + u(i+1,n))`.
 493 | >
 494 | > – [Finite difference methods for diffusion processes](http://hplgit.github.io/num-methods-for-PDEs/doc/pub/diffu/sphinx/._main_diffu000.html), Hans Petter Langtangen
 495 | 
 496 | The goal here is to compute the discrete equation over a finite domain using
 497 | `as_strided` to produce a sliding-window view of a 1D array. This view can be
 498 | then used to compute `U` at the next iteration. Using the the following initial
 499 | conditions (using Z instead of U):
 500 | 
 501 | 
 502 | ```Python
 503 | Z = np.random.uniform(0.00, 0.05, (50,100))
 504 | Z[0,5::10] = 1
 505 | ```
 506 | 
 507 | Try to obtain this picture (where time goes from top to bottom):
 508 | 
 509 | ![](diffusion.png)
 510 | 
 511 | 
 512 | The code to display the figure from an array Z is:
 513 | 
 514 | ```Python
 515 | import matplotlib as plt
 516 | 
 517 | plt.figure(figsize=(6,3))
 518 | plt.subplot(1,1,1,frameon=False)
 519 | plt.imshow(Z, vmin=0, vmax=1)
 520 | plt.xticks([]), plt.yticks([])
 521 | plt.tight_layout()
 522 | plt.show()
 523 | ```
 524 | 
 525 | **Hint**: You will need to write a `sliding_window(Z, size=3)` function that returns
 526 | a strided view of Z.
 527 | 
 528 | <details><summary><b>Solution</b> (click to expand)</summary><p>
 529 | 
 530 | Sources [diffusion.py](diffusion.py)
 531 | 
 532 | ```Python
 533 | import numpy as np
 534 | import matplotlib.pyplot as plt
 535 | from numpy.lib.stride_tricks import as_strided
 536 | 
 537 | 
 538 | def sliding_window(Z, size=2):
 539 |     n, s = Z.shape[0], Z.strides[0]
 540 |     return as_strided(Z, shape=(n-size+1, size), strides=(s, s))
 541 | 
 542 | 
 543 | # Initial conditions:
 544 | # Domain size is 100 and we'll iterate over 50 time steps
 545 | Z = np.zeros((50,100))
 546 | Z[0,5::10] = 1.5
 547 | 
 548 | # Actual iteration
 549 | F = 0.05
 550 | for i in range(1, len(Z)):
 551 |     Z[i,1:-1] = Z[i-1,1:-1] + F*(sliding_window(Z[i-1], 3)*[+1,-2,+1]).sum(axis=1)
 552 | 
 553 | # Display
 554 | plt.figure(figsize=(6,3))
 555 | plt.subplot(1,1,1,frameon=False)
 556 | plt.imshow(Z, vmin=0, vmax=1)
 557 | plt.xticks([]), plt.yticks([])
 558 | plt.tight_layout()
 559 | plt.savefig("diffusion.png")
 560 | plt.show()
 561 | ```
 562 | 
 563 | </p></details>
 564 | 
 565 | 
 566 | ### Rule 30
 567 | 
 568 | With only a slight modification of the previous exercise, we can compute a
 569 | one-dimensional [cellular automata](https://en.wikipedia.org/wiki/Cellular_automaton) and more specifically the [Rule 30](https://en.wikipedia.org/wiki/Rule_30) that
 570 | exhibits intriguing patterns as shown below:
 571 | 
 572 | ![](automata.png)
 573 | 
 574 | To start with, here is how to convert the rule in a useful form:
 575 | 
 576 | ```Python
 577 | rule = 30 
 578 | R = np.array([int(v) for v in '{0:08b}'.format(rule)])[::-1]
 579 | ```
 580 | 
 581 | and we consider this initial state:
 582 | 
 583 | ```Python
 584 | Z = np.zeros((250,501), dtype=int)
 585 | Z[0,250] = 1
 586 | ```
 587 | 
 588 | Try to obtain the same figure. Display code is:
 589 | 
 590 | ```Python
 591 | plt.figure(figsize=(6,3))
 592 | plt.subplot(1,1,1,frameon=False)
 593 | plt.imshow(Z, vmin=0, vmax=1, cmap=plt.cm.gray_r)
 594 | plt.xticks([]), plt.yticks([])
 595 | plt.tight_layout()
 596 | plt.savefig("automata.png")
 597 | plt.show()
 598 | ```
 599 | 
 600 | 
 601 | <details><summary><b>Solution</b> (click to expand)</summary><p>
 602 | 
 603 | Sources [automata.py](automata.py)
 604 | 
 605 | ```Python
 606 | import numpy as np
 607 | import matplotlib.pyplot as plt
 608 | from numpy.lib.stride_tricks import as_strided
 609 | 
 610 | def sliding_window(Z, size=2):
 611 |     n, s = Z.shape[0], Z.strides[0]
 612 |     return as_strided(Z, shape=(n-size+1, size), strides=(s, s))
 613 | 
 614 | # Rule 30  (see https://en.wikipedia.org/wiki/Rule_30)
 615 | # 0x000: 0, 0x001: 1, 0x010: 1, 0x011: 1
 616 | # 0x100: 1, 0x101: 0, 0x110: 0, 0x111: 0
 617 | rule = 30 
 618 | R = np.array([int(v) for v in '{0:08b}'.format(rule)])[::-1]
 619 | 
 620 | # Initial state
 621 | Z = np.zeros((250,501), dtype=int)
 622 | Z[0,250] = 1
 623 | 
 624 | # Computing some iterations
 625 | for i in range(1, len(Z)):
 626 |     N = sliding_window(Z[i-1],3) * [1,2,4]
 627 |     Z[i,1:-1] = R[N.sum(axis=1)]
 628 | 
 629 | # Display
 630 | plt.figure(figsize=(6,3))
 631 | plt.subplot(1,1,1,frameon=False)
 632 | plt.imshow(Z, vmin=0, vmax=1, cmap=plt.cm.gray_r)
 633 | plt.xticks([]), plt.yticks([])
 634 | plt.tight_layout()
 635 | plt.savefig("automata.png")
 636 | plt.show()
 637 | ```
 638 | 
 639 | </p></details>
 640 | 
 641 | 
 642 | ### Input / Output
 643 | 
 644 | → Exercise written by [Stefan van der Walt](http://mentat.za.net/).  
 645 | 
 646 | Place the following data in a text file, data.txt:
 647 | 
 648 | ```
 649 | % rank         lemma (10 letters max)      frequency       dispersion
 650 | 21             they                        1865844         0.96
 651 | 42             her                         969591          0.91
 652 | 49             as                          829018          0.95
 653 | 7              to                          6332195         0.98
 654 | 63             take                        670745          0.97
 655 | 14             you                         3085642         0.92
 656 | 35             go                          1151045         0.93
 657 | 56             think                       772787          0.91
 658 | 28             not                         1638883         0.98
 659 | ```
 660 | 
 661 | Now, design a suitable structured data type, then load the data from the text
 662 | file using [np.loadtxt](https://docs.scipy.org/doc/numpy/reference/generated/numpy.loadtxt.html) (look at the documenration to see how to handle the '%' comment character).
 663 | 
 664 | Here's a skeleton to start with:
 665 | 
 666 | ```Python
 667 | import numpy as np
 668 | 
 669 | # Construct the data-type
 670 | # For example:
 671 | # dtype = np.dtype([('x', np.float), ('y', np.int), ('z', np.uint8)])
 672 | 
 673 | dt = np.dtype(...)  # Modify this line to give the correct answer
 674 | data = np.loadtxt(...)  # Load data with loadtxt
 675 | ```
 676 | 
 677 | Examine the data you got:
 678 |  * Extract words only
 679 |  * Extract the 3rd row
 680 |  * Print all words with rank < 30
 681 | 
 682 | Sort the data according to frequency (see
 683 | [np.sort](https://docs.scipy.org/doc/numpy/reference/routines.sort.html)).
 684 | 
 685 | 
 686 | Save the result to a compressed numpy data file (e.g. "sorted.npz") using [np.savez](https://docs.scipy.org/doc/numpy/reference/generated/numpy.savez.html) and load it back with `out = np.load("sorted.npz")`. Do you get back what you put in? Why?
 687 | 
 688 | <details><summary><b>Solution</b> (click to expand)</summary><p>
 689 | 
 690 | Source: [input-output.py](input-output.py)
 691 | 
 692 | ```
 693 | import numpy as np
 694 | 
 695 | # Create our own dtype
 696 | dtype = np.dtype([('rank',       'i8'),
 697 |                   ('lemma',      'S8'),
 698 |                   ('frequency',  'i8'),
 699 |                   ('dispersion', 'f8')])
 700 | 
 701 | # Load file using our own dtype
 702 | data = np.loadtxt('data.txt', comments='%', dtype=dtype)
 703 | 
 704 | # Extract words only
 705 | print(data["lemma"])
 706 | 
 707 | # Extract the 3rd row
 708 | print(data[2])
 709 | 
 710 | # Print all words with rank < 30
 711 | print(data[data["rank"] < 30])
 712 | 
 713 | # Sort the data according to frequency.
 714 | sorted = np.sort(data, order="frequency")
 715 | print(sorted)
 716 | 
 717 | # Save unsorted and sorted array
 718 | np.savez("sorted.npz", data=data, sorted=sorted)
 719 | 
 720 | # Load saved array
 721 | out = np.load("sorted.npz")
 722 | print(out["sorted"])
 723 | ```
 724 | 
 725 | </p></details><br/>
 726 | 
 727 | 
 728 | ## ❸ – Advanced exercises
 729 | 
 730 | ### Geometry
 731 | 
 732 | We consider a collection of 2d squares that are each defined by four points, a scaling factor, a translation and a rotation angle. We want to obtain the following figure:
 733 | 
 734 | ![](geometry.png)
 735 | 
 736 | made of 25 squares, scaled by 0.1, translated by (1,0) and with increasing
 737 | rotation angles. The order of operation is `scale`, `translate` and
 738 | `rotate`. What would be the best structure `S` to hold all these information at
 739 | once?  
 740 | **→  Hints**: [structured arrays](https://docs.scipy.org/doc/numpy/user/basics.rec.html)
 741 | 
 742 | <details><summary><b>Solution</b> (click to expand)</summary><p>
 743 | 
 744 | ```Python
 745 | dtype = [("points",    float, (4, 2)),
 746 |          ("scale",     float, 1),
 747 |          ("translate", float, 2),
 748 |          ("rotate",    float, 1)]
 749 | S = np.zeros(25, dtype = dtype)
 750 | ```
 751 | 
 752 | </p></details><br/>
 753 | 
 754 | We now need to initialize our array. For the four points describing a square,
 755 | you can use the following points: [(-1,-1), (-1,+1), (+1,+1), (+1,-1)]
 756 | 
 757 | <details><summary><b>Solution</b> (click to expand)</summary><p>
 758 | 
 759 | ```Python
 760 | S["points"] = [(-1,-1), (-1,+1), (+1,+1), (+1,-1)]
 761 | S["translate"] = (1,0)
 762 | S["scale"] = 0.1
 763 | S["rotate"] = np.linspace(0, 2*np.pi, len(S), endpoint=False)
 764 | ```
 765 | </p></details><br/>
 766 | 
 767 | 
 768 | Now, we need to write a function that apply all these transformations and write
 769 | the results in new array:
 770 | 
 771 | ```Python
 772 | 
 773 | P = np.zeros((len(S), 4, 2))
 774 | # Your code here (to populate P)
 775 | ...
 776 | ```
 777 | 
 778 | You can start by writing a translate, scale and rotate function first.
 779 | 
 780 | > Rotation reminder. Considering a point (x,y) and a rotation angle a,
 781 | > the rotated coordinates (x',y') are:
 782 | >
 783 | > x' = x.cos(a) - y.sin(a) and y' = x.sin(a) + y.cos(a)
 784 | 
 785 | 
 786 | The display code is:
 787 | 
 788 | ```Python
 789 | import matplotlib.pyplot as plt
 790 | 
 791 | fig = plt.figure(figsize=(6,6))
 792 | ax = plt.subplot(1,1,1, frameon=False)
 793 | for i in range(len(P)):
 794 |     X = np.r_[P[i,:,0], P[i,0,0]]
 795 |     Y = np.r_[P[i,:,1], P[i,0,1]]
 796 |     plt.plot(X, Y, color="black")
 797 | plt.xticks([]), plt.yticks([])
 798 | plt.tight_layout()
 799 | plt.show()
 800 | ```
 801 | 
 802 | 
 803 | 
 804 | <details><summary><b>Solution</b> (click to expand)</summary><p>
 805 | 
 806 | Source: [geometry.py](geometry.py)
 807 | 
 808 | ```Python
 809 | import numpy as np
 810 | import matplotlib.pyplot as plt
 811 | 
 812 | dtype = [("points",    float, (4, 2)),
 813 |          ("scale",     float, 1),
 814 |          ("translate", float, 2),
 815 |          ("rotate",    float, 1)]
 816 | S = np.zeros(25, dtype = dtype)
 817 | S["points"] = [(-1,-1), (-1,+1), (+1,+1), (+1,-1)]
 818 | S["translate"] = (1,0)
 819 | S["scale"] = 0.1
 820 | S["rotate"] = np.linspace(0, 2*np.pi, len(S), endpoint=False)
 821 | 
 822 | P = np.zeros((len(S), 4, 2))
 823 | for i in range(len(S)):
 824 |     for j in range(4):
 825 |         x = S[i]["points"][j,0]
 826 |         y = S[i]["points"][j,1]
 827 |         tx, ty = S[i]["translate"]
 828 |         scale  = S[i]["scale"]
 829 |         theta  = S[i]["rotate"]
 830 |         x = tx + x*scale
 831 |         y = ty + y*scale
 832 |         x_ = x*np.cos(theta) - y*np.sin(theta)
 833 |         y_ = x*np.sin(theta) + y*np.cos(theta)
 834 |         P[i,j] = x_, y_
 835 | 
 836 | fig = plt.figure(figsize=(6,6))
 837 | ax = plt.subplot(1,1,1, frameon=False)
 838 | for i in range(len(P)):
 839 |     X = np.r_[P[i,:,0], P[i,0,0]]
 840 |     Y = np.r_[P[i,:,1], P[i,0,1]]
 841 |     plt.plot(X, Y, color="black")
 842 | plt.xticks([]), plt.yticks([])
 843 | plt.tight_layout()
 844 | plt.savefig("geometry.png")
 845 | plt.show()
 846 | ```
 847 | 
 848 | </p></details><br/>
 849 | 
 850 | The proposed solution has two loops. Can you imagine a way to do it without loop ?  
 851 | **→ Hints**: [einsum](https://docs.scipy.org/doc/numpy/reference/generated/numpy.einsum.html)
 852 | 
 853 | <details><summary><b>Solution</b> (click to expand)</summary><p>
 854 | 
 855 | Have a look at [Multiple individual 2d rotation at once](https://stackoverflow.com/questions/40822983/multiple-individual-2d-rotation-at-once) on stack overflow. I did not implement it, feel free to issue a PR with the solution.
 856 | 
 857 | </p></details>
 858 | 
 859 | ### Image quantization
 860 | 
 861 | > In computer graphics, color quantization or color image quantization is
 862 | > quantization applied to color spaces; it is a process that reduces the number
 863 | > of distinct colors used in an image, usually with the intention that the new
 864 | > image should be as visually similar as possible to the original image.
 865 | >
 866 | > – Wikipedia
 867 | 
 868 | In this exercise, we want to produce color quantization, that is, considering a
 869 | random image, we would like to reduce the number of colors without altering too
 870 | much the perception of the image. We thus need to find the most representative
 871 | colors.
 872 | 
 873 | The first (naive) idea that may come to mind is to count the number of times a
 874 | specific color is used and to use the most frequent colors for quantization.
 875 | Unfortunately, this does not work very well as illustrated below:
 876 | 
 877 | ![](kitten.jpg)
 878 | ![](kitten-dithered.jpg)
 879 | 
 880 | The reason is that some color and slight variations might be over-represented
 881 | in th eoriginal image and will thus appears among the most frequent
 882 | colors. This the reason why the kitten ended mostly in green and the flower
 883 | totally dissapeared.
 884 | 
 885 | To check by  yourself, you'll write the corresponding script  and check for the
 886 | result:
 887 | 
 888 | 1. Load an image (using [imageio](http://imageio.github.io/).[imread](https://imageio.readthedocs.io/en/latest/userapi.html#imageio.imread))
 889 | 2. Find the number of unique colors and their frequency (counts)
 890 | 3. Pick the n=16 most frequent colors
 891 | 4. Replace colors in the original image with the closest color (found previously)
 892 | 5. Save the result (using [imageio](http://imageio.github.io/).[imsave](https://imageio.readthedocs.io/en/latest/userapi.html#imageio.imsave))
 893 | 
 894 | 
 895 | <details><summary><b>Solution</b> (click to expand)</summary><p>
 896 | 
 897 | Sources: [bad-dither.py](bad-dither.py)
 898 | 
 899 | ```Python
 900 | import imageio
 901 | import numpy as np
 902 | import scipy.spatial
 903 | 
 904 | # Number of final colors we want
 905 | n = 16
 906 | 
 907 | # Original Image
 908 | I = imageio.imread("kitten.jpg")
 909 | shape = I.shape
 910 | 
 911 | # Flattened image
 912 | I = I.reshape(shape[0]*shape[1], shape[2])
 913 | 
 914 | # Find the unique colors and their frequency (=counts)
 915 | colors, counts = np.unique(I, axis=0, return_counts=True)
 916 | 
 917 | # Get the n most frequent colors
 918 | sorted = np.argsort(counts)[::-1]
 919 | C = I[sorted][:n]
 920 | 
 921 | # Compute distance to most frequent colors
 922 | D = scipy.spatial.distance.cdist(I, C, 'sqeuclidean')
 923 | 
 924 | # Replace colors with closest one
 925 | Z = (C[D.argmin(axis=1)]).reshape(shape)
 926 | 
 927 | # Save result
 928 | imageio.imsave("kitten-dithered.jpg", Z)
 929 | ```
 930 | 
 931 | </p></details></br>
 932 | 
 933 | 
 934 | We thus need a different method and this method is called [k-means
 935 | clustering](https://en.wikipedia.org/wiki/K-means_clustering) that allow to
 936 | partition data into n clusters whose centroids may serve as a prototype for the
 937 | cluster.
 938 | 
 939 | ![](kitten.jpg)
 940 | ![](kitten-quantized.jpg)
 941 | 
 942 | The algorithm is quite simple. We start with n random points (centroids) and we
 943 | compute for each point in our data what is the closest centroid. Those
 944 | constitute cluster of points. For each cluster, we compute its centroid (mean
 945 | point) and we reiterate the processus for a given number of steps. In this
 946 | exercise, you'll have to write such a k-means function and to use it to
 947 | quantize the image.
 948 | 
 949 | <details><summary><b>Solution</b> (click to expand)</summary><p>
 950 | 
 951 | Sources: [kmeans.py](kmeans.py)
 952 | 
 953 | ```Python
 954 | # Code by Gareth Rees, posted on stack overflow
 955 | # https://codereview.stackexchange.com/questions/61598/k-mean-with-numpy
 956 | 
 957 | import numpy as np
 958 | import scipy.spatial
 959 | 
 960 | def cluster_centroids(data, clusters, k=None):
 961 |     if k is None:
 962 |         k = np.max(clusters) + 1
 963 |     result = np.empty(shape=(k,) + data.shape[1:])
 964 |     for i in range(k):
 965 |         np.mean(data[clusters == i], axis=0, out=result[i])
 966 |     return result
 967 | 
 968 | 
 969 | def kmeans(data, k=None, centroids=None, steps=20):
 970 |     if centroids is not None and k is not None:
 971 |         assert(k == len(centroids))
 972 |     elif centroids is not None:
 973 |         k = len(centroids)
 974 |     elif k is not None:
 975 |         # Forgy initialization method: choose k data points randomly.
 976 |         centroids = data[np.random.choice(np.arange(len(data)), k, False)]
 977 |     else:
 978 |         raise RuntimeError("Need a value for k or centroids.")
 979 | 
 980 |     for _ in range(max(steps, 1)):
 981 |         # Squared distances between each point and each centroid.
 982 |         sqdists = scipy.spatial.distance.cdist(centroids, data, 'sqeuclidean')
 983 | 
 984 |         # Index of the closest centroid to each data point.
 985 |         clusters = np.argmin(sqdists, axis=0)
 986 | 
 987 |         new_centroids = cluster_centroids(data, clusters, k)
 988 |         if np.array_equal(new_centroids, centroids):
 989 |             break
 990 | 
 991 |         centroids = new_centroids
 992 |     return centroids, clusters
 993 | 
 994 | 
 995 | if __name__ == '__main__':
 996 |     import imageio
 997 | 
 998 |     # Number of final colors we want
 999 |     n = 16
1000 | 
1001 |     # Original Image
1002 |     I = imageio.imread("kitten.jpg")
1003 |     shape = I.shape
1004 | 
1005 |     # Flattened image
1006 |     D = I.reshape(shape[0]*shape[1], shape[2])
1007 |     
1008 |     # Search for 16 centroids in D (using 20 iterations)
1009 |     centroids, clusters = kmeans(D, k=n, steps=20)
1010 | 
1011 |     # Create quantized image
1012 |     I = (centroids[clusters]).reshape(shape)
1013 |     I = np.round(I).astype(np.uint8)
1014 | 
1015 |     # Save result
1016 |     imageio.imsave("kitten-quantized.jpg", I)
1017 | ```
1018 | 
1019 | </p></details></br>
1020 | 
1021 | 
1022 | 
1023 | 
1024 | ### Neural networks
1025 | 
1026 | In this exercise, we'll implement one of the most simple feed-forward neural
1027 | network, a.k.a. the [Perceptron](https://en.wikipedia.org/wiki/Perceptron). We'll use it to discrimate between two classes
1028 | (points in two dimensions,see [desired output](perceptron.mp4)):
1029 | 
1030 | ```Python
1031 | samples = np.zeros(100, dtype=[('input',  float, 2),
1032 |                                ('output', float, 1)])
1033 |                                
1034 | P = np.random.uniform(0.05,0.95,(len(samples),2))
1035 | samples["input"] = P
1036 | stars = np.where(P[:,0]+P[:,1] < 1)
1037 | discs = np.where(P[:,0]+P[:,1] > 1)
1038 | samples["output"][stars] = +1
1039 | samples["output"][discs] = 0
1040 | ```
1041 | 
1042 | Your goal is to populate the following class in order to train the
1043 | network. You'll need:
1044 | 
1045 | * a one-dimensional array to store the input
1046 | * a one-dimensional array to store the output
1047 | * a two-dimensional array to store the weights
1048 | * a threshold function (for example `lambda x: x > 0`)
1049 | 
1050 | The `propagate_forward` method is supposed to compute the output of the network
1051 | while the `propagate_backward` is supposed to modify the weights according to
1052 | the actual error.
1053 | 
1054 | ```Python
1055 | class Perceptron:
1056 |     def __init__(self, n, m):
1057 |         "Initialization of the perceptron with given sizes"
1058 |         ...
1059 | 
1060 |     def reset(self):
1061 |         "Reset weights"
1062 |         ...
1063 | 
1064 |     def propagate_forward(self, data):
1065 |         "Propagate data from input layer to output layer"
1066 |         ...
1067 |         
1068 |     def propagate_backward(self, target, lrate=0.1):
1069 |         "Back propagate error related to target using lrate"
1070 |         ... 
1071 | ```
1072 | 
1073 | <details><summary><b>Solution</b> (click to expand)</summary><p>
1074 | 
1075 | Sources: [perceptron.py](perceptron.py)
1076 | 
1077 | ```Python
1078 | class Perceptron:
1079 |     ''' Perceptron class. '''
1080 | 
1081 |     def __init__(self, n, m):
1082 |         "Initialization of the perceptron with given sizes"
1083 | 
1084 |         self.input  = np.ones(n+1)
1085 |         self.output = np.ones(m)
1086 |         self.weights= np.zeros((m,n+1))
1087 |         self.reset()
1088 | 
1089 |     def reset(self):
1090 |         "Reset weights"
1091 | 
1092 |         self.weights[...] = np.random.uniform(-.5, .5, self.weights.shape)
1093 | 
1094 |     def propagate_forward(self, data):
1095 |         "Propagate data from input layer to output layer"
1096 | 
1097 |         # Set input layer (but not bias)
1098 |         self.input[1:]  = data
1099 |         self.output[...] = f(np.dot(self.weights,self.input))
1100 | 
1101 |         # Return output
1102 |         return self.output
1103 | 
1104 |     def propagate_backward(self, target, lrate=0.1):
1105 |         "Back propagate error related to target using lrate"
1106 | 
1107 |         error = np.atleast_2d(target-self.output)
1108 |         input = np.atleast_2d(self.input)
1109 |         self.weights += lrate*np.dot(error.T,input)
1110 | 
1111 |         # Return error
1112 |         return (error**2).sum()
1113 | ```
1114 | 
1115 | </p></details><br/>
1116 | 
1117 | To train the network for 1000 iterations, we can do:
1118 | 
1119 | ```Python
1120 | 
1121 | lrate = 0.1
1122 | for i in range(1000):
1123 |     lrate *= 0.999
1124 |     n = np.random.randint(samples.size)
1125 |     network.propagate_forward( samples['input'][n] )
1126 |     error = network.propagate_backward( samples['output'][n], lrate )
1127 | ```
1128 | 
1129 | For other type of neural networks, you can have a look at https://github.com/rougier/neural-networks/.
1130 | 
1131 | 
1132 | 
1133 | 
1134 | ## ❹ – References
1135 | 
1136 | ### Book & tutorials
1137 | 
1138 | This is a curated list of resources among the plethora of books & tutorials
1139 | that exist online. Make no mistake, it is strongly biased.
1140 | 
1141 | * [From Python to Numpy](http://www.labri.fr/perso/nrougier/from-python-to-numpy/),
1142 |   Nicolas P.Rougier, 2017
1143 | * [100 Numpy Exercises](https://github.com/rougier/numpy-100),
1144 |   Nicolas P. Rougier, 2017
1145 | * [SciPy Lecture Notes](http://www.scipy-lectures.org/),
1146 |   Gaël Varoquaux, Emmanuelle Gouillart, Olav Vahtras et al., 2016
1147 | * [Elegant SciPy: The Art of Scientific Python](https://github.com/elegant-scipy/elegant-scipy),
1148 |   Juan Nunez-Iglesias, Stéfan van der Walt, Harriet Dashnow, 2016
1149 | * [Numpy Medkit](http://mentat.za.net/numpy/numpy_advanced_slides),
1150 |   Stéfan van der Walt, 2008
1151 | 
1152 | ### Archives
1153 | 
1154 | You can access all ASPP archives from https://python.g-node.org/wiki/archives
1155 | 
1156 | * **2017** (Nikiti, Greece, Juan Nunez-Iglesias):
1157 |   [exercises](https://github.com/jni/aspp2017-numpy) –  [solutions](https://github.com/jni/aspp2017-numpy-solutions)
1158 | * **2016** (Reading, United Kingdom, Stéfan van der Walt):
1159 |   [exercises](https://github.com/ASPP/2016_numpy)
1160 | * **2015** (Munich, Germany, Juan Nunez-Iglesias):
1161 |   [exercises](https://github.com/jni/aspp2015/tree/delivered) – [solutions](https://github.com/jni/aspp2015/tree/solved-in-class)
1162 | * **2014** (Split, Croatia, Stéfan van der Walt):
1163 |   [notebooks](https://python.g-node.org/python-summerschool-2014/_media/numpy_advanced.tar.bz2)
1164 | * **2013** (Züricj, Switzerland, Stéfan van der Walt):
1165 |   [slides](https://python.g-node.org/python-summerschool-2013/_media/advanced_numpy/slides/index.html) – [exercises](https://python.g-node.org/python-summerschool-2013/_media/advanced_numpy/problems.html) – [dropbox](https://www.dropbox.com/sh/4esl1ii7cac5xfa/O-CSFKKYvS/assp2013/numpy_problems)
1166 | * **2012** (Kiel, Germany, Stéfan van der Walt):
1167 |   [slides](https://python.g-node.org/python-summerschool-2012/_media/wiki/numpy/numpy_kiel2012.pdf) – [exercises](https://python.g-node.org/python-summerschool-2012/_media/wiki/numpy/problems.html)
1168 | * **2011** (St Andrew, United Kingdom, Pauli Virtanen):
1169 |   [slides](https://python.g-node.org/python-summerschool-2011/_media/materials/numpy/numpy-slides.pdf) – [exercises](https://python.g-node.org/python-summerschool-2011/_media/materials/numpy/numpy-exercises.zip) – [solutions](https://python.g-node.org/python-summerschool-2011/_media/materials/numpy/numpy-solutions.zip)
1170 | * **2010** (Trento, Italy, Stéfan van der Walt):
1171 |   [slides](https://python.g-node.org/python-autumnschool-2010/_media/materials/advanced_numpy/numpy_trento2010.pdf) – [exercises](https://python.g-node.org/python-autumnschool-2010/_media/materials/advanced_numpy/problems.html) – [solutions 1](https://python.g-node.org/python-autumnschool-2010/_media/materials/advanced_numpy/array_interface/solution.py) – [solutions 2](https://python.g-node.org/python-autumnschool-2010/_media/materials/advanced_numpy/structured_arrays/load_txt_solution.py)
1172 | * **2010** (Warsaw, Poland, Bartosz Teleńczuk):
1173 |   [slides](https://python.g-node.org/python-winterschool-2010/_media/scientific_python.pdf) – [exercises](https://python.g-node.org/python-winterschool-2010/_media/python_tools_for_science.pdf)
1174 | * **2009** (Berlin, Germany, Jens Kremkow):
1175 |   [slides](https://python.g-node.org/python-summerschool-2009/_media/numpy_scipy_matplotlib_pynn_neurotools.pdf) – [examples](https://python.g-node.org/python-summerschool-2009/_media/examples_numpy.py) – [exercises](https://python.g-node.org/python-summerschool-2009/_media/exercises_day2_numpy.py)
1176 | 


--------------------------------------------------------------------------------
/automata.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/ASPP/ASPP-2018-numpy/430c43d4023b20ee581557ce874dbcacede3f56f/automata.png


--------------------------------------------------------------------------------
/automata.py:
--------------------------------------------------------------------------------
 1 | # -----------------------------------------------------------------------------
 2 | # Copyright (C) 2018  Nicolas P. Rougier
 3 | # Distributed under the terms of the BSD License.
 4 | # -----------------------------------------------------------------------------
 5 | import numpy as np
 6 | import matplotlib.pyplot as plt
 7 | from numpy.lib.stride_tricks import as_strided
 8 | 
 9 | def sliding_window(Z, size=2):
10 |     n, s = Z.shape[0], Z.strides[0]
11 |     return as_strided(Z, shape=(n-size+1, size), strides=(s, s))
12 | 
13 | # Rule 30  (see https://en.wikipedia.org/wiki/Rule_30)
14 | # 0x000: 0, 0x001: 1, 0x010: 1, 0x011: 1
15 | # 0x100: 1, 0x101: 0, 0x110: 0, 0x111: 0
16 | rule = 30 
17 | R = np.array([int(v) for v in '{0:08b}'.format(rule)])[::-1]
18 | 
19 | # Initial state
20 | Z = np.zeros((250,501), dtype=int)
21 | Z[0,250] = 1
22 | 
23 | # Computing some iterations
24 | for i in range(1, len(Z)):
25 |     N = sliding_window(Z[i-1],3) * [1,2,4]
26 |     Z[i,1:-1] = R[N.sum(axis=1)]
27 | 
28 | # Display
29 | plt.figure(figsize=(6,3))
30 | plt.subplot(1,1,1,frameon=False)
31 | plt.imshow(Z, vmin=0, vmax=1, cmap=plt.cm.gray_r)
32 | plt.xticks([]), plt.yticks([])
33 | plt.tight_layout()
34 | plt.savefig("automata.png")
35 | plt.show()
36 | 


--------------------------------------------------------------------------------
/bad-dither.py:
--------------------------------------------------------------------------------
 1 | # -----------------------------------------------------------------------------
 2 | # Copyright (C) 2018  Nicolas P. Rougier
 3 | # Distributed under the terms of the BSD License.
 4 | # -----------------------------------------------------------------------------
 5 | import imageio
 6 | import numpy as np
 7 | import scipy.spatial
 8 | 
 9 | # Number of final colors we want
10 | n = 16
11 | 
12 | # Original Image
13 | I = imageio.imread("kitten.jpg")
14 | shape = I.shape
15 | 
16 | # Flattened image
17 | I = I.reshape(shape[0]*shape[1], shape[2])
18 | 
19 | # Find the unique colors and their frequency (=counts)
20 | colors, counts = np.unique(I, axis=0, return_counts=True)
21 | 
22 | # Get the n most frequent colors
23 | sorted = np.argsort(counts)[::-1]
24 | C = I[sorted][:n]
25 | 
26 | # Compute distance to most frequent colors
27 | D = scipy.spatial.distance.cdist(I, C, 'sqeuclidean')
28 | 
29 | # Replace colors with closest one
30 | Z = (C[D.argmin(axis=1)]).reshape(shape)
31 | 
32 | # Save result
33 | imageio.imsave("kitten-dithered.jpg", Z)
34 | 


--------------------------------------------------------------------------------
/basic-manipulation.py:
--------------------------------------------------------------------------------
 1 | # -----------------------------------------------------------------------------
 2 | # Copyright (C) 2018  Nicolas P. Rougier
 3 | # Distributed under the terms of the BSD License.
 4 | # -----------------------------------------------------------------------------
 5 | import numpy as np
 6 | 
 7 | # Create a vector with values ranging from 10 to 49
 8 | Z = np.arange(10,50)
 9 | 
10 | # Create a null vector of size 100 but the fifth value which is 1
11 | Z = np.zeros(100)
12 | Z[4] = 1
13 | 
14 | # Reverse a vector (first element becomes last)
15 | Z = np.arange(50)[::-1]
16 | 
17 | # Create a 3x3 matrix with values ranging from 0 to 8
18 | Z = np.arange(9).reshape(3,3)
19 | 
20 | # Create a 3x3 identity matrix
21 | Z = np.eye(3)
22 | 
23 | # Create a 2d array with 1 on the border and 0 inside
24 | Z = np.ones((10,10))
25 | Z[1:-1,1:-1] = 0
26 | 
27 | # Given a 1D array, negate all elements which are between 3 and 8, in place
28 | Z = np.arange(11)
29 | Z[(3 < Z) & (Z <= 8)] *= -1
30 | 


--------------------------------------------------------------------------------
/data.txt:
--------------------------------------------------------------------------------
 1 | % rank         lemma (8 letters max)      frequency       dispersion
 2 | 21             they                        1865844         0.96
 3 | 42             her                         969591          0.91
 4 | 49             as                          829018          0.95
 5 | 7              to                          6332195         0.98
 6 | 63             take                        670745          0.97
 7 | 14             you                         3085642         0.92
 8 | 35             go                          1151045         0.93
 9 | 56             think                       772787          0.91
10 | 28             not                         1638883         0.98
11 | 


--------------------------------------------------------------------------------
/diffusion.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/ASPP/ASPP-2018-numpy/430c43d4023b20ee581557ce874dbcacede3f56f/diffusion.png


--------------------------------------------------------------------------------
/diffusion.py:
--------------------------------------------------------------------------------
 1 | # -----------------------------------------------------------------------------
 2 | # Copyright (C) 2018  Nicolas P. Rougier
 3 | # Distributed under the terms of the BSD License.
 4 | # -----------------------------------------------------------------------------
 5 | import numpy as np
 6 | import matplotlib.pyplot as plt
 7 | 
 8 | from numpy.lib.stride_tricks import as_strided
 9 | 
10 | def sliding_window(Z, size=2):
11 |     n, s = Z.shape[0], Z.strides[0]
12 |     return as_strided(Z, shape=(n-size+1, size), strides=(s, s))
13 | 
14 | 
15 | # Initial conditions:
16 | # Domain size is 100 and we'll iterate over 50 time steps
17 | U = np.zeros((50,100))
18 | U[0,5::10] = 1.5
19 | 
20 | # Actual iteration
21 | F = 0.05
22 | for i in range(1, len(Z)):
23 |     Z[i,1:-1] = Z[i-1,1:-1] + F*(sliding_window(Z[i-1], 3)*[+1,-2,+1]).sum(axis=1)
24 | 
25 | # Display
26 | plt.figure(figsize=(6,3))
27 | plt.subplot(1,1,1,frameon=False)
28 | plt.imshow(Z, vmin=0, vmax=1)
29 | plt.xticks([]), plt.yticks([])
30 | plt.tight_layout()
31 | plt.savefig("diffusion.png")
32 | plt.show()
33 | 


--------------------------------------------------------------------------------
/dithered.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/ASPP/ASPP-2018-numpy/430c43d4023b20ee581557ce874dbcacede3f56f/dithered.png


--------------------------------------------------------------------------------
/geometry.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/ASPP/ASPP-2018-numpy/430c43d4023b20ee581557ce874dbcacede3f56f/geometry.png


--------------------------------------------------------------------------------
/geometry.py:
--------------------------------------------------------------------------------
 1 | # -----------------------------------------------------------------------------
 2 | # Copyright (C) 2018  Nicolas P. Rougier
 3 | # Distributed under the terms of the BSD License.
 4 | # -----------------------------------------------------------------------------
 5 | import numpy as np
 6 | import matplotlib.pyplot as plt
 7 | 
 8 | dtype = [("points",    float, (4, 2)),
 9 |          ("scale",     float, 1),
10 |          ("translate", float, 2),
11 |          ("rotate",    float, 1)]
12 | S = np.zeros(25, dtype = dtype)
13 | S["points"] = [(-1,-1), (-1,+1), (+1,+1), (+1,-1)]
14 | S["translate"] = (1,0)
15 | S["scale"] = 0.1
16 | S["rotate"] = np.linspace(0, 2*np.pi, len(S), endpoint=False)
17 | 
18 | P = np.zeros((len(S), 4, 2))
19 | for i in range(len(S)):
20 |     for j in range(4):
21 |         x = S[i]["points"][j,0]
22 |         y = S[i]["points"][j,1]
23 |         tx, ty = S[i]["translate"]
24 |         scale  = S[i]["scale"]
25 |         theta  = S[i]["rotate"]
26 |         x = tx + x*scale
27 |         y = ty + y*scale
28 |         x_ = x*np.cos(theta) - y*np.sin(theta)
29 |         y_ = x*np.sin(theta) + y*np.cos(theta)
30 |         P[i,j] = x_, y_
31 | 
32 | fig = plt.figure(figsize=(6,6))
33 | ax = plt.subplot(1,1,1, frameon=False)
34 | for i in range(len(P)):
35 |     X = np.r_[P[i,:,0], P[i,0,0]]
36 |     Y = np.r_[P[i,:,1], P[i,0,1]]
37 |     plt.plot(X, Y, color="black")
38 | plt.xticks([]), plt.yticks([])
39 | plt.tight_layout()
40 | plt.savefig("geometry.png")
41 | plt.show()
42 | 


--------------------------------------------------------------------------------
/imshow.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/ASPP/ASPP-2018-numpy/430c43d4023b20ee581557ce874dbcacede3f56f/imshow.png


--------------------------------------------------------------------------------
/imshow.py:
--------------------------------------------------------------------------------
 1 | # Terminal visualization of 2D numpy arrays
 2 | # Copyright (c) 2009  Nicolas P. Rougier
 3 | #
 4 | # This program is free software: you can redistribute it and/or modify it under
 5 | # the terms of the GNU General Public License as published by the Free Software
 6 | # Foundation, either version 3 of the License, or (at your option) any later
 7 | # version.
 8 | #
 9 | # This program is distributed in the hope that it will be useful, but WITHOUT
10 | # ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS
11 | # FOR A PARTICULAR PURPOSE. See the GNU General Public License for more details.
12 | #
13 | # You should have received a copy of the GNU General Public License along with
14 | # this program.  If not, see <http://www.gnu.org/licenses/>.
15 | # ------------------------------------------------------------------------------
16 | """ Terminal visualization of 2D numpy arrays
17 |     Using extended color capability of terminal (256 colors), the imshow function
18 |     renders a 2D numpy array within terminal.
19 | """
20 | import sys
21 | import numpy as np
22 | from matplotlib.cm import viridis
23 | 
24 | 
25 | def imshow (Z, vmin=None, vmax=None, cmap=viridis, show_cmap=True):
26 |     ''' Show a 2D numpy array using terminal colors '''
27 | 
28 |     Z = np.atleast_2d(Z)
29 |     
30 |     if len(Z.shape) != 2:
31 |         print("Cannot display non 2D array")
32 |         return
33 | 
34 |     vmin = vmin or Z.min()
35 |     vmax = vmax or Z.max()
36 | 
37 |     # Build initialization string that setup terminal colors
38 |     init = ''
39 |     for i in range(240):
40 |         v = i/240 
41 |         r,g,b,a = cmap(v)
42 |         init += "\x1b]4;%d;rgb:%02x/%02x/%02x\x1b\\" % (16+i, int(r*255),int(g*255),int(b*255))
43 | 
44 |     # Build array data string
45 |     data = ''
46 |     for i in range(Z.shape[0]):
47 |         for j in range(Z.shape[1]):
48 |             c = 16 + int( ((Z[Z.shape[0]-i-1,j]-vmin) / (vmax-vmin))*239)
49 |             if (c < 16):
50 |                 c=16
51 |             elif (c > 255):
52 |                 c=255
53 |             data += "\x1b[48;5;%dm  " % c
54 |             u = vmax - (i/float(max(Z.shape[0]-1,1))) * ((vmax-vmin))
55 |         if show_cmap:
56 |             data += "\x1b[0m  "
57 |             data += "\x1b[48;5;%dm  " % (16 + (1-i/float(Z.shape[0]))*239)
58 |             data += "\x1b[0m %+.2f" % u
59 |         data += "\n"
60 | 
61 |     sys.stdout.write(init+'\n')
62 |     sys.stdout.write(data+'\n')
63 | 
64 | 
65 | if __name__ == '__main__':
66 |     def func3(x,y):
67 |         return (1- x/2 + x**5 + y**3)*np.exp(-x**2-y**2)
68 |     dx, dy = .2, .2
69 |     x = np.arange(-3.0, 3.0, dx)
70 |     y = np.arange(-3.0, 3.0, dy)
71 |     X,Y = np.meshgrid(x, y)
72 |     Z = np.array (func3(X, Y))
73 |     imshow (Z)
74 | 


--------------------------------------------------------------------------------
/input-output.py:
--------------------------------------------------------------------------------
 1 | # -----------------------------------------------------------------------------
 2 | # Copyright (C) 2018  Nicolas P. Rougier
 3 | # Distributed under the terms of the BSD License.
 4 | # -----------------------------------------------------------------------------
 5 | import numpy as np
 6 | 
 7 | # Create our own dtype
 8 | dtype = np.dtype([('rank',       'i8'),
 9 |                   ('lemma',      'S8'),
10 |                   ('frequency',  'i8'),
11 |                   ('dispersion', 'f8')])
12 | 
13 | # Load file using our own dtype
14 | data = np.loadtxt('data.txt', comments='%', dtype=dtype)
15 | 
16 | # Extract words only
17 | print(data["lemma"])
18 | 
19 | # Extract the 3rd row
20 | print(data[2])
21 | 
22 | # Print all words with rank < 30
23 | print(data[data["rank"] < 30])
24 | 
25 | # Sort the data according to frequency 
26 | sorted = np.sort(data, order="frequency")
27 | print(sorted)
28 | 
29 | # Save unsorted and sorted array
30 | np.savez("sorted.npz", data=data, sorted=sorted)
31 | 
32 | # Load saved array
33 | out = np.load("sorted.npz")
34 | print(out["sorted"])
35 | 


--------------------------------------------------------------------------------
/kitten-dithered.jpg:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/ASPP/ASPP-2018-numpy/430c43d4023b20ee581557ce874dbcacede3f56f/kitten-dithered.jpg


--------------------------------------------------------------------------------
/kitten-quantized.jpg:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/ASPP/ASPP-2018-numpy/430c43d4023b20ee581557ce874dbcacede3f56f/kitten-quantized.jpg


--------------------------------------------------------------------------------
/kitten.jpg:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/ASPP/ASPP-2018-numpy/430c43d4023b20ee581557ce874dbcacede3f56f/kitten.jpg


--------------------------------------------------------------------------------
/kmeans.py:
--------------------------------------------------------------------------------
  1 | # -----------------------------------------------------------------------------
  2 | # Copyright (C) 2018  Nicolas P. Rougier
  3 | # Distributed under the terms of the BSD License.
  4 | # -----------------------------------------------------------------------------
  5 | # Code by Gareth Rees, posted on stack overflow
  6 | # https://codereview.stackexchange.com/questions/61598/k-mean-with-numpy
  7 | 
  8 | import numpy as np
  9 | import scipy.spatial
 10 | 
 11 | 
 12 | def cluster_centroids(data, clusters, k=None):
 13 |     """Return centroids of clusters in data.
 14 | 
 15 |     data is an array of observations with shape (A, B, ...).
 16 | 
 17 |     clusters is an array of integers of shape (A,) giving the index
 18 |     (from 0 to k-1) of the cluster to which each observation belongs.
 19 |     The clusters must all be non-empty.
 20 | 
 21 |     k is the number of clusters. If omitted, it is deduced from the
 22 |     values in the clusters array.
 23 | 
 24 |     The result is an array of shape (k, B, ...) containing the
 25 |     centroid of each cluster.
 26 | 
 27 |     >>> data = np.array([[12, 10, 87],
 28 |     ...                  [ 2, 12, 33],
 29 |     ...                  [68, 31, 32],
 30 |     ...                  [88, 13, 66],
 31 |     ...                  [79, 40, 89],
 32 |     ...                  [ 1, 77, 12]])
 33 |     >>> cluster_centroids(data, np.array([1, 1, 2, 2, 0, 1]))
 34 |     array([[ 79.,  40.,  89.],
 35 |            [  5.,  33.,  44.],
 36 |            [ 78.,  22.,  49.]])
 37 | 
 38 |     """
 39 |     if k is None:
 40 |         k = np.max(clusters) + 1
 41 |     result = np.empty(shape=(k,) + data.shape[1:])
 42 |     for i in range(k):
 43 |         np.mean(data[clusters == i], axis=0, out=result[i])
 44 |     return result
 45 | 
 46 | 
 47 | def kmeans(data, k=None, centroids=None, steps=20):
 48 |     """Divide the observations in data into clusters using the k-means
 49 |     algorithm, and return an array of integers assigning each data
 50 |     point to one of the clusters.
 51 | 
 52 |     centroids, if supplied, must be an array giving the initial
 53 |     position of the centroids of each cluster.
 54 | 
 55 |     If centroids is omitted, the number k gives the number of clusters
 56 |     and the initial positions of the centroids are selected randomly
 57 |     from the data.
 58 | 
 59 |     The k-means algorithm adjusts the centroids iteratively for the
 60 |     given number of steps, or until no further progress can be made.
 61 | 
 62 |     >>> data = np.array([[12, 10, 87],
 63 |     ...                  [ 2, 12, 33],
 64 |     ...                  [68, 31, 32],
 65 |     ...                  [88, 13, 66],
 66 |     ...                  [79, 40, 89],
 67 |     ...                  [ 1, 77, 12]])
 68 |     >>> np.random.seed(73)
 69 |     >>> kmeans(data, k=3)
 70 |     (array([[79., 40., 89.],
 71 |             [ 5., 33., 44.],
 72 |             [78., 22., 49.]]),    array([1, 1, 2, 2, 0, 1]))
 73 | 
 74 |     """
 75 |     if centroids is not None and k is not None:
 76 |         assert(k == len(centroids))
 77 |     elif centroids is not None:
 78 |         k = len(centroids)
 79 |     elif k is not None:
 80 |         # Forgy initialization method: choose k data points randomly.
 81 |         centroids = data[np.random.choice(np.arange(len(data)), k, False)]
 82 |     else:
 83 |         raise RuntimeError("Need a value for k or centroids.")
 84 | 
 85 |     for _ in range(max(steps, 1)):
 86 |         # Squared distances between each point and each centroid.
 87 |         sqdists = scipy.spatial.distance.cdist(centroids, data, 'sqeuclidean')
 88 | 
 89 |         # Index of the closest centroid to each data point.
 90 |         clusters = np.argmin(sqdists, axis=0)
 91 | 
 92 |         new_centroids = cluster_centroids(data, clusters, k)
 93 |         if np.array_equal(new_centroids, centroids):
 94 |             break
 95 | 
 96 |         centroids = new_centroids
 97 | 
 98 |     return centroids, clusters
 99 | 
100 | 
101 | 
102 | if __name__ == '__main__':
103 |     import imageio
104 | 
105 |     # Number of final colors we want
106 |     n = 16
107 | 
108 |     # Original Image
109 |     I = imageio.imread("kitten.jpg")
110 |     shape = I.shape
111 | 
112 |     # Flattened image
113 |     D = I.reshape(shape[0]*shape[1], shape[2])
114 |     
115 |     # Search for 16 centroids in D (using 20 iterations)
116 |     centroids, clusters = kmeans(D, k=n, steps=20)
117 | 
118 |     # Create quantized image
119 |     I = (centroids[clusters]).reshape(shape)
120 |     I = np.round(I).astype(np.uint8)
121 | 
122 |     # Save result
123 |     imageio.imsave("kitten-quantized.jpg", I)
124 | 


--------------------------------------------------------------------------------
/moving-average.py:
--------------------------------------------------------------------------------
1 | import numpy as np
2 | import matplolib.pyplot as plt
3 | 
4 | 


--------------------------------------------------------------------------------
/nan-arithmetics.py:
--------------------------------------------------------------------------------
 1 | # -----------------------------------------------------------------------------
 2 | # Copyright (C) 2018  Nicolas P. Rougier
 3 | # Distributed under the terms of the BSD License.
 4 | # -----------------------------------------------------------------------------
 5 | import numpy as np
 6 | 
 7 | # Result is NaN
 8 | print(0 * np.nan)
 9 | 
10 | # Result is False
11 | print(np.nan == np.nan)
12 | 
13 | # Result is False
14 | print(np.inf > np.nan)
15 | 
16 | # Result is NaN
17 | print(np.nan - np.nan)
18 | 
19 | # Result is False !!!
20 | print(0.3 == 3 * 0.1)
21 | print("0.1 really is {:0.56f}".format(0.1))
22 | 


--------------------------------------------------------------------------------
/original.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/ASPP/ASPP-2018-numpy/430c43d4023b20ee581557ce874dbcacede3f56f/original.png


--------------------------------------------------------------------------------
/perceptron.mp4:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/ASPP/ASPP-2018-numpy/430c43d4023b20ee581557ce874dbcacede3f56f/perceptron.mp4


--------------------------------------------------------------------------------
/perceptron.py:
--------------------------------------------------------------------------------
  1 | # -----------------------------------------------------------------------------
  2 | # Copyright (C) 2018  Nicolas P. Rougier
  3 | # Distributed under the terms of the BSD License.
  4 | # -----------------------------------------------------------------------------
  5 | import numpy as np
  6 | 
  7 | def f(x):
  8 |     return x > 0
  9 | 
 10 | class Perceptron:
 11 |     ''' Perceptron class. '''
 12 | 
 13 |     def __init__(self, n, m):
 14 |         ''' Initialization of the perceptron with given sizes.  '''
 15 | 
 16 |         self.input  = np.ones(n+1)
 17 |         self.output = np.ones(m)
 18 |         self.weights= np.zeros((m,n+1))
 19 |         self.reset()
 20 | 
 21 |     def reset(self):
 22 |         ''' Reset weights '''
 23 | 
 24 |         self.weights[...] = np.random.uniform(-.5, .5, self.weights.shape)
 25 | 
 26 |     def propagate_forward(self, data):
 27 |         ''' Propagate data from input layer to output layer. '''
 28 | 
 29 |         # Set input layer (but not bias)
 30 |         self.input[1:]  = data
 31 |         self.output[...] = f(np.dot(self.weights,self.input))
 32 | 
 33 |         # Return output
 34 |         return self.output
 35 | 
 36 |     def propagate_backward(self, target, lrate=0.1):
 37 |         ''' Back propagate error related to target using lrate. '''
 38 | 
 39 |         error = np.atleast_2d(target-self.output)
 40 |         input = np.atleast_2d(self.input)
 41 |         self.weights += lrate*np.dot(error.T,input)
 42 | 
 43 |         # Return error
 44 |         return (error**2).sum()
 45 | 
 46 | 
 47 | # -----------------------------------------------------------------------------
 48 | if __name__ == '__main__':
 49 |     import numpy as np
 50 |     import matplotlib.pyplot as plt
 51 |     import matplotlib.animation as animation
 52 | 
 53 |     np.random.seed(123)
 54 |     
 55 |     samples = np.zeros(100, dtype=[('input',  float, 2),
 56 |                                    ('output', float, 1)])
 57 | 
 58 |     P = np.random.uniform(0.05,0.95,(len(samples),2))
 59 |     samples["input"] = P
 60 |     stars = np.where(P[:,0]+P[:,1] < 1)
 61 |     discs = np.where(P[:,0]+P[:,1] > 1)
 62 |     samples["output"][stars] = +1
 63 |     samples["output"][discs] = 0
 64 | 
 65 | 
 66 |     network = Perceptron(2,1)
 67 |     network.reset()
 68 |     lrate = 0.05
 69 | 
 70 |     fig = plt.figure(figsize=(6,6))
 71 |     ax = plt.subplot(1,1,1, aspect=1, frameon=False)
 72 |     ax.scatter(P[stars,0], P[stars,1], color="red", marker="*", s=50, alpha=.5)
 73 |     ax.scatter(P[discs,0], P[discs,1], color="blue", s=25, alpha=.5)
 74 |     line, = ax.plot([], [], color="black", linewidth=2)
 75 |     ax.set_xlim(0,1)
 76 |     ax.set_xticks([])
 77 |     ax.set_ylim(0,1)
 78 |     ax.set_yticks([])
 79 |     plt.tight_layout()
 80 | 
 81 |     def animate(i):
 82 |         global lrate
 83 |         error = 0
 84 | 
 85 |         count = 0
 86 |         lrate *= 0.99
 87 |         while error == 0 and count < 10:
 88 |             n = np.random.randint(samples.size)
 89 |             network.propagate_forward( samples['input'][n] )
 90 |             error = network.propagate_backward( samples['output'][n], lrate )
 91 |             count += 1
 92 | 
 93 |         c,a,b = network.weights[0]
 94 |         x0 = -2
 95 |         x1 = +2
 96 |         if a != 0:
 97 |             y0 = (-c -b*x0)/a
 98 |             y1 = (-c -b*x1)/a
 99 |         else:
100 |             y0 = 0
101 |             y1 = 1
102 |             
103 |         line.set_xdata([x0,x1])
104 |         line.set_ydata([y0,y1])
105 |         
106 |         return line,
107 | 
108 |     anim = animation.FuncAnimation(fig, animate, np.arange(1, 300))
109 |     #Writer = animation.writers['ffmpeg']
110 |     #writer = Writer(fps=30,
111 |     #                metadata=dict(artist='Nicolas P. Rougier'), bitrate=1800)
112 |     # anim.save('perceptron.mp4', writer=writer)
113 |     plt.show()
114 | 


--------------------------------------------------------------------------------
/random-walk.py:
--------------------------------------------------------------------------------
 1 | # -----------------------------------------------------------------------------
 2 | # Copyright (C) 2018  Nicolas P. Rougier
 3 | # Distributed under the terms of the BSD License.
 4 | # -----------------------------------------------------------------------------
 5 | import random
 6 | import numpy as np
 7 | from tools import timeit
 8 | 
 9 | def random_walk_slow(n):
10 |     position = 0
11 |     walk = [position]
12 |     for i in range(n):
13 |         position += 2*random.randint(0, 1)-1
14 |         walk.append(position)
15 |     return walk
16 | 
17 | 
18 | def random_walk_faster(n=1000):
19 |     from itertools import accumulate
20 |     # Only available from Python 3.6
21 |     steps = random.choices([-1,+1], k=n)
22 |     return [0]+list(accumulate(steps))
23 | 
24 | def random_walk_fastest(n=1000):
25 |     steps = np.random.choice([-1,+1], n)
26 |     return np.cumsum(steps)
27 | 
28 | 
29 | if __name__ == '__main__':
30 | 
31 |     timeit("random_walk_slow(1000)", globals())
32 |     timeit("random_walk_faster(1000)", globals())
33 |     timeit("random_walk_fastest(1000)", globals())
34 | 


--------------------------------------------------------------------------------
/reorder.py:
--------------------------------------------------------------------------------
 1 | # -----------------------------------------------------------------------------
 2 | # Copyright (C) 2018  Nicolas P. Rougier
 3 | # Distributed under the terms of the BSD License.
 4 | # -----------------------------------------------------------------------------
 5 | import struct
 6 | import numpy as np
 7 | 
 8 | # Generation of the array
 9 | # Z = range(1001, 1009)
10 | # L = np.reshape(Z, (2,2,2), order="F").ravel().astype(">i8").view(np.ubyte)
11 | 
12 | L = [  0,   0,   0,   0,   0,   0,   3, 233,
13 |        0,   0,   0,   0,   0,   0,   3, 237,
14 |        0,   0,   0,   0,   0,   0,   3, 235,
15 |        0,   0,   0,   0,   0,   0,   3, 239,
16 |        0,   0,   0,   0,   0,   0,   3, 234,
17 |        0,   0,   0,   0,   0,   0,   3, 238,
18 |        0,   0,   0,   0,   0,   0,   3, 236,
19 |        0,   0,   0,   0,   0,   0,   3, 240]
20 | 
21 | # Automatic (numpy)
22 | Z = np.reshape(np.array(L, dtype=np.ubyte).view(dtype=">i8"), (2,2,2), order="F")
23 | print(Z[1,0,0])
24 | 
25 | # Manual (brain)
26 | shape = (2,2,2)
27 | itemsize = 8
28 | # We can probably do better
29 | strides = itemsize, itemsize*shape[0], itemsize*shape[0]*shape[1]
30 | index = (1,0,0)
31 | start = sum(i*s for i,s in zip(index,strides))
32 | end = start+itemsize
33 | value = struct.unpack(">Q", bytes(L[start:end]))[0]
34 | print(value)
35 | 


--------------------------------------------------------------------------------
/repeat.py:
--------------------------------------------------------------------------------
 1 | # -----------------------------------------------------------------------------
 2 | # Copyright (C) 2018  Nicolas P. Rougier
 3 | # Distributed under the terms of the BSD License.
 4 | # -----------------------------------------------------------------------------
 5 | import numpy as np
 6 | from numpy.lib.stride_tricks import as_strided
 7 | 
 8 | Z = np.zeros(5)
 9 | Z1 = np.tile(Z,(3,1))
10 | Z2 = as_strided(Z, shape=(3,)+Z.shape, strides=(0,)+Z.strides)
11 | 
12 | # Real repeat (three times the memory)
13 | Z1[0,0] = 1
14 | print(Z1)
15 | 
16 | # Fake repeat (but less memory)
17 | Z2[0,0] = 1
18 | print(Z2)
19 | 


--------------------------------------------------------------------------------
/strides.py:
--------------------------------------------------------------------------------
 1 | # -----------------------------------------------------------------------------
 2 | # Copyright (C) 2018  Nicolas P. Rougier
 3 | # Distributed under the terms of the BSD License.
 4 | # -----------------------------------------------------------------------------
 5 | import numpy as np
 6 | 
 7 | def strides(Z):
 8 |     strides = [Z.itemsize]
 9 |     
10 |     # Fotran ordered array
11 |     if np.isfortran(Z):
12 |         for i in range(0, Z.ndim-1):
13 |             strides.append(strides[-1] * Z.shape[i])
14 |         return tuple(strides)
15 |     # C ordered array
16 |     else:
17 |         for i in range(Z.ndim-1, 0, -1):
18 |             strides.append(strides[-1] * Z.shape[i])
19 |         return tuple(strides[::-1])
20 | 
21 | # This work
22 | Z = np.arange(24).reshape((2,3,4), order="C")
23 | print(Z.strides, " – ", strides(Z))
24 | 
25 | Z = np.arange(24).reshape((2,3,4), order="F")
26 | print(Z.strides, " – ", strides(Z))
27 | 
28 | # This does not work
29 | # Z = Z[::2]
30 | # print(Z.strides, " – ", strides(Z))
31 | 
32 | 


--------------------------------------------------------------------------------
/tools.py:
--------------------------------------------------------------------------------
  1 | # -----------------------------------------------------------------------------
  2 | # Copyright (C) 2018  Nicolas P. Rougier
  3 | # Distributed under the terms of the BSD License.
  4 | # -----------------------------------------------------------------------------
  5 | from imshow import imshow
  6 | 
  7 | def sysinfo():
  8 |     import sys
  9 |     import time
 10 |     import numpy as np
 11 |     import scipy as sp
 12 |     import matplotlib
 13 | 
 14 |     print("Date:       %s" % (time.strftime("%D")))
 15 |     version = sys.version_info
 16 |     major, minor, micro = version.major, version.minor, version.micro
 17 |     print("Python:     %d.%d.%d" % (major, minor, micro))
 18 |     print("Numpy:     ", np.__version__)
 19 |     print("Scipy:     ", sp.__version__)
 20 |     print("Matplotlib:", matplotlib.__version__)
 21 | 
 22 | 
 23 | def timeit(stmt, globals=globals()):
 24 |     import numpy as np
 25 |     import timeit as _timeit
 26 | 
 27 |     print("Timing '{0}'".format(stmt))
 28 |         
 29 |     # Rough approximation of a 10 runs
 30 |     trial = _timeit.timeit(stmt, globals=globals, number=10)/10
 31 |     
 32 |     # Maximum duration
 33 |     duration = 5.0
 34 |     
 35 |     # Number of repeat
 36 |     repeat = 7
 37 |     
 38 |     # Compute rounded number of trials
 39 |     number = max(1,int(10**np.ceil(np.log((duration/repeat)/trial)/np.log(10))))
 40 |     
 41 |     # Only report best run
 42 |     times = _timeit.repeat(stmt, globals=globals, number=number, repeat=repeat)
 43 |     times = np.array(times)/number
 44 |     mean = np.mean(times)
 45 |     std = np.std(times)
 46 | 
 47 |     # Display results
 48 |     units = {"s":  1, "ms": 1e-3, "us": 1e-6, "ns": 1e-9}
 49 |     for key,value in units.items():
 50 |         unit, factor = key, 1/value
 51 |         if mean > value: break
 52 |     mean *= factor
 53 |     std *= factor
 54 | 
 55 |     print("%.3g %s ± %.3g %s per loop (mean ± std. dev. of %d runs, %d loops each)" %
 56 |           (mean, unit, std, unit, repeat, number))
 57 | 
 58 |     
 59 | def info(Z):
 60 |     import sys
 61 |     import numpy as np
 62 |     endianness = {'=': 'native (%s)' % sys.byteorder,
 63 |                  '<': 'little',
 64 |                  '>': 'big',
 65 |                  '|': 'not applicable'}
 66 | 
 67 |     print("------------------------------")
 68 |     print("Interface (item)")
 69 |     print("  shape:      ", Z.shape)
 70 |     print("  dtype:      ", Z.dtype)
 71 |     print("  length:     ", len(Z))
 72 |     print("  size:       ", Z.size)
 73 |     print("  endianness: ", endianness[Z.dtype.byteorder])
 74 |     if np.isfortran(Z):
 75 |         print("  order:       ☐ C  ☑ Fortran")
 76 |     else:
 77 |         print("  order:       ☑ C  ☐ Fortran")
 78 |     print("")
 79 |     print("Memory (byte)")
 80 |     print("  item size:  ", Z.itemsize)
 81 |     print("  array size: ", Z.size*Z.itemsize)
 82 |     print("  strides:    ", Z.strides)
 83 |     print("")
 84 |     print("Properties")
 85 |     if Z.flags["OWNDATA"]:
 86 |         print("  own data:    ☑ Yes  ☐ No")
 87 |     else:
 88 |         print("  own data:    ☐ Yes  ☑ No")
 89 |     if Z.flags["WRITEABLE"]:
 90 |         print("  writeable:   ☑ Yes  ☐ No")
 91 |     else:
 92 |         print("  writeable:   ☐ Yes  ☑ No")
 93 |     if np.isfortran(Z) and Z.flags["F_CONTIGUOUS"]:
 94 |         print("  contiguous:  ☑ Yes  ☐ No")
 95 |     elif not np.isfortran(Z) and Z.flags["C_CONTIGUOUS"]:
 96 |         print("  contiguous:  ☑ Yes  ☐ No")
 97 |     else:
 98 |         print("  contiguous:  ☐ Yes  ☑ No")
 99 |     if Z.flags["ALIGNED"]:
100 |         print("  aligned:     ☑ Yes  ☐ No")
101 |     else:
102 |         print("  aligned:     ☐ Yes  ☑ No")
103 |     print("------------------------------")
104 |     print()
105 | 
106 | 
107 | if __name__ == '__main__':
108 |     import numpy as np
109 |     
110 |     sysinfo()
111 | 
112 |     Z = np.arange(9).reshape(3,3)
113 |     info(Z)
114 | 
115 |     timeit("Z=np.random.uniform(0,1,1000000)", globals())
116 | 
117 |     
118 | 


--------------------------------------------------------------------------------