├── LICENSE.md
├── README.md
├── automata.png
├── automata.py
├── bad-dither.py
├── basic-manipulation.py
├── data.txt
├── diffusion.png
├── diffusion.py
├── dithered.png
├── geometry.png
├── geometry.py
├── imshow.png
├── imshow.py
├── input-output.py
├── kitten-dithered.jpg
├── kitten-quantized.jpg
├── kitten.jpg
├── kmeans.py
├── moving-average.py
├── nan-arithmetics.py
├── original.png
├── perceptron.mp4
├── perceptron.py
├── random-walk.py
├── reorder.py
├── repeat.py
├── strides.py
└── tools.py
/LICENSE.md:
--------------------------------------------------------------------------------
1 | MIT License
2 |
3 | Copyright (c) 2018 Nicolas P. Rougier
4 |
5 | Permission is hereby granted, free of charge, to any person obtaining a copy
6 | of this software and associated documentation files (the "Software"), to deal
7 | in the Software without restriction, including without limitation the rights
8 | to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
9 | copies of the Software, and to permit persons to whom the Software is
10 | furnished to do so, subject to the following conditions:
11 |
12 | The above copyright notice and this permission notice shall be included in all
13 | copies or substantial portions of the Software.
14 |
15 | THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
16 | IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
17 | FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
18 | AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
19 | LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
20 | OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
21 | SOFTWARE.
22 |
--------------------------------------------------------------------------------
/README.md:
--------------------------------------------------------------------------------
1 | # Advanced NumPy
2 |
3 | A 3h00 course on advanced numpy techniques
4 | [Nicolas P. Rougier](http://www.labri.fr/perso/nrougier),
5 | [G-Node summer school](https://python.g-node.org/),
6 | Camerino, Italy, 2018
7 |
8 |
9 | > NumPy is a library for the Python programming language, adding support for
10 | > large, multi-dimensional arrays and matrices, along with a large collection
11 | > of high-level mathematical functions to operate on these arrays.
12 | >
13 | > – Wikipedia
14 |
15 |
16 | **Quicklinks**:
17 | [Numpy website](https://www.numpy.org) –
18 | [Numpy GitHub](https://github.com/numpy/numpy) –
19 | [Numpy documentation](https://www.numpy.org/devdocs/reference/) –
20 | [ASPP archives](https://python.g-node.org/wiki/archives) –
21 | [100 Numpy Exercises](https://github.com/rougier/numpy-100) –
22 | [From Python to Numpy](http://www.labri.fr/perso/nrougier/from-python-to-numpy/)
23 |
24 | #### Table of Contents
25 |
26 | * [Introduction](#--introduction)
27 | * [Warmup](#--warmup)
28 | * [Advanced exercises](#--advanced-exercises)
29 | * [References](#--references)
30 |
31 | ---
32 |
33 | ## ❶ – Introduction
34 |
35 | NumPy is all about vectorization. If you are familiar with Python, this is the
36 | main difficulty you'll face because you'll need to change your way of thinking
37 | and your new friends (among others) are named "vectors", "arrays", "views" or
38 | "ufuncs". Let's take a very simple example: random walk.
39 |
40 | One obvious way to write a random walk in Python is:
41 |
42 | ```Python
43 | def random_walk_slow(n):
44 | position = 0
45 | walk = [position]
46 | for i in range(n):
47 | position += 2*random.randint(0, 1)-1
48 | walk.append(position)
49 | return walk
50 | walk = random_walk_slow(1000)
51 | ```
52 |
53 |
54 | It works, but it is slow. We can do better using the itertools Python module
55 | that offers a set of functions for creating iterators for efficient looping. If
56 | we observe that a random walk is an accumulation of steps, we can rewrite the
57 | function by first generating all the steps and accumulate them without any
58 | loop:
59 |
60 | ```Python
61 | def random_walk_faster(n=1000):
62 | from itertools import accumulate
63 | # Only available from Python 3.6
64 | steps = random.choices([-1,+1], k=n)
65 | return [0]+list(accumulate(steps))
66 | walk = random_walk_faster(1000)
67 | ```
68 |
69 | It is better but still, it is slow. A more efficient implementation, taking
70 | full advantage of NumPy, can be written as:
71 |
72 | ```Python
73 | def random_walk_fastest(n=1000):
74 | steps = np.random.choice([-1,+1], n)
75 | return np.cumsum(steps)
76 | walk = random_walk_fastest(1000)
77 | ```
78 |
79 | Now, it is amazingly fast !
80 |
81 | ```Pycon
82 | >>> timeit("random_walk_slow(1000)", globals())
83 | Timing 'random_walk_slow(1000)'
84 | 1.58 ms ± 0.0228 ms per loop (mean ± std. dev. of 7 runs, 1000 loops each)
85 |
86 | >>> timeit("random_walk_faster(1000)", globals())
87 | Timing 'random_walk_faster(1000)'
88 | 281 us ± 3.15 us per loop (mean ± std. dev. of 7 runs, 10000 loops each)
89 |
90 | >>> timeit("random_walk_fastest(1000)", globals())
91 | Timing 'random_walk_fastest(1000)'
92 | 27.6 us ± 3.45 us per loop (mean ± std. dev. of 7 runs, 1000 loops each)
93 | ```
94 |
95 | **Warning**: You may have noticed (or not) that the `random_walk_fast` works
96 | but is not reproducible at all, which is pretty annoying in Science. If you
97 | want to know why, you can have a look at the article [Re-run, Repeat,
98 | Reproduce, Reuse, Replicate: Transforming Code into Scientific
99 | Contributions](https://www.frontiersin.org/articles/10.3389/fninf.2017.00069/full)
100 | (that I wrote with [Fabien Benureau](https://github.com/benureau)).
101 |
102 |
103 |
104 | Last point, before heading to the course, I would like to warn you about a
105 | potential problem you may encounter once you'll have become familiar enough
106 | with NumPy. It is a very powerful library and you can make wonders with it but,
107 | most of the time, this comes at the price of readability. If you don't comment
108 | your code at the time of writing, you won't be able to tell what a function is
109 | doing after a few weeks (or possibly days). For example, can you tell what the
110 | two functions below are doing?
111 |
112 | ```Python
113 | def function_1(seq, sub):
114 | return [i for i in range(len(seq) - len(sub)) if seq[i:i+len(sub)] == sub]
115 |
116 | def function_2(seq, sub):
117 | target = np.dot(sub, sub)
118 | candidates = np.where(np.correlate(seq, sub, mode='valid') == target)[0]
119 | check = candidates[:, np.newaxis] + np.arange(len(sub))
120 | mask = np.all((np.take(seq, check) == sub), axis=-1)
121 | return candidates[mask]
122 | ```
123 |
124 | As you may have guessed, the second function is the
125 | vectorized-optimized-faster-NumPy version of the first function and it runs 10x
126 | faster than the pure Python version. But it is hardly readable.
127 |
128 |
129 |
130 | ## ❷ – Warmup
131 |
132 | You're supposed to be already familiar with NumPy. If not, you should read the
133 | [NumPy chapter](http://www.scipy-lectures.org/intro/numpy/index.html) from the [SciPy Lecture Notes](http://www.scipy-lectures.org/). Before heading to the more advanced
134 | stuff, let's do some warmup exercises (that should pose no problem). If you
135 | choke on the first exercise, you should try to have a look at the [Anatomy of an array](https://www.labri.fr/perso/nrougier/from-python-to-numpy/#anatomy-of-an-array) and check also the [Quick references](https://www.labri.fr/perso/nrougier/from-python-to-numpy/#quick-references).
136 |
137 | ### Useful tools
138 |
139 | Before heading to the exercises, We'll write a `sysinfo` and `info` function that will help us debug our code.
140 |
141 | The `sysinfo` function displays some information related to you scientific
142 | environment:
143 |
144 | ```Pycon
145 | >>> import tools
146 | >>> tools.sysinfo()
147 | Date: 08/25/18
148 | Python: 3.7.0
149 | Numpy: 1.14.5
150 | Scipy: 1.1.0
151 | Matplotlib: 2.2.2
152 | ```
153 |
154 | While the `info` function displays a lot of information for a specific array:
155 |
156 | ```Pycon
157 | >>> import tools
158 | >>> Z = np.arange(9).reshape(3,3)
159 | >>> tools.info(Z)
160 | ------------------------------
161 | Interface (item)
162 | shape: (3,3)
163 | dtype: int64
164 | length: 3
165 | size: 9
166 | endianess: native (little)
167 | order: ☑ C ☐ Fortran
168 |
169 | Memory (byte)
170 | item size: 8
171 | array size: 72
172 | strides: (24, 8)
173 |
174 | Properties
175 | own data: ☑ Yes ☐ No
176 | writeable: ☑ Yes ☐ No
177 | contiguous: ☑ Yes ☐ No
178 | aligned: ☑ Yes ☐ No
179 | ------------------------------
180 | ```
181 |
182 |
183 | Try to code these two functions. You can then compare your implementation with
184 | [mine](tools.py).
185 | **NOTE**: We don't care so much about the formatting, do not lose time trying
186 | to copy it exactly.
187 |
188 |
189 | The [tools.py](tools.py) script comes with two other functions that might be
190 | useful. The `timeit` function allows to precisely time some code (e.g. to
191 | measure which one is the fastest). It is pretty similar to the `%timeit` magic
192 | function from IPython:
193 |
194 | ```Pycon
195 | >>> import tools
196 | >>> tools.timeit("Z=np.random.uniform(0,1,1000000)", globals())
197 | >>> Measuring time for 'Z=np.random.uniform(0,1,1000000)'
198 | 11.4 ms ± 0.198 ms per loop (mean ± std. dev. of 7 runs, 100 loops each)
199 | ```
200 |
201 | And the `imshow` function is able to display a one-dimensional or
202 | two-dimensional array in the console. It won't replace matplotlib but it can
203 | comes handy for some (small) arrays (you'll need a 256 colors terminal):
204 |
205 | 
206 |
207 |
208 | ### Basic manipulation
209 |
210 | Let's start with some basic operations:
211 |
212 | • Create a vector with values ranging from 10 to 49
213 | • Create a null vector of size 100 but the fifth value which is 1
214 | • Reverse a vector (first element becomes last)
215 | • Create a 3x3 matrix with values ranging from 0 to 8
216 | • Create a 3x3 identity matrix
217 | • Create a 2d array with 1 on the border and 0 inside
218 | • Given a 1D array, negate all elements which are between 3 and 8, in place
219 |
220 | For a more complete list, you can have a look at the [100 Numpy Exercises](https://github.com/rougier/numpy-100).
221 |
222 | Solution (click to expand)
223 |
224 | Sources: [basic-manipulation.py](basic-manipulation.py)
225 |
226 | ```Python
227 | import numpy as np
228 |
229 | # Create a vector with values ranging from 10 to 49
230 | Z = np.arange(10,50)
231 |
232 | # Create a null vector of size 100 but the fifth value which is 1
233 | Z = np.zeros(100)
234 | Z[4] = 1
235 |
236 | # Reverse a vector (first element becomes last)
237 | Z = np.arange(50)[::-1]
238 |
239 | # Create a 3x3 matrix with values ranging from 0 to 8
240 | Z = np.arange(9).reshape(3,3)
241 |
242 | # Create a 3x3 identity matrix
243 | Z = np.eye(3)
244 |
245 | # Create a 2d array with 1 on the border and 0 inside
246 | Z = np.ones((10,10))
247 | Z[1:-1,1:-1] = 0
248 |
249 | # Given a 1D array, negate all elements which are between 3 and 8, in place
250 | Z = np.arange(11)
251 | Z[(3 < Z) & (Z <= 8)] *= -1
252 | ```
253 |
254 |
255 |
256 |
257 | ### NaN arithmetics
258 |
259 | Just a reminder on NaN arithmetics:
260 |
261 | What is the result of the following expression?
262 | **→ Hints**: [What Every Computer Scientist Should Know About Floating-Point Arithmetic, D. Goldberg, 1991](https://docs.oracle.com/cd/E19957-01/806-3568/ncg_goldberg.html)
263 |
264 |
265 | ```Python
266 | print(0 * np.nan)
267 | print(np.nan == np.nan)
268 | print(np.inf > np.nan)
269 | print(np.nan - np.nan)
270 | print(0.3 == 3 * 0.1)
271 | ```
272 |
273 |
274 | Solution (click to expand)
275 |
276 | Sources [nan-arithmetics.py](nan-arithmetics.py)
277 |
278 | ```Python
279 | import numpy as np
280 |
281 | # Result is NaN
282 | print(0 * np.nan)
283 |
284 | # Result is False
285 | print(np.nan == np.nan)
286 |
287 | # Result is False
288 | print(np.inf > np.nan)
289 |
290 | # Result is NaN
291 | print(np.nan - np.nan)
292 |
293 | # Result is False !!!
294 | print(0.3 == 3 * 0.1)
295 | print("0.1 really is {:0.56f}".format(0.1))
296 | ```
297 |
298 |
299 |
300 |
301 |
302 | ### Computing strides
303 |
304 | Consider an array Z, how to compute Z strides (manually)?
305 | **→ Hints**:
306 | [itemsize](https://docs.scipy.org/doc/numpy/reference/generated/numpy.ndarray.itemsize.html) –
307 | [shape](https://docs.scipy.org/doc/numpy/reference/generated/numpy.ndarray.shape.html) –
308 | [ndim](https://docs.scipy.org/doc/numpy/reference/generated/numpy.ndarray.ndim.html)
309 |
310 |
311 | ```Python
312 | import numpy as np
313 | Z = np.arange(24).reshape(2,3,4)
314 | print(Z.strides)
315 | ```
316 |
317 | Solution (click to expand)
318 |
319 | Sources [strides.py](strides.py)
320 |
321 | ```Python
322 | import numpy as np
323 |
324 | def strides(Z):
325 | strides = [Z.itemsize]
326 |
327 | # Fotran ordered array
328 | if np.isfortran(Z):
329 | for i in range(0, Z.ndim-1):
330 | strides.append(strides[-1] * Z.shape[i])
331 | return tuple(strides)
332 | # C ordered array
333 | else:
334 | for i in range(Z.ndim-1, 0, -1):
335 | strides.append(strides[-1] * Z.shape[i])
336 | return tuple(strides[::-1])
337 |
338 | # This work
339 | Z = np.arange(24).reshape((2,3,4), order="C")
340 | print(Z.strides, " – ", strides(Z))
341 |
342 | Z = np.arange(24).reshape((2,3,4), order="F")
343 | print(Z.strides, " – ", strides(Z))
344 |
345 | # This does not work
346 | Z = Z[::2]
347 | print(Z.strides, " – ", strides(Z))
348 | ```
349 |
350 |
351 |
352 | ### Repeat and repeat
353 |
354 | Can you tell the difference?
355 | **→ Hints**:
356 | [tile](https://docs.scipy.org/doc/numpy/reference/generated/numpy.tile.html) –
357 | [as_strided](https://docs.scipy.org/doc/numpy/reference/generated/numpy.lib.stride_tricks.as_strided.html)
358 |
359 |
360 | ```Python
361 | import numpy as np
362 | from numpy.lib.stride_tricks import as_strided
363 |
364 | Z = np.random.randint(0,10,5)
365 | Z1 = np.tile(Z, (3,1))
366 | Z2 = as_strided(Z, shape=(3,)+Z.shape, strides=(0,)+Z.strides)
367 | ```
368 |
369 | Solution (click to expand)
370 |
371 | Sources [repeat.py](repeat.py)
372 |
373 | ```Python
374 | import numpy as np
375 | from numpy.lib.stride_tricks import as_strided
376 |
377 | Z = np.zeros(5)
378 | Z1 = np.tile(Z,(3,1))
379 | Z2 = as_strided(Z, shape=(3,)+Z.shape, strides=(0,)+Z.strides)
380 |
381 | # Real repeat: three times the memory
382 | Z1[0,0] = 1
383 | print(Z1)
384 |
385 | # Fake repeat: less memory but not totally equivalent
386 | Z2[0,0] = 1
387 | print(Z2)
388 | ```
389 |
390 |
391 |
392 |
393 | ### Reordering things
394 |
395 | Let's consider the following list:
396 |
397 | ```Python
398 | L = [ 0, 0, 0, 0, 0, 0, 3, 233,
399 | 0, 0, 0, 0, 0, 0, 3, 237,
400 | 0, 0, 0, 0, 0, 0, 3, 235,
401 | 0, 0, 0, 0, 0, 0, 3, 239,
402 | 0, 0, 0, 0, 0, 0, 3, 234,
403 | 0, 0, 0, 0, 0, 0, 3, 238,
404 | 0, 0, 0, 0, 0, 0, 3, 236,
405 | 0, 0, 0, 0, 0, 0, 3, 240]
406 | ```
407 |
408 | This is actually the byte dump of a 2x2x2 array, fortran ordered of 64 bits
409 | integers using big endian encoding.
410 |
411 | How would you access element at [1,0,0] with NumPy (simple)?
412 |
413 | Solution (click to expand)
414 |
415 | ```Python
416 |
417 | import struct
418 | import numpy as np
419 |
420 | # Generation of the array
421 | # Z = range(1001, 1009)
422 | # L = np.reshape(Z, (2,2,2), order="F").ravel().astype(">i8").view(np.ubyte)
423 |
424 | L = [ 0, 0, 0, 0, 0, 0, 3, 233,
425 | 0, 0, 0, 0, 0, 0, 3, 237,
426 | 0, 0, 0, 0, 0, 0, 3, 235,
427 | 0, 0, 0, 0, 0, 0, 3, 239,
428 | 0, 0, 0, 0, 0, 0, 3, 234,
429 | 0, 0, 0, 0, 0, 0, 3, 238,
430 | 0, 0, 0, 0, 0, 0, 3, 236,
431 | 0, 0, 0, 0, 0, 0, 3, 240]
432 |
433 | # Automatic (numpy)
434 | Z = np.reshape(np.array(L, dtype=np.ubyte).view(dtype=">i8"), (2,2,2), order="F")
435 | print(Z[1,0,0])
436 | ```
437 |
438 |
439 |
440 | How would you access element at [1,0,0] without NumPy (harder)?
441 | **→ Hints**: Use your brain!
442 |
443 |
444 | Solution (click to expand)
445 |
446 | Sources [reorder.py](reorder.py)
447 |
448 | ```Python
449 |
450 | import struct
451 | import numpy as np
452 |
453 | # Generation of the array
454 | # Z = range(1001, 1009)
455 | # L = np.reshape(Z, (2,2,2), order="F").ravel().astype(">i8").view(np.ubyte)
456 |
457 | L = [ 0, 0, 0, 0, 0, 0, 3, 233,
458 | 0, 0, 0, 0, 0, 0, 3, 237,
459 | 0, 0, 0, 0, 0, 0, 3, 235,
460 | 0, 0, 0, 0, 0, 0, 3, 239,
461 | 0, 0, 0, 0, 0, 0, 3, 234,
462 | 0, 0, 0, 0, 0, 0, 3, 238,
463 | 0, 0, 0, 0, 0, 0, 3, 236,
464 | 0, 0, 0, 0, 0, 0, 3, 240]
465 |
466 | # Automatic (numpy)
467 | Z = np.reshape(np.array(L, dtype=np.ubyte).view(dtype=">i8"), (2,2,2), order="F")
468 | print(Z[1,0,0])
469 |
470 | # Manual (brain)
471 | shape = (2,2,2)
472 | itemsize = 8
473 | # We can probably do better
474 | strides = itemsize, itemsize*shape[0], itemsize*shape[0]*shape[1]
475 | index = (1,0,0)
476 | start = sum(i*s for i,s in zip(index,strides))
477 | end = start+itemsize
478 | value = struct.unpack(">Q", bytes(L[start:end]))[0]
479 | print(value)
480 | ```
481 |
482 |
483 |
484 |
485 | ### Heat equation
486 |
487 |
488 | > The diffusion equation (a.k.a the heat equation) reads `∂u/∂t = α∂²u/∂x²` where
489 | > u(x,t) is the unknown function to be solved, x is a coordinate in space, and t
490 | > is time. The coefficient α is the diffusion coefficient and determines how fast
491 | > u changes in time. The discrete (time(n) and space (i)) version of the equation
492 | > can be rewritten as `u(i,n+1) = u(i,n) + F(u(i-1,n) - 2u(i,n) + u(i+1,n))`.
493 | >
494 | > – [Finite difference methods for diffusion processes](http://hplgit.github.io/num-methods-for-PDEs/doc/pub/diffu/sphinx/._main_diffu000.html), Hans Petter Langtangen
495 |
496 | The goal here is to compute the discrete equation over a finite domain using
497 | `as_strided` to produce a sliding-window view of a 1D array. This view can be
498 | then used to compute `U` at the next iteration. Using the the following initial
499 | conditions (using Z instead of U):
500 |
501 |
502 | ```Python
503 | Z = np.random.uniform(0.00, 0.05, (50,100))
504 | Z[0,5::10] = 1
505 | ```
506 |
507 | Try to obtain this picture (where time goes from top to bottom):
508 |
509 | 
510 |
511 |
512 | The code to display the figure from an array Z is:
513 |
514 | ```Python
515 | import matplotlib as plt
516 |
517 | plt.figure(figsize=(6,3))
518 | plt.subplot(1,1,1,frameon=False)
519 | plt.imshow(Z, vmin=0, vmax=1)
520 | plt.xticks([]), plt.yticks([])
521 | plt.tight_layout()
522 | plt.show()
523 | ```
524 |
525 | **Hint**: You will need to write a `sliding_window(Z, size=3)` function that returns
526 | a strided view of Z.
527 |
528 | Solution (click to expand)
529 |
530 | Sources [diffusion.py](diffusion.py)
531 |
532 | ```Python
533 | import numpy as np
534 | import matplotlib.pyplot as plt
535 | from numpy.lib.stride_tricks import as_strided
536 |
537 |
538 | def sliding_window(Z, size=2):
539 | n, s = Z.shape[0], Z.strides[0]
540 | return as_strided(Z, shape=(n-size+1, size), strides=(s, s))
541 |
542 |
543 | # Initial conditions:
544 | # Domain size is 100 and we'll iterate over 50 time steps
545 | Z = np.zeros((50,100))
546 | Z[0,5::10] = 1.5
547 |
548 | # Actual iteration
549 | F = 0.05
550 | for i in range(1, len(Z)):
551 | Z[i,1:-1] = Z[i-1,1:-1] + F*(sliding_window(Z[i-1], 3)*[+1,-2,+1]).sum(axis=1)
552 |
553 | # Display
554 | plt.figure(figsize=(6,3))
555 | plt.subplot(1,1,1,frameon=False)
556 | plt.imshow(Z, vmin=0, vmax=1)
557 | plt.xticks([]), plt.yticks([])
558 | plt.tight_layout()
559 | plt.savefig("diffusion.png")
560 | plt.show()
561 | ```
562 |
563 |
564 |
565 |
566 | ### Rule 30
567 |
568 | With only a slight modification of the previous exercise, we can compute a
569 | one-dimensional [cellular automata](https://en.wikipedia.org/wiki/Cellular_automaton) and more specifically the [Rule 30](https://en.wikipedia.org/wiki/Rule_30) that
570 | exhibits intriguing patterns as shown below:
571 |
572 | 
573 |
574 | To start with, here is how to convert the rule in a useful form:
575 |
576 | ```Python
577 | rule = 30
578 | R = np.array([int(v) for v in '{0:08b}'.format(rule)])[::-1]
579 | ```
580 |
581 | and we consider this initial state:
582 |
583 | ```Python
584 | Z = np.zeros((250,501), dtype=int)
585 | Z[0,250] = 1
586 | ```
587 |
588 | Try to obtain the same figure. Display code is:
589 |
590 | ```Python
591 | plt.figure(figsize=(6,3))
592 | plt.subplot(1,1,1,frameon=False)
593 | plt.imshow(Z, vmin=0, vmax=1, cmap=plt.cm.gray_r)
594 | plt.xticks([]), plt.yticks([])
595 | plt.tight_layout()
596 | plt.savefig("automata.png")
597 | plt.show()
598 | ```
599 |
600 |
601 | Solution (click to expand)
602 |
603 | Sources [automata.py](automata.py)
604 |
605 | ```Python
606 | import numpy as np
607 | import matplotlib.pyplot as plt
608 | from numpy.lib.stride_tricks import as_strided
609 |
610 | def sliding_window(Z, size=2):
611 | n, s = Z.shape[0], Z.strides[0]
612 | return as_strided(Z, shape=(n-size+1, size), strides=(s, s))
613 |
614 | # Rule 30 (see https://en.wikipedia.org/wiki/Rule_30)
615 | # 0x000: 0, 0x001: 1, 0x010: 1, 0x011: 1
616 | # 0x100: 1, 0x101: 0, 0x110: 0, 0x111: 0
617 | rule = 30
618 | R = np.array([int(v) for v in '{0:08b}'.format(rule)])[::-1]
619 |
620 | # Initial state
621 | Z = np.zeros((250,501), dtype=int)
622 | Z[0,250] = 1
623 |
624 | # Computing some iterations
625 | for i in range(1, len(Z)):
626 | N = sliding_window(Z[i-1],3) * [1,2,4]
627 | Z[i,1:-1] = R[N.sum(axis=1)]
628 |
629 | # Display
630 | plt.figure(figsize=(6,3))
631 | plt.subplot(1,1,1,frameon=False)
632 | plt.imshow(Z, vmin=0, vmax=1, cmap=plt.cm.gray_r)
633 | plt.xticks([]), plt.yticks([])
634 | plt.tight_layout()
635 | plt.savefig("automata.png")
636 | plt.show()
637 | ```
638 |
639 |
640 |
641 |
642 | ### Input / Output
643 |
644 | → Exercise written by [Stefan van der Walt](http://mentat.za.net/).
645 |
646 | Place the following data in a text file, data.txt:
647 |
648 | ```
649 | % rank lemma (10 letters max) frequency dispersion
650 | 21 they 1865844 0.96
651 | 42 her 969591 0.91
652 | 49 as 829018 0.95
653 | 7 to 6332195 0.98
654 | 63 take 670745 0.97
655 | 14 you 3085642 0.92
656 | 35 go 1151045 0.93
657 | 56 think 772787 0.91
658 | 28 not 1638883 0.98
659 | ```
660 |
661 | Now, design a suitable structured data type, then load the data from the text
662 | file using [np.loadtxt](https://docs.scipy.org/doc/numpy/reference/generated/numpy.loadtxt.html) (look at the documenration to see how to handle the '%' comment character).
663 |
664 | Here's a skeleton to start with:
665 |
666 | ```Python
667 | import numpy as np
668 |
669 | # Construct the data-type
670 | # For example:
671 | # dtype = np.dtype([('x', np.float), ('y', np.int), ('z', np.uint8)])
672 |
673 | dt = np.dtype(...) # Modify this line to give the correct answer
674 | data = np.loadtxt(...) # Load data with loadtxt
675 | ```
676 |
677 | Examine the data you got:
678 | * Extract words only
679 | * Extract the 3rd row
680 | * Print all words with rank < 30
681 |
682 | Sort the data according to frequency (see
683 | [np.sort](https://docs.scipy.org/doc/numpy/reference/routines.sort.html)).
684 |
685 |
686 | Save the result to a compressed numpy data file (e.g. "sorted.npz") using [np.savez](https://docs.scipy.org/doc/numpy/reference/generated/numpy.savez.html) and load it back with `out = np.load("sorted.npz")`. Do you get back what you put in? Why?
687 |
688 | Solution (click to expand)
689 |
690 | Source: [input-output.py](input-output.py)
691 |
692 | ```
693 | import numpy as np
694 |
695 | # Create our own dtype
696 | dtype = np.dtype([('rank', 'i8'),
697 | ('lemma', 'S8'),
698 | ('frequency', 'i8'),
699 | ('dispersion', 'f8')])
700 |
701 | # Load file using our own dtype
702 | data = np.loadtxt('data.txt', comments='%', dtype=dtype)
703 |
704 | # Extract words only
705 | print(data["lemma"])
706 |
707 | # Extract the 3rd row
708 | print(data[2])
709 |
710 | # Print all words with rank < 30
711 | print(data[data["rank"] < 30])
712 |
713 | # Sort the data according to frequency.
714 | sorted = np.sort(data, order="frequency")
715 | print(sorted)
716 |
717 | # Save unsorted and sorted array
718 | np.savez("sorted.npz", data=data, sorted=sorted)
719 |
720 | # Load saved array
721 | out = np.load("sorted.npz")
722 | print(out["sorted"])
723 | ```
724 |
725 |
726 |
727 |
728 | ## ❸ – Advanced exercises
729 |
730 | ### Geometry
731 |
732 | We consider a collection of 2d squares that are each defined by four points, a scaling factor, a translation and a rotation angle. We want to obtain the following figure:
733 |
734 | 
735 |
736 | made of 25 squares, scaled by 0.1, translated by (1,0) and with increasing
737 | rotation angles. The order of operation is `scale`, `translate` and
738 | `rotate`. What would be the best structure `S` to hold all these information at
739 | once?
740 | **→ Hints**: [structured arrays](https://docs.scipy.org/doc/numpy/user/basics.rec.html)
741 |
742 | Solution (click to expand)
743 |
744 | ```Python
745 | dtype = [("points", float, (4, 2)),
746 | ("scale", float, 1),
747 | ("translate", float, 2),
748 | ("rotate", float, 1)]
749 | S = np.zeros(25, dtype = dtype)
750 | ```
751 |
752 |
753 |
754 | We now need to initialize our array. For the four points describing a square,
755 | you can use the following points: [(-1,-1), (-1,+1), (+1,+1), (+1,-1)]
756 |
757 | Solution (click to expand)
758 |
759 | ```Python
760 | S["points"] = [(-1,-1), (-1,+1), (+1,+1), (+1,-1)]
761 | S["translate"] = (1,0)
762 | S["scale"] = 0.1
763 | S["rotate"] = np.linspace(0, 2*np.pi, len(S), endpoint=False)
764 | ```
765 |
766 |
767 |
768 | Now, we need to write a function that apply all these transformations and write
769 | the results in new array:
770 |
771 | ```Python
772 |
773 | P = np.zeros((len(S), 4, 2))
774 | # Your code here (to populate P)
775 | ...
776 | ```
777 |
778 | You can start by writing a translate, scale and rotate function first.
779 |
780 | > Rotation reminder. Considering a point (x,y) and a rotation angle a,
781 | > the rotated coordinates (x',y') are:
782 | >
783 | > x' = x.cos(a) - y.sin(a) and y' = x.sin(a) + y.cos(a)
784 |
785 |
786 | The display code is:
787 |
788 | ```Python
789 | import matplotlib.pyplot as plt
790 |
791 | fig = plt.figure(figsize=(6,6))
792 | ax = plt.subplot(1,1,1, frameon=False)
793 | for i in range(len(P)):
794 | X = np.r_[P[i,:,0], P[i,0,0]]
795 | Y = np.r_[P[i,:,1], P[i,0,1]]
796 | plt.plot(X, Y, color="black")
797 | plt.xticks([]), plt.yticks([])
798 | plt.tight_layout()
799 | plt.show()
800 | ```
801 |
802 |
803 |
804 | Solution (click to expand)
805 |
806 | Source: [geometry.py](geometry.py)
807 |
808 | ```Python
809 | import numpy as np
810 | import matplotlib.pyplot as plt
811 |
812 | dtype = [("points", float, (4, 2)),
813 | ("scale", float, 1),
814 | ("translate", float, 2),
815 | ("rotate", float, 1)]
816 | S = np.zeros(25, dtype = dtype)
817 | S["points"] = [(-1,-1), (-1,+1), (+1,+1), (+1,-1)]
818 | S["translate"] = (1,0)
819 | S["scale"] = 0.1
820 | S["rotate"] = np.linspace(0, 2*np.pi, len(S), endpoint=False)
821 |
822 | P = np.zeros((len(S), 4, 2))
823 | for i in range(len(S)):
824 | for j in range(4):
825 | x = S[i]["points"][j,0]
826 | y = S[i]["points"][j,1]
827 | tx, ty = S[i]["translate"]
828 | scale = S[i]["scale"]
829 | theta = S[i]["rotate"]
830 | x = tx + x*scale
831 | y = ty + y*scale
832 | x_ = x*np.cos(theta) - y*np.sin(theta)
833 | y_ = x*np.sin(theta) + y*np.cos(theta)
834 | P[i,j] = x_, y_
835 |
836 | fig = plt.figure(figsize=(6,6))
837 | ax = plt.subplot(1,1,1, frameon=False)
838 | for i in range(len(P)):
839 | X = np.r_[P[i,:,0], P[i,0,0]]
840 | Y = np.r_[P[i,:,1], P[i,0,1]]
841 | plt.plot(X, Y, color="black")
842 | plt.xticks([]), plt.yticks([])
843 | plt.tight_layout()
844 | plt.savefig("geometry.png")
845 | plt.show()
846 | ```
847 |
848 |
849 |
850 | The proposed solution has two loops. Can you imagine a way to do it without loop ?
851 | **→ Hints**: [einsum](https://docs.scipy.org/doc/numpy/reference/generated/numpy.einsum.html)
852 |
853 | Solution (click to expand)
854 |
855 | Have a look at [Multiple individual 2d rotation at once](https://stackoverflow.com/questions/40822983/multiple-individual-2d-rotation-at-once) on stack overflow. I did not implement it, feel free to issue a PR with the solution.
856 |
857 |
858 |
859 | ### Image quantization
860 |
861 | > In computer graphics, color quantization or color image quantization is
862 | > quantization applied to color spaces; it is a process that reduces the number
863 | > of distinct colors used in an image, usually with the intention that the new
864 | > image should be as visually similar as possible to the original image.
865 | >
866 | > – Wikipedia
867 |
868 | In this exercise, we want to produce color quantization, that is, considering a
869 | random image, we would like to reduce the number of colors without altering too
870 | much the perception of the image. We thus need to find the most representative
871 | colors.
872 |
873 | The first (naive) idea that may come to mind is to count the number of times a
874 | specific color is used and to use the most frequent colors for quantization.
875 | Unfortunately, this does not work very well as illustrated below:
876 |
877 | 
878 | 
879 |
880 | The reason is that some color and slight variations might be over-represented
881 | in th eoriginal image and will thus appears among the most frequent
882 | colors. This the reason why the kitten ended mostly in green and the flower
883 | totally dissapeared.
884 |
885 | To check by yourself, you'll write the corresponding script and check for the
886 | result:
887 |
888 | 1. Load an image (using [imageio](http://imageio.github.io/).[imread](https://imageio.readthedocs.io/en/latest/userapi.html#imageio.imread))
889 | 2. Find the number of unique colors and their frequency (counts)
890 | 3. Pick the n=16 most frequent colors
891 | 4. Replace colors in the original image with the closest color (found previously)
892 | 5. Save the result (using [imageio](http://imageio.github.io/).[imsave](https://imageio.readthedocs.io/en/latest/userapi.html#imageio.imsave))
893 |
894 |
895 | Solution (click to expand)
896 |
897 | Sources: [bad-dither.py](bad-dither.py)
898 |
899 | ```Python
900 | import imageio
901 | import numpy as np
902 | import scipy.spatial
903 |
904 | # Number of final colors we want
905 | n = 16
906 |
907 | # Original Image
908 | I = imageio.imread("kitten.jpg")
909 | shape = I.shape
910 |
911 | # Flattened image
912 | I = I.reshape(shape[0]*shape[1], shape[2])
913 |
914 | # Find the unique colors and their frequency (=counts)
915 | colors, counts = np.unique(I, axis=0, return_counts=True)
916 |
917 | # Get the n most frequent colors
918 | sorted = np.argsort(counts)[::-1]
919 | C = I[sorted][:n]
920 |
921 | # Compute distance to most frequent colors
922 | D = scipy.spatial.distance.cdist(I, C, 'sqeuclidean')
923 |
924 | # Replace colors with closest one
925 | Z = (C[D.argmin(axis=1)]).reshape(shape)
926 |
927 | # Save result
928 | imageio.imsave("kitten-dithered.jpg", Z)
929 | ```
930 |
931 |
932 |
933 |
934 | We thus need a different method and this method is called [k-means
935 | clustering](https://en.wikipedia.org/wiki/K-means_clustering) that allow to
936 | partition data into n clusters whose centroids may serve as a prototype for the
937 | cluster.
938 |
939 | 
940 | 
941 |
942 | The algorithm is quite simple. We start with n random points (centroids) and we
943 | compute for each point in our data what is the closest centroid. Those
944 | constitute cluster of points. For each cluster, we compute its centroid (mean
945 | point) and we reiterate the processus for a given number of steps. In this
946 | exercise, you'll have to write such a k-means function and to use it to
947 | quantize the image.
948 |
949 | Solution (click to expand)
950 |
951 | Sources: [kmeans.py](kmeans.py)
952 |
953 | ```Python
954 | # Code by Gareth Rees, posted on stack overflow
955 | # https://codereview.stackexchange.com/questions/61598/k-mean-with-numpy
956 |
957 | import numpy as np
958 | import scipy.spatial
959 |
960 | def cluster_centroids(data, clusters, k=None):
961 | if k is None:
962 | k = np.max(clusters) + 1
963 | result = np.empty(shape=(k,) + data.shape[1:])
964 | for i in range(k):
965 | np.mean(data[clusters == i], axis=0, out=result[i])
966 | return result
967 |
968 |
969 | def kmeans(data, k=None, centroids=None, steps=20):
970 | if centroids is not None and k is not None:
971 | assert(k == len(centroids))
972 | elif centroids is not None:
973 | k = len(centroids)
974 | elif k is not None:
975 | # Forgy initialization method: choose k data points randomly.
976 | centroids = data[np.random.choice(np.arange(len(data)), k, False)]
977 | else:
978 | raise RuntimeError("Need a value for k or centroids.")
979 |
980 | for _ in range(max(steps, 1)):
981 | # Squared distances between each point and each centroid.
982 | sqdists = scipy.spatial.distance.cdist(centroids, data, 'sqeuclidean')
983 |
984 | # Index of the closest centroid to each data point.
985 | clusters = np.argmin(sqdists, axis=0)
986 |
987 | new_centroids = cluster_centroids(data, clusters, k)
988 | if np.array_equal(new_centroids, centroids):
989 | break
990 |
991 | centroids = new_centroids
992 | return centroids, clusters
993 |
994 |
995 | if __name__ == '__main__':
996 | import imageio
997 |
998 | # Number of final colors we want
999 | n = 16
1000 |
1001 | # Original Image
1002 | I = imageio.imread("kitten.jpg")
1003 | shape = I.shape
1004 |
1005 | # Flattened image
1006 | D = I.reshape(shape[0]*shape[1], shape[2])
1007 |
1008 | # Search for 16 centroids in D (using 20 iterations)
1009 | centroids, clusters = kmeans(D, k=n, steps=20)
1010 |
1011 | # Create quantized image
1012 | I = (centroids[clusters]).reshape(shape)
1013 | I = np.round(I).astype(np.uint8)
1014 |
1015 | # Save result
1016 | imageio.imsave("kitten-quantized.jpg", I)
1017 | ```
1018 |
1019 |
1020 |
1021 |
1022 |
1023 |
1024 | ### Neural networks
1025 |
1026 | In this exercise, we'll implement one of the most simple feed-forward neural
1027 | network, a.k.a. the [Perceptron](https://en.wikipedia.org/wiki/Perceptron). We'll use it to discrimate between two classes
1028 | (points in two dimensions,see [desired output](perceptron.mp4)):
1029 |
1030 | ```Python
1031 | samples = np.zeros(100, dtype=[('input', float, 2),
1032 | ('output', float, 1)])
1033 |
1034 | P = np.random.uniform(0.05,0.95,(len(samples),2))
1035 | samples["input"] = P
1036 | stars = np.where(P[:,0]+P[:,1] < 1)
1037 | discs = np.where(P[:,0]+P[:,1] > 1)
1038 | samples["output"][stars] = +1
1039 | samples["output"][discs] = 0
1040 | ```
1041 |
1042 | Your goal is to populate the following class in order to train the
1043 | network. You'll need:
1044 |
1045 | * a one-dimensional array to store the input
1046 | * a one-dimensional array to store the output
1047 | * a two-dimensional array to store the weights
1048 | * a threshold function (for example `lambda x: x > 0`)
1049 |
1050 | The `propagate_forward` method is supposed to compute the output of the network
1051 | while the `propagate_backward` is supposed to modify the weights according to
1052 | the actual error.
1053 |
1054 | ```Python
1055 | class Perceptron:
1056 | def __init__(self, n, m):
1057 | "Initialization of the perceptron with given sizes"
1058 | ...
1059 |
1060 | def reset(self):
1061 | "Reset weights"
1062 | ...
1063 |
1064 | def propagate_forward(self, data):
1065 | "Propagate data from input layer to output layer"
1066 | ...
1067 |
1068 | def propagate_backward(self, target, lrate=0.1):
1069 | "Back propagate error related to target using lrate"
1070 | ...
1071 | ```
1072 |
1073 | Solution (click to expand)
1074 |
1075 | Sources: [perceptron.py](perceptron.py)
1076 |
1077 | ```Python
1078 | class Perceptron:
1079 | ''' Perceptron class. '''
1080 |
1081 | def __init__(self, n, m):
1082 | "Initialization of the perceptron with given sizes"
1083 |
1084 | self.input = np.ones(n+1)
1085 | self.output = np.ones(m)
1086 | self.weights= np.zeros((m,n+1))
1087 | self.reset()
1088 |
1089 | def reset(self):
1090 | "Reset weights"
1091 |
1092 | self.weights[...] = np.random.uniform(-.5, .5, self.weights.shape)
1093 |
1094 | def propagate_forward(self, data):
1095 | "Propagate data from input layer to output layer"
1096 |
1097 | # Set input layer (but not bias)
1098 | self.input[1:] = data
1099 | self.output[...] = f(np.dot(self.weights,self.input))
1100 |
1101 | # Return output
1102 | return self.output
1103 |
1104 | def propagate_backward(self, target, lrate=0.1):
1105 | "Back propagate error related to target using lrate"
1106 |
1107 | error = np.atleast_2d(target-self.output)
1108 | input = np.atleast_2d(self.input)
1109 | self.weights += lrate*np.dot(error.T,input)
1110 |
1111 | # Return error
1112 | return (error**2).sum()
1113 | ```
1114 |
1115 |
1116 |
1117 | To train the network for 1000 iterations, we can do:
1118 |
1119 | ```Python
1120 |
1121 | lrate = 0.1
1122 | for i in range(1000):
1123 | lrate *= 0.999
1124 | n = np.random.randint(samples.size)
1125 | network.propagate_forward( samples['input'][n] )
1126 | error = network.propagate_backward( samples['output'][n], lrate )
1127 | ```
1128 |
1129 | For other type of neural networks, you can have a look at https://github.com/rougier/neural-networks/.
1130 |
1131 |
1132 |
1133 |
1134 | ## ❹ – References
1135 |
1136 | ### Book & tutorials
1137 |
1138 | This is a curated list of resources among the plethora of books & tutorials
1139 | that exist online. Make no mistake, it is strongly biased.
1140 |
1141 | * [From Python to Numpy](http://www.labri.fr/perso/nrougier/from-python-to-numpy/),
1142 | Nicolas P.Rougier, 2017
1143 | * [100 Numpy Exercises](https://github.com/rougier/numpy-100),
1144 | Nicolas P. Rougier, 2017
1145 | * [SciPy Lecture Notes](http://www.scipy-lectures.org/),
1146 | Gaël Varoquaux, Emmanuelle Gouillart, Olav Vahtras et al., 2016
1147 | * [Elegant SciPy: The Art of Scientific Python](https://github.com/elegant-scipy/elegant-scipy),
1148 | Juan Nunez-Iglesias, Stéfan van der Walt, Harriet Dashnow, 2016
1149 | * [Numpy Medkit](http://mentat.za.net/numpy/numpy_advanced_slides),
1150 | Stéfan van der Walt, 2008
1151 |
1152 | ### Archives
1153 |
1154 | You can access all ASPP archives from https://python.g-node.org/wiki/archives
1155 |
1156 | * **2017** (Nikiti, Greece, Juan Nunez-Iglesias):
1157 | [exercises](https://github.com/jni/aspp2017-numpy) – [solutions](https://github.com/jni/aspp2017-numpy-solutions)
1158 | * **2016** (Reading, United Kingdom, Stéfan van der Walt):
1159 | [exercises](https://github.com/ASPP/2016_numpy)
1160 | * **2015** (Munich, Germany, Juan Nunez-Iglesias):
1161 | [exercises](https://github.com/jni/aspp2015/tree/delivered) – [solutions](https://github.com/jni/aspp2015/tree/solved-in-class)
1162 | * **2014** (Split, Croatia, Stéfan van der Walt):
1163 | [notebooks](https://python.g-node.org/python-summerschool-2014/_media/numpy_advanced.tar.bz2)
1164 | * **2013** (Züricj, Switzerland, Stéfan van der Walt):
1165 | [slides](https://python.g-node.org/python-summerschool-2013/_media/advanced_numpy/slides/index.html) – [exercises](https://python.g-node.org/python-summerschool-2013/_media/advanced_numpy/problems.html) – [dropbox](https://www.dropbox.com/sh/4esl1ii7cac5xfa/O-CSFKKYvS/assp2013/numpy_problems)
1166 | * **2012** (Kiel, Germany, Stéfan van der Walt):
1167 | [slides](https://python.g-node.org/python-summerschool-2012/_media/wiki/numpy/numpy_kiel2012.pdf) – [exercises](https://python.g-node.org/python-summerschool-2012/_media/wiki/numpy/problems.html)
1168 | * **2011** (St Andrew, United Kingdom, Pauli Virtanen):
1169 | [slides](https://python.g-node.org/python-summerschool-2011/_media/materials/numpy/numpy-slides.pdf) – [exercises](https://python.g-node.org/python-summerschool-2011/_media/materials/numpy/numpy-exercises.zip) – [solutions](https://python.g-node.org/python-summerschool-2011/_media/materials/numpy/numpy-solutions.zip)
1170 | * **2010** (Trento, Italy, Stéfan van der Walt):
1171 | [slides](https://python.g-node.org/python-autumnschool-2010/_media/materials/advanced_numpy/numpy_trento2010.pdf) – [exercises](https://python.g-node.org/python-autumnschool-2010/_media/materials/advanced_numpy/problems.html) – [solutions 1](https://python.g-node.org/python-autumnschool-2010/_media/materials/advanced_numpy/array_interface/solution.py) – [solutions 2](https://python.g-node.org/python-autumnschool-2010/_media/materials/advanced_numpy/structured_arrays/load_txt_solution.py)
1172 | * **2010** (Warsaw, Poland, Bartosz Teleńczuk):
1173 | [slides](https://python.g-node.org/python-winterschool-2010/_media/scientific_python.pdf) – [exercises](https://python.g-node.org/python-winterschool-2010/_media/python_tools_for_science.pdf)
1174 | * **2009** (Berlin, Germany, Jens Kremkow):
1175 | [slides](https://python.g-node.org/python-summerschool-2009/_media/numpy_scipy_matplotlib_pynn_neurotools.pdf) – [examples](https://python.g-node.org/python-summerschool-2009/_media/examples_numpy.py) – [exercises](https://python.g-node.org/python-summerschool-2009/_media/exercises_day2_numpy.py)
1176 |
--------------------------------------------------------------------------------
/automata.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/ASPP/ASPP-2018-numpy/430c43d4023b20ee581557ce874dbcacede3f56f/automata.png
--------------------------------------------------------------------------------
/automata.py:
--------------------------------------------------------------------------------
1 | # -----------------------------------------------------------------------------
2 | # Copyright (C) 2018 Nicolas P. Rougier
3 | # Distributed under the terms of the BSD License.
4 | # -----------------------------------------------------------------------------
5 | import numpy as np
6 | import matplotlib.pyplot as plt
7 | from numpy.lib.stride_tricks import as_strided
8 |
9 | def sliding_window(Z, size=2):
10 | n, s = Z.shape[0], Z.strides[0]
11 | return as_strided(Z, shape=(n-size+1, size), strides=(s, s))
12 |
13 | # Rule 30 (see https://en.wikipedia.org/wiki/Rule_30)
14 | # 0x000: 0, 0x001: 1, 0x010: 1, 0x011: 1
15 | # 0x100: 1, 0x101: 0, 0x110: 0, 0x111: 0
16 | rule = 30
17 | R = np.array([int(v) for v in '{0:08b}'.format(rule)])[::-1]
18 |
19 | # Initial state
20 | Z = np.zeros((250,501), dtype=int)
21 | Z[0,250] = 1
22 |
23 | # Computing some iterations
24 | for i in range(1, len(Z)):
25 | N = sliding_window(Z[i-1],3) * [1,2,4]
26 | Z[i,1:-1] = R[N.sum(axis=1)]
27 |
28 | # Display
29 | plt.figure(figsize=(6,3))
30 | plt.subplot(1,1,1,frameon=False)
31 | plt.imshow(Z, vmin=0, vmax=1, cmap=plt.cm.gray_r)
32 | plt.xticks([]), plt.yticks([])
33 | plt.tight_layout()
34 | plt.savefig("automata.png")
35 | plt.show()
36 |
--------------------------------------------------------------------------------
/bad-dither.py:
--------------------------------------------------------------------------------
1 | # -----------------------------------------------------------------------------
2 | # Copyright (C) 2018 Nicolas P. Rougier
3 | # Distributed under the terms of the BSD License.
4 | # -----------------------------------------------------------------------------
5 | import imageio
6 | import numpy as np
7 | import scipy.spatial
8 |
9 | # Number of final colors we want
10 | n = 16
11 |
12 | # Original Image
13 | I = imageio.imread("kitten.jpg")
14 | shape = I.shape
15 |
16 | # Flattened image
17 | I = I.reshape(shape[0]*shape[1], shape[2])
18 |
19 | # Find the unique colors and their frequency (=counts)
20 | colors, counts = np.unique(I, axis=0, return_counts=True)
21 |
22 | # Get the n most frequent colors
23 | sorted = np.argsort(counts)[::-1]
24 | C = I[sorted][:n]
25 |
26 | # Compute distance to most frequent colors
27 | D = scipy.spatial.distance.cdist(I, C, 'sqeuclidean')
28 |
29 | # Replace colors with closest one
30 | Z = (C[D.argmin(axis=1)]).reshape(shape)
31 |
32 | # Save result
33 | imageio.imsave("kitten-dithered.jpg", Z)
34 |
--------------------------------------------------------------------------------
/basic-manipulation.py:
--------------------------------------------------------------------------------
1 | # -----------------------------------------------------------------------------
2 | # Copyright (C) 2018 Nicolas P. Rougier
3 | # Distributed under the terms of the BSD License.
4 | # -----------------------------------------------------------------------------
5 | import numpy as np
6 |
7 | # Create a vector with values ranging from 10 to 49
8 | Z = np.arange(10,50)
9 |
10 | # Create a null vector of size 100 but the fifth value which is 1
11 | Z = np.zeros(100)
12 | Z[4] = 1
13 |
14 | # Reverse a vector (first element becomes last)
15 | Z = np.arange(50)[::-1]
16 |
17 | # Create a 3x3 matrix with values ranging from 0 to 8
18 | Z = np.arange(9).reshape(3,3)
19 |
20 | # Create a 3x3 identity matrix
21 | Z = np.eye(3)
22 |
23 | # Create a 2d array with 1 on the border and 0 inside
24 | Z = np.ones((10,10))
25 | Z[1:-1,1:-1] = 0
26 |
27 | # Given a 1D array, negate all elements which are between 3 and 8, in place
28 | Z = np.arange(11)
29 | Z[(3 < Z) & (Z <= 8)] *= -1
30 |
--------------------------------------------------------------------------------
/data.txt:
--------------------------------------------------------------------------------
1 | % rank lemma (8 letters max) frequency dispersion
2 | 21 they 1865844 0.96
3 | 42 her 969591 0.91
4 | 49 as 829018 0.95
5 | 7 to 6332195 0.98
6 | 63 take 670745 0.97
7 | 14 you 3085642 0.92
8 | 35 go 1151045 0.93
9 | 56 think 772787 0.91
10 | 28 not 1638883 0.98
11 |
--------------------------------------------------------------------------------
/diffusion.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/ASPP/ASPP-2018-numpy/430c43d4023b20ee581557ce874dbcacede3f56f/diffusion.png
--------------------------------------------------------------------------------
/diffusion.py:
--------------------------------------------------------------------------------
1 | # -----------------------------------------------------------------------------
2 | # Copyright (C) 2018 Nicolas P. Rougier
3 | # Distributed under the terms of the BSD License.
4 | # -----------------------------------------------------------------------------
5 | import numpy as np
6 | import matplotlib.pyplot as plt
7 |
8 | from numpy.lib.stride_tricks import as_strided
9 |
10 | def sliding_window(Z, size=2):
11 | n, s = Z.shape[0], Z.strides[0]
12 | return as_strided(Z, shape=(n-size+1, size), strides=(s, s))
13 |
14 |
15 | # Initial conditions:
16 | # Domain size is 100 and we'll iterate over 50 time steps
17 | U = np.zeros((50,100))
18 | U[0,5::10] = 1.5
19 |
20 | # Actual iteration
21 | F = 0.05
22 | for i in range(1, len(Z)):
23 | Z[i,1:-1] = Z[i-1,1:-1] + F*(sliding_window(Z[i-1], 3)*[+1,-2,+1]).sum(axis=1)
24 |
25 | # Display
26 | plt.figure(figsize=(6,3))
27 | plt.subplot(1,1,1,frameon=False)
28 | plt.imshow(Z, vmin=0, vmax=1)
29 | plt.xticks([]), plt.yticks([])
30 | plt.tight_layout()
31 | plt.savefig("diffusion.png")
32 | plt.show()
33 |
--------------------------------------------------------------------------------
/dithered.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/ASPP/ASPP-2018-numpy/430c43d4023b20ee581557ce874dbcacede3f56f/dithered.png
--------------------------------------------------------------------------------
/geometry.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/ASPP/ASPP-2018-numpy/430c43d4023b20ee581557ce874dbcacede3f56f/geometry.png
--------------------------------------------------------------------------------
/geometry.py:
--------------------------------------------------------------------------------
1 | # -----------------------------------------------------------------------------
2 | # Copyright (C) 2018 Nicolas P. Rougier
3 | # Distributed under the terms of the BSD License.
4 | # -----------------------------------------------------------------------------
5 | import numpy as np
6 | import matplotlib.pyplot as plt
7 |
8 | dtype = [("points", float, (4, 2)),
9 | ("scale", float, 1),
10 | ("translate", float, 2),
11 | ("rotate", float, 1)]
12 | S = np.zeros(25, dtype = dtype)
13 | S["points"] = [(-1,-1), (-1,+1), (+1,+1), (+1,-1)]
14 | S["translate"] = (1,0)
15 | S["scale"] = 0.1
16 | S["rotate"] = np.linspace(0, 2*np.pi, len(S), endpoint=False)
17 |
18 | P = np.zeros((len(S), 4, 2))
19 | for i in range(len(S)):
20 | for j in range(4):
21 | x = S[i]["points"][j,0]
22 | y = S[i]["points"][j,1]
23 | tx, ty = S[i]["translate"]
24 | scale = S[i]["scale"]
25 | theta = S[i]["rotate"]
26 | x = tx + x*scale
27 | y = ty + y*scale
28 | x_ = x*np.cos(theta) - y*np.sin(theta)
29 | y_ = x*np.sin(theta) + y*np.cos(theta)
30 | P[i,j] = x_, y_
31 |
32 | fig = plt.figure(figsize=(6,6))
33 | ax = plt.subplot(1,1,1, frameon=False)
34 | for i in range(len(P)):
35 | X = np.r_[P[i,:,0], P[i,0,0]]
36 | Y = np.r_[P[i,:,1], P[i,0,1]]
37 | plt.plot(X, Y, color="black")
38 | plt.xticks([]), plt.yticks([])
39 | plt.tight_layout()
40 | plt.savefig("geometry.png")
41 | plt.show()
42 |
--------------------------------------------------------------------------------
/imshow.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/ASPP/ASPP-2018-numpy/430c43d4023b20ee581557ce874dbcacede3f56f/imshow.png
--------------------------------------------------------------------------------
/imshow.py:
--------------------------------------------------------------------------------
1 | # Terminal visualization of 2D numpy arrays
2 | # Copyright (c) 2009 Nicolas P. Rougier
3 | #
4 | # This program is free software: you can redistribute it and/or modify it under
5 | # the terms of the GNU General Public License as published by the Free Software
6 | # Foundation, either version 3 of the License, or (at your option) any later
7 | # version.
8 | #
9 | # This program is distributed in the hope that it will be useful, but WITHOUT
10 | # ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS
11 | # FOR A PARTICULAR PURPOSE. See the GNU General Public License for more details.
12 | #
13 | # You should have received a copy of the GNU General Public License along with
14 | # this program. If not, see .
15 | # ------------------------------------------------------------------------------
16 | """ Terminal visualization of 2D numpy arrays
17 | Using extended color capability of terminal (256 colors), the imshow function
18 | renders a 2D numpy array within terminal.
19 | """
20 | import sys
21 | import numpy as np
22 | from matplotlib.cm import viridis
23 |
24 |
25 | def imshow (Z, vmin=None, vmax=None, cmap=viridis, show_cmap=True):
26 | ''' Show a 2D numpy array using terminal colors '''
27 |
28 | Z = np.atleast_2d(Z)
29 |
30 | if len(Z.shape) != 2:
31 | print("Cannot display non 2D array")
32 | return
33 |
34 | vmin = vmin or Z.min()
35 | vmax = vmax or Z.max()
36 |
37 | # Build initialization string that setup terminal colors
38 | init = ''
39 | for i in range(240):
40 | v = i/240
41 | r,g,b,a = cmap(v)
42 | init += "\x1b]4;%d;rgb:%02x/%02x/%02x\x1b\\" % (16+i, int(r*255),int(g*255),int(b*255))
43 |
44 | # Build array data string
45 | data = ''
46 | for i in range(Z.shape[0]):
47 | for j in range(Z.shape[1]):
48 | c = 16 + int( ((Z[Z.shape[0]-i-1,j]-vmin) / (vmax-vmin))*239)
49 | if (c < 16):
50 | c=16
51 | elif (c > 255):
52 | c=255
53 | data += "\x1b[48;5;%dm " % c
54 | u = vmax - (i/float(max(Z.shape[0]-1,1))) * ((vmax-vmin))
55 | if show_cmap:
56 | data += "\x1b[0m "
57 | data += "\x1b[48;5;%dm " % (16 + (1-i/float(Z.shape[0]))*239)
58 | data += "\x1b[0m %+.2f" % u
59 | data += "\n"
60 |
61 | sys.stdout.write(init+'\n')
62 | sys.stdout.write(data+'\n')
63 |
64 |
65 | if __name__ == '__main__':
66 | def func3(x,y):
67 | return (1- x/2 + x**5 + y**3)*np.exp(-x**2-y**2)
68 | dx, dy = .2, .2
69 | x = np.arange(-3.0, 3.0, dx)
70 | y = np.arange(-3.0, 3.0, dy)
71 | X,Y = np.meshgrid(x, y)
72 | Z = np.array (func3(X, Y))
73 | imshow (Z)
74 |
--------------------------------------------------------------------------------
/input-output.py:
--------------------------------------------------------------------------------
1 | # -----------------------------------------------------------------------------
2 | # Copyright (C) 2018 Nicolas P. Rougier
3 | # Distributed under the terms of the BSD License.
4 | # -----------------------------------------------------------------------------
5 | import numpy as np
6 |
7 | # Create our own dtype
8 | dtype = np.dtype([('rank', 'i8'),
9 | ('lemma', 'S8'),
10 | ('frequency', 'i8'),
11 | ('dispersion', 'f8')])
12 |
13 | # Load file using our own dtype
14 | data = np.loadtxt('data.txt', comments='%', dtype=dtype)
15 |
16 | # Extract words only
17 | print(data["lemma"])
18 |
19 | # Extract the 3rd row
20 | print(data[2])
21 |
22 | # Print all words with rank < 30
23 | print(data[data["rank"] < 30])
24 |
25 | # Sort the data according to frequency
26 | sorted = np.sort(data, order="frequency")
27 | print(sorted)
28 |
29 | # Save unsorted and sorted array
30 | np.savez("sorted.npz", data=data, sorted=sorted)
31 |
32 | # Load saved array
33 | out = np.load("sorted.npz")
34 | print(out["sorted"])
35 |
--------------------------------------------------------------------------------
/kitten-dithered.jpg:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/ASPP/ASPP-2018-numpy/430c43d4023b20ee581557ce874dbcacede3f56f/kitten-dithered.jpg
--------------------------------------------------------------------------------
/kitten-quantized.jpg:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/ASPP/ASPP-2018-numpy/430c43d4023b20ee581557ce874dbcacede3f56f/kitten-quantized.jpg
--------------------------------------------------------------------------------
/kitten.jpg:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/ASPP/ASPP-2018-numpy/430c43d4023b20ee581557ce874dbcacede3f56f/kitten.jpg
--------------------------------------------------------------------------------
/kmeans.py:
--------------------------------------------------------------------------------
1 | # -----------------------------------------------------------------------------
2 | # Copyright (C) 2018 Nicolas P. Rougier
3 | # Distributed under the terms of the BSD License.
4 | # -----------------------------------------------------------------------------
5 | # Code by Gareth Rees, posted on stack overflow
6 | # https://codereview.stackexchange.com/questions/61598/k-mean-with-numpy
7 |
8 | import numpy as np
9 | import scipy.spatial
10 |
11 |
12 | def cluster_centroids(data, clusters, k=None):
13 | """Return centroids of clusters in data.
14 |
15 | data is an array of observations with shape (A, B, ...).
16 |
17 | clusters is an array of integers of shape (A,) giving the index
18 | (from 0 to k-1) of the cluster to which each observation belongs.
19 | The clusters must all be non-empty.
20 |
21 | k is the number of clusters. If omitted, it is deduced from the
22 | values in the clusters array.
23 |
24 | The result is an array of shape (k, B, ...) containing the
25 | centroid of each cluster.
26 |
27 | >>> data = np.array([[12, 10, 87],
28 | ... [ 2, 12, 33],
29 | ... [68, 31, 32],
30 | ... [88, 13, 66],
31 | ... [79, 40, 89],
32 | ... [ 1, 77, 12]])
33 | >>> cluster_centroids(data, np.array([1, 1, 2, 2, 0, 1]))
34 | array([[ 79., 40., 89.],
35 | [ 5., 33., 44.],
36 | [ 78., 22., 49.]])
37 |
38 | """
39 | if k is None:
40 | k = np.max(clusters) + 1
41 | result = np.empty(shape=(k,) + data.shape[1:])
42 | for i in range(k):
43 | np.mean(data[clusters == i], axis=0, out=result[i])
44 | return result
45 |
46 |
47 | def kmeans(data, k=None, centroids=None, steps=20):
48 | """Divide the observations in data into clusters using the k-means
49 | algorithm, and return an array of integers assigning each data
50 | point to one of the clusters.
51 |
52 | centroids, if supplied, must be an array giving the initial
53 | position of the centroids of each cluster.
54 |
55 | If centroids is omitted, the number k gives the number of clusters
56 | and the initial positions of the centroids are selected randomly
57 | from the data.
58 |
59 | The k-means algorithm adjusts the centroids iteratively for the
60 | given number of steps, or until no further progress can be made.
61 |
62 | >>> data = np.array([[12, 10, 87],
63 | ... [ 2, 12, 33],
64 | ... [68, 31, 32],
65 | ... [88, 13, 66],
66 | ... [79, 40, 89],
67 | ... [ 1, 77, 12]])
68 | >>> np.random.seed(73)
69 | >>> kmeans(data, k=3)
70 | (array([[79., 40., 89.],
71 | [ 5., 33., 44.],
72 | [78., 22., 49.]]), array([1, 1, 2, 2, 0, 1]))
73 |
74 | """
75 | if centroids is not None and k is not None:
76 | assert(k == len(centroids))
77 | elif centroids is not None:
78 | k = len(centroids)
79 | elif k is not None:
80 | # Forgy initialization method: choose k data points randomly.
81 | centroids = data[np.random.choice(np.arange(len(data)), k, False)]
82 | else:
83 | raise RuntimeError("Need a value for k or centroids.")
84 |
85 | for _ in range(max(steps, 1)):
86 | # Squared distances between each point and each centroid.
87 | sqdists = scipy.spatial.distance.cdist(centroids, data, 'sqeuclidean')
88 |
89 | # Index of the closest centroid to each data point.
90 | clusters = np.argmin(sqdists, axis=0)
91 |
92 | new_centroids = cluster_centroids(data, clusters, k)
93 | if np.array_equal(new_centroids, centroids):
94 | break
95 |
96 | centroids = new_centroids
97 |
98 | return centroids, clusters
99 |
100 |
101 |
102 | if __name__ == '__main__':
103 | import imageio
104 |
105 | # Number of final colors we want
106 | n = 16
107 |
108 | # Original Image
109 | I = imageio.imread("kitten.jpg")
110 | shape = I.shape
111 |
112 | # Flattened image
113 | D = I.reshape(shape[0]*shape[1], shape[2])
114 |
115 | # Search for 16 centroids in D (using 20 iterations)
116 | centroids, clusters = kmeans(D, k=n, steps=20)
117 |
118 | # Create quantized image
119 | I = (centroids[clusters]).reshape(shape)
120 | I = np.round(I).astype(np.uint8)
121 |
122 | # Save result
123 | imageio.imsave("kitten-quantized.jpg", I)
124 |
--------------------------------------------------------------------------------
/moving-average.py:
--------------------------------------------------------------------------------
1 | import numpy as np
2 | import matplolib.pyplot as plt
3 |
4 |
--------------------------------------------------------------------------------
/nan-arithmetics.py:
--------------------------------------------------------------------------------
1 | # -----------------------------------------------------------------------------
2 | # Copyright (C) 2018 Nicolas P. Rougier
3 | # Distributed under the terms of the BSD License.
4 | # -----------------------------------------------------------------------------
5 | import numpy as np
6 |
7 | # Result is NaN
8 | print(0 * np.nan)
9 |
10 | # Result is False
11 | print(np.nan == np.nan)
12 |
13 | # Result is False
14 | print(np.inf > np.nan)
15 |
16 | # Result is NaN
17 | print(np.nan - np.nan)
18 |
19 | # Result is False !!!
20 | print(0.3 == 3 * 0.1)
21 | print("0.1 really is {:0.56f}".format(0.1))
22 |
--------------------------------------------------------------------------------
/original.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/ASPP/ASPP-2018-numpy/430c43d4023b20ee581557ce874dbcacede3f56f/original.png
--------------------------------------------------------------------------------
/perceptron.mp4:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/ASPP/ASPP-2018-numpy/430c43d4023b20ee581557ce874dbcacede3f56f/perceptron.mp4
--------------------------------------------------------------------------------
/perceptron.py:
--------------------------------------------------------------------------------
1 | # -----------------------------------------------------------------------------
2 | # Copyright (C) 2018 Nicolas P. Rougier
3 | # Distributed under the terms of the BSD License.
4 | # -----------------------------------------------------------------------------
5 | import numpy as np
6 |
7 | def f(x):
8 | return x > 0
9 |
10 | class Perceptron:
11 | ''' Perceptron class. '''
12 |
13 | def __init__(self, n, m):
14 | ''' Initialization of the perceptron with given sizes. '''
15 |
16 | self.input = np.ones(n+1)
17 | self.output = np.ones(m)
18 | self.weights= np.zeros((m,n+1))
19 | self.reset()
20 |
21 | def reset(self):
22 | ''' Reset weights '''
23 |
24 | self.weights[...] = np.random.uniform(-.5, .5, self.weights.shape)
25 |
26 | def propagate_forward(self, data):
27 | ''' Propagate data from input layer to output layer. '''
28 |
29 | # Set input layer (but not bias)
30 | self.input[1:] = data
31 | self.output[...] = f(np.dot(self.weights,self.input))
32 |
33 | # Return output
34 | return self.output
35 |
36 | def propagate_backward(self, target, lrate=0.1):
37 | ''' Back propagate error related to target using lrate. '''
38 |
39 | error = np.atleast_2d(target-self.output)
40 | input = np.atleast_2d(self.input)
41 | self.weights += lrate*np.dot(error.T,input)
42 |
43 | # Return error
44 | return (error**2).sum()
45 |
46 |
47 | # -----------------------------------------------------------------------------
48 | if __name__ == '__main__':
49 | import numpy as np
50 | import matplotlib.pyplot as plt
51 | import matplotlib.animation as animation
52 |
53 | np.random.seed(123)
54 |
55 | samples = np.zeros(100, dtype=[('input', float, 2),
56 | ('output', float, 1)])
57 |
58 | P = np.random.uniform(0.05,0.95,(len(samples),2))
59 | samples["input"] = P
60 | stars = np.where(P[:,0]+P[:,1] < 1)
61 | discs = np.where(P[:,0]+P[:,1] > 1)
62 | samples["output"][stars] = +1
63 | samples["output"][discs] = 0
64 |
65 |
66 | network = Perceptron(2,1)
67 | network.reset()
68 | lrate = 0.05
69 |
70 | fig = plt.figure(figsize=(6,6))
71 | ax = plt.subplot(1,1,1, aspect=1, frameon=False)
72 | ax.scatter(P[stars,0], P[stars,1], color="red", marker="*", s=50, alpha=.5)
73 | ax.scatter(P[discs,0], P[discs,1], color="blue", s=25, alpha=.5)
74 | line, = ax.plot([], [], color="black", linewidth=2)
75 | ax.set_xlim(0,1)
76 | ax.set_xticks([])
77 | ax.set_ylim(0,1)
78 | ax.set_yticks([])
79 | plt.tight_layout()
80 |
81 | def animate(i):
82 | global lrate
83 | error = 0
84 |
85 | count = 0
86 | lrate *= 0.99
87 | while error == 0 and count < 10:
88 | n = np.random.randint(samples.size)
89 | network.propagate_forward( samples['input'][n] )
90 | error = network.propagate_backward( samples['output'][n], lrate )
91 | count += 1
92 |
93 | c,a,b = network.weights[0]
94 | x0 = -2
95 | x1 = +2
96 | if a != 0:
97 | y0 = (-c -b*x0)/a
98 | y1 = (-c -b*x1)/a
99 | else:
100 | y0 = 0
101 | y1 = 1
102 |
103 | line.set_xdata([x0,x1])
104 | line.set_ydata([y0,y1])
105 |
106 | return line,
107 |
108 | anim = animation.FuncAnimation(fig, animate, np.arange(1, 300))
109 | #Writer = animation.writers['ffmpeg']
110 | #writer = Writer(fps=30,
111 | # metadata=dict(artist='Nicolas P. Rougier'), bitrate=1800)
112 | # anim.save('perceptron.mp4', writer=writer)
113 | plt.show()
114 |
--------------------------------------------------------------------------------
/random-walk.py:
--------------------------------------------------------------------------------
1 | # -----------------------------------------------------------------------------
2 | # Copyright (C) 2018 Nicolas P. Rougier
3 | # Distributed under the terms of the BSD License.
4 | # -----------------------------------------------------------------------------
5 | import random
6 | import numpy as np
7 | from tools import timeit
8 |
9 | def random_walk_slow(n):
10 | position = 0
11 | walk = [position]
12 | for i in range(n):
13 | position += 2*random.randint(0, 1)-1
14 | walk.append(position)
15 | return walk
16 |
17 |
18 | def random_walk_faster(n=1000):
19 | from itertools import accumulate
20 | # Only available from Python 3.6
21 | steps = random.choices([-1,+1], k=n)
22 | return [0]+list(accumulate(steps))
23 |
24 | def random_walk_fastest(n=1000):
25 | steps = np.random.choice([-1,+1], n)
26 | return np.cumsum(steps)
27 |
28 |
29 | if __name__ == '__main__':
30 |
31 | timeit("random_walk_slow(1000)", globals())
32 | timeit("random_walk_faster(1000)", globals())
33 | timeit("random_walk_fastest(1000)", globals())
34 |
--------------------------------------------------------------------------------
/reorder.py:
--------------------------------------------------------------------------------
1 | # -----------------------------------------------------------------------------
2 | # Copyright (C) 2018 Nicolas P. Rougier
3 | # Distributed under the terms of the BSD License.
4 | # -----------------------------------------------------------------------------
5 | import struct
6 | import numpy as np
7 |
8 | # Generation of the array
9 | # Z = range(1001, 1009)
10 | # L = np.reshape(Z, (2,2,2), order="F").ravel().astype(">i8").view(np.ubyte)
11 |
12 | L = [ 0, 0, 0, 0, 0, 0, 3, 233,
13 | 0, 0, 0, 0, 0, 0, 3, 237,
14 | 0, 0, 0, 0, 0, 0, 3, 235,
15 | 0, 0, 0, 0, 0, 0, 3, 239,
16 | 0, 0, 0, 0, 0, 0, 3, 234,
17 | 0, 0, 0, 0, 0, 0, 3, 238,
18 | 0, 0, 0, 0, 0, 0, 3, 236,
19 | 0, 0, 0, 0, 0, 0, 3, 240]
20 |
21 | # Automatic (numpy)
22 | Z = np.reshape(np.array(L, dtype=np.ubyte).view(dtype=">i8"), (2,2,2), order="F")
23 | print(Z[1,0,0])
24 |
25 | # Manual (brain)
26 | shape = (2,2,2)
27 | itemsize = 8
28 | # We can probably do better
29 | strides = itemsize, itemsize*shape[0], itemsize*shape[0]*shape[1]
30 | index = (1,0,0)
31 | start = sum(i*s for i,s in zip(index,strides))
32 | end = start+itemsize
33 | value = struct.unpack(">Q", bytes(L[start:end]))[0]
34 | print(value)
35 |
--------------------------------------------------------------------------------
/repeat.py:
--------------------------------------------------------------------------------
1 | # -----------------------------------------------------------------------------
2 | # Copyright (C) 2018 Nicolas P. Rougier
3 | # Distributed under the terms of the BSD License.
4 | # -----------------------------------------------------------------------------
5 | import numpy as np
6 | from numpy.lib.stride_tricks import as_strided
7 |
8 | Z = np.zeros(5)
9 | Z1 = np.tile(Z,(3,1))
10 | Z2 = as_strided(Z, shape=(3,)+Z.shape, strides=(0,)+Z.strides)
11 |
12 | # Real repeat (three times the memory)
13 | Z1[0,0] = 1
14 | print(Z1)
15 |
16 | # Fake repeat (but less memory)
17 | Z2[0,0] = 1
18 | print(Z2)
19 |
--------------------------------------------------------------------------------
/strides.py:
--------------------------------------------------------------------------------
1 | # -----------------------------------------------------------------------------
2 | # Copyright (C) 2018 Nicolas P. Rougier
3 | # Distributed under the terms of the BSD License.
4 | # -----------------------------------------------------------------------------
5 | import numpy as np
6 |
7 | def strides(Z):
8 | strides = [Z.itemsize]
9 |
10 | # Fotran ordered array
11 | if np.isfortran(Z):
12 | for i in range(0, Z.ndim-1):
13 | strides.append(strides[-1] * Z.shape[i])
14 | return tuple(strides)
15 | # C ordered array
16 | else:
17 | for i in range(Z.ndim-1, 0, -1):
18 | strides.append(strides[-1] * Z.shape[i])
19 | return tuple(strides[::-1])
20 |
21 | # This work
22 | Z = np.arange(24).reshape((2,3,4), order="C")
23 | print(Z.strides, " – ", strides(Z))
24 |
25 | Z = np.arange(24).reshape((2,3,4), order="F")
26 | print(Z.strides, " – ", strides(Z))
27 |
28 | # This does not work
29 | # Z = Z[::2]
30 | # print(Z.strides, " – ", strides(Z))
31 |
32 |
--------------------------------------------------------------------------------
/tools.py:
--------------------------------------------------------------------------------
1 | # -----------------------------------------------------------------------------
2 | # Copyright (C) 2018 Nicolas P. Rougier
3 | # Distributed under the terms of the BSD License.
4 | # -----------------------------------------------------------------------------
5 | from imshow import imshow
6 |
7 | def sysinfo():
8 | import sys
9 | import time
10 | import numpy as np
11 | import scipy as sp
12 | import matplotlib
13 |
14 | print("Date: %s" % (time.strftime("%D")))
15 | version = sys.version_info
16 | major, minor, micro = version.major, version.minor, version.micro
17 | print("Python: %d.%d.%d" % (major, minor, micro))
18 | print("Numpy: ", np.__version__)
19 | print("Scipy: ", sp.__version__)
20 | print("Matplotlib:", matplotlib.__version__)
21 |
22 |
23 | def timeit(stmt, globals=globals()):
24 | import numpy as np
25 | import timeit as _timeit
26 |
27 | print("Timing '{0}'".format(stmt))
28 |
29 | # Rough approximation of a 10 runs
30 | trial = _timeit.timeit(stmt, globals=globals, number=10)/10
31 |
32 | # Maximum duration
33 | duration = 5.0
34 |
35 | # Number of repeat
36 | repeat = 7
37 |
38 | # Compute rounded number of trials
39 | number = max(1,int(10**np.ceil(np.log((duration/repeat)/trial)/np.log(10))))
40 |
41 | # Only report best run
42 | times = _timeit.repeat(stmt, globals=globals, number=number, repeat=repeat)
43 | times = np.array(times)/number
44 | mean = np.mean(times)
45 | std = np.std(times)
46 |
47 | # Display results
48 | units = {"s": 1, "ms": 1e-3, "us": 1e-6, "ns": 1e-9}
49 | for key,value in units.items():
50 | unit, factor = key, 1/value
51 | if mean > value: break
52 | mean *= factor
53 | std *= factor
54 |
55 | print("%.3g %s ± %.3g %s per loop (mean ± std. dev. of %d runs, %d loops each)" %
56 | (mean, unit, std, unit, repeat, number))
57 |
58 |
59 | def info(Z):
60 | import sys
61 | import numpy as np
62 | endianness = {'=': 'native (%s)' % sys.byteorder,
63 | '<': 'little',
64 | '>': 'big',
65 | '|': 'not applicable'}
66 |
67 | print("------------------------------")
68 | print("Interface (item)")
69 | print(" shape: ", Z.shape)
70 | print(" dtype: ", Z.dtype)
71 | print(" length: ", len(Z))
72 | print(" size: ", Z.size)
73 | print(" endianness: ", endianness[Z.dtype.byteorder])
74 | if np.isfortran(Z):
75 | print(" order: ☐ C ☑ Fortran")
76 | else:
77 | print(" order: ☑ C ☐ Fortran")
78 | print("")
79 | print("Memory (byte)")
80 | print(" item size: ", Z.itemsize)
81 | print(" array size: ", Z.size*Z.itemsize)
82 | print(" strides: ", Z.strides)
83 | print("")
84 | print("Properties")
85 | if Z.flags["OWNDATA"]:
86 | print(" own data: ☑ Yes ☐ No")
87 | else:
88 | print(" own data: ☐ Yes ☑ No")
89 | if Z.flags["WRITEABLE"]:
90 | print(" writeable: ☑ Yes ☐ No")
91 | else:
92 | print(" writeable: ☐ Yes ☑ No")
93 | if np.isfortran(Z) and Z.flags["F_CONTIGUOUS"]:
94 | print(" contiguous: ☑ Yes ☐ No")
95 | elif not np.isfortran(Z) and Z.flags["C_CONTIGUOUS"]:
96 | print(" contiguous: ☑ Yes ☐ No")
97 | else:
98 | print(" contiguous: ☐ Yes ☑ No")
99 | if Z.flags["ALIGNED"]:
100 | print(" aligned: ☑ Yes ☐ No")
101 | else:
102 | print(" aligned: ☐ Yes ☑ No")
103 | print("------------------------------")
104 | print()
105 |
106 |
107 | if __name__ == '__main__':
108 | import numpy as np
109 |
110 | sysinfo()
111 |
112 | Z = np.arange(9).reshape(3,3)
113 | info(Z)
114 |
115 | timeit("Z=np.random.uniform(0,1,1000000)", globals())
116 |
117 |
118 |
--------------------------------------------------------------------------------