├── .gitignore
├── 01-tensor_tutorial.ipynb
├── 02-space_stretching.ipynb
├── 03-autograd_tutorial.ipynb
├── 04-spiral_classification.ipynb
├── 05-convnet.ipynb
├── 06-autoencoder.ipynb
├── 07-VAE.ipynb
├── 08-1-classify_seq_data.ipynb
├── 08-2-echo_data.ipynb
├── 08-3-temporal_order_classification_experiments.ipynb
├── 08-4-echo_experiments.ipynb
├── README.md
├── conda-envt.yml
├── img
    └── train.gif
├── plot_conf.py
├── raw
    ├── keras-regularisation.ipynb
    └── keras-sequences
    │   ├── 0_1_classify_seq_data.ipynb
    │   ├── 0_2_echo_data.ipynb
    │   ├── 1_1_temporal_order_classification_experiments.ipynb
    │   ├── 1_2_echo_experiments.ipynb
    │   └── sequential_tasks.py
├── sequential_tasks.py
└── slides
    ├── 01 - ML and spiral classification.pdf
    ├── 02 - CNN.pdf
    ├── 03 - Generative models.pdf
    └── 04 - RNN.pdf


/.gitignore:
--------------------------------------------------------------------------------
 1 | # Remove [I]Python caching
 2 | __pycache__
 3 | .ipynb_checkpoints
 4 | 
 5 | # Remove Mac shit
 6 | .DS_Store
 7 | 
 8 | # Remove Vim temp files
 9 | *sw*
10 | 


--------------------------------------------------------------------------------
/01-tensor_tutorial.ipynb:
--------------------------------------------------------------------------------
  1 | {
  2 |  "cells": [
  3 |   {
  4 |    "cell_type": "markdown",
  5 |    "metadata": {},
  6 |    "source": [
  7 |     "\n",
  8 |     "# What is PyTorch?\n",
  9 |     "\n",
 10 |     "It’s a Python based scientific computing package targeted at two sets of audiences:\n",
 11 |     "\n",
 12 |     "-  Tensorial library that uses the power of GPUs\n",
 13 |     "-  A deep learning research platform that provides maximum flexibility and speed\n",
 14 |     "\n",
 15 |     "## Import the library"
 16 |    ]
 17 |   },
 18 |   {
 19 |    "cell_type": "code",
 20 |    "execution_count": null,
 21 |    "metadata": {},
 22 |    "outputs": [],
 23 |    "source": [
 24 |     "import torch  # <Ctrl> / <Shift> + <Return>"
 25 |    ]
 26 |   },
 27 |   {
 28 |    "cell_type": "markdown",
 29 |    "metadata": {},
 30 |    "source": [
 31 |     "## Getting help in Jupyter"
 32 |    ]
 33 |   },
 34 |   {
 35 |    "cell_type": "code",
 36 |    "execution_count": null,
 37 |    "metadata": {},
 38 |    "outputs": [],
 39 |    "source": [
 40 |     "torch.sq  # <Tab>"
 41 |    ]
 42 |   },
 43 |   {
 44 |    "cell_type": "code",
 45 |    "execution_count": null,
 46 |    "metadata": {},
 47 |    "outputs": [],
 48 |    "source": [
 49 |     "# What about all `*Tensor`s?\n",
 50 |     "torch.*Tensor?"
 51 |    ]
 52 |   },
 53 |   {
 54 |    "cell_type": "code",
 55 |    "execution_count": null,
 56 |    "metadata": {},
 57 |    "outputs": [],
 58 |    "source": [
 59 |     "torch.nn.Module()  # <Shift>+<Tab>"
 60 |    ]
 61 |   },
 62 |   {
 63 |    "cell_type": "code",
 64 |    "execution_count": null,
 65 |    "metadata": {},
 66 |    "outputs": [],
 67 |    "source": [
 68 |     "# Annotate your functions / classes!\n",
 69 |     "torch.nn.Module?"
 70 |    ]
 71 |   },
 72 |   {
 73 |    "cell_type": "code",
 74 |    "execution_count": null,
 75 |    "metadata": {},
 76 |    "outputs": [],
 77 |    "source": [
 78 |     "torch.nn.Module??"
 79 |    ]
 80 |   },
 81 |   {
 82 |    "cell_type": "markdown",
 83 |    "metadata": {},
 84 |    "source": [
 85 |     "## Dropping to Bash: magic!"
 86 |    ]
 87 |   },
 88 |   {
 89 |    "cell_type": "code",
 90 |    "execution_count": null,
 91 |    "metadata": {
 92 |     "scrolled": true
 93 |    },
 94 |    "outputs": [],
 95 |    "source": [
 96 |     "! ls -lh"
 97 |    ]
 98 |   },
 99 |   {
100 |    "cell_type": "code",
101 |    "execution_count": null,
102 |    "metadata": {},
103 |    "outputs": [],
104 |    "source": [
105 |     "%%bash\n",
106 |     "for f in $(ls *.*); do\n",
107 |     "    echo $(wc -l $f)\n",
108 |     "done"
109 |    ]
110 |   },
111 |   {
112 |    "cell_type": "code",
113 |    "execution_count": null,
114 |    "metadata": {},
115 |    "outputs": [],
116 |    "source": [
117 |     "# Help?\n",
118 |     "%%bash?"
119 |    ]
120 |   },
121 |   {
122 |    "cell_type": "code",
123 |    "execution_count": null,
124 |    "metadata": {},
125 |    "outputs": [],
126 |    "source": [
127 |     "# Getting some general help\n",
128 |     "%magic"
129 |    ]
130 |   },
131 |   {
132 |    "cell_type": "markdown",
133 |    "metadata": {},
134 |    "source": [
135 |     "## Python native data types\n",
136 |     "\n",
137 |     "Python has many native datatypes. Here are the important ones:\n",
138 |     "\n",
139 |     " - **Booleans** are either `True` or `False`.\n",
140 |     " - **Numbers** can be integers (1 and 2), floats (1.1 and 1.2), fractions (1/2 and 2/3), or even complex numbers.\n",
141 |     " - **Strings** are sequences of Unicode characters, e.g. an html document.\n",
142 |     " - **Lists** are ordered sequences of values.\n",
143 |     " - **Tuples** are ordered, immutable sequences of values.\n",
144 |     " - **Sets** are unordered bags of values.\n",
145 |     " - **Dictionaries** are unordered bags of key-value pairs.\n",
146 |     " \n",
147 |     "See [here](http://www.diveintopython3.net/native-datatypes.html) for a complete overview.\n",
148 |     "\n",
149 |     "### More resources\n",
150 |     "\n",
151 |     " 1. Brief Python introduction [here](https://learnxinyminutes.com/docs/python3/).\n",
152 |     " 2. Full Python tutorial [here](https://docs.python.org/3/tutorial/).\n",
153 |     " 3. A Whirlwind Tour of Python [here](https://github.com/jakevdp/WhirlwindTourOfPython).\n",
154 |     " 4. Python Data Science Handbook [here](https://github.com/jakevdp/PythonDataScienceHandbook)."
155 |    ]
156 |   },
157 |   {
158 |    "cell_type": "markdown",
159 |    "metadata": {},
160 |    "source": [
161 |     "## Torch!"
162 |    ]
163 |   },
164 |   {
165 |    "cell_type": "code",
166 |    "execution_count": null,
167 |    "metadata": {},
168 |    "outputs": [],
169 |    "source": [
170 |     "t = torch.Tensor(2, 3, 4)\n",
171 |     "type(t)"
172 |    ]
173 |   },
174 |   {
175 |    "cell_type": "code",
176 |    "execution_count": null,
177 |    "metadata": {},
178 |    "outputs": [],
179 |    "source": [
180 |     "t.size()"
181 |    ]
182 |   },
183 |   {
184 |    "cell_type": "code",
185 |    "execution_count": null,
186 |    "metadata": {},
187 |    "outputs": [],
188 |    "source": [
189 |     "# t.size() is a classic tuple =>\n",
190 |     "print('t size:', ' \\u00D7 '.join(map(str, t.size())))"
191 |    ]
192 |   },
193 |   {
194 |    "cell_type": "code",
195 |    "execution_count": null,
196 |    "metadata": {},
197 |    "outputs": [],
198 |    "source": [
199 |     "print(f'point in a {t.numel()} dimensional space')\n",
200 |     "print(f'organised in {t.dim()} sub-dimensions')"
201 |    ]
202 |   },
203 |   {
204 |    "cell_type": "code",
205 |    "execution_count": null,
206 |    "metadata": {},
207 |    "outputs": [],
208 |    "source": [
209 |     "t"
210 |    ]
211 |   },
212 |   {
213 |    "cell_type": "code",
214 |    "execution_count": null,
215 |    "metadata": {},
216 |    "outputs": [],
217 |    "source": [
218 |     "# Mind the underscore!\n",
219 |     "t.random_(10)"
220 |    ]
221 |   },
222 |   {
223 |    "cell_type": "code",
224 |    "execution_count": null,
225 |    "metadata": {},
226 |    "outputs": [],
227 |    "source": [
228 |     "t"
229 |    ]
230 |   },
231 |   {
232 |    "cell_type": "code",
233 |    "execution_count": null,
234 |    "metadata": {},
235 |    "outputs": [],
236 |    "source": [
237 |     "r = torch.Tensor(t)\n",
238 |     "r.resize_(3, 8)\n",
239 |     "r"
240 |    ]
241 |   },
242 |   {
243 |    "cell_type": "code",
244 |    "execution_count": null,
245 |    "metadata": {},
246 |    "outputs": [],
247 |    "source": [
248 |     "r.zero_()"
249 |    ]
250 |   },
251 |   {
252 |    "cell_type": "code",
253 |    "execution_count": null,
254 |    "metadata": {},
255 |    "outputs": [],
256 |    "source": [
257 |     "t"
258 |    ]
259 |   },
260 |   {
261 |    "cell_type": "code",
262 |    "execution_count": null,
263 |    "metadata": {},
264 |    "outputs": [],
265 |    "source": [
266 |     "# This *is* important, sigh...\n",
267 |     "s = r.clone()"
268 |    ]
269 |   },
270 |   {
271 |    "cell_type": "code",
272 |    "execution_count": null,
273 |    "metadata": {},
274 |    "outputs": [],
275 |    "source": [
276 |     "s.fill_(1)\n",
277 |     "s"
278 |    ]
279 |   },
280 |   {
281 |    "cell_type": "code",
282 |    "execution_count": null,
283 |    "metadata": {},
284 |    "outputs": [],
285 |    "source": [
286 |     "r"
287 |    ]
288 |   },
289 |   {
290 |    "cell_type": "markdown",
291 |    "metadata": {},
292 |    "source": [
293 |     "## Vectors (1D Tensors)"
294 |    ]
295 |   },
296 |   {
297 |    "cell_type": "code",
298 |    "execution_count": null,
299 |    "metadata": {},
300 |    "outputs": [],
301 |    "source": [
302 |     "v = torch.Tensor([1, 2, 3, 4]); v"
303 |    ]
304 |   },
305 |   {
306 |    "cell_type": "code",
307 |    "execution_count": null,
308 |    "metadata": {},
309 |    "outputs": [],
310 |    "source": [
311 |     "print(f'dim: {v.dim()}, size: {v.size()[0]}')"
312 |    ]
313 |   },
314 |   {
315 |    "cell_type": "code",
316 |    "execution_count": null,
317 |    "metadata": {},
318 |    "outputs": [],
319 |    "source": [
320 |     "w = torch.Tensor([1, 0, 2, 0]); w"
321 |    ]
322 |   },
323 |   {
324 |    "cell_type": "code",
325 |    "execution_count": null,
326 |    "metadata": {},
327 |    "outputs": [],
328 |    "source": [
329 |     "# Element-wise multiplication\n",
330 |     "v * w"
331 |    ]
332 |   },
333 |   {
334 |    "cell_type": "code",
335 |    "execution_count": null,
336 |    "metadata": {},
337 |    "outputs": [],
338 |    "source": [
339 |     "# Scalar product: 1*1 + 2*0 + 3*2 + 4*0\n",
340 |     "v @ w"
341 |    ]
342 |   },
343 |   {
344 |    "cell_type": "code",
345 |    "execution_count": null,
346 |    "metadata": {},
347 |    "outputs": [],
348 |    "source": [
349 |     "x = torch.Tensor(5).random_(10); x"
350 |    ]
351 |   },
352 |   {
353 |    "cell_type": "code",
354 |    "execution_count": null,
355 |    "metadata": {},
356 |    "outputs": [],
357 |    "source": [
358 |     "print(f'first: {x[0]}, last: {x[-1]}')"
359 |    ]
360 |   },
361 |   {
362 |    "cell_type": "code",
363 |    "execution_count": null,
364 |    "metadata": {},
365 |    "outputs": [],
366 |    "source": [
367 |     "# Extract sub-Tensor [from:to)\n",
368 |     "x[1:2 + 1]"
369 |    ]
370 |   },
371 |   {
372 |    "cell_type": "code",
373 |    "execution_count": null,
374 |    "metadata": {},
375 |    "outputs": [],
376 |    "source": [
377 |     "v"
378 |    ]
379 |   },
380 |   {
381 |    "cell_type": "code",
382 |    "execution_count": null,
383 |    "metadata": {},
384 |    "outputs": [],
385 |    "source": [
386 |     "v = torch.arange(1, 4 + 1); v"
387 |    ]
388 |   },
389 |   {
390 |    "cell_type": "code",
391 |    "execution_count": null,
392 |    "metadata": {},
393 |    "outputs": [],
394 |    "source": [
395 |     "print(v.pow(2), v)"
396 |    ]
397 |   },
398 |   {
399 |    "cell_type": "code",
400 |    "execution_count": null,
401 |    "metadata": {},
402 |    "outputs": [],
403 |    "source": [
404 |     "print(v.pow_(2), v)"
405 |    ]
406 |   },
407 |   {
408 |    "cell_type": "markdown",
409 |    "metadata": {},
410 |    "source": [
411 |     "## Matrices (2D Tensors)"
412 |    ]
413 |   },
414 |   {
415 |    "cell_type": "code",
416 |    "execution_count": null,
417 |    "metadata": {},
418 |    "outputs": [],
419 |    "source": [
420 |     "m = torch.Tensor([[2, 5, 3, 7],\n",
421 |     "                  [4, 2, 1, 9]]); m"
422 |    ]
423 |   },
424 |   {
425 |    "cell_type": "code",
426 |    "execution_count": null,
427 |    "metadata": {},
428 |    "outputs": [],
429 |    "source": [
430 |     "m.dim()"
431 |    ]
432 |   },
433 |   {
434 |    "cell_type": "code",
435 |    "execution_count": null,
436 |    "metadata": {},
437 |    "outputs": [],
438 |    "source": [
439 |     "print(m.size(0), m.size(1), m.size(), sep=' -- ')"
440 |    ]
441 |   },
442 |   {
443 |    "cell_type": "code",
444 |    "execution_count": null,
445 |    "metadata": {},
446 |    "outputs": [],
447 |    "source": [
448 |     "m.numel()"
449 |    ]
450 |   },
451 |   {
452 |    "cell_type": "code",
453 |    "execution_count": null,
454 |    "metadata": {},
455 |    "outputs": [],
456 |    "source": [
457 |     "m[0][2]"
458 |    ]
459 |   },
460 |   {
461 |    "cell_type": "code",
462 |    "execution_count": null,
463 |    "metadata": {},
464 |    "outputs": [],
465 |    "source": [
466 |     "m[0, 2]"
467 |    ]
468 |   },
469 |   {
470 |    "cell_type": "code",
471 |    "execution_count": null,
472 |    "metadata": {},
473 |    "outputs": [],
474 |    "source": [
475 |     "m[:, 1]"
476 |    ]
477 |   },
478 |   {
479 |    "cell_type": "code",
480 |    "execution_count": null,
481 |    "metadata": {},
482 |    "outputs": [],
483 |    "source": [
484 |     "m[:, [1]]"
485 |    ]
486 |   },
487 |   {
488 |    "cell_type": "code",
489 |    "execution_count": null,
490 |    "metadata": {},
491 |    "outputs": [],
492 |    "source": [
493 |     "m[[0], :]"
494 |    ]
495 |   },
496 |   {
497 |    "cell_type": "code",
498 |    "execution_count": null,
499 |    "metadata": {},
500 |    "outputs": [],
501 |    "source": [
502 |     "m[0, :]"
503 |    ]
504 |   },
505 |   {
506 |    "cell_type": "code",
507 |    "execution_count": null,
508 |    "metadata": {},
509 |    "outputs": [],
510 |    "source": [
511 |     "v = torch.arange(1, 4 + 1); v"
512 |    ]
513 |   },
514 |   {
515 |    "cell_type": "code",
516 |    "execution_count": null,
517 |    "metadata": {},
518 |    "outputs": [],
519 |    "source": [
520 |     "m @ v"
521 |    ]
522 |   },
523 |   {
524 |    "cell_type": "code",
525 |    "execution_count": null,
526 |    "metadata": {},
527 |    "outputs": [],
528 |    "source": [
529 |     "m[[0], :] @ v"
530 |    ]
531 |   },
532 |   {
533 |    "cell_type": "code",
534 |    "execution_count": null,
535 |    "metadata": {},
536 |    "outputs": [],
537 |    "source": [
538 |     "m[[1], :] @ v"
539 |    ]
540 |   },
541 |   {
542 |    "cell_type": "code",
543 |    "execution_count": null,
544 |    "metadata": {},
545 |    "outputs": [],
546 |    "source": [
547 |     "m + torch.rand(2, 4)"
548 |    ]
549 |   },
550 |   {
551 |    "cell_type": "code",
552 |    "execution_count": null,
553 |    "metadata": {},
554 |    "outputs": [],
555 |    "source": [
556 |     "m - torch.rand(2, 4)"
557 |    ]
558 |   },
559 |   {
560 |    "cell_type": "code",
561 |    "execution_count": null,
562 |    "metadata": {},
563 |    "outputs": [],
564 |    "source": [
565 |     "m * torch.rand(2, 4)"
566 |    ]
567 |   },
568 |   {
569 |    "cell_type": "code",
570 |    "execution_count": null,
571 |    "metadata": {},
572 |    "outputs": [],
573 |    "source": [
574 |     "m / torch.rand(2, 4)"
575 |    ]
576 |   },
577 |   {
578 |    "cell_type": "code",
579 |    "execution_count": null,
580 |    "metadata": {},
581 |    "outputs": [],
582 |    "source": [
583 |     "m.t()"
584 |    ]
585 |   },
586 |   {
587 |    "cell_type": "code",
588 |    "execution_count": null,
589 |    "metadata": {},
590 |    "outputs": [],
591 |    "source": [
592 |     "# Same as\n",
593 |     "m.transpose(0, 1)"
594 |    ]
595 |   },
596 |   {
597 |    "cell_type": "markdown",
598 |    "metadata": {},
599 |    "source": [
600 |     "## Constructors"
601 |    ]
602 |   },
603 |   {
604 |    "cell_type": "code",
605 |    "execution_count": null,
606 |    "metadata": {},
607 |    "outputs": [],
608 |    "source": [
609 |     "torch.arange(3, 8 + 1)"
610 |    ]
611 |   },
612 |   {
613 |    "cell_type": "code",
614 |    "execution_count": null,
615 |    "metadata": {},
616 |    "outputs": [],
617 |    "source": [
618 |     "torch.arange(5.7, -3, -2.1)"
619 |    ]
620 |   },
621 |   {
622 |    "cell_type": "code",
623 |    "execution_count": null,
624 |    "metadata": {},
625 |    "outputs": [],
626 |    "source": [
627 |     "torch.linspace(3, 8, 20).view(1, -1)"
628 |    ]
629 |   },
630 |   {
631 |    "cell_type": "code",
632 |    "execution_count": null,
633 |    "metadata": {},
634 |    "outputs": [],
635 |    "source": [
636 |     "torch.zeros(3, 5)"
637 |    ]
638 |   },
639 |   {
640 |    "cell_type": "code",
641 |    "execution_count": null,
642 |    "metadata": {},
643 |    "outputs": [],
644 |    "source": [
645 |     "torch.ones(3, 2, 5)"
646 |    ]
647 |   },
648 |   {
649 |    "cell_type": "code",
650 |    "execution_count": null,
651 |    "metadata": {},
652 |    "outputs": [],
653 |    "source": [
654 |     "torch.eye(3)"
655 |    ]
656 |   },
657 |   {
658 |    "cell_type": "code",
659 |    "execution_count": null,
660 |    "metadata": {},
661 |    "outputs": [],
662 |    "source": [
663 |     "# Pretty plotting config\n",
664 |     "%run plot_conf.py"
665 |    ]
666 |   },
667 |   {
668 |    "cell_type": "code",
669 |    "execution_count": null,
670 |    "metadata": {},
671 |    "outputs": [],
672 |    "source": [
673 |     "plt_style()"
674 |    ]
675 |   },
676 |   {
677 |    "cell_type": "code",
678 |    "execution_count": null,
679 |    "metadata": {},
680 |    "outputs": [],
681 |    "source": [
682 |     "# Numpy bridge!\n",
683 |     "plt.hist(torch.randn(1000).numpy(), 100);"
684 |    ]
685 |   },
686 |   {
687 |    "cell_type": "code",
688 |    "execution_count": null,
689 |    "metadata": {},
690 |    "outputs": [],
691 |    "source": [
692 |     "plt.hist(torch.randn(10**6).numpy(), 100);  # how much does this chart weight?\n",
693 |     "# use rasterized=True for SVG/EPS/PDF!"
694 |    ]
695 |   },
696 |   {
697 |    "cell_type": "code",
698 |    "execution_count": null,
699 |    "metadata": {},
700 |    "outputs": [],
701 |    "source": [
702 |     "plt.hist(torch.rand(10**6).numpy(), 100);"
703 |    ]
704 |   },
705 |   {
706 |    "cell_type": "markdown",
707 |    "metadata": {},
708 |    "source": [
709 |     "## Casting"
710 |    ]
711 |   },
712 |   {
713 |    "cell_type": "code",
714 |    "execution_count": null,
715 |    "metadata": {},
716 |    "outputs": [],
717 |    "source": [
718 |     "torch.*Tensor?"
719 |    ]
720 |   },
721 |   {
722 |    "cell_type": "code",
723 |    "execution_count": null,
724 |    "metadata": {},
725 |    "outputs": [],
726 |    "source": [
727 |     "m"
728 |    ]
729 |   },
730 |   {
731 |    "cell_type": "code",
732 |    "execution_count": null,
733 |    "metadata": {},
734 |    "outputs": [],
735 |    "source": [
736 |     "m.double()"
737 |    ]
738 |   },
739 |   {
740 |    "cell_type": "code",
741 |    "execution_count": null,
742 |    "metadata": {},
743 |    "outputs": [],
744 |    "source": [
745 |     "m.byte()"
746 |    ]
747 |   },
748 |   {
749 |    "cell_type": "code",
750 |    "execution_count": null,
751 |    "metadata": {},
752 |    "outputs": [],
753 |    "source": [
754 |     "device = torch.device(\"cuda:0\" if torch.cuda.is_available() else \"cpu\")\n",
755 |     "m.to(device)"
756 |    ]
757 |   },
758 |   {
759 |    "cell_type": "code",
760 |    "execution_count": null,
761 |    "metadata": {},
762 |    "outputs": [],
763 |    "source": [
764 |     "m_np = m.numpy(); m_np"
765 |    ]
766 |   },
767 |   {
768 |    "cell_type": "code",
769 |    "execution_count": null,
770 |    "metadata": {},
771 |    "outputs": [],
772 |    "source": [
773 |     "m_np[0, 0] = -1; m_np"
774 |    ]
775 |   },
776 |   {
777 |    "cell_type": "code",
778 |    "execution_count": null,
779 |    "metadata": {},
780 |    "outputs": [],
781 |    "source": [
782 |     "m"
783 |    ]
784 |   },
785 |   {
786 |    "cell_type": "code",
787 |    "execution_count": null,
788 |    "metadata": {},
789 |    "outputs": [],
790 |    "source": [
791 |     "n_np = np.arange(5)\n",
792 |     "n = torch.from_numpy(n_np)\n",
793 |     "print(n_np, n)"
794 |    ]
795 |   },
796 |   {
797 |    "cell_type": "code",
798 |    "execution_count": null,
799 |    "metadata": {},
800 |    "outputs": [],
801 |    "source": [
802 |     "n.mul_(2)\n",
803 |     "n_np"
804 |    ]
805 |   },
806 |   {
807 |    "cell_type": "markdown",
808 |    "metadata": {},
809 |    "source": [
810 |     "## More fun"
811 |    ]
812 |   },
813 |   {
814 |    "cell_type": "code",
815 |    "execution_count": null,
816 |    "metadata": {},
817 |    "outputs": [],
818 |    "source": [
819 |     "a = torch.Tensor([[1, 2, 3, 4]])\n",
820 |     "b = torch.Tensor([[5, 6, 7, 8]])\n",
821 |     "print(a, b)"
822 |    ]
823 |   },
824 |   {
825 |    "cell_type": "code",
826 |    "execution_count": null,
827 |    "metadata": {},
828 |    "outputs": [],
829 |    "source": [
830 |     "torch.cat((a, b), 0)"
831 |    ]
832 |   },
833 |   {
834 |    "cell_type": "code",
835 |    "execution_count": null,
836 |    "metadata": {},
837 |    "outputs": [],
838 |    "source": [
839 |     "torch.cat((a, b), 1)"
840 |    ]
841 |   },
842 |   {
843 |    "cell_type": "markdown",
844 |    "metadata": {},
845 |    "source": [
846 |     "## Much more\n",
847 |     "\n",
848 |     "There's definitely much more, but this was the basics about `Tensor`s fun.\n",
849 |     "\n",
850 |     "*Torch* full API should be read at least once.\n",
851 |     "Hence, go [here](http://pytorch.org/docs/0.3.0/torch.html).\n",
852 |     "You'll find 100+ `Tensor` operations, including transposing, indexing, slicing, mathematical operations, linear algebra, random numbers, etc are described."
853 |    ]
854 |   }
855 |  ],
856 |  "metadata": {
857 |   "kernelspec": {
858 |    "display_name": "Python 3",
859 |    "language": "python",
860 |    "name": "python3"
861 |   },
862 |   "language_info": {
863 |    "codemirror_mode": {
864 |     "name": "ipython",
865 |     "version": 3
866 |    },
867 |    "file_extension": ".py",
868 |    "mimetype": "text/x-python",
869 |    "name": "python",
870 |    "nbconvert_exporter": "python",
871 |    "pygments_lexer": "ipython3",
872 |    "version": "3.6.5"
873 |   }
874 |  },
875 |  "nbformat": 4,
876 |  "nbformat_minor": 1
877 | }
878 | 


--------------------------------------------------------------------------------
/02-space_stretching.ipynb:
--------------------------------------------------------------------------------
  1 | {
  2 |  "cells": [
  3 |   {
  4 |    "cell_type": "code",
  5 |    "execution_count": null,
  6 |    "metadata": {},
  7 |    "outputs": [],
  8 |    "source": [
  9 |     "# Pretty plotting config\n",
 10 |     "%run plot_conf.py"
 11 |    ]
 12 |   },
 13 |   {
 14 |    "cell_type": "code",
 15 |    "execution_count": null,
 16 |    "metadata": {},
 17 |    "outputs": [],
 18 |    "source": [
 19 |     "# Set style (need to be in a new cell)\n",
 20 |     "plt_style()"
 21 |    ]
 22 |   },
 23 |   {
 24 |    "cell_type": "code",
 25 |    "execution_count": null,
 26 |    "metadata": {},
 27 |    "outputs": [],
 28 |    "source": [
 29 |     "import torch\n",
 30 |     "import torch.nn as nn\n",
 31 |     "import matplotlib.pyplot as plt\n",
 32 |     "\n",
 33 |     "\n",
 34 |     "# utility function\n",
 35 |     "def show_scatterplot(X, norm=True, title=''):\n",
 36 |     "    X = X.numpy()\n",
 37 |     "    plt.figure()\n",
 38 |     "    plt.axis('equal')\n",
 39 |     "    plt.scatter(X[:, 0], X[:, 1], c=colors, s=p_size)\n",
 40 |     "    if norm:\n",
 41 |     "        plt.xlim(-6, 6)\n",
 42 |     "        plt.ylim(-6, 6)\n",
 43 |     "    plt.grid(True)\n",
 44 |     "    plt.title(title)\n",
 45 |     "\n",
 46 |     "\n",
 47 |     "# generate some points in 2-D space\n",
 48 |     "n_points = 1000\n",
 49 |     "p_size = 30\n",
 50 |     "X = torch.randn(n_points, 2) \n",
 51 |     "colors = X[:, 0].numpy() \n",
 52 |     "\n",
 53 |     "show_scatterplot(X, norm=True, title='X')"
 54 |    ]
 55 |   },
 56 |   {
 57 |    "cell_type": "markdown",
 58 |    "metadata": {},
 59 |    "source": [
 60 |     "# Visualizing Linear Transformations\n",
 61 |     "\n",
 62 |     "* Generate a random matrix $W$\n",
 63 |     "\n",
 64 |     "$\n",
 65 |     "\\begin{equation}\n",
 66 |     "    W = U\n",
 67 |     "  \\left[ {\\begin{array}{cc}\n",
 68 |     "   s_1 & 0 \\\\\n",
 69 |     "   0 & s_2 \\\\\n",
 70 |     "  \\end{array} } \\right]\n",
 71 |     "  V^\\top\n",
 72 |     "\\end{equation}\n",
 73 |     "$\n",
 74 |     "* Compute $y = Wx$\n",
 75 |     "* Larger singular values stretch the points\n",
 76 |     "* Smaller singular values push them together\n",
 77 |     "* $U, V$ rotate/reflect"
 78 |    ]
 79 |   },
 80 |   {
 81 |    "cell_type": "code",
 82 |    "execution_count": null,
 83 |    "metadata": {
 84 |     "scrolled": false
 85 |    },
 86 |    "outputs": [],
 87 |    "source": [
 88 |     "show_scatterplot(X, norm=True, title='X')\n",
 89 |     "\n",
 90 |     "for i in range(10):\n",
 91 |     "    # create a random matrix\n",
 92 |     "    W = torch.randn(2, 2)\n",
 93 |     "    # transform points\n",
 94 |     "    Y = torch.mm(X, W)\n",
 95 |     "    # compute singular values\n",
 96 |     "    U,S,V = torch.svd(W)\n",
 97 |     "    # plot\n",
 98 |     "    show_scatterplot(Y, norm=True, title='y = Wx, singular values : [{:.3f}, {:.3f}]'.format(S[0], S[1]))"
 99 |    ]
100 |   },
101 |   {
102 |    "cell_type": "markdown",
103 |    "metadata": {},
104 |    "source": [
105 |     "# Linear transformation with PyTorch"
106 |    ]
107 |   },
108 |   {
109 |    "cell_type": "code",
110 |    "execution_count": null,
111 |    "metadata": {},
112 |    "outputs": [],
113 |    "source": [
114 |     "device = torch.device(\"cuda:0\" if torch.cuda.is_available() else \"cpu\")"
115 |    ]
116 |   },
117 |   {
118 |    "cell_type": "code",
119 |    "execution_count": null,
120 |    "metadata": {},
121 |    "outputs": [],
122 |    "source": [
123 |     "model = nn.Sequential(\n",
124 |     "        nn.Linear(2, 2, bias=False)\n",
125 |     ")\n",
126 |     "model.to(device)\n",
127 |     "Y = model(X).data\n",
128 |     "show_scatterplot(Y)"
129 |    ]
130 |   },
131 |   {
132 |    "cell_type": "markdown",
133 |    "metadata": {},
134 |    "source": [
135 |     "# Non-linear Transform: Map Points to a Square\n",
136 |     "\n",
137 |     "* Linear transforms can rotate, reflect, stretch and compress, but cannot curve\n",
138 |     "* We need non-linearities for this\n",
139 |     "* Can (approximately) map points to a square by first stretching out by a factor $s$, then squashing with a tanh function\n",
140 |     "\n",
141 |     "$\n",
142 |     "   f(x)= \\tanh \\left(\n",
143 |     "  \\left[ {\\begin{array}{cc}\n",
144 |     "   s & 0 \\\\\n",
145 |     "   0 & s \\\\\n",
146 |     "  \\end{array} } \\right]  \n",
147 |     "  x\n",
148 |     "  \\right)\n",
149 |     "$"
150 |    ]
151 |   },
152 |   {
153 |    "cell_type": "code",
154 |    "execution_count": null,
155 |    "metadata": {},
156 |    "outputs": [],
157 |    "source": [
158 |     "z = torch.linspace(-10, 10, 101)\n",
159 |     "s = torch.tanh(z)\n",
160 |     "plt.plot(z.numpy(), s.numpy())\n",
161 |     "plt.title('tanh() non linearity')"
162 |    ]
163 |   },
164 |   {
165 |    "cell_type": "code",
166 |    "execution_count": null,
167 |    "metadata": {
168 |     "scrolled": false
169 |    },
170 |    "outputs": [],
171 |    "source": [
172 |     "show_scatterplot(X, title='X')\n",
173 |     "plt.axis('square')\n",
174 |     "\n",
175 |     "model = nn.Sequential(\n",
176 |     "        nn.Linear(2, 2, bias=False),\n",
177 |     "        nn.Tanh()\n",
178 |     "    )\n",
179 |     "\n",
180 |     "model.to(device)\n",
181 |     "\n",
182 |     "for s in range(1, 10):\n",
183 |     "    W = s * torch.eye(2)\n",
184 |     "    model[0].weight.data.copy_(W)\n",
185 |     "    Y = model(X).data\n",
186 |     "    show_scatterplot(Y, False, title='f(x), s={}'.format(s))\n",
187 |     "    plt.axis('square')\n",
188 |     "    plt.axis([-1.2, 1.2, -1.2, 1.2])"
189 |    ]
190 |   },
191 |   {
192 |    "cell_type": "markdown",
193 |    "metadata": {
194 |     "collapsed": true
195 |    },
196 |    "source": [
197 |     "# Visualize Functions Represented by Random Neural Networks"
198 |    ]
199 |   },
200 |   {
201 |    "cell_type": "code",
202 |    "execution_count": null,
203 |    "metadata": {
204 |     "scrolled": false
205 |    },
206 |    "outputs": [],
207 |    "source": [
208 |     "show_scatterplot(X, title='x')\n",
209 |     "n_hidden = 5\n",
210 |     "\n",
211 |     "for i in range(5):\n",
212 |     "    # create 1-layer neural networks with random weights\n",
213 |     "    model_1layer = nn.Sequential(\n",
214 |     "            nn.Linear(2, n_hidden, bias=True), \n",
215 |     "            nn.ReLU(), \n",
216 |     "            nn.Linear(n_hidden, 2, bias=True)\n",
217 |     "        )\n",
218 |     "    Y = model_1layer(X).data\n",
219 |     "    show_scatterplot(Y, False, title='f(x)')"
220 |    ]
221 |   },
222 |   {
223 |    "cell_type": "code",
224 |    "execution_count": null,
225 |    "metadata": {
226 |     "scrolled": false
227 |    },
228 |    "outputs": [],
229 |    "source": [
230 |     "# deeper network with random weights\n",
231 |     "show_scatterplot(X, title='x')\n",
232 |     "n_hidden = 1000\n",
233 |     "\n",
234 |     "for i in range(5):\n",
235 |     "    model_2layer = nn.Sequential(\n",
236 |     "        nn.Linear(2, n_hidden, bias=True), \n",
237 |     "        nn.ReLU(), \n",
238 |     "        nn.Linear(n_hidden, n_hidden, bias=True), \n",
239 |     "        nn.ReLU(), \n",
240 |     "        nn.Linear(n_hidden, n_hidden, bias=True), \n",
241 |     "        nn.ReLU(), \n",
242 |     "        nn.Linear(n_hidden, n_hidden, bias=True), \n",
243 |     "        nn.ReLU(), \n",
244 |     "        nn.Linear(n_hidden, 2, bias=True)\n",
245 |     "    )\n",
246 |     "    Y = model_2layer(X).data\n",
247 |     "    show_scatterplot(Y, False, title='f(x)')\n",
248 |     "\n",
249 |     "\n"
250 |    ]
251 |   },
252 |   {
253 |    "cell_type": "code",
254 |    "execution_count": null,
255 |    "metadata": {},
256 |    "outputs": [],
257 |    "source": []
258 |   }
259 |  ],
260 |  "metadata": {
261 |   "kernelspec": {
262 |    "display_name": "Python 3",
263 |    "language": "python",
264 |    "name": "python3"
265 |   },
266 |   "language_info": {
267 |    "codemirror_mode": {
268 |     "name": "ipython",
269 |     "version": 3
270 |    },
271 |    "file_extension": ".py",
272 |    "mimetype": "text/x-python",
273 |    "name": "python",
274 |    "nbconvert_exporter": "python",
275 |    "pygments_lexer": "ipython3",
276 |    "version": "3.6.5"
277 |   }
278 |  },
279 |  "nbformat": 4,
280 |  "nbformat_minor": 2
281 | }
282 | 


--------------------------------------------------------------------------------
/03-autograd_tutorial.ipynb:
--------------------------------------------------------------------------------
  1 | {
  2 |  "cells": [
  3 |   {
  4 |    "cell_type": "markdown",
  5 |    "metadata": {},
  6 |    "source": [
  7 |     "# Autograd: automatic differentiation\n",
  8 |     "\n",
  9 |     "The ``autograd`` package provides automatic differentiation for all operations\n",
 10 |     "on Tensors. It is a define-by-run framework, which means that your backprop is\n",
 11 |     "defined by how your code is run, and that every single iteration can be\n",
 12 |     "different."
 13 |    ]
 14 |   },
 15 |   {
 16 |    "cell_type": "code",
 17 |    "execution_count": null,
 18 |    "metadata": {},
 19 |    "outputs": [],
 20 |    "source": [
 21 |     "import torch"
 22 |    ]
 23 |   },
 24 |   {
 25 |    "cell_type": "markdown",
 26 |    "metadata": {},
 27 |    "source": [
 28 |     "Create a tensor:"
 29 |    ]
 30 |   },
 31 |   {
 32 |    "cell_type": "code",
 33 |    "execution_count": null,
 34 |    "metadata": {},
 35 |    "outputs": [],
 36 |    "source": [
 37 |     "x = torch.tensor([[1, 2], [3, 4]], requires_grad=True, dtype=torch.float32)\n",
 38 |     "print(x)"
 39 |    ]
 40 |   },
 41 |   {
 42 |    "cell_type": "markdown",
 43 |    "metadata": {},
 44 |    "source": [
 45 |     "Do an operation on the tensor:"
 46 |    ]
 47 |   },
 48 |   {
 49 |    "cell_type": "code",
 50 |    "execution_count": null,
 51 |    "metadata": {},
 52 |    "outputs": [],
 53 |    "source": [
 54 |     "y = x - 2\n",
 55 |     "print(y)"
 56 |    ]
 57 |   },
 58 |   {
 59 |    "cell_type": "markdown",
 60 |    "metadata": {},
 61 |    "source": [
 62 |     "``y`` was created as a result of an operation, so it has a ``grad_fn``.\n",
 63 |     "\n"
 64 |    ]
 65 |   },
 66 |   {
 67 |    "cell_type": "code",
 68 |    "execution_count": null,
 69 |    "metadata": {},
 70 |    "outputs": [],
 71 |    "source": [
 72 |     "print(y.grad_fn)"
 73 |    ]
 74 |   },
 75 |   {
 76 |    "cell_type": "code",
 77 |    "execution_count": null,
 78 |    "metadata": {},
 79 |    "outputs": [],
 80 |    "source": [
 81 |     "print(x.grad_fn)"
 82 |    ]
 83 |   },
 84 |   {
 85 |    "cell_type": "code",
 86 |    "execution_count": null,
 87 |    "metadata": {},
 88 |    "outputs": [],
 89 |    "source": [
 90 |     "y.grad_fn"
 91 |    ]
 92 |   },
 93 |   {
 94 |    "cell_type": "code",
 95 |    "execution_count": null,
 96 |    "metadata": {},
 97 |    "outputs": [],
 98 |    "source": [
 99 |     "y.grad_fn.next_functions[0][0]"
100 |    ]
101 |   },
102 |   {
103 |    "cell_type": "code",
104 |    "execution_count": null,
105 |    "metadata": {},
106 |    "outputs": [],
107 |    "source": [
108 |     "y.grad_fn.next_functions[0][0].variable"
109 |    ]
110 |   },
111 |   {
112 |    "cell_type": "markdown",
113 |    "metadata": {},
114 |    "source": [
115 |     "Do more operations on `y`"
116 |    ]
117 |   },
118 |   {
119 |    "cell_type": "code",
120 |    "execution_count": null,
121 |    "metadata": {},
122 |    "outputs": [],
123 |    "source": [
124 |     "z = y * y * 3\n",
125 |     "out = z.mean()\n",
126 |     "\n",
127 |     "print(z, out)"
128 |    ]
129 |   },
130 |   {
131 |    "cell_type": "markdown",
132 |    "metadata": {},
133 |    "source": [
134 |     "## Gradients\n",
135 |     "\n",
136 |     "Let's backprop now `out.backward()` is equivalent to doing `out.backward(torch.tensor([1.0]))`"
137 |    ]
138 |   },
139 |   {
140 |    "cell_type": "code",
141 |    "execution_count": null,
142 |    "metadata": {},
143 |    "outputs": [],
144 |    "source": [
145 |     "out.backward()"
146 |    ]
147 |   },
148 |   {
149 |    "cell_type": "markdown",
150 |    "metadata": {},
151 |    "source": [
152 |     "print gradients d(out)/dx\n",
153 |     "\n",
154 |     "\n"
155 |    ]
156 |   },
157 |   {
158 |    "cell_type": "code",
159 |    "execution_count": null,
160 |    "metadata": {},
161 |    "outputs": [],
162 |    "source": [
163 |     "print(x.grad)"
164 |    ]
165 |   },
166 |   {
167 |    "cell_type": "markdown",
168 |    "metadata": {},
169 |    "source": [
170 |     "You can do many crazy things with autograd!\n",
171 |     "> With Great *Flexibility* Comes Great Responsibility"
172 |    ]
173 |   },
174 |   {
175 |    "cell_type": "code",
176 |    "execution_count": null,
177 |    "metadata": {},
178 |    "outputs": [],
179 |    "source": [
180 |     "# Dynamic graphs!\n",
181 |     "x = torch.randn(3, requires_grad=True)\n",
182 |     "\n",
183 |     "y = x * 2\n",
184 |     "while y.data.norm() < 1000:\n",
185 |     "    y = y * 2\n",
186 |     "\n",
187 |     "print(y)"
188 |    ]
189 |   },
190 |   {
191 |    "cell_type": "code",
192 |    "execution_count": null,
193 |    "metadata": {},
194 |    "outputs": [],
195 |    "source": [
196 |     "gradients = torch.FloatTensor([0.1, 1.0, 0.0001])\n",
197 |     "y.backward(gradients)\n",
198 |     "\n",
199 |     "print(x.grad)"
200 |    ]
201 |   },
202 |   {
203 |    "cell_type": "markdown",
204 |    "metadata": {},
205 |    "source": [
206 |     "## Inference"
207 |    ]
208 |   },
209 |   {
210 |    "cell_type": "code",
211 |    "execution_count": null,
212 |    "metadata": {},
213 |    "outputs": [],
214 |    "source": [
215 |     "n = 3"
216 |    ]
217 |   },
218 |   {
219 |    "cell_type": "code",
220 |    "execution_count": null,
221 |    "metadata": {},
222 |    "outputs": [],
223 |    "source": [
224 |     "x = torch.arange(1, n + 1, requires_grad=True)\n",
225 |     "w = torch.ones(n, requires_grad=True)\n",
226 |     "z = w @ x\n",
227 |     "z.backward()\n",
228 |     "print(x.grad, w.grad, sep='\\n')"
229 |    ]
230 |   },
231 |   {
232 |    "cell_type": "code",
233 |    "execution_count": null,
234 |    "metadata": {},
235 |    "outputs": [],
236 |    "source": [
237 |     "x = torch.arange(1, n + 1)\n",
238 |     "w = torch.ones(n, requires_grad=True)\n",
239 |     "z = w @ x\n",
240 |     "z.backward()\n",
241 |     "print(x.grad, w.grad, sep='\\n')"
242 |    ]
243 |   },
244 |   {
245 |    "cell_type": "code",
246 |    "execution_count": null,
247 |    "metadata": {},
248 |    "outputs": [],
249 |    "source": [
250 |     "with torch.no_grad():\n",
251 |     "    x = torch.arange(1, n + 1)\n",
252 |     "    w = torch.ones(n, requires_grad=True)\n",
253 |     "    z = w @ x\n",
254 |     "    z.backward()\n",
255 |     "    print(x.grad, w.grad, sep='\\n')"
256 |    ]
257 |   },
258 |   {
259 |    "cell_type": "markdown",
260 |    "metadata": {},
261 |    "source": [
262 |     "## More stuff\n",
263 |     "\n",
264 |     "Documentation of the automatic differentiation package is at\n",
265 |     "http://pytorch.org/docs/autograd\n",
266 |     "\n"
267 |    ]
268 |   }
269 |  ],
270 |  "metadata": {
271 |   "kernelspec": {
272 |    "display_name": "Python 3",
273 |    "language": "python",
274 |    "name": "python3"
275 |   },
276 |   "language_info": {
277 |    "codemirror_mode": {
278 |     "name": "ipython",
279 |     "version": 3
280 |    },
281 |    "file_extension": ".py",
282 |    "mimetype": "text/x-python",
283 |    "name": "python",
284 |    "nbconvert_exporter": "python",
285 |    "pygments_lexer": "ipython3",
286 |    "version": "3.6.5"
287 |   }
288 |  },
289 |  "nbformat": 4,
290 |  "nbformat_minor": 1
291 | }
292 | 


--------------------------------------------------------------------------------
/04-spiral_classification.ipynb:
--------------------------------------------------------------------------------
  1 | {
  2 |  "cells": [
  3 |   {
  4 |    "cell_type": "markdown",
  5 |    "metadata": {},
  6 |    "source": [
  7 |     "### Feed Forward Networks"
  8 |    ]
  9 |   },
 10 |   {
 11 |    "cell_type": "markdown",
 12 |    "metadata": {},
 13 |    "source": [
 14 |     "<p1><center>Creating differentiable computation graphs for classification tasks.</center></p1>\n",
 15 |     "<img src=\"img/train.gif\">"
 16 |    ]
 17 |   },
 18 |   {
 19 |    "cell_type": "markdown",
 20 |    "metadata": {},
 21 |    "source": [
 22 |     "### Create the data"
 23 |    ]
 24 |   },
 25 |   {
 26 |    "cell_type": "code",
 27 |    "execution_count": null,
 28 |    "metadata": {},
 29 |    "outputs": [],
 30 |    "source": [
 31 |     "import random\n",
 32 |     "import torch\n",
 33 |     "from torch import nn, optim\n",
 34 |     "import torch.nn.functional as F\n",
 35 |     "import math\n",
 36 |     "import os"
 37 |    ]
 38 |   },
 39 |   {
 40 |    "cell_type": "code",
 41 |    "execution_count": null,
 42 |    "metadata": {},
 43 |    "outputs": [],
 44 |    "source": [
 45 |     "%run plot_conf.py"
 46 |    ]
 47 |   },
 48 |   {
 49 |    "cell_type": "code",
 50 |    "execution_count": null,
 51 |    "metadata": {},
 52 |    "outputs": [],
 53 |    "source": [
 54 |     "plt_style()"
 55 |    ]
 56 |   },
 57 |   {
 58 |    "cell_type": "code",
 59 |    "execution_count": null,
 60 |    "metadata": {},
 61 |    "outputs": [],
 62 |    "source": [
 63 |     "from IPython import display"
 64 |    ]
 65 |   },
 66 |   {
 67 |    "cell_type": "code",
 68 |    "execution_count": null,
 69 |    "metadata": {},
 70 |    "outputs": [],
 71 |    "source": [
 72 |     "seed=12345\n",
 73 |     "random.seed(seed)\n",
 74 |     "torch.manual_seed(seed)\n",
 75 |     "N = 1000  # num_samples_per_class\n",
 76 |     "D = 2  # dimensions\n",
 77 |     "C = 3  # num_classes\n",
 78 |     "H = 100  # num_hidden_units"
 79 |    ]
 80 |   },
 81 |   {
 82 |    "cell_type": "code",
 83 |    "execution_count": null,
 84 |    "metadata": {},
 85 |    "outputs": [],
 86 |    "source": [
 87 |     "X = torch.zeros(N * C, D)\n",
 88 |     "y = torch.zeros(N * C)\n",
 89 |     "\n",
 90 |     "for i in range(C):\n",
 91 |     "    index = 0\n",
 92 |     "    r = torch.linspace(0, 1, N)\n",
 93 |     "    t = torch.linspace(\n",
 94 |     "        i * 2 * math.pi / C,\n",
 95 |     "        (i + 2) * 2 * math.pi / C,\n",
 96 |     "        N\n",
 97 |     "    ) + torch.randn(N) * 0.1\n",
 98 |     "    \n",
 99 |     "    for ix in range(N * i, N * (i + 1)):\n",
100 |     "        X[ix] = r[index] * torch.FloatTensor((\n",
101 |     "            math.sin(t[index]), math.cos(t[index])\n",
102 |     "        ))\n",
103 |     "        y[ix] = i\n",
104 |     "        index += 1\n",
105 |     "\n",
106 |     "print(\"SHAPES:\")\n",
107 |     "print(\"-------------------\")\n",
108 |     "print(\"X:\", tuple(X.size()))\n",
109 |     "print(\"y:\", tuple(y.size()))"
110 |    ]
111 |   },
112 |   {
113 |    "cell_type": "code",
114 |    "execution_count": null,
115 |    "metadata": {},
116 |    "outputs": [],
117 |    "source": [
118 |     "def plot_data(X, y, d=.0, auto=False):\n",
119 |     "    \"\"\"\n",
120 |     "    Plot the data.\n",
121 |     "    \"\"\"\n",
122 |     "    plt.clf()\n",
123 |     "    plt.scatter(X[:, 0], X[:, 1], c=y, s=20, cmap=plt.cm.Spectral)\n",
124 |     "    plt.axis('square')\n",
125 |     "    plt.axis((-1.1, 1.1, -1.1, 1.1))\n",
126 |     "    if auto is True: plt.axis('equal')\n",
127 |     "#     plt.savefig('spiral{:.2f}.png'.format(d))"
128 |    ]
129 |   },
130 |   {
131 |    "cell_type": "code",
132 |    "execution_count": null,
133 |    "metadata": {},
134 |    "outputs": [],
135 |    "source": [
136 |     "# Create the data\n",
137 |     "plot_data(X.numpy(), y.numpy())"
138 |    ]
139 |   },
140 |   {
141 |    "cell_type": "code",
142 |    "execution_count": null,
143 |    "metadata": {},
144 |    "outputs": [],
145 |    "source": [
146 |     "def plot_model(X, y, model, e=.0, auto=False):\n",
147 |     "    \"\"\"\n",
148 |     "    Plot the model from torch weights.\n",
149 |     "    \"\"\"\n",
150 |     "    \n",
151 |     "    X = X.numpy()\n",
152 |     "    y = y.numpy(),\n",
153 |     "    w1 = torch.transpose(model.fc1.weight.data, 0, 1).numpy()\n",
154 |     "    b1 = model.fc1.bias.data.numpy()\n",
155 |     "    w2 = torch.transpose(model.fc2.weight.data, 0, 1).numpy()\n",
156 |     "    b2 = model.fc2.bias.data.numpy()\n",
157 |     "    \n",
158 |     "    h = 0.01\n",
159 |     "\n",
160 |     "    x_min, x_max = (-1.1, 1.1)\n",
161 |     "    y_min, y_max = (-1.1, 1.1)\n",
162 |     "    \n",
163 |     "    if auto is True:\n",
164 |     "        x_min, x_max = X[:, 0].min() - 1, X[:, 0].max() + 1\n",
165 |     "        y_min, y_max = X[:, 1].min() - 1, X[:, 1].max() + 1\n",
166 |     "    xx, yy = np.meshgrid(np.arange(x_min, x_max, h),\n",
167 |     "                         np.arange(y_min, y_max, h))\n",
168 |     "    Z = np.dot(np.maximum(0, np.dot(np.c_[xx.ravel(), yy.ravel()], w1) + b1), w2) + b2\n",
169 |     "    Z = np.argmax(Z, axis=1)\n",
170 |     "    Z = Z.reshape(xx.shape)\n",
171 |     "    fig = plt.figure()\n",
172 |     "    plt.contourf(xx, yy, Z, cmap=plt.cm.Spectral, alpha=0.3)\n",
173 |     "    plt.scatter(X[:, 0], X[:, 1], c=y[0], s=40, cmap=plt.cm.Spectral)\n",
174 |     "    plt.axis((-1.1, 1.1, -1.1, 1.1))\n",
175 |     "    plt.axis('square')\n",
176 |     "    if auto is True:\n",
177 |     "        plt.axis((xx.min(), xx.max(), yy.min(), yy.max()))\n",
178 |     "    \n",
179 |     "#     plt.savefig('train{:03.2f}.png'.format(e))"
180 |    ]
181 |   },
182 |   {
183 |    "cell_type": "markdown",
184 |    "metadata": {},
185 |    "source": [
186 |     "### Linear model"
187 |    ]
188 |   },
189 |   {
190 |    "cell_type": "code",
191 |    "execution_count": null,
192 |    "metadata": {},
193 |    "outputs": [],
194 |    "source": [
195 |     "learning_rate = 1e-3\n",
196 |     "lambda_l2 = 1e-5"
197 |    ]
198 |   },
199 |   {
200 |    "cell_type": "code",
201 |    "execution_count": null,
202 |    "metadata": {},
203 |    "outputs": [],
204 |    "source": [
205 |     "# Linear model\n",
206 |     "class linear_model(nn.Module):\n",
207 |     "    \"\"\"\n",
208 |     "    Linear model.\n",
209 |     "    \"\"\"\n",
210 |     "    def __init__(self, D_in, H, D_out):\n",
211 |     "        \"\"\"\n",
212 |     "        Initialize weights.\n",
213 |     "        \"\"\"\n",
214 |     "        super(linear_model, self).__init__()\n",
215 |     "        self.fc1 = nn.Linear(D_in, H)\n",
216 |     "        self.fc2 = nn.Linear(H, D_out)\n",
217 |     "\n",
218 |     "    def forward(self, x):\n",
219 |     "        \"\"\"\n",
220 |     "        Forward pass.\n",
221 |     "        \"\"\"\n",
222 |     "        z = self.fc1(x)\n",
223 |     "        z = self.fc2(z)\n",
224 |     "        return z"
225 |    ]
226 |   },
227 |   {
228 |    "cell_type": "code",
229 |    "execution_count": null,
230 |    "metadata": {},
231 |    "outputs": [],
232 |    "source": [
233 |     "device = torch.device(\"cuda:0\" if torch.cuda.is_available() else \"cpu\")"
234 |    ]
235 |   },
236 |   {
237 |    "cell_type": "code",
238 |    "execution_count": null,
239 |    "metadata": {},
240 |    "outputs": [],
241 |    "source": [
242 |     "# nn package to create our linear model\n",
243 |     "# each Linear module has a weight and bias\n",
244 |     "model = linear_model(D, H, C)\n",
245 |     "model.to(device) #Convert to CUDA\n",
246 |     "\n",
247 |     "# nn package also has different loss functions.\n",
248 |     "# we use cross entropy loss for our classification task\n",
249 |     "criterion = torch.nn.CrossEntropyLoss()\n",
250 |     "\n",
251 |     "# we use the optim package to apply\n",
252 |     "# stochastic gradient descent for our parameter updates\n",
253 |     "optimizer = torch.optim.SGD(model.parameters(), lr=learning_rate, weight_decay=lambda_l2) # built-in L2\n",
254 |     "\n",
255 |     "# We convert our inputs and targets to Variables\n",
256 |     "# so we can use automatic differentiation but we \n",
257 |     "# use require_grad=False b/c we don't want the gradients\n",
258 |     "# to alter these values.\n",
259 |     "input_X = torch.tensor(X, requires_grad=False, dtype=torch.float32)\n",
260 |     "y_true = torch.tensor(y, requires_grad=False, dtype=torch.long)\n",
261 |     "\n",
262 |     "# Training\n",
263 |     "for t in range(1000):\n",
264 |     "    \n",
265 |     "    # Feed forward to get the logits\n",
266 |     "    y_pred = model(input_X)\n",
267 |     "    \n",
268 |     "    # Compute the loss and accuracy\n",
269 |     "    loss = criterion(y_pred, y_true)\n",
270 |     "    score, predicted = torch.max(y_pred, 1)\n",
271 |     "    acc = (y_true == predicted).sum().float() / len(y_true)\n",
272 |     "    print(\"[EPOCH]: %i, [LOSS]: %.6f, [ACCURACY]: %.3f\" % (t, loss.item(), acc))\n",
273 |     "    display.clear_output(wait=True)\n",
274 |     "    \n",
275 |     "    # zero the gradients before running\n",
276 |     "    # the backward pass.\n",
277 |     "    optimizer.zero_grad()\n",
278 |     "    \n",
279 |     "    # Backward pass to compute the gradient\n",
280 |     "    # of loss w.r.t our learnable params. \n",
281 |     "    loss.backward()\n",
282 |     "    \n",
283 |     "    # Update params\n",
284 |     "    optimizer.step()"
285 |    ]
286 |   },
287 |   {
288 |    "cell_type": "code",
289 |    "execution_count": null,
290 |    "metadata": {},
291 |    "outputs": [],
292 |    "source": [
293 |     "# Plot trained model\n",
294 |     "print(model)"
295 |    ]
296 |   },
297 |   {
298 |    "cell_type": "code",
299 |    "execution_count": null,
300 |    "metadata": {},
301 |    "outputs": [],
302 |    "source": [
303 |     "plot_model(X, y, model)"
304 |    ]
305 |   },
306 |   {
307 |    "cell_type": "markdown",
308 |    "metadata": {},
309 |    "source": [
310 |     "### Two-layered network"
311 |    ]
312 |   },
313 |   {
314 |    "cell_type": "code",
315 |    "execution_count": null,
316 |    "metadata": {},
317 |    "outputs": [],
318 |    "source": [
319 |     "learning_rate = 1e-3\n",
320 |     "lambda_l2 = 1e-5"
321 |    ]
322 |   },
323 |   {
324 |    "cell_type": "code",
325 |    "execution_count": null,
326 |    "metadata": {},
327 |    "outputs": [],
328 |    "source": [
329 |     "# NN model\n",
330 |     "class two_layer_network(nn.Module):\n",
331 |     "    \"\"\"\n",
332 |     "    NN model.\n",
333 |     "    \"\"\"\n",
334 |     "    def __init__(self, D_in, H, D_out):\n",
335 |     "        \"\"\"\n",
336 |     "        Initialize weights.\n",
337 |     "        \"\"\"\n",
338 |     "        super(two_layer_network, self).__init__()\n",
339 |     "        self.fc1 = nn.Linear(D_in, H)\n",
340 |     "        self.fc2 = nn.Linear(H, D_out)\n",
341 |     "\n",
342 |     "    def forward(self, x):\n",
343 |     "        \"\"\"\n",
344 |     "        Forward pass.\n",
345 |     "        \"\"\"\n",
346 |     "        z = F.relu(self.fc1(x))\n",
347 |     "        z = self.fc2(z)\n",
348 |     "        return z"
349 |    ]
350 |   },
351 |   {
352 |    "cell_type": "code",
353 |    "execution_count": null,
354 |    "metadata": {
355 |     "scrolled": true
356 |    },
357 |    "outputs": [],
358 |    "source": [
359 |     "# nn package to create our linear model\n",
360 |     "# each Linear module has a weight and bias\n",
361 |     "model = two_layer_network(D, H, C)\n",
362 |     "model.to(device)\n",
363 |     "\n",
364 |     "# nn package also has different loss functions.\n",
365 |     "# we use cross entropy loss for our classification task\n",
366 |     "criterion = torch.nn.CrossEntropyLoss()\n",
367 |     "\n",
368 |     "# we use the optim package to apply\n",
369 |     "# ADAM for our parameter updates\n",
370 |     "optimizer = torch.optim.Adam(model.parameters(), lr=learning_rate, weight_decay=lambda_l2) # built-in L2\n",
371 |     "\n",
372 |     "# We convert our inputs and targest to Variables\n",
373 |     "# so we can use automatic differentiation but we \n",
374 |     "# use require_grad=False b/c we don't want the gradients\n",
375 |     "# to alter these values.\n",
376 |     "input_X = torch.tensor(X, requires_grad=False, dtype=torch.float32)\n",
377 |     "y_true = torch.tensor(y, requires_grad=False, dtype=torch.long)\n",
378 |     "\n",
379 |     "# e = 1.  # plotting purpose\n",
380 |     "\n",
381 |     "# Training\n",
382 |     "for t in range(1000):\n",
383 |     "    \n",
384 |     "    # Feed forward to get the logits\n",
385 |     "    y_pred = model(input_X)\n",
386 |     "    \n",
387 |     "    # Compute the loss and accuracy\n",
388 |     "    loss = criterion(y_pred, y_true)\n",
389 |     "    score, predicted = torch.max(y_pred, 1)\n",
390 |     "    acc = (y_true == predicted).sum().float() / len(y_true)\n",
391 |     "    print(\"[EPOCH]: %i, [LOSS]: %.6f, [ACCURACY]: %.3f\" % (t, loss.item(), acc))\n",
392 |     "    display.clear_output(wait=True)\n",
393 |     "    \n",
394 |     "    # zero the gradients before running\n",
395 |     "    # the backward pass.\n",
396 |     "    optimizer.zero_grad()\n",
397 |     "    \n",
398 |     "    # Backward pass to compute the gradient\n",
399 |     "    # of loss w.r.t our learnable params. \n",
400 |     "    loss.backward()\n",
401 |     "    \n",
402 |     "    # Update params\n",
403 |     "    optimizer.step()\n",
404 |     "    \n",
405 |     "#    # Plot some progress\n",
406 |     "#     if t % math.ceil(e) == 0:\n",
407 |     "#         plot_model(X, y, model, e)\n",
408 |     "#         e *= 1.5\n",
409 |     "\n",
410 |     "#! convert -delay 20 -crop 500x475+330+50 +repage $(gls -1v train*) train.gif"
411 |    ]
412 |   },
413 |   {
414 |    "cell_type": "code",
415 |    "execution_count": null,
416 |    "metadata": {},
417 |    "outputs": [],
418 |    "source": [
419 |     "# Plot trained model\n",
420 |     "print(model)\n",
421 |     "plot_model(X, y, model)"
422 |    ]
423 |   }
424 |  ],
425 |  "metadata": {
426 |   "kernelspec": {
427 |    "display_name": "Codas ML",
428 |    "language": "python",
429 |    "name": "codasml"
430 |   },
431 |   "language_info": {
432 |    "codemirror_mode": {
433 |     "name": "ipython",
434 |     "version": 3
435 |    },
436 |    "file_extension": ".py",
437 |    "mimetype": "text/x-python",
438 |    "name": "python",
439 |    "nbconvert_exporter": "python",
440 |    "pygments_lexer": "ipython3",
441 |    "version": "3.6.5"
442 |   }
443 |  },
444 |  "nbformat": 4,
445 |  "nbformat_minor": 2
446 | }
447 | 


--------------------------------------------------------------------------------
/05-convnet.ipynb:
--------------------------------------------------------------------------------
  1 | {
  2 |  "cells": [
  3 |   {
  4 |    "cell_type": "markdown",
  5 |    "metadata": {},
  6 |    "source": [
  7 |     "# Outline\n",
  8 |     "\n",
  9 |     "* Today we will show how to train a ConvNet using PyTorch\n",
 10 |     "* We will also illustrate how the ConvNet makes use of specific assumptions"
 11 |    ]
 12 |   },
 13 |   {
 14 |    "cell_type": "markdown",
 15 |    "metadata": {},
 16 |    "source": [
 17 |     "# To perform well, we need to incorporate some prior knowledge about the problem\n",
 18 |     "\n",
 19 |     "* Assumptions helps us when they are true\n",
 20 |     "* They hurt us when they are not\n",
 21 |     "* We want to make just the right amount of assumptions, not more than that\n",
 22 |     "\n",
 23 |     "## In Deep Learning\n",
 24 |     "\n",
 25 |     "* Many layers: compositionality\n",
 26 |     "* Convolutions: locality + stationarity of images\n",
 27 |     "* Pooling: Invariance of object class to translations"
 28 |    ]
 29 |   },
 30 |   {
 31 |    "cell_type": "code",
 32 |    "execution_count": null,
 33 |    "metadata": {},
 34 |    "outputs": [],
 35 |    "source": [
 36 |     "% run plot_conf.py"
 37 |    ]
 38 |   },
 39 |   {
 40 |    "cell_type": "code",
 41 |    "execution_count": null,
 42 |    "metadata": {},
 43 |    "outputs": [],
 44 |    "source": [
 45 |     "plt_style()"
 46 |    ]
 47 |   },
 48 |   {
 49 |    "cell_type": "code",
 50 |    "execution_count": null,
 51 |    "metadata": {},
 52 |    "outputs": [],
 53 |    "source": [
 54 |     "import torch\n",
 55 |     "import torch.nn as nn\n",
 56 |     "import torch.nn.functional as F\n",
 57 |     "import torch.optim as optim\n",
 58 |     "from torchvision import datasets, transforms\n",
 59 |     "import matplotlib.pyplot as plt\n",
 60 |     "import numpy\n",
 61 |     "\n",
 62 |     "# function to count number of parameters\n",
 63 |     "def get_n_params(model):\n",
 64 |     "    np=0\n",
 65 |     "    for p in list(model.parameters()):\n",
 66 |     "        np += p.nelement()\n",
 67 |     "    return np"
 68 |    ]
 69 |   },
 70 |   {
 71 |    "cell_type": "markdown",
 72 |    "metadata": {},
 73 |    "source": [
 74 |     "# Load the Dataset (MNIST)\n",
 75 |     "\n",
 76 |     "\n",
 77 |     "We can use some PyTorch DataLoader utilities for this. This will download, shuffle, normalize data and arrange it in batches."
 78 |    ]
 79 |   },
 80 |   {
 81 |    "cell_type": "code",
 82 |    "execution_count": null,
 83 |    "metadata": {
 84 |     "scrolled": false
 85 |    },
 86 |    "outputs": [],
 87 |    "source": [
 88 |     "input_size  = 28*28   # images are 28x28 pixels\n",
 89 |     "output_size = 10      # there are 10 classes\n",
 90 |     "\n",
 91 |     "train_loader = torch.utils.data.DataLoader(\n",
 92 |     "    datasets.MNIST('../data', train=True, download=True,\n",
 93 |     "                   transform=transforms.Compose([\n",
 94 |     "                       transforms.ToTensor(),\n",
 95 |     "                       transforms.Normalize((0.1307,), (0.3081,))\n",
 96 |     "                   ])),\n",
 97 |     "    batch_size=64, shuffle=True)\n",
 98 |     "test_loader = torch.utils.data.DataLoader(\n",
 99 |     "    datasets.MNIST('../data', train=False, transform=transforms.Compose([\n",
100 |     "                       transforms.ToTensor(),\n",
101 |     "                       transforms.Normalize((0.1307,), (0.3081,))\n",
102 |     "                   ])),\n",
103 |     "    batch_size=1000, shuffle=True)"
104 |    ]
105 |   },
106 |   {
107 |    "cell_type": "code",
108 |    "execution_count": null,
109 |    "metadata": {},
110 |    "outputs": [],
111 |    "source": [
112 |     "# show some images\n",
113 |     "plt.figure()\n",
114 |     "for i in range(10):\n",
115 |     "    plt.subplot(2, 5, i + 1)\n",
116 |     "    image, _ = train_loader.dataset.__getitem__(i)\n",
117 |     "    plt.imshow(image.squeeze().numpy())"
118 |    ]
119 |   },
120 |   {
121 |    "cell_type": "markdown",
122 |    "metadata": {},
123 |    "source": [
124 |     "# Create the model classes"
125 |    ]
126 |   },
127 |   {
128 |    "cell_type": "code",
129 |    "execution_count": null,
130 |    "metadata": {},
131 |    "outputs": [],
132 |    "source": [
133 |     "class FC2Layer(nn.Module):\n",
134 |     "    def __init__(self, input_size, n_hidden, output_size):\n",
135 |     "        super(FC2Layer, self).__init__()\n",
136 |     "        self.input_size = input_size\n",
137 |     "        self.network = nn.Sequential(\n",
138 |     "            nn.Linear(input_size, n_hidden), \n",
139 |     "            nn.ReLU(), \n",
140 |     "            nn.Linear(n_hidden, n_hidden), \n",
141 |     "            nn.ReLU(), \n",
142 |     "            nn.Linear(n_hidden, output_size), \n",
143 |     "            nn.LogSoftmax(dim=1)\n",
144 |     "        )\n",
145 |     "\n",
146 |     "    def forward(self, x):\n",
147 |     "        x = x.view(-1, self.input_size)\n",
148 |     "        return self.network(x)\n",
149 |     "    \n",
150 |     "    \n",
151 |     "class CNN(nn.Module):\n",
152 |     "    def __init__(self, input_size, n_feature, output_size):\n",
153 |     "        super(CNN, self).__init__()\n",
154 |     "        self.n_feature = n_feature\n",
155 |     "        self.conv1 = nn.Conv2d(in_channels=1, out_channels=n_feature, kernel_size=5)\n",
156 |     "        self.conv2 = nn.Conv2d(n_feature, n_feature, kernel_size=5)\n",
157 |     "        self.fc1 = nn.Linear(n_feature*4*4, 50)\n",
158 |     "        self.fc2 = nn.Linear(50, 10)\n",
159 |     "        \n",
160 |     "\n",
161 |     "\n",
162 |     "    def forward(self, x, verbose=False):\n",
163 |     "        x = self.conv1(x)\n",
164 |     "        x = F.relu(x)\n",
165 |     "        x = F.max_pool2d(x, kernel_size=2)\n",
166 |     "        x = self.conv2(x)\n",
167 |     "        x = F.relu(x)\n",
168 |     "        x = F.max_pool2d(x, kernel_size=2)\n",
169 |     "        x = x.view(-1, self.n_feature*4*4)\n",
170 |     "        x = self.fc1(x)\n",
171 |     "        x = F.relu(x)\n",
172 |     "        x = self.fc2(x)\n",
173 |     "        x = F.log_softmax(x, dim=1)\n",
174 |     "        return x\n",
175 |     " \n"
176 |    ]
177 |   },
178 |   {
179 |    "cell_type": "markdown",
180 |    "metadata": {},
181 |    "source": [
182 |     "## Running on a GPU: device string\n",
183 |     "\n",
184 |     "Switching between CPU and GPU in PyTorch is controlled via a device string, which will seemlessly determine whether GPU is available, falling back to CPU if not:"
185 |    ]
186 |   },
187 |   {
188 |    "cell_type": "code",
189 |    "execution_count": null,
190 |    "metadata": {},
191 |    "outputs": [],
192 |    "source": [
193 |     "device = torch.device(\"cuda:0\" if torch.cuda.is_available() else \"cpu\")"
194 |    ]
195 |   },
196 |   {
197 |    "cell_type": "code",
198 |    "execution_count": null,
199 |    "metadata": {},
200 |    "outputs": [],
201 |    "source": [
202 |     "accuracy_list = []\n",
203 |     "\n",
204 |     "def train(epoch, model, perm=torch.arange(0, 784).long()):\n",
205 |     "    model.train()\n",
206 |     "    for batch_idx, (data, target) in enumerate(train_loader):\n",
207 |     "  \n",
208 |     "        # permute pixels\n",
209 |     "        data = data.view(-1, 28*28)\n",
210 |     "        data = data[:, perm]\n",
211 |     "        data = data.view(-1, 1, 28, 28)\n",
212 |     "            \n",
213 |     "        optimizer.zero_grad()\n",
214 |     "        output = model(data)\n",
215 |     "        loss = F.nll_loss(output, target)\n",
216 |     "        loss.backward()\n",
217 |     "        optimizer.step()\n",
218 |     "        if batch_idx % 100 == 0:\n",
219 |     "            print('Train Epoch: {} [{}/{} ({:.0f}%)]\\tLoss: {:.6f}'.format(\n",
220 |     "                epoch, batch_idx * len(data), len(train_loader.dataset),\n",
221 |     "                100. * batch_idx / len(train_loader), loss.item()))\n",
222 |     "            \n",
223 |     "def test(model, perm=torch.arange(0, 784).long()):\n",
224 |     "    model.eval()\n",
225 |     "    test_loss = 0\n",
226 |     "    correct = 0\n",
227 |     "    for data, target in test_loader:\n",
228 |     "        # permute pixels\n",
229 |     "        data = data.view(-1, 28*28)\n",
230 |     "        data = data[:, perm]\n",
231 |     "        data = data.view(-1, 1, 28, 28)\n",
232 |     "        output = model(data)\n",
233 |     "        test_loss += F.nll_loss(output, target, size_average=False).item() # sum up batch loss                                                               \n",
234 |     "        pred = output.data.max(1, keepdim=True)[1] # get the index of the max log-probability                                                                 \n",
235 |     "        correct += pred.eq(target.data.view_as(pred)).cpu().sum().item()\n",
236 |     "\n",
237 |     "    test_loss /= len(test_loader.dataset)\n",
238 |     "    accuracy = 100. * correct / len(test_loader.dataset)\n",
239 |     "    accuracy_list.append(accuracy)\n",
240 |     "    print('\\nTest set: Average loss: {:.4f}, Accuracy: {}/{} ({:.0f}%)\\n'.format(\n",
241 |     "        test_loss, correct, len(test_loader.dataset),\n",
242 |     "        accuracy))"
243 |    ]
244 |   },
245 |   {
246 |    "cell_type": "markdown",
247 |    "metadata": {},
248 |    "source": [
249 |     "# Train a small fully-connected network"
250 |    ]
251 |   },
252 |   {
253 |    "cell_type": "code",
254 |    "execution_count": null,
255 |    "metadata": {},
256 |    "outputs": [],
257 |    "source": [
258 |     "n_hidden    = 8    # number of hidden units\n",
259 |     "\n",
260 |     "model = FC2Layer(input_size, n_hidden, output_size)\n",
261 |     "optimizer = optim.SGD(model.parameters(), lr=0.01, momentum=0.5)\n",
262 |     "print('Number of parameters: {}'.format(get_n_params(model)))\n",
263 |     "\n",
264 |     "for epoch in range(0, 1):\n",
265 |     "    train(epoch, model)\n",
266 |     "    test(model)"
267 |    ]
268 |   },
269 |   {
270 |    "cell_type": "markdown",
271 |    "metadata": {},
272 |    "source": [
273 |     "# Train a ConvNet with the same number of parameters"
274 |    ]
275 |   },
276 |   {
277 |    "cell_type": "code",
278 |    "execution_count": null,
279 |    "metadata": {},
280 |    "outputs": [],
281 |    "source": [
282 |     "# Training settings \n",
283 |     "n_features    = 6     # number of feature maps\n",
284 |     "\n",
285 |     "model = CNN(input_size, n_features, output_size)\n",
286 |     "optimizer = optim.SGD(model.parameters(), lr=0.01, momentum=0.5)\n",
287 |     "print('Number of parameters: {}'.format(get_n_params(model)))\n",
288 |     "\n",
289 |     "for epoch in range(0, 1):\n",
290 |     "    train(epoch, model)\n",
291 |     "    test(model)"
292 |    ]
293 |   },
294 |   {
295 |    "cell_type": "markdown",
296 |    "metadata": {},
297 |    "source": [
298 |     "# The ConvNet performs better with the same number of parameters, thanks to its use of prior knowledge about images\n",
299 |     "\n",
300 |     "* Use of convolution: Locality and stationarity in images\n",
301 |     "* Pooling: builds in some translation invariance\n",
302 |     "\n",
303 |     "# What happens if the assumptions are no longer true?\n"
304 |    ]
305 |   },
306 |   {
307 |    "cell_type": "code",
308 |    "execution_count": null,
309 |    "metadata": {
310 |     "scrolled": false
311 |    },
312 |    "outputs": [],
313 |    "source": [
314 |     "perm = torch.randperm(784)\n",
315 |     "plt.figure()\n",
316 |     "for i in range(10):\n",
317 |     "    image, _ = train_loader.dataset.__getitem__(i)\n",
318 |     "    # permute pixels\n",
319 |     "    image_perm = image.view(-1, 28*28).clone()\n",
320 |     "    image_perm = image_perm[:, perm]\n",
321 |     "    image_perm = image_perm.view(-1, 1, 28, 28)\n",
322 |     "    plt.subplot(4, 5, i + 1)\n",
323 |     "    plt.imshow(image.squeeze().numpy())\n",
324 |     "    plt.axis('off')\n",
325 |     "    plt.subplot(4, 5, i + 11)\n",
326 |     "    plt.imshow(image_perm.squeeze().numpy())\n",
327 |     "    plt.axis('off')"
328 |    ]
329 |   },
330 |   {
331 |    "cell_type": "markdown",
332 |    "metadata": {},
333 |    "source": [
334 |     "# ConvNet with permuted pixels"
335 |    ]
336 |   },
337 |   {
338 |    "cell_type": "code",
339 |    "execution_count": null,
340 |    "metadata": {},
341 |    "outputs": [],
342 |    "source": [
343 |     "# Training settings \n",
344 |     "n_features    = 6     # number of feature maps\n",
345 |     "\n",
346 |     "model = CNN(input_size, n_features, output_size)\n",
347 |     "optimizer = optim.SGD(model.parameters(), lr=0.01, momentum=0.5)\n",
348 |     "print('Number of parameters: {}'.format(get_n_params(model)))\n",
349 |     "\n",
350 |     "for epoch in range(0, 1):\n",
351 |     "    train(epoch, model, perm)\n",
352 |     "    test(model, perm)"
353 |    ]
354 |   },
355 |   {
356 |    "cell_type": "markdown",
357 |    "metadata": {},
358 |    "source": [
359 |     "# Fully-Connected with Permuted Pixels"
360 |    ]
361 |   },
362 |   {
363 |    "cell_type": "code",
364 |    "execution_count": null,
365 |    "metadata": {},
366 |    "outputs": [],
367 |    "source": [
368 |     "n_hidden    = 8    # number of hidden units\n",
369 |     "\n",
370 |     "model = FC2Layer(input_size, n_hidden, output_size)\n",
371 |     "optimizer = optim.SGD(model.parameters(), lr=0.01, momentum=0.5)\n",
372 |     "print('Number of parameters: {}'.format(get_n_params(model)))\n",
373 |     "\n",
374 |     "for epoch in range(0, 1):\n",
375 |     "    train(epoch, model, perm)\n",
376 |     "    test(model, perm)"
377 |    ]
378 |   },
379 |   {
380 |    "cell_type": "markdown",
381 |    "metadata": {},
382 |    "source": [
383 |     "# The ConvNet's performance drops when we permute the pixels, but the Fully-Connected Network's performance stays the same\n",
384 |     "\n",
385 |     "* ConvNet makes the assumption that pixels lie on a grid and are stationary/local\n",
386 |     "* It loses performance when this assumption is wrong\n",
387 |     "* The fully-connected network does not make this assumption\n",
388 |     "* It does less well when it is true, since it doesn't take advantage of this prior knowledge\n",
389 |     "* But it doesn't suffer when the assumption is wrong"
390 |    ]
391 |   },
392 |   {
393 |    "cell_type": "code",
394 |    "execution_count": null,
395 |    "metadata": {},
396 |    "outputs": [],
397 |    "source": [
398 |     "plt.bar(('NN image', 'CNN image',\n",
399 |     "         'CNN scrambled', 'NN scrambled'),\n",
400 |     "        accuracy_list, width=0.4)\n",
401 |     "plt.ylim((min(accuracy_list)-5, 96))\n",
402 |     "plt.ylabel('Accuracy [%]')\n",
403 |     "for tick in plt.gca().xaxis.get_major_ticks():\n",
404 |     "    tick.label.set_fontsize(20)\n",
405 |     "plt.title('Performance comparison');"
406 |    ]
407 |   },
408 |   {
409 |    "cell_type": "code",
410 |    "execution_count": null,
411 |    "metadata": {},
412 |    "outputs": [],
413 |    "source": []
414 |   }
415 |  ],
416 |  "metadata": {
417 |   "kernelspec": {
418 |    "display_name": "Codas ML",
419 |    "language": "python",
420 |    "name": "codasml"
421 |   },
422 |   "language_info": {
423 |    "codemirror_mode": {
424 |     "name": "ipython",
425 |     "version": 3
426 |    },
427 |    "file_extension": ".py",
428 |    "mimetype": "text/x-python",
429 |    "name": "python",
430 |    "nbconvert_exporter": "python",
431 |    "pygments_lexer": "ipython3",
432 |    "version": "3.6.5"
433 |   }
434 |  },
435 |  "nbformat": 4,
436 |  "nbformat_minor": 2
437 | }
438 | 


--------------------------------------------------------------------------------
/06-autoencoder.ipynb:
--------------------------------------------------------------------------------
  1 | {
  2 |  "cells": [
  3 |   {
  4 |    "cell_type": "code",
  5 |    "execution_count": null,
  6 |    "metadata": {},
  7 |    "outputs": [],
  8 |    "source": [
  9 |     "# Import some libraries\n",
 10 |     "\n",
 11 |     "import torch\n",
 12 |     "import torchvision\n",
 13 |     "from torch import nn\n",
 14 |     "from torch.utils.data import DataLoader\n",
 15 |     "from torchvision import transforms\n",
 16 |     "from torchvision.datasets import MNIST\n",
 17 |     "from matplotlib import pyplot as plt"
 18 |    ]
 19 |   },
 20 |   {
 21 |    "cell_type": "code",
 22 |    "execution_count": null,
 23 |    "metadata": {},
 24 |    "outputs": [],
 25 |    "source": [
 26 |     "# Convert vector to image\n",
 27 |     "\n",
 28 |     "def to_img(x):\n",
 29 |     "    x = 0.5 * (x + 1)\n",
 30 |     "    x = x.view(x.size(0), 28, 28)\n",
 31 |     "    return x"
 32 |    ]
 33 |   },
 34 |   {
 35 |    "cell_type": "code",
 36 |    "execution_count": null,
 37 |    "metadata": {},
 38 |    "outputs": [],
 39 |    "source": [
 40 |     "# Displaying routine\n",
 41 |     "\n",
 42 |     "def display_images(in_, out, n=1):\n",
 43 |     "    for N in range(n):\n",
 44 |     "        if in_ is not None:\n",
 45 |     "            in_pic = to_img(in_.cpu().data)\n",
 46 |     "            plt.figure(figsize=(18, 6))\n",
 47 |     "            for i in range(4):\n",
 48 |     "                plt.subplot(1,4,i+1)\n",
 49 |     "                plt.imshow(in_pic[i+4*N])\n",
 50 |     "                plt.axis('off')\n",
 51 |     "        out_pic = to_img(out.cpu().data)\n",
 52 |     "        plt.figure(figsize=(18, 6))\n",
 53 |     "        for i in range(4):\n",
 54 |     "            plt.subplot(1,4,i+1)\n",
 55 |     "            plt.imshow(out_pic[i+4*N])\n",
 56 |     "            plt.axis('off')"
 57 |    ]
 58 |   },
 59 |   {
 60 |    "cell_type": "code",
 61 |    "execution_count": null,
 62 |    "metadata": {},
 63 |    "outputs": [],
 64 |    "source": [
 65 |     "# Define data loading step\n",
 66 |     "\n",
 67 |     "batch_size = 256\n",
 68 |     "\n",
 69 |     "img_transform = transforms.Compose([\n",
 70 |     "    transforms.ToTensor(),\n",
 71 |     "    transforms.Normalize((0.5, 0.5, 0.5), (0.5, 0.5, 0.5))\n",
 72 |     "])\n",
 73 |     "\n",
 74 |     "dataset = MNIST('./data', transform=img_transform, download=True)\n",
 75 |     "dataloader = DataLoader(dataset, batch_size=batch_size, shuffle=True)"
 76 |    ]
 77 |   },
 78 |   {
 79 |    "cell_type": "code",
 80 |    "execution_count": null,
 81 |    "metadata": {},
 82 |    "outputs": [],
 83 |    "source": [
 84 |     "device = torch.device(\"cuda:0\" if torch.cuda.is_available() else \"cpu\")"
 85 |    ]
 86 |   },
 87 |   {
 88 |    "cell_type": "code",
 89 |    "execution_count": null,
 90 |    "metadata": {},
 91 |    "outputs": [],
 92 |    "source": [
 93 |     "# Define model architecture and reconstruction loss\n",
 94 |     "\n",
 95 |     "# n = 28 x 28 = 784\n",
 96 |     "d = 30  # for standard AE (under-complete hidden layer)\n",
 97 |     "# d = 500  # for denoising AE (over-complete hidden layer)\n",
 98 |     "\n",
 99 |     "class Autoencoder(nn.Module):\n",
100 |     "    def __init__(self):\n",
101 |     "        super().__init__()\n",
102 |     "        self.encoder = nn.Sequential(\n",
103 |     "            nn.Linear(28 * 28, d),\n",
104 |     "            nn.Tanh(),\n",
105 |     "        )\n",
106 |     "        self.decoder = nn.Sequential(\n",
107 |     "            nn.Linear(d, 28 * 28),\n",
108 |     "            nn.Tanh(),\n",
109 |     "        )\n",
110 |     "\n",
111 |     "    def forward(self, x):\n",
112 |     "        x = self.encoder(x)\n",
113 |     "        x = self.decoder(x)\n",
114 |     "        return x\n",
115 |     "    \n",
116 |     "model = Autoencoder().to(device)\n",
117 |     "criterion = nn.MSELoss()"
118 |    ]
119 |   },
120 |   {
121 |    "cell_type": "code",
122 |    "execution_count": null,
123 |    "metadata": {},
124 |    "outputs": [],
125 |    "source": [
126 |     "# Configure the optimiser\n",
127 |     "\n",
128 |     "learning_rate = 1e-3\n",
129 |     "\n",
130 |     "optimizer = torch.optim.Adam(\n",
131 |     "    model.parameters(),\n",
132 |     "    lr=learning_rate,\n",
133 |     ")"
134 |    ]
135 |   },
136 |   {
137 |    "cell_type": "markdown",
138 |    "metadata": {},
139 |    "source": [
140 |     "*Comment* or *un-comment out* a few lines of code to seamlessly switch between *standard AE* and *denoising one*.\n",
141 |     "\n",
142 |     "Don't forget to **(1)** change the size of the hidden layer accordingly, **(2)** re-generate the model, and **(3)** re-pass the parameters to the optimiser."
143 |    ]
144 |   },
145 |   {
146 |    "cell_type": "code",
147 |    "execution_count": null,
148 |    "metadata": {
149 |     "scrolled": false
150 |    },
151 |    "outputs": [],
152 |    "source": [
153 |     "# Train standard or denoising autoencoder (AE)\n",
154 |     "\n",
155 |     "num_epochs = 1\n",
156 |     "# do = nn.Dropout()  # comment out for standard AE\n",
157 |     "for epoch in range(num_epochs):\n",
158 |     "    for data in dataloader:\n",
159 |     "        img, _ = data\n",
160 |     "        img.requires_grad_()\n",
161 |     "        img = img.view(img.size(0), -1)\n",
162 |     "#         img_bad = do(img).to(device)  # comment out for standard AE\n",
163 |     "        # ===================forward=====================\n",
164 |     "        output = model(img)  # feed <img> (for std AE) or <img_bad> (for denoising AE)\n",
165 |     "        loss = criterion(output, img.data)\n",
166 |     "        # ===================backward====================\n",
167 |     "        optimizer.zero_grad()\n",
168 |     "        loss.backward()\n",
169 |     "        optimizer.step()\n",
170 |     "    # ===================log========================\n",
171 |     "    print(f'epoch [{epoch + 1}/{num_epochs}], loss:{loss.item():.4f}')\n",
172 |     "    display_images(None, output)  # pass (None, output) for std AE, (img_bad, output) for denoising AE"
173 |    ]
174 |   },
175 |   {
176 |    "cell_type": "code",
177 |    "execution_count": null,
178 |    "metadata": {},
179 |    "outputs": [],
180 |    "source": [
181 |     "# Visualise a few kernels of the encoder\n",
182 |     "\n",
183 |     "display_images(None, model.encoder[0].weight, 5)"
184 |    ]
185 |   },
186 |   {
187 |    "cell_type": "code",
188 |    "execution_count": null,
189 |    "metadata": {},
190 |    "outputs": [],
191 |    "source": [
192 |     "! conda install -y --name codas-ml opencv"
193 |    ]
194 |   },
195 |   {
196 |    "cell_type": "code",
197 |    "execution_count": null,
198 |    "metadata": {},
199 |    "outputs": [],
200 |    "source": [
201 |     "# Let's compare the autoencoder inpainting capabilities vs. OpenCV\n",
202 |     "\n",
203 |     "from cv2 import inpaint, INPAINT_NS, INPAINT_TELEA"
204 |    ]
205 |   },
206 |   {
207 |    "cell_type": "code",
208 |    "execution_count": null,
209 |    "metadata": {},
210 |    "outputs": [],
211 |    "source": [
212 |     "# Inpaint with Telea and Navier-Stokes methods\n",
213 |     "\n",
214 |     "dst_TELEA = list()\n",
215 |     "dst_NS = list()\n",
216 |     "\n",
217 |     "for i in range(3, 7):\n",
218 |     "    corrupted_img = ((img_bad.data.cpu()[i].view(28, 28) / 4 + 0.5) * 255).byte().numpy()\n",
219 |     "    mask = 2 - img_bad.grad_fn.noise.cpu()[i].view(28, 28).byte().numpy()\n",
220 |     "    dst_TELEA.append(inpaint(corrupted_img, mask, 3, INPAINT_TELEA))\n",
221 |     "    dst_NS.append(inpaint(corrupted_img, mask, 3, INPAINT_NS))\n",
222 |     "\n",
223 |     "tns_TELEA = [torch.from_numpy(d) for d in dst_TELEA]\n",
224 |     "tns_NS = [torch.from_numpy(d) for d in dst_NS]\n",
225 |     "\n",
226 |     "TELEA = torch.stack(tns_TELEA).float()\n",
227 |     "NS = torch.stack(tns_NS).float()"
228 |    ]
229 |   },
230 |   {
231 |    "cell_type": "code",
232 |    "execution_count": null,
233 |    "metadata": {},
234 |    "outputs": [],
235 |    "source": [
236 |     "# Compare the results: [noise], [img + noise], [img], [AE, Telea, Navier-Stokes] inpainting\n",
237 |     "\n",
238 |     "with torch.no_grad():\n",
239 |     "    display_images(img_bad.grad_fn.noise[3:7], img_bad[3:7])\n",
240 |     "    display_images(img[3:7], output[3:7])\n",
241 |     "    display_images(TELEA, NS)"
242 |    ]
243 |   },
244 |   {
245 |    "cell_type": "code",
246 |    "execution_count": null,
247 |    "metadata": {},
248 |    "outputs": [],
249 |    "source": []
250 |   }
251 |  ],
252 |  "metadata": {
253 |   "kernelspec": {
254 |    "display_name": "Python 3",
255 |    "language": "python",
256 |    "name": "python3"
257 |   },
258 |   "language_info": {
259 |    "codemirror_mode": {
260 |     "name": "ipython",
261 |     "version": 3
262 |    },
263 |    "file_extension": ".py",
264 |    "mimetype": "text/x-python",
265 |    "name": "python",
266 |    "nbconvert_exporter": "python",
267 |    "pygments_lexer": "ipython3",
268 |    "version": "3.6.6"
269 |   }
270 |  },
271 |  "nbformat": 4,
272 |  "nbformat_minor": 2
273 | }
274 | 


--------------------------------------------------------------------------------
/07-VAE.ipynb:
--------------------------------------------------------------------------------
  1 | {
  2 |  "cells": [
  3 |   {
  4 |    "cell_type": "code",
  5 |    "execution_count": null,
  6 |    "metadata": {},
  7 |    "outputs": [],
  8 |    "source": [
  9 |     "import torch\n",
 10 |     "import torchvision\n",
 11 |     "from torch import nn\n",
 12 |     "from torch.utils.data import DataLoader\n",
 13 |     "from torchvision import transforms\n",
 14 |     "from torchvision.datasets import MNIST\n",
 15 |     "from matplotlib import pyplot as plt"
 16 |    ]
 17 |   },
 18 |   {
 19 |    "cell_type": "code",
 20 |    "execution_count": null,
 21 |    "metadata": {},
 22 |    "outputs": [],
 23 |    "source": [
 24 |     "# Displaying routine\n",
 25 |     "\n",
 26 |     "def display_images(in_, out, n=1, label=None, count=False):\n",
 27 |     "    for N in range(n):\n",
 28 |     "        if in_ is not None:\n",
 29 |     "            in_pic = in_.data.cpu().view(-1, 28, 28)\n",
 30 |     "            plt.figure(figsize=(18, 4))\n",
 31 |     "            plt.suptitle(label + ' – real test data / reconstructions', color='w', fontsize=16)\n",
 32 |     "            for i in range(4):\n",
 33 |     "                plt.subplot(1,4,i+1)\n",
 34 |     "                plt.imshow(in_pic[i+4*N])\n",
 35 |     "                plt.axis('off')\n",
 36 |     "        out_pic = out.data.cpu().view(-1, 28, 28)\n",
 37 |     "        plt.figure(figsize=(18, 6))\n",
 38 |     "        for i in range(4):\n",
 39 |     "            plt.subplot(1,4,i+1)\n",
 40 |     "            plt.imshow(out_pic[i+4*N])\n",
 41 |     "            plt.axis('off')\n",
 42 |     "            if count: plt.title(str(4 * N + i), color='w')"
 43 |    ]
 44 |   },
 45 |   {
 46 |    "cell_type": "code",
 47 |    "execution_count": null,
 48 |    "metadata": {},
 49 |    "outputs": [],
 50 |    "source": [
 51 |     "# Set random seeds\n",
 52 |     "\n",
 53 |     "torch.manual_seed(1)\n",
 54 |     "torch.cuda.manual_seed(1)"
 55 |    ]
 56 |   },
 57 |   {
 58 |    "cell_type": "code",
 59 |    "execution_count": null,
 60 |    "metadata": {},
 61 |    "outputs": [],
 62 |    "source": [
 63 |     "# Define data loading step\n",
 64 |     "\n",
 65 |     "batch_size = 256\n",
 66 |     "\n",
 67 |     "kwargs = {'num_workers': 1, 'pin_memory': True}\n",
 68 |     "train_loader = torch.utils.data.DataLoader(\n",
 69 |     "    MNIST('./data', train=True, download=True,\n",
 70 |     "                   transform=transforms.ToTensor()),\n",
 71 |     "    batch_size=batch_size, shuffle=True, **kwargs)\n",
 72 |     "test_loader = torch.utils.data.DataLoader(\n",
 73 |     "    MNIST('./data', train=False, transform=transforms.ToTensor()),\n",
 74 |     "    batch_size=batch_size, shuffle=True, **kwargs)"
 75 |    ]
 76 |   },
 77 |   {
 78 |    "cell_type": "code",
 79 |    "execution_count": null,
 80 |    "metadata": {},
 81 |    "outputs": [],
 82 |    "source": [
 83 |     "device = torch.device(\"cuda:0\" if torch.cuda.is_available() else \"cpu\")"
 84 |    ]
 85 |   },
 86 |   {
 87 |    "cell_type": "code",
 88 |    "execution_count": null,
 89 |    "metadata": {},
 90 |    "outputs": [],
 91 |    "source": [
 92 |     "d = 20\n",
 93 |     "\n",
 94 |     "class VAE(nn.Module):\n",
 95 |     "    def __init__(self):\n",
 96 |     "        super().__init__()\n",
 97 |     "\n",
 98 |     "        self.encoder = nn.Sequential(\n",
 99 |     "            nn.Linear(784, d ** 2),\n",
100 |     "            nn.ReLU(),\n",
101 |     "            nn.Linear(d ** 2, d * 2)\n",
102 |     "        )\n",
103 |     "\n",
104 |     "        self.decoder = nn.Sequential(\n",
105 |     "            nn.Linear(d, d ** 2),\n",
106 |     "            nn.ReLU(),\n",
107 |     "            nn.Linear(d ** 2, 784),\n",
108 |     "            nn.Sigmoid(),\n",
109 |     "        )\n",
110 |     "\n",
111 |     "    def reparameterize(self, mu, logvar):\n",
112 |     "        if self.training:\n",
113 |     "            std = logvar.mul(0.5).exp_()\n",
114 |     "            eps = std.data.new(std.size()).normal_()\n",
115 |     "            return eps.mul(std).add_(mu)\n",
116 |     "        else:\n",
117 |     "            return mu\n",
118 |     "\n",
119 |     "    def forward(self, x):\n",
120 |     "        mu_logvar = self.encoder(x.view(-1, 784)).view(-1, 2, d)\n",
121 |     "        mu = mu_logvar[:, 0, :]\n",
122 |     "        logvar = mu_logvar[:, 1, :]\n",
123 |     "        z = self.reparameterize(mu, logvar)\n",
124 |     "        return self.decoder(z), mu, logvar\n",
125 |     "\n",
126 |     "model = VAE().to(device)"
127 |    ]
128 |   },
129 |   {
130 |    "cell_type": "code",
131 |    "execution_count": null,
132 |    "metadata": {},
133 |    "outputs": [],
134 |    "source": [
135 |     "learning_rate = 1e-3\n",
136 |     "\n",
137 |     "optimizer = torch.optim.Adam(\n",
138 |     "    model.parameters(),\n",
139 |     "    lr=learning_rate,\n",
140 |     ")"
141 |    ]
142 |   },
143 |   {
144 |    "cell_type": "code",
145 |    "execution_count": null,
146 |    "metadata": {},
147 |    "outputs": [],
148 |    "source": [
149 |     "# Reconstruction + KL divergence losses summed over all elements and batch\n",
150 |     "\n",
151 |     "def loss_function(recon_x, x, mu, logvar):\n",
152 |     "    BCE = nn.functional.binary_cross_entropy(\n",
153 |     "        recon_x, x.view(-1, 784), size_average=False\n",
154 |     "    )\n",
155 |     "    KLD = -0.5 * torch.sum(1 + logvar - mu.pow(2) - logvar.exp())\n",
156 |     "\n",
157 |     "    return BCE + KLD"
158 |    ]
159 |   },
160 |   {
161 |    "cell_type": "code",
162 |    "execution_count": null,
163 |    "metadata": {
164 |     "scrolled": false
165 |    },
166 |    "outputs": [],
167 |    "source": [
168 |     "# Training and testing the VAE\n",
169 |     "\n",
170 |     "epochs = 10\n",
171 |     "for epoch in range(1, epochs + 1):\n",
172 |     "    # Training\n",
173 |     "    model.train()\n",
174 |     "    train_loss = 0\n",
175 |     "    for data, _ in train_loader:\n",
176 |     "        data = data.to(device)\n",
177 |     "        # ===================forward=====================\n",
178 |     "        recon_batch, mu, logvar = model(data)\n",
179 |     "        loss = loss_function(recon_batch, data, mu, logvar)\n",
180 |     "        train_loss += loss.item()\n",
181 |     "        # ===================backward====================\n",
182 |     "        optimizer.zero_grad()\n",
183 |     "        loss.backward()\n",
184 |     "        optimizer.step()\n",
185 |     "    # ===================log========================\n",
186 |     "    print(f'====> Epoch: {epoch} Average loss: {train_loss / len(train_loader.dataset):.4f}')\n",
187 |     "    \n",
188 |     "    # Testing\n",
189 |     "    \n",
190 |     "    with torch.no_grad():\n",
191 |     "        model.eval()\n",
192 |     "        test_loss = 0\n",
193 |     "        for data, _ in test_loader:\n",
194 |     "            data = data.to(device)\n",
195 |     "            # ===================forward=====================\n",
196 |     "            recon_batch, mu, logvar = model(data)\n",
197 |     "            test_loss += loss_function(recon_batch, data, mu, logvar).item()\n",
198 |     "    # ===================log========================\n",
199 |     "    test_loss /= len(test_loader.dataset)\n",
200 |     "    print(f'====> Test set loss: {test_loss:.4f}')\n",
201 |     "    display_images(data, recon_batch, 1, f'Epoch {epoch}')"
202 |    ]
203 |   },
204 |   {
205 |    "cell_type": "code",
206 |    "execution_count": null,
207 |    "metadata": {},
208 |    "outputs": [],
209 |    "source": [
210 |     "# Generating a few samples\n",
211 |     "\n",
212 |     "N = 16\n",
213 |     "sample = torch.randn((N, 20), requires_grad=False).to(device)\n",
214 |     "sample = model.decoder(sample)\n",
215 |     "display_images(None, sample, N // 4, count=True)"
216 |    ]
217 |   },
218 |   {
219 |    "cell_type": "code",
220 |    "execution_count": null,
221 |    "metadata": {},
222 |    "outputs": [],
223 |    "source": [
224 |     "# Display last test batch\n",
225 |     "\n",
226 |     "display_images(None, data, 4, count=True)"
227 |    ]
228 |   },
229 |   {
230 |    "cell_type": "code",
231 |    "execution_count": null,
232 |    "metadata": {},
233 |    "outputs": [],
234 |    "source": [
235 |     "# Choose starting and ending point for the interpolation -> shows original and reconstructed\n",
236 |     "\n",
237 |     "A, B = 5, 14\n",
238 |     "sample = model.decoder(torch.stack((mu[A].data, mu[B].data), 0))\n",
239 |     "display_images(None, torch.stack(((\n",
240 |     "    data[A].data.view(-1),\n",
241 |     "    data[B].data.view(-1),\n",
242 |     "    sample.data[0],\n",
243 |     "    sample.data[1]\n",
244 |     ")), 0))"
245 |    ]
246 |   },
247 |   {
248 |    "cell_type": "code",
249 |    "execution_count": null,
250 |    "metadata": {},
251 |    "outputs": [],
252 |    "source": [
253 |     "# Perform an interpolation between input A and B, in N steps\n",
254 |     "\n",
255 |     "N = 16\n",
256 |     "code = torch.Tensor(N, 20).to(device)\n",
257 |     "for i in range(N):\n",
258 |     "    code[i] = i / N * mu[B].data + (1 - i / N) * mu[A].data\n",
259 |     "code = torch.tensor(code, requires_grad=True)\n",
260 |     "sample = model.decoder(code)\n",
261 |     "display_images(None, sample, N // 4, count=True)"
262 |    ]
263 |   },
264 |   {
265 |    "cell_type": "code",
266 |    "execution_count": null,
267 |    "metadata": {},
268 |    "outputs": [],
269 |    "source": []
270 |   }
271 |  ],
272 |  "metadata": {
273 |   "kernelspec": {
274 |    "display_name": "Python 3",
275 |    "language": "python",
276 |    "name": "python3"
277 |   },
278 |   "language_info": {
279 |    "codemirror_mode": {
280 |     "name": "ipython",
281 |     "version": 3
282 |    },
283 |    "file_extension": ".py",
284 |    "mimetype": "text/x-python",
285 |    "name": "python",
286 |    "nbconvert_exporter": "python",
287 |    "pygments_lexer": "ipython3",
288 |    "version": "3.6.5"
289 |   }
290 |  },
291 |  "nbformat": 4,
292 |  "nbformat_minor": 2
293 | }
294 | 


--------------------------------------------------------------------------------
/08-1-classify_seq_data.ipynb:
--------------------------------------------------------------------------------
  1 | {
  2 |  "cells": [
  3 |   {
  4 |    "cell_type": "markdown",
  5 |    "metadata": {
  6 |     "collapsed": true
  7 |    },
  8 |    "source": [
  9 |     "An example of many-to-one (sequence classification):\n",
 10 |     "\n",
 11 |     "\n",
 12 |     "Original experiment from Hochreiter&Schmidhuber(1997):\n",
 13 |     "\n",
 14 |     "        The goal is to classify sequences. Elements and targets are represented locally\n",
 15 |     "        (input vectors with only one non-zero bit). The sequence starts with an E, ends\n",
 16 |     "        with a B (the \"trigger symbol\") and otherwise consists of randomly chosen symbols\n",
 17 |     "        from the set {a, b, c, d} except for two elements at positions t1 and t2 that are\n",
 18 |     "        either X or Y . The sequence length is randomly chosen between 100 and 110, t1 is\n",
 19 |     "        randomly chosen between 10 and 20, and t2 is randomly chosen between 50 and 60.\n",
 20 |     "        There are 4 sequence classes Q, R, S, U which depend on the temporal order of X and Y.\n",
 21 |     "        The rules are:\n",
 22 |     "            X, X -> Q,\n",
 23 |     "            X, Y -> R,\n",
 24 |     "            Y , X -> S,\n",
 25 |     "            Y , Y -> U.  "
 26 |    ]
 27 |   },
 28 |   {
 29 |    "cell_type": "code",
 30 |    "execution_count": null,
 31 |    "metadata": {},
 32 |    "outputs": [],
 33 |    "source": [
 34 |     "from sequential_tasks import TemporalOrderExp6aSequence\n",
 35 |     "\n",
 36 |     "# generate data\n",
 37 |     "dg = TemporalOrderExp6aSequence.get_predefined_generator(\n",
 38 |     "    TemporalOrderExp6aSequence.DifficultyLevel.EASY)\n"
 39 |    ]
 40 |   },
 41 |   {
 42 |    "cell_type": "code",
 43 |    "execution_count": null,
 44 |    "metadata": {},
 45 |    "outputs": [],
 46 |    "source": [
 47 |     "# Raw sequences and their classes:\n",
 48 |     "for n in range(5):\n",
 49 |     "    x, y = dg.generate_pair()\n",
 50 |     "    print('{} ----> {}'.format(x, y))\n"
 51 |    ]
 52 |   },
 53 |   {
 54 |    "cell_type": "code",
 55 |    "execution_count": null,
 56 |    "metadata": {},
 57 |    "outputs": [],
 58 |    "source": [
 59 |     "# Encoding our data into RNN-friendly data format\n",
 60 |     "\n",
 61 |     "# Single data pair example:\n",
 62 |     "x, y = dg.generate_pair()\n",
 63 |     "print('{} ----> {}'.format(x, y))\n",
 64 |     "    \n",
 65 |     "enc_x = dg.encode_x(x)\n",
 66 |     "enc_y = dg.encode_y(y)\n",
 67 |     "\n",
 68 |     "print('Encoded input sequence:')\n",
 69 |     "print(enc_x)\n",
 70 |     "print('Encoded output sequence:')\n",
 71 |     "print(enc_y)\n"
 72 |    ]
 73 |   },
 74 |   {
 75 |    "cell_type": "code",
 76 |    "execution_count": null,
 77 |    "metadata": {},
 78 |    "outputs": [],
 79 |    "source": [
 80 |     "# let's take a batch of training pairs\n",
 81 |     "batch_x, batch_y = dg[0]\n",
 82 |     "\n",
 83 |     "# batch_x has the shape (batch_size, max_seq_length, num_symbols)\n",
 84 |     "print('Batch_x shape = ', batch_x.shape)\n",
 85 |     "\n",
 86 |     "# batch_y has the shape (batch_size, num_classes)\n",
 87 |     "print('Batch_y shape = ', batch_y.shape)\n",
 88 |     "\n",
 89 |     "# inputs are zero-padded (added zero prefix)\n",
 90 |     "# to obtain sequences of equal length\n",
 91 |     "print(batch_x[0])\n",
 92 |     "\n"
 93 |    ]
 94 |   },
 95 |   {
 96 |    "cell_type": "code",
 97 |    "execution_count": null,
 98 |    "metadata": {},
 99 |    "outputs": [],
100 |    "source": []
101 |   },
102 |   {
103 |    "cell_type": "code",
104 |    "execution_count": null,
105 |    "metadata": {},
106 |    "outputs": [],
107 |    "source": []
108 |   }
109 |  ],
110 |  "metadata": {
111 |   "kernelspec": {
112 |    "display_name": "Python 3",
113 |    "language": "python",
114 |    "name": "python3"
115 |   },
116 |   "language_info": {
117 |    "codemirror_mode": {
118 |     "name": "ipython",
119 |     "version": 3
120 |    },
121 |    "file_extension": ".py",
122 |    "mimetype": "text/x-python",
123 |    "name": "python",
124 |    "nbconvert_exporter": "python",
125 |    "pygments_lexer": "ipython3",
126 |    "version": "3.6.5"
127 |   }
128 |  },
129 |  "nbformat": 4,
130 |  "nbformat_minor": 1
131 | }
132 | 


--------------------------------------------------------------------------------
/08-2-echo_data.ipynb:
--------------------------------------------------------------------------------
  1 | {
  2 |  "cells": [
  3 |   {
  4 |    "cell_type": "markdown",
  5 |    "metadata": {
  6 |     "collapsed": true
  7 |    },
  8 |    "source": [
  9 |     "Echoing signal n steps is an example of synchronized many-to-many task:"
 10 |    ]
 11 |   },
 12 |   {
 13 |    "cell_type": "code",
 14 |    "execution_count": null,
 15 |    "metadata": {},
 16 |    "outputs": [],
 17 |    "source": [
 18 |     "from sequential_tasks import EchoData\n",
 19 |     "\n",
 20 |     "batch_size = 5\n",
 21 |     "echo_step = 3\n",
 22 |     "series_length = 20000\n",
 23 |     "truncated_length = 10\n",
 24 |     "\n",
 25 |     "data_gen = EchoData(\n",
 26 |     "    echo_step=echo_step,\n",
 27 |     "    batch_size=batch_size,\n",
 28 |     "    series_length=series_length,\n",
 29 |     "    truncated_length=truncated_length)"
 30 |    ]
 31 |   },
 32 |   {
 33 |    "cell_type": "code",
 34 |    "execution_count": null,
 35 |    "metadata": {},
 36 |    "outputs": [],
 37 |    "source": [
 38 |     "# Let's print first 20 timesteps of the first sequences to see the echo data:\n",
 39 |     "print('(1st sequence) x = ', data_gen.raw_x[0, :20], '... ')\n",
 40 |     "print('(1st sequence) y = ', data_gen.raw_y[0, :20], '... ')"
 41 |    ]
 42 |   },
 43 |   {
 44 |    "cell_type": "code",
 45 |    "execution_count": null,
 46 |    "metadata": {},
 47 |    "outputs": [],
 48 |    "source": [
 49 |     "# batch_size different sequences are created:\n",
 50 |     "print('bax = ')\n",
 51 |     "print(data_gen.raw_x[:, :20])\n",
 52 |     "print('y = ')\n",
 53 |     "print(data_gen.raw_y[:, :20])\n",
 54 |     "\n",
 55 |     "print('raw_x shape:', data_gen.raw_x.shape)      # shape = (batch_size, sequence_length)\n",
 56 |     "print('raw_y shape:', data_gen.raw_y.shape)      # shape = (batch_size, sequence_length)\n"
 57 |    ]
 58 |   },
 59 |   {
 60 |    "cell_type": "code",
 61 |    "execution_count": null,
 62 |    "metadata": {},
 63 |    "outputs": [],
 64 |    "source": [
 65 |     "# In order to use RNNs data is organized into tensors of size:\n",
 66 |     "# [batch_size, truncated_sequence_length, feature_dim\n",
 67 |     "\n",
 68 |     "i_batch = 0\n",
 69 |     "print('batch x shape:', data_gen.x_batches[i_batch].shape)\n",
 70 |     "print('batch y shape:', data_gen.y_batches[i_batch].shape)\n"
 71 |    ]
 72 |   },
 73 |   {
 74 |    "cell_type": "code",
 75 |    "execution_count": null,
 76 |    "metadata": {},
 77 |    "outputs": [],
 78 |    "source": [
 79 |     "\n",
 80 |     "print(data_gen.x_batches[i_batch])\n",
 81 |     "print(data_gen.y_batches[i_batch])\n"
 82 |    ]
 83 |   },
 84 |   {
 85 |    "cell_type": "markdown",
 86 |    "metadata": {},
 87 |    "source": [
 88 |     "       "
 89 |    ]
 90 |   },
 91 |   {
 92 |    "cell_type": "code",
 93 |    "execution_count": null,
 94 |    "metadata": {},
 95 |    "outputs": [],
 96 |    "source": []
 97 |   }
 98 |  ],
 99 |  "metadata": {
100 |   "kernelspec": {
101 |    "display_name": "Python 3",
102 |    "language": "python",
103 |    "name": "python3"
104 |   },
105 |   "language_info": {
106 |    "codemirror_mode": {
107 |     "name": "ipython",
108 |     "version": 3
109 |    },
110 |    "file_extension": ".py",
111 |    "mimetype": "text/x-python",
112 |    "name": "python",
113 |    "nbconvert_exporter": "python",
114 |    "pygments_lexer": "ipython3",
115 |    "version": "3.6.5"
116 |   }
117 |  },
118 |  "nbformat": 4,
119 |  "nbformat_minor": 1
120 | }
121 | 


--------------------------------------------------------------------------------
/08-3-temporal_order_classification_experiments.ipynb:
--------------------------------------------------------------------------------
  1 | {
  2 |  "cells": [
  3 |   {
  4 |    "cell_type": "code",
  5 |    "execution_count": null,
  6 |    "metadata": {},
  7 |    "outputs": [],
  8 |    "source": [
  9 |     "from sequential_tasks import TemporalOrderExp6aSequence\n",
 10 |     "\n",
 11 |     "import torch\n",
 12 |     "import torch.nn as nn\n",
 13 |     "import torch.nn.functional as F\n",
 14 |     "import torch.optim as optim\n",
 15 |     "\n",
 16 |     "torch.manual_seed(1)"
 17 |    ]
 18 |   },
 19 |   {
 20 |    "cell_type": "markdown",
 21 |    "metadata": {},
 22 |    "source": [
 23 |     "# Specify experiment settings and prepare the data"
 24 |    ]
 25 |   },
 26 |   {
 27 |    "cell_type": "code",
 28 |    "execution_count": null,
 29 |    "metadata": {},
 30 |    "outputs": [],
 31 |    "source": [
 32 |     "# experiments settings\n",
 33 |     "settings = {\n",
 34 |     "    \"difficulty\": TemporalOrderExp6aSequence.DifficultyLevel.EASY,\n",
 35 |     "    \"batch_size\": 32,\n",
 36 |     "    \"h_units\": 4,\n",
 37 |     "    \"max_epochs\": 10\n",
 38 |     "}"
 39 |    ]
 40 |   },
 41 |   {
 42 |    "cell_type": "code",
 43 |    "execution_count": null,
 44 |    "metadata": {},
 45 |    "outputs": [],
 46 |    "source": [
 47 |     "#training data\n",
 48 |     "train_data_gen = TemporalOrderExp6aSequence.get_predefined_generator(\n",
 49 |     "    settings['difficulty'],\n",
 50 |     "    settings['batch_size'])\n",
 51 |     "train_size = len(train_data_gen)\n",
 52 |     "\n",
 53 |     "# testing data\n",
 54 |     "test_data_gen = TemporalOrderExp6aSequence.get_predefined_generator(\n",
 55 |     "    settings['difficulty'],\n",
 56 |     "    settings['batch_size'])\n",
 57 |     "test_size = len(test_data_gen)   "
 58 |    ]
 59 |   },
 60 |   {
 61 |    "cell_type": "markdown",
 62 |    "metadata": {},
 63 |    "source": [
 64 |     "# Define neural network"
 65 |    ]
 66 |   },
 67 |   {
 68 |    "cell_type": "code",
 69 |    "execution_count": null,
 70 |    "metadata": {},
 71 |    "outputs": [],
 72 |    "source": [
 73 |     "class SimpleRNN(nn.Module):\n",
 74 |     "\n",
 75 |     "    def __init__(self, input_size, rnn_hidden_size, output_size):\n",
 76 |     "\n",
 77 |     "        super(SimpleRNN, self).__init__()\n",
 78 |     "        self.rnn = torch.nn.RNN(input_size, rnn_hidden_size, num_layers=1, nonlinearity='relu', batch_first=True)\n",
 79 |     "        self.linear = torch.nn.Linear(rnn_hidden_size, output_size) \n",
 80 |     "\n",
 81 |     "    def forward(self, x):\n",
 82 |     "        x, _ = self.rnn(x)\n",
 83 |     "        x = self.linear(x)\n",
 84 |     "        return F.log_softmax(x, dim=1)"
 85 |    ]
 86 |   },
 87 |   {
 88 |    "cell_type": "code",
 89 |    "execution_count": null,
 90 |    "metadata": {},
 91 |    "outputs": [],
 92 |    "source": [
 93 |     "device = torch.device(\"cuda:0\" if torch.cuda.is_available() else \"cpu\")"
 94 |    ]
 95 |   },
 96 |   {
 97 |    "cell_type": "markdown",
 98 |    "metadata": {},
 99 |    "source": [
100 |     "# Define training loop"
101 |    ]
102 |   },
103 |   {
104 |    "cell_type": "code",
105 |    "execution_count": null,
106 |    "metadata": {},
107 |    "outputs": [],
108 |    "source": [
109 |     "def train():\n",
110 |     "    model.train()\n",
111 |     "    correct = 0\n",
112 |     "    for batch_idx in range(train_size):\n",
113 |     "        data, target = train_data_gen[batch_idx]\n",
114 |     "        data, target = torch.from_numpy(data).float().to(device), torch.from_numpy(target).long().to(device)\n",
115 |     "        optimizer.zero_grad()\n",
116 |     "        y_pred = model(data)\n",
117 |     "        loss = criterion(y_pred, target)\n",
118 |     "        loss.backward()\n",
119 |     "        optimizer.step()\n",
120 |     "        \n",
121 |     "        pred = y_pred.max(1, keepdim=True)[1]\n",
122 |     "        correct += pred.eq(target.view_as(pred)).sum().item()\n",
123 |     "    return correct, loss    "
124 |    ]
125 |   },
126 |   {
127 |    "cell_type": "code",
128 |    "execution_count": null,
129 |    "metadata": {},
130 |    "outputs": [],
131 |    "source": [
132 |     "def test():\n",
133 |     "    model.eval()   \n",
134 |     "    correct = 0\n",
135 |     "    with torch.no_grad():\n",
136 |     "        for batch_idx in range(test_size):\n",
137 |     "            data, target = test_data_gen[batch_idx]\n",
138 |     "            data, target = torch.from_numpy(data).float().to(device), torch.from_numpy(target).long().to(device)\n",
139 |     "            y_pred = model(data)\n",
140 |     "            pred = y_pred.max(1, keepdim=True)[1]\n",
141 |     "            correct += pred.eq(target.view_as(pred)).sum().item()\n",
142 |     "    return correct"
143 |    ]
144 |   },
145 |   {
146 |    "cell_type": "markdown",
147 |    "metadata": {},
148 |    "source": [
149 |     "# Initialize the Model and Optimizer"
150 |    ]
151 |   },
152 |   {
153 |    "cell_type": "code",
154 |    "execution_count": null,
155 |    "metadata": {},
156 |    "outputs": [],
157 |    "source": [
158 |     "model = SimpleRNN(train_data_gen.n_symbols, settings['h_units'], train_data_gen.n_classes)\n",
159 |     "\n",
160 |     "criterion = torch.nn.CrossEntropyLoss()\n",
161 |     "optimizer = torch.optim.RMSprop(model.parameters(), lr=0.001)"
162 |    ]
163 |   },
164 |   {
165 |    "cell_type": "markdown",
166 |    "metadata": {},
167 |    "source": [
168 |     "# Train the model"
169 |    ]
170 |   },
171 |   {
172 |    "cell_type": "code",
173 |    "execution_count": null,
174 |    "metadata": {},
175 |    "outputs": [],
176 |    "source": [
177 |     "#train for max_epochs epochs\n",
178 |     "epochs = settings['max_epochs']\n",
179 |     "epoch = 0\n",
180 |     "while epoch < epochs:\n",
181 |     "    correct, loss = train()\n",
182 |     "\n",
183 |     "    epoch += 1\n",
184 |     "    train_accuracy = float(correct) / train_size\n",
185 |     "    print('Train Epoch: {}/{}, loss: {:.4f}, accuracy {:2.2f}'.format(epoch, epochs, loss.item(), train_accuracy))\n",
186 |     "\n",
187 |     "#test    \n",
188 |     "correct = test()\n",
189 |     "test_accuracy = float(correct) / test_size\n",
190 |     "print('\\nTest accuracy: {}'.format(test_accuracy))"
191 |    ]
192 |   },
193 |   {
194 |    "cell_type": "code",
195 |    "execution_count": null,
196 |    "metadata": {},
197 |    "outputs": [],
198 |    "source": [
199 |     "print('acc = {:.2f}%.'.format(test_accuracy))"
200 |    ]
201 |   },
202 |   {
203 |    "cell_type": "code",
204 |    "execution_count": null,
205 |    "metadata": {},
206 |    "outputs": [],
207 |    "source": []
208 |   },
209 |   {
210 |    "cell_type": "code",
211 |    "execution_count": null,
212 |    "metadata": {},
213 |    "outputs": [],
214 |    "source": []
215 |   }
216 |  ],
217 |  "metadata": {
218 |   "kernelspec": {
219 |    "display_name": "Codas ML",
220 |    "language": "python",
221 |    "name": "codasml"
222 |   },
223 |   "language_info": {
224 |    "codemirror_mode": {
225 |     "name": "ipython",
226 |     "version": 3
227 |    },
228 |    "file_extension": ".py",
229 |    "mimetype": "text/x-python",
230 |    "name": "python",
231 |    "nbconvert_exporter": "python",
232 |    "pygments_lexer": "ipython3",
233 |    "version": "3.6.5"
234 |   }
235 |  },
236 |  "nbformat": 4,
237 |  "nbformat_minor": 1
238 | }
239 | 


--------------------------------------------------------------------------------
/08-4-echo_experiments.ipynb:
--------------------------------------------------------------------------------
  1 | {
  2 |  "cells": [
  3 |   {
  4 |    "cell_type": "code",
  5 |    "execution_count": null,
  6 |    "metadata": {},
  7 |    "outputs": [],
  8 |    "source": [
  9 |     "from sequential_tasks import EchoData\n",
 10 |     "import numpy as np"
 11 |    ]
 12 |   },
 13 |   {
 14 |    "cell_type": "code",
 15 |    "execution_count": null,
 16 |    "metadata": {},
 17 |    "outputs": [],
 18 |    "source": [
 19 |     "import torch\n",
 20 |     "import torch.nn as nn\n",
 21 |     "import torch.nn.functional as F\n",
 22 |     "import torch.optim as optim\n",
 23 |     "\n",
 24 |     "torch.manual_seed(1)"
 25 |    ]
 26 |   },
 27 |   {
 28 |    "cell_type": "markdown",
 29 |    "metadata": {},
 30 |    "source": [
 31 |     "# Specify experiment settings and prepare the data"
 32 |    ]
 33 |   },
 34 |   {
 35 |    "cell_type": "code",
 36 |    "execution_count": null,
 37 |    "metadata": {},
 38 |    "outputs": [],
 39 |    "source": [
 40 |     "# experiments settings\n",
 41 |     "settings = {\n",
 42 |     "    \"series_length\": 20000,\n",
 43 |     "    \"echo_step\": 3,\n",
 44 |     "    \"truncated_length\": 20,\n",
 45 |     "    \"batch_size\": 5,\n",
 46 |     "    \"h_units\": 4,\n",
 47 |     "    \"max_epochs\": 5\n",
 48 |     "}"
 49 |    ]
 50 |   },
 51 |   {
 52 |    "cell_type": "code",
 53 |    "execution_count": null,
 54 |    "metadata": {},
 55 |    "outputs": [],
 56 |    "source": [
 57 |     "#training data\n",
 58 |     "train_data_gen = EchoData(\n",
 59 |     "        series_length=settings['series_length'],\n",
 60 |     "        truncated_length=settings['truncated_length'],\n",
 61 |     "        echo_step=settings['echo_step'],\n",
 62 |     "        batch_size=settings['batch_size'])\n",
 63 |     "train_size = len(train_data_gen)\n",
 64 |     "\n",
 65 |     "#testing \n",
 66 |     "test_data_gen = EchoData(\n",
 67 |     "        series_length=settings['series_length'],\n",
 68 |     "        truncated_length=settings['truncated_length'],\n",
 69 |     "        echo_step=settings['echo_step'],\n",
 70 |     "        batch_size=settings['batch_size'])\n",
 71 |     "test_size = len(test_data_gen)    "
 72 |    ]
 73 |   },
 74 |   {
 75 |    "cell_type": "markdown",
 76 |    "metadata": {},
 77 |    "source": [
 78 |     "# Define the model"
 79 |    ]
 80 |   },
 81 |   {
 82 |    "cell_type": "code",
 83 |    "execution_count": null,
 84 |    "metadata": {},
 85 |    "outputs": [],
 86 |    "source": [
 87 |     "class SimpleRNN(nn.Module):\n",
 88 |     "\n",
 89 |     "    def __init__(self, input_size, rnn_hidden_size, output_size):\n",
 90 |     "\n",
 91 |     "        super(SimpleRNN, self).__init__()\n",
 92 |     "        self.rnn_hidden_size = rnn_hidden_size\n",
 93 |     "        self.rnn = torch.nn.RNN(input_size, self.rnn_hidden_size, num_layers=1, nonlinearity='relu', batch_first=True)\n",
 94 |     "        self.linear = torch.nn.Linear(rnn_hidden_size, 1)\n",
 95 |     "\n",
 96 |     "    def forward(self, x, hidden):\n",
 97 |     "        x, hidden = self.rnn(x, hidden)  \n",
 98 |     "        x = self.linear(x)\n",
 99 |     "        return nn.Sigmoid()(x), hidden\n",
100 |     "\n",
101 |     "    def init_hidden(self, batch_size):\n",
102 |     "        weight = next(self.parameters()).data\n",
103 |     "        return weight.new(1, batch_size, self.rnn_hidden_size).zero_()"
104 |    ]
105 |   },
106 |   {
107 |    "cell_type": "code",
108 |    "execution_count": null,
109 |    "metadata": {},
110 |    "outputs": [],
111 |    "source": [
112 |     "device = torch.device(\"cuda:0\" if torch.cuda.is_available() else \"cpu\")"
113 |    ]
114 |   },
115 |   {
116 |    "cell_type": "markdown",
117 |    "metadata": {},
118 |    "source": [
119 |     "## Define training and test loops"
120 |    ]
121 |   },
122 |   {
123 |    "cell_type": "code",
124 |    "execution_count": null,
125 |    "metadata": {},
126 |    "outputs": [],
127 |    "source": [
128 |     "def train(hidden):\n",
129 |     "    model.train()\n",
130 |     "       \n",
131 |     "    correct = 0\n",
132 |     "    for batch_idx in range(train_size):\n",
133 |     "        data, target = train_data_gen[batch_idx]\n",
134 |     "        data, target = torch.from_numpy(data).float().to(device), torch.from_numpy(target).float().to(device)\n",
135 |     "        optimizer.zero_grad()\n",
136 |     "        y_pred, hidden = model(data, hidden)\n",
137 |     "        loss = criterion(y_pred, target)\n",
138 |     "        loss.backward(retain_graph=True)\n",
139 |     "        optimizer.step()\n",
140 |     "        \n",
141 |     "        pred = (y_pred > 0.5).float()\n",
142 |     "        correct += (pred == target).sum().item()\n",
143 |     "        \n",
144 |     "    return correct, loss, hidden    "
145 |    ]
146 |   },
147 |   {
148 |    "cell_type": "code",
149 |    "execution_count": null,
150 |    "metadata": {},
151 |    "outputs": [],
152 |    "source": [
153 |     "def test(hidden):\n",
154 |     "    model.eval()   \n",
155 |     "    correct = 0\n",
156 |     "    with torch.no_grad():\n",
157 |     "        for batch_idx in range(test_size):\n",
158 |     "            data, target = test_data_gen[batch_idx]\n",
159 |     "            data, target = torch.from_numpy(data).float().to(device), torch.from_numpy(target).float().to(device)\n",
160 |     "            y_pred, hidden = model(data, hidden)\n",
161 |     "            \n",
162 |     "            pred = (y_pred > 0.5).float()\n",
163 |     "            correct += (pred == target).sum().item()\n",
164 |     "\n",
165 |     "    return correct"
166 |    ]
167 |   },
168 |   {
169 |    "cell_type": "markdown",
170 |    "metadata": {},
171 |    "source": [
172 |     "# Initialize the Model and Optimizer"
173 |    ]
174 |   },
175 |   {
176 |    "cell_type": "code",
177 |    "execution_count": null,
178 |    "metadata": {},
179 |    "outputs": [],
180 |    "source": [
181 |     "feature_dim = 1 #since we have a scalar series\n",
182 |     "model = SimpleRNN(1, settings['h_units'], 1)  \n",
183 |     "model.to(device)\n",
184 |     "hidden = model.init_hidden(train_data_gen.batch_size) #initialize hidden states for RNN  \n",
185 |     "        \n",
186 |     "criterion = torch.nn.BCEWithLogitsLoss()\n",
187 |     "optimizer = torch.optim.RMSprop(model.parameters(), lr=0.001)"
188 |    ]
189 |   },
190 |   {
191 |    "cell_type": "code",
192 |    "execution_count": null,
193 |    "metadata": {},
194 |    "outputs": [],
195 |    "source": [
196 |     "epochs = settings['max_epochs']\n",
197 |     "epoch = 0\n",
198 |     "\n",
199 |     "while epoch < epochs:\n",
200 |     "    correct, loss, hidden = train(hidden)\n",
201 |     "    epoch += 1\n",
202 |     "    train_accuracy = float(correct) / train_size\n",
203 |     "    print('Train Epoch: {}/{}, loss: {:.4f}, accuracy {:2.2f}'.format(epoch, epochs, loss.item(), train_accuracy))\n",
204 |     "\n",
205 |     "#test    \n",
206 |     "correct = test(hidden)\n",
207 |     "test_accuracy = float(correct) / test_size\n",
208 |     "print('\\nTest accuracy: {}'.format(test_accuracy))"
209 |    ]
210 |   },
211 |   {
212 |    "cell_type": "code",
213 |    "execution_count": null,
214 |    "metadata": {},
215 |    "outputs": [],
216 |    "source": []
217 |   },
218 |   {
219 |    "cell_type": "code",
220 |    "execution_count": null,
221 |    "metadata": {},
222 |    "outputs": [],
223 |    "source": []
224 |   }
225 |  ],
226 |  "metadata": {
227 |   "kernelspec": {
228 |    "display_name": "Codas ML",
229 |    "language": "python",
230 |    "name": "codasml"
231 |   },
232 |   "language_info": {
233 |    "codemirror_mode": {
234 |     "name": "ipython",
235 |     "version": 3
236 |    },
237 |    "file_extension": ".py",
238 |    "mimetype": "text/x-python",
239 |    "name": "python",
240 |    "nbconvert_exporter": "python",
241 |    "pygments_lexer": "ipython3",
242 |    "version": "3.6.5"
243 |   }
244 |  },
245 |  "nbformat": 4,
246 |  "nbformat_minor": 1
247 | }
248 | 


--------------------------------------------------------------------------------
/README.md:
--------------------------------------------------------------------------------
  1 | # PyTorch-Deep-Learning-Minicourse
  2 | Minicourse in Deep Learning with PyTorch
  3 | 
  4 | These lessons, developed during the course of several years while I've been teaching at Purdue and NYU, are here proposed for the Computational and Data Science for High Energy Physics ([CoDaS-HEP](http://codas-hep.org/)) summer school at Princeton University.
  5 | I'll upload the videos and link to them as soon as they are made available to me.
  6 | I'm also planning to record them in a more quiet environment and at a slower pace, add them to my YouTube channel, and made available [here](https://github.com/Atcold/pytorch-Video-Tutorials).
  7 | 
  8 | ## Table of contents
  9 | `T`: theory
 10 | `P`: practice
 11 | 
 12 |  1. `T` Learning paradigms: supervised-, unsupervised-, and reinforcement-learning
 13 |  2. `P` Getting started with the tools: Jupyter notebook, PyTorch tensors and autodifferentiation
 14 |  3. `T+P` Neural net's forward and backward propagation for classification
 15 |  4. `T+P` Convolutional neural nets improve performance by exploiting data nature
 16 |  5. `T+P` Unsupervised learning: vanilla and variational autoencoders, generative adversarial nets
 17 |  6. `T+P` Recurrent nets natively support sequential data
 18 | 
 19 | ## Sessions
 20 |  1. Time slot 1 (1h30min + 45 min = 2h15min) on Tuesday afternoon (1, 2, 3)
 21 |  2. Time slot 2 (1h30min + 45 min = 2h15min) on Wednesday afternoon (4)
 22 |  3. Extra section (45min) on Thursday afternoon (5)
 23 |  4. Extra section (1h30min) on Friday morning (6)
 24 | 
 25 | ## Notebooks visualisation
 26 | *Jupyter Notebooks* are used throughout these lectures for interactive data exploration and visualisation.
 27 | 
 28 | I use dark styles for both *GitHub* and *Jupyter Notebook*.
 29 | You better do the same, or they will look ugly.
 30 | To see the content appropriately install the following:
 31 | 
 32 |  - [*Jupyter Notebook* dark theme](https://userstyles.org/styles/153443/jupyter-notebook-dark);
 33 |  - [*GitHub* dark theme](https://userstyles.org/styles/37035/github-dark) and comment out the `invert #fff to #181818` code block.
 34 |  
 35 | ## Media coverage
 36 |  - Princeton Research Computing [article](https://researchcomputing.princeton.edu/news/princetons-codas-hep-summer-school-young-physicists-gain-edge-computational-skills)
 37 |  - Princeton University main page [article](https://www.princeton.edu/news/2018/07/27/princeton-summer-program-graduate-student-physicists-gain-computational-skills)
 38 | 
 39 | ## Keeping in touch
 40 | Feel free to follow me on [Twitter](https://twitter.com/AlfredoCanziani) and subscribe to my [YouTube channel](https://www.youtube.com/user/Atcold/) to have the latest free educational material.
 41 | 
 42 | # Getting started
 43 | To be able to follow the workshop exercises, you are going to need a laptop with Miniconda (a minimal version of Anaconda) and several Python packages installed.
 44 | Following instruction would work as is for Mac or Ubuntu linux users, Windows users would need to install and work in the Gitbash terminal.
 45 | 
 46 | ## Download and install Miniconda
 47 | Please go to the [Anaconda website](https://conda.io/miniconda.html).
 48 | Download and install *the latest* Miniconda version for *Python* 3.6 for your operating system. 
 49 | 
 50 | ```bash
 51 | wget <http:// link to miniconda>
 52 | sh <miniconda .sh>
 53 | ```
 54 | 
 55 | After that, type:
 56 | 
 57 | ```bash
 58 | conda --help
 59 | ```
 60 | 
 61 | and read the manual.
 62 | 
 63 | ## Check-out the git repository with the exercise 
 64 | Once Miniconda is ready, checkout the course repository and and proceed with setting up the environment:
 65 | 
 66 | ```bash
 67 | git clone https://github.com/Atcold/PyTorch-Deep-Learning-Minicourse 
 68 | ```
 69 | 
 70 | If you do not have git and do not wish to install it, just download the repository as zip, and unpack it:
 71 | 
 72 | ```bash
 73 | wget https://github.com/Atcold/PyTorch-Deep-Learning-Minicourse/archive/master.zip 
 74 | #For Mac users:
 75 | #curl -O https://github.com/Atcold/PyTorch-Deep-Learning-Minicourse/archive/master.zip 
 76 | unzip master.zip
 77 | ```
 78 | 
 79 | ## Create isolated Miniconda environment
 80 | Change into the course folder, then type:
 81 | 
 82 | ```bash
 83 | #cd PyTorch-Deep-Learning-Minicourse 
 84 | conda env create -f conda-envt.yml
 85 | source activate codas-ml
 86 | ```
 87 | 
 88 | ## Enable anaconda kernel in Jupyter
 89 | To make newly created miniconda environment visible in the Jupyter, install `ipykernel`:
 90 | 
 91 | ```bash
 92 | python -m ipykernel install --user --name codas-ml --display-name "Codas ML"
 93 | ```
 94 | 
 95 | ## Start jupyter notebook
 96 | If you are working in a JupyterLab container double click on "Files" tab in the upper right corner.
 97 | Locate first notebook, double click to open.
 98 | Do not attempt to start `jupyter` from the terminal window.
 99 | 
100 | If working on a laptop, start from terminal as usual:
101 | 
102 | ```bash
103 | jupyter notebook
104 | ```
105 | 


--------------------------------------------------------------------------------
/conda-envt.yml:
--------------------------------------------------------------------------------
 1 | name: codas-ml
 2 | channels:
 3 |   - conda-forge
 4 |   - pytorch
 5 | dependencies:
 6 |   - python=3.6
 7 |   - pytorch=0.4.0
 8 |   - torchvision
 9 |   - tensorflow=1.8.0
10 |   - matplotlib
11 |   - jupyter
12 | 


--------------------------------------------------------------------------------
/img/train.gif:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/SudalaiRajkumar/PyTorch-Deep-Learning-Minicourse/d2b0970935ec19cb4526ceb4ec028b94a3958c25/img/train.gif


--------------------------------------------------------------------------------
/plot_conf.py:
--------------------------------------------------------------------------------
 1 | # matplotlib and stuff
 2 | import matplotlib.pyplot as plt
 3 | import numpy as np
 4 | 
 5 | 
 6 | def plt_style(c='k'):
 7 |     """
 8 |     Set plotting style for bright (``c = 'w'``) or dark (``c = 'k'``) backgrounds
 9 | 
10 |     :param c: colour, can be set to ``'w'`` or ``'k'`` (which is the default)
11 |     :type c: str
12 |     """
13 |     import matplotlib as mpl
14 |     from matplotlib import rc
15 | 
16 |     # Reset previous configuration
17 |     mpl.rcParams.update(mpl.rcParamsDefault)
18 |     # %matplotlib inline  # not from script
19 |     get_ipython().run_line_magic('matplotlib', 'inline')
20 | 
21 |     # configuration for bright background
22 |     if c == 'w':
23 |         plt.style.use('bmh')
24 | 
25 |     # configurations for dark background
26 |     if c == 'k':
27 |         plt.style.use(['dark_background', 'bmh'])
28 | 
29 |     # remove background colour, set figure size
30 |     rc('figure', figsize=(16, 8), max_open_warning=False)
31 |     rc('axes', facecolor='none')
32 | 
33 | 
34 | def plt_interactive(c='k'):
35 |     from matplotlib import rc
36 |     import matplotlib as mpl
37 |     mpl.rcParams.update(mpl.rcParamsDefault)
38 |     get_ipython().run_line_magic('matplotlib', 'notebook')
39 |     plt.rc('figure', figsize=(9.5, 4.75), facecolor=c)
40 |     # configuration for bright background
41 |     if c == 'w':
42 |         plt.style.use('bmh')
43 | 
44 |     # configurations for dark background
45 |     if c == 'k':
46 |         plt.style.use(['dark_background', 'bmh'])
47 |     rc('axes', facecolor='none')
48 | 
49 | plt_style()
50 | 


--------------------------------------------------------------------------------
/raw/keras-regularisation.ipynb:
--------------------------------------------------------------------------------
  1 | {
  2 |  "cells": [
  3 |   {
  4 |    "cell_type": "markdown",
  5 |    "metadata": {},
  6 |    "source": [
  7 |     "# Regularisation in NNs"
  8 |    ]
  9 |   },
 10 |   {
 11 |    "cell_type": "markdown",
 12 |    "metadata": {},
 13 |    "source": [
 14 |     "## 1. Set up the environment"
 15 |    ]
 16 |   },
 17 |   {
 18 |    "cell_type": "code",
 19 |    "execution_count": null,
 20 |    "metadata": {},
 21 |    "outputs": [],
 22 |    "source": [
 23 |     "# Import statements\n",
 24 |     "from tensorflow import keras as kr\n",
 25 |     "import tensorflow as tf\n",
 26 |     "import numpy as np\n",
 27 |     "import matplotlib.pyplot as plt"
 28 |    ]
 29 |   },
 30 |   {
 31 |    "cell_type": "code",
 32 |    "execution_count": null,
 33 |    "metadata": {},
 34 |    "outputs": [],
 35 |    "source": [
 36 |     "# Set my plotting style\n",
 37 |     "plt.style.use(('dark_background', 'bmh'))\n",
 38 |     "plt.rc('axes', facecolor='none')\n",
 39 |     "plt.rc('figure', figsize=(16, 4))"
 40 |    ]
 41 |   },
 42 |   {
 43 |    "cell_type": "code",
 44 |    "execution_count": null,
 45 |    "metadata": {},
 46 |    "outputs": [],
 47 |    "source": [
 48 |     "# Set random seed for reproducibility\n",
 49 |     "np.random.seed(0)\n",
 50 |     "tf.set_random_seed(0)"
 51 |    ]
 52 |   },
 53 |   {
 54 |    "cell_type": "code",
 55 |    "execution_count": null,
 56 |    "metadata": {},
 57 |    "outputs": [],
 58 |    "source": [
 59 |     "# Shortcuts\n",
 60 |     "imdb = kr.datasets.imdb\n",
 61 |     "Tokeniser = kr.preprocessing.text.Tokenizer\n",
 62 |     "models = kr.models\n",
 63 |     "layers = kr.layers\n",
 64 |     "regularisers = kr.regularizers\n",
 65 |     "constraints = kr.constraints\n",
 66 |     "EarlyStopping = kr.callbacks.EarlyStopping\n",
 67 |     "ModelCheckpoint = kr.callbacks.ModelCheckpoint"
 68 |    ]
 69 |   },
 70 |   {
 71 |    "cell_type": "markdown",
 72 |    "metadata": {},
 73 |    "source": [
 74 |     "## 2. Loading the data set"
 75 |    ]
 76 |   },
 77 |   {
 78 |    "cell_type": "code",
 79 |    "execution_count": null,
 80 |    "metadata": {},
 81 |    "outputs": [],
 82 |    "source": [
 83 |     "# Set the number of features we want\n",
 84 |     "features_nb = 1000\n",
 85 |     "\n",
 86 |     "# Load data and target vector from movie review data\n",
 87 |     "(train_data, train_target), (test_data, test_target) = imdb.load_data(num_words=features_nb)\n",
 88 |     "\n",
 89 |     "# Convert movie review data to a one-hot encoded feature matrix\n",
 90 |     "tokeniser = Tokeniser(num_words=features_nb)\n",
 91 |     "train_features = tokeniser.sequences_to_matrix(train_data, mode='binary')\n",
 92 |     "test_features = tokeniser.sequences_to_matrix(test_data, mode='binary')"
 93 |    ]
 94 |   },
 95 |   {
 96 |    "cell_type": "markdown",
 97 |    "metadata": {},
 98 |    "source": [
 99 |     "### 2.1 Exploring the data set"
100 |    ]
101 |   },
102 |   {
103 |    "cell_type": "code",
104 |    "execution_count": null,
105 |    "metadata": {},
106 |    "outputs": [],
107 |    "source": [
108 |     "# Check data set sizes\n",
109 |     "print('train_data.shape:', train_data.shape)\n",
110 |     "print('train_target.shape:', train_target.shape)\n",
111 |     "print('test_data.shape:', test_data.shape)\n",
112 |     "print('test_target.shape:', test_target.shape)"
113 |    ]
114 |   },
115 |   {
116 |    "cell_type": "code",
117 |    "execution_count": null,
118 |    "metadata": {},
119 |    "outputs": [],
120 |    "source": [
121 |     "# Check format of first training sample\n",
122 |     "print('type(train_data[0]):', type(train_data[0]))\n",
123 |     "print('type(train_target[0]):', type(train_target[0]))"
124 |    ]
125 |   },
126 |   {
127 |    "cell_type": "code",
128 |    "execution_count": null,
129 |    "metadata": {},
130 |    "outputs": [],
131 |    "source": [
132 |     "# Check size of first 10 training samples and corresponding target\n",
133 |     "print('Reviews length:', [len(sample) for sample in train_data[:10]])\n",
134 |     "print('Review sentiment (bad/good):', train_target[:10])"
135 |    ]
136 |   },
137 |   {
138 |    "cell_type": "code",
139 |    "execution_count": null,
140 |    "metadata": {},
141 |    "outputs": [],
142 |    "source": [
143 |     "# Show first review - machine format\n",
144 |     "print(train_data[0])"
145 |    ]
146 |   },
147 |   {
148 |    "cell_type": "code",
149 |    "execution_count": null,
150 |    "metadata": {},
151 |    "outputs": [],
152 |    "source": [
153 |     "# Data set text visualisation helper function\n",
154 |     "def show_text(sample):\n",
155 |     "    word_to_id = imdb.get_word_index()\n",
156 |     "    word_to_id = {k:(v+3) for k,v in word_to_id.items()}\n",
157 |     "    word_to_id[\"<PAD>\"] = 0\n",
158 |     "    word_to_id[\"<START>\"] = 1\n",
159 |     "    word_to_id[\"<UNK>\"] = 2\n",
160 |     "\n",
161 |     "    id_to_word = {value:key for key,value in word_to_id.items()}\n",
162 |     "    print(' '.join(id_to_word[id_] for id_ in sample))"
163 |    ]
164 |   },
165 |   {
166 |    "cell_type": "code",
167 |    "execution_count": null,
168 |    "metadata": {},
169 |    "outputs": [],
170 |    "source": [
171 |     "# Show first review - human format\n",
172 |     "show_text(train_data[0])"
173 |    ]
174 |   },
175 |   {
176 |    "cell_type": "code",
177 |    "execution_count": null,
178 |    "metadata": {},
179 |    "outputs": [],
180 |    "source": [
181 |     "# Show first review - neural net format\n",
182 |     "print(train_features[0])"
183 |    ]
184 |   },
185 |   {
186 |    "cell_type": "code",
187 |    "execution_count": null,
188 |    "metadata": {},
189 |    "outputs": [],
190 |    "source": [
191 |     "# Show first review - neural net format - explanation\n",
192 |     "print(train_features[0] * np.arange(len(train_features[0])))"
193 |    ]
194 |   },
195 |   {
196 |    "cell_type": "markdown",
197 |    "metadata": {},
198 |    "source": [
199 |     "## 3. Exploring regularisation of NN\n",
200 |     "\n",
201 |     "Play with the code, especially the one marked `# toggle`.  \n",
202 |     "Start from `# toggle 0`, and then, one at the time, `# toggle 1` to `5`."
203 |    ]
204 |   },
205 |   {
206 |    "cell_type": "code",
207 |    "execution_count": null,
208 |    "metadata": {},
209 |    "outputs": [],
210 |    "source": [
211 |     "# Start neural network\n",
212 |     "network = models.Sequential()\n",
213 |     "\n",
214 |     "# Add a Dropout layer\n",
215 |     "# network.add(layers.Dropout(0.2))  # toggle 4\n",
216 |     "\n",
217 |     "# Add fully connected layer with a ReLU activation function and L2 regularization\n",
218 |     "network.add(layers.Dense(\n",
219 |     "    units=16, \n",
220 |     "    activation='relu', \n",
221 |     "#     kernel_regularizer=regularisers.l2(0.005),  # toggle 1\n",
222 |     "#     kernel_regularizer=regularisers.l1(0.001),  # toggle 2\n",
223 |     "#     kernel_constraint=constraints.max_norm(1),  # toggle 3\n",
224 |     "    input_shape=(features_nb,)\n",
225 |     "))\n",
226 |     "\n",
227 |     "# Add fully connected layer with a ReLU activation function and L2 regularization\n",
228 |     "network.add(layers.Dense(\n",
229 |     "    units=16, \n",
230 |     "#     kernel_regularizer=regularisers.l2(0.005),  # toggle 1\n",
231 |     "#     kernel_constraint=constraints.max_norm(1),  # toggle 3\n",
232 |     "    activation='relu'\n",
233 |     "))\n",
234 |     "\n",
235 |     "# Add a Dropout layer\n",
236 |     "# network.add(layers.Dropout(0.5))  # toggle 4\n",
237 |     "\n",
238 |     "# Add fully connected layer with a sigmoid activation function\n",
239 |     "network.add(layers.Dense(units=1, activation='sigmoid'))  # Compile neural network\n",
240 |     "\n",
241 |     "# Compile network\n",
242 |     "network.compile(\n",
243 |     "    loss='binary_crossentropy',  # Cross-entropy\n",
244 |     "    optimizer='rmsprop',  # Root Mean Square Propagation\n",
245 |     "    metrics=['accuracy']   # Accuracy performance metric\n",
246 |     ")"
247 |    ]
248 |   },
249 |   {
250 |    "cell_type": "code",
251 |    "execution_count": null,
252 |    "metadata": {},
253 |    "outputs": [],
254 |    "source": [
255 |     "# Train neural network\n",
256 |     "history = network.fit(\n",
257 |     "    train_features,  # Features\n",
258 |     "    train_target,  # Target vector\n",
259 |     "    epochs=25,  # Number of epochs\n",
260 |     "    verbose=0,  # No output\n",
261 |     "    batch_size=100,  # Number of observations per batch\n",
262 |     "    validation_data=(test_features, test_target),  # Data for evaluation\n",
263 |     "#     callbacks=[                                                                             # toggle 5\n",
264 |     "#         EarlyStopping(monitor='val_loss', patience=2),                                      # toggle 5\n",
265 |     "#         ModelCheckpoint(filepath='best_model.h5', monitor='val_loss', save_best_only=True)  # toggle 5\n",
266 |     "#     ],                                                                                      # toggle 5\n",
267 |     ")"
268 |    ]
269 |   },
270 |   {
271 |    "cell_type": "code",
272 |    "execution_count": null,
273 |    "metadata": {},
274 |    "outputs": [],
275 |    "source": [
276 |     "# ! ls  # toggle 5"
277 |    ]
278 |   },
279 |   {
280 |    "cell_type": "code",
281 |    "execution_count": null,
282 |    "metadata": {},
283 |    "outputs": [],
284 |    "source": [
285 |     "# Get training and test accuracy histories\n",
286 |     "train_loss = history.history['loss']\n",
287 |     "test_loss = history.history['val_loss']\n",
288 |     "\n",
289 |     "# Create count of the number of epochs\n",
290 |     "epoch = range(1, len(train_loss) + 1)\n",
291 |     "\n",
292 |     "# Visualize accuracy history\n",
293 |     "plt.figure()\n",
294 |     "\n",
295 |     "plt.plot(epoch, train_loss)\n",
296 |     "plt.plot(epoch, test_loss)\n",
297 |     "# plt.plot(no_reg['epoch'], no_reg['train_loss'])  # toggle 0\n",
298 |     "# plt.plot(no_reg['epoch'], no_reg['test_loss'])  # toggle 0\n",
299 |     "\n",
300 |     "plt.legend(['Train loss', 'Test loss', 'Train no-reg', 'Test no-reg'])\n",
301 |     "plt.xlabel('Epoch')\n",
302 |     "plt.ylabel('Loss score')\n",
303 |     "\n",
304 |     "# Get training and test accuracy histories\n",
305 |     "train_accuracy = history.history['acc']\n",
306 |     "test_accuracy = history.history['val_acc']\n",
307 |     "\n",
308 |     "# Visualize accuracy history\n",
309 |     "plt.figure()\n",
310 |     "\n",
311 |     "plt.plot(epoch, train_accuracy)\n",
312 |     "plt.plot(epoch, test_accuracy)\n",
313 |     "# plt.plot(no_reg['epoch'], no_reg['train_accuracy'])  # toggle 0\n",
314 |     "# plt.plot(no_reg['epoch'], no_reg['test_accuracy'])  # toggle 0\n",
315 |     "\n",
316 |     "plt.legend(['Train accuracy', 'Test accuracy', 'Train no-reg', 'Test no-reg'])\n",
317 |     "plt.xlabel('Epoch')\n",
318 |     "plt.ylabel('Accuracy Score')\n",
319 |     "\n",
320 |     "no_reg = {                             # toggle 0\n",
321 |     "    'epoch': epoch,                    # toggle 0\n",
322 |     "    'train_loss': train_loss,          # toggle 0\n",
323 |     "    'test_loss': test_loss,            # toggle 0\n",
324 |     "    'train_accuracy': train_accuracy,  # toggle 0\n",
325 |     "    'test_accuracy': test_accuracy,    # toggle 0\n",
326 |     "}"
327 |    ]
328 |   },
329 |   {
330 |    "cell_type": "code",
331 |    "execution_count": null,
332 |    "metadata": {},
333 |    "outputs": [],
334 |    "source": [
335 |     "# Backup weights\n",
336 |     "weights = network.layers[0].get_weights()[0]  # toggle 0\n",
337 |     "# weights_L1 = network.layers[0].get_weights()[0]  # toggle 1\n",
338 |     "# weights_L2 = network.layers[0].get_weights()[0]  # toggle 2\n",
339 |     "# weights_max = network.layers[0].get_weights()[0]  # toggle 3"
340 |    ]
341 |   },
342 |   {
343 |    "cell_type": "markdown",
344 |    "metadata": {},
345 |    "source": [
346 |     "After you got to toggle `# toggle 3`, execute the following code."
347 |    ]
348 |   },
349 |   {
350 |    "cell_type": "code",
351 |    "execution_count": null,
352 |    "metadata": {},
353 |    "outputs": [],
354 |    "source": [
355 |     "# Show weight distribution\n",
356 |     "plt.hist((\n",
357 |     "    weights.reshape(-1),\n",
358 |     "    weights_L1.reshape(-1),\n",
359 |     "    weights_L2.reshape(-1),\n",
360 |     "    weights_max.reshape(-1),\n",
361 |     "), 49, range=(-.5, .5), label=(\n",
362 |     "    'No-reg',\n",
363 |     "    'L1',\n",
364 |     "    'L2',\n",
365 |     "    'Max',\n",
366 |     "))\n",
367 |     "plt.legend();"
368 |    ]
369 |   }
370 |  ],
371 |  "metadata": {
372 |   "kernelspec": {
373 |    "display_name": "Python 3",
374 |    "language": "python",
375 |    "name": "python3"
376 |   },
377 |   "language_info": {
378 |    "codemirror_mode": {
379 |     "name": "ipython",
380 |     "version": 3
381 |    },
382 |    "file_extension": ".py",
383 |    "mimetype": "text/x-python",
384 |    "name": "python",
385 |    "nbconvert_exporter": "python",
386 |    "pygments_lexer": "ipython3",
387 |    "version": "3.6.5"
388 |   }
389 |  },
390 |  "nbformat": 4,
391 |  "nbformat_minor": 2
392 | }
393 | 


--------------------------------------------------------------------------------
/raw/keras-sequences/0_1_classify_seq_data.ipynb:
--------------------------------------------------------------------------------
  1 | {
  2 |  "cells": [
  3 |   {
  4 |    "cell_type": "markdown",
  5 |    "metadata": {
  6 |     "collapsed": true
  7 |    },
  8 |    "source": [
  9 |     "An example of many-to-one (sequence classification):\n",
 10 |     "\n",
 11 |     "\n",
 12 |     "Original experiment from Hochreiter&Schmidhuber(1997):\n",
 13 |     "\n",
 14 |     "        The goal is to classify sequences. Elements and targets are represented locally\n",
 15 |     "        (input vectors with only one non-zero bit). The sequence starts with an E, ends\n",
 16 |     "        with a B (the \"trigger symbol\") and otherwise consists of randomly chosen symbols\n",
 17 |     "        from the set {a, b, c, d} except for two elements at positions t1 and t2 that are\n",
 18 |     "        either X or Y . The sequence length is randomly chosen between 100 and 110, t1 is\n",
 19 |     "        randomly chosen between 10 and 20, and t2 is randomly chosen between 50 and 60.\n",
 20 |     "        There are 4 sequence classes Q, R, S, U which depend on the temporal order of X and Y.\n",
 21 |     "        The rules are:\n",
 22 |     "            X, X -> Q,\n",
 23 |     "            X, Y -> R,\n",
 24 |     "            Y , X -> S,\n",
 25 |     "            Y , Y -> U.  "
 26 |    ]
 27 |   },
 28 |   {
 29 |    "cell_type": "code",
 30 |    "execution_count": 1,
 31 |    "metadata": {},
 32 |    "outputs": [],
 33 |    "source": [
 34 |     "from sequential_tasks import TemporalOrderExp6aSequence\n",
 35 |     "\n",
 36 |     "# data generator\n",
 37 |     "dg = TemporalOrderExp6aSequence.get_predefined_generator(\n",
 38 |     "    TemporalOrderExp6aSequence.DifficultyLevel.EASY)\n"
 39 |    ]
 40 |   },
 41 |   {
 42 |    "cell_type": "code",
 43 |    "execution_count": 2,
 44 |    "metadata": {},
 45 |    "outputs": [
 46 |     {
 47 |      "name": "stdout",
 48 |      "output_type": "stream",
 49 |      "text": [
 50 |       "BbXcXcbE ----> Q\n",
 51 |       "BYdadYE ----> U\n",
 52 |       "BXddbXcE ----> Q\n",
 53 |       "BXacYdE ----> R\n",
 54 |       "BbYbXdbE ----> S\n"
 55 |      ]
 56 |     }
 57 |    ],
 58 |    "source": [
 59 |     "# Raw sequences and their classes:\n",
 60 |     "for n in range(5):\n",
 61 |     "    x, y = dg.generate_pair()\n",
 62 |     "    print('{} ----> {}'.format(x, y))\n"
 63 |    ]
 64 |   },
 65 |   {
 66 |    "cell_type": "code",
 67 |    "execution_count": 3,
 68 |    "metadata": {},
 69 |    "outputs": [
 70 |     {
 71 |      "name": "stdout",
 72 |      "output_type": "stream",
 73 |      "text": [
 74 |       "BXbcYaE ----> R\n",
 75 |       "Encoded input sequence:\n",
 76 |       "[[0. 0. 0. 0. 0. 0. 1. 0.]\n",
 77 |       " [1. 0. 0. 0. 0. 0. 0. 0.]\n",
 78 |       " [0. 0. 0. 1. 0. 0. 0. 0.]\n",
 79 |       " [0. 0. 0. 0. 1. 0. 0. 0.]\n",
 80 |       " [0. 1. 0. 0. 0. 0. 0. 0.]\n",
 81 |       " [0. 0. 1. 0. 0. 0. 0. 0.]\n",
 82 |       " [0. 0. 0. 0. 0. 0. 0. 1.]]\n",
 83 |       "Encoded output sequence:\n",
 84 |       "[0. 1. 0. 0.]\n"
 85 |      ]
 86 |     }
 87 |    ],
 88 |    "source": [
 89 |     "# Encoding our data into RNN-friendly data format\n",
 90 |     "\n",
 91 |     "# Single data pair example:\n",
 92 |     "x, y = dg.generate_pair()\n",
 93 |     "print('{} ----> {}'.format(x, y))\n",
 94 |     "    \n",
 95 |     "enc_x = dg.encode_x(x)\n",
 96 |     "enc_y = dg.encode_y(y)\n",
 97 |     "\n",
 98 |     "print('Encoded input sequence:')\n",
 99 |     "print(enc_x)\n",
100 |     "print('Encoded output sequence:')\n",
101 |     "print(enc_y)\n"
102 |    ]
103 |   },
104 |   {
105 |    "cell_type": "code",
106 |    "execution_count": 4,
107 |    "metadata": {},
108 |    "outputs": [
109 |     {
110 |      "name": "stdout",
111 |      "output_type": "stream",
112 |      "text": [
113 |       "Batch_x shape =  (32, 9, 8)\n",
114 |       "Batch_y shape =  (32, 4)\n",
115 |       "[[0 0 0 0 0 0 0 0]\n",
116 |       " [0 0 0 0 0 0 1 0]\n",
117 |       " [1 0 0 0 0 0 0 0]\n",
118 |       " [0 0 0 0 1 0 0 0]\n",
119 |       " [0 0 0 1 0 0 0 0]\n",
120 |       " [0 1 0 0 0 0 0 0]\n",
121 |       " [0 0 1 0 0 0 0 0]\n",
122 |       " [0 0 0 0 1 0 0 0]\n",
123 |       " [0 0 0 0 0 0 0 1]]\n"
124 |      ]
125 |     }
126 |    ],
127 |    "source": [
128 |     "# let's generate a batch of training pairs\n",
129 |     "batch_x, batch_y = dg[0]\n",
130 |     "\n",
131 |     "# batch_x has the shape (batch_size, max_seq_length, num_symbols)\n",
132 |     "print('Batch_x shape = ', batch_x.shape)\n",
133 |     "\n",
134 |     "# batch_y has the shape (batch_size, num_classes)\n",
135 |     "print('Batch_y shape = ', batch_y.shape)\n",
136 |     "\n",
137 |     "# inputs are zero-padded (added zero prefix)\n",
138 |     "# to obtain sequences of equal length\n",
139 |     "print(batch_x[0])\n",
140 |     "\n"
141 |    ]
142 |   }
143 |  ],
144 |  "metadata": {
145 |   "kernelspec": {
146 |    "display_name": "Python 3",
147 |    "language": "python",
148 |    "name": "python3"
149 |   },
150 |   "language_info": {
151 |    "codemirror_mode": {
152 |     "name": "ipython",
153 |     "version": 3
154 |    },
155 |    "file_extension": ".py",
156 |    "mimetype": "text/x-python",
157 |    "name": "python",
158 |    "nbconvert_exporter": "python",
159 |    "pygments_lexer": "ipython3",
160 |    "version": "3.6.3"
161 |   }
162 |  },
163 |  "nbformat": 4,
164 |  "nbformat_minor": 1
165 | }
166 | 


--------------------------------------------------------------------------------
/raw/keras-sequences/0_2_echo_data.ipynb:
--------------------------------------------------------------------------------
  1 | {
  2 |  "cells": [
  3 |   {
  4 |    "cell_type": "markdown",
  5 |    "metadata": {
  6 |     "collapsed": true
  7 |    },
  8 |    "source": [
  9 |     "Echoing signal n steps is an example of synchronized many-to-many task:"
 10 |    ]
 11 |   },
 12 |   {
 13 |    "cell_type": "code",
 14 |    "execution_count": 1,
 15 |    "metadata": {},
 16 |    "outputs": [],
 17 |    "source": [
 18 |     "from sequential_tasks import EchoData\n",
 19 |     "\n",
 20 |     "batch_size = 5\n",
 21 |     "echo_step = 3\n",
 22 |     "series_length = 20000\n",
 23 |     "truncated_length = 10\n",
 24 |     "\n",
 25 |     "data_gen = EchoData(\n",
 26 |     "    echo_step=echo_step,\n",
 27 |     "    batch_size=batch_size,\n",
 28 |     "    series_length=series_length,\n",
 29 |     "    truncated_length=truncated_length)\n",
 30 |     "\n"
 31 |    ]
 32 |   },
 33 |   {
 34 |    "cell_type": "code",
 35 |    "execution_count": 2,
 36 |    "metadata": {},
 37 |    "outputs": [
 38 |     {
 39 |      "name": "stdout",
 40 |      "output_type": "stream",
 41 |      "text": [
 42 |       "(1st sequence) x =  [0 1 0 1 0 1 1 0 0 0 1 0 0 1 0 1 0 0 1 0] ... \n",
 43 |       "(1st sequence) y =  [0 0 0 0 1 0 1 0 1 1 0 0 0 1 0 0 1 0 1 0] ... \n"
 44 |      ]
 45 |     }
 46 |    ],
 47 |    "source": [
 48 |     "# Let's print first 20 timesteps of the first sequences to see the echo data:\n",
 49 |     "print('(1st sequence) x = ', data_gen.raw_x[0, :20], '... ')\n",
 50 |     "print('(1st sequence) y = ', data_gen.raw_y[0, :20], '... ')\n"
 51 |    ]
 52 |   },
 53 |   {
 54 |    "cell_type": "code",
 55 |    "execution_count": 3,
 56 |    "metadata": {},
 57 |    "outputs": [
 58 |     {
 59 |      "name": "stdout",
 60 |      "output_type": "stream",
 61 |      "text": [
 62 |       "bax = \n",
 63 |       "[[0 1 0 1 0 1 1 0 0 0 1 0 0 1 0 1 0 0 1 0]\n",
 64 |       " [1 0 1 1 1 0 1 1 0 1 1 0 0 1 1 1 0 1 0 1]\n",
 65 |       " [1 0 1 0 0 1 1 1 1 0 1 1 0 0 1 0 1 0 0 1]\n",
 66 |       " [1 1 1 1 1 1 1 1 0 0 0 1 0 1 0 1 1 1 0 0]\n",
 67 |       " [1 0 0 0 0 0 1 1 1 1 1 0 0 1 1 0 1 1 0 0]]\n",
 68 |       "y = \n",
 69 |       "[[0 0 0 0 1 0 1 0 1 1 0 0 0 1 0 0 1 0 1 0]\n",
 70 |       " [0 0 0 1 0 1 1 1 0 1 1 0 1 1 0 0 1 1 1 0]\n",
 71 |       " [0 0 0 1 0 1 0 0 1 1 1 1 0 1 1 0 0 1 0 1]\n",
 72 |       " [0 0 0 1 1 1 1 1 1 1 1 0 0 0 1 0 1 0 1 1]\n",
 73 |       " [0 0 0 1 0 0 0 0 0 1 1 1 1 1 0 0 1 1 0 1]]\n",
 74 |       "raw_x shape: (5, 20000)\n",
 75 |       "raw_y shape: (5, 20000)\n"
 76 |      ]
 77 |     }
 78 |    ],
 79 |    "source": [
 80 |     "# batch_size different sequences are created:\n",
 81 |     "print('bax = ')\n",
 82 |     "print(data_gen.raw_x[:, :20])\n",
 83 |     "print('y = ')\n",
 84 |     "print(data_gen.raw_y[:, :20])\n",
 85 |     "\n",
 86 |     "print('raw_x shape:', data_gen.raw_x.shape)      # shape = (batch_size, series_length)\n",
 87 |     "print('raw_y shape:', data_gen.raw_y.shape)      # shape = (batch_size, series_length)\n"
 88 |    ]
 89 |   },
 90 |   {
 91 |    "cell_type": "code",
 92 |    "execution_count": 4,
 93 |    "metadata": {},
 94 |    "outputs": [
 95 |     {
 96 |      "name": "stdout",
 97 |      "output_type": "stream",
 98 |      "text": [
 99 |       "batch x shape: (5, 10, 1)\n",
100 |       "batch y shape: (5, 10, 1)\n"
101 |      ]
102 |     }
103 |    ],
104 |    "source": [
105 |     "# In order to use RNNs data organized into batches of size:\n",
106 |     "# [batch_size, truncated_sequence_length, feature_dim\n",
107 |     "\n",
108 |     "i_batch = 0\n",
109 |     "print('batch x shape:', data_gen.x_batches[i_batch].shape)\n",
110 |     "print('batch y shape:', data_gen.y_batches[i_batch].shape)\n"
111 |    ]
112 |   },
113 |   {
114 |    "cell_type": "code",
115 |    "execution_count": 5,
116 |    "metadata": {},
117 |    "outputs": [
118 |     {
119 |      "name": "stdout",
120 |      "output_type": "stream",
121 |      "text": [
122 |       "[[[0]\n",
123 |       "  [1]\n",
124 |       "  [0]\n",
125 |       "  [1]\n",
126 |       "  [0]\n",
127 |       "  [1]\n",
128 |       "  [1]\n",
129 |       "  [0]\n",
130 |       "  [0]\n",
131 |       "  [0]]\n",
132 |       "\n",
133 |       " [[1]\n",
134 |       "  [0]\n",
135 |       "  [1]\n",
136 |       "  [1]\n",
137 |       "  [1]\n",
138 |       "  [0]\n",
139 |       "  [1]\n",
140 |       "  [1]\n",
141 |       "  [0]\n",
142 |       "  [1]]\n",
143 |       "\n",
144 |       " [[1]\n",
145 |       "  [0]\n",
146 |       "  [1]\n",
147 |       "  [0]\n",
148 |       "  [0]\n",
149 |       "  [1]\n",
150 |       "  [1]\n",
151 |       "  [1]\n",
152 |       "  [1]\n",
153 |       "  [0]]\n",
154 |       "\n",
155 |       " [[1]\n",
156 |       "  [1]\n",
157 |       "  [1]\n",
158 |       "  [1]\n",
159 |       "  [1]\n",
160 |       "  [1]\n",
161 |       "  [1]\n",
162 |       "  [1]\n",
163 |       "  [0]\n",
164 |       "  [0]]\n",
165 |       "\n",
166 |       " [[1]\n",
167 |       "  [0]\n",
168 |       "  [0]\n",
169 |       "  [0]\n",
170 |       "  [0]\n",
171 |       "  [0]\n",
172 |       "  [1]\n",
173 |       "  [1]\n",
174 |       "  [1]\n",
175 |       "  [1]]]\n",
176 |       "[[[0]\n",
177 |       "  [0]\n",
178 |       "  [0]\n",
179 |       "  [0]\n",
180 |       "  [1]\n",
181 |       "  [0]\n",
182 |       "  [1]\n",
183 |       "  [0]\n",
184 |       "  [1]\n",
185 |       "  [1]]\n",
186 |       "\n",
187 |       " [[0]\n",
188 |       "  [0]\n",
189 |       "  [0]\n",
190 |       "  [1]\n",
191 |       "  [0]\n",
192 |       "  [1]\n",
193 |       "  [1]\n",
194 |       "  [1]\n",
195 |       "  [0]\n",
196 |       "  [1]]\n",
197 |       "\n",
198 |       " [[0]\n",
199 |       "  [0]\n",
200 |       "  [0]\n",
201 |       "  [1]\n",
202 |       "  [0]\n",
203 |       "  [1]\n",
204 |       "  [0]\n",
205 |       "  [0]\n",
206 |       "  [1]\n",
207 |       "  [1]]\n",
208 |       "\n",
209 |       " [[0]\n",
210 |       "  [0]\n",
211 |       "  [0]\n",
212 |       "  [1]\n",
213 |       "  [1]\n",
214 |       "  [1]\n",
215 |       "  [1]\n",
216 |       "  [1]\n",
217 |       "  [1]\n",
218 |       "  [1]]\n",
219 |       "\n",
220 |       " [[0]\n",
221 |       "  [0]\n",
222 |       "  [0]\n",
223 |       "  [1]\n",
224 |       "  [0]\n",
225 |       "  [0]\n",
226 |       "  [0]\n",
227 |       "  [0]\n",
228 |       "  [0]\n",
229 |       "  [1]]]\n"
230 |      ]
231 |     }
232 |    ],
233 |    "source": [
234 |     "\n",
235 |     "print(data_gen.x_batches[i_batch])\n",
236 |     "print(data_gen.y_batches[i_batch])\n"
237 |    ]
238 |   },
239 |   {
240 |    "cell_type": "markdown",
241 |    "metadata": {},
242 |    "source": [
243 |     "       "
244 |    ]
245 |   }
246 |  ],
247 |  "metadata": {
248 |   "kernelspec": {
249 |    "display_name": "Python 3",
250 |    "language": "python",
251 |    "name": "python3"
252 |   },
253 |   "language_info": {
254 |    "codemirror_mode": {
255 |     "name": "ipython",
256 |     "version": 3
257 |    },
258 |    "file_extension": ".py",
259 |    "mimetype": "text/x-python",
260 |    "name": "python",
261 |    "nbconvert_exporter": "python",
262 |    "pygments_lexer": "ipython3",
263 |    "version": "3.6.3"
264 |   }
265 |  },
266 |  "nbformat": 4,
267 |  "nbformat_minor": 1
268 | }
269 | 


--------------------------------------------------------------------------------
/raw/keras-sequences/1_1_temporal_order_classification_experiments.ipynb:
--------------------------------------------------------------------------------
  1 | {
  2 |  "cells": [
  3 |   {
  4 |    "cell_type": "code",
  5 |    "execution_count": 1,
  6 |    "metadata": {},
  7 |    "outputs": [],
  8 |    "source": [
  9 |     "from sequential_tasks import TemporalOrderExp6aSequence\n",
 10 |     "from tensorflow.python.keras.models import Sequential\n",
 11 |     "from tensorflow.python.keras.layers import SimpleRNN, Dense\n"
 12 |    ]
 13 |   },
 14 |   {
 15 |    "cell_type": "code",
 16 |    "execution_count": 2,
 17 |    "metadata": {},
 18 |    "outputs": [],
 19 |    "source": [
 20 |     "def exp6a_experiment(settings):\n",
 21 |     "    train_data_gen = TemporalOrderExp6aSequence.get_predefined_generator(\n",
 22 |     "        settings['difficulty'],\n",
 23 |     "        settings['batch_size'])\n",
 24 |     "\n",
 25 |     "    model = Sequential([\n",
 26 |     "        SimpleRNN(\n",
 27 |     "            units=settings['h_units'],\n",
 28 |     "            input_shape=(train_data_gen.length_range[1],\n",
 29 |     "                         train_data_gen.n_symbols)),\n",
 30 |     "        Dense(units=train_data_gen.n_classes, activation='softmax')\n",
 31 |     "    ])\n",
 32 |     "\n",
 33 |     "    model.compile(optimizer='rmsprop', loss='categorical_crossentropy', metrics=['accuracy'])\n",
 34 |     "    model.fit_generator(\n",
 35 |     "        generator=train_data_gen,\n",
 36 |     "        epochs=settings['max_epochs'],\n",
 37 |     "        verbose=2)\n",
 38 |     "\n",
 39 |     "    # testing\n",
 40 |     "    test_data_gen = TemporalOrderExp6aSequence.get_predefined_generator(\n",
 41 |     "        settings['difficulty'],\n",
 42 |     "        settings['batch_size'])\n",
 43 |     "\n",
 44 |     "    eval_metrics = model.evaluate_generator(test_data_gen)\n",
 45 |     "    test_accuracy = eval_metrics[1]\n",
 46 |     "    \n",
 47 |     "    return test_accuracy\n"
 48 |    ]
 49 |   },
 50 |   {
 51 |    "cell_type": "code",
 52 |    "execution_count": 3,
 53 |    "metadata": {},
 54 |    "outputs": [
 55 |     {
 56 |      "name": "stdout",
 57 |      "output_type": "stream",
 58 |      "text": [
 59 |       "Epoch 1/30\n",
 60 |       " - 0s - loss: 1.3959 - acc: 0.2591\n",
 61 |       "Epoch 2/30\n",
 62 |       " - 0s - loss: 1.3335 - acc: 0.3347\n",
 63 |       "Epoch 3/30\n",
 64 |       " - 0s - loss: 1.3186 - acc: 0.3498\n",
 65 |       "Epoch 4/30\n",
 66 |       " - 0s - loss: 1.2981 - acc: 0.3659\n",
 67 |       "Epoch 5/30\n",
 68 |       " - 0s - loss: 1.2400 - acc: 0.4355\n",
 69 |       "Epoch 6/30\n",
 70 |       " - 0s - loss: 1.2026 - acc: 0.4899\n",
 71 |       "Epoch 7/30\n",
 72 |       " - 0s - loss: 1.1653 - acc: 0.5101\n",
 73 |       "Epoch 8/30\n",
 74 |       " - 0s - loss: 1.1257 - acc: 0.5232\n",
 75 |       "Epoch 9/30\n",
 76 |       " - 0s - loss: 1.0727 - acc: 0.5817\n",
 77 |       "Epoch 10/30\n",
 78 |       " - 0s - loss: 1.0358 - acc: 0.6431\n",
 79 |       "Epoch 11/30\n",
 80 |       " - 0s - loss: 0.9817 - acc: 0.6643\n",
 81 |       "Epoch 12/30\n",
 82 |       " - 0s - loss: 0.9390 - acc: 0.6825\n",
 83 |       "Epoch 13/30\n",
 84 |       " - 0s - loss: 0.8946 - acc: 0.7016\n",
 85 |       "Epoch 14/30\n",
 86 |       " - 0s - loss: 0.8541 - acc: 0.7097\n",
 87 |       "Epoch 15/30\n",
 88 |       " - 0s - loss: 0.8061 - acc: 0.7681\n",
 89 |       "Epoch 16/30\n",
 90 |       " - 0s - loss: 0.7683 - acc: 0.7621\n",
 91 |       "Epoch 17/30\n",
 92 |       " - 0s - loss: 0.7158 - acc: 0.7944\n",
 93 |       "Epoch 18/30\n",
 94 |       " - 0s - loss: 0.6629 - acc: 0.8730\n",
 95 |       "Epoch 19/30\n",
 96 |       " - 0s - loss: 0.6269 - acc: 0.9153\n",
 97 |       "Epoch 20/30\n",
 98 |       " - 0s - loss: 0.5782 - acc: 0.9345\n",
 99 |       "Epoch 21/30\n",
100 |       " - 0s - loss: 0.5307 - acc: 0.9556\n",
101 |       "Epoch 22/30\n",
102 |       " - 0s - loss: 0.4938 - acc: 0.9768\n",
103 |       "Epoch 23/30\n",
104 |       " - 0s - loss: 0.4314 - acc: 0.9859\n",
105 |       "Epoch 24/30\n",
106 |       " - 0s - loss: 0.4023 - acc: 0.9960\n",
107 |       "Epoch 25/30\n",
108 |       " - 0s - loss: 0.3608 - acc: 0.9980\n",
109 |       "Epoch 26/30\n",
110 |       " - 0s - loss: 0.3202 - acc: 1.0000\n",
111 |       "Epoch 27/30\n",
112 |       " - 0s - loss: 0.2959 - acc: 1.0000\n",
113 |       "Epoch 28/30\n",
114 |       " - 0s - loss: 0.2637 - acc: 1.0000\n",
115 |       "Epoch 29/30\n",
116 |       " - 0s - loss: 0.2297 - acc: 1.0000\n",
117 |       "Epoch 30/30\n",
118 |       " - 0s - loss: 0.2187 - acc: 1.0000\n",
119 |       "acc = 1.00%.\n"
120 |      ]
121 |     }
122 |    ],
123 |    "source": [
124 |     "# experiments settings\n",
125 |     "params = {\n",
126 |     "    \"difficulty\": TemporalOrderExp6aSequence.DifficultyLevel.EASY,\n",
127 |     "    \"batch_size\": 32,\n",
128 |     "    \"h_units\": 4,\n",
129 |     "    \"max_epochs\": 30\n",
130 |     "}\n",
131 |     "\n",
132 |     "acc = exp6a_experiment(params)\n",
133 |     "print('acc = {:.2f}%.'.format(acc))\n"
134 |    ]
135 |   }
136 |  ],
137 |  "metadata": {
138 |   "kernelspec": {
139 |    "display_name": "Python 3",
140 |    "language": "python",
141 |    "name": "python3"
142 |   },
143 |   "language_info": {
144 |    "codemirror_mode": {
145 |     "name": "ipython",
146 |     "version": 3
147 |    },
148 |    "file_extension": ".py",
149 |    "mimetype": "text/x-python",
150 |    "name": "python",
151 |    "nbconvert_exporter": "python",
152 |    "pygments_lexer": "ipython3",
153 |    "version": "3.6.3"
154 |   }
155 |  },
156 |  "nbformat": 4,
157 |  "nbformat_minor": 1
158 | }
159 | 


--------------------------------------------------------------------------------
/raw/keras-sequences/1_2_echo_experiments.ipynb:
--------------------------------------------------------------------------------
  1 | {
  2 |  "cells": [
  3 |   {
  4 |    "cell_type": "code",
  5 |    "execution_count": 1,
  6 |    "metadata": {},
  7 |    "outputs": [],
  8 |    "source": [
  9 |     "from sequential_tasks import EchoData\n",
 10 |     "from tensorflow.python.keras.models import Sequential\n",
 11 |     "from tensorflow.python.keras.layers import SimpleRNN, Dense, TimeDistributed\n",
 12 |     "import numpy as np\n"
 13 |    ]
 14 |   },
 15 |   {
 16 |    "cell_type": "code",
 17 |    "execution_count": 2,
 18 |    "metadata": {},
 19 |    "outputs": [],
 20 |    "source": [
 21 |     "def echo_experiment(settings):\n",
 22 |     "    train_data_gen = EchoData(\n",
 23 |     "        series_length=settings['series_length'],\n",
 24 |     "        truncated_length=settings['truncated_length'],\n",
 25 |     "        echo_step=settings['echo_step'],\n",
 26 |     "        batch_size=settings['batch_size'])\n",
 27 |     "\n",
 28 |     "    model = Sequential([\n",
 29 |     "        SimpleRNN(\n",
 30 |     "            units=settings['h_units'],\n",
 31 |     "            batch_input_shape=(settings['batch_size'], settings['truncated_length'], 1),\n",
 32 |     "            return_sequences=True,\n",
 33 |     "            stateful=True),\n",
 34 |     "        TimeDistributed(\n",
 35 |     "            Dense(units=1, activation='sigmoid'))\n",
 36 |     "    ])\n",
 37 |     "\n",
 38 |     "    model.compile(optimizer='rmsprop', loss='binary_crossentropy', metrics=['accuracy'])\n",
 39 |     "    model.fit_generator(\n",
 40 |     "        generator=train_data_gen,\n",
 41 |     "        epochs=settings['max_epochs'],\n",
 42 |     "        verbose=2,\n",
 43 |     "        shuffle=False)\n",
 44 |     "\n",
 45 |     "    # testing\n",
 46 |     "    test_data_gen = EchoData(\n",
 47 |     "        series_length=settings['series_length'],\n",
 48 |     "        truncated_length=settings['truncated_length'],\n",
 49 |     "        echo_step=settings['echo_step'],\n",
 50 |     "        batch_size=settings['batch_size'])\n",
 51 |     "\n",
 52 |     "    # we could do evaluation like this:\n",
 53 |     "    # eval_metrics = model.evaluate_generator(test_data_gen)\n",
 54 |     "    \n",
 55 |     "    # but let's gather statistics from each batch...\n",
 56 |     "    batch_accuracies = []\n",
 57 |     "    for b in range(test_data_gen.n_batches):\n",
 58 |     "        x_test, y_test = test_data_gen[b]\n",
 59 |     "        batch_metrics = model.evaluate(\n",
 60 |     "            x=x_test,\n",
 61 |     "            y=y_test,\n",
 62 |     "            batch_size=params[\"batch_size\"],\n",
 63 |     "            verbose=0)\n",
 64 |     "        batch_accuracies.append(100. * batch_metrics[1])\n",
 65 |     "    # ... and let's skip the first batch (when RNN is not warmed up)\n",
 66 |     "    test_accuracy = np.mean(batch_accuracies[1:])\n",
 67 |     "\n",
 68 |     "    return test_accuracy\n"
 69 |    ]
 70 |   },
 71 |   {
 72 |    "cell_type": "code",
 73 |    "execution_count": 3,
 74 |    "metadata": {},
 75 |    "outputs": [
 76 |     {
 77 |      "name": "stdout",
 78 |      "output_type": "stream",
 79 |      "text": [
 80 |       "Epoch 1/4\n",
 81 |       " - 7s - loss: 0.6507 - acc: 0.5941\n",
 82 |       "Epoch 2/4\n",
 83 |       " - 6s - loss: 0.3372 - acc: 0.8691\n",
 84 |       "Epoch 3/4\n",
 85 |       " - 6s - loss: 0.1511 - acc: 0.9244\n",
 86 |       "Epoch 4/4\n",
 87 |       " - 6s - loss: 0.1228 - acc: 0.9256\n",
 88 |       "acc = 100.00%.\n"
 89 |      ]
 90 |     }
 91 |    ],
 92 |    "source": [
 93 |     "# experiments settings\n",
 94 |     "params = {\n",
 95 |     "    \"series_length\": 20000,\n",
 96 |     "    \"echo_step\": 3,\n",
 97 |     "    \"truncated_length\": 20,\n",
 98 |     "    \"batch_size\": 5,\n",
 99 |     "    \"h_units\": 4,\n",
100 |     "    \"max_epochs\": 4\n",
101 |     "}\n",
102 |     "\n",
103 |     "acc = echo_experiment(params)\n",
104 |     "print('acc = {:.2f}%.'.format(acc))\n"
105 |    ]
106 |   }
107 |  ],
108 |  "metadata": {
109 |   "kernelspec": {
110 |    "display_name": "Python 3",
111 |    "language": "python",
112 |    "name": "python3"
113 |   },
114 |   "language_info": {
115 |    "codemirror_mode": {
116 |     "name": "ipython",
117 |     "version": 3
118 |    },
119 |    "file_extension": ".py",
120 |    "mimetype": "text/x-python",
121 |    "name": "python",
122 |    "nbconvert_exporter": "python",
123 |    "pygments_lexer": "ipython3",
124 |    "version": "3.6.3"
125 |   }
126 |  },
127 |  "nbformat": 4,
128 |  "nbformat_minor": 1
129 | }
130 | 


--------------------------------------------------------------------------------
/raw/keras-sequences/sequential_tasks.py:
--------------------------------------------------------------------------------
  1 | import numpy as np
  2 | from tensorflow.python.keras.utils import Sequence, to_categorical
  3 | from tensorflow.python.keras.preprocessing.sequence import pad_sequences
  4 | 
  5 | 
  6 | class EchoData(Sequence):
  7 | 
  8 |     def __init__(self, series_length=40000, batch_size=32,
  9 |                  echo_step=3, truncated_length=10, seed=None):
 10 | 
 11 |         self.series_length = series_length
 12 |         self.truncated_length = truncated_length
 13 |         self.n_batches = series_length//truncated_length
 14 | 
 15 |         self.echo_step = echo_step
 16 |         self.batch_size = batch_size
 17 |         if seed is not None:
 18 |             np.random.seed(seed)
 19 |         self.raw_x = None
 20 |         self.raw_y = None
 21 |         self.x_batches = []
 22 |         self.y_batches = []
 23 |         self.generate_new_series()
 24 |         self.prepare_batches()
 25 | 
 26 |     def __getitem__(self, index):
 27 |         if index == 0:
 28 |             self.generate_new_series()
 29 |             self.prepare_batches()
 30 |         return self.x_batches[index], self.y_batches[index]
 31 | 
 32 |     def __len__(self):
 33 |         return self.n_batches
 34 | 
 35 |     def generate_new_series(self):
 36 |         x = np.random.choice(
 37 |             2,
 38 |             size=(self.batch_size, self.series_length),
 39 |             p=[0.5, 0.5])
 40 |         y = np.roll(x, self.echo_step, axis=1)
 41 |         y[:, 0:self.echo_step] = 0
 42 |         self.raw_x = x
 43 |         self.raw_y = y
 44 | 
 45 |     def prepare_batches(self):
 46 |         x = np.expand_dims(self.raw_x, axis=-1)
 47 |         y = np.expand_dims(self.raw_y, axis=-1)
 48 |         self.x_batches = np.split(x, self.n_batches, axis=1)
 49 |         self.y_batches = np.split(y, self.n_batches, axis=1)
 50 | 
 51 | 
 52 | class TemporalOrderExp6aSequence(Sequence):
 53 |     """
 54 |     From Hochreiter&Schmidhuber(1997):
 55 | 
 56 |         The goal is to classify sequences. Elements and targets are represented locally
 57 |         (input vectors with only one non-zero bit). The sequence starts with an E, ends
 58 |         with a B (the "trigger symbol") and otherwise consists of randomly chosen symbols
 59 |         from the set {a, b, c, d} except for two elements at positions t1 and t2 that are
 60 |         either X or Y . The sequence length is randomly chosen between 100 and 110, t1 is
 61 |         randomly chosen between 10 and 20, and t2 is randomly chosen between 50 and 60.
 62 |         There are 4 sequence classes Q, R, S, U which depend on the temporal order of X and Y.
 63 |         The rules are:
 64 |             X, X -> Q,
 65 |             X, Y -> R,
 66 |             Y , X -> S,
 67 |             Y , Y -> U.
 68 | 
 69 |     """
 70 | 
 71 |     def __init__(self, length_range=(100, 111), t1_range=(10, 21), t2_range=(50, 61),
 72 |                  batch_size=32, seed=None):
 73 | 
 74 |         self.classes = ['Q', 'R', 'S', 'U']
 75 |         self.n_classes = len(self.classes)
 76 | 
 77 |         self.relevant_symbols = ['X', 'Y']
 78 |         self.distraction_symbols = ['a', 'b', 'c', 'd']
 79 |         self.start_symbol = 'B'
 80 |         self.end_symbol = 'E'
 81 | 
 82 |         self.length_range = length_range
 83 |         self.t1_range = t1_range
 84 |         self.t2_range = t2_range
 85 |         self.batch_size = batch_size
 86 | 
 87 |         if seed is not None:
 88 |             np.random.seed(seed)
 89 | 
 90 |         all_symbols = self.relevant_symbols + self.distraction_symbols + \
 91 |                       [self.start_symbol] + [self.end_symbol]
 92 |         self.n_symbols = len(all_symbols)
 93 |         self.s_to_idx = {s: n for n, s in enumerate(all_symbols)}
 94 |         self.idx_to_s = {n: s for n, s in enumerate(all_symbols)}
 95 | 
 96 |         self.c_to_idx = {c: n for n, c in enumerate(self.classes)}
 97 |         self.idx_to_c = {n: c for n, c in enumerate(self.classes)}
 98 | 
 99 |     def generate_pair(self):
100 |         length = np.random.randint(self.length_range[0], self.length_range[1])
101 |         t1 = np.random.randint(self.t1_range[0], self.t1_range[1])
102 |         t2 = np.random.randint(self.t2_range[0], self.t2_range[1])
103 | 
104 |         x = np.random.choice(self.distraction_symbols, length)
105 |         x[0] = self.start_symbol
106 |         x[-1] = self.end_symbol
107 | 
108 |         y = np.random.choice(self.classes)
109 |         if y == 'Q':
110 |             x[t1], x[t2] = self.relevant_symbols[0], self.relevant_symbols[0]
111 |         elif y == 'R':
112 |             x[t1], x[t2] = self.relevant_symbols[0], self.relevant_symbols[1]
113 |         elif y == 'S':
114 |             x[t1], x[t2] = self.relevant_symbols[1], self.relevant_symbols[0]
115 |         else:
116 |             x[t1], x[t2] = self.relevant_symbols[1], self.relevant_symbols[1]
117 | 
118 |         return ''.join(x), y
119 | 
120 |     # encoding/decoding single instance version
121 | 
122 |     def encode_x(self, x):
123 |         idx_x = [self.s_to_idx[s] for s in x]
124 |         return to_categorical(idx_x, num_classes=self.n_symbols)
125 | 
126 |     def encode_y(self, y):
127 |         idx_y = self.c_to_idx[y]
128 |         return to_categorical(idx_y, num_classes=self.n_classes)
129 | 
130 |     def decode_x(self, x):
131 |         x = x[np.sum(x, axis=1) > 0]    # remove padding
132 |         return ''.join([self.idx_to_s[pos] for pos in np.argmax(x, axis=1)])
133 | 
134 |     def decode_y(self, y):
135 |         return self.idx_to_c[np.argmax(y)]
136 | 
137 |     # encoding/decoding batch versions
138 | 
139 |     def encode_x_batch(self, x_batch):
140 |         return pad_sequences([self.encode_x(x) for x in x_batch],
141 |                              maxlen=self.length_range[1])
142 | 
143 |     def encode_y_batch(self, y_batch):
144 |         return np.array([self.encode_y(y) for y in y_batch])
145 | 
146 |     def decode_x_batch(self, x_batch):
147 |         return [self.decode_x(x) for x in x_batch]
148 | 
149 |     def decode_y_batch(self, y_batch):
150 |         return [self.idx_to_c[pos] for pos in np.argmax(y_batch, axis=1)]
151 | 
152 |     def __len__(self):
153 |         """ Let's assume 1000 sequences as the size of data. """
154 |         return int(1000. / self.batch_size)
155 | 
156 |     def __getitem__(self, index):
157 |         batch_x, batch_y = [], []
158 |         for _ in range(self.batch_size):
159 |             x, y = self.generate_pair()
160 |             batch_x.append(x)
161 |             batch_y.append(y)
162 |         return self.encode_x_batch(batch_x), self.encode_y_batch(batch_y)
163 | 
164 |     class DifficultyLevel:
165 |         """ On HARD, settings are identical to the original settings from the '97 paper."""
166 |         EASY, NORMAL, MODERATE, HARD, NIGHTMARE = range(5)
167 | 
168 |     @staticmethod
169 |     def get_predefined_generator(difficulty_level, batch_size=32, seed=8382):
170 |         EASY = TemporalOrderExp6aSequence.DifficultyLevel.EASY
171 |         NORMAL = TemporalOrderExp6aSequence.DifficultyLevel.NORMAL
172 |         MODERATE = TemporalOrderExp6aSequence.DifficultyLevel.MODERATE
173 |         HARD = TemporalOrderExp6aSequence.DifficultyLevel.HARD
174 | 
175 |         if difficulty_level == EASY:
176 |             length_range = (7, 9)
177 |             t1_range = (1, 3)
178 |             t2_range = (4, 6)
179 |         elif difficulty_level == NORMAL:
180 |             length_range = (30, 41)
181 |             t1_range = (2, 6)
182 |             t2_range = (20, 28)
183 |         elif difficulty_level == MODERATE:
184 |             length_range = (60, 81)
185 |             t1_range = (10, 21)
186 |             t2_range = (45, 55)
187 |         elif difficulty_level == HARD:
188 |             length_range = (100, 111)
189 |             t1_range = (10, 21)
190 |             t2_range = (50, 61)
191 |         else:
192 |             length_range = (300, 501)
193 |             t1_range = (10, 81)
194 |             t2_range = (250, 291)
195 |         return TemporalOrderExp6aSequence(length_range, t1_range, t2_range,
196 |                                           batch_size, seed)
197 | 


--------------------------------------------------------------------------------
/sequential_tasks.py:
--------------------------------------------------------------------------------
  1 | import numpy as np
  2 | from tensorflow.python.keras.utils import Sequence, to_categorical
  3 | from tensorflow.python.keras.preprocessing.sequence import pad_sequences
  4 | 
  5 | 
  6 | class EchoData(Sequence):
  7 | 
  8 |     def __init__(self, series_length=40000, batch_size=32,
  9 |                  echo_step=3, truncated_length=10, seed=None):
 10 | 
 11 |         self.series_length = series_length
 12 |         self.truncated_length = truncated_length
 13 |         self.n_batches = series_length//truncated_length
 14 | 
 15 |         self.echo_step = echo_step
 16 |         self.batch_size = batch_size
 17 |         if seed is not None:
 18 |             np.random.seed(seed)
 19 |         self.raw_x = None
 20 |         self.raw_y = None
 21 |         self.x_batches = []
 22 |         self.y_batches = []
 23 |         self.generate_new_series()
 24 |         self.prepare_batches()
 25 | 
 26 |     def __getitem__(self, index):
 27 |         if index == 0:
 28 |             self.generate_new_series()
 29 |             self.prepare_batches()
 30 |         return self.x_batches[index], self.y_batches[index]
 31 | 
 32 |     def __len__(self):
 33 |         return self.n_batches
 34 | 
 35 |     def generate_new_series(self):
 36 |         x = np.random.choice(
 37 |             2,
 38 |             size=(self.batch_size, self.series_length),
 39 |             p=[0.5, 0.5])
 40 |         y = np.roll(x, self.echo_step, axis=1)
 41 |         y[:, 0:self.echo_step] = 0
 42 |         self.raw_x = x
 43 |         self.raw_y = y
 44 | 
 45 |     def prepare_batches(self):
 46 |         x = np.expand_dims(self.raw_x, axis=-1)
 47 |         y = np.expand_dims(self.raw_y, axis=-1)
 48 |         self.x_batches = np.split(x, self.n_batches, axis=1)
 49 |         self.y_batches = np.split(y, self.n_batches, axis=1)
 50 | 
 51 | 
 52 | class TemporalOrderExp6aSequence(Sequence):
 53 |     """
 54 |     From Hochreiter&Schmidhuber(1997):
 55 | 
 56 |         The goal is to classify sequences. Elements and targets are represented locally
 57 |         (input vectors with only one non-zero bit). The sequence starts with an E, ends
 58 |         with a B (the "trigger symbol") and otherwise consists of randomly chosen symbols
 59 |         from the set {a, b, c, d} except for two elements at positions t1 and t2 that are
 60 |         either X or Y . The sequence length is randomly chosen between 100 and 110, t1 is
 61 |         randomly chosen between 10 and 20, and t2 is randomly chosen between 50 and 60.
 62 |         There are 4 sequence classes Q, R, S, U which depend on the temporal order of X and Y.
 63 |         The rules are:
 64 |             X, X -> Q,
 65 |             X, Y -> R,
 66 |             Y , X -> S,
 67 |             Y , Y -> U.
 68 | 
 69 |     """
 70 | 
 71 |     def __init__(self, length_range=(100, 111), t1_range=(10, 21), t2_range=(50, 61),
 72 |                  batch_size=32, seed=None):
 73 | 
 74 |         self.classes = ['Q', 'R', 'S', 'U']
 75 |         self.n_classes = len(self.classes)
 76 | 
 77 |         self.relevant_symbols = ['X', 'Y']
 78 |         self.distraction_symbols = ['a', 'b', 'c', 'd']
 79 |         self.start_symbol = 'B'
 80 |         self.end_symbol = 'E'
 81 | 
 82 |         self.length_range = length_range
 83 |         self.t1_range = t1_range
 84 |         self.t2_range = t2_range
 85 |         self.batch_size = batch_size
 86 | 
 87 |         if seed is not None:
 88 |             np.random.seed(seed)
 89 | 
 90 |         all_symbols = self.relevant_symbols + self.distraction_symbols + \
 91 |                       [self.start_symbol] + [self.end_symbol]
 92 |         self.n_symbols = len(all_symbols)
 93 |         self.s_to_idx = {s: n for n, s in enumerate(all_symbols)}
 94 |         self.idx_to_s = {n: s for n, s in enumerate(all_symbols)}
 95 | 
 96 |         self.c_to_idx = {c: n for n, c in enumerate(self.classes)}
 97 |         self.idx_to_c = {n: c for n, c in enumerate(self.classes)}
 98 | 
 99 |     def generate_pair(self):
100 |         length = np.random.randint(self.length_range[0], self.length_range[1])
101 |         t1 = np.random.randint(self.t1_range[0], self.t1_range[1])
102 |         t2 = np.random.randint(self.t2_range[0], self.t2_range[1])
103 | 
104 |         x = np.random.choice(self.distraction_symbols, length)
105 |         x[0] = self.start_symbol
106 |         x[-1] = self.end_symbol
107 | 
108 |         y = np.random.choice(self.classes)
109 |         if y == 'Q':
110 |             x[t1], x[t2] = self.relevant_symbols[0], self.relevant_symbols[0]
111 |         elif y == 'R':
112 |             x[t1], x[t2] = self.relevant_symbols[0], self.relevant_symbols[1]
113 |         elif y == 'S':
114 |             x[t1], x[t2] = self.relevant_symbols[1], self.relevant_symbols[0]
115 |         else:
116 |             x[t1], x[t2] = self.relevant_symbols[1], self.relevant_symbols[1]
117 | 
118 |         return ''.join(x), y
119 | 
120 |     # encoding/decoding single instance version
121 | 
122 |     def encode_x(self, x):
123 |         idx_x = [self.s_to_idx[s] for s in x]
124 |         return to_categorical(idx_x, num_classes=self.n_symbols)
125 | 
126 |     def encode_y(self, y):
127 |         idx_y = self.c_to_idx[y]
128 |         return to_categorical(idx_y, num_classes=self.n_classes)
129 | 
130 |     def decode_x(self, x):
131 |         x = x[np.sum(x, axis=1) > 0]    # remove padding
132 |         return ''.join([self.idx_to_s[pos] for pos in np.argmax(x, axis=1)])
133 | 
134 |     def decode_y(self, y):
135 |         return self.idx_to_c[np.argmax(y)]
136 | 
137 |     # encoding/decoding batch versions
138 | 
139 |     def encode_x_batch(self, x_batch):
140 |         return pad_sequences([self.encode_x(x) for x in x_batch],
141 |                              maxlen=self.length_range[1])
142 | 
143 |     def encode_y_batch(self, y_batch):
144 |         return np.array([self.encode_y(y) for y in y_batch])
145 | 
146 |     def decode_x_batch(self, x_batch):
147 |         return [self.decode_x(x) for x in x_batch]
148 | 
149 |     def decode_y_batch(self, y_batch):
150 |         return [self.idx_to_c[pos] for pos in np.argmax(y_batch, axis=1)]
151 | 
152 |     def __len__(self):
153 |         """ Let's assume 1000 sequences as the size of data. """
154 |         return int(1000. / self.batch_size)
155 | 
156 |     def __getitem__(self, index):
157 |         batch_x, batch_y = [], []
158 |         for _ in range(self.batch_size):
159 |             x, y = self.generate_pair()
160 |             batch_x.append(x)
161 |             batch_y.append(y)
162 |         return self.encode_x_batch(batch_x), self.encode_y_batch(batch_y)
163 | 
164 |     class DifficultyLevel:
165 |         """ On HARD, settings are identical to the original settings from the '97 paper."""
166 |         EASY, NORMAL, MODERATE, HARD, NIGHTMARE = range(5)
167 | 
168 |     @staticmethod
169 |     def get_predefined_generator(difficulty_level, batch_size=32, seed=8382):
170 |         EASY = TemporalOrderExp6aSequence.DifficultyLevel.EASY
171 |         NORMAL = TemporalOrderExp6aSequence.DifficultyLevel.NORMAL
172 |         MODERATE = TemporalOrderExp6aSequence.DifficultyLevel.MODERATE
173 |         HARD = TemporalOrderExp6aSequence.DifficultyLevel.HARD
174 | 
175 |         if difficulty_level == EASY:
176 |             length_range = (7, 9)
177 |             t1_range = (1, 3)
178 |             t2_range = (4, 6)
179 |         elif difficulty_level == NORMAL:
180 |             length_range = (30, 41)
181 |             t1_range = (2, 6)
182 |             t2_range = (20, 28)
183 |         elif difficulty_level == MODERATE:
184 |             length_range = (60, 81)
185 |             t1_range = (10, 21)
186 |             t2_range = (45, 55)
187 |         elif difficulty_level == HARD:
188 |             length_range = (100, 111)
189 |             t1_range = (10, 21)
190 |             t2_range = (50, 61)
191 |         else:
192 |             length_range = (300, 501)
193 |             t1_range = (10, 81)
194 |             t2_range = (250, 291)
195 |         return TemporalOrderExp6aSequence(length_range, t1_range, t2_range,
196 |                                           batch_size, seed)
197 | 


--------------------------------------------------------------------------------
/slides/01 - ML and spiral classification.pdf:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/SudalaiRajkumar/PyTorch-Deep-Learning-Minicourse/d2b0970935ec19cb4526ceb4ec028b94a3958c25/slides/01 - ML and spiral classification.pdf


--------------------------------------------------------------------------------
/slides/02 - CNN.pdf:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/SudalaiRajkumar/PyTorch-Deep-Learning-Minicourse/d2b0970935ec19cb4526ceb4ec028b94a3958c25/slides/02 - CNN.pdf


--------------------------------------------------------------------------------
/slides/03 - Generative models.pdf:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/SudalaiRajkumar/PyTorch-Deep-Learning-Minicourse/d2b0970935ec19cb4526ceb4ec028b94a3958c25/slides/03 - Generative models.pdf


--------------------------------------------------------------------------------
/slides/04 - RNN.pdf:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/SudalaiRajkumar/PyTorch-Deep-Learning-Minicourse/d2b0970935ec19cb4526ceb4ec028b94a3958c25/slides/04 - RNN.pdf


--------------------------------------------------------------------------------