├── .gitignore ├── 01-tensor_tutorial.ipynb ├── 02-space_stretching.ipynb ├── 03-autograd_tutorial.ipynb ├── 04-spiral_classification.ipynb ├── 05-convnet.ipynb ├── 06-autoencoder.ipynb ├── 07-VAE.ipynb ├── 08-1-classify_seq_data.ipynb ├── 08-2-echo_data.ipynb ├── 08-3-temporal_order_classification_experiments.ipynb ├── 08-4-echo_experiments.ipynb ├── README.md ├── conda-envt.yml ├── img └── train.gif ├── plot_conf.py ├── raw ├── keras-regularisation.ipynb └── keras-sequences │ ├── 0_1_classify_seq_data.ipynb │ ├── 0_2_echo_data.ipynb │ ├── 1_1_temporal_order_classification_experiments.ipynb │ ├── 1_2_echo_experiments.ipynb │ └── sequential_tasks.py ├── sequential_tasks.py └── slides ├── 01 - ML and spiral classification.pdf ├── 02 - CNN.pdf ├── 03 - Generative models.pdf └── 04 - RNN.pdf /.gitignore: -------------------------------------------------------------------------------- 1 | # Remove [I]Python caching 2 | __pycache__ 3 | .ipynb_checkpoints 4 | 5 | # Remove Mac shit 6 | .DS_Store 7 | 8 | # Remove Vim temp files 9 | *sw* 10 | -------------------------------------------------------------------------------- /01-tensor_tutorial.ipynb: -------------------------------------------------------------------------------- 1 | { 2 | "cells": [ 3 | { 4 | "cell_type": "markdown", 5 | "metadata": {}, 6 | "source": [ 7 | "\n", 8 | "# What is PyTorch?\n", 9 | "\n", 10 | "It’s a Python based scientific computing package targeted at two sets of audiences:\n", 11 | "\n", 12 | "- Tensorial library that uses the power of GPUs\n", 13 | "- A deep learning research platform that provides maximum flexibility and speed\n", 14 | "\n", 15 | "## Import the library" 16 | ] 17 | }, 18 | { 19 | "cell_type": "code", 20 | "execution_count": null, 21 | "metadata": {}, 22 | "outputs": [], 23 | "source": [ 24 | "import torch # / + " 25 | ] 26 | }, 27 | { 28 | "cell_type": "markdown", 29 | "metadata": {}, 30 | "source": [ 31 | "## Getting help in Jupyter" 32 | ] 33 | }, 34 | { 35 | "cell_type": "code", 36 | "execution_count": null, 37 | "metadata": {}, 38 | "outputs": [], 39 | "source": [ 40 | "torch.sq # " 41 | ] 42 | }, 43 | { 44 | "cell_type": "code", 45 | "execution_count": null, 46 | "metadata": {}, 47 | "outputs": [], 48 | "source": [ 49 | "# What about all `*Tensor`s?\n", 50 | "torch.*Tensor?" 51 | ] 52 | }, 53 | { 54 | "cell_type": "code", 55 | "execution_count": null, 56 | "metadata": {}, 57 | "outputs": [], 58 | "source": [ 59 | "torch.nn.Module() # +" 60 | ] 61 | }, 62 | { 63 | "cell_type": "code", 64 | "execution_count": null, 65 | "metadata": {}, 66 | "outputs": [], 67 | "source": [ 68 | "# Annotate your functions / classes!\n", 69 | "torch.nn.Module?" 70 | ] 71 | }, 72 | { 73 | "cell_type": "code", 74 | "execution_count": null, 75 | "metadata": {}, 76 | "outputs": [], 77 | "source": [ 78 | "torch.nn.Module??" 79 | ] 80 | }, 81 | { 82 | "cell_type": "markdown", 83 | "metadata": {}, 84 | "source": [ 85 | "## Dropping to Bash: magic!" 86 | ] 87 | }, 88 | { 89 | "cell_type": "code", 90 | "execution_count": null, 91 | "metadata": { 92 | "scrolled": true 93 | }, 94 | "outputs": [], 95 | "source": [ 96 | "! ls -lh" 97 | ] 98 | }, 99 | { 100 | "cell_type": "code", 101 | "execution_count": null, 102 | "metadata": {}, 103 | "outputs": [], 104 | "source": [ 105 | "%%bash\n", 106 | "for f in $(ls *.*); do\n", 107 | " echo $(wc -l $f)\n", 108 | "done" 109 | ] 110 | }, 111 | { 112 | "cell_type": "code", 113 | "execution_count": null, 114 | "metadata": {}, 115 | "outputs": [], 116 | "source": [ 117 | "# Help?\n", 118 | "%%bash?" 119 | ] 120 | }, 121 | { 122 | "cell_type": "code", 123 | "execution_count": null, 124 | "metadata": {}, 125 | "outputs": [], 126 | "source": [ 127 | "# Getting some general help\n", 128 | "%magic" 129 | ] 130 | }, 131 | { 132 | "cell_type": "markdown", 133 | "metadata": {}, 134 | "source": [ 135 | "## Python native data types\n", 136 | "\n", 137 | "Python has many native datatypes. Here are the important ones:\n", 138 | "\n", 139 | " - **Booleans** are either `True` or `False`.\n", 140 | " - **Numbers** can be integers (1 and 2), floats (1.1 and 1.2), fractions (1/2 and 2/3), or even complex numbers.\n", 141 | " - **Strings** are sequences of Unicode characters, e.g. an html document.\n", 142 | " - **Lists** are ordered sequences of values.\n", 143 | " - **Tuples** are ordered, immutable sequences of values.\n", 144 | " - **Sets** are unordered bags of values.\n", 145 | " - **Dictionaries** are unordered bags of key-value pairs.\n", 146 | " \n", 147 | "See [here](http://www.diveintopython3.net/native-datatypes.html) for a complete overview.\n", 148 | "\n", 149 | "### More resources\n", 150 | "\n", 151 | " 1. Brief Python introduction [here](https://learnxinyminutes.com/docs/python3/).\n", 152 | " 2. Full Python tutorial [here](https://docs.python.org/3/tutorial/).\n", 153 | " 3. A Whirlwind Tour of Python [here](https://github.com/jakevdp/WhirlwindTourOfPython).\n", 154 | " 4. Python Data Science Handbook [here](https://github.com/jakevdp/PythonDataScienceHandbook)." 155 | ] 156 | }, 157 | { 158 | "cell_type": "markdown", 159 | "metadata": {}, 160 | "source": [ 161 | "## Torch!" 162 | ] 163 | }, 164 | { 165 | "cell_type": "code", 166 | "execution_count": null, 167 | "metadata": {}, 168 | "outputs": [], 169 | "source": [ 170 | "t = torch.Tensor(2, 3, 4)\n", 171 | "type(t)" 172 | ] 173 | }, 174 | { 175 | "cell_type": "code", 176 | "execution_count": null, 177 | "metadata": {}, 178 | "outputs": [], 179 | "source": [ 180 | "t.size()" 181 | ] 182 | }, 183 | { 184 | "cell_type": "code", 185 | "execution_count": null, 186 | "metadata": {}, 187 | "outputs": [], 188 | "source": [ 189 | "# t.size() is a classic tuple =>\n", 190 | "print('t size:', ' \\u00D7 '.join(map(str, t.size())))" 191 | ] 192 | }, 193 | { 194 | "cell_type": "code", 195 | "execution_count": null, 196 | "metadata": {}, 197 | "outputs": [], 198 | "source": [ 199 | "print(f'point in a {t.numel()} dimensional space')\n", 200 | "print(f'organised in {t.dim()} sub-dimensions')" 201 | ] 202 | }, 203 | { 204 | "cell_type": "code", 205 | "execution_count": null, 206 | "metadata": {}, 207 | "outputs": [], 208 | "source": [ 209 | "t" 210 | ] 211 | }, 212 | { 213 | "cell_type": "code", 214 | "execution_count": null, 215 | "metadata": {}, 216 | "outputs": [], 217 | "source": [ 218 | "# Mind the underscore!\n", 219 | "t.random_(10)" 220 | ] 221 | }, 222 | { 223 | "cell_type": "code", 224 | "execution_count": null, 225 | "metadata": {}, 226 | "outputs": [], 227 | "source": [ 228 | "t" 229 | ] 230 | }, 231 | { 232 | "cell_type": "code", 233 | "execution_count": null, 234 | "metadata": {}, 235 | "outputs": [], 236 | "source": [ 237 | "r = torch.Tensor(t)\n", 238 | "r.resize_(3, 8)\n", 239 | "r" 240 | ] 241 | }, 242 | { 243 | "cell_type": "code", 244 | "execution_count": null, 245 | "metadata": {}, 246 | "outputs": [], 247 | "source": [ 248 | "r.zero_()" 249 | ] 250 | }, 251 | { 252 | "cell_type": "code", 253 | "execution_count": null, 254 | "metadata": {}, 255 | "outputs": [], 256 | "source": [ 257 | "t" 258 | ] 259 | }, 260 | { 261 | "cell_type": "code", 262 | "execution_count": null, 263 | "metadata": {}, 264 | "outputs": [], 265 | "source": [ 266 | "# This *is* important, sigh...\n", 267 | "s = r.clone()" 268 | ] 269 | }, 270 | { 271 | "cell_type": "code", 272 | "execution_count": null, 273 | "metadata": {}, 274 | "outputs": [], 275 | "source": [ 276 | "s.fill_(1)\n", 277 | "s" 278 | ] 279 | }, 280 | { 281 | "cell_type": "code", 282 | "execution_count": null, 283 | "metadata": {}, 284 | "outputs": [], 285 | "source": [ 286 | "r" 287 | ] 288 | }, 289 | { 290 | "cell_type": "markdown", 291 | "metadata": {}, 292 | "source": [ 293 | "## Vectors (1D Tensors)" 294 | ] 295 | }, 296 | { 297 | "cell_type": "code", 298 | "execution_count": null, 299 | "metadata": {}, 300 | "outputs": [], 301 | "source": [ 302 | "v = torch.Tensor([1, 2, 3, 4]); v" 303 | ] 304 | }, 305 | { 306 | "cell_type": "code", 307 | "execution_count": null, 308 | "metadata": {}, 309 | "outputs": [], 310 | "source": [ 311 | "print(f'dim: {v.dim()}, size: {v.size()[0]}')" 312 | ] 313 | }, 314 | { 315 | "cell_type": "code", 316 | "execution_count": null, 317 | "metadata": {}, 318 | "outputs": [], 319 | "source": [ 320 | "w = torch.Tensor([1, 0, 2, 0]); w" 321 | ] 322 | }, 323 | { 324 | "cell_type": "code", 325 | "execution_count": null, 326 | "metadata": {}, 327 | "outputs": [], 328 | "source": [ 329 | "# Element-wise multiplication\n", 330 | "v * w" 331 | ] 332 | }, 333 | { 334 | "cell_type": "code", 335 | "execution_count": null, 336 | "metadata": {}, 337 | "outputs": [], 338 | "source": [ 339 | "# Scalar product: 1*1 + 2*0 + 3*2 + 4*0\n", 340 | "v @ w" 341 | ] 342 | }, 343 | { 344 | "cell_type": "code", 345 | "execution_count": null, 346 | "metadata": {}, 347 | "outputs": [], 348 | "source": [ 349 | "x = torch.Tensor(5).random_(10); x" 350 | ] 351 | }, 352 | { 353 | "cell_type": "code", 354 | "execution_count": null, 355 | "metadata": {}, 356 | "outputs": [], 357 | "source": [ 358 | "print(f'first: {x[0]}, last: {x[-1]}')" 359 | ] 360 | }, 361 | { 362 | "cell_type": "code", 363 | "execution_count": null, 364 | "metadata": {}, 365 | "outputs": [], 366 | "source": [ 367 | "# Extract sub-Tensor [from:to)\n", 368 | "x[1:2 + 1]" 369 | ] 370 | }, 371 | { 372 | "cell_type": "code", 373 | "execution_count": null, 374 | "metadata": {}, 375 | "outputs": [], 376 | "source": [ 377 | "v" 378 | ] 379 | }, 380 | { 381 | "cell_type": "code", 382 | "execution_count": null, 383 | "metadata": {}, 384 | "outputs": [], 385 | "source": [ 386 | "v = torch.arange(1, 4 + 1); v" 387 | ] 388 | }, 389 | { 390 | "cell_type": "code", 391 | "execution_count": null, 392 | "metadata": {}, 393 | "outputs": [], 394 | "source": [ 395 | "print(v.pow(2), v)" 396 | ] 397 | }, 398 | { 399 | "cell_type": "code", 400 | "execution_count": null, 401 | "metadata": {}, 402 | "outputs": [], 403 | "source": [ 404 | "print(v.pow_(2), v)" 405 | ] 406 | }, 407 | { 408 | "cell_type": "markdown", 409 | "metadata": {}, 410 | "source": [ 411 | "## Matrices (2D Tensors)" 412 | ] 413 | }, 414 | { 415 | "cell_type": "code", 416 | "execution_count": null, 417 | "metadata": {}, 418 | "outputs": [], 419 | "source": [ 420 | "m = torch.Tensor([[2, 5, 3, 7],\n", 421 | " [4, 2, 1, 9]]); m" 422 | ] 423 | }, 424 | { 425 | "cell_type": "code", 426 | "execution_count": null, 427 | "metadata": {}, 428 | "outputs": [], 429 | "source": [ 430 | "m.dim()" 431 | ] 432 | }, 433 | { 434 | "cell_type": "code", 435 | "execution_count": null, 436 | "metadata": {}, 437 | "outputs": [], 438 | "source": [ 439 | "print(m.size(0), m.size(1), m.size(), sep=' -- ')" 440 | ] 441 | }, 442 | { 443 | "cell_type": "code", 444 | "execution_count": null, 445 | "metadata": {}, 446 | "outputs": [], 447 | "source": [ 448 | "m.numel()" 449 | ] 450 | }, 451 | { 452 | "cell_type": "code", 453 | "execution_count": null, 454 | "metadata": {}, 455 | "outputs": [], 456 | "source": [ 457 | "m[0][2]" 458 | ] 459 | }, 460 | { 461 | "cell_type": "code", 462 | "execution_count": null, 463 | "metadata": {}, 464 | "outputs": [], 465 | "source": [ 466 | "m[0, 2]" 467 | ] 468 | }, 469 | { 470 | "cell_type": "code", 471 | "execution_count": null, 472 | "metadata": {}, 473 | "outputs": [], 474 | "source": [ 475 | "m[:, 1]" 476 | ] 477 | }, 478 | { 479 | "cell_type": "code", 480 | "execution_count": null, 481 | "metadata": {}, 482 | "outputs": [], 483 | "source": [ 484 | "m[:, [1]]" 485 | ] 486 | }, 487 | { 488 | "cell_type": "code", 489 | "execution_count": null, 490 | "metadata": {}, 491 | "outputs": [], 492 | "source": [ 493 | "m[[0], :]" 494 | ] 495 | }, 496 | { 497 | "cell_type": "code", 498 | "execution_count": null, 499 | "metadata": {}, 500 | "outputs": [], 501 | "source": [ 502 | "m[0, :]" 503 | ] 504 | }, 505 | { 506 | "cell_type": "code", 507 | "execution_count": null, 508 | "metadata": {}, 509 | "outputs": [], 510 | "source": [ 511 | "v = torch.arange(1, 4 + 1); v" 512 | ] 513 | }, 514 | { 515 | "cell_type": "code", 516 | "execution_count": null, 517 | "metadata": {}, 518 | "outputs": [], 519 | "source": [ 520 | "m @ v" 521 | ] 522 | }, 523 | { 524 | "cell_type": "code", 525 | "execution_count": null, 526 | "metadata": {}, 527 | "outputs": [], 528 | "source": [ 529 | "m[[0], :] @ v" 530 | ] 531 | }, 532 | { 533 | "cell_type": "code", 534 | "execution_count": null, 535 | "metadata": {}, 536 | "outputs": [], 537 | "source": [ 538 | "m[[1], :] @ v" 539 | ] 540 | }, 541 | { 542 | "cell_type": "code", 543 | "execution_count": null, 544 | "metadata": {}, 545 | "outputs": [], 546 | "source": [ 547 | "m + torch.rand(2, 4)" 548 | ] 549 | }, 550 | { 551 | "cell_type": "code", 552 | "execution_count": null, 553 | "metadata": {}, 554 | "outputs": [], 555 | "source": [ 556 | "m - torch.rand(2, 4)" 557 | ] 558 | }, 559 | { 560 | "cell_type": "code", 561 | "execution_count": null, 562 | "metadata": {}, 563 | "outputs": [], 564 | "source": [ 565 | "m * torch.rand(2, 4)" 566 | ] 567 | }, 568 | { 569 | "cell_type": "code", 570 | "execution_count": null, 571 | "metadata": {}, 572 | "outputs": [], 573 | "source": [ 574 | "m / torch.rand(2, 4)" 575 | ] 576 | }, 577 | { 578 | "cell_type": "code", 579 | "execution_count": null, 580 | "metadata": {}, 581 | "outputs": [], 582 | "source": [ 583 | "m.t()" 584 | ] 585 | }, 586 | { 587 | "cell_type": "code", 588 | "execution_count": null, 589 | "metadata": {}, 590 | "outputs": [], 591 | "source": [ 592 | "# Same as\n", 593 | "m.transpose(0, 1)" 594 | ] 595 | }, 596 | { 597 | "cell_type": "markdown", 598 | "metadata": {}, 599 | "source": [ 600 | "## Constructors" 601 | ] 602 | }, 603 | { 604 | "cell_type": "code", 605 | "execution_count": null, 606 | "metadata": {}, 607 | "outputs": [], 608 | "source": [ 609 | "torch.arange(3, 8 + 1)" 610 | ] 611 | }, 612 | { 613 | "cell_type": "code", 614 | "execution_count": null, 615 | "metadata": {}, 616 | "outputs": [], 617 | "source": [ 618 | "torch.arange(5.7, -3, -2.1)" 619 | ] 620 | }, 621 | { 622 | "cell_type": "code", 623 | "execution_count": null, 624 | "metadata": {}, 625 | "outputs": [], 626 | "source": [ 627 | "torch.linspace(3, 8, 20).view(1, -1)" 628 | ] 629 | }, 630 | { 631 | "cell_type": "code", 632 | "execution_count": null, 633 | "metadata": {}, 634 | "outputs": [], 635 | "source": [ 636 | "torch.zeros(3, 5)" 637 | ] 638 | }, 639 | { 640 | "cell_type": "code", 641 | "execution_count": null, 642 | "metadata": {}, 643 | "outputs": [], 644 | "source": [ 645 | "torch.ones(3, 2, 5)" 646 | ] 647 | }, 648 | { 649 | "cell_type": "code", 650 | "execution_count": null, 651 | "metadata": {}, 652 | "outputs": [], 653 | "source": [ 654 | "torch.eye(3)" 655 | ] 656 | }, 657 | { 658 | "cell_type": "code", 659 | "execution_count": null, 660 | "metadata": {}, 661 | "outputs": [], 662 | "source": [ 663 | "# Pretty plotting config\n", 664 | "%run plot_conf.py" 665 | ] 666 | }, 667 | { 668 | "cell_type": "code", 669 | "execution_count": null, 670 | "metadata": {}, 671 | "outputs": [], 672 | "source": [ 673 | "plt_style()" 674 | ] 675 | }, 676 | { 677 | "cell_type": "code", 678 | "execution_count": null, 679 | "metadata": {}, 680 | "outputs": [], 681 | "source": [ 682 | "# Numpy bridge!\n", 683 | "plt.hist(torch.randn(1000).numpy(), 100);" 684 | ] 685 | }, 686 | { 687 | "cell_type": "code", 688 | "execution_count": null, 689 | "metadata": {}, 690 | "outputs": [], 691 | "source": [ 692 | "plt.hist(torch.randn(10**6).numpy(), 100); # how much does this chart weight?\n", 693 | "# use rasterized=True for SVG/EPS/PDF!" 694 | ] 695 | }, 696 | { 697 | "cell_type": "code", 698 | "execution_count": null, 699 | "metadata": {}, 700 | "outputs": [], 701 | "source": [ 702 | "plt.hist(torch.rand(10**6).numpy(), 100);" 703 | ] 704 | }, 705 | { 706 | "cell_type": "markdown", 707 | "metadata": {}, 708 | "source": [ 709 | "## Casting" 710 | ] 711 | }, 712 | { 713 | "cell_type": "code", 714 | "execution_count": null, 715 | "metadata": {}, 716 | "outputs": [], 717 | "source": [ 718 | "torch.*Tensor?" 719 | ] 720 | }, 721 | { 722 | "cell_type": "code", 723 | "execution_count": null, 724 | "metadata": {}, 725 | "outputs": [], 726 | "source": [ 727 | "m" 728 | ] 729 | }, 730 | { 731 | "cell_type": "code", 732 | "execution_count": null, 733 | "metadata": {}, 734 | "outputs": [], 735 | "source": [ 736 | "m.double()" 737 | ] 738 | }, 739 | { 740 | "cell_type": "code", 741 | "execution_count": null, 742 | "metadata": {}, 743 | "outputs": [], 744 | "source": [ 745 | "m.byte()" 746 | ] 747 | }, 748 | { 749 | "cell_type": "code", 750 | "execution_count": null, 751 | "metadata": {}, 752 | "outputs": [], 753 | "source": [ 754 | "device = torch.device(\"cuda:0\" if torch.cuda.is_available() else \"cpu\")\n", 755 | "m.to(device)" 756 | ] 757 | }, 758 | { 759 | "cell_type": "code", 760 | "execution_count": null, 761 | "metadata": {}, 762 | "outputs": [], 763 | "source": [ 764 | "m_np = m.numpy(); m_np" 765 | ] 766 | }, 767 | { 768 | "cell_type": "code", 769 | "execution_count": null, 770 | "metadata": {}, 771 | "outputs": [], 772 | "source": [ 773 | "m_np[0, 0] = -1; m_np" 774 | ] 775 | }, 776 | { 777 | "cell_type": "code", 778 | "execution_count": null, 779 | "metadata": {}, 780 | "outputs": [], 781 | "source": [ 782 | "m" 783 | ] 784 | }, 785 | { 786 | "cell_type": "code", 787 | "execution_count": null, 788 | "metadata": {}, 789 | "outputs": [], 790 | "source": [ 791 | "n_np = np.arange(5)\n", 792 | "n = torch.from_numpy(n_np)\n", 793 | "print(n_np, n)" 794 | ] 795 | }, 796 | { 797 | "cell_type": "code", 798 | "execution_count": null, 799 | "metadata": {}, 800 | "outputs": [], 801 | "source": [ 802 | "n.mul_(2)\n", 803 | "n_np" 804 | ] 805 | }, 806 | { 807 | "cell_type": "markdown", 808 | "metadata": {}, 809 | "source": [ 810 | "## More fun" 811 | ] 812 | }, 813 | { 814 | "cell_type": "code", 815 | "execution_count": null, 816 | "metadata": {}, 817 | "outputs": [], 818 | "source": [ 819 | "a = torch.Tensor([[1, 2, 3, 4]])\n", 820 | "b = torch.Tensor([[5, 6, 7, 8]])\n", 821 | "print(a, b)" 822 | ] 823 | }, 824 | { 825 | "cell_type": "code", 826 | "execution_count": null, 827 | "metadata": {}, 828 | "outputs": [], 829 | "source": [ 830 | "torch.cat((a, b), 0)" 831 | ] 832 | }, 833 | { 834 | "cell_type": "code", 835 | "execution_count": null, 836 | "metadata": {}, 837 | "outputs": [], 838 | "source": [ 839 | "torch.cat((a, b), 1)" 840 | ] 841 | }, 842 | { 843 | "cell_type": "markdown", 844 | "metadata": {}, 845 | "source": [ 846 | "## Much more\n", 847 | "\n", 848 | "There's definitely much more, but this was the basics about `Tensor`s fun.\n", 849 | "\n", 850 | "*Torch* full API should be read at least once.\n", 851 | "Hence, go [here](http://pytorch.org/docs/0.3.0/torch.html).\n", 852 | "You'll find 100+ `Tensor` operations, including transposing, indexing, slicing, mathematical operations, linear algebra, random numbers, etc are described." 853 | ] 854 | } 855 | ], 856 | "metadata": { 857 | "kernelspec": { 858 | "display_name": "Python 3", 859 | "language": "python", 860 | "name": "python3" 861 | }, 862 | "language_info": { 863 | "codemirror_mode": { 864 | "name": "ipython", 865 | "version": 3 866 | }, 867 | "file_extension": ".py", 868 | "mimetype": "text/x-python", 869 | "name": "python", 870 | "nbconvert_exporter": "python", 871 | "pygments_lexer": "ipython3", 872 | "version": "3.6.5" 873 | } 874 | }, 875 | "nbformat": 4, 876 | "nbformat_minor": 1 877 | } 878 | -------------------------------------------------------------------------------- /02-space_stretching.ipynb: -------------------------------------------------------------------------------- 1 | { 2 | "cells": [ 3 | { 4 | "cell_type": "code", 5 | "execution_count": null, 6 | "metadata": {}, 7 | "outputs": [], 8 | "source": [ 9 | "# Pretty plotting config\n", 10 | "%run plot_conf.py" 11 | ] 12 | }, 13 | { 14 | "cell_type": "code", 15 | "execution_count": null, 16 | "metadata": {}, 17 | "outputs": [], 18 | "source": [ 19 | "# Set style (need to be in a new cell)\n", 20 | "plt_style()" 21 | ] 22 | }, 23 | { 24 | "cell_type": "code", 25 | "execution_count": null, 26 | "metadata": {}, 27 | "outputs": [], 28 | "source": [ 29 | "import torch\n", 30 | "import torch.nn as nn\n", 31 | "import matplotlib.pyplot as plt\n", 32 | "\n", 33 | "\n", 34 | "# utility function\n", 35 | "def show_scatterplot(X, norm=True, title=''):\n", 36 | " X = X.numpy()\n", 37 | " plt.figure()\n", 38 | " plt.axis('equal')\n", 39 | " plt.scatter(X[:, 0], X[:, 1], c=colors, s=p_size)\n", 40 | " if norm:\n", 41 | " plt.xlim(-6, 6)\n", 42 | " plt.ylim(-6, 6)\n", 43 | " plt.grid(True)\n", 44 | " plt.title(title)\n", 45 | "\n", 46 | "\n", 47 | "# generate some points in 2-D space\n", 48 | "n_points = 1000\n", 49 | "p_size = 30\n", 50 | "X = torch.randn(n_points, 2) \n", 51 | "colors = X[:, 0].numpy() \n", 52 | "\n", 53 | "show_scatterplot(X, norm=True, title='X')" 54 | ] 55 | }, 56 | { 57 | "cell_type": "markdown", 58 | "metadata": {}, 59 | "source": [ 60 | "# Visualizing Linear Transformations\n", 61 | "\n", 62 | "* Generate a random matrix $W$\n", 63 | "\n", 64 | "$\n", 65 | "\\begin{equation}\n", 66 | " W = U\n", 67 | " \\left[ {\\begin{array}{cc}\n", 68 | " s_1 & 0 \\\\\n", 69 | " 0 & s_2 \\\\\n", 70 | " \\end{array} } \\right]\n", 71 | " V^\\top\n", 72 | "\\end{equation}\n", 73 | "$\n", 74 | "* Compute $y = Wx$\n", 75 | "* Larger singular values stretch the points\n", 76 | "* Smaller singular values push them together\n", 77 | "* $U, V$ rotate/reflect" 78 | ] 79 | }, 80 | { 81 | "cell_type": "code", 82 | "execution_count": null, 83 | "metadata": { 84 | "scrolled": false 85 | }, 86 | "outputs": [], 87 | "source": [ 88 | "show_scatterplot(X, norm=True, title='X')\n", 89 | "\n", 90 | "for i in range(10):\n", 91 | " # create a random matrix\n", 92 | " W = torch.randn(2, 2)\n", 93 | " # transform points\n", 94 | " Y = torch.mm(X, W)\n", 95 | " # compute singular values\n", 96 | " U,S,V = torch.svd(W)\n", 97 | " # plot\n", 98 | " show_scatterplot(Y, norm=True, title='y = Wx, singular values : [{:.3f}, {:.3f}]'.format(S[0], S[1]))" 99 | ] 100 | }, 101 | { 102 | "cell_type": "markdown", 103 | "metadata": {}, 104 | "source": [ 105 | "# Linear transformation with PyTorch" 106 | ] 107 | }, 108 | { 109 | "cell_type": "code", 110 | "execution_count": null, 111 | "metadata": {}, 112 | "outputs": [], 113 | "source": [ 114 | "device = torch.device(\"cuda:0\" if torch.cuda.is_available() else \"cpu\")" 115 | ] 116 | }, 117 | { 118 | "cell_type": "code", 119 | "execution_count": null, 120 | "metadata": {}, 121 | "outputs": [], 122 | "source": [ 123 | "model = nn.Sequential(\n", 124 | " nn.Linear(2, 2, bias=False)\n", 125 | ")\n", 126 | "model.to(device)\n", 127 | "Y = model(X).data\n", 128 | "show_scatterplot(Y)" 129 | ] 130 | }, 131 | { 132 | "cell_type": "markdown", 133 | "metadata": {}, 134 | "source": [ 135 | "# Non-linear Transform: Map Points to a Square\n", 136 | "\n", 137 | "* Linear transforms can rotate, reflect, stretch and compress, but cannot curve\n", 138 | "* We need non-linearities for this\n", 139 | "* Can (approximately) map points to a square by first stretching out by a factor $s$, then squashing with a tanh function\n", 140 | "\n", 141 | "$\n", 142 | " f(x)= \\tanh \\left(\n", 143 | " \\left[ {\\begin{array}{cc}\n", 144 | " s & 0 \\\\\n", 145 | " 0 & s \\\\\n", 146 | " \\end{array} } \\right] \n", 147 | " x\n", 148 | " \\right)\n", 149 | "$" 150 | ] 151 | }, 152 | { 153 | "cell_type": "code", 154 | "execution_count": null, 155 | "metadata": {}, 156 | "outputs": [], 157 | "source": [ 158 | "z = torch.linspace(-10, 10, 101)\n", 159 | "s = torch.tanh(z)\n", 160 | "plt.plot(z.numpy(), s.numpy())\n", 161 | "plt.title('tanh() non linearity')" 162 | ] 163 | }, 164 | { 165 | "cell_type": "code", 166 | "execution_count": null, 167 | "metadata": { 168 | "scrolled": false 169 | }, 170 | "outputs": [], 171 | "source": [ 172 | "show_scatterplot(X, title='X')\n", 173 | "plt.axis('square')\n", 174 | "\n", 175 | "model = nn.Sequential(\n", 176 | " nn.Linear(2, 2, bias=False),\n", 177 | " nn.Tanh()\n", 178 | " )\n", 179 | "\n", 180 | "model.to(device)\n", 181 | "\n", 182 | "for s in range(1, 10):\n", 183 | " W = s * torch.eye(2)\n", 184 | " model[0].weight.data.copy_(W)\n", 185 | " Y = model(X).data\n", 186 | " show_scatterplot(Y, False, title='f(x), s={}'.format(s))\n", 187 | " plt.axis('square')\n", 188 | " plt.axis([-1.2, 1.2, -1.2, 1.2])" 189 | ] 190 | }, 191 | { 192 | "cell_type": "markdown", 193 | "metadata": { 194 | "collapsed": true 195 | }, 196 | "source": [ 197 | "# Visualize Functions Represented by Random Neural Networks" 198 | ] 199 | }, 200 | { 201 | "cell_type": "code", 202 | "execution_count": null, 203 | "metadata": { 204 | "scrolled": false 205 | }, 206 | "outputs": [], 207 | "source": [ 208 | "show_scatterplot(X, title='x')\n", 209 | "n_hidden = 5\n", 210 | "\n", 211 | "for i in range(5):\n", 212 | " # create 1-layer neural networks with random weights\n", 213 | " model_1layer = nn.Sequential(\n", 214 | " nn.Linear(2, n_hidden, bias=True), \n", 215 | " nn.ReLU(), \n", 216 | " nn.Linear(n_hidden, 2, bias=True)\n", 217 | " )\n", 218 | " Y = model_1layer(X).data\n", 219 | " show_scatterplot(Y, False, title='f(x)')" 220 | ] 221 | }, 222 | { 223 | "cell_type": "code", 224 | "execution_count": null, 225 | "metadata": { 226 | "scrolled": false 227 | }, 228 | "outputs": [], 229 | "source": [ 230 | "# deeper network with random weights\n", 231 | "show_scatterplot(X, title='x')\n", 232 | "n_hidden = 1000\n", 233 | "\n", 234 | "for i in range(5):\n", 235 | " model_2layer = nn.Sequential(\n", 236 | " nn.Linear(2, n_hidden, bias=True), \n", 237 | " nn.ReLU(), \n", 238 | " nn.Linear(n_hidden, n_hidden, bias=True), \n", 239 | " nn.ReLU(), \n", 240 | " nn.Linear(n_hidden, n_hidden, bias=True), \n", 241 | " nn.ReLU(), \n", 242 | " nn.Linear(n_hidden, n_hidden, bias=True), \n", 243 | " nn.ReLU(), \n", 244 | " nn.Linear(n_hidden, 2, bias=True)\n", 245 | " )\n", 246 | " Y = model_2layer(X).data\n", 247 | " show_scatterplot(Y, False, title='f(x)')\n", 248 | "\n", 249 | "\n" 250 | ] 251 | }, 252 | { 253 | "cell_type": "code", 254 | "execution_count": null, 255 | "metadata": {}, 256 | "outputs": [], 257 | "source": [] 258 | } 259 | ], 260 | "metadata": { 261 | "kernelspec": { 262 | "display_name": "Python 3", 263 | "language": "python", 264 | "name": "python3" 265 | }, 266 | "language_info": { 267 | "codemirror_mode": { 268 | "name": "ipython", 269 | "version": 3 270 | }, 271 | "file_extension": ".py", 272 | "mimetype": "text/x-python", 273 | "name": "python", 274 | "nbconvert_exporter": "python", 275 | "pygments_lexer": "ipython3", 276 | "version": "3.6.5" 277 | } 278 | }, 279 | "nbformat": 4, 280 | "nbformat_minor": 2 281 | } 282 | -------------------------------------------------------------------------------- /03-autograd_tutorial.ipynb: -------------------------------------------------------------------------------- 1 | { 2 | "cells": [ 3 | { 4 | "cell_type": "markdown", 5 | "metadata": {}, 6 | "source": [ 7 | "# Autograd: automatic differentiation\n", 8 | "\n", 9 | "The ``autograd`` package provides automatic differentiation for all operations\n", 10 | "on Tensors. It is a define-by-run framework, which means that your backprop is\n", 11 | "defined by how your code is run, and that every single iteration can be\n", 12 | "different." 13 | ] 14 | }, 15 | { 16 | "cell_type": "code", 17 | "execution_count": null, 18 | "metadata": {}, 19 | "outputs": [], 20 | "source": [ 21 | "import torch" 22 | ] 23 | }, 24 | { 25 | "cell_type": "markdown", 26 | "metadata": {}, 27 | "source": [ 28 | "Create a tensor:" 29 | ] 30 | }, 31 | { 32 | "cell_type": "code", 33 | "execution_count": null, 34 | "metadata": {}, 35 | "outputs": [], 36 | "source": [ 37 | "x = torch.tensor([[1, 2], [3, 4]], requires_grad=True, dtype=torch.float32)\n", 38 | "print(x)" 39 | ] 40 | }, 41 | { 42 | "cell_type": "markdown", 43 | "metadata": {}, 44 | "source": [ 45 | "Do an operation on the tensor:" 46 | ] 47 | }, 48 | { 49 | "cell_type": "code", 50 | "execution_count": null, 51 | "metadata": {}, 52 | "outputs": [], 53 | "source": [ 54 | "y = x - 2\n", 55 | "print(y)" 56 | ] 57 | }, 58 | { 59 | "cell_type": "markdown", 60 | "metadata": {}, 61 | "source": [ 62 | "``y`` was created as a result of an operation, so it has a ``grad_fn``.\n", 63 | "\n" 64 | ] 65 | }, 66 | { 67 | "cell_type": "code", 68 | "execution_count": null, 69 | "metadata": {}, 70 | "outputs": [], 71 | "source": [ 72 | "print(y.grad_fn)" 73 | ] 74 | }, 75 | { 76 | "cell_type": "code", 77 | "execution_count": null, 78 | "metadata": {}, 79 | "outputs": [], 80 | "source": [ 81 | "print(x.grad_fn)" 82 | ] 83 | }, 84 | { 85 | "cell_type": "code", 86 | "execution_count": null, 87 | "metadata": {}, 88 | "outputs": [], 89 | "source": [ 90 | "y.grad_fn" 91 | ] 92 | }, 93 | { 94 | "cell_type": "code", 95 | "execution_count": null, 96 | "metadata": {}, 97 | "outputs": [], 98 | "source": [ 99 | "y.grad_fn.next_functions[0][0]" 100 | ] 101 | }, 102 | { 103 | "cell_type": "code", 104 | "execution_count": null, 105 | "metadata": {}, 106 | "outputs": [], 107 | "source": [ 108 | "y.grad_fn.next_functions[0][0].variable" 109 | ] 110 | }, 111 | { 112 | "cell_type": "markdown", 113 | "metadata": {}, 114 | "source": [ 115 | "Do more operations on `y`" 116 | ] 117 | }, 118 | { 119 | "cell_type": "code", 120 | "execution_count": null, 121 | "metadata": {}, 122 | "outputs": [], 123 | "source": [ 124 | "z = y * y * 3\n", 125 | "out = z.mean()\n", 126 | "\n", 127 | "print(z, out)" 128 | ] 129 | }, 130 | { 131 | "cell_type": "markdown", 132 | "metadata": {}, 133 | "source": [ 134 | "## Gradients\n", 135 | "\n", 136 | "Let's backprop now `out.backward()` is equivalent to doing `out.backward(torch.tensor([1.0]))`" 137 | ] 138 | }, 139 | { 140 | "cell_type": "code", 141 | "execution_count": null, 142 | "metadata": {}, 143 | "outputs": [], 144 | "source": [ 145 | "out.backward()" 146 | ] 147 | }, 148 | { 149 | "cell_type": "markdown", 150 | "metadata": {}, 151 | "source": [ 152 | "print gradients d(out)/dx\n", 153 | "\n", 154 | "\n" 155 | ] 156 | }, 157 | { 158 | "cell_type": "code", 159 | "execution_count": null, 160 | "metadata": {}, 161 | "outputs": [], 162 | "source": [ 163 | "print(x.grad)" 164 | ] 165 | }, 166 | { 167 | "cell_type": "markdown", 168 | "metadata": {}, 169 | "source": [ 170 | "You can do many crazy things with autograd!\n", 171 | "> With Great *Flexibility* Comes Great Responsibility" 172 | ] 173 | }, 174 | { 175 | "cell_type": "code", 176 | "execution_count": null, 177 | "metadata": {}, 178 | "outputs": [], 179 | "source": [ 180 | "# Dynamic graphs!\n", 181 | "x = torch.randn(3, requires_grad=True)\n", 182 | "\n", 183 | "y = x * 2\n", 184 | "while y.data.norm() < 1000:\n", 185 | " y = y * 2\n", 186 | "\n", 187 | "print(y)" 188 | ] 189 | }, 190 | { 191 | "cell_type": "code", 192 | "execution_count": null, 193 | "metadata": {}, 194 | "outputs": [], 195 | "source": [ 196 | "gradients = torch.FloatTensor([0.1, 1.0, 0.0001])\n", 197 | "y.backward(gradients)\n", 198 | "\n", 199 | "print(x.grad)" 200 | ] 201 | }, 202 | { 203 | "cell_type": "markdown", 204 | "metadata": {}, 205 | "source": [ 206 | "## Inference" 207 | ] 208 | }, 209 | { 210 | "cell_type": "code", 211 | "execution_count": null, 212 | "metadata": {}, 213 | "outputs": [], 214 | "source": [ 215 | "n = 3" 216 | ] 217 | }, 218 | { 219 | "cell_type": "code", 220 | "execution_count": null, 221 | "metadata": {}, 222 | "outputs": [], 223 | "source": [ 224 | "x = torch.arange(1, n + 1, requires_grad=True)\n", 225 | "w = torch.ones(n, requires_grad=True)\n", 226 | "z = w @ x\n", 227 | "z.backward()\n", 228 | "print(x.grad, w.grad, sep='\\n')" 229 | ] 230 | }, 231 | { 232 | "cell_type": "code", 233 | "execution_count": null, 234 | "metadata": {}, 235 | "outputs": [], 236 | "source": [ 237 | "x = torch.arange(1, n + 1)\n", 238 | "w = torch.ones(n, requires_grad=True)\n", 239 | "z = w @ x\n", 240 | "z.backward()\n", 241 | "print(x.grad, w.grad, sep='\\n')" 242 | ] 243 | }, 244 | { 245 | "cell_type": "code", 246 | "execution_count": null, 247 | "metadata": {}, 248 | "outputs": [], 249 | "source": [ 250 | "with torch.no_grad():\n", 251 | " x = torch.arange(1, n + 1)\n", 252 | " w = torch.ones(n, requires_grad=True)\n", 253 | " z = w @ x\n", 254 | " z.backward()\n", 255 | " print(x.grad, w.grad, sep='\\n')" 256 | ] 257 | }, 258 | { 259 | "cell_type": "markdown", 260 | "metadata": {}, 261 | "source": [ 262 | "## More stuff\n", 263 | "\n", 264 | "Documentation of the automatic differentiation package is at\n", 265 | "http://pytorch.org/docs/autograd\n", 266 | "\n" 267 | ] 268 | } 269 | ], 270 | "metadata": { 271 | "kernelspec": { 272 | "display_name": "Python 3", 273 | "language": "python", 274 | "name": "python3" 275 | }, 276 | "language_info": { 277 | "codemirror_mode": { 278 | "name": "ipython", 279 | "version": 3 280 | }, 281 | "file_extension": ".py", 282 | "mimetype": "text/x-python", 283 | "name": "python", 284 | "nbconvert_exporter": "python", 285 | "pygments_lexer": "ipython3", 286 | "version": "3.6.5" 287 | } 288 | }, 289 | "nbformat": 4, 290 | "nbformat_minor": 1 291 | } 292 | -------------------------------------------------------------------------------- /04-spiral_classification.ipynb: -------------------------------------------------------------------------------- 1 | { 2 | "cells": [ 3 | { 4 | "cell_type": "markdown", 5 | "metadata": {}, 6 | "source": [ 7 | "### Feed Forward Networks" 8 | ] 9 | }, 10 | { 11 | "cell_type": "markdown", 12 | "metadata": {}, 13 | "source": [ 14 | "
Creating differentiable computation graphs for classification tasks.
\n", 15 | "" 16 | ] 17 | }, 18 | { 19 | "cell_type": "markdown", 20 | "metadata": {}, 21 | "source": [ 22 | "### Create the data" 23 | ] 24 | }, 25 | { 26 | "cell_type": "code", 27 | "execution_count": null, 28 | "metadata": {}, 29 | "outputs": [], 30 | "source": [ 31 | "import random\n", 32 | "import torch\n", 33 | "from torch import nn, optim\n", 34 | "import torch.nn.functional as F\n", 35 | "import math\n", 36 | "import os" 37 | ] 38 | }, 39 | { 40 | "cell_type": "code", 41 | "execution_count": null, 42 | "metadata": {}, 43 | "outputs": [], 44 | "source": [ 45 | "%run plot_conf.py" 46 | ] 47 | }, 48 | { 49 | "cell_type": "code", 50 | "execution_count": null, 51 | "metadata": {}, 52 | "outputs": [], 53 | "source": [ 54 | "plt_style()" 55 | ] 56 | }, 57 | { 58 | "cell_type": "code", 59 | "execution_count": null, 60 | "metadata": {}, 61 | "outputs": [], 62 | "source": [ 63 | "from IPython import display" 64 | ] 65 | }, 66 | { 67 | "cell_type": "code", 68 | "execution_count": null, 69 | "metadata": {}, 70 | "outputs": [], 71 | "source": [ 72 | "seed=12345\n", 73 | "random.seed(seed)\n", 74 | "torch.manual_seed(seed)\n", 75 | "N = 1000 # num_samples_per_class\n", 76 | "D = 2 # dimensions\n", 77 | "C = 3 # num_classes\n", 78 | "H = 100 # num_hidden_units" 79 | ] 80 | }, 81 | { 82 | "cell_type": "code", 83 | "execution_count": null, 84 | "metadata": {}, 85 | "outputs": [], 86 | "source": [ 87 | "X = torch.zeros(N * C, D)\n", 88 | "y = torch.zeros(N * C)\n", 89 | "\n", 90 | "for i in range(C):\n", 91 | " index = 0\n", 92 | " r = torch.linspace(0, 1, N)\n", 93 | " t = torch.linspace(\n", 94 | " i * 2 * math.pi / C,\n", 95 | " (i + 2) * 2 * math.pi / C,\n", 96 | " N\n", 97 | " ) + torch.randn(N) * 0.1\n", 98 | " \n", 99 | " for ix in range(N * i, N * (i + 1)):\n", 100 | " X[ix] = r[index] * torch.FloatTensor((\n", 101 | " math.sin(t[index]), math.cos(t[index])\n", 102 | " ))\n", 103 | " y[ix] = i\n", 104 | " index += 1\n", 105 | "\n", 106 | "print(\"SHAPES:\")\n", 107 | "print(\"-------------------\")\n", 108 | "print(\"X:\", tuple(X.size()))\n", 109 | "print(\"y:\", tuple(y.size()))" 110 | ] 111 | }, 112 | { 113 | "cell_type": "code", 114 | "execution_count": null, 115 | "metadata": {}, 116 | "outputs": [], 117 | "source": [ 118 | "def plot_data(X, y, d=.0, auto=False):\n", 119 | " \"\"\"\n", 120 | " Plot the data.\n", 121 | " \"\"\"\n", 122 | " plt.clf()\n", 123 | " plt.scatter(X[:, 0], X[:, 1], c=y, s=20, cmap=plt.cm.Spectral)\n", 124 | " plt.axis('square')\n", 125 | " plt.axis((-1.1, 1.1, -1.1, 1.1))\n", 126 | " if auto is True: plt.axis('equal')\n", 127 | "# plt.savefig('spiral{:.2f}.png'.format(d))" 128 | ] 129 | }, 130 | { 131 | "cell_type": "code", 132 | "execution_count": null, 133 | "metadata": {}, 134 | "outputs": [], 135 | "source": [ 136 | "# Create the data\n", 137 | "plot_data(X.numpy(), y.numpy())" 138 | ] 139 | }, 140 | { 141 | "cell_type": "code", 142 | "execution_count": null, 143 | "metadata": {}, 144 | "outputs": [], 145 | "source": [ 146 | "def plot_model(X, y, model, e=.0, auto=False):\n", 147 | " \"\"\"\n", 148 | " Plot the model from torch weights.\n", 149 | " \"\"\"\n", 150 | " \n", 151 | " X = X.numpy()\n", 152 | " y = y.numpy(),\n", 153 | " w1 = torch.transpose(model.fc1.weight.data, 0, 1).numpy()\n", 154 | " b1 = model.fc1.bias.data.numpy()\n", 155 | " w2 = torch.transpose(model.fc2.weight.data, 0, 1).numpy()\n", 156 | " b2 = model.fc2.bias.data.numpy()\n", 157 | " \n", 158 | " h = 0.01\n", 159 | "\n", 160 | " x_min, x_max = (-1.1, 1.1)\n", 161 | " y_min, y_max = (-1.1, 1.1)\n", 162 | " \n", 163 | " if auto is True:\n", 164 | " x_min, x_max = X[:, 0].min() - 1, X[:, 0].max() + 1\n", 165 | " y_min, y_max = X[:, 1].min() - 1, X[:, 1].max() + 1\n", 166 | " xx, yy = np.meshgrid(np.arange(x_min, x_max, h),\n", 167 | " np.arange(y_min, y_max, h))\n", 168 | " Z = np.dot(np.maximum(0, np.dot(np.c_[xx.ravel(), yy.ravel()], w1) + b1), w2) + b2\n", 169 | " Z = np.argmax(Z, axis=1)\n", 170 | " Z = Z.reshape(xx.shape)\n", 171 | " fig = plt.figure()\n", 172 | " plt.contourf(xx, yy, Z, cmap=plt.cm.Spectral, alpha=0.3)\n", 173 | " plt.scatter(X[:, 0], X[:, 1], c=y[0], s=40, cmap=plt.cm.Spectral)\n", 174 | " plt.axis((-1.1, 1.1, -1.1, 1.1))\n", 175 | " plt.axis('square')\n", 176 | " if auto is True:\n", 177 | " plt.axis((xx.min(), xx.max(), yy.min(), yy.max()))\n", 178 | " \n", 179 | "# plt.savefig('train{:03.2f}.png'.format(e))" 180 | ] 181 | }, 182 | { 183 | "cell_type": "markdown", 184 | "metadata": {}, 185 | "source": [ 186 | "### Linear model" 187 | ] 188 | }, 189 | { 190 | "cell_type": "code", 191 | "execution_count": null, 192 | "metadata": {}, 193 | "outputs": [], 194 | "source": [ 195 | "learning_rate = 1e-3\n", 196 | "lambda_l2 = 1e-5" 197 | ] 198 | }, 199 | { 200 | "cell_type": "code", 201 | "execution_count": null, 202 | "metadata": {}, 203 | "outputs": [], 204 | "source": [ 205 | "# Linear model\n", 206 | "class linear_model(nn.Module):\n", 207 | " \"\"\"\n", 208 | " Linear model.\n", 209 | " \"\"\"\n", 210 | " def __init__(self, D_in, H, D_out):\n", 211 | " \"\"\"\n", 212 | " Initialize weights.\n", 213 | " \"\"\"\n", 214 | " super(linear_model, self).__init__()\n", 215 | " self.fc1 = nn.Linear(D_in, H)\n", 216 | " self.fc2 = nn.Linear(H, D_out)\n", 217 | "\n", 218 | " def forward(self, x):\n", 219 | " \"\"\"\n", 220 | " Forward pass.\n", 221 | " \"\"\"\n", 222 | " z = self.fc1(x)\n", 223 | " z = self.fc2(z)\n", 224 | " return z" 225 | ] 226 | }, 227 | { 228 | "cell_type": "code", 229 | "execution_count": null, 230 | "metadata": {}, 231 | "outputs": [], 232 | "source": [ 233 | "device = torch.device(\"cuda:0\" if torch.cuda.is_available() else \"cpu\")" 234 | ] 235 | }, 236 | { 237 | "cell_type": "code", 238 | "execution_count": null, 239 | "metadata": {}, 240 | "outputs": [], 241 | "source": [ 242 | "# nn package to create our linear model\n", 243 | "# each Linear module has a weight and bias\n", 244 | "model = linear_model(D, H, C)\n", 245 | "model.to(device) #Convert to CUDA\n", 246 | "\n", 247 | "# nn package also has different loss functions.\n", 248 | "# we use cross entropy loss for our classification task\n", 249 | "criterion = torch.nn.CrossEntropyLoss()\n", 250 | "\n", 251 | "# we use the optim package to apply\n", 252 | "# stochastic gradient descent for our parameter updates\n", 253 | "optimizer = torch.optim.SGD(model.parameters(), lr=learning_rate, weight_decay=lambda_l2) # built-in L2\n", 254 | "\n", 255 | "# We convert our inputs and targets to Variables\n", 256 | "# so we can use automatic differentiation but we \n", 257 | "# use require_grad=False b/c we don't want the gradients\n", 258 | "# to alter these values.\n", 259 | "input_X = torch.tensor(X, requires_grad=False, dtype=torch.float32)\n", 260 | "y_true = torch.tensor(y, requires_grad=False, dtype=torch.long)\n", 261 | "\n", 262 | "# Training\n", 263 | "for t in range(1000):\n", 264 | " \n", 265 | " # Feed forward to get the logits\n", 266 | " y_pred = model(input_X)\n", 267 | " \n", 268 | " # Compute the loss and accuracy\n", 269 | " loss = criterion(y_pred, y_true)\n", 270 | " score, predicted = torch.max(y_pred, 1)\n", 271 | " acc = (y_true == predicted).sum().float() / len(y_true)\n", 272 | " print(\"[EPOCH]: %i, [LOSS]: %.6f, [ACCURACY]: %.3f\" % (t, loss.item(), acc))\n", 273 | " display.clear_output(wait=True)\n", 274 | " \n", 275 | " # zero the gradients before running\n", 276 | " # the backward pass.\n", 277 | " optimizer.zero_grad()\n", 278 | " \n", 279 | " # Backward pass to compute the gradient\n", 280 | " # of loss w.r.t our learnable params. \n", 281 | " loss.backward()\n", 282 | " \n", 283 | " # Update params\n", 284 | " optimizer.step()" 285 | ] 286 | }, 287 | { 288 | "cell_type": "code", 289 | "execution_count": null, 290 | "metadata": {}, 291 | "outputs": [], 292 | "source": [ 293 | "# Plot trained model\n", 294 | "print(model)" 295 | ] 296 | }, 297 | { 298 | "cell_type": "code", 299 | "execution_count": null, 300 | "metadata": {}, 301 | "outputs": [], 302 | "source": [ 303 | "plot_model(X, y, model)" 304 | ] 305 | }, 306 | { 307 | "cell_type": "markdown", 308 | "metadata": {}, 309 | "source": [ 310 | "### Two-layered network" 311 | ] 312 | }, 313 | { 314 | "cell_type": "code", 315 | "execution_count": null, 316 | "metadata": {}, 317 | "outputs": [], 318 | "source": [ 319 | "learning_rate = 1e-3\n", 320 | "lambda_l2 = 1e-5" 321 | ] 322 | }, 323 | { 324 | "cell_type": "code", 325 | "execution_count": null, 326 | "metadata": {}, 327 | "outputs": [], 328 | "source": [ 329 | "# NN model\n", 330 | "class two_layer_network(nn.Module):\n", 331 | " \"\"\"\n", 332 | " NN model.\n", 333 | " \"\"\"\n", 334 | " def __init__(self, D_in, H, D_out):\n", 335 | " \"\"\"\n", 336 | " Initialize weights.\n", 337 | " \"\"\"\n", 338 | " super(two_layer_network, self).__init__()\n", 339 | " self.fc1 = nn.Linear(D_in, H)\n", 340 | " self.fc2 = nn.Linear(H, D_out)\n", 341 | "\n", 342 | " def forward(self, x):\n", 343 | " \"\"\"\n", 344 | " Forward pass.\n", 345 | " \"\"\"\n", 346 | " z = F.relu(self.fc1(x))\n", 347 | " z = self.fc2(z)\n", 348 | " return z" 349 | ] 350 | }, 351 | { 352 | "cell_type": "code", 353 | "execution_count": null, 354 | "metadata": { 355 | "scrolled": true 356 | }, 357 | "outputs": [], 358 | "source": [ 359 | "# nn package to create our linear model\n", 360 | "# each Linear module has a weight and bias\n", 361 | "model = two_layer_network(D, H, C)\n", 362 | "model.to(device)\n", 363 | "\n", 364 | "# nn package also has different loss functions.\n", 365 | "# we use cross entropy loss for our classification task\n", 366 | "criterion = torch.nn.CrossEntropyLoss()\n", 367 | "\n", 368 | "# we use the optim package to apply\n", 369 | "# ADAM for our parameter updates\n", 370 | "optimizer = torch.optim.Adam(model.parameters(), lr=learning_rate, weight_decay=lambda_l2) # built-in L2\n", 371 | "\n", 372 | "# We convert our inputs and targest to Variables\n", 373 | "# so we can use automatic differentiation but we \n", 374 | "# use require_grad=False b/c we don't want the gradients\n", 375 | "# to alter these values.\n", 376 | "input_X = torch.tensor(X, requires_grad=False, dtype=torch.float32)\n", 377 | "y_true = torch.tensor(y, requires_grad=False, dtype=torch.long)\n", 378 | "\n", 379 | "# e = 1. # plotting purpose\n", 380 | "\n", 381 | "# Training\n", 382 | "for t in range(1000):\n", 383 | " \n", 384 | " # Feed forward to get the logits\n", 385 | " y_pred = model(input_X)\n", 386 | " \n", 387 | " # Compute the loss and accuracy\n", 388 | " loss = criterion(y_pred, y_true)\n", 389 | " score, predicted = torch.max(y_pred, 1)\n", 390 | " acc = (y_true == predicted).sum().float() / len(y_true)\n", 391 | " print(\"[EPOCH]: %i, [LOSS]: %.6f, [ACCURACY]: %.3f\" % (t, loss.item(), acc))\n", 392 | " display.clear_output(wait=True)\n", 393 | " \n", 394 | " # zero the gradients before running\n", 395 | " # the backward pass.\n", 396 | " optimizer.zero_grad()\n", 397 | " \n", 398 | " # Backward pass to compute the gradient\n", 399 | " # of loss w.r.t our learnable params. \n", 400 | " loss.backward()\n", 401 | " \n", 402 | " # Update params\n", 403 | " optimizer.step()\n", 404 | " \n", 405 | "# # Plot some progress\n", 406 | "# if t % math.ceil(e) == 0:\n", 407 | "# plot_model(X, y, model, e)\n", 408 | "# e *= 1.5\n", 409 | "\n", 410 | "#! convert -delay 20 -crop 500x475+330+50 +repage $(gls -1v train*) train.gif" 411 | ] 412 | }, 413 | { 414 | "cell_type": "code", 415 | "execution_count": null, 416 | "metadata": {}, 417 | "outputs": [], 418 | "source": [ 419 | "# Plot trained model\n", 420 | "print(model)\n", 421 | "plot_model(X, y, model)" 422 | ] 423 | } 424 | ], 425 | "metadata": { 426 | "kernelspec": { 427 | "display_name": "Codas ML", 428 | "language": "python", 429 | "name": "codasml" 430 | }, 431 | "language_info": { 432 | "codemirror_mode": { 433 | "name": "ipython", 434 | "version": 3 435 | }, 436 | "file_extension": ".py", 437 | "mimetype": "text/x-python", 438 | "name": "python", 439 | "nbconvert_exporter": "python", 440 | "pygments_lexer": "ipython3", 441 | "version": "3.6.5" 442 | } 443 | }, 444 | "nbformat": 4, 445 | "nbformat_minor": 2 446 | } 447 | -------------------------------------------------------------------------------- /05-convnet.ipynb: -------------------------------------------------------------------------------- 1 | { 2 | "cells": [ 3 | { 4 | "cell_type": "markdown", 5 | "metadata": {}, 6 | "source": [ 7 | "# Outline\n", 8 | "\n", 9 | "* Today we will show how to train a ConvNet using PyTorch\n", 10 | "* We will also illustrate how the ConvNet makes use of specific assumptions" 11 | ] 12 | }, 13 | { 14 | "cell_type": "markdown", 15 | "metadata": {}, 16 | "source": [ 17 | "# To perform well, we need to incorporate some prior knowledge about the problem\n", 18 | "\n", 19 | "* Assumptions helps us when they are true\n", 20 | "* They hurt us when they are not\n", 21 | "* We want to make just the right amount of assumptions, not more than that\n", 22 | "\n", 23 | "## In Deep Learning\n", 24 | "\n", 25 | "* Many layers: compositionality\n", 26 | "* Convolutions: locality + stationarity of images\n", 27 | "* Pooling: Invariance of object class to translations" 28 | ] 29 | }, 30 | { 31 | "cell_type": "code", 32 | "execution_count": null, 33 | "metadata": {}, 34 | "outputs": [], 35 | "source": [ 36 | "% run plot_conf.py" 37 | ] 38 | }, 39 | { 40 | "cell_type": "code", 41 | "execution_count": null, 42 | "metadata": {}, 43 | "outputs": [], 44 | "source": [ 45 | "plt_style()" 46 | ] 47 | }, 48 | { 49 | "cell_type": "code", 50 | "execution_count": null, 51 | "metadata": {}, 52 | "outputs": [], 53 | "source": [ 54 | "import torch\n", 55 | "import torch.nn as nn\n", 56 | "import torch.nn.functional as F\n", 57 | "import torch.optim as optim\n", 58 | "from torchvision import datasets, transforms\n", 59 | "import matplotlib.pyplot as plt\n", 60 | "import numpy\n", 61 | "\n", 62 | "# function to count number of parameters\n", 63 | "def get_n_params(model):\n", 64 | " np=0\n", 65 | " for p in list(model.parameters()):\n", 66 | " np += p.nelement()\n", 67 | " return np" 68 | ] 69 | }, 70 | { 71 | "cell_type": "markdown", 72 | "metadata": {}, 73 | "source": [ 74 | "# Load the Dataset (MNIST)\n", 75 | "\n", 76 | "\n", 77 | "We can use some PyTorch DataLoader utilities for this. This will download, shuffle, normalize data and arrange it in batches." 78 | ] 79 | }, 80 | { 81 | "cell_type": "code", 82 | "execution_count": null, 83 | "metadata": { 84 | "scrolled": false 85 | }, 86 | "outputs": [], 87 | "source": [ 88 | "input_size = 28*28 # images are 28x28 pixels\n", 89 | "output_size = 10 # there are 10 classes\n", 90 | "\n", 91 | "train_loader = torch.utils.data.DataLoader(\n", 92 | " datasets.MNIST('../data', train=True, download=True,\n", 93 | " transform=transforms.Compose([\n", 94 | " transforms.ToTensor(),\n", 95 | " transforms.Normalize((0.1307,), (0.3081,))\n", 96 | " ])),\n", 97 | " batch_size=64, shuffle=True)\n", 98 | "test_loader = torch.utils.data.DataLoader(\n", 99 | " datasets.MNIST('../data', train=False, transform=transforms.Compose([\n", 100 | " transforms.ToTensor(),\n", 101 | " transforms.Normalize((0.1307,), (0.3081,))\n", 102 | " ])),\n", 103 | " batch_size=1000, shuffle=True)" 104 | ] 105 | }, 106 | { 107 | "cell_type": "code", 108 | "execution_count": null, 109 | "metadata": {}, 110 | "outputs": [], 111 | "source": [ 112 | "# show some images\n", 113 | "plt.figure()\n", 114 | "for i in range(10):\n", 115 | " plt.subplot(2, 5, i + 1)\n", 116 | " image, _ = train_loader.dataset.__getitem__(i)\n", 117 | " plt.imshow(image.squeeze().numpy())" 118 | ] 119 | }, 120 | { 121 | "cell_type": "markdown", 122 | "metadata": {}, 123 | "source": [ 124 | "# Create the model classes" 125 | ] 126 | }, 127 | { 128 | "cell_type": "code", 129 | "execution_count": null, 130 | "metadata": {}, 131 | "outputs": [], 132 | "source": [ 133 | "class FC2Layer(nn.Module):\n", 134 | " def __init__(self, input_size, n_hidden, output_size):\n", 135 | " super(FC2Layer, self).__init__()\n", 136 | " self.input_size = input_size\n", 137 | " self.network = nn.Sequential(\n", 138 | " nn.Linear(input_size, n_hidden), \n", 139 | " nn.ReLU(), \n", 140 | " nn.Linear(n_hidden, n_hidden), \n", 141 | " nn.ReLU(), \n", 142 | " nn.Linear(n_hidden, output_size), \n", 143 | " nn.LogSoftmax(dim=1)\n", 144 | " )\n", 145 | "\n", 146 | " def forward(self, x):\n", 147 | " x = x.view(-1, self.input_size)\n", 148 | " return self.network(x)\n", 149 | " \n", 150 | " \n", 151 | "class CNN(nn.Module):\n", 152 | " def __init__(self, input_size, n_feature, output_size):\n", 153 | " super(CNN, self).__init__()\n", 154 | " self.n_feature = n_feature\n", 155 | " self.conv1 = nn.Conv2d(in_channels=1, out_channels=n_feature, kernel_size=5)\n", 156 | " self.conv2 = nn.Conv2d(n_feature, n_feature, kernel_size=5)\n", 157 | " self.fc1 = nn.Linear(n_feature*4*4, 50)\n", 158 | " self.fc2 = nn.Linear(50, 10)\n", 159 | " \n", 160 | "\n", 161 | "\n", 162 | " def forward(self, x, verbose=False):\n", 163 | " x = self.conv1(x)\n", 164 | " x = F.relu(x)\n", 165 | " x = F.max_pool2d(x, kernel_size=2)\n", 166 | " x = self.conv2(x)\n", 167 | " x = F.relu(x)\n", 168 | " x = F.max_pool2d(x, kernel_size=2)\n", 169 | " x = x.view(-1, self.n_feature*4*4)\n", 170 | " x = self.fc1(x)\n", 171 | " x = F.relu(x)\n", 172 | " x = self.fc2(x)\n", 173 | " x = F.log_softmax(x, dim=1)\n", 174 | " return x\n", 175 | " \n" 176 | ] 177 | }, 178 | { 179 | "cell_type": "markdown", 180 | "metadata": {}, 181 | "source": [ 182 | "## Running on a GPU: device string\n", 183 | "\n", 184 | "Switching between CPU and GPU in PyTorch is controlled via a device string, which will seemlessly determine whether GPU is available, falling back to CPU if not:" 185 | ] 186 | }, 187 | { 188 | "cell_type": "code", 189 | "execution_count": null, 190 | "metadata": {}, 191 | "outputs": [], 192 | "source": [ 193 | "device = torch.device(\"cuda:0\" if torch.cuda.is_available() else \"cpu\")" 194 | ] 195 | }, 196 | { 197 | "cell_type": "code", 198 | "execution_count": null, 199 | "metadata": {}, 200 | "outputs": [], 201 | "source": [ 202 | "accuracy_list = []\n", 203 | "\n", 204 | "def train(epoch, model, perm=torch.arange(0, 784).long()):\n", 205 | " model.train()\n", 206 | " for batch_idx, (data, target) in enumerate(train_loader):\n", 207 | " \n", 208 | " # permute pixels\n", 209 | " data = data.view(-1, 28*28)\n", 210 | " data = data[:, perm]\n", 211 | " data = data.view(-1, 1, 28, 28)\n", 212 | " \n", 213 | " optimizer.zero_grad()\n", 214 | " output = model(data)\n", 215 | " loss = F.nll_loss(output, target)\n", 216 | " loss.backward()\n", 217 | " optimizer.step()\n", 218 | " if batch_idx % 100 == 0:\n", 219 | " print('Train Epoch: {} [{}/{} ({:.0f}%)]\\tLoss: {:.6f}'.format(\n", 220 | " epoch, batch_idx * len(data), len(train_loader.dataset),\n", 221 | " 100. * batch_idx / len(train_loader), loss.item()))\n", 222 | " \n", 223 | "def test(model, perm=torch.arange(0, 784).long()):\n", 224 | " model.eval()\n", 225 | " test_loss = 0\n", 226 | " correct = 0\n", 227 | " for data, target in test_loader:\n", 228 | " # permute pixels\n", 229 | " data = data.view(-1, 28*28)\n", 230 | " data = data[:, perm]\n", 231 | " data = data.view(-1, 1, 28, 28)\n", 232 | " output = model(data)\n", 233 | " test_loss += F.nll_loss(output, target, size_average=False).item() # sum up batch loss \n", 234 | " pred = output.data.max(1, keepdim=True)[1] # get the index of the max log-probability \n", 235 | " correct += pred.eq(target.data.view_as(pred)).cpu().sum().item()\n", 236 | "\n", 237 | " test_loss /= len(test_loader.dataset)\n", 238 | " accuracy = 100. * correct / len(test_loader.dataset)\n", 239 | " accuracy_list.append(accuracy)\n", 240 | " print('\\nTest set: Average loss: {:.4f}, Accuracy: {}/{} ({:.0f}%)\\n'.format(\n", 241 | " test_loss, correct, len(test_loader.dataset),\n", 242 | " accuracy))" 243 | ] 244 | }, 245 | { 246 | "cell_type": "markdown", 247 | "metadata": {}, 248 | "source": [ 249 | "# Train a small fully-connected network" 250 | ] 251 | }, 252 | { 253 | "cell_type": "code", 254 | "execution_count": null, 255 | "metadata": {}, 256 | "outputs": [], 257 | "source": [ 258 | "n_hidden = 8 # number of hidden units\n", 259 | "\n", 260 | "model = FC2Layer(input_size, n_hidden, output_size)\n", 261 | "optimizer = optim.SGD(model.parameters(), lr=0.01, momentum=0.5)\n", 262 | "print('Number of parameters: {}'.format(get_n_params(model)))\n", 263 | "\n", 264 | "for epoch in range(0, 1):\n", 265 | " train(epoch, model)\n", 266 | " test(model)" 267 | ] 268 | }, 269 | { 270 | "cell_type": "markdown", 271 | "metadata": {}, 272 | "source": [ 273 | "# Train a ConvNet with the same number of parameters" 274 | ] 275 | }, 276 | { 277 | "cell_type": "code", 278 | "execution_count": null, 279 | "metadata": {}, 280 | "outputs": [], 281 | "source": [ 282 | "# Training settings \n", 283 | "n_features = 6 # number of feature maps\n", 284 | "\n", 285 | "model = CNN(input_size, n_features, output_size)\n", 286 | "optimizer = optim.SGD(model.parameters(), lr=0.01, momentum=0.5)\n", 287 | "print('Number of parameters: {}'.format(get_n_params(model)))\n", 288 | "\n", 289 | "for epoch in range(0, 1):\n", 290 | " train(epoch, model)\n", 291 | " test(model)" 292 | ] 293 | }, 294 | { 295 | "cell_type": "markdown", 296 | "metadata": {}, 297 | "source": [ 298 | "# The ConvNet performs better with the same number of parameters, thanks to its use of prior knowledge about images\n", 299 | "\n", 300 | "* Use of convolution: Locality and stationarity in images\n", 301 | "* Pooling: builds in some translation invariance\n", 302 | "\n", 303 | "# What happens if the assumptions are no longer true?\n" 304 | ] 305 | }, 306 | { 307 | "cell_type": "code", 308 | "execution_count": null, 309 | "metadata": { 310 | "scrolled": false 311 | }, 312 | "outputs": [], 313 | "source": [ 314 | "perm = torch.randperm(784)\n", 315 | "plt.figure()\n", 316 | "for i in range(10):\n", 317 | " image, _ = train_loader.dataset.__getitem__(i)\n", 318 | " # permute pixels\n", 319 | " image_perm = image.view(-1, 28*28).clone()\n", 320 | " image_perm = image_perm[:, perm]\n", 321 | " image_perm = image_perm.view(-1, 1, 28, 28)\n", 322 | " plt.subplot(4, 5, i + 1)\n", 323 | " plt.imshow(image.squeeze().numpy())\n", 324 | " plt.axis('off')\n", 325 | " plt.subplot(4, 5, i + 11)\n", 326 | " plt.imshow(image_perm.squeeze().numpy())\n", 327 | " plt.axis('off')" 328 | ] 329 | }, 330 | { 331 | "cell_type": "markdown", 332 | "metadata": {}, 333 | "source": [ 334 | "# ConvNet with permuted pixels" 335 | ] 336 | }, 337 | { 338 | "cell_type": "code", 339 | "execution_count": null, 340 | "metadata": {}, 341 | "outputs": [], 342 | "source": [ 343 | "# Training settings \n", 344 | "n_features = 6 # number of feature maps\n", 345 | "\n", 346 | "model = CNN(input_size, n_features, output_size)\n", 347 | "optimizer = optim.SGD(model.parameters(), lr=0.01, momentum=0.5)\n", 348 | "print('Number of parameters: {}'.format(get_n_params(model)))\n", 349 | "\n", 350 | "for epoch in range(0, 1):\n", 351 | " train(epoch, model, perm)\n", 352 | " test(model, perm)" 353 | ] 354 | }, 355 | { 356 | "cell_type": "markdown", 357 | "metadata": {}, 358 | "source": [ 359 | "# Fully-Connected with Permuted Pixels" 360 | ] 361 | }, 362 | { 363 | "cell_type": "code", 364 | "execution_count": null, 365 | "metadata": {}, 366 | "outputs": [], 367 | "source": [ 368 | "n_hidden = 8 # number of hidden units\n", 369 | "\n", 370 | "model = FC2Layer(input_size, n_hidden, output_size)\n", 371 | "optimizer = optim.SGD(model.parameters(), lr=0.01, momentum=0.5)\n", 372 | "print('Number of parameters: {}'.format(get_n_params(model)))\n", 373 | "\n", 374 | "for epoch in range(0, 1):\n", 375 | " train(epoch, model, perm)\n", 376 | " test(model, perm)" 377 | ] 378 | }, 379 | { 380 | "cell_type": "markdown", 381 | "metadata": {}, 382 | "source": [ 383 | "# The ConvNet's performance drops when we permute the pixels, but the Fully-Connected Network's performance stays the same\n", 384 | "\n", 385 | "* ConvNet makes the assumption that pixels lie on a grid and are stationary/local\n", 386 | "* It loses performance when this assumption is wrong\n", 387 | "* The fully-connected network does not make this assumption\n", 388 | "* It does less well when it is true, since it doesn't take advantage of this prior knowledge\n", 389 | "* But it doesn't suffer when the assumption is wrong" 390 | ] 391 | }, 392 | { 393 | "cell_type": "code", 394 | "execution_count": null, 395 | "metadata": {}, 396 | "outputs": [], 397 | "source": [ 398 | "plt.bar(('NN image', 'CNN image',\n", 399 | " 'CNN scrambled', 'NN scrambled'),\n", 400 | " accuracy_list, width=0.4)\n", 401 | "plt.ylim((min(accuracy_list)-5, 96))\n", 402 | "plt.ylabel('Accuracy [%]')\n", 403 | "for tick in plt.gca().xaxis.get_major_ticks():\n", 404 | " tick.label.set_fontsize(20)\n", 405 | "plt.title('Performance comparison');" 406 | ] 407 | }, 408 | { 409 | "cell_type": "code", 410 | "execution_count": null, 411 | "metadata": {}, 412 | "outputs": [], 413 | "source": [] 414 | } 415 | ], 416 | "metadata": { 417 | "kernelspec": { 418 | "display_name": "Codas ML", 419 | "language": "python", 420 | "name": "codasml" 421 | }, 422 | "language_info": { 423 | "codemirror_mode": { 424 | "name": "ipython", 425 | "version": 3 426 | }, 427 | "file_extension": ".py", 428 | "mimetype": "text/x-python", 429 | "name": "python", 430 | "nbconvert_exporter": "python", 431 | "pygments_lexer": "ipython3", 432 | "version": "3.6.5" 433 | } 434 | }, 435 | "nbformat": 4, 436 | "nbformat_minor": 2 437 | } 438 | -------------------------------------------------------------------------------- /06-autoencoder.ipynb: -------------------------------------------------------------------------------- 1 | { 2 | "cells": [ 3 | { 4 | "cell_type": "code", 5 | "execution_count": null, 6 | "metadata": {}, 7 | "outputs": [], 8 | "source": [ 9 | "# Import some libraries\n", 10 | "\n", 11 | "import torch\n", 12 | "import torchvision\n", 13 | "from torch import nn\n", 14 | "from torch.utils.data import DataLoader\n", 15 | "from torchvision import transforms\n", 16 | "from torchvision.datasets import MNIST\n", 17 | "from matplotlib import pyplot as plt" 18 | ] 19 | }, 20 | { 21 | "cell_type": "code", 22 | "execution_count": null, 23 | "metadata": {}, 24 | "outputs": [], 25 | "source": [ 26 | "# Convert vector to image\n", 27 | "\n", 28 | "def to_img(x):\n", 29 | " x = 0.5 * (x + 1)\n", 30 | " x = x.view(x.size(0), 28, 28)\n", 31 | " return x" 32 | ] 33 | }, 34 | { 35 | "cell_type": "code", 36 | "execution_count": null, 37 | "metadata": {}, 38 | "outputs": [], 39 | "source": [ 40 | "# Displaying routine\n", 41 | "\n", 42 | "def display_images(in_, out, n=1):\n", 43 | " for N in range(n):\n", 44 | " if in_ is not None:\n", 45 | " in_pic = to_img(in_.cpu().data)\n", 46 | " plt.figure(figsize=(18, 6))\n", 47 | " for i in range(4):\n", 48 | " plt.subplot(1,4,i+1)\n", 49 | " plt.imshow(in_pic[i+4*N])\n", 50 | " plt.axis('off')\n", 51 | " out_pic = to_img(out.cpu().data)\n", 52 | " plt.figure(figsize=(18, 6))\n", 53 | " for i in range(4):\n", 54 | " plt.subplot(1,4,i+1)\n", 55 | " plt.imshow(out_pic[i+4*N])\n", 56 | " plt.axis('off')" 57 | ] 58 | }, 59 | { 60 | "cell_type": "code", 61 | "execution_count": null, 62 | "metadata": {}, 63 | "outputs": [], 64 | "source": [ 65 | "# Define data loading step\n", 66 | "\n", 67 | "batch_size = 256\n", 68 | "\n", 69 | "img_transform = transforms.Compose([\n", 70 | " transforms.ToTensor(),\n", 71 | " transforms.Normalize((0.5, 0.5, 0.5), (0.5, 0.5, 0.5))\n", 72 | "])\n", 73 | "\n", 74 | "dataset = MNIST('./data', transform=img_transform, download=True)\n", 75 | "dataloader = DataLoader(dataset, batch_size=batch_size, shuffle=True)" 76 | ] 77 | }, 78 | { 79 | "cell_type": "code", 80 | "execution_count": null, 81 | "metadata": {}, 82 | "outputs": [], 83 | "source": [ 84 | "device = torch.device(\"cuda:0\" if torch.cuda.is_available() else \"cpu\")" 85 | ] 86 | }, 87 | { 88 | "cell_type": "code", 89 | "execution_count": null, 90 | "metadata": {}, 91 | "outputs": [], 92 | "source": [ 93 | "# Define model architecture and reconstruction loss\n", 94 | "\n", 95 | "# n = 28 x 28 = 784\n", 96 | "d = 30 # for standard AE (under-complete hidden layer)\n", 97 | "# d = 500 # for denoising AE (over-complete hidden layer)\n", 98 | "\n", 99 | "class Autoencoder(nn.Module):\n", 100 | " def __init__(self):\n", 101 | " super().__init__()\n", 102 | " self.encoder = nn.Sequential(\n", 103 | " nn.Linear(28 * 28, d),\n", 104 | " nn.Tanh(),\n", 105 | " )\n", 106 | " self.decoder = nn.Sequential(\n", 107 | " nn.Linear(d, 28 * 28),\n", 108 | " nn.Tanh(),\n", 109 | " )\n", 110 | "\n", 111 | " def forward(self, x):\n", 112 | " x = self.encoder(x)\n", 113 | " x = self.decoder(x)\n", 114 | " return x\n", 115 | " \n", 116 | "model = Autoencoder().to(device)\n", 117 | "criterion = nn.MSELoss()" 118 | ] 119 | }, 120 | { 121 | "cell_type": "code", 122 | "execution_count": null, 123 | "metadata": {}, 124 | "outputs": [], 125 | "source": [ 126 | "# Configure the optimiser\n", 127 | "\n", 128 | "learning_rate = 1e-3\n", 129 | "\n", 130 | "optimizer = torch.optim.Adam(\n", 131 | " model.parameters(),\n", 132 | " lr=learning_rate,\n", 133 | ")" 134 | ] 135 | }, 136 | { 137 | "cell_type": "markdown", 138 | "metadata": {}, 139 | "source": [ 140 | "*Comment* or *un-comment out* a few lines of code to seamlessly switch between *standard AE* and *denoising one*.\n", 141 | "\n", 142 | "Don't forget to **(1)** change the size of the hidden layer accordingly, **(2)** re-generate the model, and **(3)** re-pass the parameters to the optimiser." 143 | ] 144 | }, 145 | { 146 | "cell_type": "code", 147 | "execution_count": null, 148 | "metadata": { 149 | "scrolled": false 150 | }, 151 | "outputs": [], 152 | "source": [ 153 | "# Train standard or denoising autoencoder (AE)\n", 154 | "\n", 155 | "num_epochs = 1\n", 156 | "# do = nn.Dropout() # comment out for standard AE\n", 157 | "for epoch in range(num_epochs):\n", 158 | " for data in dataloader:\n", 159 | " img, _ = data\n", 160 | " img.requires_grad_()\n", 161 | " img = img.view(img.size(0), -1)\n", 162 | "# img_bad = do(img).to(device) # comment out for standard AE\n", 163 | " # ===================forward=====================\n", 164 | " output = model(img) # feed (for std AE) or (for denoising AE)\n", 165 | " loss = criterion(output, img.data)\n", 166 | " # ===================backward====================\n", 167 | " optimizer.zero_grad()\n", 168 | " loss.backward()\n", 169 | " optimizer.step()\n", 170 | " # ===================log========================\n", 171 | " print(f'epoch [{epoch + 1}/{num_epochs}], loss:{loss.item():.4f}')\n", 172 | " display_images(None, output) # pass (None, output) for std AE, (img_bad, output) for denoising AE" 173 | ] 174 | }, 175 | { 176 | "cell_type": "code", 177 | "execution_count": null, 178 | "metadata": {}, 179 | "outputs": [], 180 | "source": [ 181 | "# Visualise a few kernels of the encoder\n", 182 | "\n", 183 | "display_images(None, model.encoder[0].weight, 5)" 184 | ] 185 | }, 186 | { 187 | "cell_type": "code", 188 | "execution_count": null, 189 | "metadata": {}, 190 | "outputs": [], 191 | "source": [ 192 | "! conda install -y --name codas-ml opencv" 193 | ] 194 | }, 195 | { 196 | "cell_type": "code", 197 | "execution_count": null, 198 | "metadata": {}, 199 | "outputs": [], 200 | "source": [ 201 | "# Let's compare the autoencoder inpainting capabilities vs. OpenCV\n", 202 | "\n", 203 | "from cv2 import inpaint, INPAINT_NS, INPAINT_TELEA" 204 | ] 205 | }, 206 | { 207 | "cell_type": "code", 208 | "execution_count": null, 209 | "metadata": {}, 210 | "outputs": [], 211 | "source": [ 212 | "# Inpaint with Telea and Navier-Stokes methods\n", 213 | "\n", 214 | "dst_TELEA = list()\n", 215 | "dst_NS = list()\n", 216 | "\n", 217 | "for i in range(3, 7):\n", 218 | " corrupted_img = ((img_bad.data.cpu()[i].view(28, 28) / 4 + 0.5) * 255).byte().numpy()\n", 219 | " mask = 2 - img_bad.grad_fn.noise.cpu()[i].view(28, 28).byte().numpy()\n", 220 | " dst_TELEA.append(inpaint(corrupted_img, mask, 3, INPAINT_TELEA))\n", 221 | " dst_NS.append(inpaint(corrupted_img, mask, 3, INPAINT_NS))\n", 222 | "\n", 223 | "tns_TELEA = [torch.from_numpy(d) for d in dst_TELEA]\n", 224 | "tns_NS = [torch.from_numpy(d) for d in dst_NS]\n", 225 | "\n", 226 | "TELEA = torch.stack(tns_TELEA).float()\n", 227 | "NS = torch.stack(tns_NS).float()" 228 | ] 229 | }, 230 | { 231 | "cell_type": "code", 232 | "execution_count": null, 233 | "metadata": {}, 234 | "outputs": [], 235 | "source": [ 236 | "# Compare the results: [noise], [img + noise], [img], [AE, Telea, Navier-Stokes] inpainting\n", 237 | "\n", 238 | "with torch.no_grad():\n", 239 | " display_images(img_bad.grad_fn.noise[3:7], img_bad[3:7])\n", 240 | " display_images(img[3:7], output[3:7])\n", 241 | " display_images(TELEA, NS)" 242 | ] 243 | }, 244 | { 245 | "cell_type": "code", 246 | "execution_count": null, 247 | "metadata": {}, 248 | "outputs": [], 249 | "source": [] 250 | } 251 | ], 252 | "metadata": { 253 | "kernelspec": { 254 | "display_name": "Python 3", 255 | "language": "python", 256 | "name": "python3" 257 | }, 258 | "language_info": { 259 | "codemirror_mode": { 260 | "name": "ipython", 261 | "version": 3 262 | }, 263 | "file_extension": ".py", 264 | "mimetype": "text/x-python", 265 | "name": "python", 266 | "nbconvert_exporter": "python", 267 | "pygments_lexer": "ipython3", 268 | "version": "3.6.6" 269 | } 270 | }, 271 | "nbformat": 4, 272 | "nbformat_minor": 2 273 | } 274 | -------------------------------------------------------------------------------- /07-VAE.ipynb: -------------------------------------------------------------------------------- 1 | { 2 | "cells": [ 3 | { 4 | "cell_type": "code", 5 | "execution_count": null, 6 | "metadata": {}, 7 | "outputs": [], 8 | "source": [ 9 | "import torch\n", 10 | "import torchvision\n", 11 | "from torch import nn\n", 12 | "from torch.utils.data import DataLoader\n", 13 | "from torchvision import transforms\n", 14 | "from torchvision.datasets import MNIST\n", 15 | "from matplotlib import pyplot as plt" 16 | ] 17 | }, 18 | { 19 | "cell_type": "code", 20 | "execution_count": null, 21 | "metadata": {}, 22 | "outputs": [], 23 | "source": [ 24 | "# Displaying routine\n", 25 | "\n", 26 | "def display_images(in_, out, n=1, label=None, count=False):\n", 27 | " for N in range(n):\n", 28 | " if in_ is not None:\n", 29 | " in_pic = in_.data.cpu().view(-1, 28, 28)\n", 30 | " plt.figure(figsize=(18, 4))\n", 31 | " plt.suptitle(label + ' – real test data / reconstructions', color='w', fontsize=16)\n", 32 | " for i in range(4):\n", 33 | " plt.subplot(1,4,i+1)\n", 34 | " plt.imshow(in_pic[i+4*N])\n", 35 | " plt.axis('off')\n", 36 | " out_pic = out.data.cpu().view(-1, 28, 28)\n", 37 | " plt.figure(figsize=(18, 6))\n", 38 | " for i in range(4):\n", 39 | " plt.subplot(1,4,i+1)\n", 40 | " plt.imshow(out_pic[i+4*N])\n", 41 | " plt.axis('off')\n", 42 | " if count: plt.title(str(4 * N + i), color='w')" 43 | ] 44 | }, 45 | { 46 | "cell_type": "code", 47 | "execution_count": null, 48 | "metadata": {}, 49 | "outputs": [], 50 | "source": [ 51 | "# Set random seeds\n", 52 | "\n", 53 | "torch.manual_seed(1)\n", 54 | "torch.cuda.manual_seed(1)" 55 | ] 56 | }, 57 | { 58 | "cell_type": "code", 59 | "execution_count": null, 60 | "metadata": {}, 61 | "outputs": [], 62 | "source": [ 63 | "# Define data loading step\n", 64 | "\n", 65 | "batch_size = 256\n", 66 | "\n", 67 | "kwargs = {'num_workers': 1, 'pin_memory': True}\n", 68 | "train_loader = torch.utils.data.DataLoader(\n", 69 | " MNIST('./data', train=True, download=True,\n", 70 | " transform=transforms.ToTensor()),\n", 71 | " batch_size=batch_size, shuffle=True, **kwargs)\n", 72 | "test_loader = torch.utils.data.DataLoader(\n", 73 | " MNIST('./data', train=False, transform=transforms.ToTensor()),\n", 74 | " batch_size=batch_size, shuffle=True, **kwargs)" 75 | ] 76 | }, 77 | { 78 | "cell_type": "code", 79 | "execution_count": null, 80 | "metadata": {}, 81 | "outputs": [], 82 | "source": [ 83 | "device = torch.device(\"cuda:0\" if torch.cuda.is_available() else \"cpu\")" 84 | ] 85 | }, 86 | { 87 | "cell_type": "code", 88 | "execution_count": null, 89 | "metadata": {}, 90 | "outputs": [], 91 | "source": [ 92 | "d = 20\n", 93 | "\n", 94 | "class VAE(nn.Module):\n", 95 | " def __init__(self):\n", 96 | " super().__init__()\n", 97 | "\n", 98 | " self.encoder = nn.Sequential(\n", 99 | " nn.Linear(784, d ** 2),\n", 100 | " nn.ReLU(),\n", 101 | " nn.Linear(d ** 2, d * 2)\n", 102 | " )\n", 103 | "\n", 104 | " self.decoder = nn.Sequential(\n", 105 | " nn.Linear(d, d ** 2),\n", 106 | " nn.ReLU(),\n", 107 | " nn.Linear(d ** 2, 784),\n", 108 | " nn.Sigmoid(),\n", 109 | " )\n", 110 | "\n", 111 | " def reparameterize(self, mu, logvar):\n", 112 | " if self.training:\n", 113 | " std = logvar.mul(0.5).exp_()\n", 114 | " eps = std.data.new(std.size()).normal_()\n", 115 | " return eps.mul(std).add_(mu)\n", 116 | " else:\n", 117 | " return mu\n", 118 | "\n", 119 | " def forward(self, x):\n", 120 | " mu_logvar = self.encoder(x.view(-1, 784)).view(-1, 2, d)\n", 121 | " mu = mu_logvar[:, 0, :]\n", 122 | " logvar = mu_logvar[:, 1, :]\n", 123 | " z = self.reparameterize(mu, logvar)\n", 124 | " return self.decoder(z), mu, logvar\n", 125 | "\n", 126 | "model = VAE().to(device)" 127 | ] 128 | }, 129 | { 130 | "cell_type": "code", 131 | "execution_count": null, 132 | "metadata": {}, 133 | "outputs": [], 134 | "source": [ 135 | "learning_rate = 1e-3\n", 136 | "\n", 137 | "optimizer = torch.optim.Adam(\n", 138 | " model.parameters(),\n", 139 | " lr=learning_rate,\n", 140 | ")" 141 | ] 142 | }, 143 | { 144 | "cell_type": "code", 145 | "execution_count": null, 146 | "metadata": {}, 147 | "outputs": [], 148 | "source": [ 149 | "# Reconstruction + KL divergence losses summed over all elements and batch\n", 150 | "\n", 151 | "def loss_function(recon_x, x, mu, logvar):\n", 152 | " BCE = nn.functional.binary_cross_entropy(\n", 153 | " recon_x, x.view(-1, 784), size_average=False\n", 154 | " )\n", 155 | " KLD = -0.5 * torch.sum(1 + logvar - mu.pow(2) - logvar.exp())\n", 156 | "\n", 157 | " return BCE + KLD" 158 | ] 159 | }, 160 | { 161 | "cell_type": "code", 162 | "execution_count": null, 163 | "metadata": { 164 | "scrolled": false 165 | }, 166 | "outputs": [], 167 | "source": [ 168 | "# Training and testing the VAE\n", 169 | "\n", 170 | "epochs = 10\n", 171 | "for epoch in range(1, epochs + 1):\n", 172 | " # Training\n", 173 | " model.train()\n", 174 | " train_loss = 0\n", 175 | " for data, _ in train_loader:\n", 176 | " data = data.to(device)\n", 177 | " # ===================forward=====================\n", 178 | " recon_batch, mu, logvar = model(data)\n", 179 | " loss = loss_function(recon_batch, data, mu, logvar)\n", 180 | " train_loss += loss.item()\n", 181 | " # ===================backward====================\n", 182 | " optimizer.zero_grad()\n", 183 | " loss.backward()\n", 184 | " optimizer.step()\n", 185 | " # ===================log========================\n", 186 | " print(f'====> Epoch: {epoch} Average loss: {train_loss / len(train_loader.dataset):.4f}')\n", 187 | " \n", 188 | " # Testing\n", 189 | " \n", 190 | " with torch.no_grad():\n", 191 | " model.eval()\n", 192 | " test_loss = 0\n", 193 | " for data, _ in test_loader:\n", 194 | " data = data.to(device)\n", 195 | " # ===================forward=====================\n", 196 | " recon_batch, mu, logvar = model(data)\n", 197 | " test_loss += loss_function(recon_batch, data, mu, logvar).item()\n", 198 | " # ===================log========================\n", 199 | " test_loss /= len(test_loader.dataset)\n", 200 | " print(f'====> Test set loss: {test_loss:.4f}')\n", 201 | " display_images(data, recon_batch, 1, f'Epoch {epoch}')" 202 | ] 203 | }, 204 | { 205 | "cell_type": "code", 206 | "execution_count": null, 207 | "metadata": {}, 208 | "outputs": [], 209 | "source": [ 210 | "# Generating a few samples\n", 211 | "\n", 212 | "N = 16\n", 213 | "sample = torch.randn((N, 20), requires_grad=False).to(device)\n", 214 | "sample = model.decoder(sample)\n", 215 | "display_images(None, sample, N // 4, count=True)" 216 | ] 217 | }, 218 | { 219 | "cell_type": "code", 220 | "execution_count": null, 221 | "metadata": {}, 222 | "outputs": [], 223 | "source": [ 224 | "# Display last test batch\n", 225 | "\n", 226 | "display_images(None, data, 4, count=True)" 227 | ] 228 | }, 229 | { 230 | "cell_type": "code", 231 | "execution_count": null, 232 | "metadata": {}, 233 | "outputs": [], 234 | "source": [ 235 | "# Choose starting and ending point for the interpolation -> shows original and reconstructed\n", 236 | "\n", 237 | "A, B = 5, 14\n", 238 | "sample = model.decoder(torch.stack((mu[A].data, mu[B].data), 0))\n", 239 | "display_images(None, torch.stack(((\n", 240 | " data[A].data.view(-1),\n", 241 | " data[B].data.view(-1),\n", 242 | " sample.data[0],\n", 243 | " sample.data[1]\n", 244 | ")), 0))" 245 | ] 246 | }, 247 | { 248 | "cell_type": "code", 249 | "execution_count": null, 250 | "metadata": {}, 251 | "outputs": [], 252 | "source": [ 253 | "# Perform an interpolation between input A and B, in N steps\n", 254 | "\n", 255 | "N = 16\n", 256 | "code = torch.Tensor(N, 20).to(device)\n", 257 | "for i in range(N):\n", 258 | " code[i] = i / N * mu[B].data + (1 - i / N) * mu[A].data\n", 259 | "code = torch.tensor(code, requires_grad=True)\n", 260 | "sample = model.decoder(code)\n", 261 | "display_images(None, sample, N // 4, count=True)" 262 | ] 263 | }, 264 | { 265 | "cell_type": "code", 266 | "execution_count": null, 267 | "metadata": {}, 268 | "outputs": [], 269 | "source": [] 270 | } 271 | ], 272 | "metadata": { 273 | "kernelspec": { 274 | "display_name": "Python 3", 275 | "language": "python", 276 | "name": "python3" 277 | }, 278 | "language_info": { 279 | "codemirror_mode": { 280 | "name": "ipython", 281 | "version": 3 282 | }, 283 | "file_extension": ".py", 284 | "mimetype": "text/x-python", 285 | "name": "python", 286 | "nbconvert_exporter": "python", 287 | "pygments_lexer": "ipython3", 288 | "version": "3.6.5" 289 | } 290 | }, 291 | "nbformat": 4, 292 | "nbformat_minor": 2 293 | } 294 | -------------------------------------------------------------------------------- /08-1-classify_seq_data.ipynb: -------------------------------------------------------------------------------- 1 | { 2 | "cells": [ 3 | { 4 | "cell_type": "markdown", 5 | "metadata": { 6 | "collapsed": true 7 | }, 8 | "source": [ 9 | "An example of many-to-one (sequence classification):\n", 10 | "\n", 11 | "\n", 12 | "Original experiment from Hochreiter&Schmidhuber(1997):\n", 13 | "\n", 14 | " The goal is to classify sequences. Elements and targets are represented locally\n", 15 | " (input vectors with only one non-zero bit). The sequence starts with an E, ends\n", 16 | " with a B (the \"trigger symbol\") and otherwise consists of randomly chosen symbols\n", 17 | " from the set {a, b, c, d} except for two elements at positions t1 and t2 that are\n", 18 | " either X or Y . The sequence length is randomly chosen between 100 and 110, t1 is\n", 19 | " randomly chosen between 10 and 20, and t2 is randomly chosen between 50 and 60.\n", 20 | " There are 4 sequence classes Q, R, S, U which depend on the temporal order of X and Y.\n", 21 | " The rules are:\n", 22 | " X, X -> Q,\n", 23 | " X, Y -> R,\n", 24 | " Y , X -> S,\n", 25 | " Y , Y -> U. " 26 | ] 27 | }, 28 | { 29 | "cell_type": "code", 30 | "execution_count": null, 31 | "metadata": {}, 32 | "outputs": [], 33 | "source": [ 34 | "from sequential_tasks import TemporalOrderExp6aSequence\n", 35 | "\n", 36 | "# generate data\n", 37 | "dg = TemporalOrderExp6aSequence.get_predefined_generator(\n", 38 | " TemporalOrderExp6aSequence.DifficultyLevel.EASY)\n" 39 | ] 40 | }, 41 | { 42 | "cell_type": "code", 43 | "execution_count": null, 44 | "metadata": {}, 45 | "outputs": [], 46 | "source": [ 47 | "# Raw sequences and their classes:\n", 48 | "for n in range(5):\n", 49 | " x, y = dg.generate_pair()\n", 50 | " print('{} ----> {}'.format(x, y))\n" 51 | ] 52 | }, 53 | { 54 | "cell_type": "code", 55 | "execution_count": null, 56 | "metadata": {}, 57 | "outputs": [], 58 | "source": [ 59 | "# Encoding our data into RNN-friendly data format\n", 60 | "\n", 61 | "# Single data pair example:\n", 62 | "x, y = dg.generate_pair()\n", 63 | "print('{} ----> {}'.format(x, y))\n", 64 | " \n", 65 | "enc_x = dg.encode_x(x)\n", 66 | "enc_y = dg.encode_y(y)\n", 67 | "\n", 68 | "print('Encoded input sequence:')\n", 69 | "print(enc_x)\n", 70 | "print('Encoded output sequence:')\n", 71 | "print(enc_y)\n" 72 | ] 73 | }, 74 | { 75 | "cell_type": "code", 76 | "execution_count": null, 77 | "metadata": {}, 78 | "outputs": [], 79 | "source": [ 80 | "# let's take a batch of training pairs\n", 81 | "batch_x, batch_y = dg[0]\n", 82 | "\n", 83 | "# batch_x has the shape (batch_size, max_seq_length, num_symbols)\n", 84 | "print('Batch_x shape = ', batch_x.shape)\n", 85 | "\n", 86 | "# batch_y has the shape (batch_size, num_classes)\n", 87 | "print('Batch_y shape = ', batch_y.shape)\n", 88 | "\n", 89 | "# inputs are zero-padded (added zero prefix)\n", 90 | "# to obtain sequences of equal length\n", 91 | "print(batch_x[0])\n", 92 | "\n" 93 | ] 94 | }, 95 | { 96 | "cell_type": "code", 97 | "execution_count": null, 98 | "metadata": {}, 99 | "outputs": [], 100 | "source": [] 101 | }, 102 | { 103 | "cell_type": "code", 104 | "execution_count": null, 105 | "metadata": {}, 106 | "outputs": [], 107 | "source": [] 108 | } 109 | ], 110 | "metadata": { 111 | "kernelspec": { 112 | "display_name": "Python 3", 113 | "language": "python", 114 | "name": "python3" 115 | }, 116 | "language_info": { 117 | "codemirror_mode": { 118 | "name": "ipython", 119 | "version": 3 120 | }, 121 | "file_extension": ".py", 122 | "mimetype": "text/x-python", 123 | "name": "python", 124 | "nbconvert_exporter": "python", 125 | "pygments_lexer": "ipython3", 126 | "version": "3.6.5" 127 | } 128 | }, 129 | "nbformat": 4, 130 | "nbformat_minor": 1 131 | } 132 | -------------------------------------------------------------------------------- /08-2-echo_data.ipynb: -------------------------------------------------------------------------------- 1 | { 2 | "cells": [ 3 | { 4 | "cell_type": "markdown", 5 | "metadata": { 6 | "collapsed": true 7 | }, 8 | "source": [ 9 | "Echoing signal n steps is an example of synchronized many-to-many task:" 10 | ] 11 | }, 12 | { 13 | "cell_type": "code", 14 | "execution_count": null, 15 | "metadata": {}, 16 | "outputs": [], 17 | "source": [ 18 | "from sequential_tasks import EchoData\n", 19 | "\n", 20 | "batch_size = 5\n", 21 | "echo_step = 3\n", 22 | "series_length = 20000\n", 23 | "truncated_length = 10\n", 24 | "\n", 25 | "data_gen = EchoData(\n", 26 | " echo_step=echo_step,\n", 27 | " batch_size=batch_size,\n", 28 | " series_length=series_length,\n", 29 | " truncated_length=truncated_length)" 30 | ] 31 | }, 32 | { 33 | "cell_type": "code", 34 | "execution_count": null, 35 | "metadata": {}, 36 | "outputs": [], 37 | "source": [ 38 | "# Let's print first 20 timesteps of the first sequences to see the echo data:\n", 39 | "print('(1st sequence) x = ', data_gen.raw_x[0, :20], '... ')\n", 40 | "print('(1st sequence) y = ', data_gen.raw_y[0, :20], '... ')" 41 | ] 42 | }, 43 | { 44 | "cell_type": "code", 45 | "execution_count": null, 46 | "metadata": {}, 47 | "outputs": [], 48 | "source": [ 49 | "# batch_size different sequences are created:\n", 50 | "print('bax = ')\n", 51 | "print(data_gen.raw_x[:, :20])\n", 52 | "print('y = ')\n", 53 | "print(data_gen.raw_y[:, :20])\n", 54 | "\n", 55 | "print('raw_x shape:', data_gen.raw_x.shape) # shape = (batch_size, sequence_length)\n", 56 | "print('raw_y shape:', data_gen.raw_y.shape) # shape = (batch_size, sequence_length)\n" 57 | ] 58 | }, 59 | { 60 | "cell_type": "code", 61 | "execution_count": null, 62 | "metadata": {}, 63 | "outputs": [], 64 | "source": [ 65 | "# In order to use RNNs data is organized into tensors of size:\n", 66 | "# [batch_size, truncated_sequence_length, feature_dim\n", 67 | "\n", 68 | "i_batch = 0\n", 69 | "print('batch x shape:', data_gen.x_batches[i_batch].shape)\n", 70 | "print('batch y shape:', data_gen.y_batches[i_batch].shape)\n" 71 | ] 72 | }, 73 | { 74 | "cell_type": "code", 75 | "execution_count": null, 76 | "metadata": {}, 77 | "outputs": [], 78 | "source": [ 79 | "\n", 80 | "print(data_gen.x_batches[i_batch])\n", 81 | "print(data_gen.y_batches[i_batch])\n" 82 | ] 83 | }, 84 | { 85 | "cell_type": "markdown", 86 | "metadata": {}, 87 | "source": [ 88 | " " 89 | ] 90 | }, 91 | { 92 | "cell_type": "code", 93 | "execution_count": null, 94 | "metadata": {}, 95 | "outputs": [], 96 | "source": [] 97 | } 98 | ], 99 | "metadata": { 100 | "kernelspec": { 101 | "display_name": "Python 3", 102 | "language": "python", 103 | "name": "python3" 104 | }, 105 | "language_info": { 106 | "codemirror_mode": { 107 | "name": "ipython", 108 | "version": 3 109 | }, 110 | "file_extension": ".py", 111 | "mimetype": "text/x-python", 112 | "name": "python", 113 | "nbconvert_exporter": "python", 114 | "pygments_lexer": "ipython3", 115 | "version": "3.6.5" 116 | } 117 | }, 118 | "nbformat": 4, 119 | "nbformat_minor": 1 120 | } 121 | -------------------------------------------------------------------------------- /08-3-temporal_order_classification_experiments.ipynb: -------------------------------------------------------------------------------- 1 | { 2 | "cells": [ 3 | { 4 | "cell_type": "code", 5 | "execution_count": null, 6 | "metadata": {}, 7 | "outputs": [], 8 | "source": [ 9 | "from sequential_tasks import TemporalOrderExp6aSequence\n", 10 | "\n", 11 | "import torch\n", 12 | "import torch.nn as nn\n", 13 | "import torch.nn.functional as F\n", 14 | "import torch.optim as optim\n", 15 | "\n", 16 | "torch.manual_seed(1)" 17 | ] 18 | }, 19 | { 20 | "cell_type": "markdown", 21 | "metadata": {}, 22 | "source": [ 23 | "# Specify experiment settings and prepare the data" 24 | ] 25 | }, 26 | { 27 | "cell_type": "code", 28 | "execution_count": null, 29 | "metadata": {}, 30 | "outputs": [], 31 | "source": [ 32 | "# experiments settings\n", 33 | "settings = {\n", 34 | " \"difficulty\": TemporalOrderExp6aSequence.DifficultyLevel.EASY,\n", 35 | " \"batch_size\": 32,\n", 36 | " \"h_units\": 4,\n", 37 | " \"max_epochs\": 10\n", 38 | "}" 39 | ] 40 | }, 41 | { 42 | "cell_type": "code", 43 | "execution_count": null, 44 | "metadata": {}, 45 | "outputs": [], 46 | "source": [ 47 | "#training data\n", 48 | "train_data_gen = TemporalOrderExp6aSequence.get_predefined_generator(\n", 49 | " settings['difficulty'],\n", 50 | " settings['batch_size'])\n", 51 | "train_size = len(train_data_gen)\n", 52 | "\n", 53 | "# testing data\n", 54 | "test_data_gen = TemporalOrderExp6aSequence.get_predefined_generator(\n", 55 | " settings['difficulty'],\n", 56 | " settings['batch_size'])\n", 57 | "test_size = len(test_data_gen) " 58 | ] 59 | }, 60 | { 61 | "cell_type": "markdown", 62 | "metadata": {}, 63 | "source": [ 64 | "# Define neural network" 65 | ] 66 | }, 67 | { 68 | "cell_type": "code", 69 | "execution_count": null, 70 | "metadata": {}, 71 | "outputs": [], 72 | "source": [ 73 | "class SimpleRNN(nn.Module):\n", 74 | "\n", 75 | " def __init__(self, input_size, rnn_hidden_size, output_size):\n", 76 | "\n", 77 | " super(SimpleRNN, self).__init__()\n", 78 | " self.rnn = torch.nn.RNN(input_size, rnn_hidden_size, num_layers=1, nonlinearity='relu', batch_first=True)\n", 79 | " self.linear = torch.nn.Linear(rnn_hidden_size, output_size) \n", 80 | "\n", 81 | " def forward(self, x):\n", 82 | " x, _ = self.rnn(x)\n", 83 | " x = self.linear(x)\n", 84 | " return F.log_softmax(x, dim=1)" 85 | ] 86 | }, 87 | { 88 | "cell_type": "code", 89 | "execution_count": null, 90 | "metadata": {}, 91 | "outputs": [], 92 | "source": [ 93 | "device = torch.device(\"cuda:0\" if torch.cuda.is_available() else \"cpu\")" 94 | ] 95 | }, 96 | { 97 | "cell_type": "markdown", 98 | "metadata": {}, 99 | "source": [ 100 | "# Define training loop" 101 | ] 102 | }, 103 | { 104 | "cell_type": "code", 105 | "execution_count": null, 106 | "metadata": {}, 107 | "outputs": [], 108 | "source": [ 109 | "def train():\n", 110 | " model.train()\n", 111 | " correct = 0\n", 112 | " for batch_idx in range(train_size):\n", 113 | " data, target = train_data_gen[batch_idx]\n", 114 | " data, target = torch.from_numpy(data).float().to(device), torch.from_numpy(target).long().to(device)\n", 115 | " optimizer.zero_grad()\n", 116 | " y_pred = model(data)\n", 117 | " loss = criterion(y_pred, target)\n", 118 | " loss.backward()\n", 119 | " optimizer.step()\n", 120 | " \n", 121 | " pred = y_pred.max(1, keepdim=True)[1]\n", 122 | " correct += pred.eq(target.view_as(pred)).sum().item()\n", 123 | " return correct, loss " 124 | ] 125 | }, 126 | { 127 | "cell_type": "code", 128 | "execution_count": null, 129 | "metadata": {}, 130 | "outputs": [], 131 | "source": [ 132 | "def test():\n", 133 | " model.eval() \n", 134 | " correct = 0\n", 135 | " with torch.no_grad():\n", 136 | " for batch_idx in range(test_size):\n", 137 | " data, target = test_data_gen[batch_idx]\n", 138 | " data, target = torch.from_numpy(data).float().to(device), torch.from_numpy(target).long().to(device)\n", 139 | " y_pred = model(data)\n", 140 | " pred = y_pred.max(1, keepdim=True)[1]\n", 141 | " correct += pred.eq(target.view_as(pred)).sum().item()\n", 142 | " return correct" 143 | ] 144 | }, 145 | { 146 | "cell_type": "markdown", 147 | "metadata": {}, 148 | "source": [ 149 | "# Initialize the Model and Optimizer" 150 | ] 151 | }, 152 | { 153 | "cell_type": "code", 154 | "execution_count": null, 155 | "metadata": {}, 156 | "outputs": [], 157 | "source": [ 158 | "model = SimpleRNN(train_data_gen.n_symbols, settings['h_units'], train_data_gen.n_classes)\n", 159 | "\n", 160 | "criterion = torch.nn.CrossEntropyLoss()\n", 161 | "optimizer = torch.optim.RMSprop(model.parameters(), lr=0.001)" 162 | ] 163 | }, 164 | { 165 | "cell_type": "markdown", 166 | "metadata": {}, 167 | "source": [ 168 | "# Train the model" 169 | ] 170 | }, 171 | { 172 | "cell_type": "code", 173 | "execution_count": null, 174 | "metadata": {}, 175 | "outputs": [], 176 | "source": [ 177 | "#train for max_epochs epochs\n", 178 | "epochs = settings['max_epochs']\n", 179 | "epoch = 0\n", 180 | "while epoch < epochs:\n", 181 | " correct, loss = train()\n", 182 | "\n", 183 | " epoch += 1\n", 184 | " train_accuracy = float(correct) / train_size\n", 185 | " print('Train Epoch: {}/{}, loss: {:.4f}, accuracy {:2.2f}'.format(epoch, epochs, loss.item(), train_accuracy))\n", 186 | "\n", 187 | "#test \n", 188 | "correct = test()\n", 189 | "test_accuracy = float(correct) / test_size\n", 190 | "print('\\nTest accuracy: {}'.format(test_accuracy))" 191 | ] 192 | }, 193 | { 194 | "cell_type": "code", 195 | "execution_count": null, 196 | "metadata": {}, 197 | "outputs": [], 198 | "source": [ 199 | "print('acc = {:.2f}%.'.format(test_accuracy))" 200 | ] 201 | }, 202 | { 203 | "cell_type": "code", 204 | "execution_count": null, 205 | "metadata": {}, 206 | "outputs": [], 207 | "source": [] 208 | }, 209 | { 210 | "cell_type": "code", 211 | "execution_count": null, 212 | "metadata": {}, 213 | "outputs": [], 214 | "source": [] 215 | } 216 | ], 217 | "metadata": { 218 | "kernelspec": { 219 | "display_name": "Codas ML", 220 | "language": "python", 221 | "name": "codasml" 222 | }, 223 | "language_info": { 224 | "codemirror_mode": { 225 | "name": "ipython", 226 | "version": 3 227 | }, 228 | "file_extension": ".py", 229 | "mimetype": "text/x-python", 230 | "name": "python", 231 | "nbconvert_exporter": "python", 232 | "pygments_lexer": "ipython3", 233 | "version": "3.6.5" 234 | } 235 | }, 236 | "nbformat": 4, 237 | "nbformat_minor": 1 238 | } 239 | -------------------------------------------------------------------------------- /08-4-echo_experiments.ipynb: -------------------------------------------------------------------------------- 1 | { 2 | "cells": [ 3 | { 4 | "cell_type": "code", 5 | "execution_count": null, 6 | "metadata": {}, 7 | "outputs": [], 8 | "source": [ 9 | "from sequential_tasks import EchoData\n", 10 | "import numpy as np" 11 | ] 12 | }, 13 | { 14 | "cell_type": "code", 15 | "execution_count": null, 16 | "metadata": {}, 17 | "outputs": [], 18 | "source": [ 19 | "import torch\n", 20 | "import torch.nn as nn\n", 21 | "import torch.nn.functional as F\n", 22 | "import torch.optim as optim\n", 23 | "\n", 24 | "torch.manual_seed(1)" 25 | ] 26 | }, 27 | { 28 | "cell_type": "markdown", 29 | "metadata": {}, 30 | "source": [ 31 | "# Specify experiment settings and prepare the data" 32 | ] 33 | }, 34 | { 35 | "cell_type": "code", 36 | "execution_count": null, 37 | "metadata": {}, 38 | "outputs": [], 39 | "source": [ 40 | "# experiments settings\n", 41 | "settings = {\n", 42 | " \"series_length\": 20000,\n", 43 | " \"echo_step\": 3,\n", 44 | " \"truncated_length\": 20,\n", 45 | " \"batch_size\": 5,\n", 46 | " \"h_units\": 4,\n", 47 | " \"max_epochs\": 5\n", 48 | "}" 49 | ] 50 | }, 51 | { 52 | "cell_type": "code", 53 | "execution_count": null, 54 | "metadata": {}, 55 | "outputs": [], 56 | "source": [ 57 | "#training data\n", 58 | "train_data_gen = EchoData(\n", 59 | " series_length=settings['series_length'],\n", 60 | " truncated_length=settings['truncated_length'],\n", 61 | " echo_step=settings['echo_step'],\n", 62 | " batch_size=settings['batch_size'])\n", 63 | "train_size = len(train_data_gen)\n", 64 | "\n", 65 | "#testing \n", 66 | "test_data_gen = EchoData(\n", 67 | " series_length=settings['series_length'],\n", 68 | " truncated_length=settings['truncated_length'],\n", 69 | " echo_step=settings['echo_step'],\n", 70 | " batch_size=settings['batch_size'])\n", 71 | "test_size = len(test_data_gen) " 72 | ] 73 | }, 74 | { 75 | "cell_type": "markdown", 76 | "metadata": {}, 77 | "source": [ 78 | "# Define the model" 79 | ] 80 | }, 81 | { 82 | "cell_type": "code", 83 | "execution_count": null, 84 | "metadata": {}, 85 | "outputs": [], 86 | "source": [ 87 | "class SimpleRNN(nn.Module):\n", 88 | "\n", 89 | " def __init__(self, input_size, rnn_hidden_size, output_size):\n", 90 | "\n", 91 | " super(SimpleRNN, self).__init__()\n", 92 | " self.rnn_hidden_size = rnn_hidden_size\n", 93 | " self.rnn = torch.nn.RNN(input_size, self.rnn_hidden_size, num_layers=1, nonlinearity='relu', batch_first=True)\n", 94 | " self.linear = torch.nn.Linear(rnn_hidden_size, 1)\n", 95 | "\n", 96 | " def forward(self, x, hidden):\n", 97 | " x, hidden = self.rnn(x, hidden) \n", 98 | " x = self.linear(x)\n", 99 | " return nn.Sigmoid()(x), hidden\n", 100 | "\n", 101 | " def init_hidden(self, batch_size):\n", 102 | " weight = next(self.parameters()).data\n", 103 | " return weight.new(1, batch_size, self.rnn_hidden_size).zero_()" 104 | ] 105 | }, 106 | { 107 | "cell_type": "code", 108 | "execution_count": null, 109 | "metadata": {}, 110 | "outputs": [], 111 | "source": [ 112 | "device = torch.device(\"cuda:0\" if torch.cuda.is_available() else \"cpu\")" 113 | ] 114 | }, 115 | { 116 | "cell_type": "markdown", 117 | "metadata": {}, 118 | "source": [ 119 | "## Define training and test loops" 120 | ] 121 | }, 122 | { 123 | "cell_type": "code", 124 | "execution_count": null, 125 | "metadata": {}, 126 | "outputs": [], 127 | "source": [ 128 | "def train(hidden):\n", 129 | " model.train()\n", 130 | " \n", 131 | " correct = 0\n", 132 | " for batch_idx in range(train_size):\n", 133 | " data, target = train_data_gen[batch_idx]\n", 134 | " data, target = torch.from_numpy(data).float().to(device), torch.from_numpy(target).float().to(device)\n", 135 | " optimizer.zero_grad()\n", 136 | " y_pred, hidden = model(data, hidden)\n", 137 | " loss = criterion(y_pred, target)\n", 138 | " loss.backward(retain_graph=True)\n", 139 | " optimizer.step()\n", 140 | " \n", 141 | " pred = (y_pred > 0.5).float()\n", 142 | " correct += (pred == target).sum().item()\n", 143 | " \n", 144 | " return correct, loss, hidden " 145 | ] 146 | }, 147 | { 148 | "cell_type": "code", 149 | "execution_count": null, 150 | "metadata": {}, 151 | "outputs": [], 152 | "source": [ 153 | "def test(hidden):\n", 154 | " model.eval() \n", 155 | " correct = 0\n", 156 | " with torch.no_grad():\n", 157 | " for batch_idx in range(test_size):\n", 158 | " data, target = test_data_gen[batch_idx]\n", 159 | " data, target = torch.from_numpy(data).float().to(device), torch.from_numpy(target).float().to(device)\n", 160 | " y_pred, hidden = model(data, hidden)\n", 161 | " \n", 162 | " pred = (y_pred > 0.5).float()\n", 163 | " correct += (pred == target).sum().item()\n", 164 | "\n", 165 | " return correct" 166 | ] 167 | }, 168 | { 169 | "cell_type": "markdown", 170 | "metadata": {}, 171 | "source": [ 172 | "# Initialize the Model and Optimizer" 173 | ] 174 | }, 175 | { 176 | "cell_type": "code", 177 | "execution_count": null, 178 | "metadata": {}, 179 | "outputs": [], 180 | "source": [ 181 | "feature_dim = 1 #since we have a scalar series\n", 182 | "model = SimpleRNN(1, settings['h_units'], 1) \n", 183 | "model.to(device)\n", 184 | "hidden = model.init_hidden(train_data_gen.batch_size) #initialize hidden states for RNN \n", 185 | " \n", 186 | "criterion = torch.nn.BCEWithLogitsLoss()\n", 187 | "optimizer = torch.optim.RMSprop(model.parameters(), lr=0.001)" 188 | ] 189 | }, 190 | { 191 | "cell_type": "code", 192 | "execution_count": null, 193 | "metadata": {}, 194 | "outputs": [], 195 | "source": [ 196 | "epochs = settings['max_epochs']\n", 197 | "epoch = 0\n", 198 | "\n", 199 | "while epoch < epochs:\n", 200 | " correct, loss, hidden = train(hidden)\n", 201 | " epoch += 1\n", 202 | " train_accuracy = float(correct) / train_size\n", 203 | " print('Train Epoch: {}/{}, loss: {:.4f}, accuracy {:2.2f}'.format(epoch, epochs, loss.item(), train_accuracy))\n", 204 | "\n", 205 | "#test \n", 206 | "correct = test(hidden)\n", 207 | "test_accuracy = float(correct) / test_size\n", 208 | "print('\\nTest accuracy: {}'.format(test_accuracy))" 209 | ] 210 | }, 211 | { 212 | "cell_type": "code", 213 | "execution_count": null, 214 | "metadata": {}, 215 | "outputs": [], 216 | "source": [] 217 | }, 218 | { 219 | "cell_type": "code", 220 | "execution_count": null, 221 | "metadata": {}, 222 | "outputs": [], 223 | "source": [] 224 | } 225 | ], 226 | "metadata": { 227 | "kernelspec": { 228 | "display_name": "Codas ML", 229 | "language": "python", 230 | "name": "codasml" 231 | }, 232 | "language_info": { 233 | "codemirror_mode": { 234 | "name": "ipython", 235 | "version": 3 236 | }, 237 | "file_extension": ".py", 238 | "mimetype": "text/x-python", 239 | "name": "python", 240 | "nbconvert_exporter": "python", 241 | "pygments_lexer": "ipython3", 242 | "version": "3.6.5" 243 | } 244 | }, 245 | "nbformat": 4, 246 | "nbformat_minor": 1 247 | } 248 | -------------------------------------------------------------------------------- /README.md: -------------------------------------------------------------------------------- 1 | # PyTorch-Deep-Learning-Minicourse 2 | Minicourse in Deep Learning with PyTorch 3 | 4 | These lessons, developed during the course of several years while I've been teaching at Purdue and NYU, are here proposed for the Computational and Data Science for High Energy Physics ([CoDaS-HEP](http://codas-hep.org/)) summer school at Princeton University. 5 | I'll upload the videos and link to them as soon as they are made available to me. 6 | I'm also planning to record them in a more quiet environment and at a slower pace, add them to my YouTube channel, and made available [here](https://github.com/Atcold/pytorch-Video-Tutorials). 7 | 8 | ## Table of contents 9 | `T`: theory 10 | `P`: practice 11 | 12 | 1. `T` Learning paradigms: supervised-, unsupervised-, and reinforcement-learning 13 | 2. `P` Getting started with the tools: Jupyter notebook, PyTorch tensors and autodifferentiation 14 | 3. `T+P` Neural net's forward and backward propagation for classification 15 | 4. `T+P` Convolutional neural nets improve performance by exploiting data nature 16 | 5. `T+P` Unsupervised learning: vanilla and variational autoencoders, generative adversarial nets 17 | 6. `T+P` Recurrent nets natively support sequential data 18 | 19 | ## Sessions 20 | 1. Time slot 1 (1h30min + 45 min = 2h15min) on Tuesday afternoon (1, 2, 3) 21 | 2. Time slot 2 (1h30min + 45 min = 2h15min) on Wednesday afternoon (4) 22 | 3. Extra section (45min) on Thursday afternoon (5) 23 | 4. Extra section (1h30min) on Friday morning (6) 24 | 25 | ## Notebooks visualisation 26 | *Jupyter Notebooks* are used throughout these lectures for interactive data exploration and visualisation. 27 | 28 | I use dark styles for both *GitHub* and *Jupyter Notebook*. 29 | You better do the same, or they will look ugly. 30 | To see the content appropriately install the following: 31 | 32 | - [*Jupyter Notebook* dark theme](https://userstyles.org/styles/153443/jupyter-notebook-dark); 33 | - [*GitHub* dark theme](https://userstyles.org/styles/37035/github-dark) and comment out the `invert #fff to #181818` code block. 34 | 35 | ## Media coverage 36 | - Princeton Research Computing [article](https://researchcomputing.princeton.edu/news/princetons-codas-hep-summer-school-young-physicists-gain-edge-computational-skills) 37 | - Princeton University main page [article](https://www.princeton.edu/news/2018/07/27/princeton-summer-program-graduate-student-physicists-gain-computational-skills) 38 | 39 | ## Keeping in touch 40 | Feel free to follow me on [Twitter](https://twitter.com/AlfredoCanziani) and subscribe to my [YouTube channel](https://www.youtube.com/user/Atcold/) to have the latest free educational material. 41 | 42 | # Getting started 43 | To be able to follow the workshop exercises, you are going to need a laptop with Miniconda (a minimal version of Anaconda) and several Python packages installed. 44 | Following instruction would work as is for Mac or Ubuntu linux users, Windows users would need to install and work in the Gitbash terminal. 45 | 46 | ## Download and install Miniconda 47 | Please go to the [Anaconda website](https://conda.io/miniconda.html). 48 | Download and install *the latest* Miniconda version for *Python* 3.6 for your operating system. 49 | 50 | ```bash 51 | wget 52 | sh 53 | ``` 54 | 55 | After that, type: 56 | 57 | ```bash 58 | conda --help 59 | ``` 60 | 61 | and read the manual. 62 | 63 | ## Check-out the git repository with the exercise 64 | Once Miniconda is ready, checkout the course repository and and proceed with setting up the environment: 65 | 66 | ```bash 67 | git clone https://github.com/Atcold/PyTorch-Deep-Learning-Minicourse 68 | ``` 69 | 70 | If you do not have git and do not wish to install it, just download the repository as zip, and unpack it: 71 | 72 | ```bash 73 | wget https://github.com/Atcold/PyTorch-Deep-Learning-Minicourse/archive/master.zip 74 | #For Mac users: 75 | #curl -O https://github.com/Atcold/PyTorch-Deep-Learning-Minicourse/archive/master.zip 76 | unzip master.zip 77 | ``` 78 | 79 | ## Create isolated Miniconda environment 80 | Change into the course folder, then type: 81 | 82 | ```bash 83 | #cd PyTorch-Deep-Learning-Minicourse 84 | conda env create -f conda-envt.yml 85 | source activate codas-ml 86 | ``` 87 | 88 | ## Enable anaconda kernel in Jupyter 89 | To make newly created miniconda environment visible in the Jupyter, install `ipykernel`: 90 | 91 | ```bash 92 | python -m ipykernel install --user --name codas-ml --display-name "Codas ML" 93 | ``` 94 | 95 | ## Start jupyter notebook 96 | If you are working in a JupyterLab container double click on "Files" tab in the upper right corner. 97 | Locate first notebook, double click to open. 98 | Do not attempt to start `jupyter` from the terminal window. 99 | 100 | If working on a laptop, start from terminal as usual: 101 | 102 | ```bash 103 | jupyter notebook 104 | ``` 105 | -------------------------------------------------------------------------------- /conda-envt.yml: -------------------------------------------------------------------------------- 1 | name: codas-ml 2 | channels: 3 | - conda-forge 4 | - pytorch 5 | dependencies: 6 | - python=3.6 7 | - pytorch=0.4.0 8 | - torchvision 9 | - tensorflow=1.8.0 10 | - matplotlib 11 | - jupyter 12 | -------------------------------------------------------------------------------- /img/train.gif: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/SudalaiRajkumar/PyTorch-Deep-Learning-Minicourse/d2b0970935ec19cb4526ceb4ec028b94a3958c25/img/train.gif -------------------------------------------------------------------------------- /plot_conf.py: -------------------------------------------------------------------------------- 1 | # matplotlib and stuff 2 | import matplotlib.pyplot as plt 3 | import numpy as np 4 | 5 | 6 | def plt_style(c='k'): 7 | """ 8 | Set plotting style for bright (``c = 'w'``) or dark (``c = 'k'``) backgrounds 9 | 10 | :param c: colour, can be set to ``'w'`` or ``'k'`` (which is the default) 11 | :type c: str 12 | """ 13 | import matplotlib as mpl 14 | from matplotlib import rc 15 | 16 | # Reset previous configuration 17 | mpl.rcParams.update(mpl.rcParamsDefault) 18 | # %matplotlib inline # not from script 19 | get_ipython().run_line_magic('matplotlib', 'inline') 20 | 21 | # configuration for bright background 22 | if c == 'w': 23 | plt.style.use('bmh') 24 | 25 | # configurations for dark background 26 | if c == 'k': 27 | plt.style.use(['dark_background', 'bmh']) 28 | 29 | # remove background colour, set figure size 30 | rc('figure', figsize=(16, 8), max_open_warning=False) 31 | rc('axes', facecolor='none') 32 | 33 | 34 | def plt_interactive(c='k'): 35 | from matplotlib import rc 36 | import matplotlib as mpl 37 | mpl.rcParams.update(mpl.rcParamsDefault) 38 | get_ipython().run_line_magic('matplotlib', 'notebook') 39 | plt.rc('figure', figsize=(9.5, 4.75), facecolor=c) 40 | # configuration for bright background 41 | if c == 'w': 42 | plt.style.use('bmh') 43 | 44 | # configurations for dark background 45 | if c == 'k': 46 | plt.style.use(['dark_background', 'bmh']) 47 | rc('axes', facecolor='none') 48 | 49 | plt_style() 50 | -------------------------------------------------------------------------------- /raw/keras-regularisation.ipynb: -------------------------------------------------------------------------------- 1 | { 2 | "cells": [ 3 | { 4 | "cell_type": "markdown", 5 | "metadata": {}, 6 | "source": [ 7 | "# Regularisation in NNs" 8 | ] 9 | }, 10 | { 11 | "cell_type": "markdown", 12 | "metadata": {}, 13 | "source": [ 14 | "## 1. Set up the environment" 15 | ] 16 | }, 17 | { 18 | "cell_type": "code", 19 | "execution_count": null, 20 | "metadata": {}, 21 | "outputs": [], 22 | "source": [ 23 | "# Import statements\n", 24 | "from tensorflow import keras as kr\n", 25 | "import tensorflow as tf\n", 26 | "import numpy as np\n", 27 | "import matplotlib.pyplot as plt" 28 | ] 29 | }, 30 | { 31 | "cell_type": "code", 32 | "execution_count": null, 33 | "metadata": {}, 34 | "outputs": [], 35 | "source": [ 36 | "# Set my plotting style\n", 37 | "plt.style.use(('dark_background', 'bmh'))\n", 38 | "plt.rc('axes', facecolor='none')\n", 39 | "plt.rc('figure', figsize=(16, 4))" 40 | ] 41 | }, 42 | { 43 | "cell_type": "code", 44 | "execution_count": null, 45 | "metadata": {}, 46 | "outputs": [], 47 | "source": [ 48 | "# Set random seed for reproducibility\n", 49 | "np.random.seed(0)\n", 50 | "tf.set_random_seed(0)" 51 | ] 52 | }, 53 | { 54 | "cell_type": "code", 55 | "execution_count": null, 56 | "metadata": {}, 57 | "outputs": [], 58 | "source": [ 59 | "# Shortcuts\n", 60 | "imdb = kr.datasets.imdb\n", 61 | "Tokeniser = kr.preprocessing.text.Tokenizer\n", 62 | "models = kr.models\n", 63 | "layers = kr.layers\n", 64 | "regularisers = kr.regularizers\n", 65 | "constraints = kr.constraints\n", 66 | "EarlyStopping = kr.callbacks.EarlyStopping\n", 67 | "ModelCheckpoint = kr.callbacks.ModelCheckpoint" 68 | ] 69 | }, 70 | { 71 | "cell_type": "markdown", 72 | "metadata": {}, 73 | "source": [ 74 | "## 2. Loading the data set" 75 | ] 76 | }, 77 | { 78 | "cell_type": "code", 79 | "execution_count": null, 80 | "metadata": {}, 81 | "outputs": [], 82 | "source": [ 83 | "# Set the number of features we want\n", 84 | "features_nb = 1000\n", 85 | "\n", 86 | "# Load data and target vector from movie review data\n", 87 | "(train_data, train_target), (test_data, test_target) = imdb.load_data(num_words=features_nb)\n", 88 | "\n", 89 | "# Convert movie review data to a one-hot encoded feature matrix\n", 90 | "tokeniser = Tokeniser(num_words=features_nb)\n", 91 | "train_features = tokeniser.sequences_to_matrix(train_data, mode='binary')\n", 92 | "test_features = tokeniser.sequences_to_matrix(test_data, mode='binary')" 93 | ] 94 | }, 95 | { 96 | "cell_type": "markdown", 97 | "metadata": {}, 98 | "source": [ 99 | "### 2.1 Exploring the data set" 100 | ] 101 | }, 102 | { 103 | "cell_type": "code", 104 | "execution_count": null, 105 | "metadata": {}, 106 | "outputs": [], 107 | "source": [ 108 | "# Check data set sizes\n", 109 | "print('train_data.shape:', train_data.shape)\n", 110 | "print('train_target.shape:', train_target.shape)\n", 111 | "print('test_data.shape:', test_data.shape)\n", 112 | "print('test_target.shape:', test_target.shape)" 113 | ] 114 | }, 115 | { 116 | "cell_type": "code", 117 | "execution_count": null, 118 | "metadata": {}, 119 | "outputs": [], 120 | "source": [ 121 | "# Check format of first training sample\n", 122 | "print('type(train_data[0]):', type(train_data[0]))\n", 123 | "print('type(train_target[0]):', type(train_target[0]))" 124 | ] 125 | }, 126 | { 127 | "cell_type": "code", 128 | "execution_count": null, 129 | "metadata": {}, 130 | "outputs": [], 131 | "source": [ 132 | "# Check size of first 10 training samples and corresponding target\n", 133 | "print('Reviews length:', [len(sample) for sample in train_data[:10]])\n", 134 | "print('Review sentiment (bad/good):', train_target[:10])" 135 | ] 136 | }, 137 | { 138 | "cell_type": "code", 139 | "execution_count": null, 140 | "metadata": {}, 141 | "outputs": [], 142 | "source": [ 143 | "# Show first review - machine format\n", 144 | "print(train_data[0])" 145 | ] 146 | }, 147 | { 148 | "cell_type": "code", 149 | "execution_count": null, 150 | "metadata": {}, 151 | "outputs": [], 152 | "source": [ 153 | "# Data set text visualisation helper function\n", 154 | "def show_text(sample):\n", 155 | " word_to_id = imdb.get_word_index()\n", 156 | " word_to_id = {k:(v+3) for k,v in word_to_id.items()}\n", 157 | " word_to_id[\"\"] = 0\n", 158 | " word_to_id[\"\"] = 1\n", 159 | " word_to_id[\"\"] = 2\n", 160 | "\n", 161 | " id_to_word = {value:key for key,value in word_to_id.items()}\n", 162 | " print(' '.join(id_to_word[id_] for id_ in sample))" 163 | ] 164 | }, 165 | { 166 | "cell_type": "code", 167 | "execution_count": null, 168 | "metadata": {}, 169 | "outputs": [], 170 | "source": [ 171 | "# Show first review - human format\n", 172 | "show_text(train_data[0])" 173 | ] 174 | }, 175 | { 176 | "cell_type": "code", 177 | "execution_count": null, 178 | "metadata": {}, 179 | "outputs": [], 180 | "source": [ 181 | "# Show first review - neural net format\n", 182 | "print(train_features[0])" 183 | ] 184 | }, 185 | { 186 | "cell_type": "code", 187 | "execution_count": null, 188 | "metadata": {}, 189 | "outputs": [], 190 | "source": [ 191 | "# Show first review - neural net format - explanation\n", 192 | "print(train_features[0] * np.arange(len(train_features[0])))" 193 | ] 194 | }, 195 | { 196 | "cell_type": "markdown", 197 | "metadata": {}, 198 | "source": [ 199 | "## 3. Exploring regularisation of NN\n", 200 | "\n", 201 | "Play with the code, especially the one marked `# toggle`. \n", 202 | "Start from `# toggle 0`, and then, one at the time, `# toggle 1` to `5`." 203 | ] 204 | }, 205 | { 206 | "cell_type": "code", 207 | "execution_count": null, 208 | "metadata": {}, 209 | "outputs": [], 210 | "source": [ 211 | "# Start neural network\n", 212 | "network = models.Sequential()\n", 213 | "\n", 214 | "# Add a Dropout layer\n", 215 | "# network.add(layers.Dropout(0.2)) # toggle 4\n", 216 | "\n", 217 | "# Add fully connected layer with a ReLU activation function and L2 regularization\n", 218 | "network.add(layers.Dense(\n", 219 | " units=16, \n", 220 | " activation='relu', \n", 221 | "# kernel_regularizer=regularisers.l2(0.005), # toggle 1\n", 222 | "# kernel_regularizer=regularisers.l1(0.001), # toggle 2\n", 223 | "# kernel_constraint=constraints.max_norm(1), # toggle 3\n", 224 | " input_shape=(features_nb,)\n", 225 | "))\n", 226 | "\n", 227 | "# Add fully connected layer with a ReLU activation function and L2 regularization\n", 228 | "network.add(layers.Dense(\n", 229 | " units=16, \n", 230 | "# kernel_regularizer=regularisers.l2(0.005), # toggle 1\n", 231 | "# kernel_constraint=constraints.max_norm(1), # toggle 3\n", 232 | " activation='relu'\n", 233 | "))\n", 234 | "\n", 235 | "# Add a Dropout layer\n", 236 | "# network.add(layers.Dropout(0.5)) # toggle 4\n", 237 | "\n", 238 | "# Add fully connected layer with a sigmoid activation function\n", 239 | "network.add(layers.Dense(units=1, activation='sigmoid')) # Compile neural network\n", 240 | "\n", 241 | "# Compile network\n", 242 | "network.compile(\n", 243 | " loss='binary_crossentropy', # Cross-entropy\n", 244 | " optimizer='rmsprop', # Root Mean Square Propagation\n", 245 | " metrics=['accuracy'] # Accuracy performance metric\n", 246 | ")" 247 | ] 248 | }, 249 | { 250 | "cell_type": "code", 251 | "execution_count": null, 252 | "metadata": {}, 253 | "outputs": [], 254 | "source": [ 255 | "# Train neural network\n", 256 | "history = network.fit(\n", 257 | " train_features, # Features\n", 258 | " train_target, # Target vector\n", 259 | " epochs=25, # Number of epochs\n", 260 | " verbose=0, # No output\n", 261 | " batch_size=100, # Number of observations per batch\n", 262 | " validation_data=(test_features, test_target), # Data for evaluation\n", 263 | "# callbacks=[ # toggle 5\n", 264 | "# EarlyStopping(monitor='val_loss', patience=2), # toggle 5\n", 265 | "# ModelCheckpoint(filepath='best_model.h5', monitor='val_loss', save_best_only=True) # toggle 5\n", 266 | "# ], # toggle 5\n", 267 | ")" 268 | ] 269 | }, 270 | { 271 | "cell_type": "code", 272 | "execution_count": null, 273 | "metadata": {}, 274 | "outputs": [], 275 | "source": [ 276 | "# ! ls # toggle 5" 277 | ] 278 | }, 279 | { 280 | "cell_type": "code", 281 | "execution_count": null, 282 | "metadata": {}, 283 | "outputs": [], 284 | "source": [ 285 | "# Get training and test accuracy histories\n", 286 | "train_loss = history.history['loss']\n", 287 | "test_loss = history.history['val_loss']\n", 288 | "\n", 289 | "# Create count of the number of epochs\n", 290 | "epoch = range(1, len(train_loss) + 1)\n", 291 | "\n", 292 | "# Visualize accuracy history\n", 293 | "plt.figure()\n", 294 | "\n", 295 | "plt.plot(epoch, train_loss)\n", 296 | "plt.plot(epoch, test_loss)\n", 297 | "# plt.plot(no_reg['epoch'], no_reg['train_loss']) # toggle 0\n", 298 | "# plt.plot(no_reg['epoch'], no_reg['test_loss']) # toggle 0\n", 299 | "\n", 300 | "plt.legend(['Train loss', 'Test loss', 'Train no-reg', 'Test no-reg'])\n", 301 | "plt.xlabel('Epoch')\n", 302 | "plt.ylabel('Loss score')\n", 303 | "\n", 304 | "# Get training and test accuracy histories\n", 305 | "train_accuracy = history.history['acc']\n", 306 | "test_accuracy = history.history['val_acc']\n", 307 | "\n", 308 | "# Visualize accuracy history\n", 309 | "plt.figure()\n", 310 | "\n", 311 | "plt.plot(epoch, train_accuracy)\n", 312 | "plt.plot(epoch, test_accuracy)\n", 313 | "# plt.plot(no_reg['epoch'], no_reg['train_accuracy']) # toggle 0\n", 314 | "# plt.plot(no_reg['epoch'], no_reg['test_accuracy']) # toggle 0\n", 315 | "\n", 316 | "plt.legend(['Train accuracy', 'Test accuracy', 'Train no-reg', 'Test no-reg'])\n", 317 | "plt.xlabel('Epoch')\n", 318 | "plt.ylabel('Accuracy Score')\n", 319 | "\n", 320 | "no_reg = { # toggle 0\n", 321 | " 'epoch': epoch, # toggle 0\n", 322 | " 'train_loss': train_loss, # toggle 0\n", 323 | " 'test_loss': test_loss, # toggle 0\n", 324 | " 'train_accuracy': train_accuracy, # toggle 0\n", 325 | " 'test_accuracy': test_accuracy, # toggle 0\n", 326 | "}" 327 | ] 328 | }, 329 | { 330 | "cell_type": "code", 331 | "execution_count": null, 332 | "metadata": {}, 333 | "outputs": [], 334 | "source": [ 335 | "# Backup weights\n", 336 | "weights = network.layers[0].get_weights()[0] # toggle 0\n", 337 | "# weights_L1 = network.layers[0].get_weights()[0] # toggle 1\n", 338 | "# weights_L2 = network.layers[0].get_weights()[0] # toggle 2\n", 339 | "# weights_max = network.layers[0].get_weights()[0] # toggle 3" 340 | ] 341 | }, 342 | { 343 | "cell_type": "markdown", 344 | "metadata": {}, 345 | "source": [ 346 | "After you got to toggle `# toggle 3`, execute the following code." 347 | ] 348 | }, 349 | { 350 | "cell_type": "code", 351 | "execution_count": null, 352 | "metadata": {}, 353 | "outputs": [], 354 | "source": [ 355 | "# Show weight distribution\n", 356 | "plt.hist((\n", 357 | " weights.reshape(-1),\n", 358 | " weights_L1.reshape(-1),\n", 359 | " weights_L2.reshape(-1),\n", 360 | " weights_max.reshape(-1),\n", 361 | "), 49, range=(-.5, .5), label=(\n", 362 | " 'No-reg',\n", 363 | " 'L1',\n", 364 | " 'L2',\n", 365 | " 'Max',\n", 366 | "))\n", 367 | "plt.legend();" 368 | ] 369 | } 370 | ], 371 | "metadata": { 372 | "kernelspec": { 373 | "display_name": "Python 3", 374 | "language": "python", 375 | "name": "python3" 376 | }, 377 | "language_info": { 378 | "codemirror_mode": { 379 | "name": "ipython", 380 | "version": 3 381 | }, 382 | "file_extension": ".py", 383 | "mimetype": "text/x-python", 384 | "name": "python", 385 | "nbconvert_exporter": "python", 386 | "pygments_lexer": "ipython3", 387 | "version": "3.6.5" 388 | } 389 | }, 390 | "nbformat": 4, 391 | "nbformat_minor": 2 392 | } 393 | -------------------------------------------------------------------------------- /raw/keras-sequences/0_1_classify_seq_data.ipynb: -------------------------------------------------------------------------------- 1 | { 2 | "cells": [ 3 | { 4 | "cell_type": "markdown", 5 | "metadata": { 6 | "collapsed": true 7 | }, 8 | "source": [ 9 | "An example of many-to-one (sequence classification):\n", 10 | "\n", 11 | "\n", 12 | "Original experiment from Hochreiter&Schmidhuber(1997):\n", 13 | "\n", 14 | " The goal is to classify sequences. Elements and targets are represented locally\n", 15 | " (input vectors with only one non-zero bit). The sequence starts with an E, ends\n", 16 | " with a B (the \"trigger symbol\") and otherwise consists of randomly chosen symbols\n", 17 | " from the set {a, b, c, d} except for two elements at positions t1 and t2 that are\n", 18 | " either X or Y . The sequence length is randomly chosen between 100 and 110, t1 is\n", 19 | " randomly chosen between 10 and 20, and t2 is randomly chosen between 50 and 60.\n", 20 | " There are 4 sequence classes Q, R, S, U which depend on the temporal order of X and Y.\n", 21 | " The rules are:\n", 22 | " X, X -> Q,\n", 23 | " X, Y -> R,\n", 24 | " Y , X -> S,\n", 25 | " Y , Y -> U. " 26 | ] 27 | }, 28 | { 29 | "cell_type": "code", 30 | "execution_count": 1, 31 | "metadata": {}, 32 | "outputs": [], 33 | "source": [ 34 | "from sequential_tasks import TemporalOrderExp6aSequence\n", 35 | "\n", 36 | "# data generator\n", 37 | "dg = TemporalOrderExp6aSequence.get_predefined_generator(\n", 38 | " TemporalOrderExp6aSequence.DifficultyLevel.EASY)\n" 39 | ] 40 | }, 41 | { 42 | "cell_type": "code", 43 | "execution_count": 2, 44 | "metadata": {}, 45 | "outputs": [ 46 | { 47 | "name": "stdout", 48 | "output_type": "stream", 49 | "text": [ 50 | "BbXcXcbE ----> Q\n", 51 | "BYdadYE ----> U\n", 52 | "BXddbXcE ----> Q\n", 53 | "BXacYdE ----> R\n", 54 | "BbYbXdbE ----> S\n" 55 | ] 56 | } 57 | ], 58 | "source": [ 59 | "# Raw sequences and their classes:\n", 60 | "for n in range(5):\n", 61 | " x, y = dg.generate_pair()\n", 62 | " print('{} ----> {}'.format(x, y))\n" 63 | ] 64 | }, 65 | { 66 | "cell_type": "code", 67 | "execution_count": 3, 68 | "metadata": {}, 69 | "outputs": [ 70 | { 71 | "name": "stdout", 72 | "output_type": "stream", 73 | "text": [ 74 | "BXbcYaE ----> R\n", 75 | "Encoded input sequence:\n", 76 | "[[0. 0. 0. 0. 0. 0. 1. 0.]\n", 77 | " [1. 0. 0. 0. 0. 0. 0. 0.]\n", 78 | " [0. 0. 0. 1. 0. 0. 0. 0.]\n", 79 | " [0. 0. 0. 0. 1. 0. 0. 0.]\n", 80 | " [0. 1. 0. 0. 0. 0. 0. 0.]\n", 81 | " [0. 0. 1. 0. 0. 0. 0. 0.]\n", 82 | " [0. 0. 0. 0. 0. 0. 0. 1.]]\n", 83 | "Encoded output sequence:\n", 84 | "[0. 1. 0. 0.]\n" 85 | ] 86 | } 87 | ], 88 | "source": [ 89 | "# Encoding our data into RNN-friendly data format\n", 90 | "\n", 91 | "# Single data pair example:\n", 92 | "x, y = dg.generate_pair()\n", 93 | "print('{} ----> {}'.format(x, y))\n", 94 | " \n", 95 | "enc_x = dg.encode_x(x)\n", 96 | "enc_y = dg.encode_y(y)\n", 97 | "\n", 98 | "print('Encoded input sequence:')\n", 99 | "print(enc_x)\n", 100 | "print('Encoded output sequence:')\n", 101 | "print(enc_y)\n" 102 | ] 103 | }, 104 | { 105 | "cell_type": "code", 106 | "execution_count": 4, 107 | "metadata": {}, 108 | "outputs": [ 109 | { 110 | "name": "stdout", 111 | "output_type": "stream", 112 | "text": [ 113 | "Batch_x shape = (32, 9, 8)\n", 114 | "Batch_y shape = (32, 4)\n", 115 | "[[0 0 0 0 0 0 0 0]\n", 116 | " [0 0 0 0 0 0 1 0]\n", 117 | " [1 0 0 0 0 0 0 0]\n", 118 | " [0 0 0 0 1 0 0 0]\n", 119 | " [0 0 0 1 0 0 0 0]\n", 120 | " [0 1 0 0 0 0 0 0]\n", 121 | " [0 0 1 0 0 0 0 0]\n", 122 | " [0 0 0 0 1 0 0 0]\n", 123 | " [0 0 0 0 0 0 0 1]]\n" 124 | ] 125 | } 126 | ], 127 | "source": [ 128 | "# let's generate a batch of training pairs\n", 129 | "batch_x, batch_y = dg[0]\n", 130 | "\n", 131 | "# batch_x has the shape (batch_size, max_seq_length, num_symbols)\n", 132 | "print('Batch_x shape = ', batch_x.shape)\n", 133 | "\n", 134 | "# batch_y has the shape (batch_size, num_classes)\n", 135 | "print('Batch_y shape = ', batch_y.shape)\n", 136 | "\n", 137 | "# inputs are zero-padded (added zero prefix)\n", 138 | "# to obtain sequences of equal length\n", 139 | "print(batch_x[0])\n", 140 | "\n" 141 | ] 142 | } 143 | ], 144 | "metadata": { 145 | "kernelspec": { 146 | "display_name": "Python 3", 147 | "language": "python", 148 | "name": "python3" 149 | }, 150 | "language_info": { 151 | "codemirror_mode": { 152 | "name": "ipython", 153 | "version": 3 154 | }, 155 | "file_extension": ".py", 156 | "mimetype": "text/x-python", 157 | "name": "python", 158 | "nbconvert_exporter": "python", 159 | "pygments_lexer": "ipython3", 160 | "version": "3.6.3" 161 | } 162 | }, 163 | "nbformat": 4, 164 | "nbformat_minor": 1 165 | } 166 | -------------------------------------------------------------------------------- /raw/keras-sequences/0_2_echo_data.ipynb: -------------------------------------------------------------------------------- 1 | { 2 | "cells": [ 3 | { 4 | "cell_type": "markdown", 5 | "metadata": { 6 | "collapsed": true 7 | }, 8 | "source": [ 9 | "Echoing signal n steps is an example of synchronized many-to-many task:" 10 | ] 11 | }, 12 | { 13 | "cell_type": "code", 14 | "execution_count": 1, 15 | "metadata": {}, 16 | "outputs": [], 17 | "source": [ 18 | "from sequential_tasks import EchoData\n", 19 | "\n", 20 | "batch_size = 5\n", 21 | "echo_step = 3\n", 22 | "series_length = 20000\n", 23 | "truncated_length = 10\n", 24 | "\n", 25 | "data_gen = EchoData(\n", 26 | " echo_step=echo_step,\n", 27 | " batch_size=batch_size,\n", 28 | " series_length=series_length,\n", 29 | " truncated_length=truncated_length)\n", 30 | "\n" 31 | ] 32 | }, 33 | { 34 | "cell_type": "code", 35 | "execution_count": 2, 36 | "metadata": {}, 37 | "outputs": [ 38 | { 39 | "name": "stdout", 40 | "output_type": "stream", 41 | "text": [ 42 | "(1st sequence) x = [0 1 0 1 0 1 1 0 0 0 1 0 0 1 0 1 0 0 1 0] ... \n", 43 | "(1st sequence) y = [0 0 0 0 1 0 1 0 1 1 0 0 0 1 0 0 1 0 1 0] ... \n" 44 | ] 45 | } 46 | ], 47 | "source": [ 48 | "# Let's print first 20 timesteps of the first sequences to see the echo data:\n", 49 | "print('(1st sequence) x = ', data_gen.raw_x[0, :20], '... ')\n", 50 | "print('(1st sequence) y = ', data_gen.raw_y[0, :20], '... ')\n" 51 | ] 52 | }, 53 | { 54 | "cell_type": "code", 55 | "execution_count": 3, 56 | "metadata": {}, 57 | "outputs": [ 58 | { 59 | "name": "stdout", 60 | "output_type": "stream", 61 | "text": [ 62 | "bax = \n", 63 | "[[0 1 0 1 0 1 1 0 0 0 1 0 0 1 0 1 0 0 1 0]\n", 64 | " [1 0 1 1 1 0 1 1 0 1 1 0 0 1 1 1 0 1 0 1]\n", 65 | " [1 0 1 0 0 1 1 1 1 0 1 1 0 0 1 0 1 0 0 1]\n", 66 | " [1 1 1 1 1 1 1 1 0 0 0 1 0 1 0 1 1 1 0 0]\n", 67 | " [1 0 0 0 0 0 1 1 1 1 1 0 0 1 1 0 1 1 0 0]]\n", 68 | "y = \n", 69 | "[[0 0 0 0 1 0 1 0 1 1 0 0 0 1 0 0 1 0 1 0]\n", 70 | " [0 0 0 1 0 1 1 1 0 1 1 0 1 1 0 0 1 1 1 0]\n", 71 | " [0 0 0 1 0 1 0 0 1 1 1 1 0 1 1 0 0 1 0 1]\n", 72 | " [0 0 0 1 1 1 1 1 1 1 1 0 0 0 1 0 1 0 1 1]\n", 73 | " [0 0 0 1 0 0 0 0 0 1 1 1 1 1 0 0 1 1 0 1]]\n", 74 | "raw_x shape: (5, 20000)\n", 75 | "raw_y shape: (5, 20000)\n" 76 | ] 77 | } 78 | ], 79 | "source": [ 80 | "# batch_size different sequences are created:\n", 81 | "print('bax = ')\n", 82 | "print(data_gen.raw_x[:, :20])\n", 83 | "print('y = ')\n", 84 | "print(data_gen.raw_y[:, :20])\n", 85 | "\n", 86 | "print('raw_x shape:', data_gen.raw_x.shape) # shape = (batch_size, series_length)\n", 87 | "print('raw_y shape:', data_gen.raw_y.shape) # shape = (batch_size, series_length)\n" 88 | ] 89 | }, 90 | { 91 | "cell_type": "code", 92 | "execution_count": 4, 93 | "metadata": {}, 94 | "outputs": [ 95 | { 96 | "name": "stdout", 97 | "output_type": "stream", 98 | "text": [ 99 | "batch x shape: (5, 10, 1)\n", 100 | "batch y shape: (5, 10, 1)\n" 101 | ] 102 | } 103 | ], 104 | "source": [ 105 | "# In order to use RNNs data organized into batches of size:\n", 106 | "# [batch_size, truncated_sequence_length, feature_dim\n", 107 | "\n", 108 | "i_batch = 0\n", 109 | "print('batch x shape:', data_gen.x_batches[i_batch].shape)\n", 110 | "print('batch y shape:', data_gen.y_batches[i_batch].shape)\n" 111 | ] 112 | }, 113 | { 114 | "cell_type": "code", 115 | "execution_count": 5, 116 | "metadata": {}, 117 | "outputs": [ 118 | { 119 | "name": "stdout", 120 | "output_type": "stream", 121 | "text": [ 122 | "[[[0]\n", 123 | " [1]\n", 124 | " [0]\n", 125 | " [1]\n", 126 | " [0]\n", 127 | " [1]\n", 128 | " [1]\n", 129 | " [0]\n", 130 | " [0]\n", 131 | " [0]]\n", 132 | "\n", 133 | " [[1]\n", 134 | " [0]\n", 135 | " [1]\n", 136 | " [1]\n", 137 | " [1]\n", 138 | " [0]\n", 139 | " [1]\n", 140 | " [1]\n", 141 | " [0]\n", 142 | " [1]]\n", 143 | "\n", 144 | " [[1]\n", 145 | " [0]\n", 146 | " [1]\n", 147 | " [0]\n", 148 | " [0]\n", 149 | " [1]\n", 150 | " [1]\n", 151 | " [1]\n", 152 | " [1]\n", 153 | " [0]]\n", 154 | "\n", 155 | " [[1]\n", 156 | " [1]\n", 157 | " [1]\n", 158 | " [1]\n", 159 | " [1]\n", 160 | " [1]\n", 161 | " [1]\n", 162 | " [1]\n", 163 | " [0]\n", 164 | " [0]]\n", 165 | "\n", 166 | " [[1]\n", 167 | " [0]\n", 168 | " [0]\n", 169 | " [0]\n", 170 | " [0]\n", 171 | " [0]\n", 172 | " [1]\n", 173 | " [1]\n", 174 | " [1]\n", 175 | " [1]]]\n", 176 | "[[[0]\n", 177 | " [0]\n", 178 | " [0]\n", 179 | " [0]\n", 180 | " [1]\n", 181 | " [0]\n", 182 | " [1]\n", 183 | " [0]\n", 184 | " [1]\n", 185 | " [1]]\n", 186 | "\n", 187 | " [[0]\n", 188 | " [0]\n", 189 | " [0]\n", 190 | " [1]\n", 191 | " [0]\n", 192 | " [1]\n", 193 | " [1]\n", 194 | " [1]\n", 195 | " [0]\n", 196 | " [1]]\n", 197 | "\n", 198 | " [[0]\n", 199 | " [0]\n", 200 | " [0]\n", 201 | " [1]\n", 202 | " [0]\n", 203 | " [1]\n", 204 | " [0]\n", 205 | " [0]\n", 206 | " [1]\n", 207 | " [1]]\n", 208 | "\n", 209 | " [[0]\n", 210 | " [0]\n", 211 | " [0]\n", 212 | " [1]\n", 213 | " [1]\n", 214 | " [1]\n", 215 | " [1]\n", 216 | " [1]\n", 217 | " [1]\n", 218 | " [1]]\n", 219 | "\n", 220 | " [[0]\n", 221 | " [0]\n", 222 | " [0]\n", 223 | " [1]\n", 224 | " [0]\n", 225 | " [0]\n", 226 | " [0]\n", 227 | " [0]\n", 228 | " [0]\n", 229 | " [1]]]\n" 230 | ] 231 | } 232 | ], 233 | "source": [ 234 | "\n", 235 | "print(data_gen.x_batches[i_batch])\n", 236 | "print(data_gen.y_batches[i_batch])\n" 237 | ] 238 | }, 239 | { 240 | "cell_type": "markdown", 241 | "metadata": {}, 242 | "source": [ 243 | " " 244 | ] 245 | } 246 | ], 247 | "metadata": { 248 | "kernelspec": { 249 | "display_name": "Python 3", 250 | "language": "python", 251 | "name": "python3" 252 | }, 253 | "language_info": { 254 | "codemirror_mode": { 255 | "name": "ipython", 256 | "version": 3 257 | }, 258 | "file_extension": ".py", 259 | "mimetype": "text/x-python", 260 | "name": "python", 261 | "nbconvert_exporter": "python", 262 | "pygments_lexer": "ipython3", 263 | "version": "3.6.3" 264 | } 265 | }, 266 | "nbformat": 4, 267 | "nbformat_minor": 1 268 | } 269 | -------------------------------------------------------------------------------- /raw/keras-sequences/1_1_temporal_order_classification_experiments.ipynb: -------------------------------------------------------------------------------- 1 | { 2 | "cells": [ 3 | { 4 | "cell_type": "code", 5 | "execution_count": 1, 6 | "metadata": {}, 7 | "outputs": [], 8 | "source": [ 9 | "from sequential_tasks import TemporalOrderExp6aSequence\n", 10 | "from tensorflow.python.keras.models import Sequential\n", 11 | "from tensorflow.python.keras.layers import SimpleRNN, Dense\n" 12 | ] 13 | }, 14 | { 15 | "cell_type": "code", 16 | "execution_count": 2, 17 | "metadata": {}, 18 | "outputs": [], 19 | "source": [ 20 | "def exp6a_experiment(settings):\n", 21 | " train_data_gen = TemporalOrderExp6aSequence.get_predefined_generator(\n", 22 | " settings['difficulty'],\n", 23 | " settings['batch_size'])\n", 24 | "\n", 25 | " model = Sequential([\n", 26 | " SimpleRNN(\n", 27 | " units=settings['h_units'],\n", 28 | " input_shape=(train_data_gen.length_range[1],\n", 29 | " train_data_gen.n_symbols)),\n", 30 | " Dense(units=train_data_gen.n_classes, activation='softmax')\n", 31 | " ])\n", 32 | "\n", 33 | " model.compile(optimizer='rmsprop', loss='categorical_crossentropy', metrics=['accuracy'])\n", 34 | " model.fit_generator(\n", 35 | " generator=train_data_gen,\n", 36 | " epochs=settings['max_epochs'],\n", 37 | " verbose=2)\n", 38 | "\n", 39 | " # testing\n", 40 | " test_data_gen = TemporalOrderExp6aSequence.get_predefined_generator(\n", 41 | " settings['difficulty'],\n", 42 | " settings['batch_size'])\n", 43 | "\n", 44 | " eval_metrics = model.evaluate_generator(test_data_gen)\n", 45 | " test_accuracy = eval_metrics[1]\n", 46 | " \n", 47 | " return test_accuracy\n" 48 | ] 49 | }, 50 | { 51 | "cell_type": "code", 52 | "execution_count": 3, 53 | "metadata": {}, 54 | "outputs": [ 55 | { 56 | "name": "stdout", 57 | "output_type": "stream", 58 | "text": [ 59 | "Epoch 1/30\n", 60 | " - 0s - loss: 1.3959 - acc: 0.2591\n", 61 | "Epoch 2/30\n", 62 | " - 0s - loss: 1.3335 - acc: 0.3347\n", 63 | "Epoch 3/30\n", 64 | " - 0s - loss: 1.3186 - acc: 0.3498\n", 65 | "Epoch 4/30\n", 66 | " - 0s - loss: 1.2981 - acc: 0.3659\n", 67 | "Epoch 5/30\n", 68 | " - 0s - loss: 1.2400 - acc: 0.4355\n", 69 | "Epoch 6/30\n", 70 | " - 0s - loss: 1.2026 - acc: 0.4899\n", 71 | "Epoch 7/30\n", 72 | " - 0s - loss: 1.1653 - acc: 0.5101\n", 73 | "Epoch 8/30\n", 74 | " - 0s - loss: 1.1257 - acc: 0.5232\n", 75 | "Epoch 9/30\n", 76 | " - 0s - loss: 1.0727 - acc: 0.5817\n", 77 | "Epoch 10/30\n", 78 | " - 0s - loss: 1.0358 - acc: 0.6431\n", 79 | "Epoch 11/30\n", 80 | " - 0s - loss: 0.9817 - acc: 0.6643\n", 81 | "Epoch 12/30\n", 82 | " - 0s - loss: 0.9390 - acc: 0.6825\n", 83 | "Epoch 13/30\n", 84 | " - 0s - loss: 0.8946 - acc: 0.7016\n", 85 | "Epoch 14/30\n", 86 | " - 0s - loss: 0.8541 - acc: 0.7097\n", 87 | "Epoch 15/30\n", 88 | " - 0s - loss: 0.8061 - acc: 0.7681\n", 89 | "Epoch 16/30\n", 90 | " - 0s - loss: 0.7683 - acc: 0.7621\n", 91 | "Epoch 17/30\n", 92 | " - 0s - loss: 0.7158 - acc: 0.7944\n", 93 | "Epoch 18/30\n", 94 | " - 0s - loss: 0.6629 - acc: 0.8730\n", 95 | "Epoch 19/30\n", 96 | " - 0s - loss: 0.6269 - acc: 0.9153\n", 97 | "Epoch 20/30\n", 98 | " - 0s - loss: 0.5782 - acc: 0.9345\n", 99 | "Epoch 21/30\n", 100 | " - 0s - loss: 0.5307 - acc: 0.9556\n", 101 | "Epoch 22/30\n", 102 | " - 0s - loss: 0.4938 - acc: 0.9768\n", 103 | "Epoch 23/30\n", 104 | " - 0s - loss: 0.4314 - acc: 0.9859\n", 105 | "Epoch 24/30\n", 106 | " - 0s - loss: 0.4023 - acc: 0.9960\n", 107 | "Epoch 25/30\n", 108 | " - 0s - loss: 0.3608 - acc: 0.9980\n", 109 | "Epoch 26/30\n", 110 | " - 0s - loss: 0.3202 - acc: 1.0000\n", 111 | "Epoch 27/30\n", 112 | " - 0s - loss: 0.2959 - acc: 1.0000\n", 113 | "Epoch 28/30\n", 114 | " - 0s - loss: 0.2637 - acc: 1.0000\n", 115 | "Epoch 29/30\n", 116 | " - 0s - loss: 0.2297 - acc: 1.0000\n", 117 | "Epoch 30/30\n", 118 | " - 0s - loss: 0.2187 - acc: 1.0000\n", 119 | "acc = 1.00%.\n" 120 | ] 121 | } 122 | ], 123 | "source": [ 124 | "# experiments settings\n", 125 | "params = {\n", 126 | " \"difficulty\": TemporalOrderExp6aSequence.DifficultyLevel.EASY,\n", 127 | " \"batch_size\": 32,\n", 128 | " \"h_units\": 4,\n", 129 | " \"max_epochs\": 30\n", 130 | "}\n", 131 | "\n", 132 | "acc = exp6a_experiment(params)\n", 133 | "print('acc = {:.2f}%.'.format(acc))\n" 134 | ] 135 | } 136 | ], 137 | "metadata": { 138 | "kernelspec": { 139 | "display_name": "Python 3", 140 | "language": "python", 141 | "name": "python3" 142 | }, 143 | "language_info": { 144 | "codemirror_mode": { 145 | "name": "ipython", 146 | "version": 3 147 | }, 148 | "file_extension": ".py", 149 | "mimetype": "text/x-python", 150 | "name": "python", 151 | "nbconvert_exporter": "python", 152 | "pygments_lexer": "ipython3", 153 | "version": "3.6.3" 154 | } 155 | }, 156 | "nbformat": 4, 157 | "nbformat_minor": 1 158 | } 159 | -------------------------------------------------------------------------------- /raw/keras-sequences/1_2_echo_experiments.ipynb: -------------------------------------------------------------------------------- 1 | { 2 | "cells": [ 3 | { 4 | "cell_type": "code", 5 | "execution_count": 1, 6 | "metadata": {}, 7 | "outputs": [], 8 | "source": [ 9 | "from sequential_tasks import EchoData\n", 10 | "from tensorflow.python.keras.models import Sequential\n", 11 | "from tensorflow.python.keras.layers import SimpleRNN, Dense, TimeDistributed\n", 12 | "import numpy as np\n" 13 | ] 14 | }, 15 | { 16 | "cell_type": "code", 17 | "execution_count": 2, 18 | "metadata": {}, 19 | "outputs": [], 20 | "source": [ 21 | "def echo_experiment(settings):\n", 22 | " train_data_gen = EchoData(\n", 23 | " series_length=settings['series_length'],\n", 24 | " truncated_length=settings['truncated_length'],\n", 25 | " echo_step=settings['echo_step'],\n", 26 | " batch_size=settings['batch_size'])\n", 27 | "\n", 28 | " model = Sequential([\n", 29 | " SimpleRNN(\n", 30 | " units=settings['h_units'],\n", 31 | " batch_input_shape=(settings['batch_size'], settings['truncated_length'], 1),\n", 32 | " return_sequences=True,\n", 33 | " stateful=True),\n", 34 | " TimeDistributed(\n", 35 | " Dense(units=1, activation='sigmoid'))\n", 36 | " ])\n", 37 | "\n", 38 | " model.compile(optimizer='rmsprop', loss='binary_crossentropy', metrics=['accuracy'])\n", 39 | " model.fit_generator(\n", 40 | " generator=train_data_gen,\n", 41 | " epochs=settings['max_epochs'],\n", 42 | " verbose=2,\n", 43 | " shuffle=False)\n", 44 | "\n", 45 | " # testing\n", 46 | " test_data_gen = EchoData(\n", 47 | " series_length=settings['series_length'],\n", 48 | " truncated_length=settings['truncated_length'],\n", 49 | " echo_step=settings['echo_step'],\n", 50 | " batch_size=settings['batch_size'])\n", 51 | "\n", 52 | " # we could do evaluation like this:\n", 53 | " # eval_metrics = model.evaluate_generator(test_data_gen)\n", 54 | " \n", 55 | " # but let's gather statistics from each batch...\n", 56 | " batch_accuracies = []\n", 57 | " for b in range(test_data_gen.n_batches):\n", 58 | " x_test, y_test = test_data_gen[b]\n", 59 | " batch_metrics = model.evaluate(\n", 60 | " x=x_test,\n", 61 | " y=y_test,\n", 62 | " batch_size=params[\"batch_size\"],\n", 63 | " verbose=0)\n", 64 | " batch_accuracies.append(100. * batch_metrics[1])\n", 65 | " # ... and let's skip the first batch (when RNN is not warmed up)\n", 66 | " test_accuracy = np.mean(batch_accuracies[1:])\n", 67 | "\n", 68 | " return test_accuracy\n" 69 | ] 70 | }, 71 | { 72 | "cell_type": "code", 73 | "execution_count": 3, 74 | "metadata": {}, 75 | "outputs": [ 76 | { 77 | "name": "stdout", 78 | "output_type": "stream", 79 | "text": [ 80 | "Epoch 1/4\n", 81 | " - 7s - loss: 0.6507 - acc: 0.5941\n", 82 | "Epoch 2/4\n", 83 | " - 6s - loss: 0.3372 - acc: 0.8691\n", 84 | "Epoch 3/4\n", 85 | " - 6s - loss: 0.1511 - acc: 0.9244\n", 86 | "Epoch 4/4\n", 87 | " - 6s - loss: 0.1228 - acc: 0.9256\n", 88 | "acc = 100.00%.\n" 89 | ] 90 | } 91 | ], 92 | "source": [ 93 | "# experiments settings\n", 94 | "params = {\n", 95 | " \"series_length\": 20000,\n", 96 | " \"echo_step\": 3,\n", 97 | " \"truncated_length\": 20,\n", 98 | " \"batch_size\": 5,\n", 99 | " \"h_units\": 4,\n", 100 | " \"max_epochs\": 4\n", 101 | "}\n", 102 | "\n", 103 | "acc = echo_experiment(params)\n", 104 | "print('acc = {:.2f}%.'.format(acc))\n" 105 | ] 106 | } 107 | ], 108 | "metadata": { 109 | "kernelspec": { 110 | "display_name": "Python 3", 111 | "language": "python", 112 | "name": "python3" 113 | }, 114 | "language_info": { 115 | "codemirror_mode": { 116 | "name": "ipython", 117 | "version": 3 118 | }, 119 | "file_extension": ".py", 120 | "mimetype": "text/x-python", 121 | "name": "python", 122 | "nbconvert_exporter": "python", 123 | "pygments_lexer": "ipython3", 124 | "version": "3.6.3" 125 | } 126 | }, 127 | "nbformat": 4, 128 | "nbformat_minor": 1 129 | } 130 | -------------------------------------------------------------------------------- /raw/keras-sequences/sequential_tasks.py: -------------------------------------------------------------------------------- 1 | import numpy as np 2 | from tensorflow.python.keras.utils import Sequence, to_categorical 3 | from tensorflow.python.keras.preprocessing.sequence import pad_sequences 4 | 5 | 6 | class EchoData(Sequence): 7 | 8 | def __init__(self, series_length=40000, batch_size=32, 9 | echo_step=3, truncated_length=10, seed=None): 10 | 11 | self.series_length = series_length 12 | self.truncated_length = truncated_length 13 | self.n_batches = series_length//truncated_length 14 | 15 | self.echo_step = echo_step 16 | self.batch_size = batch_size 17 | if seed is not None: 18 | np.random.seed(seed) 19 | self.raw_x = None 20 | self.raw_y = None 21 | self.x_batches = [] 22 | self.y_batches = [] 23 | self.generate_new_series() 24 | self.prepare_batches() 25 | 26 | def __getitem__(self, index): 27 | if index == 0: 28 | self.generate_new_series() 29 | self.prepare_batches() 30 | return self.x_batches[index], self.y_batches[index] 31 | 32 | def __len__(self): 33 | return self.n_batches 34 | 35 | def generate_new_series(self): 36 | x = np.random.choice( 37 | 2, 38 | size=(self.batch_size, self.series_length), 39 | p=[0.5, 0.5]) 40 | y = np.roll(x, self.echo_step, axis=1) 41 | y[:, 0:self.echo_step] = 0 42 | self.raw_x = x 43 | self.raw_y = y 44 | 45 | def prepare_batches(self): 46 | x = np.expand_dims(self.raw_x, axis=-1) 47 | y = np.expand_dims(self.raw_y, axis=-1) 48 | self.x_batches = np.split(x, self.n_batches, axis=1) 49 | self.y_batches = np.split(y, self.n_batches, axis=1) 50 | 51 | 52 | class TemporalOrderExp6aSequence(Sequence): 53 | """ 54 | From Hochreiter&Schmidhuber(1997): 55 | 56 | The goal is to classify sequences. Elements and targets are represented locally 57 | (input vectors with only one non-zero bit). The sequence starts with an E, ends 58 | with a B (the "trigger symbol") and otherwise consists of randomly chosen symbols 59 | from the set {a, b, c, d} except for two elements at positions t1 and t2 that are 60 | either X or Y . The sequence length is randomly chosen between 100 and 110, t1 is 61 | randomly chosen between 10 and 20, and t2 is randomly chosen between 50 and 60. 62 | There are 4 sequence classes Q, R, S, U which depend on the temporal order of X and Y. 63 | The rules are: 64 | X, X -> Q, 65 | X, Y -> R, 66 | Y , X -> S, 67 | Y , Y -> U. 68 | 69 | """ 70 | 71 | def __init__(self, length_range=(100, 111), t1_range=(10, 21), t2_range=(50, 61), 72 | batch_size=32, seed=None): 73 | 74 | self.classes = ['Q', 'R', 'S', 'U'] 75 | self.n_classes = len(self.classes) 76 | 77 | self.relevant_symbols = ['X', 'Y'] 78 | self.distraction_symbols = ['a', 'b', 'c', 'd'] 79 | self.start_symbol = 'B' 80 | self.end_symbol = 'E' 81 | 82 | self.length_range = length_range 83 | self.t1_range = t1_range 84 | self.t2_range = t2_range 85 | self.batch_size = batch_size 86 | 87 | if seed is not None: 88 | np.random.seed(seed) 89 | 90 | all_symbols = self.relevant_symbols + self.distraction_symbols + \ 91 | [self.start_symbol] + [self.end_symbol] 92 | self.n_symbols = len(all_symbols) 93 | self.s_to_idx = {s: n for n, s in enumerate(all_symbols)} 94 | self.idx_to_s = {n: s for n, s in enumerate(all_symbols)} 95 | 96 | self.c_to_idx = {c: n for n, c in enumerate(self.classes)} 97 | self.idx_to_c = {n: c for n, c in enumerate(self.classes)} 98 | 99 | def generate_pair(self): 100 | length = np.random.randint(self.length_range[0], self.length_range[1]) 101 | t1 = np.random.randint(self.t1_range[0], self.t1_range[1]) 102 | t2 = np.random.randint(self.t2_range[0], self.t2_range[1]) 103 | 104 | x = np.random.choice(self.distraction_symbols, length) 105 | x[0] = self.start_symbol 106 | x[-1] = self.end_symbol 107 | 108 | y = np.random.choice(self.classes) 109 | if y == 'Q': 110 | x[t1], x[t2] = self.relevant_symbols[0], self.relevant_symbols[0] 111 | elif y == 'R': 112 | x[t1], x[t2] = self.relevant_symbols[0], self.relevant_symbols[1] 113 | elif y == 'S': 114 | x[t1], x[t2] = self.relevant_symbols[1], self.relevant_symbols[0] 115 | else: 116 | x[t1], x[t2] = self.relevant_symbols[1], self.relevant_symbols[1] 117 | 118 | return ''.join(x), y 119 | 120 | # encoding/decoding single instance version 121 | 122 | def encode_x(self, x): 123 | idx_x = [self.s_to_idx[s] for s in x] 124 | return to_categorical(idx_x, num_classes=self.n_symbols) 125 | 126 | def encode_y(self, y): 127 | idx_y = self.c_to_idx[y] 128 | return to_categorical(idx_y, num_classes=self.n_classes) 129 | 130 | def decode_x(self, x): 131 | x = x[np.sum(x, axis=1) > 0] # remove padding 132 | return ''.join([self.idx_to_s[pos] for pos in np.argmax(x, axis=1)]) 133 | 134 | def decode_y(self, y): 135 | return self.idx_to_c[np.argmax(y)] 136 | 137 | # encoding/decoding batch versions 138 | 139 | def encode_x_batch(self, x_batch): 140 | return pad_sequences([self.encode_x(x) for x in x_batch], 141 | maxlen=self.length_range[1]) 142 | 143 | def encode_y_batch(self, y_batch): 144 | return np.array([self.encode_y(y) for y in y_batch]) 145 | 146 | def decode_x_batch(self, x_batch): 147 | return [self.decode_x(x) for x in x_batch] 148 | 149 | def decode_y_batch(self, y_batch): 150 | return [self.idx_to_c[pos] for pos in np.argmax(y_batch, axis=1)] 151 | 152 | def __len__(self): 153 | """ Let's assume 1000 sequences as the size of data. """ 154 | return int(1000. / self.batch_size) 155 | 156 | def __getitem__(self, index): 157 | batch_x, batch_y = [], [] 158 | for _ in range(self.batch_size): 159 | x, y = self.generate_pair() 160 | batch_x.append(x) 161 | batch_y.append(y) 162 | return self.encode_x_batch(batch_x), self.encode_y_batch(batch_y) 163 | 164 | class DifficultyLevel: 165 | """ On HARD, settings are identical to the original settings from the '97 paper.""" 166 | EASY, NORMAL, MODERATE, HARD, NIGHTMARE = range(5) 167 | 168 | @staticmethod 169 | def get_predefined_generator(difficulty_level, batch_size=32, seed=8382): 170 | EASY = TemporalOrderExp6aSequence.DifficultyLevel.EASY 171 | NORMAL = TemporalOrderExp6aSequence.DifficultyLevel.NORMAL 172 | MODERATE = TemporalOrderExp6aSequence.DifficultyLevel.MODERATE 173 | HARD = TemporalOrderExp6aSequence.DifficultyLevel.HARD 174 | 175 | if difficulty_level == EASY: 176 | length_range = (7, 9) 177 | t1_range = (1, 3) 178 | t2_range = (4, 6) 179 | elif difficulty_level == NORMAL: 180 | length_range = (30, 41) 181 | t1_range = (2, 6) 182 | t2_range = (20, 28) 183 | elif difficulty_level == MODERATE: 184 | length_range = (60, 81) 185 | t1_range = (10, 21) 186 | t2_range = (45, 55) 187 | elif difficulty_level == HARD: 188 | length_range = (100, 111) 189 | t1_range = (10, 21) 190 | t2_range = (50, 61) 191 | else: 192 | length_range = (300, 501) 193 | t1_range = (10, 81) 194 | t2_range = (250, 291) 195 | return TemporalOrderExp6aSequence(length_range, t1_range, t2_range, 196 | batch_size, seed) 197 | -------------------------------------------------------------------------------- /sequential_tasks.py: -------------------------------------------------------------------------------- 1 | import numpy as np 2 | from tensorflow.python.keras.utils import Sequence, to_categorical 3 | from tensorflow.python.keras.preprocessing.sequence import pad_sequences 4 | 5 | 6 | class EchoData(Sequence): 7 | 8 | def __init__(self, series_length=40000, batch_size=32, 9 | echo_step=3, truncated_length=10, seed=None): 10 | 11 | self.series_length = series_length 12 | self.truncated_length = truncated_length 13 | self.n_batches = series_length//truncated_length 14 | 15 | self.echo_step = echo_step 16 | self.batch_size = batch_size 17 | if seed is not None: 18 | np.random.seed(seed) 19 | self.raw_x = None 20 | self.raw_y = None 21 | self.x_batches = [] 22 | self.y_batches = [] 23 | self.generate_new_series() 24 | self.prepare_batches() 25 | 26 | def __getitem__(self, index): 27 | if index == 0: 28 | self.generate_new_series() 29 | self.prepare_batches() 30 | return self.x_batches[index], self.y_batches[index] 31 | 32 | def __len__(self): 33 | return self.n_batches 34 | 35 | def generate_new_series(self): 36 | x = np.random.choice( 37 | 2, 38 | size=(self.batch_size, self.series_length), 39 | p=[0.5, 0.5]) 40 | y = np.roll(x, self.echo_step, axis=1) 41 | y[:, 0:self.echo_step] = 0 42 | self.raw_x = x 43 | self.raw_y = y 44 | 45 | def prepare_batches(self): 46 | x = np.expand_dims(self.raw_x, axis=-1) 47 | y = np.expand_dims(self.raw_y, axis=-1) 48 | self.x_batches = np.split(x, self.n_batches, axis=1) 49 | self.y_batches = np.split(y, self.n_batches, axis=1) 50 | 51 | 52 | class TemporalOrderExp6aSequence(Sequence): 53 | """ 54 | From Hochreiter&Schmidhuber(1997): 55 | 56 | The goal is to classify sequences. Elements and targets are represented locally 57 | (input vectors with only one non-zero bit). The sequence starts with an E, ends 58 | with a B (the "trigger symbol") and otherwise consists of randomly chosen symbols 59 | from the set {a, b, c, d} except for two elements at positions t1 and t2 that are 60 | either X or Y . The sequence length is randomly chosen between 100 and 110, t1 is 61 | randomly chosen between 10 and 20, and t2 is randomly chosen between 50 and 60. 62 | There are 4 sequence classes Q, R, S, U which depend on the temporal order of X and Y. 63 | The rules are: 64 | X, X -> Q, 65 | X, Y -> R, 66 | Y , X -> S, 67 | Y , Y -> U. 68 | 69 | """ 70 | 71 | def __init__(self, length_range=(100, 111), t1_range=(10, 21), t2_range=(50, 61), 72 | batch_size=32, seed=None): 73 | 74 | self.classes = ['Q', 'R', 'S', 'U'] 75 | self.n_classes = len(self.classes) 76 | 77 | self.relevant_symbols = ['X', 'Y'] 78 | self.distraction_symbols = ['a', 'b', 'c', 'd'] 79 | self.start_symbol = 'B' 80 | self.end_symbol = 'E' 81 | 82 | self.length_range = length_range 83 | self.t1_range = t1_range 84 | self.t2_range = t2_range 85 | self.batch_size = batch_size 86 | 87 | if seed is not None: 88 | np.random.seed(seed) 89 | 90 | all_symbols = self.relevant_symbols + self.distraction_symbols + \ 91 | [self.start_symbol] + [self.end_symbol] 92 | self.n_symbols = len(all_symbols) 93 | self.s_to_idx = {s: n for n, s in enumerate(all_symbols)} 94 | self.idx_to_s = {n: s for n, s in enumerate(all_symbols)} 95 | 96 | self.c_to_idx = {c: n for n, c in enumerate(self.classes)} 97 | self.idx_to_c = {n: c for n, c in enumerate(self.classes)} 98 | 99 | def generate_pair(self): 100 | length = np.random.randint(self.length_range[0], self.length_range[1]) 101 | t1 = np.random.randint(self.t1_range[0], self.t1_range[1]) 102 | t2 = np.random.randint(self.t2_range[0], self.t2_range[1]) 103 | 104 | x = np.random.choice(self.distraction_symbols, length) 105 | x[0] = self.start_symbol 106 | x[-1] = self.end_symbol 107 | 108 | y = np.random.choice(self.classes) 109 | if y == 'Q': 110 | x[t1], x[t2] = self.relevant_symbols[0], self.relevant_symbols[0] 111 | elif y == 'R': 112 | x[t1], x[t2] = self.relevant_symbols[0], self.relevant_symbols[1] 113 | elif y == 'S': 114 | x[t1], x[t2] = self.relevant_symbols[1], self.relevant_symbols[0] 115 | else: 116 | x[t1], x[t2] = self.relevant_symbols[1], self.relevant_symbols[1] 117 | 118 | return ''.join(x), y 119 | 120 | # encoding/decoding single instance version 121 | 122 | def encode_x(self, x): 123 | idx_x = [self.s_to_idx[s] for s in x] 124 | return to_categorical(idx_x, num_classes=self.n_symbols) 125 | 126 | def encode_y(self, y): 127 | idx_y = self.c_to_idx[y] 128 | return to_categorical(idx_y, num_classes=self.n_classes) 129 | 130 | def decode_x(self, x): 131 | x = x[np.sum(x, axis=1) > 0] # remove padding 132 | return ''.join([self.idx_to_s[pos] for pos in np.argmax(x, axis=1)]) 133 | 134 | def decode_y(self, y): 135 | return self.idx_to_c[np.argmax(y)] 136 | 137 | # encoding/decoding batch versions 138 | 139 | def encode_x_batch(self, x_batch): 140 | return pad_sequences([self.encode_x(x) for x in x_batch], 141 | maxlen=self.length_range[1]) 142 | 143 | def encode_y_batch(self, y_batch): 144 | return np.array([self.encode_y(y) for y in y_batch]) 145 | 146 | def decode_x_batch(self, x_batch): 147 | return [self.decode_x(x) for x in x_batch] 148 | 149 | def decode_y_batch(self, y_batch): 150 | return [self.idx_to_c[pos] for pos in np.argmax(y_batch, axis=1)] 151 | 152 | def __len__(self): 153 | """ Let's assume 1000 sequences as the size of data. """ 154 | return int(1000. / self.batch_size) 155 | 156 | def __getitem__(self, index): 157 | batch_x, batch_y = [], [] 158 | for _ in range(self.batch_size): 159 | x, y = self.generate_pair() 160 | batch_x.append(x) 161 | batch_y.append(y) 162 | return self.encode_x_batch(batch_x), self.encode_y_batch(batch_y) 163 | 164 | class DifficultyLevel: 165 | """ On HARD, settings are identical to the original settings from the '97 paper.""" 166 | EASY, NORMAL, MODERATE, HARD, NIGHTMARE = range(5) 167 | 168 | @staticmethod 169 | def get_predefined_generator(difficulty_level, batch_size=32, seed=8382): 170 | EASY = TemporalOrderExp6aSequence.DifficultyLevel.EASY 171 | NORMAL = TemporalOrderExp6aSequence.DifficultyLevel.NORMAL 172 | MODERATE = TemporalOrderExp6aSequence.DifficultyLevel.MODERATE 173 | HARD = TemporalOrderExp6aSequence.DifficultyLevel.HARD 174 | 175 | if difficulty_level == EASY: 176 | length_range = (7, 9) 177 | t1_range = (1, 3) 178 | t2_range = (4, 6) 179 | elif difficulty_level == NORMAL: 180 | length_range = (30, 41) 181 | t1_range = (2, 6) 182 | t2_range = (20, 28) 183 | elif difficulty_level == MODERATE: 184 | length_range = (60, 81) 185 | t1_range = (10, 21) 186 | t2_range = (45, 55) 187 | elif difficulty_level == HARD: 188 | length_range = (100, 111) 189 | t1_range = (10, 21) 190 | t2_range = (50, 61) 191 | else: 192 | length_range = (300, 501) 193 | t1_range = (10, 81) 194 | t2_range = (250, 291) 195 | return TemporalOrderExp6aSequence(length_range, t1_range, t2_range, 196 | batch_size, seed) 197 | -------------------------------------------------------------------------------- /slides/01 - ML and spiral classification.pdf: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/SudalaiRajkumar/PyTorch-Deep-Learning-Minicourse/d2b0970935ec19cb4526ceb4ec028b94a3958c25/slides/01 - ML and spiral classification.pdf -------------------------------------------------------------------------------- /slides/02 - CNN.pdf: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/SudalaiRajkumar/PyTorch-Deep-Learning-Minicourse/d2b0970935ec19cb4526ceb4ec028b94a3958c25/slides/02 - CNN.pdf -------------------------------------------------------------------------------- /slides/03 - Generative models.pdf: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/SudalaiRajkumar/PyTorch-Deep-Learning-Minicourse/d2b0970935ec19cb4526ceb4ec028b94a3958c25/slides/03 - Generative models.pdf -------------------------------------------------------------------------------- /slides/04 - RNN.pdf: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/SudalaiRajkumar/PyTorch-Deep-Learning-Minicourse/d2b0970935ec19cb4526ceb4ec028b94a3958c25/slides/04 - RNN.pdf --------------------------------------------------------------------------------