├── .gitignore ├── imgs ├── img1.png ├── img2.png └── cifar10.png ├── server.py ├── README.md ├── 3-omt ├── README.md └── pytorch-fgu-omt.ipynb ├── 2-Intermediate ├── 2.2-Pretrained-ResNet-Imagenet.ipynb ├── 2.4-Finetuning-Hymenoptera.ipynb ├── 2.3-TransferLearning-MNIST.ipynb ├── 2.5-CharRNN.ipynb └── 2.1-Convolutional-Neural-Networks.ipynb └── 1-Basics ├── 1.4-FizzBuzz.ipynb └── 1.2-Linear-Regression.ipynb /.gitignore: -------------------------------------------------------------------------------- 1 | *.ipynb_checkpoints 2 | *.html 3 | *.pdf 4 | .DS_Store 5 | *.pth 6 | data/ 7 | -------------------------------------------------------------------------------- /imgs/img1.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/leriomaggio/PyCon9-Pytorch-from-Ground-Up/master/imgs/img1.png -------------------------------------------------------------------------------- /imgs/img2.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/leriomaggio/PyCon9-Pytorch-from-Ground-Up/master/imgs/img2.png -------------------------------------------------------------------------------- /imgs/cifar10.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/leriomaggio/PyCon9-Pytorch-from-Ground-Up/master/imgs/cifar10.png -------------------------------------------------------------------------------- /server.py: -------------------------------------------------------------------------------- 1 | #!/bin/env python 2 | 3 | import SocketServer 4 | import BaseHTTPServer 5 | import SimpleHTTPServer 6 | 7 | class ThreadingSimpleServer(SocketServer.ThreadingMixIn, 8 | BaseHTTPServer.HTTPServer): 9 | pass 10 | 11 | import sys 12 | 13 | if sys.argv[1:]: 14 | port = int(sys.argv[1]) 15 | else: 16 | port = 8000 17 | 18 | server = ThreadingSimpleServer(('', port), SimpleHTTPServer.SimpleHTTPRequestHandler) 19 | try: 20 | while 1: 21 | sys.stdout.flush() 22 | server.handle_request() 23 | except KeyboardInterrupt: 24 | print "Finished" -------------------------------------------------------------------------------- /README.md: -------------------------------------------------------------------------------- 1 | # PyCon9: PyTorch from Ground Up 2 | 3 | ## Instructions: 4 | 5 | ### Part 0 - Introduction Presentation 6 | 7 | #### How to launch 8 | 9 | $ cd 0-pytorch-fgu 10 | $ source launch_me.sh 11 | 12 | ### Part 1-2 - Hands On 13 | 14 | #### How to launch 15 | 16 | - move to the root folder 17 | - run: 18 | $ jupyter notebook 19 | 20 | 21 | ### Notes 22 | 23 | we make use of [rise](https://github.com/damianavila/RISE) to make a presentation out of a notebook, so you should install it 24 | 25 | $ conda install -c damianavila82 rise 26 | 27 | ### TODO 28 | 29 | - @lantiga: 30 | - check and fix part-0 presentation 31 | - add notes on the notebooks for white board interventions 32 | - eventually add notes for other fixes 33 | - check 3-omt part (pytorch future roadmap) and remove it 34 | - @dnlcrl: 35 | - write 1.5-MNIST-intro.ipynb: 36 | - introduction of mnist dataset 37 | - approaching mnist with linear and/or logistic regression 38 | - write 2.1-Convolutional-Neural-Networks.ipynb: 39 | - introducing convolutional layers 40 | - approaching mnist with a simple convnet 41 | - fix remaining notebooks -------------------------------------------------------------------------------- /3-omt/README.md: -------------------------------------------------------------------------------- 1 | # pytorch from ground up 2 | This is the repository for **PyTorch From Ground Up** tutorial and [**Pycon 9**](https://pycon.it) talk slides. It features Jupyter Notebook Slides html file, and training notebooks 3 | 4 | ## Knowledge Prerequisites 5 | This tutorial assumes familiarity with Python and Numpy. 6 | 7 | ## Tutorial Prerequisites 8 | Python3 is required to run this tutorial. You also will need some libraries from SciPy package (NumPy, Matplotlib, Pandas), Jupyter Notebook support, and Pytorch 0.3.0 or newer. 9 | 10 | The simpliest way to maintain Python with all these libraries as well as many others is to install [Anaconda](https://www.anaconda.com/download). You can Find Pytorch installation instructions on the [Pytorch page](http://pytorch.org). 11 | 12 | CUDA availability is NOT required, but still, Life is short -- use a GPU! 13 | 14 | ## How to Use 15 | Our tutorial git has no submodules. To download the tutorial, use 16 | 17 | ``` 18 | git clone TODO 19 | ``` 20 | 21 | To run the Jupyter Notebook Slides as at Jupyter Day Atlanta 2018 talk, you can use following command: 22 | 23 | ``` 24 | jupyter nbconvert pytorch-fgu.ipynb --to slides --post serve 25 | ``` 26 | 27 | ## Table of Contents 28 | TODO -------------------------------------------------------------------------------- /3-omt/pytorch-fgu-omt.ipynb: -------------------------------------------------------------------------------- 1 | { 2 | "cells": [ 3 | { 4 | "cell_type": "markdown", 5 | "metadata": { 6 | "slideshow": { 7 | "slide_type": "slide" 8 | } 9 | }, 10 | "source": [ 11 | "#
PyTorch From the Ground Up - One More Thing\n", 12 | "\n", 13 | "\n", 14 | "\n", 15 | "\n", 16 | "\n", 17 | "
[L. Antiga](http://twitter.com/lantiga), [D. Ciriello](http://twitter.com/dnlcrl) and [A. Paszke](http://twitter.com/apaszke)\n", 18 | "\n", 19 | "
[PyCon Nove (2018)](https://pycon.it/)." 20 | ] 21 | }, 22 | { 23 | "cell_type": "markdown", 24 | "metadata": { 25 | "slideshow": { 26 | "slide_type": "slide" 27 | } 28 | }, 29 | "source": [ 30 | "# Forward and Backward Function Hooks\n", 31 | "\n", 32 | "- how about inspecting / modifying the output and grad_output of a layer?\n", 33 | "- \"We introduce hooks for this purpose.\" \n", 34 | "- You can register a function on a Module or a Variable\n" 35 | ] 36 | }, 37 | { 38 | "cell_type": "code", 39 | "execution_count": null, 40 | "metadata": { 41 | "collapsed": true, 42 | "slideshow": { 43 | "slide_type": "slide" 44 | } 45 | }, 46 | "outputs": [], 47 | "source": [ 48 | "def printnorm(self, input, output):\n", 49 | " # input is a tuple of packed inputs\n", 50 | " # output is a Variable. output.data is the Tensor we are interested\n", 51 | " print('Inside ' + self.__class__.__name__ + ' forward')\n", 52 | " print('')\n", 53 | " print('input: ', type(input))\n", 54 | " print('input[0]: ', type(input[0]))\n", 55 | " print('output: ', type(output))\n", 56 | " print('')\n", 57 | " print('input size:', input[0].size())\n", 58 | " print('output size:', output.data.size())\n", 59 | " print('output norm:', output.data.norm())\n", 60 | "\n", 61 | "\n", 62 | "net.conv2.register_forward_hook(printnorm)\n", 63 | "\n", 64 | "out = net(input)" 65 | ] 66 | }, 67 | { 68 | "cell_type": "code", 69 | "execution_count": null, 70 | "metadata": { 71 | "collapsed": true, 72 | "slideshow": { 73 | "slide_type": "slide" 74 | } 75 | }, 76 | "outputs": [], 77 | "source": [ 78 | "def printgradnorm(self, grad_input, grad_output):\n", 79 | " print('Inside ' + self.__class__.__name__ + ' backward')\n", 80 | " print('Inside class:' + self.__class__.__name__)\n", 81 | " print('')\n", 82 | " print('grad_input: ', type(grad_input))\n", 83 | " print('grad_input[0]: ', type(grad_input[0]))\n", 84 | " print('grad_output: ', type(grad_output))\n", 85 | " print('grad_output[0]: ', type(grad_output[0]))\n", 86 | " print('')\n", 87 | " print('grad_input size:', grad_input[0].size())\n", 88 | " print('grad_output size:', grad_output[0].size())\n", 89 | " print('grad_input norm:', grad_input[0].data.norm())\n", 90 | "\n", 91 | "\n", 92 | "net.conv2.register_backward_hook(printgradnorm)\n", 93 | "\n", 94 | "out = net(input)\n", 95 | "err = loss_fn(out, target)\n", 96 | "err.backward()" 97 | ] 98 | }, 99 | { 100 | "cell_type": "markdown", 101 | "metadata": { 102 | "slideshow": { 103 | "slide_type": "slide" 104 | } 105 | }, 106 | "source": [ 107 | "# Nightly Builds!\n", 108 | "\n", 109 | "- recently added\n", 110 | "- install the last **unstable** master version of PyTorch\n", 111 | "- use all just implemented layers" 112 | ] 113 | }, 114 | { 115 | "cell_type": "markdown", 116 | "metadata": { 117 | "slideshow": { 118 | "slide_type": "slide" 119 | } 120 | }, 121 | "source": [ 122 | "# Thank You!" 123 | ] 124 | } 125 | ], 126 | "metadata": { 127 | "celltoolbar": "Slideshow", 128 | "kernelspec": { 129 | "display_name": "Python 3", 130 | "language": "python", 131 | "name": "python3" 132 | }, 133 | "language_info": { 134 | "codemirror_mode": { 135 | "name": "ipython", 136 | "version": 3 137 | }, 138 | "file_extension": ".py", 139 | "mimetype": "text/x-python", 140 | "name": "python", 141 | "nbconvert_exporter": "python", 142 | "pygments_lexer": "ipython3", 143 | "version": "3.6.3" 144 | } 145 | }, 146 | "nbformat": 4, 147 | "nbformat_minor": 2 148 | } 149 | -------------------------------------------------------------------------------- /2-Intermediate/2.2-Pretrained-ResNet-Imagenet.ipynb: -------------------------------------------------------------------------------- 1 | { 2 | "cells": [ 3 | { 4 | "cell_type": "markdown", 5 | "metadata": { 6 | "slideshow": { 7 | "slide_type": "slide" 8 | } 9 | }, 10 | "source": [ 11 | "# Classifying images with a pretrained model" 12 | ] 13 | }, 14 | { 15 | "cell_type": "markdown", 16 | "metadata": { 17 | "slideshow": { 18 | "slide_type": "notes" 19 | } 20 | }, 21 | "source": [ 22 | "In this section we will learn how to use a pretrained model, to perform predictions without changing its parameters, specifically we will use a pretrained residual network model having 152 residual blocks.\n", 23 | "\n", 24 | "Residual models are a generation of convnets proposed in 2016 which obtained the best results for the ILSRVC (originally regarding classifcation and localization on ImageNet dataset) competition for that year, since then a lot of its variations are getting proposed, you can see a full list of available pretrained models on the pytorch documentation website, along with the performace obtained by the model on the ImageNet test set.\n", 25 | "\n", 26 | "The power of those residual model is given by the residual paths, which consists of a residual block's input value replicated and concatenated to its output, in a way that the model keeps a sort of track of what the orinal data contains. This fact permits the model to leaern better using the same number of parameters. More informations about the residual models can be found here:\n", 27 | "\n", 28 | "https://arxiv.org/abs/1512.03385" 29 | ] 30 | }, 31 | { 32 | "cell_type": "markdown", 33 | "metadata": { 34 | "slideshow": { 35 | "slide_type": "notes" 36 | } 37 | }, 38 | "source": [ 39 | "Let's import our needed packages" 40 | ] 41 | }, 42 | { 43 | "cell_type": "code", 44 | "execution_count": null, 45 | "metadata": { 46 | "slideshow": { 47 | "slide_type": "slide" 48 | } 49 | }, 50 | "outputs": [], 51 | "source": [ 52 | "import torch\n", 53 | "from torch.utils import data\n", 54 | "\n", 55 | "import numpy as np\n", 56 | "\n", 57 | "from torchvision.datasets.mnist import FashionMNIST\n", 58 | "from torchvision.models.resnet import resnet152\n", 59 | "from torchvision import transforms, utils\n", 60 | "\n", 61 | "from torch import nn\n", 62 | "from torch import optim\n", 63 | "from torch.autograd import Variable\n", 64 | "from torch.nn import functional as F\n", 65 | "\n", 66 | "from PIL import Image\n", 67 | "\n", 68 | "import matplotlib.pyplot as plt\n", 69 | "%matplotlib inline" 70 | ] 71 | }, 72 | { 73 | "cell_type": "markdown", 74 | "metadata": { 75 | "slideshow": { 76 | "slide_type": "notes" 77 | } 78 | }, 79 | "source": [ 80 | "to get pretrained resnet, it is sufficient to pass **True** to the **pretrained** method parameter. It's VERY important to call **.eval()** on our method before using it for prediction, because otherwise we will obtain strange results. That's because inside our model, there are some layers which behaves differently during training and valuidation mode, and so by calling the **.eval()** method, thanks to how the nn.Module class, the call gets propagated to all the model'd submodules." 81 | ] 82 | }, 83 | { 84 | "cell_type": "code", 85 | "execution_count": null, 86 | "metadata": { 87 | "slideshow": { 88 | "slide_type": "slide" 89 | } 90 | }, 91 | "outputs": [], 92 | "source": [ 93 | "model = resnet152(pretrained=True).eval()" 94 | ] 95 | }, 96 | { 97 | "cell_type": "markdown", 98 | "metadata": { 99 | "slideshow": { 100 | "slide_type": "notes" 101 | } 102 | }, 103 | "source": [ 104 | "This model has been trained on ImageNet dataset, which consists of high-res images for 1000 classes, so if we want to predict an image with the resnet model we need to map the class having the highest log probability, to the label name ('cat', 'dog', etc), so we just have to download the following json file and load it.\n", 105 | "\n", 106 | "json imagenet classes, https://s3.amazonaws.com/deep-learning-models/image-models/imagenet_class_index.json" 107 | ] 108 | }, 109 | { 110 | "cell_type": "code", 111 | "execution_count": null, 112 | "metadata": { 113 | "slideshow": { 114 | "slide_type": "slide" 115 | } 116 | }, 117 | "outputs": [], 118 | "source": [ 119 | "import json\n", 120 | "class_idx = json.load(open(\"../data/imagenet_class_index.json\"))" 121 | ] 122 | }, 123 | { 124 | "cell_type": "markdown", 125 | "metadata": { 126 | "slideshow": { 127 | "slide_type": "notes" 128 | } 129 | }, 130 | "source": [ 131 | "Let's take an example image" 132 | ] 133 | }, 134 | { 135 | "cell_type": "code", 136 | "execution_count": null, 137 | "metadata": { 138 | "slideshow": { 139 | "slide_type": "slide" 140 | } 141 | }, 142 | "outputs": [], 143 | "source": [ 144 | "img = Image.open('../data/imgs/img.jpg')\n", 145 | "img = img.convert('RGB')" 146 | ] 147 | }, 148 | { 149 | "cell_type": "code", 150 | "execution_count": null, 151 | "metadata": { 152 | "slideshow": { 153 | "slide_type": "slide" 154 | } 155 | }, 156 | "outputs": [], 157 | "source": [ 158 | "plt.imshow(img)" 159 | ] 160 | }, 161 | { 162 | "cell_type": "markdown", 163 | "metadata": { 164 | "slideshow": { 165 | "slide_type": "notes" 166 | } 167 | }, 168 | "source": [ 169 | "As you can see this is a picture of a dog of labrador breed, in order to pass the image to the model and get its prediction, we should apply the same preprocessing applied to the training image used for training the model with ImageNet images. Firstly, we need to apply the same normalization process applied to images form the training set, which means we need to subtract the pixels mean and divide by the pixels std, as eplained in the PyTorch documentaiton\n", 170 | "\n", 171 | "http://pytorch.org/docs/master/torchvision/models.html\n", 172 | "\n", 173 | "\n", 174 | "All pre-trained models expect input images normalized in the same way, i.e. mini-batches of 3-channel RGB images of shape (3 x H x W), where H and W are expected to be at least 224. The images have to be loaded in to a range of [0, 1] and then normalized using **mean = [0.485, 0.456, 0.406]** and **std = [0.229, 0.224, 0.225]**. \n", 175 | "\n", 176 | "On the same page you can also see other model implementation and their accuracies on ImageNet testset.\n", 177 | "\n", 178 | "Given that we should pass 224 images to the model we add a resize + crop transofrmation before creating the tensors and normalizing the image.\n" 179 | ] 180 | }, 181 | { 182 | "cell_type": "code", 183 | "execution_count": null, 184 | "metadata": { 185 | "slideshow": { 186 | "slide_type": "slide" 187 | } 188 | }, 189 | "outputs": [], 190 | "source": [ 191 | "def preproc(x):\n", 192 | " x = img\n", 193 | " t = transforms.Compose([\n", 194 | " transforms.Resize(256),\n", 195 | " transforms.CenterCrop(224),\n", 196 | " transforms.ToTensor(),\n", 197 | " transforms.Normalize(mean=[0.485, 0.456, 0.406], std=[0.229, 0.224, 0.225])\n", 198 | " ])\n", 199 | " x = t(x)\n", 200 | " x = torch.Tensor(x).unsqueeze(0)\n", 201 | " x = Variable(x)\n", 202 | " return x\n" 203 | ] 204 | }, 205 | { 206 | "cell_type": "markdown", 207 | "metadata": { 208 | "slideshow": { 209 | "slide_type": "notes" 210 | } 211 | }, 212 | "source": [ 213 | "ok, now we have our preprocessing funciton ready and we can pass the image to the model in order to get its preditction for this image, so we pass the preprocessed image to the model, getting the model's log probabilities for the image belonging to each class (1000 classes), than we find the label having the highest probability, map its value to the class name and print it. We could actually have a master class too (dog, labrador) but this is a more simple map which just returns a sigle value." 214 | ] 215 | }, 216 | { 217 | "cell_type": "code", 218 | "execution_count": null, 219 | "metadata": { 220 | "slideshow": { 221 | "slide_type": "slide" 222 | } 223 | }, 224 | "outputs": [], 225 | "source": [ 226 | "output = model(preproc(img))\n", 227 | "m, argm = output.data.squeeze().max(0)\n", 228 | "class_id = argm[0]\n", 229 | "print(output)\n", 230 | "class_idx[str(class_id)]" 231 | ] 232 | }, 233 | { 234 | "cell_type": "markdown", 235 | "metadata": { 236 | "slideshow": { 237 | "slide_type": "notes" 238 | } 239 | }, 240 | "source": [ 241 | "Let's try with another image:" 242 | ] 243 | }, 244 | { 245 | "cell_type": "code", 246 | "execution_count": null, 247 | "metadata": { 248 | "slideshow": { 249 | "slide_type": "slide" 250 | } 251 | }, 252 | "outputs": [], 253 | "source": [ 254 | "img = Image.open('../data/imgs/cat.jpg')\n", 255 | "img = img.convert('RGB')" 256 | ] 257 | }, 258 | { 259 | "cell_type": "code", 260 | "execution_count": null, 261 | "metadata": { 262 | "slideshow": { 263 | "slide_type": "slide" 264 | } 265 | }, 266 | "outputs": [], 267 | "source": [ 268 | "plt.imshow(img)" 269 | ] 270 | }, 271 | { 272 | "cell_type": "code", 273 | "execution_count": null, 274 | "metadata": { 275 | "slideshow": { 276 | "slide_type": "slide" 277 | } 278 | }, 279 | "outputs": [], 280 | "source": [ 281 | "output = model(preproc(img))\n", 282 | "m, argm = output.data.squeeze().max(0)\n", 283 | "class_id = argm[0]\n", 284 | "print(output)\n", 285 | "class_idx[str(class_id)]" 286 | ] 287 | }, 288 | { 289 | "cell_type": "markdown", 290 | "metadata": { 291 | "slideshow": { 292 | "slide_type": "notes" 293 | } 294 | }, 295 | "source": [ 296 | "\"A tabby is any domestic cat\" [wikipedia]\n", 297 | "If we had the master class it would also gave us the 'cat' label" 298 | ] 299 | } 300 | ], 301 | "metadata": { 302 | "celltoolbar": "Slideshow", 303 | "kernelspec": { 304 | "display_name": "Python 3", 305 | "language": "python", 306 | "name": "python3" 307 | }, 308 | "language_info": { 309 | "codemirror_mode": { 310 | "name": "ipython", 311 | "version": 3 312 | }, 313 | "file_extension": ".py", 314 | "mimetype": "text/x-python", 315 | "name": "python", 316 | "nbconvert_exporter": "python", 317 | "pygments_lexer": "ipython3", 318 | "version": "3.6.3" 319 | } 320 | }, 321 | "nbformat": 4, 322 | "nbformat_minor": 2 323 | } 324 | -------------------------------------------------------------------------------- /2-Intermediate/2.4-Finetuning-Hymenoptera.ipynb: -------------------------------------------------------------------------------- 1 | { 2 | "cells": [ 3 | { 4 | "cell_type": "markdown", 5 | "metadata": { 6 | "slideshow": { 7 | "slide_type": "slide" 8 | } 9 | }, 10 | "source": [ 11 | "# FineTuning a Pretrained Model to Distinguish Between Bees and Ants" 12 | ] 13 | }, 14 | { 15 | "cell_type": "markdown", 16 | "metadata": { 17 | "slideshow": { 18 | "slide_type": "notes" 19 | } 20 | }, 21 | "source": [ 22 | "In this last but one section we will see how to perform the fine tuning of a pretrained model (for real this time).\n", 23 | "To do this we will use a simpler dataset, composed by images belonging to just 2 classes, bees and ants.\n", 24 | "\n", 25 | "Q: what does FineTuning actually consists of?\n", 26 | "\n", 27 | "Finetuning means that we downlaod and use a pretrained model as in the previous seciton, but instead of just training the newly added last layer, we should re-train *all* the model's parameters, but using a very low learning rate, in this way we won't change the parameters as much, but we'll permit our model to slowly adapt the parameters to the new task, thus the term FineTuning.\n", 28 | "\n", 29 | "Let's add our usual imports" 30 | ] 31 | }, 32 | { 33 | "cell_type": "code", 34 | "execution_count": null, 35 | "metadata": { 36 | "slideshow": { 37 | "slide_type": "slide" 38 | } 39 | }, 40 | "outputs": [], 41 | "source": [ 42 | "import torch\n", 43 | "from torch.utils import data\n", 44 | "\n", 45 | "import numpy as np\n", 46 | "from tqdm import tqdm\n", 47 | "\n", 48 | "\n", 49 | "from torchvision.datasets.mnist import FashionMNIST\n", 50 | "from torchvision.models.resnet import resnet18\n", 51 | "from torchvision import transforms, utils, datasets\n", 52 | "\n", 53 | "from torch import nn\n", 54 | "from torch import optim\n", 55 | "from torch.autograd import Variable\n", 56 | "from torch.nn import functional as F\n", 57 | "\n", 58 | "import os\n", 59 | "from PIL import Image\n", 60 | "\n", 61 | "import matplotlib.pyplot as plt\n", 62 | "%matplotlib inline" 63 | ] 64 | }, 65 | { 66 | "cell_type": "markdown", 67 | "metadata": { 68 | "slideshow": { 69 | "slide_type": "notes" 70 | } 71 | }, 72 | "source": [ 73 | "lets create our dataset objects" 74 | ] 75 | }, 76 | { 77 | "cell_type": "code", 78 | "execution_count": null, 79 | "metadata": { 80 | "slideshow": { 81 | "slide_type": "slide" 82 | } 83 | }, 84 | "outputs": [], 85 | "source": [ 86 | "mean = [0.485, 0.456, 0.406]\n", 87 | "std = [0.229, 0.224, 0.225]\n", 88 | "\n", 89 | "transform_tr = transforms.Compose([\n", 90 | " transforms.RandomResizedCrop(224),\n", 91 | " transforms.RandomHorizontalFlip(),\n", 92 | " transforms.ToTensor(),\n", 93 | " transforms.Normalize(mean, std)\n", 94 | "])\n", 95 | "\n", 96 | "transform_val = transforms.Compose([\n", 97 | " transforms.Resize(256),\n", 98 | " transforms.CenterCrop(256),\n", 99 | " transforms.ToTensor(),\n", 100 | " transforms.Normalize(mean, std)\n", 101 | "])\n", 102 | "\n", 103 | "data_dir = '../data/hymenoptera_data'\n" 104 | ] 105 | }, 106 | { 107 | "cell_type": "code", 108 | "execution_count": null, 109 | "metadata": { 110 | "slideshow": { 111 | "slide_type": "slide" 112 | } 113 | }, 114 | "outputs": [], 115 | "source": [ 116 | "image_datasets_tr = datasets.ImageFolder(\n", 117 | " os.path.join(data_dir, 'train'), \n", 118 | " transform_tr)\n", 119 | "image_datasets_val = datasets.ImageFolder(\n", 120 | " os.path.join(data_dir, 'val'), \n", 121 | " transform_val)\n", 122 | "\n", 123 | "dataloader_tr = data.DataLoader(image_datasets_tr, shuffle=True, batch_size=4, num_workers=4)\n", 124 | "dataloader_val = data.DataLoader(image_datasets_val, shuffle=False, batch_size=4, num_workers=4)\n", 125 | "\n", 126 | "dataset_sizes = {'train': len(image_datasets_tr), 'val': len(image_datasets_val)}\n", 127 | "\n", 128 | "class_names = image_datasets_tr.classes" 129 | ] 130 | }, 131 | { 132 | "cell_type": "markdown", 133 | "metadata": { 134 | "slideshow": { 135 | "slide_type": "notes" 136 | } 137 | }, 138 | "source": [ 139 | "let's show some sample from the dataset and its label, for this purpose we will create a simple funciton" 140 | ] 141 | }, 142 | { 143 | "cell_type": "code", 144 | "execution_count": null, 145 | "metadata": { 146 | "slideshow": { 147 | "slide_type": "slide" 148 | } 149 | }, 150 | "outputs": [], 151 | "source": [ 152 | "def imshow(inp, title=None):\n", 153 | " '''imshow for Tensor'''\n", 154 | " inp = inp.numpy().transpose((1, 2, 0))\n", 155 | " inp = np.array(std) * inp + np.array(mean)\n", 156 | " inp = np.clip(inp, 0, 1)\n", 157 | " plt.figure(figsize=[10,10])\n", 158 | " plt.imshow(inp)\n", 159 | " if title:\n", 160 | " plt.title(title)\n", 161 | " plt.pause(0.001)" 162 | ] 163 | }, 164 | { 165 | "cell_type": "code", 166 | "execution_count": null, 167 | "metadata": { 168 | "slideshow": { 169 | "slide_type": "slide" 170 | } 171 | }, 172 | "outputs": [], 173 | "source": [ 174 | "inputs, classes = next(iter(dataloader_tr))\n", 175 | "out = utils.make_grid(inputs)\n", 176 | "imshow(out, title=[class_names[x] for x in classes])" 177 | ] 178 | }, 179 | { 180 | "cell_type": "markdown", 181 | "metadata": { 182 | "slideshow": { 183 | "slide_type": "notes" 184 | } 185 | }, 186 | "source": [ 187 | "Bees, and also Ants.\n", 188 | "\n", 189 | "Let's get our pretrained model, replace the pooling layer and the last fc layer, we will also initialize the newly created fc layer weight data using a normal function having mu=0 and var=0.001" 190 | ] 191 | }, 192 | { 193 | "cell_type": "code", 194 | "execution_count": null, 195 | "metadata": { 196 | "slideshow": { 197 | "slide_type": "slide" 198 | } 199 | }, 200 | "outputs": [], 201 | "source": [ 202 | "model = resnet18(pretrained=True)\n", 203 | "# replace the avgpool (it's a 7x7 pooling expecting Cx7x7 tensors)\n", 204 | "model.avgpool = nn.AdaptiveAvgPool2d((1,1))\n", 205 | "# replace the last layer (it has 1000 out features and we need new weights)\n", 206 | "model.fc = nn.Linear(model.fc.in_features, 2)\n", 207 | "model.fc.weight.data.normal_(0.0, 0.001)" 208 | ] 209 | }, 210 | { 211 | "cell_type": "markdown", 212 | "metadata": { 213 | "slideshow": { 214 | "slide_type": "notes" 215 | } 216 | }, 217 | "source": [ 218 | "we use a CrossEntropyLoss and SGD algorithm for classification as usual for classification problems. This time we will also make use of a *Learning Rate Scheduler*, which permits us to update the learing rate following a certain rule, in this case we will use the simple StepLR scheduler which accept a **step_size** parameter which corresponds to the amount of steps between each lr update, and a gamma parameter, which will get multiplied with our LR and the results will be our new LR.\n", 219 | "\n", 220 | "in brief the StepLR does:\n", 221 | "\n", 222 | " every step_size steps:\n", 223 | " LR *= gamma" 224 | ] 225 | }, 226 | { 227 | "cell_type": "code", 228 | "execution_count": null, 229 | "metadata": { 230 | "slideshow": { 231 | "slide_type": "slide" 232 | } 233 | }, 234 | "outputs": [], 235 | "source": [ 236 | "loss = nn.CrossEntropyLoss()\n", 237 | "optimizer = optim.SGD(model.parameters(), lr=0.0001, momentum=0.9)\n", 238 | "scheduler = optim.lr_scheduler.StepLR(\n", 239 | " optimizer, \n", 240 | " step_size=7, \n", 241 | " gamma=0.1)" 242 | ] 243 | }, 244 | { 245 | "cell_type": "markdown", 246 | "metadata": { 247 | "slideshow": { 248 | "slide_type": "notes" 249 | } 250 | }, 251 | "source": [ 252 | "Let's get our training started, as we don't want to wait and want to go drink some beer, we will train the model for just 1 epoch and evaluate the model on the validation dtaset" 253 | ] 254 | }, 255 | { 256 | "cell_type": "code", 257 | "execution_count": null, 258 | "metadata": { 259 | "slideshow": { 260 | "slide_type": "slide" 261 | } 262 | }, 263 | "outputs": [], 264 | "source": [ 265 | "for epoch in range(1):\n", 266 | "\n", 267 | " for i, (x, y) in enumerate(dataloader_tr):\n", 268 | " x, y = Variable(x), Variable(y)\n", 269 | " l = loss(model(x), y)\n", 270 | "\n", 271 | " optimizer.zero_grad()\n", 272 | " l.backward()\n", 273 | " optimizer.step()\n", 274 | " \n", 275 | " print('Epoch: {}, batch_idx: {}/{}, loss: {}'.format(\n", 276 | " epoch, i, len(dataloader_tr)-1, l.data.numpy()[0]))" 277 | ] 278 | }, 279 | { 280 | "cell_type": "markdown", 281 | "metadata": { 282 | "slideshow": { 283 | "slide_type": "notes" 284 | } 285 | }, 286 | "source": [ 287 | "Q: Why this time the pretrained model is training faster??\n", 288 | "\n", 289 | "because the dataset is much more small, in terms of number of samples, even if the images are much larger, we roughly have almost a 10% of its number of samples, thus the faster epochs" 290 | ] 291 | }, 292 | { 293 | "cell_type": "markdown", 294 | "metadata": { 295 | "slideshow": { 296 | "slide_type": "notes" 297 | } 298 | }, 299 | "source": [ 300 | "Let's evaluate the model on the validation dataset as usual" 301 | ] 302 | }, 303 | { 304 | "cell_type": "code", 305 | "execution_count": null, 306 | "metadata": { 307 | "slideshow": { 308 | "slide_type": "slide" 309 | } 310 | }, 311 | "outputs": [], 312 | "source": [ 313 | "model.eval()\n", 314 | "preds = []\n", 315 | "ys = []\n", 316 | "for x, y in dataloader_val:\n", 317 | " x, y = Variable(x), Variable(y)\n", 318 | " preds.extend(model(x).max(1)[1].data.tolist())\n", 319 | " ys.extend(y.data)\n", 320 | "\n", 321 | "corrects = (np.array(preds) == np.array(ys))\n", 322 | "print('Accuracy: {}'.format(corrects.mean()))" 323 | ] 324 | }, 325 | { 326 | "cell_type": "markdown", 327 | "metadata": { 328 | "slideshow": { 329 | "slide_type": "notes" 330 | } 331 | }, 332 | "source": [ 333 | "that's actually an awesome result with just one epoch of training, IMHO. But let's count how many images the model classifies wrongly" 334 | ] 335 | }, 336 | { 337 | "cell_type": "code", 338 | "execution_count": null, 339 | "metadata": { 340 | "slideshow": { 341 | "slide_type": "slide" 342 | } 343 | }, 344 | "outputs": [], 345 | "source": [ 346 | "print('correct predicitons: {}'.format(\n", 347 | " np.sum(np.array(preds) == np.array(ys))))\n", 348 | "\n", 349 | "print('wrong predicitons: {}'.format(\n", 350 | " np.sum(np.array(preds) != np.array(ys))))" 351 | ] 352 | } 353 | ], 354 | "metadata": { 355 | "celltoolbar": "Slideshow", 356 | "kernelspec": { 357 | "display_name": "Python 3", 358 | "language": "python", 359 | "name": "python3" 360 | }, 361 | "language_info": { 362 | "codemirror_mode": { 363 | "name": "ipython", 364 | "version": 3 365 | }, 366 | "file_extension": ".py", 367 | "mimetype": "text/x-python", 368 | "name": "python", 369 | "nbconvert_exporter": "python", 370 | "pygments_lexer": "ipython3", 371 | "version": "3.6.3" 372 | } 373 | }, 374 | "nbformat": 4, 375 | "nbformat_minor": 2 376 | } 377 | -------------------------------------------------------------------------------- /1-Basics/1.4-FizzBuzz.ipynb: -------------------------------------------------------------------------------- 1 | { 2 | "cells": [ 3 | { 4 | "cell_type": "markdown", 5 | "metadata": { 6 | "slideshow": { 7 | "slide_type": "slide" 8 | } 9 | }, 10 | "source": [ 11 | "# Fizz Buzz with Pytorch" 12 | ] 13 | }, 14 | { 15 | "cell_type": "markdown", 16 | "metadata": { 17 | "slideshow": { 18 | "slide_type": "notes" 19 | } 20 | }, 21 | "source": [ 22 | "this is a very simple task often chosen by an interviewer to get an idea of a candidate's ability to write simple functions. We will break it with a very simple feed forward neural network composed by a total of 3 weighted layers (2 hidden and 1 output layer).\n", 23 | "\n", 24 | "This task usually consists of writing a function that takes an integer and returns the string 'fizz' if the number is divisible by (is a multiple of) 3, 'buzz' if the number is divisible by 5, 'fizzbuzz' if the number is divisible by 3*5=15 and returning the number itself otherwise.\n", 25 | "\n", 26 | "We'll approach this task by first converting the decimal integer numbers to binary inputs, so our model will have `num_bits` values per sample and will output 4 values, corresponding to the possible classes for each sample (fizz, buzz, fizzbuzz, x).\n", 27 | "\n", 28 | "So we will start by writing the **fizz_buzz_encode** method, and other two convenience methods for encoding/decoding binary and fizz buzz, obviously after importing the usual modules, and defining the number of possible digits for representing the numbers (bits). We will set it to 12.\n", 29 | "\n", 30 | "source:\n", 31 | "http://joelgrus.com/2016/05/23/fizz-buzz-in-tensorflow/" 32 | ] 33 | }, 34 | { 35 | "cell_type": "code", 36 | "execution_count": null, 37 | "metadata": { 38 | "slideshow": { 39 | "slide_type": "slide" 40 | } 41 | }, 42 | "outputs": [], 43 | "source": [ 44 | "import torch\n", 45 | "from torch.utils import data\n", 46 | "\n", 47 | "import torch.nn as nn\n", 48 | "\n", 49 | "# functional module, fuctional implementations for \n", 50 | "# unparameterized neural network modules\n", 51 | "import torch.nn.functional as F\n", 52 | "\n", 53 | "from torch.autograd import Variable\n", 54 | "import torch.optim as optim\n", 55 | "\n", 56 | "import numpy as np\n", 57 | "\n", 58 | "NUM_DIGITS = 12" 59 | ] 60 | }, 61 | { 62 | "cell_type": "markdown", 63 | "metadata": { 64 | "slideshow": { 65 | "slide_type": "notes" 66 | } 67 | }, 68 | "source": [ 69 | "write our solution and convenience methods" 70 | ] 71 | }, 72 | { 73 | "cell_type": "code", 74 | "execution_count": null, 75 | "metadata": { 76 | "slideshow": { 77 | "slide_type": "slide" 78 | } 79 | }, 80 | "outputs": [], 81 | "source": [ 82 | "# Represent each input by an array of its binary digits.\n", 83 | "def binary_encode(i, num_digits):\n", 84 | " return np.array([i >> d & 1 for d in range(num_digits)])\n", 85 | "\n", 86 | "# One-hot encode the desired outputs: [number, \"fizz\", \"buzz\", \"fizzbuzz\"]\n", 87 | "def fizz_buzz_encode(i):\n", 88 | " if i % 15 == 0: return 3\n", 89 | " elif i % 5 == 0: return 2\n", 90 | " elif i % 3 == 0: return 1\n", 91 | " else: return 0\n", 92 | "\n", 93 | "#printable and coherent labels\n", 94 | "def fizz_buzz_decode(i, prediction):\n", 95 | " return [str(i), \"fizz\", \"buzz\", \"fizzbuzz\"][prediction]" 96 | ] 97 | }, 98 | { 99 | "cell_type": "markdown", 100 | "metadata": { 101 | "slideshow": { 102 | "slide_type": "notes" 103 | } 104 | }, 105 | "source": [ 106 | "Now I will show you a way to build a custom dataset class, that subclasses data.Dataset. \n", 107 | "\n", 108 | "That's a simple way to create a sort of singleton object for our data, so that if and when we instantiate the dataset object multiple times, the data doesn't get created/loaded multiple times. To do this, we create an empty dictionary **DATA_CACHE** at global scope (it will be created when we import the module). Then when we instantiate the dataset object and its init method gets called, we first check if our **DATA_CACHE** actually contains the data, if not we fill the cache with our data, otherwise we simply compute the data at each index (we need to split in train/val/test). This way we could load even a gazillion samples without actually copying data and thus using more memory." 109 | ] 110 | }, 111 | { 112 | "cell_type": "code", 113 | "execution_count": null, 114 | "metadata": { 115 | "slideshow": { 116 | "slide_type": "slide" 117 | } 118 | }, 119 | "outputs": [], 120 | "source": [ 121 | "DATA_CACHE = {} \n", 122 | "\n", 123 | "def fill_cache(num_bits):\n", 124 | " DATA_CACHE.update({\n", 125 | " 'X': [binary_encode(i, NUM_DIGITS) for i in range(2 ** NUM_DIGITS)],\n", 126 | " 'y': [fizz_buzz_encode(i) for i in range(2 ** NUM_DIGITS)]\n", 127 | " })\n", 128 | "\n", 129 | "class FizzbuzzDataset(data.Dataset):\n", 130 | " def __init__(self, num_bits=NUM_DIGITS, mode='train'):\n", 131 | " super(FizzbuzzDataset, self).__init__()\n", 132 | " \n", 133 | " if not DATA_CACHE:\n", 134 | " fill_cache(num_bits)\n", 135 | " \n", 136 | " start, end = (0, 100) if mode == 'val' else (100, len(DATA_CACHE['y']))\n", 137 | " self.idxs = list(range(start, end))\n", 138 | " \n", 139 | " def __len__(self):\n", 140 | " return len(self.idxs)\n", 141 | " \n", 142 | " def __getitem__(self, idx):\n", 143 | " x = DATA_CACHE['X'][self.idxs[idx]]\n", 144 | " x = x.astype(np.float32)\n", 145 | " y = DATA_CACHE['y'][self.idxs[idx]]\n", 146 | " return x, y" 147 | ] 148 | }, 149 | { 150 | "cell_type": "markdown", 151 | "metadata": { 152 | "slideshow": { 153 | "slide_type": "notes" 154 | } 155 | }, 156 | "source": [ 157 | "As said above, we create a simple model composed by a total of 3 feed forward layers, where the first one has `num_digits` inputs for each sample, followed by an activation fuction. We say our model must have 50 \"neural units\" for each one of the hidden layers and, given that we have 4 classes, an output dimension of 4." 158 | ] 159 | }, 160 | { 161 | "cell_type": "code", 162 | "execution_count": null, 163 | "metadata": { 164 | "slideshow": { 165 | "slide_type": "slide" 166 | } 167 | }, 168 | "outputs": [], 169 | "source": [ 170 | "# http://pytorch.org/docs/master/nn.html#torch.nn.LeakyReLU\n", 171 | "class FizzbuzzModel(nn.Module):\n", 172 | " def __init__(self, h_dim=50, input_dim=NUM_DIGITS, num_classes=4):\n", 173 | " super(FizzbuzzModel, self).__init__()\n", 174 | " self.linear1 = nn.Sequential(\n", 175 | " nn.Linear(input_dim, h_dim),\n", 176 | " nn.LeakyReLU()\n", 177 | " )\n", 178 | " self.linear2 = nn.Sequential(\n", 179 | " nn.Linear(h_dim, h_dim),\n", 180 | " nn.LeakyReLU()\n", 181 | " )\n", 182 | " self.classifier = nn.Linear(h_dim, num_classes)\n", 183 | " \n", 184 | " def forward(self, x):\n", 185 | " x = self.linear1(x)\n", 186 | " x = self.linear2(x)\n", 187 | " x = self.classifier(x)\n", 188 | " return x " 189 | ] 190 | }, 191 | { 192 | "cell_type": "code", 193 | "execution_count": null, 194 | "metadata": { 195 | "slideshow": { 196 | "slide_type": "slide" 197 | } 198 | }, 199 | "outputs": [], 200 | "source": [ 201 | "# SAME AS PREVIOUS BUT USING F (nn.functional)\n", 202 | "class FizzbuzzModel(nn.Module):\n", 203 | " def __init__(self, h_dim=50, input_dim=NUM_DIGITS, num_classes=4):\n", 204 | " super(FizzbuzzModel, self).__init__()\n", 205 | " self.linear1 = nn.Linear(input_dim, h_dim)\n", 206 | " self.linear2 = nn.Linear(h_dim, h_dim)\n", 207 | " self.classifier = nn.Linear(h_dim, num_classes)\n", 208 | " \n", 209 | " def forward(self, x):\n", 210 | " x = F.leaky_relu(self.linear1(x))\n", 211 | " x = F.leaky_relu(self.linear2(x))\n", 212 | " x = self.classifier(x)\n", 213 | " return x " 214 | ] 215 | }, 216 | { 217 | "cell_type": "markdown", 218 | "metadata": { 219 | "slideshow": { 220 | "slide_type": "notes" 221 | } 222 | }, 223 | "source": [ 224 | "let's instantiate our dataset objects, their corresponding dataloaders, our FizzbuzzModel, the usual SGD optimizing algorithm and a cross entropy loss" 225 | ] 226 | }, 227 | { 228 | "cell_type": "code", 229 | "execution_count": null, 230 | "metadata": { 231 | "slideshow": { 232 | "slide_type": "slide" 233 | } 234 | }, 235 | "outputs": [], 236 | "source": [ 237 | "dataset_tr = FizzbuzzDataset(mode='train')\n", 238 | "dataset_val = FizzbuzzDataset(mode='val')\n", 239 | "dataloader_tr = data.DataLoader(dataset_tr, batch_size=128, shuffle=True)\n", 240 | "dataloader_val = data.DataLoader(dataset_val, batch_size=128, shuffle=False)\n", 241 | "\n", 242 | "model = FizzbuzzModel()\n", 243 | "optimizer = optim.SGD(model.parameters(), lr=.05, momentum=0.9)\n", 244 | "loss = nn.CrossEntropyLoss()\n" 245 | ] 246 | }, 247 | { 248 | "cell_type": "markdown", 249 | "metadata": { 250 | "slideshow": { 251 | "slide_type": "notes" 252 | } 253 | }, 254 | "source": [ 255 | "we thus train our model for 500 epochs" 256 | ] 257 | }, 258 | { 259 | "cell_type": "code", 260 | "execution_count": null, 261 | "metadata": { 262 | "slideshow": { 263 | "slide_type": "slide" 264 | } 265 | }, 266 | "outputs": [], 267 | "source": [ 268 | "for epoch in range(500):\n", 269 | " # train loop\n", 270 | " for x, y in dataloader_tr:\n", 271 | " x, y = Variable(x), Variable(y)\n", 272 | " l = loss(model(x), y)\n", 273 | " \n", 274 | " optimizer.zero_grad()\n", 275 | " l.backward()\n", 276 | " optimizer.step()\n", 277 | " if not epoch % 100:\n", 278 | " print('Epoch: {}, loss: {}'.format(epoch, l.data.numpy()[0]))\n", 279 | " " 280 | ] 281 | }, 282 | { 283 | "cell_type": "markdown", 284 | "metadata": { 285 | "slideshow": { 286 | "slide_type": "notes" 287 | } 288 | }, 289 | "source": [ 290 | "Finally we run the prediction on our evaluation set, which contains numbers from 0 to 99, we then fizz buzz encode the model's predictions and print the fizz buzz encoded values." 291 | ] 292 | }, 293 | { 294 | "cell_type": "code", 295 | "execution_count": null, 296 | "metadata": { 297 | "slideshow": { 298 | "slide_type": "slide" 299 | } 300 | }, 301 | "outputs": [], 302 | "source": [ 303 | "preds = []\n", 304 | "ys = []\n", 305 | "model.eval()\n", 306 | "for x, y in dataloader_val:\n", 307 | " x = Variable(x)\n", 308 | " preds.extend(model(x).max(1)[1].data.tolist())\n", 309 | " ys.extend(y)\n", 310 | " \n", 311 | "correct = np.array(preds) == np.array(ys)\n", 312 | "predictions = zip(range(0, 100), preds)\n", 313 | "\n", 314 | "print('Accuracy: ', correct.mean(), ', Errors: ', np.logical_not(correct).sum())\n", 315 | "print ([fizz_buzz_decode(i, x) for (i, x) in predictions])" 316 | ] 317 | } 318 | ], 319 | "metadata": { 320 | "celltoolbar": "Slideshow", 321 | "kernelspec": { 322 | "display_name": "Python 3", 323 | "language": "python", 324 | "name": "python3" 325 | }, 326 | "language_info": { 327 | "codemirror_mode": { 328 | "name": "ipython", 329 | "version": 3 330 | }, 331 | "file_extension": ".py", 332 | "mimetype": "text/x-python", 333 | "name": "python", 334 | "nbconvert_exporter": "python", 335 | "pygments_lexer": "ipython3", 336 | "version": "3.6.3" 337 | } 338 | }, 339 | "nbformat": 4, 340 | "nbformat_minor": 2 341 | } 342 | -------------------------------------------------------------------------------- /2-Intermediate/2.3-TransferLearning-MNIST.ipynb: -------------------------------------------------------------------------------- 1 | { 2 | "cells": [ 3 | { 4 | "cell_type": "markdown", 5 | "metadata": { 6 | "slideshow": { 7 | "slide_type": "slide" 8 | } 9 | }, 10 | "source": [ 11 | "# Transfer Learning and Training on (FASHION) MNIST" 12 | ] 13 | }, 14 | { 15 | "cell_type": "markdown", 16 | "metadata": { 17 | "slideshow": { 18 | "slide_type": "notes" 19 | } 20 | }, 21 | "source": [ 22 | "Now that we saw how to load and use a pretrained model for our aims, lets look at how to apply transfer learning in order to use a pretrained model as a feature extractor. Since the pretrained models available in torchvision are all trained on ImageNet, our model's last layer will provides 1000 output values, which are meant to be interpreted as log probabilities for the input to belong to every class, so we have to get rid of that layer and use a new layer with the needed number of parameters. The MNIST dataset contains digits from 0 to 9, so we need 10 output values, than we will replace the pretrained model's last layer to a new layer having 10 output features." 23 | ] 24 | }, 25 | { 26 | "cell_type": "markdown", 27 | "metadata": { 28 | "slideshow": { 29 | "slide_type": "notes" 30 | } 31 | }, 32 | "source": [ 33 | "As usual, we will import all required packages, torch, numpy, etc" 34 | ] 35 | }, 36 | { 37 | "cell_type": "code", 38 | "execution_count": null, 39 | "metadata": { 40 | "slideshow": { 41 | "slide_type": "slide" 42 | } 43 | }, 44 | "outputs": [], 45 | "source": [ 46 | "import torch\n", 47 | "import torchvision\n", 48 | "from torch.utils import data\n", 49 | "\n", 50 | "import numpy as np\n", 51 | "\n", 52 | "from torch import nn\n", 53 | "from torch import optim\n", 54 | "from torch.autograd import Variable\n", 55 | "from torch.nn import functional as F\n", 56 | "\n" 57 | ] 58 | }, 59 | { 60 | "cell_type": "markdown", 61 | "metadata": { 62 | "slideshow": { 63 | "slide_type": "notes" 64 | } 65 | }, 66 | "source": [ 67 | "We should obviously also import the dataset and model classes from the torchvision package" 68 | ] 69 | }, 70 | { 71 | "cell_type": "code", 72 | "execution_count": null, 73 | "metadata": { 74 | "slideshow": { 75 | "slide_type": "slide" 76 | } 77 | }, 78 | "outputs": [], 79 | "source": [ 80 | "from torchvision.datasets.mnist import FashionMNIST\n", 81 | "from torchvision.models.resnet import resnet18\n", 82 | "from torchvision import transforms, utils\n", 83 | "\n", 84 | "from PIL import Image\n", 85 | "\n", 86 | "import matplotlib.pyplot as plt\n", 87 | "%matplotlib inline" 88 | ] 89 | }, 90 | { 91 | "cell_type": "markdown", 92 | "metadata": { 93 | "slideshow": { 94 | "slide_type": "notes" 95 | } 96 | }, 97 | "source": [ 98 | "Instead of using the boring MNIST, we will use a better, less boring version of it, its called Fashion MNIST and consists of the same number of images, having the same dimensions and same calsses from 0 to 9 but representing dresses types instead of digits.\n", 99 | "\n", 100 | "The images are W/B 28x28, having the pixel mean and std corresponding to 0.1307 and 0.3081, we can also use a data augmenting function from torchvision.transforms, which permit to \"increase\" the training dataset size by modifying the images applying transformations which won't change the image label, in this case we can use for example a random horizontal flipping function, which unsurprisingly applies horizontal flipping to images with probability p=0.5.\n", 101 | "\n", 102 | "After creating the dataset objects we can finally pass them to the DataLoader init method to get our iterable dataloaders." 103 | ] 104 | }, 105 | { 106 | "cell_type": "code", 107 | "execution_count": null, 108 | "metadata": { 109 | "slideshow": { 110 | "slide_type": "slide" 111 | } 112 | }, 113 | "outputs": [], 114 | "source": [ 115 | "transfs_tr = transforms.Compose([\n", 116 | " transforms.RandomHorizontalFlip(),\n", 117 | " transforms.ToTensor(),\n", 118 | " transforms.Normalize([0.1307], [0.3081])\n", 119 | "])\n", 120 | "\n", 121 | "transfs_val = transforms.Compose([\n", 122 | " transforms.ToTensor(),\n", 123 | " transforms.Normalize([0.1307], [0.3081])\n", 124 | "])\n", 125 | "\n", 126 | "dset_tr = FashionMNIST(root='../data/fmnist', train=True, download=True, transform=transfs_tr)\n", 127 | "dset_val = FashionMNIST(root='../data/fmnist', train=False, download=True, transform=transfs_val)\n", 128 | "dataloader_tr = data.DataLoader(dset_tr, batch_size=64)\n", 129 | "dataloader_val = data.DataLoader(dset_val, batch_size=64)\n", 130 | "\n" 131 | ] 132 | }, 133 | { 134 | "cell_type": "markdown", 135 | "metadata": { 136 | "slideshow": { 137 | "slide_type": "notes" 138 | } 139 | }, 140 | "source": [ 141 | "we can actually show some of the images from the dataset using the torchvision's utils package, containing a couple of functions useful for visualization of tensor objects containing images." 142 | ] 143 | }, 144 | { 145 | "cell_type": "code", 146 | "execution_count": null, 147 | "metadata": { 148 | "slideshow": { 149 | "slide_type": "slide" 150 | } 151 | }, 152 | "outputs": [], 153 | "source": [ 154 | "dset_temp = FashionMNIST(root='../data/fmnist', train=True, download=True, transform=transforms.ToTensor())\n", 155 | "dataloader_temp = data.DataLoader(dset_temp, batch_size=64)\n", 156 | "batch_img, batch_label = next(iter(dataloader_temp))\n", 157 | "\n", 158 | "grid = utils.make_grid(batch_img)\n", 159 | "plt.figure(figsize=(10,10))\n", 160 | "plt.imshow(grid.numpy().transpose(1,2,0))" 161 | ] 162 | }, 163 | { 164 | "cell_type": "markdown", 165 | "metadata": { 166 | "slideshow": { 167 | "slide_type": "notes" 168 | } 169 | }, 170 | "source": [ 171 | "As you can see we really have a fashion dataset, certainly less boring than the \"always the same\" MNIST dataset" 172 | ] 173 | }, 174 | { 175 | "cell_type": "markdown", 176 | "metadata": { 177 | "slideshow": { 178 | "slide_type": "notes" 179 | } 180 | }, 181 | "source": [ 182 | "we can now get our model object, as seen in previous sections we will use a resnet model from the torchvision package, but given that we are on CPU and we still need to train a model, we will use a resnet18 (the smallest) instead of resnet152 (the largest). So we create the [pretrained] model and will apply some edits to it. \n", 183 | "\n", 184 | "For example, if I'm not wrong, the model is being deveoped to work with 224x224 images and it makes uses of a average pooling layer in order to average the features for each channel before enetring to the last, classifier layer. Since we have different size images, we can simply replace that layer with and AdaptiveAvgPool2d, a great pytorch module, which will compute the pooling window by itself in order to obtain the required output dimension. Since the standard resnet wants one value for each channel in the last but one layer (before the classifier), we will replace the resnet's avg pool with a nn.AdaptiveAvgPool2d((1,1)).\n", 185 | "\n", 186 | "An other change we need to apply to our pretrained models is due to the fact that the model has been trained on RGB images, having thus 3 channels for image, but our images are W/B, so in attidion to adding a dimension for the color channel, we also need to adapt the first layer of the model. Since it contains 3channel filters, in order to make them work for our aims we simply need to perform a sum over the channel dimension for that parameters, and given that **sum(1)** will actually remove that dimension, we will readd it with the **unsqueeze** method.\n", 187 | "\n", 188 | "Finally, as already explained above, we will replace the last (linear/affine/fully connected) layer having 1000 output features, with a much simpler layer having 10 output features. \n", 189 | "\n", 190 | "Last but not least, transfer learning requires that we don't update the already trained parameters, so we can put them in eval mode, we will also pass to our optimizer, not the entire model's parameters but only the parameters from our last fully connected layer." 191 | ] 192 | }, 193 | { 194 | "cell_type": "code", 195 | "execution_count": null, 196 | "metadata": { 197 | "slideshow": { 198 | "slide_type": "slide" 199 | } 200 | }, 201 | "outputs": [], 202 | "source": [ 203 | "model = resnet18(pretrained=True)\n", 204 | "# replace the avgpool (it's a 7x7 pooling expecting Cx7x7 tensors so it would rise an error)\n", 205 | "model.avgpool = nn.AdaptiveAvgPool2d((1,1))\n", 206 | "# sum over input channels and read its dimension\n", 207 | "model.conv1.weight.data = model.conv1.weight.data.sum(1).unsqueeze(1)\n", 208 | "# replace the last layer (it has 1000 out features and we need new weights)\n", 209 | "model.fc = nn.Linear(model.fc.in_features, 10)\n", 210 | "model.eval()\n", 211 | "model.fc.train()" 212 | ] 213 | }, 214 | { 215 | "cell_type": "markdown", 216 | "metadata": { 217 | "slideshow": { 218 | "slide_type": "notes" 219 | } 220 | }, 221 | "source": [ 222 | "Ok, our pretrained model is ready to give us features from the new data, lets define our loss function (Cross entropy as usual for classification problems) and our optimizer/ccriterion/optimization algorithm, SGD. We will use a learning rate of 0.1 and a momentum of 0.5" 223 | ] 224 | }, 225 | { 226 | "cell_type": "markdown", 227 | "metadata": { 228 | "slideshow": { 229 | "slide_type": "notes" 230 | } 231 | }, 232 | "source": [ 233 | "That's how you perform transfer learning, our problem now is that even if the model is less big respect othe models, our dataset contains a lot of samples, so we should wait to much time to train it on CPU, so we will train a much much simpler model from scratch to end this part of the tutorial, just remember that if you want to actually use the pretrained model and train only the last fully connected layer, you just need to pass **model.fc.parameters()** to the optimizer, instead of passing all model's parameters." 234 | ] 235 | }, 236 | { 237 | "cell_type": "markdown", 238 | "metadata": { 239 | "slideshow": { 240 | "slide_type": "notes" 241 | } 242 | }, 243 | "source": [ 244 | "So we can now see also how to build a convolution network, which applies shifting filters of window size **k**, storing the filter output values for each shifting. For more information about the convolution layers you can firstly look at this awesome gif below, then go straight to the PyTorch documentation website." 245 | ] 246 | }, 247 | { 248 | "cell_type": "markdown", 249 | "metadata": { 250 | "slideshow": { 251 | "slide_type": "notes" 252 | } 253 | }, 254 | "source": [ 255 | "source: https://github.com/vdumoulin/conv_arithmetic/blob/master/gif/no_padding_no_strides.gif\n", 256 | "\n", 257 | "PyTorch convolutions documentation: http://pytorch.org/docs/stable/nn.html?highlight=convolution#convolution-layers" 258 | ] 259 | }, 260 | { 261 | "cell_type": "markdown", 262 | "metadata": { 263 | "slideshow": { 264 | "slide_type": "notes" 265 | } 266 | }, 267 | "source": [ 268 | "we can now define our loss function and optimizer objects" 269 | ] 270 | }, 271 | { 272 | "cell_type": "code", 273 | "execution_count": null, 274 | "metadata": { 275 | "slideshow": { 276 | "slide_type": "slide" 277 | } 278 | }, 279 | "outputs": [], 280 | "source": [ 281 | "loss = nn.CrossEntropyLoss()\n", 282 | "optimizer = optim.SGD(model.fc.parameters(), lr=0.02, momentum=0.5)" 283 | ] 284 | }, 285 | { 286 | "cell_type": "markdown", 287 | "metadata": { 288 | "slideshow": { 289 | "slide_type": "notes" 290 | } 291 | }, 292 | "source": [ 293 | "We can now start our training, lets run the model for 10 epochs and see how it performs later" 294 | ] 295 | }, 296 | { 297 | "cell_type": "code", 298 | "execution_count": null, 299 | "metadata": { 300 | "slideshow": { 301 | "slide_type": "slide" 302 | } 303 | }, 304 | "outputs": [], 305 | "source": [ 306 | "# how much batches the dataloader will iterate through??\n", 307 | "\n", 308 | "len(dataloader_tr)" 309 | ] 310 | }, 311 | { 312 | "cell_type": "code", 313 | "execution_count": null, 314 | "metadata": { 315 | "slideshow": { 316 | "slide_type": "slide" 317 | } 318 | }, 319 | "outputs": [], 320 | "source": [ 321 | "for epoch in range(1):\n", 322 | " for i, (x, y) in enumerate(dataloader_tr):\n", 323 | " x, y = Variable(x), Variable(y)\n", 324 | " l = loss(model(x), y)\n", 325 | "\n", 326 | " optimizer.zero_grad()\n", 327 | " l.backward()\n", 328 | " optimizer.step()\n", 329 | " if i % 10 == 0:\n", 330 | " print('Epoch: {}, iter:{}, loss: {}'.format(epoch, i, l.data.numpy()[0]))\n", 331 | " if i > 100:\n", 332 | " break" 333 | ] 334 | }, 335 | { 336 | "cell_type": "markdown", 337 | "metadata": { 338 | "slideshow": { 339 | "slide_type": "notes" 340 | } 341 | }, 342 | "source": [ 343 | "Ok, finally we can evaluate our model accuracy performace on the validation dataset (not doing it on the training set because the model already saw those samples many times)" 344 | ] 345 | }, 346 | { 347 | "cell_type": "code", 348 | "execution_count": null, 349 | "metadata": { 350 | "slideshow": { 351 | "slide_type": "slide" 352 | } 353 | }, 354 | "outputs": [], 355 | "source": [ 356 | "model.eval()\n", 357 | "preds = []\n", 358 | "ys = []\n", 359 | "for i, (x, y) in enumerate(dataloader_val):\n", 360 | " x, y = Variable(x), Variable(y)\n", 361 | " preds.extend(model(x).max(1)[1].data.tolist())\n", 362 | " ys.extend(y.data)\n", 363 | " if not i % 30:\n", 364 | " print(i)\n", 365 | "\n", 366 | "corrects = (np.array(preds) == np.array(ys))\n", 367 | "print('Accuracy: {}'.format(corrects.mean()))\n" 368 | ] 369 | }, 370 | { 371 | "cell_type": "markdown", 372 | "metadata": { 373 | "slideshow": { 374 | "slide_type": "notes" 375 | } 376 | }, 377 | "source": [ 378 | "And we can show the confusion matrix, which will show how many samples the model classifies correctly." 379 | ] 380 | }, 381 | { 382 | "cell_type": "code", 383 | "execution_count": null, 384 | "metadata": { 385 | "slideshow": { 386 | "slide_type": "slide" 387 | } 388 | }, 389 | "outputs": [], 390 | "source": [ 391 | "from sklearn.metrics import confusion_matrix\n", 392 | "\n", 393 | "plt.matshow(confusion_matrix(np.array(preds), np.array(ys)))\n" 394 | ] 395 | }, 396 | { 397 | "cell_type": "code", 398 | "execution_count": null, 399 | "metadata": {}, 400 | "outputs": [], 401 | "source": [] 402 | } 403 | ], 404 | "metadata": { 405 | "celltoolbar": "Slideshow", 406 | "kernelspec": { 407 | "display_name": "Python 3", 408 | "language": "python", 409 | "name": "python3" 410 | }, 411 | "language_info": { 412 | "codemirror_mode": { 413 | "name": "ipython", 414 | "version": 3 415 | }, 416 | "file_extension": ".py", 417 | "mimetype": "text/x-python", 418 | "name": "python", 419 | "nbconvert_exporter": "python", 420 | "pygments_lexer": "ipython3", 421 | "version": "3.6.3" 422 | } 423 | }, 424 | "nbformat": 4, 425 | "nbformat_minor": 2 426 | } 427 | -------------------------------------------------------------------------------- /2-Intermediate/2.5-CharRNN.ipynb: -------------------------------------------------------------------------------- 1 | { 2 | "cells": [ 3 | { 4 | "cell_type": "markdown", 5 | "metadata": { 6 | "slideshow": { 7 | "slide_type": "slide" 8 | } 9 | }, 10 | "source": [ 11 | "# Simple CharRNN with PyTorch" 12 | ] 13 | }, 14 | { 15 | "cell_type": "markdown", 16 | "metadata": { 17 | "slideshow": { 18 | "slide_type": "notes" 19 | } 20 | }, 21 | "source": [ 22 | "We are now at the last section of this training, as I get you may want to see something else than image classification tasks, in this section we will approach a NLP (Natural Language Processing) task, that is, to learn to generate names in different languages, just by looking at, guess what?, names! The news is that we will make use of recurrent neural networks, but that's not as much exiting as the obtained results we will get." 23 | ] 24 | }, 25 | { 26 | "cell_type": "markdown", 27 | "metadata": { 28 | "slideshow": { 29 | "slide_type": "notes" 30 | } 31 | }, 32 | "source": [ 33 | "Given that I'm not an NLP expert, this part comes straight from the PyTorch documentation, let call the jupyter magic method to show plots" 34 | ] 35 | }, 36 | { 37 | "cell_type": "code", 38 | "execution_count": null, 39 | "metadata": { 40 | "slideshow": { 41 | "slide_type": "slide" 42 | } 43 | }, 44 | "outputs": [], 45 | "source": [ 46 | "%matplotlib inline" 47 | ] 48 | }, 49 | { 50 | "cell_type": "markdown", 51 | "metadata": { 52 | "slideshow": { 53 | "slide_type": "notes" 54 | } 55 | }, 56 | "source": [ 57 | "\n", 58 | "\n", 59 | "Download the data from https://download.pytorch.org/tutorial/data.zip and extract it to the current directory.\n", 60 | "\n", 61 | "In short, there are a bunch of plain text files ``data/names/[Language].txt`` with a name per line. We split lines into an array, convert Unicode to ASCII, and end up with a dictionary ``{language: [names ...]}``.\n", 62 | "\n", 63 | "\n" 64 | ] 65 | }, 66 | { 67 | "cell_type": "code", 68 | "execution_count": null, 69 | "metadata": { 70 | "slideshow": { 71 | "slide_type": "slide" 72 | } 73 | }, 74 | "outputs": [], 75 | "source": [ 76 | "from __future__ import unicode_literals, print_function, division\n", 77 | "from io import open\n", 78 | "import glob\n", 79 | "import unicodedata\n", 80 | "import string\n", 81 | "\n", 82 | "all_letters = string.ascii_letters + \" .,;'-\"\n", 83 | "n_letters = len(all_letters) + 1 # Plus EOS marker\n" 84 | ] 85 | }, 86 | { 87 | "cell_type": "code", 88 | "execution_count": null, 89 | "metadata": { 90 | "slideshow": { 91 | "slide_type": "slide" 92 | } 93 | }, 94 | "outputs": [], 95 | "source": [ 96 | "def findFiles(path): return glob.glob(path)\n", 97 | "\n", 98 | "# Turn a Unicode string to plain ASCII, thanks to \n", 99 | "# http://stackoverflow.com/a/518232/2809427\n", 100 | "def unicodeToAscii(s):\n", 101 | " return ''.join(\n", 102 | " c for c in unicodedata.normalize('NFD', s)\n", 103 | " if unicodedata.category(c) != 'Mn'\n", 104 | " and c in all_letters\n", 105 | " )\n", 106 | "\n", 107 | "# Read a file and split into lines\n", 108 | "def readLines(filename):\n", 109 | " lines = open(filename, encoding='utf-8').read().strip().split('\\n')\n", 110 | " return [unicodeToAscii(line) for line in lines]\n" 111 | ] 112 | }, 113 | { 114 | "cell_type": "code", 115 | "execution_count": null, 116 | "metadata": { 117 | "slideshow": { 118 | "slide_type": "slide" 119 | } 120 | }, 121 | "outputs": [], 122 | "source": [ 123 | "# Build the category_lines dictionary, a list of lines per category\n", 124 | "category_lines = {}\n", 125 | "all_categories = []\n", 126 | "for filename in findFiles('../data/names/*.txt'):\n", 127 | " category = filename.split('/')[-1].split('.')[0]\n", 128 | " all_categories.append(category)\n", 129 | " lines = readLines(filename)\n", 130 | " category_lines[category] = lines\n", 131 | "\n", 132 | "n_categories = len(all_categories)\n", 133 | "\n", 134 | "print('# categories:', n_categories, all_categories)\n", 135 | "print(unicodeToAscii(\"O'Néàl\"))" 136 | ] 137 | }, 138 | { 139 | "cell_type": "markdown", 140 | "metadata": { 141 | "slideshow": { 142 | "slide_type": "slide" 143 | } 144 | }, 145 | "source": [ 146 | "Creating the Network\n", 147 | "====================\n", 148 | "\n", 149 | "We will interpret the output as the probability of the next letter. When\n", 150 | "sampling, the most likely output letter is used as the next input\n", 151 | "letter.\n", 152 | "\n", 153 | "this simple scheme shows how we will implement our recurrent model\n", 154 | "\n", 155 | "https://i.imgur.com/jzVrf7f.png\n", 156 | "\n", 157 | "\n", 158 | "\n" 159 | ] 160 | }, 161 | { 162 | "cell_type": "code", 163 | "execution_count": null, 164 | "metadata": { 165 | "slideshow": { 166 | "slide_type": "slide" 167 | } 168 | }, 169 | "outputs": [], 170 | "source": [ 171 | "import torch\n", 172 | "import torch.nn as nn\n", 173 | "from torch.autograd import Variable\n", 174 | "\n", 175 | "class RNN(nn.Module):\n", 176 | " def __init__(self, input_size, hidden_size, output_size):\n", 177 | " super(RNN, self).__init__()\n", 178 | " self.hidden_size = hidden_size\n", 179 | " \n", 180 | " out_f = n_categories + input_size + hidden_size\n", 181 | " self.i2h = nn.Linear(out_f, hidden_size)\n", 182 | " self.i2o = nn.Linear(out_f, output_size)\n", 183 | " self.o2o = nn.Linear(hidden_size + output_size, output_size)\n", 184 | " self.dropout = nn.Dropout(0.1)\n", 185 | " self.softmax = nn.LogSoftmax(dim=1)\n", 186 | "\n", 187 | " def forward(self, category, input, hidden):\n", 188 | " input_combined = torch.cat((category, input, hidden), 1)\n", 189 | " hidden = self.i2h(input_combined)\n", 190 | " output = self.i2o(input_combined)\n", 191 | " output_combined = torch.cat((hidden, output), 1)\n", 192 | " output = self.o2o(output_combined)\n", 193 | " output = self.dropout(output)\n", 194 | " output = self.softmax(output)\n", 195 | " return output, hidden\n", 196 | "\n", 197 | " def initHidden(self):\n", 198 | " return Variable(torch.zeros(1, self.hidden_size))" 199 | ] 200 | }, 201 | { 202 | "cell_type": "markdown", 203 | "metadata": { 204 | "slideshow": { 205 | "slide_type": "slide" 206 | } 207 | }, 208 | "source": [ 209 | "Training\n", 210 | "=========\n", 211 | "Preparing for Training\n", 212 | "----------------------\n", 213 | "\n", 214 | "First of all, helper functions to get random pairs of (category, line):\n", 215 | "\n", 216 | "\n" 217 | ] 218 | }, 219 | { 220 | "cell_type": "code", 221 | "execution_count": null, 222 | "metadata": { 223 | "slideshow": { 224 | "slide_type": "slide" 225 | } 226 | }, 227 | "outputs": [], 228 | "source": [ 229 | "import random\n", 230 | "\n", 231 | "# Random item from a list\n", 232 | "def randomChoice(l):\n", 233 | " return l[random.randint(0, len(l) - 1)]\n", 234 | "\n", 235 | "# Get a random category and random line from that category\n", 236 | "def randomTrainingPair():\n", 237 | " category = randomChoice(all_categories)\n", 238 | " line = randomChoice(category_lines[category])\n", 239 | " return category, line" 240 | ] 241 | }, 242 | { 243 | "cell_type": "markdown", 244 | "metadata": { 245 | "slideshow": { 246 | "slide_type": "notes" 247 | } 248 | }, 249 | "source": [ 250 | "For each timestep (that is, for each letter in a training word) the\n", 251 | "inputs of the network will be ``(category, current letter, hidden state)`` and the outputs will be ``(next letter, next hidden state)``. So for each training set, we'll need the category, a set of input letters, and a set of output/target letters.\n", 252 | "\n", 253 | "Since we are predicting the next letter from the current letter for each timestep, the letter pairs are groups of consecutive letters from the line - e.g. for ``\"ABCD\"`` we would create (\"A\", \"B\"), (\"B\", \"C\"),\n", 254 | "(\"C\", \"D\"), (\"D\", \"EOS\").\n", 255 | "\n", 256 | "https://i.imgur.com/JH58tXY.png\n", 257 | "\n", 258 | "The category tensor is a [one-hot tensor](https://en.wikipedia.org/wiki/One-hot) of size\n", 259 | "``<1 x n_categories>``. When training we feed it to the network at every timestep - this is a design choice, it could have been included as part of initial hidden state or some other strategy.\n", 260 | "\n", 261 | "\n" 262 | ] 263 | }, 264 | { 265 | "cell_type": "code", 266 | "execution_count": null, 267 | "metadata": { 268 | "slideshow": { 269 | "slide_type": "slide" 270 | } 271 | }, 272 | "outputs": [], 273 | "source": [ 274 | "# One-hot vector for category\n", 275 | "def categoryTensor(category):\n", 276 | " li = all_categories.index(category)\n", 277 | " tensor = torch.zeros(1, n_categories)\n", 278 | " tensor[0][li] = 1\n", 279 | " return tensor\n", 280 | "\n", 281 | "# One-hot matrix of first to last letters (not including EOS) for input\n", 282 | "def inputTensor(line):\n", 283 | " tensor = torch.zeros(len(line), 1, n_letters)\n", 284 | " for li in range(len(line)):\n", 285 | " letter = line[li]\n", 286 | " tensor[li][0][all_letters.find(letter)] = 1\n", 287 | " return tensor\n", 288 | "\n", 289 | "# LongTensor of second letter to end (EOS) for target\n", 290 | "def targetTensor(line):\n", 291 | " letter_indexes = [all_letters.find(line[li]) for li in range(\n", 292 | " 1, len(line))]\n", 293 | " letter_indexes.append(n_letters - 1) # EOS\n", 294 | " return torch.LongTensor(letter_indexes)" 295 | ] 296 | }, 297 | { 298 | "cell_type": "markdown", 299 | "metadata": { 300 | "slideshow": { 301 | "slide_type": "notes" 302 | } 303 | }, 304 | "source": [ 305 | "For convenience during training we'll make a ``randomTrainingExample``\n", 306 | "function that fetches a random (category, line) pair and turns them into\n", 307 | "the required (category, input, target) tensors.\n", 308 | "\n", 309 | "\n" 310 | ] 311 | }, 312 | { 313 | "cell_type": "code", 314 | "execution_count": null, 315 | "metadata": { 316 | "slideshow": { 317 | "slide_type": "slide" 318 | } 319 | }, 320 | "outputs": [], 321 | "source": [ 322 | "# Make category, input, and target tensors from a random category, line pair\n", 323 | "def randomTrainingExample():\n", 324 | " category, line = randomTrainingPair()\n", 325 | " category_tensor = Variable(categoryTensor(category))\n", 326 | " input_line_tensor = Variable(inputTensor(line))\n", 327 | " target_line_tensor = Variable(targetTensor(line))\n", 328 | " return category_tensor, input_line_tensor, target_line_tensor" 329 | ] 330 | }, 331 | { 332 | "cell_type": "markdown", 333 | "metadata": { 334 | "slideshow": { 335 | "slide_type": "notes" 336 | } 337 | }, 338 | "source": [ 339 | "Training the Network\n", 340 | "--------------------\n", 341 | "\n", 342 | "In contrast to classification, where only the last output is used, we\n", 343 | "are making a prediction at every step, so we are calculating loss at\n", 344 | "every step.\n", 345 | "\n", 346 | "The magic of autograd allows you to simply sum these losses at each step\n", 347 | "and call backward at the end.\n", 348 | "\n", 349 | "\n" 350 | ] 351 | }, 352 | { 353 | "cell_type": "code", 354 | "execution_count": null, 355 | "metadata": { 356 | "slideshow": { 357 | "slide_type": "slide" 358 | } 359 | }, 360 | "outputs": [], 361 | "source": [ 362 | "criterion = nn.NLLLoss()\n", 363 | "\n", 364 | "learning_rate = 0.0005\n", 365 | "\n", 366 | "def train(category_tensor, input_line_tensor, target_line_tensor):\n", 367 | " hidden = rnn.initHidden()\n", 368 | "\n", 369 | " rnn.zero_grad()\n", 370 | "\n", 371 | " loss = 0\n", 372 | "\n", 373 | " for i in range(input_line_tensor.size()[0]):\n", 374 | " output, hidden = rnn(category_tensor, input_line_tensor[i], hidden)\n", 375 | " loss += criterion(output, target_line_tensor[i])\n", 376 | "\n", 377 | " loss.backward()\n", 378 | "\n", 379 | " for p in rnn.parameters():\n", 380 | " p.data.add_(-learning_rate, p.grad.data)\n", 381 | "\n", 382 | " return output, loss.data[0] / input_line_tensor.size()[0]" 383 | ] 384 | }, 385 | { 386 | "cell_type": "markdown", 387 | "metadata": { 388 | "slideshow": { 389 | "slide_type": "notes" 390 | } 391 | }, 392 | "source": [ 393 | "To keep track of how long training takes I am adding a\n", 394 | "``timeSince(timestamp)`` function which returns a human readable string:\n", 395 | "\n", 396 | "\n" 397 | ] 398 | }, 399 | { 400 | "cell_type": "code", 401 | "execution_count": null, 402 | "metadata": { 403 | "slideshow": { 404 | "slide_type": "slide" 405 | } 406 | }, 407 | "outputs": [], 408 | "source": [ 409 | "import time\n", 410 | "import math\n", 411 | "\n", 412 | "def timeSince(since):\n", 413 | " now = time.time()\n", 414 | " s = now - since\n", 415 | " m = math.floor(s / 60)\n", 416 | " s -= m * 60\n", 417 | " return '%dm %ds' % (m, s)" 418 | ] 419 | }, 420 | { 421 | "cell_type": "markdown", 422 | "metadata": { 423 | "slideshow": { 424 | "slide_type": "notes" 425 | } 426 | }, 427 | "source": [ 428 | "Training is business as usual - call train a bunch of times and wait a\n", 429 | "few minutes, printing the current time and loss every ``print_every``\n", 430 | "examples, and keeping store of an average loss per ``plot_every`` examples\n", 431 | "in ``all_losses`` for plotting later.\n", 432 | "\n", 433 | "\n" 434 | ] 435 | }, 436 | { 437 | "cell_type": "code", 438 | "execution_count": null, 439 | "metadata": { 440 | "slideshow": { 441 | "slide_type": "slide" 442 | } 443 | }, 444 | "outputs": [], 445 | "source": [ 446 | "rnn = RNN(n_letters, 128, n_letters)\n", 447 | "\n", 448 | "n_iters = 100000\n", 449 | "print_every = 5000\n", 450 | "plot_every = 500\n", 451 | "all_losses = []\n", 452 | "total_loss = 0 # Reset every plot_every iters\n", 453 | "\n", 454 | "start = time.time()\n", 455 | "\n", 456 | "for iter in range(1, n_iters + 1):\n", 457 | " output, loss = train(*randomTrainingExample())\n", 458 | " total_loss += loss\n", 459 | "\n", 460 | " if iter % print_every == 0:\n", 461 | " print('%s (%d %d%%) %.4f' % (\n", 462 | " timeSince(start), iter, iter / n_iters * 100, loss))\n", 463 | "\n", 464 | " if iter % plot_every == 0:\n", 465 | " all_losses.append(total_loss / plot_every)\n", 466 | " total_loss = 0" 467 | ] 468 | }, 469 | { 470 | "cell_type": "markdown", 471 | "metadata": { 472 | "slideshow": { 473 | "slide_type": "notes" 474 | } 475 | }, 476 | "source": [ 477 | "Plotting the Losses\n", 478 | "-------------------\n", 479 | "\n", 480 | "Plotting the historical loss from all\\_losses shows the network\n", 481 | "learning:\n", 482 | "\n", 483 | "\n" 484 | ] 485 | }, 486 | { 487 | "cell_type": "code", 488 | "execution_count": null, 489 | "metadata": { 490 | "slideshow": { 491 | "slide_type": "slide" 492 | } 493 | }, 494 | "outputs": [], 495 | "source": [ 496 | "import matplotlib.pyplot as plt\n", 497 | "import matplotlib.ticker as ticker\n", 498 | "\n", 499 | "plt.figure()\n", 500 | "plt.plot(all_losses)" 501 | ] 502 | }, 503 | { 504 | "cell_type": "markdown", 505 | "metadata": { 506 | "slideshow": { 507 | "slide_type": "notes" 508 | } 509 | }, 510 | "source": [ 511 | "Sampling the Network\n", 512 | "====================\n", 513 | "\n", 514 | "To sample we give the network a letter and ask what the next one is,\n", 515 | "feed that in as the next letter, and repeat until the EOS token.\n", 516 | "\n", 517 | "- Create tensors for input category, starting letter, and empty hidden\n", 518 | " state\n", 519 | "- Create a string ``output_name`` with the starting letter\n", 520 | "- Up to a maximum output length,\n", 521 | "\n", 522 | " - Feed the current letter to the network\n", 523 | " - Get the next letter from highest output, and next hidden state\n", 524 | " - If the letter is EOS, stop here\n", 525 | " - If a regular letter, add to ``output_name`` and continue\n", 526 | "\n", 527 | "- Return the final name\n", 528 | "\n", 529 | ".. Note::\n", 530 | " Rather than having to give it a starting letter, another\n", 531 | " strategy would have been to include a \"start of string\" token in\n", 532 | " training and have the network choose its own starting letter.\n", 533 | "\n", 534 | "\n" 535 | ] 536 | }, 537 | { 538 | "cell_type": "code", 539 | "execution_count": null, 540 | "metadata": { 541 | "slideshow": { 542 | "slide_type": "slide" 543 | } 544 | }, 545 | "outputs": [], 546 | "source": [ 547 | "max_length = 20\n", 548 | "\n", 549 | "# Sample from a category and starting letter\n", 550 | "def sample(category, start_letter='A'):\n", 551 | " category_tensor = Variable(categoryTensor(category))\n", 552 | " input = Variable(inputTensor(start_letter))\n", 553 | " hidden = rnn.initHidden()\n", 554 | "\n", 555 | " output_name = start_letter\n", 556 | "\n", 557 | " for i in range(max_length):\n", 558 | " output, hidden = rnn(category_tensor, input[0], hidden)\n", 559 | " topv, topi = output.data.topk(1)\n", 560 | " topi = topi[0][0]\n", 561 | " if topi == n_letters - 1:\n", 562 | " break\n", 563 | " else:\n", 564 | " letter = all_letters[topi]\n", 565 | " output_name += letter\n", 566 | " input = Variable(inputTensor(letter))\n", 567 | "\n", 568 | " return output_name\n" 569 | ] 570 | }, 571 | { 572 | "cell_type": "code", 573 | "execution_count": null, 574 | "metadata": { 575 | "slideshow": { 576 | "slide_type": "slide" 577 | } 578 | }, 579 | "outputs": [], 580 | "source": [ 581 | "# Get multiple samples from one category \n", 582 | "# and multiple starting letters\n", 583 | "def samples(category, start_letters='ABC'):\n", 584 | " for start_letter in start_letters:\n", 585 | " print(sample(category, start_letter))\n", 586 | "\n", 587 | "samples('Russian', 'RUS')\n", 588 | "print('----')\n", 589 | "samples('German', 'GER')\n", 590 | "print('----')\n", 591 | "samples('Spanish', 'SPA')\n", 592 | "print('----')\n", 593 | "samples('Chinese', 'CHI')\n", 594 | "print('----')\n", 595 | "samples('Italian', 'ITA')\n", 596 | "print('----')\n", 597 | "samples('Italian', 'BCDFG')" 598 | ] 599 | }, 600 | { 601 | "cell_type": "markdown", 602 | "metadata": { 603 | "slideshow": { 604 | "slide_type": "notes" 605 | } 606 | }, 607 | "source": [ 608 | "Italian names are not very strong, but you get the idea, if we were using LTSMs we would actually get much better results but sould need to wait a lot to train the model on CPU." 609 | ] 610 | }, 611 | { 612 | "cell_type": "markdown", 613 | "metadata": { 614 | "slideshow": { 615 | "slide_type": "notes" 616 | } 617 | }, 618 | "source": [ 619 | "Exercises\n", 620 | "=========\n", 621 | "\n", 622 | "- Try with a different dataset of category -> line, for example:\n", 623 | "\n", 624 | " - Fictional series -> Character name\n", 625 | " - Part of speech -> Word\n", 626 | " - Country -> City\n", 627 | "\n", 628 | "- Use a \"start of sentence\" token so that sampling can be done without\n", 629 | " choosing a start letter\n", 630 | "- Get better results with a bigger and/or better shaped network\n", 631 | "\n", 632 | " - Try the nn.LSTM and nn.GRU layers\n", 633 | " - Combine multiple of these RNNs as a higher level network\n", 634 | "\n", 635 | "\n" 636 | ] 637 | } 638 | ], 639 | "metadata": { 640 | "celltoolbar": "Slideshow", 641 | "kernelspec": { 642 | "display_name": "Python 3", 643 | "language": "python", 644 | "name": "python3" 645 | }, 646 | "language_info": { 647 | "codemirror_mode": { 648 | "name": "ipython", 649 | "version": 3 650 | }, 651 | "file_extension": ".py", 652 | "mimetype": "text/x-python", 653 | "name": "python", 654 | "nbconvert_exporter": "python", 655 | "pygments_lexer": "ipython3", 656 | "version": "3.6.3" 657 | } 658 | }, 659 | "nbformat": 4, 660 | "nbformat_minor": 1 661 | } 662 | -------------------------------------------------------------------------------- /1-Basics/1.2-Linear-Regression.ipynb: -------------------------------------------------------------------------------- 1 | { 2 | "cells": [ 3 | { 4 | "cell_type": "markdown", 5 | "metadata": { 6 | "slideshow": { 7 | "slide_type": "slide" 8 | } 9 | }, 10 | "source": [ 11 | "# 1.2 Linear Regression with PyTorch" 12 | ] 13 | }, 14 | { 15 | "cell_type": "markdown", 16 | "metadata": { 17 | "slideshow": { 18 | "slide_type": "notes" 19 | } 20 | }, 21 | "source": [ 22 | "we will see now how to implement a really simple task with PyTorch, such as performing a linear regression on two small sets of points. \n", 23 | "\n", 24 | "The task consists of: given a set of *independent* x values, learn to estimate the relationship (beta) with the corresponding *dependent* y values.\n", 25 | "\n", 26 | "more info: https://en.wikipedia.org/wiki/Regression_analysis\n", 27 | "\n", 28 | "First we import our usual packages" 29 | ] 30 | }, 31 | { 32 | "cell_type": "code", 33 | "execution_count": 1, 34 | "metadata": { 35 | "slideshow": { 36 | "slide_type": "slide" 37 | } 38 | }, 39 | "outputs": [], 40 | "source": [ 41 | "import torch\n", 42 | "import torch.nn as nn\n", 43 | "import numpy as np\n", 44 | "import matplotlib.pyplot as plt\n", 45 | "from torch.autograd import Variable\n", 46 | "\n", 47 | "%matplotlib inline" 48 | ] 49 | }, 50 | { 51 | "cell_type": "markdown", 52 | "metadata": { 53 | "slideshow": { 54 | "slide_type": "notes" 55 | } 56 | }, 57 | "source": [ 58 | "Lets create our datapoints, you can copy the same numbers or change them a bit, that wouldn't change our aims to show how the process works" 59 | ] 60 | }, 61 | { 62 | "cell_type": "code", 63 | "execution_count": 2, 64 | "metadata": { 65 | "slideshow": { 66 | "slide_type": "slide" 67 | } 68 | }, 69 | "outputs": [], 70 | "source": [ 71 | "# Toy Dataset \n", 72 | "x_train = np.array([[3.3], [4.4], [5.5], [6.71], [6.93], [4.168], \n", 73 | " [9.779], [6.182], [7.59], [2.167], [7.042], \n", 74 | " [10.791], [5.313], [7.997], [3.1]], dtype=np.float32)\n", 75 | "\n", 76 | "y_train = np.array([[1.7], [2.76], [2.09], [3.19], [1.694], [1.573], \n", 77 | " [3.366], [2.596], [2.53], [1.221], [2.827], \n", 78 | " [3.465], [1.65], [2.904], [1.3]], dtype=np.float32)" 79 | ] 80 | }, 81 | { 82 | "cell_type": "code", 83 | "execution_count": 3, 84 | "metadata": { 85 | "slideshow": { 86 | "slide_type": "slide" 87 | } 88 | }, 89 | "outputs": [ 90 | { 91 | "data": { 92 | "image/png": "iVBORw0KGgoAAAANSUhEUgAAAXcAAAD8CAYAAACMwORRAAAABHNCSVQICAgIfAhkiAAAAAlwSFlzAAALEgAACxIB0t1+/AAAADl0RVh0U29mdHdhcmUAbWF0cGxvdGxpYiB2ZXJzaW9uIDIuMi4yLCBodHRwOi8vbWF0cGxvdGxpYi5vcmcvhp/UCwAAFNxJREFUeJzt3X+Q1PV9x/HX+87TyyHViqciyC2lpFrIgXqIlkxHQzTWxuhESexsbXDS3EySNthxbIyXSH4MmTo6sRgSM2f8gboTYzCJ1jFtbTRjzEyMB4IoOP4od3hqdSHlgK7GA97947sgLHfsd7nd+373s8/HzM5397Of232z3L3ue5/ve79r7i4AQFiaki4AAFB9hDsABIhwB4AAEe4AECDCHQACRLgDQIAIdwAIEOEOAAEi3AEgQEck9cTHH3+8ZzKZpJ4eAOrS6tWrt7h7e7l5iYV7JpNRX19fUk8PAHXJzAbizGNZBgACRLgDQIDKhruZtZrZ78xsnZm9YGbfGGHOYjPLm9na4uXva1MuACCOOGvuf5D0EXffaWYtkp4ys1+4+29L5v3Y3f9hLMUMDw9rcHBQ77777lgeBlXS2tqqqVOnqqWlJelSAFSobLh7dML3ncWbLcVLTU4CPzg4qIkTJyqTycjMavEUiMndtXXrVg0ODmr69OlJlwOgQrHW3M2s2czWSnpb0mPu/vQI0y4zs+fMbJWZnTLK43SbWZ+Z9eXz+YPuf/fddzVp0iSCPQXMTJMmTeKvKKCacjkpk5GamqJtLlezp4oV7u6+293nSpoq6Swzm10y5d8kZdy9U9J/SVo5yuP0unuXu3e1t4/cpkmwpwf/F0AV5XJSd7c0MCC5R9vu7poFfEXdMu6+TdKvJF1YMr7V3f9QvHm7pDOrUh0AhKKnRyoUDhwrFKLxGojTLdNuZscWr39A0kclvVgyZ/J+Nz8haWM1ixxPg4ODuuSSSzRz5kzNmDFDS5Ys0XvvvTfi3DfeeEOXX3552ce86KKLtG3btsOq5+tf/7puvvnmsvOOPvroQ96/bds2ff/73z+sGgBUwebNlY2PUZw998mSnjCz5yQ9o2jN/REz+6aZfaI450vFNsl1kr4kaXFNqi1V5fUrd9cnP/lJXXrppXr55Zf10ksvaefOneoZ4Tfrrl27dPLJJ2vVqlVlH/fRRx/VscceO6baxopwBxI2bVpl42NUNtzd/Tl3P93dO919trt/szh+g7s/XLz+FXef5e5z3P08d3/x0I9aBTVYv3r88cfV2tqqq666SpLU3NysW265RXfeeacKhYLuvvtuLVq0SBdffLEuuOAC9ff3a/bs6PBDoVDQpz71KXV2durTn/605s+fv+/0CplMRlu2bFF/f79OO+00fe5zn9OsWbN0wQUX6J133pEk3X777Zo3b57mzJmjyy67TIXSP99KbNq0Seecc47mzZunr33ta/vGd+7cqYULF+qMM87Qhz70IT300EOSpOuuu06vvvqq5s6dq2uvvXbUeQBqZNkyqa3twLG2tmi8Ftw9kcuZZ57ppTZs2HDQ2Kg6OtyjWD/w0tER/zFKLF++3K+++uqDxufOnevr1q3zu+66y6dMmeJbt251d/dNmzb5rFmz3N39pptu8u7ubnd3X79+vTc3N/szzzxTLLXD8/m8b9q0yZubm/3ZZ591d/dFixb5vffe6+7uW7Zs2fd8PT09fuutt7q7+9KlS/2mm246qKaLL77YV65c6e7uK1as8AkTJri7+/DwsA8NDbm7ez6f9xkzZviePXsOqPVQ80pV9H8C4NDuuy/KKLNoe999FT+EpD6PkbGJnThszGqwfuXuI3aI7D9+/vnn67jjjjtozlNPPaUlS5ZIkmbPnq3Ozs4Rn2P69OmaO3euJOnMM89Uf3+/JOn555/XV7/6VW3btk07d+7Uxz72sUPW+pvf/EYPPvigJOnKK6/Ul7/85X21Xn/99XryySfV1NSk119/XW+99daI/6aR5p100kmHfF4AY5DNRpdxUL/nlqnB+tWsWbMOOlPl9u3b9dprr2nGjBmSpAkTJoz4tdEv1PKOOuqofdebm5u1a9cuSdLixYu1YsUKrV+/XkuXLo3VXz7SL6JcLqd8Pq/Vq1dr7dq1OvHEE0d8rLjzANSn+g33GqxfLVy4UIVCQffcc48kaffu3brmmmu0ePFitZU+V4kPf/jDeuCBByRJGzZs0Pr16yt67h07dmjy5MkaHh5WLsZxgwULFuj++++XpAPmDw0N6YQTTlBLS4ueeOIJDQxEZwedOHGiduzYUXYeEJRxfNNQ2tRvuGezUm+v1NEhmUXb3t4x/cljZvrZz36mn/zkJ5o5c6Y++MEPqrW1Vd/+9rfLfu0XvvAF5fN5dXZ26sYbb1RnZ6eOOeaY2M/9rW99S/Pnz9f555+vU089tez85cuX63vf+57mzZunoaGhfePZbFZ9fX3q6upSLpfb91iTJk3SggULNHv2bF177bWjzgOCMc5vGkobi7ucUG1dXV1eugSyceNGnXbaaYnUM1a7d+/W8PCwWltb9eqrr2rhwoV66aWXdOSRRyZd2pjU8/8JGlwmEwV6qY4OqXisqx6Z2Wp37yo3r34PqKZMoVDQeeedp+HhYbm7brvttroPdqCujfObhtKGcK+SiRMn8rGBQJpMmzbynnuN3jSUNqlbc09qmQgH4/8CdW283zSUMqkK99bWVm3dupVQSQEvns+9tbU16VKAw1ODpot6kqoDqnwSU7rwSUxA+tTlAdWWlhY+9QcAqiBVyzIAgOog3AEgQIQ7AASIcAeAABHuABAgwh0AAkS4A5Vo4FPIor6kqs8dSLW9p5Dd+/m2e08hKzXMux5RP9hzB+Lq6Xk/2PcqFKJxIGUIdyCuBj+FLOoL4Q7EVYPP7QVqhXAH4mrwU8iivhDuQFwNfgpZ1Be6ZYBKZLOEOeoCe+4AECDCHQACRLgDQIAIdwAIEOEOAAEi3AEgQIQ7AASIcAeAABHuABCgsuFuZq1m9jszW2dmL5jZN0aYc5SZ/djMXjGzp80sU4tiAQDxxNlz/4Okj7j7HElzJV1oZmeXzPmspP919z+VdIukG6tbJgCgEmXD3SM7izdbihcvmXaJpJXF66skLTQzq1qVAICKxFpzN7NmM1sr6W1Jj7n70yVTpkh6TZLcfZekIUmTRnicbjPrM7O+fD4/tsoBAKOKFe7uvtvd50qaKuksM5tdMmWkvfTSvXu5e6+7d7l7V3t7e+XVAgBiqahbxt23SfqVpAtL7hqUdIokmdkRko6R9Psq1AcAOAxxumXazezY4vUPSPqopBdLpj0s6TPF65dLetzdD9pzBwCMjzgf1jFZ0koza1b0y+ABd3/EzL4pqc/dH5Z0h6R7zewVRXvsV9SsYgBAWWXD3d2fk3T6COM37Hf9XUmLqlsaAOBw8Q5VIHS5nJTJSE1N0TaXS7oijAM+QxUIWS4ndXdLhUJ0e2Agui3xWbCBY88dCFlPz/vBvlehEI0jaIQ7ELLNmysbRzAIdyBk06ZVNo5gEO5AyJYtk9raDhxra4vGETTCHaiVNHSpZLNSb6/U0SGZRdveXg6mNgC6ZYBaSFOXSjZLmDcg9tyBWqBLBQkj3IFaoEsFCSPcgVqgSwUJI9yBWqBLBQkj3BtFGjo3GgldKkgY3TKNIE2dG42ELhUkiD33RkDnBtBwCPdGQOcG0HAI90ZA5wbQcAj3RkDnBtBwCPdGQOcG0HDolmkUdG4ADYU9dwAIEOEOAAEi3AEgQIQ7AASIcAeAABHuABAgwh0AAkS4I3yc7hgNiDcxIWyc7hgNij13hI3THaNBEe4IG6c7RoMi3BE2TneMBkW4I2yc7hgNinBH2EI63TFdP6gA3TIIXwinO6brBxUqu+duZqeY2RNmttHMXjCzJSPMOdfMhsxsbfFyQ23KBRoUXT+oUJw9912SrnH3NWY2UdJqM3vM3TeUzPu1u3+8+iUCoOsHlSq75+7ub7r7muL1HZI2SppS68IA7IeuH1SoogOqZpaRdLqkp0e4+xwzW2dmvzCzWaN8fbeZ9ZlZXz6fr7hYoGHR9YMKxQ53Mzta0oOSrnb37SV3r5HU4e5zJH1X0s9Hegx373X3Lnfvam9vP9yagcYTUtcPxoW5e/lJZi2SHpH0H+7+nRjz+yV1ufuW0eZ0dXV5X19fBaUCAMxstbt3lZsXp1vGJN0haeNowW5mJxXnyczOKj7u1spKBgBUS5xumQWSrpS03szWFseulzRNktz9B5Iul/R5M9sl6R1JV3icPwkAADVRNtzd/SlJVmbOCkkrqlUUAGBsOP0AAASIcAeAABHuABAgwh0AAkS4A0CACHcACBDhDgABItwBIECEOwAEiHAHgAAR7gAQIMIdAAJEuANAgAh3AAgQ4Q4AASLcASBAhDsABIhwB4AAEe4AECDCHQACRLgDQIAIdwAIEOEOAAEi3AEgQIQ7AASIcAeAABHuABAgwh3Jy+WkTEZqaoq2uVzSFQF174ikC0CDy+Wk7m6pUIhuDwxEtyUpm02uLqDOseeOZPX0vB/sexUK0TiAw0a4I1mbN1c2DiAWwh3JmjatsnEAsRDuSNayZVJb24FjbW3ROIDDRrgjWdms1NsrdXRIZtG2t5eDqcAY0S2D5GWzhDlQZWX33M3sFDN7wsw2mtkLZrZkhDlmZrea2Stm9pyZnVGbcgEAccTZc98l6Rp3X2NmEyWtNrPH3H3DfnP+StLM4mW+pNuKWwBAAsruubv7m+6+pnh9h6SNkqaUTLtE0j0e+a2kY81sctWrBQDEUtEBVTPLSDpd0tMld02R9Np+twd18C8AmVm3mfWZWV8+n6+sUgBAbLHD3cyOlvSgpKvdfXvp3SN8iR804N7r7l3u3tXe3l5ZpQCA2GKFu5m1KAr2nLv/dIQpg5JO2e/2VElvjL08AMDhiNMtY5LukLTR3b8zyrSHJf1dsWvmbElD7v5mFesEAFQgTrfMAklXSlpvZmuLY9dLmiZJ7v4DSY9KukjSK5IKkq6qfqkAgLjKhru7P6WR19T3n+OSvlitogAAY8PpBwAgQIQ7AASIcAeAABHuABAgwh0AAkS4A0CACHcACBDhDgABItwBIECEOwAEiHAHgAAR7gAQIMIdAAJEuANAgAh3AAgQ4Q4AASLcASBAhDsABIhwr6ZcTspkpKamaJvLJV0RxhvfA0iJOB+QjThyOam7WyoUotsDA9FtScpmk6sL44fvAaSIRZ9tPf66urq8r68vkeeuiUwm+mEu1dEh9fePdzVIAt8DGAdmttrdu8rNY1mmWjZvrmwc4eF7AClCuFfLtGmVjTeaRliL5nsAKUK4V8uyZVJb24FjbW3ReKPbuxY9MCC5v78WHVrA8z2AFCHcqyWblXp7o/VVs2jb28uBNEnq6Xn/IONehUI0HhK+B5AiHFBF7TU1RXvspcykPXvGvx6gjnFAFenBWjQw7gh31B5r0cC4I9xRe6xFA+OOcA9F2lsNs9nojTx79kRbgh2oKU4/EALe9g6gBHvuIWiUVkMAsRHuIeBt7wBKEO4hoNUQQAnCPQS0GgIoUTbczexOM3vbzJ4f5f5zzWzIzNYWLzdUv0wcEq2GAErE6Za5W9IKSfccYs6v3f3jVakIhyebJcwB7FN2z93dn5T0+3GoBQBQJdVacz/HzNaZ2S/MbNZok8ys28z6zKwvn89X6akBAKWqEe5rJHW4+xxJ35X089Emunuvu3e5e1d7e3sVnhoAMJIxh7u7b3f3ncXrj0pqMbPjx1wZAOCwjTnczewkM7Pi9bOKj7l1rI8LADh8ZbtlzOxHks6VdLyZDUpaKqlFktz9B5Iul/R5M9sl6R1JV3hSnwACAJAUI9zd/W/K3L9CUaskACAleIcqAASIcAeAABHuABAgwh0AAkS4A0CACHcACBDhDgABItwBIECEOwAEiHCvVC4nZTJSU1O0zeWSrggADhLnk5iwVy4ndXdLhUJ0e2Agui3xKUgAUoU990r09Lwf7HsVCtE4AKQI4V6JzZsrGweAhBDulZg2rbJxAEgI4V6JZcuktrYDx9raonEASBHCvRLZrNTbK3V0SGbRtreXg6kAUqe+wj0NbYjZrNTfL+3ZE20JdgApVD+tkLQhAkBs9bPnThsiAMRWP+FOGyIAxFY/4U4bIgDEVj/hThsiAMRWP+FOGyIAxFY/3TJSFOSEOQCUVT977gCA2Ah3AAgQ4Q4AASLcASBAhDsABMjcPZknNstLGogx9XhJW2pcTj3idRkdr83IeF1GV0+vTYe7t5eblFi4x2Vmfe7elXQdacPrMjpem5HxuowuxNeGZRkACBDhDgABqodw7026gJTidRkdr83IeF1GF9xrk/o1dwBA5ephzx0AUKFUhruZnWJmT5jZRjN7wcyWJF1TmphZs5k9a2aPJF1LmpjZsWa2ysxeLH7vnJN0TWlhZv9U/Fl63sx+ZGatSdeUFDO708zeNrPn9xs7zsweM7OXi9s/TrLGakhluEvaJekadz9N0tmSvmhmf55wTWmyRNLGpItIoeWS/t3dT5U0R7xGkiQzmyLpS5K63H22pGZJVyRbVaLulnRhydh1kn7p7jMl/bJ4u66lMtzd/U13X1O8vkPRD+mUZKtKBzObKumvJf0w6VrSxMz+SNJfSrpDktz9PXfflmxVqXKEpA+Y2RGS2iS9kXA9iXH3JyX9vmT4Ekkri9dXSrp0XIuqgVSG+/7MLCPpdElPJ1tJavyrpH+WtCfpQlLmTyTlJd1VXLL6oZlNSLqoNHD31yXdLGmzpDclDbn7fyZbVeqc6O5vStHOpaQTEq5nzFId7mZ2tKQHJV3t7tuTridpZvZxSW+7++qka0mhIySdIek2dz9d0v8pgD+tq6G4fnyJpOmSTpY0wcz+NtmqUGupDXcza1EU7Dl3/2nS9aTEAkmfMLN+SfdL+oiZ3ZdsSakxKGnQ3ff+hbdKUdhD+qikTe6ed/dhST+V9BcJ15Q2b5nZZEkqbt9OuJ4xS2W4m5kpWjvd6O7fSbqetHD3r7j7VHfPKDog9ri7swcmyd3/R9JrZvZnxaGFkjYkWFKabJZ0tpm1FX+2FoqDzaUelvSZ4vXPSHoowVqqIq2fobpA0pWS1pvZ2uLY9e7+aII1If3+UVLOzI6U9N+Srkq4nlRw96fNbJWkNYo60Z5VgO/IjMvMfiTpXEnHm9mgpKWS/kXSA2b2WUW/DBclV2F18A5VAAhQKpdlAABjQ7gDQIAIdwAIEOEOAAEi3AEgQIQ7AASIcAeAABHuABCg/wfIgd/kvMMxQwAAAABJRU5ErkJggg==\n", 93 | "text/plain": [ 94 | "
" 95 | ] 96 | }, 97 | "metadata": {}, 98 | "output_type": "display_data" 99 | } 100 | ], 101 | "source": [ 102 | "plt.plot(x_train, y_train, 'ro', label='Original data')\n", 103 | "plt.legend()\n", 104 | "plt.show()" 105 | ] 106 | }, 107 | { 108 | "cell_type": "markdown", 109 | "metadata": { 110 | "slideshow": { 111 | "slide_type": "notes" 112 | } 113 | }, 114 | "source": [ 115 | "Now we create a really simple model, subclassing **nn.Module** and thus reimplementing its **forward** method, which gets called everytime a call is performed on the instantiated object (like as in **x = model(x)**, which triggers the dunder method **\\__call__**, which returns the results of the forward method). \n", 116 | "\n", 117 | "Our model for the linear regression will consits in just a single linear layer, also known as affine layer or fully connected layer, which applies a linear transformation to the incoming data: `y = Wx + b`. \n", 118 | "\n", 119 | "So we initialize the layer object and call it with **x** as argument in the **forward** method.\n", 120 | "\n", 121 | "Our Linear layer will have a single input and output value, thats because torch always assumes you send the data in batches through modules, so we will at each step send x_train, and that's the reason of the arrays having an additional dimension." 122 | ] 123 | }, 124 | { 125 | "cell_type": "code", 126 | "execution_count": 4, 127 | "metadata": { 128 | "slideshow": { 129 | "slide_type": "slide" 130 | } 131 | }, 132 | "outputs": [], 133 | "source": [ 134 | "# Linear Regression Model\n", 135 | "class LinearRegression(nn.Module):\n", 136 | " def __init__(self, input_size, output_size):\n", 137 | " super(LinearRegression, self).__init__()\n", 138 | " self.linear = nn.Linear(input_size, output_size) \n", 139 | " \n", 140 | " def forward(self, x):\n", 141 | " out = self.linear(x)\n", 142 | " return out" 143 | ] 144 | }, 145 | { 146 | "cell_type": "markdown", 147 | "metadata": { 148 | "slideshow": { 149 | "slide_type": "notes" 150 | } 151 | }, 152 | "source": [ 153 | "We can now insantiate our model object, loss function and optimizing algorithm\n", 154 | "\n", 155 | "We use an MSELoss (which stands for Mean Squared Error Loss), that computes the mean squared error between two inputs (the model's output and the actual target/ground truth/etc, the one which should be the correct output).\n", 156 | "\n", 157 | "As optimization algorithm we use the standard gradient descent algorithm (SGD) which consists of the computations of the error and the derivative of each one of the models' parameters with respect to it (the gradients). The algorithm updates than each parameters applying `w' = w - lr * dl/dw`." 158 | ] 159 | }, 160 | { 161 | "cell_type": "code", 162 | "execution_count": 5, 163 | "metadata": { 164 | "slideshow": { 165 | "slide_type": "slide" 166 | } 167 | }, 168 | "outputs": [], 169 | "source": [ 170 | "# single \"neural unit\" layer\n", 171 | "model = LinearRegression(1, 1)\n", 172 | "# same as:\n", 173 | "# model = nn.Linear(1, 1)\n", 174 | "\n", 175 | "# Loss and Optimizer\n", 176 | "criterion = nn.MSELoss()\n", 177 | "optimizer = torch.optim.SGD(model.parameters(), lr=0.001)" 178 | ] 179 | }, 180 | { 181 | "cell_type": "markdown", 182 | "metadata": { 183 | "slideshow": { 184 | "slide_type": "notes" 185 | } 186 | }, 187 | "source": [ 188 | "In order to train our simplest model, we loop trough the desired number of epochs, performing an optimization step at every run in the for loop, remember that the model actually sees every data point and its respective y in a single forward pass, so the loss (and the gradients) will be averaged at each step." 189 | ] 190 | }, 191 | { 192 | "cell_type": "code", 193 | "execution_count": 6, 194 | "metadata": { 195 | "slideshow": { 196 | "slide_type": "slide" 197 | } 198 | }, 199 | "outputs": [ 200 | { 201 | "name": "stdout", 202 | "output_type": "stream", 203 | "text": [ 204 | "Epoch [5/60], Loss: 22.0868\n", 205 | "Epoch [10/60], Loss: 9.0634\n", 206 | "Epoch [15/60], Loss: 3.7874\n", 207 | "Epoch [20/60], Loss: 1.6500\n", 208 | "Epoch [25/60], Loss: 0.7840\n", 209 | "Epoch [30/60], Loss: 0.4332\n", 210 | "Epoch [35/60], Loss: 0.2910\n", 211 | "Epoch [40/60], Loss: 0.2334\n", 212 | "Epoch [45/60], Loss: 0.2100\n", 213 | "Epoch [50/60], Loss: 0.2005\n", 214 | "Epoch [55/60], Loss: 0.1966\n", 215 | "Epoch [60/60], Loss: 0.1949\n" 216 | ] 217 | } 218 | ], 219 | "source": [ 220 | "# Train the Model \n", 221 | "for epoch in range(60):\n", 222 | " # Convert numpy array to torch Variable\n", 223 | " inputs = Variable(torch.from_numpy(x_train))\n", 224 | " targets = Variable(torch.from_numpy(y_train))\n", 225 | "\n", 226 | " # Forward + Backward + Optimize\n", 227 | " optimizer.zero_grad() \n", 228 | " outputs = model(inputs)\n", 229 | " loss = criterion(outputs, targets)\n", 230 | " loss.backward()\n", 231 | " optimizer.step()\n", 232 | " \n", 233 | " if (epoch+1) % 5 == 0:\n", 234 | " print ('Epoch [%d/60], Loss: %.4f' \n", 235 | " %(epoch+1, loss.data[0]))\n", 236 | " " 237 | ] 238 | }, 239 | { 240 | "cell_type": "markdown", 241 | "metadata": { 242 | "slideshow": { 243 | "slide_type": "notes" 244 | } 245 | }, 246 | "source": [ 247 | "We run the SGD algorithm for 60 epochs, let's see what the model has learnt by plotting the regression line (remember y = Wx + b from above?) and the ground truth points" 248 | ] 249 | }, 250 | { 251 | "cell_type": "code", 252 | "execution_count": 7, 253 | "metadata": { 254 | "slideshow": { 255 | "slide_type": "slide" 256 | } 257 | }, 258 | "outputs": [ 259 | { 260 | "data": { 261 | "image/png": "\n", 262 | "text/plain": [ 263 | "
" 264 | ] 265 | }, 266 | "metadata": {}, 267 | "output_type": "display_data" 268 | } 269 | ], 270 | "source": [ 271 | "# Plot the graph\n", 272 | "model.eval()\n", 273 | "predicted = model(Variable(torch.from_numpy(x_train))).data.numpy()\n", 274 | "plt.plot(x_train, y_train, 'ro', label='Original data')\n", 275 | "plt.plot(x_train, predicted, label='Fitted line')\n", 276 | "plt.legend()\n", 277 | "plt.show()" 278 | ] 279 | }, 280 | { 281 | "cell_type": "markdown", 282 | "metadata": { 283 | "slideshow": { 284 | "slide_type": "notes" 285 | } 286 | }, 287 | "source": [ 288 | "We are done with our task and we can export the model parameters to a file so that we could eventually load it later when neeeded." 289 | ] 290 | }, 291 | { 292 | "cell_type": "code", 293 | "execution_count": 8, 294 | "metadata": { 295 | "slideshow": { 296 | "slide_type": "slide" 297 | } 298 | }, 299 | "outputs": [], 300 | "source": [ 301 | "# Save the Model\n", 302 | "torch.save(model.state_dict(), 'reg-model.pth')" 303 | ] 304 | } 305 | ], 306 | "metadata": { 307 | "celltoolbar": "Slideshow", 308 | "kernelspec": { 309 | "display_name": "Python 3", 310 | "language": "python", 311 | "name": "python3" 312 | }, 313 | "language_info": { 314 | "codemirror_mode": { 315 | "name": "ipython", 316 | "version": 3 317 | }, 318 | "file_extension": ".py", 319 | "mimetype": "text/x-python", 320 | "name": "python", 321 | "nbconvert_exporter": "python", 322 | "pygments_lexer": "ipython3", 323 | "version": "3.6.3" 324 | } 325 | }, 326 | "nbformat": 4, 327 | "nbformat_minor": 2 328 | } 329 | -------------------------------------------------------------------------------- /2-Intermediate/2.1-Convolutional-Neural-Networks.ipynb: -------------------------------------------------------------------------------- 1 | { 2 | "cells": [ 3 | { 4 | "cell_type": "markdown", 5 | "metadata": {}, 6 | "source": [ 7 | "# Conv Network and Fashion MNIST" 8 | ] 9 | }, 10 | { 11 | "cell_type": "code", 12 | "execution_count": 2, 13 | "metadata": { 14 | "slideshow": { 15 | "slide_type": "slide" 16 | } 17 | }, 18 | "outputs": [], 19 | "source": [ 20 | "import torch\n", 21 | "import torchvision\n", 22 | "from torch.utils import data\n", 23 | "\n", 24 | "import numpy as np\n", 25 | "\n", 26 | "from torch import nn\n", 27 | "from torch import optim\n", 28 | "from torch.autograd import Variable\n", 29 | "from torch.nn import functional as F" 30 | ] 31 | }, 32 | { 33 | "cell_type": "code", 34 | "execution_count": 3, 35 | "metadata": { 36 | "slideshow": { 37 | "slide_type": "slide" 38 | } 39 | }, 40 | "outputs": [], 41 | "source": [ 42 | "from torchvision.datasets.mnist import FashionMNIST\n", 43 | "from torchvision.models.resnet import resnet18\n", 44 | "from torchvision import transforms, utils\n", 45 | "\n", 46 | "from PIL import Image\n", 47 | "\n", 48 | "import matplotlib.pyplot as plt\n", 49 | "%matplotlib inline" 50 | ] 51 | }, 52 | { 53 | "cell_type": "code", 54 | "execution_count": 5, 55 | "metadata": { 56 | "slideshow": { 57 | "slide_type": "slide" 58 | } 59 | }, 60 | "outputs": [], 61 | "source": [ 62 | "transfs_tr = transforms.Compose([\n", 63 | " \n", 64 | " transforms.ToTensor(),\n", 65 | " transforms.Normalize([0.1307], [0.3081])\n", 66 | "])\n", 67 | "\n", 68 | "transfs_val = transforms.Compose([\n", 69 | " transforms.ToTensor(),\n", 70 | " transforms.Normalize([0.1307], [0.3081])\n", 71 | "])\n", 72 | "\n", 73 | "dset_tr = FashionMNIST(root='../data/fmnist', train=True, download=True, transform=transfs_tr)\n", 74 | "dset_val = FashionMNIST(root='../data/fmnist', train=False, download=True, transform=transfs_val)\n", 75 | "dataloader_tr = data.DataLoader(dset_tr, batch_size=64)\n", 76 | "dataloader_val = data.DataLoader(dset_val, batch_size=64)" 77 | ] 78 | }, 79 | { 80 | "cell_type": "code", 81 | "execution_count": 6, 82 | "metadata": { 83 | "slideshow": { 84 | "slide_type": "slide" 85 | } 86 | }, 87 | "outputs": [ 88 | { 89 | "data": { 90 | "text/plain": [ 91 | "" 92 | ] 93 | }, 94 | "execution_count": 6, 95 | "metadata": {}, 96 | "output_type": "execute_result" 97 | }, 98 | { 99 | "data": { 100 | "image/png": "\n", 101 | "text/plain": [ 102 | "
" 103 | ] 104 | }, 105 | "metadata": {}, 106 | "output_type": "display_data" 107 | } 108 | ], 109 | "source": [ 110 | "dset_temp = FashionMNIST(root='../data/fmnist', train=True, download=True, transform=transforms.ToTensor())\n", 111 | "dataloader_temp = data.DataLoader(dset_temp, batch_size=64)\n", 112 | "batch_img = next(iter(dataloader_temp))[0]\n", 113 | "\n", 114 | "grid = utils.make_grid(batch_img)\n", 115 | "\n", 116 | "plt.figure(figsize=(10,10))\n", 117 | "plt.imshow(grid.numpy().transpose(1,2,0))" 118 | ] 119 | }, 120 | { 121 | "cell_type": "markdown", 122 | "metadata": { 123 | "slideshow": { 124 | "slide_type": "slide" 125 | } 126 | }, 127 | "source": [ 128 | "### Labels\n", 129 | "Each training and test example is assigned to one of the following labels:\n", 130 | "\n", 131 | "| Label | Description |\n", 132 | "| --- | --- |\n", 133 | "| 0 | T-shirt/top |\n", 134 | "| 1 | Trouser |\n", 135 | "| 2 | Pullover |\n", 136 | "| 3 | Dress |\n", 137 | "| 4 | Coat |\n", 138 | "| 5 | Sandal |\n", 139 | "| 6 | Shirt |\n", 140 | "| 7 | Sneaker |\n", 141 | "| 8 | Bag |\n", 142 | "| 9 | Ankle boot |\n", 143 | "\n", 144 | "https://github.com/zalandoresearch/fashion-mnist/blob/master/README.md" 145 | ] 146 | }, 147 | { 148 | "cell_type": "markdown", 149 | "metadata": { 150 | "slideshow": { 151 | "slide_type": "slide" 152 | } 153 | }, 154 | "source": [ 155 | "\n", 156 | " \n", 157 | " \n", 158 | " \n", 159 | " \n", 160 | " \n", 161 | " \n", 162 | " \n", 163 | " \n", 164 | " \n", 165 | " \n", 166 | " \n", 167 | " \n", 168 | " \n", 169 | " \n", 170 | " \n", 171 | " \n", 172 | " \n", 173 | " \n", 174 | " \n", 175 | " \n", 176 | " \n", 177 | " \n", 178 | " \n", 179 | " \n", 180 | "
No padding, no stridesArbitrary padding, no stridesHalf padding, no stridesFull padding, no strides
No padding, stridesPadding, stridesPadding, strides (odd)
\n", 181 | "\n" 182 | ] 183 | }, 184 | { 185 | "cell_type": "code", 186 | "execution_count": 8, 187 | "metadata": { 188 | "slideshow": { 189 | "slide_type": "slide" 190 | } 191 | }, 192 | "outputs": [], 193 | "source": [ 194 | "class Net(nn.Module):\n", 195 | " def __init__(self):\n", 196 | " super(Net, self).__init__()\n", 197 | " self.conv1 = nn.Conv2d(1, 10, kernel_size=5)\n", 198 | " self.conv2 = nn.Conv2d(10, 20, kernel_size=5)\n", 199 | " self.conv2_drop = nn.Dropout2d()\n", 200 | " self.fc1 = nn.Linear(320, 50)\n", 201 | " self.fc2 = nn.Linear(50, 10)\n", 202 | "\n", 203 | " def forward(self, x):\n", 204 | " x = F.relu(F.max_pool2d(self.conv1(x), 2))\n", 205 | " x = F.relu(F.max_pool2d(self.conv2_drop(self.conv2(x)), 2))\n", 206 | " x = x.view(-1, 320)\n", 207 | " x = F.relu(self.fc1(x))\n", 208 | " x = F.dropout(x, training=self.training)\n", 209 | " x = self.fc2(x)\n", 210 | " return x\n", 211 | " \n", 212 | "model = Net()" 213 | ] 214 | }, 215 | { 216 | "cell_type": "code", 217 | "execution_count": 9, 218 | "metadata": { 219 | "slideshow": { 220 | "slide_type": "slide" 221 | } 222 | }, 223 | "outputs": [], 224 | "source": [ 225 | "loss = nn.CrossEntropyLoss()\n", 226 | "optimizer = optim.SGD(model.parameters(), lr=0.01, momentum=0.9)" 227 | ] 228 | }, 229 | { 230 | "cell_type": "code", 231 | "execution_count": 10, 232 | "metadata": { 233 | "slideshow": { 234 | "slide_type": "slide" 235 | } 236 | }, 237 | "outputs": [ 238 | { 239 | "name": "stdout", 240 | "output_type": "stream", 241 | "text": [ 242 | "938\n", 243 | "Epoch: 0, loss: 0.7718479633331299\n", 244 | "Epoch: 1, loss: 0.45067089796066284\n" 245 | ] 246 | }, 247 | { 248 | "ename": "KeyboardInterrupt", 249 | "evalue": "", 250 | "output_type": "error", 251 | "traceback": [ 252 | "\u001b[0;31m---------------------------------------------------------------------------\u001b[0m", 253 | "\u001b[0;31mKeyboardInterrupt\u001b[0m Traceback (most recent call last)", 254 | "\u001b[0;32m\u001b[0m in \u001b[0;36m\u001b[0;34m()\u001b[0m\n\u001b[1;32m 6\u001b[0m \u001b[0;34m\u001b[0m\u001b[0m\n\u001b[1;32m 7\u001b[0m \u001b[0moptimizer\u001b[0m\u001b[0;34m.\u001b[0m\u001b[0mzero_grad\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[0;32m----> 8\u001b[0;31m \u001b[0ml\u001b[0m\u001b[0;34m.\u001b[0m\u001b[0mbackward\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[0m\u001b[1;32m 9\u001b[0m \u001b[0moptimizer\u001b[0m\u001b[0;34m.\u001b[0m\u001b[0mstep\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[1;32m 10\u001b[0m \u001b[0mprint\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0;34m'Epoch: {}, loss: {}'\u001b[0m\u001b[0;34m.\u001b[0m\u001b[0mformat\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0mepoch\u001b[0m\u001b[0;34m,\u001b[0m \u001b[0ml\u001b[0m\u001b[0;34m.\u001b[0m\u001b[0mdata\u001b[0m\u001b[0;34m.\u001b[0m\u001b[0mnumpy\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m[\u001b[0m\u001b[0;36m0\u001b[0m\u001b[0;34m]\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n", 255 | "\u001b[0;32m~/miniconda3/lib/python3.6/site-packages/torch/autograd/variable.py\u001b[0m in \u001b[0;36mbackward\u001b[0;34m(self, gradient, retain_graph, create_graph, retain_variables)\u001b[0m\n\u001b[1;32m 165\u001b[0m \u001b[0mVariable\u001b[0m\u001b[0;34m.\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[1;32m 166\u001b[0m \"\"\"\n\u001b[0;32m--> 167\u001b[0;31m \u001b[0mtorch\u001b[0m\u001b[0;34m.\u001b[0m\u001b[0mautograd\u001b[0m\u001b[0;34m.\u001b[0m\u001b[0mbackward\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0mself\u001b[0m\u001b[0;34m,\u001b[0m \u001b[0mgradient\u001b[0m\u001b[0;34m,\u001b[0m \u001b[0mretain_graph\u001b[0m\u001b[0;34m,\u001b[0m \u001b[0mcreate_graph\u001b[0m\u001b[0;34m,\u001b[0m \u001b[0mretain_variables\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[0m\u001b[1;32m 168\u001b[0m \u001b[0;34m\u001b[0m\u001b[0m\n\u001b[1;32m 169\u001b[0m \u001b[0;32mdef\u001b[0m \u001b[0mregister_hook\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0mself\u001b[0m\u001b[0;34m,\u001b[0m \u001b[0mhook\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m:\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n", 256 | "\u001b[0;32m~/miniconda3/lib/python3.6/site-packages/torch/autograd/__init__.py\u001b[0m in \u001b[0;36mbackward\u001b[0;34m(variables, grad_variables, retain_graph, create_graph, retain_variables)\u001b[0m\n\u001b[1;32m 97\u001b[0m \u001b[0;34m\u001b[0m\u001b[0m\n\u001b[1;32m 98\u001b[0m Variable._execution_engine.run_backward(\n\u001b[0;32m---> 99\u001b[0;31m variables, grad_variables, retain_graph)\n\u001b[0m\u001b[1;32m 100\u001b[0m \u001b[0;34m\u001b[0m\u001b[0m\n\u001b[1;32m 101\u001b[0m \u001b[0;34m\u001b[0m\u001b[0m\n", 257 | "\u001b[0;31mKeyboardInterrupt\u001b[0m: " 258 | ] 259 | } 260 | ], 261 | "source": [ 262 | "print(len(dataloader_tr))\n", 263 | "for epoch in range(5):\n", 264 | " for i, (x, y) in enumerate(dataloader_tr):\n", 265 | " x, y = Variable(x), Variable(y)\n", 266 | " l = loss(model(x), y)\n", 267 | ",\n", 268 | " optimizer.zero_grad()\n", 269 | " l.backward()\n", 270 | " optimizer.step()\n", 271 | " print('Epoch: {}, loss: {}'.format(epoch, l.data.numpy()[0]))" 272 | ] 273 | }, 274 | { 275 | "cell_type": "code", 276 | "execution_count": 11, 277 | "metadata": { 278 | "slideshow": { 279 | "slide_type": "slide" 280 | } 281 | }, 282 | "outputs": [ 283 | { 284 | "name": "stdout", 285 | "output_type": "stream", 286 | "text": [ 287 | "157\n", 288 | "0\n", 289 | "30\n", 290 | "60\n", 291 | "90\n", 292 | "120\n", 293 | "150\n", 294 | "Accuracy: 0.8392\n" 295 | ] 296 | } 297 | ], 298 | "source": [ 299 | "model.eval()\n", 300 | "preds = []\n", 301 | "ys = []\n", 302 | "print(len(dataloader_val))\n", 303 | "for i, (x, y) in enumerate(dataloader_val):\n", 304 | " x, y = Variable(x), Variable(y)\n", 305 | " preds.extend(model(x).max(1)[1].data.tolist())\n", 306 | " ys.extend(y.data)\n", 307 | " if not i % 30:\n", 308 | " print(i)\n", 309 | "\n", 310 | "corrects = (np.array(preds) == np.array(ys))\n", 311 | "print('Accuracy: {}'.format(corrects.mean()))\n" 312 | ] 313 | }, 314 | { 315 | "cell_type": "code", 316 | "execution_count": 12, 317 | "metadata": { 318 | "slideshow": { 319 | "slide_type": "slide" 320 | } 321 | }, 322 | "outputs": [ 323 | { 324 | "data": { 325 | "text/plain": [ 326 | "" 327 | ] 328 | }, 329 | "execution_count": 12, 330 | "metadata": {}, 331 | "output_type": "execute_result" 332 | }, 333 | { 334 | "data": { 335 | "image/png": "iVBORw0KGgoAAAANSUhEUgAAAP4AAAECCAYAAADesWqHAAAABHNCSVQICAgIfAhkiAAAAAlwSFlzAAALEgAACxIB0t1+/AAAADl0RVh0U29mdHdhcmUAbWF0cGxvdGxpYiB2ZXJzaW9uIDIuMi4yLCBodHRwOi8vbWF0cGxvdGxpYi5vcmcvhp/UCwAAC5dJREFUeJzt3U2MVfUZx/Hfb15gGNQgtonI0IKJ0RpjhY6tSmITMLFqq5suNJGkbmbTKhoTo02Nu66M0UVjMsGaphJdIAtrjNr4smhtqSPQIo5NDFreI61FKAZhmKeLuTQW6dwz9Pzvmcvz/SQmMF6fPBnm67lzOfc/jggByKWn6QUAdB7hAwkRPpAQ4QMJET6QEOEDCTUWvu3v2f6r7Q9sP9jUHlXZXmL7DdvjtrfbXtv0TlXY7rW9xfaLTe9She0FtjfYfr/1ub626Z3asX1f62viXdvP2h5oeqd2Ggnfdq+kX0i6SdLlku6wfXkTu8zAhKT7I+Ibkq6R9OMu2FmS1koab3qJGXhC0ssRcZmkb2qW7257saR7JA1HxBWSeiXd3uxW7TV1xf+2pA8iYkdEHJP0nKTbGtqlkojYFxGbW78+rKkvyMXNbjU920OSbpG0ruldqrB9nqTrJT0lSRFxLCIONrtVJX2S5tnukzQoaW/D+7TVVPiLJe36wu93a5ZH9EW2l0paLmlTs5u09bikByRNNr1IRRdLOiDp6da3J+tsz296qelExB5Jj0raKWmfpE8j4tVmt2qvqfB9mo91xb3Dts+R9LykeyPiUNP7/C+2vy/p44h4p+ldZqBP0gpJT0bEcklHJM3q139sn6+pZ6vLJF0kab7tO5vdqr2mwt8tackXfj+kLnh6ZLtfU9Gvj4iNTe/TxkpJt9r+SFPfSq2y/UyzK7W1W9LuiDj5TGqDpv5HMJvdIOnDiDgQEcclbZR0XcM7tdVU+G9LusT2MttzNPViyAsN7VKJbWvqe8/xiHis6X3aiYiHImIoIpZq6vP7ekTM6itRROyXtMv2pa0PrZb0XoMrVbFT0jW2B1tfI6s1y1+QlKaeWnVcREzY/omkVzT1KugvI2J7E7vMwEpJayRts7219bGfRsRLDe50Nrpb0vrWBWGHpLsa3mdaEbHJ9gZJmzX1Nz9bJI02u1V75m25QD7cuQckRPhAQoQPJET4QEKEDyTUePi2R5reYSa6bV+JnTuh2/ZtPHxJXfUJU/ftK7FzJ3TVvrMhfAAdVuQGngULe2PRULWbAg9+ckILFvZWeuyebQXfqOXTvW/oy47HUfXP4JwF91SbeybiRLU33R3X5+rX3GJ7VHXigup/fhNHj6hvoPrje/9x5ExWqs1s+Rwf1REdi8/bftEVuWV30VCfnv7NotrnPrzs6tpnnuT+OUXm9swrdxjLiUOz9s2Bp/XPH5Q7TOf8X/2h2Oxusileq/Q4nuoDCRE+kBDhAwkRPpAQ4QMJVQq/287ABzC9tuF36Rn4AKZR5YrfdWfgA5helfC7+gx8AF9WJfxKZ+DbHrE9Znvs4Ccn/v/NABRTJfxKZ+BHxGhEDEfEcNV77wE0o0r4XXcGPoDptX2TTpeegQ9gGpXendf6oRH84AjgLMGde0BChA8kRPhAQoQPJET4QEJFDts8zwvjO15d+9xX9m5t/6AzdONFV5UZXPEQzzPCTzrGKTbFazoUn7T9ouOKDyRE+EBChA8kRPhAQoQPJET4QEKEDyRE+EBChA8kRPhAQoQPJET4QEKEDyRE+EBChA8kRPhAQoQPJET4QEKEDyRE+EBChA8kRPhAQpV+aOZM2VbPwEDtc4sdgS3Jry8uMjdW7y0yV5L6Fl1YZO7Evv1F5vZceVmRuZI0+Zf3yww+S49H54oPJET4QEKEDyRE+EBChA8kRPhAQoQPJNQ2fNtLbL9he9z2dttrO7EYgHKq3MAzIen+iNhs+1xJ79j+bUS8V3g3AIW0veJHxL6I2Nz69WFJ45LK3OYGoCNm9D2+7aWSlkvaVGIZAJ1R+V592+dIel7SvRFx6DT/fkTSiCQNeH5tCwKoX6Urvu1+TUW/PiI2nu4xETEaEcMRMTxHc+vcEUDNqryqb0lPSRqPiMfKrwSgtCpX/JWS1khaZXtr65+bC+8FoKC23+NHxO8kFXxTMoBO4849ICHCBxIifCAhwgcSInwgoSKn7MqWenvrHzu33I1BcUOZk2V/vqPc3c0Pr7ix2OwiPtjZ9AZo4YoPJET4QEKEDyRE+EBChA8kRPhAQoQPJET4QEKEDyRE+EBChA8kRPhAQoQPJET4QEKEDyRE+EBChA8kRPhAQoQPJET4QEKEDyRE+EBCRY7XjslJTX72WYnRxUysWlFk7s+uHCwyV5LueHtbkbnrLxsqMvfEVZcUmStJfuvPxWafjbjiAwkRPpAQ4QMJET6QEOEDCRE+kBDhAwlVDt92r+0ttl8suRCA8mZyxV8rabzUIgA6p1L4tock3SJpXdl1AHRC1Sv+45IekDRZcBcAHdI2fNvfl/RxRLzT5nEjtsdsjx3X57UtCKB+Va74KyXdavsjSc9JWmX7mVMfFBGjETEcEcP9mlvzmgDq1Db8iHgoIoYiYqmk2yW9HhF3Ft8MQDH8PT6Q0Izejx8Rb0p6s8gmADqGKz6QEOEDCRE+kBDhAwkRPpBQkVN2JUkRxUaX0PfatDcmnrGS9ziXOg3317t+X2TumiVFxk7p6S0zNwr+Cdr1z6yYHVd8ICHCBxIifCAhwgcSInwgIcIHEiJ8ICHCBxIifCAhwgcSInwgIcIHEiJ8ICHCBxIifCAhwgcSInwgIcIHEiJ8ICHCBxIifCChIqfsuqdHPeecW/vcycOHa595Ut/SrxWZO7n/4yJzJcnz5hWZu2bJyiJz9997XZG5knTh428Vmeu+ggdRT0wUm90OV3wgIcIHEiJ8ICHCBxIifCAhwgcSInwgoUrh215ge4Pt922P27629GIAyql6d8ITkl6OiB/aniNpsOBOAAprG77t8yRdL+lHkhQRxyQdK7sWgJKqPNW/WNIBSU/b3mJ7ne35hfcCUFCV8PskrZD0ZEQsl3RE0oOnPsj2iO0x22PH4mjNawKoU5Xwd0vaHRGbWr/foKn/EfyXiBiNiOGIGJ7jgTp3BFCztuFHxH5Ju2xf2vrQaknvFd0KQFFVX9W/W9L61iv6OyTdVW4lAKVVCj8itkoaLrwLgA7hzj0gIcIHEiJ8ICHCBxIifCAhwgcSKnJ2cExOFj0Ku4SJj3Y2vcLMHS1za3TPufUfjS6VOwJbkl7as7nI3JuHvlVkriT1XrCw9pk+2FvpcVzxgYQIH0iI8IGECB9IiPCBhAgfSIjwgYQIH0iI8IGECB9IiPCBhAgfSIjwgYQIH0iI8IGECB9IiPCBhAgfSIjwgYQIH0iI8IGEipyyK1vun1P72Dh+rPaZ/2GXmRtRZq5UbOdSJySX+Jo46ebFK4rMfWHPn4rMlaRbF19d+8yIE5UexxUfSIjwgYQIH0iI8IGECB9IiPCBhAgfSKhS+Lbvs73d9ru2n7U9UHoxAOW0Dd/2Ykn3SBqOiCsk9Uq6vfRiAMqp+lS/T9I8232SBiXtLbcSgNLahh8ReyQ9KmmnpH2SPo2IV0svBqCcKk/1z5d0m6Rlki6SNN/2nad53IjtMdtjx+No/ZsCqE2Vp/o3SPowIg5ExHFJGyVdd+qDImI0IoYjYrif1/6AWa1K+DslXWN70LYlrZY0XnYtACVV+R5/k6QNkjZL2tb6b0YL7wWgoErvx4+IRyQ9UngXAB3CnXtAQoQPJET4QEKEDyRE+EBChA8kVOZ47YiyR2GXUPIY7FIK7dwzf36RuZNHjhSZK0k9g4NF5pY4Avuk53f/sfaZ373pX5UexxUfSIjwgYQIH0iI8IGECB9IiPCBhAgfSIjwgYQIH0iI8IGECB9IiPCBhAgfSIjwgYQIH0iI8IGECB9IiPCBhAgfSIjwgYQIH0jIUeCkVtsHJP2t4sO/IunvtS9RTrftK7FzJ8yWfb8eEV9t96Ai4c+E7bGIGG50iRnotn0ldu6EbtuXp/pAQoQPJDQbwh9teoEZ6rZ9JXbuhK7at/Hv8QF03my44gPoMMIHEiJ8ICHCBxIifCChfwNG8ppeSzgZ/QAAAABJRU5ErkJggg==\n", 336 | "text/plain": [ 337 | "
" 338 | ] 339 | }, 340 | "metadata": {}, 341 | "output_type": "display_data" 342 | } 343 | ], 344 | "source": [ 345 | "from sklearn.metrics import confusion_matrix\n", 346 | "\n", 347 | "plt.matshow(confusion_matrix(np.array(preds), np.array(ys)))\n", 348 | "\n" 349 | ] 350 | }, 351 | { 352 | "cell_type": "markdown", 353 | "metadata": { 354 | "slideshow": { 355 | "slide_type": "slide" 356 | } 357 | }, 358 | "source": [ 359 | "shirt get confused for t-shirt" 360 | ] 361 | } 362 | ], 363 | "metadata": { 364 | "celltoolbar": "Slideshow", 365 | "kernelspec": { 366 | "display_name": "Python 3", 367 | "language": "python", 368 | "name": "python3" 369 | }, 370 | "language_info": { 371 | "codemirror_mode": { 372 | "name": "ipython", 373 | "version": 3 374 | }, 375 | "file_extension": ".py", 376 | "mimetype": "text/x-python", 377 | "name": "python", 378 | "nbconvert_exporter": "python", 379 | "pygments_lexer": "ipython3", 380 | "version": "3.6.3" 381 | } 382 | }, 383 | "nbformat": 4, 384 | "nbformat_minor": 2 385 | } 386 | --------------------------------------------------------------------------------