├── .gitattributes ├── README.md ├── assignments └── README.md ├── notebooks ├── CEC.ipynb ├── Convolution.ipynb ├── Dropout.ipynb ├── Linear.ipynb ├── MNIST_GAN.ipynb ├── Max-Pool.ipynb ├── NN.ipynb ├── ReLU.ipynb ├── Transposed Convolution.ipynb ├── WeightInit.ipynb ├── cifar10 │ ├── te_data.bin │ └── te_labels.bin └── mnist │ ├── test_32x32.t7 │ └── train_32x32.t7 └── projects ├── glyphs_sample.png └── readme.md /.gitattributes: -------------------------------------------------------------------------------- 1 | *.t7 filter=lfs diff=lfs merge=lfs -text 2 | *.pdf filter=lfs diff=lfs merge=lfs -text 3 | -------------------------------------------------------------------------------- /README.md: -------------------------------------------------------------------------------- 1 | 2 |

Computer Vision (CS 763) - Spring 2018

3 | 4 |

Course Information

5 | 12 | 13 |

Please note that CS663 is a hard prerequisite for this course.

14 | 15 |

Topics to be covered (tentative)

16 | 36 | 37 |

Learning materials and textbooks

38 | 45 | 46 |

Grading Policy

47 | 56 | 57 |

Other Policies

58 | 63 | 64 |

Course Projects

65 | 66 | [02/02/2018] Course projects have now been finalized. 67 | 68 | Go to this link for the finalized list. 69 | 70 |

Assignments

71 | 79 | 80 |

Lecture Schedule:

81 | 82 | 83 | 84 | 85 | 86 | 87 | 88 | 89 | 90 | 91 | 92 | 93 | 94 | 95 | 97 | 98 | 99 | 100 | 101 | 102 | 110 | 111 | 113 | 114 | 115 | 116 | 117 | 123 | 124 | 126 | 127 | 128 | 129 | 130 | 131 | 138 | 139 | 141 | 142 | 143 | 144 | 145 | 146 | 158 | 162 | 163 | 164 | 165 | 166 | 167 | 185 | 189 | 190 | 191 | 192 | 193 | 194 | 195 | 203 | 206 | 207 | 208 | 209 | 210 | 211 | 217 | 220 | 221 | 222 | 223 | 224 | 225 | 242 | 246 | 247 | 249 | 250 | 251 | 252 | 260 | 263 | 264 | 268 | 269 | 270 | 271 | 277 | 280 | 283 | 284 | 285 | 286 | 287 | 292 | 295 | 299 | 301 | 302 | 303 | 304 | 310 | 313 | 316 | 317 | 318 | 319 | 324 | 327 | 330 | 331 | 332 | 333 | 334 | 339 | 342 | 344 | 345 | 346 | 347 | 348 | 355 | 358 | 360 | 361 | 362 | 363 | 364 | 365 | 373 | 377 | 379 | 380 | 381 | 382 | 383 | 387 | 390 | 391 | 392 | 393 | 394 | 395 | 402 | 405 | 406 | 407 | 408 | 409 | 410 | 416 | 419 | 420 | 421 | 422 | 423 | 424 | 431 | 434 | 435 | 436 | 437 | 438 | 439 | 446 | 449 | 450 | 451 | 452 | 453 | 454 | 460 | 464 | 465 | 466 | 467 | 468 | 469 | 474 | 477 | 478 | 479 | 480 | 481 | 482 | 487 | 490 | 491 | 492 | 493 | 494 | 495 | 499 | 502 | 503 | 504 | 505 | 506 |
DateTopicsSlidesiTorch NotebooksExtra Reading
4th Jan. 2018
  • Introduction to computer vision, applications and course overview
    Slides -- 96 | --
    5th Jan. 2018 103 | Camera Geometry 104 |
      105 |
    • Homogeneous coordinates and projective geometry 106 |
    • Vanishing points, ideal line, point line duality in P2 107 |
    • Important 2D and 3D transformations using homogeneous coordinates 108 |
    • Introduction to the pin-hole camera model 109 |
    Slides -- 112 | Homogeneous Representations of Points, Lines and Planes
    12th Jan. 2018
      118 |
    • Modeling the pinhole camera analytically, intinsic and extrinsic parameters 119 |
    • World, camera, image plane and sensor plane coordinate systems and transformations between them 120 |
    • Linear and non-linear (lens distortion) errors 121 |
    • Homography, planar world and pure rotation of the camera 122 |
    Slides -- 125 | --
    13th Jan. 2018
      132 |
    • Iterative solutions for dealing with with non-linear (lens distortion) errors 133 |
    • Normalized, ideal, euclidian, affine and general camera models 134 |
    • Orthographic and weak-perspective camera models 135 |
    • Cross ratios and its applications 136 |
    • Camera calibration using DLT (known 3D control points) 137 |
    Slides -- 140 | Resource on SVD, how/why it can be used to solve eq. sytems of type Ax=0, |x|=1
    18th Jan. 2018
      147 |
    • Zhang's camera calibration method, mention of a few DL based calibration methods 148 |
    149 | Image Alignment 150 |
      151 |
    • Image alignment: problem statement, physically and digitally corresponding points 152 |
    • Motion models and degrees of freedom; non-rigid/deformable/non-parametric image alignment 153 |
    • Control point based image alignment using least squares - derivation for pseudo-inverse 154 |
    • Introduction to the SIFT algorithm 155 |
    • Forward and reverse image warping - bilinear and nearest-neighbor interpolation 156 |
    • Mention of DL based image patch descriptors 157 |
    159 | Slides(1)
    160 | Slides(2) 161 |
    -- --
    19th Jan. 2018
      168 |
    • Image alignment using image similarity measures: mean squared error, normalized cross-correlation 169 |
    • Concept of field of view in image alignment using image similarity measures 170 |
    • Monomodal and multimodal image alignment 171 |
    • Concept of joint histograms and behaviour of joint histograms in multi-modal image alignment 172 |
    • Concept of entropy and joint entropy, algorithm for multimodal registration by minimizing joint entropy 173 |
    • Aspects of image registration: 2D/3D, motion model, monomodal or multimodal 174 |
    • Application scenarios for image alignment: template matching, video stabilization, panorama generation, face recognition, 3D to 2D alignment 175 |
    176 | Robust Methods in Computer Vision 177 |
      178 |
    • Least squares problems and their relation to the Gaussian distribution on the noise 179 |
    • Examples of outliers in computer vision 180 |
    • Explanation of why the Gaussian distribution is unsuited to handling outliers 181 |
    • Introduction to the Laplacian distribution 182 |
    • The importance of heavy-tailed distributions in robust statistics 183 |
    • RANSAC (random sample consensus) algorithm 184 |
    186 | Slides(1)
    187 | Slides(2) 188 |
    -- --
    25th Jan. 2018 196 | Recognizing images, objects, scenes (Prof. Suyash P. Awate) 197 |
      198 |
    • Texture modeling and classification 199 |
    • Image classification, challenges 200 |
    • Bag of words model, dictionary learning 201 |
    • Defining image similarity, pyramid match kernel (PMK) 202 |
    204 | Slides
    205 |
    -- --
    1st Feb. 2018 212 | Recognizing images, objects, scenes (Prof. Suyash P. Awate) 213 |
      214 |
    • Pyramid match kernel (PMK) 215 |
    • Kernel coding, local coding, vector quantization, sparse coding, LcLC 216 |
    218 | Slides
    219 |
    -- --
    2nd Feb. 2018 226 | Robust Methods in Computer Vision 227 |
      228 |
    • RANSAC: time complexity and expected no. of iterations 229 |
    • Using RANSAC for Homography estimation 230 |
    • Introduction to the Laplacian distribution 231 |
    • Mean versus median: L2 fit versus L1 fit 232 |
    • LMeds: Least Median of Squares 233 |
    234 | Deep Learning for Computer Vision 235 |
      236 |
    • History, introduction 237 |
    • Data driven paradigm 238 |
    • K-NN on CIFAR 10 239 |
    • Hyperparameters, choice of loss function, cross-validation 240 |
    241 |
    243 | Slides(1)
    244 | Slides(2)
    245 |
    KNN Matrix calculus reminder 248 |
    8th Feb. 2018 253 |
      254 |
    • Softmax classifier, cross-entropy loss function, regularization 255 |
    • Optimization: vanilla gradient descent, stochastic gradient descent 256 |
    • Vanilla momentum, Nesterov momentum, AdaGrad, RMSProp, ADAM 257 |
    • Second order optimization methods, it's issues with deep learning 258 |
    • Good learning rate, learning rate decay 259 |
    261 | Slides
    262 |
    Gradient Check ADAM, 265 | Nesterov
    266 | DL optimization algorithms overview 267 |
    9th Feb. 2018 272 |
      273 |
    • Feed forward, back-propagation 274 |
    • Fully connected layer 275 |
    • Activation functions: sigmoid, tanh, ReLU, LeakyReLU, ELU, etc. 276 |
    278 | Slides
    279 |
    Linear Layer, 281 | ReLU 282 | --
    15th Feb. 2018 288 |
      289 |
    • Convolutions: transposed, dilated, fully-connected as convolution, sliding window as convolution 290 |
    • Max-pooling, Dropout 291 |
    293 | Slides
    294 |
    MaxPool, 296 | Convolution, 297 | Transposed convolution, Dropout 298 | Convolution arithmetic for deep 300 | learning
    16th Feb. 2018 305 |
      306 |
    • SoftMax, Cross Entropy 307 |
    • Data Augmentation, hyperparamter selection 308 |
    • Weight initialization 309 |
    311 | Slides
    312 |
    Cross Entropy, 314 | Weight Initialization 315 | --
    22nd Feb. 2018 320 |
      321 |
    • ConvNet applications 322 |
    • ConvNet case studies 323 |
    325 | Slides
    326 |
    328 | -- 329 | --
    23rd Feb. 2018 335 |
      336 |
    • RNNs, LSTMS 337 |
    • Visualizing and understanding ConvNets 338 |
    340 | Slides
    341 |
    -- 343 | --
    8th March 2018 349 |
      350 |
    • Visualizing and understanding ConvNets 351 |
    • Images that maximize ConvNet class scores, reconstructing images from ConvNet codes 352 |
    • Deep Dream, Neural Art, Adversarial Examples 353 |
    • Dimentionality reduction: siamese and triplet networks 354 |
    356 | Slides
    357 |
    -- 359 | --
    9th March 2018 366 |
      367 |
    • Other vision tasks: semantic segmentation, object localization, object detection, instance segmentation 368 |
    • R-CNN, Mask R-CNN, 369 |
    • Autoencoders 370 |
    • Generative modeling: VAEs, GANs 371 |
    • Case studies: pix2pix, CycleGAN, UNIT 372 |
    374 | Slides
    375 | Slides
    376 |
    MNIST Vanilla GAN 378 | --
    15th March 2018 384 |
      385 |
    • Deep Reinforcement Learning (Prof. Shivaram Kalyanakrishnan) 386 |
    388 | Slides 389 | -- --
    16th March 2018 396 | Structure from Motion 397 |
      398 |
    • Motion as a cue to inference of 3D structure from images 399 |
    • Motion factorization algorithm by Tomasi and Kanade for inference of (sparse) 3D structure of a fixed object being observed by a moving orthographic camera (or a rigidly moving object, being observed by a fixed orthographic camera) 400 |
    • Aspects of the above algorithm: Rank theorem, metric constraints for inference of motion parameters and 3D structure 401 |
    403 | Slides
    404 |
    -- --
    22nd March 2018 411 | Kanade-Lucas-Tomasi Feature Tracking (KLT) 412 |
      413 |
    • Tracking feature-points from a template by estimating motion parameters. 414 |
    • Finding good features to track. 415 |
    417 | Slides
    418 |
    -- Lucas-Kanade 20 Years On: A Unifying Framework
    23rd March 2018 425 | Geometric Stereo 426 |
      427 |
    • Orientation parameters for the camera pair and relative orientation. 428 |
    • Coplanarity constraint for corresponding points 429 |
    • Derivation and key properties of the Fundamental matrix 430 |
    432 | Slides
    433 |
    -- --
    5th April 2018 440 |
      441 |
    • Introduction to epipolar geometry 442 |
    • Essential matrix 443 |
    • Popular parameterizations for the relative orientation 444 |
    • Generating the normalized stereo case from arbitrary views 445 |
    447 | Slides
    448 |
    -- --
    6th April 2018 455 |
      456 |
    • Direct Solutions for Computing Fundamental and Essential Matrix 457 |
    • 8-point algorithm 458 |
    • Triangulation 459 |
    461 | Slides(1)
    462 | Slides(2)
    463 |
    -- --
    12th April 2018 470 |
      471 |
    • Absolute Orientation 472 |
    • Multi-View Geometry and Bundle Adjustment 473 |
    475 | Slides
    476 |
    -- --
    19th April 2018 483 |
      484 |
    • Shape from Shading: Introduction 485 |
    • Reflectance Models 486 |
    488 | Slides
    489 |
    -- --
    20th April 2018 496 |
      497 |
    • Photometric Stereo 498 |
    500 | Slides
    501 |
    -- --
    507 | 508 | 509 | -------------------------------------------------------------------------------- /assignments/README.md: -------------------------------------------------------------------------------- 1 |

    Computer Vision (CS 763) - Spring 2018 Assignment Information

    2 |

    Updates

    3 |
      4 | 5 |
    1. [12-Jan-18] Assignment 1 on Camera Geometry has been released. Due date: January 26, 2018. 6 |
    2. [27-Jan-18] Assignment 2 on Image Alignment and Robust Methods has been released. Due date: February 4, 2018. 7 |
    3. [9-Feb-18] Assignment 3 on Neural Networks has been released. Due date: February 21, 2018. Corresponding kaggle competition link 8 |
    4. [06-Mar-18] Assignment 4 on RNNs has been released. The due date for submission is Monday, March 19, 2018. Corresponding kaggle competition link 9 |
    5. [24-March-18] Assignment 5 on Tracking has been released. Due date: April 2, 2018. Download the necessary files from here 10 |
    6. [11-April-18] Assignment 6 on Multiview Geometry has been released. Due date: April 19, 2018. 11 |
    12 | -------------------------------------------------------------------------------- /notebooks/CEC.ipynb: -------------------------------------------------------------------------------- 1 | { 2 | "cells": [ 3 | { 4 | "cell_type": "markdown", 5 | "metadata": { 6 | "deletable": true, 7 | "editable": true 8 | }, 9 | "source": [ 10 | "# Cross-Entropy Criterion Layer\n", 11 | "In this notebook, we will look into the forward and the backward the the ```nn.CrossEntropyCriterion``` layer. We will also see how to compute the gradient of the loss respect to the output of the network $\\frac{\\partial L}{\\partial O}$." 12 | ] 13 | }, 14 | { 15 | "cell_type": "markdown", 16 | "metadata": { 17 | "deletable": true, 18 | "editable": true 19 | }, 20 | "source": [ 21 | "#### Input\n", 22 | "We explicitly initialize the output values to be as bellow:" 23 | ] 24 | }, 25 | { 26 | "cell_type": "code", 27 | "execution_count": 1, 28 | "metadata": { 29 | "collapsed": false, 30 | "deletable": true, 31 | "editable": true 32 | }, 33 | "outputs": [ 34 | { 35 | "data": { 36 | "text/plain": [ 37 | " 3.2000\n", 38 | " 5.1000\n", 39 | "-1.7000\n", 40 | "[torch.DoubleTensor of size 3]\n", 41 | "\n" 42 | ] 43 | }, 44 | "execution_count": 1, 45 | "metadata": {}, 46 | "output_type": "execute_result" 47 | } 48 | ], 49 | "source": [ 50 | "o = torch.Tensor({3.2, 5.1, -1.7}) \n", 51 | "print(o)" 52 | ] 53 | }, 54 | { 55 | "cell_type": "markdown", 56 | "metadata": { 57 | "deletable": true, 58 | "editable": true 59 | }, 60 | "source": [ 61 | "#### Target\n", 62 | "Assume that the target should have been the $1^{st}$ class" 63 | ] 64 | }, 65 | { 66 | "cell_type": "code", 67 | "execution_count": 2, 68 | "metadata": { 69 | "collapsed": false, 70 | "deletable": true, 71 | "editable": true 72 | }, 73 | "outputs": [ 74 | { 75 | "data": { 76 | "text/plain": [ 77 | " 1\n", 78 | "[torch.DoubleTensor of size 1]\n", 79 | "\n" 80 | ] 81 | }, 82 | "execution_count": 2, 83 | "metadata": {}, 84 | "output_type": "execute_result" 85 | } 86 | ], 87 | "source": [ 88 | "t = torch.Tensor({1})\n", 89 | "print(t)" 90 | ] 91 | }, 92 | { 93 | "cell_type": "markdown", 94 | "metadata": { 95 | "deletable": true, 96 | "editable": true 97 | }, 98 | "source": [ 99 | "#### Calcuate the Loss\n", 100 | "We verify that the loss is equaly to the value that we manually calcuated in class." 101 | ] 102 | }, 103 | { 104 | "cell_type": "code", 105 | "execution_count": 3, 106 | "metadata": { 107 | "collapsed": false, 108 | "deletable": true, 109 | "editable": true 110 | }, 111 | "outputs": [ 112 | { 113 | "data": { 114 | "text/plain": [ 115 | "2.0403551528002\t\n" 116 | ] 117 | }, 118 | "execution_count": 3, 119 | "metadata": {}, 120 | "output_type": "execute_result" 121 | } 122 | ], 123 | "source": [ 124 | "require 'nn';\n", 125 | "cec = nn.CrossEntropyCriterion()\n", 126 | "err = cec:forward(o, t)\n", 127 | "print(err)" 128 | ] 129 | }, 130 | { 131 | "cell_type": "markdown", 132 | "metadata": { 133 | "deletable": true, 134 | "editable": true 135 | }, 136 | "source": [ 137 | "#### Gradient of Loss with respect to Output\n", 138 | "Now, let us look into the gradient of the loss respect to the output of the network $\\frac{\\partial L}{\\partial O}$. We know that loss is equal to:\n", 139 | "\"Cross-Entropy\n", 140 | "As we saw in the class, the error is equal to $\\hat{o}-[1, 0, 0]^{T}$, where $\\hat{o}=SoftMax(o)$. So, let us know use torch to calcuate gradient of the loss respect to the output and then also do the same manually." 141 | ] 142 | }, 143 | { 144 | "cell_type": "code", 145 | "execution_count": 4, 146 | "metadata": { 147 | "collapsed": false, 148 | "deletable": true, 149 | "editable": true 150 | }, 151 | "outputs": [ 152 | { 153 | "data": { 154 | "text/plain": [ 155 | " 0.1300\n", 156 | " 0.8690\n", 157 | " 0.0010\n", 158 | "[torch.DoubleTensor of size 3]\n", 159 | "\n" 160 | ] 161 | }, 162 | "execution_count": 4, 163 | "metadata": {}, 164 | "output_type": "execute_result" 165 | } 166 | ], 167 | "source": [ 168 | "ohat = nn.SoftMax():forward(o)\n", 169 | "print(ohat)" 170 | ] 171 | }, 172 | { 173 | "cell_type": "code", 174 | "execution_count": 5, 175 | "metadata": { 176 | "collapsed": false, 177 | "deletable": true, 178 | "editable": true 179 | }, 180 | "outputs": [ 181 | { 182 | "data": { 183 | "text/plain": [ 184 | "-0.8700\n", 185 | " 0.8690\n", 186 | " 0.0010\n", 187 | "[torch.DoubleTensor of size 3]\n", 188 | "\n" 189 | ] 190 | }, 191 | "execution_count": 5, 192 | "metadata": {}, 193 | "output_type": "execute_result" 194 | } 195 | ], 196 | "source": [ 197 | "dl_do = cec:backward(o, t)\n", 198 | "print(dl_do)" 199 | ] 200 | }, 201 | { 202 | "cell_type": "code", 203 | "execution_count": 6, 204 | "metadata": { 205 | "collapsed": false, 206 | "deletable": true, 207 | "editable": true 208 | }, 209 | "outputs": [ 210 | { 211 | "data": { 212 | "text/plain": [ 213 | "-0.8700\n", 214 | " 0.8690\n", 215 | " 0.0010\n", 216 | "[torch.DoubleTensor of size 3]\n", 217 | "\n" 218 | ] 219 | }, 220 | "execution_count": 6, 221 | "metadata": {}, 222 | "output_type": "execute_result" 223 | } 224 | ], 225 | "source": [ 226 | "target = torch.Tensor({1, 0, 0})\n", 227 | "dl_do_manual = ohat - target\n", 228 | "print(dl_do_manual)" 229 | ] 230 | }, 231 | { 232 | "cell_type": "markdown", 233 | "metadata": { 234 | "deletable": true, 235 | "editable": true 236 | }, 237 | "source": [ 238 | "Note how ```dl_do``` and ```dl_do_manual``` are exactly the same." 239 | ] 240 | } 241 | ], 242 | "metadata": { 243 | "kernelspec": { 244 | "display_name": "iTorch", 245 | "language": "lua", 246 | "name": "itorch" 247 | }, 248 | "language_info": { 249 | "name": "lua", 250 | "version": "5.1" 251 | } 252 | }, 253 | "nbformat": 4, 254 | "nbformat_minor": 2 255 | } 256 | -------------------------------------------------------------------------------- /notebooks/Convolution.ipynb: -------------------------------------------------------------------------------- 1 | { 2 | "cells": [ 3 | { 4 | "cell_type": "markdown", 5 | "metadata": {}, 6 | "source": [ 7 | "# Convolution Layer\n", 8 | "In this notebook, we will look into the forward and the backward the the ```nn.SpatialConvolution``` layer. We will also see how to compute the gradient of the loss with respect to the weights $\\frac{\\partial L}{\\partial W}$ for this layer. I leave gradient of the loss with respect to the input $\\frac{\\partial L}{\\partial I}$ as an excercise." 9 | ] 10 | }, 11 | { 12 | "cell_type": "markdown", 13 | "metadata": {}, 14 | "source": [ 15 | "#### Input\n", 16 | "Similar to the example we used in the class, let us have the input $n$ of size $1 \\times 4$." 17 | ] 18 | }, 19 | { 20 | "cell_type": "code", 21 | "execution_count": 1, 22 | "metadata": {}, 23 | "outputs": [ 24 | { 25 | "data": { 26 | "text/plain": [ 27 | "(1,.,.) = \n", 28 | " 0.9741 0.6880 0.2478 0.5842\n", 29 | "[torch.DoubleTensor of size 1x1x4]\n", 30 | "\n" 31 | ] 32 | }, 33 | "execution_count": 1, 34 | "metadata": {}, 35 | "output_type": "execute_result" 36 | } 37 | ], 38 | "source": [ 39 | "require 'nn';\n", 40 | "n = torch.rand(4):reshape(1,1,4)\n", 41 | "print(n)" 42 | ] 43 | }, 44 | { 45 | "cell_type": "markdown", 46 | "metadata": {}, 47 | "source": [ 48 | "#### Output using Convolution Block\n", 49 | "Also similar to the example we used in the class, let us create a convolution block with a weights $w$ of size $1 \\times 3$ and obtain the output $m = Convolution(n, w)$ of size $1\\times2$." 50 | ] 51 | }, 52 | { 53 | "cell_type": "code", 54 | "execution_count": 2, 55 | "metadata": {}, 56 | "outputs": [ 57 | { 58 | "data": { 59 | "text/plain": [ 60 | "(1,.,.) = \n", 61 | " 0.1151 -0.0816\n", 62 | "[torch.DoubleTensor of size 1x1x2]\n", 63 | "\n" 64 | ] 65 | }, 66 | "execution_count": 2, 67 | "metadata": {}, 68 | "output_type": "execute_result" 69 | } 70 | ], 71 | "source": [ 72 | "conv = nn.SpatialConvolutionMM(1,1,3,1)\n", 73 | "conv.bias:fill(0)\n", 74 | "m = conv:forward(n)\n", 75 | "print(m)" 76 | ] 77 | }, 78 | { 79 | "cell_type": "markdown", 80 | "metadata": {}, 81 | "source": [ 82 | "#### Doing backward and calculating the gradinent of the loss with respect to the weights\n", 83 | "Let us set the gradient of the loss with respect to the input of next layer (flowing in through the next layer) $\\frac{\\partial L}{\\partial I^{l+1}}$ to be random numbers. After the ```backward()``` call, let us print the $\\frac{\\partial L}{\\partial W}$ as calcuated by torch." 84 | ] 85 | }, 86 | { 87 | "cell_type": "code", 88 | "execution_count": 3, 89 | "metadata": {}, 90 | "outputs": [ 91 | { 92 | "data": { 93 | "text/plain": [ 94 | " 0.3516 0.1675 0.2284\n", 95 | "[torch.DoubleTensor of size 1x3]\n", 96 | "\n" 97 | ] 98 | }, 99 | "execution_count": 3, 100 | "metadata": {}, 101 | "output_type": "execute_result" 102 | } 103 | ], 104 | "source": [ 105 | "nextgrad=torch.rand(2):reshape(1,1,2)\n", 106 | "conv:backward(n, nextgrad)\n", 107 | "print(conv.gradWeight)" 108 | ] 109 | }, 110 | { 111 | "cell_type": "markdown", 112 | "metadata": {}, 113 | "source": [ 114 | "#### Expression for calcuating the gradinent of the loss with respect to the weights\n", 115 | "Again, as we learnt in class, the $\\frac{\\partial L}{\\partial W} = Convolution(I, \\frac{\\partial L}{\\partial I^{l+1}})$. We varify that this is indeed exactly equal to $\\frac{\\partial L}{\\partial W}$ computed by torch. " 116 | ] 117 | }, 118 | { 119 | "cell_type": "code", 120 | "execution_count": 4, 121 | "metadata": {}, 122 | "outputs": [ 123 | { 124 | "data": { 125 | "text/plain": [ 126 | "(1,.,.) = \n", 127 | " 0.3516 0.1675 0.2284\n", 128 | "[torch.DoubleTensor of size 1x1x3]\n", 129 | "\n" 130 | ] 131 | }, 132 | "execution_count": 4, 133 | "metadata": {}, 134 | "output_type": "execute_result" 135 | } 136 | ], 137 | "source": [ 138 | "convback = nn.SpatialConvolutionMM(1,1,2,1)\n", 139 | "convback.bias:fill(0)\n", 140 | "convback.weight:copy(nextgrad:reshape(1,2))\n", 141 | "gradWeight = convback:forward(n)\n", 142 | "print(gradWeight)" 143 | ] 144 | }, 145 | { 146 | "cell_type": "markdown", 147 | "metadata": {}, 148 | "source": [ 149 | "I leave the calculation of $\\frac{\\partial L}{\\partial I}$ as an exercise. Remember: $\\frac{\\partial L}{\\partial I} = Convolution(W^{padded}, \\frac{\\partial L}{\\partial I^{l+1}}^{Flip})$" 150 | ] 151 | } 152 | ], 153 | "metadata": { 154 | "kernelspec": { 155 | "display_name": "iTorch", 156 | "language": "lua", 157 | "name": "itorch" 158 | }, 159 | "language_info": { 160 | "name": "lua", 161 | "version": "5.1" 162 | } 163 | }, 164 | "nbformat": 4, 165 | "nbformat_minor": 2 166 | } 167 | -------------------------------------------------------------------------------- /notebooks/Dropout.ipynb: -------------------------------------------------------------------------------- 1 | { 2 | "cells": [ 3 | { 4 | "cell_type": "markdown", 5 | "metadata": { 6 | "deletable": true, 7 | "editable": true 8 | }, 9 | "source": [ 10 | "# Dropout Layer\n", 11 | "In this notebook, we will look into the the dropout layer.\n", 12 | "Consider a standard 3 layered network as shown bellow in (a):\n", 13 | "\n", 14 | "During training with dropout, as shown bellow in code snippet, we randomly set some neurons to zero in the forward pass. The network with dropout on can be seen in (b)." 15 | ] 16 | }, 17 | { 18 | "cell_type": "code", 19 | "execution_count": 20, 20 | "metadata": { 21 | "collapsed": false, 22 | "deletable": true, 23 | "editable": true 24 | }, 25 | "outputs": [ 26 | { 27 | "data": { 28 | "text/plain": [ 29 | "Hidden Layer 1 output before dropout:\t\n", 30 | " 0.2164\n", 31 | " 0.0931\n", 32 | "-0.3029\n", 33 | "-0.6296\n", 34 | " 0.3302\n", 35 | "[torch.DoubleTensor of size 5]\n", 36 | "\n", 37 | "Dropout Mask:\t\n", 38 | " 1\n", 39 | " 1\n", 40 | " 1\n", 41 | " 1\n", 42 | " 0\n", 43 | "[torch.DoubleTensor of size 5]\n", 44 | "\n", 45 | "Hidden Layer 1 output after dropout\t\n", 46 | " 0.2164\n", 47 | " 0.0931\n", 48 | "-0.3029\n", 49 | "-0.6296\n", 50 | " 0.0000\n", 51 | "[torch.DoubleTensor of size 5]\n", 52 | "\n", 53 | "-----------------------------------\t\n", 54 | "Hidden Layer 2 output before dropout\t\n", 55 | " 0.1721\n", 56 | " 0.0472\n", 57 | " 0.1428\n", 58 | " 0.1557\n", 59 | " 0.0723\n", 60 | "[torch.DoubleTensor of size 5]\n", 61 | "\n", 62 | "Dropout Mask:\t\n", 63 | " 1\n", 64 | " 1\n", 65 | " 0\n", 66 | " 0\n", 67 | " 1\n", 68 | "[torch.DoubleTensor of size 5]\n", 69 | "\n", 70 | "Hidden Layer 2 output after dropout\t\n", 71 | " 0.1721\n", 72 | " 0.0472\n", 73 | " 0.0000\n", 74 | " 0.0000\n", 75 | " 0.0723\n", 76 | "[torch.DoubleTensor of size 5]\n", 77 | "\n", 78 | "-----------------------------------\t\n", 79 | "Hidden Layer 3 output before dropout\t\n", 80 | "-0.3645\n", 81 | "-0.1621\n", 82 | "-0.0378\n", 83 | "-0.0005\n", 84 | "-0.1822\n", 85 | "[torch.DoubleTensor of size 5]\n", 86 | "\n", 87 | "Dropout Mask:\t\n", 88 | " 0\n", 89 | " 0\n", 90 | " 0\n", 91 | " 1\n", 92 | " 1\n", 93 | "[torch.DoubleTensor of size 5]\n", 94 | "\n", 95 | "Hidden Layer 2 output after dropout\t\n", 96 | "-0.0000\n", 97 | "-0.0000\n", 98 | "-0.0000\n", 99 | "-0.0005\n", 100 | "-0.1822\n", 101 | "[torch.DoubleTensor of size 5]\n", 102 | "\n", 103 | "-----------------------------------\t\n" 104 | ] 105 | }, 106 | "execution_count": 20, 107 | "metadata": {}, 108 | "output_type": "execute_result" 109 | } 110 | ], 111 | "source": [ 112 | "require 'nn';\n", 113 | "p = 0.5\n", 114 | "x = torch.rand(5)\n", 115 | "L1 = nn.Linear(5, 5)\n", 116 | "L2 = nn.Linear(5, 5)\n", 117 | "L3 = nn.Linear(5, 5)\n", 118 | "L4 = nn.Linear(5, 1)\n", 119 | "\n", 120 | "\n", 121 | "H1 = L1:forward(x)\n", 122 | "print('Hidden Layer 1 output before dropout:')\n", 123 | "print(H1)\n", 124 | "U1 = torch.rand(H1:size(1)):gt(p):double()\n", 125 | "print('Dropout Mask:')\n", 126 | "print(U1)\n", 127 | "H1 = H1:cmul(U1)\n", 128 | "print('Hidden Layer 1 output after dropout')\n", 129 | "print(H1)\n", 130 | "print('-----------------------------------')\n", 131 | "\n", 132 | "H2 = L2:forward(H1)\n", 133 | "print('Hidden Layer 2 output before dropout')\n", 134 | "print(H2)\n", 135 | "U2 = torch.rand(H2:size(1)):gt(p):double()\n", 136 | "print('Dropout Mask:')\n", 137 | "print(U2)\n", 138 | "H2 = H2:cmul(U2)\n", 139 | "print('Hidden Layer 2 output after dropout')\n", 140 | "print(H2)\n", 141 | "print('-----------------------------------')\n", 142 | "\n", 143 | "H3 = L3:forward(H2)\n", 144 | "print('Hidden Layer 3 output before dropout')\n", 145 | "print(H3)\n", 146 | "U3 = torch.rand(H3:size(1)):gt(p):double()\n", 147 | "print('Dropout Mask:')\n", 148 | "print(U3)\n", 149 | "H2 = H3:cmul(U3)\n", 150 | "print('Hidden Layer 2 output after dropout')\n", 151 | "print(H3)\n", 152 | "print('-----------------------------------')\n", 153 | "\n", 154 | "out = L4:forward(H3)" 155 | ] 156 | }, 157 | { 158 | "cell_type": "markdown", 159 | "metadata": { 160 | "deletable": true, 161 | "editable": true 162 | }, 163 | "source": [ 164 | "However, we must componsate for the dropout, such as the total magnitude of the activations are same both in the trining and the test phase. This can be done by scaling the activations down during the forward pass at the test time." 165 | ] 166 | }, 167 | { 168 | "cell_type": "code", 169 | "execution_count": 21, 170 | "metadata": { 171 | "collapsed": true, 172 | "deletable": true, 173 | "editable": true 174 | }, 175 | "outputs": [], 176 | "source": [ 177 | "require 'nn';\n", 178 | "p = 0.5\n", 179 | "x = torch.rand(5)\n", 180 | "L1 = nn.Linear(5, 5)\n", 181 | "L2 = nn.Linear(5, 5)\n", 182 | "L3 = nn.Linear(5, 5)\n", 183 | "L4 = nn.Linear(5, 1)\n", 184 | "\n", 185 | "function forward_train(x)\n", 186 | " H1 = L1:forward(x)\n", 187 | " U1 = torch.rand(H1:size(1)):gt(p):double()\n", 188 | " H1 = H1:cmul(U1)\n", 189 | "\n", 190 | " H2 = L2:forward(H1)\n", 191 | " U2 = torch.rand(H2:size(1)):gt(p):double()\n", 192 | " H2 = H2:cmul(U2)\n", 193 | "\n", 194 | " H3 = L3:forward(H2)\n", 195 | " U3 = torch.rand(H3:size(1)):gt(p):double()\n", 196 | " H2 = H3:cmul(U3)\n", 197 | "\n", 198 | " out = L4:forward(H3)\n", 199 | " return out\n", 200 | "end\n", 201 | "\n", 202 | "function forward_test(x)\n", 203 | " H1 = L1:forward(x) * p\n", 204 | " H2 = L2:forward(H1) * p\n", 205 | " H3 = L3:forward(H2) * p\n", 206 | " out = L4:forward(H3)\n", 207 | " return out\n", 208 | "end" 209 | ] 210 | }, 211 | { 212 | "cell_type": "markdown", 213 | "metadata": { 214 | "deletable": true, 215 | "editable": true 216 | }, 217 | "source": [ 218 | "Alternatively, we can scale the activations up at training time, and thus the test time code remains untouched." 219 | ] 220 | }, 221 | { 222 | "cell_type": "code", 223 | "execution_count": null, 224 | "metadata": { 225 | "collapsed": true, 226 | "deletable": true, 227 | "editable": true 228 | }, 229 | "outputs": [], 230 | "source": [ 231 | "function forward_train(x)\n", 232 | " H1 = L1:forward(x)\n", 233 | " U1 = torch.rand(H1:size(1)):gt(p):double()\n", 234 | " H1 = H1:cmul(U1) / p\n", 235 | "\n", 236 | " H2 = L2:forward(H1)\n", 237 | " U2 = torch.rand(H2:size(1)):gt(p):double()\n", 238 | " H2 = H2:cmul(U2) / p\n", 239 | "\n", 240 | " H3 = L3:forward(H2)\n", 241 | " U3 = torch.rand(H3:size(1)):gt(p):double()\n", 242 | " H2 = H3:cmul(U3) / p\n", 243 | "\n", 244 | " out = L4:forward(H3)\n", 245 | " return out\n", 246 | "end\n", 247 | "\n", 248 | "function forward_test(x)\n", 249 | " H1 = L1:forward(x) \n", 250 | " H2 = L2:forward(H1) \n", 251 | " H3 = L3:forward(H2) \n", 252 | " out = L4:forward(H3)\n", 253 | " return out\n", 254 | "end" 255 | ] 256 | } 257 | ], 258 | "metadata": { 259 | "kernelspec": { 260 | "display_name": "iTorch", 261 | "language": "lua", 262 | "name": "itorch" 263 | }, 264 | "language_info": { 265 | "name": "lua", 266 | "version": "5.1" 267 | } 268 | }, 269 | "nbformat": 4, 270 | "nbformat_minor": 2 271 | } 272 | -------------------------------------------------------------------------------- /notebooks/Linear.ipynb: -------------------------------------------------------------------------------- 1 | { 2 | "cells": [ 3 | { 4 | "cell_type": "markdown", 5 | "metadata": {}, 6 | "source": [ 7 | "# Fully-Connected (Linear) Layer\n", 8 | "In this notebook, we will look into the forward and the backward the the ```nn.Linear``` layer. We will also manualy derive the expressions for the gradient of the loss respect to the input $\\frac{\\partial L}{\\partial I}$ of this layer and also the gradient of the loss with respect to the weights $\\frac{\\partial L}{\\partial W}$." 9 | ] 10 | }, 11 | { 12 | "cell_type": "code", 13 | "execution_count": 1, 14 | "metadata": {}, 15 | "outputs": [], 16 | "source": [ 17 | "require 'nn';\n", 18 | "n = torch.rand(5)\n", 19 | "lin = nn.Linear(5,4)\n", 20 | "m = lin:forward(n)" 21 | ] 22 | }, 23 | { 24 | "cell_type": "markdown", 25 | "metadata": {}, 26 | "source": [ 27 | "#### Input" 28 | ] 29 | }, 30 | { 31 | "cell_type": "code", 32 | "execution_count": 2, 33 | "metadata": {}, 34 | "outputs": [ 35 | { 36 | "data": { 37 | "text/plain": [ 38 | " 0.2647\n", 39 | " 0.7477\n", 40 | " 0.5382\n", 41 | " 0.1914\n", 42 | " 0.0834\n", 43 | "[torch.DoubleTensor of size 5]\n", 44 | "\n" 45 | ] 46 | }, 47 | "execution_count": 2, 48 | "metadata": {}, 49 | "output_type": "execute_result" 50 | } 51 | ], 52 | "source": [ 53 | "n" 54 | ] 55 | }, 56 | { 57 | "cell_type": "markdown", 58 | "metadata": {}, 59 | "source": [ 60 | "#### Output" 61 | ] 62 | }, 63 | { 64 | "cell_type": "code", 65 | "execution_count": 3, 66 | "metadata": {}, 67 | "outputs": [ 68 | { 69 | "data": { 70 | "text/plain": [ 71 | " 0.1242\n", 72 | " 0.2382\n", 73 | "-0.6489\n", 74 | "-0.2896\n", 75 | "[torch.DoubleTensor of size 4]\n", 76 | "\n" 77 | ] 78 | }, 79 | "execution_count": 3, 80 | "metadata": {}, 81 | "output_type": "execute_result" 82 | } 83 | ], 84 | "source": [ 85 | "m" 86 | ] 87 | }, 88 | { 89 | "cell_type": "markdown", 90 | "metadata": {}, 91 | "source": [ 92 | "#### Output (manually compute)" 93 | ] 94 | }, 95 | { 96 | "cell_type": "code", 97 | "execution_count": 4, 98 | "metadata": {}, 99 | "outputs": [ 100 | { 101 | "data": { 102 | "text/plain": [ 103 | " 0.1242\n", 104 | " 0.2382\n", 105 | "-0.6489\n", 106 | "-0.2896\n", 107 | "[torch.DoubleTensor of size 4]\n", 108 | "\n" 109 | ] 110 | }, 111 | "execution_count": 4, 112 | "metadata": {}, 113 | "output_type": "execute_result" 114 | } 115 | ], 116 | "source": [ 117 | "m_ = lin.weight*n + lin.bias\n", 118 | "print(m_)" 119 | ] 120 | }, 121 | { 122 | "cell_type": "markdown", 123 | "metadata": {}, 124 | "source": [ 125 | "#### Set gradient of the loss with respect of input of the next layer $\\frac{\\partial L}{\\partial I^{l+1}}$ of next layer to random values, and backward propagae the gradient from the next layer via this linear layer." 126 | ] 127 | }, 128 | { 129 | "cell_type": "code", 130 | "execution_count": 5, 131 | "metadata": { 132 | "collapsed": true 133 | }, 134 | "outputs": [], 135 | "source": [ 136 | "nextgrad = torch.rand(4)\n", 137 | "lin:backward(n, nextgrad)" 138 | ] 139 | }, 140 | { 141 | "cell_type": "markdown", 142 | "metadata": {}, 143 | "source": [ 144 | "#### Gradient of the loss with respect of input of this layer $\\frac{\\partial L}{\\partial I}$" 145 | ] 146 | }, 147 | { 148 | "cell_type": "code", 149 | "execution_count": 6, 150 | "metadata": {}, 151 | "outputs": [ 152 | { 153 | "data": { 154 | "text/plain": [ 155 | "-0.1486\n", 156 | "-0.3457\n", 157 | "-0.0236\n", 158 | " 0.2292\n", 159 | " 0.3097\n", 160 | "[torch.DoubleTensor of size 5]\n", 161 | "\n" 162 | ] 163 | }, 164 | "execution_count": 6, 165 | "metadata": {}, 166 | "output_type": "execute_result" 167 | } 168 | ], 169 | "source": [ 170 | "lin.gradInput" 171 | ] 172 | }, 173 | { 174 | "cell_type": "markdown", 175 | "metadata": {}, 176 | "source": [ 177 | "#### Relation for calcuating the gradient of loss with respect to the input: $\\frac{\\partial L}{\\partial I^{l}} = \\frac{\\partial L}{\\partial I^{l+1}} \\times \\frac{\\partial O^{l}}{\\partial I^{l}}$. Note how the jacobian $\\frac{\\partial O^{l}}{\\partial I^{l}} = W^{l}$" 178 | ] 179 | }, 180 | { 181 | "cell_type": "code", 182 | "execution_count": 7, 183 | "metadata": {}, 184 | "outputs": [ 185 | { 186 | "data": { 187 | "text/plain": [ 188 | "-0.1486 -0.3457 -0.0236 0.2292 0.3097\n", 189 | "[torch.DoubleTensor of size 1x5]\n", 190 | "\n" 191 | ] 192 | }, 193 | "execution_count": 7, 194 | "metadata": {}, 195 | "output_type": "execute_result" 196 | } 197 | ], 198 | "source": [ 199 | "nextgrad:reshape(1,4)*lin.weight" 200 | ] 201 | }, 202 | { 203 | "cell_type": "markdown", 204 | "metadata": {}, 205 | "source": [ 206 | "#### This layers gradient of Loss with respect to the weights: $\\frac{\\partial L}{\\partial W^{l}}$" 207 | ] 208 | }, 209 | { 210 | "cell_type": "code", 211 | "execution_count": 8, 212 | "metadata": {}, 213 | "outputs": [ 214 | { 215 | "data": { 216 | "text/plain": [ 217 | " 0.2135 0.6033 0.4343 0.1544 0.0673\n", 218 | " 0.0556 0.1572 0.1132 0.0402 0.0175\n", 219 | " 0.0850 0.2402 0.1729 0.0615 0.0268\n", 220 | " 0.1717 0.4850 0.3491 0.1242 0.0541\n", 221 | "[torch.DoubleTensor of size 4x5]\n", 222 | "\n" 223 | ] 224 | }, 225 | "execution_count": 8, 226 | "metadata": {}, 227 | "output_type": "execute_result" 228 | } 229 | ], 230 | "source": [ 231 | "lin.gradWeight" 232 | ] 233 | }, 234 | { 235 | "cell_type": "markdown", 236 | "metadata": {}, 237 | "source": [ 238 | "#### Relation for calcuating the gradient of the loss with respect to the weights of this layer: $\\frac{\\partial L}{\\partial W^{l}} = \\frac{\\partial L}{\\partial O} \\frac{\\partial O}{\\partial W_{l}}$.
    \n", 239 | "Let us first calcuate $\\frac{\\partial O}{\\partial W_{l}}$ which is a jacobian of size $4\\times20$. " 240 | ] 241 | }, 242 | { 243 | "cell_type": "code", 244 | "execution_count": 9, 245 | "metadata": {}, 246 | "outputs": [], 247 | "source": [ 248 | "dodw = torch.Tensor(4,20)\n", 249 | "st = 1\n", 250 | "for i = 1, 4 do\n", 251 | " for j = 1, 5 do\n", 252 | " dodw[i][st]=n[j]\n", 253 | " st = st + 1\n", 254 | " end\n", 255 | "end" 256 | ] 257 | }, 258 | { 259 | "cell_type": "markdown", 260 | "metadata": {}, 261 | "source": [ 262 | "#### Finally, we can now calculate $\\frac{\\partial L}{\\partial W^{l}} = \\frac{\\partial L}{\\partial O} \\frac{\\partial O}{\\partial W_{l}}$" 263 | ] 264 | }, 265 | { 266 | "cell_type": "code", 267 | "execution_count": 10, 268 | "metadata": {}, 269 | "outputs": [ 270 | { 271 | "data": { 272 | "text/plain": [ 273 | " 0.2135 0.6033 0.4343 0.1544 0.0673\n", 274 | " 0.0556 0.1572 0.1132 0.0402 0.0175\n", 275 | " 0.0850 0.2402 0.1729 0.0615 0.0268\n", 276 | " 0.1717 0.4850 0.3491 0.1242 0.0541\n", 277 | "[torch.DoubleTensor of size 4x5]\n", 278 | "\n" 279 | ] 280 | }, 281 | "execution_count": 10, 282 | "metadata": {}, 283 | "output_type": "execute_result" 284 | } 285 | ], 286 | "source": [ 287 | "(nextgrad:reshape(1,4) * dodw):reshape(4,5)" 288 | ] 289 | }, 290 | { 291 | "cell_type": "markdown", 292 | "metadata": {}, 293 | "source": [ 294 | "#### This layers gradient of output with respect to the bias: $\\frac{\\partial L}{\\partial b^{l}}$" 295 | ] 296 | }, 297 | { 298 | "cell_type": "code", 299 | "execution_count": 11, 300 | "metadata": {}, 301 | "outputs": [ 302 | { 303 | "data": { 304 | "text/plain": [ 305 | " 0.8068\n", 306 | " 0.2102\n", 307 | " 0.3212\n", 308 | " 0.6487\n", 309 | "[torch.DoubleTensor of size 4]\n", 310 | "\n" 311 | ] 312 | }, 313 | "execution_count": 11, 314 | "metadata": {}, 315 | "output_type": "execute_result" 316 | } 317 | ], 318 | "source": [ 319 | "lin.gradBias" 320 | ] 321 | }, 322 | { 323 | "cell_type": "markdown", 324 | "metadata": {}, 325 | "source": [ 326 | "#### Relation for calcuating this layers gradient of output with respect to the bias: $\\frac{\\partial L}{\\partial b^{l}} = \\frac{\\partial L}{\\partial I^{l+1}}$" 327 | ] 328 | }, 329 | { 330 | "cell_type": "code", 331 | "execution_count": 12, 332 | "metadata": {}, 333 | "outputs": [ 334 | { 335 | "data": { 336 | "text/plain": [ 337 | " 0.8068\n", 338 | " 0.2102\n", 339 | " 0.3212\n", 340 | " 0.6487\n", 341 | "[torch.DoubleTensor of size 4x1]\n", 342 | "\n" 343 | ] 344 | }, 345 | "execution_count": 12, 346 | "metadata": {}, 347 | "output_type": "execute_result" 348 | } 349 | ], 350 | "source": [ 351 | "nextgrad:reshape(1,4):t()" 352 | ] 353 | } 354 | ], 355 | "metadata": { 356 | "kernelspec": { 357 | "display_name": "iTorch", 358 | "language": "lua", 359 | "name": "itorch" 360 | }, 361 | "language_info": { 362 | "name": "lua", 363 | "version": "5.1" 364 | } 365 | }, 366 | "nbformat": 4, 367 | "nbformat_minor": 2 368 | } 369 | -------------------------------------------------------------------------------- /notebooks/Max-Pool.ipynb: -------------------------------------------------------------------------------- 1 | { 2 | "cells": [ 3 | { 4 | "cell_type": "code", 5 | "execution_count": 1, 6 | "metadata": { 7 | "collapsed": false 8 | }, 9 | "outputs": [], 10 | "source": [ 11 | "require 'nn';\n", 12 | "n = torch.Tensor({{{0.2692, 0.4190, 0.2095, 0.9163},\n", 13 | " {0.2778, 0.9199, 0.5555, 0.1638},\n", 14 | " {0.6936, 0.2328, 0.0553, 0.1798},\n", 15 | " {0.3611, 0.3225, 0.9032, 0.5106}}})" 16 | ] 17 | }, 18 | { 19 | "cell_type": "code", 20 | "execution_count": 2, 21 | "metadata": { 22 | "collapsed": false 23 | }, 24 | "outputs": [ 25 | { 26 | "data": { 27 | "text/plain": [ 28 | "(1,.,.) = \n", 29 | " 0.2692 0.4190 0.2095 0.9163\n", 30 | " 0.2778 0.9199 0.5555 0.1638\n", 31 | " 0.6936 0.2328 0.0553 0.1798\n", 32 | " 0.3611 0.3225 0.9032 0.5106\n", 33 | "[torch.DoubleTensor of size 1x4x4]\n", 34 | "\n" 35 | ] 36 | }, 37 | "execution_count": 2, 38 | "metadata": {}, 39 | "output_type": "execute_result" 40 | } 41 | ], 42 | "source": [ 43 | "print(n)" 44 | ] 45 | }, 46 | { 47 | "cell_type": "code", 48 | "execution_count": 3, 49 | "metadata": { 50 | "collapsed": false 51 | }, 52 | "outputs": [ 53 | { 54 | "data": { 55 | "text/plain": [ 56 | "(1,.,.) = \n", 57 | " 0.9199 0.9163\n", 58 | " 0.6936 0.9032\n", 59 | "[torch.DoubleTensor of size 1x2x2]\n", 60 | "\n" 61 | ] 62 | }, 63 | "execution_count": 3, 64 | "metadata": {}, 65 | "output_type": "execute_result" 66 | } 67 | ], 68 | "source": [ 69 | "pool = nn.SpatialMaxPooling(2, 2)\n", 70 | "m = pool:forward(n)\n", 71 | "print(m)" 72 | ] 73 | }, 74 | { 75 | "cell_type": "code", 76 | "execution_count": 4, 77 | "metadata": { 78 | "collapsed": false 79 | }, 80 | "outputs": [ 81 | { 82 | "data": { 83 | "text/plain": [ 84 | "(1,.,.) = \n", 85 | " 0 0 0 1\n", 86 | " 0 1 0 0\n", 87 | " 1 0 0 0\n", 88 | " 0 0 1 0\n", 89 | "[torch.DoubleTensor of size 1x4x4]\n", 90 | "\n" 91 | ] 92 | }, 93 | "execution_count": 4, 94 | "metadata": {}, 95 | "output_type": "execute_result" 96 | } 97 | ], 98 | "source": [ 99 | "nextgrad = torch.ones(1,2,2)\n", 100 | "pool:backward(n, nextgrad)\n", 101 | "print(pool.gradInput)" 102 | ] 103 | } 104 | ], 105 | "metadata": { 106 | "kernelspec": { 107 | "display_name": "iTorch", 108 | "language": "lua", 109 | "name": "itorch" 110 | }, 111 | "language_info": { 112 | "name": "lua", 113 | "version": "5.1" 114 | } 115 | }, 116 | "nbformat": 4, 117 | "nbformat_minor": 2 118 | } 119 | -------------------------------------------------------------------------------- /notebooks/NN.ipynb: -------------------------------------------------------------------------------- 1 | { 2 | "cells": [ 3 | { 4 | "cell_type": "markdown", 5 | "metadata": {}, 6 | "source": [ 7 | "## Nearest Neighbor Classifier for CIFAR-10" 8 | ] 9 | }, 10 | { 11 | "cell_type": "code", 12 | "execution_count": 10, 13 | "metadata": {}, 14 | "outputs": [], 15 | "source": [ 16 | "-- load test images\n", 17 | "te_x = torch.load('cifar10/te_data.bin')\n", 18 | "-- load test labels \n", 19 | "te_y = torch.load('cifar10/te_labels.bin')\n", 20 | "\n", 21 | "-- assuming the traning set to be the same \n", 22 | "-- as the test set\n", 23 | "tr_x = te_x\n", 24 | "tr_y = te_y" 25 | ] 26 | }, 27 | { 28 | "cell_type": "code", 29 | "execution_count": 11, 30 | "metadata": {}, 31 | "outputs": [ 32 | { 33 | "data": { 34 | "text/plain": [ 35 | " 10000\n", 36 | " 3\n", 37 | " 32\n", 38 | " 32\n", 39 | "[torch.LongStorage of size 4]\n", 40 | "\n", 41 | " 10000\n", 42 | "[torch.LongStorage of size 1]\n", 43 | "\n" 44 | ] 45 | }, 46 | "execution_count": 11, 47 | "metadata": {}, 48 | "output_type": "execute_result" 49 | } 50 | ], 51 | "source": [ 52 | "print(te_x:size())\n", 53 | "print(te_y:size())" 54 | ] 55 | }, 56 | { 57 | "cell_type": "code", 58 | "execution_count": 12, 59 | "metadata": {}, 60 | "outputs": [ 61 | { 62 | "data": { 63 | "image/png": "", 64 | "text/plain": [ 65 | "Console does not support images" 66 | ] 67 | }, 68 | "metadata": { 69 | "image/png": { 70 | "height": 204, 71 | "width": 204 72 | } 73 | }, 74 | "output_type": "display_data" 75 | }, 76 | { 77 | "data": { 78 | "text/plain": [ 79 | " 3\n", 80 | " 8\n", 81 | " 8\n", 82 | " 0\n", 83 | " 6\n", 84 | " 6\n", 85 | " 1\n", 86 | " 6\n", 87 | " 3\n", 88 | " 1\n", 89 | " 0\n", 90 | " 9\n", 91 | " 5\n", 92 | " 7\n", 93 | " 9\n", 94 | " 8\n", 95 | " 5\n", 96 | " 7\n", 97 | " 8\n", 98 | " 6\n", 99 | " 7\n", 100 | " 0\n", 101 | " 4\n", 102 | " 9\n", 103 | " 5\n", 104 | " 2\n", 105 | " 4\n", 106 | " 0\n", 107 | " 9\n", 108 | " 6\n", 109 | " 6\n", 110 | " 5\n", 111 | " 4\n", 112 | " 5\n", 113 | " 9\n", 114 | " 2\n", 115 | "[torch.ByteTensor of size 36]\n", 116 | "\n" 117 | ] 118 | }, 119 | "execution_count": 12, 120 | "metadata": {}, 121 | "output_type": "execute_result" 122 | } 123 | ], 124 | "source": [ 125 | "-- display the first 36 training set images\n", 126 | "require 'image';\n", 127 | "itorch.image(te_x[{{1,36},{},{},{}}])\n", 128 | "print(te_y[{{1,36}}])" 129 | ] 130 | }, 131 | { 132 | "cell_type": "code", 133 | "execution_count": 13, 134 | "metadata": {}, 135 | "outputs": [], 136 | "source": [ 137 | "local NN = torch.class(\"NN\")\n", 138 | "\n", 139 | "function NN:__init()\n", 140 | "end\n", 141 | "\n", 142 | "function NN:train(X, y)\n", 143 | " -- X is 2D of size N x D = 32x32x3, so each row is an example\n", 144 | " -- Y is 1D of size N \n", 145 | " self.tr_x = X\n", 146 | " self.tr_y = y\n", 147 | "end\n", 148 | "\n", 149 | "function NN:predict(x)\n", 150 | " -- x is of size D = 32x32x3 for which we want to predict the label\n", 151 | " -- returns the predicted label for the input x\n", 152 | " local min_idx = nil\n", 153 | " local min_dist = 1e10\n", 154 | " for i=1, self.tr_x:size(1) do\n", 155 | " local dist = (self.tr_x[i] - x):float():abs():sum()\n", 156 | " if (dist < min_dist) then\n", 157 | " min_dist = dist\n", 158 | " min_idx = i\n", 159 | " end\n", 160 | " end\n", 161 | " return self.tr_y[min_idx]\n", 162 | "end" 163 | ] 164 | }, 165 | { 166 | "cell_type": "code", 167 | "execution_count": 14, 168 | "metadata": {}, 169 | "outputs": [ 170 | { 171 | "data": { 172 | "image/png": "iVBORw0KGgoAAAANSUhEUgAAACAAAAAgCAIAAAD8GO2jAAAIyElEQVRIiU2W2Y8lV5HGI+JsmXkz7721upaualcvtts2eMEDbVv2aDyIF9CMNPPAC4/8ObzwB/CAAEsgkAakkTAjjWRbePBCt+wWNky3201vtdfdb2aeJYKHKhrHW0hH54vzi++TDu4+ONRGIyIAAgAAGGtB0DdNjAERSCEAICKRorMjZ4V41ovIaYuEj0qRQkXaB5+YEVEAFKCyZnjwEJjzsmesQWQAAIGYIqGQNjGKSEIAQCIAAGCRUxkCkCSPZABAgWgijUgAgAoJcffWjR//8Afb2+e/873vL2xsgxAhsILB7u5f/vDuysa5K1dfV9YKCMLZpCLCzI/eRESIpDWdKmljNJEGECLUFn//m1++9bv/WVtefO7V15bObWptgSUBFEXx5+sfvv3f//XdPHvhtTcAT2khyFk9IvwlGSAEUkoZTdZpY5VCvPtgNyq6c//+X/96N/pIyqBRyqjdO7cO9x7c+Pj6H997t51PEQBBQPgf9BG/vBJmFmEW0S4z1moiRYgphbK3lCkzTnA8GDBzZo2ft8f3br73qzevffDRxIfJeNQ2ddbpAiCAAIAIAog8okSPZAARtDFKa3W6LST9xn/85zvvvD0YDG7837sfrK2Gpr752aftwT0ZPTy/4JrO+s6lJ11WRp8IgFFAgOWR0um+5fQxhIoU6RQ9sIAipTWm+Fi3s97vDBa6o/t333rzZ4TYhDnHuFK4Z7e3t59/6aWvfwNNFkIgPHXAqX3Orj4dXM7QMUbU8zpYq3A6Pfrs2qfvvzd88LmTdOn85cVML5u4tVj2O31tLOk8CaXB5O4f3s6ramn7IiBxYgAQZpFHZj0ldBodQoU4Gs14Ovzk1z/97S/ePDw5LjevnPj2iRx6FDMFy/0uacMxAKFnVftERq/sXNq6+i/nvvq1YmGl9QHOXCScBACIEAmIyBhDRLh3888f/vxHb/3m19fv7b1yeXsv3+zz5FKHOK8YFCSvOVVOZ8YAolLKKSWpHbXJbj115Vv/vvnUsyExsHzJP4lIaaWQUBKr1ze6v3jzJ29/vnt5c/Pp1d7clLce7oNWH+9N372196f90cAHpYyzxintrMuLIu+UEuO93d1G2a0nnm18jCGklJg5xNj61ofQ+NC03sek//e3b93Yn3ar4p/Orzqn6qOj+/NY1ewsWWcYpE5wPG0qBZzbadNUMfbLMs87T29d3Lp6tex1ypTOyCBIOg0HMAsSIil94/admvnblx7PNDYhaT9dzfUXJ/Vza/2rW0UTRQB6mXaOFIoATOpGBMrMVkpVWhtthSIgaEABBhIQRSKCgERCqIyky2uLL55bncQ2ROVTXC/UkU97NfQtbfQ7vSIvLBZOd1zWKUpnLXAEFmo9t3WxtOi6i5ykickHDjGGyD5FnzgkDlHUepb981OblbUH03lkSQAVpb5Jh006mEfg0HPKKZQUUwpWQ5lZY7Nx4LvD+ubd3c/+cnPoYWllKc8dMguc5hlPY65Jqdd3VnYWOgn1yWyeRLwwcduluGh5HuSgjqNmzuwNRGCeRL4zDTf2Jjd2j28dHtwfDG7dfXj9+oezerq2fZ4Fgve+beumjT4gcAqt3lnOax9TmtetVwZGMWgIJleLOV3R8tkw7s/JGFeV+UmkWw9ORpPZcpVdWOqa3sIcbG6wkDbc+vijd1ZXn/gKtBElhBSV0p2imM3GuqN1IsWSQkpEYdLEHKMiIKKO0StWGjK7s7A3H/owJz9fqfKtpY7tdoOqVqoihxaYq7zonNxf553uxjaJF06Ikji+//+faJ8igFYkWulupk4C7tUxSWiZQfFg3PgwDmZhkrjn8Nz6WuH0DLMkmQK8vX/kfVroVTI4uH148sX+0Stff2a565vaa62ssS8utTqmoI1LKVoFxriMprdn8VAaS4KolLNlbjKYJi/1jI6sWnbLnFSo/bwZ//7atU6eba6tOI1KqWu37wzr+88/vzYazYvKZi4rnVXfvLhOwkGAGZDowclo2MSJ55MgjQ+vvvzcy69dnbV8eDKqfWzm8xATizAIC8QkrsgA0PuQOVdWnfXt9XGqxWUHw/H+cHJ3OFP/evGxJGkeOTE2KUxCACaFwgBEalSnO3vDTpk/88zOzoWthV5pJCClpLSPQUBERBMiJEPy5OPLr7y045vxQlGITzG1k7rRMXgvnEQSKmbIta0dTwUxhX43+8YLF156+vyF7eVZGx/uHfZffGy137157+B3739+4+Zx0zYck9PYcXqx23nh8mNPrBTb/ceJ0K8UrHXdeq21qr0IYB1kKrTredQwxGQBLq9X//baU+urvcls2i1Qb3YK1+kV3a9ctPOm3d0//vxeo1F63c7FrdU3Xrny6tcuRB+ccaApcMMiSwXph63MI44azww+sA6xByFRZEk7G6u5yxGd0anxTWwTaEBOhdGvfvXJC1tbu4Mm+PnqUnl+bbVbdSC2rK1WVhFgDJkt0+RIGwTi5JI3ClsDnJgVEOrCqn7PjefDXuVC03BsR9PJyXR2eeN8VXVLZZ6uVp67ZMFA206jF2cqBuOMKFMFfxIhK4repBlphMAcBDEBhJQEwBARwcZ6p4H60y9upmac68xmThm7f3RkgTeXV6wtrCsjh+gnCTkAu9FuZcu6HlQLG4oUAos0ZXdZj5vQBIkMiAJImkhRKq1Z3exnVWFdxmTybl8Zt9xhMtaIMKBvZyF45WyQxCJ14L3x3lJZVLYYzw67veXIdTuVyIGEGUA0kQiygNFkje738qJb6aJwnb4uFzEryZhMu9XeYqdTirHlwqqtuqJV0V8C7eoYvM2+mMyjLusItW9Mt3c8PJ7XM80MgSGK0N//M07T0uZy4+xgPMuLqk0QBTObN7GuQwwgCRKlBgDnwTuJxjkn3fl42u9VI8RS55N6FqBpLfjAukmiSAEIATCgISwLtba13On3h7MRS4ixHcwmw7YRCSxeKUgMYd4SYRRg7UBrbSXwZDocaqtBSDh4PysXquFg9jfmlD3cDl2puQAAAABJRU5ErkJggg==", 173 | "text/plain": [ 174 | "Console does not support images" 175 | ] 176 | }, 177 | "metadata": { 178 | "image/png": { 179 | "height": 32, 180 | "width": 32 181 | } 182 | }, 183 | "output_type": "display_data" 184 | }, 185 | { 186 | "data": { 187 | "text/plain": [ 188 | "7\t\n" 189 | ] 190 | }, 191 | "execution_count": 14, 192 | "metadata": {}, 193 | "output_type": "execute_result" 194 | }, 195 | { 196 | "data": { 197 | "image/png": "iVBORw0KGgoAAAANSUhEUgAAACAAAAAgCAIAAAD8GO2jAAAImElEQVRIiWVW2W+cVxU/5977bbN6Fs/YiVPHzUqTyk2B2KYppa3pohRaoRYXCFAFkCAVLQLxBKIPFfwbCFAfeECqxFvUItq0JTSEJlWXhCReEq8T22N7Zr7vu+vhYaZOop6n891P9yy/+zsLRrkaIjjnEJGIgAgQEQAJALoqADgAcM4RgAUCIHSmr1io1wZmZ+eCMGi3WkQOGBDRZ7d4V2cA1LXeFcA7BAAACKB3E4iACMkBuclHHjZG7h4ZvntkxBgHyAGQCAEQAImo52DbECIyxrp50Pb/z6R3gMAZILlnnv7m8Scf1yodHz+qtSFCBE7EEBkAI8Lt7Nm2RkTOuTvM3uaIIXOOrLFayWPHvvLrX/3yZmPlxRd/VijmGo0VzgQQA0Ig1lOghzC7ZQ3oMzx6H47IESFjUslO3OYcg8A/Mjr6ym9/k8+GgwP155791uzM9ObWFjJ+e7oAuK2JLiaMMUeu55PQEVqGyJxzCsDu2Tv85S/df+/hQ8Vc7r6Dh+45uO+fZ95cWb7xr3feaiwtSWMJBZLD211QL3RxmzskIAaIAAwcAJEzpWJuaurZp48/NbRj0Mg4F2U8YlcufeyMymbC1cay7wkiIiDEHgR3CoouVLcSQ+ecLeYzxx4Yy+Wy4+NjU1NTaUfOzVzjCHk/lEYR0uF77yGHjZXG1Zn5s//9REqDgJ8334NoO35gSFaVy4Vf/PynD04crVT7a9WqjJP2ZpsBlouFQi6vrC5Ws0h6YX4p9L29I7srpeLiUoMxTtAjz+0cvPXI3QcPAv/kj374vRPPR5kgCgNrKQqzIvCRM0POkBOBCKLAGJXG7TRuV8rFSqkPnO1ybps822YZAAAhACKiVvKp40++9NKpKAor1f58oegFETDOPWGck9okMgUGSqatra04TgrF0r79BycmJgLfJ+c+Vzw9BwzIA/KcMbt37fjJj39QrWSBTF9f1fcj52ySthmQJ0IiLvyAMabiVCUmm6s0NtW16yvj4+OHDn3BkWNsu/iBMWAMAB3rtgcAArCTjz78xfuPyDSVaYrIkiTR2lhrlEyBAIGHYeSsk6mU0swvrvzxz6/96S+vSanGxsY8z7sjeuyxs9sqnCPVX6t8e+q5bC7X2morpRHAGIMIRGScjZOE+4IJYYkcYKvdOf3GG2+++Y+FhYU0TQ8cOFCv140x21Vwi0VAROS4oBMnvjsxMWa0klJnMjljLCImSZKmCROif3DAWhvL1BNCeCEw/tDXHsmVB5RWg4ODnPPh4eHZuXlEdM71QAEAAB6ERW3U5ORXX331d32FvJKSM8aZSBOptU6SxPNE/0DdOtpqbeXyGT/wo8BHZH3l+t4DB2u12s6dOwuFwtVr0x99/Ml21W73IGadGRysvfzySwP9A1orQODcs5aUUnGcSKX6yuUokzXGdocFACAX+UIxm8uEQTBQr5dLpVp/bXzsaLncR2C74RP1cGKM0TPPfGN8fFybFMgZY7qMlEorYyq1eqZQTFMtlUTOnAOwJFOltFEyRmeyUSYThL4QE+NHH3vsYSCDSAgABN0ezvbtGfn+ie+Evs+QHFljrLVOa9NutaNMptpfjZNkY3MzTZMkiVMpnXVKSi6EJzzf95RS3YDL5fLJky8cPnTIGtMdiV3K8pMnX3h+6lnnNENnlAJCleq1m+tJmtw1POyI/v3++42VhlRKax2Foe8Jz/M8z/d8P0n10tJyNpcLwpBx3Dk0VCqV337rjDEGsTfK2OOPP8o5Q7RARsZtFSftzfb66nq9XvN9/5NPPz19+vTc3JwxRmttnfOCMMzmiHHtwAuDgZ07uC8IgQB835ucnHziiSeU1r0JDyDuGz0MNmVgndOddgspXL/ZLOQKtVpto9l858yZ5eUlwT3f9yuVShLHRGgdAvIklcLz64N9AMA57y4G/dXqqVOn3n33vRvzi57nERH/w+9fMVZyJJ0my0tLrbZcbqzt279feN57Z8++d/ZsEEQbzY1mc7NWqwGyKAqCINjY2Lhy5Uomk7XWATLrLOcCkVmC/mo1k8me/+B8nMSITABzBMo5rjrt2ZnpLcnv2r03XynOzsydO39BSiMEAvGtrc5HH12+e2S30TIMvMXFpTRN9u4/wIUnpfJ9n3PfWiSE1ZtruUz2+JNff+2vf3MOBQA5Y4BcpxMvLi5pnn9oZKTdbn344cX19dUoCmUqPS8Mw7C5tn5ZSXIjmUyQJEmpVCIHcZwwxvL5AhBqbQj4pcv/+/vrr48fPVItlpZX1wUAWaOdY6123NxoVQb7kWBuZvrSpx9nAj+MwvX1ZppoBgH3uEySubkbe/aM7NgxhIjnzv3HEvieVx8YGN417Ps+Y4GUstlsdtaazBEDEOAACVQqO7HUFoIgUmm6vLio0qSQywJANgq01GQ1F4IQNjc2rl+f59wLw+DcufOp1KOjo7XawNWrV9ud9j33jfb3Vyvlylqzubm1xRgTYCwSaKnarTiXLwdBlHTiVnM98EUY+q1WC8j6QjhrLAJjzBi7enMtn8vn83lr3fT0zPpas9G4mctGFy5enG+sFHL5IBPNrCx0jOQ8EEYbra3WTmk7tGtXEEWra+urzU0uBGOcIwdChuicRSJnDDlQSk1Pz5RKZaVNlMneWJifvT734LEHhODvvv22UapSLFye2SBERGLGgVKUpg5Q3DW0KxMF1xcW11oxiFBZ4CLi3AdnjVIcQXDGOUNkSqlGYyVJYmBACNqaJE2PjI6aOEYlG0uL165MC0QkyxgXlkAbWywWa/Was3Z2bra7AjvrjNEAtzbX7vqNCNZaKaU1ViALPT/0/EajsdJYjTKZdhxfunw57sTYGziMIxdMeMW+vjAMhRDFYnHn0NDa2mpnqyWE56xljAkhtNbGWuQ+WeuMUWnqgPmeHwUhWRd34vMfXGDgWu0YuWCcd5fy/wPNSQ64gew2xgAAAABJRU5ErkJggg==", 198 | "text/plain": [ 199 | "Console does not support images" 200 | ] 201 | }, 202 | "metadata": { 203 | "image/png": { 204 | "height": 32, 205 | "width": 32 206 | } 207 | }, 208 | "output_type": "display_data" 209 | }, 210 | { 211 | "data": { 212 | "text/plain": [ 213 | "7\t\n" 214 | ] 215 | }, 216 | "execution_count": 14, 217 | "metadata": {}, 218 | "output_type": "execute_result" 219 | }, 220 | { 221 | "data": { 222 | "image/png": "iVBORw0KGgoAAAANSUhEUgAAACAAAAAgCAIAAAD8GO2jAAAJw0lEQVRIiQXBSY+kZQEA4Hf99q22Xqr37ukZemYYJ7MwQQWUAAIqwYiJHryYaPw33j15I1EPJOqBGFyC4KBE1oHuWXqqp7fqquqq+upb393ngVeTAAfeWcqFoQtN31d5FPinMxbNt27eeubwydnR0ZHrkfWNeaDr0PMno7yq6orXjhcOxymCwLet8zyzbFtkuQEKWWR5aYkQAiAwkJJWFAFCUwqmWXlwViJZrnTxjBUwd/v908l0FIZBWRQE2sQjdhje2L6GMd3ff3TQ62lRC6UoxYggh9IgiIiNDUUAAKUUwkgqRoJGMk1nlZaG6PlWrEQ4mdYEWoEbWbalDVtcWC6zutVo53waNdvbV5+eTfNLl68AyffuffH+v97PpPJdq+mFDtfQxpmoyrKUUoaBjzXA60udMInPs6KoOCJkvrPg2bZjWUG72e22dDVznNh2W7pOgS5Wd77x4ms/9F3n688/XWwlG6vra5vbZ8f9s3t7Sgo3jhFAeZmaKoN5LZXJq5JgKAMbLQZOlc5MWYoyU1rPspknGpQgbJHhdOiEyPeh5werl65YXnThwlrifOujf/ytd3AMLG+5nTjL8yfZJBv0A8tL83HokFazMWDc8X1CDeCzqSlml+aTZjMRkOw9OZ5lmQtlVRZjxpHhk9GDteUrNQRBZ6kseZWdac1m2fhx70Gt4FIUISxaiauZ0WVpY4sG0fL6hcn+/fXVNbzWmSvTrK64ZzlIapsS37OS0F5dbK5sbZ+NUjXsP7u11gjiGTfr155LwjgbHQ+Oj5IgjKL2owcHajoLPCdeXgijJkuzEpms5L6C5XSsMcFXdy7WVeEBElqOMtry/NW15dVua/3C2tzaBYKsTd964dpTTIPT4Tnxk1qI0yc9G2PfD7e2n7p58/b4pD8eTxB1jg9OWFGmQLDhJMwrZtH7p0N8cTUiyNgCxdIkSEet1vVv3uqsdqJ25/hsIrmwQX1wvF9QtLTS/fi/d52osXVpp73QTea7Qau1srG2sr62+/DR7hd7DqKM68rSXcee9+NqedEgB39zs4MMzCqdZiyJGidpaiexFcasEukk3d3dPTjs9c/Pi0qMx9O8ZBvbl++88HLQXvDiZiVBmhUL3c6d5587Ho0++vwzJbXjeEnSctudKZRqMsYvXFz2XRcS+uVwdJ+x4/H56Hz6xZf7J4PB1as7Z/1+WTJqe5PJtMzzyTQ3kNy4fQcCk42H2XSs6iywTafdSFoxomh9Y2ljcaUSykp8JCpaF/jZ7U2W537kRaGXULSztZDL8j9fP4ra8fx8++DRI1bwdFYPxxOt1cbG5uUrV+KAOpAVkwEUeWhpH8jpsM+LFGv29d6n7UZzMp4c9fZcwQLHxgs21Ep4AQ2a4cFoeu948HA4EwbPNZutJHjppe9IqXYf9C7feub7P3pjd293Nk1vXF2lhp0dD2b5DACheJ6l5/l0NDzpeVifDKdHuw9XfGuSzmwCUdiMm92ulTSYQS6xVuKWxw2Vst8/SYt0brHTmW8tdpKfvfm9t958befq5U++/Pr3b7+DBGj4dj0eHNy/Nxkf5eWwP36CA9N0ncOvHo8GU2Ws/nCWlRl+68VXF5cuKBp/9fhAGmkDzDiHNu4m8VZ3bmW5++Dhw4OD3vJiK47d6zeuSyHf++s/y6pY35jbWFv0HEIJ6B8fnh72tldXTu/tf/jx/YLauVb75/mUQ/wchmgy48PR8f2HeFa3GWxDuhhF637ocH58dnQ6GQ9n+XSWh6F1dWfr4uZWVut3//4uwOW3X7izub0luDzZ3T/631eHH3/FHwzOT3JtiKWhKDU2mMRHA6OHDYheRp7Uhs8qhQDQWpRVOdKcAtz08pI/6R0lkUmnk7nl67/89a/WlqJ0sF9Ox6g7HzeS4tOe/vcTxrUN6E0rFsCYWjAKNdDEk1hBCCFMlRgCUevKwRaWCigJpERZFs+5W3P0ledvXFlfDbwFAABU2Y/feFWrDLuWTb3hwaPD/90zaQ0JGasy9oCPKZQqAgZJQDIbVEZWUtRCAIAi26MaAKkLJZEB7DwVTf+l526//sq18SyfW/CL8VE2emx3Gkkztpxosj/8029+NzkdhQQbwDEBaZ0xQW1sUYhdhEla5ghhl5CGQx2gtTEAQa40MERzWAP/xvNvbD29k4pz7k7/8+l7hDMu0pq1gVpVk+Gff/vH++9/HFDLJiQEUBtTKF0rU3CuISIQkg62kQFUQIi0oVoToyBC2sw5dgb1xu07z3735b5U8ws7UVx+fvdf48PzaZHysnQP8w/efq/32X7sUAchB2KsjNSSYGQjiAGU2kiuyKWfvylrprmEUBjIhdIlE2GctDtznNjO8lYmmIBYQMeiwY3bz/+l94fevZOmAl989lfVLxqWZwQ3nDMgCXWMIQgAyxgsJRQCIURWf3FTa4MRxhD5TmLbbYgsalNp6ONHj8fDUewBq9ZVPksJajXDQVqMxtkPfvoT+vorkAFZa6GY5AIpTQBUtVRcMsaqulZKKcHJ3Q/fEVprBCjxL6zd2txsxkmiIBsenRze/2R9fY4V++VI8lMrD2HeDOfaDXBz+8waMZ07HrFDWxCDCIYQIgQxChCmFCEHYwKxrTG5vrStETIQGoySJGhajszK8flh75O7vhq7QhgpsTCnfcmb4fkpZ+nw2qWloh6yfCyUrg1WUCujtTGY4FooY4ABgGCMMYXIJtRGUiqEMaKG16Ph6W5VgsGgZ9Sw3XFrlnq+32q4/VHBJXewWl1pLbY9BCyEViAABgIJDYDQaA0hBJIbrRBE2mhtgME2qcup0gZCCIEBZlbpXElcT0eBR7QxRtA8tfb2BpMZbG/NNWLTjJDmlZRSG40gIpQAjDHGQkshBIYAAiC1xBhDYxyLEi204BIAgDECSinFtIK+izEGSimMrMkk/eCDu+2FrVGo17trvB5LVtQ1Y6x2bMf1PIQQhFApxTmHCFFCtdbGGABAXTPCC2MAgQAIARDFkACAtAFISK2BRpS0O96dZy5aXtf3SF0rh/iM5VobQqk2KstSx3G1UrZtU4INgAAAx3GUUkpKJSWxiSOVVkohBIFGlm0DAIqyVFJqpjGmok6311pRY84KO/uPn2AjF1pLCIk8HyPMATCuHWKMATDGAIyR49gYYyEEpVRrTTCBBgCIEERQKmWU8DyfYkgIreqKM4YQDFyXlyeQwJ1Lm6P+mVGl65I4WoBQEoKgRghhhJHgXBvDhRBCRFGEEFJKkSB0hZA1YxghS0OtVV2kEEKghGKVqGsppYNNXVbTrGRVGrm2RSAhhmIgFajyEgGDEXIcBxlt2S6mFBhDCNFaQwj/Dzam07BzUN3GAAAAAElFTkSuQmCC", 223 | "text/plain": [ 224 | "Console does not support images" 225 | ] 226 | }, 227 | "metadata": { 228 | "image/png": { 229 | "height": 32, 230 | "width": 32 231 | } 232 | }, 233 | "output_type": "display_data" 234 | }, 235 | { 236 | "data": { 237 | "text/plain": [ 238 | "7\t\n" 239 | ] 240 | }, 241 | "execution_count": 14, 242 | "metadata": {}, 243 | "output_type": "execute_result" 244 | }, 245 | { 246 | "data": { 247 | "image/png": "iVBORw0KGgoAAAANSUhEUgAAACAAAAAgCAIAAAD8GO2jAAAI20lEQVRIiT1WSW+k1RU997731ffVXOUq2213225DDxi6oSEdRBgDQQLCgiBFEWGdRZQFfyab/IAoq0iRUBRCAoghEKbuVuMeTA94bNtl12C7hm9892ZRTd7+vaNz3r3nHPrjn74J8sXazGTY3TGSpS7tHe7Eg3bB5OfOPpkLqhlbEKAADAACAKioKgAikCgEgDLUggFmqAcAEFGxvc3bzWPNkI/ImN1O++bVa4uzs+cef8wYEg+ZC40TglWQEGP8OqCigAKkIIUCCvWgFgpxBLUggqogs2qG937YOPPYY9s7vTQZ+rmjSn2hmC9lLlWJhFICSAkAweDH4zCmQAAYAFiFAQMwgwCSMVUVm/NkbzjY2NqoBPlzj5zicw+osIsPCCCi8aOkAjCB8KNYBCj0PiElUQCssMB9RFUCwKp2Z38jifoPL840auU4TiAwALkERAAcjGVmsKikTonoRwjV/9NRMKAyBiAoCZRBqoCqRdR/8clzeTfsbrTyxXLO95kAyPiuIb630+p0ezPHTjTqdTAURqFACijAqhh/N8MCRgFRgowBrCKzj8/PFjzd3lxLwkFzcmayOcFwChXnjOXusP/PT94PysXN7u7C1NzSwrTxck5hIFBRMqJjORQwBKuAijNQp76BjTW0C7PVg+Go5JvS5ETOcs6NAFUVVWHYO6sr4tH8g6dTGd3YXk2Hw0fPzBoDEVUCEbNC7qtlGFagDCcq/cOeY9tNQhvG/TRJ6hO1wGOjQoigSupS0a3NwfBgcHRwuHLpysxsLSjUr26uptnwwuk5LwcRIfJICc4xM8gJUnWiqsYzR2Hvxt39s4tzNsniUiUI8jmNQqIxWYE4zZyVMF/KN6YL1QCT9fJW76izf+9aFoVR+NOzc8VyLnOJOkAdlEGsSqrChL2B3t7Ye2AqOFG1NudxNQiycOghYwWN10jEUna8mU/r5axYqlN0Y7O3fP27cs6LRri52Y/6wyeXFstlDwRVCJExbIgykc2+3lw/nC56jyzMttC0aRjDjDxSZmIicQIoMZEjKLJ+uHrj1rbJ2uGoUjQTpWKWRLXJY+3BaPle68REpVnzPcNs7GCYxonz88F+lJ2p56anK1db0i8WbSUwRMKG4VR5PHPCMARS1b3drTjsqHIjyM3NTG2urZULdRr0cNijqrnXSVodqdUndjq9/sHw5HS5cGy+SqhM5L/rxDv7B4mENm+tE5Hxvsh4A3g8FgIpTpTLtYkiSb5aHHa3q7XCg4vTQSHXHyIODyUchiK7LsuTW5wrl2rNu61BjoK94kwram+sLV+7fMOqChQqSoCC7luLAGAibU40L/ysebR5q72+ViiUEpTWO1k1C9jWuVY7/sCJ3cNDvx+fnypGKHy/mwZFP8rXu514eXn59uUrp6eqljmnkmLsPMRj/yIGgZG5Y6XS8dNL34661fjImzw53A8rJvJNcu6J56Yee/7GjZVqe+WhZkqU+347C225Mtnca7X/++kXt69ffub05K9ffdmSsaxQCBGIyTADTASAHBHiyO32z1x4tvT0q1mkg2EroLRcKZqZi19eWv/s3Y9+sVjOTzWuddGmgl8tbbba3136auvm17964uwvX3ra5H0r5IGYkSoLERMZZkvMCkvIkEThznq+cKbROFWtFzq99cG9VmN6/tLm9t//+ufFwB07PvefWwef3x6ePLNw7+bN5SuXMOy+9eITL1w87xk/yWATJc+3lGX3TZ+YmIiNISNgyqFIodtvDysHCmxvdA9a+6vbw08u3wnIPvXIQ+9+dPe9r7YylrX17496q3Nl8/YbLyydmkvjNEtJiWx0cBj6uUreeKQy9noiQ0zEBLDJEYPSQXv5i1WqYqLSgb+xub2ztxeUqn95b/nfX1/moh/46N3dPz8//frPn/OLjW9X2mlKmbjMqbWUDfpxL/MahcCzhgiGDZEZZy+xIfUpl9U09qK9aJjt946OWp2Ddm93p33r1h2v6FsvbviYf/j4mcXTkfJO59BoZtnlffHh2YlGtYT4cBiFYZwrVxggYiZDRHo/nVidGOZmkIhsh2nyt6vXr1xfyWWoTgRxnJ6slX//5ktLp+ZTqDXiOSKXpJmIIxVnRcgjO1kuw0E0UbYEZQMmlvslQhXsWY8la6f2sx9a19e2cnlDUZqGg4snGn/4zRtLZ045ZoWmWRpHkUs1S5NM0jhLLTFUiQRkmQFDDB4n91gmUogzXhInW+uHK7bcr06Wj9WGW1HVmtefOvfasxfPzi2IwomqiArIWjY5k0slZXa+JWaIwoDJEHnM1jM+ExMZAEpg8GgUbvTSdlAuVBsmPUxHzo6Onnvi/Du/fc3z83EIk2MDOCEiIhCzIWIidi6zBAaEQATLZFPROI7L+QJUQAxIH3aN6/HSUnLQWf7w028vf/f0sWDuwvOnZ07kYLJR5hsvE8XYuFUNyDExMxsGxEKMigOghgg8SAXiCn7GzOTZgwyr0hgUq62t7scf/Ku7cunNCw+cO//gXie8cOpkmjhicQ6qcMSqjlQZUCI2jAwGYlVVREghkhF4e7fbKOVtKRcK7qy3O95U1Cj9cO37rz/+5Oq3n8z6XGs8s7oxmq8HXgAlj0hSl4qIkjduGCo6Tuk0SbIotg7OqTAAdWmWWMSBlx9k2EFw5WB7Y6sruc3rX30240XPPtLY3W5/efWmH+S8xRNyh8MonJ5slmsV45OqUyUVhULVASQCz1jrPCPCgEAdEw9Hgx8i9vOTcXnKNqPlD/8R7XcvzpfefuVFCvhwOJqqlEdhmCbZ+lav6NmYouhAGhN14zkVVYFzYhhQj8Qo87hQGlXPkDnoh9t7R1tJ1s3XV3faH7//zeHG6uuPNt9566WF483ZavPh2YWJfOX07FypWv7g6nKS8Nzx2SDv9ft9dQrAqYuyKE5cliFKM2FjJYMKAYAxd7fbW1o5Mb+0ttr6/KMPBxu3f/fyT1559oLHNo2cI4FiGCYrd3dWNtY06hty1mm1WD4cDJy4+8WSuTeMUCz3RmHF8P8Al3UbtSsgHaMAAAAASUVORK5CYII=", 248 | "text/plain": [ 249 | "Console does not support images" 250 | ] 251 | }, 252 | "metadata": { 253 | "image/png": { 254 | "height": 32, 255 | "width": 32 256 | } 257 | }, 258 | "output_type": "display_data" 259 | }, 260 | { 261 | "data": { 262 | "text/plain": [ 263 | "2\t\n" 264 | ] 265 | }, 266 | "execution_count": 14, 267 | "metadata": {}, 268 | "output_type": "execute_result" 269 | }, 270 | { 271 | "data": { 272 | "image/png": "iVBORw0KGgoAAAANSUhEUgAAACAAAAAgCAIAAAD8GO2jAAAHY0lEQVRIiTVWS3MmSZF0j8isqu8hqaVpdfc8bIZhdgaMNwYYZhgHjN/Hnf/AkQtw4TIHbG0Nw3iv7bKwA9MPtVpqqdX6vqrKzHAOpY5zZkSGh4d78qc/++U4zmUuoQDhNDczdzejMefkngREU0gAIkIKLCEIUCyhACQQSokpmdPMlVRrMnifJdGY3Yeuz8lSMjczMwFVoRZLOgCUImKupZRaaqshhaCAACmkEqgFRppZImLoHIBCAt0tOzqnGz0ZSQEmCqa7QI149epq//r1arNeDUNtahJgighRijcNkmZpu+5z6lprpZVpqiE01RoAJIkkgFAAhCRpKvNunBu4PTzcbNbmKSJoJkBSba3W1oAAnci5S6uuozGZm5NkRACKqCVIGsi5zBGikSSgMtcW0fedm9UIQzXzNxOBmedsCQCR3bvkaaozaVJIMNLctaAlCVFbG8fSQkKANAMFd3fCnRBahADSAJAkYXf8MDdzY5rmQhIgiZAiIhqgACmgtiihFgGJFEAnAQEgYEYBkKAmojUtZXLOq02fc3IildJy9j53NNZWb+c616YIiQHEwk0BgIFshAFErU1QMk/uZmgRtTYBJCVEoxk2m27IOVXJhbkVNoZCNHNWYC5NoBkJRLTWmhMyMNgUTUpStqjRDJSihWjUm2ntd/shm4PJUmpAa40gaEvb5kog7Q6QaGxGQmYLwjCD+wKtgWZMiSRRSpEEksA8z621lJOZGcnaMNdAgJSB2SCAkBEpO7MRSIk5eU7ZfVnUhVoUMM9lHPctQoGqEhHzPLsxSZAgIELRpFAooAV6LbxxpxuMdKO502gkCAEC1KK2mGudo5VQhKLaiDo5HC2FuFDGjH1nNaLMmGsrrQELbeQGdyY3V8pSdTOjmdsy0kBtdW5RCmtYbQrJIIkEU62RshlhhIAGBjkFahMkYCkjN/bJc1O1SnLJD1JgCLVqrppa1FAIAozMRqen0hoMy7rMRWPFrkQLhkwREKi72qqYWtCMRhqtGcAAWkhCgzdYkEEuClNJN0+3s7wGiCqVQMDCEgyhaK0BcklgMwMI0szMbZnx3co5BSDetKMISQqHSUyzZYCLNOhuoykiwdz7ZZ9btBAMMgpvaMPlPEhwsQZFi1brkh8QjcbkagJDgqDFMEiQgMkK5jGP4yqCFJPTk9dQi9k4pRSp64aNZSMAFaGRzCkJdzJMWhJNZACSIrTIOKAQAP/s00/vf/qrL1xcHr5/uv3Rx92j4/rXs+nP/7q4Hf/H0pNH72++8o2Dt99dH5+uDg671RaBulyFIJBMkRLNDYDkd7YIAEPaffWDi+8OLX3h3f5ym7ar7cfoT0b7IKUfP7z629n2F8+OX+TXT+7h6ZNp1e3WW3vr/tEXv9wdnQZ0p0tQAmEkAC2wAABSSpouPzz53de/e3P1gw9fnLUyxdGJ7r9rvvap8cmn6/M/rPafd/OwcvYB7S8u99Nox4/i6CTuPI0SUkrJrGPEXMZoJXUrul08/f/P/vLrf/z2rw9P9uNlHS/UqP4gbQ8TOzs82nzE1PVDN8/z65vGHXIeNG1OHlwHJ4ktwGgCmNJ8c23JH//tj88///zq5fXHX/v2NI3//af/+vtffrt7Pdbgo2xH5s9qbMz75LcHm4OTzSfd9MOXr+/tSjt/HBGxXa0PaQ8exXzb2hylmRMkNaff/PwnTHp1eTbdjlOJ24s/3L6aXu1jbl3adF1KdV+up3mqtI5t7dpqmq/PX47nV9cnhw+6j95j19V5ri/+gavH8NOoBSFYgrS/uU4v/vn7gDab9cnKNcTcnnJguZ7v9SsrhVMJI1Y4pTktlflw1477+SHHB2qz5mC1oUdDvanznx6f9S8/+/t/5s3R6vAU0T9//pRf+WTohzw2OXncZ+/ZpHR2u0mpR/St9CmtV5tay1yqhvUW/DDmFf3mspbaXb592hToV2r1davP4vpphJFjU5c3u2lKrZmmWCdej+X8pmz67EC50LOp/LO0sWkYbHuQdrt5vy85vf6yZ2635d7mz7ZvMW5f3eZWL9Iead1v7MMvfeOdo+2qO5xKycNgvkp1its9hiEPXDGR5p6S/8c7X3r7o/X55bOz89W6u3d0dHNzs9vvH56efvWTj7BeP3161q6uok716nr16vZZLQ8/OD64/9bxe58w97nLB0ZPbmT6zre/tV4fdCkHXWYNBtj//t/n/3r+ogKbVbe/3Z2P0zSN3TAY+fTs+YvEUqpFTe7dO28/xlna75Lh+Phoux1ySimZG0kZW3r34+8HME1Tmcrtbrfbj+M0FuSr6/NSJrU2jWMpNeVswNXLl1cvX3Zdvnd0eHCwXW1XDx/c/+b3vrXq+q5LfZc8uRkhkTREghLcuuTu1vVdP3SHtdS51reO6/vvhcLcPKV+6FZ9P3TdsBq6rkvZcrKuy+6uWP4YisXAGEZkUzate0eLdHNz64saCma+Gtbdvb7Pue9y7lLX5XT3rOXDt/g3FK1FK6USMsCh3tU7+45OOGPa7V48ueyH/t+4TvZOVMnuRAAAAABJRU5ErkJggg==", 273 | "text/plain": [ 274 | "Console does not support images" 275 | ] 276 | }, 277 | "metadata": { 278 | "image/png": { 279 | "height": 32, 280 | "width": 32 281 | } 282 | }, 283 | "output_type": "display_data" 284 | }, 285 | { 286 | "data": { 287 | "text/plain": [ 288 | "9\t\n" 289 | ] 290 | }, 291 | "execution_count": 14, 292 | "metadata": {}, 293 | "output_type": "execute_result" 294 | } 295 | ], 296 | "source": [ 297 | "classifier = NN.new()\n", 298 | "classifier:train(tr_x, tr_y)\n", 299 | "\n", 300 | "itorch.image(te_x[100])\n", 301 | "print (classifier:predict(te_x[100]))\n", 302 | "\n", 303 | "itorch.image(te_x[110])\n", 304 | "print (classifier:predict(te_x[110]))\n", 305 | "\n", 306 | "itorch.image(te_x[120])\n", 307 | "print (classifier:predict(te_x[120]))\n", 308 | "\n", 309 | "itorch.image(te_x[130])\n", 310 | "print (classifier:predict(te_x[130]))\n", 311 | "\n", 312 | "itorch.image(te_x[140])\n", 313 | "print (classifier:predict(te_x[140]))" 314 | ] 315 | }, 316 | { 317 | "cell_type": "code", 318 | "execution_count": 15, 319 | "metadata": { 320 | "collapsed": true 321 | }, 322 | "outputs": [], 323 | "source": [ 324 | "require 'math'\n", 325 | "function evaluate()\n", 326 | " errors = 0\n", 327 | " for i = 1,10 do\n", 328 | " xi = tr_x[i] \n", 329 | " ti = tr_y[i] \n", 330 | " op = classifier:predict(xi)\n", 331 | " if ti ~= op then\n", 332 | " errors = errors + 1\n", 333 | " end\n", 334 | " end\n", 335 | " return errors\n", 336 | "end" 337 | ] 338 | }, 339 | { 340 | "cell_type": "code", 341 | "execution_count": 16, 342 | "metadata": {}, 343 | "outputs": [ 344 | { 345 | "data": { 346 | "text/plain": [ 347 | "0\t\n" 348 | ] 349 | }, 350 | "execution_count": 16, 351 | "metadata": {}, 352 | "output_type": "execute_result" 353 | } 354 | ], 355 | "source": [ 356 | "print(evaluate())" 357 | ] 358 | } 359 | ], 360 | "metadata": { 361 | "kernelspec": { 362 | "display_name": "iTorch", 363 | "language": "lua", 364 | "name": "itorch" 365 | }, 366 | "language_info": { 367 | "name": "lua", 368 | "version": "5.1" 369 | } 370 | }, 371 | "nbformat": 4, 372 | "nbformat_minor": 2 373 | } 374 | -------------------------------------------------------------------------------- /notebooks/ReLU.ipynb: -------------------------------------------------------------------------------- 1 | { 2 | "cells": [ 3 | { 4 | "cell_type": "markdown", 5 | "metadata": {}, 6 | "source": [ 7 | "# Rectified Linear (ReLU) Layer\n", 8 | "In this notebook, we will look into the forward and the backward the the ```nn.ReLU``` layer. We will also see how to compute the gradient of the lost respect to the input $\\frac{\\partial L}{\\partial I}$ for this layer." 9 | ] 10 | }, 11 | { 12 | "cell_type": "markdown", 13 | "metadata": {}, 14 | "source": [ 15 | "#### Input\n", 16 | "```torch.rand``` gives us random numbers unformly in the range $[0, 1]$. We subtract 0.5
    to bring it to the range $[-0.5, 0.5]$" 17 | ] 18 | }, 19 | { 20 | "cell_type": "code", 21 | "execution_count": 1, 22 | "metadata": {}, 23 | "outputs": [ 24 | { 25 | "data": { 26 | "text/plain": [ 27 | "-0.0044\n", 28 | "-0.1521\n", 29 | " 0.4794\n", 30 | "-0.1014\n", 31 | " 0.4201\n", 32 | "[torch.DoubleTensor of size 5]\n", 33 | "\n" 34 | ] 35 | }, 36 | "execution_count": 1, 37 | "metadata": {}, 38 | "output_type": "execute_result" 39 | } 40 | ], 41 | "source": [ 42 | "require 'nn';\n", 43 | "n = torch.rand(5) - 0.5 \n", 44 | "print(n)" 45 | ] 46 | }, 47 | { 48 | "cell_type": "markdown", 49 | "metadata": {}, 50 | "source": [ 51 | "#### Output" 52 | ] 53 | }, 54 | { 55 | "cell_type": "code", 56 | "execution_count": 2, 57 | "metadata": {}, 58 | "outputs": [ 59 | { 60 | "data": { 61 | "text/plain": [ 62 | " 0.0000\n", 63 | " 0.0000\n", 64 | " 0.4794\n", 65 | " 0.0000\n", 66 | " 0.4201\n", 67 | "[torch.DoubleTensor of size 5]\n", 68 | "\n" 69 | ] 70 | }, 71 | "execution_count": 2, 72 | "metadata": {}, 73 | "output_type": "execute_result" 74 | } 75 | ], 76 | "source": [ 77 | "relu = nn.ReLU()\n", 78 | "m = relu:forward(n)\n", 79 | "print(m)" 80 | ] 81 | }, 82 | { 83 | "cell_type": "markdown", 84 | "metadata": {}, 85 | "source": [ 86 | "So for simplicity, we start by setting the gradient of the loss with respect to the input of next layer (flowing in through the next layer) $\\frac{\\partial L}{\\partial I^{l+1}}$ to be all ones.
    Next, we see that gradient of the lost with respect to the input of this layer $\\frac{\\partial L}{\\partial I^{l}}$ is one where $n > 0$ and zero otherwise." 87 | ] 88 | }, 89 | { 90 | "cell_type": "code", 91 | "execution_count": 3, 92 | "metadata": {}, 93 | "outputs": [ 94 | { 95 | "data": { 96 | "text/plain": [ 97 | " 0\n", 98 | " 0\n", 99 | " 1\n", 100 | " 0\n", 101 | " 1\n", 102 | "[torch.DoubleTensor of size 5]\n", 103 | "\n" 104 | ] 105 | }, 106 | "execution_count": 3, 107 | "metadata": {}, 108 | "output_type": "execute_result" 109 | } 110 | ], 111 | "source": [ 112 | "nextgrad=torch.ones(5)\n", 113 | "relu:backward(n, nextgrad)\n", 114 | "print(relu.gradInput)" 115 | ] 116 | }, 117 | { 118 | "cell_type": "code", 119 | "execution_count": 4, 120 | "metadata": {}, 121 | "outputs": [ 122 | { 123 | "data": { 124 | "text/plain": [ 125 | " 1\n", 126 | " 1\n", 127 | " 1\n", 128 | " 1\n", 129 | " 1\n", 130 | "[torch.DoubleTensor of size 5]\n", 131 | "\n" 132 | ] 133 | }, 134 | "execution_count": 4, 135 | "metadata": {}, 136 | "output_type": "execute_result" 137 | } 138 | ], 139 | "source": [ 140 | "print(nextgrad)" 141 | ] 142 | } 143 | ], 144 | "metadata": { 145 | "kernelspec": { 146 | "display_name": "iTorch", 147 | "language": "lua", 148 | "name": "itorch" 149 | }, 150 | "language_info": { 151 | "name": "lua", 152 | "version": "5.1" 153 | } 154 | }, 155 | "nbformat": 4, 156 | "nbformat_minor": 2 157 | } 158 | -------------------------------------------------------------------------------- /notebooks/Transposed Convolution.ipynb: -------------------------------------------------------------------------------- 1 | { 2 | "cells": [ 3 | { 4 | "cell_type": "markdown", 5 | "metadata": {}, 6 | "source": [ 7 | "## Transposed Convolution\n", 8 | "Transposed convolutions also called fractionally strided convolutions or deconvolutions work by swapping the forward and backward passes of a convolution. One way to put it is to note that the kernel defines a convolution, but whether it’s a direct convolution or a transposed convolution is determined by how the forward and backward passes are computed [1]. \n", 9 | "\n", 10 | "For instance, although the kernel $\\mathbf{w}$ defines a convolution whose forward and backward passes are computed by multiplying with $\\mathbf{C}$ and $\\mathbf{C}^T$ respectively, it also defines a transposed convolution whose forward and backward passes are computed by multiplying with $\\mathbf{C}^T$ and $(\\mathbf{C}^T)^T = \\mathbf{C}$ respectively.\n", 11 | "\n", 12 | "It is defined as SpatialFullConvolution in Torch\n", 13 | "\n", 14 | "[1]: Vincent Dumoulin, Francesco Visin - [A guide to convolution arithmetic for deep learning](https://arxiv.org/abs/1603.07285 \"A guide to convolution arithmetic for deep learning\") " 15 | ] 16 | }, 17 | { 18 | "cell_type": "markdown", 19 | "metadata": {}, 20 | "source": [ 21 | "### SpatialFullConvolution\n", 22 | "`module = nn.SpatialFullConvolution(nInputPlane, nOutputPlane, kW, kH, [dW], [dH], [padW], [padH], [adjW], [adjH])`\n", 23 | "\n", 24 | "Other frameworks call this operation \"In-network Upsampling\", \"Fractionally-strided convolution\", \"Backwards Convolution,\" \"Deconvolution\", or \"Upconvolution.\"\n", 25 | "The parameters are the following:\n", 26 | "* `nInputPlane:` The number of expected input planes in the image given into `forward()`.\n", 27 | "* `nOutputPlane:` The number of output planes the convolution layer will produce.\n", 28 | "* `kW:` The kernel width of the convolution\n", 29 | "* `kH:` The kernel height of the convolution\n", 30 | "* `dW:` The step of the convolution in the width dimension. Default is 1.\n", 31 | "* `dH:` The step of the convolution in the height dimension. Default is 1.\n", 32 | "* `padW:` Additional zeros added to the input plane data on both sides of width axis. Default is 0. (kW-1)/2 is often used here.\n", 33 | "* `padH:` Additional zeros added to the input plane data on both sides of height axis. Default is 0. (kH-1)/2 is often used here.\n", 34 | "* `adjW:` Extra width to add to the output image. Default is 0. Cannot be greater than dW-1.\n", 35 | "* `adjH:` Extra height to add to the output image. Default is 0. Cannot be greater than dH-1.\n" 36 | ] 37 | }, 38 | { 39 | "cell_type": "code", 40 | "execution_count": 1, 41 | "metadata": { 42 | "collapsed": true 43 | }, 44 | "outputs": [], 45 | "source": [ 46 | "require 'torch'\n", 47 | "require 'nn'" 48 | ] 49 | }, 50 | { 51 | "cell_type": "markdown", 52 | "metadata": {}, 53 | "source": [ 54 | "### (Normal) Convolution Forward & Backward Pass\n", 55 | "\n", 56 | "Layer with kernel size $3 \\times 3$ and stride of $2$, $2$" 57 | ] 58 | }, 59 | { 60 | "cell_type": "code", 61 | "execution_count": 2, 62 | "metadata": {}, 63 | "outputs": [ 64 | { 65 | "data": { 66 | "text/plain": [ 67 | " 1.0160 0.4442 0.8234 1.7105 1.9389 1.8970 0.4721 1.4563 0.4023\n", 68 | "[torch.DoubleTensor of size 1x9]\n", 69 | "\n" 70 | ] 71 | }, 72 | "execution_count": 2, 73 | "metadata": {}, 74 | "output_type": "execute_result" 75 | } 76 | ], 77 | "source": [ 78 | "conv = nn.SpatialConvolutionMM(1,1,3,3,2,2)\n", 79 | "conv.weight:uniform(0,2)\n", 80 | "conv.bias:fill(0)\n", 81 | "print(conv.weight)" 82 | ] 83 | }, 84 | { 85 | "cell_type": "markdown", 86 | "metadata": {}, 87 | "source": [ 88 | "#### Input\n", 89 | "Image of channel $1$ and size $5 \\times 5$, $r$ is the inputGradient" 90 | ] 91 | }, 92 | { 93 | "cell_type": "code", 94 | "execution_count": 3, 95 | "metadata": { 96 | "collapsed": true 97 | }, 98 | "outputs": [], 99 | "source": [ 100 | "imgC = torch.Tensor(1,1,5,5)\n", 101 | "imgC:uniform(0,5)" 102 | ] 103 | }, 104 | { 105 | "cell_type": "code", 106 | "execution_count": 4, 107 | "metadata": {}, 108 | "outputs": [ 109 | { 110 | "data": { 111 | "text/plain": [ 112 | "(1,1,.,.) = \n", 113 | " 28.0467 21.1382\n", 114 | " 23.1202 30.5335\n", 115 | "[torch.DoubleTensor of size 1x1x2x2]\n", 116 | "\n" 117 | ] 118 | }, 119 | "execution_count": 4, 120 | "metadata": {}, 121 | "output_type": "execute_result" 122 | } 123 | ], 124 | "source": [ 125 | "r=conv:forward(imgC)\n", 126 | "print(r)" 127 | ] 128 | }, 129 | { 130 | "cell_type": "code", 131 | "execution_count": 5, 132 | "metadata": {}, 133 | "outputs": [ 134 | { 135 | "data": { 136 | "text/plain": [ 137 | "(1,1,.,.) = \n", 138 | " 28.4956 12.4596 44.5710 9.3905 17.4058\n", 139 | " 47.9750 54.3784 89.3627 40.9838 40.0994\n", 140 | " 36.7311 51.1151 71.3240 44.3476 33.6471\n", 141 | " 39.5481 44.8267 96.0881 59.1999 57.9224\n", 142 | " 10.9150 33.6697 23.7172 44.4655 12.2851\n", 143 | "[torch.DoubleTensor of size 1x1x5x5]\n", 144 | "\n" 145 | ] 146 | }, 147 | "execution_count": 5, 148 | "metadata": {}, 149 | "output_type": "execute_result" 150 | } 151 | ], 152 | "source": [ 153 | "conv:backward(imgC,r)" 154 | ] 155 | }, 156 | { 157 | "cell_type": "markdown", 158 | "metadata": {}, 159 | "source": [ 160 | "### Transpose Convolution Forward Pass\n", 161 | "\n", 162 | "Layer with kernel size $3 \\times 3$ and stride of $2$, $2$" 163 | ] 164 | }, 165 | { 166 | "cell_type": "code", 167 | "execution_count": 6, 168 | "metadata": { 169 | "collapsed": true 170 | }, 171 | "outputs": [], 172 | "source": [ 173 | "trancon = nn.SpatialFullConvolution(1, 1, 3, 3, 2, 2)\n", 174 | "trancon.bias:fill(0)" 175 | ] 176 | }, 177 | { 178 | "cell_type": "code", 179 | "execution_count": 7, 180 | "metadata": {}, 181 | "outputs": [ 182 | { 183 | "data": { 184 | "text/plain": [ 185 | "(1,1,.,.) = \n", 186 | " 1.0160 0.4442 0.8234\n", 187 | " 1.7105 1.9389 1.8970\n", 188 | " 0.4721 1.4563 0.4023\n", 189 | "[torch.DoubleTensor of size 1x1x3x3]\n", 190 | "\n" 191 | ] 192 | }, 193 | "execution_count": 7, 194 | "metadata": {}, 195 | "output_type": "execute_result" 196 | } 197 | ], 198 | "source": [ 199 | "trancon.weight:copy(conv.weight)\n", 200 | "trancon.weight:reshape(1,1,3,3)\n", 201 | "print(trancon.weight)" 202 | ] 203 | }, 204 | { 205 | "cell_type": "markdown", 206 | "metadata": {}, 207 | "source": [ 208 | "#### Input\n", 209 | "Image of channel $1$ and size $2 \\times 2$ " 210 | ] 211 | }, 212 | { 213 | "cell_type": "code", 214 | "execution_count": 8, 215 | "metadata": {}, 216 | "outputs": [ 217 | { 218 | "data": { 219 | "text/plain": [ 220 | "(1,1,.,.) = \n", 221 | " 28.0467 21.1382\n", 222 | " 23.1202 30.5335\n", 223 | "[torch.DoubleTensor of size 1x1x2x2]\n", 224 | "\n" 225 | ] 226 | }, 227 | "execution_count": 8, 228 | "metadata": {}, 229 | "output_type": "execute_result" 230 | } 231 | ], 232 | "source": [ 233 | "print(r)" 234 | ] 235 | }, 236 | { 237 | "cell_type": "markdown", 238 | "metadata": {}, 239 | "source": [ 240 | "#### Forward Pass - Note this is exactly same as the backward pass of (normal) convolution!" 241 | ] 242 | }, 243 | { 244 | "cell_type": "code", 245 | "execution_count": 9, 246 | "metadata": {}, 247 | "outputs": [ 248 | { 249 | "data": { 250 | "text/plain": [ 251 | "(1,1,.,.) = \n", 252 | " 28.4956 12.4596 44.5710 9.3905 17.4058\n", 253 | " 47.9750 54.3784 89.3627 40.9838 40.0994\n", 254 | " 36.7311 51.1151 71.3240 44.3476 33.6471\n", 255 | " 39.5481 44.8267 96.0881 59.1999 57.9224\n", 256 | " 10.9150 33.6697 23.7172 44.4655 12.2851\n", 257 | "[torch.DoubleTensor of size 1x1x5x5]\n", 258 | "\n" 259 | ] 260 | }, 261 | "execution_count": 9, 262 | "metadata": {}, 263 | "output_type": "execute_result" 264 | } 265 | ], 266 | "source": [ 267 | "trancon:forward(r)" 268 | ] 269 | } 270 | ], 271 | "metadata": { 272 | "kernelspec": { 273 | "display_name": "iTorch", 274 | "language": "lua", 275 | "name": "itorch" 276 | }, 277 | "language_info": { 278 | "name": "lua", 279 | "version": "5.1" 280 | } 281 | }, 282 | "nbformat": 4, 283 | "nbformat_minor": 2 284 | } 285 | -------------------------------------------------------------------------------- /notebooks/cifar10/te_data.bin: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/cs763/Spring2018/817df53a8def4e05f2ef2b217b0efd7b479c662b/notebooks/cifar10/te_data.bin -------------------------------------------------------------------------------- /notebooks/cifar10/te_labels.bin: -------------------------------------------------------------------------------- 1 | V 1torch.ByteTensor'V 1torch.ByteStoragenotebooks/mnist/test_32x32.t7: -------------------------------------------------------------------------------- 1 | version https://git-lfs.github.com/spec/v1 2 | oid sha256:33b5f4f249d9113ddf5918ac0af21dff73312a68d9ca06e6260e82770cdf0994 3 | size 10250210 4 | -------------------------------------------------------------------------------- /notebooks/mnist/train_32x32.t7: -------------------------------------------------------------------------------- 1 | version https://git-lfs.github.com/spec/v1 2 | oid sha256:391bf4c91ce96deb828e7c1b4c8255b310aa997ce202c52609e89705e5bb87fa 3 | size 61500210 4 | -------------------------------------------------------------------------------- /projects/glyphs_sample.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/cs763/Spring2018/817df53a8def4e05f2ef2b217b0efd7b479c662b/projects/glyphs_sample.png -------------------------------------------------------------------------------- /projects/readme.md: -------------------------------------------------------------------------------- 1 |

    Computer Vision (CS 763) - Spring 2018 Project Information

    2 | You can choose projects given in the list below or you can propose your project and get it approved by the TAs. 3 |

    Timeline

    4 | 9 | 10 |

    Project Proposal Form

    11 | Project Proposal Form 12 | 13 |

    Some Proposed Projects

    14 |
      15 |
    1. MASK R-CNN for Object Instance Segmentation
      16 | Meet Pragnesh Shah, Mehta Nihar Nikhil, Parth Ashit Kothari
      17 | 18 | Mask R-CNN efficiently detects objects in an image while simultaneously generating a high-quality segmentation mask for each instance. Instance segmentation involves detection of all objects in the image as well as precise segmentation of each instance.The related research paper was published by Facebook AI Research. Through this project, we aim to implement this model in pytorch and perform experiments on the PASCAL-VOC dataset. We also shall make an attempt to provide a modification to the model which helps in improving the accuracies on the vision tasks as described in the paper. Another incentive of this project is that no working code is yet available in pytorch/tensorflow, which are the popular frameworks today. This will largely benefit the deep learning community. 19 | 20 |
    2. Object Detection
      21 | Kratika Gupta, Pratik Kalshetty, Naman Rastogi
      22 | 23 | The aim of this project is to implement a system which is capable of detecting and localising objects in an image. The goal is to achieve accurate object detection at a high speed. The ideas in YOLO (You Only Look Once: Unified, Real-Time Object Detection) will help in implementing the project. For the purpose of this project, the publicly available PASCAL VOC dataset will be used. It consists of $10$k annotated images with $20$ object classes with $25$k object annotations. 24 | 25 |
    3. Plane Extraction on Surfaces 26 |
      Tejesh Raut, Deep Modh, Chaitanya Rajesh
      27 | 28 | In this project, we aim to track a section of a plane in the image so that a user can place a 3D object on the surface, for augmented reality applications. We would recognise if the texture to be detected is present or not in the image. If present, we can then find its position and orientation in the image and finally pply the transformation on predefined hardcoded 3D model in order to place the object on the surface plane. The output will be an image along with the 3D model placed on it. 29 | 30 |
    4. Activity Recognition 31 |
      Naman Jain, Sahil Shah, Uddeshya
      32 | 33 | In our project, we plan to work on detecting human activities from videos. There exists a good amount of literature and data in this area, yet, there is room for improvement. Our initial endeavours would be towards exploring 3D CNNs, as is proposed in the literature, to approach the problem. We also intend to incorporate Connectionist Temporal Classification (CTC) losses in our temporal architectures for improved detection. 34 | 35 |
    5. Glyph based AR application 36 |
      Sachin Goyal, Mohit Madan, Michelle Barnette
      37 | 38 | We wish to perform glyph based AR using the following steps: 1) Glyph-boundary recognition using edge detection. 2) Estimating of the distortion (orientation and scaling) in 3D space. 3) Placing any image on the glyph in 3D space (applying rigid transformations). 4) Identification of the glyph pattern and placing specific image on glyph. 5) (If time permits) placing a 3D object on glyph. 39 | 40 |

    6. Vehicle registration plate detection and recognition 41 |
      Chintan Tundia, Shabana K M, Sukanya Bhattacharjee
      42 | 43 | In this project, we aim to build an automated pipeline to perform license plate detection from images of vehicles and perform recognition of characters in the number plate. This would involve first localizing the number plate in the image followed by detecting the numbers. For the current project, we plan to stick to Indian number plates only. 44 | 45 |
    7. Video Prediction using GANS 46 |
      Alok Kumar Bishoyi
      47 | 48 | Adversarial networks have recently been used to predict future video frames. I plan to implement ‘Deep Multi-Scale Video Prediction beyond Mean Square Error’ for Chainer users. I also plan to test and train on sample games like Pong, Breakout, etc. 49 | 50 |
    8. Label satellite image chips from Amazon rainforest with atmospheric conditions and various classes of land cover
      51 | Kalpesh Dusane, Arijit Mukherjee, Vaishali Shakya
      52 | 53 | It is crucial to identify the terrain of Amazon rainforest to help the government track the region of deforestation or human infiltration of forest which in turns help minimize the devastating effects of climate change. Due to recent advancements in technology, we can capture satellite images but manually monitor and labeling this satellite images would be a tedious and unreliable task. To solve this problem, we represent model which assign multiple labels to the image describing various climate conditions and landmark. We seek to explore different deep convolutional neural network architectures to do the image labeling task and learn how changing various factors such as hyperparameters affect the performance of our neural net model. 54 | 55 |
    9. Style Transfer on Car Exteriors
      56 | Uday Kusupati, Jayanth Shankar, Manoj Kilaru
      57 | Applying style from another image(painting, texture etc) to image of car, sort of similar to prisma but for car exteriors. 58 | 59 | Idea is to identify the body of car using segmentation on the car parts. Next we will build upon an efficient solution proposed by Leon A. Gatys where they train feed-forward convolutional neural networks by defining and optimizing perceptual loss functions. Intent is to also use cues like depth, neighborhood pixels, as well as planarity to transfer style better. 60 | 61 |
    10. Implementing Re3: Real-Time Recurrent Regression Networks for Visual Tracking of Generic Objects
      62 | Rishab Garg, Abhishek Kumar
      63 | 64 | Implementing Re3: Real-Time Recurrent Regression Networks for Visual Tracking of Generic Objects 65 | 66 | A robust object tracking algorithm based which utilises the combination of CNN & LSTM. This optimized tracker is capable of tracking objects at 150 FPS, while attaining impressive results on various benchmarks. This tracker also handles temporary occlusion better than other comparable trackers. Involving LSTM gives real-time deep object tracker capability to incorporating temporal information into its model. All this is achieved in with a single forward pass. 67 | 68 |
    11. Face detection using CNN
      69 | Khursheed Ali, Ayush Goyal and Anshul Gupta 70 |
      71 | 72 | Face detection has vas applications in the areas ranging from surveillance, 73 | security, crowd size estimation to social networking etc. The challenge 74 | lies in creating a model which is agnostic to lightning conditions, 75 | pose, accessories and occlusion. We aim to create a pipeline which takes 76 | an image as an input and creates a bounding box on the faces of all the 77 | people in the image. Further, if the dimension of the face is above a certain 78 | threshold, we will detect the expression of the person. 79 | 80 |
    12. Image Captioning
      81 | Yugal Sachdev
      82 | 83 | The goal of this project is to generate natural language descriptions of images and their regions. It will use combination of CNN over image regions, bidirectional RNN over sentences, and a structured loss function that aligns the two modalities - text and images, through a multimodal embedding. I will be implementing 'Deep Visual-Semantic Alignments for Generating Image Descriptions' on Flikr8K, Flickr30K and MSCOCO datasets using Tensorflow and Keras. 84 |
    85 | --------------------------------------------------------------------------------