├── README.md ├── data └── timemachine.txt ├── notebooks-1 ├── 1-ndarray.ipynb ├── 2-autograd.ipynb ├── 3-linear-regression-scratch.ipynb ├── 4-linear-regression-gluon.ipynb ├── 5-fashion-mnist.ipynb ├── 6-softmax-regression-scratch.ipynb ├── 7-softmax-regression-gluon.ipynb ├── 8-mlp-scratch.ipynb └── 9-mlp-gluon.ipynb ├── notebooks-2 ├── 1-use-gpu.ipynb ├── 2-conv-layer.ipynb ├── 3-pooling.ipynb ├── 4-lenet.ipynb ├── 5-alexnet.ipynb ├── 6-vgg.ipynb ├── 7-googlenet.ipynb └── 8-resnet.ipynb ├── notebooks-3 ├── 1-hybridize.ipynb ├── 2-multiple-gpus.ipynb ├── 3-multiple-gpus-gluon.ipynb └── 4-fine-tuning.ipynb ├── notebooks-4 ├── 1-text-preprocessing.ipynb ├── 2-rnn-scratch.ipynb ├── 3-rnn-gluon.ipynb ├── 4-gru.ipynb └── 5-lstm.ipynb └── run_ipynb.sh /README.md: -------------------------------------------------------------------------------- 1 | # d2l-1day-notebooks 2 | 3 | 4 | Notebooks for a 1-day crash course. It aims for teaching deep learning in a single day. This repo contains the notebooks with only simplified code blocks. The texts are also summarized into slides that will be uploaded later. 5 | 6 | Check [the wiki page](https://github.com/mli/1day-notebooks/wiki) for instructions to setup the running environments. 7 | 8 | ## Part 1: Deep Learning Basic 9 | 10 | | title | ipynb | slides | 11 | | ------------------------------ | ---- | ---- | 12 | | Data Manipulation with Ndarray | [github](https://github.com/mli/d2l-1day-notebooks/blob/master/notebooks-1/1-ndarray.ipynb) | [nbviewer](https://nbviewer.jupyter.org/format/slides/github/mli/d2l-1day-notebooks/blob/master/notebooks-1/1-ndarray.ipynb#/) | 13 | | Automatic Differentiation | [github](https://github.com/mli/d2l-1day-notebooks/blob/master/notebooks-1/2-autograd.ipynb) | [nbviewer](https://nbviewer.jupyter.org/format/slides/github/mli/d2l-1day-notebooks/blob/master/notebooks-1/2-autograd.ipynb#/) | 14 | | Linear Regression Implementation from Scratch | [github](https://github.com/mli/d2l-1day-notebooks/blob/master/notebooks-1/3-linear-regression-scratch.ipynb) | [nbviewer](https://nbviewer.jupyter.org/format/slides/github/mli/d2l-1day-notebooks/blob/master/notebooks-1/3-linear-regression-scratch.ipynb#/) | 15 | | Concise Implementation of Linear Regression | [github](https://github.com/mli/d2l-1day-notebooks/blob/master/notebooks-1/4-linear-regression-gluon.ipynb) | [nbviewer](https://nbviewer.jupyter.org/format/slides/github/mli/d2l-1day-notebooks/blob/master/notebooks-1/4-linear-regression-gluon.ipynb#/) | 16 | | Image Classification Data (Fashion-MNIST) | [github](https://github.com/mli/d2l-1day-notebooks/blob/master/notebooks-1/5-fashion-mnist.ipynb) | [nbviewer](https://nbviewer.jupyter.org/format/slides/github/mli/d2l-1day-notebooks/blob/master/notebooks-1/5-fashion-mnist.ipynb#/) | 17 | | Implementation of Softmax Regression from Scratch | [github](https://github.com/mli/d2l-1day-notebooks/blob/master/notebooks-1/6-softmax-regression-scratch.ipynb) | [nbviewer](https://nbviewer.jupyter.org/format/slides/github/mli/d2l-1day-notebooks/blob/master/notebooks-1/6-softmax-regression-scratch.ipynb#/) | 18 | | Concise Implementation of Softmax Regression | [github](https://github.com/mli/d2l-1day-notebooks/blob/master/notebooks-1/7-softmax-regression-gluon.ipynb) | [nbviewer](https://nbviewer.jupyter.org/format/slides/github/mli/d2l-1day-notebooks/blob/master/notebooks-1/7-softmax-regression-gluon.ipynb#/) | 19 | | Implementation of Multilayer Perceptron from Scratch | [github](https://github.com/mli/d2l-1day-notebooks/blob/master/notebooks-1/8-mlp-scratch.ipynb) | [nbviewer](https://nbviewer.jupyter.org/format/slides/github/mli/d2l-1day-notebooks/blob/master/notebooks-1/8-mlp-scratch.ipynb#/) | 20 | | Concise Implementation of Multilayer Perceptron | [github](https://github.com/mli/d2l-1day-notebooks/blob/master/notebooks-1/9-mlp-gluon.ipynb) | [nbviewer](https://nbviewer.jupyter.org/format/slides/github/mli/d2l-1day-notebooks/blob/master/notebooks-1/9-mlp-gluon.ipynb#/) | 21 | 22 | ## Part 2: Convolutional Neural Networks 23 | 24 | | title | ipynb | slides | 25 | | -------------------------------------------- | ------------------------------------------------------------ | ------------------------------------------------------------ | 26 | | GPUs | [github](https://github.com/mli/d2l-1day-notebooks/blob/master/notebooks-2/1-use-gpu.ipynb) | [nbviewer](https://nbviewer.jupyter.org/format/slides/github/mli/d2l-1day-notebooks/blob/master/notebooks-2/1-use-gpu.ipynb#/) | 27 | | Convolutions | [github](https://github.com/mli/d2l-1day-notebooks/blob/master/notebooks-2/2-conv-layer.ipynb) | [nbviewer](https://nbviewer.jupyter.org/format/slides/github/mli/d2l-1day-notebooks/blob/master/notebooks-2/2-conv-layer.ipynb#/) | 28 | | Pooling | [github](https://github.com/mli/d2l-1day-notebooks/blob/master/notebooks-2/3-pooling.ipynb) | [nbviewer](https://nbviewer.jupyter.org/format/slides/github/mli/d2l-1day-notebooks/blob/master/notebooks-2/3-pooling.ipynb#/) | 29 | | Convolutional Neural Networks (LeNet) | [github](https://github.com/mli/d2l-1day-notebooks/blob/master/notebooks-2/4-lenet.ipynb) | [nbviewer](https://nbviewer.jupyter.org/format/slides/github/mli/d2l-1day-notebooks/blob/master/notebooks-2/4-lenet.ipynb#/) | 30 | | Deep Convolutional Neural Networks (AlexNet) | [github](https://github.com/mli/d2l-1day-notebooks/blob/master/notebooks-2/5-alexnet.ipynb) | [nbviewer](https://nbviewer.jupyter.org/format/slides/github/mli/d2l-1day-notebooks/blob/master/notebooks-2/5-alexnet.ipynb#/) | 31 | | Networks Using Blocks (VGG) | [github](https://github.com/mli/d2l-1day-notebooks/blob/master/notebooks-2/6-vgg.ipynb) | [nbviewer](https://nbviewer.jupyter.org/format/slides/github/mli/d2l-1day-notebooks/blob/master/notebooks-2/6-vgg.ipynb#/) | 32 | | Inception Networks (GoogLeNet) | [github](https://github.com/mli/d2l-1day-notebooks/blob/master/notebooks-2/7-googlenet.ipynb) | [nbviewer](https://nbviewer.jupyter.org/format/slides/github/mli/d2l-1day-notebooks/blob/master/notebooks-2/7-googlenet.ipynb#/) | 33 | | Residual Networks (ResNet) | [github](https://github.com/mli/d2l-1day-notebooks/blob/master/notebooks-2/8-resnet.ipynb) | [nbviewer](https://nbviewer.jupyter.org/format/slides/github/mli/d2l-1day-notebooks/blob/master/notebooks-2/8-resnet.ipynb#/) | 34 | 35 | ## Part 3: Performance 36 | 37 | | title | ipynb | slides | 38 | | ------------------------------------------------- | ------------------------------------------------------------ | ------------------------------------------------------------ | 39 | | A Hybrid of Imperative and Symbolic Programming | [github](https://github.com/mli/d2l-1day-notebooks/blob/master/notebooks-3/1-hybridize.ipynb) | [nbviewer](https://nbviewer.jupyter.org/format/slides/github/mli/d2l-1day-notebooks/blob/master/notebooks-3/1-hybridize.ipynb#/) | 40 | | Multi-GPU Computation Implementation from Scratch | [github](https://github.com/mli/d2l-1day-notebooks/blob/master/notebooks-3/2-multiple-gpus.ipynb) | [nbviewer](https://nbviewer.jupyter.org/format/slides/github/mli/d2l-1day-notebooks/blob/master/notebooks-3/2-multiple-gpus.ipynb#/) | 41 | | Concise Implementation of Multi-GPU Computation | [github](https://github.com/mli/d2l-1day-notebooks/blob/master/notebooks-3/3-multiple-gpus-gluon.ipynb) | [nbviewer](https://nbviewer.jupyter.org/format/slides/github/mli/d2l-1day-notebooks/blob/master/notebooks-3/3-multiple-gpus-gluon.ipynb#/) | 42 | | Fine Tuning | [github](https://github.com/mli/d2l-1day-notebooks/blob/master/notebooks-3/4-fine-tuning.ipynb) | [nbviewer](https://nbviewer.jupyter.org/format/slides/github/mli/d2l-1day-notebooks/blob/master/notebooks-3/4-fine-tuning.ipynb#/) | 43 | 44 | ## Part 4: Recurrent Neural Networks 45 | 46 | | title | ipynb | slides | 47 | | -------------------------------------------------------- | ------------------------------------------------------------ | ------------------------------------------------------------ | 48 | | Text Preprocessing | [github](https://github.com/mli/d2l-1day-notebooks/blob/master/notebooks-4/1-text-preprocessing.ipynb) | [nbviewer](https://nbviewer.jupyter.org/format/slides/github/mli/d2l-1day-notebooks/blob/master/notebooks-4/1-text-preprocessing.ipynb#/) | 49 | | Implementation of Recurrent Neural Networks from Scratch | [github](https://github.com/mli/d2l-1day-notebooks/blob/master/notebooks-4/2-rnn-scratch.ipynb) | [nbviewer](https://nbviewer.jupyter.org/format/slides/github/mli/d2l-1day-notebooks/blob/master/notebooks-4/2-rnn-scratch.ipynb#/) | 50 | | Concise Implementation of Recurrent Neural Networks | [github](https://github.com/mli/d2l-1day-notebooks/blob/master/notebooks-4/3-rnn-gluon.ipynb) | [nbviewer](https://nbviewer.jupyter.org/format/slides/github/mli/d2l-1day-notebooks/blob/master/notebooks-4/3-rnn-gluon.ipynb#/) | 51 | | Gated Recurrent Units (GRU) | [github](https://github.com/mli/d2l-1day-notebooks/blob/master/notebooks-4/4-gru.ipynb) | [nbviewer](https://nbviewer.jupyter.org/format/slides/github/mli/d2l-1day-notebooks/blob/master/notebooks-4/4-gru.ipynb#/) | 52 | | Long Short Term Memory (LSTM) | [github](https://github.com/mli/d2l-1day-notebooks/blob/master/notebooks-4/5-lstm.ipynb) | [nbviewer](https://nbviewer.jupyter.org/format/slides/github/mli/d2l-1day-notebooks/blob/master/notebooks-4/5-lstm.ipynb#/) | 53 | 54 | 55 | -------------------------------------------------------------------------------- /notebooks-1/1-ndarray.ipynb: -------------------------------------------------------------------------------- 1 | { 2 | "cells": [ 3 | { 4 | "cell_type": "markdown", 5 | "metadata": { 6 | "slideshow": { 7 | "slide_type": "slide" 8 | } 9 | }, 10 | "source": [ 11 | "# Data Manipulation with Ndarray\n", 12 | "\n", 13 | "Importing `np` (numpy-like) module and `npx` (numpy extensions) module from MXNet. " 14 | ] 15 | }, 16 | { 17 | "cell_type": "code", 18 | "execution_count": 1, 19 | "metadata": { 20 | "ExecuteTime": { 21 | "end_time": "2019-07-03T21:48:38.205412Z", 22 | "start_time": "2019-07-03T21:48:36.544703Z" 23 | }, 24 | "attributes": { 25 | "classes": [], 26 | "id": "", 27 | "n": "1" 28 | } 29 | }, 30 | "outputs": [], 31 | "source": [ 32 | "from mxnet import np, npx\n", 33 | "# Invoke the experimental numpy-compatible feature in MXNet \n", 34 | "npx.set_np() " 35 | ] 36 | }, 37 | { 38 | "cell_type": "markdown", 39 | "metadata": { 40 | "slideshow": { 41 | "slide_type": "slide" 42 | } 43 | }, 44 | "source": [ 45 | "Create a vector and query its attributes" 46 | ] 47 | }, 48 | { 49 | "cell_type": "code", 50 | "execution_count": 2, 51 | "metadata": { 52 | "ExecuteTime": { 53 | "end_time": "2019-07-03T21:48:38.232852Z", 54 | "start_time": "2019-07-03T21:48:38.208019Z" 55 | }, 56 | "attributes": { 57 | "classes": [], 58 | "id": "", 59 | "n": "2" 60 | } 61 | }, 62 | "outputs": [ 63 | { 64 | "data": { 65 | "text/plain": [ 66 | "array([ 0., 1., 2., 3., 4., 5., 6., 7., 8., 9., 10., 11.])" 67 | ] 68 | }, 69 | "execution_count": 2, 70 | "metadata": {}, 71 | "output_type": "execute_result" 72 | } 73 | ], 74 | "source": [ 75 | "x = np.arange(12)\n", 76 | "x" 77 | ] 78 | }, 79 | { 80 | "cell_type": "code", 81 | "execution_count": 3, 82 | "metadata": { 83 | "ExecuteTime": { 84 | "end_time": "2019-07-03T21:48:38.239446Z", 85 | "start_time": "2019-07-03T21:48:38.235098Z" 86 | }, 87 | "attributes": { 88 | "classes": [], 89 | "id": "", 90 | "n": "8" 91 | } 92 | }, 93 | "outputs": [ 94 | { 95 | "data": { 96 | "text/plain": [ 97 | "(12,)" 98 | ] 99 | }, 100 | "execution_count": 3, 101 | "metadata": {}, 102 | "output_type": "execute_result" 103 | } 104 | ], 105 | "source": [ 106 | "x.shape" 107 | ] 108 | }, 109 | { 110 | "cell_type": "code", 111 | "execution_count": 4, 112 | "metadata": { 113 | "ExecuteTime": { 114 | "end_time": "2019-07-03T21:48:38.246552Z", 115 | "start_time": "2019-07-03T21:48:38.242380Z" 116 | }, 117 | "attributes": { 118 | "classes": [], 119 | "id": "", 120 | "n": "9" 121 | } 122 | }, 123 | "outputs": [ 124 | { 125 | "data": { 126 | "text/plain": [ 127 | "12" 128 | ] 129 | }, 130 | "execution_count": 4, 131 | "metadata": {}, 132 | "output_type": "execute_result" 133 | } 134 | ], 135 | "source": [ 136 | "x.size" 137 | ] 138 | }, 139 | { 140 | "cell_type": "markdown", 141 | "metadata": { 142 | "slideshow": { 143 | "slide_type": "slide" 144 | } 145 | }, 146 | "source": [ 147 | "More ways to construct arrays" 148 | ] 149 | }, 150 | { 151 | "cell_type": "code", 152 | "execution_count": 5, 153 | "metadata": { 154 | "ExecuteTime": { 155 | "end_time": "2019-07-03T21:48:38.254509Z", 156 | "start_time": "2019-07-03T21:48:38.248757Z" 157 | }, 158 | "attributes": { 159 | "classes": [], 160 | "id": "", 161 | "n": "4" 162 | } 163 | }, 164 | "outputs": [ 165 | { 166 | "data": { 167 | "text/plain": [ 168 | "array([[0., 0., 0., 0.],\n", 169 | " [0., 0., 0., 0.],\n", 170 | " [0., 0., 0., 0.]])" 171 | ] 172 | }, 173 | "execution_count": 5, 174 | "metadata": {}, 175 | "output_type": "execute_result" 176 | } 177 | ], 178 | "source": [ 179 | "np.zeros((3, 4))" 180 | ] 181 | }, 182 | { 183 | "cell_type": "code", 184 | "execution_count": 6, 185 | "metadata": { 186 | "ExecuteTime": { 187 | "end_time": "2019-07-03T21:48:38.263742Z", 188 | "start_time": "2019-07-03T21:48:38.256770Z" 189 | }, 190 | "attributes": { 191 | "classes": [], 192 | "id": "", 193 | "n": "6" 194 | } 195 | }, 196 | "outputs": [ 197 | { 198 | "data": { 199 | "text/plain": [ 200 | "array([[2., 1., 4., 3.],\n", 201 | " [1., 2., 3., 4.],\n", 202 | " [4., 3., 2., 1.]])" 203 | ] 204 | }, 205 | "execution_count": 6, 206 | "metadata": {}, 207 | "output_type": "execute_result" 208 | } 209 | ], 210 | "source": [ 211 | "np.array([[2, 1, 4, 3], [1, 2, 3, 4], [4, 3, 2, 1]])" 212 | ] 213 | }, 214 | { 215 | "cell_type": "code", 216 | "execution_count": 7, 217 | "metadata": { 218 | "ExecuteTime": { 219 | "end_time": "2019-07-03T21:48:38.283115Z", 220 | "start_time": "2019-07-03T21:48:38.266029Z" 221 | }, 222 | "attributes": { 223 | "classes": [], 224 | "id": "", 225 | "n": "7" 226 | } 227 | }, 228 | "outputs": [ 229 | { 230 | "data": { 231 | "text/plain": [ 232 | "array([[ 2.2122064 , 0.7740038 , 1.0434405 , 1.1839255 ],\n", 233 | " [ 1.8917114 , -1.2347414 , -1.771029 , -0.45138445],\n", 234 | " [ 0.57938355, -1.856082 , -1.9768796 , -0.20801921]])" 235 | ] 236 | }, 237 | "execution_count": 7, 238 | "metadata": {}, 239 | "output_type": "execute_result" 240 | } 241 | ], 242 | "source": [ 243 | "np.random.normal(0, 1, size=(3, 4))" 244 | ] 245 | }, 246 | { 247 | "cell_type": "markdown", 248 | "metadata": { 249 | "slideshow": { 250 | "slide_type": "slide" 251 | } 252 | }, 253 | "source": [ 254 | "Elemental-wise operators" 255 | ] 256 | }, 257 | { 258 | "cell_type": "code", 259 | "execution_count": 8, 260 | "metadata": { 261 | "ExecuteTime": { 262 | "end_time": "2019-07-03T21:48:38.298150Z", 263 | "start_time": "2019-07-03T21:48:38.286183Z" 264 | } 265 | }, 266 | "outputs": [ 267 | { 268 | "name": "stdout", 269 | "output_type": "stream", 270 | "text": [ 271 | "x = [1. 2. 4. 8.]\n", 272 | "x + y [ 3. 4. 6. 10.]\n", 273 | "x - y [-1. 0. 2. 6.]\n", 274 | "x * y [ 2. 4. 8. 16.]\n", 275 | "x ** y [ 1. 4. 16. 64.]\n", 276 | "x / y [0.5 1. 2. 4. ]\n" 277 | ] 278 | } 279 | ], 280 | "source": [ 281 | "x = np.array([1, 2, 4, 8])\n", 282 | "y = np.ones_like(x) * 2\n", 283 | "print('x =', x)\n", 284 | "print('x + y', x + y)\n", 285 | "print('x - y', x - y)\n", 286 | "print('x * y', x * y)\n", 287 | "print('x ** y', x ** y)\n", 288 | "print('x / y', x / y)" 289 | ] 290 | }, 291 | { 292 | "cell_type": "markdown", 293 | "metadata": { 294 | "slideshow": { 295 | "slide_type": "slide" 296 | } 297 | }, 298 | "source": [ 299 | "Matrix multiplication." 300 | ] 301 | }, 302 | { 303 | "cell_type": "code", 304 | "execution_count": 9, 305 | "metadata": { 306 | "ExecuteTime": { 307 | "end_time": "2019-07-03T21:48:38.320311Z", 308 | "start_time": "2019-07-03T21:48:38.300609Z" 309 | }, 310 | "attributes": { 311 | "classes": [], 312 | "id": "", 313 | "n": "13" 314 | } 315 | }, 316 | "outputs": [ 317 | { 318 | "data": { 319 | "text/plain": [ 320 | "array([[ 18., 20., 10.],\n", 321 | " [ 58., 60., 50.],\n", 322 | " [ 98., 100., 90.]])" 323 | ] 324 | }, 325 | "execution_count": 9, 326 | "metadata": {}, 327 | "output_type": "execute_result" 328 | } 329 | ], 330 | "source": [ 331 | "x = np.arange(12).reshape((3,4))\n", 332 | "y = np.array([[2, 1, 4, 3], [1, 2, 3, 4], [4, 3, 2, 1]])\n", 333 | "np.dot(x, y.T)" 334 | ] 335 | }, 336 | { 337 | "cell_type": "markdown", 338 | "metadata": { 339 | "slideshow": { 340 | "slide_type": "slide" 341 | } 342 | }, 343 | "source": [ 344 | "Concatenate arrays along a particular axis. " 345 | ] 346 | }, 347 | { 348 | "cell_type": "code", 349 | "execution_count": 10, 350 | "metadata": { 351 | "ExecuteTime": { 352 | "end_time": "2019-07-03T21:48:38.353948Z", 353 | "start_time": "2019-07-03T21:48:38.322469Z" 354 | } 355 | }, 356 | "outputs": [ 357 | { 358 | "data": { 359 | "text/plain": [ 360 | "(array([[ 0., 1., 2., 3.],\n", 361 | " [ 4., 5., 6., 7.],\n", 362 | " [ 8., 9., 10., 11.],\n", 363 | " [ 2., 1., 4., 3.],\n", 364 | " [ 1., 2., 3., 4.],\n", 365 | " [ 4., 3., 2., 1.]]),\n", 366 | " array([[ 0., 1., 2., 3., 2., 1., 4., 3.],\n", 367 | " [ 4., 5., 6., 7., 1., 2., 3., 4.],\n", 368 | " [ 8., 9., 10., 11., 4., 3., 2., 1.]]))" 369 | ] 370 | }, 371 | "execution_count": 10, 372 | "metadata": {}, 373 | "output_type": "execute_result" 374 | } 375 | ], 376 | "source": [ 377 | "np.concatenate([x, y], axis=0), np.concatenate([x, y], axis=1)" 378 | ] 379 | }, 380 | { 381 | "cell_type": "markdown", 382 | "metadata": { 383 | "slideshow": { 384 | "slide_type": "slide" 385 | } 386 | }, 387 | "source": [ 388 | "Broadcast Mechanism" 389 | ] 390 | }, 391 | { 392 | "cell_type": "code", 393 | "execution_count": 11, 394 | "metadata": { 395 | "ExecuteTime": { 396 | "end_time": "2019-07-03T21:48:38.382407Z", 397 | "start_time": "2019-07-03T21:48:38.355839Z" 398 | }, 399 | "attributes": { 400 | "classes": [], 401 | "id": "", 402 | "n": "14" 403 | } 404 | }, 405 | "outputs": [ 406 | { 407 | "name": "stdout", 408 | "output_type": "stream", 409 | "text": [ 410 | "a:\n", 411 | " [[0.]\n", 412 | " [1.]\n", 413 | " [2.]]\n", 414 | "b:\n", 415 | " [[0. 1.]]\n" 416 | ] 417 | }, 418 | { 419 | "data": { 420 | "text/plain": [ 421 | "array([[0., 1.],\n", 422 | " [1., 2.],\n", 423 | " [2., 3.]])" 424 | ] 425 | }, 426 | "execution_count": 11, 427 | "metadata": {}, 428 | "output_type": "execute_result" 429 | } 430 | ], 431 | "source": [ 432 | "a = np.arange(3).reshape((3, 1))\n", 433 | "b = np.arange(2).reshape((1, 2))\n", 434 | "print('a:\\n', a)\n", 435 | "print('b:\\n', b)\n", 436 | "a + b" 437 | ] 438 | }, 439 | { 440 | "cell_type": "markdown", 441 | "metadata": { 442 | "slideshow": { 443 | "slide_type": "slide" 444 | } 445 | }, 446 | "source": [ 447 | "Indexing and Slicing\n" 448 | ] 449 | }, 450 | { 451 | "cell_type": "code", 452 | "execution_count": 12, 453 | "metadata": { 454 | "ExecuteTime": { 455 | "end_time": "2019-07-03T21:48:38.400208Z", 456 | "start_time": "2019-07-03T21:48:38.384624Z" 457 | }, 458 | "attributes": { 459 | "classes": [], 460 | "id": "", 461 | "n": "19" 462 | } 463 | }, 464 | "outputs": [ 465 | { 466 | "name": "stdout", 467 | "output_type": "stream", 468 | "text": [ 469 | "x[-1] =\n", 470 | " [ 8. 9. 10. 11.]\n", 471 | "x[1:3] =\n", 472 | " [[ 4. 5. 6. 7.]\n", 473 | " [ 8. 9. 10. 11.]]\n", 474 | "x[1:3, 2:4] =\n", 475 | " [[ 6. 7.]\n", 476 | " [10. 11.]]\n", 477 | "x[1,2] = 6.0\n" 478 | ] 479 | } 480 | ], 481 | "source": [ 482 | "print('x[-1] =\\n', x[-1])\n", 483 | "print('x[1:3] =\\n', x[1:3])\n", 484 | "print('x[1:3, 2:4] =\\n', x[1:3, 2:4])\n", 485 | "print('x[1,2] =', x[1,2])" 486 | ] 487 | }, 488 | { 489 | "cell_type": "markdown", 490 | "metadata": { 491 | "slideshow": { 492 | "slide_type": "slide" 493 | } 494 | }, 495 | "source": [ 496 | "`mxnet.numpy.ndarray` and `numpy.ndarray`" 497 | ] 498 | }, 499 | { 500 | "cell_type": "code", 501 | "execution_count": 13, 502 | "metadata": { 503 | "ExecuteTime": { 504 | "end_time": "2019-07-03T21:48:38.406593Z", 505 | "start_time": "2019-07-03T21:48:38.402022Z" 506 | }, 507 | "attributes": { 508 | "classes": [], 509 | "id": "", 510 | "n": "22" 511 | } 512 | }, 513 | "outputs": [ 514 | { 515 | "name": "stdout", 516 | "output_type": "stream", 517 | "text": [ 518 | "\n", 519 | "\n" 520 | ] 521 | } 522 | ], 523 | "source": [ 524 | "a = x.asnumpy()\n", 525 | "print(type(a))\n", 526 | "b = np.array(a)\n", 527 | "print(type(b))" 528 | ] 529 | } 530 | ], 531 | "metadata": { 532 | "celltoolbar": "Slideshow", 533 | "kernelspec": { 534 | "display_name": "Python 3", 535 | "language": "python", 536 | "name": "python3" 537 | }, 538 | "language_info": { 539 | "codemirror_mode": { 540 | "name": "ipython", 541 | "version": 3 542 | }, 543 | "file_extension": ".py", 544 | "mimetype": "text/x-python", 545 | "name": "python", 546 | "nbconvert_exporter": "python", 547 | "pygments_lexer": "ipython3", 548 | "version": "3.7.1" 549 | }, 550 | "toc": { 551 | "base_numbering": 1, 552 | "nav_menu": {}, 553 | "number_sections": true, 554 | "sideBar": true, 555 | "skip_h1_title": false, 556 | "title_cell": "Table of Contents", 557 | "title_sidebar": "Contents", 558 | "toc_cell": false, 559 | "toc_position": {}, 560 | "toc_section_display": true, 561 | "toc_window_display": false 562 | } 563 | }, 564 | "nbformat": 4, 565 | "nbformat_minor": 2 566 | } 567 | -------------------------------------------------------------------------------- /notebooks-1/2-autograd.ipynb: -------------------------------------------------------------------------------- 1 | { 2 | "cells": [ 3 | { 4 | "cell_type": "markdown", 5 | "metadata": { 6 | "slideshow": { 7 | "slide_type": "slide" 8 | } 9 | }, 10 | "source": [ 11 | "# Automatic Differentiation" 12 | ] 13 | }, 14 | { 15 | "cell_type": "code", 16 | "execution_count": 1, 17 | "metadata": { 18 | "ExecuteTime": { 19 | "end_time": "2019-07-03T21:50:54.398998Z", 20 | "start_time": "2019-07-03T21:50:53.307986Z" 21 | }, 22 | "attributes": { 23 | "classes": [], 24 | "id": "", 25 | "n": "1" 26 | } 27 | }, 28 | "outputs": [ 29 | { 30 | "data": { 31 | "text/plain": [ 32 | "array([0., 1., 2., 3.])" 33 | ] 34 | }, 35 | "execution_count": 1, 36 | "metadata": {}, 37 | "output_type": "execute_result" 38 | } 39 | ], 40 | "source": [ 41 | "from mxnet import autograd, np, npx \n", 42 | "npx.set_np()\n", 43 | "\n", 44 | "x = np.arange(4)\n", 45 | "x" 46 | ] 47 | }, 48 | { 49 | "cell_type": "markdown", 50 | "metadata": { 51 | "slideshow": { 52 | "slide_type": "-" 53 | } 54 | }, 55 | "source": [ 56 | "Allocate space to store the gradient with respect to ``x``." 57 | ] 58 | }, 59 | { 60 | "cell_type": "code", 61 | "execution_count": 2, 62 | "metadata": { 63 | "ExecuteTime": { 64 | "end_time": "2019-07-03T21:50:54.404851Z", 65 | "start_time": "2019-07-03T21:50:54.401388Z" 66 | }, 67 | "attributes": { 68 | "classes": [], 69 | "id": "", 70 | "n": "3" 71 | } 72 | }, 73 | "outputs": [], 74 | "source": [ 75 | "x.attach_grad()" 76 | ] 77 | }, 78 | { 79 | "cell_type": "markdown", 80 | "metadata": { 81 | "slideshow": { 82 | "slide_type": "slide" 83 | } 84 | }, 85 | "source": [ 86 | "Record the computation within the `record` scope." 87 | ] 88 | }, 89 | { 90 | "cell_type": "code", 91 | "execution_count": 3, 92 | "metadata": { 93 | "ExecuteTime": { 94 | "end_time": "2019-07-03T21:50:54.416246Z", 95 | "start_time": "2019-07-03T21:50:54.406690Z" 96 | }, 97 | "attributes": { 98 | "classes": [], 99 | "id": "", 100 | "n": "4" 101 | }, 102 | "scrolled": true 103 | }, 104 | "outputs": [ 105 | { 106 | "data": { 107 | "text/plain": [ 108 | "array(28.)" 109 | ] 110 | }, 111 | "execution_count": 3, 112 | "metadata": {}, 113 | "output_type": "execute_result" 114 | } 115 | ], 116 | "source": [ 117 | "with autograd.record():\n", 118 | " y = 2.0 * np.dot(x, x)\n", 119 | "y" 120 | ] 121 | }, 122 | { 123 | "cell_type": "markdown", 124 | "metadata": { 125 | "slideshow": { 126 | "slide_type": "-" 127 | } 128 | }, 129 | "source": [ 130 | "The gradient of the function $y = 2\\mathbf{x}^{\\top}\\mathbf{x}$ with respect to $\\mathbf{x}$ should be $4\\mathbf{x}$. " 131 | ] 132 | }, 133 | { 134 | "cell_type": "code", 135 | "execution_count": 4, 136 | "metadata": { 137 | "ExecuteTime": { 138 | "end_time": "2019-07-03T21:50:54.435955Z", 139 | "start_time": "2019-07-03T21:50:54.418269Z" 140 | }, 141 | "attributes": { 142 | "classes": [], 143 | "id": "", 144 | "n": "5" 145 | } 146 | }, 147 | "outputs": [ 148 | { 149 | "data": { 150 | "text/plain": [ 151 | "array([0., 0., 0., 0.])" 152 | ] 153 | }, 154 | "execution_count": 4, 155 | "metadata": {}, 156 | "output_type": "execute_result" 157 | } 158 | ], 159 | "source": [ 160 | "y.backward()\n", 161 | "x.grad - 4 * x" 162 | ] 163 | } 164 | ], 165 | "metadata": { 166 | "celltoolbar": "Slideshow", 167 | "kernelspec": { 168 | "display_name": "Python 3", 169 | "language": "python", 170 | "name": "python3" 171 | }, 172 | "language_info": { 173 | "codemirror_mode": { 174 | "name": "ipython", 175 | "version": 3 176 | }, 177 | "file_extension": ".py", 178 | "mimetype": "text/x-python", 179 | "name": "python", 180 | "nbconvert_exporter": "python", 181 | "pygments_lexer": "ipython3", 182 | "version": "3.7.1" 183 | }, 184 | "toc": { 185 | "base_numbering": 1, 186 | "nav_menu": {}, 187 | "number_sections": true, 188 | "sideBar": true, 189 | "skip_h1_title": false, 190 | "title_cell": "Table of Contents", 191 | "title_sidebar": "Contents", 192 | "toc_cell": false, 193 | "toc_position": {}, 194 | "toc_section_display": true, 195 | "toc_window_display": false 196 | } 197 | }, 198 | "nbformat": 4, 199 | "nbformat_minor": 2 200 | } 201 | -------------------------------------------------------------------------------- /notebooks-1/4-linear-regression-gluon.ipynb: -------------------------------------------------------------------------------- 1 | { 2 | "cells": [ 3 | { 4 | "cell_type": "markdown", 5 | "metadata": { 6 | "slideshow": { 7 | "slide_type": "slide" 8 | } 9 | }, 10 | "source": [ 11 | "# Concise Implementation of Linear Regression" 12 | ] 13 | }, 14 | { 15 | "cell_type": "code", 16 | "execution_count": 1, 17 | "metadata": { 18 | "ExecuteTime": { 19 | "end_time": "2019-07-03T21:59:54.049026Z", 20 | "start_time": "2019-07-03T21:59:52.500967Z" 21 | }, 22 | "attributes": { 23 | "classes": [], 24 | "id": "", 25 | "n": "2" 26 | } 27 | }, 28 | "outputs": [], 29 | "source": [ 30 | "import d2l\n", 31 | "from mxnet import autograd, np, npx, gluon\n", 32 | "npx.set_np()\n", 33 | "\n", 34 | "true_w = np.array([2, -3.4])\n", 35 | "true_b = 4.2\n", 36 | "features, labels = d2l.synthetic_data(true_w, true_b, 1000)" 37 | ] 38 | }, 39 | { 40 | "cell_type": "markdown", 41 | "metadata": { 42 | "slideshow": { 43 | "slide_type": "slide" 44 | } 45 | }, 46 | "source": [ 47 | "Reading Data" 48 | ] 49 | }, 50 | { 51 | "cell_type": "code", 52 | "execution_count": 2, 53 | "metadata": { 54 | "ExecuteTime": { 55 | "end_time": "2019-07-03T21:59:54.120474Z", 56 | "start_time": "2019-07-03T21:59:54.051605Z" 57 | }, 58 | "attributes": { 59 | "classes": [], 60 | "id": "", 61 | "n": "3" 62 | }, 63 | "scrolled": true 64 | }, 65 | "outputs": [ 66 | { 67 | "name": "stdout", 68 | "output_type": "stream", 69 | "text": [ 70 | "X =\n", 71 | "[[ 0.4015098 1.4096868 ]\n", 72 | " [ 0.65820086 -1.4260322 ]\n", 73 | " [ 0.00153129 -0.14330608]\n", 74 | " [-0.843129 0.6070013 ]\n", 75 | " [ 1.5080738 -0.27229312]\n", 76 | " [-0.01436996 0.50522786]\n", 77 | " [-0.2513225 -0.7733599 ]\n", 78 | " [-0.4892422 0.82852226]\n", 79 | " [ 0.19469471 0.26424283]\n", 80 | " [ 0.8269238 1.0562588 ]]y =\n", 81 | "[ 0.21267602 10.392891 4.7093544 0.45944032 8.140177 2.475876\n", 82 | " 6.332503 0.4059622 3.69898 2.2603266 ]\n" 83 | ] 84 | } 85 | ], 86 | "source": [ 87 | "def load_array(data_arrays, batch_size, is_train=True):\n", 88 | " dataset = gluon.data.ArrayDataset(*data_arrays)\n", 89 | " return gluon.data.DataLoader(dataset, batch_size, shuffle=is_train)\n", 90 | " \n", 91 | "batch_size = 10\n", 92 | "data_iter = load_array((features, labels), batch_size)\n", 93 | "for X, y in data_iter:\n", 94 | " print('X =\\n%sy =\\n%s' %(X, y))\n", 95 | " break" 96 | ] 97 | }, 98 | { 99 | "cell_type": "markdown", 100 | "metadata": { 101 | "slideshow": { 102 | "slide_type": "slide" 103 | } 104 | }, 105 | "source": [ 106 | "Define the Model and initialize Model Parameters" 107 | ] 108 | }, 109 | { 110 | "cell_type": "code", 111 | "execution_count": 3, 112 | "metadata": { 113 | "ExecuteTime": { 114 | "end_time": "2019-07-03T21:59:54.129952Z", 115 | "start_time": "2019-07-03T21:59:54.124685Z" 116 | }, 117 | "attributes": { 118 | "classes": [], 119 | "id": "", 120 | "n": "5" 121 | }, 122 | "slideshow": { 123 | "slide_type": "-" 124 | } 125 | }, 126 | "outputs": [], 127 | "source": [ 128 | "from mxnet.gluon import nn\n", 129 | "from mxnet import init\n", 130 | "\n", 131 | "net = nn.Sequential()\n", 132 | "net.add(nn.Dense(1))\n", 133 | "net.initialize(init.Normal(sigma=0.01))" 134 | ] 135 | }, 136 | { 137 | "cell_type": "markdown", 138 | "metadata": { 139 | "slideshow": { 140 | "slide_type": "slide" 141 | } 142 | }, 143 | "source": [ 144 | "Define the loss function and optimization algorithm" 145 | ] 146 | }, 147 | { 148 | "cell_type": "code", 149 | "execution_count": 4, 150 | "metadata": { 151 | "ExecuteTime": { 152 | "end_time": "2019-07-03T21:59:54.135619Z", 153 | "start_time": "2019-07-03T21:59:54.132150Z" 154 | }, 155 | "attributes": { 156 | "classes": [], 157 | "id": "", 158 | "n": "8" 159 | } 160 | }, 161 | "outputs": [], 162 | "source": [ 163 | "from mxnet import gluon\n", 164 | "\n", 165 | "loss = gluon.loss.L2Loss() \n", 166 | "trainer = gluon.Trainer(net.collect_params(),\n", 167 | " 'sgd', {'learning_rate': 0.03})" 168 | ] 169 | }, 170 | { 171 | "cell_type": "markdown", 172 | "metadata": { 173 | "slideshow": { 174 | "slide_type": "slide" 175 | } 176 | }, 177 | "source": [ 178 | "Training" 179 | ] 180 | }, 181 | { 182 | "cell_type": "code", 183 | "execution_count": 5, 184 | "metadata": { 185 | "ExecuteTime": { 186 | "end_time": "2019-07-03T21:59:56.549677Z", 187 | "start_time": "2019-07-03T21:59:54.137373Z" 188 | }, 189 | "attributes": { 190 | "classes": [], 191 | "id": "", 192 | "n": "10" 193 | } 194 | }, 195 | "outputs": [ 196 | { 197 | "name": "stdout", 198 | "output_type": "stream", 199 | "text": [ 200 | "epoch 1, loss: 0.040749\n", 201 | "epoch 2, loss: 0.000152\n", 202 | "epoch 3, loss: 0.000051\n", 203 | "Error in estimating w [[ 0.00024056 -0.00077081]]\n", 204 | "Error in estimating b [0.00041628]\n" 205 | ] 206 | } 207 | ], 208 | "source": [ 209 | "for epoch in range(1, 4):\n", 210 | " for X, y in data_iter:\n", 211 | " with autograd.record():\n", 212 | " l = loss(net(X), y)\n", 213 | " l.backward()\n", 214 | " trainer.step(batch_size)\n", 215 | " l = loss(net(features), labels)\n", 216 | " print('epoch %d, loss: %f' % (epoch, l.mean()))\n", 217 | " \n", 218 | "w = net[0].weight.data()\n", 219 | "print('Error in estimating w', true_w.reshape(w.shape) - w)\n", 220 | "b = net[0].bias.data()\n", 221 | "print('Error in estimating b', true_b - b) " 222 | ] 223 | } 224 | ], 225 | "metadata": { 226 | "celltoolbar": "Slideshow", 227 | "kernelspec": { 228 | "display_name": "Python 3", 229 | "language": "python", 230 | "name": "python3" 231 | }, 232 | "language_info": { 233 | "codemirror_mode": { 234 | "name": "ipython", 235 | "version": 3 236 | }, 237 | "file_extension": ".py", 238 | "mimetype": "text/x-python", 239 | "name": "python", 240 | "nbconvert_exporter": "python", 241 | "pygments_lexer": "ipython3", 242 | "version": "3.7.1" 243 | }, 244 | "toc": { 245 | "base_numbering": 1, 246 | "nav_menu": {}, 247 | "number_sections": true, 248 | "sideBar": true, 249 | "skip_h1_title": false, 250 | "title_cell": "Table of Contents", 251 | "title_sidebar": "Contents", 252 | "toc_cell": false, 253 | "toc_position": {}, 254 | "toc_section_display": true, 255 | "toc_window_display": false 256 | } 257 | }, 258 | "nbformat": 4, 259 | "nbformat_minor": 2 260 | } 261 | -------------------------------------------------------------------------------- /notebooks-1/7-softmax-regression-gluon.ipynb: -------------------------------------------------------------------------------- 1 | { 2 | "cells": [ 3 | { 4 | "cell_type": "markdown", 5 | "metadata": { 6 | "slideshow": { 7 | "slide_type": "slide" 8 | } 9 | }, 10 | "source": [ 11 | "# Concise Implementation of Softmax Regression\n" 12 | ] 13 | }, 14 | { 15 | "cell_type": "code", 16 | "execution_count": 1, 17 | "metadata": { 18 | "ExecuteTime": { 19 | "end_time": "2019-07-03T22:07:06.023766Z", 20 | "start_time": "2019-07-03T22:07:03.435628Z" 21 | }, 22 | "attributes": { 23 | "classes": [], 24 | "id": "", 25 | "n": "1" 26 | } 27 | }, 28 | "outputs": [], 29 | "source": [ 30 | "import d2l\n", 31 | "from mxnet import gluon, init, npx\n", 32 | "from mxnet.gluon import nn\n", 33 | "npx.set_np()\n", 34 | "\n", 35 | "train_iter, test_iter = d2l.load_data_fashion_mnist(batch_size=256)" 36 | ] 37 | }, 38 | { 39 | "cell_type": "markdown", 40 | "metadata": { 41 | "slideshow": { 42 | "slide_type": "slide" 43 | } 44 | }, 45 | "source": [ 46 | "Model and initialization" 47 | ] 48 | }, 49 | { 50 | "cell_type": "code", 51 | "execution_count": 2, 52 | "metadata": { 53 | "ExecuteTime": { 54 | "end_time": "2019-07-03T22:07:06.034492Z", 55 | "start_time": "2019-07-03T22:07:06.027213Z" 56 | }, 57 | "attributes": { 58 | "classes": [], 59 | "id": "", 60 | "n": "3" 61 | } 62 | }, 63 | "outputs": [], 64 | "source": [ 65 | "net = nn.Sequential()\n", 66 | "net.add(nn.Dense(10))\n", 67 | "net.initialize(init.Normal(sigma=0.01))" 68 | ] 69 | }, 70 | { 71 | "cell_type": "markdown", 72 | "metadata": { 73 | "slideshow": { 74 | "slide_type": "slide" 75 | } 76 | }, 77 | "source": [ 78 | "Loss function, optimization algorithm and training" 79 | ] 80 | }, 81 | { 82 | "cell_type": "code", 83 | "execution_count": 3, 84 | "metadata": { 85 | "ExecuteTime": { 86 | "end_time": "2019-07-03T22:07:38.001921Z", 87 | "start_time": "2019-07-03T22:07:06.036621Z" 88 | }, 89 | "attributes": { 90 | "classes": [], 91 | "id": "", 92 | "n": "5" 93 | } 94 | }, 95 | "outputs": [ 96 | { 97 | "data": { 98 | "image/svg+xml": [ 99 | "\n", 100 | "\n", 102 | "\n", 103 | "\n", 104 | " \n", 105 | " \n", 108 | " \n", 109 | " \n", 110 | " \n", 111 | " \n", 117 | " \n", 118 | " \n", 119 | " \n", 120 | " \n", 126 | " \n", 127 | " \n", 128 | " \n", 129 | " \n", 130 | " \n", 133 | " \n", 134 | " \n", 135 | " \n", 136 | " \n", 139 | " \n", 140 | " \n", 141 | " \n", 142 | " \n", 143 | " \n", 144 | " \n", 145 | " \n", 146 | " \n", 147 | " \n", 171 | " \n", 172 | " \n", 173 | " \n", 174 | " \n", 175 | " \n", 176 | " \n", 177 | " \n", 178 | " \n", 179 | " \n", 182 | " \n", 183 | " \n", 184 | " \n", 185 | " \n", 186 | " \n", 187 | " \n", 188 | " \n", 189 | " \n", 190 | " \n", 191 | " \n", 208 | " \n", 209 | " \n", 210 | " \n", 211 | " \n", 212 | " \n", 213 | " \n", 214 | " \n", 215 | " \n", 216 | " \n", 219 | " \n", 220 | " \n", 221 | " \n", 222 | " \n", 223 | " \n", 224 | " \n", 225 | " \n", 226 | " \n", 227 | " \n", 228 | " \n", 258 | " \n", 259 | " \n", 260 | " \n", 261 | " \n", 262 | " \n", 263 | " \n", 264 | " \n", 265 | " \n", 266 | " \n", 269 | " \n", 270 | " \n", 271 | " \n", 272 | " \n", 273 | " \n", 274 | " \n", 275 | " \n", 276 | " \n", 277 | " \n", 278 | " \n", 317 | " \n", 318 | " \n", 319 | " \n", 320 | " \n", 321 | " \n", 322 | " \n", 323 | " \n", 324 | " \n", 325 | " \n", 328 | " \n", 329 | " \n", 330 | " \n", 331 | " \n", 332 | " \n", 333 | " \n", 334 | " \n", 335 | " \n", 336 | " \n", 337 | " \n", 350 | " \n", 371 | " \n", 372 | " \n", 373 | " \n", 374 | " \n", 375 | " \n", 376 | " \n", 377 | " \n", 378 | " \n", 379 | " \n", 380 | " \n", 381 | " \n", 405 | " \n", 431 | " \n", 452 | " \n", 473 | " \n", 492 | " \n", 493 | " \n", 494 | " \n", 495 | " \n", 496 | " \n", 497 | " \n", 498 | " \n", 499 | " \n", 500 | " \n", 501 | " \n", 502 | " \n", 503 | " \n", 504 | " \n", 505 | " \n", 508 | " \n", 509 | " \n", 510 | " \n", 511 | " \n", 514 | " \n", 515 | " \n", 516 | " \n", 517 | " \n", 518 | " \n", 519 | " \n", 520 | " \n", 521 | " \n", 522 | " \n", 528 | " \n", 529 | " \n", 530 | " \n", 531 | " \n", 532 | " \n", 533 | " \n", 534 | " \n", 535 | " \n", 536 | " \n", 537 | " \n", 538 | " \n", 541 | " \n", 542 | " \n", 543 | " \n", 544 | " \n", 545 | " \n", 546 | " \n", 547 | " \n", 548 | " \n", 549 | " \n", 550 | " \n", 551 | " \n", 552 | " \n", 553 | " \n", 554 | " \n", 555 | " \n", 556 | " \n", 557 | " \n", 558 | " \n", 561 | " \n", 562 | " \n", 563 | " \n", 564 | " \n", 565 | " \n", 566 | " \n", 567 | " \n", 568 | " \n", 569 | " \n", 570 | " \n", 571 | " \n", 572 | " \n", 573 | " \n", 574 | " \n", 575 | " \n", 576 | " \n", 577 | " \n", 578 | " \n", 589 | " \n", 590 | " \n", 591 | " \n", 602 | " \n", 603 | " \n", 604 | " \n", 615 | " \n", 616 | " \n", 617 | " \n", 620 | " \n", 621 | " \n", 622 | " \n", 625 | " \n", 626 | " \n", 627 | " \n", 630 | " \n", 631 | " \n", 632 | " \n", 635 | " \n", 636 | " \n", 637 | " \n", 638 | " \n", 649 | " \n", 650 | " \n", 651 | " \n", 654 | " \n", 655 | " \n", 656 | " \n", 657 | " \n", 658 | " \n", 659 | " \n", 679 | " \n", 695 | " \n", 727 | " \n", 738 | " \n", 757 | " \n", 758 | " \n", 764 | " \n", 795 | " \n", 796 | " \n", 797 | " \n", 798 | " \n", 799 | " \n", 800 | " \n", 801 | " \n", 802 | " \n", 803 | " \n", 804 | " \n", 805 | " \n", 806 | " \n", 807 | " \n", 808 | " \n", 809 | " \n", 810 | " \n", 813 | " \n", 814 | " \n", 815 | " \n", 816 | " \n", 817 | " \n", 818 | " \n", 819 | " \n", 820 | " \n", 821 | " \n", 822 | " \n", 823 | " \n", 824 | " \n", 825 | " \n", 826 | " \n", 827 | " \n", 828 | " \n", 829 | " \n", 830 | " \n", 833 | " \n", 834 | " \n", 835 | " \n", 836 | " \n", 837 | " \n", 838 | " \n", 839 | " \n", 840 | " \n", 841 | " \n", 842 | " \n", 843 | " \n", 844 | " \n", 845 | " \n", 846 | " \n", 847 | " \n", 848 | " \n", 849 | " \n", 850 | " \n", 851 | " \n", 852 | " \n", 853 | " \n", 854 | " \n", 855 | " \n", 856 | "\n" 857 | ], 858 | "text/plain": [ 859 | "
" 860 | ] 861 | }, 862 | "metadata": { 863 | "needs_background": "light" 864 | }, 865 | "output_type": "display_data" 866 | } 867 | ], 868 | "source": [ 869 | "loss = gluon.loss.SoftmaxCrossEntropyLoss()\n", 870 | "trainer = gluon.Trainer(net.collect_params(), \n", 871 | " 'sgd', {'learning_rate': 0.1})\n", 872 | "d2l.train_ch3(net, train_iter, test_iter, loss, 10, trainer)" 873 | ] 874 | } 875 | ], 876 | "metadata": { 877 | "celltoolbar": "Slideshow", 878 | "kernelspec": { 879 | "display_name": "Python 3", 880 | "language": "python", 881 | "name": "python3" 882 | }, 883 | "language_info": { 884 | "codemirror_mode": { 885 | "name": "ipython", 886 | "version": 3 887 | }, 888 | "file_extension": ".py", 889 | "mimetype": "text/x-python", 890 | "name": "python", 891 | "nbconvert_exporter": "python", 892 | "pygments_lexer": "ipython3", 893 | "version": "3.7.1" 894 | }, 895 | "toc": { 896 | "base_numbering": 1, 897 | "nav_menu": {}, 898 | "number_sections": true, 899 | "sideBar": true, 900 | "skip_h1_title": false, 901 | "title_cell": "Table of Contents", 902 | "title_sidebar": "Contents", 903 | "toc_cell": false, 904 | "toc_position": {}, 905 | "toc_section_display": true, 906 | "toc_window_display": false 907 | } 908 | }, 909 | "nbformat": 4, 910 | "nbformat_minor": 2 911 | } 912 | -------------------------------------------------------------------------------- /notebooks-1/9-mlp-gluon.ipynb: -------------------------------------------------------------------------------- 1 | { 2 | "cells": [ 3 | { 4 | "cell_type": "markdown", 5 | "metadata": { 6 | "slideshow": { 7 | "slide_type": "slide" 8 | } 9 | }, 10 | "source": [ 11 | "# Concise Implementation of Multilayer Perceptron" 12 | ] 13 | }, 14 | { 15 | "cell_type": "code", 16 | "execution_count": 1, 17 | "metadata": { 18 | "ExecuteTime": { 19 | "end_time": "2019-07-02T21:21:36.954926Z", 20 | "start_time": "2019-07-02T21:21:33.324711Z" 21 | } 22 | }, 23 | "outputs": [], 24 | "source": [ 25 | "import d2l\n", 26 | "from mxnet import gluon, npx, init\n", 27 | "from mxnet.gluon import nn\n", 28 | "npx.set_np()\n", 29 | "\n", 30 | "train_iter, test_iter = d2l.load_data_fashion_mnist(batch_size=256)" 31 | ] 32 | }, 33 | { 34 | "cell_type": "markdown", 35 | "metadata": { 36 | "slideshow": { 37 | "slide_type": "slide" 38 | } 39 | }, 40 | "source": [ 41 | "The model" 42 | ] 43 | }, 44 | { 45 | "cell_type": "code", 46 | "execution_count": 2, 47 | "metadata": { 48 | "ExecuteTime": { 49 | "end_time": "2019-07-02T21:21:36.964731Z", 50 | "start_time": "2019-07-02T21:21:36.957821Z" 51 | }, 52 | "attributes": { 53 | "classes": [], 54 | "id": "", 55 | "n": "5" 56 | } 57 | }, 58 | "outputs": [], 59 | "source": [ 60 | "net = nn.Sequential()\n", 61 | "net.add(nn.Dense(256, activation='relu'),\n", 62 | " nn.Dense(10))\n", 63 | "net.initialize(init.Normal(sigma=0.01))" 64 | ] 65 | }, 66 | { 67 | "cell_type": "markdown", 68 | "metadata": { 69 | "slideshow": { 70 | "slide_type": "slide" 71 | } 72 | }, 73 | "source": [ 74 | "Training" 75 | ] 76 | }, 77 | { 78 | "cell_type": "code", 79 | "execution_count": 3, 80 | "metadata": { 81 | "ExecuteTime": { 82 | "end_time": "2019-07-02T21:21:59.272107Z", 83 | "start_time": "2019-07-02T21:21:36.966596Z" 84 | }, 85 | "attributes": { 86 | "classes": [], 87 | "id": "", 88 | "n": "6" 89 | } 90 | }, 91 | "outputs": [ 92 | { 93 | "data": { 94 | "image/svg+xml": [ 95 | "\n", 96 | "\n", 98 | "\n", 99 | "\n", 100 | " \n", 101 | " \n", 104 | " \n", 105 | " \n", 106 | " \n", 107 | " \n", 113 | " \n", 114 | " \n", 115 | " \n", 116 | " \n", 122 | " \n", 123 | " \n", 124 | " \n", 125 | " \n", 126 | " \n", 129 | " \n", 130 | " \n", 131 | " \n", 132 | " \n", 135 | " \n", 136 | " \n", 137 | " \n", 138 | " \n", 139 | " \n", 140 | " \n", 141 | " \n", 142 | " \n", 143 | " \n", 167 | " \n", 168 | " \n", 169 | " \n", 170 | " \n", 171 | " \n", 172 | " \n", 173 | " \n", 174 | " \n", 175 | " \n", 178 | " \n", 179 | " \n", 180 | " \n", 181 | " \n", 182 | " \n", 183 | " \n", 184 | " \n", 185 | " \n", 186 | " \n", 187 | " \n", 204 | " \n", 205 | " \n", 206 | " \n", 207 | " \n", 208 | " \n", 209 | " \n", 210 | " \n", 211 | " \n", 212 | " \n", 215 | " \n", 216 | " \n", 217 | " \n", 218 | " \n", 219 | " \n", 220 | " \n", 221 | " \n", 222 | " \n", 223 | " \n", 224 | " \n", 254 | " \n", 255 | " \n", 256 | " \n", 257 | " \n", 258 | " \n", 259 | " \n", 260 | " \n", 261 | " \n", 262 | " \n", 265 | " \n", 266 | " \n", 267 | " \n", 268 | " \n", 269 | " \n", 270 | " \n", 271 | " \n", 272 | " \n", 273 | " \n", 274 | " \n", 313 | " \n", 314 | " \n", 315 | " \n", 316 | " \n", 317 | " \n", 318 | " \n", 319 | " \n", 320 | " \n", 321 | " \n", 324 | " \n", 325 | " \n", 326 | " \n", 327 | " \n", 328 | " \n", 329 | " \n", 330 | " \n", 331 | " \n", 332 | " \n", 333 | " \n", 346 | " \n", 367 | " \n", 368 | " \n", 369 | " \n", 370 | " \n", 371 | " \n", 372 | " \n", 373 | " \n", 374 | " \n", 375 | " \n", 376 | " \n", 377 | " \n", 401 | " \n", 427 | " \n", 448 | " \n", 469 | " \n", 488 | " \n", 489 | " \n", 490 | " \n", 491 | " \n", 492 | " \n", 493 | " \n", 494 | " \n", 495 | " \n", 496 | " \n", 497 | " \n", 498 | " \n", 499 | " \n", 500 | " \n", 501 | " \n", 504 | " \n", 505 | " \n", 506 | " \n", 507 | " \n", 510 | " \n", 511 | " \n", 512 | " \n", 513 | " \n", 514 | " \n", 515 | " \n", 516 | " \n", 517 | " \n", 518 | " \n", 524 | " \n", 525 | " \n", 526 | " \n", 527 | " \n", 528 | " \n", 529 | " \n", 530 | " \n", 531 | " \n", 532 | " \n", 533 | " \n", 534 | " \n", 537 | " \n", 538 | " \n", 539 | " \n", 540 | " \n", 541 | " \n", 542 | " \n", 543 | " \n", 544 | " \n", 545 | " \n", 546 | " \n", 547 | " \n", 548 | " \n", 549 | " \n", 550 | " \n", 551 | " \n", 552 | " \n", 553 | " \n", 554 | " \n", 557 | " \n", 558 | " \n", 559 | " \n", 560 | " \n", 561 | " \n", 562 | " \n", 563 | " \n", 564 | " \n", 565 | " \n", 566 | " \n", 567 | " \n", 568 | " \n", 569 | " \n", 570 | " \n", 571 | " \n", 572 | " \n", 573 | " \n", 574 | " \n", 585 | " \n", 586 | " \n", 587 | " \n", 598 | " \n", 599 | " \n", 600 | " \n", 611 | " \n", 612 | " \n", 613 | " \n", 616 | " \n", 617 | " \n", 618 | " \n", 621 | " \n", 622 | " \n", 623 | " \n", 626 | " \n", 627 | " \n", 628 | " \n", 631 | " \n", 632 | " \n", 633 | " \n", 634 | " \n", 645 | " \n", 646 | " \n", 647 | " \n", 650 | " \n", 651 | " \n", 652 | " \n", 653 | " \n", 654 | " \n", 655 | " \n", 675 | " \n", 691 | " \n", 723 | " \n", 734 | " \n", 753 | " \n", 754 | " \n", 760 | " \n", 791 | " \n", 792 | " \n", 793 | " \n", 794 | " \n", 795 | " \n", 796 | " \n", 797 | " \n", 798 | " \n", 799 | " \n", 800 | " \n", 801 | " \n", 802 | " \n", 803 | " \n", 804 | " \n", 805 | " \n", 806 | " \n", 809 | " \n", 810 | " \n", 811 | " \n", 812 | " \n", 813 | " \n", 814 | " \n", 815 | " \n", 816 | " \n", 817 | " \n", 818 | " \n", 819 | " \n", 820 | " \n", 821 | " \n", 822 | " \n", 823 | " \n", 824 | " \n", 825 | " \n", 826 | " \n", 829 | " \n", 830 | " \n", 831 | " \n", 832 | " \n", 833 | " \n", 834 | " \n", 835 | " \n", 836 | " \n", 837 | " \n", 838 | " \n", 839 | " \n", 840 | " \n", 841 | " \n", 842 | " \n", 843 | " \n", 844 | " \n", 845 | " \n", 846 | " \n", 847 | " \n", 848 | " \n", 849 | " \n", 850 | " \n", 851 | " \n", 852 | "\n" 853 | ], 854 | "text/plain": [ 855 | "
" 856 | ] 857 | }, 858 | "metadata": { 859 | "needs_background": "light" 860 | }, 861 | "output_type": "display_data" 862 | } 863 | ], 864 | "source": [ 865 | "loss = gluon.loss.SoftmaxCrossEntropyLoss()\n", 866 | "trainer = gluon.Trainer(net.collect_params(), \n", 867 | " 'sgd', {'learning_rate': 0.5})\n", 868 | "d2l.train_ch3(net, train_iter, test_iter, loss, 10, trainer)" 869 | ] 870 | } 871 | ], 872 | "metadata": { 873 | "celltoolbar": "Slideshow", 874 | "kernelspec": { 875 | "display_name": "Python 3", 876 | "language": "python", 877 | "name": "python3" 878 | }, 879 | "language_info": { 880 | "codemirror_mode": { 881 | "name": "ipython", 882 | "version": 3 883 | }, 884 | "file_extension": ".py", 885 | "mimetype": "text/x-python", 886 | "name": "python", 887 | "nbconvert_exporter": "python", 888 | "pygments_lexer": "ipython3", 889 | "version": "3.7.1" 890 | }, 891 | "toc": { 892 | "base_numbering": 1, 893 | "nav_menu": {}, 894 | "number_sections": true, 895 | "sideBar": true, 896 | "skip_h1_title": false, 897 | "title_cell": "Table of Contents", 898 | "title_sidebar": "Contents", 899 | "toc_cell": false, 900 | "toc_position": {}, 901 | "toc_section_display": true, 902 | "toc_window_display": false 903 | } 904 | }, 905 | "nbformat": 4, 906 | "nbformat_minor": 2 907 | } 908 | -------------------------------------------------------------------------------- /notebooks-2/1-use-gpu.ipynb: -------------------------------------------------------------------------------- 1 | { 2 | "cells": [ 3 | { 4 | "cell_type": "markdown", 5 | "metadata": { 6 | "slideshow": { 7 | "slide_type": "slide" 8 | } 9 | }, 10 | "source": [ 11 | "# GPUs\n", 12 | "\n", 13 | "Check your CUDA driver and device. " 14 | ] 15 | }, 16 | { 17 | "cell_type": "code", 18 | "execution_count": 1, 19 | "metadata": { 20 | "ExecuteTime": { 21 | "end_time": "2019-07-03T22:10:58.775829Z", 22 | "start_time": "2019-07-03T22:10:58.421457Z" 23 | }, 24 | "attributes": { 25 | "classes": [], 26 | "id": "", 27 | "n": "1" 28 | }, 29 | "scrolled": true 30 | }, 31 | "outputs": [ 32 | { 33 | "name": "stdout", 34 | "output_type": "stream", 35 | "text": [ 36 | "Wed Jul 3 22:10:58 2019 \n", 37 | "+-----------------------------------------------------------------------------+\n", 38 | "| NVIDIA-SMI 418.67 Driver Version: 418.67 CUDA Version: 10.1 |\n", 39 | "|-------------------------------+----------------------+----------------------+\n", 40 | "| GPU Name Persistence-M| Bus-Id Disp.A | Volatile Uncorr. ECC |\n", 41 | "| Fan Temp Perf Pwr:Usage/Cap| Memory-Usage | GPU-Util Compute M. |\n", 42 | "|===============================+======================+======================|\n", 43 | "| 0 Tesla V100-SXM2... Off | 00000000:00:1B.0 Off | 0 |\n", 44 | "| N/A 70C P0 228W / 300W | 7684MiB / 16130MiB | 78% Default |\n", 45 | "+-------------------------------+----------------------+----------------------+\n", 46 | "| 1 Tesla V100-SXM2... Off | 00000000:00:1C.0 Off | 0 |\n", 47 | "| N/A 44C P0 38W / 300W | 11MiB / 16130MiB | 0% Default |\n", 48 | "+-------------------------------+----------------------+----------------------+\n", 49 | "| 2 Tesla V100-SXM2... Off | 00000000:00:1D.0 Off | 0 |\n", 50 | "| N/A 43C P0 59W / 300W | 978MiB / 16130MiB | 14% Default |\n", 51 | "+-------------------------------+----------------------+----------------------+\n", 52 | "| 3 Tesla V100-SXM2... Off | 00000000:00:1E.0 Off | 0 |\n", 53 | "| N/A 40C P0 40W / 300W | 11MiB / 16130MiB | 0% Default |\n", 54 | "+-------------------------------+----------------------+----------------------+\n", 55 | " \n", 56 | "+-----------------------------------------------------------------------------+\n", 57 | "| Processes: GPU Memory |\n", 58 | "| GPU PID Type Process name Usage |\n", 59 | "|=============================================================================|\n", 60 | "| 0 118587 C ...iconda3/envs/d2l-en-numpy2-0/bin/python 7673MiB |\n", 61 | "| 2 119109 C ...iconda3/envs/d2l-en-numpy2-1/bin/python 967MiB |\n", 62 | "+-----------------------------------------------------------------------------+\n" 63 | ] 64 | } 65 | ], 66 | "source": [ 67 | "!nvidia-smi" 68 | ] 69 | }, 70 | { 71 | "cell_type": "markdown", 72 | "metadata": { 73 | "slideshow": { 74 | "slide_type": "slide" 75 | } 76 | }, 77 | "source": [ 78 | "Number of available GPUs" 79 | ] 80 | }, 81 | { 82 | "cell_type": "code", 83 | "execution_count": 2, 84 | "metadata": { 85 | "ExecuteTime": { 86 | "end_time": "2019-07-03T22:11:00.147240Z", 87 | "start_time": "2019-07-03T22:10:58.778752Z" 88 | } 89 | }, 90 | "outputs": [ 91 | { 92 | "data": { 93 | "text/plain": [ 94 | "2" 95 | ] 96 | }, 97 | "execution_count": 2, 98 | "metadata": {}, 99 | "output_type": "execute_result" 100 | } 101 | ], 102 | "source": [ 103 | "from mxnet import np, npx\n", 104 | "from mxnet.gluon import nn\n", 105 | "npx.set_np()\n", 106 | "\n", 107 | "npx.num_gpus()" 108 | ] 109 | }, 110 | { 111 | "cell_type": "markdown", 112 | "metadata": { 113 | "slideshow": { 114 | "slide_type": "slide" 115 | } 116 | }, 117 | "source": [ 118 | "Computation devices" 119 | ] 120 | }, 121 | { 122 | "cell_type": "code", 123 | "execution_count": 3, 124 | "metadata": { 125 | "ExecuteTime": { 126 | "end_time": "2019-07-03T22:11:00.157196Z", 127 | "start_time": "2019-07-03T22:11:00.149532Z" 128 | } 129 | }, 130 | "outputs": [ 131 | { 132 | "name": "stdout", 133 | "output_type": "stream", 134 | "text": [ 135 | "cpu(0) gpu(0) gpu(1)\n" 136 | ] 137 | }, 138 | { 139 | "data": { 140 | "text/plain": [ 141 | "(gpu(0), cpu(0), [gpu(0), gpu(1)])" 142 | ] 143 | }, 144 | "execution_count": 3, 145 | "metadata": {}, 146 | "output_type": "execute_result" 147 | } 148 | ], 149 | "source": [ 150 | "print(npx.cpu(), npx.gpu(), npx.gpu(1))\n", 151 | "\n", 152 | "def try_gpu(i=0):\n", 153 | " return npx.gpu(i) if npx.num_gpus() >= i + 1 else npx.cpu()\n", 154 | "\n", 155 | "def try_all_gpus():\n", 156 | " ctxes = [npx.gpu(i) for i in range(npx.num_gpus())]\n", 157 | " return ctxes if ctxes else [npx.cpu()]\n", 158 | "\n", 159 | "try_gpu(), try_gpu(3), try_all_gpus()" 160 | ] 161 | }, 162 | { 163 | "cell_type": "markdown", 164 | "metadata": { 165 | "slideshow": { 166 | "slide_type": "slide" 167 | } 168 | }, 169 | "source": [ 170 | "Create ndarrays on the 1st GPU" 171 | ] 172 | }, 173 | { 174 | "cell_type": "code", 175 | "execution_count": 4, 176 | "metadata": { 177 | "ExecuteTime": { 178 | "end_time": "2019-07-03T22:11:04.523547Z", 179 | "start_time": "2019-07-03T22:11:00.159618Z" 180 | }, 181 | "attributes": { 182 | "classes": [], 183 | "id": "", 184 | "n": "5" 185 | } 186 | }, 187 | "outputs": [ 188 | { 189 | "name": "stdout", 190 | "output_type": "stream", 191 | "text": [ 192 | "gpu(0)\n" 193 | ] 194 | }, 195 | { 196 | "data": { 197 | "text/plain": [ 198 | "array([[1., 1., 1.],\n", 199 | " [1., 1., 1.]], ctx=gpu(0))" 200 | ] 201 | }, 202 | "execution_count": 4, 203 | "metadata": {}, 204 | "output_type": "execute_result" 205 | } 206 | ], 207 | "source": [ 208 | "x = np.ones((2, 3), ctx=try_gpu())\n", 209 | "print(x.context)\n", 210 | "x" 211 | ] 212 | }, 213 | { 214 | "cell_type": "markdown", 215 | "metadata": { 216 | "slideshow": { 217 | "slide_type": "slide" 218 | } 219 | }, 220 | "source": [ 221 | "Create on the 2nd GPU" 222 | ] 223 | }, 224 | { 225 | "cell_type": "code", 226 | "execution_count": 5, 227 | "metadata": { 228 | "ExecuteTime": { 229 | "end_time": "2019-07-03T22:11:08.769323Z", 230 | "start_time": "2019-07-03T22:11:04.525346Z" 231 | } 232 | }, 233 | "outputs": [ 234 | { 235 | "data": { 236 | "text/plain": [ 237 | "array([[0.59119 , 0.313164 , 0.76352036],\n", 238 | " [0.9731786 , 0.35454726, 0.11677533]], ctx=gpu(1))" 239 | ] 240 | }, 241 | "execution_count": 5, 242 | "metadata": {}, 243 | "output_type": "execute_result" 244 | } 245 | ], 246 | "source": [ 247 | "y = np.random.uniform(size=(2, 3), ctx=try_gpu(1))\n", 248 | "y" 249 | ] 250 | }, 251 | { 252 | "cell_type": "markdown", 253 | "metadata": { 254 | "slideshow": { 255 | "slide_type": "slide" 256 | } 257 | }, 258 | "source": [ 259 | "Copying between devices" 260 | ] 261 | }, 262 | { 263 | "cell_type": "code", 264 | "execution_count": 6, 265 | "metadata": { 266 | "ExecuteTime": { 267 | "end_time": "2019-07-03T22:11:08.779134Z", 268 | "start_time": "2019-07-03T22:11:08.770982Z" 269 | }, 270 | "attributes": { 271 | "classes": [], 272 | "id": "", 273 | "n": "7" 274 | } 275 | }, 276 | "outputs": [ 277 | { 278 | "name": "stdout", 279 | "output_type": "stream", 280 | "text": [ 281 | "[[1. 1. 1.]\n", 282 | " [1. 1. 1.]] @gpu(0)\n", 283 | "[[1. 1. 1.]\n", 284 | " [1. 1. 1.]] @gpu(1)\n" 285 | ] 286 | } 287 | ], 288 | "source": [ 289 | "z = x.copyto(try_gpu(1))\n", 290 | "print(x)\n", 291 | "print(z)" 292 | ] 293 | }, 294 | { 295 | "cell_type": "markdown", 296 | "metadata": { 297 | "slideshow": { 298 | "slide_type": "slide" 299 | } 300 | }, 301 | "source": [ 302 | "The inputs of an operator must be on the same device, then the computation will run on that device." 303 | ] 304 | }, 305 | { 306 | "cell_type": "code", 307 | "execution_count": 7, 308 | "metadata": { 309 | "ExecuteTime": { 310 | "end_time": "2019-07-03T22:11:08.786457Z", 311 | "start_time": "2019-07-03T22:11:08.781557Z" 312 | } 313 | }, 314 | "outputs": [ 315 | { 316 | "data": { 317 | "text/plain": [ 318 | "array([[1.59119 , 1.313164 , 1.7635204],\n", 319 | " [1.9731786, 1.3545473, 1.1167753]], ctx=gpu(1))" 320 | ] 321 | }, 322 | "execution_count": 7, 323 | "metadata": {}, 324 | "output_type": "execute_result" 325 | } 326 | ], 327 | "source": [ 328 | "y + z" 329 | ] 330 | }, 331 | { 332 | "cell_type": "markdown", 333 | "metadata": { 334 | "slideshow": { 335 | "slide_type": "slide" 336 | } 337 | }, 338 | "source": [ 339 | "Initialize parameters on the first GPU." 340 | ] 341 | }, 342 | { 343 | "cell_type": "code", 344 | "execution_count": 8, 345 | "metadata": { 346 | "ExecuteTime": { 347 | "end_time": "2019-07-03T22:11:08.795260Z", 348 | "start_time": "2019-07-03T22:11:08.789855Z" 349 | }, 350 | "attributes": { 351 | "classes": [], 352 | "id": "", 353 | "n": "12" 354 | } 355 | }, 356 | "outputs": [], 357 | "source": [ 358 | "net = nn.Sequential()\n", 359 | "net.add(nn.Dense(1))\n", 360 | "net.initialize(ctx=try_gpu())" 361 | ] 362 | }, 363 | { 364 | "cell_type": "markdown", 365 | "metadata": { 366 | "slideshow": { 367 | "slide_type": "slide" 368 | } 369 | }, 370 | "source": [ 371 | "When the input is an ndarray on the GPU, Gluon will calculate the result on the same GPU." 372 | ] 373 | }, 374 | { 375 | "cell_type": "code", 376 | "execution_count": 9, 377 | "metadata": { 378 | "ExecuteTime": { 379 | "end_time": "2019-07-03T22:11:08.818131Z", 380 | "start_time": "2019-07-03T22:11:08.797675Z" 381 | }, 382 | "attributes": { 383 | "classes": [], 384 | "id": "", 385 | "n": "13" 386 | } 387 | }, 388 | "outputs": [ 389 | { 390 | "data": { 391 | "text/plain": [ 392 | "array([[0.04995865],\n", 393 | " [0.04995865]], ctx=gpu(0))" 394 | ] 395 | }, 396 | "execution_count": 9, 397 | "metadata": {}, 398 | "output_type": "execute_result" 399 | } 400 | ], 401 | "source": [ 402 | "net(x)" 403 | ] 404 | }, 405 | { 406 | "cell_type": "markdown", 407 | "metadata": { 408 | "slideshow": { 409 | "slide_type": "slide" 410 | } 411 | }, 412 | "source": [ 413 | "Let us confirm that the model parameters are stored on the same GPU." 414 | ] 415 | }, 416 | { 417 | "cell_type": "code", 418 | "execution_count": 10, 419 | "metadata": { 420 | "ExecuteTime": { 421 | "end_time": "2019-07-03T22:11:08.825572Z", 422 | "start_time": "2019-07-03T22:11:08.820386Z" 423 | }, 424 | "attributes": { 425 | "classes": [], 426 | "id": "", 427 | "n": "14" 428 | } 429 | }, 430 | "outputs": [ 431 | { 432 | "data": { 433 | "text/plain": [ 434 | "array([[0.0068339 , 0.01299825, 0.0301265 ]], ctx=gpu(0))" 435 | ] 436 | }, 437 | "execution_count": 10, 438 | "metadata": {}, 439 | "output_type": "execute_result" 440 | } 441 | ], 442 | "source": [ 443 | "net[0].weight.data()" 444 | ] 445 | } 446 | ], 447 | "metadata": { 448 | "celltoolbar": "Slideshow", 449 | "kernelspec": { 450 | "display_name": "Python 3", 451 | "language": "python", 452 | "name": "python3" 453 | }, 454 | "language_info": { 455 | "codemirror_mode": { 456 | "name": "ipython", 457 | "version": 3 458 | }, 459 | "file_extension": ".py", 460 | "mimetype": "text/x-python", 461 | "name": "python", 462 | "nbconvert_exporter": "python", 463 | "pygments_lexer": "ipython3", 464 | "version": "3.7.1" 465 | }, 466 | "toc": { 467 | "base_numbering": 1, 468 | "nav_menu": {}, 469 | "number_sections": true, 470 | "sideBar": true, 471 | "skip_h1_title": false, 472 | "title_cell": "Table of Contents", 473 | "title_sidebar": "Contents", 474 | "toc_cell": false, 475 | "toc_position": {}, 476 | "toc_section_display": true, 477 | "toc_window_display": false 478 | } 479 | }, 480 | "nbformat": 4, 481 | "nbformat_minor": 2 482 | } 483 | -------------------------------------------------------------------------------- /notebooks-2/2-conv-layer.ipynb: -------------------------------------------------------------------------------- 1 | { 2 | "cells": [ 3 | { 4 | "cell_type": "markdown", 5 | "metadata": { 6 | "slideshow": { 7 | "slide_type": "slide" 8 | } 9 | }, 10 | "source": [ 11 | "# Convolutions" 12 | ] 13 | }, 14 | { 15 | "cell_type": "code", 16 | "execution_count": 1, 17 | "metadata": { 18 | "ExecuteTime": { 19 | "end_time": "2019-07-03T22:12:43.185492Z", 20 | "start_time": "2019-07-03T22:12:41.569269Z" 21 | } 22 | }, 23 | "outputs": [], 24 | "source": [ 25 | "from mxnet import autograd, np, npx\n", 26 | "from mxnet.gluon import nn\n", 27 | "npx.set_np()" 28 | ] 29 | }, 30 | { 31 | "cell_type": "markdown", 32 | "metadata": { 33 | "slideshow": { 34 | "slide_type": "slide" 35 | } 36 | }, 37 | "source": [ 38 | "The cross-correlation operator." 39 | ] 40 | }, 41 | { 42 | "cell_type": "code", 43 | "execution_count": 2, 44 | "metadata": { 45 | "ExecuteTime": { 46 | "end_time": "2019-07-03T22:12:43.312247Z", 47 | "start_time": "2019-07-03T22:12:43.188173Z" 48 | } 49 | }, 50 | "outputs": [ 51 | { 52 | "data": { 53 | "text/plain": [ 54 | "array([[19., 25.],\n", 55 | " [37., 43.]])" 56 | ] 57 | }, 58 | "execution_count": 2, 59 | "metadata": {}, 60 | "output_type": "execute_result" 61 | } 62 | ], 63 | "source": [ 64 | "def corr2d(X, K):\n", 65 | " h, w = K.shape\n", 66 | " Y = np.zeros((X.shape[0] - h + 1, X.shape[1] - w + 1))\n", 67 | " for i in range(Y.shape[0]):\n", 68 | " for j in range(Y.shape[1]):\n", 69 | " Y[i, j] = (X[i: i + h, j: j + w] * K).sum()\n", 70 | " return Y\n", 71 | "\n", 72 | "X = np.array([[0, 1, 2], [3, 4, 5], [6, 7, 8]])\n", 73 | "K = np.array([[0, 1], [2, 3]])\n", 74 | "corr2d(X, K)" 75 | ] 76 | }, 77 | { 78 | "cell_type": "markdown", 79 | "metadata": { 80 | "slideshow": { 81 | "slide_type": "slide" 82 | } 83 | }, 84 | "source": [ 85 | "Convolutional layers" 86 | ] 87 | }, 88 | { 89 | "cell_type": "code", 90 | "execution_count": 3, 91 | "metadata": { 92 | "ExecuteTime": { 93 | "end_time": "2019-07-03T22:12:43.320777Z", 94 | "start_time": "2019-07-03T22:12:43.314285Z" 95 | }, 96 | "attributes": { 97 | "classes": [], 98 | "id": "", 99 | "n": "70" 100 | } 101 | }, 102 | "outputs": [], 103 | "source": [ 104 | "class Conv2D(nn.Block):\n", 105 | " def __init__(self, kernel_size, **kwargs):\n", 106 | " super(Conv2D, self).__init__(**kwargs)\n", 107 | " self.weight = self.params.get('weight', shape=kernel_size)\n", 108 | " self.bias = self.params.get('bias', shape=(1,))\n", 109 | "\n", 110 | " def forward(self, x):\n", 111 | " return corr2d(x, self.weight.data()) + self.bias.data()" 112 | ] 113 | }, 114 | { 115 | "cell_type": "markdown", 116 | "metadata": { 117 | "slideshow": { 118 | "slide_type": "slide" 119 | } 120 | }, 121 | "source": [ 122 | "Padding" 123 | ] 124 | }, 125 | { 126 | "cell_type": "code", 127 | "execution_count": 4, 128 | "metadata": { 129 | "ExecuteTime": { 130 | "end_time": "2019-07-03T22:12:43.343747Z", 131 | "start_time": "2019-07-03T22:12:43.325742Z" 132 | } 133 | }, 134 | "outputs": [ 135 | { 136 | "data": { 137 | "text/plain": [ 138 | "(8, 8)" 139 | ] 140 | }, 141 | "execution_count": 4, 142 | "metadata": {}, 143 | "output_type": "execute_result" 144 | } 145 | ], 146 | "source": [ 147 | "# A convenient function to test Gluon convoplution layers. \n", 148 | "def comp_conv2d(conv2d, X):\n", 149 | " conv2d.initialize()\n", 150 | " # Add batch and channel dimension.\n", 151 | " X = X.reshape((1, 1) + X.shape)\n", 152 | " Y = conv2d(X)\n", 153 | " # Exclude the first two dimensions\n", 154 | " return Y.reshape(Y.shape[2:])\n", 155 | "\n", 156 | "conv2d = nn.Conv2D(1, kernel_size=3, padding=1)\n", 157 | "X = np.random.uniform(size=(8, 8))\n", 158 | "comp_conv2d(conv2d, X).shape" 159 | ] 160 | }, 161 | { 162 | "cell_type": "markdown", 163 | "metadata": { 164 | "slideshow": { 165 | "slide_type": "slide" 166 | } 167 | }, 168 | "source": [ 169 | "Stride" 170 | ] 171 | }, 172 | { 173 | "cell_type": "code", 174 | "execution_count": 5, 175 | "metadata": { 176 | "ExecuteTime": { 177 | "end_time": "2019-07-03T22:12:43.364745Z", 178 | "start_time": "2019-07-03T22:12:43.345529Z" 179 | } 180 | }, 181 | "outputs": [ 182 | { 183 | "data": { 184 | "text/plain": [ 185 | "(4, 4)" 186 | ] 187 | }, 188 | "execution_count": 5, 189 | "metadata": {}, 190 | "output_type": "execute_result" 191 | } 192 | ], 193 | "source": [ 194 | "conv2d = nn.Conv2D(1, kernel_size=3, padding=1, strides=2)\n", 195 | "comp_conv2d(conv2d, X).shape" 196 | ] 197 | }, 198 | { 199 | "cell_type": "markdown", 200 | "metadata": { 201 | "slideshow": { 202 | "slide_type": "slide" 203 | } 204 | }, 205 | "source": [ 206 | "A slightly more complicated example" 207 | ] 208 | }, 209 | { 210 | "cell_type": "code", 211 | "execution_count": 6, 212 | "metadata": { 213 | "ExecuteTime": { 214 | "end_time": "2019-07-03T22:12:43.382376Z", 215 | "start_time": "2019-07-03T22:12:43.368194Z" 216 | } 217 | }, 218 | "outputs": [ 219 | { 220 | "data": { 221 | "text/plain": [ 222 | "(2, 2)" 223 | ] 224 | }, 225 | "execution_count": 6, 226 | "metadata": {}, 227 | "output_type": "execute_result" 228 | } 229 | ], 230 | "source": [ 231 | "conv2d = nn.Conv2D(1, kernel_size=(3, 5), padding=(0, 1), strides=(3, 4))\n", 232 | "comp_conv2d(conv2d, X).shape" 233 | ] 234 | }, 235 | { 236 | "cell_type": "markdown", 237 | "metadata": { 238 | "slideshow": { 239 | "slide_type": "slide" 240 | } 241 | }, 242 | "source": [ 243 | "Multiple input channels" 244 | ] 245 | }, 246 | { 247 | "cell_type": "code", 248 | "execution_count": 7, 249 | "metadata": { 250 | "ExecuteTime": { 251 | "end_time": "2019-07-03T22:12:43.535573Z", 252 | "start_time": "2019-07-03T22:12:43.387749Z" 253 | } 254 | }, 255 | "outputs": [ 256 | { 257 | "data": { 258 | "text/plain": [ 259 | "array([[ 56., 72.],\n", 260 | " [104., 120.]])" 261 | ] 262 | }, 263 | "execution_count": 7, 264 | "metadata": {}, 265 | "output_type": "execute_result" 266 | } 267 | ], 268 | "source": [ 269 | "def corr2d_multi_in(X, K):\n", 270 | " return sum(corr2d(x, k) for x, k in zip(X, K))\n", 271 | "\n", 272 | "X = np.array([[[0, 1, 2], [3, 4, 5], [6, 7, 8]],\n", 273 | " [[1, 2, 3], [4, 5, 6], [7, 8, 9]]])\n", 274 | "K = np.array([[[0, 1], [2, 3]], [[1, 2], [3, 4]]])\n", 275 | "\n", 276 | "corr2d_multi_in(X, K)" 277 | ] 278 | }, 279 | { 280 | "cell_type": "markdown", 281 | "metadata": { 282 | "slideshow": { 283 | "slide_type": "slide" 284 | } 285 | }, 286 | "source": [ 287 | "Multiple output channels" 288 | ] 289 | }, 290 | { 291 | "cell_type": "code", 292 | "execution_count": 8, 293 | "metadata": { 294 | "ExecuteTime": { 295 | "end_time": "2019-07-03T22:12:43.576040Z", 296 | "start_time": "2019-07-03T22:12:43.538210Z" 297 | }, 298 | "scrolled": true 299 | }, 300 | "outputs": [ 301 | { 302 | "data": { 303 | "text/plain": [ 304 | "((3, 2, 2, 2), (3, 2, 2))" 305 | ] 306 | }, 307 | "execution_count": 8, 308 | "metadata": {}, 309 | "output_type": "execute_result" 310 | } 311 | ], 312 | "source": [ 313 | "def corr2d_multi_in_out(X, K):\n", 314 | " return np.stack([corr2d_multi_in(X, k) for k in K])\n", 315 | "\n", 316 | "K = np.stack((K, K + 1, K + 2))\n", 317 | "K.shape, corr2d_multi_in_out(X, K).shape" 318 | ] 319 | } 320 | ], 321 | "metadata": { 322 | "celltoolbar": "Slideshow", 323 | "kernelspec": { 324 | "display_name": "Python 3", 325 | "language": "python", 326 | "name": "python3" 327 | }, 328 | "language_info": { 329 | "codemirror_mode": { 330 | "name": "ipython", 331 | "version": 3 332 | }, 333 | "file_extension": ".py", 334 | "mimetype": "text/x-python", 335 | "name": "python", 336 | "nbconvert_exporter": "python", 337 | "pygments_lexer": "ipython3", 338 | "version": "3.7.1" 339 | }, 340 | "toc": { 341 | "base_numbering": 1, 342 | "nav_menu": {}, 343 | "number_sections": true, 344 | "sideBar": true, 345 | "skip_h1_title": false, 346 | "title_cell": "Table of Contents", 347 | "title_sidebar": "Contents", 348 | "toc_cell": false, 349 | "toc_position": {}, 350 | "toc_section_display": true, 351 | "toc_window_display": false 352 | } 353 | }, 354 | "nbformat": 4, 355 | "nbformat_minor": 2 356 | } 357 | -------------------------------------------------------------------------------- /notebooks-2/3-pooling.ipynb: -------------------------------------------------------------------------------- 1 | { 2 | "cells": [ 3 | { 4 | "cell_type": "markdown", 5 | "metadata": { 6 | "slideshow": { 7 | "slide_type": "slide" 8 | } 9 | }, 10 | "source": [ 11 | "# Pooling" 12 | ] 13 | }, 14 | { 15 | "cell_type": "code", 16 | "execution_count": 1, 17 | "metadata": { 18 | "ExecuteTime": { 19 | "end_time": "2019-07-03T22:14:08.041117Z", 20 | "start_time": "2019-07-03T22:14:06.519244Z" 21 | }, 22 | "attributes": { 23 | "classes": [], 24 | "id": "", 25 | "n": "3" 26 | } 27 | }, 28 | "outputs": [], 29 | "source": [ 30 | "from mxnet import np, npx\n", 31 | "from mxnet.gluon import nn\n", 32 | "npx.set_np()" 33 | ] 34 | }, 35 | { 36 | "cell_type": "markdown", 37 | "metadata": { 38 | "slideshow": { 39 | "slide_type": "slide" 40 | } 41 | }, 42 | "source": [ 43 | "Implement 2-d pooling" 44 | ] 45 | }, 46 | { 47 | "cell_type": "code", 48 | "execution_count": 2, 49 | "metadata": { 50 | "ExecuteTime": { 51 | "end_time": "2019-07-03T22:14:08.174881Z", 52 | "start_time": "2019-07-03T22:14:08.043310Z" 53 | }, 54 | "attributes": { 55 | "classes": [], 56 | "id": "", 57 | "n": "4" 58 | } 59 | }, 60 | "outputs": [ 61 | { 62 | "data": { 63 | "text/plain": [ 64 | "array([[4., 5.],\n", 65 | " [7., 8.]])" 66 | ] 67 | }, 68 | "execution_count": 2, 69 | "metadata": {}, 70 | "output_type": "execute_result" 71 | } 72 | ], 73 | "source": [ 74 | "def pool2d(X, pool_size, mode='max'):\n", 75 | " p_h, p_w = pool_size\n", 76 | " Y = np.zeros((X.shape[0] - p_h + 1, X.shape[1] - p_w + 1))\n", 77 | " for i in range(Y.shape[0]):\n", 78 | " for j in range(Y.shape[1]):\n", 79 | " if mode == 'max':\n", 80 | " Y[i, j] = np.max(X[i: i + p_h, j: j + p_w])\n", 81 | " elif mode == 'avg':\n", 82 | " Y[i, j] = X[i: i + p_h, j: j + p_w].mean()\n", 83 | " return Y\n", 84 | "\n", 85 | "X = np.array([[0, 1, 2], [3, 4, 5], [6, 7, 8]])\n", 86 | "pool2d(X, (2, 2))" 87 | ] 88 | }, 89 | { 90 | "cell_type": "markdown", 91 | "metadata": { 92 | "slideshow": { 93 | "slide_type": "slide" 94 | } 95 | }, 96 | "source": [ 97 | "Padding and Stride" 98 | ] 99 | }, 100 | { 101 | "cell_type": "code", 102 | "execution_count": 3, 103 | "metadata": { 104 | "ExecuteTime": { 105 | "end_time": "2019-07-03T22:14:08.188017Z", 106 | "start_time": "2019-07-03T22:14:08.176923Z" 107 | }, 108 | "attributes": { 109 | "classes": [], 110 | "id": "", 111 | "n": "15" 112 | } 113 | }, 114 | "outputs": [ 115 | { 116 | "name": "stdout", 117 | "output_type": "stream", 118 | "text": [ 119 | "[[[[ 0. 1. 2. 3.]\n", 120 | " [ 4. 5. 6. 7.]\n", 121 | " [ 8. 9. 10. 11.]\n", 122 | " [12. 13. 14. 15.]]]]\n" 123 | ] 124 | }, 125 | { 126 | "data": { 127 | "text/plain": [ 128 | "array([[[[10.]]]])" 129 | ] 130 | }, 131 | "execution_count": 3, 132 | "metadata": {}, 133 | "output_type": "execute_result" 134 | } 135 | ], 136 | "source": [ 137 | "X = np.arange(16).reshape((1, 1, 4, 4))\n", 138 | "print(X)\n", 139 | "\n", 140 | "pool2d = nn.MaxPool2D(3)\n", 141 | "pool2d(X)" 142 | ] 143 | }, 144 | { 145 | "cell_type": "markdown", 146 | "metadata": { 147 | "slideshow": { 148 | "slide_type": "slide" 149 | } 150 | }, 151 | "source": [ 152 | "Specify the padding and stride" 153 | ] 154 | }, 155 | { 156 | "cell_type": "code", 157 | "execution_count": 4, 158 | "metadata": { 159 | "ExecuteTime": { 160 | "end_time": "2019-07-03T22:14:08.195734Z", 161 | "start_time": "2019-07-03T22:14:08.189887Z" 162 | }, 163 | "attributes": { 164 | "classes": [], 165 | "id": "", 166 | "n": "7" 167 | } 168 | }, 169 | "outputs": [ 170 | { 171 | "name": "stdout", 172 | "output_type": "stream", 173 | "text": [ 174 | "[[[[ 0. 1. 2. 3.]\n", 175 | " [ 4. 5. 6. 7.]\n", 176 | " [ 8. 9. 10. 11.]\n", 177 | " [12. 13. 14. 15.]]]]\n" 178 | ] 179 | }, 180 | { 181 | "data": { 182 | "text/plain": [ 183 | "array([[[[ 5., 7.],\n", 184 | " [13., 15.]]]])" 185 | ] 186 | }, 187 | "execution_count": 4, 188 | "metadata": {}, 189 | "output_type": "execute_result" 190 | } 191 | ], 192 | "source": [ 193 | "print(X)\n", 194 | "pool2d = nn.MaxPool2D(3, padding=1, strides=2)\n", 195 | "pool2d(X)" 196 | ] 197 | }, 198 | { 199 | "cell_type": "markdown", 200 | "metadata": { 201 | "slideshow": { 202 | "slide_type": "slide" 203 | } 204 | }, 205 | "source": [ 206 | "Multiple channels" 207 | ] 208 | }, 209 | { 210 | "cell_type": "code", 211 | "execution_count": 5, 212 | "metadata": { 213 | "ExecuteTime": { 214 | "end_time": "2019-07-03T22:14:08.217621Z", 215 | "start_time": "2019-07-03T22:14:08.197280Z" 216 | }, 217 | "attributes": { 218 | "classes": [], 219 | "id": "", 220 | "n": "9" 221 | } 222 | }, 223 | "outputs": [ 224 | { 225 | "name": "stdout", 226 | "output_type": "stream", 227 | "text": [ 228 | "[[[[ 0. 1. 2. 3.]\n", 229 | " [ 4. 5. 6. 7.]\n", 230 | " [ 8. 9. 10. 11.]\n", 231 | " [12. 13. 14. 15.]]\n", 232 | "\n", 233 | " [[ 1. 2. 3. 4.]\n", 234 | " [ 5. 6. 7. 8.]\n", 235 | " [ 9. 10. 11. 12.]\n", 236 | " [13. 14. 15. 16.]]]]\n" 237 | ] 238 | }, 239 | { 240 | "data": { 241 | "text/plain": [ 242 | "array([[[[ 5., 7.],\n", 243 | " [13., 15.]],\n", 244 | "\n", 245 | " [[ 6., 8.],\n", 246 | " [14., 16.]]]])" 247 | ] 248 | }, 249 | "execution_count": 5, 250 | "metadata": {}, 251 | "output_type": "execute_result" 252 | } 253 | ], 254 | "source": [ 255 | "X = np.concatenate((X, X + 1), axis=1)\n", 256 | "print(X)\n", 257 | "pool2d = nn.MaxPool2D(3, padding=1, strides=2)\n", 258 | "pool2d(X)" 259 | ] 260 | } 261 | ], 262 | "metadata": { 263 | "celltoolbar": "Slideshow", 264 | "kernelspec": { 265 | "display_name": "Python 3", 266 | "language": "python", 267 | "name": "python3" 268 | }, 269 | "language_info": { 270 | "codemirror_mode": { 271 | "name": "ipython", 272 | "version": 3 273 | }, 274 | "file_extension": ".py", 275 | "mimetype": "text/x-python", 276 | "name": "python", 277 | "nbconvert_exporter": "python", 278 | "pygments_lexer": "ipython3", 279 | "version": "3.7.1" 280 | }, 281 | "toc": { 282 | "base_numbering": 1, 283 | "nav_menu": {}, 284 | "number_sections": true, 285 | "sideBar": true, 286 | "skip_h1_title": false, 287 | "title_cell": "Table of Contents", 288 | "title_sidebar": "Contents", 289 | "toc_cell": false, 290 | "toc_position": {}, 291 | "toc_section_display": true, 292 | "toc_window_display": false 293 | } 294 | }, 295 | "nbformat": 4, 296 | "nbformat_minor": 2 297 | } 298 | -------------------------------------------------------------------------------- /notebooks-3/1-hybridize.ipynb: -------------------------------------------------------------------------------- 1 | { 2 | "cells": [ 3 | { 4 | "cell_type": "markdown", 5 | "metadata": { 6 | "slideshow": { 7 | "slide_type": "slide" 8 | } 9 | }, 10 | "source": [ 11 | "# A Hybrid of Imperative and Symbolic Programming" 12 | ] 13 | }, 14 | { 15 | "cell_type": "code", 16 | "execution_count": 1, 17 | "metadata": { 18 | "ExecuteTime": { 19 | "end_time": "2019-07-03T22:42:42.256618Z", 20 | "start_time": "2019-07-03T22:42:39.319502Z" 21 | } 22 | }, 23 | "outputs": [ 24 | { 25 | "data": { 26 | "text/plain": [ 27 | "6" 28 | ] 29 | }, 30 | "execution_count": 1, 31 | "metadata": {}, 32 | "output_type": "execute_result" 33 | } 34 | ], 35 | "source": [ 36 | "import d2l\n", 37 | "from mxnet import np, npx, sym\n", 38 | "from mxnet.gluon import nn\n", 39 | "npx.set_np()\n", 40 | "\n", 41 | "def add(a, b):\n", 42 | " return a + b\n", 43 | "def fancy_func(a, b, c):\n", 44 | " e = add(a, b)\n", 45 | " return add(c, e)\n", 46 | "fancy_func(1, 2, 3)" 47 | ] 48 | }, 49 | { 50 | "cell_type": "markdown", 51 | "metadata": { 52 | "slideshow": { 53 | "slide_type": "slide" 54 | } 55 | }, 56 | "source": [ 57 | "Symbolic programming" 58 | ] 59 | }, 60 | { 61 | "cell_type": "code", 62 | "execution_count": 2, 63 | "metadata": { 64 | "ExecuteTime": { 65 | "end_time": "2019-07-03T22:42:42.263210Z", 66 | "start_time": "2019-07-03T22:42:42.258563Z" 67 | }, 68 | "scrolled": true 69 | }, 70 | "outputs": [ 71 | { 72 | "name": "stdout", 73 | "output_type": "stream", 74 | "text": [ 75 | "6\n" 76 | ] 77 | } 78 | ], 79 | "source": [ 80 | "def add_str():\n", 81 | " return '''def add(a, b):\n", 82 | " return a + b\n", 83 | "'''\n", 84 | "def fancy_func_str():\n", 85 | " return '''def fancy_func(a, b, c):\n", 86 | " e = add(a, b)\n", 87 | " return add(c, e)\n", 88 | "'''\n", 89 | "def evoke_str():\n", 90 | " return add_str() + fancy_func_str() + '''\n", 91 | "print(fancy_func(1, 2, 3))\n", 92 | "'''\n", 93 | "prog = evoke_str()\n", 94 | "y = compile(prog, '', 'exec')\n", 95 | "exec(y)" 96 | ] 97 | }, 98 | { 99 | "cell_type": "markdown", 100 | "metadata": { 101 | "slideshow": { 102 | "slide_type": "slide" 103 | } 104 | }, 105 | "source": [ 106 | "Construct with the ``HybridSequential`` class" 107 | ] 108 | }, 109 | { 110 | "cell_type": "code", 111 | "execution_count": 3, 112 | "metadata": { 113 | "ExecuteTime": { 114 | "end_time": "2019-07-03T22:42:42.287669Z", 115 | "start_time": "2019-07-03T22:42:42.264929Z" 116 | } 117 | }, 118 | "outputs": [ 119 | { 120 | "data": { 121 | "text/plain": [ 122 | "array([[0.08827581, 0.00505182]])" 123 | ] 124 | }, 125 | "execution_count": 3, 126 | "metadata": {}, 127 | "output_type": "execute_result" 128 | } 129 | ], 130 | "source": [ 131 | "def get_net():\n", 132 | " net = nn.HybridSequential()\n", 133 | " net.add(nn.Dense(256, activation='relu'),\n", 134 | " nn.Dense(128, activation='relu'),\n", 135 | " nn.Dense(2))\n", 136 | " net.initialize()\n", 137 | " return net\n", 138 | "\n", 139 | "x = np.random.normal(size=(1, 512))\n", 140 | "net = get_net()\n", 141 | "net(x)" 142 | ] 143 | }, 144 | { 145 | "cell_type": "markdown", 146 | "metadata": { 147 | "slideshow": { 148 | "slide_type": "slide" 149 | } 150 | }, 151 | "source": [ 152 | "Compile and optimize the workload" 153 | ] 154 | }, 155 | { 156 | "cell_type": "code", 157 | "execution_count": 4, 158 | "metadata": { 159 | "ExecuteTime": { 160 | "end_time": "2019-07-03T22:42:42.295336Z", 161 | "start_time": "2019-07-03T22:42:42.289426Z" 162 | } 163 | }, 164 | "outputs": [ 165 | { 166 | "data": { 167 | "text/plain": [ 168 | "array([[0.08827581, 0.00505182]])" 169 | ] 170 | }, 171 | "execution_count": 4, 172 | "metadata": {}, 173 | "output_type": "execute_result" 174 | } 175 | ], 176 | "source": [ 177 | "net.hybridize()\n", 178 | "net(x)" 179 | ] 180 | }, 181 | { 182 | "cell_type": "markdown", 183 | "metadata": { 184 | "slideshow": { 185 | "slide_type": "slide" 186 | } 187 | }, 188 | "source": [ 189 | "Benchmark" 190 | ] 191 | }, 192 | { 193 | "cell_type": "code", 194 | "execution_count": 5, 195 | "metadata": { 196 | "ExecuteTime": { 197 | "end_time": "2019-07-03T22:42:43.129608Z", 198 | "start_time": "2019-07-03T22:42:42.296695Z" 199 | } 200 | }, 201 | "outputs": [ 202 | { 203 | "name": "stdout", 204 | "output_type": "stream", 205 | "text": [ 206 | "before hybridizing: 0.5593 sec\n", 207 | "after hybridizing: 0.2651 sec\n" 208 | ] 209 | } 210 | ], 211 | "source": [ 212 | "def benchmark(net, x):\n", 213 | " timer = d2l.Timer()\n", 214 | " for i in range(1000):\n", 215 | " _ = net(x)\n", 216 | " npx.waitall()\n", 217 | " return timer.stop()\n", 218 | "\n", 219 | "net = get_net()\n", 220 | "print('before hybridizing: %.4f sec' % (benchmark(net, x)))\n", 221 | "net.hybridize()\n", 222 | "print('after hybridizing: %.4f sec' % (benchmark(net, x)))" 223 | ] 224 | }, 225 | { 226 | "cell_type": "markdown", 227 | "metadata": { 228 | "slideshow": { 229 | "slide_type": "slide" 230 | } 231 | }, 232 | "source": [ 233 | "Export the program to other languages" 234 | ] 235 | }, 236 | { 237 | "cell_type": "code", 238 | "execution_count": 6, 239 | "metadata": { 240 | "ExecuteTime": { 241 | "end_time": "2019-07-03T22:42:43.364016Z", 242 | "start_time": "2019-07-03T22:42:43.131219Z" 243 | } 244 | }, 245 | "outputs": [ 246 | { 247 | "name": "stdout", 248 | "output_type": "stream", 249 | "text": [ 250 | "my_mlp-0000.params my_mlp-symbol.json\n", 251 | "{\n", 252 | " \"nodes\": [\n", 253 | " {\n", 254 | " \"op\": \"null\", \n", 255 | " \"name\": \"data\", \n", 256 | " \"inputs\": []\n", 257 | " }, \n", 258 | " {\n", 259 | " \"op\": \"null\", \n", 260 | " \"name\": \"dense3_weight\", \n", 261 | " \"attrs\": {\n", 262 | " \"__dtype__\": \"0\", \n", 263 | " \"__lr_mult__\": \"1.0\", \n", 264 | " \"__shape__\": \"(256, -1)\", \n", 265 | " \"__storage_type__\": \"0\", \n", 266 | " \"__wd_mult__\": \"1.0\"\n", 267 | " }, \n", 268 | " \"inputs\": []\n", 269 | " }, \n", 270 | " {\n" 271 | ] 272 | } 273 | ], 274 | "source": [ 275 | "net.export('my_mlp')\n", 276 | "!ls my_mlp*\n", 277 | "!head -n20 my_mlp-symbol.json" 278 | ] 279 | } 280 | ], 281 | "metadata": { 282 | "celltoolbar": "Slideshow", 283 | "kernelspec": { 284 | "display_name": "Python 3", 285 | "language": "python", 286 | "name": "python3" 287 | }, 288 | "language_info": { 289 | "codemirror_mode": { 290 | "name": "ipython", 291 | "version": 3 292 | }, 293 | "file_extension": ".py", 294 | "mimetype": "text/x-python", 295 | "name": "python", 296 | "nbconvert_exporter": "python", 297 | "pygments_lexer": "ipython3", 298 | "version": "3.7.1" 299 | }, 300 | "toc": { 301 | "base_numbering": 1, 302 | "nav_menu": {}, 303 | "number_sections": true, 304 | "sideBar": true, 305 | "skip_h1_title": false, 306 | "title_cell": "Table of Contents", 307 | "title_sidebar": "Contents", 308 | "toc_cell": false, 309 | "toc_position": {}, 310 | "toc_section_display": true, 311 | "toc_window_display": false 312 | } 313 | }, 314 | "nbformat": 4, 315 | "nbformat_minor": 2 316 | } 317 | -------------------------------------------------------------------------------- /notebooks-4/1-text-preprocessing.ipynb: -------------------------------------------------------------------------------- 1 | { 2 | "cells": [ 3 | { 4 | "cell_type": "markdown", 5 | "metadata": { 6 | "slideshow": { 7 | "slide_type": "slide" 8 | } 9 | }, 10 | "source": [ 11 | "# Text Preprocessing" 12 | ] 13 | }, 14 | { 15 | "cell_type": "code", 16 | "execution_count": 1, 17 | "metadata": { 18 | "ExecuteTime": { 19 | "end_time": "2019-07-03T22:57:47.080061Z", 20 | "start_time": "2019-07-03T22:57:46.025675Z" 21 | } 22 | }, 23 | "outputs": [], 24 | "source": [ 25 | "import collections\n", 26 | "import re\n", 27 | "import random\n", 28 | "from mxnet import np, npx\n", 29 | "npx.set_np()" 30 | ] 31 | }, 32 | { 33 | "cell_type": "markdown", 34 | "metadata": { 35 | "slideshow": { 36 | "slide_type": "slide" 37 | } 38 | }, 39 | "source": [ 40 | "Read \"Time Machine\" by H. G. Wells as our training dataset" 41 | ] 42 | }, 43 | { 44 | "cell_type": "code", 45 | "execution_count": 2, 46 | "metadata": { 47 | "ExecuteTime": { 48 | "end_time": "2019-07-03T22:57:47.109854Z", 49 | "start_time": "2019-07-03T22:57:47.081916Z" 50 | } 51 | }, 52 | "outputs": [ 53 | { 54 | "data": { 55 | "text/plain": [ 56 | "'# sentences 3221'" 57 | ] 58 | }, 59 | "execution_count": 2, 60 | "metadata": {}, 61 | "output_type": "execute_result" 62 | } 63 | ], 64 | "source": [ 65 | "def read_time_machine():\n", 66 | " with open('../data/timemachine.txt', 'r') as f:\n", 67 | " lines = f.readlines()\n", 68 | " return [re.sub('[^A-Za-z]+', ' ', line.strip().lower()) \n", 69 | " for line in lines]\n", 70 | "\n", 71 | "lines = read_time_machine()\n", 72 | "'# sentences %d' % len(lines)" 73 | ] 74 | }, 75 | { 76 | "cell_type": "markdown", 77 | "metadata": { 78 | "slideshow": { 79 | "slide_type": "slide" 80 | } 81 | }, 82 | "source": [ 83 | "Split each sentence into a list of tokens" 84 | ] 85 | }, 86 | { 87 | "cell_type": "code", 88 | "execution_count": 3, 89 | "metadata": { 90 | "ExecuteTime": { 91 | "end_time": "2019-07-03T22:57:47.119856Z", 92 | "start_time": "2019-07-03T22:57:47.111528Z" 93 | } 94 | }, 95 | "outputs": [ 96 | { 97 | "data": { 98 | "text/plain": [ 99 | "[['the', 'time', 'machine', 'by', 'h', 'g', 'wells', ''], ['']]" 100 | ] 101 | }, 102 | "execution_count": 3, 103 | "metadata": {}, 104 | "output_type": "execute_result" 105 | } 106 | ], 107 | "source": [ 108 | "def tokenize(lines, token='word'):\n", 109 | " if token == 'word':\n", 110 | " return [line.split(' ') for line in lines]\n", 111 | " elif token == 'char':\n", 112 | " return [list(line) for line in lines]\n", 113 | " else:\n", 114 | " print('ERROR: unkown token type '+token)\n", 115 | "\n", 116 | "tokens = tokenize(lines)\n", 117 | "tokens[0:2]" 118 | ] 119 | }, 120 | { 121 | "cell_type": "markdown", 122 | "metadata": { 123 | "slideshow": { 124 | "slide_type": "slide" 125 | } 126 | }, 127 | "source": [ 128 | "Build a vocabulary to map string tokens into numerical indices" 129 | ] 130 | }, 131 | { 132 | "cell_type": "code", 133 | "execution_count": 4, 134 | "metadata": { 135 | "ExecuteTime": { 136 | "end_time": "2019-07-03T22:57:47.130515Z", 137 | "start_time": "2019-07-03T22:57:47.121839Z" 138 | }, 139 | "attributes": { 140 | "classes": [], 141 | "id": "", 142 | "n": "9" 143 | } 144 | }, 145 | "outputs": [], 146 | "source": [ 147 | "class Vocab(object):\n", 148 | " def __init__(self, tokens, min_freq=0):\n", 149 | " # Sort according to frequencies\n", 150 | " counter = collections.Counter([tk for line in tokens for tk in line])\n", 151 | " self.token_freqs = sorted(counter.items(), key=lambda x: x[0])\n", 152 | " self.token_freqs.sort(key=lambda x: x[1], reverse=True)\n", 153 | " self.unk, uniq_tokens = 0, ['']\n", 154 | " uniq_tokens += [token for token, freq in self.token_freqs \n", 155 | " if freq >= min_freq and token not in uniq_tokens]\n", 156 | " self.idx_to_token, self.token_to_idx = [], dict()\n", 157 | " for token in uniq_tokens:\n", 158 | " self.idx_to_token.append(token)\n", 159 | " self.token_to_idx[token] = len(self.idx_to_token) - 1\n", 160 | " def __len__(self):\n", 161 | " return len(self.idx_to_token)\n", 162 | " def __getitem__(self, tokens):\n", 163 | " if not isinstance(tokens, (list, tuple)):\n", 164 | " return self.token_to_idx.get(tokens, self.unk)\n", 165 | " return [self.__getitem__(token) for token in tokens]\n", 166 | " def to_tokens(self, indices):\n", 167 | " if not isinstance(indices, (list, tuple)):\n", 168 | " return self.idx_to_token[indices]\n", 169 | " return [self.idx_to_token[index] for index in indices]" 170 | ] 171 | }, 172 | { 173 | "cell_type": "markdown", 174 | "metadata": { 175 | "slideshow": { 176 | "slide_type": "slide" 177 | } 178 | }, 179 | "source": [ 180 | "Print the map between a few tokens to indices" 181 | ] 182 | }, 183 | { 184 | "cell_type": "code", 185 | "execution_count": 5, 186 | "metadata": { 187 | "ExecuteTime": { 188 | "end_time": "2019-07-03T22:57:47.147807Z", 189 | "start_time": "2019-07-03T22:57:47.131982Z" 190 | }, 191 | "attributes": { 192 | "classes": [], 193 | "id": "", 194 | "n": "23" 195 | } 196 | }, 197 | "outputs": [ 198 | { 199 | "name": "stdout", 200 | "output_type": "stream", 201 | "text": [ 202 | "[('', 0), ('the', 1), ('', 2), ('i', 3), ('and', 4), ('of', 5), ('a', 6), ('to', 7), ('was', 8), ('in', 9)]\n" 203 | ] 204 | } 205 | ], 206 | "source": [ 207 | "vocab = Vocab(tokens)\n", 208 | "print(list(vocab.token_to_idx.items())[0:10])" 209 | ] 210 | }, 211 | { 212 | "cell_type": "markdown", 213 | "metadata": { 214 | "slideshow": { 215 | "slide_type": "slide" 216 | } 217 | }, 218 | "source": [ 219 | "Now we can convert each sentence into a list of numerical indices" 220 | ] 221 | }, 222 | { 223 | "cell_type": "code", 224 | "execution_count": 6, 225 | "metadata": { 226 | "ExecuteTime": { 227 | "end_time": "2019-07-03T22:57:47.152854Z", 228 | "start_time": "2019-07-03T22:57:47.149161Z" 229 | }, 230 | "attributes": { 231 | "classes": [], 232 | "id": "", 233 | "n": "25" 234 | } 235 | }, 236 | "outputs": [ 237 | { 238 | "name": "stdout", 239 | "output_type": "stream", 240 | "text": [ 241 | "words: ['the', 'time', 'traveller', 'for', 'so', 'it', 'will', 'be', 'convenient', 'to', 'speak', 'of', 'him', '']\n", 242 | "indices: [1, 20, 72, 17, 38, 12, 120, 43, 706, 7, 660, 5, 112, 2]\n", 243 | "words: ['was', 'expounding', 'a', 'recondite', 'matter', 'to', 'us', 'his', 'grey', 'eyes', 'shone', 'and']\n", 244 | "indices: [8, 1654, 6, 3864, 634, 7, 131, 26, 344, 127, 484, 4]\n" 245 | ] 246 | } 247 | ], 248 | "source": [ 249 | "for i in range(8, 10):\n", 250 | " print('words:', tokens[i]) \n", 251 | " print('indices:', vocab[tokens[i]])" 252 | ] 253 | }, 254 | { 255 | "cell_type": "markdown", 256 | "metadata": { 257 | "slideshow": { 258 | "slide_type": "slide" 259 | } 260 | }, 261 | "source": [ 262 | "Next load data into mini-batches" 263 | ] 264 | }, 265 | { 266 | "cell_type": "code", 267 | "execution_count": 7, 268 | "metadata": { 269 | "ExecuteTime": { 270 | "end_time": "2019-07-03T22:57:47.159683Z", 271 | "start_time": "2019-07-03T22:57:47.154168Z" 272 | } 273 | }, 274 | "outputs": [], 275 | "source": [ 276 | "def seq_data_iter_consecutive(corpus, batch_size, num_steps):\n", 277 | " # Offset for the iterator over the data for uniform starts\n", 278 | " offset = random.randint(0, num_steps)\n", 279 | " # Slice out data - ignore num_steps and just wrap around\n", 280 | " num_indices = ((len(corpus) - offset - 1) // batch_size) * batch_size\n", 281 | " Xs = np.array(corpus[offset:offset+num_indices])\n", 282 | " Ys = np.array(corpus[offset+1:offset+1+num_indices])\n", 283 | " Xs, Ys = Xs.reshape((batch_size, -1)), Ys.reshape((batch_size, -1))\n", 284 | " num_batches = Xs.shape[1] // num_steps\n", 285 | " for i in range(0, num_batches * num_steps, num_steps):\n", 286 | " X = Xs[:,i:(i+num_steps)]\n", 287 | " Y = Ys[:,i:(i+num_steps)]\n", 288 | " yield X, Y" 289 | ] 290 | }, 291 | { 292 | "cell_type": "markdown", 293 | "metadata": { 294 | "slideshow": { 295 | "slide_type": "slide" 296 | } 297 | }, 298 | "source": [ 299 | "Test on a toy example" 300 | ] 301 | }, 302 | { 303 | "cell_type": "code", 304 | "execution_count": 8, 305 | "metadata": { 306 | "ExecuteTime": { 307 | "end_time": "2019-07-03T22:57:47.176345Z", 308 | "start_time": "2019-07-03T22:57:47.161464Z" 309 | }, 310 | "scrolled": true 311 | }, 312 | "outputs": [ 313 | { 314 | "name": "stdout", 315 | "output_type": "stream", 316 | "text": [ 317 | "X =\n", 318 | "[[ 6. 7. 8. 9. 10. 11.]\n", 319 | " [17. 18. 19. 20. 21. 22.]]\n", 320 | "Y =\n", 321 | "[[ 7. 8. 9. 10. 11. 12.]\n", 322 | " [18. 19. 20. 21. 22. 23.]]\n" 323 | ] 324 | } 325 | ], 326 | "source": [ 327 | "my_seq = list(range(30))\n", 328 | "for X, Y in seq_data_iter_consecutive(my_seq, batch_size=2, num_steps=6):\n", 329 | " print('X =\\n%s\\nY =\\n%s' %(X, Y))" 330 | ] 331 | } 332 | ], 333 | "metadata": { 334 | "celltoolbar": "Slideshow", 335 | "kernelspec": { 336 | "display_name": "Python 3", 337 | "language": "python", 338 | "name": "python3" 339 | }, 340 | "language_info": { 341 | "codemirror_mode": { 342 | "name": "ipython", 343 | "version": 3 344 | }, 345 | "file_extension": ".py", 346 | "mimetype": "text/x-python", 347 | "name": "python", 348 | "nbconvert_exporter": "python", 349 | "pygments_lexer": "ipython3", 350 | "version": "3.7.1" 351 | }, 352 | "toc": { 353 | "base_numbering": 1, 354 | "nav_menu": {}, 355 | "number_sections": true, 356 | "sideBar": true, 357 | "skip_h1_title": false, 358 | "title_cell": "Table of Contents", 359 | "title_sidebar": "Contents", 360 | "toc_cell": false, 361 | "toc_position": {}, 362 | "toc_section_display": true, 363 | "toc_window_display": false 364 | } 365 | }, 366 | "nbformat": 4, 367 | "nbformat_minor": 2 368 | } 369 | -------------------------------------------------------------------------------- /notebooks-4/3-rnn-gluon.ipynb: -------------------------------------------------------------------------------- 1 | { 2 | "cells": [ 3 | { 4 | "cell_type": "markdown", 5 | "metadata": { 6 | "slideshow": { 7 | "slide_type": "slide" 8 | } 9 | }, 10 | "source": [ 11 | "# Concise Implementation of Recurrent Neural Networks" 12 | ] 13 | }, 14 | { 15 | "cell_type": "code", 16 | "execution_count": 1, 17 | "metadata": { 18 | "ExecuteTime": { 19 | "end_time": "2019-07-03T23:01:16.582933Z", 20 | "start_time": "2019-07-03T23:01:13.502104Z" 21 | }, 22 | "attributes": { 23 | "classes": [], 24 | "id": "", 25 | "n": "1" 26 | } 27 | }, 28 | "outputs": [], 29 | "source": [ 30 | "import d2l\n", 31 | "import math\n", 32 | "from mxnet import gluon, init, np, npx\n", 33 | "from mxnet.gluon import nn, rnn\n", 34 | "npx.set_np()\n", 35 | "\n", 36 | "batch_size, num_steps = 32, 35\n", 37 | "train_iter, vocab = d2l.load_data_time_machine(batch_size, num_steps)" 38 | ] 39 | }, 40 | { 41 | "cell_type": "markdown", 42 | "metadata": { 43 | "slideshow": { 44 | "slide_type": "slide" 45 | } 46 | }, 47 | "source": [ 48 | "Creating a RNN layer with 256 hidden units." 49 | ] 50 | }, 51 | { 52 | "cell_type": "code", 53 | "execution_count": 2, 54 | "metadata": { 55 | "ExecuteTime": { 56 | "end_time": "2019-07-03T23:01:16.591409Z", 57 | "start_time": "2019-07-03T23:01:16.585714Z" 58 | }, 59 | "attributes": { 60 | "classes": [], 61 | "id": "", 62 | "n": "26" 63 | } 64 | }, 65 | "outputs": [], 66 | "source": [ 67 | "rnn_layer = rnn.RNN(256)\n", 68 | "rnn_layer.initialize()" 69 | ] 70 | }, 71 | { 72 | "cell_type": "markdown", 73 | "metadata": { 74 | "slideshow": { 75 | "slide_type": "slide" 76 | } 77 | }, 78 | "source": [ 79 | "Initializing the hidden state." 80 | ] 81 | }, 82 | { 83 | "cell_type": "code", 84 | "execution_count": 3, 85 | "metadata": { 86 | "ExecuteTime": { 87 | "end_time": "2019-07-03T23:01:16.599071Z", 88 | "start_time": "2019-07-03T23:01:16.593543Z" 89 | }, 90 | "attributes": { 91 | "classes": [], 92 | "id": "", 93 | "n": "37" 94 | } 95 | }, 96 | "outputs": [ 97 | { 98 | "data": { 99 | "text/plain": [ 100 | "(1, (1, 1, 256))" 101 | ] 102 | }, 103 | "execution_count": 3, 104 | "metadata": {}, 105 | "output_type": "execute_result" 106 | } 107 | ], 108 | "source": [ 109 | "state = rnn_layer.begin_state(batch_size=1)\n", 110 | "len(state), state[0].shape" 111 | ] 112 | }, 113 | { 114 | "cell_type": "markdown", 115 | "metadata": { 116 | "slideshow": { 117 | "slide_type": "slide" 118 | } 119 | }, 120 | "source": [ 121 | "Defining a class to wrap the RNN layers" 122 | ] 123 | }, 124 | { 125 | "cell_type": "code", 126 | "execution_count": 4, 127 | "metadata": { 128 | "ExecuteTime": { 129 | "end_time": "2019-07-03T23:01:16.611592Z", 130 | "start_time": "2019-07-03T23:01:16.601094Z" 131 | }, 132 | "attributes": { 133 | "classes": [], 134 | "id": "", 135 | "n": "39" 136 | } 137 | }, 138 | "outputs": [], 139 | "source": [ 140 | "class RNNModel(nn.Block):\n", 141 | " def __init__(self, rnn_layer, vocab_size, **kwargs):\n", 142 | " super(RNNModel, self).__init__(**kwargs)\n", 143 | " self.rnn = rnn_layer\n", 144 | " self.vocab_size = vocab_size\n", 145 | " self.dense = nn.Dense(vocab_size)\n", 146 | "\n", 147 | " def forward(self, inputs, state):\n", 148 | " X = npx.one_hot(inputs.T, self.vocab_size)\n", 149 | " Y, state = self.rnn(X, state)\n", 150 | " # The fully connected layer will first change the shape of Y to\n", 151 | " # (num_steps * batch_size, num_hiddens)\n", 152 | " # Its output shape is (num_steps * batch_size, vocab_size)\n", 153 | " output = self.dense(Y.reshape((-1, Y.shape[-1])))\n", 154 | " return output, state\n", 155 | "\n", 156 | " def begin_state(self, *args, **kwargs):\n", 157 | " return self.rnn.begin_state(*args, **kwargs)" 158 | ] 159 | }, 160 | { 161 | "cell_type": "markdown", 162 | "metadata": { 163 | "slideshow": { 164 | "slide_type": "slide" 165 | } 166 | }, 167 | "source": [ 168 | "Training" 169 | ] 170 | }, 171 | { 172 | "cell_type": "code", 173 | "execution_count": 5, 174 | "metadata": { 175 | "ExecuteTime": { 176 | "end_time": "2019-07-03T23:02:13.528517Z", 177 | "start_time": "2019-07-03T23:01:16.616773Z" 178 | }, 179 | "attributes": { 180 | "classes": [], 181 | "id": "", 182 | "n": "42" 183 | }, 184 | "scrolled": true 185 | }, 186 | "outputs": [ 187 | { 188 | "name": "stdout", 189 | "output_type": "stream", 190 | "text": [ 191 | "Perplexity 1.2, 158013 tokens/sec on gpu(0)\n", 192 | "time traveller you can show black is white by argument said fil\n", 193 | "traveller after the pauserequired for the little go the geo\n" 194 | ] 195 | }, 196 | { 197 | "data": { 198 | "image/svg+xml": [ 199 | "\n", 200 | "\n", 202 | "\n", 203 | "\n", 204 | " \n", 205 | " \n", 208 | " \n", 209 | " \n", 210 | " \n", 211 | " \n", 217 | " \n", 218 | " \n", 219 | " \n", 220 | " \n", 226 | " \n", 227 | " \n", 228 | " \n", 229 | " \n", 230 | " \n", 233 | " \n", 234 | " \n", 235 | " \n", 236 | " \n", 239 | " \n", 240 | " \n", 241 | " \n", 242 | " \n", 243 | " \n", 244 | " \n", 245 | " \n", 246 | " \n", 247 | " \n", 268 | " \n", 269 | " \n", 270 | " \n", 271 | " \n", 272 | " \n", 273 | " \n", 274 | " \n", 275 | " \n", 276 | " \n", 279 | " \n", 280 | " \n", 281 | " \n", 282 | " \n", 283 | " \n", 284 | " \n", 285 | " \n", 286 | " \n", 287 | " \n", 288 | " \n", 301 | " \n", 302 | " \n", 303 | " \n", 304 | " \n", 305 | " \n", 306 | " \n", 307 | " \n", 308 | " \n", 309 | " \n", 310 | " \n", 311 | " \n", 314 | " \n", 315 | " \n", 316 | " \n", 317 | " \n", 318 | " \n", 319 | " \n", 320 | " \n", 321 | " \n", 322 | " \n", 323 | " \n", 347 | " \n", 348 | " \n", 349 | " \n", 350 | " \n", 351 | " \n", 352 | " \n", 353 | " \n", 354 | " \n", 355 | " \n", 356 | " \n", 357 | " \n", 360 | " \n", 361 | " \n", 362 | " \n", 363 | " \n", 364 | " \n", 365 | " \n", 366 | " \n", 367 | " \n", 368 | " \n", 369 | " \n", 401 | " \n", 402 | " \n", 403 | " \n", 404 | " \n", 405 | " \n", 406 | " \n", 407 | " \n", 408 | " \n", 409 | " \n", 410 | " \n", 411 | " \n", 414 | " \n", 415 | " \n", 416 | " \n", 417 | " \n", 418 | " \n", 419 | " \n", 420 | " \n", 421 | " \n", 422 | " \n", 423 | " \n", 440 | " \n", 441 | " \n", 442 | " \n", 443 | " \n", 444 | " \n", 445 | " \n", 446 | " \n", 447 | " \n", 448 | " \n", 449 | " \n", 450 | " \n", 453 | " \n", 454 | " \n", 455 | " \n", 456 | " \n", 457 | " \n", 458 | " \n", 459 | " \n", 460 | " \n", 461 | " \n", 462 | " \n", 486 | " \n", 487 | " \n", 488 | " \n", 489 | " \n", 490 | " \n", 491 | " \n", 492 | " \n", 493 | " \n", 494 | " \n", 495 | " \n", 496 | " \n", 497 | " \n", 521 | " \n", 547 | " \n", 568 | " \n", 589 | " \n", 608 | " \n", 609 | " \n", 610 | " \n", 611 | " \n", 612 | " \n", 613 | " \n", 614 | " \n", 615 | " \n", 616 | " \n", 617 | " \n", 618 | " \n", 619 | " \n", 620 | " \n", 621 | " \n", 624 | " \n", 625 | " \n", 626 | " \n", 627 | " \n", 630 | " \n", 631 | " \n", 632 | " \n", 633 | " \n", 634 | " \n", 635 | " \n", 636 | " \n", 637 | " \n", 638 | " \n", 639 | " \n", 640 | " \n", 641 | " \n", 642 | " \n", 643 | " \n", 644 | " \n", 647 | " \n", 648 | " \n", 649 | " \n", 650 | " \n", 651 | " \n", 652 | " \n", 653 | " \n", 654 | " \n", 655 | " \n", 656 | " \n", 657 | " \n", 658 | " \n", 659 | " \n", 660 | " \n", 661 | " \n", 662 | " \n", 665 | " \n", 666 | " \n", 667 | " \n", 668 | " \n", 669 | " \n", 670 | " \n", 671 | " \n", 672 | " \n", 673 | " \n", 674 | " \n", 675 | " \n", 676 | " \n", 677 | " \n", 678 | " \n", 679 | " \n", 680 | " \n", 681 | " \n", 684 | " \n", 685 | " \n", 686 | " \n", 687 | " \n", 688 | " \n", 689 | " \n", 690 | " \n", 691 | " \n", 692 | " \n", 693 | " \n", 694 | " \n", 695 | " \n", 696 | " \n", 697 | " \n", 698 | " \n", 699 | " \n", 700 | " \n", 703 | " \n", 704 | " \n", 705 | " \n", 706 | " \n", 707 | " \n", 708 | " \n", 709 | " \n", 710 | " \n", 711 | " \n", 712 | " \n", 713 | " \n", 714 | " \n", 715 | " \n", 716 | " \n", 717 | " \n", 718 | " \n", 719 | " \n", 722 | " \n", 723 | " \n", 724 | " \n", 725 | " \n", 726 | " \n", 727 | " \n", 728 | " \n", 729 | " \n", 730 | " \n", 731 | " \n", 732 | " \n", 733 | " \n", 734 | " \n", 735 | " \n", 736 | " \n", 737 | " \n", 738 | " \n", 739 | " \n", 755 | " \n", 761 | " \n", 775 | " \n", 786 | " \n", 806 | " \n", 822 | " \n", 823 | " \n", 824 | " \n", 825 | " \n", 826 | " \n", 827 | " \n", 828 | " \n", 829 | " \n", 830 | " \n", 831 | " \n", 832 | " \n", 833 | " \n", 834 | " \n", 835 | " \n", 836 | " \n", 837 | " \n", 838 | " \n", 889 | " \n", 890 | " \n", 891 | " \n", 894 | " \n", 895 | " \n", 896 | " \n", 899 | " \n", 900 | " \n", 901 | " \n", 904 | " \n", 905 | " \n", 906 | " \n", 909 | " \n", 910 | " \n", 911 | " \n", 912 | " \n", 923 | " \n", 924 | " \n", 925 | " \n", 928 | " \n", 929 | " \n", 930 | " \n", 931 | " \n", 932 | " \n", 933 | " \n", 965 | " \n", 984 | " \n", 985 | " \n", 986 | " \n", 987 | " \n", 988 | " \n", 989 | " \n", 990 | " \n", 991 | " \n", 992 | " \n", 993 | " \n", 994 | " \n", 995 | " \n", 996 | " \n", 997 | " \n", 998 | " \n", 999 | " \n", 1000 | " \n", 1001 | "\n" 1002 | ], 1003 | "text/plain": [ 1004 | "
" 1005 | ] 1006 | }, 1007 | "metadata": { 1008 | "needs_background": "light" 1009 | }, 1010 | "output_type": "display_data" 1011 | } 1012 | ], 1013 | "source": [ 1014 | "num_epochs, lr, ctx = 500, 1, d2l.try_gpu()\n", 1015 | "model = RNNModel(rnn_layer, len(vocab))\n", 1016 | "model.initialize(force_reinit=True, ctx=ctx)\n", 1017 | "d2l.train_ch8(model, train_iter, vocab, lr, num_epochs, ctx)" 1018 | ] 1019 | } 1020 | ], 1021 | "metadata": { 1022 | "celltoolbar": "Slideshow", 1023 | "kernelspec": { 1024 | "display_name": "Python 3", 1025 | "language": "python", 1026 | "name": "python3" 1027 | }, 1028 | "language_info": { 1029 | "codemirror_mode": { 1030 | "name": "ipython", 1031 | "version": 3 1032 | }, 1033 | "file_extension": ".py", 1034 | "mimetype": "text/x-python", 1035 | "name": "python", 1036 | "nbconvert_exporter": "python", 1037 | "pygments_lexer": "ipython3", 1038 | "version": "3.7.1" 1039 | }, 1040 | "toc": { 1041 | "base_numbering": 1, 1042 | "nav_menu": {}, 1043 | "number_sections": true, 1044 | "sideBar": true, 1045 | "skip_h1_title": false, 1046 | "title_cell": "Table of Contents", 1047 | "title_sidebar": "Contents", 1048 | "toc_cell": false, 1049 | "toc_position": {}, 1050 | "toc_section_display": true, 1051 | "toc_window_display": false 1052 | } 1053 | }, 1054 | "nbformat": 4, 1055 | "nbformat_minor": 2 1056 | } 1057 | -------------------------------------------------------------------------------- /run_ipynb.sh: -------------------------------------------------------------------------------- 1 | #!/bin/bash 2 | set -e 3 | 4 | if [ $# -eq 0 ]; then 5 | echo "Usage: bash $0 NOTEBOOKS" 6 | echo "E.g., bash run_notebooks.sh */*.ipynb" 7 | echo "Execute all the notebooks and save outputs (assuming with Python 3)." 8 | exit -1 9 | fi 10 | 11 | echo "Start to evaluate $@" 12 | 13 | for f in $@; do 14 | echo "=== Executing $f" 15 | jupyter nbconvert --execute --ExecutePreprocessor.kernel_name=python3 --to notebook --ExecutePreprocessor.timeout=1200 --inplace $f 16 | done 17 | --------------------------------------------------------------------------------