├── .gitignore
├── 01_the_machine_learning_landscape.ipynb
├── 02_end_to_end_machine_learning_project.ipynb
├── 03_classification.ipynb
├── 04_training_linear_models.ipynb
├── 05_support_vector_machines.ipynb
├── 06_decision_trees.ipynb
├── 07_ensemble_learning_and_random_forests.ipynb
├── 08_dimensionality_reduction.ipynb
├── 09_up_and_running_with_tensorflow.ipynb
├── 10_introduction_to_artificial_neural_networks.ipynb
├── 11_deep_learning.ipynb
├── 12_distributed_tensorflow.ipynb
├── 13_convolutional_neural_networks.ipynb
├── 14_recurrent_neural_networks.ipynb
├── 15_autoencoders.ipynb
├── 16_reinforcement_learning.ipynb
├── INSTALL.md
├── LICENSE
├── README.md
├── apt.txt
├── book_equations.ipynb
├── datasets
    ├── housing
    │   ├── README.md
    │   ├── housing.csv
    │   └── housing.tgz
    ├── inception
    │   └── imagenet_class_names.txt
    └── lifesat
    │   ├── README.md
    │   ├── gdp_per_capita.csv
    │   └── oecd_bli_2015.csv
├── docker
    ├── .env
    ├── Dockerfile
    ├── Makefile
    ├── README.md
    ├── bashrc.bash
    ├── bin
    │   ├── nbclean_checkpoints
    │   ├── nbdiff_checkpoint
    │   ├── rm_empty_subdirs
    │   └── tensorboard
    ├── docker-compose.yml
    └── jupyter_notebook_config.py
├── environment.yml
├── extra_autodiff.ipynb
├── extra_capsnets-cn.ipynb
├── extra_capsnets.ipynb
├── extra_gradient_descent_comparison.ipynb
├── extra_tensorflow_reproducibility.ipynb
├── future_encoders.py
├── images
    ├── ann
    │   └── README
    ├── autoencoders
    │   └── README
    ├── classification
    │   └── README
    ├── cnn
    │   ├── README
    │   └── test_image.png
    ├── decision_trees
    │   └── README
    ├── deep
    │   └── README
    ├── distributed
    │   └── README
    ├── end_to_end_project
    │   ├── README
    │   └── california.png
    ├── ensembles
    │   └── README
    ├── fundamentals
    │   └── README
    ├── rl
    │   └── README
    ├── rnn
    │   └── README
    ├── svm
    │   └── README
    ├── tensorflow
    │   └── README
    ├── training_linear_models
    │   └── README
    └── unsupervised_learning
    │   ├── README
    │   └── ladybug.png
├── index.ipynb
├── math_differential_calculus.ipynb
├── math_linear_algebra.ipynb
├── ml-project-checklist.md
├── requirements.txt
├── tools_matplotlib.ipynb
├── tools_numpy.ipynb
└── tools_pandas.ipynb


/.gitignore:
--------------------------------------------------------------------------------
 1 | *.bak
 2 | *.bak.*
 3 | *.ckpt
 4 | *.old
 5 | *.pyc
 6 | .DS_Store
 7 | .ipynb_checkpoints
 8 | .vscode/
 9 | checkpoint
10 | logs/*
11 | tf_logs/*
12 | images/**/*.png
13 | images/**/*.dot
14 | my_*
15 | person.proto
16 | person.desc
17 | person_pb2.py
18 | datasets/flowers
19 | datasets/lifesat/lifesat.csv
20 | datasets/spam
21 | datasets/titanic
22 | datasets/words
23 | 
24 | 


--------------------------------------------------------------------------------
/12_distributed_tensorflow.ipynb:
--------------------------------------------------------------------------------
  1 | {
  2 |  "cells": [
  3 |   {
  4 |    "cell_type": "markdown",
  5 |    "metadata": {},
  6 |    "source": [
  7 |     "**Chapter 12 – Distributed TensorFlow**"
  8 |    ]
  9 |   },
 10 |   {
 11 |    "cell_type": "markdown",
 12 |    "metadata": {},
 13 |    "source": [
 14 |     "_This notebook contains all the sample code and solutions to the exercises in chapter 12._\n",
 15 |     "\n",
 16 |     "<table align=\"left\">\n",
 17 |     "  <td>\n",
 18 |     "    <a target=\"_blank\" href=\"https://colab.research.google.com/github/ageron/handson-ml/blob/master/12_distributed_tensorflow.ipynb\"><img src=\"https://www.tensorflow.org/images/colab_logo_32px.png\" />Run in Google Colab</a>\n",
 19 |     "  </td>\n",
 20 |     "</table>"
 21 |    ]
 22 |   },
 23 |   {
 24 |    "cell_type": "markdown",
 25 |    "metadata": {},
 26 |    "source": [
 27 |     "**Warning**: this is the code for the 1st edition of the book. Please visit https://github.com/ageron/handson-ml2 for the 2nd edition code, with up-to-date notebooks using the latest library versions. In particular, the 1st edition is based on TensorFlow 1, while the 2nd edition uses TensorFlow 2, which is much simpler to use."
 28 |    ]
 29 |   },
 30 |   {
 31 |    "cell_type": "markdown",
 32 |    "metadata": {},
 33 |    "source": [
 34 |     "# Setup"
 35 |    ]
 36 |   },
 37 |   {
 38 |    "cell_type": "markdown",
 39 |    "metadata": {},
 40 |    "source": [
 41 |     "First, let's make sure this notebook works well in both python 2 and 3, import a few common modules, ensure MatplotLib plots figures inline and prepare a function to save the figures:"
 42 |    ]
 43 |   },
 44 |   {
 45 |    "cell_type": "code",
 46 |    "execution_count": 1,
 47 |    "metadata": {},
 48 |    "outputs": [],
 49 |    "source": [
 50 |     "# To support both python 2 and python 3\n",
 51 |     "from __future__ import division, print_function, unicode_literals\n",
 52 |     "\n",
 53 |     "# Common imports\n",
 54 |     "import numpy as np\n",
 55 |     "import os\n",
 56 |     "\n",
 57 |     "try:\n",
 58 |     "    # %tensorflow_version only exists in Colab.\n",
 59 |     "    %tensorflow_version 1.x\n",
 60 |     "except Exception:\n",
 61 |     "    pass\n",
 62 |     "\n",
 63 |     "# to make this notebook's output stable across runs\n",
 64 |     "def reset_graph(seed=42):\n",
 65 |     "    tf.reset_default_graph()\n",
 66 |     "    tf.set_random_seed(seed)\n",
 67 |     "    np.random.seed(seed)\n",
 68 |     "\n",
 69 |     "# To plot pretty figures\n",
 70 |     "%matplotlib inline\n",
 71 |     "import matplotlib\n",
 72 |     "import matplotlib.pyplot as plt\n",
 73 |     "plt.rcParams['axes.labelsize'] = 14\n",
 74 |     "plt.rcParams['xtick.labelsize'] = 12\n",
 75 |     "plt.rcParams['ytick.labelsize'] = 12\n",
 76 |     "\n",
 77 |     "# Where to save the figures\n",
 78 |     "PROJECT_ROOT_DIR = \".\"\n",
 79 |     "CHAPTER_ID = \"distributed\"\n",
 80 |     "IMAGES_PATH = os.path.join(PROJECT_ROOT_DIR, \"images\", CHAPTER_ID)\n",
 81 |     "os.makedirs(IMAGES_PATH, exist_ok=True)\n",
 82 |     "\n",
 83 |     "def save_fig(fig_id, tight_layout=True, fig_extension=\"png\", resolution=300):\n",
 84 |     "    path = os.path.join(IMAGES_PATH, fig_id + \".\" + fig_extension)\n",
 85 |     "    print(\"Saving figure\", fig_id)\n",
 86 |     "    if tight_layout:\n",
 87 |     "        plt.tight_layout()\n",
 88 |     "    plt.savefig(path, format=fig_extension, dpi=resolution)"
 89 |    ]
 90 |   },
 91 |   {
 92 |    "cell_type": "markdown",
 93 |    "metadata": {},
 94 |    "source": [
 95 |     "# Local server"
 96 |    ]
 97 |   },
 98 |   {
 99 |    "cell_type": "code",
100 |    "execution_count": 2,
101 |    "metadata": {},
102 |    "outputs": [],
103 |    "source": [
104 |     "import tensorflow as tf"
105 |    ]
106 |   },
107 |   {
108 |    "cell_type": "code",
109 |    "execution_count": 3,
110 |    "metadata": {},
111 |    "outputs": [],
112 |    "source": [
113 |     "c = tf.constant(\"Hello distributed TensorFlow!\")\n",
114 |     "server = tf.train.Server.create_local_server()"
115 |    ]
116 |   },
117 |   {
118 |    "cell_type": "code",
119 |    "execution_count": 4,
120 |    "metadata": {},
121 |    "outputs": [
122 |     {
123 |      "name": "stdout",
124 |      "output_type": "stream",
125 |      "text": [
126 |       "b'Hello distributed TensorFlow!'\n"
127 |      ]
128 |     }
129 |    ],
130 |    "source": [
131 |     "with tf.Session(server.target) as sess:\n",
132 |     "    print(sess.run(c))"
133 |    ]
134 |   },
135 |   {
136 |    "cell_type": "markdown",
137 |    "metadata": {},
138 |    "source": [
139 |     "# Cluster"
140 |    ]
141 |   },
142 |   {
143 |    "cell_type": "code",
144 |    "execution_count": 5,
145 |    "metadata": {},
146 |    "outputs": [],
147 |    "source": [
148 |     "cluster_spec = tf.train.ClusterSpec({\n",
149 |     "    \"ps\": [\n",
150 |     "        \"127.0.0.1:2221\",  # /job:ps/task:0\n",
151 |     "        \"127.0.0.1:2222\",  # /job:ps/task:1\n",
152 |     "    ],\n",
153 |     "    \"worker\": [\n",
154 |     "        \"127.0.0.1:2223\",  # /job:worker/task:0\n",
155 |     "        \"127.0.0.1:2224\",  # /job:worker/task:1\n",
156 |     "        \"127.0.0.1:2225\",  # /job:worker/task:2\n",
157 |     "    ]})"
158 |    ]
159 |   },
160 |   {
161 |    "cell_type": "code",
162 |    "execution_count": 6,
163 |    "metadata": {},
164 |    "outputs": [],
165 |    "source": [
166 |     "task_ps0 = tf.train.Server(cluster_spec, job_name=\"ps\", task_index=0)\n",
167 |     "task_ps1 = tf.train.Server(cluster_spec, job_name=\"ps\", task_index=1)\n",
168 |     "task_worker0 = tf.train.Server(cluster_spec, job_name=\"worker\", task_index=0)\n",
169 |     "task_worker1 = tf.train.Server(cluster_spec, job_name=\"worker\", task_index=1)\n",
170 |     "task_worker2 = tf.train.Server(cluster_spec, job_name=\"worker\", task_index=2)"
171 |    ]
172 |   },
173 |   {
174 |    "cell_type": "markdown",
175 |    "metadata": {},
176 |    "source": [
177 |     "# Pinning operations across devices and servers"
178 |    ]
179 |   },
180 |   {
181 |    "cell_type": "code",
182 |    "execution_count": 7,
183 |    "metadata": {},
184 |    "outputs": [],
185 |    "source": [
186 |     "reset_graph()\n",
187 |     "\n",
188 |     "with tf.device(\"/job:ps\"):\n",
189 |     "    a = tf.Variable(1.0, name=\"a\")\n",
190 |     "\n",
191 |     "with tf.device(\"/job:worker\"):\n",
192 |     "    b = a + 2\n",
193 |     "\n",
194 |     "with tf.device(\"/job:worker/task:1\"):\n",
195 |     "    c = a + b"
196 |    ]
197 |   },
198 |   {
199 |    "cell_type": "code",
200 |    "execution_count": 8,
201 |    "metadata": {},
202 |    "outputs": [
203 |     {
204 |      "name": "stdout",
205 |      "output_type": "stream",
206 |      "text": [
207 |       "4.0\n"
208 |      ]
209 |     }
210 |    ],
211 |    "source": [
212 |     "with tf.Session(\"grpc://127.0.0.1:2221\") as sess:\n",
213 |     "    sess.run(a.initializer)\n",
214 |     "    print(c.eval())"
215 |    ]
216 |   },
217 |   {
218 |    "cell_type": "code",
219 |    "execution_count": 9,
220 |    "metadata": {},
221 |    "outputs": [],
222 |    "source": [
223 |     "reset_graph()\n",
224 |     "\n",
225 |     "with tf.device(tf.train.replica_device_setter(\n",
226 |     "        ps_tasks=2,\n",
227 |     "        ps_device=\"/job:ps\",\n",
228 |     "        worker_device=\"/job:worker\")):\n",
229 |     "    v1 = tf.Variable(1.0, name=\"v1\")  # pinned to /job:ps/task:0 (defaults to /cpu:0)\n",
230 |     "    v2 = tf.Variable(2.0, name=\"v2\")  # pinned to /job:ps/task:1 (defaults to /cpu:0)\n",
231 |     "    v3 = tf.Variable(3.0, name=\"v3\")  # pinned to /job:ps/task:0 (defaults to /cpu:0)\n",
232 |     "    s = v1 + v2            # pinned to /job:worker (defaults to task:0/cpu:0)\n",
233 |     "    with tf.device(\"/task:1\"):\n",
234 |     "        p1 = 2 * s         # pinned to /job:worker/task:1 (defaults to /cpu:0)\n",
235 |     "        with tf.device(\"/cpu:0\"):\n",
236 |     "            p2 = 3 * s     # pinned to /job:worker/task:1/cpu:0\n",
237 |     "\n",
238 |     "config = tf.ConfigProto()\n",
239 |     "config.log_device_placement = True\n",
240 |     "\n",
241 |     "with tf.Session(\"grpc://127.0.0.1:2221\", config=config) as sess:\n",
242 |     "    v1.initializer.run()"
243 |    ]
244 |   },
245 |   {
246 |    "cell_type": "markdown",
247 |    "metadata": {},
248 |    "source": [
249 |     "# Readers – the old way"
250 |    ]
251 |   },
252 |   {
253 |    "cell_type": "code",
254 |    "execution_count": 10,
255 |    "metadata": {},
256 |    "outputs": [],
257 |    "source": [
258 |     "reset_graph()"
259 |    ]
260 |   },
261 |   {
262 |    "cell_type": "code",
263 |    "execution_count": 11,
264 |    "metadata": {},
265 |    "outputs": [
266 |     {
267 |      "name": "stdout",
268 |      "output_type": "stream",
269 |      "text": [
270 |       "[1.0, 6, 44]\n"
271 |      ]
272 |     }
273 |    ],
274 |    "source": [
275 |     "default1 = tf.constant([5.])\n",
276 |     "default2 = tf.constant([6])\n",
277 |     "default3 = tf.constant([7])\n",
278 |     "dec = tf.decode_csv(tf.constant(\"1.,,44\"),\n",
279 |     "                    record_defaults=[default1, default2, default3])\n",
280 |     "with tf.Session() as sess:\n",
281 |     "    print(sess.run(dec))"
282 |    ]
283 |   },
284 |   {
285 |    "cell_type": "code",
286 |    "execution_count": 12,
287 |    "metadata": {},
288 |    "outputs": [
289 |     {
290 |      "name": "stdout",
291 |      "output_type": "stream",
292 |      "text": [
293 |       "No more files to read\n",
294 |       "[array([[ 4.,  5.],\n",
295 |       "       [ 1., -1.]], dtype=float32), array([1, 0], dtype=int32)]\n",
296 |       "[array([[7., 8.]], dtype=float32), array([0], dtype=int32)]\n",
297 |       "No more training instances\n"
298 |      ]
299 |     }
300 |    ],
301 |    "source": [
302 |     "reset_graph()\n",
303 |     "\n",
304 |     "test_csv = open(\"my_test.csv\", \"w\")\n",
305 |     "test_csv.write(\"x1, x2 , target\\n\")\n",
306 |     "test_csv.write(\"1.,, 0\\n\")\n",
307 |     "test_csv.write(\"4., 5. , 1\\n\")\n",
308 |     "test_csv.write(\"7., 8. , 0\\n\")\n",
309 |     "test_csv.close()\n",
310 |     "\n",
311 |     "filename_queue = tf.FIFOQueue(capacity=10, dtypes=[tf.string], shapes=[()])\n",
312 |     "filename = tf.placeholder(tf.string)\n",
313 |     "enqueue_filename = filename_queue.enqueue([filename])\n",
314 |     "close_filename_queue = filename_queue.close()\n",
315 |     "\n",
316 |     "reader = tf.TextLineReader(skip_header_lines=1)\n",
317 |     "key, value = reader.read(filename_queue)\n",
318 |     "\n",
319 |     "x1, x2, target = tf.decode_csv(value, record_defaults=[[-1.], [-1.], [-1]])\n",
320 |     "features = tf.stack([x1, x2])\n",
321 |     "\n",
322 |     "instance_queue = tf.RandomShuffleQueue(\n",
323 |     "    capacity=10, min_after_dequeue=2,\n",
324 |     "    dtypes=[tf.float32, tf.int32], shapes=[[2],[]],\n",
325 |     "    name=\"instance_q\", shared_name=\"shared_instance_q\")\n",
326 |     "enqueue_instance = instance_queue.enqueue([features, target])\n",
327 |     "close_instance_queue = instance_queue.close()\n",
328 |     "\n",
329 |     "minibatch_instances, minibatch_targets = instance_queue.dequeue_up_to(2)\n",
330 |     "\n",
331 |     "with tf.Session() as sess:\n",
332 |     "    sess.run(enqueue_filename, feed_dict={filename: \"my_test.csv\"})\n",
333 |     "    sess.run(close_filename_queue)\n",
334 |     "    try:\n",
335 |     "        while True:\n",
336 |     "            sess.run(enqueue_instance)\n",
337 |     "    except tf.errors.OutOfRangeError as ex:\n",
338 |     "        print(\"No more files to read\")\n",
339 |     "    sess.run(close_instance_queue)\n",
340 |     "    try:\n",
341 |     "        while True:\n",
342 |     "            print(sess.run([minibatch_instances, minibatch_targets]))\n",
343 |     "    except tf.errors.OutOfRangeError as ex:\n",
344 |     "        print(\"No more training instances\")"
345 |    ]
346 |   },
347 |   {
348 |    "cell_type": "code",
349 |    "execution_count": 13,
350 |    "metadata": {},
351 |    "outputs": [],
352 |    "source": [
353 |     "#coord = tf.train.Coordinator()\n",
354 |     "#threads = tf.train.start_queue_runners(coord=coord)\n",
355 |     "#filename_queue = tf.train.string_input_producer([\"test.csv\"])\n",
356 |     "#coord.request_stop()\n",
357 |     "#coord.join(threads)"
358 |    ]
359 |   },
360 |   {
361 |    "cell_type": "markdown",
362 |    "metadata": {},
363 |    "source": [
364 |     "# Queue runners and coordinators"
365 |    ]
366 |   },
367 |   {
368 |    "cell_type": "code",
369 |    "execution_count": 14,
370 |    "metadata": {},
371 |    "outputs": [
372 |     {
373 |      "name": "stdout",
374 |      "output_type": "stream",
375 |      "text": [
376 |       "[array([[ 7.,  8.],\n",
377 |       "       [ 1., -1.]], dtype=float32), array([0, 0], dtype=int32)]\n",
378 |       "[array([[4., 5.]], dtype=float32), array([1], dtype=int32)]\n",
379 |       "No more training instances\n"
380 |      ]
381 |     }
382 |    ],
383 |    "source": [
384 |     "reset_graph()\n",
385 |     "\n",
386 |     "filename_queue = tf.FIFOQueue(capacity=10, dtypes=[tf.string], shapes=[()])\n",
387 |     "filename = tf.placeholder(tf.string)\n",
388 |     "enqueue_filename = filename_queue.enqueue([filename])\n",
389 |     "close_filename_queue = filename_queue.close()\n",
390 |     "\n",
391 |     "reader = tf.TextLineReader(skip_header_lines=1)\n",
392 |     "key, value = reader.read(filename_queue)\n",
393 |     "\n",
394 |     "x1, x2, target = tf.decode_csv(value, record_defaults=[[-1.], [-1.], [-1]])\n",
395 |     "features = tf.stack([x1, x2])\n",
396 |     "\n",
397 |     "instance_queue = tf.RandomShuffleQueue(\n",
398 |     "    capacity=10, min_after_dequeue=2,\n",
399 |     "    dtypes=[tf.float32, tf.int32], shapes=[[2],[]],\n",
400 |     "    name=\"instance_q\", shared_name=\"shared_instance_q\")\n",
401 |     "enqueue_instance = instance_queue.enqueue([features, target])\n",
402 |     "close_instance_queue = instance_queue.close()\n",
403 |     "\n",
404 |     "minibatch_instances, minibatch_targets = instance_queue.dequeue_up_to(2)\n",
405 |     "\n",
406 |     "n_threads = 5\n",
407 |     "queue_runner = tf.train.QueueRunner(instance_queue, [enqueue_instance] * n_threads)\n",
408 |     "coord = tf.train.Coordinator()\n",
409 |     "\n",
410 |     "with tf.Session() as sess:\n",
411 |     "    sess.run(enqueue_filename, feed_dict={filename: \"my_test.csv\"})\n",
412 |     "    sess.run(close_filename_queue)\n",
413 |     "    enqueue_threads = queue_runner.create_threads(sess, coord=coord, start=True)\n",
414 |     "    try:\n",
415 |     "        while True:\n",
416 |     "            print(sess.run([minibatch_instances, minibatch_targets]))\n",
417 |     "    except tf.errors.OutOfRangeError as ex:\n",
418 |     "        print(\"No more training instances\")"
419 |    ]
420 |   },
421 |   {
422 |    "cell_type": "code",
423 |    "execution_count": 15,
424 |    "metadata": {},
425 |    "outputs": [
426 |     {
427 |      "name": "stdout",
428 |      "output_type": "stream",
429 |      "text": [
430 |       "[array([[ 4.,  5.],\n",
431 |       "       [ 1., -1.]], dtype=float32), array([1, 0], dtype=int32)]\n",
432 |       "[array([[7., 8.]], dtype=float32), array([0], dtype=int32)]\n",
433 |       "No more training instances\n"
434 |      ]
435 |     }
436 |    ],
437 |    "source": [
438 |     "reset_graph()\n",
439 |     "\n",
440 |     "def read_and_push_instance(filename_queue, instance_queue):\n",
441 |     "    reader = tf.TextLineReader(skip_header_lines=1)\n",
442 |     "    key, value = reader.read(filename_queue)\n",
443 |     "    x1, x2, target = tf.decode_csv(value, record_defaults=[[-1.], [-1.], [-1]])\n",
444 |     "    features = tf.stack([x1, x2])\n",
445 |     "    enqueue_instance = instance_queue.enqueue([features, target])\n",
446 |     "    return enqueue_instance\n",
447 |     "\n",
448 |     "filename_queue = tf.FIFOQueue(capacity=10, dtypes=[tf.string], shapes=[()])\n",
449 |     "filename = tf.placeholder(tf.string)\n",
450 |     "enqueue_filename = filename_queue.enqueue([filename])\n",
451 |     "close_filename_queue = filename_queue.close()\n",
452 |     "\n",
453 |     "instance_queue = tf.RandomShuffleQueue(\n",
454 |     "    capacity=10, min_after_dequeue=2,\n",
455 |     "    dtypes=[tf.float32, tf.int32], shapes=[[2],[]],\n",
456 |     "    name=\"instance_q\", shared_name=\"shared_instance_q\")\n",
457 |     "\n",
458 |     "minibatch_instances, minibatch_targets = instance_queue.dequeue_up_to(2)\n",
459 |     "\n",
460 |     "read_and_enqueue_ops = [read_and_push_instance(filename_queue, instance_queue) for i in range(5)]\n",
461 |     "queue_runner = tf.train.QueueRunner(instance_queue, read_and_enqueue_ops)\n",
462 |     "\n",
463 |     "with tf.Session() as sess:\n",
464 |     "    sess.run(enqueue_filename, feed_dict={filename: \"my_test.csv\"})\n",
465 |     "    sess.run(close_filename_queue)\n",
466 |     "    coord = tf.train.Coordinator()\n",
467 |     "    enqueue_threads = queue_runner.create_threads(sess, coord=coord, start=True)\n",
468 |     "    try:\n",
469 |     "        while True:\n",
470 |     "            print(sess.run([minibatch_instances, minibatch_targets]))\n",
471 |     "    except tf.errors.OutOfRangeError as ex:\n",
472 |     "        print(\"No more training instances\")\n",
473 |     "\n"
474 |    ]
475 |   },
476 |   {
477 |    "cell_type": "markdown",
478 |    "metadata": {},
479 |    "source": [
480 |     "# Setting a timeout"
481 |    ]
482 |   },
483 |   {
484 |    "cell_type": "code",
485 |    "execution_count": 16,
486 |    "metadata": {},
487 |    "outputs": [
488 |     {
489 |      "name": "stdout",
490 |      "output_type": "stream",
491 |      "text": [
492 |       "2.0\n",
493 |       "6.0\n",
494 |       "3.0\n",
495 |       "4.0\n",
496 |       "Timed out while dequeuing\n"
497 |      ]
498 |     }
499 |    ],
500 |    "source": [
501 |     "reset_graph()\n",
502 |     "\n",
503 |     "q = tf.FIFOQueue(capacity=10, dtypes=[tf.float32], shapes=[()])\n",
504 |     "v = tf.placeholder(tf.float32)\n",
505 |     "enqueue = q.enqueue([v])\n",
506 |     "dequeue = q.dequeue()\n",
507 |     "output = dequeue + 1\n",
508 |     "\n",
509 |     "config = tf.ConfigProto()\n",
510 |     "config.operation_timeout_in_ms = 1000\n",
511 |     "\n",
512 |     "with tf.Session(config=config) as sess:\n",
513 |     "    sess.run(enqueue, feed_dict={v: 1.0})\n",
514 |     "    sess.run(enqueue, feed_dict={v: 2.0})\n",
515 |     "    sess.run(enqueue, feed_dict={v: 3.0})\n",
516 |     "    print(sess.run(output))\n",
517 |     "    print(sess.run(output, feed_dict={dequeue: 5}))\n",
518 |     "    print(sess.run(output))\n",
519 |     "    print(sess.run(output))\n",
520 |     "    try:\n",
521 |     "        print(sess.run(output))\n",
522 |     "    except tf.errors.DeadlineExceededError as ex:\n",
523 |     "        print(\"Timed out while dequeuing\")\n"
524 |    ]
525 |   },
526 |   {
527 |    "cell_type": "markdown",
528 |    "metadata": {},
529 |    "source": [
530 |     "# Data API"
531 |    ]
532 |   },
533 |   {
534 |    "cell_type": "markdown",
535 |    "metadata": {},
536 |    "source": [
537 |     "The Data API, introduced in TensorFlow 1.4, makes reading data efficiently much easier."
538 |    ]
539 |   },
540 |   {
541 |    "cell_type": "code",
542 |    "execution_count": 17,
543 |    "metadata": {},
544 |    "outputs": [],
545 |    "source": [
546 |     "tf.reset_default_graph()"
547 |    ]
548 |   },
549 |   {
550 |    "cell_type": "markdown",
551 |    "metadata": {},
552 |    "source": [
553 |     "Let's start with a simple dataset composed of three times the integers 0 to 9, in batches of 7:"
554 |    ]
555 |   },
556 |   {
557 |    "cell_type": "code",
558 |    "execution_count": 18,
559 |    "metadata": {},
560 |    "outputs": [],
561 |    "source": [
562 |     "dataset = tf.data.Dataset.from_tensor_slices(np.arange(10))\n",
563 |     "dataset = dataset.repeat(3).batch(7)"
564 |    ]
565 |   },
566 |   {
567 |    "cell_type": "markdown",
568 |    "metadata": {},
569 |    "source": [
570 |     "The first line creates a dataset containing the integers 0 through 9. The second line creates a new dataset based on the first one, repeating its elements three times and creating batches of 7 elements. As you can see, we start with a source dataset, then we chain calls to various methods to apply transformations to the data."
571 |    ]
572 |   },
573 |   {
574 |    "cell_type": "markdown",
575 |    "metadata": {},
576 |    "source": [
577 |     "Next, we create a one-shot-iterator to go through this dataset just once, and we call its `get_next()` method to get a tensor that represents the next element."
578 |    ]
579 |   },
580 |   {
581 |    "cell_type": "code",
582 |    "execution_count": 19,
583 |    "metadata": {},
584 |    "outputs": [],
585 |    "source": [
586 |     "iterator = dataset.make_one_shot_iterator()\n",
587 |     "next_element = iterator.get_next()"
588 |    ]
589 |   },
590 |   {
591 |    "cell_type": "markdown",
592 |    "metadata": {},
593 |    "source": [
594 |     "Let's repeatedly evaluate `next_element` to go through the dataset. When there are not more elements, we get an `OutOfRangeError`:"
595 |    ]
596 |   },
597 |   {
598 |    "cell_type": "code",
599 |    "execution_count": 20,
600 |    "metadata": {},
601 |    "outputs": [
602 |     {
603 |      "name": "stdout",
604 |      "output_type": "stream",
605 |      "text": [
606 |       "[0 1 2 3 4 5 6]\n",
607 |       "[7 8 9 0 1 2 3]\n",
608 |       "[4 5 6 7 8 9 0]\n",
609 |       "[1 2 3 4 5 6 7]\n",
610 |       "[8 9]\n",
611 |       "Done\n"
612 |      ]
613 |     }
614 |    ],
615 |    "source": [
616 |     "with tf.Session() as sess:\n",
617 |     "    try:\n",
618 |     "        while True:\n",
619 |     "            print(next_element.eval())\n",
620 |     "    except tf.errors.OutOfRangeError:\n",
621 |     "        print(\"Done\")"
622 |    ]
623 |   },
624 |   {
625 |    "cell_type": "markdown",
626 |    "metadata": {},
627 |    "source": [
628 |     "Great! It worked fine."
629 |    ]
630 |   },
631 |   {
632 |    "cell_type": "markdown",
633 |    "metadata": {},
634 |    "source": [
635 |     "Note that, as always, a tensor is only evaluated once each time we run the graph (`sess.run()`): so even if we evaluate multiple tensors that all depend on `next_element`, it is only evaluated once. This is true as well if we ask for `next_element` to be evaluated twice in just one run:"
636 |    ]
637 |   },
638 |   {
639 |    "cell_type": "code",
640 |    "execution_count": 21,
641 |    "metadata": {},
642 |    "outputs": [
643 |     {
644 |      "name": "stdout",
645 |      "output_type": "stream",
646 |      "text": [
647 |       "[array([0, 1, 2, 3, 4, 5, 6]), array([0, 1, 2, 3, 4, 5, 6])]\n",
648 |       "[array([7, 8, 9, 0, 1, 2, 3]), array([7, 8, 9, 0, 1, 2, 3])]\n",
649 |       "[array([4, 5, 6, 7, 8, 9, 0]), array([4, 5, 6, 7, 8, 9, 0])]\n",
650 |       "[array([1, 2, 3, 4, 5, 6, 7]), array([1, 2, 3, 4, 5, 6, 7])]\n",
651 |       "[array([8, 9]), array([8, 9])]\n",
652 |       "Done\n"
653 |      ]
654 |     }
655 |    ],
656 |    "source": [
657 |     "with tf.Session() as sess:\n",
658 |     "    try:\n",
659 |     "        while True:\n",
660 |     "            print(sess.run([next_element, next_element]))\n",
661 |     "    except tf.errors.OutOfRangeError:\n",
662 |     "        print(\"Done\")"
663 |    ]
664 |   },
665 |   {
666 |    "cell_type": "markdown",
667 |    "metadata": {},
668 |    "source": [
669 |     "The `interleave()` method is powerful but a bit tricky to grasp at first. The easiest way to understand it is to look at an example:"
670 |    ]
671 |   },
672 |   {
673 |    "cell_type": "code",
674 |    "execution_count": 22,
675 |    "metadata": {},
676 |    "outputs": [],
677 |    "source": [
678 |     "tf.reset_default_graph()"
679 |    ]
680 |   },
681 |   {
682 |    "cell_type": "code",
683 |    "execution_count": 23,
684 |    "metadata": {},
685 |    "outputs": [],
686 |    "source": [
687 |     "dataset = tf.data.Dataset.from_tensor_slices(np.arange(10))\n",
688 |     "dataset = dataset.repeat(3).batch(7)\n",
689 |     "dataset = dataset.interleave(\n",
690 |     "    lambda v: tf.data.Dataset.from_tensor_slices(v),\n",
691 |     "    cycle_length=3,\n",
692 |     "    block_length=2)\n",
693 |     "iterator = dataset.make_one_shot_iterator()\n",
694 |     "next_element = iterator.get_next()"
695 |    ]
696 |   },
697 |   {
698 |    "cell_type": "code",
699 |    "execution_count": 24,
700 |    "metadata": {},
701 |    "outputs": [
702 |     {
703 |      "name": "stdout",
704 |      "output_type": "stream",
705 |      "text": [
706 |       "0,1,7,8,4,5,2,3,9,0,6,7,4,5,1,2,8,9,6,3,0,1,2,8,9,3,4,5,6,7,Done\n"
707 |      ]
708 |     }
709 |    ],
710 |    "source": [
711 |     "with tf.Session() as sess:\n",
712 |     "    try:\n",
713 |     "        while True:\n",
714 |     "            print(next_element.eval(), end=\",\")\n",
715 |     "    except tf.errors.OutOfRangeError:\n",
716 |     "        print(\"Done\")"
717 |    ]
718 |   },
719 |   {
720 |    "cell_type": "markdown",
721 |    "metadata": {},
722 |    "source": [
723 |     "Because `cycle_length=3`, the new dataset starts by pulling 3 elements from the previous dataset: that's `[0,1,2,3,4,5,6]`, `[7,8,9,0,1,2,3]` and `[4,5,6,7,8,9,0]`. Then it calls the lambda function we gave it to create one dataset for each of the elements. Since we use `Dataset.from_tensor_slices()`, each dataset is going to return its elements one by one. Next, it pulls two items (since `block_length=2`) from each of these three datasets, and it iterates until all three datasets are out of items: 0,1 (from 1st), 7,8 (from 2nd), 4,5 (from 3rd), 2,3 (from 1st), 9,0 (from 2nd), and so on until 8,9 (from 3rd), 6 (from 1st), 3 (from 2nd), 0 (from 3rd). Next it tries to pull the next 3 elements from the original dataset, but there are just two left: `[1,2,3,4,5,6,7]` and `[8,9]`. Again, it creates datasets from these elements, and it pulls two items from each until both datasets are out of items: 1,2 (from 1st), 8,9 (from 2nd), 3,4 (from 1st), 5,6 (from 1st), 7 (from 1st). Notice that there's no interleaving at the end since the arrays do not have the same length."
724 |    ]
725 |   },
726 |   {
727 |    "cell_type": "markdown",
728 |    "metadata": {},
729 |    "source": [
730 |     "# Readers – the new way"
731 |    ]
732 |   },
733 |   {
734 |    "cell_type": "markdown",
735 |    "metadata": {},
736 |    "source": [
737 |     "Instead of using a source dataset based on `from_tensor_slices()` or `from_tensor()`, we can use a reader dataset. It handles most of the complexity for us (e.g., threads):"
738 |    ]
739 |   },
740 |   {
741 |    "cell_type": "code",
742 |    "execution_count": 25,
743 |    "metadata": {},
744 |    "outputs": [],
745 |    "source": [
746 |     "tf.reset_default_graph()"
747 |    ]
748 |   },
749 |   {
750 |    "cell_type": "code",
751 |    "execution_count": 26,
752 |    "metadata": {},
753 |    "outputs": [],
754 |    "source": [
755 |     "filenames = [\"my_test.csv\"]"
756 |    ]
757 |   },
758 |   {
759 |    "cell_type": "code",
760 |    "execution_count": 27,
761 |    "metadata": {},
762 |    "outputs": [],
763 |    "source": [
764 |     "dataset = tf.data.TextLineDataset(filenames)"
765 |    ]
766 |   },
767 |   {
768 |    "cell_type": "markdown",
769 |    "metadata": {},
770 |    "source": [
771 |     "We still need to tell it how to decode each line:"
772 |    ]
773 |   },
774 |   {
775 |    "cell_type": "code",
776 |    "execution_count": 28,
777 |    "metadata": {},
778 |    "outputs": [],
779 |    "source": [
780 |     "def decode_csv_line(line):\n",
781 |     "    x1, x2, y = tf.decode_csv(\n",
782 |     "        line, record_defaults=[[-1.], [-1.], [-1.]])\n",
783 |     "    X = tf.stack([x1, x2])\n",
784 |     "    return X, y"
785 |    ]
786 |   },
787 |   {
788 |    "cell_type": "markdown",
789 |    "metadata": {},
790 |    "source": [
791 |     "Next, we can apply this decoding function to each element in the dataset using `map()`:"
792 |    ]
793 |   },
794 |   {
795 |    "cell_type": "code",
796 |    "execution_count": 29,
797 |    "metadata": {},
798 |    "outputs": [],
799 |    "source": [
800 |     "dataset = dataset.skip(1).map(decode_csv_line)"
801 |    ]
802 |   },
803 |   {
804 |    "cell_type": "markdown",
805 |    "metadata": {},
806 |    "source": [
807 |     "Finally, let's create a one-shot iterator:"
808 |    ]
809 |   },
810 |   {
811 |    "cell_type": "code",
812 |    "execution_count": 30,
813 |    "metadata": {},
814 |    "outputs": [],
815 |    "source": [
816 |     "it = dataset.make_one_shot_iterator()\n",
817 |     "X, y = it.get_next()"
818 |    ]
819 |   },
820 |   {
821 |    "cell_type": "code",
822 |    "execution_count": 31,
823 |    "metadata": {},
824 |    "outputs": [
825 |     {
826 |      "name": "stdout",
827 |      "output_type": "stream",
828 |      "text": [
829 |       "[ 1. -1.] 0.0\n",
830 |       "[4. 5.] 1.0\n",
831 |       "[7. 8.] 0.0\n",
832 |       "Done\n"
833 |      ]
834 |     }
835 |    ],
836 |    "source": [
837 |     "with tf.Session() as sess:\n",
838 |     "    try:\n",
839 |     "        while True:\n",
840 |     "            X_val, y_val = sess.run([X, y])\n",
841 |     "            print(X_val, y_val)\n",
842 |     "    except tf.errors.OutOfRangeError as ex:\n",
843 |     "        print(\"Done\")\n"
844 |    ]
845 |   },
846 |   {
847 |    "cell_type": "markdown",
848 |    "metadata": {
849 |     "collapsed": true
850 |    },
851 |    "source": [
852 |     "# Exercise solutions"
853 |    ]
854 |   },
855 |   {
856 |    "cell_type": "markdown",
857 |    "metadata": {},
858 |    "source": [
859 |     "**Coming soon**"
860 |    ]
861 |   },
862 |   {
863 |    "cell_type": "code",
864 |    "execution_count": null,
865 |    "metadata": {},
866 |    "outputs": [],
867 |    "source": []
868 |   }
869 |  ],
870 |  "metadata": {
871 |   "kernelspec": {
872 |    "display_name": "Python 3",
873 |    "language": "python",
874 |    "name": "python3"
875 |   },
876 |   "language_info": {
877 |    "codemirror_mode": {
878 |     "name": "ipython",
879 |     "version": 3
880 |    },
881 |    "file_extension": ".py",
882 |    "mimetype": "text/x-python",
883 |    "name": "python",
884 |    "nbconvert_exporter": "python",
885 |    "pygments_lexer": "ipython3",
886 |    "version": "3.7.10"
887 |   },
888 |   "nav_menu": {},
889 |   "toc": {
890 |    "navigate_menu": true,
891 |    "number_sections": true,
892 |    "sideBar": true,
893 |    "threshold": 6,
894 |    "toc_cell": false,
895 |    "toc_section_display": "block",
896 |    "toc_window_display": false
897 |   }
898 |  },
899 |  "nbformat": 4,
900 |  "nbformat_minor": 1
901 | }
902 | 


--------------------------------------------------------------------------------
/INSTALL.md:
--------------------------------------------------------------------------------
 1 | # Installation
 2 | 
 3 | ## Download this repository
 4 | To install this repository and run the Jupyter notebooks on your machine, you will first need git, which you may already have. Open a terminal and type `git` to check. If you do not have git, you can download it from [git-scm.com](https://git-scm.com/).
 5 | 
 6 | Next, clone this repository by opening a terminal and typing the following commands (do not type the first `$` on each line, it's just a convention to show that this is a terminal prompt, not something else like Python code):
 7 | 
 8 |     $ cd $HOME  # or any other development directory you prefer
 9 |     $ git clone https://github.com/ageron/handson-ml.git
10 |     $ cd handson-ml
11 | 
12 | If you do not want to install git, you can instead download [master.zip](https://github.com/ageron/handson-ml/archive/master.zip), unzip it, rename the resulting directory to `handson-ml` and move it to your development directory.
13 | 
14 | ## Install Anaconda
15 | Next, you will need Python 3 and a bunch of Python libraries. The simplest way to install these is to [download and install Anaconda](https://www.anaconda.com/distribution/), which is a great cross-platform Python distribution for scientific computing. It comes bundled with many scientific libraries, including NumPy, Pandas, Matplotlib, Scikit-Learn and much more, so it's quite a large installation. If you prefer a lighter weight Anaconda distribution, you can [install Miniconda](https://docs.conda.io/en/latest/miniconda.html), which contains the bare minimum to run the `conda` packaging tool. You should install the latest version of Anaconda (or Miniconda) available.
16 | 
17 | During the installation on MacOSX and Linux, you will be asked whether to initialize Anaconda by running `conda init`: you should accept, as it will update your shell script to ensure that `conda` is available whenever you open a terminal. After the installation, you must close your terminal and open a new one for the changes to take effect.
18 | 
19 | During the installation on Windows, you will be asked whether you want the installer to update the `PATH` environment variable. This is not recommended as it may interfere with other software. Instead, after the installation you should open the Start Menu and launch an Anaconda Shell whenever you want to use Anaconda.
20 | 
21 | Once Anaconda (or Miniconda) is installed, run the following command to update the `conda` packaging tool to the latest version:
22 | 
23 |     $ conda update -n base -c defaults conda
24 | 
25 | > **Note**: if you don't like Anaconda for some reason, then you can install Python 3 and use pip to install the required libraries manually (this is not recommended, unless you really know what you are doing). I recommend using Python 3.7, since some libs don't support Python 3.8 or 3.9 yet.
26 | 
27 | 
28 | ## Install the GPU Driver and Libraries
29 | If you have a TensorFlow-compatible GPU card (NVidia card with Compute Capability ≥ 3.5), and you want TensorFlow to use it, then you should download the latest driver for your card from [nvidia.com](https://www.nvidia.com/Download/index.aspx?lang=en-us) and install it. You will also need NVidia's CUDA and cuDNN libraries, but the good news is that they will be installed automatically when you install the tensorflow-gpu package from Anaconda. However, if you don't use Anaconda, you will have to install them manually. If you hit any roadblock, see TensorFlow's [GPU installation instructions](https://tensorflow.org/install/gpu) for more details.
30 | 
31 | ## Create the `tf1` Environment
32 | Next, make sure you're in the `handson-ml` directory and run the following command. It will create a new `conda` environment containing every library you will need to run all the notebooks (by default, the environment will be named `tf1`, but you can choose another name using the `-n` option):
33 | 
34 |     $ conda env create -f environment.yml
35 | 
36 | Next, activate the new environment:
37 | 
38 |     $ conda activate tf1
39 | 
40 | 
41 | ## Start Jupyter
42 | You're almost there! You just need to register the `tf1` conda environment to Jupyter. The notebooks in this project will default to the environment named `python3`, so it's best to register this environment using the name `python3` (if you prefer to use another name, you will have to select it in the "Kernel > Change kernel..." menu in Jupyter every time you open a notebook):
43 | 
44 |     $ python3 -m ipykernel install --user --name=python3
45 | 
46 | And that's it! You can now start Jupyter like this:
47 | 
48 |     $ jupyter notebook
49 | 
50 | This should open up your browser, and you should see Jupyter's tree view, with the contents of the current directory. If your browser does not open automatically, visit [localhost:8888](http://localhost:8888/tree). Click on `index.ipynb` to get started.
51 | 
52 | Congrats! You are ready to learn Machine Learning, hands on!
53 | 
54 | When you're done with Jupyter, you can close it by typing Ctrl-C in the Terminal window where you started it. Every time you want to work on this project, you will need to open a Terminal, and run:
55 | 
56 |     $ cd $HOME # or whatever development directory you chose earlier
57 |     $ cd handson-ml
58 |     $ conda activate tf1
59 |     $ jupyter notebook
60 | 
61 | ## Update This Project and its Libraries
62 | I regularly update the notebooks to fix issues and add support for new libraries. So make sure you update this project regularly.
63 | 
64 | For this, open a terminal, and run:
65 | 
66 |     $ cd $HOME # or whatever development directory you chose earlier
67 |     $ cd handson-ml # go to this project's directory
68 |     $ git pull
69 | 
70 | If you get an error, it's probably because you modified a notebook. In this case, before running `git pull` you will first need to commit your changes. I recommend doing this in your own branch, or else you may get conflicts:
71 | 
72 |     $ git checkout -b my_branch # you can use another branch name if you want
73 |     $ git add -u
74 |     $ git commit -m "describe your changes here"
75 |     $ git checkout master
76 |     $ git pull
77 | 
78 | Next, let's update the libraries. First, let's update `conda` itself:
79 | 
80 |     $ conda update -c defaults -n base conda
81 | 
82 | Then we'll delete this project's `tf1` environment:
83 | 
84 |     $ conda activate base
85 |     $ conda env remove -n tf1
86 | 
87 | And recreate the environment:
88 | 
89 |     $ conda env create -f environment.yml
90 | 
91 | Lastly, we reactivate the environment and start Jupyter:
92 | 
93 |     $ conda activate tf1
94 |     $ jupyter notebook
95 | 


--------------------------------------------------------------------------------
/LICENSE:
--------------------------------------------------------------------------------
  1 |                                  Apache License
  2 |                            Version 2.0, January 2004
  3 |                         http://www.apache.org/licenses/
  4 | 
  5 |    TERMS AND CONDITIONS FOR USE, REPRODUCTION, AND DISTRIBUTION
  6 | 
  7 |    1. Definitions.
  8 | 
  9 |       "License" shall mean the terms and conditions for use, reproduction,
 10 |       and distribution as defined by Sections 1 through 9 of this document.
 11 | 
 12 |       "Licensor" shall mean the copyright owner or entity authorized by
 13 |       the copyright owner that is granting the License.
 14 | 
 15 |       "Legal Entity" shall mean the union of the acting entity and all
 16 |       other entities that control, are controlled by, or are under common
 17 |       control with that entity. For the purposes of this definition,
 18 |       "control" means (i) the power, direct or indirect, to cause the
 19 |       direction or management of such entity, whether by contract or
 20 |       otherwise, or (ii) ownership of fifty percent (50%) or more of the
 21 |       outstanding shares, or (iii) beneficial ownership of such entity.
 22 | 
 23 |       "You" (or "Your") shall mean an individual or Legal Entity
 24 |       exercising permissions granted by this License.
 25 | 
 26 |       "Source" form shall mean the preferred form for making modifications,
 27 |       including but not limited to software source code, documentation
 28 |       source, and configuration files.
 29 | 
 30 |       "Object" form shall mean any form resulting from mechanical
 31 |       transformation or translation of a Source form, including but
 32 |       not limited to compiled object code, generated documentation,
 33 |       and conversions to other media types.
 34 | 
 35 |       "Work" shall mean the work of authorship, whether in Source or
 36 |       Object form, made available under the License, as indicated by a
 37 |       copyright notice that is included in or attached to the work
 38 |       (an example is provided in the Appendix below).
 39 | 
 40 |       "Derivative Works" shall mean any work, whether in Source or Object
 41 |       form, that is based on (or derived from) the Work and for which the
 42 |       editorial revisions, annotations, elaborations, or other modifications
 43 |       represent, as a whole, an original work of authorship. For the purposes
 44 |       of this License, Derivative Works shall not include works that remain
 45 |       separable from, or merely link (or bind by name) to the interfaces of,
 46 |       the Work and Derivative Works thereof.
 47 | 
 48 |       "Contribution" shall mean any work of authorship, including
 49 |       the original version of the Work and any modifications or additions
 50 |       to that Work or Derivative Works thereof, that is intentionally
 51 |       submitted to Licensor for inclusion in the Work by the copyright owner
 52 |       or by an individual or Legal Entity authorized to submit on behalf of
 53 |       the copyright owner. For the purposes of this definition, "submitted"
 54 |       means any form of electronic, verbal, or written communication sent
 55 |       to the Licensor or its representatives, including but not limited to
 56 |       communication on electronic mailing lists, source code control systems,
 57 |       and issue tracking systems that are managed by, or on behalf of, the
 58 |       Licensor for the purpose of discussing and improving the Work, but
 59 |       excluding communication that is conspicuously marked or otherwise
 60 |       designated in writing by the copyright owner as "Not a Contribution."
 61 | 
 62 |       "Contributor" shall mean Licensor and any individual or Legal Entity
 63 |       on behalf of whom a Contribution has been received by Licensor and
 64 |       subsequently incorporated within the Work.
 65 | 
 66 |    2. Grant of Copyright License. Subject to the terms and conditions of
 67 |       this License, each Contributor hereby grants to You a perpetual,
 68 |       worldwide, non-exclusive, no-charge, royalty-free, irrevocable
 69 |       copyright license to reproduce, prepare Derivative Works of,
 70 |       publicly display, publicly perform, sublicense, and distribute the
 71 |       Work and such Derivative Works in Source or Object form.
 72 | 
 73 |    3. Grant of Patent License. Subject to the terms and conditions of
 74 |       this License, each Contributor hereby grants to You a perpetual,
 75 |       worldwide, non-exclusive, no-charge, royalty-free, irrevocable
 76 |       (except as stated in this section) patent license to make, have made,
 77 |       use, offer to sell, sell, import, and otherwise transfer the Work,
 78 |       where such license applies only to those patent claims licensable
 79 |       by such Contributor that are necessarily infringed by their
 80 |       Contribution(s) alone or by combination of their Contribution(s)
 81 |       with the Work to which such Contribution(s) was submitted. If You
 82 |       institute patent litigation against any entity (including a
 83 |       cross-claim or counterclaim in a lawsuit) alleging that the Work
 84 |       or a Contribution incorporated within the Work constitutes direct
 85 |       or contributory patent infringement, then any patent licenses
 86 |       granted to You under this License for that Work shall terminate
 87 |       as of the date such litigation is filed.
 88 | 
 89 |    4. Redistribution. You may reproduce and distribute copies of the
 90 |       Work or Derivative Works thereof in any medium, with or without
 91 |       modifications, and in Source or Object form, provided that You
 92 |       meet the following conditions:
 93 | 
 94 |       (a) You must give any other recipients of the Work or
 95 |           Derivative Works a copy of this License; and
 96 | 
 97 |       (b) You must cause any modified files to carry prominent notices
 98 |           stating that You changed the files; and
 99 | 
100 |       (c) You must retain, in the Source form of any Derivative Works
101 |           that You distribute, all copyright, patent, trademark, and
102 |           attribution notices from the Source form of the Work,
103 |           excluding those notices that do not pertain to any part of
104 |           the Derivative Works; and
105 | 
106 |       (d) If the Work includes a "NOTICE" text file as part of its
107 |           distribution, then any Derivative Works that You distribute must
108 |           include a readable copy of the attribution notices contained
109 |           within such NOTICE file, excluding those notices that do not
110 |           pertain to any part of the Derivative Works, in at least one
111 |           of the following places: within a NOTICE text file distributed
112 |           as part of the Derivative Works; within the Source form or
113 |           documentation, if provided along with the Derivative Works; or,
114 |           within a display generated by the Derivative Works, if and
115 |           wherever such third-party notices normally appear. The contents
116 |           of the NOTICE file are for informational purposes only and
117 |           do not modify the License. You may add Your own attribution
118 |           notices within Derivative Works that You distribute, alongside
119 |           or as an addendum to the NOTICE text from the Work, provided
120 |           that such additional attribution notices cannot be construed
121 |           as modifying the License.
122 | 
123 |       You may add Your own copyright statement to Your modifications and
124 |       may provide additional or different license terms and conditions
125 |       for use, reproduction, or distribution of Your modifications, or
126 |       for any such Derivative Works as a whole, provided Your use,
127 |       reproduction, and distribution of the Work otherwise complies with
128 |       the conditions stated in this License.
129 | 
130 |    5. Submission of Contributions. Unless You explicitly state otherwise,
131 |       any Contribution intentionally submitted for inclusion in the Work
132 |       by You to the Licensor shall be under the terms and conditions of
133 |       this License, without any additional terms or conditions.
134 |       Notwithstanding the above, nothing herein shall supersede or modify
135 |       the terms of any separate license agreement you may have executed
136 |       with Licensor regarding such Contributions.
137 | 
138 |    6. Trademarks. This License does not grant permission to use the trade
139 |       names, trademarks, service marks, or product names of the Licensor,
140 |       except as required for reasonable and customary use in describing the
141 |       origin of the Work and reproducing the content of the NOTICE file.
142 | 
143 |    7. Disclaimer of Warranty. Unless required by applicable law or
144 |       agreed to in writing, Licensor provides the Work (and each
145 |       Contributor provides its Contributions) on an "AS IS" BASIS,
146 |       WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or
147 |       implied, including, without limitation, any warranties or conditions
148 |       of TITLE, NON-INFRINGEMENT, MERCHANTABILITY, or FITNESS FOR A
149 |       PARTICULAR PURPOSE. You are solely responsible for determining the
150 |       appropriateness of using or redistributing the Work and assume any
151 |       risks associated with Your exercise of permissions under this License.
152 | 
153 |    8. Limitation of Liability. In no event and under no legal theory,
154 |       whether in tort (including negligence), contract, or otherwise,
155 |       unless required by applicable law (such as deliberate and grossly
156 |       negligent acts) or agreed to in writing, shall any Contributor be
157 |       liable to You for damages, including any direct, indirect, special,
158 |       incidental, or consequential damages of any character arising as a
159 |       result of this License or out of the use or inability to use the
160 |       Work (including but not limited to damages for loss of goodwill,
161 |       work stoppage, computer failure or malfunction, or any and all
162 |       other commercial damages or losses), even if such Contributor
163 |       has been advised of the possibility of such damages.
164 | 
165 |    9. Accepting Warranty or Additional Liability. While redistributing
166 |       the Work or Derivative Works thereof, You may choose to offer,
167 |       and charge a fee for, acceptance of support, warranty, indemnity,
168 |       or other liability obligations and/or rights consistent with this
169 |       License. However, in accepting such obligations, You may act only
170 |       on Your own behalf and on Your sole responsibility, not on behalf
171 |       of any other Contributor, and only if You agree to indemnify,
172 |       defend, and hold each Contributor harmless for any liability
173 |       incurred by, or claims asserted against, such Contributor by reason
174 |       of your accepting any such warranty or additional liability.
175 | 
176 |    END OF TERMS AND CONDITIONS
177 | 
178 | 
179 | 


--------------------------------------------------------------------------------
/README.md:
--------------------------------------------------------------------------------
 1 | Machine Learning Notebooks
 2 | ==========================
 3 | 
 4 | # ⚠ THE <a href="https://github.com/ageron/handson-ml3">THIRD EDITION OF MY BOOK</a> IS NOW AVAILABLE.
 5 | 
 6 | This project is for the first edition, which is now outdated.
 7 | 
 8 | <details>
 9 | 
10 | This project aims at teaching you the fundamentals of Machine Learning in
11 | python. It contains the example code and solutions to the exercises in my O'Reilly book [Hands-on Machine Learning with Scikit-Learn and TensorFlow](https://learning.oreilly.com/library/view/hands-on-machine-learning/9781491962282/):
12 | 
13 | [![book](http://akamaicovers.oreilly.com/images/9781491962282/cat.gif)](https://learning.oreilly.com/library/view/hands-on-machine-learning/9781491962282/)
14 | 
15 | 
16 | ## Quick Start
17 | 
18 | ### Want to play with these notebooks online without having to install anything?
19 | Use any of the following services.
20 | 
21 | **WARNING**: Please be aware that these services provide temporary environments: anything you do will be deleted after a while, so make sure you download any data you care about.
22 | 
23 | * **Recommended**: open this repository in [Colaboratory](https://colab.research.google.com/github/ageron/handson-ml/blob/master/):
24 | <a href="https://colab.research.google.com/github/ageron/handson-ml/blob/master/"><img src="https://colab.research.google.com/img/colab_favicon.ico" width="90" /></a>
25 | 
26 | * Or open it in [Binder](https://mybinder.org/v2/gh/ageron/handson-ml/master):
27 | <a href="https://mybinder.org/v2/gh/ageron/handson-ml/master"><img src="https://matthiasbussonnier.com/posts/img/binder_logo_128x128.png" width="90" /></a>
28 | 
29 |   * _Note_: Most of the time, Binder starts up quickly and works great, but when handson-ml is updated, Binder creates a new environment from scratch, and this can take quite some time.
30 | 
31 | * Or open it in [Deepnote](https://beta.deepnote.com/launch?template=data-science&url=https%3A//github.com/ageron/handson-ml/blob/master/index.ipynb):
32 | <a href="https://beta.deepnote.com/launch?template=data-science&url=https%3A//github.com/ageron/handson-ml/blob/master/index.ipynb"><img src="https://www.deepnote.com/static/illustration.png" width="150" /></a>
33 | 
34 | ### Just want to quickly look at some notebooks, without executing any code?
35 | 
36 | Browse this repository using [jupyter.org's notebook viewer](https://nbviewer.jupyter.org/github/ageron/handson-ml/blob/master/index.ipynb):
37 | <a href="https://nbviewer.jupyter.org/github/ageron/handson-ml/blob/master/index.ipynb"><img src="https://jupyter.org/assets/logos/rectanglelogo-greytext-orangebody-greymoons.svg" width="150" /></a>
38 | 
39 | _Note_: [github.com's notebook viewer](index.ipynb) also works but it is slower and the math equations are not always displayed correctly.
40 | 
41 | ### Want to run this project using a Docker image?
42 | Read the [Docker instructions](https://github.com/ageron/handson-ml/tree/master/docker).
43 | 
44 | ### Want to install this project on your own machine?
45 | 
46 | Start by installing [Anaconda](https://www.anaconda.com/distribution/) (or [Miniconda](https://docs.conda.io/en/latest/miniconda.html)), [git](https://git-scm.com/downloads), and if you have a TensorFlow-compatible GPU, install the [GPU driver](https://www.nvidia.com/Download/index.aspx), as well as the appropriate version of CUDA and cuDNN (see TensorFlow's documentation for more details).
47 | 
48 | Next, clone this project by opening a terminal and typing the following commands (do not type the first `$` signs on each line, they just indicate that these are terminal commands):
49 | 
50 |     $ git clone https://github.com/ageron/handson-ml.git
51 |     $ cd handson-ml
52 | 
53 | Next, run the following commands:
54 | 
55 |     $ conda env create -f environment.yml
56 |     $ conda activate tf1
57 |     $ python -m ipykernel install --user --name=python3
58 | 
59 | Finally, start Jupyter:
60 | 
61 |     $ jupyter notebook
62 | 
63 | If you need further instructions, read the [detailed installation instructions](INSTALL.md).
64 | 
65 | # FAQ
66 | 
67 | **Which Python version should I use?**
68 | 
69 | I recommend Python 3.7. If you follow the installation instructions above, that's the version you will get. Most code will work with other versions of Python 3, but some libraries do not support Python 3.8 or 3.9 yet, which is why I recommend Python 3.7.
70 | 
71 | **I'm getting an error when I call `load_housing_data()`**
72 | 
73 | Make sure you call `fetch_housing_data()` *before* you call `load_housing_data()`. If you're getting an HTTP error, make sure you're running the exact same code as in the notebook (copy/paste it if needed). If the problem persists, please check your network configuration.
74 | 
75 | **I'm getting an SSL error on MacOSX**
76 | 
77 | You probably need to install the SSL certificates (see this [StackOverflow question](https://stackoverflow.com/questions/27835619/urllib-and-ssl-certificate-verify-failed-error)). If you downloaded Python from the official website, then run `/Applications/Python\ 3.7/Install\ Certificates.command` in a terminal (change `3.7` to whatever version you installed). If you installed Python using MacPorts, run `sudo port install curl-ca-bundle` in a terminal.
78 | 
79 | **I've installed this project locally. How do I update it to the latest version?**
80 | 
81 | See [INSTALL.md](INSTALL.md)
82 | 
83 | **How do I update my Python libraries to the latest versions, when using Anaconda?**
84 | 
85 | See [INSTALL.md](INSTALL.md)
86 | 
87 | ## Contributors
88 | I would like to thank everyone [who contributed to this project](https://github.com/ageron/handson-ml/graphs/contributors), either by providing useful feedback, filing issues or submitting Pull Requests. Special thanks go to Haesun Park and Ian Beauregard who reviewed every notebook and submitted many PRs, including help on some of the exercise solutions. Thanks as well to Steven Bunkley and Ziembla who created the `docker` directory, and to github user SuperYorio who helped on some exercise solutions.
89 | 
90 | </details>
91 | 


--------------------------------------------------------------------------------
/apt.txt:
--------------------------------------------------------------------------------
 1 | build-essential
 2 | cmake
 3 | ffmpeg
 4 | git
 5 | libboost-all-dev
 6 | libjpeg-dev
 7 | libpq-dev
 8 | libsdl2-dev
 9 | sudo
10 | swig
11 | unzip
12 | xorg-dev
13 | xvfb
14 | zip
15 | zlib1g-dev
16 | 


--------------------------------------------------------------------------------
/datasets/housing/README.md:
--------------------------------------------------------------------------------
 1 | # California Housing
 2 | 
 3 | ## Source
 4 | This dataset is a modified version of the California Housing dataset available from [Luís Torgo's page](http://www.dcc.fc.up.pt/~ltorgo/Regression/cal_housing.html) (University of Porto). Luís Torgo obtained it from the StatLib repository (which is closed now). The dataset may also be downloaded from StatLib mirrors.
 5 | 
 6 | This dataset appeared in a 1997 paper titled *Sparse Spatial Autoregressions* by Pace, R. Kelley and Ronald Barry, published in the *Statistics and Probability Letters* journal. They built it using the 1990 California census data. It contains one row per census block group. A block group is the smallest geographical unit for which the U.S. Census Bureau publishes sample data (a block group typically has a population of 600 to 3,000 people).
 7 | 
 8 | ## Tweaks
 9 | The dataset in this directory is almost identical to the original, with two differences:
10 | 
11 | * 207 values were randomly removed from the `total_bedrooms` column, so we can discuss what to do with missing data.
12 | * An additional categorical attribute called `ocean_proximity` was added, indicating (very roughly) whether each block group is near the ocean, near the Bay area, inland or on an island. This allows discussing what to do with categorical data.
13 | 
14 | Note that the block groups are called "districts" in the Jupyter notebooks, simply because in some contexts the name "block group" was confusing.
15 | 
16 | ## Data description
17 | 
18 |     >>> housing.info()
19 |     <class 'pandas.core.frame.DataFrame'>
20 |     RangeIndex: 20640 entries, 0 to 20639
21 |     Data columns (total 10 columns):
22 |     longitude             20640 non-null float64
23 |     latitude              20640 non-null float64
24 |     housing_median_age    20640 non-null float64
25 |     total_rooms           20640 non-null float64
26 |     total_bedrooms        20433 non-null float64
27 |     population            20640 non-null float64
28 |     households            20640 non-null float64
29 |     median_income         20640 non-null float64
30 |     median_house_value    20640 non-null float64
31 |     ocean_proximity       20640 non-null object
32 |     dtypes: float64(9), object(1)
33 |     memory usage: 1.6+ MB
34 |     
35 |     >>> housing["ocean_proximity"].value_counts()
36 |     <1H OCEAN     9136
37 |     INLAND        6551
38 |     NEAR OCEAN    2658
39 |     NEAR BAY      2290
40 |     ISLAND           5
41 |     Name: ocean_proximity, dtype: int64
42 |     
43 |     >>> housing.describe()
44 |               longitude      latitude  housing_median_age   total_rooms  \
45 |     count  16513.000000  16513.000000        16513.000000  16513.000000   
46 |     mean    -119.575972     35.639693           28.652335   2622.347605   
47 |     std        2.002048      2.138279           12.576306   2138.559393   
48 |     min     -124.350000     32.540000            1.000000      6.000000   
49 |     25%     -121.800000     33.940000           18.000000   1442.000000   
50 |     50%     -118.510000     34.260000           29.000000   2119.000000   
51 |     75%     -118.010000     37.720000           37.000000   3141.000000   
52 |     max     -114.310000     41.950000           52.000000  39320.000000   
53 | 
54 |            total_bedrooms    population    households  median_income  
55 |     count    16355.000000  16513.000000  16513.000000   16513.000000  
56 |     mean       534.885112   1419.525465    496.975050       3.875651  
57 |     std        412.716467   1115.715084    375.737945       1.905088  
58 |     min          2.000000      3.000000      2.000000       0.499900  
59 |     25%        295.000000    784.000000    278.000000       2.566800  
60 |     50%        433.000000   1164.000000    408.000000       3.541400  
61 |     75%        644.000000   1718.000000    602.000000       4.745000  
62 |     max       6210.000000  35682.000000   5358.000000      15.000100
63 |  
64 | 


--------------------------------------------------------------------------------
/datasets/housing/housing.tgz:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/ageron/handson-ml/ac1310a3cc1567ecfb4b798715c804627076775f/datasets/housing/housing.tgz


--------------------------------------------------------------------------------
/datasets/inception/imagenet_class_names.txt:
--------------------------------------------------------------------------------
   1 | n01440764 tench, Tinca tinca
   2 | n01443537 goldfish, Carassius auratus
   3 | n01484850 great white shark, white shark, man-eater, man-eating shark, Carcharodon carcharias
   4 | n01491361 tiger shark, Galeocerdo cuvieri
   5 | n01494475 hammerhead, hammerhead shark
   6 | n01496331 electric ray, crampfish, numbfish, torpedo
   7 | n01498041 stingray
   8 | n01514668 cock
   9 | n01514859 hen
  10 | n01518878 ostrich, Struthio camelus
  11 | n01530575 brambling, Fringilla montifringilla
  12 | n01531178 goldfinch, Carduelis carduelis
  13 | n01532829 house finch, linnet, Carpodacus mexicanus
  14 | n01534433 junco, snowbird
  15 | n01537544 indigo bunting, indigo finch, indigo bird, Passerina cyanea
  16 | n01558993 robin, American robin, Turdus migratorius
  17 | n01560419 bulbul
  18 | n01580077 jay
  19 | n01582220 magpie
  20 | n01592084 chickadee
  21 | n01601694 water ouzel, dipper
  22 | n01608432 kite
  23 | n01614925 bald eagle, American eagle, Haliaeetus leucocephalus
  24 | n01616318 vulture
  25 | n01622779 great grey owl, great gray owl, Strix nebulosa
  26 | n01629819 European fire salamander, Salamandra salamandra
  27 | n01630670 common newt, Triturus vulgaris
  28 | n01631663 eft
  29 | n01632458 spotted salamander, Ambystoma maculatum
  30 | n01632777 axolotl, mud puppy, Ambystoma mexicanum
  31 | n01641577 bullfrog, Rana catesbeiana
  32 | n01644373 tree frog, tree-frog
  33 | n01644900 tailed frog, bell toad, ribbed toad, tailed toad, Ascaphus trui
  34 | n01664065 loggerhead, loggerhead turtle, Caretta caretta
  35 | n01665541 leatherback turtle, leatherback, leathery turtle, Dermochelys coriacea
  36 | n01667114 mud turtle
  37 | n01667778 terrapin
  38 | n01669191 box turtle, box tortoise
  39 | n01675722 banded gecko
  40 | n01677366 common iguana, iguana, Iguana iguana
  41 | n01682714 American chameleon, anole, Anolis carolinensis
  42 | n01685808 whiptail, whiptail lizard
  43 | n01687978 agama
  44 | n01688243 frilled lizard, Chlamydosaurus kingi
  45 | n01689811 alligator lizard
  46 | n01692333 Gila monster, Heloderma suspectum
  47 | n01693334 green lizard, Lacerta viridis
  48 | n01694178 African chameleon, Chamaeleo chamaeleon
  49 | n01695060 Komodo dragon, Komodo lizard, dragon lizard, giant lizard, Varanus komodoensis
  50 | n01697457 African crocodile, Nile crocodile, Crocodylus niloticus
  51 | n01698640 American alligator, Alligator mississipiensis
  52 | n01704323 triceratops
  53 | n01728572 thunder snake, worm snake, Carphophis amoenus
  54 | n01728920 ringneck snake, ring-necked snake, ring snake
  55 | n01729322 hognose snake, puff adder, sand viper
  56 | n01729977 green snake, grass snake
  57 | n01734418 king snake, kingsnake
  58 | n01735189 garter snake, grass snake
  59 | n01737021 water snake
  60 | n01739381 vine snake
  61 | n01740131 night snake, Hypsiglena torquata
  62 | n01742172 boa constrictor, Constrictor constrictor
  63 | n01744401 rock python, rock snake, Python sebae
  64 | n01748264 Indian cobra, Naja naja
  65 | n01749939 green mamba
  66 | n01751748 sea snake
  67 | n01753488 horned viper, cerastes, sand viper, horned asp, Cerastes cornutus
  68 | n01755581 diamondback, diamondback rattlesnake, Crotalus adamanteus
  69 | n01756291 sidewinder, horned rattlesnake, Crotalus cerastes
  70 | n01768244 trilobite
  71 | n01770081 harvestman, daddy longlegs, Phalangium opilio
  72 | n01770393 scorpion
  73 | n01773157 black and gold garden spider, Argiope aurantia
  74 | n01773549 barn spider, Araneus cavaticus
  75 | n01773797 garden spider, Aranea diademata
  76 | n01774384 black widow, Latrodectus mactans
  77 | n01774750 tarantula
  78 | n01775062 wolf spider, hunting spider
  79 | n01776313 tick
  80 | n01784675 centipede
  81 | n01795545 black grouse
  82 | n01796340 ptarmigan
  83 | n01797886 ruffed grouse, partridge, Bonasa umbellus
  84 | n01798484 prairie chicken, prairie grouse, prairie fowl
  85 | n01806143 peacock
  86 | n01806567 quail
  87 | n01807496 partridge
  88 | n01817953 African grey, African gray, Psittacus erithacus
  89 | n01818515 macaw
  90 | n01819313 sulphur-crested cockatoo, Kakatoe galerita, Cacatua galerita
  91 | n01820546 lorikeet
  92 | n01824575 coucal
  93 | n01828970 bee eater
  94 | n01829413 hornbill
  95 | n01833805 hummingbird
  96 | n01843065 jacamar
  97 | n01843383 toucan
  98 | n01847000 drake
  99 | n01855032 red-breasted merganser, Mergus serrator
 100 | n01855672 goose
 101 | n01860187 black swan, Cygnus atratus
 102 | n01871265 tusker
 103 | n01872401 echidna, spiny anteater, anteater
 104 | n01873310 platypus, duckbill, duckbilled platypus, duck-billed platypus, Ornithorhynchus anatinus
 105 | n01877812 wallaby, brush kangaroo
 106 | n01882714 koala, koala bear, kangaroo bear, native bear, Phascolarctos cinereus
 107 | n01883070 wombat
 108 | n01910747 jellyfish
 109 | n01914609 sea anemone, anemone
 110 | n01917289 brain coral
 111 | n01924916 flatworm, platyhelminth
 112 | n01930112 nematode, nematode worm, roundworm
 113 | n01943899 conch
 114 | n01944390 snail
 115 | n01945685 slug
 116 | n01950731 sea slug, nudibranch
 117 | n01955084 chiton, coat-of-mail shell, sea cradle, polyplacophore
 118 | n01968897 chambered nautilus, pearly nautilus, nautilus
 119 | n01978287 Dungeness crab, Cancer magister
 120 | n01978455 rock crab, Cancer irroratus
 121 | n01980166 fiddler crab
 122 | n01981276 king crab, Alaska crab, Alaskan king crab, Alaska king crab, Paralithodes camtschatica
 123 | n01983481 American lobster, Northern lobster, Maine lobster, Homarus americanus
 124 | n01984695 spiny lobster, langouste, rock lobster, crawfish, crayfish, sea crawfish
 125 | n01985128 crayfish, crawfish, crawdad, crawdaddy
 126 | n01986214 hermit crab
 127 | n01990800 isopod
 128 | n02002556 white stork, Ciconia ciconia
 129 | n02002724 black stork, Ciconia nigra
 130 | n02006656 spoonbill
 131 | n02007558 flamingo
 132 | n02009229 little blue heron, Egretta caerulea
 133 | n02009912 American egret, great white heron, Egretta albus
 134 | n02011460 bittern
 135 | n02012849 crane
 136 | n02013706 limpkin, Aramus pictus
 137 | n02017213 European gallinule, Porphyrio porphyrio
 138 | n02018207 American coot, marsh hen, mud hen, water hen, Fulica americana
 139 | n02018795 bustard
 140 | n02025239 ruddy turnstone, Arenaria interpres
 141 | n02027492 red-backed sandpiper, dunlin, Erolia alpina
 142 | n02028035 redshank, Tringa totanus
 143 | n02033041 dowitcher
 144 | n02037110 oystercatcher, oyster catcher
 145 | n02051845 pelican
 146 | n02056570 king penguin, Aptenodytes patagonica
 147 | n02058221 albatross, mollymawk
 148 | n02066245 grey whale, gray whale, devilfish, Eschrichtius gibbosus, Eschrichtius robustus
 149 | n02071294 killer whale, killer, orca, grampus, sea wolf, Orcinus orca
 150 | n02074367 dugong, Dugong dugon
 151 | n02077923 sea lion
 152 | n02085620 Chihuahua
 153 | n02085782 Japanese spaniel
 154 | n02085936 Maltese dog, Maltese terrier, Maltese
 155 | n02086079 Pekinese, Pekingese, Peke
 156 | n02086240 Shih-Tzu
 157 | n02086646 Blenheim spaniel
 158 | n02086910 papillon
 159 | n02087046 toy terrier
 160 | n02087394 Rhodesian ridgeback
 161 | n02088094 Afghan hound, Afghan
 162 | n02088238 basset, basset hound
 163 | n02088364 beagle
 164 | n02088466 bloodhound, sleuthhound
 165 | n02088632 bluetick
 166 | n02089078 black-and-tan coonhound
 167 | n02089867 Walker hound, Walker foxhound
 168 | n02089973 English foxhound
 169 | n02090379 redbone
 170 | n02090622 borzoi, Russian wolfhound
 171 | n02090721 Irish wolfhound
 172 | n02091032 Italian greyhound
 173 | n02091134 whippet
 174 | n02091244 Ibizan hound, Ibizan Podenco
 175 | n02091467 Norwegian elkhound, elkhound
 176 | n02091635 otterhound, otter hound
 177 | n02091831 Saluki, gazelle hound
 178 | n02092002 Scottish deerhound, deerhound
 179 | n02092339 Weimaraner
 180 | n02093256 Staffordshire bullterrier, Staffordshire bull terrier
 181 | n02093428 American Staffordshire terrier, Staffordshire terrier, American pit bull terrier, pit bull terrier
 182 | n02093647 Bedlington terrier
 183 | n02093754 Border terrier
 184 | n02093859 Kerry blue terrier
 185 | n02093991 Irish terrier
 186 | n02094114 Norfolk terrier
 187 | n02094258 Norwich terrier
 188 | n02094433 Yorkshire terrier
 189 | n02095314 wire-haired fox terrier
 190 | n02095570 Lakeland terrier
 191 | n02095889 Sealyham terrier, Sealyham
 192 | n02096051 Airedale, Airedale terrier
 193 | n02096177 cairn, cairn terrier
 194 | n02096294 Australian terrier
 195 | n02096437 Dandie Dinmont, Dandie Dinmont terrier
 196 | n02096585 Boston bull, Boston terrier
 197 | n02097047 miniature schnauzer
 198 | n02097130 giant schnauzer
 199 | n02097209 standard schnauzer
 200 | n02097298 Scotch terrier, Scottish terrier, Scottie
 201 | n02097474 Tibetan terrier, chrysanthemum dog
 202 | n02097658 silky terrier, Sydney silky
 203 | n02098105 soft-coated wheaten terrier
 204 | n02098286 West Highland white terrier
 205 | n02098413 Lhasa, Lhasa apso
 206 | n02099267 flat-coated retriever
 207 | n02099429 curly-coated retriever
 208 | n02099601 golden retriever
 209 | n02099712 Labrador retriever
 210 | n02099849 Chesapeake Bay retriever
 211 | n02100236 German short-haired pointer
 212 | n02100583 vizsla, Hungarian pointer
 213 | n02100735 English setter
 214 | n02100877 Irish setter, red setter
 215 | n02101006 Gordon setter
 216 | n02101388 Brittany spaniel
 217 | n02101556 clumber, clumber spaniel
 218 | n02102040 English springer, English springer spaniel
 219 | n02102177 Welsh springer spaniel
 220 | n02102318 cocker spaniel, English cocker spaniel, cocker
 221 | n02102480 Sussex spaniel
 222 | n02102973 Irish water spaniel
 223 | n02104029 kuvasz
 224 | n02104365 schipperke
 225 | n02105056 groenendael
 226 | n02105162 malinois
 227 | n02105251 briard
 228 | n02105412 kelpie
 229 | n02105505 komondor
 230 | n02105641 Old English sheepdog, bobtail
 231 | n02105855 Shetland sheepdog, Shetland sheep dog, Shetland
 232 | n02106030 collie
 233 | n02106166 Border collie
 234 | n02106382 Bouvier des Flandres, Bouviers des Flandres
 235 | n02106550 Rottweiler
 236 | n02106662 German shepherd, German shepherd dog, German police dog, alsatian
 237 | n02107142 Doberman, Doberman pinscher
 238 | n02107312 miniature pinscher
 239 | n02107574 Greater Swiss Mountain dog
 240 | n02107683 Bernese mountain dog
 241 | n02107908 Appenzeller
 242 | n02108000 EntleBucher
 243 | n02108089 boxer
 244 | n02108422 bull mastiff
 245 | n02108551 Tibetan mastiff
 246 | n02108915 French bulldog
 247 | n02109047 Great Dane
 248 | n02109525 Saint Bernard, St Bernard
 249 | n02109961 Eskimo dog, husky
 250 | n02110063 malamute, malemute, Alaskan malamute
 251 | n02110185 Siberian husky
 252 | n02110341 dalmatian, coach dog, carriage dog
 253 | n02110627 affenpinscher, monkey pinscher, monkey dog
 254 | n02110806 basenji
 255 | n02110958 pug, pug-dog
 256 | n02111129 Leonberg
 257 | n02111277 Newfoundland, Newfoundland dog
 258 | n02111500 Great Pyrenees
 259 | n02111889 Samoyed, Samoyede
 260 | n02112018 Pomeranian
 261 | n02112137 chow, chow chow
 262 | n02112350 keeshond
 263 | n02112706 Brabancon griffon
 264 | n02113023 Pembroke, Pembroke Welsh corgi
 265 | n02113186 Cardigan, Cardigan Welsh corgi
 266 | n02113624 toy poodle
 267 | n02113712 miniature poodle
 268 | n02113799 standard poodle
 269 | n02113978 Mexican hairless
 270 | n02114367 timber wolf, grey wolf, gray wolf, Canis lupus
 271 | n02114548 white wolf, Arctic wolf, Canis lupus tundrarum
 272 | n02114712 red wolf, maned wolf, Canis rufus, Canis niger
 273 | n02114855 coyote, prairie wolf, brush wolf, Canis latrans
 274 | n02115641 dingo, warrigal, warragal, Canis dingo
 275 | n02115913 dhole, Cuon alpinus
 276 | n02116738 African hunting dog, hyena dog, Cape hunting dog, Lycaon pictus
 277 | n02117135 hyena, hyaena
 278 | n02119022 red fox, Vulpes vulpes
 279 | n02119789 kit fox, Vulpes macrotis
 280 | n02120079 Arctic fox, white fox, Alopex lagopus
 281 | n02120505 grey fox, gray fox, Urocyon cinereoargenteus
 282 | n02123045 tabby, tabby cat
 283 | n02123159 tiger cat
 284 | n02123394 Persian cat
 285 | n02123597 Siamese cat, Siamese
 286 | n02124075 Egyptian cat
 287 | n02125311 cougar, puma, catamount, mountain lion, painter, panther, Felis concolor
 288 | n02127052 lynx, catamount
 289 | n02128385 leopard, Panthera pardus
 290 | n02128757 snow leopard, ounce, Panthera uncia
 291 | n02128925 jaguar, panther, Panthera onca, Felis onca
 292 | n02129165 lion, king of beasts, Panthera leo
 293 | n02129604 tiger, Panthera tigris
 294 | n02130308 cheetah, chetah, Acinonyx jubatus
 295 | n02132136 brown bear, bruin, Ursus arctos
 296 | n02133161 American black bear, black bear, Ursus americanus, Euarctos americanus
 297 | n02134084 ice bear, polar bear, Ursus Maritimus, Thalarctos maritimus
 298 | n02134418 sloth bear, Melursus ursinus, Ursus ursinus
 299 | n02137549 mongoose
 300 | n02138441 meerkat, mierkat
 301 | n02165105 tiger beetle
 302 | n02165456 ladybug, ladybeetle, lady beetle, ladybird, ladybird beetle
 303 | n02167151 ground beetle, carabid beetle
 304 | n02168699 long-horned beetle, longicorn, longicorn beetle
 305 | n02169497 leaf beetle, chrysomelid
 306 | n02172182 dung beetle
 307 | n02174001 rhinoceros beetle
 308 | n02177972 weevil
 309 | n02190166 fly
 310 | n02206856 bee
 311 | n02219486 ant, emmet, pismire
 312 | n02226429 grasshopper, hopper
 313 | n02229544 cricket
 314 | n02231487 walking stick, walkingstick, stick insect
 315 | n02233338 cockroach, roach
 316 | n02236044 mantis, mantid
 317 | n02256656 cicada, cicala
 318 | n02259212 leafhopper
 319 | n02264363 lacewing, lacewing fly
 320 | n02268443 dragonfly, darning needle, devil's darning needle, sewing needle, snake feeder, snake doctor, mosquito hawk, skeeter hawk
 321 | n02268853 damselfly
 322 | n02276258 admiral
 323 | n02277742 ringlet, ringlet butterfly
 324 | n02279972 monarch, monarch butterfly, milkweed butterfly, Danaus plexippus
 325 | n02280649 cabbage butterfly
 326 | n02281406 sulphur butterfly, sulfur butterfly
 327 | n02281787 lycaenid, lycaenid butterfly
 328 | n02317335 starfish, sea star
 329 | n02319095 sea urchin
 330 | n02321529 sea cucumber, holothurian
 331 | n02325366 wood rabbit, cottontail, cottontail rabbit
 332 | n02326432 hare
 333 | n02328150 Angora, Angora rabbit
 334 | n02342885 hamster
 335 | n02346627 porcupine, hedgehog
 336 | n02356798 fox squirrel, eastern fox squirrel, Sciurus niger
 337 | n02361337 marmot
 338 | n02363005 beaver
 339 | n02364673 guinea pig, Cavia cobaya
 340 | n02389026 sorrel
 341 | n02391049 zebra
 342 | n02395406 hog, pig, grunter, squealer, Sus scrofa
 343 | n02396427 wild boar, boar, Sus scrofa
 344 | n02397096 warthog
 345 | n02398521 hippopotamus, hippo, river horse, Hippopotamus amphibius
 346 | n02403003 ox
 347 | n02408429 water buffalo, water ox, Asiatic buffalo, Bubalus bubalis
 348 | n02410509 bison
 349 | n02412080 ram, tup
 350 | n02415577 bighorn, bighorn sheep, cimarron, Rocky Mountain bighorn, Rocky Mountain sheep, Ovis canadensis
 351 | n02417914 ibex, Capra ibex
 352 | n02422106 hartebeest
 353 | n02422699 impala, Aepyceros melampus
 354 | n02423022 gazelle
 355 | n02437312 Arabian camel, dromedary, Camelus dromedarius
 356 | n02437616 llama
 357 | n02441942 weasel
 358 | n02442845 mink
 359 | n02443114 polecat, fitch, foulmart, foumart, Mustela putorius
 360 | n02443484 black-footed ferret, ferret, Mustela nigripes
 361 | n02444819 otter
 362 | n02445715 skunk, polecat, wood pussy
 363 | n02447366 badger
 364 | n02454379 armadillo
 365 | n02457408 three-toed sloth, ai, Bradypus tridactylus
 366 | n02480495 orangutan, orang, orangutang, Pongo pygmaeus
 367 | n02480855 gorilla, Gorilla gorilla
 368 | n02481823 chimpanzee, chimp, Pan troglodytes
 369 | n02483362 gibbon, Hylobates lar
 370 | n02483708 siamang, Hylobates syndactylus, Symphalangus syndactylus
 371 | n02484975 guenon, guenon monkey
 372 | n02486261 patas, hussar monkey, Erythrocebus patas
 373 | n02486410 baboon
 374 | n02487347 macaque
 375 | n02488291 langur
 376 | n02488702 colobus, colobus monkey
 377 | n02489166 proboscis monkey, Nasalis larvatus
 378 | n02490219 marmoset
 379 | n02492035 capuchin, ringtail, Cebus capucinus
 380 | n02492660 howler monkey, howler
 381 | n02493509 titi, titi monkey
 382 | n02493793 spider monkey, Ateles geoffroyi
 383 | n02494079 squirrel monkey, Saimiri sciureus
 384 | n02497673 Madagascar cat, ring-tailed lemur, Lemur catta
 385 | n02500267 indri, indris, Indri indri, Indri brevicaudatus
 386 | n02504013 Indian elephant, Elephas maximus
 387 | n02504458 African elephant, Loxodonta africana
 388 | n02509815 lesser panda, red panda, panda, bear cat, cat bear, Ailurus fulgens
 389 | n02510455 giant panda, panda, panda bear, coon bear, Ailuropoda melanoleuca
 390 | n02514041 barracouta, snoek
 391 | n02526121 eel
 392 | n02536864 coho, cohoe, coho salmon, blue jack, silver salmon, Oncorhynchus kisutch
 393 | n02606052 rock beauty, Holocanthus tricolor
 394 | n02607072 anemone fish
 395 | n02640242 sturgeon
 396 | n02641379 gar, garfish, garpike, billfish, Lepisosteus osseus
 397 | n02643566 lionfish
 398 | n02655020 puffer, pufferfish, blowfish, globefish
 399 | n02666196 abacus
 400 | n02667093 abaya
 401 | n02669723 academic gown, academic robe, judge's robe
 402 | n02672831 accordion, piano accordion, squeeze box
 403 | n02676566 acoustic guitar
 404 | n02687172 aircraft carrier, carrier, flattop, attack aircraft carrier
 405 | n02690373 airliner
 406 | n02692877 airship, dirigible
 407 | n02699494 altar
 408 | n02701002 ambulance
 409 | n02704792 amphibian, amphibious vehicle
 410 | n02708093 analog clock
 411 | n02727426 apiary, bee house
 412 | n02730930 apron
 413 | n02747177 ashcan, trash can, garbage can, wastebin, ash bin, ash-bin, ashbin, dustbin, trash barrel, trash bin
 414 | n02749479 assault rifle, assault gun
 415 | n02769748 backpack, back pack, knapsack, packsack, rucksack, haversack
 416 | n02776631 bakery, bakeshop, bakehouse
 417 | n02777292 balance beam, beam
 418 | n02782093 balloon
 419 | n02783161 ballpoint, ballpoint pen, ballpen, Biro
 420 | n02786058 Band Aid
 421 | n02787622 banjo
 422 | n02788148 bannister, banister, balustrade, balusters, handrail
 423 | n02790996 barbell
 424 | n02791124 barber chair
 425 | n02791270 barbershop
 426 | n02793495 barn
 427 | n02794156 barometer
 428 | n02795169 barrel, cask
 429 | n02797295 barrow, garden cart, lawn cart, wheelbarrow
 430 | n02799071 baseball
 431 | n02802426 basketball
 432 | n02804414 bassinet
 433 | n02804610 bassoon
 434 | n02807133 bathing cap, swimming cap
 435 | n02808304 bath towel
 436 | n02808440 bathtub, bathing tub, bath, tub
 437 | n02814533 beach wagon, station wagon, wagon, estate car, beach waggon, station waggon, waggon
 438 | n02814860 beacon, lighthouse, beacon light, pharos
 439 | n02815834 beaker
 440 | n02817516 bearskin, busby, shako
 441 | n02823428 beer bottle
 442 | n02823750 beer glass
 443 | n02825657 bell cote, bell cot
 444 | n02834397 bib
 445 | n02835271 bicycle-built-for-two, tandem bicycle, tandem
 446 | n02837789 bikini, two-piece
 447 | n02840245 binder, ring-binder
 448 | n02841315 binoculars, field glasses, opera glasses
 449 | n02843684 birdhouse
 450 | n02859443 boathouse
 451 | n02860847 bobsled, bobsleigh, bob
 452 | n02865351 bolo tie, bolo, bola tie, bola
 453 | n02869837 bonnet, poke bonnet
 454 | n02870880 bookcase
 455 | n02871525 bookshop, bookstore, bookstall
 456 | n02877765 bottlecap
 457 | n02879718 bow
 458 | n02883205 bow tie, bow-tie, bowtie
 459 | n02892201 brass, memorial tablet, plaque
 460 | n02892767 brassiere, bra, bandeau
 461 | n02894605 breakwater, groin, groyne, mole, bulwark, seawall, jetty
 462 | n02895154 breastplate, aegis, egis
 463 | n02906734 broom
 464 | n02909870 bucket, pail
 465 | n02910353 buckle
 466 | n02916936 bulletproof vest
 467 | n02917067 bullet train, bullet
 468 | n02927161 butcher shop, meat market
 469 | n02930766 cab, hack, taxi, taxicab
 470 | n02939185 caldron, cauldron
 471 | n02948072 candle, taper, wax light
 472 | n02950826 cannon
 473 | n02951358 canoe
 474 | n02951585 can opener, tin opener
 475 | n02963159 cardigan
 476 | n02965783 car mirror
 477 | n02966193 carousel, carrousel, merry-go-round, roundabout, whirligig
 478 | n02966687 carpenter's kit, tool kit
 479 | n02971356 carton
 480 | n02974003 car wheel
 481 | n02977058 cash machine, cash dispenser, automated teller machine, automatic teller machine, automated teller, automatic teller, ATM
 482 | n02978881 cassette
 483 | n02979186 cassette player
 484 | n02980441 castle
 485 | n02981792 catamaran
 486 | n02988304 CD player
 487 | n02992211 cello, violoncello
 488 | n02992529 cellular telephone, cellular phone, cellphone, cell, mobile phone
 489 | n02999410 chain
 490 | n03000134 chainlink fence
 491 | n03000247 chain mail, ring mail, mail, chain armor, chain armour, ring armor, ring armour
 492 | n03000684 chain saw, chainsaw
 493 | n03014705 chest
 494 | n03016953 chiffonier, commode
 495 | n03017168 chime, bell, gong
 496 | n03018349 china cabinet, china closet
 497 | n03026506 Christmas stocking
 498 | n03028079 church, church building
 499 | n03032252 cinema, movie theater, movie theatre, movie house, picture palace
 500 | n03041632 cleaver, meat cleaver, chopper
 501 | n03042490 cliff dwelling
 502 | n03045698 cloak
 503 | n03047690 clog, geta, patten, sabot
 504 | n03062245 cocktail shaker
 505 | n03063599 coffee mug
 506 | n03063689 coffeepot
 507 | n03065424 coil, spiral, volute, whorl, helix
 508 | n03075370 combination lock
 509 | n03085013 computer keyboard, keypad
 510 | n03089624 confectionery, confectionary, candy store
 511 | n03095699 container ship, containership, container vessel
 512 | n03100240 convertible
 513 | n03109150 corkscrew, bottle screw
 514 | n03110669 cornet, horn, trumpet, trump
 515 | n03124043 cowboy boot
 516 | n03124170 cowboy hat, ten-gallon hat
 517 | n03125729 cradle
 518 | n03126707 crane
 519 | n03127747 crash helmet
 520 | n03127925 crate
 521 | n03131574 crib, cot
 522 | n03133878 Crock Pot
 523 | n03134739 croquet ball
 524 | n03141823 crutch
 525 | n03146219 cuirass
 526 | n03160309 dam, dike, dyke
 527 | n03179701 desk
 528 | n03180011 desktop computer
 529 | n03187595 dial telephone, dial phone
 530 | n03188531 diaper, nappy, napkin
 531 | n03196217 digital clock
 532 | n03197337 digital watch
 533 | n03201208 dining table, board
 534 | n03207743 dishrag, dishcloth
 535 | n03207941 dishwasher, dish washer, dishwashing machine
 536 | n03208938 disk brake, disc brake
 537 | n03216828 dock, dockage, docking facility
 538 | n03218198 dogsled, dog sled, dog sleigh
 539 | n03220513 dome
 540 | n03223299 doormat, welcome mat
 541 | n03240683 drilling platform, offshore rig
 542 | n03249569 drum, membranophone, tympan
 543 | n03250847 drumstick
 544 | n03255030 dumbbell
 545 | n03259280 Dutch oven
 546 | n03271574 electric fan, blower
 547 | n03272010 electric guitar
 548 | n03272562 electric locomotive
 549 | n03290653 entertainment center
 550 | n03291819 envelope
 551 | n03297495 espresso maker
 552 | n03314780 face powder
 553 | n03325584 feather boa, boa
 554 | n03337140 file, file cabinet, filing cabinet
 555 | n03344393 fireboat
 556 | n03345487 fire engine, fire truck
 557 | n03347037 fire screen, fireguard
 558 | n03355925 flagpole, flagstaff
 559 | n03372029 flute, transverse flute
 560 | n03376595 folding chair
 561 | n03379051 football helmet
 562 | n03384352 forklift
 563 | n03388043 fountain
 564 | n03388183 fountain pen
 565 | n03388549 four-poster
 566 | n03393912 freight car
 567 | n03394916 French horn, horn
 568 | n03400231 frying pan, frypan, skillet
 569 | n03404251 fur coat
 570 | n03417042 garbage truck, dustcart
 571 | n03424325 gasmask, respirator, gas helmet
 572 | n03425413 gas pump, gasoline pump, petrol pump, island dispenser
 573 | n03443371 goblet
 574 | n03444034 go-kart
 575 | n03445777 golf ball
 576 | n03445924 golfcart, golf cart
 577 | n03447447 gondola
 578 | n03447721 gong, tam-tam
 579 | n03450230 gown
 580 | n03452741 grand piano, grand
 581 | n03457902 greenhouse, nursery, glasshouse
 582 | n03459775 grille, radiator grille
 583 | n03461385 grocery store, grocery, food market, market
 584 | n03467068 guillotine
 585 | n03476684 hair slide
 586 | n03476991 hair spray
 587 | n03478589 half track
 588 | n03481172 hammer
 589 | n03482405 hamper
 590 | n03483316 hand blower, blow dryer, blow drier, hair dryer, hair drier
 591 | n03485407 hand-held computer, hand-held microcomputer
 592 | n03485794 handkerchief, hankie, hanky, hankey
 593 | n03492542 hard disc, hard disk, fixed disk
 594 | n03494278 harmonica, mouth organ, harp, mouth harp
 595 | n03495258 harp
 596 | n03496892 harvester, reaper
 597 | n03498962 hatchet
 598 | n03527444 holster
 599 | n03529860 home theater, home theatre
 600 | n03530642 honeycomb
 601 | n03532672 hook, claw
 602 | n03534580 hoopskirt, crinoline
 603 | n03535780 horizontal bar, high bar
 604 | n03538406 horse cart, horse-cart
 605 | n03544143 hourglass
 606 | n03584254 iPod
 607 | n03584829 iron, smoothing iron
 608 | n03590841 jack-o'-lantern
 609 | n03594734 jean, blue jean, denim
 610 | n03594945 jeep, landrover
 611 | n03595614 jersey, T-shirt, tee shirt
 612 | n03598930 jigsaw puzzle
 613 | n03599486 jinrikisha, ricksha, rickshaw
 614 | n03602883 joystick
 615 | n03617480 kimono
 616 | n03623198 knee pad
 617 | n03627232 knot
 618 | n03630383 lab coat, laboratory coat
 619 | n03633091 ladle
 620 | n03637318 lampshade, lamp shade
 621 | n03642806 laptop, laptop computer
 622 | n03649909 lawn mower, mower
 623 | n03657121 lens cap, lens cover
 624 | n03658185 letter opener, paper knife, paperknife
 625 | n03661043 library
 626 | n03662601 lifeboat
 627 | n03666591 lighter, light, igniter, ignitor
 628 | n03670208 limousine, limo
 629 | n03673027 liner, ocean liner
 630 | n03676483 lipstick, lip rouge
 631 | n03680355 Loafer
 632 | n03690938 lotion
 633 | n03691459 loudspeaker, speaker, speaker unit, loudspeaker system, speaker system
 634 | n03692522 loupe, jeweler's loupe
 635 | n03697007 lumbermill, sawmill
 636 | n03706229 magnetic compass
 637 | n03709823 mailbag, postbag
 638 | n03710193 mailbox, letter box
 639 | n03710637 maillot
 640 | n03710721 maillot, tank suit
 641 | n03717622 manhole cover
 642 | n03720891 maraca
 643 | n03721384 marimba, xylophone
 644 | n03724870 mask
 645 | n03729826 matchstick
 646 | n03733131 maypole
 647 | n03733281 maze, labyrinth
 648 | n03733805 measuring cup
 649 | n03742115 medicine chest, medicine cabinet
 650 | n03743016 megalith, megalithic structure
 651 | n03759954 microphone, mike
 652 | n03761084 microwave, microwave oven
 653 | n03763968 military uniform
 654 | n03764736 milk can
 655 | n03769881 minibus
 656 | n03770439 miniskirt, mini
 657 | n03770679 minivan
 658 | n03773504 missile
 659 | n03775071 mitten
 660 | n03775546 mixing bowl
 661 | n03776460 mobile home, manufactured home
 662 | n03777568 Model T
 663 | n03777754 modem
 664 | n03781244 monastery
 665 | n03782006 monitor
 666 | n03785016 moped
 667 | n03786901 mortar
 668 | n03787032 mortarboard
 669 | n03788195 mosque
 670 | n03788365 mosquito net
 671 | n03791053 motor scooter, scooter
 672 | n03792782 mountain bike, all-terrain bike, off-roader
 673 | n03792972 mountain tent
 674 | n03793489 mouse, computer mouse
 675 | n03794056 mousetrap
 676 | n03796401 moving van
 677 | n03803284 muzzle
 678 | n03804744 nail
 679 | n03814639 neck brace
 680 | n03814906 necklace
 681 | n03825788 nipple
 682 | n03832673 notebook, notebook computer
 683 | n03837869 obelisk
 684 | n03838899 oboe, hautboy, hautbois
 685 | n03840681 ocarina, sweet potato
 686 | n03841143 odometer, hodometer, mileometer, milometer
 687 | n03843555 oil filter
 688 | n03854065 organ, pipe organ
 689 | n03857828 oscilloscope, scope, cathode-ray oscilloscope, CRO
 690 | n03866082 overskirt
 691 | n03868242 oxcart
 692 | n03868863 oxygen mask
 693 | n03871628 packet
 694 | n03873416 paddle, boat paddle
 695 | n03874293 paddlewheel, paddle wheel
 696 | n03874599 padlock
 697 | n03876231 paintbrush
 698 | n03877472 pajama, pyjama, pj's, jammies
 699 | n03877845 palace
 700 | n03884397 panpipe, pandean pipe, syrinx
 701 | n03887697 paper towel
 702 | n03888257 parachute, chute
 703 | n03888605 parallel bars, bars
 704 | n03891251 park bench
 705 | n03891332 parking meter
 706 | n03895866 passenger car, coach, carriage
 707 | n03899768 patio, terrace
 708 | n03902125 pay-phone, pay-station
 709 | n03903868 pedestal, plinth, footstall
 710 | n03908618 pencil box, pencil case
 711 | n03908714 pencil sharpener
 712 | n03916031 perfume, essence
 713 | n03920288 Petri dish
 714 | n03924679 photocopier
 715 | n03929660 pick, plectrum, plectron
 716 | n03929855 pickelhaube
 717 | n03930313 picket fence, paling
 718 | n03930630 pickup, pickup truck
 719 | n03933933 pier
 720 | n03935335 piggy bank, penny bank
 721 | n03937543 pill bottle
 722 | n03938244 pillow
 723 | n03942813 ping-pong ball
 724 | n03944341 pinwheel
 725 | n03947888 pirate, pirate ship
 726 | n03950228 pitcher, ewer
 727 | n03954731 plane, carpenter's plane, woodworking plane
 728 | n03956157 planetarium
 729 | n03958227 plastic bag
 730 | n03961711 plate rack
 731 | n03967562 plow, plough
 732 | n03970156 plunger, plumber's helper
 733 | n03976467 Polaroid camera, Polaroid Land camera
 734 | n03976657 pole
 735 | n03977966 police van, police wagon, paddy wagon, patrol wagon, wagon, black Maria
 736 | n03980874 poncho
 737 | n03982430 pool table, billiard table, snooker table
 738 | n03983396 pop bottle, soda bottle
 739 | n03991062 pot, flowerpot
 740 | n03992509 potter's wheel
 741 | n03995372 power drill
 742 | n03998194 prayer rug, prayer mat
 743 | n04004767 printer
 744 | n04005630 prison, prison house
 745 | n04008634 projectile, missile
 746 | n04009552 projector
 747 | n04019541 puck, hockey puck
 748 | n04023962 punching bag, punch bag, punching ball, punchball
 749 | n04026417 purse
 750 | n04033901 quill, quill pen
 751 | n04033995 quilt, comforter, comfort, puff
 752 | n04037443 racer, race car, racing car
 753 | n04039381 racket, racquet
 754 | n04040759 radiator
 755 | n04041544 radio, wireless
 756 | n04044716 radio telescope, radio reflector
 757 | n04049303 rain barrel
 758 | n04065272 recreational vehicle, RV, R.V.
 759 | n04067472 reel
 760 | n04069434 reflex camera
 761 | n04070727 refrigerator, icebox
 762 | n04074963 remote control, remote
 763 | n04081281 restaurant, eating house, eating place, eatery
 764 | n04086273 revolver, six-gun, six-shooter
 765 | n04090263 rifle
 766 | n04099969 rocking chair, rocker
 767 | n04111531 rotisserie
 768 | n04116512 rubber eraser, rubber, pencil eraser
 769 | n04118538 rugby ball
 770 | n04118776 rule, ruler
 771 | n04120489 running shoe
 772 | n04125021 safe
 773 | n04127249 safety pin
 774 | n04131690 saltshaker, salt shaker
 775 | n04133789 sandal
 776 | n04136333 sarong
 777 | n04141076 sax, saxophone
 778 | n04141327 scabbard
 779 | n04141975 scale, weighing machine
 780 | n04146614 school bus
 781 | n04147183 schooner
 782 | n04149813 scoreboard
 783 | n04152593 screen, CRT screen
 784 | n04153751 screw
 785 | n04154565 screwdriver
 786 | n04162706 seat belt, seatbelt
 787 | n04179913 sewing machine
 788 | n04192698 shield, buckler
 789 | n04200800 shoe shop, shoe-shop, shoe store
 790 | n04201297 shoji
 791 | n04204238 shopping basket
 792 | n04204347 shopping cart
 793 | n04208210 shovel
 794 | n04209133 shower cap
 795 | n04209239 shower curtain
 796 | n04228054 ski
 797 | n04229816 ski mask
 798 | n04235860 sleeping bag
 799 | n04238763 slide rule, slipstick
 800 | n04239074 sliding door
 801 | n04243546 slot, one-armed bandit
 802 | n04251144 snorkel
 803 | n04252077 snowmobile
 804 | n04252225 snowplow, snowplough
 805 | n04254120 soap dispenser
 806 | n04254680 soccer ball
 807 | n04254777 sock
 808 | n04258138 solar dish, solar collector, solar furnace
 809 | n04259630 sombrero
 810 | n04263257 soup bowl
 811 | n04264628 space bar
 812 | n04265275 space heater
 813 | n04266014 space shuttle
 814 | n04270147 spatula
 815 | n04273569 speedboat
 816 | n04275548 spider web, spider's web
 817 | n04277352 spindle
 818 | n04285008 sports car, sport car
 819 | n04286575 spotlight, spot
 820 | n04296562 stage
 821 | n04310018 steam locomotive
 822 | n04311004 steel arch bridge
 823 | n04311174 steel drum
 824 | n04317175 stethoscope
 825 | n04325704 stole
 826 | n04326547 stone wall
 827 | n04328186 stopwatch, stop watch
 828 | n04330267 stove
 829 | n04332243 strainer
 830 | n04335435 streetcar, tram, tramcar, trolley, trolley car
 831 | n04336792 stretcher
 832 | n04344873 studio couch, day bed
 833 | n04346328 stupa, tope
 834 | n04347754 submarine, pigboat, sub, U-boat
 835 | n04350905 suit, suit of clothes
 836 | n04355338 sundial
 837 | n04355933 sunglass
 838 | n04356056 sunglasses, dark glasses, shades
 839 | n04357314 sunscreen, sunblock, sun blocker
 840 | n04366367 suspension bridge
 841 | n04367480 swab, swob, mop
 842 | n04370456 sweatshirt
 843 | n04371430 swimming trunks, bathing trunks
 844 | n04371774 swing
 845 | n04372370 switch, electric switch, electrical switch
 846 | n04376876 syringe
 847 | n04380533 table lamp
 848 | n04389033 tank, army tank, armored combat vehicle, armoured combat vehicle
 849 | n04392985 tape player
 850 | n04398044 teapot
 851 | n04399382 teddy, teddy bear
 852 | n04404412 television, television system
 853 | n04409515 tennis ball
 854 | n04417672 thatch, thatched roof
 855 | n04418357 theater curtain, theatre curtain
 856 | n04423845 thimble
 857 | n04428191 thresher, thrasher, threshing machine
 858 | n04429376 throne
 859 | n04435653 tile roof
 860 | n04442312 toaster
 861 | n04443257 tobacco shop, tobacconist shop, tobacconist
 862 | n04447861 toilet seat
 863 | n04456115 torch
 864 | n04458633 totem pole
 865 | n04461696 tow truck, tow car, wrecker
 866 | n04462240 toyshop
 867 | n04465501 tractor
 868 | n04467665 trailer truck, tractor trailer, trucking rig, rig, articulated lorry, semi
 869 | n04476259 tray
 870 | n04479046 trench coat
 871 | n04482393 tricycle, trike, velocipede
 872 | n04483307 trimaran
 873 | n04485082 tripod
 874 | n04486054 triumphal arch
 875 | n04487081 trolleybus, trolley coach, trackless trolley
 876 | n04487394 trombone
 877 | n04493381 tub, vat
 878 | n04501370 turnstile
 879 | n04505470 typewriter keyboard
 880 | n04507155 umbrella
 881 | n04509417 unicycle, monocycle
 882 | n04515003 upright, upright piano
 883 | n04517823 vacuum, vacuum cleaner
 884 | n04522168 vase
 885 | n04523525 vault
 886 | n04525038 velvet
 887 | n04525305 vending machine
 888 | n04532106 vestment
 889 | n04532670 viaduct
 890 | n04536866 violin, fiddle
 891 | n04540053 volleyball
 892 | n04542943 waffle iron
 893 | n04548280 wall clock
 894 | n04548362 wallet, billfold, notecase, pocketbook
 895 | n04550184 wardrobe, closet, press
 896 | n04552348 warplane, military plane
 897 | n04553703 washbasin, handbasin, washbowl, lavabo, wash-hand basin
 898 | n04554684 washer, automatic washer, washing machine
 899 | n04557648 water bottle
 900 | n04560804 water jug
 901 | n04562935 water tower
 902 | n04579145 whiskey jug
 903 | n04579432 whistle
 904 | n04584207 wig
 905 | n04589890 window screen
 906 | n04590129 window shade
 907 | n04591157 Windsor tie
 908 | n04591713 wine bottle
 909 | n04592741 wing
 910 | n04596742 wok
 911 | n04597913 wooden spoon
 912 | n04599235 wool, woolen, woollen
 913 | n04604644 worm fence, snake fence, snake-rail fence, Virginia fence
 914 | n04606251 wreck
 915 | n04612504 yawl
 916 | n04613696 yurt
 917 | n06359193 web site, website, internet site, site
 918 | n06596364 comic book
 919 | n06785654 crossword puzzle, crossword
 920 | n06794110 street sign
 921 | n06874185 traffic light, traffic signal, stoplight
 922 | n07248320 book jacket, dust cover, dust jacket, dust wrapper
 923 | n07565083 menu
 924 | n07579787 plate
 925 | n07583066 guacamole
 926 | n07584110 consomme
 927 | n07590611 hot pot, hotpot
 928 | n07613480 trifle
 929 | n07614500 ice cream, icecream
 930 | n07615774 ice lolly, lolly, lollipop, popsicle
 931 | n07684084 French loaf
 932 | n07693725 bagel, beigel
 933 | n07695742 pretzel
 934 | n07697313 cheeseburger
 935 | n07697537 hotdog, hot dog, red hot
 936 | n07711569 mashed potato
 937 | n07714571 head cabbage
 938 | n07714990 broccoli
 939 | n07715103 cauliflower
 940 | n07716358 zucchini, courgette
 941 | n07716906 spaghetti squash
 942 | n07717410 acorn squash
 943 | n07717556 butternut squash
 944 | n07718472 cucumber, cuke
 945 | n07718747 artichoke, globe artichoke
 946 | n07720875 bell pepper
 947 | n07730033 cardoon
 948 | n07734744 mushroom
 949 | n07742313 Granny Smith
 950 | n07745940 strawberry
 951 | n07747607 orange
 952 | n07749582 lemon
 953 | n07753113 fig
 954 | n07753275 pineapple, ananas
 955 | n07753592 banana
 956 | n07754684 jackfruit, jak, jack
 957 | n07760859 custard apple
 958 | n07768694 pomegranate
 959 | n07802026 hay
 960 | n07831146 carbonara
 961 | n07836838 chocolate sauce, chocolate syrup
 962 | n07860988 dough
 963 | n07871810 meat loaf, meatloaf
 964 | n07873807 pizza, pizza pie
 965 | n07875152 potpie
 966 | n07880968 burrito
 967 | n07892512 red wine
 968 | n07920052 espresso
 969 | n07930864 cup
 970 | n07932039 eggnog
 971 | n09193705 alp
 972 | n09229709 bubble
 973 | n09246464 cliff, drop, drop-off
 974 | n09256479 coral reef
 975 | n09288635 geyser
 976 | n09332890 lakeside, lakeshore
 977 | n09399592 promontory, headland, head, foreland
 978 | n09421951 sandbar, sand bar
 979 | n09428293 seashore, coast, seacoast, sea-coast
 980 | n09468604 valley, vale
 981 | n09472597 volcano
 982 | n09835506 ballplayer, baseball player
 983 | n10148035 groom, bridegroom
 984 | n10565667 scuba diver
 985 | n11879895 rapeseed
 986 | n11939491 daisy
 987 | n12057211 yellow lady's slipper, yellow lady-slipper, Cypripedium calceolus, Cypripedium parviflorum
 988 | n12144580 corn
 989 | n12267677 acorn
 990 | n12620546 hip, rose hip, rosehip
 991 | n12768682 buckeye, horse chestnut, conker
 992 | n12985857 coral fungus
 993 | n12998815 agaric
 994 | n13037406 gyromitra
 995 | n13040303 stinkhorn, carrion fungus
 996 | n13044778 earthstar
 997 | n13052670 hen-of-the-woods, hen of the woods, Polyporus frondosus, Grifola frondosa
 998 | n13054560 bolete
 999 | n13133613 ear, spike, capitulum
1000 | n15075141 toilet tissue, toilet paper, bathroom tissue


--------------------------------------------------------------------------------
/datasets/lifesat/README.md:
--------------------------------------------------------------------------------
 1 | # Life satisfaction and GDP per capita
 2 | ## Life satisfaction
 3 | ### Source
 4 | This dataset was obtained from the OECD's website at: http://stats.oecd.org/index.aspx?DataSetCode=BLI
 5 | 
 6 | ### Data description
 7 | 
 8 |     Int64Index: 3292 entries, 0 to 3291
 9 |     Data columns (total 17 columns):
10 |     ﻿"LOCATION"              3292 non-null object
11 |     Country                  3292 non-null object
12 |     INDICATOR                3292 non-null object
13 |     Indicator                3292 non-null object
14 |     MEASURE                  3292 non-null object
15 |     Measure                  3292 non-null object
16 |     INEQUALITY               3292 non-null object
17 |     Inequality               3292 non-null object
18 |     Unit Code                3292 non-null object
19 |     Unit                     3292 non-null object
20 |     PowerCode Code           3292 non-null int64
21 |     PowerCode                3292 non-null object
22 |     Reference Period Code    0 non-null float64
23 |     Reference Period         0 non-null float64
24 |     Value                    3292 non-null float64
25 |     Flag Codes               1120 non-null object
26 |     Flags                    1120 non-null object
27 |     dtypes: float64(3), int64(1), object(13)
28 |     memory usage: 462.9+ KB
29 | 
30 | ### Example usage using python Pandas
31 | 
32 |     >>> life_sat = pd.read_csv("oecd_bli_2015.csv", thousands=',')
33 |     
34 |     >>> life_sat_total = life_sat[life_sat["INEQUALITY"]=="TOT"]
35 |     
36 |     >>> life_sat_total = life_sat_total.pivot(index="Country", columns="Indicator", values="Value")
37 |     
38 |     >>> life_sat_total.info()
39 |     <class 'pandas.core.frame.DataFrame'>
40 |     Index: 37 entries, Australia to United States
41 |     Data columns (total 24 columns):
42 |     Air pollution                                37 non-null float64
43 |     Assault rate                                 37 non-null float64
44 |     Consultation on rule-making                  37 non-null float64
45 |     Dwellings without basic facilities           37 non-null float64
46 |     Educational attainment                       37 non-null float64
47 |     Employees working very long hours            37 non-null float64
48 |     Employment rate                              37 non-null float64
49 |     Homicide rate                                37 non-null float64
50 |     Household net adjusted disposable income     37 non-null float64
51 |     Household net financial wealth               37 non-null float64
52 |     Housing expenditure                          37 non-null float64
53 |     Job security                                 37 non-null float64
54 |     Life expectancy                              37 non-null float64
55 |     Life satisfaction                            37 non-null float64
56 |     Long-term unemployment rate                  37 non-null float64
57 |     Personal earnings                            37 non-null float64
58 |     Quality of support network                   37 non-null float64
59 |     Rooms per person                             37 non-null float64
60 |     Self-reported health                         37 non-null float64
61 |     Student skills                               37 non-null float64
62 |     Time devoted to leisure and personal care    37 non-null float64
63 |     Voter turnout                                37 non-null float64
64 |     Water quality                                37 non-null float64
65 |     Years in education                           37 non-null float64
66 |     dtypes: float64(24)
67 |     memory usage: 7.2+ KB
68 | 
69 | ## GDP per capita
70 | ### Source
71 | Dataset obtained from the IMF's website at: http://goo.gl/j1MSKe
72 | 
73 | ### Data description
74 | 
75 |     Int64Index: 190 entries, 0 to 189
76 |     Data columns (total 7 columns):
77 |     Country                          190 non-null object
78 |     Subject Descriptor               189 non-null object
79 |     Units                            189 non-null object
80 |     Scale                            189 non-null object
81 |     Country/Series-specific Notes    188 non-null object
82 |     2015                             187 non-null float64
83 |     Estimates Start After            188 non-null float64
84 |     dtypes: float64(2), object(5)
85 |     memory usage: 11.9+ KB
86 | 
87 | ### Example usage using python Pandas
88 | 
89 |     >>> gdp_per_capita = pd.read_csv(
90 |     ...     datapath+"gdp_per_capita.csv", thousands=',', delimiter='\t',
91 |     ...     encoding='latin1', na_values="n/a", index_col="Country")
92 |     ...
93 |     >>> gdp_per_capita.rename(columns={"2015": "GDP per capita"}, inplace=True)
94 | 
95 | 


--------------------------------------------------------------------------------
/datasets/lifesat/gdp_per_capita.csv:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/ageron/handson-ml/ac1310a3cc1567ecfb4b798715c804627076775f/datasets/lifesat/gdp_per_capita.csv


--------------------------------------------------------------------------------
/docker/.env:
--------------------------------------------------------------------------------
1 | COMPOSE_PROJECT_NAME=handson-ml
2 | 


--------------------------------------------------------------------------------
/docker/Dockerfile:
--------------------------------------------------------------------------------
 1 | FROM continuumio/miniconda3:latest
 2 | 
 3 | RUN apt-get update && apt-get install -y \
 4 |         build-essential \
 5 |         cmake \
 6 |         ffmpeg \
 7 |         git \
 8 |         libboost-all-dev \
 9 |         libjpeg-dev \
10 |         libpq-dev \
11 |         libsdl2-dev swig \
12 |         sudo \
13 |         unzip \
14 |         xorg-dev \
15 |         xvfb \
16 |         zip \
17 |         zlib1g-dev \
18 |     && apt clean \
19 |     && rm -rf /var/lib/apt/lists/*
20 | 
21 | COPY environment.yml /tmp/
22 | RUN echo '    - pyvirtualdisplay' >> /tmp/environment.yml \
23 |     && conda env create -f /tmp/environment.yml \
24 |     && conda clean -afy \
25 |     && find /opt/conda/ -follow -type f -name '*.a' -delete \
26 |     && find /opt/conda/ -follow -type f -name '*.pyc' -delete \
27 |     && find /opt/conda/ -follow -type f -name '*.js.map' -delete \
28 |     && rm /tmp/environment.yml
29 | 
30 | ARG username
31 | ARG userid
32 | 
33 | ARG home=/home/${username}
34 | ARG workdir=${home}/handson-ml
35 | 
36 | RUN adduser ${username} --uid ${userid} --gecos '' --disabled-password \
37 |     && echo "${username} ALL=(root) NOPASSWD:ALL" > /etc/sudoers.d/${username} \
38 |     && chmod 0440 /etc/sudoers.d/${username}
39 | 
40 | WORKDIR ${workdir}
41 | RUN chown ${username}:${username} ${workdir}
42 | 
43 | USER ${username}
44 | WORKDIR ${workdir}
45 | 
46 | ENV PATH /opt/conda/envs/tf1/bin:$PATH
47 | 
48 | # The config below enables diffing notebooks with nbdiff (and nbdiff support
49 | # in git diff command) after connecting to the container by "make exec" (or
50 | # "docker-compose exec handson-ml bash")
51 | #       You may also try running:
52 | #         nbdiff NOTEBOOK_NAME.ipynb
53 | #       to get nbdiff between checkpointed version and current version of the
54 | # given notebook.
55 | 
56 | RUN git-nbdiffdriver config --enable --global
57 | 
58 | # INFO: Optionally uncomment any (one) of the following RUN commands below to ignore either
59 | #       metadata or details in nbdiff within git diff
60 | #RUN git config --global diff.jupyternotebook.command 'git-nbdiffdriver diff --ignore-metadata'
61 | RUN git config --global diff.jupyternotebook.command 'git-nbdiffdriver diff --ignore-details'
62 | 
63 | 
64 | COPY docker/bashrc.bash /tmp/
65 | RUN cat /tmp/bashrc.bash >> ${home}/.bashrc \
66 |     && echo "export PATH=\"${workdir}/docker/bin:$PATH\"" >> ${home}/.bashrc \
67 |     && sudo rm /tmp/bashrc.bash
68 | 
69 | 
70 | # INFO: Uncomment lines below to enable automatic save of python-only and html-only
71 | #       exports alongside the notebook
72 | #COPY docker/jupyter_notebook_config.py /tmp/
73 | #RUN cat /tmp/jupyter_notebook_config.py >> ${home}/.jupyter/jupyter_notebook_config.py
74 | #RUN sudo rm /tmp/jupyter_notebook_config.py
75 | 
76 | 
77 | # INFO: Uncomment the RUN command below to disable git diff paging
78 | #RUN git config --global core.pager ''
79 | 
80 | 
81 | # INFO: Uncomment the RUN command below for easy and constant notebook URL (just localhost:8888)
82 | #       That will switch Jupyter to using empty password instead of a token.
83 | #       To avoid making a security hole you SHOULD in fact not only uncomment but
84 | #       regenerate the hash for your own non-empty password and replace the hash below.
85 | #       You can compute a password hash in any notebook, just run the code:
86 | #          from notebook.auth import passwd
87 | #          passwd()
88 | #       and take the hash from the output
89 | #RUN mkdir -p ${home}/.jupyter && \
90 | #    echo 'c.NotebookApp.password = u"sha1:c6bbcba2d04b:f969e403db876dcfbe26f47affe41909bd53392e"' \
91 | #    >> ${home}/.jupyter/jupyter_notebook_config.py
92 | 


--------------------------------------------------------------------------------
/docker/Makefile:
--------------------------------------------------------------------------------
 1 | 
 2 | help:
 3 | 	cat Makefile
 4 | run:
 5 | 	docker-compose up
 6 | exec:
 7 | 	docker-compose exec handson-ml bash
 8 | build: stop .FORCE
 9 | 	docker-compose build
10 | rebuild: stop .FORCE
11 | 	docker-compose build --no-cache
12 | stop:
13 | 	docker stop handson-ml || true; docker rm handson-ml || true;
14 | .FORCE:
15 | 


--------------------------------------------------------------------------------
/docker/README.md:
--------------------------------------------------------------------------------
  1 | 
  2 | # Hands-on Machine Learning in Docker
  3 | 
  4 | This is the Docker configuration which allows you to run and tweak the book's notebooks without installing any dependencies on your machine!<br/>OK, any except `docker` and `docker-compose`.<br />And optionally `make`.<br />And a few more things if you want GPU support (see below for details).
  5 | 
  6 | ## Prerequisites
  7 | 
  8 | Follow the instructions on [Install Docker](https://docs.docker.com/engine/installation/) and [Install Docker Compose](https://docs.docker.com/compose/install/) for your environment if you haven't got `docker` and `docker-compose` already.
  9 | 
 10 | Some general knowledge about `docker` infrastructure might be useful (that's an interesting topic on its own) but is not strictly *required* to just run the notebooks.
 11 | 
 12 | ## Usage
 13 | 
 14 | ### Prepare the image (once)
 15 | 
 16 | The first option is to pull the image from Docker Hub (this will download about 1.9 GB of compressed data):
 17 | 
 18 | ```bash
 19 | $ docker pull ageron/handson-ml
 20 | ```
 21 | 
 22 | **Note**: this is the CPU-only image. For GPU support, read the GPU section below.
 23 | 
 24 | Alternatively, you can build the image yourself. This will be slower, but it will ensure the image is up to date, with the latest libraries, except for TensorFlow 1.15 (instead of the latest TensorFlow 2.x). For this, assuming you already downloaded this project into the directory `/path/to/project/handson-ml`:
 25 | 
 26 | ```bash
 27 | $ cd /path/to/project/handson-ml/docker
 28 | $ docker-compose build
 29 | ```
 30 | 
 31 | This will take quite a while, but is only required once.
 32 | 
 33 | After the process is finished you have an `ageron/handson-ml:latest` image, that will be the base for your experiments. You can confirm that by running the following command:
 34 | 
 35 | ```bash
 36 | $ docker images
 37 | REPOSITORY            TAG         IMAGE ID            CREATED             SIZE
 38 | ageron/handson-ml     latest      4fcbafcd715b        3 minutes ago       4.85GB
 39 | ```
 40 | 
 41 | ### Run the notebooks
 42 | 
 43 | Still assuming you already downloaded this project into the directory `/path/to/project/handson-ml`, run the following commands to start the Jupyter server inside the container, which is named `handson-ml`:
 44 | 
 45 | ```bash
 46 | $ cd /path/to/project/handson-ml/docker
 47 | $ docker-compose up
 48 | ```
 49 | 
 50 | Next, just point your browser to the URL printed on the screen (or go to <http://localhost:8888> if you enabled password authentication inside the `jupyter_notebook_config.py` file, before building the image) and you're ready to play with the book's code!
 51 | 
 52 | The server runs in the directory containing the notebooks, and the changes you make from the browser will be persisted there.
 53 | 
 54 | You can close the server just by pressing `Ctrl-C` in the terminal window.
 55 | 
 56 | ### Using `make` (optional)
 57 | 
 58 | If you have `make` installed on your computer, you can use it as a thin layer to run `docker-compose` commands. For example, executing `make rebuild` will actually run `docker-compose build --no-cache`, which will rebuild the image without using the cache. This ensures that your image is based on the latest version of the `continuumio/miniconda3` image which the `ageron/handson-ml` image is based on.
 59 | 
 60 | If you don't have `make` (and you don't want to install it), just examine the contents of `Makefile` to see which `docker-compose` commands you can run instead.
 61 | 
 62 | ### Run additional commands in the container
 63 | 
 64 | Run `make exec` (or `docker-compose exec handson-ml bash`) while the server is running to run an additional `bash` shell inside the `handson-ml` container. Now you're inside the environment prepared within the image.
 65 | 
 66 | One of the useful things that can be done there would be starting TensorBoard (for example with simple `tb` command, see bashrc file).
 67 | 
 68 | Another one may be comparing versions of the notebooks using the `nbdiff` command if you haven't got `nbdime` installed locally (it is **way** better than plain `diff` for notebooks). See [Tools for diffing and merging of Jupyter notebooks](https://github.com/jupyter/nbdime) for more details.
 69 | 
 70 | You can see changes you made relative to the version in git using `git diff` which is integrated with `nbdiff`.
 71 | 
 72 | You may also try `nbd NOTEBOOK_NAME.ipynb` command (custom, see bashrc file) to compare one of your notebooks with its `checkpointed` version.<br/>
 73 | To be precise, the output will tell you *what modifications should be re-played on the **manually saved** version of the notebook (located in `.ipynb_checkpoints` subdirectory) to update it to the **current** i.e. **auto-saved** version (given as command's argument - located in working directory)*.
 74 | 
 75 | ## GPU Support on Linux (experimental)
 76 | 
 77 | ### Prerequisites
 78 | 
 79 | If you're running on Linux, and you have a TensorFlow-compatible GPU card (NVidia card with Compute Capability ≥ 3.5) that you would like TensorFlow to use inside the Docker container, then you should download and install the latest driver for your card from [nvidia.com](https://www.nvidia.com/Download/index.aspx?lang=en-us). You will also need to install [NVidia Docker support](https://github.com/NVIDIA/nvidia-docker): if you are using Docker 19.03 or above, you must install the `nvidia-container-toolkit` package, and for earlier versions, you must install `nvidia-docker2`.
 80 | 
 81 | Next, edit the `docker-compose.yml` file:
 82 | 
 83 | ```bash
 84 | $ cd /path/to/project/handson-ml/docker
 85 | $ edit docker-compose.yml  # use your favorite editor
 86 | ```
 87 | 
 88 | * Replace `dockerfile: ./docker/Dockerfile` with `dockerfile: ./docker/Dockerfile.gpu`
 89 | * Replace `image: ageron/handson-ml:latest` with `image: ageron/handson-ml:latest-gpu`
 90 | * If you want to use `docker-compose`, you will need version 1.28 or above for GPU support, and you must uncomment the whole `deploy` section in `docker-compose.yml`. 
 91 | 
 92 | ### Prepare the image (once)
 93 | 
 94 | If you want to pull the prebuilt image from Docker Hub (this will download over 3.5 GB of compressed data):
 95 | 
 96 | ```bash
 97 | $ docker pull ageron/handson-ml:latest-gpu
 98 | ```
 99 | 
100 | If you prefer to build the image yourself:
101 | 
102 | ```bash
103 | $ cd /path/to/project/handson-ml/docker
104 | $ docker-compose build
105 | ```
106 | 
107 | ### Run the notebooks with `docker-compose` (version 1.28 or above)
108 | 
109 | If you have `docker-compose` version 1.28 or above, that's great! You can simply run:
110 | 
111 | ```bash
112 | $ cd /path/to/project/handson-ml/docker
113 | $ docker-compose up
114 | [...]
115 |      or http://127.0.0.1:8888/?token=[...]
116 | ```
117 | 
118 | Then point your browser to the URL and Jupyter should appear. If you then open or create a notebook and execute the following code, a list containing your GPU device(s) should be displayed (success!):
119 | 
120 | ```python
121 | import tensorflow as tf
122 | 
123 | tf.config.list_physical_devices("GPU")
124 | ```
125 | 
126 | To stop the server, just press Ctrl-C.
127 | 
128 | ### Run the notebooks without `docker-compose`
129 | 
130 | If you have a version of `docker-compose` earlier than 1.28, you will have to use `docker run` directly.
131 | 
132 | If you are using Docker 19.03 or above, you can run:
133 | 
134 | ```bash
135 | $ cd /path/to/project/handson-ml
136 | $ docker run --name handson-ml --gpus all -p 8888:8888 -p 6006:6006 --log-opt mode=non-blocking --log-opt max-buffer-size=50m -v `pwd`:/home/devel/handson-ml ageron/handson-ml:latest-gpu /opt/conda/envs/tf1/bin/jupyter notebook --ip='0.0.0.0' --port=8888 --no-browser
137 | ```
138 | 
139 | If you are using an older version of Docker, then replace `--gpus all` with `--runtime=nvidia`.
140 | 
141 | Now point your browser to the displayed URL: Jupyter should appear, and you can open a notebook and run `import tensorflow as tf` and `tf.config.list_physical_devices("GPU)` as above to confirm that TensorFlow does indeed see your GPU device(s).
142 | 
143 | Lastly, to interrupt the server, press Ctrl-C, then run:
144 | 
145 | ```bash
146 | $ docker rm handson-ml
147 | ```
148 | 
149 | This will remove the container so you can start a new one later (but it will not remove the image or the notebooks, don't worry!).
150 | 
151 | Have fun!
152 | 


--------------------------------------------------------------------------------
/docker/bashrc.bash:
--------------------------------------------------------------------------------
1 | alias ll="ls -alF"
2 | alias nbd="nbdiff_checkpoint"
3 | alias tb="tensorboard --logdir=tf_logs"
4 | 


--------------------------------------------------------------------------------
/docker/bin/nbclean_checkpoints:
--------------------------------------------------------------------------------
  1 | #!/usr/bin/env python
  2 | 
  3 | import collections
  4 | import glob
  5 | import hashlib
  6 | import os
  7 | import subprocess
  8 | 
  9 | 
 10 | class NotebookAnalyser:
 11 | 
 12 |     def __init__(self, dry_run=False, verbose=False, colorful=False):
 13 |         self._dry_run = dry_run
 14 |         self._verbose = verbose
 15 |         self._colors = collections.defaultdict(lambda: "")
 16 |         if colorful:
 17 |             for color in [
 18 |                 NotebookAnalyser.COLOR_WHITE,
 19 |                 NotebookAnalyser.COLOR_RED,
 20 |                 NotebookAnalyser.COLOR_GREEN,
 21 |                 NotebookAnalyser.COLOR_YELLOW,
 22 |             ]:
 23 |                 self._colors[color] =    "\033[{}m".format(color)
 24 | 
 25 |     NOTEBOOK_SUFFIX = ".ipynb"
 26 |     CHECKPOINT_DIR = NOTEBOOK_SUFFIX + "_checkpoints"
 27 |     CHECKPOINT_MASK = "*-checkpoint" + NOTEBOOK_SUFFIX
 28 |     CHECKPOINT_MASK_LEN = len(CHECKPOINT_MASK) - 1
 29 | 
 30 |     @staticmethod
 31 |     def get_hash(file_path):
 32 |         with open(file_path, "rb") as input:
 33 |             hash = hashlib.md5()
 34 |             for chunk in iter(lambda: input.read(4096), b""):
 35 |                 hash.update(chunk)
 36 |         return hash.hexdigest()
 37 | 
 38 |     MESSAGE_ORPHANED = "missing "
 39 |     MESSAGE_MODIFIED = "modified"
 40 |     MESSAGE_DELETED =  "DELETING"
 41 | 
 42 |     COLOR_WHITE = "0"
 43 |     COLOR_RED = "31"
 44 |     COLOR_GREEN = "32"
 45 |     COLOR_YELLOW = "33"
 46 | 
 47 |     def log(self, message, file, color=COLOR_WHITE):
 48 |         color_on = self._colors[color]
 49 |         color_off = self._colors[NotebookAnalyser.COLOR_WHITE]
 50 |         print("{}{}{}: {}".format(color_on, message, color_off, file))
 51 | 
 52 |     def clean_checkpoints(self, directory):
 53 |         for checkpoint_path in sorted(glob.glob(os.path.join(directory, NotebookAnalyser.CHECKPOINT_MASK))):
 54 | 
 55 |             workfile_dir = os.path.dirname(os.path.dirname(checkpoint_path))
 56 |             workfile_name = os.path.basename(checkpoint_path)[:-NotebookAnalyser.CHECKPOINT_MASK_LEN] + NotebookAnalyser.NOTEBOOK_SUFFIX
 57 |             workfile_path = os.path.join(workfile_dir, workfile_name)
 58 | 
 59 |             status = ""
 60 |             if not os.path.isfile(workfile_path):
 61 |                 if self._verbose:
 62 |                     self.log(NotebookAnalyser.MESSAGE_ORPHANED, workfile_path, NotebookAnalyser.COLOR_RED)
 63 |             else:
 64 |                 checkpoint_stat = os.stat(checkpoint_path)
 65 |                 workfile_stat = os.stat(workfile_path)
 66 | 
 67 |                 modified = workfile_stat.st_size != checkpoint_stat.st_size
 68 | 
 69 |                 if not modified:
 70 |                     checkpoint_hash = NotebookAnalyser.get_hash(checkpoint_path)
 71 |                     workfile_hash = NotebookAnalyser.get_hash(workfile_path)
 72 |                     modified = checkpoint_hash != workfile_hash
 73 | 
 74 |                 if modified:
 75 |                     if self._verbose:
 76 |                         self.log(NotebookAnalyser.MESSAGE_MODIFIED, workfile_path, NotebookAnalyser.COLOR_YELLOW)
 77 |                 else:
 78 |                     self.log(NotebookAnalyser.MESSAGE_DELETED, checkpoint_path, NotebookAnalyser.COLOR_GREEN)
 79 |                     if not self._dry_run:
 80 |                         os.remove(checkpoint_path)
 81 | 
 82 |         if not self._dry_run and not os.listdir(directory):
 83 |             self.log(NotebookAnalyser.MESSAGE_DELETED, directory, NotebookAnalyser.COLOR_GREEN)
 84 |             os.rmdir(directory)
 85 | 
 86 |     def clean_checkpoints_recursively(self, directory):
 87 |         for (root, subdirs, files) in os.walk(directory):
 88 |             subdirs.sort() # INFO: traverse alphabetically
 89 |             if NotebookAnalyser.CHECKPOINT_DIR in subdirs:
 90 |                 subdirs.remove(NotebookAnalyser.CHECKPOINT_DIR) # INFO: don't recurse there
 91 |                 self.clean_checkpoints(os.path.join(root, NotebookAnalyser.CHECKPOINT_DIR))
 92 | 
 93 | 
 94 | def main():
 95 |     import argparse
 96 |     parser = argparse.ArgumentParser(description="Remove checkpointed versions of those jupyter notebooks that are identical to their working copies.",
 97 |         epilog="""Notebooks will be reported as either
 98 |         "DELETED" if the working copy and checkpointed version are identical
 99 |         (checkpoint will be deleted),
100 |         "missing" if there is a checkpoint but no corresponding working file can be found
101 |         or "modified" if notebook and the checkpoint are not byte-to-byte identical.
102 |         If removal of checkpoints results in empty ".ipynb_checkpoints" directory
103 |         that directory is also deleted.
104 |         """) #, formatter_class=argparse.RawDescriptionHelpFormatter)
105 |     parser.add_argument("dirs", metavar="DIR", type=str, nargs="*", default=".", help="directories to search")
106 |     parser.add_argument("-d", "--dry-run", action="store_true", help="only print messages, don't perform any removals")
107 |     parser.add_argument("-v", "--verbose", action="store_true", help="verbose mode")
108 |     parser.add_argument("-c", "--color", action="store_true", help="colorful mode")
109 |     args = parser.parse_args()
110 | 
111 |     analyser = NotebookAnalyser(args.dry_run, args.verbose, args.color)
112 |     for directory in args.dirs:
113 |         analyser.clean_checkpoints_recursively(directory)
114 | 
115 | if __name__ == "__main__":
116 |     main()
117 | 


--------------------------------------------------------------------------------
/docker/bin/nbdiff_checkpoint:
--------------------------------------------------------------------------------
 1 | #!/bin/bash
 2 | if [[ "$#" -lt 1 || "$1" =~ ^((-h)|(--help))$ ]] ; then
 3 |     echo "usage: nbdiff_checkpoint NOTEBOOK.ipynb"
 4 |     echo
 5 |     echo "Show differences between given jupyter notebook and its checkpointed version (in .ipynb_checkpoints subdirectory)"
 6 |     exit
 7 | fi
 8 | 
 9 | DIRNAME=$(dirname "$1")
10 | BASENAME=$(basename "$1" .ipynb)
11 | shift
12 | 
13 | WORKING_COPY=$DIRNAME/$BASENAME.ipynb
14 | CHECKPOINT_COPY=$DIRNAME/.ipynb_checkpoints/$BASENAME-checkpoint.ipynb
15 | 
16 | echo "----- Analysing how to change $CHECKPOINT_COPY into $WORKING_COPY -----"
17 | nbdiff "$CHECKPOINT_COPY" "$WORKING_COPY" --ignore-details "$@"
18 | 


--------------------------------------------------------------------------------
/docker/bin/rm_empty_subdirs:
--------------------------------------------------------------------------------
 1 | #!/usr/bin/env python
 2 | 
 3 | import os
 4 | 
 5 | def remove_empty_directories(initial_dir,
 6 |     allow_initial_delete=False, ignore_nonexistant_initial=False,
 7 |     dry_run=False, quiet=False):
 8 | 
 9 |     FORBIDDEN_SUBDIRS = set([".git"])
10 | 
11 |     if not os.path.isdir(initial_dir) and not ignore_nonexistant_initial:
12 |         raise RuntimeError("Initial directory '{}' not found!".format(initial_dir))
13 | 
14 |     message = "removed"
15 |     if dry_run:
16 |         message = "to be " + message
17 | 
18 |     deleted = set()
19 | 
20 |     for (directory, subdirs, files) in os.walk(initial_dir, topdown=False):
21 |         forbidden = False
22 |         parent = directory
23 |         while parent:
24 |             parent, dirname = os.path.split(parent)
25 |             if dirname in FORBIDDEN_SUBDIRS:
26 |                 forbidden = True
27 |                 break
28 |         if forbidden:
29 |             continue
30 | 
31 |         is_empty = len(files) < 1 and len(set([os.path.join(directory, s) for s in subdirs]) - deleted) < 1
32 | 
33 |         if is_empty and (initial_dir != directory or allow_initial_delete):
34 |             if not quiet:
35 |                 print("{}: {}".format(message, directory))
36 |             deleted.add(directory)
37 |             if not dry_run:
38 |                 os.rmdir(directory)
39 | 
40 | def main():
41 |     import argparse
42 |     parser = argparse.ArgumentParser(description="Remove empty directories recursively in subtree.")
43 |     parser.add_argument("dir", metavar="DIR", type=str, nargs="+", help="directory to be searched")
44 |     parser.add_argument("-r", "--allow-dir-removal", action="store_true", help="allow deletion of DIR itself")
45 |     parser.add_argument("-i", "--ignore-nonexistent-dir", action="store_true", help="don't throw an error if DIR doesn't exist")
46 |     parser.add_argument("-d", "--dry-run", action="store_true", help="only print messages, don't perform any removals")
47 |     parser.add_argument("-q", "--quiet", action="store_true", help="don't print names of directories being removed")
48 |     args = parser.parse_args()
49 |     for directory in args.dir:
50 |         remove_empty_directories(directory, args.allow_dir_removal, args.ignore_nonexistent_dir,
51 |             args.dry_run, args.quiet)
52 | 
53 | if __name__ == "__main__":
54 |     main()
55 | 


--------------------------------------------------------------------------------
/docker/bin/tensorboard:
--------------------------------------------------------------------------------
1 | #!/bin/bash
2 | python -m tensorboard.main "$@"
3 | 


--------------------------------------------------------------------------------
/docker/docker-compose.yml:
--------------------------------------------------------------------------------
 1 | version: "3"
 2 | services:
 3 |   handson-ml:
 4 |     build:
 5 |       context: ../
 6 |       dockerfile: ./docker/Dockerfile #Dockerfile.gpu
 7 |       args:
 8 |         - username=devel
 9 |         - userid=1000
10 |     container_name: handson-ml
11 |     image: ageron/handson-ml:latest #latest-gpu
12 |     restart: unless-stopped
13 |     logging:
14 |       driver: json-file
15 |       options:
16 |         max-size: 50m
17 |     ports:
18 |       - "8888:8888"
19 |       - "6006:6006"
20 |     volumes:
21 |       - ../:/home/devel/handson-ml
22 |     command: /opt/conda/envs/tf1/bin/jupyter notebook --ip='0.0.0.0' --port=8888 --no-browser
23 |     #deploy:
24 |     #  resources:
25 |     #    reservations:
26 |     #      devices:
27 |     #      - capabilities: [gpu]
28 | 


--------------------------------------------------------------------------------
/docker/jupyter_notebook_config.py:
--------------------------------------------------------------------------------
 1 | import os
 2 | import subprocess
 3 | 
 4 | def export_script_and_view(model, os_path, contents_manager):
 5 |     if model["type"] != "notebook":
 6 |         return
 7 |     dir_name, file_name = os.path.split(os_path)
 8 |     file_base, file_ext = os.path.splitext(file_name)
 9 |     if file_base.startswith("Untitled"):
10 |         return
11 |     export_name = file_base if file_ext == ".ipynb" else file_name
12 |     subprocess.check_call(["jupyter", "nbconvert", "--to", "script", file_name, "--output", export_name + "_script"], cwd=dir_name)
13 |     subprocess.check_call(["jupyter", "nbconvert", "--to", "html", file_name, "--output", export_name + "_view"], cwd=dir_name)
14 | 
15 | c.FileContentsManager.post_save_hook = export_script_and_view
16 | 


--------------------------------------------------------------------------------
/environment.yml:
--------------------------------------------------------------------------------
 1 | name: tf1
 2 | channels:
 3 |   - conda-forge
 4 |   - defaults
 5 | dependencies:
 6 |   - atari_py=0.2 # used only in chapter 16
 7 |   - box2d-py=2.3 # used only in chapter 16
 8 |   - graphviz # used only in chapter 6 for dot files
 9 |   - gym=0.18 # used only in chapter 16
10 |   - ipython=7.20 # a powerful Python shell
11 |   - joblib=0.14 # used only in chapter 2 to save/load Scikit-Learn models
12 |   - jupyter=1.0 # to edit and run Jupyter notebooks
13 |   - matplotlib=3.3 # beautiful plots. See tutorial tools_matplotlib.ipynb
14 |   - nbdime=2.1 # optional tool to diff Jupyter notebooks
15 |   - nltk=3.4 # optionally used in chapter 3, exercise 4
16 |   - numexpr=2.7 # used only in the Pandas tutorial for numerical expressions
17 |   - numpy=1.19 # Powerful n-dimensional arrays and numerical computing tools
18 |   - pandas=1.2 # data analysis and manipulation tool
19 |   - pillow=8.1 # image manipulation library, (used by matplotlib.image.imread)
20 |   - pip # Python's package-management system
21 |   - py-xgboost=0.90 # used only in chapter 7 for optimized Gradient Boosting
22 |   - pyglet=1.5 # used only in chapter 16 to render environments
23 |   - pyopengl=3.1 # used only in chapter 16 to render environments
24 |   - python=3.7 # Python! Not using latest version as some libs lack support
25 |   - python-graphviz # used only in chapter 6 for dot files
26 |  #- pyvirtualdisplay=1.3 # used only in chapter 16 if on headless server
27 |   - scikit-image=0.18.1 # used only in chapter 13 to resize images
28 |   - scikit-learn=0.24 # machine learning library
29 |   - scipy=1.6 # scientific/technical computing library
30 |   - transformers=4.3 # Natural Language Processing lib for TF or PyTorch
31 |   - wheel # built-package format for pip
32 |   - widgetsnbextension=3.5 # interactive HTML widgets for Jupyter notebooks
33 |   - pip:
34 |     - tensorboard==1.15.0 # TensorFlow's visualization toolkit
35 |     - tensorflow==1.15.5 # or tensorflow-gpu if you have a TF-compatible GPU
36 |     - urlextract==1.2.0 # optionally used in chapter 3, exercise 4
37 | 


--------------------------------------------------------------------------------
/extra_autodiff.ipynb:
--------------------------------------------------------------------------------
   1 | {
   2 |  "cells": [
   3 |   {
   4 |    "cell_type": "markdown",
   5 |    "metadata": {},
   6 |    "source": [
   7 |     "**Appendix D – Autodiff**"
   8 |    ]
   9 |   },
  10 |   {
  11 |    "cell_type": "markdown",
  12 |    "metadata": {},
  13 |    "source": [
  14 |     "_This notebook contains toy implementations of various autodiff techniques, to explain how they works._\n",
  15 |     "\n",
  16 |     "<table align=\"left\">\n",
  17 |     "  <td>\n",
  18 |     "    <a target=\"_blank\" href=\"https://colab.research.google.com/github/ageron/handson-ml/blob/master/extra_autodiff.ipynb\"><img src=\"https://www.tensorflow.org/images/colab_logo_32px.png\" />Run in Google Colab</a>\n",
  19 |     "  </td>\n",
  20 |     "</table>"
  21 |    ]
  22 |   },
  23 |   {
  24 |    "cell_type": "markdown",
  25 |    "metadata": {},
  26 |    "source": [
  27 |     "**Warning**: this is the code for the 1st edition of the book. Please visit https://github.com/ageron/handson-ml2 for the 2nd edition code, with up-to-date notebooks using the latest library versions. In particular, the 1st edition is based on TensorFlow 1, while the 2nd edition uses TensorFlow 2, which is much simpler to use."
  28 |    ]
  29 |   },
  30 |   {
  31 |    "cell_type": "markdown",
  32 |    "metadata": {},
  33 |    "source": [
  34 |     "# Setup"
  35 |    ]
  36 |   },
  37 |   {
  38 |    "cell_type": "markdown",
  39 |    "metadata": {},
  40 |    "source": [
  41 |     "First, let's make sure this notebook works well in both python 2 and 3:"
  42 |    ]
  43 |   },
  44 |   {
  45 |    "cell_type": "code",
  46 |    "execution_count": 1,
  47 |    "metadata": {},
  48 |    "outputs": [],
  49 |    "source": [
  50 |     "# To support both python 2 and python 3\n",
  51 |     "from __future__ import absolute_import, division, print_function, unicode_literals"
  52 |    ]
  53 |   },
  54 |   {
  55 |    "cell_type": "markdown",
  56 |    "metadata": {},
  57 |    "source": [
  58 |     "# Introduction"
  59 |    ]
  60 |   },
  61 |   {
  62 |    "cell_type": "markdown",
  63 |    "metadata": {},
  64 |    "source": [
  65 |     "Suppose we want to compute the gradients of the function $f(x,y)=x^2y + y + 2$ with regards to the parameters x and y:"
  66 |    ]
  67 |   },
  68 |   {
  69 |    "cell_type": "code",
  70 |    "execution_count": 2,
  71 |    "metadata": {},
  72 |    "outputs": [],
  73 |    "source": [
  74 |     "def f(x,y):\n",
  75 |     "    return x*x*y + y + 2"
  76 |    ]
  77 |   },
  78 |   {
  79 |    "cell_type": "markdown",
  80 |    "metadata": {},
  81 |    "source": [
  82 |     "One approach is to solve this analytically:\n",
  83 |     "\n",
  84 |     "$\\dfrac{\\partial f}{\\partial x} = 2xy$\n",
  85 |     "\n",
  86 |     "$\\dfrac{\\partial f}{\\partial y} = x^2 + 1$"
  87 |    ]
  88 |   },
  89 |   {
  90 |    "cell_type": "code",
  91 |    "execution_count": 3,
  92 |    "metadata": {},
  93 |    "outputs": [],
  94 |    "source": [
  95 |     "def df(x,y):\n",
  96 |     "    return 2*x*y, x*x + 1"
  97 |    ]
  98 |   },
  99 |   {
 100 |    "cell_type": "markdown",
 101 |    "metadata": {},
 102 |    "source": [
 103 |     "So for example $\\dfrac{\\partial f}{\\partial x}(3,4) = 24$ and $\\dfrac{\\partial f}{\\partial y}(3,4) = 10$."
 104 |    ]
 105 |   },
 106 |   {
 107 |    "cell_type": "code",
 108 |    "execution_count": 4,
 109 |    "metadata": {},
 110 |    "outputs": [
 111 |     {
 112 |      "data": {
 113 |       "text/plain": [
 114 |        "(24, 10)"
 115 |       ]
 116 |      },
 117 |      "execution_count": 4,
 118 |      "metadata": {},
 119 |      "output_type": "execute_result"
 120 |     }
 121 |    ],
 122 |    "source": [
 123 |     "df(3, 4)"
 124 |    ]
 125 |   },
 126 |   {
 127 |    "cell_type": "markdown",
 128 |    "metadata": {},
 129 |    "source": [
 130 |     "Perfect! We can also find the equations for the second order derivatives (also called Hessians):\n",
 131 |     "\n",
 132 |     "$\\dfrac{\\partial^2 f}{\\partial x \\partial x} = \\dfrac{\\partial (2xy)}{\\partial x} = 2y$\n",
 133 |     "\n",
 134 |     "$\\dfrac{\\partial^2 f}{\\partial x \\partial y} = \\dfrac{\\partial (2xy)}{\\partial y} = 2x$\n",
 135 |     "\n",
 136 |     "$\\dfrac{\\partial^2 f}{\\partial y \\partial x} = \\dfrac{\\partial (x^2 + 1)}{\\partial x} = 2x$\n",
 137 |     "\n",
 138 |     "$\\dfrac{\\partial^2 f}{\\partial y \\partial y} = \\dfrac{\\partial (x^2 + 1)}{\\partial y} = 0$"
 139 |    ]
 140 |   },
 141 |   {
 142 |    "cell_type": "markdown",
 143 |    "metadata": {},
 144 |    "source": [
 145 |     "At x=3 and y=4, these Hessians are respectively 8, 6, 6, 0. Let's use the equations above to compute them:"
 146 |    ]
 147 |   },
 148 |   {
 149 |    "cell_type": "code",
 150 |    "execution_count": 5,
 151 |    "metadata": {},
 152 |    "outputs": [],
 153 |    "source": [
 154 |     "def d2f(x, y):\n",
 155 |     "    return [2*y, 2*x], [2*x, 0]"
 156 |    ]
 157 |   },
 158 |   {
 159 |    "cell_type": "code",
 160 |    "execution_count": 6,
 161 |    "metadata": {},
 162 |    "outputs": [
 163 |     {
 164 |      "data": {
 165 |       "text/plain": [
 166 |        "([8, 6], [6, 0])"
 167 |       ]
 168 |      },
 169 |      "execution_count": 6,
 170 |      "metadata": {},
 171 |      "output_type": "execute_result"
 172 |     }
 173 |    ],
 174 |    "source": [
 175 |     "d2f(3, 4)"
 176 |    ]
 177 |   },
 178 |   {
 179 |    "cell_type": "markdown",
 180 |    "metadata": {},
 181 |    "source": [
 182 |     "Perfect, but this requires some mathematical work. It is not too hard in this case, but for a deep neural network, it is pratically impossible to compute the derivatives this way. So let's look at various ways to automate this!"
 183 |    ]
 184 |   },
 185 |   {
 186 |    "cell_type": "markdown",
 187 |    "metadata": {},
 188 |    "source": [
 189 |     "# Numeric differentiation"
 190 |    ]
 191 |   },
 192 |   {
 193 |    "cell_type": "markdown",
 194 |    "metadata": {},
 195 |    "source": [
 196 |     "Here, we compute an approxiation of the gradients using the equation: $\\dfrac{\\partial f}{\\partial x} = \\displaystyle{\\lim_{\\epsilon \\to 0}}\\dfrac{f(x+\\epsilon, y) - f(x, y)}{\\epsilon}$ (and there is a similar definition for $\\dfrac{\\partial f}{\\partial y}$)."
 197 |    ]
 198 |   },
 199 |   {
 200 |    "cell_type": "code",
 201 |    "execution_count": 7,
 202 |    "metadata": {},
 203 |    "outputs": [],
 204 |    "source": [
 205 |     "def gradients(func, vars_list, eps=0.0001):\n",
 206 |     "    partial_derivatives = []\n",
 207 |     "    base_func_eval = func(*vars_list)\n",
 208 |     "    for idx in range(len(vars_list)):\n",
 209 |     "        tweaked_vars = vars_list[:]\n",
 210 |     "        tweaked_vars[idx] += eps\n",
 211 |     "        tweaked_func_eval = func(*tweaked_vars)\n",
 212 |     "        derivative = (tweaked_func_eval - base_func_eval) / eps\n",
 213 |     "        partial_derivatives.append(derivative)\n",
 214 |     "    return partial_derivatives"
 215 |    ]
 216 |   },
 217 |   {
 218 |    "cell_type": "code",
 219 |    "execution_count": 8,
 220 |    "metadata": {},
 221 |    "outputs": [],
 222 |    "source": [
 223 |     "def df(x, y):\n",
 224 |     "    return gradients(f, [x, y])"
 225 |    ]
 226 |   },
 227 |   {
 228 |    "cell_type": "code",
 229 |    "execution_count": 9,
 230 |    "metadata": {},
 231 |    "outputs": [
 232 |     {
 233 |      "data": {
 234 |       "text/plain": [
 235 |        "[24.000400000048216, 10.000000000047748]"
 236 |       ]
 237 |      },
 238 |      "execution_count": 9,
 239 |      "metadata": {},
 240 |      "output_type": "execute_result"
 241 |     }
 242 |    ],
 243 |    "source": [
 244 |     "df(3, 4)"
 245 |    ]
 246 |   },
 247 |   {
 248 |    "cell_type": "markdown",
 249 |    "metadata": {},
 250 |    "source": [
 251 |     "It works well!"
 252 |    ]
 253 |   },
 254 |   {
 255 |    "cell_type": "markdown",
 256 |    "metadata": {},
 257 |    "source": [
 258 |     "The good news is that it is pretty easy to compute the Hessians. First let's create functions that compute the first order derivatives (also called Jacobians):"
 259 |    ]
 260 |   },
 261 |   {
 262 |    "cell_type": "code",
 263 |    "execution_count": 10,
 264 |    "metadata": {},
 265 |    "outputs": [
 266 |     {
 267 |      "data": {
 268 |       "text/plain": [
 269 |        "(24.000400000048216, 10.000000000047748)"
 270 |       ]
 271 |      },
 272 |      "execution_count": 10,
 273 |      "metadata": {},
 274 |      "output_type": "execute_result"
 275 |     }
 276 |    ],
 277 |    "source": [
 278 |     "def dfdx(x, y):\n",
 279 |     "    return gradients(f, [x,y])[0]\n",
 280 |     "\n",
 281 |     "def dfdy(x, y):\n",
 282 |     "    return gradients(f, [x,y])[1]\n",
 283 |     "\n",
 284 |     "dfdx(3., 4.), dfdy(3., 4.)"
 285 |    ]
 286 |   },
 287 |   {
 288 |    "cell_type": "markdown",
 289 |    "metadata": {},
 290 |    "source": [
 291 |     "Now we can simply apply the `gradients()` function to these functions:"
 292 |    ]
 293 |   },
 294 |   {
 295 |    "cell_type": "code",
 296 |    "execution_count": 11,
 297 |    "metadata": {},
 298 |    "outputs": [],
 299 |    "source": [
 300 |     "def d2f(x, y):\n",
 301 |     "    return [gradients(dfdx, [3., 4.]), gradients(dfdy, [3., 4.])]"
 302 |    ]
 303 |   },
 304 |   {
 305 |    "cell_type": "code",
 306 |    "execution_count": 12,
 307 |    "metadata": {},
 308 |    "outputs": [
 309 |     {
 310 |      "data": {
 311 |       "text/plain": [
 312 |        "[[7.999999951380232, 6.000099261882497],\n",
 313 |        " [6.000099261882497, -1.4210854715202004e-06]]"
 314 |       ]
 315 |      },
 316 |      "execution_count": 12,
 317 |      "metadata": {},
 318 |      "output_type": "execute_result"
 319 |     }
 320 |    ],
 321 |    "source": [
 322 |     "d2f(3, 4)"
 323 |    ]
 324 |   },
 325 |   {
 326 |    "cell_type": "markdown",
 327 |    "metadata": {},
 328 |    "source": [
 329 |     "So everything works well, but the result is approximate, and computing the gradients of a function with regards to $n$ variables requires calling that function $n$ times. In deep neural nets, there are often thousands of parameters to tweak using gradient descent (which requires computing the gradients of the loss function with regards to each of these parameters), so this approach would be much too slow."
 330 |    ]
 331 |   },
 332 |   {
 333 |    "cell_type": "markdown",
 334 |    "metadata": {},
 335 |    "source": [
 336 |     "## Implementing a Toy Computation Graph"
 337 |    ]
 338 |   },
 339 |   {
 340 |    "cell_type": "markdown",
 341 |    "metadata": {},
 342 |    "source": [
 343 |     "Rather than this numerical approach, let's implement some symbolic autodiff techniques. For this, we will need to define classes to represent constants, variables and operations."
 344 |    ]
 345 |   },
 346 |   {
 347 |    "cell_type": "code",
 348 |    "execution_count": 13,
 349 |    "metadata": {},
 350 |    "outputs": [],
 351 |    "source": [
 352 |     "class Const(object):\n",
 353 |     "    def __init__(self, value):\n",
 354 |     "        self.value = value\n",
 355 |     "    def evaluate(self):\n",
 356 |     "        return self.value\n",
 357 |     "    def __str__(self):\n",
 358 |     "        return str(self.value)\n",
 359 |     "\n",
 360 |     "class Var(object):\n",
 361 |     "    def __init__(self, name, init_value=0):\n",
 362 |     "        self.value = init_value\n",
 363 |     "        self.name = name\n",
 364 |     "    def evaluate(self):\n",
 365 |     "        return self.value\n",
 366 |     "    def __str__(self):\n",
 367 |     "        return self.name\n",
 368 |     "\n",
 369 |     "class BinaryOperator(object):\n",
 370 |     "    def __init__(self, a, b):\n",
 371 |     "        self.a = a\n",
 372 |     "        self.b = b\n",
 373 |     "\n",
 374 |     "class Add(BinaryOperator):\n",
 375 |     "    def evaluate(self):\n",
 376 |     "        return self.a.evaluate() + self.b.evaluate()\n",
 377 |     "    def __str__(self):\n",
 378 |     "        return \"{} + {}\".format(self.a, self.b)\n",
 379 |     "\n",
 380 |     "class Mul(BinaryOperator):\n",
 381 |     "    def evaluate(self):\n",
 382 |     "        return self.a.evaluate() * self.b.evaluate()\n",
 383 |     "    def __str__(self):\n",
 384 |     "        return \"({}) * ({})\".format(self.a, self.b)"
 385 |    ]
 386 |   },
 387 |   {
 388 |    "cell_type": "markdown",
 389 |    "metadata": {},
 390 |    "source": [
 391 |     "Good, now we can build a computation graph to represent the function $f$:"
 392 |    ]
 393 |   },
 394 |   {
 395 |    "cell_type": "code",
 396 |    "execution_count": 14,
 397 |    "metadata": {},
 398 |    "outputs": [],
 399 |    "source": [
 400 |     "x = Var(\"x\")\n",
 401 |     "y = Var(\"y\")\n",
 402 |     "f = Add(Mul(Mul(x, x), y), Add(y, Const(2))) # f(x,y) = x²y + y + 2"
 403 |    ]
 404 |   },
 405 |   {
 406 |    "cell_type": "markdown",
 407 |    "metadata": {},
 408 |    "source": [
 409 |     "And we can run this graph to compute $f$ at any point, for example $f(3, 4)$."
 410 |    ]
 411 |   },
 412 |   {
 413 |    "cell_type": "code",
 414 |    "execution_count": 15,
 415 |    "metadata": {},
 416 |    "outputs": [
 417 |     {
 418 |      "data": {
 419 |       "text/plain": [
 420 |        "42"
 421 |       ]
 422 |      },
 423 |      "execution_count": 15,
 424 |      "metadata": {},
 425 |      "output_type": "execute_result"
 426 |     }
 427 |    ],
 428 |    "source": [
 429 |     "x.value = 3\n",
 430 |     "y.value = 4\n",
 431 |     "f.evaluate()"
 432 |    ]
 433 |   },
 434 |   {
 435 |    "cell_type": "markdown",
 436 |    "metadata": {},
 437 |    "source": [
 438 |     "Perfect, it found the ultimate answer."
 439 |    ]
 440 |   },
 441 |   {
 442 |    "cell_type": "markdown",
 443 |    "metadata": {},
 444 |    "source": [
 445 |     "## Computing gradients"
 446 |    ]
 447 |   },
 448 |   {
 449 |    "cell_type": "markdown",
 450 |    "metadata": {},
 451 |    "source": [
 452 |     "The autodiff methods we will present below are all based on the *chain rule*."
 453 |    ]
 454 |   },
 455 |   {
 456 |    "cell_type": "markdown",
 457 |    "metadata": {},
 458 |    "source": [
 459 |     "Suppose we have two functions $u$ and $v$, and we apply them sequentially to some input $x$, and we get the result $z$. So we have $z = v(u(x))$, which we can rewrite as $z = v(s)$ and $s = u(x)$. Now we can apply the chain rule to get the partial derivative of the output $z$ with regards to the input $x$:\n",
 460 |     "\n",
 461 |     "$ \\dfrac{\\partial z}{\\partial x} = \\dfrac{\\partial s}{\\partial x} \\cdot \\dfrac{\\partial z}{\\partial s}$"
 462 |    ]
 463 |   },
 464 |   {
 465 |    "cell_type": "markdown",
 466 |    "metadata": {},
 467 |    "source": [
 468 |     "Now if $z$ is the output of a sequence of functions which have intermediate outputs $s_1, s_2, ..., s_n$, the chain rule still applies:\n",
 469 |     "\n",
 470 |     "$ \\dfrac{\\partial z}{\\partial x} = \\dfrac{\\partial s_1}{\\partial x} \\cdot \\dfrac{\\partial s_2}{\\partial s_1} \\cdot \\dfrac{\\partial s_3}{\\partial s_2} \\cdot \\dots \\cdot \\dfrac{\\partial s_{n-1}}{\\partial s_{n-2}} \\cdot \\dfrac{\\partial s_n}{\\partial s_{n-1}} \\cdot \\dfrac{\\partial z}{\\partial s_n}$"
 471 |    ]
 472 |   },
 473 |   {
 474 |    "cell_type": "markdown",
 475 |    "metadata": {},
 476 |    "source": [
 477 |     "In forward mode autodiff, the algorithm computes these terms \"forward\" (i.e., in the same order as the computations required to compute the output $z$), that is from left to right: first $\\dfrac{\\partial s_1}{\\partial x}$, then $\\dfrac{\\partial s_2}{\\partial s_1}$, and so on. In reverse mode autodiff, the algorithm computes these terms \"backwards\", from right to left: first $\\dfrac{\\partial z}{\\partial s_n}$, then $\\dfrac{\\partial s_n}{\\partial s_{n-1}}$, and so on.\n",
 478 |     "\n",
 479 |     "For example, suppose you want to compute the derivative of the function $z(x)=\\sin(x^2)$ at x=3, using forward mode autodiff. The algorithm would first compute the partial derivative $\\dfrac{\\partial s_1}{\\partial x}=\\dfrac{\\partial x^2}{\\partial x}=2x=6$. Next, it would compute $\\dfrac{\\partial z}{\\partial x}=\\dfrac{\\partial s_1}{\\partial x}\\cdot\\dfrac{\\partial z}{\\partial s_1}= 6 \\cdot \\dfrac{\\partial \\sin(s_1)}{\\partial s_1}=6 \\cdot \\cos(s_1)=6 \\cdot \\cos(3^2)\\approx-5.46$."
 480 |    ]
 481 |   },
 482 |   {
 483 |    "cell_type": "markdown",
 484 |    "metadata": {},
 485 |    "source": [
 486 |     "Let's verify this result using the `gradients()` function defined earlier:"
 487 |    ]
 488 |   },
 489 |   {
 490 |    "cell_type": "code",
 491 |    "execution_count": 16,
 492 |    "metadata": {},
 493 |    "outputs": [
 494 |     {
 495 |      "data": {
 496 |       "text/plain": [
 497 |        "[-5.46761419430053]"
 498 |       ]
 499 |      },
 500 |      "execution_count": 16,
 501 |      "metadata": {},
 502 |      "output_type": "execute_result"
 503 |     }
 504 |    ],
 505 |    "source": [
 506 |     "from math import sin\n",
 507 |     "\n",
 508 |     "def z(x):\n",
 509 |     "    return sin(x**2)\n",
 510 |     "\n",
 511 |     "gradients(z, [3])"
 512 |    ]
 513 |   },
 514 |   {
 515 |    "cell_type": "markdown",
 516 |    "metadata": {},
 517 |    "source": [
 518 |     "Look good. Now let's do the same thing using reverse mode autodiff. This time the algorithm would start from the right hand side so it would compute $\\dfrac{\\partial z}{\\partial s_1} = \\dfrac{\\partial \\sin(s_1)}{\\partial s_1}=\\cos(s_1)=\\cos(3^2)\\approx -0.91$. Next it would compute $\\dfrac{\\partial z}{\\partial x}=\\dfrac{\\partial s_1}{\\partial x}\\cdot\\dfrac{\\partial z}{\\partial s_1} \\approx \\dfrac{\\partial s_1}{\\partial x} \\cdot -0.91 = \\dfrac{\\partial x^2}{\\partial x} \\cdot -0.91=2x \\cdot -0.91 = 6\\cdot-0.91=-5.46$."
 519 |    ]
 520 |   },
 521 |   {
 522 |    "cell_type": "markdown",
 523 |    "metadata": {},
 524 |    "source": [
 525 |     "Of course both approaches give the same result (except for rounding errors), and with a single input and output they involve the same number of computations. But when there are several inputs or outputs, they can have very different performance. Indeed, if there are many inputs, the right-most terms will be needed to compute the partial derivatives with regards to each input, so it is a good idea to compute these right-most terms first. That means using reverse-mode autodiff. This way, the right-most terms can be computed just once and used to compute all the partial derivatives. Conversely, if there are many outputs, forward-mode is generally preferable because the left-most terms can be computed just once to compute the partial derivatives of the different outputs. In Deep Learning, there are typically thousands of model parameters, meaning there are lots of inputs, but few outputs. In fact, there is generally just one output during training: the loss. This is why reverse mode autodiff is used in TensorFlow and all major Deep Learning libraries."
 526 |    ]
 527 |   },
 528 |   {
 529 |    "cell_type": "markdown",
 530 |    "metadata": {},
 531 |    "source": [
 532 |     "There's one additional complexity in reverse mode autodiff: the value of $s_i$ is generally required when computing $\\dfrac{\\partial s_{i+1}}{\\partial s_i}$, and computing $s_i$ requires first computing $s_{i-1}$, which requires computing $s_{i-2}$, and so on. So basically, a first pass forward through the network is required to compute $s_1$, $s_2$, $s_3$, $\\dots$, $s_{n-1}$ and $s_n$, and then the algorithm can compute the partial derivatives from right to left. Storing all the intermediate values $s_i$ in RAM is sometimes a problem, especially when handling images, and when using GPUs which often have limited RAM: to limit this problem, one can reduce the number of layers in the neural network, or configure TensorFlow to make it swap these values from GPU RAM to CPU RAM. Another approach is to only cache every other intermediate value, $s_1$, $s_3$, $s_5$, $\\dots$, $s_{n-4}$, $s_{n-2}$ and $s_n$. This means that when the algorithm computes the partial derivatives, if an intermediate value $s_i$ is missing, it will need to recompute it based on the previous intermediate value $s_{i-1}$. This trades off CPU for RAM (if you are interested, check out [this paper](https://pdfs.semanticscholar.org/f61e/9fd5a4878e1493f7a6b03774a61c17b7e9a4.pdf))."
 533 |    ]
 534 |   },
 535 |   {
 536 |    "cell_type": "markdown",
 537 |    "metadata": {},
 538 |    "source": [
 539 |     "### Forward mode autodiff"
 540 |    ]
 541 |   },
 542 |   {
 543 |    "cell_type": "code",
 544 |    "execution_count": 17,
 545 |    "metadata": {},
 546 |    "outputs": [],
 547 |    "source": [
 548 |     "Const.gradient = lambda self, var: Const(0)\n",
 549 |     "Var.gradient = lambda self, var: Const(1) if self is var else Const(0)\n",
 550 |     "Add.gradient = lambda self, var: Add(self.a.gradient(var), self.b.gradient(var))\n",
 551 |     "Mul.gradient = lambda self, var: Add(Mul(self.a, self.b.gradient(var)), Mul(self.a.gradient(var), self.b))\n",
 552 |     "\n",
 553 |     "x = Var(name=\"x\", init_value=3.)\n",
 554 |     "y = Var(name=\"y\", init_value=4.)\n",
 555 |     "f = Add(Mul(Mul(x, x), y), Add(y, Const(2))) # f(x,y) = x²y + y + 2\n",
 556 |     "\n",
 557 |     "dfdx = f.gradient(x)  # 2xy\n",
 558 |     "dfdy = f.gradient(y)  # x² + 1"
 559 |    ]
 560 |   },
 561 |   {
 562 |    "cell_type": "code",
 563 |    "execution_count": 18,
 564 |    "metadata": {},
 565 |    "outputs": [
 566 |     {
 567 |      "data": {
 568 |       "text/plain": [
 569 |        "(24.0, 10.0)"
 570 |       ]
 571 |      },
 572 |      "execution_count": 18,
 573 |      "metadata": {},
 574 |      "output_type": "execute_result"
 575 |     }
 576 |    ],
 577 |    "source": [
 578 |     "dfdx.evaluate(), dfdy.evaluate()"
 579 |    ]
 580 |   },
 581 |   {
 582 |    "cell_type": "markdown",
 583 |    "metadata": {},
 584 |    "source": [
 585 |     "Since the output of the `gradient()` method is fully symbolic, we are not limited to the first order derivatives, we can also compute second order derivatives, and so on:"
 586 |    ]
 587 |   },
 588 |   {
 589 |    "cell_type": "code",
 590 |    "execution_count": 19,
 591 |    "metadata": {},
 592 |    "outputs": [],
 593 |    "source": [
 594 |     "d2fdxdx = dfdx.gradient(x) # 2y\n",
 595 |     "d2fdxdy = dfdx.gradient(y) # 2x\n",
 596 |     "d2fdydx = dfdy.gradient(x) # 2x\n",
 597 |     "d2fdydy = dfdy.gradient(y) # 0"
 598 |    ]
 599 |   },
 600 |   {
 601 |    "cell_type": "code",
 602 |    "execution_count": 20,
 603 |    "metadata": {},
 604 |    "outputs": [
 605 |     {
 606 |      "data": {
 607 |       "text/plain": [
 608 |        "[[8.0, 6.0], [6.0, 0.0]]"
 609 |       ]
 610 |      },
 611 |      "execution_count": 20,
 612 |      "metadata": {},
 613 |      "output_type": "execute_result"
 614 |     }
 615 |    ],
 616 |    "source": [
 617 |     "[[d2fdxdx.evaluate(), d2fdxdy.evaluate()],\n",
 618 |     " [d2fdydx.evaluate(), d2fdydy.evaluate()]]"
 619 |    ]
 620 |   },
 621 |   {
 622 |    "cell_type": "markdown",
 623 |    "metadata": {},
 624 |    "source": [
 625 |     "Note that the result is now exact, not an approximation (up to the limit of the machine's float precision, of course)."
 626 |    ]
 627 |   },
 628 |   {
 629 |    "cell_type": "markdown",
 630 |    "metadata": {},
 631 |    "source": [
 632 |     "### Forward mode autodiff using dual numbers"
 633 |    ]
 634 |   },
 635 |   {
 636 |    "cell_type": "markdown",
 637 |    "metadata": {},
 638 |    "source": [
 639 |     "A nice way to apply forward mode autodiff is to use [dual numbers](https://en.wikipedia.org/wiki/Dual_number). In short, a dual number $z$ has the form $z = a + b\\epsilon$, where $a$ and $b$ are real numbers, and $\\epsilon$ is an infinitesimal number, positive but smaller than all real numbers, and such that $\\epsilon^2=0$.\n",
 640 |     "It can be shown that $f(x + \\epsilon) = f(x) + \\dfrac{\\partial f}{\\partial x}\\epsilon$, so simply by computing $f(x + \\epsilon)$ we get both the value of $f(x)$ and the partial derivative of $f$ with regards to $x$. "
 641 |    ]
 642 |   },
 643 |   {
 644 |    "cell_type": "markdown",
 645 |    "metadata": {},
 646 |    "source": [
 647 |     "Dual numbers have their own arithmetic rules, which are generally quite natural. For example:\n",
 648 |     "\n",
 649 |     "**Addition**\n",
 650 |     "\n",
 651 |     "$(a_1 + b_1\\epsilon) + (a_2 + b_2\\epsilon) = (a_1 + a_2) + (b_1 + b_2)\\epsilon$\n",
 652 |     "\n",
 653 |     "**Subtraction**\n",
 654 |     "\n",
 655 |     "$(a_1 + b_1\\epsilon) - (a_2 + b_2\\epsilon) = (a_1 - a_2) + (b_1 - b_2)\\epsilon$\n",
 656 |     "\n",
 657 |     "**Multiplication**\n",
 658 |     "\n",
 659 |     "$(a_1 + b_1\\epsilon) \\times (a_2 + b_2\\epsilon) = (a_1 a_2) + (a_1 b_2 + a_2 b_1)\\epsilon + b_1 b_2\\epsilon^2 = (a_1 a_2) + (a_1b_2 + a_2b_1)\\epsilon$\n",
 660 |     "\n",
 661 |     "**Division**\n",
 662 |     "\n",
 663 |     "$\\dfrac{a_1 + b_1\\epsilon}{a_2 + b_2\\epsilon} = \\dfrac{a_1 + b_1\\epsilon}{a_2 + b_2\\epsilon} \\cdot \\dfrac{a_2 - b_2\\epsilon}{a_2 - b_2\\epsilon} = \\dfrac{a_1 a_2 + (b_1 a_2 - a_1 b_2)\\epsilon - b_1 b_2\\epsilon^2}{{a_2}^2 + (a_2 b_2 - a_2 b_2)\\epsilon - {b_2}^2\\epsilon} = \\dfrac{a_1}{a_2} + \\dfrac{a_1 b_2 - b_1 a_2}{{a_2}^2}\\epsilon$\n",
 664 |     "\n",
 665 |     "**Power**\n",
 666 |     "\n",
 667 |     "$(a + b\\epsilon)^n = a^n + (n a^{n-1}b)\\epsilon$\n",
 668 |     "\n",
 669 |     "etc."
 670 |    ]
 671 |   },
 672 |   {
 673 |    "cell_type": "markdown",
 674 |    "metadata": {},
 675 |    "source": [
 676 |     "Let's create a class to represent dual numbers, and implement a few operations (addition and multiplication). You can try adding some more if you want."
 677 |    ]
 678 |   },
 679 |   {
 680 |    "cell_type": "code",
 681 |    "execution_count": 21,
 682 |    "metadata": {},
 683 |    "outputs": [],
 684 |    "source": [
 685 |     "class DualNumber(object):\n",
 686 |     "    def __init__(self, value=0.0, eps=0.0):\n",
 687 |     "        self.value = value\n",
 688 |     "        self.eps = eps\n",
 689 |     "    def __add__(self, b):\n",
 690 |     "        return DualNumber(self.value + self.to_dual(b).value,\n",
 691 |     "                          self.eps + self.to_dual(b).eps)\n",
 692 |     "    def __radd__(self, a):\n",
 693 |     "        return self.to_dual(a).__add__(self)\n",
 694 |     "    def __mul__(self, b):\n",
 695 |     "        return DualNumber(self.value * self.to_dual(b).value,\n",
 696 |     "                          self.eps * self.to_dual(b).value + self.value * self.to_dual(b).eps)\n",
 697 |     "    def __rmul__(self, a):\n",
 698 |     "        return self.to_dual(a).__mul__(self)\n",
 699 |     "    def __str__(self):\n",
 700 |     "        if self.eps:\n",
 701 |     "            return \"{:.1f} + {:.1f}ε\".format(self.value, self.eps)\n",
 702 |     "        else:\n",
 703 |     "            return \"{:.1f}\".format(self.value)\n",
 704 |     "    def __repr__(self):\n",
 705 |     "        return str(self)\n",
 706 |     "    @classmethod\n",
 707 |     "    def to_dual(cls, n):\n",
 708 |     "        if hasattr(n, \"value\"):\n",
 709 |     "            return n\n",
 710 |     "        else:\n",
 711 |     "            return cls(n)"
 712 |    ]
 713 |   },
 714 |   {
 715 |    "cell_type": "markdown",
 716 |    "metadata": {},
 717 |    "source": [
 718 |     "$3 + (3 + 4 \\epsilon) = 6 + 4\\epsilon$"
 719 |    ]
 720 |   },
 721 |   {
 722 |    "cell_type": "code",
 723 |    "execution_count": 22,
 724 |    "metadata": {},
 725 |    "outputs": [
 726 |     {
 727 |      "data": {
 728 |       "text/plain": [
 729 |        "6.0 + 4.0ε"
 730 |       ]
 731 |      },
 732 |      "execution_count": 22,
 733 |      "metadata": {},
 734 |      "output_type": "execute_result"
 735 |     }
 736 |    ],
 737 |    "source": [
 738 |     "3 + DualNumber(3, 4)"
 739 |    ]
 740 |   },
 741 |   {
 742 |    "cell_type": "markdown",
 743 |    "metadata": {},
 744 |    "source": [
 745 |     "$(3 + 4ε)\\times(5 + 7ε)$ = $3 \\times 5 + 3 \\times 7ε + 4ε \\times 5 + 4ε \\times 7ε$ = $15 + 21ε + 20ε + 28ε^2$ = $15 + 41ε + 28 \\times 0$ = $15 + 41ε$"
 746 |    ]
 747 |   },
 748 |   {
 749 |    "cell_type": "code",
 750 |    "execution_count": 23,
 751 |    "metadata": {},
 752 |    "outputs": [
 753 |     {
 754 |      "data": {
 755 |       "text/plain": [
 756 |        "15.0 + 41.0ε"
 757 |       ]
 758 |      },
 759 |      "execution_count": 23,
 760 |      "metadata": {},
 761 |      "output_type": "execute_result"
 762 |     }
 763 |    ],
 764 |    "source": [
 765 |     "DualNumber(3, 4) * DualNumber(5, 7)"
 766 |    ]
 767 |   },
 768 |   {
 769 |    "cell_type": "markdown",
 770 |    "metadata": {},
 771 |    "source": [
 772 |     "Now let's see if the dual numbers work with our toy computation framework:"
 773 |    ]
 774 |   },
 775 |   {
 776 |    "cell_type": "code",
 777 |    "execution_count": 24,
 778 |    "metadata": {},
 779 |    "outputs": [
 780 |     {
 781 |      "data": {
 782 |       "text/plain": [
 783 |        "42.0"
 784 |       ]
 785 |      },
 786 |      "execution_count": 24,
 787 |      "metadata": {},
 788 |      "output_type": "execute_result"
 789 |     }
 790 |    ],
 791 |    "source": [
 792 |     "x.value = DualNumber(3.0)\n",
 793 |     "y.value = DualNumber(4.0)\n",
 794 |     "\n",
 795 |     "f.evaluate()"
 796 |    ]
 797 |   },
 798 |   {
 799 |    "cell_type": "markdown",
 800 |    "metadata": {},
 801 |    "source": [
 802 |     "Yep, sure works. Now let's use this to compute the partial derivatives of $f$ with regards to $x$ and $y$ at x=3 and y=4:"
 803 |    ]
 804 |   },
 805 |   {
 806 |    "cell_type": "code",
 807 |    "execution_count": 25,
 808 |    "metadata": {},
 809 |    "outputs": [],
 810 |    "source": [
 811 |     "x.value = DualNumber(3.0, 1.0)  # 3 + ε\n",
 812 |     "y.value = DualNumber(4.0)       # 4\n",
 813 |     "\n",
 814 |     "dfdx = f.evaluate().eps\n",
 815 |     "\n",
 816 |     "x.value = DualNumber(3.0)       # 3\n",
 817 |     "y.value = DualNumber(4.0, 1.0)  # 4 + ε\n",
 818 |     "\n",
 819 |     "dfdy = f.evaluate().eps"
 820 |    ]
 821 |   },
 822 |   {
 823 |    "cell_type": "code",
 824 |    "execution_count": 26,
 825 |    "metadata": {},
 826 |    "outputs": [
 827 |     {
 828 |      "data": {
 829 |       "text/plain": [
 830 |        "24.0"
 831 |       ]
 832 |      },
 833 |      "execution_count": 26,
 834 |      "metadata": {},
 835 |      "output_type": "execute_result"
 836 |     }
 837 |    ],
 838 |    "source": [
 839 |     "dfdx"
 840 |    ]
 841 |   },
 842 |   {
 843 |    "cell_type": "code",
 844 |    "execution_count": 27,
 845 |    "metadata": {},
 846 |    "outputs": [
 847 |     {
 848 |      "data": {
 849 |       "text/plain": [
 850 |        "10.0"
 851 |       ]
 852 |      },
 853 |      "execution_count": 27,
 854 |      "metadata": {},
 855 |      "output_type": "execute_result"
 856 |     }
 857 |    ],
 858 |    "source": [
 859 |     "dfdy"
 860 |    ]
 861 |   },
 862 |   {
 863 |    "cell_type": "markdown",
 864 |    "metadata": {},
 865 |    "source": [
 866 |     "Great! However, in this implementation we are limited to first order derivatives.\n",
 867 |     "Now let's look at reverse mode."
 868 |    ]
 869 |   },
 870 |   {
 871 |    "cell_type": "markdown",
 872 |    "metadata": {},
 873 |    "source": [
 874 |     "### Reverse mode autodiff"
 875 |    ]
 876 |   },
 877 |   {
 878 |    "cell_type": "markdown",
 879 |    "metadata": {},
 880 |    "source": [
 881 |     "Let's rewrite our toy framework to add reverse mode autodiff:"
 882 |    ]
 883 |   },
 884 |   {
 885 |    "cell_type": "code",
 886 |    "execution_count": 28,
 887 |    "metadata": {},
 888 |    "outputs": [],
 889 |    "source": [
 890 |     "class Const(object):\n",
 891 |     "    def __init__(self, value):\n",
 892 |     "        self.value = value\n",
 893 |     "    def evaluate(self):\n",
 894 |     "        return self.value\n",
 895 |     "    def backpropagate(self, gradient):\n",
 896 |     "        pass\n",
 897 |     "    def __str__(self):\n",
 898 |     "        return str(self.value)\n",
 899 |     "\n",
 900 |     "class Var(object):\n",
 901 |     "    def __init__(self, name, init_value=0):\n",
 902 |     "        self.value = init_value\n",
 903 |     "        self.name = name\n",
 904 |     "        self.gradient = 0\n",
 905 |     "    def evaluate(self):\n",
 906 |     "        return self.value\n",
 907 |     "    def backpropagate(self, gradient):\n",
 908 |     "        self.gradient += gradient\n",
 909 |     "    def __str__(self):\n",
 910 |     "        return self.name\n",
 911 |     "\n",
 912 |     "class BinaryOperator(object):\n",
 913 |     "    def __init__(self, a, b):\n",
 914 |     "        self.a = a\n",
 915 |     "        self.b = b\n",
 916 |     "\n",
 917 |     "class Add(BinaryOperator):\n",
 918 |     "    def evaluate(self):\n",
 919 |     "        self.value = self.a.evaluate() + self.b.evaluate()\n",
 920 |     "        return self.value\n",
 921 |     "    def backpropagate(self, gradient):\n",
 922 |     "        self.a.backpropagate(gradient)\n",
 923 |     "        self.b.backpropagate(gradient)\n",
 924 |     "    def __str__(self):\n",
 925 |     "        return \"{} + {}\".format(self.a, self.b)\n",
 926 |     "\n",
 927 |     "class Mul(BinaryOperator):\n",
 928 |     "    def evaluate(self):\n",
 929 |     "        self.value = self.a.evaluate() * self.b.evaluate()\n",
 930 |     "        return self.value\n",
 931 |     "    def backpropagate(self, gradient):\n",
 932 |     "        self.a.backpropagate(gradient * self.b.value)\n",
 933 |     "        self.b.backpropagate(gradient * self.a.value)\n",
 934 |     "    def __str__(self):\n",
 935 |     "        return \"({}) * ({})\".format(self.a, self.b)"
 936 |    ]
 937 |   },
 938 |   {
 939 |    "cell_type": "code",
 940 |    "execution_count": 29,
 941 |    "metadata": {},
 942 |    "outputs": [],
 943 |    "source": [
 944 |     "x = Var(\"x\", init_value=3)\n",
 945 |     "y = Var(\"y\", init_value=4)\n",
 946 |     "f = Add(Mul(Mul(x, x), y), Add(y, Const(2))) # f(x,y) = x²y + y + 2\n",
 947 |     "\n",
 948 |     "result = f.evaluate()\n",
 949 |     "f.backpropagate(1.0)"
 950 |    ]
 951 |   },
 952 |   {
 953 |    "cell_type": "code",
 954 |    "execution_count": 30,
 955 |    "metadata": {},
 956 |    "outputs": [
 957 |     {
 958 |      "name": "stdout",
 959 |      "output_type": "stream",
 960 |      "text": [
 961 |       "((x) * (x)) * (y) + y + 2\n"
 962 |      ]
 963 |     }
 964 |    ],
 965 |    "source": [
 966 |     "print(f)"
 967 |    ]
 968 |   },
 969 |   {
 970 |    "cell_type": "code",
 971 |    "execution_count": 31,
 972 |    "metadata": {},
 973 |    "outputs": [
 974 |     {
 975 |      "data": {
 976 |       "text/plain": [
 977 |        "42"
 978 |       ]
 979 |      },
 980 |      "execution_count": 31,
 981 |      "metadata": {},
 982 |      "output_type": "execute_result"
 983 |     }
 984 |    ],
 985 |    "source": [
 986 |     "result"
 987 |    ]
 988 |   },
 989 |   {
 990 |    "cell_type": "code",
 991 |    "execution_count": 32,
 992 |    "metadata": {},
 993 |    "outputs": [
 994 |     {
 995 |      "data": {
 996 |       "text/plain": [
 997 |        "24.0"
 998 |       ]
 999 |      },
1000 |      "execution_count": 32,
1001 |      "metadata": {},
1002 |      "output_type": "execute_result"
1003 |     }
1004 |    ],
1005 |    "source": [
1006 |     "x.gradient"
1007 |    ]
1008 |   },
1009 |   {
1010 |    "cell_type": "code",
1011 |    "execution_count": 33,
1012 |    "metadata": {},
1013 |    "outputs": [
1014 |     {
1015 |      "data": {
1016 |       "text/plain": [
1017 |        "10.0"
1018 |       ]
1019 |      },
1020 |      "execution_count": 33,
1021 |      "metadata": {},
1022 |      "output_type": "execute_result"
1023 |     }
1024 |    ],
1025 |    "source": [
1026 |     "y.gradient"
1027 |    ]
1028 |   },
1029 |   {
1030 |    "cell_type": "markdown",
1031 |    "metadata": {},
1032 |    "source": [
1033 |     "Again, in this implementation the outputs are just numbers, not symbolic expressions, so we are limited to first order derivatives. However, we could have made the `backpropagate()` methods return symbolic expressions rather than values (e.g., return `Add(2,3)` rather than 5). This would make it possible to compute second order gradients (and beyond). This is what TensorFlow does, as do all the major libraries that implement autodiff."
1034 |    ]
1035 |   },
1036 |   {
1037 |    "cell_type": "markdown",
1038 |    "metadata": {},
1039 |    "source": [
1040 |     "### Reverse mode autodiff using TensorFlow"
1041 |    ]
1042 |   },
1043 |   {
1044 |    "cell_type": "code",
1045 |    "execution_count": 34,
1046 |    "metadata": {},
1047 |    "outputs": [],
1048 |    "source": [
1049 |     "try:\n",
1050 |     "    # %tensorflow_version only exists in Colab.\n",
1051 |     "    %tensorflow_version 1.x\n",
1052 |     "except Exception:\n",
1053 |     "    pass\n",
1054 |     "\n",
1055 |     "import tensorflow as tf"
1056 |    ]
1057 |   },
1058 |   {
1059 |    "cell_type": "code",
1060 |    "execution_count": 35,
1061 |    "metadata": {},
1062 |    "outputs": [
1063 |     {
1064 |      "data": {
1065 |       "text/plain": [
1066 |        "(42.0, [24.0, 10.0])"
1067 |       ]
1068 |      },
1069 |      "execution_count": 35,
1070 |      "metadata": {},
1071 |      "output_type": "execute_result"
1072 |     }
1073 |    ],
1074 |    "source": [
1075 |     "tf.reset_default_graph()\n",
1076 |     "\n",
1077 |     "x = tf.Variable(3., name=\"x\")\n",
1078 |     "y = tf.Variable(4., name=\"y\")\n",
1079 |     "f = x*x*y + y + 2\n",
1080 |     "\n",
1081 |     "jacobians = tf.gradients(f, [x, y])\n",
1082 |     "\n",
1083 |     "init = tf.global_variables_initializer()\n",
1084 |     "\n",
1085 |     "with tf.Session() as sess:\n",
1086 |     "    init.run()\n",
1087 |     "    f_val, jacobians_val = sess.run([f, jacobians])\n",
1088 |     "\n",
1089 |     "f_val, jacobians_val"
1090 |    ]
1091 |   },
1092 |   {
1093 |    "cell_type": "markdown",
1094 |    "metadata": {},
1095 |    "source": [
1096 |     "Since everything is symbolic, we can compute second order derivatives, and beyond. However, when we compute the derivative of a tensor with regards to a variable that it does not depend on, instead of returning 0.0, the `gradients()` function returns None, which cannot be evaluated by `sess.run()`. So beware of `None` values. Here we just replace them with zero tensors."
1097 |    ]
1098 |   },
1099 |   {
1100 |    "cell_type": "code",
1101 |    "execution_count": 36,
1102 |    "metadata": {},
1103 |    "outputs": [
1104 |     {
1105 |      "data": {
1106 |       "text/plain": [
1107 |        "([8.0, 6.0], [6.0, 0.0])"
1108 |       ]
1109 |      },
1110 |      "execution_count": 36,
1111 |      "metadata": {},
1112 |      "output_type": "execute_result"
1113 |     }
1114 |    ],
1115 |    "source": [
1116 |     "hessians_x = tf.gradients(jacobians[0], [x, y])\n",
1117 |     "hessians_y = tf.gradients(jacobians[1], [x, y])\n",
1118 |     "\n",
1119 |     "def replace_none_with_zero(tensors):\n",
1120 |     "    return [tensor if tensor is not None else tf.constant(0.)\n",
1121 |     "            for tensor in tensors]\n",
1122 |     "\n",
1123 |     "hessians_x = replace_none_with_zero(hessians_x)\n",
1124 |     "hessians_y = replace_none_with_zero(hessians_y)\n",
1125 |     "\n",
1126 |     "init = tf.global_variables_initializer()\n",
1127 |     "\n",
1128 |     "with tf.Session() as sess:\n",
1129 |     "    init.run()\n",
1130 |     "    hessians_x_val, hessians_y_val = sess.run([hessians_x, hessians_y])\n",
1131 |     "\n",
1132 |     "hessians_x_val, hessians_y_val"
1133 |    ]
1134 |   },
1135 |   {
1136 |    "cell_type": "markdown",
1137 |    "metadata": {},
1138 |    "source": [
1139 |     "And that's all folks! Hope you enjoyed this notebook."
1140 |    ]
1141 |   },
1142 |   {
1143 |    "cell_type": "code",
1144 |    "execution_count": null,
1145 |    "metadata": {},
1146 |    "outputs": [],
1147 |    "source": []
1148 |   }
1149 |  ],
1150 |  "metadata": {
1151 |   "kernelspec": {
1152 |    "display_name": "Python 3",
1153 |    "language": "python",
1154 |    "name": "python3"
1155 |   },
1156 |   "language_info": {
1157 |    "codemirror_mode": {
1158 |     "name": "ipython",
1159 |     "version": 3
1160 |    },
1161 |    "file_extension": ".py",
1162 |    "mimetype": "text/x-python",
1163 |    "name": "python",
1164 |    "nbconvert_exporter": "python",
1165 |    "pygments_lexer": "ipython3",
1166 |    "version": "3.7.10"
1167 |   },
1168 |   "nav_menu": {
1169 |    "height": "603px",
1170 |    "width": "616px"
1171 |   },
1172 |   "toc": {
1173 |    "navigate_menu": true,
1174 |    "number_sections": true,
1175 |    "sideBar": true,
1176 |    "threshold": 6,
1177 |    "toc_cell": false,
1178 |    "toc_section_display": "block",
1179 |    "toc_window_display": true
1180 |   }
1181 |  },
1182 |  "nbformat": 4,
1183 |  "nbformat_minor": 1
1184 | }
1185 | 


--------------------------------------------------------------------------------
/extra_tensorflow_reproducibility.ipynb:
--------------------------------------------------------------------------------
   1 | {
   2 |  "cells": [
   3 |   {
   4 |    "cell_type": "markdown",
   5 |    "metadata": {},
   6 |    "source": [
   7 |     "**TensorFlow Reproducibility**\n",
   8 |     "\n",
   9 |     "This notebook explains how to get fully reproducible code with TensorFlow.\n",
  10 |     "\n",
  11 |     "<table align=\"left\">\n",
  12 |     "  <td>\n",
  13 |     "    <a target=\"_blank\" href=\"https://colab.research.google.com/github/ageron/handson-ml/blob/master/extra_tensorflow_reproducibility.ipynb\"><img src=\"https://www.tensorflow.org/images/colab_logo_32px.png\" />Run in Google Colab</a>\n",
  14 |     "  </td>\n",
  15 |     "</table>"
  16 |    ]
  17 |   },
  18 |   {
  19 |    "cell_type": "markdown",
  20 |    "metadata": {},
  21 |    "source": [
  22 |     "**Warning**: this notebook accompanies the 1st edition of the book. Please visit https://github.com/ageron/handson-ml2 for the 2nd edition project, with up-to-date notebooks using the latest library versions. In particular, the 1st edition is based on TensorFlow 1, while the 2nd edition uses TensorFlow 2, which is much simpler to use."
  23 |    ]
  24 |   },
  25 |   {
  26 |    "cell_type": "markdown",
  27 |    "metadata": {},
  28 |    "source": [
  29 |     "Watch [this video](https://youtu.be/Ys8ofBeR2kA) to understand the key ideas behind TensorFlow reproducibility:"
  30 |    ]
  31 |   },
  32 |   {
  33 |    "cell_type": "code",
  34 |    "execution_count": 2,
  35 |    "metadata": {},
  36 |    "outputs": [
  37 |     {
  38 |      "data": {
  39 |       "text/html": [
  40 |        "\n",
  41 |        "        <iframe\n",
  42 |        "            width=\"560\"\n",
  43 |        "            height=\"315\"\n",
  44 |        "            src=\"https://www.youtube.com/embed/Ys8ofBeR2kA?frameborder=0&allowfullscreen=True\"\n",
  45 |        "            frameborder=\"0\"\n",
  46 |        "            allowfullscreen\n",
  47 |        "        ></iframe>\n",
  48 |        "        "
  49 |       ],
  50 |       "text/plain": [
  51 |        "<IPython.lib.display.IFrame at 0x7f8920447890>"
  52 |       ]
  53 |      },
  54 |      "execution_count": 2,
  55 |      "metadata": {},
  56 |      "output_type": "execute_result"
  57 |     }
  58 |    ],
  59 |    "source": [
  60 |     "from IPython.display import IFrame\n",
  61 |     "IFrame(src=\"https://www.youtube.com/embed/Ys8ofBeR2kA\", width=560, height=315, frameborder=\"0\", allowfullscreen=True)"
  62 |    ]
  63 |   },
  64 |   {
  65 |    "cell_type": "markdown",
  66 |    "metadata": {},
  67 |    "source": [
  68 |     "**Warning**: this is the code for the 1st edition of the book. Please visit https://github.com/ageron/handson-ml2 for the 2nd edition code, with up-to-date notebooks using the latest library versions. In particular, the 1st edition is based on TensorFlow 1, while the 2nd edition uses TensorFlow 2, which is much simpler to use."
  69 |    ]
  70 |   },
  71 |   {
  72 |    "cell_type": "code",
  73 |    "execution_count": 1,
  74 |    "metadata": {},
  75 |    "outputs": [
  76 |     {
  77 |      "name": "stderr",
  78 |      "output_type": "stream",
  79 |      "text": [
  80 |       "/Users/ageron/.virtualenvs/ml/lib/python3.6/importlib/_bootstrap.py:219: RuntimeWarning: compiletime version 3.5 of module 'tensorflow.python.framework.fast_tensor_util' does not match runtime version 3.6\n",
  81 |       "  return f(*args, **kwds)\n"
  82 |      ]
  83 |     }
  84 |    ],
  85 |    "source": [
  86 |     "from __future__ import division, print_function, unicode_literals\n",
  87 |     "\n",
  88 |     "try:\n",
  89 |     "    # %tensorflow_version only exists in Colab.\n",
  90 |     "    %tensorflow_version 1.x\n",
  91 |     "except Exception:\n",
  92 |     "    pass\n",
  93 |     "\n",
  94 |     "import numpy as np\n",
  95 |     "import tensorflow as tf\n",
  96 |     "from tensorflow import keras"
  97 |    ]
  98 |   },
  99 |   {
 100 |    "cell_type": "markdown",
 101 |    "metadata": {},
 102 |    "source": [
 103 |     "## Checklist"
 104 |    ]
 105 |   },
 106 |   {
 107 |    "cell_type": "markdown",
 108 |    "metadata": {},
 109 |    "source": [
 110 |     "1. Do not run TensorFlow on the GPU.\n",
 111 |     "2. Beware of multithreading, and make TensorFlow single-threaded.\n",
 112 |     "3. Set all the random seeds.\n",
 113 |     "4. Eliminate any other source of variability."
 114 |    ]
 115 |   },
 116 |   {
 117 |    "cell_type": "markdown",
 118 |    "metadata": {},
 119 |    "source": [
 120 |     "## Do Not Run TensorFlow on the GPU"
 121 |    ]
 122 |   },
 123 |   {
 124 |    "cell_type": "markdown",
 125 |    "metadata": {},
 126 |    "source": [
 127 |     "Some operations (like `tf.reduce_sum()`) have favor performance over precision, and their outputs may vary slightly across runs. To get reproducible results, make sure TensorFlow runs on the CPU:"
 128 |    ]
 129 |   },
 130 |   {
 131 |    "cell_type": "code",
 132 |    "execution_count": 2,
 133 |    "metadata": {},
 134 |    "outputs": [],
 135 |    "source": [
 136 |     "import os\n",
 137 |     "os.environ[\"CUDA_VISIBLE_DEVICES\"]=\"\""
 138 |    ]
 139 |   },
 140 |   {
 141 |    "cell_type": "markdown",
 142 |    "metadata": {},
 143 |    "source": [
 144 |     "## Beware of Multithreading"
 145 |    ]
 146 |   },
 147 |   {
 148 |    "cell_type": "markdown",
 149 |    "metadata": {},
 150 |    "source": [
 151 |     "Because floats have limited precision, the order of execution matters:"
 152 |    ]
 153 |   },
 154 |   {
 155 |    "cell_type": "code",
 156 |    "execution_count": 3,
 157 |    "metadata": {},
 158 |    "outputs": [
 159 |     {
 160 |      "data": {
 161 |       "text/plain": [
 162 |        "1.4285714285714286"
 163 |       ]
 164 |      },
 165 |      "execution_count": 3,
 166 |      "metadata": {},
 167 |      "output_type": "execute_result"
 168 |     }
 169 |    ],
 170 |    "source": [
 171 |     "2. * 5. / 7."
 172 |    ]
 173 |   },
 174 |   {
 175 |    "cell_type": "code",
 176 |    "execution_count": 4,
 177 |    "metadata": {},
 178 |    "outputs": [
 179 |     {
 180 |      "data": {
 181 |       "text/plain": [
 182 |        "1.4285714285714284"
 183 |       ]
 184 |      },
 185 |      "execution_count": 4,
 186 |      "metadata": {},
 187 |      "output_type": "execute_result"
 188 |     }
 189 |    ],
 190 |    "source": [
 191 |     "2. / 7. * 5."
 192 |    ]
 193 |   },
 194 |   {
 195 |    "cell_type": "markdown",
 196 |    "metadata": {},
 197 |    "source": [
 198 |     "You should make sure TensorFlow runs your ops on a single thread:"
 199 |    ]
 200 |   },
 201 |   {
 202 |    "cell_type": "code",
 203 |    "execution_count": 5,
 204 |    "metadata": {},
 205 |    "outputs": [],
 206 |    "source": [
 207 |     "config = tf.ConfigProto(intra_op_parallelism_threads=1,\n",
 208 |     "                        inter_op_parallelism_threads=1)\n",
 209 |     "\n",
 210 |     "with tf.Session(config=config) as sess:\n",
 211 |     "    #... this will run single threaded\n",
 212 |     "    pass"
 213 |    ]
 214 |   },
 215 |   {
 216 |    "cell_type": "markdown",
 217 |    "metadata": {},
 218 |    "source": [
 219 |     "The thread pools for all sessions are created when you create the first session, so all sessions in the rest of this notebook will be single-threaded:"
 220 |    ]
 221 |   },
 222 |   {
 223 |    "cell_type": "code",
 224 |    "execution_count": 6,
 225 |    "metadata": {},
 226 |    "outputs": [],
 227 |    "source": [
 228 |     "with tf.Session() as sess:\n",
 229 |     "    #... also single-threaded!\n",
 230 |     "    pass"
 231 |    ]
 232 |   },
 233 |   {
 234 |    "cell_type": "markdown",
 235 |    "metadata": {},
 236 |    "source": [
 237 |     "## Set all the random seeds!"
 238 |    ]
 239 |   },
 240 |   {
 241 |    "cell_type": "markdown",
 242 |    "metadata": {},
 243 |    "source": [
 244 |     "### Python's built-in `hash()` function"
 245 |    ]
 246 |   },
 247 |   {
 248 |    "cell_type": "code",
 249 |    "execution_count": 7,
 250 |    "metadata": {},
 251 |    "outputs": [
 252 |     {
 253 |      "name": "stdout",
 254 |      "output_type": "stream",
 255 |      "text": [
 256 |       "{'n', 'k', 'l', 'h', 'r', 'a', 'i', 't', 'd', 's', 'g', 'T', ' ', 'y', 'e', 'u'}\n",
 257 |       "{'n', 'k', 'l', 'h', 'r', 'a', 'i', 't', 'd', 's', 'g', 'T', ' ', 'y', 'e', 'u'}\n"
 258 |      ]
 259 |     }
 260 |    ],
 261 |    "source": [
 262 |     "print(set(\"Try restarting the kernel and running this again\"))\n",
 263 |     "print(set(\"Try restarting the kernel and running this again\"))"
 264 |    ]
 265 |   },
 266 |   {
 267 |    "cell_type": "markdown",
 268 |    "metadata": {},
 269 |    "source": [
 270 |     "Since Python 3.3, the result will be different every time, unless you start Python with the `PYTHONHASHSEED` environment variable set to `0`:"
 271 |    ]
 272 |   },
 273 |   {
 274 |    "cell_type": "markdown",
 275 |    "metadata": {},
 276 |    "source": [
 277 |     "```shell\n",
 278 |     "PYTHONHASHSEED=0 python\n",
 279 |     "```\n",
 280 |     "\n",
 281 |     "```pycon\n",
 282 |     ">>> print(set(\"Now the output is stable across runs\"))\n",
 283 |     "{'n', 'b', 'h', 'o', 'i', 'a', 'r', 't', 'p', 'N', 's', 'c', ' ', 'l', 'e', 'w', 'u'}\n",
 284 |     ">>> exit()\n",
 285 |     "```\n",
 286 |     "\n",
 287 |     "```shell\n",
 288 |     "PYTHONHASHSEED=0 python\n",
 289 |     "```\n",
 290 |     "```pycon\n",
 291 |     ">>> print(set(\"Now the output is stable across runs\"))\n",
 292 |     "{'n', 'b', 'h', 'o', 'i', 'a', 'r', 't', 'p', 'N', 's', 'c', ' ', 'l', 'e', 'w', 'u'}\n",
 293 |     "```"
 294 |    ]
 295 |   },
 296 |   {
 297 |    "cell_type": "markdown",
 298 |    "metadata": {},
 299 |    "source": [
 300 |     "Alternatively, you could set this environment variable system-wide, but that's probably not a good idea, because this automatic randomization was [introduced for security reasons](http://ocert.org/advisories/ocert-2011-003.html)."
 301 |    ]
 302 |   },
 303 |   {
 304 |    "cell_type": "markdown",
 305 |    "metadata": {},
 306 |    "source": [
 307 |     "Unfortunately, setting the environment variable from within Python (e.g., using `os.environ[\"PYTHONHASHSEED\"]=\"0\"`) will not work, because Python reads it upon startup. For Jupyter notebooks, you have to start the Jupyter server like this:\n",
 308 |     "\n",
 309 |     "```shell\n",
 310 |     "PYTHONHASHSEED=0 jupyter notebook\n",
 311 |     "```"
 312 |    ]
 313 |   },
 314 |   {
 315 |    "cell_type": "code",
 316 |    "execution_count": 8,
 317 |    "metadata": {},
 318 |    "outputs": [],
 319 |    "source": [
 320 |     "if os.environ.get(\"PYTHONHASHSEED\") != \"0\":\n",
 321 |     "    raise Exception(\"You must set PYTHONHASHSEED=0 when starting the Jupyter server to get reproducible results.\")"
 322 |    ]
 323 |   },
 324 |   {
 325 |    "cell_type": "markdown",
 326 |    "metadata": {},
 327 |    "source": [
 328 |     "### Python Random Number Generators (RNGs)"
 329 |    ]
 330 |   },
 331 |   {
 332 |    "cell_type": "code",
 333 |    "execution_count": 9,
 334 |    "metadata": {},
 335 |    "outputs": [
 336 |     {
 337 |      "name": "stdout",
 338 |      "output_type": "stream",
 339 |      "text": [
 340 |       "0.6394267984578837\n",
 341 |       "0.025010755222666936\n",
 342 |       "\n",
 343 |       "0.6394267984578837\n",
 344 |       "0.025010755222666936\n"
 345 |      ]
 346 |     }
 347 |    ],
 348 |    "source": [
 349 |     "import random\n",
 350 |     "\n",
 351 |     "random.seed(42)\n",
 352 |     "print(random.random())\n",
 353 |     "print(random.random())\n",
 354 |     "\n",
 355 |     "print()\n",
 356 |     "\n",
 357 |     "random.seed(42)\n",
 358 |     "print(random.random())\n",
 359 |     "print(random.random())"
 360 |    ]
 361 |   },
 362 |   {
 363 |    "cell_type": "markdown",
 364 |    "metadata": {},
 365 |    "source": [
 366 |     "### NumPy RNGs"
 367 |    ]
 368 |   },
 369 |   {
 370 |    "cell_type": "code",
 371 |    "execution_count": 10,
 372 |    "metadata": {},
 373 |    "outputs": [
 374 |     {
 375 |      "name": "stdout",
 376 |      "output_type": "stream",
 377 |      "text": [
 378 |       "0.3745401188473625\n",
 379 |       "0.9507143064099162\n",
 380 |       "\n",
 381 |       "0.3745401188473625\n",
 382 |       "0.9507143064099162\n"
 383 |      ]
 384 |     }
 385 |    ],
 386 |    "source": [
 387 |     "import numpy as np\n",
 388 |     "\n",
 389 |     "np.random.seed(42)\n",
 390 |     "print(np.random.rand())\n",
 391 |     "print(np.random.rand())\n",
 392 |     "\n",
 393 |     "print()\n",
 394 |     "\n",
 395 |     "np.random.seed(42)\n",
 396 |     "print(np.random.rand())\n",
 397 |     "print(np.random.rand())"
 398 |    ]
 399 |   },
 400 |   {
 401 |    "cell_type": "markdown",
 402 |    "metadata": {},
 403 |    "source": [
 404 |     "### TensorFlow RNGs"
 405 |    ]
 406 |   },
 407 |   {
 408 |    "cell_type": "markdown",
 409 |    "metadata": {},
 410 |    "source": [
 411 |     "TensorFlow's behavior is more complex because of two things:\n",
 412 |     "* you create a graph, and then you execute it. The random seed must be set before you create the random operations.\n",
 413 |     "* there are two seeds: one at the graph level, and one at the individual random operation level."
 414 |    ]
 415 |   },
 416 |   {
 417 |    "cell_type": "code",
 418 |    "execution_count": 11,
 419 |    "metadata": {},
 420 |    "outputs": [
 421 |     {
 422 |      "name": "stdout",
 423 |      "output_type": "stream",
 424 |      "text": [
 425 |       "0.63789964\n",
 426 |       "0.8774011\n",
 427 |       "\n",
 428 |       "0.63789964\n",
 429 |       "0.8774011\n"
 430 |      ]
 431 |     }
 432 |    ],
 433 |    "source": [
 434 |     "import tensorflow as tf\n",
 435 |     "\n",
 436 |     "tf.set_random_seed(42)\n",
 437 |     "rnd = tf.random_uniform(shape=[])\n",
 438 |     "\n",
 439 |     "with tf.Session() as sess:\n",
 440 |     "    print(rnd.eval())\n",
 441 |     "    print(rnd.eval())\n",
 442 |     "\n",
 443 |     "print()\n",
 444 |     "\n",
 445 |     "with tf.Session() as sess:\n",
 446 |     "    print(rnd.eval())\n",
 447 |     "    print(rnd.eval())"
 448 |    ]
 449 |   },
 450 |   {
 451 |    "cell_type": "markdown",
 452 |    "metadata": {},
 453 |    "source": [
 454 |     "Every time you reset the graph, you need to set the seed again:"
 455 |    ]
 456 |   },
 457 |   {
 458 |    "cell_type": "code",
 459 |    "execution_count": 12,
 460 |    "metadata": {},
 461 |    "outputs": [
 462 |     {
 463 |      "name": "stdout",
 464 |      "output_type": "stream",
 465 |      "text": [
 466 |       "0.63789964\n",
 467 |       "0.8774011\n",
 468 |       "\n",
 469 |       "0.63789964\n",
 470 |       "0.8774011\n"
 471 |      ]
 472 |     }
 473 |    ],
 474 |    "source": [
 475 |     "tf.reset_default_graph()\n",
 476 |     "\n",
 477 |     "tf.set_random_seed(42)\n",
 478 |     "rnd = tf.random_uniform(shape=[])\n",
 479 |     "\n",
 480 |     "with tf.Session() as sess:\n",
 481 |     "    print(rnd.eval())\n",
 482 |     "    print(rnd.eval())\n",
 483 |     "\n",
 484 |     "print()\n",
 485 |     "\n",
 486 |     "with tf.Session() as sess:\n",
 487 |     "    print(rnd.eval())\n",
 488 |     "    print(rnd.eval())"
 489 |    ]
 490 |   },
 491 |   {
 492 |    "cell_type": "markdown",
 493 |    "metadata": {},
 494 |    "source": [
 495 |     "If you create your own graph, it will ignore the default graph's seed:"
 496 |    ]
 497 |   },
 498 |   {
 499 |    "cell_type": "code",
 500 |    "execution_count": 13,
 501 |    "metadata": {},
 502 |    "outputs": [
 503 |     {
 504 |      "name": "stdout",
 505 |      "output_type": "stream",
 506 |      "text": [
 507 |       "0.5718187\n",
 508 |       "0.6233171\n",
 509 |       "\n",
 510 |       "0.32140207\n",
 511 |       "0.46593904\n"
 512 |      ]
 513 |     }
 514 |    ],
 515 |    "source": [
 516 |     "tf.reset_default_graph()\n",
 517 |     "tf.set_random_seed(42)\n",
 518 |     "\n",
 519 |     "graph = tf.Graph()\n",
 520 |     "with graph.as_default():\n",
 521 |     "    rnd = tf.random_uniform(shape=[])\n",
 522 |     "\n",
 523 |     "with tf.Session(graph=graph):\n",
 524 |     "    print(rnd.eval())\n",
 525 |     "    print(rnd.eval())\n",
 526 |     "\n",
 527 |     "print()\n",
 528 |     "\n",
 529 |     "with tf.Session(graph=graph):\n",
 530 |     "    print(rnd.eval())\n",
 531 |     "    print(rnd.eval())"
 532 |    ]
 533 |   },
 534 |   {
 535 |    "cell_type": "markdown",
 536 |    "metadata": {},
 537 |    "source": [
 538 |     "You must set its own seed:"
 539 |    ]
 540 |   },
 541 |   {
 542 |    "cell_type": "code",
 543 |    "execution_count": 14,
 544 |    "metadata": {},
 545 |    "outputs": [
 546 |     {
 547 |      "name": "stdout",
 548 |      "output_type": "stream",
 549 |      "text": [
 550 |       "0.63789964\n",
 551 |       "0.8774011\n",
 552 |       "\n",
 553 |       "0.63789964\n",
 554 |       "0.8774011\n"
 555 |      ]
 556 |     }
 557 |    ],
 558 |    "source": [
 559 |     "graph = tf.Graph()\n",
 560 |     "with graph.as_default():\n",
 561 |     "    tf.set_random_seed(42)\n",
 562 |     "    rnd = tf.random_uniform(shape=[])\n",
 563 |     "\n",
 564 |     "with tf.Session(graph=graph):\n",
 565 |     "    print(rnd.eval())\n",
 566 |     "    print(rnd.eval())\n",
 567 |     "\n",
 568 |     "print()\n",
 569 |     "\n",
 570 |     "with tf.Session(graph=graph):\n",
 571 |     "    print(rnd.eval())\n",
 572 |     "    print(rnd.eval())"
 573 |    ]
 574 |   },
 575 |   {
 576 |    "cell_type": "markdown",
 577 |    "metadata": {},
 578 |    "source": [
 579 |     "If you set the seed after the random operation is created, the seed has no effet:"
 580 |    ]
 581 |   },
 582 |   {
 583 |    "cell_type": "code",
 584 |    "execution_count": 15,
 585 |    "metadata": {},
 586 |    "outputs": [
 587 |     {
 588 |      "name": "stdout",
 589 |      "output_type": "stream",
 590 |      "text": [
 591 |       "0.087068915\n",
 592 |       "0.6322479\n",
 593 |       "\n",
 594 |       "0.17158246\n",
 595 |       "0.2868148\n"
 596 |      ]
 597 |     }
 598 |    ],
 599 |    "source": [
 600 |     "tf.reset_default_graph()\n",
 601 |     "\n",
 602 |     "rnd = tf.random_uniform(shape=[])\n",
 603 |     "\n",
 604 |     "tf.set_random_seed(42) # BAD, NO EFFECT!\n",
 605 |     "with tf.Session() as sess:\n",
 606 |     "    print(rnd.eval())\n",
 607 |     "    print(rnd.eval())\n",
 608 |     "\n",
 609 |     "print()\n",
 610 |     "\n",
 611 |     "tf.set_random_seed(42) # BAD, NO EFFECT!\n",
 612 |     "with tf.Session() as sess:\n",
 613 |     "    print(rnd.eval())\n",
 614 |     "    print(rnd.eval())"
 615 |    ]
 616 |   },
 617 |   {
 618 |    "cell_type": "markdown",
 619 |    "metadata": {},
 620 |    "source": [
 621 |     "#### A note about operation seeds"
 622 |    ]
 623 |   },
 624 |   {
 625 |    "cell_type": "markdown",
 626 |    "metadata": {},
 627 |    "source": [
 628 |     "You can also set a seed for each individual random operation. When you do, it is combined with the graph seed into the final seed used by that op. The following table summarizes how this works:\n",
 629 |     "\n",
 630 |     "| Graph seed | Op seed |  Resulting seed                |\n",
 631 |     "|------------|---------|--------------------------------|\n",
 632 |     "| None       | None    | Random                         |\n",
 633 |     "| graph_seed | None    | f(graph_seed, op_index)        |\n",
 634 |     "| None       | op_seed | f(default_graph_seed, op_seed) |\n",
 635 |     "| graph_seed | op_seed | f(graph_seed, op_seed)         |\n",
 636 |     "\n",
 637 |     "* `f()` is a deterministic function.\n",
 638 |     "* `op_index = graph._last_id` when there is a graph seed, different random ops without op seeds will have different outputs. However, each of them will have the same sequence of outputs at every run.\n",
 639 |     "\n",
 640 |     "In eager mode, there is a global seed instead of graph seed (since there is no graph in eager mode)."
 641 |    ]
 642 |   },
 643 |   {
 644 |    "cell_type": "code",
 645 |    "execution_count": 16,
 646 |    "metadata": {},
 647 |    "outputs": [
 648 |     {
 649 |      "name": "stdout",
 650 |      "output_type": "stream",
 651 |      "text": [
 652 |       "0.95227146\n",
 653 |       "0.95227146\n",
 654 |       "0.55099714\n",
 655 |       "0.8960779\n",
 656 |       "0.8960779\n",
 657 |       "0.54318357\n",
 658 |       "\n",
 659 |       "0.95227146\n",
 660 |       "0.95227146\n",
 661 |       "0.6398845\n",
 662 |       "0.8960779\n",
 663 |       "0.8960779\n",
 664 |       "0.24617589\n"
 665 |      ]
 666 |     }
 667 |    ],
 668 |    "source": [
 669 |     "tf.reset_default_graph()\n",
 670 |     "\n",
 671 |     "rnd1 = tf.random_uniform(shape=[], seed=42)\n",
 672 |     "rnd2 = tf.random_uniform(shape=[], seed=42)\n",
 673 |     "rnd3 = tf.random_uniform(shape=[])\n",
 674 |     "\n",
 675 |     "with tf.Session() as sess:\n",
 676 |     "    print(rnd1.eval())\n",
 677 |     "    print(rnd2.eval())\n",
 678 |     "    print(rnd3.eval())\n",
 679 |     "    print(rnd1.eval())\n",
 680 |     "    print(rnd2.eval())\n",
 681 |     "    print(rnd3.eval())\n",
 682 |     "\n",
 683 |     "print()\n",
 684 |     "\n",
 685 |     "with tf.Session() as sess:\n",
 686 |     "    print(rnd1.eval())\n",
 687 |     "    print(rnd2.eval())\n",
 688 |     "    print(rnd3.eval())\n",
 689 |     "    print(rnd1.eval())\n",
 690 |     "    print(rnd2.eval())\n",
 691 |     "    print(rnd3.eval())"
 692 |    ]
 693 |   },
 694 |   {
 695 |    "cell_type": "markdown",
 696 |    "metadata": {},
 697 |    "source": [
 698 |     "In the following example, you may think that all random ops will have the same random seed, but `rnd3` will actually have a different seed:"
 699 |    ]
 700 |   },
 701 |   {
 702 |    "cell_type": "code",
 703 |    "execution_count": 17,
 704 |    "metadata": {},
 705 |    "outputs": [
 706 |     {
 707 |      "name": "stdout",
 708 |      "output_type": "stream",
 709 |      "text": [
 710 |       "0.4163028\n",
 711 |       "0.4163028\n",
 712 |       "0.96100175\n",
 713 |       "0.033224702\n",
 714 |       "0.033224702\n",
 715 |       "0.17637014\n",
 716 |       "\n",
 717 |       "0.4163028\n",
 718 |       "0.4163028\n",
 719 |       "0.96100175\n",
 720 |       "0.033224702\n",
 721 |       "0.033224702\n",
 722 |       "0.17637014\n"
 723 |      ]
 724 |     }
 725 |    ],
 726 |    "source": [
 727 |     "tf.reset_default_graph()\n",
 728 |     "\n",
 729 |     "tf.set_random_seed(42)\n",
 730 |     "\n",
 731 |     "rnd1 = tf.random_uniform(shape=[], seed=42)\n",
 732 |     "rnd2 = tf.random_uniform(shape=[], seed=42)\n",
 733 |     "rnd3 = tf.random_uniform(shape=[])\n",
 734 |     "\n",
 735 |     "with tf.Session() as sess:\n",
 736 |     "    print(rnd1.eval())\n",
 737 |     "    print(rnd2.eval())\n",
 738 |     "    print(rnd3.eval())\n",
 739 |     "    print(rnd1.eval())\n",
 740 |     "    print(rnd2.eval())\n",
 741 |     "    print(rnd3.eval())\n",
 742 |     "\n",
 743 |     "print()\n",
 744 |     "\n",
 745 |     "with tf.Session() as sess:\n",
 746 |     "    print(rnd1.eval())\n",
 747 |     "    print(rnd2.eval())\n",
 748 |     "    print(rnd3.eval())\n",
 749 |     "    print(rnd1.eval())\n",
 750 |     "    print(rnd2.eval())\n",
 751 |     "    print(rnd3.eval())"
 752 |    ]
 753 |   },
 754 |   {
 755 |    "cell_type": "markdown",
 756 |    "metadata": {},
 757 |    "source": [
 758 |     "#### Estimators API"
 759 |    ]
 760 |   },
 761 |   {
 762 |    "cell_type": "markdown",
 763 |    "metadata": {},
 764 |    "source": [
 765 |     "**Tip**: in a Jupyter notebook, you probably want to set the random seeds regularly so that you can come back and run the notebook from there (instead of from the beginning) and still get reproducible outputs."
 766 |    ]
 767 |   },
 768 |   {
 769 |    "cell_type": "code",
 770 |    "execution_count": 18,
 771 |    "metadata": {},
 772 |    "outputs": [],
 773 |    "source": [
 774 |     "random.seed(42)\n",
 775 |     "np.random.seed(42)\n",
 776 |     "tf.set_random_seed(42)"
 777 |    ]
 778 |   },
 779 |   {
 780 |    "cell_type": "markdown",
 781 |    "metadata": {},
 782 |    "source": [
 783 |     "If you use the Estimators API, make sure to create a `RunConfig` and set its `tf_random_seed`, then pass it to the constructor of your estimator:"
 784 |    ]
 785 |   },
 786 |   {
 787 |    "cell_type": "code",
 788 |    "execution_count": 19,
 789 |    "metadata": {},
 790 |    "outputs": [
 791 |     {
 792 |      "name": "stdout",
 793 |      "output_type": "stream",
 794 |      "text": [
 795 |       "WARNING:tensorflow:Using temporary folder as model directory: /var/folders/wy/h39t6kb11pnbb0pzhksd_fqh0000gn/T/tmp2xxrubio\n",
 796 |       "INFO:tensorflow:Using config: {'_model_dir': '/var/folders/wy/h39t6kb11pnbb0pzhksd_fqh0000gn/T/tmp2xxrubio', '_tf_random_seed': 42, '_save_summary_steps': 100, '_save_checkpoints_steps': None, '_save_checkpoints_secs': 600, '_session_config': None, '_keep_checkpoint_max': 5, '_keep_checkpoint_every_n_hours': 10000, '_log_step_count_steps': 100, '_train_distribute': None, '_device_fn': None, '_service': None, '_cluster_spec': <tensorflow.python.training.server_lib.ClusterSpec object at 0x11dba7da0>, '_task_type': 'worker', '_task_id': 0, '_global_id_in_cluster': 0, '_master': '', '_evaluation_master': '', '_is_chief': True, '_num_ps_replicas': 0, '_num_worker_replicas': 1}\n"
 797 |      ]
 798 |     }
 799 |    ],
 800 |    "source": [
 801 |     "my_config = tf.estimator.RunConfig(tf_random_seed=42)\n",
 802 |     "\n",
 803 |     "feature_cols = [tf.feature_column.numeric_column(\"X\", shape=[28 * 28])]\n",
 804 |     "dnn_clf = tf.estimator.DNNClassifier(hidden_units=[300, 100], n_classes=10,\n",
 805 |     "                                     feature_columns=feature_cols,\n",
 806 |     "                                     config=my_config)"
 807 |    ]
 808 |   },
 809 |   {
 810 |    "cell_type": "markdown",
 811 |    "metadata": {},
 812 |    "source": [
 813 |     "Let's try it on MNIST:"
 814 |    ]
 815 |   },
 816 |   {
 817 |    "cell_type": "code",
 818 |    "execution_count": 20,
 819 |    "metadata": {},
 820 |    "outputs": [],
 821 |    "source": [
 822 |     "(X_train, y_train), (X_test, y_test) = tf.keras.datasets.mnist.load_data()\n",
 823 |     "X_train = X_train.astype(np.float32).reshape(-1, 28*28) / 255.0\n",
 824 |     "y_train = y_train.astype(np.int32)"
 825 |    ]
 826 |   },
 827 |   {
 828 |    "cell_type": "markdown",
 829 |    "metadata": {},
 830 |    "source": [
 831 |     "Unfortunately, the `numpy_input_fn` does not allow us to set the seed when `shuffle=True`, so we must shuffle the data ourself and set `shuffle=False`."
 832 |    ]
 833 |   },
 834 |   {
 835 |    "cell_type": "code",
 836 |    "execution_count": 21,
 837 |    "metadata": {},
 838 |    "outputs": [
 839 |     {
 840 |      "name": "stdout",
 841 |      "output_type": "stream",
 842 |      "text": [
 843 |       "INFO:tensorflow:Calling model_fn.\n",
 844 |       "INFO:tensorflow:Done calling model_fn.\n",
 845 |       "INFO:tensorflow:Create CheckpointSaverHook.\n",
 846 |       "INFO:tensorflow:Graph was finalized.\n",
 847 |       "INFO:tensorflow:Running local_init_op.\n",
 848 |       "INFO:tensorflow:Done running local_init_op.\n",
 849 |       "INFO:tensorflow:Saving checkpoints for 0 into /var/folders/wy/h39t6kb11pnbb0pzhksd_fqh0000gn/T/tmp2xxrubio/model.ckpt.\n",
 850 |       "INFO:tensorflow:loss = 73.945915, step = 1\n",
 851 |       "INFO:tensorflow:global_step/sec: 348.999\n",
 852 |       "INFO:tensorflow:loss = 21.020527, step = 101 (0.287 sec)\n",
 853 |       "INFO:tensorflow:global_step/sec: 431.365\n",
 854 |       "INFO:tensorflow:loss = 8.926933, step = 201 (0.232 sec)\n",
 855 |       "INFO:tensorflow:global_step/sec: 438.11\n",
 856 |       "INFO:tensorflow:loss = 2.3184745, step = 301 (0.228 sec)\n",
 857 |       "INFO:tensorflow:global_step/sec: 437.696\n",
 858 |       "INFO:tensorflow:loss = 10.654381, step = 401 (0.228 sec)\n",
 859 |       "INFO:tensorflow:global_step/sec: 452.808\n",
 860 |       "INFO:tensorflow:loss = 4.2829914, step = 501 (0.221 sec)\n",
 861 |       "INFO:tensorflow:global_step/sec: 450.062\n",
 862 |       "INFO:tensorflow:loss = 2.497019, step = 601 (0.222 sec)\n",
 863 |       "INFO:tensorflow:global_step/sec: 451.86\n",
 864 |       "INFO:tensorflow:loss = 3.9215999, step = 701 (0.221 sec)\n",
 865 |       "INFO:tensorflow:global_step/sec: 442.86\n",
 866 |       "INFO:tensorflow:loss = 3.8031044, step = 801 (0.226 sec)\n",
 867 |       "INFO:tensorflow:global_step/sec: 444.581\n",
 868 |       "INFO:tensorflow:loss = 3.9209557, step = 901 (0.225 sec)\n",
 869 |       "INFO:tensorflow:global_step/sec: 439.603\n",
 870 |       "INFO:tensorflow:loss = 5.506338, step = 1001 (0.227 sec)\n",
 871 |       "INFO:tensorflow:global_step/sec: 444.545\n",
 872 |       "INFO:tensorflow:loss = 2.6690354, step = 1101 (0.225 sec)\n",
 873 |       "INFO:tensorflow:global_step/sec: 445.176\n",
 874 |       "INFO:tensorflow:loss = 6.559507, step = 1201 (0.225 sec)\n",
 875 |       "INFO:tensorflow:global_step/sec: 443.365\n",
 876 |       "INFO:tensorflow:loss = 5.707597, step = 1301 (0.225 sec)\n",
 877 |       "INFO:tensorflow:global_step/sec: 447.822\n",
 878 |       "<<314 more lines>>\n",
 879 |       "INFO:tensorflow:loss = 0.48648793, step = 17101 (0.227 sec)\n",
 880 |       "INFO:tensorflow:global_step/sec: 454.872\n",
 881 |       "INFO:tensorflow:loss = 0.49331194, step = 17201 (0.220 sec)\n",
 882 |       "INFO:tensorflow:global_step/sec: 443.025\n",
 883 |       "INFO:tensorflow:loss = 0.32060045, step = 17301 (0.226 sec)\n",
 884 |       "INFO:tensorflow:global_step/sec: 440.069\n",
 885 |       "INFO:tensorflow:loss = 0.13167329, step = 17401 (0.227 sec)\n",
 886 |       "INFO:tensorflow:global_step/sec: 448.211\n",
 887 |       "INFO:tensorflow:loss = 0.05688939, step = 17501 (0.223 sec)\n",
 888 |       "INFO:tensorflow:global_step/sec: 450.458\n",
 889 |       "INFO:tensorflow:loss = 0.36213198, step = 17601 (0.222 sec)\n",
 890 |       "INFO:tensorflow:global_step/sec: 428.842\n",
 891 |       "INFO:tensorflow:loss = 0.36243188, step = 17701 (0.233 sec)\n",
 892 |       "INFO:tensorflow:global_step/sec: 456.734\n",
 893 |       "INFO:tensorflow:loss = 0.20977254, step = 17801 (0.219 sec)\n",
 894 |       "INFO:tensorflow:global_step/sec: 432.647\n",
 895 |       "INFO:tensorflow:loss = 0.09754325, step = 17901 (0.231 sec)\n",
 896 |       "INFO:tensorflow:global_step/sec: 389.941\n",
 897 |       "INFO:tensorflow:loss = 0.03494991, step = 18001 (0.256 sec)\n",
 898 |       "INFO:tensorflow:global_step/sec: 434.925\n",
 899 |       "INFO:tensorflow:loss = 0.17031653, step = 18101 (0.230 sec)\n",
 900 |       "INFO:tensorflow:global_step/sec: 445.735\n",
 901 |       "INFO:tensorflow:loss = 0.3200203, step = 18201 (0.224 sec)\n",
 902 |       "INFO:tensorflow:global_step/sec: 444.929\n",
 903 |       "INFO:tensorflow:loss = 0.18385477, step = 18301 (0.225 sec)\n",
 904 |       "INFO:tensorflow:global_step/sec: 445.546\n",
 905 |       "INFO:tensorflow:loss = 0.20921718, step = 18401 (0.225 sec)\n",
 906 |       "INFO:tensorflow:global_step/sec: 450.454\n",
 907 |       "INFO:tensorflow:loss = 0.01868303, step = 18501 (0.222 sec)\n",
 908 |       "INFO:tensorflow:global_step/sec: 445.762\n",
 909 |       "INFO:tensorflow:loss = 0.051421717, step = 18601 (0.224 sec)\n",
 910 |       "INFO:tensorflow:global_step/sec: 445.921\n",
 911 |       "INFO:tensorflow:loss = 0.047041617, step = 18701 (0.224 sec)\n",
 912 |       "INFO:tensorflow:Saving checkpoints for 18750 into /var/folders/wy/h39t6kb11pnbb0pzhksd_fqh0000gn/T/tmp2xxrubio/model.ckpt.\n",
 913 |       "INFO:tensorflow:Loss for final step: 0.46282205.\n"
 914 |      ]
 915 |     },
 916 |     {
 917 |      "data": {
 918 |       "text/plain": [
 919 |        "<tensorflow.python.estimator.canned.dnn.DNNClassifier at 0x11711b748>"
 920 |       ]
 921 |      },
 922 |      "execution_count": 21,
 923 |      "metadata": {},
 924 |      "output_type": "execute_result"
 925 |     }
 926 |    ],
 927 |    "source": [
 928 |     "indices = np.random.permutation(len(X_train))\n",
 929 |     "X_train_shuffled = X_train[indices]\n",
 930 |     "y_train_shuffled = y_train[indices]\n",
 931 |     "\n",
 932 |     "input_fn = tf.estimator.inputs.numpy_input_fn(\n",
 933 |     "    x={\"X\": X_train_shuffled}, y=y_train_shuffled, num_epochs=10, batch_size=32, shuffle=False)\n",
 934 |     "dnn_clf.train(input_fn=input_fn)"
 935 |    ]
 936 |   },
 937 |   {
 938 |    "cell_type": "markdown",
 939 |    "metadata": {},
 940 |    "source": [
 941 |     "The final loss should be exactly 0.46282205."
 942 |    ]
 943 |   },
 944 |   {
 945 |    "cell_type": "markdown",
 946 |    "metadata": {},
 947 |    "source": [
 948 |     "Instead of using the `numpy_input_fn()` function (which cannot reproducibly shuffle the dataset at each epoch), you can create your own input function using the Data API and set its shuffling seed:"
 949 |    ]
 950 |   },
 951 |   {
 952 |    "cell_type": "code",
 953 |    "execution_count": 22,
 954 |    "metadata": {},
 955 |    "outputs": [],
 956 |    "source": [
 957 |     "def create_dataset(X, y=None, n_epochs=1, batch_size=32,\n",
 958 |     "                   buffer_size=1000, seed=None):\n",
 959 |     "    dataset = tf.data.Dataset.from_tensor_slices(({\"X\": X}, y))\n",
 960 |     "    dataset = dataset.repeat(n_epochs)\n",
 961 |     "    dataset = dataset.shuffle(buffer_size, seed=seed)\n",
 962 |     "    return dataset.batch(batch_size)\n",
 963 |     "\n",
 964 |     "input_fn=lambda: create_dataset(X_train, y_train, seed=42)"
 965 |    ]
 966 |   },
 967 |   {
 968 |    "cell_type": "code",
 969 |    "execution_count": 23,
 970 |    "metadata": {},
 971 |    "outputs": [
 972 |     {
 973 |      "name": "stdout",
 974 |      "output_type": "stream",
 975 |      "text": [
 976 |       "WARNING:tensorflow:Using temporary folder as model directory: /var/folders/wy/h39t6kb11pnbb0pzhksd_fqh0000gn/T/tmpawwl1lf0\n",
 977 |       "INFO:tensorflow:Using config: {'_model_dir': '/var/folders/wy/h39t6kb11pnbb0pzhksd_fqh0000gn/T/tmpawwl1lf0', '_tf_random_seed': 42, '_save_summary_steps': 100, '_save_checkpoints_steps': None, '_save_checkpoints_secs': 600, '_session_config': None, '_keep_checkpoint_max': 5, '_keep_checkpoint_every_n_hours': 10000, '_log_step_count_steps': 100, '_train_distribute': None, '_device_fn': None, '_service': None, '_cluster_spec': <tensorflow.python.training.server_lib.ClusterSpec object at 0x137e1c6d8>, '_task_type': 'worker', '_task_id': 0, '_global_id_in_cluster': 0, '_master': '', '_evaluation_master': '', '_is_chief': True, '_num_ps_replicas': 0, '_num_worker_replicas': 1}\n",
 978 |       "INFO:tensorflow:Calling model_fn.\n",
 979 |       "INFO:tensorflow:Done calling model_fn.\n",
 980 |       "INFO:tensorflow:Create CheckpointSaverHook.\n",
 981 |       "INFO:tensorflow:Graph was finalized.\n",
 982 |       "INFO:tensorflow:Running local_init_op.\n",
 983 |       "INFO:tensorflow:Done running local_init_op.\n",
 984 |       "INFO:tensorflow:Saving checkpoints for 0 into /var/folders/wy/h39t6kb11pnbb0pzhksd_fqh0000gn/T/tmpawwl1lf0/model.ckpt.\n",
 985 |       "INFO:tensorflow:loss = 80.279686, step = 1\n",
 986 |       "INFO:tensorflow:global_step/sec: 161.253\n",
 987 |       "INFO:tensorflow:loss = 16.09288, step = 101 (0.621 sec)\n",
 988 |       "INFO:tensorflow:global_step/sec: 433.582\n",
 989 |       "INFO:tensorflow:loss = 5.605775, step = 201 (0.231 sec)\n",
 990 |       "INFO:tensorflow:global_step/sec: 447.561\n",
 991 |       "INFO:tensorflow:loss = 12.584702, step = 301 (0.224 sec)\n",
 992 |       "INFO:tensorflow:global_step/sec: 442.148\n",
 993 |       "INFO:tensorflow:loss = 2.089463, step = 401 (0.226 sec)\n",
 994 |       "INFO:tensorflow:global_step/sec: 434.492\n",
 995 |       "INFO:tensorflow:loss = 9.2258215, step = 501 (0.230 sec)\n",
 996 |       "INFO:tensorflow:global_step/sec: 447.994\n",
 997 |       "INFO:tensorflow:loss = 8.11821, step = 601 (0.223 sec)\n",
 998 |       "INFO:tensorflow:global_step/sec: 442.723\n",
 999 |       "INFO:tensorflow:loss = 0.653025, step = 701 (0.226 sec)\n",
1000 |       "INFO:tensorflow:global_step/sec: 425.438\n",
1001 |       "INFO:tensorflow:loss = 4.331424, step = 801 (0.235 sec)\n",
1002 |       "INFO:tensorflow:global_step/sec: 444.471\n",
1003 |       "INFO:tensorflow:loss = 1.55325, step = 901 (0.225 sec)\n",
1004 |       "INFO:tensorflow:global_step/sec: 436.037\n",
1005 |       "INFO:tensorflow:loss = 5.208349, step = 1001 (0.229 sec)\n",
1006 |       "INFO:tensorflow:global_step/sec: 433.071\n",
1007 |       "INFO:tensorflow:loss = 0.80289483, step = 1101 (0.231 sec)\n",
1008 |       "INFO:tensorflow:global_step/sec: 436.717\n",
1009 |       "INFO:tensorflow:loss = 3.1879468, step = 1201 (0.229 sec)\n",
1010 |       "INFO:tensorflow:global_step/sec: 452.687\n",
1011 |       "INFO:tensorflow:loss = 5.55963, step = 1301 (0.221 sec)\n",
1012 |       "INFO:tensorflow:global_step/sec: 446.2\n",
1013 |       "INFO:tensorflow:loss = 12.830038, step = 1401 (0.224 sec)\n",
1014 |       "INFO:tensorflow:global_step/sec: 450.525\n",
1015 |       "INFO:tensorflow:loss = 6.8311796, step = 1501 (0.222 sec)\n",
1016 |       "INFO:tensorflow:global_step/sec: 452.967\n",
1017 |       "INFO:tensorflow:loss = 1.635078, step = 1601 (0.221 sec)\n",
1018 |       "INFO:tensorflow:global_step/sec: 453.743\n",
1019 |       "INFO:tensorflow:loss = 1.9616288, step = 1701 (0.220 sec)\n",
1020 |       "INFO:tensorflow:global_step/sec: 450.01\n",
1021 |       "INFO:tensorflow:loss = 1.4227519, step = 1801 (0.222 sec)\n",
1022 |       "INFO:tensorflow:Saving checkpoints for 1875 into /var/folders/wy/h39t6kb11pnbb0pzhksd_fqh0000gn/T/tmpawwl1lf0/model.ckpt.\n",
1023 |       "INFO:tensorflow:Loss for final step: 1.0556093.\n"
1024 |      ]
1025 |     },
1026 |     {
1027 |      "data": {
1028 |       "text/plain": [
1029 |        "<tensorflow.python.estimator.canned.dnn.DNNClassifier at 0x11dba7a20>"
1030 |       ]
1031 |      },
1032 |      "execution_count": 23,
1033 |      "metadata": {},
1034 |      "output_type": "execute_result"
1035 |     }
1036 |    ],
1037 |    "source": [
1038 |     "random.seed(42)\n",
1039 |     "np.random.seed(42)\n",
1040 |     "tf.set_random_seed(42)\n",
1041 |     "\n",
1042 |     "my_config = tf.estimator.RunConfig(tf_random_seed=42)\n",
1043 |     "\n",
1044 |     "feature_cols = [tf.feature_column.numeric_column(\"X\", shape=[28 * 28])]\n",
1045 |     "dnn_clf = tf.estimator.DNNClassifier(hidden_units=[300, 100], n_classes=10,\n",
1046 |     "                                     feature_columns=feature_cols,\n",
1047 |     "                                     config=my_config)\n",
1048 |     "dnn_clf.train(input_fn=input_fn)"
1049 |    ]
1050 |   },
1051 |   {
1052 |    "cell_type": "markdown",
1053 |    "metadata": {},
1054 |    "source": [
1055 |     "The final loss should be exactly 1.0556093."
1056 |    ]
1057 |   },
1058 |   {
1059 |    "cell_type": "markdown",
1060 |    "metadata": {},
1061 |    "source": [
1062 |     "```python\n",
1063 |     "indices = np.random.permutation(len(X_train))\n",
1064 |     "X_train_shuffled = X_train[indices]\n",
1065 |     "y_train_shuffled = y_train[indices]\n",
1066 |     "\n",
1067 |     "input_fn = tf.estimator.inputs.numpy_input_fn(\n",
1068 |     "    x={\"X\": X_train_shuffled}, y=y_train_shuffled,\n",
1069 |     "    num_epochs=10, batch_size=32, shuffle=False)\n",
1070 |     "dnn_clf.train(input_fn=input_fn)\n",
1071 |     "```"
1072 |    ]
1073 |   },
1074 |   {
1075 |    "cell_type": "markdown",
1076 |    "metadata": {},
1077 |    "source": [
1078 |     "#### Keras API"
1079 |    ]
1080 |   },
1081 |   {
1082 |    "cell_type": "markdown",
1083 |    "metadata": {},
1084 |    "source": [
1085 |     "If you use the Keras API, all you need to do is set the random seed any time you clear the session:"
1086 |    ]
1087 |   },
1088 |   {
1089 |    "cell_type": "code",
1090 |    "execution_count": 24,
1091 |    "metadata": {},
1092 |    "outputs": [
1093 |     {
1094 |      "name": "stdout",
1095 |      "output_type": "stream",
1096 |      "text": [
1097 |       "Epoch 1/10\n",
1098 |       "60000/60000 [==============================] - 5s 78us/step - loss: 0.5929 - acc: 0.8450\n",
1099 |       "Epoch 2/10\n",
1100 |       "60000/60000 [==============================] - 4s 75us/step - loss: 0.2804 - acc: 0.9199\n",
1101 |       "Epoch 3/10\n",
1102 |       "60000/60000 [==============================] - 4s 74us/step - loss: 0.2276 - acc: 0.9350\n",
1103 |       "Epoch 4/10\n",
1104 |       "60000/60000 [==============================] - 4s 74us/step - loss: 0.1933 - acc: 0.9449\n",
1105 |       "Epoch 5/10\n",
1106 |       "60000/60000 [==============================] - 4s 74us/step - loss: 0.1682 - acc: 0.9518\n",
1107 |       "Epoch 6/10\n",
1108 |       "60000/60000 [==============================] - 4s 74us/step - loss: 0.1490 - acc: 0.9573\n",
1109 |       "Epoch 7/10\n",
1110 |       "60000/60000 [==============================] - 4s 74us/step - loss: 0.1332 - acc: 0.9622\n",
1111 |       "Epoch 8/10\n",
1112 |       "60000/60000 [==============================] - 5s 75us/step - loss: 0.1202 - acc: 0.9658\n",
1113 |       "Epoch 9/10\n",
1114 |       "60000/60000 [==============================] - 4s 75us/step - loss: 0.1090 - acc: 0.9693\n",
1115 |       "Epoch 10/10\n",
1116 |       "60000/60000 [==============================] - 4s 75us/step - loss: 0.1000 - acc: 0.9716\n"
1117 |      ]
1118 |     },
1119 |     {
1120 |      "data": {
1121 |       "text/plain": [
1122 |        "<tensorflow.python.keras.callbacks.History at 0x1379fff98>"
1123 |       ]
1124 |      },
1125 |      "execution_count": 24,
1126 |      "metadata": {},
1127 |      "output_type": "execute_result"
1128 |     }
1129 |    ],
1130 |    "source": [
1131 |     "keras.backend.clear_session()\n",
1132 |     "\n",
1133 |     "random.seed(42)\n",
1134 |     "np.random.seed(42)\n",
1135 |     "tf.set_random_seed(42)\n",
1136 |     "\n",
1137 |     "model = keras.models.Sequential([\n",
1138 |     "    keras.layers.Dense(300, activation=\"relu\"),\n",
1139 |     "    keras.layers.Dense(100, activation=\"relu\"),\n",
1140 |     "    keras.layers.Dense(10, activation=\"softmax\"),\n",
1141 |     "])\n",
1142 |     "model.compile(loss=\"sparse_categorical_crossentropy\", optimizer=\"sgd\",\n",
1143 |     "              metrics=[\"accuracy\"])\n",
1144 |     "model.fit(X_train, y_train, epochs=10)"
1145 |    ]
1146 |   },
1147 |   {
1148 |    "cell_type": "markdown",
1149 |    "metadata": {},
1150 |    "source": [
1151 |     "You should get exactly 97.16% accuracy on the training set at the end of training."
1152 |    ]
1153 |   },
1154 |   {
1155 |    "cell_type": "markdown",
1156 |    "metadata": {},
1157 |    "source": [
1158 |     "## Eliminate other sources of variability"
1159 |    ]
1160 |   },
1161 |   {
1162 |    "cell_type": "markdown",
1163 |    "metadata": {},
1164 |    "source": [
1165 |     "For example, `os.listdir()` returns file names in an order that depends on how the files were indexed by the file system:"
1166 |    ]
1167 |   },
1168 |   {
1169 |    "cell_type": "code",
1170 |    "execution_count": 25,
1171 |    "metadata": {},
1172 |    "outputs": [
1173 |     {
1174 |      "data": {
1175 |       "text/plain": [
1176 |        "['my_test_foo_1',\n",
1177 |        " 'my_test_foo_6',\n",
1178 |        " 'my_test_foo_8',\n",
1179 |        " 'my_test_foo_9',\n",
1180 |        " 'my_test_foo_7',\n",
1181 |        " 'my_test_foo_0',\n",
1182 |        " 'my_test_foo_5',\n",
1183 |        " 'my_test_foo_2',\n",
1184 |        " 'my_test_foo_3',\n",
1185 |        " 'my_test_foo_4']"
1186 |       ]
1187 |      },
1188 |      "execution_count": 25,
1189 |      "metadata": {},
1190 |      "output_type": "execute_result"
1191 |     }
1192 |    ],
1193 |    "source": [
1194 |     "for i in range(10):\n",
1195 |     "    with open(\"my_test_foo_{}\".format(i), \"w\"):\n",
1196 |     "        pass\n",
1197 |     "\n",
1198 |     "[f for f in os.listdir() if f.startswith(\"my_test_foo_\")]"
1199 |    ]
1200 |   },
1201 |   {
1202 |    "cell_type": "code",
1203 |    "execution_count": 26,
1204 |    "metadata": {},
1205 |    "outputs": [
1206 |     {
1207 |      "data": {
1208 |       "text/plain": [
1209 |        "['my_test_bar_4',\n",
1210 |        " 'my_test_bar_3',\n",
1211 |        " 'my_test_bar_2',\n",
1212 |        " 'my_test_bar_5',\n",
1213 |        " 'my_test_bar_0',\n",
1214 |        " 'my_test_bar_7',\n",
1215 |        " 'my_test_bar_9',\n",
1216 |        " 'my_test_bar_8',\n",
1217 |        " 'my_test_bar_6',\n",
1218 |        " 'my_test_bar_1']"
1219 |       ]
1220 |      },
1221 |      "execution_count": 26,
1222 |      "metadata": {},
1223 |      "output_type": "execute_result"
1224 |     }
1225 |    ],
1226 |    "source": [
1227 |     "for i in range(10):\n",
1228 |     "    with open(\"my_test_bar_{}\".format(i), \"w\"):\n",
1229 |     "        pass\n",
1230 |     "\n",
1231 |     "[f for f in os.listdir() if f.startswith(\"my_test_bar_\")]"
1232 |    ]
1233 |   },
1234 |   {
1235 |    "cell_type": "markdown",
1236 |    "metadata": {},
1237 |    "source": [
1238 |     "You should sort the file names before you use them:"
1239 |    ]
1240 |   },
1241 |   {
1242 |    "cell_type": "code",
1243 |    "execution_count": 27,
1244 |    "metadata": {},
1245 |    "outputs": [],
1246 |    "source": [
1247 |     "filenames = os.listdir()\n",
1248 |     "filenames.sort()"
1249 |    ]
1250 |   },
1251 |   {
1252 |    "cell_type": "code",
1253 |    "execution_count": 28,
1254 |    "metadata": {},
1255 |    "outputs": [
1256 |     {
1257 |      "data": {
1258 |       "text/plain": [
1259 |        "['my_test_foo_0',\n",
1260 |        " 'my_test_foo_1',\n",
1261 |        " 'my_test_foo_2',\n",
1262 |        " 'my_test_foo_3',\n",
1263 |        " 'my_test_foo_4',\n",
1264 |        " 'my_test_foo_5',\n",
1265 |        " 'my_test_foo_6',\n",
1266 |        " 'my_test_foo_7',\n",
1267 |        " 'my_test_foo_8',\n",
1268 |        " 'my_test_foo_9']"
1269 |       ]
1270 |      },
1271 |      "execution_count": 28,
1272 |      "metadata": {},
1273 |      "output_type": "execute_result"
1274 |     }
1275 |    ],
1276 |    "source": [
1277 |     "[f for f in filenames if f.startswith(\"my_test_foo_\")]"
1278 |    ]
1279 |   },
1280 |   {
1281 |    "cell_type": "code",
1282 |    "execution_count": 29,
1283 |    "metadata": {},
1284 |    "outputs": [],
1285 |    "source": [
1286 |     "for f in os.listdir():\n",
1287 |     "    if f.startswith(\"my_test_foo_\") or f.startswith(\"my_test_bar_\"):\n",
1288 |     "        os.remove(f)"
1289 |    ]
1290 |   },
1291 |   {
1292 |    "cell_type": "markdown",
1293 |    "metadata": {},
1294 |    "source": [
1295 |     "I hope you enjoyed this notebook. If you do not get reproducible results, or if they are different than mine, then please [file an issue](https://github.com/ageron/handson-ml/issues) on github, specifying what version of Python, TensorFlow, and NumPy you are using, as well as your O.S. version. Thank you!"
1296 |    ]
1297 |   },
1298 |   {
1299 |    "cell_type": "markdown",
1300 |    "metadata": {},
1301 |    "source": [
1302 |     "If you want to learn more about Deep Learning and TensorFlow, check out my book [Hands-On Machine Learning with Scitkit-Learn and TensorFlow](http://homl.info/amazon), O'Reilly. You can also follow me on twitter [@aureliengeron](https://twitter.com/aureliengeron) or watch my videos on YouTube at [youtube.com/c/AurelienGeron](https://www.youtube.com/c/AurelienGeron)."
1303 |    ]
1304 |   },
1305 |   {
1306 |    "cell_type": "code",
1307 |    "execution_count": null,
1308 |    "metadata": {},
1309 |    "outputs": [],
1310 |    "source": []
1311 |   }
1312 |  ],
1313 |  "metadata": {
1314 |   "kernelspec": {
1315 |    "display_name": "Python 3",
1316 |    "language": "python",
1317 |    "name": "python3"
1318 |   },
1319 |   "language_info": {
1320 |    "codemirror_mode": {
1321 |     "name": "ipython",
1322 |     "version": 3
1323 |    },
1324 |    "file_extension": ".py",
1325 |    "mimetype": "text/x-python",
1326 |    "name": "python",
1327 |    "nbconvert_exporter": "python",
1328 |    "pygments_lexer": "ipython3",
1329 |    "version": "3.7.10"
1330 |   }
1331 |  },
1332 |  "nbformat": 4,
1333 |  "nbformat_minor": 2
1334 | }
1335 | 


--------------------------------------------------------------------------------
/images/ann/README:
--------------------------------------------------------------------------------
1 | Images generated by the notebooks
2 | 


--------------------------------------------------------------------------------
/images/autoencoders/README:
--------------------------------------------------------------------------------
1 | Images generated by the notebooks
2 | 


--------------------------------------------------------------------------------
/images/classification/README:
--------------------------------------------------------------------------------
1 | Images generated by the notebooks
2 | 


--------------------------------------------------------------------------------
/images/cnn/README:
--------------------------------------------------------------------------------
1 | Images generated by the notebooks
2 | 


--------------------------------------------------------------------------------
/images/cnn/test_image.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/ageron/handson-ml/ac1310a3cc1567ecfb4b798715c804627076775f/images/cnn/test_image.png


--------------------------------------------------------------------------------
/images/decision_trees/README:
--------------------------------------------------------------------------------
1 | Images generated by the notebooks
2 | 


--------------------------------------------------------------------------------
/images/deep/README:
--------------------------------------------------------------------------------
1 | Images generated by the notebooks
2 | 


--------------------------------------------------------------------------------
/images/distributed/README:
--------------------------------------------------------------------------------
1 | Images generated by the notebooks
2 | 


--------------------------------------------------------------------------------
/images/end_to_end_project/README:
--------------------------------------------------------------------------------
1 | Images generated by the notebooks
2 | 


--------------------------------------------------------------------------------
/images/end_to_end_project/california.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/ageron/handson-ml/ac1310a3cc1567ecfb4b798715c804627076775f/images/end_to_end_project/california.png


--------------------------------------------------------------------------------
/images/ensembles/README:
--------------------------------------------------------------------------------
1 | Images generated by the notebooks
2 | 


--------------------------------------------------------------------------------
/images/fundamentals/README:
--------------------------------------------------------------------------------
1 | Images generated by the notebooks
2 | 


--------------------------------------------------------------------------------
/images/rl/README:
--------------------------------------------------------------------------------
1 | Images generated by the notebooks
2 | 


--------------------------------------------------------------------------------
/images/rnn/README:
--------------------------------------------------------------------------------
1 | Images generated by the notebooks
2 | 


--------------------------------------------------------------------------------
/images/svm/README:
--------------------------------------------------------------------------------
1 | Images generated by the notebooks
2 | 


--------------------------------------------------------------------------------
/images/tensorflow/README:
--------------------------------------------------------------------------------
1 | Images generated by the notebooks
2 | 


--------------------------------------------------------------------------------
/images/training_linear_models/README:
--------------------------------------------------------------------------------
1 | Images generated by the notebooks
2 | 


--------------------------------------------------------------------------------
/images/unsupervised_learning/README:
--------------------------------------------------------------------------------
1 | Images generated by the notebooks
2 | 


--------------------------------------------------------------------------------
/images/unsupervised_learning/ladybug.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/ageron/handson-ml/ac1310a3cc1567ecfb4b798715c804627076775f/images/unsupervised_learning/ladybug.png


--------------------------------------------------------------------------------
/index.ipynb:
--------------------------------------------------------------------------------
  1 | {
  2 |  "cells": [
  3 |   {
  4 |    "cell_type": "markdown",
  5 |    "metadata": {},
  6 |    "source": [
  7 |     "# Machine Learning Notebooks\n",
  8 |     "\n",
  9 |     "*Welcome to the Machine Learning Notebooks!*\n",
 10 |     "\n",
 11 |     "[Prerequisites](#Prerequisites) (see below)\n",
 12 |     "\n",
 13 |     "<table align=\"left\">\n",
 14 |     "  <td>\n",
 15 |     "    <a target=\"_blank\" href=\"https://colab.research.google.com/github/ageron/handson-ml/blob/master/index.ipynb\"><img src=\"https://www.tensorflow.org/images/colab_logo_32px.png\" />Run in Google Colab</a>\n",
 16 |     "  </td>\n",
 17 |     "</table>"
 18 |    ]
 19 |   },
 20 |   {
 21 |    "cell_type": "markdown",
 22 |    "metadata": {},
 23 |    "source": [
 24 |     "**Warning**: this project contains the code for the 1st edition of the book. Please visit https://github.com/ageron/handson-ml2 for the 2nd edition code, with up-to-date notebooks using the latest library versions. In particular, the 1st edition is based on TensorFlow 1, while the 2nd edition uses TensorFlow 2, which is much simpler to use."
 25 |    ]
 26 |   },
 27 |   {
 28 |    "cell_type": "markdown",
 29 |    "metadata": {},
 30 |    "source": [
 31 |     "## Notebooks\n",
 32 |     "1. [The Machine Learning landscape](01_the_machine_learning_landscape.ipynb)\n",
 33 |     "2. [End-to-end Machine Learning project](02_end_to_end_machine_learning_project.ipynb)\n",
 34 |     "3. [Classification](03_classification.ipynb)\n",
 35 |     "4. [Training Linear Models](04_training_linear_models.ipynb)\n",
 36 |     "5. [Support Vector Machines](05_support_vector_machines.ipynb)\n",
 37 |     "6. [Decision Trees](06_decision_trees.ipynb)\n",
 38 |     "7. [Ensemble Learning and Random Forests](07_ensemble_learning_and_random_forests.ipynb)\n",
 39 |     "8. [Dimensionality Reduction & Unsupervised Learning](08_dimensionality_reduction.ipynb)\n",
 40 |     "9. [Up and running with TensorFlow](09_up_and_running_with_tensorflow.ipynb)\n",
 41 |     "10. [Introduction to Artificial Neural Networks](10_introduction_to_artificial_neural_networks.ipynb)\n",
 42 |     "11. [Deep Learning](11_deep_learning.ipynb)\n",
 43 |     "12. [Distributed TensorFlow](12_distributed_tensorflow.ipynb)\n",
 44 |     "13. [Convolutional Neural Networks](13_convolutional_neural_networks.ipynb)\n",
 45 |     "14. [Recurrent Neural Networks](14_recurrent_neural_networks.ipynb)\n",
 46 |     "15. [Autoencoders](15_autoencoders.ipynb)\n",
 47 |     "16. [Reinforcement Learning](16_reinforcement_learning.ipynb)"
 48 |    ]
 49 |   },
 50 |   {
 51 |    "cell_type": "markdown",
 52 |    "metadata": {},
 53 |    "source": [
 54 |     "## Scientific Python tutorials\n",
 55 |     "* [NumPy](tools_numpy.ipynb)\n",
 56 |     "* [Matplotlib](tools_matplotlib.ipynb)\n",
 57 |     "* [Pandas](tools_pandas.ipynb)"
 58 |    ]
 59 |   },
 60 |   {
 61 |    "cell_type": "markdown",
 62 |    "metadata": {},
 63 |    "source": [
 64 |     "## Math Tutorials\n",
 65 |     "* [Linear Algebra](math_linear_algebra.ipynb)\n",
 66 |     "* [Differential Calculus](math_differential_calculus.ipynb)"
 67 |    ]
 68 |   },
 69 |   {
 70 |    "cell_type": "markdown",
 71 |    "metadata": {},
 72 |    "source": [
 73 |     "## Extra Material\n",
 74 |     "* [Capsule Networks](extra_capsnets.ipynb)\n",
 75 |     "* [TensorFlow Reproducibility](extra_tensorflow_reproducibility.ipynb)"
 76 |    ]
 77 |   },
 78 |   {
 79 |    "cell_type": "markdown",
 80 |    "metadata": {},
 81 |    "source": [
 82 |     "## Misc.\n",
 83 |     "* [Equations](book_equations.ipynb) (list of equations in the book)"
 84 |    ]
 85 |   },
 86 |   {
 87 |    "cell_type": "markdown",
 88 |    "metadata": {
 89 |     "collapsed": true
 90 |    },
 91 |    "source": [
 92 |     "## Prerequisites\n",
 93 |     "### To understand\n",
 94 |     "* **Python** – you don't need to be an expert python programmer, but you do need to know the basics. If you don't, the official [Python tutorial](https://docs.python.org/3/tutorial/) is a good place to start.\n",
 95 |     "* **Scientific Python** – We will be using a few popular python libraries, in particular NumPy, matplotlib and pandas. If you are not familiar with these libraries, you should probably start by going through the tutorials in the Tools section (especially NumPy).\n",
 96 |     "* **Math** – We will also use some notions of Linear Algebra, Calculus, Statistics and Probability theory. You should be able to follow along if you learned these in the past as it won't be very advanced, but if you don't know about these topics or you need a refresher then go through the appropriate introduction in the Math section.\n",
 97 |     "\n",
 98 |     "### To run the examples\n",
 99 |     "* **Jupyter** – These notebooks are based on Jupyter. If you just plan to read without running any code, there's really nothing more to know, just keep reading! But if you want to experiment with the code examples you need to:\n",
100 |     "    * follow the [installation instructions](https://github.com/ageron/handson-ml/#installation),\n",
101 |     "    * learn how to use Jupyter. Start the User Interface Tour from the Help menu."
102 |    ]
103 |   },
104 |   {
105 |    "cell_type": "code",
106 |    "execution_count": null,
107 |    "metadata": {
108 |     "collapsed": true
109 |    },
110 |    "outputs": [],
111 |    "source": []
112 |   }
113 |  ],
114 |  "metadata": {
115 |   "kernelspec": {
116 |    "display_name": "Python 3",
117 |    "language": "python",
118 |    "name": "python3"
119 |   },
120 |   "language_info": {
121 |    "codemirror_mode": {
122 |     "name": "ipython",
123 |     "version": 3
124 |    },
125 |    "file_extension": ".py",
126 |    "mimetype": "text/x-python",
127 |    "name": "python",
128 |    "nbconvert_exporter": "python",
129 |    "pygments_lexer": "ipython3",
130 |    "version": "3.7.10"
131 |   },
132 |   "nav_menu": {},
133 |   "toc": {
134 |    "navigate_menu": true,
135 |    "number_sections": true,
136 |    "sideBar": true,
137 |    "threshold": 6,
138 |    "toc_cell": false,
139 |    "toc_section_display": "block",
140 |    "toc_window_display": false
141 |   }
142 |  },
143 |  "nbformat": 4,
144 |  "nbformat_minor": 1
145 | }
146 | 


--------------------------------------------------------------------------------
/ml-project-checklist.md:
--------------------------------------------------------------------------------
  1 | This checklist can guide you through your Machine Learning projects. There are eight main steps:  
  2 | 
  3 | 1. Frame the problem and look at the big picture.  
  4 | 2. Get the data.  
  5 | 3. Explore the data to gain insights.  
  6 | 4. Prepare the data to better expose the underlying data patterns to Machine Learning algorithms.  
  7 | 5. Explore many different models and short-list the best ones.  
  8 | 6. Fine-tune your models and combine them into a great solution.  
  9 | 7. Present your solution.  
 10 | 8. Launch, monitor, and maintain your system.  
 11 | 
 12 | Obviously, you should feel free to adapt this checklist to your needs.  
 13 | 
 14 | # Frame the problem and look at the big picture  
 15 | 1. Define the objective in business terms.  
 16 | 2. How will your solution be used?  
 17 | 3. What are the current solutions/workarounds (if any)?  
 18 | 4. How should you frame this problem (supervised/unsupervised, online/offline, etc.)  
 19 | 5. How should performance be measured?  
 20 | 6. Is the performance measure aligned with the business objective?  
 21 | 7. What would be the minimum performance needed to reach the business objective?  
 22 | 8. What are comparable problems? Can you reuse experience or tools?  
 23 | 9. Is human expertise available?  
 24 | 10. How would you solve the problem manually?  
 25 | 11. List the assumptions you or others have made so far.  
 26 | 12. Verify assumptions if possible.  
 27 | 
 28 | # Get the data   
 29 | Note: automate as much as possible so you can easily get fresh data.  
 30 | 
 31 | 1. List the data you need and how much you need.  
 32 | 2. Find and document where you can get that data.  
 33 | 3. Check how much space it will take.  
 34 | 4. Check legal obligations, and get the authorization if necessary.  
 35 | 5. Get access authorizations.  
 36 | 6. Create a workspace (with enough storage space).  
 37 | 7. Get the data.  
 38 | 8. Convert the data to a format you can easily manipulate (without changing the data itself).  
 39 | 9. Ensure sensitive information is deleted or protected (e.g., anonymized). 
 40 | 10. Check the size and type of data (time series, sample, geographical, etc.).  
 41 | 11. Sample a test set, put it aside, and never look at it (no data snooping!).    
 42 | 
 43 | # Explore the data  
 44 | Note: try to get insights from a field expert for these steps.  
 45 | 
 46 | 1. Create a copy of the data for exploration (sampling it down to a manageable size if necessary).
 47 | 2. Create a Jupyter notebook to keep record of your data exploration.  
 48 | 3. Study each attribute and its characteristics:  
 49 |     - Name  
 50 |     - Type (categorical, int/float, bounded/unbounded, text, structured, etc.)
 51 |     - % of missing values  
 52 |     - Noisiness and type of noise (stochastic, outliers, rounding errors, etc.)
 53 |     - Possibly useful for the task?  
 54 |     - Type of distribution (Gaussian, uniform, logarithmic, etc.)
 55 | 4. For supervised learning tasks, identify the target attribute(s).
 56 | 5. Visualize the data.  
 57 | 6. Study the correlations between attributes.  
 58 | 7. Study how you would solve the problem manually.  
 59 | 8. Identify the promising transformations you may want to apply.  
 60 | 9. Identify extra data that would be useful (go back to "Get the Data" on page 502).  
 61 | 10. Document what you have learned.  
 62 | 
 63 | # Prepare the data  
 64 | Notes:    
 65 | - Work on copies of the data (keep the original dataset intact).  
 66 | - Write functions for all data transformations you apply, for five reasons:  
 67 |     - So you can easily prepare the data the next time you get a fresh dataset  
 68 |     - So you can apply these transformations in future projects  
 69 |     - To clean and prepare the test set  
 70 |     - To clean and prepare new data instances  
 71 |     - To make it easy to treat your preparation choices as hyperparameters  
 72 | 
 73 | 1. Data cleaning:  
 74 |     - Fix or remove outliers (optional).  
 75 |     - Fill in missing values (e.g., with zero, mean, median...) or drop their rows (or columns).  
 76 | 2. Feature selection (optional):  
 77 |     - Drop the attributes that provide no useful information for the task.  
 78 | 3. Feature engineering, where appropriates:  
 79 |     - Discretize continuous features.  
 80 |     - Decompose features (e.g., categorical, date/time, etc.).  
 81 |     - Add promising transformations of features (e.g., log(x), sqrt(x), x^2, etc.).
 82 |     - Aggregate features into promising new features.  
 83 | 4. Feature scaling: standardize or normalize features.  
 84 | 
 85 | # Short-list promising models  
 86 | Notes: 
 87 | - If the data is huge, you may want to sample smaller training sets so you can train many different models in a reasonable time (be aware that this penalizes complex models such as large neural nets or Random Forests).  
 88 | - Once again, try to automate these steps as much as possible.    
 89 | 
 90 | 1. Train many quick and dirty models from different categories (e.g., linear, naive, Bayes, SVM, Random Forests, neural net, etc.) using standard parameters.  
 91 | 2. Measure and compare their performance.  
 92 |     - For each model, use N-fold cross-validation and compute the mean and standard deviation of their performance. 
 93 | 3. Analyze the most significant variables for each algorithm.  
 94 | 4. Analyze the types of errors the models make.  
 95 |     - What data would a human have used to avoid these errors?  
 96 | 5. Have a quick round of feature selection and engineering.  
 97 | 6. Have one or two more quick iterations of the five previous steps.  
 98 | 7. Short-list the top three to five most promising models, preferring models that make different types of errors.  
 99 | 
100 | # Fine-Tune the System  
101 | Notes:  
102 | - You will want to use as much data as possible for this step, especially as you move toward the end of fine-tuning.   
103 | - As always automate what you can.    
104 | 
105 | 1. Fine-tune the hyperparameters using cross-validation.  
106 |     - Treat your data transformation choices as hyperparameters, especially when you are not sure about them (e.g., should I replace missing values with zero or the median value? Or just drop the rows?).  
107 |     - Unless there are very few hyperparamter values to explore, prefer random search over grid search. If training is very long, you may prefer a Bayesian optimization approach (e.g., using a Gaussian process priors, as described by Jasper Snoek, Hugo Larochelle, and Ryan Adams ([https://goo.gl/PEFfGr](https://goo.gl/PEFfGr)))  
108 | 2. Try Ensemble methods. Combining your best models will often perform better than running them invdividually.  
109 | 3. Once you are confident about your final model, measure its performance on the test set to estimate the generalization error.
110 | 
111 | > Don't tweak your model after measuring the generalization error: you would just start overfitting the test set.  
112 |   
113 | # Present your solution  
114 | 1. Document what you have done.  
115 | 2. Create a nice presentation.  
116 |     - Make sure you highlight the big picture first.  
117 | 3. Explain why your solution achieves the business objective.  
118 | 4. Don't forget to present interesting points you noticed along the way.  
119 |     - Describe what worked and what did not.  
120 |     - List your assumptions and your system's limitations.  
121 | 5. Ensure your key findings are communicated through beautiful visualizations or easy-to-remember statements (e.g., "the median income is the number-one predictor of housing prices").  
122 | 
123 | # Launch!  
124 | 1. Get your solution ready for production (plug into production data inputs, write unit tests, etc.).  
125 | 2. Write monitoring code to check your system's live performance at regular intervals and trigger alerts when it drops.  
126 |     - Beware of slow degradation too: models tend to "rot" as data evolves.   
127 |     - Measuring performance may require a human pipeline (e.g., via a crowdsourcing service).  
128 |     - Also monitor your inputs' quality (e.g., a malfunctioning sensor sending random values, or another team's output becoming stale). This is  particulary important for online learning systems.  
129 | 3. Retrain your models on a regular basis on fresh data (automate as much as possible).  
130 | 


--------------------------------------------------------------------------------
/requirements.txt:
--------------------------------------------------------------------------------
 1 | # TensorFlow is much easier to install using Anaconda, especially
 2 | # on Windows or when using a GPU. Please see the installation
 3 | # instructions in INSTALL.md
 4 | 
 5 | ##### Core scientific packages
 6 | jupyter==1.0.0
 7 | matplotlib==3.3.4
 8 | numpy==1.22.0
 9 | pandas==1.2.2
10 | scipy==1.6.0
11 | 
12 | ##### Machine Learning packages
13 | scikit-learn==0.24.1
14 | 
15 | # Optional: the XGBoost library is only used in chapter 7
16 | xgboost==1.3.3
17 | 
18 | ##### TensorFlow-related packages
19 | 
20 | # If you want to use a GPU, it must have CUDA Compute Capability 3.5 or
21 | # higher support, and you must install CUDA, cuDNN and more: see
22 | # tensorflow.org for the detailed installation instructions.
23 | 
24 | tensorflow==1.15.5 # or tensorflow-gpu==1.15.5 for GPU support
25 | 
26 | tensorboard==1.15.0
27 | 
28 | ##### Reinforcement Learning library (chapter 16)
29 | 
30 | # There are a few dependencies you need to install first, check out:
31 | # https://github.com/openai/gym#installing-everything
32 | gym[atari,Box2D]==0.18.0
33 | # On Windows, install atari_py using:
34 | # pip install --no-index -f https://github.com/Kojoley/atari-py/releases atari_py
35 | 
36 | ##### Image manipulation
37 | Pillow==9.0.1
38 | graphviz==0.16
39 | pyglet==1.5.0
40 | scikit-image==0.18.1
41 | 
42 | #pyvirtualdisplay # needed in chapter 16, if on a headless server
43 |                   # (i.e., without screen, e.g., Colab or VM)
44 | 
45 | 
46 | ##### Additional utilities
47 | 
48 | # Efficient jobs (caching, parallelism, persistence)
49 | joblib==0.14.1
50 | 
51 | # Nice utility to diff Jupyter Notebooks.
52 | nbdime==2.1.1
53 | 
54 | # May be useful with Pandas for complex "where" clauses (e.g., Pandas
55 | # tutorial).
56 | numexpr==2.7.2
57 | 
58 | # Optional: these libraries can be useful in the classification chapter,
59 | # exercise 4.
60 | nltk==3.6.6
61 | urlextract==1.2.0
62 | 
63 | 


--------------------------------------------------------------------------------