├── .idea ├── .gitignore ├── Getting started with TF 2.iml ├── inspectionProfiles │ └── profiles_settings.xml ├── misc.xml ├── modules.xml └── vcs.xml ├── .vscode └── settings.json ├── Capstone Project ├── .ipynb_checkpoints │ └── Capstone Project-checkpoint.ipynb ├── Capstone Project.ipynb ├── Capstone Project.pdf ├── SeqMode │ ├── checkpoint │ ├── mySeqModel.data-00000-of-00002 │ ├── mySeqModel.data-00001-of-00002 │ └── mySeqModel.index ├── checkpoint ├── mySeqModel.data-00000-of-00002 ├── mySeqModel.data-00001-of-00002 └── mySeqModel.index ├── Introduction to Google Colab └── readme.md ├── Saving and Loading Model Weights ├── Explanation of saved files.ipynb ├── ProgrammingTutorial.ipynb ├── Saving model architecture only.ipynb ├── data │ ├── bird1.jpg │ ├── bird2.jpg │ ├── cat1.jpg │ └── imagenet_categories.txt ├── model_checkpoints │ ├── saved_model.pb │ └── variables │ │ ├── variables.data-00000-of-00002 │ │ ├── variables.data-00001-of-00002 │ │ └── variables.index └── readme.md ├── TF.Keras Sequential API Basics ├── .ipynb_checkpoints │ ├── Mnisit_Fashion-checkpoint.ipynb │ ├── TF Keras Week 1 Tutorial-checkpoint.ipynb │ └── Weight Initializers-checkpoint.ipynb ├── MNISIT.ipynb ├── Metrics.ipynb ├── Mnisit_Fashion.ipynb ├── TF Keras Week 1 Tutorial.ipynb ├── Weight Initializers.ipynb └── readme.md ├── Validation, Regularization and Callbacks ├── 0.PNG ├── 1.PNG ├── Batch normalisation.ipynb ├── Custom Callback.ipynb ├── Validation_Regularization_CallBacks.ipynb ├── Week 3 Programming Assignment.ipynb └── readme.md └── readme.md /.idea/.gitignore: -------------------------------------------------------------------------------- 1 | # Default ignored files 2 | /shelf/ 3 | /workspace.xml 4 | # Datasource local storage ignored files 5 | /dataSources/ 6 | /dataSources.local.xml 7 | # Editor-based HTTP Client requests 8 | /httpRequests/ 9 | -------------------------------------------------------------------------------- /.idea/Getting started with TF 2.iml: -------------------------------------------------------------------------------- 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | -------------------------------------------------------------------------------- /.idea/inspectionProfiles/profiles_settings.xml: -------------------------------------------------------------------------------- 1 | 2 | 3 | 6 | -------------------------------------------------------------------------------- /.idea/misc.xml: -------------------------------------------------------------------------------- 1 | 2 | 3 | 4 | 6 | 7 | -------------------------------------------------------------------------------- /.idea/modules.xml: -------------------------------------------------------------------------------- 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | -------------------------------------------------------------------------------- /.idea/vcs.xml: -------------------------------------------------------------------------------- 1 | 2 | 3 | 4 | 5 | 6 | -------------------------------------------------------------------------------- /.vscode/settings.json: -------------------------------------------------------------------------------- 1 | { 2 | "python.pythonPath": "C:\\Anaconda\\envs\\myenv\\python.exe" 3 | } -------------------------------------------------------------------------------- /Capstone Project/Capstone Project.ipynb: -------------------------------------------------------------------------------- 1 | { 2 | "cells": [ 3 | { 4 | "cell_type": "markdown", 5 | "metadata": {}, 6 | "source": [ 7 | "# Capstone Project\n", 8 | "## Image classifier for the SVHN dataset\n", 9 | "### Instructions\n", 10 | "\n", 11 | "In this notebook, you will create a neural network that classifies real-world images digits. You will use concepts from throughout this course in building, training, testing, validating and saving your Tensorflow classifier model.\n", 12 | "\n", 13 | "This project is peer-assessed. Within this notebook you will find instructions in each section for how to complete the project. Pay close attention to the instructions as the peer review will be carried out according to a grading rubric that checks key parts of the project instructions. Feel free to add extra cells into the notebook as required.\n", 14 | "\n", 15 | "### How to submit\n", 16 | "\n", 17 | "When you have completed the Capstone project notebook, you will submit a pdf of the notebook for peer review. First ensure that the notebook has been fully executed from beginning to end, and all of the cell outputs are visible. This is important, as the grading rubric depends on the reviewer being able to view the outputs of your notebook. Save the notebook as a pdf (File -> Download as -> PDF via LaTeX). You should then submit this pdf for review.\n", 18 | "\n", 19 | "### Let's get started!\n", 20 | "\n", 21 | "We'll start by running some imports, and loading the dataset. For this project you are free to make further imports throughout the notebook as you wish. " 22 | ] 23 | }, 24 | { 25 | "cell_type": "code", 26 | "execution_count": null, 27 | "metadata": {}, 28 | "outputs": [], 29 | "source": [ 30 | "import tensorflow as tf\n", 31 | "from scipy.io import loadmat" 32 | ] 33 | }, 34 | { 35 | "cell_type": "code", 36 | "execution_count": null, 37 | "metadata": {}, 38 | "outputs": [], 39 | "source": [ 40 | "from tensorflow.keras.models import Sequential\n", 41 | "from tensorflow.keras.layers import Conv2D, Flatten, BatchNormalization, MaxPool2D, Dense\n", 42 | "import pandas as pd \n", 43 | "import numpy as np\n", 44 | "import matplotlib.pyplot as plt\n", 45 | "from sklearn.model_selection import train_test_split\n", 46 | "%matplotlib inline" 47 | ] 48 | }, 49 | { 50 | "cell_type": "markdown", 51 | "metadata": {}, 52 | "source": [ 53 | "![SVHN overview image](data/svhn_examples.jpg)\n", 54 | "For the capstone project, you will use the [SVHN dataset](http://ufldl.stanford.edu/housenumbers/). This is an image dataset of over 600,000 digit images in all, and is a harder dataset than MNIST as the numbers appear in the context of natural scene images. SVHN is obtained from house numbers in Google Street View images. \n", 55 | "\n", 56 | "* Y. Netzer, T. Wang, A. Coates, A. Bissacco, B. Wu and A. Y. Ng. \"Reading Digits in Natural Images with Unsupervised Feature Learning\". NIPS Workshop on Deep Learning and Unsupervised Feature Learning, 2011.\n", 57 | "\n", 58 | "Your goal is to develop an end-to-end workflow for building, training, validating, evaluating and saving a neural network that classifies a real-world image into one of ten classes." 59 | ] 60 | }, 61 | { 62 | "cell_type": "code", 63 | "execution_count": null, 64 | "metadata": {}, 65 | "outputs": [], 66 | "source": [ 67 | "# Run this cell to load the dataset\n", 68 | "\n", 69 | "train = loadmat('data/train_32x32.mat')\n", 70 | "test = loadmat('data/test_32x32.mat')" 71 | ] 72 | }, 73 | { 74 | "cell_type": "markdown", 75 | "metadata": {}, 76 | "source": [ 77 | "Both `train` and `test` are dictionaries with keys `X` and `y` for the input images and labels respectively." 78 | ] 79 | }, 80 | { 81 | "cell_type": "markdown", 82 | "metadata": {}, 83 | "source": [ 84 | "## 1. Inspect and preprocess the dataset\n", 85 | "* Extract the training and testing images and labels separately from the train and test dictionaries loaded for you.\n", 86 | "* Select a random sample of images and corresponding labels from the dataset (at least 10), and display them in a figure.\n", 87 | "* Convert the training and test images to grayscale by taking the average across all colour channels for each pixel. _Hint: retain the channel dimension, which will now have size 1._\n", 88 | "* Select a random sample of the grayscale images and corresponding labels from the dataset (at least 10), and display them in a figure." 89 | ] 90 | }, 91 | { 92 | "cell_type": "code", 93 | "execution_count": null, 94 | "metadata": {}, 95 | "outputs": [], 96 | "source": [ 97 | "X_train = train['X']\n", 98 | "X_test = test['X']\n", 99 | "y_train = train['y']\n", 100 | "y_test = test['y']" 101 | ] 102 | }, 103 | { 104 | "cell_type": "code", 105 | "execution_count": null, 106 | "metadata": {}, 107 | "outputs": [], 108 | "source": [ 109 | "X_train.shape, X_test.shape" 110 | ] 111 | }, 112 | { 113 | "cell_type": "code", 114 | "execution_count": null, 115 | "metadata": {}, 116 | "outputs": [], 117 | "source": [ 118 | "X_train = np.moveaxis(X_train, -1, 0)\n", 119 | "X_test = np.moveaxis(X_test, -1 , 0)" 120 | ] 121 | }, 122 | { 123 | "cell_type": "code", 124 | "execution_count": null, 125 | "metadata": {}, 126 | "outputs": [], 127 | "source": [ 128 | "X_train.shape, X_test.shape" 129 | ] 130 | }, 131 | { 132 | "cell_type": "code", 133 | "execution_count": null, 134 | "metadata": {}, 135 | "outputs": [], 136 | "source": [ 137 | "for i in range(10):\n", 138 | " plt.imshow(X_train[i, :, :, :,])\n", 139 | " plt.show()\n", 140 | " print(y_train[i])" 141 | ] 142 | }, 143 | { 144 | "cell_type": "code", 145 | "execution_count": null, 146 | "metadata": {}, 147 | "outputs": [], 148 | "source": [ 149 | "X_train_gs = np.mean(X_train, 3).reshape(73257, 32, 32, 1)/255\n", 150 | "X_test_gs = np.mean(X_test,3).reshape(26032, 32,32 ,1)/255\n", 151 | "X_train_for_plotting = np.mean(X_train,3)" 152 | ] 153 | }, 154 | { 155 | "cell_type": "code", 156 | "execution_count": null, 157 | "metadata": {}, 158 | "outputs": [], 159 | "source": [ 160 | "X_train_gs.shape" 161 | ] 162 | }, 163 | { 164 | "cell_type": "code", 165 | "execution_count": null, 166 | "metadata": {}, 167 | "outputs": [], 168 | "source": [ 169 | "for i in range(10):\n", 170 | " plt.imshow(X_train_for_plotting[i, :, :,])\n", 171 | " plt.show()\n", 172 | " print(y_train[i])" 173 | ] 174 | }, 175 | { 176 | "cell_type": "code", 177 | "execution_count": null, 178 | "metadata": {}, 179 | "outputs": [], 180 | "source": [ 181 | "X_train[0].shape" 182 | ] 183 | }, 184 | { 185 | "cell_type": "code", 186 | "execution_count": null, 187 | "metadata": {}, 188 | "outputs": [], 189 | "source": [ 190 | "from sklearn.preprocessing import OneHotEncoder\n", 191 | "\n", 192 | "enc = OneHotEncoder().fit(y_train)\n", 193 | "y_train_oh = enc.transform(y_train).toarray()\n", 194 | "y_test_oh = enc.transform(y_test).toarray()" 195 | ] 196 | }, 197 | { 198 | "cell_type": "code", 199 | "execution_count": null, 200 | "metadata": {}, 201 | "outputs": [], 202 | "source": [ 203 | "y_test_oh[0]" 204 | ] 205 | }, 206 | { 207 | "cell_type": "code", 208 | "execution_count": null, 209 | "metadata": {}, 210 | "outputs": [], 211 | "source": [ 212 | "plt.imshow(X_test[0])" 213 | ] 214 | }, 215 | { 216 | "cell_type": "markdown", 217 | "metadata": {}, 218 | "source": [ 219 | "## 2. MLP neural network classifier\n", 220 | "* Build an MLP classifier model using the Sequential API. Your model should use only Flatten and Dense layers, with the final layer having a 10-way softmax output. \n", 221 | "* You should design and build the model yourself. Feel free to experiment with different MLP architectures. _Hint: to achieve a reasonable accuracy you won't need to use more than 4 or 5 layers._\n", 222 | "* Print out the model summary (using the summary() method)\n", 223 | "* Compile and train the model (we recommend a maximum of 30 epochs), making use of both training and validation sets during the training run. \n", 224 | "* Your model should track at least one appropriate metric, and use at least two callbacks during training, one of which should be a ModelCheckpoint callback.\n", 225 | "* As a guide, you should aim to achieve a final categorical cross entropy training loss of less than 1.0 (the validation loss might be higher).\n", 226 | "* Plot the learning curves for loss vs epoch and accuracy vs epoch for both training and validation sets.\n", 227 | "* Compute and display the loss and accuracy of the trained model on the test set." 228 | ] 229 | }, 230 | { 231 | "cell_type": "code", 232 | "execution_count": null, 233 | "metadata": {}, 234 | "outputs": [], 235 | "source": [ 236 | "from tensorflow.keras.callbacks import ModelCheckpoint, EarlyStopping" 237 | ] 238 | }, 239 | { 240 | "cell_type": "code", 241 | "execution_count": null, 242 | "metadata": {}, 243 | "outputs": [], 244 | "source": [ 245 | "checkpoint = ModelCheckpoint(filepath = 'SeqMode\\\\mySeqModel', save_best_only=True, save_weights_only=True, monitor='val_loss', verbose=1)\n", 246 | "earlystop = EarlyStopping(patience=5, monitor='loss')" 247 | ] 248 | }, 249 | { 250 | "cell_type": "code", 251 | "execution_count": null, 252 | "metadata": {}, 253 | "outputs": [], 254 | "source": [] 255 | }, 256 | { 257 | "cell_type": "code", 258 | "execution_count": null, 259 | "metadata": {}, 260 | "outputs": [], 261 | "source": [ 262 | "model2 = Sequential([\n", 263 | " Flatten(input_shape=X_train[0].shape),\n", 264 | " Dense(128*4, activation='relu'),\n", 265 | " Dense(64, activation='relu'),\n", 266 | " BatchNormalization(),\n", 267 | " Dense(64, activation='relu'),\n", 268 | " tf.keras.layers.Dropout(0.5),\n", 269 | " Dense(32, activation='relu'),\n", 270 | " Dense(10, activation='softmax')\n", 271 | "])\n", 272 | "model2.summary()" 273 | ] 274 | }, 275 | { 276 | "cell_type": "code", 277 | "execution_count": null, 278 | "metadata": {}, 279 | "outputs": [], 280 | "source": [ 281 | "model2.compile(optimizer='adam', loss='categorical_crossentropy', metrics=['acc'])" 282 | ] 283 | }, 284 | { 285 | "cell_type": "code", 286 | "execution_count": null, 287 | "metadata": {}, 288 | "outputs": [], 289 | "source": [ 290 | "history = model2.fit(X_train, y_train_oh, callbacks=[checkpoint, earlystop], batch_size=128, validation_data=(X_test, y_test_oh), epochs=30)" 291 | ] 292 | }, 293 | { 294 | "cell_type": "code", 295 | "execution_count": null, 296 | "metadata": {}, 297 | "outputs": [], 298 | "source": [ 299 | "!dir" 300 | ] 301 | }, 302 | { 303 | "cell_type": "code", 304 | "execution_count": null, 305 | "metadata": {}, 306 | "outputs": [], 307 | "source": [ 308 | "plt.plot(history.history['loss'])\n", 309 | "plt.plot(history.history['val_loss'])\n", 310 | "plt.xlabel('Epochs')\n", 311 | "plt.ylabel('Loss')\n", 312 | "plt.legend(['loss','val_loss'], loc='upper right')\n", 313 | "plt.title(\"Loss\")" 314 | ] 315 | }, 316 | { 317 | "cell_type": "code", 318 | "execution_count": null, 319 | "metadata": {}, 320 | "outputs": [], 321 | "source": [ 322 | "plt.plot(history.history['acc'])\n", 323 | "plt.plot(history.history['val_acc'])\n", 324 | "plt.xlabel('Epochs')\n", 325 | "plt.ylabel('Acc')\n", 326 | "plt.legend(['loss','val_acc'], loc='lower right')\n", 327 | "plt.title(\"Accuracy\")" 328 | ] 329 | }, 330 | { 331 | "cell_type": "markdown", 332 | "metadata": {}, 333 | "source": [ 334 | "## 3. CNN neural network classifier\n", 335 | "* Build a CNN classifier model using the Sequential API. Your model should use the Conv2D, MaxPool2D, BatchNormalization, Flatten, Dense and Dropout layers. The final layer should again have a 10-way softmax output. \n", 336 | "* You should design and build the model yourself. Feel free to experiment with different CNN architectures. _Hint: to achieve a reasonable accuracy you won't need to use more than 2 or 3 convolutional layers and 2 fully connected layers.)_\n", 337 | "* The CNN model should use fewer trainable parameters than your MLP model.\n", 338 | "* Compile and train the model (we recommend a maximum of 30 epochs), making use of both training and validation sets during the training run.\n", 339 | "* Your model should track at least one appropriate metric, and use at least two callbacks during training, one of which should be a ModelCheckpoint callback.\n", 340 | "* You should aim to beat the MLP model performance with fewer parameters!\n", 341 | "* Plot the learning curves for loss vs epoch and accuracy vs epoch for both training and validation sets.\n", 342 | "* Compute and display the loss and accuracy of the trained model on the test set." 343 | ] 344 | }, 345 | { 346 | "cell_type": "code", 347 | "execution_count": null, 348 | "metadata": {}, 349 | "outputs": [], 350 | "source": [ 351 | "model3 = Sequential([\n", 352 | " Conv2D(filters= 16, kernel_size= 3, activation='relu', input_shape=X_train[0].shape),\n", 353 | " MaxPool2D(pool_size= (3,3), strides=1),\n", 354 | " Conv2D(filters= 32, kernel_size = 3, padding='valid', strides=1, activation='relu'),\n", 355 | " MaxPool2D(pool_size = (1,1), strides = 3),\n", 356 | " BatchNormalization(),\n", 357 | " Conv2D(filters= 32, kernel_size = 3, padding='valid', strides=2, activation='relu'),\n", 358 | " tf.keras.layers.Dropout(0.5),\n", 359 | " Flatten(),\n", 360 | " Dense(128, activation='relu'),\n", 361 | " Dense(32, activation='relu'),\n", 362 | " tf.keras.layers.Dropout(0.3),\n", 363 | " Dense(10, activation='softmax')\n", 364 | "])" 365 | ] 366 | }, 367 | { 368 | "cell_type": "code", 369 | "execution_count": null, 370 | "metadata": {}, 371 | "outputs": [], 372 | "source": [ 373 | "model3.summary()" 374 | ] 375 | }, 376 | { 377 | "cell_type": "code", 378 | "execution_count": null, 379 | "metadata": {}, 380 | "outputs": [], 381 | "source": [ 382 | "## Less parameters then normal model" 383 | ] 384 | }, 385 | { 386 | "cell_type": "code", 387 | "execution_count": null, 388 | "metadata": {}, 389 | "outputs": [], 390 | "source": [ 391 | "model3.compile(optimizer='adam', loss='categorical_crossentropy', metrics=['acc'])" 392 | ] 393 | }, 394 | { 395 | "cell_type": "code", 396 | "execution_count": null, 397 | "metadata": {}, 398 | "outputs": [], 399 | "source": [ 400 | "callback1 = ModelCheckpoint(filepath='CNNweights', save_best_only=True, save_weights_only=True, save_freq=5000,monitor='val_acc')\n", 401 | "callback2 = EarlyStopping(monitor='loss',patience=7, verbose=1)" 402 | ] 403 | }, 404 | { 405 | "cell_type": "code", 406 | "execution_count": null, 407 | "metadata": {}, 408 | "outputs": [], 409 | "source": [ 410 | "X_train.shape" 411 | ] 412 | }, 413 | { 414 | "cell_type": "code", 415 | "execution_count": null, 416 | "metadata": {}, 417 | "outputs": [], 418 | "source": [ 419 | "history = model3.fit(X_train, y_train_oh, callbacks=[checkpoint, earlystop], batch_size=256, validation_data=(X_test, y_test_oh), epochs=30)" 420 | ] 421 | }, 422 | { 423 | "cell_type": "markdown", 424 | "metadata": {}, 425 | "source": [ 426 | "### We can see that we improved our accuracy very much ascompared normal dense model in 4 epochs while having very less parameters" 427 | ] 428 | }, 429 | { 430 | "cell_type": "markdown", 431 | "metadata": {}, 432 | "source": [ 433 | "## 4. Get model predictions\n", 434 | "* Load the best weights for the MLP and CNN models that you saved during the training run.\n", 435 | "* Randomly select 5 images and corresponding labels from the test set and display the images with their labels.\n", 436 | "* Alongside the image and label, show each model’s predictive distribution as a bar chart, and the final model prediction given by the label with maximum probability." 437 | ] 438 | }, 439 | { 440 | "cell_type": "code", 441 | "execution_count": null, 442 | "metadata": {}, 443 | "outputs": [], 444 | "source": [ 445 | "model2.load_weights('SeqMode\\\\mySeqModel')" 446 | ] 447 | }, 448 | { 449 | "cell_type": "code", 450 | "execution_count": null, 451 | "metadata": {}, 452 | "outputs": [], 453 | "source": [ 454 | "import random" 455 | ] 456 | }, 457 | { 458 | "cell_type": "code", 459 | "execution_count": null, 460 | "metadata": {}, 461 | "outputs": [], 462 | "source": [ 463 | "num_test_images = X_test.shape[0]\n", 464 | "\n", 465 | "random_inx = np.random.choice(num_test_images, 5)\n", 466 | "random_test_images = X_test[random_inx, ...]\n", 467 | "random_test_labels = y_test[random_inx, ...]\n", 468 | "\n", 469 | "predictions = model2.predict(random_test_images)\n", 470 | "\n", 471 | "fig, axes = plt.subplots(5, 2, figsize=(16, 12))\n", 472 | "fig.subplots_adjust(hspace=0.4, wspace=-0.2)\n", 473 | "\n", 474 | "for i, (prediction, image, label) in enumerate(zip(predictions, random_test_images, random_test_labels)):\n", 475 | " axes[i, 0].imshow(np.squeeze(image))\n", 476 | " axes[i, 0].get_xaxis().set_visible(False)\n", 477 | " axes[i, 0].get_yaxis().set_visible(False)\n", 478 | " axes[i, 0].text(10., -1.5, f'Digit {label}')\n", 479 | " axes[i, 1].bar(np.arange(1,11), prediction)\n", 480 | " axes[i, 1].set_xticks(np.arange(1,11))\n", 481 | " axes[i, 1].set_title(\"Categorical distribution. Model prediction\")\n", 482 | " \n", 483 | "plt.show()" 484 | ] 485 | }, 486 | { 487 | "cell_type": "code", 488 | "execution_count": null, 489 | "metadata": {}, 490 | "outputs": [], 491 | "source": [ 492 | "num_test_images = X_test.shape[0]\n", 493 | "\n", 494 | "random_inx = np.random.choice(num_test_images, 5)\n", 495 | "random_test_images = X_test[random_inx, ...]\n", 496 | "random_test_labels = y_test[random_inx, ...]\n", 497 | "\n", 498 | "predictions = model3.predict(random_test_images)\n", 499 | "\n", 500 | "fig, axes = plt.subplots(5, 2, figsize=(16, 12))\n", 501 | "fig.subplots_adjust(hspace=0.4, wspace=-0.2)\n", 502 | "\n", 503 | "for i, (prediction, image, label) in enumerate(zip(predictions, random_test_images, random_test_labels)):\n", 504 | " axes[i, 0].imshow(np.squeeze(image))\n", 505 | " axes[i, 0].get_xaxis().set_visible(False)\n", 506 | " axes[i, 0].get_yaxis().set_visible(False)\n", 507 | " axes[i, 0].text(10., -1.5, f'Digit {label}')\n", 508 | " axes[i, 1].bar(np.arange(1,11), prediction)\n", 509 | " axes[i, 1].set_xticks(np.arange(1,11))\n", 510 | " axes[i, 1].set_title(\"Categorical distribution. Model prediction\")\n", 511 | " \n", 512 | "plt.show()" 513 | ] 514 | }, 515 | { 516 | "cell_type": "code", 517 | "execution_count": null, 518 | "metadata": {}, 519 | "outputs": [], 520 | "source": [] 521 | }, 522 | { 523 | "cell_type": "code", 524 | "execution_count": null, 525 | "metadata": {}, 526 | "outputs": [], 527 | "source": [] 528 | } 529 | ], 530 | "metadata": { 531 | "kernelspec": { 532 | "display_name": "Python 3.7.7 64-bit ('myenv': conda)", 533 | "language": "python", 534 | "name": "python37764bitmyenvconda4a11ba26287d4d1c969b9946e31eb2a2" 535 | }, 536 | "language_info": { 537 | "codemirror_mode": { 538 | "name": "ipython", 539 | "version": 3 540 | }, 541 | "file_extension": ".py", 542 | "mimetype": "text/x-python", 543 | "name": "python", 544 | "nbconvert_exporter": "python", 545 | "pygments_lexer": "ipython3", 546 | "version": "3.7.7" 547 | } 548 | }, 549 | "nbformat": 4, 550 | "nbformat_minor": 2 551 | } -------------------------------------------------------------------------------- /Capstone Project/Capstone Project.pdf: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/ahmadmustafaanis/Getting-Started-with-Tensorflow-2/6318490c1b07018f94389a68905d7b51029c35aa/Capstone Project/Capstone Project.pdf -------------------------------------------------------------------------------- /Capstone Project/SeqMode/checkpoint: -------------------------------------------------------------------------------- 1 | model_checkpoint_path: "mySeqModel" 2 | all_model_checkpoint_paths: "mySeqModel" 3 | -------------------------------------------------------------------------------- /Capstone Project/SeqMode/mySeqModel.data-00000-of-00002: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/ahmadmustafaanis/Getting-Started-with-Tensorflow-2/6318490c1b07018f94389a68905d7b51029c35aa/Capstone Project/SeqMode/mySeqModel.data-00000-of-00002 -------------------------------------------------------------------------------- /Capstone Project/SeqMode/mySeqModel.data-00001-of-00002: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/ahmadmustafaanis/Getting-Started-with-Tensorflow-2/6318490c1b07018f94389a68905d7b51029c35aa/Capstone Project/SeqMode/mySeqModel.data-00001-of-00002 -------------------------------------------------------------------------------- /Capstone Project/SeqMode/mySeqModel.index: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/ahmadmustafaanis/Getting-Started-with-Tensorflow-2/6318490c1b07018f94389a68905d7b51029c35aa/Capstone Project/SeqMode/mySeqModel.index -------------------------------------------------------------------------------- /Capstone Project/checkpoint: -------------------------------------------------------------------------------- 1 | model_checkpoint_path: "mySeqModel" 2 | all_model_checkpoint_paths: "mySeqModel" 3 | -------------------------------------------------------------------------------- /Capstone Project/mySeqModel.data-00000-of-00002: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/ahmadmustafaanis/Getting-Started-with-Tensorflow-2/6318490c1b07018f94389a68905d7b51029c35aa/Capstone Project/mySeqModel.data-00000-of-00002 -------------------------------------------------------------------------------- /Capstone Project/mySeqModel.data-00001-of-00002: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/ahmadmustafaanis/Getting-Started-with-Tensorflow-2/6318490c1b07018f94389a68905d7b51029c35aa/Capstone Project/mySeqModel.data-00001-of-00002 -------------------------------------------------------------------------------- /Capstone Project/mySeqModel.index: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/ahmadmustafaanis/Getting-Started-with-Tensorflow-2/6318490c1b07018f94389a68905d7b51029c35aa/Capstone Project/mySeqModel.index -------------------------------------------------------------------------------- /Introduction to Google Colab/readme.md: -------------------------------------------------------------------------------- 1 | 2 | ### Introduction to Google Collab 3 | Why Google Collab? 4 | - Provides Browser based Jupyter Notebook 5 | - Ready to use 6 | - GPU & TPUs 7 | - Store Data with Google Drive 8 | - Pre-installed Packages 9 | --- 10 | 11 | ##### Basics 12 | - Go to `file > New Notebook` for a new notebook. 13 | - All notebooks are saved in your google drive. 14 | - To locate notebook in drive go to `file > Locate to Drive` 15 | 16 | ##### Important Shortcuts 17 | - `Ctrl + M` followed by `B` to make a new code block. 18 | - `Ctrl + M` followed by `M` for new Code -> Markdown 19 | - `Ctrl + M` followed by `Y` for Markdown -> Code 20 | 21 | ##### Change Language 22 | - To make a python 2 notebook go to this link [Python 2 for Collab](https://colab.research.google.com/notebook#create=true&language=python2) 23 | 24 | ##### GPU & TPU 25 | - Go to `Runtime > Change Runtime Type` and select GPU or TPU from there. 26 | 27 | ##### Load data from Drive 28 | - Use this code snippet to import data from drive 29 | ```python3 30 | from google.colab import drive 31 | drive.mount('gdrive') 32 | 33 | my_file = open('gdrive/mydrive/...yourpathtofile/file.txt') 34 | print(myfile.read()) 35 | ``` 36 | 37 | #### Bash commands 38 | - Use Bash commands by adding `!` before them. 39 | - i.e `!pip install numpy` or `!ls` or `!dir1` etc. 40 | -------------------------------------------------------------------------------- /Saving and Loading Model Weights/Explanation of saved files.ipynb: -------------------------------------------------------------------------------- 1 | { 2 | "cells": [ 3 | { 4 | "cell_type": "markdown", 5 | "metadata": {}, 6 | "source": [ 7 | "# Explanation of saved files\n", 8 | "In this reading, we'll take a closer look at the files saved by the ModelCheckpoint callback, when saving weights only. " 9 | ] 10 | }, 11 | { 12 | "cell_type": "markdown", 13 | "metadata": {}, 14 | "source": [ 15 | "Previously, you experimented with the ModelCheckpoint callback, which can be used to save model weights during training. You looked at the saved files using the `! ls` command. The saved files were the following:\n", 16 | "\n", 17 | "```\n", 18 | "-rw-r--r-- 1 aph416 staff 87B 2 Nov 17:04 checkpoint\n", 19 | "-rw-r--r-- 1 aph416 staff 2.0K 2 Nov 17:04 checkpoint.index\n", 20 | "-rw-r--r-- 1 aph416 staff 174K 2 Nov 17:04 checkpoint.data-00000-of-00001\n", 21 | "```\n", 22 | "\n", 23 | "So, what are each of these files?" 24 | ] 25 | }, 26 | { 27 | "cell_type": "markdown", 28 | "metadata": {}, 29 | "source": [ 30 | "#### `checkpoint`\n", 31 | "This file is by far the smallest, at only 87 bytes. It's actually so small that we can just look at it directly. It's a human readable file with the following text:\n", 32 | "```\n", 33 | "model_checkpoint_path: \"checkpoint\"\n", 34 | "all_model_checkpoint_paths: \"checkpoint\"\n", 35 | "```\n", 36 | "This is metadata that indicates where the actual model data is stored." 37 | ] 38 | }, 39 | { 40 | "cell_type": "markdown", 41 | "metadata": {}, 42 | "source": [ 43 | "#### `checkpoint.index`\n", 44 | "This file tells TensorFlow which weights are stored where. When running models on distributed systems, there may be different *shards*, meaning the full model may have to be recomposed from multiple sources. In the last notebook, you created a single model on a single machine, so there is only one *shard* and all weights are stored in the same place." 45 | ] 46 | }, 47 | { 48 | "cell_type": "markdown", 49 | "metadata": {}, 50 | "source": [ 51 | "#### `checkpoint.data-00000-of-00001`\n", 52 | "This file contains the actual weights from the model. It is by far the largest of the 3 files. Recall that the model you trained had around 14000 parameters, meaning this file is roughly 12 bytes per saved weight." 53 | ] 54 | }, 55 | { 56 | "cell_type": "markdown", 57 | "metadata": {}, 58 | "source": [ 59 | "### Further reading and resources\n", 60 | "* https://www.tensorflow.org/tutorials/keras/save_and_load#what_are_these_files" 61 | ] 62 | } 63 | ], 64 | "metadata": { 65 | "kernelspec": { 66 | "display_name": "Python 3", 67 | "language": "python", 68 | "name": "python3" 69 | }, 70 | "language_info": { 71 | "codemirror_mode": { 72 | "name": "ipython", 73 | "version": 3 74 | }, 75 | "file_extension": ".py", 76 | "mimetype": "text/x-python", 77 | "name": "python", 78 | "nbconvert_exporter": "python", 79 | "pygments_lexer": "ipython3", 80 | "version": "3.7.1" 81 | } 82 | }, 83 | "nbformat": 4, 84 | "nbformat_minor": 2 85 | } 86 | -------------------------------------------------------------------------------- /Saving and Loading Model Weights/Saving model architecture only.ipynb: -------------------------------------------------------------------------------- 1 | { 2 | "cells": [ 3 | { 4 | "cell_type": "markdown", 5 | "metadata": {}, 6 | "source": [ 7 | "# Saving model architecture only\n", 8 | "\n", 9 | "In this reading you will learn how to save a model's architecture, but not its weights." 10 | ] 11 | }, 12 | { 13 | "cell_type": "code", 14 | "execution_count": null, 15 | "metadata": {}, 16 | "outputs": [], 17 | "source": [ 18 | "import tensorflow as tf\n", 19 | "from tensorflow.keras.models import Sequential\n", 20 | "from tensorflow.keras.layers import Dense\n", 21 | "import json\n", 22 | "import numpy as np" 23 | ] 24 | }, 25 | { 26 | "cell_type": "markdown", 27 | "metadata": {}, 28 | "source": [ 29 | "In previous videos and notebooks you have have learned how to save a model's weights, as well as the entire model - weights and architecture." 30 | ] 31 | }, 32 | { 33 | "cell_type": "markdown", 34 | "metadata": {}, 35 | "source": [ 36 | "### Accessing a model's configuration\n", 37 | "A model's *configuration* refers to its architecture. TensorFlow has a convenient way to retrieve a model's architecture as a dictionary. We start by creating a simple fully connected feedforward neural network with 1 hidden layer." 38 | ] 39 | }, 40 | { 41 | "cell_type": "code", 42 | "execution_count": null, 43 | "metadata": {}, 44 | "outputs": [], 45 | "source": [ 46 | "# Build the model\n", 47 | "\n", 48 | "model = Sequential([\n", 49 | " Dense(units=32, input_shape=(32, 32, 3), activation='relu', name='dense_1'),\n", 50 | " Dense(units=10, activation='softmax', name='dense_2')\n", 51 | "])" 52 | ] 53 | }, 54 | { 55 | "cell_type": "markdown", 56 | "metadata": {}, 57 | "source": [ 58 | "A TensorFlow model has an inbuilt method `get_config` which returns the model's architecture as a dictionary:" 59 | ] 60 | }, 61 | { 62 | "cell_type": "code", 63 | "execution_count": null, 64 | "metadata": { 65 | "tags": [] 66 | }, 67 | "outputs": [], 68 | "source": [ 69 | "# Get the model config\n", 70 | "\n", 71 | "config_dict = model.get_config()\n", 72 | "print(config_dict)" 73 | ] 74 | }, 75 | { 76 | "cell_type": "markdown", 77 | "metadata": {}, 78 | "source": [ 79 | "### Creating a new model from the config\n", 80 | "A new TensorFlow model can be created from this config dictionary. This model will have reinitialized weights, which are not the same as the original model." 81 | ] 82 | }, 83 | { 84 | "cell_type": "code", 85 | "execution_count": null, 86 | "metadata": {}, 87 | "outputs": [], 88 | "source": [ 89 | "# Create a model from the config dictionary\n", 90 | "\n", 91 | "model_same_config = tf.keras.Sequential.from_config(config_dict)" 92 | ] 93 | }, 94 | { 95 | "cell_type": "markdown", 96 | "metadata": {}, 97 | "source": [ 98 | "We can check explicitly that the config of both models is the same, but the weights are not: " 99 | ] 100 | }, 101 | { 102 | "cell_type": "code", 103 | "execution_count": null, 104 | "metadata": { 105 | "tags": [] 106 | }, 107 | "outputs": [], 108 | "source": [ 109 | "# Check the new model is the same architecture\n", 110 | "\n", 111 | "print('Same config:', \n", 112 | " model.get_config() == model_same_config.get_config())\n", 113 | "print('Same value for first weight matrix:', \n", 114 | " np.allclose(model.weights[0].numpy(), model_same_config.weights[0].numpy()))" 115 | ] 116 | }, 117 | { 118 | "cell_type": "markdown", 119 | "metadata": {}, 120 | "source": [ 121 | "For models that are not `Sequential` models, use `tf.keras.Model.from_config` instead of `tf.keras.Sequential.from_config`." 122 | ] 123 | }, 124 | { 125 | "cell_type": "markdown", 126 | "metadata": {}, 127 | "source": [ 128 | "### Other file formats: JSON and YAML\n", 129 | "It is also possible to obtain a model's config in JSON or YAML formats. This follows the same pattern:" 130 | ] 131 | }, 132 | { 133 | "cell_type": "code", 134 | "execution_count": null, 135 | "metadata": { 136 | "tags": [] 137 | }, 138 | "outputs": [], 139 | "source": [ 140 | "# Convert the model to JSON\n", 141 | "\n", 142 | "json_string = model.to_json()\n", 143 | "print(json_string)" 144 | ] 145 | }, 146 | { 147 | "cell_type": "markdown", 148 | "metadata": {}, 149 | "source": [ 150 | "The JSON format can easily be written out and saved as a file:" 151 | ] 152 | }, 153 | { 154 | "cell_type": "code", 155 | "execution_count": null, 156 | "metadata": {}, 157 | "outputs": [], 158 | "source": [ 159 | "# Write out JSON config file\n", 160 | "\n", 161 | "with open('config.json', 'w') as f:\n", 162 | " json.dump(json_string, f)\n", 163 | "del json_string" 164 | ] 165 | }, 166 | { 167 | "cell_type": "code", 168 | "execution_count": null, 169 | "metadata": {}, 170 | "outputs": [], 171 | "source": [ 172 | "# Read in JSON config file again\n", 173 | "\n", 174 | "with open('config.json', 'r') as f:\n", 175 | " json_string = json.load(f)" 176 | ] 177 | }, 178 | { 179 | "cell_type": "code", 180 | "execution_count": null, 181 | "metadata": {}, 182 | "outputs": [], 183 | "source": [ 184 | "# Reinitialize the model\n", 185 | "\n", 186 | "model_same_config = tf.keras.models.model_from_json(json_string)" 187 | ] 188 | }, 189 | { 190 | "cell_type": "code", 191 | "execution_count": null, 192 | "metadata": { 193 | "tags": [] 194 | }, 195 | "outputs": [], 196 | "source": [ 197 | "# Check the new model is the same architecture, but different weights\n", 198 | "\n", 199 | "print('Same config:', \n", 200 | " model.get_config() == model_same_config.get_config())\n", 201 | "print('Same value for first weight matrix:', \n", 202 | " np.allclose(model.weights[0].numpy(), model_same_config.weights[0].numpy()))" 203 | ] 204 | }, 205 | { 206 | "cell_type": "markdown", 207 | "metadata": {}, 208 | "source": [ 209 | "The YAML format is similar. The details of writing out YAML files, loading them and using them to create a new model are similar as for the JSON files, so we won't show it here." 210 | ] 211 | }, 212 | { 213 | "cell_type": "code", 214 | "execution_count": null, 215 | "metadata": { 216 | "tags": [] 217 | }, 218 | "outputs": [], 219 | "source": [ 220 | "# Convert the model to YAML\n", 221 | "\n", 222 | "yaml_string = model.to_yaml()\n", 223 | "print(yaml_string)" 224 | ] 225 | }, 226 | { 227 | "cell_type": "markdown", 228 | "metadata": {}, 229 | "source": [ 230 | "Writing out, reading in and using YAML files to create models is similar to JSON files. " 231 | ] 232 | }, 233 | { 234 | "cell_type": "markdown", 235 | "metadata": {}, 236 | "source": [ 237 | "### Further reading and resources\n", 238 | "* https://www.tensorflow.org/guide/keras/save_and_serialize#architecture-only_saving\n", 239 | "* https://keras.io/getting-started/faq/#how-can-i-save-a-keras-model" 240 | ] 241 | } 242 | ], 243 | "metadata": { 244 | "kernelspec": { 245 | "display_name": "Python 3.7.7 64-bit ('myenv': conda)", 246 | "language": "python", 247 | "name": "python37764bitmyenvconda4a11ba26287d4d1c969b9946e31eb2a2" 248 | }, 249 | "language_info": { 250 | "codemirror_mode": { 251 | "name": "ipython", 252 | "version": 3 253 | }, 254 | "file_extension": ".py", 255 | "mimetype": "text/x-python", 256 | "name": "python", 257 | "nbconvert_exporter": "python", 258 | "pygments_lexer": "ipython3", 259 | "version": "3.7.7-final" 260 | } 261 | }, 262 | "nbformat": 4, 263 | "nbformat_minor": 2 264 | } -------------------------------------------------------------------------------- /Saving and Loading Model Weights/data/bird1.jpg: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/ahmadmustafaanis/Getting-Started-with-Tensorflow-2/6318490c1b07018f94389a68905d7b51029c35aa/Saving and Loading Model Weights/data/bird1.jpg -------------------------------------------------------------------------------- /Saving and Loading Model Weights/data/bird2.jpg: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/ahmadmustafaanis/Getting-Started-with-Tensorflow-2/6318490c1b07018f94389a68905d7b51029c35aa/Saving and Loading Model Weights/data/bird2.jpg -------------------------------------------------------------------------------- /Saving and Loading Model Weights/data/cat1.jpg: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/ahmadmustafaanis/Getting-Started-with-Tensorflow-2/6318490c1b07018f94389a68905d7b51029c35aa/Saving and Loading Model Weights/data/cat1.jpg -------------------------------------------------------------------------------- /Saving and Loading Model Weights/data/imagenet_categories.txt: -------------------------------------------------------------------------------- 1 | background 2 | tench 3 | goldfish 4 | great white shark 5 | tiger shark 6 | hammerhead 7 | electric ray 8 | stingray 9 | cock 10 | hen 11 | ostrich 12 | brambling 13 | goldfinch 14 | house finch 15 | junco 16 | indigo bunting 17 | robin 18 | bulbul 19 | jay 20 | magpie 21 | chickadee 22 | water ouzel 23 | kite 24 | bald eagle 25 | vulture 26 | great grey owl 27 | European fire salamander 28 | common newt 29 | eft 30 | spotted salamander 31 | axolotl 32 | bullfrog 33 | tree frog 34 | tailed frog 35 | loggerhead 36 | leatherback turtle 37 | mud turtle 38 | terrapin 39 | box turtle 40 | banded gecko 41 | common iguana 42 | American chameleon 43 | whiptail 44 | agama 45 | frilled lizard 46 | alligator lizard 47 | Gila monster 48 | green lizard 49 | African chameleon 50 | Komodo dragon 51 | African crocodile 52 | American alligator 53 | triceratops 54 | thunder snake 55 | ringneck snake 56 | hognose snake 57 | green snake 58 | king snake 59 | garter snake 60 | water snake 61 | vine snake 62 | night snake 63 | boa constrictor 64 | rock python 65 | Indian cobra 66 | green mamba 67 | sea snake 68 | horned viper 69 | diamondback 70 | sidewinder 71 | trilobite 72 | harvestman 73 | scorpion 74 | black and gold garden spider 75 | barn spider 76 | garden spider 77 | black widow 78 | tarantula 79 | wolf spider 80 | tick 81 | centipede 82 | black grouse 83 | ptarmigan 84 | ruffed grouse 85 | prairie chicken 86 | peacock 87 | quail 88 | partridge 89 | African grey 90 | macaw 91 | sulphur-crested cockatoo 92 | lorikeet 93 | coucal 94 | bee eater 95 | hornbill 96 | hummingbird 97 | jacamar 98 | toucan 99 | drake 100 | red-breasted merganser 101 | goose 102 | black swan 103 | tusker 104 | echidna 105 | platypus 106 | wallaby 107 | koala 108 | wombat 109 | jellyfish 110 | sea anemone 111 | brain coral 112 | flatworm 113 | nematode 114 | conch 115 | snail 116 | slug 117 | sea slug 118 | chiton 119 | chambered nautilus 120 | Dungeness crab 121 | rock crab 122 | fiddler crab 123 | king crab 124 | American lobster 125 | spiny lobster 126 | crayfish 127 | hermit crab 128 | isopod 129 | white stork 130 | black stork 131 | spoonbill 132 | flamingo 133 | little blue heron 134 | American egret 135 | bittern 136 | crane 137 | limpkin 138 | European gallinule 139 | American coot 140 | bustard 141 | ruddy turnstone 142 | red-backed sandpiper 143 | redshank 144 | dowitcher 145 | oystercatcher 146 | pelican 147 | king penguin 148 | albatross 149 | grey whale 150 | killer whale 151 | dugong 152 | sea lion 153 | Chihuahua 154 | Japanese spaniel 155 | Maltese dog 156 | Pekinese 157 | Shih-Tzu 158 | Blenheim spaniel 159 | papillon 160 | toy terrier 161 | Rhodesian ridgeback 162 | Afghan hound 163 | basset 164 | beagle 165 | bloodhound 166 | bluetick 167 | black-and-tan coonhound 168 | Walker hound 169 | English foxhound 170 | redbone 171 | borzoi 172 | Irish wolfhound 173 | Italian greyhound 174 | whippet 175 | Ibizan hound 176 | Norwegian elkhound 177 | otterhound 178 | Saluki 179 | Scottish deerhound 180 | Weimaraner 181 | Staffordshire bullterrier 182 | American Staffordshire terrier 183 | Bedlington terrier 184 | Border terrier 185 | Kerry blue terrier 186 | Irish terrier 187 | Norfolk terrier 188 | Norwich terrier 189 | Yorkshire terrier 190 | wire-haired fox terrier 191 | Lakeland terrier 192 | Sealyham terrier 193 | Airedale 194 | cairn 195 | Australian terrier 196 | Dandie Dinmont 197 | Boston bull 198 | miniature schnauzer 199 | giant schnauzer 200 | standard schnauzer 201 | Scotch terrier 202 | Tibetan terrier 203 | silky terrier 204 | soft-coated wheaten terrier 205 | West Highland white terrier 206 | Lhasa 207 | flat-coated retriever 208 | curly-coated retriever 209 | golden retriever 210 | Labrador retriever 211 | Chesapeake Bay retriever 212 | German short-haired pointer 213 | vizsla 214 | English setter 215 | Irish setter 216 | Gordon setter 217 | Brittany spaniel 218 | clumber 219 | English springer 220 | Welsh springer spaniel 221 | cocker spaniel 222 | Sussex spaniel 223 | Irish water spaniel 224 | kuvasz 225 | schipperke 226 | groenendael 227 | malinois 228 | briard 229 | kelpie 230 | komondor 231 | Old English sheepdog 232 | Shetland sheepdog 233 | collie 234 | Border collie 235 | Bouvier des Flandres 236 | Rottweiler 237 | German shepherd 238 | Doberman 239 | miniature pinscher 240 | Greater Swiss Mountain dog 241 | Bernese mountain dog 242 | Appenzeller 243 | EntleBucher 244 | boxer 245 | bull mastiff 246 | Tibetan mastiff 247 | French bulldog 248 | Great Dane 249 | Saint Bernard 250 | Eskimo dog 251 | malamute 252 | Siberian husky 253 | dalmatian 254 | affenpinscher 255 | basenji 256 | pug 257 | Leonberg 258 | Newfoundland 259 | Great Pyrenees 260 | Samoyed 261 | Pomeranian 262 | chow 263 | keeshond 264 | Brabancon griffon 265 | Pembroke 266 | Cardigan 267 | toy poodle 268 | miniature poodle 269 | standard poodle 270 | Mexican hairless 271 | timber wolf 272 | white wolf 273 | red wolf 274 | coyote 275 | dingo 276 | dhole 277 | African hunting dog 278 | hyena 279 | red fox 280 | kit fox 281 | Arctic fox 282 | grey fox 283 | tabby 284 | tiger cat 285 | Persian cat 286 | Siamese cat 287 | Egyptian cat 288 | cougar 289 | lynx 290 | leopard 291 | snow leopard 292 | jaguar 293 | lion 294 | tiger 295 | cheetah 296 | brown bear 297 | American black bear 298 | ice bear 299 | sloth bear 300 | mongoose 301 | meerkat 302 | tiger beetle 303 | ladybug 304 | ground beetle 305 | long-horned beetle 306 | leaf beetle 307 | dung beetle 308 | rhinoceros beetle 309 | weevil 310 | fly 311 | bee 312 | ant 313 | grasshopper 314 | cricket 315 | walking stick 316 | cockroach 317 | mantis 318 | cicada 319 | leafhopper 320 | lacewing 321 | dragonfly 322 | damselfly 323 | admiral 324 | ringlet 325 | monarch 326 | cabbage butterfly 327 | sulphur butterfly 328 | lycaenid 329 | starfish 330 | sea urchin 331 | sea cucumber 332 | wood rabbit 333 | hare 334 | Angora 335 | hamster 336 | porcupine 337 | fox squirrel 338 | marmot 339 | beaver 340 | guinea pig 341 | sorrel 342 | zebra 343 | hog 344 | wild boar 345 | warthog 346 | hippopotamus 347 | ox 348 | water buffalo 349 | bison 350 | ram 351 | bighorn 352 | ibex 353 | hartebeest 354 | impala 355 | gazelle 356 | Arabian camel 357 | llama 358 | weasel 359 | mink 360 | polecat 361 | black-footed ferret 362 | otter 363 | skunk 364 | badger 365 | armadillo 366 | three-toed sloth 367 | orangutan 368 | gorilla 369 | chimpanzee 370 | gibbon 371 | siamang 372 | guenon 373 | patas 374 | baboon 375 | macaque 376 | langur 377 | colobus 378 | proboscis monkey 379 | marmoset 380 | capuchin 381 | howler monkey 382 | titi 383 | spider monkey 384 | squirrel monkey 385 | Madagascar cat 386 | indri 387 | Indian elephant 388 | African elephant 389 | lesser panda 390 | giant panda 391 | barracouta 392 | eel 393 | coho 394 | rock beauty 395 | anemone fish 396 | sturgeon 397 | gar 398 | lionfish 399 | puffer 400 | abacus 401 | abaya 402 | academic gown 403 | accordion 404 | acoustic guitar 405 | aircraft carrier 406 | airliner 407 | airship 408 | altar 409 | ambulance 410 | amphibian 411 | analog clock 412 | apiary 413 | apron 414 | ashcan 415 | assault rifle 416 | backpack 417 | bakery 418 | balance beam 419 | balloon 420 | ballpoint 421 | Band Aid 422 | banjo 423 | bannister 424 | barbell 425 | barber chair 426 | barbershop 427 | barn 428 | barometer 429 | barrel 430 | barrow 431 | baseball 432 | basketball 433 | bassinet 434 | bassoon 435 | bathing cap 436 | bath towel 437 | bathtub 438 | beach wagon 439 | beacon 440 | beaker 441 | bearskin 442 | beer bottle 443 | beer glass 444 | bell cote 445 | bib 446 | bicycle-built-for-two 447 | bikini 448 | binder 449 | binoculars 450 | birdhouse 451 | boathouse 452 | bobsled 453 | bolo tie 454 | bonnet 455 | bookcase 456 | bookshop 457 | bottlecap 458 | bow 459 | bow tie 460 | brass 461 | brassiere 462 | breakwater 463 | breastplate 464 | broom 465 | bucket 466 | buckle 467 | bulletproof vest 468 | bullet train 469 | butcher shop 470 | cab 471 | caldron 472 | candle 473 | cannon 474 | canoe 475 | can opener 476 | cardigan 477 | car mirror 478 | carousel 479 | carpenter's kit 480 | carton 481 | car wheel 482 | cash machine 483 | cassette 484 | cassette player 485 | castle 486 | catamaran 487 | CD player 488 | cello 489 | cellular telephone 490 | chain 491 | chainlink fence 492 | chain mail 493 | chain saw 494 | chest 495 | chiffonier 496 | chime 497 | china cabinet 498 | Christmas stocking 499 | church 500 | cinema 501 | cleaver 502 | cliff dwelling 503 | cloak 504 | clog 505 | cocktail shaker 506 | coffee mug 507 | coffeepot 508 | coil 509 | combination lock 510 | computer keyboard 511 | confectionery 512 | container ship 513 | convertible 514 | corkscrew 515 | cornet 516 | cowboy boot 517 | cowboy hat 518 | cradle 519 | crane 520 | crash helmet 521 | crate 522 | crib 523 | Crock Pot 524 | croquet ball 525 | crutch 526 | cuirass 527 | dam 528 | desk 529 | desktop computer 530 | dial telephone 531 | diaper 532 | digital clock 533 | digital watch 534 | dining table 535 | dishrag 536 | dishwasher 537 | disk brake 538 | dock 539 | dogsled 540 | dome 541 | doormat 542 | drilling platform 543 | drum 544 | drumstick 545 | dumbbell 546 | Dutch oven 547 | electric fan 548 | electric guitar 549 | electric locomotive 550 | entertainment center 551 | envelope 552 | espresso maker 553 | face powder 554 | feather boa 555 | file 556 | fireboat 557 | fire engine 558 | fire screen 559 | flagpole 560 | flute 561 | folding chair 562 | football helmet 563 | forklift 564 | fountain 565 | fountain pen 566 | four-poster 567 | freight car 568 | French horn 569 | frying pan 570 | fur coat 571 | garbage truck 572 | gasmask 573 | gas pump 574 | goblet 575 | go-kart 576 | golf ball 577 | golfcart 578 | gondola 579 | gong 580 | gown 581 | grand piano 582 | greenhouse 583 | grille 584 | grocery store 585 | guillotine 586 | hair slide 587 | hair spray 588 | half track 589 | hammer 590 | hamper 591 | hand blower 592 | hand-held computer 593 | handkerchief 594 | hard disc 595 | harmonica 596 | harp 597 | harvester 598 | hatchet 599 | holster 600 | home theater 601 | honeycomb 602 | hook 603 | hoopskirt 604 | horizontal bar 605 | horse cart 606 | hourglass 607 | iPod 608 | iron 609 | jack-o'-lantern 610 | jean 611 | jeep 612 | jersey 613 | jigsaw puzzle 614 | jinrikisha 615 | joystick 616 | kimono 617 | knee pad 618 | knot 619 | lab coat 620 | ladle 621 | lampshade 622 | laptop 623 | lawn mower 624 | lens cap 625 | letter opener 626 | library 627 | lifeboat 628 | lighter 629 | limousine 630 | liner 631 | lipstick 632 | Loafer 633 | lotion 634 | loudspeaker 635 | loupe 636 | lumbermill 637 | magnetic compass 638 | mailbag 639 | mailbox 640 | maillot 641 | maillot 642 | manhole cover 643 | maraca 644 | marimba 645 | mask 646 | matchstick 647 | maypole 648 | maze 649 | measuring cup 650 | medicine chest 651 | megalith 652 | microphone 653 | microwave 654 | military uniform 655 | milk can 656 | minibus 657 | miniskirt 658 | minivan 659 | missile 660 | mitten 661 | mixing bowl 662 | mobile home 663 | Model T 664 | modem 665 | monastery 666 | monitor 667 | moped 668 | mortar 669 | mortarboard 670 | mosque 671 | mosquito net 672 | motor scooter 673 | mountain bike 674 | mountain tent 675 | mouse 676 | mousetrap 677 | moving van 678 | muzzle 679 | nail 680 | neck brace 681 | necklace 682 | nipple 683 | notebook 684 | obelisk 685 | oboe 686 | ocarina 687 | odometer 688 | oil filter 689 | organ 690 | oscilloscope 691 | overskirt 692 | oxcart 693 | oxygen mask 694 | packet 695 | paddle 696 | paddlewheel 697 | padlock 698 | paintbrush 699 | pajama 700 | palace 701 | panpipe 702 | paper towel 703 | parachute 704 | parallel bars 705 | park bench 706 | parking meter 707 | passenger car 708 | patio 709 | pay-phone 710 | pedestal 711 | pencil box 712 | pencil sharpener 713 | perfume 714 | Petri dish 715 | photocopier 716 | pick 717 | pickelhaube 718 | picket fence 719 | pickup 720 | pier 721 | piggy bank 722 | pill bottle 723 | pillow 724 | ping-pong ball 725 | pinwheel 726 | pirate 727 | pitcher 728 | plane 729 | planetarium 730 | plastic bag 731 | plate rack 732 | plow 733 | plunger 734 | Polaroid camera 735 | pole 736 | police van 737 | poncho 738 | pool table 739 | pop bottle 740 | pot 741 | potter's wheel 742 | power drill 743 | prayer rug 744 | printer 745 | prison 746 | projectile 747 | projector 748 | puck 749 | punching bag 750 | purse 751 | quill 752 | quilt 753 | racer 754 | racket 755 | radiator 756 | radio 757 | radio telescope 758 | rain barrel 759 | recreational vehicle 760 | reel 761 | reflex camera 762 | refrigerator 763 | remote control 764 | restaurant 765 | revolver 766 | rifle 767 | rocking chair 768 | rotisserie 769 | rubber eraser 770 | rugby ball 771 | rule 772 | running shoe 773 | safe 774 | safety pin 775 | saltshaker 776 | sandal 777 | sarong 778 | sax 779 | scabbard 780 | scale 781 | school bus 782 | schooner 783 | scoreboard 784 | screen 785 | screw 786 | screwdriver 787 | seat belt 788 | sewing machine 789 | shield 790 | shoe shop 791 | shoji 792 | shopping basket 793 | shopping cart 794 | shovel 795 | shower cap 796 | shower curtain 797 | ski 798 | ski mask 799 | sleeping bag 800 | slide rule 801 | sliding door 802 | slot 803 | snorkel 804 | snowmobile 805 | snowplow 806 | soap dispenser 807 | soccer ball 808 | sock 809 | solar dish 810 | sombrero 811 | soup bowl 812 | space bar 813 | space heater 814 | space shuttle 815 | spatula 816 | speedboat 817 | spider web 818 | spindle 819 | sports car 820 | spotlight 821 | stage 822 | steam locomotive 823 | steel arch bridge 824 | steel drum 825 | stethoscope 826 | stole 827 | stone wall 828 | stopwatch 829 | stove 830 | strainer 831 | streetcar 832 | stretcher 833 | studio couch 834 | stupa 835 | submarine 836 | suit 837 | sundial 838 | sunglass 839 | sunglasses 840 | sunscreen 841 | suspension bridge 842 | swab 843 | sweatshirt 844 | swimming trunks 845 | swing 846 | switch 847 | syringe 848 | table lamp 849 | tank 850 | tape player 851 | teapot 852 | teddy 853 | television 854 | tennis ball 855 | thatch 856 | theater curtain 857 | thimble 858 | thresher 859 | throne 860 | tile roof 861 | toaster 862 | tobacco shop 863 | toilet seat 864 | torch 865 | totem pole 866 | tow truck 867 | toyshop 868 | tractor 869 | trailer truck 870 | tray 871 | trench coat 872 | tricycle 873 | trimaran 874 | tripod 875 | triumphal arch 876 | trolleybus 877 | trombone 878 | tub 879 | turnstile 880 | typewriter keyboard 881 | umbrella 882 | unicycle 883 | upright 884 | vacuum 885 | vase 886 | vault 887 | velvet 888 | vending machine 889 | vestment 890 | viaduct 891 | violin 892 | volleyball 893 | waffle iron 894 | wall clock 895 | wallet 896 | wardrobe 897 | warplane 898 | washbasin 899 | washer 900 | water bottle 901 | water jug 902 | water tower 903 | whiskey jug 904 | whistle 905 | wig 906 | window screen 907 | window shade 908 | Windsor tie 909 | wine bottle 910 | wing 911 | wok 912 | wooden spoon 913 | wool 914 | worm fence 915 | wreck 916 | yawl 917 | yurt 918 | web site 919 | comic book 920 | crossword puzzle 921 | street sign 922 | traffic light 923 | book jacket 924 | menu 925 | plate 926 | guacamole 927 | consomme 928 | hot pot 929 | trifle 930 | ice cream 931 | ice lolly 932 | French loaf 933 | bagel 934 | pretzel 935 | cheeseburger 936 | hotdog 937 | mashed potato 938 | head cabbage 939 | broccoli 940 | cauliflower 941 | zucchini 942 | spaghetti squash 943 | acorn squash 944 | butternut squash 945 | cucumber 946 | artichoke 947 | bell pepper 948 | cardoon 949 | mushroom 950 | Granny Smith 951 | strawberry 952 | orange 953 | lemon 954 | fig 955 | pineapple 956 | banana 957 | jackfruit 958 | custard apple 959 | pomegranate 960 | hay 961 | carbonara 962 | chocolate sauce 963 | dough 964 | meat loaf 965 | pizza 966 | potpie 967 | burrito 968 | red wine 969 | espresso 970 | cup 971 | eggnog 972 | alp 973 | bubble 974 | cliff 975 | coral reef 976 | geyser 977 | lakeside 978 | promontory 979 | sandbar 980 | seashore 981 | valley 982 | volcano 983 | ballplayer 984 | groom 985 | scuba diver 986 | rapeseed 987 | daisy 988 | yellow lady's slipper 989 | corn 990 | acorn 991 | hip 992 | buckeye 993 | coral fungus 994 | agaric 995 | gyromitra 996 | stinkhorn 997 | earthstar 998 | hen-of-the-woods 999 | bolete 1000 | ear 1001 | toilet tissue -------------------------------------------------------------------------------- /Saving and Loading Model Weights/model_checkpoints/saved_model.pb: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/ahmadmustafaanis/Getting-Started-with-Tensorflow-2/6318490c1b07018f94389a68905d7b51029c35aa/Saving and Loading Model Weights/model_checkpoints/saved_model.pb -------------------------------------------------------------------------------- /Saving and Loading Model Weights/model_checkpoints/variables/variables.data-00000-of-00002: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/ahmadmustafaanis/Getting-Started-with-Tensorflow-2/6318490c1b07018f94389a68905d7b51029c35aa/Saving and Loading Model Weights/model_checkpoints/variables/variables.data-00000-of-00002 -------------------------------------------------------------------------------- /Saving and Loading Model Weights/model_checkpoints/variables/variables.data-00001-of-00002: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/ahmadmustafaanis/Getting-Started-with-Tensorflow-2/6318490c1b07018f94389a68905d7b51029c35aa/Saving and Loading Model Weights/model_checkpoints/variables/variables.data-00001-of-00002 -------------------------------------------------------------------------------- /Saving and Loading Model Weights/model_checkpoints/variables/variables.index: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/ahmadmustafaanis/Getting-Started-with-Tensorflow-2/6318490c1b07018f94389a68905d7b51029c35aa/Saving and Loading Model Weights/model_checkpoints/variables/variables.index -------------------------------------------------------------------------------- /Saving and Loading Model Weights/readme.md: -------------------------------------------------------------------------------- 1 | # Saving and Loading Model Weights 2 | 3 | ----- 4 | 5 | It is important to save your model's progress as it is not feasible to train your model again and again. Training the model again and again will cost your time, & computational resources, which will divert you from solving the actuall problem. 6 | 7 | ----- 8 | 9 | There are 2 different formats to save the model weights, which are 10 | - hdf5 format (used by keras) 11 | - Tensorflow native format 12 | 13 | Normally it do not matter which format you use, but Tensorflow Native format is better. 14 | 15 | ---- 16 | 17 | You can save your model during training, at every epoch or you can save the model weights after you have completed the training and satisfied with the results. 18 | 19 | ### Saving during training 20 | 21 | We will use built-in callback known as `ModelCheckpoint` to save our model weights during training. 22 | 23 | We will create a dummy model for binary classification to see how to save models. 24 | 1. Create Model 25 | ```python3 26 | from tensorflow.keras.models import Sequential 27 | from tensorflow.keras.layers import Dense 28 | from tensorflow.keras.callbacks import ModelCheckpoint 29 | 30 | model = Sequential([ 31 | Dense(10, input_shape=(10,), activation = 'relu'), 32 | Dense(1, activation = 'sigmoid') 33 | ]) 34 | 35 | model.compile(optimizer= 'SGD', loss='binary_crossentropy', metrics=['acc'] ) 36 | ``` 37 | 2. Create CheckPoint using built-in CallBack and fit model 38 | 39 | ```python3 40 | checkpoint = ModelCheckpoint('my_mode_weights', save_weights_only = True) 41 | model.fit(X,y, epochs=10, callbacks=[checkpoint]) #pass checkpoint in callback 42 | ``` 43 | 44 | * Here we are saving in `tensorflow native format`. File name for saving weights is `my_mode_weights`. 45 | 46 | 47 | * It will save weights for every epoch, and since we are passing 1 file, so it will over write it. We will check it later how to over come it. 48 | * 3 different files will be created with names as 49 | 50 | * checkpoint 51 | * my_mode_weights.data 52 | * my_mode_weights.index 53 | 54 | * To save it in `hdf5` format, just change the name of file in `ModelCheckpoint` to `.h5` extension. i.e 55 | 56 | ```python3 57 | checkpoint = ModelCheckpoint('my_mode_weights.h5', save_weights_only = True) 58 | model.fit(X,y, epochs=10, callbacks=[checkpoint]) #pass checkpoint in callback 59 | ``` 60 | * In this case, only one file will be created with name of `my_mode_weights.h5`. 61 | 62 | ----- 63 | 64 | ### Saving after training i.e Manual saving 65 | 66 | We can save the model, after training, when all the epochs are done and we have the perfect weights, we can save them. We do not have to use the `callback` in this case. We can simply use `model.save_weights` built-in function. 67 | 68 | Lets Check Code 69 | 70 | ```python3 71 | model = Sequential([ 72 | #Layers 73 | #Layers 74 | #Layers 75 | ]) 76 | model.compile(optimizer='adam', loss='mse', metrics=['mae']) 77 | 78 | model.fit(X, y, epochs = 100) 79 | 80 | model.save_weights('my_model`) #will save weights in Native Tensorflow Format and create 3 files. 81 | 82 | model.save_weights('my_model.h5') #will save in hdf5 format 83 | ``` 84 | ----- 85 | 86 | Saved Files Explaination: 87 | To see the Explaination of files created by `ModelCheckpoint` Callback, refer to [this](Explanation%20of%20saved%20files.ipynb) notebook. 88 | 89 | ---- 90 | 91 | 92 | ## Loading Weights 93 | 94 | Since we have not saved the model architecture, We have only saved weights of the model, we have to redesign the model with same architecture. 95 | 96 | Taking our first example where we built a binary classifier, I will take same model and rewrite it. 97 | 98 | ```python3 99 | from tensorflow.keras.models import Sequential 100 | from tensorflow.keras.layers import Dense 101 | from tensorflow.keras.callbacks import ModelCheckpoint 102 | 103 | model = Sequential([ 104 | Dense(10, input_shape=(10,), activation = 'relu'), 105 | Dense(1, activation = 'sigmoid') 106 | ]) 107 | 108 | model.load_weights('my_mode_weights') #Use same file name as you used to store the weights 109 | 110 | ``` 111 | 112 | ## Saving Criterias 113 | Here we will learn how to save a model based on specific criteria, let's say we want to save the weights of a model after it see 5k training examples. We can save it using following code, 114 | 115 | ```python3 116 | 117 | from tensorflow.keras.callbacks import ModelCheckpoint 118 | 119 | checkpoint_5k_path = 'model_checkpoint_5k/checkpoint_{epochs:02d}_{batch:04d} 120 | 121 | checkpoint_5k = ModelCheckpoint(filepath = checkpoint_path, save_weights_only = True, save_freq = 5000 ) 122 | ``` 123 | Notice that we have `{epoch:02d}_{batch:04d}` in our path, so it will not overwrite the weights, instead will make new files for each epcoh and batch. 124 | 125 | There are several other important parameters in `ModelCheckpoint` callback that can help you, which are 126 | 127 | * `save_best_only` if `True` will only save the best weights based on `monitor` which can be based on your `loss` and `metrics` and validation data if any. 128 | * `monitor` to be used with `save_best_only` 129 | * `mode` this can be used with `monitor`, already discussed earlier. 130 | * `save_freq` this can be set to `epochs` for saving weights at every epoch. 131 | 132 | You can learn more about it in [this](ProgrammingTutorial.ipynb) programming tutorial under *Model Saving Criteria* 133 | 134 | ---- 135 | 136 | ## Saving and loading the Entire Model with Architecture 137 | 138 | Some times you do not only have to save the weights, but also to save the model architecture, to do this, this is your basic code 139 | 140 | ```python3 141 | from tensorflow.keras.callbacks import ModelCheckpoint 142 | 143 | checkpoint = ModelCheckpoint('my_model_arch/my_model', save_weights_only=False) 144 | ``` 145 | By default `save_weights_only` is set to `False` so no need to mention it explicitly. 146 | 147 | Our folder directory would be 148 | * my_model_arch/my_model/assets 149 | * my_model_arch/my_model/saved_model.pb 150 | * my_model_arch/my_model/variables/variables.data-0000-of-0001 151 | * my_model_arch/my_model/variables/variables.index 152 | 153 | 154 | We can also save it in `.h5` format by just adding `.h5` in the end of the filepath. In that case only 1 file would be saved. 155 | 156 | **Manually Saving the Model** 157 | 158 | We can manually save the enitre model and architecture using `model.save('filepath')` or `model.save('filepath.h5')`. 159 | 160 | **Loading the Entire Model** 161 | 162 | To load the enitre model, we can use `load_model` function which is built-in Keras. 163 | 164 | ```python3 165 | from tensorflow.keras.models import load_model 166 | 167 | new_model = load_model('my_model') 168 | 169 | new_model_h5 = load_model('model.h5') 170 | ``` 171 | 172 | To learn more about it, refer to [this](ProgrammingTutorial.ipynb) notebook under Saving the Entire model, and in-depth understanding on this topic is given in [this](Saving%20model%20architecture%20only.ipynb) notebook. 173 | 174 | ## Loading Pre-Trained Keras Models 175 | 176 | Keras provide famous deep learning models, thier architectures, and pre-trained weights. First time you load them, they will automatically download weights in ` ~/.keras/models`. Famous available models are 177 | * Xception 178 | * VGG16 179 | * VGG19 180 | * Resnet/ Resnet v2 181 | * Inception v3 182 | * Inception Resnet v2 183 | * MobileNet/ MobileNet v2 184 | * DenseNet 185 | * NASnet 186 | 187 | To import the model, we import it from `tensorflow.keras.applications`. Let's see the code. 188 | 189 | ```python3 190 | from tensorflow.keras.applications.resnet50 import ResNet50 191 | 192 | model = ResNet50(weights = 'imagenet' ) #pre-trained weights on imagenet dataset 193 | ``` 194 | 195 | Let's say we do not want imagenet weights, then we can use 196 | 197 | `model = ResNet50(weights='none')`, and then we have to do fresh training. 198 | 199 | We can use it for **Transfer Learning** if we exclude the top Dense Layers. We can do it by 200 | ```python3 201 | model = ResNet50(weights='imagenet', include_top = False) 202 | ``` 203 | #### Prediction using pre-trained ResNet50 204 | 205 | ```python3 206 | from tensorflow.keras.applications.resnet50 import ResNet50, preprocess_input, decode_predictions 207 | from tensorflow.keras.preprocessing import image 208 | import numpy as np 209 | 210 | model = ResNet50(weights = 'imagenet', include_top = True) 211 | 212 | image_input = image.load_image('path', target_size=(224, 224) )#224 is size for resnet 213 | image_input = image.img_to_array(image_input) 214 | image_input = preprocess_input(image_input[np.newaxis, ...])#adding a new axis for batch size 215 | 216 | preds = model.predict(image_input) 217 | 218 | decoded_pred = deocde_predictions(preds)[0] 219 | 220 | print(f"Your image is of {decoded_pred}") 221 | 222 | ``` 223 | You can learn more about it in [this](ProgrammingTutorial.ipynb) under "Loading pre-trained Keras models" section. 224 | 225 | ----- 226 | 227 | ## Tensorflow Hub Models 228 | Tensorflow also provides Tensorflow-hub models, which are basically focused on network modules which you can think of a seperate components of a tensorflow graph. 229 | 230 | Tensorflow hub is a seperate library and you need to install it. 231 | 232 | ```bash 233 | $ conda activate yourDeepLearningVenve 234 | $ pip install "tensorflow>=2.0.0" 235 | $ pip install --upgrade tensorflow-hub 236 | ``` 237 | You can browse all the avaiable models at [tfhub](https://tfhub.dev/) 238 | To use them, 239 | ```python3 240 | import tensorflow_hub as hub 241 | 242 | model_url = "https://tfhub.dev/google/imagenet/mobilenet_v1_050_160/classification/4" #learned from documentation 243 | 244 | module = Sequential([ 245 | hub.KerasLayer("model_url") 246 | ]) 247 | 248 | module.build(input_shape = [None, 160, 160, 3]) 249 | ``` 250 | Output of this model is 1001 classes, for which you can find labels at documentation. 251 | 252 | To know more about predicting images with it, refer to [this](ProgrammingTutorial.ipynb) notebook under TensorFlow Hub modules. -------------------------------------------------------------------------------- /TF.Keras Sequential API Basics/.ipynb_checkpoints/Weight Initializers-checkpoint.ipynb: -------------------------------------------------------------------------------- 1 | { 2 | "cells": [ 3 | { 4 | "cell_type": "markdown", 5 | "metadata": {}, 6 | "source": [ 7 | "# Weight and bias initialisers \n", 8 | "\n", 9 | "In this reading we investigate different ways to initialise weights and biases in the layers of neural networks." 10 | ] 11 | }, 12 | { 13 | "cell_type": "code", 14 | "execution_count": 1, 15 | "metadata": {}, 16 | "outputs": [ 17 | { 18 | "name": "stdout", 19 | "output_type": "stream", 20 | "text": [ 21 | "2.2.0\n" 22 | ] 23 | } 24 | ], 25 | "source": [ 26 | "%matplotlib inline\n", 27 | "import tensorflow as tf\n", 28 | "import pandas as pd\n", 29 | "print(tf.__version__)" 30 | ] 31 | }, 32 | { 33 | "cell_type": "markdown", 34 | "metadata": {}, 35 | "source": [ 36 | "### Default weights and biases\n", 37 | "\n", 38 | "In the models we have worked with so far, we have not specified the initial values of the weights and biases in each layer of our neural networks.\n", 39 | "\n", 40 | "The default values of the weights and biases in TensorFlow depend on the type of layers we are using. \n", 41 | "\n", 42 | "For example, in a `Dense` layer, the biases are set to zero (`zeros`) by default, while the weights are set according to `glorot_uniform`, the Glorot uniform initialiser. \n", 43 | "\n", 44 | "The Glorot uniform initialiser draws the weights uniformly at random from the closed interval $[-c,c]$, where $$c = \\sqrt{\\frac{6}{n_{input}+n_{output}}}$$" 45 | ] 46 | }, 47 | { 48 | "cell_type": "markdown", 49 | "metadata": {}, 50 | "source": [ 51 | "and $n_{input}$ and $n_{output}$ are the number of inputs to, and outputs from the layer respectively." 52 | ] 53 | }, 54 | { 55 | "cell_type": "markdown", 56 | "metadata": {}, 57 | "source": [ 58 | "### Initialising your own weights and biases\n", 59 | "We often would like to initialise our own weights and biases, and TensorFlow makes this process quite straightforward.\n", 60 | "\n", 61 | "When we construct a model in TensorFlow, each layer has optional arguments `kernel_initialiser` and `bias_initialiser`, which are used to set the weights and biases respectively.\n", 62 | "\n", 63 | "If a layer has no weights or biases (e.g. it is a max pooling layer), then trying to set either `kernel_initialiser` or `bias_initialiser` will throw an error.\n", 64 | "\n", 65 | "Let's see an example, which uses some of the different initialisations available in Keras." 66 | ] 67 | }, 68 | { 69 | "cell_type": "code", 70 | "execution_count": null, 71 | "metadata": {}, 72 | "outputs": [], 73 | "source": [ 74 | "from tensorflow.keras.models import Sequential\n", 75 | "from tensorflow.keras.layers import Flatten, Dense, Conv1D, MaxPooling1D " 76 | ] 77 | }, 78 | { 79 | "cell_type": "code", 80 | "execution_count": null, 81 | "metadata": {}, 82 | "outputs": [], 83 | "source": [ 84 | "# Construct a model\n", 85 | "\n", 86 | "model = Sequential([\n", 87 | " Conv1D(filters=16, kernel_size=3, input_shape=(128, 64), kernel_initializer='random_uniform', bias_initializer=\"zeros\", activation='relu'),\n", 88 | " MaxPooling1D(pool_size=4),\n", 89 | " Flatten(),\n", 90 | " Dense(64, kernel_initializer='he_uniform', bias_initializer='ones', activation='relu'),\n", 91 | "])" 92 | ] 93 | }, 94 | { 95 | "cell_type": "markdown", 96 | "metadata": {}, 97 | "source": [ 98 | "As the following example illustrates, we can also instantiate initialisers in a slightly different manner, allowing us to set optional arguments of the initialisation method." 99 | ] 100 | }, 101 | { 102 | "cell_type": "code", 103 | "execution_count": null, 104 | "metadata": {}, 105 | "outputs": [], 106 | "source": [ 107 | "# Add some layers to our model\n", 108 | "\n", 109 | "model.add(Dense(64, \n", 110 | " kernel_initializer=tf.keras.initializers.RandomNormal(mean=0.0, stddev=0.05), \n", 111 | " bias_initializer=tf.keras.initializers.Constant(value=0.4), \n", 112 | " activation='relu'),)\n", 113 | "\n", 114 | "model.add(Dense(8, \n", 115 | " kernel_initializer=tf.keras.initializers.Orthogonal(gain=1.0, seed=None), \n", 116 | " bias_initializer=tf.keras.initializers.Constant(value=0.4), \n", 117 | " activation='relu'))" 118 | ] 119 | }, 120 | { 121 | "cell_type": "markdown", 122 | "metadata": {}, 123 | "source": [ 124 | "### Custom weight and bias initialisers\n", 125 | "It is also possible to define your own weight and bias initialisers.\n", 126 | "Initializers must take in two arguments, the `shape` of the tensor to be initialised, and its `dtype`.\n", 127 | "\n", 128 | "Here is a small example, which also shows how you can use your custom initializer in a layer." 129 | ] 130 | }, 131 | { 132 | "cell_type": "code", 133 | "execution_count": null, 134 | "metadata": {}, 135 | "outputs": [], 136 | "source": [ 137 | "import tensorflow.keras.backend as K" 138 | ] 139 | }, 140 | { 141 | "cell_type": "code", 142 | "execution_count": null, 143 | "metadata": {}, 144 | "outputs": [], 145 | "source": [ 146 | "# Define a custom initializer\n", 147 | "\n", 148 | "def my_init(shape, dtype=None):\n", 149 | " return K.random_normal(shape, dtype=dtype)\n", 150 | "\n", 151 | "model.add(Dense(64, kernel_initializer=my_init))" 152 | ] 153 | }, 154 | { 155 | "cell_type": "markdown", 156 | "metadata": {}, 157 | "source": [ 158 | "Let's take a look at the summary of our finalised model." 159 | ] 160 | }, 161 | { 162 | "cell_type": "code", 163 | "execution_count": null, 164 | "metadata": {}, 165 | "outputs": [], 166 | "source": [ 167 | "# Print the model summary\n", 168 | "\n", 169 | "model.summary()" 170 | ] 171 | }, 172 | { 173 | "cell_type": "markdown", 174 | "metadata": {}, 175 | "source": [ 176 | "### Visualising the initialised weights and biases" 177 | ] 178 | }, 179 | { 180 | "cell_type": "markdown", 181 | "metadata": {}, 182 | "source": [ 183 | "Finally, we can see the effect of our initialisers on the weights and biases by plotting histograms of the resulting values. Compare these plots with the selected initialisers for each layer above." 184 | ] 185 | }, 186 | { 187 | "cell_type": "code", 188 | "execution_count": null, 189 | "metadata": {}, 190 | "outputs": [], 191 | "source": [ 192 | "import matplotlib.pyplot as plt" 193 | ] 194 | }, 195 | { 196 | "cell_type": "code", 197 | "execution_count": null, 198 | "metadata": {}, 199 | "outputs": [], 200 | "source": [ 201 | "# Plot histograms of weight and bias values\n", 202 | "\n", 203 | "fig, axes = plt.subplots(5, 2, figsize=(12,16))\n", 204 | "fig.subplots_adjust(hspace=0.5, wspace=0.5)\n", 205 | "\n", 206 | "# Filter out the pooling and flatten layers, that don't have any weights\n", 207 | "weight_layers = [layer for layer in model.layers if len(layer.weights) > 0]\n", 208 | "\n", 209 | "for i, layer in enumerate(weight_layers):\n", 210 | " for j in [0, 1]:\n", 211 | " axes[i, j].hist(layer.weights[j].numpy().flatten(), align='left')\n", 212 | " axes[i, j].set_title(layer.weights[j].name)" 213 | ] 214 | }, 215 | { 216 | "cell_type": "markdown", 217 | "metadata": {}, 218 | "source": [ 219 | "## Further reading and resources \n", 220 | "* https://keras.io/initializers/\n", 221 | "* https://www.tensorflow.org/versions/r2.0/api_docs/python/tf/keras/initializers" 222 | ] 223 | } 224 | ], 225 | "metadata": { 226 | "kernelspec": { 227 | "display_name": "Python 3", 228 | "language": "python", 229 | "name": "python3" 230 | }, 231 | "language_info": { 232 | "codemirror_mode": { 233 | "name": "ipython", 234 | "version": 3 235 | }, 236 | "file_extension": ".py", 237 | "mimetype": "text/x-python", 238 | "name": "python", 239 | "nbconvert_exporter": "python", 240 | "pygments_lexer": "ipython3", 241 | "version": "3.7.7" 242 | } 243 | }, 244 | "nbformat": 4, 245 | "nbformat_minor": 4 246 | } 247 | -------------------------------------------------------------------------------- /TF.Keras Sequential API Basics/MNISIT.ipynb: -------------------------------------------------------------------------------- 1 | { 2 | "cells": [ 3 | { 4 | "cell_type": "markdown", 5 | "metadata": { 6 | "id": "O-21wiLf-gCD", 7 | "colab_type": "text" 8 | }, 9 | "source": [ 10 | "# Programming Assignment" 11 | ] 12 | }, 13 | { 14 | "cell_type": "markdown", 15 | "metadata": { 16 | "id": "fxkainBa-gCF", 17 | "colab_type": "text" 18 | }, 19 | "source": [ 20 | "## CNN classifier for the MNIST dataset" 21 | ] 22 | }, 23 | { 24 | "cell_type": "markdown", 25 | "metadata": { 26 | "id": "XQKECTiE-gCG", 27 | "colab_type": "text" 28 | }, 29 | "source": [ 30 | "### Instructions\n", 31 | "\n", 32 | "In this notebook, you will write code to build, compile and fit a convolutional neural network (CNN) model to the MNIST dataset of images of handwritten digits.\n", 33 | "\n", 34 | "Some code cells are provided you in the notebook. You should avoid editing provided code, and make sure to execute the cells in order to avoid unexpected errors. Some cells begin with the line: \n", 35 | "\n", 36 | "`#### GRADED CELL ####`\n", 37 | "\n", 38 | "Don't move or edit this first line - this is what the automatic grader looks for to recognise graded cells. These cells require you to write your own code to complete them, and are automatically graded when you submit the notebook. Don't edit the function name or signature provided in these cells, otherwise the automatic grader might not function properly. Inside these graded cells, you can use any functions or classes that are imported below, but make sure you don't use any variables that are outside the scope of the function.\n", 39 | "\n", 40 | "### How to submit\n", 41 | "\n", 42 | "Complete all the tasks you are asked for in the worksheet. When you have finished and are happy with your code, press the **Submit Assignment** button at the top of this notebook.\n", 43 | "\n", 44 | "### Let's get started!\n", 45 | "\n", 46 | "We'll start running some imports, and loading the dataset. Do not edit the existing imports in the following cell. If you would like to make further Tensorflow imports, you should add them here." 47 | ] 48 | }, 49 | { 50 | "cell_type": "code", 51 | "metadata": { 52 | "id": "eR7qaZCl-gCJ", 53 | "colab_type": "code", 54 | "colab": {} 55 | }, 56 | "source": [ 57 | "#### PACKAGE IMPORTS ####\n", 58 | "\n", 59 | "# Run this cell first to import all required packages. Do not make any imports elsewhere in the notebook\n", 60 | "\n", 61 | "import tensorflow as tf\n", 62 | "import pandas as pd\n", 63 | "import numpy as np\n", 64 | "import matplotlib.pyplot as plt\n", 65 | "from tensorflow.keras.models import Sequential\n", 66 | "from tensorflow.keras.layers import Dense, Conv2D, Flatten, MaxPool2D\n", 67 | "%matplotlib inline\n", 68 | "\n", 69 | "# If you would like to make further imports from Tensorflow, add them here\n", 70 | "\n" 71 | ], 72 | "execution_count": null, 73 | "outputs": [] 74 | }, 75 | { 76 | "cell_type": "markdown", 77 | "metadata": { 78 | "id": "VOQk31Sc-gCN", 79 | "colab_type": "text" 80 | }, 81 | "source": [ 82 | "#### The MNIST dataset\n", 83 | "\n", 84 | "In this assignment, you will use the [MNIST dataset](http://yann.lecun.com/exdb/mnist/). It consists of a training set of 60,000 handwritten digits with corresponding labels, and a test set of 10,000 images. The images have been normalised and centred. The dataset is frequently used in machine learning research, and has become a standard benchmark for image classification models. \n", 85 | "\n", 86 | "- Y. LeCun, L. Bottou, Y. Bengio, and P. Haffner. \"Gradient-based learning applied to document recognition.\" Proceedings of the IEEE, 86(11):2278-2324, November 1998.\n", 87 | "\n", 88 | "Your goal is to construct a neural network that classifies images of handwritten digits into one of 10 classes." 89 | ] 90 | }, 91 | { 92 | "cell_type": "markdown", 93 | "metadata": { 94 | "id": "mOxMGi5e-gCP", 95 | "colab_type": "text" 96 | }, 97 | "source": [ 98 | "#### Load and preprocess the data" 99 | ] 100 | }, 101 | { 102 | "cell_type": "code", 103 | "metadata": { 104 | "id": "8zzRQzxA-gCQ", 105 | "colab_type": "code", 106 | "colab": { 107 | "base_uri": "https://localhost:8080/", 108 | "height": 51 109 | }, 110 | "outputId": "0f6024c0-c5ae-446d-fe38-877a74e2441a" 111 | }, 112 | "source": [ 113 | "# Run this cell to load the MNIST data\n", 114 | "\n", 115 | "mnist_data = tf.keras.datasets.mnist\n", 116 | "(train_images, train_labels), (test_images, test_labels) = mnist_data.load_data()" 117 | ], 118 | "execution_count": null, 119 | "outputs": [] 120 | }, 121 | { 122 | "cell_type": "markdown", 123 | "metadata": { 124 | "id": "MEeA_9-6-gCV", 125 | "colab_type": "text" 126 | }, 127 | "source": [ 128 | "First, preprocess the data by scaling the training and test images so their values lie in the range from 0 to 1." 129 | ] 130 | }, 131 | { 132 | "cell_type": "code", 133 | "metadata": { 134 | "id": "8AW1YX_9-gCX", 135 | "colab_type": "code", 136 | "colab": {} 137 | }, 138 | "source": [ 139 | "#### GRADED CELL ####\n", 140 | "\n", 141 | "# Complete the following function. \n", 142 | "# Make sure to not change the function name or arguments.\n", 143 | "\n", 144 | "def scale_mnist_data(train_images, test_images):\n", 145 | " \"\"\"\n", 146 | " This function takes in the training and test images as loaded in the cell above, and scales them\n", 147 | " so that they have minimum and maximum values equal to 0 and 1 respectively.\n", 148 | " Your function should return a tuple (train_images, test_images) of scaled training and test images.\n", 149 | " \"\"\"\n", 150 | " return (train_images/255, test_images/255)\n", 151 | " \n", 152 | " " 153 | ], 154 | "execution_count": null, 155 | "outputs": [] 156 | }, 157 | { 158 | "cell_type": "code", 159 | "metadata": { 160 | "id": "XgMBPB9d-gCa", 161 | "colab_type": "code", 162 | "colab": {} 163 | }, 164 | "source": [ 165 | "# Run your function on the input data\n", 166 | "\n", 167 | "scaled_train_images, scaled_test_images = scale_mnist_data(train_images, test_images)" 168 | ], 169 | "execution_count": null, 170 | "outputs": [] 171 | }, 172 | { 173 | "cell_type": "code", 174 | "metadata": { 175 | "id": "YbhJPs-Sjvlt", 176 | "colab_type": "code", 177 | "colab": { 178 | "base_uri": "https://localhost:8080/", 179 | "height": 34 180 | }, 181 | "outputId": "3f0f59df-95c0-4488-a31f-1b13908426e4" 182 | }, 183 | "source": [ 184 | "scaled_train_images.shape" 185 | ], 186 | "execution_count": null, 187 | "outputs": [] 188 | }, 189 | { 190 | "cell_type": "code", 191 | "metadata": { 192 | "id": "g1r-ULOQv2o3", 193 | "colab_type": "code", 194 | "colab": {} 195 | }, 196 | "source": [ 197 | "# Add a dummy channel dimension\n", 198 | "\n", 199 | "scaled_train_images = scaled_train_images[..., np.newaxis]\n", 200 | "scaled_test_images = scaled_test_images[..., np.newaxis]" 201 | ], 202 | "execution_count": null, 203 | "outputs": [] 204 | }, 205 | { 206 | "cell_type": "code", 207 | "metadata": { 208 | "id": "5hqIOj9tj2bF", 209 | "colab_type": "code", 210 | "colab": { 211 | "base_uri": "https://localhost:8080/", 212 | "height": 51 213 | }, 214 | "outputId": "cdb0943f-a7fc-4545-9d4d-756540a557e2" 215 | }, 216 | "source": [ 217 | "print(scaled_train_images.shape)\n", 218 | "print(train_labels.shape)" 219 | ], 220 | "execution_count": null, 221 | "outputs": [] 222 | }, 223 | { 224 | "cell_type": "markdown", 225 | "metadata": { 226 | "id": "Cy--eSWq-gCc", 227 | "colab_type": "text" 228 | }, 229 | "source": [ 230 | "#### Build the convolutional neural network model" 231 | ] 232 | }, 233 | { 234 | "cell_type": "markdown", 235 | "metadata": { 236 | "id": "5rnippry-gCd", 237 | "colab_type": "text" 238 | }, 239 | "source": [ 240 | "We are now ready to construct a model to fit to the data. Using the Sequential API, build your CNN model according to the following spec:\n", 241 | "\n", 242 | "* The model should use the `input_shape` in the function argument to set the input size in the first layer.\n", 243 | "* A 2D convolutional layer with a 3x3 kernel and 8 filters. Use 'SAME' zero padding and ReLU activation functions. Make sure to provide the `input_shape` keyword argument in this first layer.\n", 244 | "* A max pooling layer, with a 2x2 window, and default strides.\n", 245 | "* A flatten layer, which unrolls the input into a one-dimensional tensor.\n", 246 | "* Two dense hidden layers, each with 64 units and ReLU activation functions.\n", 247 | "* A dense output layer with 10 units and the softmax activation function.\n", 248 | "\n", 249 | "In particular, your neural network should have six layers." 250 | ] 251 | }, 252 | { 253 | "cell_type": "code", 254 | "metadata": { 255 | "id": "N-N7ArQ1-gCe", 256 | "colab_type": "code", 257 | "colab": {} 258 | }, 259 | "source": [ 260 | "#### GRADED CELL ####\n", 261 | "\n", 262 | "# Complete the following function. \n", 263 | "# Make sure to not change the function name or arguments.\n", 264 | "\n", 265 | "def get_model(input_shape):\n", 266 | " \"\"\"\n", 267 | " This function should build a Sequential model according to the above specification. Ensure the \n", 268 | " weights are initialised by providing the input_shape argument in the first layer, given by the\n", 269 | " function argument.\n", 270 | " Your function should return the model.\n", 271 | " \"\"\"\n", 272 | "\n", 273 | " model = Sequential([\n", 274 | " Conv2D(filters=8, kernel_size=3, padding='same', activation= 'relu', input_shape=input_shape),\n", 275 | " MaxPool2D((2,2)),\n", 276 | " Flatten(),\n", 277 | " Dense(64, activation='relu'),\n", 278 | " Dense(64, activation='relu'),\n", 279 | " Dense(10, activation='softmax')\n", 280 | "\n", 281 | " ])\n", 282 | " return model\n", 283 | " " 284 | ], 285 | "execution_count": null, 286 | "outputs": [] 287 | }, 288 | { 289 | "cell_type": "code", 290 | "metadata": { 291 | "id": "9L_2kj9A-gCi", 292 | "colab_type": "code", 293 | "colab": { 294 | "base_uri": "https://localhost:8080/", 295 | "height": 357 296 | }, 297 | "outputId": "3790d7f4-f9e4-43f1-91ef-53466ee1efd3" 298 | }, 299 | "source": [ 300 | "# Run your function to get the model\n", 301 | "\n", 302 | "model = get_model(scaled_train_images[0].shape)\n", 303 | "model.summary()" 304 | ], 305 | "execution_count": null, 306 | "outputs": [] 307 | }, 308 | { 309 | "cell_type": "markdown", 310 | "metadata": { 311 | "id": "uvrW1EA1-gCl", 312 | "colab_type": "text" 313 | }, 314 | "source": [ 315 | "#### Compile the model\n", 316 | "\n", 317 | "You should now compile the model using the `compile` method. To do so, you need to specify an optimizer, a loss function and a metric to judge the performance of your model." 318 | ] 319 | }, 320 | { 321 | "cell_type": "code", 322 | "metadata": { 323 | "id": "_x9mU2Li-gCm", 324 | "colab_type": "code", 325 | "colab": {} 326 | }, 327 | "source": [ 328 | "#### GRADED CELL ####\n", 329 | "\n", 330 | "# Complete the following function. \n", 331 | "# Make sure to not change the function name or arguments.\n", 332 | "\n", 333 | "def compile_model(model):\n", 334 | " \"\"\"\n", 335 | " This function takes in the model returned from your get_model function, and compiles it with an optimiser,\n", 336 | " loss function and metric.\n", 337 | " Compile the model using the Adam optimiser (with default settings), the cross-entropy loss function and\n", 338 | " accuracy as the only metric. \n", 339 | " Your function doesn't need to return anything; the model will be compiled in-place.\n", 340 | " \"\"\"\n", 341 | " model.compile(\n", 342 | " optimizer='adam',\n", 343 | " loss = 'sparse_categorical_crossentropy',\n", 344 | " metrics=['accuracy']\n", 345 | " )\n", 346 | " " 347 | ], 348 | "execution_count": null, 349 | "outputs": [] 350 | }, 351 | { 352 | "cell_type": "code", 353 | "metadata": { 354 | "id": "pY08R9yB-gCr", 355 | "colab_type": "code", 356 | "colab": {} 357 | }, 358 | "source": [ 359 | "# Run your function to compile the model\n", 360 | "\n", 361 | "compile_model(model)" 362 | ], 363 | "execution_count": null, 364 | "outputs": [] 365 | }, 366 | { 367 | "cell_type": "markdown", 368 | "metadata": { 369 | "id": "pHUcXibk-gCv", 370 | "colab_type": "text" 371 | }, 372 | "source": [ 373 | "#### Fit the model to the training data\n", 374 | "\n", 375 | "Now you should train the model on the MNIST dataset, using the model's `fit` method. Set the training to run for 5 epochs, and return the training history to be used for plotting the learning curves." 376 | ] 377 | }, 378 | { 379 | "cell_type": "code", 380 | "metadata": { 381 | "id": "cDnNXqN1-gCw", 382 | "colab_type": "code", 383 | "colab": {} 384 | }, 385 | "source": [ 386 | "#### GRADED CELL ####\n", 387 | "\n", 388 | "# Complete the following function. \n", 389 | "# Make sure to not change the function name or arguments.\n", 390 | "\n", 391 | "def train_model(model, scaled_train_images, train_labels):\n", 392 | " \"\"\"\n", 393 | " This function should train the model for 5 epochs on the scaled_train_images and train_labels. \n", 394 | " Your function should return the training history, as returned by model.fit.\n", 395 | " \"\"\"\n", 396 | " history = model.fit(scaled_train_images, train_labels, epochs=5, batch_size=256)\n", 397 | " return history" 398 | ], 399 | "execution_count": null, 400 | "outputs": [] 401 | }, 402 | { 403 | "cell_type": "code", 404 | "metadata": { 405 | "id": "Y1n3wh49-gCz", 406 | "colab_type": "code", 407 | "colab": { 408 | "base_uri": "https://localhost:8080/", 409 | "height": 187 410 | }, 411 | "outputId": "f8f5f8d3-24a0-468e-d500-86924c5559ba" 412 | }, 413 | "source": [ 414 | "# Run your function to train the model\n", 415 | "\n", 416 | "history = train_model(model, scaled_train_images, train_labels)" 417 | ], 418 | "execution_count": null, 419 | "outputs": [] 420 | }, 421 | { 422 | "cell_type": "markdown", 423 | "metadata": { 424 | "id": "rhd3yK0i-gC3", 425 | "colab_type": "text" 426 | }, 427 | "source": [ 428 | "#### Plot the learning curves\n", 429 | "\n", 430 | "We will now plot two graphs:\n", 431 | "* Epoch vs accuracy\n", 432 | "* Epoch vs loss\n", 433 | "\n", 434 | "We will load the model history into a pandas `DataFrame` and use the `plot` method to output the required graphs." 435 | ] 436 | }, 437 | { 438 | "cell_type": "code", 439 | "metadata": { 440 | "id": "y0t2Xjgq-gC4", 441 | "colab_type": "code", 442 | "colab": {} 443 | }, 444 | "source": [ 445 | "# Run this cell to load the model history into a pandas DataFrame\n", 446 | "\n", 447 | "frame = pd.DataFrame(history.history)" 448 | ], 449 | "execution_count": null, 450 | "outputs": [] 451 | }, 452 | { 453 | "cell_type": "code", 454 | "metadata": { 455 | "id": "xQqYQiR4-gC7", 456 | "colab_type": "code", 457 | "colab": { 458 | "base_uri": "https://localhost:8080/", 459 | "height": 312 460 | }, 461 | "outputId": "1aef4154-2fd7-45ab-a7b1-935206e1a038" 462 | }, 463 | "source": [ 464 | "# Run this cell to make the Accuracy vs Epochs plot\n", 465 | "\n", 466 | "acc_plot = frame.plot(y=\"accuracy\", title=\"Accuracy vs Epochs\", legend=False)\n", 467 | "acc_plot.set(xlabel=\"Epochs\", ylabel=\"Accuracy\")" 468 | ], 469 | "execution_count": null, 470 | "outputs": [] 471 | }, 472 | { 473 | "cell_type": "code", 474 | "metadata": { 475 | "id": "JGgTGfH4-gDA", 476 | "colab_type": "code", 477 | "colab": { 478 | "base_uri": "https://localhost:8080/", 479 | "height": 312 480 | }, 481 | "outputId": "723acd80-8bb4-41e1-9d31-e2513cfbcc1d" 482 | }, 483 | "source": [ 484 | "# Run this cell to make the Loss vs Epochs plot\n", 485 | "\n", 486 | "acc_plot = frame.plot(y=\"loss\", title = \"Loss vs Epochs\",legend=False)\n", 487 | "acc_plot.set(xlabel=\"Epochs\", ylabel=\"Loss\")" 488 | ], 489 | "execution_count": null, 490 | "outputs": [] 491 | }, 492 | { 493 | "cell_type": "markdown", 494 | "metadata": { 495 | "id": "ziq-tFlU-gDD", 496 | "colab_type": "text" 497 | }, 498 | "source": [ 499 | "#### Evaluate the model\n", 500 | "\n", 501 | "Finally, you should evaluate the performance of your model on the test set, by calling the model's `evaluate` method." 502 | ] 503 | }, 504 | { 505 | "cell_type": "code", 506 | "metadata": { 507 | "id": "CSqA8zUi-gDE", 508 | "colab_type": "code", 509 | "colab": {} 510 | }, 511 | "source": [ 512 | "#### GRADED CELL ####\n", 513 | "\n", 514 | "# Complete the following function. \n", 515 | "# Make sure to not change the function name or arguments.\n", 516 | "\n", 517 | "def evaluate_model(model, scaled_test_images, test_labels):\n", 518 | " \"\"\"\n", 519 | " This function should evaluate the model on the scaled_test_images and test_labels. \n", 520 | " Your function should return a tuple (test_loss, test_accuracy).\n", 521 | " \"\"\"\n", 522 | " (test_loss, test_accuracy) = model.evaluate(scaled_test_images, test_labels)\n", 523 | " return (test_loss, test_accuracy)\n", 524 | " " 525 | ], 526 | "execution_count": null, 527 | "outputs": [] 528 | }, 529 | { 530 | "cell_type": "code", 531 | "metadata": { 532 | "id": "SSNhInQD-gDG", 533 | "colab_type": "code", 534 | "colab": { 535 | "base_uri": "https://localhost:8080/", 536 | "height": 68 537 | }, 538 | "outputId": "0bd678e6-ca72-43a8-ea7a-9ce7aa09c376" 539 | }, 540 | "source": [ 541 | "# Run your function to evaluate the model\n", 542 | "\n", 543 | "test_loss, test_accuracy = evaluate_model(model, scaled_test_images, test_labels)\n", 544 | "print(f\"Test loss: {test_loss}\")\n", 545 | "print(f\"Test accuracy: {test_accuracy}\")" 546 | ], 547 | "execution_count": null, 548 | "outputs": [] 549 | }, 550 | { 551 | "cell_type": "markdown", 552 | "metadata": { 553 | "id": "SP09yVMK-gDK", 554 | "colab_type": "text" 555 | }, 556 | "source": [ 557 | "#### Model predictions\n", 558 | "\n", 559 | "Let's see some model predictions! We will randomly select four images from the test data, and display the image and label for each. \n", 560 | "\n", 561 | "For each test image, model's prediction (the label with maximum probability) is shown, together with a plot showing the model's categorical distribution." 562 | ] 563 | }, 564 | { 565 | "cell_type": "code", 566 | "metadata": { 567 | "id": "ZrUM42t_-gDL", 568 | "colab_type": "code", 569 | "colab": { 570 | "base_uri": "https://localhost:8080/", 571 | "height": 716 572 | }, 573 | "outputId": "c3655154-ea5a-4a56-8f4b-bc3ec288cb07" 574 | }, 575 | "source": [ 576 | "# Run this cell to get model predictions on randomly selected test images\n", 577 | "\n", 578 | "num_test_images = scaled_test_images.shape[0]\n", 579 | "\n", 580 | "random_inx = np.random.choice(num_test_images, 4)\n", 581 | "random_test_images = scaled_test_images[random_inx, ...]\n", 582 | "random_test_labels = test_labels[random_inx, ...]\n", 583 | "\n", 584 | "predictions = model.predict(random_test_images)\n", 585 | "\n", 586 | "fig, axes = plt.subplots(4, 2, figsize=(16, 12))\n", 587 | "fig.subplots_adjust(hspace=0.4, wspace=-0.2)\n", 588 | "\n", 589 | "for i, (prediction, image, label) in enumerate(zip(predictions, random_test_images, random_test_labels)):\n", 590 | " axes[i, 0].imshow(np.squeeze(image))\n", 591 | " axes[i, 0].get_xaxis().set_visible(False)\n", 592 | " axes[i, 0].get_yaxis().set_visible(False)\n", 593 | " axes[i, 0].text(10., -1.5, f'Digit {label}')\n", 594 | " axes[i, 1].bar(np.arange(len(prediction)), prediction)\n", 595 | " axes[i, 1].set_xticks(np.arange(len(prediction)))\n", 596 | " axes[i, 1].set_title(f\"Categorical distribution. Model prediction: {np.argmax(prediction)}\")\n", 597 | " \n", 598 | "plt.show()" 599 | ], 600 | "execution_count": null, 601 | "outputs": [] 602 | }, 603 | { 604 | "cell_type": "markdown", 605 | "metadata": { 606 | "id": "_y6mwJLs-gDP", 607 | "colab_type": "text" 608 | }, 609 | "source": [ 610 | "Congratulations for completing this programming assignment! In the next week of the course we will take a look at including validation and regularisation in our model training, and introduce Keras callbacks." 611 | ] 612 | } 613 | ], 614 | "metadata": { 615 | "coursera": { 616 | "course_slug": "tensor-flow-2-1", 617 | "graded_item_id": "g0YqY", 618 | "launcher_item_id": "N6gmY" 619 | }, 620 | "kernelspec": { 621 | "display_name": "Python 3", 622 | "language": "python", 623 | "name": "python3" 624 | }, 625 | "language_info": { 626 | "codemirror_mode": { 627 | "name": "ipython", 628 | "version": 3 629 | }, 630 | "file_extension": ".py", 631 | "mimetype": "text/x-python", 632 | "name": "python", 633 | "nbconvert_exporter": "python", 634 | "pygments_lexer": "ipython3", 635 | "version": "3.7.1" 636 | }, 637 | "colab": { 638 | "name": "Week 2 Programming Assignment.ipynb", 639 | "provenance": [], 640 | "collapsed_sections": [] 641 | }, 642 | "accelerator": "GPU" 643 | }, 644 | "nbformat": 4, 645 | "nbformat_minor": 0 646 | } -------------------------------------------------------------------------------- /TF.Keras Sequential API Basics/Metrics.ipynb: -------------------------------------------------------------------------------- 1 | { 2 | "cells": [ 3 | { 4 | "cell_type": "markdown", 5 | "metadata": {}, 6 | "source": [ 7 | "# Metrics in Keras" 8 | ] 9 | }, 10 | { 11 | "cell_type": "markdown", 12 | "metadata": {}, 13 | "source": [ 14 | "In this reading we will be exploring the different metrics in Keras that may be used to judge the performance of a model." 15 | ] 16 | }, 17 | { 18 | "cell_type": "code", 19 | "execution_count": null, 20 | "metadata": {}, 21 | "outputs": [], 22 | "source": [ 23 | "import tensorflow as tf\n", 24 | "from tensorflow.keras.models import Sequential\n", 25 | "from tensorflow.keras.layers import Dense, Flatten\n", 26 | "import tensorflow.keras.backend as K\n", 27 | "print(tf.__version__)" 28 | ] 29 | }, 30 | { 31 | "cell_type": "markdown", 32 | "metadata": {}, 33 | "source": [ 34 | "One of the most common metrics used for classification problems in Keras is `'accuracy'`. \n", 35 | "\n", 36 | "We will begin with a simple example of a model that uses accuracy as a metric." 37 | ] 38 | }, 39 | { 40 | "cell_type": "code", 41 | "execution_count": null, 42 | "metadata": {}, 43 | "outputs": [], 44 | "source": [ 45 | "# Build the model\n", 46 | "\n", 47 | "model = Sequential([\n", 48 | " Flatten(input_shape=(28,28)),\n", 49 | " Dense(32, activation='relu'),\n", 50 | " Dense(32, activation='tanh'),\n", 51 | " Dense(10, activation='softmax'),\n", 52 | "])" 53 | ] 54 | }, 55 | { 56 | "cell_type": "code", 57 | "execution_count": null, 58 | "metadata": {}, 59 | "outputs": [], 60 | "source": [ 61 | "# Compile the model\n", 62 | "\n", 63 | "model.compile(optimizer='adam',\n", 64 | " loss='sparse_categorical_crossentropy',\n", 65 | " metrics=['accuracy'])" 66 | ] 67 | }, 68 | { 69 | "cell_type": "markdown", 70 | "metadata": {}, 71 | "source": [ 72 | "We now have a model that uses accuracy as a metric to judge its performance.\n", 73 | "\n", 74 | "But how is this metric actually calculated? We will break our discussion into two cases.\n" 75 | ] 76 | }, 77 | { 78 | "cell_type": "markdown", 79 | "metadata": {}, 80 | "source": [ 81 | "### Case 1 - Binary Classification with sigmoid activation function\n", 82 | "Suppose we are training a model for a binary classification problem with a sigmoid activation function (softmax activation functions are covered in the next case). \n", 83 | "\n", 84 | "Given a training example with input $x^{(i)}$, the model will output a float between 0 and 1. Based on whether this float is less than or greater than our \"threshold\" (which by default is set at 0.5), we round the float to get the predicted classification $y_{pred}$ from the model.\n", 85 | "\n", 86 | "The accuracy metric compares the value of $y_{pred}$ on each training example with the true output, the one-hot coded vector $y_{true}^{(i)}$ from our training data.\n", 87 | "\n", 88 | "Let $$\\delta(y_{pred}^{(i)},y_{true}^{(i)}) = \\begin{cases} 1 & y_{pred}=y_{true}\\\\\n", 89 | "0 & y_{pred}\\neq y_{true} \\end{cases}$$\n", 90 | "\n", 91 | "The accuracy metric computes the mean of $\\delta(y_{pred}^{(i)},y_{true}^{(i)})$ over all training examples.\n", 92 | "\n", 93 | "$$ accuracy = \\frac{1}{N} \\sum_{i=1}^N \\delta(y_{pred}^{(i)},y_{true}^{(i)}) $$\n", 94 | "\n", 95 | "This is implemented in the backend of Keras as follows. \n", 96 | "Note: We have set $y_{true}$ and $y_{pred}$ ourselves for the purposes of this example." 97 | ] 98 | }, 99 | { 100 | "cell_type": "code", 101 | "execution_count": null, 102 | "metadata": {}, 103 | "outputs": [], 104 | "source": [ 105 | "# Sigmoid activation function\n", 106 | "\n", 107 | "y_true = tf.constant([0.0,1.0,1.0])\n", 108 | "y_pred = tf.constant([0.4,0.8, 0.3])\n", 109 | "accuracy = K.mean(K.equal(y_true, K.round(y_pred)))\n", 110 | "accuracy" 111 | ] 112 | }, 113 | { 114 | "cell_type": "markdown", 115 | "metadata": {}, 116 | "source": [ 117 | "### Case 2 - Categorical Classification\n", 118 | "Now suppose we are training a model for a classification problem which should sort data into $m>2$ different classes using a softmax activation function in the last layer.\n", 119 | "\n", 120 | "Given a training example with input $x^{(i)}$, the model will output a tensor of probabilities $p_1, p_2, \\dots p_m$, giving the likelihood (according to the model) that $x^{(i)}$ falls into each class.\n", 121 | "\n", 122 | "The accuracy metric works by determining the largest argument in the $y_{pred}^{(i)}$ tensor, and compares its index to the index of the maximum value of $y_{true}^{(i)}$ to determine $\\delta(y_{pred}^{(i)},y_{true}^{(i)})$. It then computes the accuracy in the same way as for the binary classification case.\n", 123 | "\n", 124 | "$$ accuracy = \\frac{1}{N} \\sum_{i=1}^N \\delta(y_{pred}^{(i)},y_{true}^{(i)}) $$\n", 125 | "\n", 126 | "In the backend of Keras, the accuracy metric is implemented slightly differently depending on whether we have a binary classification problem ($m=2$) or a categorical classifcation problem. Note that the accuracy for binary classification problems is the same, no matter if we use a sigmoid or softmax activation function to obtain the output.\n" 127 | ] 128 | }, 129 | { 130 | "cell_type": "code", 131 | "execution_count": null, 132 | "metadata": {}, 133 | "outputs": [], 134 | "source": [ 135 | "# Binary classification with softmax\n", 136 | "\n", 137 | "y_true = tf.constant([[0.0,1.0],[1.0,0.0],[1.0,0.0],[0.0,1.0]])\n", 138 | "y_pred = tf.constant([[0.4,0.6], [0.3,0.7], [0.05,0.95],[0.33,0.67]])\n", 139 | "accuracy =K.mean(K.equal(y_true, K.round(y_pred)))\n", 140 | "accuracy" 141 | ] 142 | }, 143 | { 144 | "cell_type": "code", 145 | "execution_count": null, 146 | "metadata": {}, 147 | "outputs": [], 148 | "source": [ 149 | "# Categorical classification with m>2\n", 150 | "\n", 151 | "y_true = tf.constant([[0.0,1.0,0.0,0.0],[1.0,0.0,0.0,0.0],[0.0,0.0,1.0,0.0]])\n", 152 | "y_pred = tf.constant([[0.4,0.6,0.0,0.0], [0.3,0.2,0.1,0.4], [0.05,0.35,0.5,0.1]])\n", 153 | "accuracy = K.mean(K.equal(K.argmax(y_true, axis=-1), K.argmax(y_pred, axis=-1)))\n", 154 | "accuracy" 155 | ] 156 | }, 157 | { 158 | "cell_type": "markdown", 159 | "metadata": {}, 160 | "source": [ 161 | "## Other examples of metrics\n", 162 | "We will now look at some other metrics in Keras. A full list is available at ." 163 | ] 164 | }, 165 | { 166 | "cell_type": "markdown", 167 | "metadata": {}, 168 | "source": [ 169 | "### Binary accuracy and categorical accuracy\n", 170 | "The `binary_accuracy` and `categorical_accuracy` metrics are, by default, identical to the Case 1 and 2 respectively of the `accuracy` metric explained above. \n", 171 | "\n", 172 | "However, using `binary_accuracy` allows you to use the optional `threshold` argument, which sets the minimum value of $y_{pred}$ which will be rounded to 1. As mentioned above, it is set as `threshold=0.5` by default.\n", 173 | "\n", 174 | "Below we give some examples of how to compile a model with `binary_accuracy` with and without a threshold." 175 | ] 176 | }, 177 | { 178 | "cell_type": "code", 179 | "execution_count": null, 180 | "metadata": {}, 181 | "outputs": [], 182 | "source": [ 183 | "# Compile the model with default threshold (=0.5)\n", 184 | "\n", 185 | "model.compile(optimizer='adam',\n", 186 | " loss='sparse_categorical_crossentropy',\n", 187 | " metrics=['binary_accuracy'])" 188 | ] 189 | }, 190 | { 191 | "cell_type": "code", 192 | "execution_count": null, 193 | "metadata": {}, 194 | "outputs": [], 195 | "source": [ 196 | "# The threshold can be specified as follows\n", 197 | "\n", 198 | "model.compile(optimizer='adam',\n", 199 | " loss='sparse_categorical_crossentropy',\n", 200 | " metrics=[tf.keras.metrics.BinaryAccuracy(threshold=0.5)])" 201 | ] 202 | }, 203 | { 204 | "cell_type": "markdown", 205 | "metadata": {}, 206 | "source": [ 207 | "### Sparse categorical accuracy\n", 208 | "\n", 209 | "This is a very similar metric to categorical accuracy with one major difference - the label $y_{true}$ of each training example is not expected to be a one-hot encoded vector, but to be a tensor consisting of a single integer. This integer is then compared to the index of the maximum argument of $y_{pred}$ to determine $\\delta(y_{pred}^{(i)},y_{true}^{(i)})$." 210 | ] 211 | }, 212 | { 213 | "cell_type": "code", 214 | "execution_count": null, 215 | "metadata": {}, 216 | "outputs": [], 217 | "source": [ 218 | "# Two examples of compiling a model with a sparse categorical accuracy metric\n", 219 | "\n", 220 | "model.compile(optimizer='adam',\n", 221 | " loss='sparse_categorical_crossentropy',\n", 222 | " metrics=[\"sparse_categorical_accuracy\"])\n", 223 | "\n", 224 | "model.compile(optimizer='adam',\n", 225 | " loss='sparse_categorical_crossentropy',\n", 226 | " metrics=[tf.keras.metrics.SparseCategoricalAccuracy()])" 227 | ] 228 | }, 229 | { 230 | "cell_type": "markdown", 231 | "metadata": {}, 232 | "source": [ 233 | "### (Sparse) Top $k$-categorical accuracy \n", 234 | "In top $k$-categorical accuracy, instead of computing how often the model correctly predicts the label of a training example, the metric computes how often the model has $y_{true}$ in the top $k$ of its predictions. By default, $k=5$.\n", 235 | "\n", 236 | "As before, the main difference between top $k$-categorical accuracy and its sparse version is that the former assumes $y_{true}$ is a one-hot encoded vector, whereas the sparse version assumes $y_{true}$ is an integer." 237 | ] 238 | }, 239 | { 240 | "cell_type": "code", 241 | "execution_count": null, 242 | "metadata": {}, 243 | "outputs": [], 244 | "source": [ 245 | "# Compile a model with a top-k categorical accuracy metric with default k (=5)\n", 246 | "\n", 247 | "model.compile(optimizer='adam',\n", 248 | " loss='sparse_categorical_crossentropy',\n", 249 | " metrics=[\"top_k_categorical_accuracy\"])" 250 | ] 251 | }, 252 | { 253 | "cell_type": "code", 254 | "execution_count": null, 255 | "metadata": {}, 256 | "outputs": [], 257 | "source": [ 258 | "# Specify k instead with the sparse top-k categorical accuracy\n", 259 | "\n", 260 | "model.compile(optimizer='adam',\n", 261 | " loss='sparse_categorical_crossentropy',\n", 262 | " metrics=[tf.keras.metrics.SparseTopKCategoricalAccuracy(k=3)])" 263 | ] 264 | }, 265 | { 266 | "cell_type": "markdown", 267 | "metadata": {}, 268 | "source": [ 269 | "## Custom metrics\n", 270 | "It is also possible to define your own custom metric in Keras.\n", 271 | "You will need to make sure that your metric takes in (at least) two arguments called `y_true` and `y_pred` and then output a single tensor value." 272 | ] 273 | }, 274 | { 275 | "cell_type": "code", 276 | "execution_count": null, 277 | "metadata": {}, 278 | "outputs": [], 279 | "source": [ 280 | "# Define a custom metric\n", 281 | "\n", 282 | "def mean_pred(y_true, y_pred):\n", 283 | " return K.mean(y_pred)" 284 | ] 285 | }, 286 | { 287 | "cell_type": "markdown", 288 | "metadata": {}, 289 | "source": [ 290 | "We can then use this metric when we compile our model as follows." 291 | ] 292 | }, 293 | { 294 | "cell_type": "code", 295 | "execution_count": null, 296 | "metadata": {}, 297 | "outputs": [], 298 | "source": [ 299 | "# Specify k instead with the sparse top-k categorical accuracy\n", 300 | "\n", 301 | "model.compile(optimizer='adam',\n", 302 | " loss='sparse_categorical_crossentropy',\n", 303 | " metrics=[mean_pred])" 304 | ] 305 | }, 306 | { 307 | "cell_type": "markdown", 308 | "metadata": {}, 309 | "source": [ 310 | "## Multiple metrics\n", 311 | "Finally, it is possible to use multiple metrics to judge the performance of your model. \n", 312 | "\n", 313 | "\n", 314 | "Here's an example:" 315 | ] 316 | }, 317 | { 318 | "cell_type": "code", 319 | "execution_count": null, 320 | "metadata": {}, 321 | "outputs": [], 322 | "source": [ 323 | "# Compile the model with multiple metrics\n", 324 | "\n", 325 | "model.compile(optimizer='adam',\n", 326 | " loss='sparse_categorical_crossentropy',\n", 327 | " metrics=[mean_pred, \"accuracy\",tf.keras.metrics.SparseTopKCategoricalAccuracy(k=3)])" 328 | ] 329 | }, 330 | { 331 | "cell_type": "markdown", 332 | "metadata": {}, 333 | "source": [ 334 | "### Sources and Further Reading\n", 335 | "* The metrics page on the Keras website: https://keras.io/metrics/\n", 336 | "* The source code for the metrics: https://github.com/keras-team/keras/blob/master/keras/metrics.py\n" 337 | ] 338 | } 339 | ], 340 | "metadata": { 341 | "kernelspec": { 342 | "display_name": "Python 3", 343 | "language": "python", 344 | "name": "python3" 345 | }, 346 | "language_info": { 347 | "codemirror_mode": { 348 | "name": "ipython", 349 | "version": 3 350 | }, 351 | "file_extension": ".py", 352 | "mimetype": "text/x-python", 353 | "name": "python", 354 | "nbconvert_exporter": "python", 355 | "pygments_lexer": "ipython3", 356 | "version": "3.7.1" 357 | } 358 | }, 359 | "nbformat": 4, 360 | "nbformat_minor": 2 361 | } -------------------------------------------------------------------------------- /TF.Keras Sequential API Basics/Mnisit_Fashion.ipynb: -------------------------------------------------------------------------------- 1 | { 2 | "cells": [ 3 | { 4 | "cell_type": "code", 5 | "execution_count": null, 6 | "metadata": {}, 7 | "outputs": [], 8 | "source": [ 9 | "import tensorflow as tf \n", 10 | "from tensorflow.keras.layers import Dense, Flatten, Conv2D, MaxPooling2D" 11 | ] 12 | }, 13 | { 14 | "cell_type": "code", 15 | "execution_count": null, 16 | "metadata": {}, 17 | "outputs": [], 18 | "source": [ 19 | "from tensorflow.keras.models import Sequential" 20 | ] 21 | }, 22 | { 23 | "cell_type": "code", 24 | "execution_count": null, 25 | "metadata": {}, 26 | "outputs": [], 27 | "source": [ 28 | "from keras.datasets import fashion_mnist" 29 | ] 30 | }, 31 | { 32 | "cell_type": "code", 33 | "execution_count": null, 34 | "metadata": {}, 35 | "outputs": [], 36 | "source": [ 37 | "(train_images,train_labels),(test_images,test_labels) = fashion_mnist.load_data()" 38 | ] 39 | }, 40 | { 41 | "cell_type": "code", 42 | "execution_count": null, 43 | "metadata": {}, 44 | "outputs": [], 45 | "source": [ 46 | "train_images.shape" 47 | ] 48 | }, 49 | { 50 | "cell_type": "code", 51 | "execution_count": null, 52 | "metadata": {}, 53 | "outputs": [], 54 | "source": [ 55 | "model = Sequential([\n", 56 | " Conv2D(16, kernel_size = 3, strides=(2,2), activation='relu', input_shape=(28,28,1)),\n", 57 | " MaxPooling2D((2,2)),\n", 58 | " Flatten(),\n", 59 | " Dense(64,activation='relu'),\n", 60 | " Dense(10,activation='softmax')\n", 61 | "])\n", 62 | "model.summary()" 63 | ] 64 | }, 65 | { 66 | "cell_type": "code", 67 | "execution_count": null, 68 | "metadata": {}, 69 | "outputs": [], 70 | "source": [ 71 | "train_images = train_images / 255\n", 72 | "test_images = test_images / 255" 73 | ] 74 | }, 75 | { 76 | "cell_type": "code", 77 | "execution_count": null, 78 | "metadata": {}, 79 | "outputs": [], 80 | "source": [ 81 | "import matplotlib.pyplot as plt\n", 82 | "import numpy as np\n", 83 | "import pandas as pd" 84 | ] 85 | }, 86 | { 87 | "cell_type": "code", 88 | "execution_count": null, 89 | "metadata": {}, 90 | "outputs": [], 91 | "source": [ 92 | "labels = [\n", 93 | " 'T-shirt/top',\n", 94 | " 'Trouser',\n", 95 | " 'Pullover',\n", 96 | " 'Dress',\n", 97 | " 'Coat',\n", 98 | " 'Sandal',\n", 99 | " 'Shirt',\n", 100 | " 'Sneaker',\n", 101 | " 'Bag',\n", 102 | " 'Ankle boot'\n", 103 | "]" 104 | ] 105 | }, 106 | { 107 | "cell_type": "code", 108 | "execution_count": null, 109 | "metadata": {}, 110 | "outputs": [], 111 | "source": [ 112 | "plt.imshow(train_images[0])\n", 113 | "print(labels[train_labels[0]])" 114 | ] 115 | }, 116 | { 117 | "cell_type": "code", 118 | "execution_count": null, 119 | "metadata": {}, 120 | "outputs": [], 121 | "source": [ 122 | "model.compile(optimizer=tf.keras.optimizers.Adam(learning_rate=0.005),\n", 123 | " loss=tf.keras.losses.SparseCategoricalCrossentropy(),\n", 124 | " metrics=[tf.keras.metrics.SparseCategoricalAccuracy(),tf.keras.metrics.MeanAbsoluteError()]\n", 125 | " )" 126 | ] 127 | }, 128 | { 129 | "cell_type": "code", 130 | "execution_count": null, 131 | "metadata": {}, 132 | "outputs": [], 133 | "source": [ 134 | "model.summary()" 135 | ] 136 | }, 137 | { 138 | "cell_type": "code", 139 | "execution_count": null, 140 | "metadata": {}, 141 | "outputs": [], 142 | "source": [ 143 | "history = model.fit(train_images[...,np.newaxis], train_labels, epochs=20)" 144 | ] 145 | }, 146 | { 147 | "cell_type": "code", 148 | "execution_count": null, 149 | "metadata": {}, 150 | "outputs": [], 151 | "source": [ 152 | "df = pd.DataFrame(history.history)" 153 | ] 154 | }, 155 | { 156 | "cell_type": "code", 157 | "execution_count": null, 158 | "metadata": {}, 159 | "outputs": [], 160 | "source": [ 161 | "df.head()" 162 | ] 163 | }, 164 | { 165 | "cell_type": "code", 166 | "execution_count": null, 167 | "metadata": {}, 168 | "outputs": [], 169 | "source": [ 170 | "plt.plot(df['sparse_categorical_accuracy'])\n", 171 | "plt.xlabel(\"Epochs\")\n", 172 | "plt.ylabel('Accuracy')\n", 173 | "plt.show()" 174 | ] 175 | }, 176 | { 177 | "cell_type": "code", 178 | "execution_count": null, 179 | "metadata": {}, 180 | "outputs": [], 181 | "source": [ 182 | "plt.plot(df['loss'].astype('float64'))\n", 183 | "plt.xlabel('Epochs')\n", 184 | "plt.ylabel('Loss')\n", 185 | "plt.show()" 186 | ] 187 | }, 188 | { 189 | "cell_type": "code", 190 | "execution_count": null, 191 | "metadata": {}, 192 | "outputs": [], 193 | "source": [ 194 | "loss, sparse_accuracy,mse = model.evaluate(test_images[...,np.newaxis], test_labels)" 195 | ] 196 | }, 197 | { 198 | "cell_type": "code", 199 | "execution_count": null, 200 | "metadata": {}, 201 | "outputs": [], 202 | "source": [] 203 | }, 204 | { 205 | "cell_type": "code", 206 | "execution_count": null, 207 | "metadata": {}, 208 | "outputs": [], 209 | "source": [] 210 | } 211 | ], 212 | "metadata": { 213 | "kernelspec": { 214 | "display_name": "Python 3", 215 | "language": "python", 216 | "name": "python3" 217 | }, 218 | "language_info": { 219 | "codemirror_mode": { 220 | "name": "ipython", 221 | "version": 3 222 | }, 223 | "file_extension": ".py", 224 | "mimetype": "text/x-python", 225 | "name": "python", 226 | "nbconvert_exporter": "python", 227 | "pygments_lexer": "ipython3", 228 | "version": "3.7.7" 229 | } 230 | }, 231 | "nbformat": 4, 232 | "nbformat_minor": 2 233 | } -------------------------------------------------------------------------------- /TF.Keras Sequential API Basics/Weight Initializers.ipynb: -------------------------------------------------------------------------------- 1 | { 2 | "cells": [ 3 | { 4 | "cell_type": "markdown", 5 | "metadata": {}, 6 | "source": [ 7 | "# Weight and bias initialisers \n", 8 | "\n", 9 | "In this reading we investigate different ways to initialise weights and biases in the layers of neural networks." 10 | ] 11 | }, 12 | { 13 | "cell_type": "code", 14 | "execution_count": null, 15 | "metadata": {}, 16 | "outputs": [], 17 | "source": [ 18 | "%matplotlib inline\n", 19 | "import tensorflow as tf\n", 20 | "import pandas as pd\n", 21 | "print(tf.__version__)" 22 | ] 23 | }, 24 | { 25 | "cell_type": "markdown", 26 | "metadata": {}, 27 | "source": [ 28 | "### Default weights and biases\n", 29 | "\n", 30 | "In the models we have worked with so far, we have not specified the initial values of the weights and biases in each layer of our neural networks.\n", 31 | "\n", 32 | "The default values of the weights and biases in TensorFlow depend on the type of layers we are using. \n", 33 | "\n", 34 | "For example, in a `Dense` layer, the biases are set to zero (`zeros`) by default, while the weights are set according to `glorot_uniform`, the Glorot uniform initialiser. \n", 35 | "\n", 36 | "The Glorot uniform initialiser draws the weights uniformly at random from the closed interval $[-c,c]$, where $$c = \\sqrt{\\frac{6}{n_{input}+n_{output}}}$$" 37 | ] 38 | }, 39 | { 40 | "cell_type": "markdown", 41 | "metadata": {}, 42 | "source": [ 43 | "and $n_{input}$ and $n_{output}$ are the number of inputs to, and outputs from the layer respectively." 44 | ] 45 | }, 46 | { 47 | "cell_type": "markdown", 48 | "metadata": {}, 49 | "source": [ 50 | "### Initialising your own weights and biases\n", 51 | "We often would like to initialise our own weights and biases, and TensorFlow makes this process quite straightforward.\n", 52 | "\n", 53 | "When we construct a model in TensorFlow, each layer has optional arguments `kernel_initialiser` and `bias_initialiser`, which are used to set the weights and biases respectively.\n", 54 | "\n", 55 | "If a layer has no weights or biases (e.g. it is a max pooling layer), then trying to set either `kernel_initialiser` or `bias_initialiser` will throw an error.\n", 56 | "\n", 57 | "Let's see an example, which uses some of the different initialisations available in Keras." 58 | ] 59 | }, 60 | { 61 | "cell_type": "code", 62 | "execution_count": null, 63 | "metadata": {}, 64 | "outputs": [], 65 | "source": [ 66 | "from tensorflow.keras.models import Sequential\n", 67 | "from tensorflow.keras.layers import Flatten, Dense, Conv1D, MaxPooling1D " 68 | ] 69 | }, 70 | { 71 | "cell_type": "code", 72 | "execution_count": null, 73 | "metadata": {}, 74 | "outputs": [], 75 | "source": [ 76 | "# Construct a model\n", 77 | "\n", 78 | "model = Sequential([\n", 79 | " Conv1D(filters=16, kernel_size=3, input_shape=(128, 64), kernel_initializer='random_uniform', bias_initializer=\"zeros\", activation='relu'),\n", 80 | " MaxPooling1D(pool_size=4),\n", 81 | " Flatten(),\n", 82 | " Dense(64, kernel_initializer='he_uniform', bias_initializer='ones', activation='relu'),\n", 83 | "])" 84 | ] 85 | }, 86 | { 87 | "cell_type": "markdown", 88 | "metadata": {}, 89 | "source": [ 90 | "As the following example illustrates, we can also instantiate initialisers in a slightly different manner, allowing us to set optional arguments of the initialisation method." 91 | ] 92 | }, 93 | { 94 | "cell_type": "code", 95 | "execution_count": null, 96 | "metadata": {}, 97 | "outputs": [], 98 | "source": [ 99 | "# Add some layers to our model\n", 100 | "\n", 101 | "model.add(Dense(64, \n", 102 | " kernel_initializer=tf.keras.initializers.RandomNormal(mean=0.0, stddev=0.05), \n", 103 | " bias_initializer=tf.keras.initializers.Constant(value=0.4), \n", 104 | " activation='relu'),)\n", 105 | "\n", 106 | "model.add(Dense(8, \n", 107 | " kernel_initializer=tf.keras.initializers.Orthogonal(gain=1.0, seed=None), \n", 108 | " bias_initializer=tf.keras.initializers.Constant(value=0.4), \n", 109 | " activation='relu'))" 110 | ] 111 | }, 112 | { 113 | "cell_type": "markdown", 114 | "metadata": {}, 115 | "source": [ 116 | "### Custom weight and bias initialisers\n", 117 | "It is also possible to define your own weight and bias initialisers.\n", 118 | "Initializers must take in two arguments, the `shape` of the tensor to be initialised, and its `dtype`.\n", 119 | "\n", 120 | "Here is a small example, which also shows how you can use your custom initializer in a layer." 121 | ] 122 | }, 123 | { 124 | "cell_type": "code", 125 | "execution_count": null, 126 | "metadata": {}, 127 | "outputs": [], 128 | "source": [ 129 | "import tensorflow.keras.backend as K" 130 | ] 131 | }, 132 | { 133 | "cell_type": "code", 134 | "execution_count": null, 135 | "metadata": {}, 136 | "outputs": [], 137 | "source": [ 138 | "# Define a custom initializer\n", 139 | "\n", 140 | "def my_init(shape, dtype=None):\n", 141 | " return K.random_normal(shape, dtype=dtype)\n", 142 | "\n", 143 | "model.add(Dense(64, kernel_initializer=my_init))" 144 | ] 145 | }, 146 | { 147 | "cell_type": "markdown", 148 | "metadata": {}, 149 | "source": [ 150 | "Let's take a look at the summary of our finalised model." 151 | ] 152 | }, 153 | { 154 | "cell_type": "code", 155 | "execution_count": null, 156 | "metadata": {}, 157 | "outputs": [], 158 | "source": [ 159 | "# Print the model summary\n", 160 | "\n", 161 | "model.summary()" 162 | ] 163 | }, 164 | { 165 | "cell_type": "markdown", 166 | "metadata": {}, 167 | "source": [ 168 | "### Visualising the initialised weights and biases" 169 | ] 170 | }, 171 | { 172 | "cell_type": "markdown", 173 | "metadata": {}, 174 | "source": [ 175 | "Finally, we can see the effect of our initialisers on the weights and biases by plotting histograms of the resulting values. Compare these plots with the selected initialisers for each layer above." 176 | ] 177 | }, 178 | { 179 | "cell_type": "code", 180 | "execution_count": null, 181 | "metadata": {}, 182 | "outputs": [], 183 | "source": [ 184 | "import matplotlib.pyplot as plt" 185 | ] 186 | }, 187 | { 188 | "cell_type": "code", 189 | "execution_count": null, 190 | "metadata": {}, 191 | "outputs": [], 192 | "source": [ 193 | "# Plot histograms of weight and bias values\n", 194 | "\n", 195 | "fig, axes = plt.subplots(5, 2, figsize=(12,16))\n", 196 | "fig.subplots_adjust(hspace=0.5, wspace=0.5)\n", 197 | "\n", 198 | "# Filter out the pooling and flatten layers, that don't have any weights\n", 199 | "weight_layers = [layer for layer in model.layers if len(layer.weights) > 0]\n", 200 | "\n", 201 | "for i, layer in enumerate(weight_layers):\n", 202 | " for j in [0, 1]:\n", 203 | " axes[i, j].hist(layer.weights[j].numpy().flatten(), align='left')\n", 204 | " axes[i, j].set_title(layer.weights[j].name)" 205 | ] 206 | }, 207 | { 208 | "cell_type": "markdown", 209 | "metadata": {}, 210 | "source": [ 211 | "## Further reading and resources \n", 212 | "* https://keras.io/initializers/\n", 213 | "* https://www.tensorflow.org/versions/r2.0/api_docs/python/tf/keras/initializers" 214 | ] 215 | } 216 | ], 217 | "metadata": { 218 | "kernelspec": { 219 | "display_name": "Python 3", 220 | "language": "python", 221 | "name": "python3" 222 | }, 223 | "language_info": { 224 | "codemirror_mode": { 225 | "name": "ipython", 226 | "version": 3 227 | }, 228 | "file_extension": ".py", 229 | "mimetype": "text/x-python", 230 | "name": "python", 231 | "nbconvert_exporter": "python", 232 | "pygments_lexer": "ipython3", 233 | "version": "3.7.7" 234 | } 235 | }, 236 | "nbformat": 4, 237 | "nbformat_minor": 4 238 | } -------------------------------------------------------------------------------- /TF.Keras Sequential API Basics/readme.md: -------------------------------------------------------------------------------- 1 | # tensorflow.keras Sequential API 2 | --- 3 | 99% of models can be built using Sequential API 4 | 5 | --- 6 | ### Imports 7 | ```python3 8 | from tensorflow.keras.models import Sequential 9 | from tensorflow.keras.layers import Dense 10 | ``` 11 | 12 | ### Creating a model 13 | ```python3 14 | model = Sequential([ 15 | Dense(64,activation = 'relu'), 16 | Dense(10,activation = 'softmax') 17 | ]) 18 | ``` 19 | 20 | This will create a model with 64 Hidden Units, and 10 Output Units. We have not defined Input units yet. We can define number of input in either training phase or in first layer. 21 | 22 | ```python3 23 | model = Sequential([ 24 | Dense(64,activation = 'relu', input_shape=(784,)), 25 | Dense(10,activation = 'softmax') 26 | ]) 27 | ``` 28 | 29 | Here we giving 784 d vector. If our input is 2d, we have to flatten it first. 30 | 31 | ```python3 32 | from tensorflow.keras.layers import Flatten, Dense 33 | from tensorflow.keras.models import Sequential 34 | model = Sequential([ 35 | Flatten(input_shape=(28,28)), 36 | Dense(64,activation = 'relu'), 37 | Dense(10,activation = 'softmax') 38 | ]) 39 | ``` 40 | 41 | We can also use model.add to add layers instead of passing them when creating as instance. 42 | 43 | ```python3 44 | model = Sequential() 45 | model.add(Dense(64,activation = 'relu', input_shape=(784,))) 46 | model.add(Dense(10,activation = 'softmax')) 47 | ``` 48 | 49 | ### Convolutinal Layers with tf.keras 50 | - There are 2 main things in CNN 51 | - Pooling Layers 52 | - Convolutional Layers 53 | 54 | - Lets see a coding example 55 | 56 | ```python3 57 | from tensorflow.keras.models import Sequential 58 | from tensorflow.keras.layers import Dense,Conv2D, MaxPool2D 59 | 60 | model = Sequential() 61 | model.add(Conv2D(16, kernel_size=3, activation = 'relu', input_shape=(32,32,3))) 62 | model.add(MaxPool2D((3,3))) 63 | model.add(Flatten()) 64 | model.add(Dense(64,activation='relu')) 65 | model.add(Dense(10,activation='softmax')) 66 | ``` 67 | 68 | To see the details of the models, use `model.summary()`. 69 | 70 | 71 | ```python3 72 | print(model.summary()) 73 | ``` 74 | 75 | ```Model: "sequential" 76 | _________________________________________________________________ 77 | Layer (type) Output Shape Param # 78 | ================================================================= 79 | conv2d (Conv2D) (None, 30, 30, 16) 448 80 | _________________________________________________________________ 81 | max_pooling2d (MaxPooling2D) (None, 10, 10, 16) 0 82 | _________________________________________________________________ 83 | flatten (Flatten) (None, 1600) 0 84 | _________________________________________________________________ 85 | dense (Dense) (None, 64) 102464 86 | _________________________________________________________________ 87 | dense_1 (Dense) (None, 10) 650 88 | ================================================================= 89 | Total params: 103,562 90 | Trainable params: 103,562 91 | Non-trainable params: 0 92 | _________________________________________________________________ 93 | None 94 | ``` 95 | 96 | We can add Padding, & Strides in our Conv2D layer. 97 | 98 | ```python3 99 | model = Sequential([ 100 | Conv2D(16,(3,3),strides = (2,2) ,padding='same', input_shape=(28,28,1)), 101 | MaxPooling2D((3,3)) 102 | ]) 103 | ``` 104 | Here we have `same` padding and strides of 2 by 2. 105 | 106 | We can change shape of input from (x,y,no.of channels) to (no.ofchannels, x, y) by simply adding data_format in MaxPool2D and Conv2D. 107 | 108 | ```python3 109 | model = Sequential([ 110 | Conv2D(16,(3,3),padding='same', input_shape=(1,28,28),data_format='channels_first'), 111 | MaxPooling2D((3,3), data_format='channels_first') 112 | ]) 113 | ``` 114 | ---- 115 | 116 | ### Custom Initialziation of Weights and Biases. 117 | Yes it is possible to initializae custom weights and Biases in Tensorflow with ease. Refer to [this](TF%20Keras%20Week%201%20Tutorial.ipynb) notebook. 118 | 119 | ---- 120 | ### Compiling of Model 121 | When we hace structure of a model ready, we can call compile method on model to assosiate it with Loss functions, Optimizer, Metrics etc. 122 | 123 | ```python3 124 | model = Sequential([ 125 | Dense(64,activation='relu', input_shape = (32,)), 126 | Dense(1, activation = 'sigmoid') 127 | ]) 128 | 129 | model.compile( 130 | loss='sgd', #sthocastic gradient descent 131 | optimizer = 'adam', 132 | metrics=['accuracy','mse'] #mse=mean squared error 133 | ) 134 | ``` 135 | Now there is another better way to do this, which is instead of passing things as a string like 'adam', we use thier objects given by tf.keras. It allows us to add more options. 136 | ```python3 137 | import tensorflow as tf 138 | from tensorflow.keras.models import Sequential 139 | from tensorflow.keras.layers import Dense 140 | 141 | model = Sequential([ 142 | Dense(64,activation='relu', input_shape = (32,)), 143 | Dense(1, activation = 'sigmoid') 144 | ]) 145 | 146 | model.compile( 147 | optimizer=tf.keras.optimizers.SGD(), 148 | loss=tf.keras.Binary_CrossEntropy(), 149 | metrics = [tf.keras.metrics.BinaryAccuaracy(), tf.keras.metrics.MeanAbsoluteError()] 150 | ) 151 | ``` 152 | 153 | Now Why we use these? Because these objects have thier own several parameters, which we can pass in it. 154 | i.e 155 | 156 | - tf.keras.optimizers.SGD(learning_rate=0.001,momentum=0.9) 157 | - tf.keras.losses.Binary_CrossEntropy(form_logit=True) 158 | - This thing form_logit is used with linear activation function to convert it into sigmoid but is more computationally better then sigmoid. 159 | 160 | Simiarly there are several other parameters which you can pass. You can explore them in Documentation. 161 | 162 | To learn more about Metrics, refer to this notebook [this](Metrics.ipynb) 163 | 164 | ----- 165 | 166 | ## Training of Model 167 | We can train model using fit method in keras. We pass training set and training labels in it. We also specify several hyper parametrs such as Number of Epochs, Batch Size. Lets look at an example. 168 | ```python3 169 | import tensorflow as tf 170 | from tensorflow.keras.models import Sequential 171 | from tensorflow.keras.layers import Dense 172 | 173 | model = Sequential([ 174 | Dense(64,activation='relu', input_shape = (32,)), 175 | Dense(1, activation = 'sigmoid') 176 | ]) 177 | 178 | model.compile( 179 | optimizer=tf.keras.optimizers.SGD(), 180 | loss=tf.keras.Binary_CrossEntropy(), 181 | metrics = [tf.keras.metrics.BinaryAccuaracy(), tf.keras.metrics.MeanAbsoluteError()] 182 | ) 183 | 184 | history = model.fit(X_train,y_train,epochs=10, batch_size=256) 185 | ``` 186 | 187 | This will train the model for 10 epochs with a batch size of 256.
We store result in History variable which we can use to analyze the performance of model based on metrics and loss. 188 | 189 | 190 | We normally convert history into a dataframe with all the analytics of model. And then visualize them. Lets see. 191 | ```python3 192 | import pandas as pd 193 | import matplotlib.pyplot as plt 194 | 195 | df = pd.DataFrame(history.history) 196 | plt.plot(df['loss']) 197 | plt.xlabel('epoch') 198 | plt.ylabel('loss') 199 | plt.show() 200 | #same for other metrics 201 | ``` 202 | 203 | ---- 204 | 205 | ## Evaluation and Prediction 206 | We can Evaluate model on Test set and predict on new examples. In order to evaluate, you might have guessed, we can use model.evaluate function in keras. 207 | Lets see an example 208 | ```python3 209 | loss, test_binary_acc, test_MAE = model.evaluate(X_test,y_test) 210 | ``` 211 | Since we used Loss function, Binary acc, and MAE as loss and metrics respectively, so it returns 3 values. 212 |
213 | For Predictions, we can use model.predict on new image. And use `np.argmax` to predict the label. 214 | Let's see and example 215 | ```python3 216 | pred = model.predict(NewImage.png) 217 | print(labels[np.argmax(pred)]) 218 | ``` 219 | Where labels is a list containing all the labels. 220 | 221 | ----- 222 | This winds up the tutorial for Tensorflow Keras Sequential API. Now it;s your turn to try it on MNIST and FASHION MNIST data sets. -------------------------------------------------------------------------------- /Validation, Regularization and Callbacks/0.PNG: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/ahmadmustafaanis/Getting-Started-with-Tensorflow-2/6318490c1b07018f94389a68905d7b51029c35aa/Validation, Regularization and Callbacks/0.PNG -------------------------------------------------------------------------------- /Validation, Regularization and Callbacks/1.PNG: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/ahmadmustafaanis/Getting-Started-with-Tensorflow-2/6318490c1b07018f94389a68905d7b51029c35aa/Validation, Regularization and Callbacks/1.PNG -------------------------------------------------------------------------------- /Validation, Regularization and Callbacks/Batch normalisation.ipynb: -------------------------------------------------------------------------------- 1 | { 2 | "cells": [ 3 | { 4 | "cell_type": "markdown", 5 | "metadata": {}, 6 | "source": [ 7 | "# Batch normalisation layers\n", 8 | "\n", 9 | "In this reading we will look at incorporating batch normalisation into our models and look at an example of how we do this in practice.\n", 10 | "\n", 11 | "As usual, let's first import tensorflow." 12 | ] 13 | }, 14 | { 15 | "cell_type": "code", 16 | "execution_count": null, 17 | "metadata": {}, 18 | "outputs": [], 19 | "source": [ 20 | "import tensorflow as tf\n", 21 | "print(tf.__version__)" 22 | ] 23 | }, 24 | { 25 | "cell_type": "markdown", 26 | "metadata": {}, 27 | "source": [ 28 | "We will be working with the diabetes dataset that we have been using in this week's screencasts. \n", 29 | "\n", 30 | "Let's load and pre-process the dataset." 31 | ] 32 | }, 33 | { 34 | "cell_type": "code", 35 | "execution_count": null, 36 | "metadata": {}, 37 | "outputs": [], 38 | "source": [ 39 | "# Load the dataset\n", 40 | "\n", 41 | "from sklearn.datasets import load_diabetes\n", 42 | "diabetes_dataset = load_diabetes()" 43 | ] 44 | }, 45 | { 46 | "cell_type": "code", 47 | "execution_count": null, 48 | "metadata": {}, 49 | "outputs": [], 50 | "source": [ 51 | "# Save the input and target variables\n", 52 | "\n", 53 | "from sklearn.model_selection import train_test_split\n", 54 | "\n", 55 | "data = diabetes_dataset['data']\n", 56 | "targets = diabetes_dataset['target']" 57 | ] 58 | }, 59 | { 60 | "cell_type": "code", 61 | "execution_count": null, 62 | "metadata": {}, 63 | "outputs": [], 64 | "source": [ 65 | "# Normalise the target data (this will make clearer training curves)\n", 66 | "\n", 67 | "targets = (targets - targets.mean(axis=0)) / (targets.std())" 68 | ] 69 | }, 70 | { 71 | "cell_type": "code", 72 | "execution_count": null, 73 | "metadata": {}, 74 | "outputs": [], 75 | "source": [ 76 | "# Split the dataset into training and test datasets \n", 77 | "\n", 78 | "train_data, test_data, train_targets, test_targets = train_test_split(data, targets, test_size=0.1)" 79 | ] 80 | }, 81 | { 82 | "cell_type": "markdown", 83 | "metadata": {}, 84 | "source": [ 85 | "### Batch normalisation - defining the model" 86 | ] 87 | }, 88 | { 89 | "cell_type": "markdown", 90 | "metadata": {}, 91 | "source": [ 92 | "We can implement batch normalisation into our model by adding it in the same way as any other layer." 93 | ] 94 | }, 95 | { 96 | "cell_type": "code", 97 | "execution_count": null, 98 | "metadata": {}, 99 | "outputs": [], 100 | "source": [ 101 | "from tensorflow.keras.models import Sequential\n", 102 | "from tensorflow.keras.layers import Flatten, Dense, Conv2D, MaxPooling2D, BatchNormalization, Dropout" 103 | ] 104 | }, 105 | { 106 | "cell_type": "code", 107 | "execution_count": null, 108 | "metadata": {}, 109 | "outputs": [], 110 | "source": [ 111 | "# Build the model\n", 112 | "\n", 113 | "model = Sequential([\n", 114 | " Dense(64, input_shape=[train_data.shape[1],], activation=\"relu\"),\n", 115 | " BatchNormalization(), # <- Batch normalisation layer\n", 116 | " Dropout(0.5),\n", 117 | " BatchNormalization(), # <- Batch normalisation layer\n", 118 | " Dropout(0.5),\n", 119 | " Dense(256, activation='relu'),\n", 120 | "])\n", 121 | "\n", 122 | "# NB: We have not added the output layer because we still have more layers to add!" 123 | ] 124 | }, 125 | { 126 | "cell_type": "code", 127 | "execution_count": null, 128 | "metadata": { 129 | "scrolled": true 130 | }, 131 | "outputs": [], 132 | "source": [ 133 | "# Print the model summary\n", 134 | "\n", 135 | "model.summary()" 136 | ] 137 | }, 138 | { 139 | "cell_type": "markdown", 140 | "metadata": {}, 141 | "source": [ 142 | "Recall that there are some parameters and hyperparameters associated with batch normalisation.\n", 143 | "\n", 144 | "* The hyperparameter **momentum** is the weighting given to the previous running mean when re-computing it with an extra minibatch. By **default**, it is set to 0.99.\n", 145 | "\n", 146 | "* The hyperparameter **$\\epsilon$** is used for numeric stability when performing the normalisation over the minibatch. By **default** it is set to 0.001.\n", 147 | "\n", 148 | "* The parameters **$\\beta$** and **$\\gamma$** are used to implement an affine transformation after normalisation. By **default**, $\\beta$ is an all-zeros vector, and $\\gamma$ is an all-ones vector.\n", 149 | "\n", 150 | "### Customising parameters\n", 151 | "These can all be changed (along with various other properties) by adding optional arguments to `tf.keras.layers.BatchNormalization()`.\n", 152 | "\n", 153 | "We can also specify the axis for batch normalisation. By default, it is set as -1.\n", 154 | "\n", 155 | "Let's see an example." 156 | ] 157 | }, 158 | { 159 | "cell_type": "code", 160 | "execution_count": null, 161 | "metadata": {}, 162 | "outputs": [], 163 | "source": [ 164 | "# Add a customised batch normalisation layer\n", 165 | "\n", 166 | "model.add(tf.keras.layers.BatchNormalization(\n", 167 | " momentum=0.95, \n", 168 | " epsilon=0.005,\n", 169 | " axis = -1,\n", 170 | " beta_initializer=tf.keras.initializers.RandomNormal(mean=0.0, stddev=0.05), \n", 171 | " gamma_initializer=tf.keras.initializers.Constant(value=0.9)\n", 172 | "))" 173 | ] 174 | }, 175 | { 176 | "cell_type": "code", 177 | "execution_count": null, 178 | "metadata": {}, 179 | "outputs": [], 180 | "source": [ 181 | "# Add the output layer\n", 182 | "\n", 183 | "model.add(Dense(1))" 184 | ] 185 | }, 186 | { 187 | "cell_type": "markdown", 188 | "metadata": {}, 189 | "source": [ 190 | "## Compile and fit the model" 191 | ] 192 | }, 193 | { 194 | "cell_type": "markdown", 195 | "metadata": {}, 196 | "source": [ 197 | "Let's now compile and fit our model with batch normalisation, and track the progress on training and validation sets.\n", 198 | "\n", 199 | "First we compile our model." 200 | ] 201 | }, 202 | { 203 | "cell_type": "code", 204 | "execution_count": null, 205 | "metadata": {}, 206 | "outputs": [], 207 | "source": [ 208 | "# Compile the model\n", 209 | "\n", 210 | "model.compile(optimizer='adam',\n", 211 | " loss='mse',\n", 212 | " metrics=['mae'])" 213 | ] 214 | }, 215 | { 216 | "cell_type": "markdown", 217 | "metadata": {}, 218 | "source": [ 219 | "Now we fit the model to the data." 220 | ] 221 | }, 222 | { 223 | "cell_type": "code", 224 | "execution_count": null, 225 | "metadata": {}, 226 | "outputs": [], 227 | "source": [ 228 | "# Train the model\n", 229 | "\n", 230 | "history = model.fit(train_data, train_targets, epochs=100, validation_split=0.15, batch_size=64,verbose=False)" 231 | ] 232 | }, 233 | { 234 | "cell_type": "markdown", 235 | "metadata": {}, 236 | "source": [ 237 | "Finally, we plot training and validation loss and accuracy to observe how the accuracy of our model improves over time." 238 | ] 239 | }, 240 | { 241 | "cell_type": "code", 242 | "execution_count": null, 243 | "metadata": {}, 244 | "outputs": [], 245 | "source": [ 246 | "# Plot the learning curves\n", 247 | "\n", 248 | "import pandas as pd\n", 249 | "import numpy as np\n", 250 | "import matplotlib.pyplot as plt\n", 251 | "%matplotlib inline\n", 252 | "\n", 253 | "frame = pd.DataFrame(history.history)\n", 254 | "epochs = np.arange(len(frame))\n", 255 | "\n", 256 | "fig = plt.figure(figsize=(12,4))\n", 257 | "\n", 258 | "# Loss plot\n", 259 | "ax = fig.add_subplot(121)\n", 260 | "ax.plot(epochs, frame['loss'], label=\"Train\")\n", 261 | "ax.plot(epochs, frame['val_loss'], label=\"Validation\")\n", 262 | "ax.set_xlabel(\"Epochs\")\n", 263 | "ax.set_ylabel(\"Loss\")\n", 264 | "ax.set_title(\"Loss vs Epochs\")\n", 265 | "ax.legend()\n", 266 | "\n", 267 | "# Accuracy plot\n", 268 | "ax = fig.add_subplot(122)\n", 269 | "ax.plot(epochs, frame['mae'], label=\"Train\")\n", 270 | "ax.plot(epochs, frame['val_mae'], label=\"Validation\")\n", 271 | "ax.set_xlabel(\"Epochs\")\n", 272 | "ax.set_ylabel(\"Mean Absolute Error\")\n", 273 | "ax.set_title(\"Mean Absolute Error vs Epochs\")\n", 274 | "ax.legend()" 275 | ] 276 | }, 277 | { 278 | "cell_type": "markdown", 279 | "metadata": {}, 280 | "source": [ 281 | "## Further reading and resources \n", 282 | "* https://keras.io/layers/normalization/\n", 283 | "* https://www.tensorflow.org/versions/r2.0/api_docs/python/tf/keras/layers/BatchNormalization" 284 | ] 285 | } 286 | ], 287 | "metadata": { 288 | "kernelspec": { 289 | "display_name": "Python 3", 290 | "language": "python", 291 | "name": "python3" 292 | }, 293 | "language_info": { 294 | "codemirror_mode": { 295 | "name": "ipython", 296 | "version": 3 297 | }, 298 | "file_extension": ".py", 299 | "mimetype": "text/x-python", 300 | "name": "python", 301 | "nbconvert_exporter": "python", 302 | "pygments_lexer": "ipython3", 303 | "version": "3.7.1" 304 | } 305 | }, 306 | "nbformat": 4, 307 | "nbformat_minor": 2 308 | } -------------------------------------------------------------------------------- /Validation, Regularization and Callbacks/Custom Callback.ipynb: -------------------------------------------------------------------------------- 1 | { 2 | "cells": [ 3 | { 4 | "cell_type": "code", 5 | "execution_count": null, 6 | "metadata": {}, 7 | "outputs": [], 8 | "source": [ 9 | "import tensorflow as tf" 10 | ] 11 | }, 12 | { 13 | "cell_type": "code", 14 | "execution_count": null, 15 | "metadata": {}, 16 | "outputs": [], 17 | "source": [ 18 | "from tensorflow.keras.layers import Dense, BatchNormalization\n", 19 | "from tensorflow.keras.models import Sequential" 20 | ] 21 | }, 22 | { 23 | "cell_type": "code", 24 | "execution_count": null, 25 | "metadata": {}, 26 | "outputs": [], 27 | "source": [ 28 | "from sklearn.datasets import load_diabetes\n", 29 | "from sklearn.model_selection import train_test_split\n", 30 | "diabetes = load_diabetes()\n", 31 | "\n", 32 | "data = diabetes['data']\n", 33 | "targets = diabetes['target']" 34 | ] 35 | }, 36 | { 37 | "cell_type": "code", 38 | "execution_count": null, 39 | "metadata": {}, 40 | "outputs": [], 41 | "source": [ 42 | "X_train,X_val,y_train,y_val = train_test_split(data, targets, test_size = 0.1)" 43 | ] 44 | }, 45 | { 46 | "cell_type": "markdown", 47 | "metadata": {}, 48 | "source": [ 49 | "#### Dummy Model" 50 | ] 51 | }, 52 | { 53 | "cell_type": "code", 54 | "execution_count": null, 55 | "metadata": {}, 56 | "outputs": [], 57 | "source": [ 58 | "model = Sequential([\n", 59 | " Dense(128, activation='relu', input_shape=(X_train.shape[1],)),\n", 60 | " Dense(64,activation='relu'),\n", 61 | " tf.keras.layers.BatchNormalization(),\n", 62 | " Dense(64, activation='relu'),\n", 63 | " Dense(64, activation='relu'),\n", 64 | " Dense(1) \n", 65 | "])" 66 | ] 67 | }, 68 | { 69 | "cell_type": "code", 70 | "execution_count": null, 71 | "metadata": {}, 72 | "outputs": [], 73 | "source": [ 74 | "model.compile(\n", 75 | " optimizer='adam',\n", 76 | " loss='mse',\n", 77 | " metrics=['mae']\n", 78 | ")" 79 | ] 80 | }, 81 | { 82 | "cell_type": "markdown", 83 | "metadata": {}, 84 | "source": [ 85 | "### Custom Callback\n", 86 | "##### we will use logs dictionary to access the loss and metric value" 87 | ] 88 | }, 89 | { 90 | "cell_type": "code", 91 | "execution_count": null, 92 | "metadata": {}, 93 | "outputs": [], 94 | "source": [ 95 | "class customCallback(tf.keras.callbacks.Callback):\n", 96 | " \n", 97 | " def on_train_batch_end(self,batch,logs = None):\n", 98 | " if batch % 2 is 0:\n", 99 | " print( f\"\\n After Batch {batch}, loss is {logs['loss']}\" )\n", 100 | "\n", 101 | " def on_test_batch_end(self,batch,logs=None):\n", 102 | " if batch % 2 is 0:\n", 103 | " print(f\"\\n After Batch {batch}, loss is {logs['loss']} \")\n", 104 | "\n", 105 | " def on_epoch_end(self, epoch, logs = None):\n", 106 | " print(f\"Epoch {epoch}, Mean Absolute Error is {logs['mae']}, Loss is {logs['loss']}\")\n", 107 | " \n", 108 | " def on_predict_batch_end(self, batch, logs=None):\n", 109 | " print(f\"Finished Prediction on Batch {batch}\")" 110 | ] 111 | }, 112 | { 113 | "cell_type": "code", 114 | "execution_count": null, 115 | "metadata": {}, 116 | "outputs": [], 117 | "source": [ 118 | "history = model.fit(X_train, y_train, epochs=10, callbacks=[customCallback()], verbose = 0, batch_size=2**6)" 119 | ] 120 | }, 121 | { 122 | "cell_type": "code", 123 | "execution_count": null, 124 | "metadata": {}, 125 | "outputs": [], 126 | "source": [ 127 | "model.evaluate(X_val,y_val, callbacks=[customCallback()], verbose=0, batch_size=10)" 128 | ] 129 | }, 130 | { 131 | "cell_type": "code", 132 | "execution_count": null, 133 | "metadata": {}, 134 | "outputs": [], 135 | "source": [ 136 | "model.predict(X_val,batch_size=10, callbacks=[customCallback()], verbose=False )" 137 | ] 138 | }, 139 | { 140 | "cell_type": "markdown", 141 | "metadata": {}, 142 | "source": [ 143 | "### We will define a custom callback to reduce the learning rate w.r.t to # of Epochs\n", 144 | "\n", 145 | "##### It is going to have a more complex custom callback" 146 | ] 147 | }, 148 | { 149 | "cell_type": "code", 150 | "execution_count": null, 151 | "metadata": {}, 152 | "outputs": [], 153 | "source": [ 154 | "lr_schedule = [\n", 155 | " (5,0.05), (10,0.03), (15,0.02), (20,0.01)\n", 156 | "]\n", 157 | "# we will get new learning rate using this function by comparing to list above.\n", 158 | "def get_new_learning_rate(epoch, lr):\n", 159 | " for i in lr_schedule:\n", 160 | " if epoch in i:\n", 161 | " lr = i[1]\n", 162 | "\n", 163 | " return lr" 164 | ] 165 | }, 166 | { 167 | "cell_type": "code", 168 | "execution_count": null, 169 | "metadata": {}, 170 | "outputs": [], 171 | "source": [ 172 | "class Learning_rate_scheduler( tf.keras.callbacks.Callback ):\n", 173 | " def __init__(self, new_lr):\n", 174 | " super(Learning_rate_scheduler, self).__init__\n", 175 | " #adding new learning rate function to our callback\n", 176 | " self.new_lr = new_lr\n", 177 | " \n", 178 | " def on_epoch_begin(self, epoch, logs=None):\n", 179 | " #we will check if our optimizer has learning rate option or not\n", 180 | " try:\n", 181 | " curr_rate = tf.keras.backend.get_value(self.model.optimizer.lr)\n", 182 | "\n", 183 | " #calling auxillary function to get scheduled learning rate, we have passed the function as parameter which is new_lr\n", 184 | "\n", 185 | " scheduled_rate = self.new_lr(epoch, curr_rate)\n", 186 | "\n", 187 | " tf.keras.backend.set_value(self.model.optimizer.lr, scheduled_rate)\n", 188 | "\n", 189 | " print(f\"Learning Rate for Epoch {epoch} is {tf.keras.backend.get_value(self.model.optimizer.lr)}\")\n", 190 | "\n", 191 | " except Exception as E:\n", 192 | " print(f'{E}\\n Most Probably your optimizer do not have learing rate option.')" 193 | ] 194 | }, 195 | { 196 | "cell_type": "code", 197 | "execution_count": null, 198 | "metadata": {}, 199 | "outputs": [], 200 | "source": [ 201 | "model = Sequential([\n", 202 | " \n", 203 | " Dense(128, activation='relu', input_shape=(X_train.shape[1],)),\n", 204 | " Dense(64,activation='relu'),\n", 205 | " tf.keras.layers.BatchNormalization(),\n", 206 | " Dense(64, activation='relu'),\n", 207 | " Dense(64, activation='relu'),\n", 208 | " Dense(1) \n", 209 | "])" 210 | ] 211 | }, 212 | { 213 | "cell_type": "code", 214 | "execution_count": null, 215 | "metadata": {}, 216 | "outputs": [], 217 | "source": [ 218 | "model.compile(loss='mse',\n", 219 | " optimizer=\"adam\",\n", 220 | " metrics=['mae', 'mse'])" 221 | ] 222 | }, 223 | { 224 | "cell_type": "code", 225 | "execution_count": null, 226 | "metadata": {}, 227 | "outputs": [], 228 | "source": [ 229 | "model.fit(X_train, y_train, epochs=25, batch_size=64, callbacks=[Learning_rate_scheduler(get_new_learning_rate)], verbose=False)" 230 | ] 231 | }, 232 | { 233 | "cell_type": "code", 234 | "execution_count": null, 235 | "metadata": {}, 236 | "outputs": [], 237 | "source": [] 238 | } 239 | ], 240 | "metadata": { 241 | "language_info": { 242 | "codemirror_mode": { 243 | "name": "ipython", 244 | "version": 3 245 | }, 246 | "file_extension": ".py", 247 | "mimetype": "text/x-python", 248 | "name": "python", 249 | "nbconvert_exporter": "python", 250 | "pygments_lexer": "ipython3", 251 | "version": "3.7.7-final" 252 | }, 253 | "orig_nbformat": 2, 254 | "kernelspec": { 255 | "name": "python37764bitmyenvconda4a11ba26287d4d1c969b9946e31eb2a2", 256 | "language": "python", 257 | "display_name": "Python 3.7.7 64-bit ('myenv': conda)" 258 | } 259 | }, 260 | "nbformat": 4, 261 | "nbformat_minor": 2 262 | } -------------------------------------------------------------------------------- /Validation, Regularization and Callbacks/Validation_Regularization_CallBacks.ipynb: -------------------------------------------------------------------------------- 1 | { 2 | "cells": [ 3 | { 4 | "cell_type": "code", 5 | "execution_count": null, 6 | "metadata": { 7 | "scrolled": true 8 | }, 9 | "outputs": [], 10 | "source": [ 11 | "import tensorflow as tf\n", 12 | "print(tf.__version__)" 13 | ] 14 | }, 15 | { 16 | "cell_type": "markdown", 17 | "metadata": {}, 18 | "source": [ 19 | "# Validation, regularisation and callbacks" 20 | ] 21 | }, 22 | { 23 | "cell_type": "markdown", 24 | "metadata": {}, 25 | "source": [ 26 | " ## Coding tutorials\n", 27 | " #### [1. Validation sets](#coding_tutorial_1)\n", 28 | " #### [2. Model regularisation](#coding_tutorial_2)\n", 29 | " #### [3. Introduction to callbacks](#coding_tutorial_3)\n", 30 | " #### [4. Early stopping / patience](#coding_tutorial_4)" 31 | ] 32 | }, 33 | { 34 | "cell_type": "markdown", 35 | "metadata": {}, 36 | "source": [ 37 | "***\n", 38 | "\n", 39 | "## Validation sets" 40 | ] 41 | }, 42 | { 43 | "cell_type": "markdown", 44 | "metadata": {}, 45 | "source": [ 46 | "#### Load the data" 47 | ] 48 | }, 49 | { 50 | "cell_type": "code", 51 | "execution_count": null, 52 | "metadata": {}, 53 | "outputs": [], 54 | "source": [ 55 | "# Load the diabetes dataset\n", 56 | "from sklearn.datasets import load_diabetes\n", 57 | "diabaties_dataset = load_diabetes()\n", 58 | "print(diabaties_dataset[\"DESCR\"])" 59 | ] 60 | }, 61 | { 62 | "cell_type": "code", 63 | "execution_count": null, 64 | "metadata": {}, 65 | "outputs": [], 66 | "source": [ 67 | "diabaties_dataset.keys()" 68 | ] 69 | }, 70 | { 71 | "cell_type": "code", 72 | "execution_count": null, 73 | "metadata": {}, 74 | "outputs": [], 75 | "source": [ 76 | "# Save the input and target variables\n", 77 | "data = diabaties_dataset['data']\n", 78 | "target = diabaties_dataset['target']\n" 79 | ] 80 | }, 81 | { 82 | "cell_type": "code", 83 | "execution_count": null, 84 | "metadata": {}, 85 | "outputs": [], 86 | "source": [ 87 | "import numpy as np\n", 88 | "# Normalise the target data (this will make clearer training curves)\n", 89 | "target = (target - np.mean(target))/np.std(target)\n", 90 | "target" 91 | ] 92 | }, 93 | { 94 | "cell_type": "code", 95 | "execution_count": null, 96 | "metadata": {}, 97 | "outputs": [], 98 | "source": [ 99 | "# Split the data into train and test sets\n", 100 | "from sklearn.model_selection import train_test_split\n", 101 | "\n", 102 | "X_train,X_test,Y_train,Y_test = train_test_split(data,target,test_size=0.1 )\n", 103 | "print(X_train.shape)\n", 104 | "print(Y_train.shape)\n", 105 | "\n", 106 | "print(X_test.shape)\n", 107 | "print(Y_test.shape)" 108 | ] 109 | }, 110 | { 111 | "cell_type": "markdown", 112 | "metadata": {}, 113 | "source": [ 114 | "#### Train a feedforward neural network model" 115 | ] 116 | }, 117 | { 118 | "cell_type": "code", 119 | "execution_count": null, 120 | "metadata": {}, 121 | "outputs": [], 122 | "source": [ 123 | "X_train.shape[1]" 124 | ] 125 | }, 126 | { 127 | "cell_type": "code", 128 | "execution_count": null, 129 | "metadata": {}, 130 | "outputs": [], 131 | "source": [ 132 | "# Build the model\n", 133 | "\n", 134 | "from tensorflow.keras.models import Sequential\n", 135 | "from tensorflow.keras.layers import Dense\n", 136 | "\n", 137 | "def get_model():\n", 138 | " model = Sequential([\n", 139 | " Dense(128, activation='relu', input_shape=(X_train.shape[1],)),\n", 140 | " Dense(128,activation='relu'),\n", 141 | " Dense(128,activation='relu'),\n", 142 | " Dense(128,activation='relu'),\n", 143 | " Dense(128,activation='relu'),\n", 144 | " Dense(128,activation='relu'),\n", 145 | " Dense(1,activation='linear')\n", 146 | " ])\n", 147 | " return model\n", 148 | "model = get_model()" 149 | ] 150 | }, 151 | { 152 | "cell_type": "code", 153 | "execution_count": null, 154 | "metadata": {}, 155 | "outputs": [], 156 | "source": [ 157 | "# Print the model summary\n", 158 | "\n", 159 | "model.summary()" 160 | ] 161 | }, 162 | { 163 | "cell_type": "code", 164 | "execution_count": null, 165 | "metadata": {}, 166 | "outputs": [], 167 | "source": [ 168 | "# Compile the model\n", 169 | "\n", 170 | "model.compile(optimizer='adam',loss='mse', metrics=['mae'])" 171 | ] 172 | }, 173 | { 174 | "cell_type": "code", 175 | "execution_count": null, 176 | "metadata": {}, 177 | "outputs": [], 178 | "source": [ 179 | "# Train the model, with some of the data reserved for validation\n", 180 | "history = model.fit(X_train,Y_train,validation_split=0.15,epochs=100,verbose=2,batch_size=64)\n" 181 | ] 182 | }, 183 | { 184 | "cell_type": "code", 185 | "execution_count": null, 186 | "metadata": {}, 187 | "outputs": [], 188 | "source": [ 189 | "# Evaluate the model on the test set\n", 190 | "loss,mae = model.evaluate(X_test,Y_test)\n" 191 | ] 192 | }, 193 | { 194 | "cell_type": "markdown", 195 | "metadata": {}, 196 | "source": [ 197 | "#### Plot the learning curves" 198 | ] 199 | }, 200 | { 201 | "cell_type": "code", 202 | "execution_count": null, 203 | "metadata": {}, 204 | "outputs": [], 205 | "source": [ 206 | "import matplotlib.pyplot as plt\n", 207 | "%matplotlib inline" 208 | ] 209 | }, 210 | { 211 | "cell_type": "code", 212 | "execution_count": null, 213 | "metadata": {}, 214 | "outputs": [], 215 | "source": [ 216 | "# Plot the training and validation loss\n", 217 | "\n", 218 | "plt.plot(history.history['loss'])\n", 219 | "plt.plot(history.history['val_loss'])\n", 220 | "plt.title('Loss vs. epochs')\n", 221 | "plt.ylabel('Loss')\n", 222 | "plt.xlabel('Epoch')\n", 223 | "plt.legend(['Training', 'Validation'], loc='upper right')\n", 224 | "plt.show()" 225 | ] 226 | }, 227 | { 228 | "cell_type": "markdown", 229 | "metadata": {}, 230 | "source": [ 231 | "***\n", 232 | "\n", 233 | "## Model regularisation" 234 | ] 235 | }, 236 | { 237 | "cell_type": "markdown", 238 | "metadata": {}, 239 | "source": [ 240 | "#### Adding regularisation with weight decay and dropout" 241 | ] 242 | }, 243 | { 244 | "cell_type": "code", 245 | "execution_count": null, 246 | "metadata": {}, 247 | "outputs": [], 248 | "source": [ 249 | "from tensorflow.keras.layers import Dropout\n", 250 | "from tensorflow.keras import regularizers" 251 | ] 252 | }, 253 | { 254 | "cell_type": "code", 255 | "execution_count": null, 256 | "metadata": {}, 257 | "outputs": [], 258 | "source": [ 259 | "def get_regularised_model(wd, rate):\n", 260 | " model = Sequential([\n", 261 | " Dense(128, activation=\"relu\", kernel_regularizer=regularizers.l2(wd), input_shape=(X_train.shape[1],)),\n", 262 | " Dropout(rate),\n", 263 | " Dense(128, activation=\"relu\", kernel_regularizer=regularizers.l2(wd),),\n", 264 | " Dropout(rate),\n", 265 | " Dense(128, activation=\"relu\", kernel_regularizer=regularizers.l2(wd),),\n", 266 | " Dropout(rate),\n", 267 | " Dense(128, activation=\"relu\", kernel_regularizer=regularizers.l2(wd),),\n", 268 | " Dropout(rate),\n", 269 | " Dense(128, activation=\"relu\", kernel_regularizer=regularizers.l2(wd),),\n", 270 | " Dropout(rate),\n", 271 | " Dense(128, activation=\"relu\", kernel_regularizer=regularizers.l2(wd),),\n", 272 | " Dropout(rate),\n", 273 | " Dense(1)\n", 274 | " ])\n", 275 | " return model" 276 | ] 277 | }, 278 | { 279 | "cell_type": "code", 280 | "execution_count": null, 281 | "metadata": {}, 282 | "outputs": [], 283 | "source": [ 284 | "# Re-build the model with weight decay and dropout layers\n", 285 | "\n", 286 | "model = get_regularised_model(0.01,0.05)" 287 | ] 288 | }, 289 | { 290 | "cell_type": "code", 291 | "execution_count": null, 292 | "metadata": {}, 293 | "outputs": [], 294 | "source": [ 295 | "# Compile the model\n", 296 | "\n", 297 | "model.compile(optimizer='adam', loss='mse', metrics=['mae'])" 298 | ] 299 | }, 300 | { 301 | "cell_type": "code", 302 | "execution_count": null, 303 | "metadata": {}, 304 | "outputs": [], 305 | "source": [ 306 | "# Train the model, with some of the data reserved for validation\n", 307 | "\n", 308 | "history = model.fit(X_train, Y_train, validation_split=0.20, epochs=100, batch_size=2**7, verbose=False )" 309 | ] 310 | }, 311 | { 312 | "cell_type": "code", 313 | "execution_count": null, 314 | "metadata": {}, 315 | "outputs": [], 316 | "source": [ 317 | "# Evaluate the model on the test set\n", 318 | "loss, mae = model.evaluate(X_test,Y_test, verbose=2)\n" 319 | ] 320 | }, 321 | { 322 | "cell_type": "markdown", 323 | "metadata": {}, 324 | "source": [ 325 | "#### Plot the learning curves" 326 | ] 327 | }, 328 | { 329 | "cell_type": "code", 330 | "execution_count": null, 331 | "metadata": {}, 332 | "outputs": [], 333 | "source": [ 334 | "# Plot the training and validation loss\n", 335 | "\n", 336 | "import matplotlib.pyplot as plt\n", 337 | "\n", 338 | "plt.plot(history.history['loss'])\n", 339 | "plt.plot(history.history['val_loss'])\n", 340 | "plt.title('Loss vs. epochs')\n", 341 | "plt.ylabel('Loss')\n", 342 | "plt.xlabel('Epoch')\n", 343 | "plt.legend(['Training', 'Validation'], loc='upper right')\n", 344 | "plt.show()" 345 | ] 346 | }, 347 | { 348 | "cell_type": "markdown", 349 | "metadata": {}, 350 | "source": [ 351 | "***\n", 352 | "\n", 353 | "## Introduction to callbacks" 354 | ] 355 | }, 356 | { 357 | "cell_type": "markdown", 358 | "metadata": {}, 359 | "source": [ 360 | "#### Example training callback" 361 | ] 362 | }, 363 | { 364 | "cell_type": "code", 365 | "execution_count": null, 366 | "metadata": {}, 367 | "outputs": [], 368 | "source": [ 369 | "# Write a custom callback\n", 370 | "from tensorflow.keras.callbacks import Callback\n", 371 | "\n", 372 | "\n", 373 | "\n", 374 | "class myCustomTrainingCallBack(Callback):\n", 375 | " \n", 376 | " def on_train_begin(self, logs= None):\n", 377 | " print(\"Training Begin!! \")\n", 378 | " \n", 379 | " def on_train_batch_begin(self, batch, logs=None):\n", 380 | " print(f\"Training Batch {batch} Begin\")\n", 381 | " \n", 382 | " def on_train_batch_end(self, batch, logs=None):\n", 383 | " print(f\"Training Batch {batch} Ended\")\n", 384 | " \n", 385 | " def on_epoch_begin(self, epoch, logs=None):\n", 386 | " print(f\"Epoch #{epoch} Begin\" )\n", 387 | " \n", 388 | " def on_epoch_end(self, epoch, logs=None):\n", 389 | " print(f\"Epoch #{epoch} End\")\n", 390 | " \n", 391 | " def on_train_end(self,logs=None):\n", 392 | " print(\"Training Ended\")\n", 393 | " \n", 394 | " \n", 395 | "class myCustomTestingCallBack(Callback):\n", 396 | " \n", 397 | " def on_test_begin(self, logs= None):\n", 398 | " print(\"Testing Begin!! \")\n", 399 | " \n", 400 | " def on_test_batch_begin(self, batch, logs=None):\n", 401 | " print(f\"Testing Batch {batch} Begin\")\n", 402 | " \n", 403 | " def on_test_batch_end(self, batch, logs=None):\n", 404 | " print(f\"Testing Batch {batch} Ended\")\n", 405 | " \n", 406 | " \n", 407 | " def on_test_end(self,logs=None):\n", 408 | " print(\"Testing Ended\")\n", 409 | " \n", 410 | "\n", 411 | "class myCustomPredictionCallBack(Callback):\n", 412 | " \n", 413 | " def on_predict_begin(self, logs= None):\n", 414 | " print(\"Prediction Begin!! \")\n", 415 | " \n", 416 | " def on_predict_batch_begin(self, batch, logs=None):\n", 417 | " print(f\"Prediction Batch {batch} Begin\")\n", 418 | " \n", 419 | " def on_predict_batch_end(self, batch, logs=None):\n", 420 | " print(f\"Predicton Batch {batch} Ended\")\n", 421 | " \n", 422 | " \n", 423 | " def on_predict_end(self,logs=None):\n", 424 | " print(\"Precition Ended\")" 425 | ] 426 | }, 427 | { 428 | "cell_type": "code", 429 | "execution_count": null, 430 | "metadata": {}, 431 | "outputs": [], 432 | "source": [ 433 | "# Re-build the model\n", 434 | "\n", 435 | "model = get_regularised_model(1e-5, 0.3)\n" 436 | ] 437 | }, 438 | { 439 | "cell_type": "code", 440 | "execution_count": null, 441 | "metadata": {}, 442 | "outputs": [], 443 | "source": [ 444 | "# Compile the model\n", 445 | "model.compile(optimizer='Adam', loss='mse')" 446 | ] 447 | }, 448 | { 449 | "cell_type": "markdown", 450 | "metadata": {}, 451 | "source": [ 452 | "#### Train the model with the callback" 453 | ] 454 | }, 455 | { 456 | "cell_type": "code", 457 | "execution_count": null, 458 | "metadata": {}, 459 | "outputs": [], 460 | "source": [ 461 | "# Train the model, with some of the data reserved for validation\n", 462 | "\n", 463 | "history = model.fit(X_train,Y_train, epochs=5, batch_size=128, validation_split=0.1, callbacks=[myCustomTrainingCallBack()], verbose=0)" 464 | ] 465 | }, 466 | { 467 | "cell_type": "code", 468 | "execution_count": null, 469 | "metadata": {}, 470 | "outputs": [], 471 | "source": [ 472 | "# Evaluate the model\n", 473 | "\n", 474 | "model.evaluate(X_test, Y_test, verbose=False, callbacks=[myCustomTestingCallBack()] )" 475 | ] 476 | }, 477 | { 478 | "cell_type": "code", 479 | "execution_count": null, 480 | "metadata": {}, 481 | "outputs": [], 482 | "source": [ 483 | "# Make predictions with the model\n", 484 | "\n", 485 | "model.predict(X_test, verbose=False, callbacks=[myCustomPredictionCallBack()])" 486 | ] 487 | }, 488 | { 489 | "cell_type": "markdown", 490 | "metadata": {}, 491 | "source": [ 492 | "***\n", 493 | "\n", 494 | "## Early stopping / patience" 495 | ] 496 | }, 497 | { 498 | "cell_type": "markdown", 499 | "metadata": {}, 500 | "source": [ 501 | "#### Re-train the models with early stopping" 502 | ] 503 | }, 504 | { 505 | "cell_type": "code", 506 | "execution_count": null, 507 | "metadata": {}, 508 | "outputs": [], 509 | "source": [ 510 | "# Re-train the unregularised model\n", 511 | "unreg_model = get_model()\n", 512 | "unreg_model.compile(optimizer='adam', loss='mse')\n", 513 | "unreg_hist = unreg_model.fit(X_train,Y_train, epochs=100,\n", 514 | " verbose=False,validation_split=0.2,\n", 515 | " callbacks=[tf.keras.callbacks.EarlyStopping(patience=3)] )\n" 516 | ] 517 | }, 518 | { 519 | "cell_type": "code", 520 | "execution_count": null, 521 | "metadata": {}, 522 | "outputs": [], 523 | "source": [ 524 | "# Evaluate the model on the test set\n", 525 | "unreg_model.evaluate(X_test, Y_test)\n" 526 | ] 527 | }, 528 | { 529 | "cell_type": "code", 530 | "execution_count": null, 531 | "metadata": {}, 532 | "outputs": [], 533 | "source": [ 534 | "# Re-train the regularised model\n", 535 | "regularized_model = get_regularised_model(1e-8,0.01)\n", 536 | "regularized_model.compile(optimizer='adam', loss='mse')\n", 537 | "regularized_hist = regularized_model.fit(X_train,Y_train, epochs=100,\n", 538 | " verbose=False,validation_split=0.2,\n", 539 | " callbacks=[tf.keras.callbacks.EarlyStopping(patience=3)] )\n" 540 | ] 541 | }, 542 | { 543 | "cell_type": "code", 544 | "execution_count": null, 545 | "metadata": {}, 546 | "outputs": [], 547 | "source": [ 548 | "# Evaluate the model on the test set\n", 549 | "\n", 550 | "regularized_model.evaluate(X_test,Y_test)" 551 | ] 552 | }, 553 | { 554 | "cell_type": "markdown", 555 | "metadata": {}, 556 | "source": [ 557 | "#### Plot the learning curves" 558 | ] 559 | }, 560 | { 561 | "cell_type": "code", 562 | "execution_count": null, 563 | "metadata": {}, 564 | "outputs": [], 565 | "source": [ 566 | "# Plot the training and validation loss\n", 567 | "\n", 568 | "import matplotlib.pyplot as plt\n", 569 | "\n", 570 | "fig = plt.figure(figsize=(12, 5))\n", 571 | "\n", 572 | "fig.add_subplot(121)\n", 573 | "\n", 574 | "plt.plot(unreg_hist.history['loss'])\n", 575 | "plt.plot(unreg_hist.history['val_loss'])\n", 576 | "plt.title('Unregularised model: loss vs. epochs')\n", 577 | "plt.ylabel('Loss')\n", 578 | "plt.xlabel('Epoch')\n", 579 | "plt.legend(['Training', 'Validation'], loc='upper right')\n", 580 | "\n", 581 | "fig.add_subplot(122)\n", 582 | "\n", 583 | "plt.plot(regularized_hist.history['loss'])\n", 584 | "plt.plot(regularized_hist.history['val_loss'])\n", 585 | "plt.title('Regularised model: loss vs. epochs')\n", 586 | "plt.ylabel('Loss')\n", 587 | "plt.xlabel('Epoch')\n", 588 | "plt.legend(['Training', 'Validation'], loc='upper right')\n", 589 | "\n", 590 | "plt.show()" 591 | ] 592 | }, 593 | { 594 | "cell_type": "code", 595 | "execution_count": null, 596 | "outputs": [], 597 | "source": [ 598 | "\n" 599 | ], 600 | "metadata": { 601 | "collapsed": false, 602 | "pycharm": { 603 | "name": "#%%\n" 604 | } 605 | } 606 | } 607 | ], 608 | "metadata": { 609 | "kernelspec": { 610 | "name": "python37764bitmyenvconda4a11ba26287d4d1c969b9946e31eb2a2", 611 | "language": "python", 612 | "display_name": "Python 3.7.7 64-bit ('myenv': conda)" 613 | }, 614 | "language_info": { 615 | "codemirror_mode": { 616 | "name": "ipython", 617 | "version": 3 618 | }, 619 | "file_extension": ".py", 620 | "mimetype": "text/x-python", 621 | "name": "python", 622 | "nbconvert_exporter": "python", 623 | "pygments_lexer": "ipython3", 624 | "version": "3.7.1" 625 | } 626 | }, 627 | "nbformat": 4, 628 | "nbformat_minor": 2 629 | } -------------------------------------------------------------------------------- /Validation, Regularization and Callbacks/Week 3 Programming Assignment.ipynb: -------------------------------------------------------------------------------- 1 | { 2 | "cells": [ 3 | { 4 | "cell_type": "markdown", 5 | "metadata": {}, 6 | "source": [ 7 | "# Programming Assignment" 8 | ] 9 | }, 10 | { 11 | "cell_type": "markdown", 12 | "metadata": {}, 13 | "source": [ 14 | "## Model validation on the Iris dataset" 15 | ] 16 | }, 17 | { 18 | "cell_type": "markdown", 19 | "metadata": {}, 20 | "source": [ 21 | "### Instructions\n", 22 | "\n", 23 | "In this notebook, you will build, compile and fit a neural network model to the Iris dataset. You will also implement validation, regularisation and callbacks to improve your model.\n", 24 | "\n", 25 | "Some code cells are provided you in the notebook. You should avoid editing provided code, and make sure to execute the cells in order to avoid unexpected errors. Some cells begin with the line: \n", 26 | "\n", 27 | "`#### GRADED CELL ####`\n", 28 | "\n", 29 | "Don't move or edit this first line - this is what the automatic grader looks for to recognise graded cells. These cells require you to write your own code to complete them, and are automatically graded when you submit the notebook. Don't edit the function name or signature provided in these cells, otherwise the automatic grader might not function properly. Inside these graded cells, you can use any functions or classes that are imported below, but make sure you don't use any variables that are outside the scope of the function.\n", 30 | "\n", 31 | "### How to submit\n", 32 | "\n", 33 | "Complete all the tasks you are asked for in the worksheet. When you have finished and are happy with your code, press the **Submit Assignment** button at the top of this notebook.\n", 34 | "\n", 35 | "### Let's get started!\n", 36 | "\n", 37 | "We'll start running some imports, and loading the dataset. Do not edit the existing imports in the following cell. If you would like to make further Tensorflow imports, you should add them here." 38 | ] 39 | }, 40 | { 41 | "cell_type": "code", 42 | "execution_count": null, 43 | "metadata": {}, 44 | "outputs": [], 45 | "source": [ 46 | "#### PACKAGE IMPORTS ####\n", 47 | "\n", 48 | "# Run this cell first to import all required packages. Do not make any imports elsewhere in the notebook\n", 49 | "from numpy.random import seed\n", 50 | "seed(8)\n", 51 | "import tensorflow as tf\n", 52 | "import numpy as np\n", 53 | "import matplotlib.pyplot as plt\n", 54 | "from sklearn import datasets, model_selection \n", 55 | "%matplotlib inline\n", 56 | "\n", 57 | "# If you would like to make further imports from tensorflow, add them here\n", 58 | "import tensorflow as tf\n", 59 | "from tensorflow.keras.layers import Dense, Dropout\n", 60 | "from tensorflow.keras.models import Sequential\n", 61 | "from sklearn.datasets import load_iris\n", 62 | "from sklearn.model_selection import train_test_split" 63 | ] 64 | }, 65 | { 66 | "cell_type": "markdown", 67 | "metadata": {}, 68 | "source": [ 69 | "\n", 70 | "\"Drawing\"\n", 71 | "\"Drawing\"\n", 72 | "\"Drawing\"\n", 73 | "" 74 | ] 75 | }, 76 | { 77 | "cell_type": "markdown", 78 | "metadata": {}, 79 | "source": [ 80 | "#### The Iris dataset\n", 81 | "\n", 82 | "In this assignment, you will use the [Iris dataset](https://scikit-learn.org/stable/auto_examples/datasets/plot_iris_dataset.html). It consists of 50 samples from each of three species of Iris (Iris setosa, Iris virginica and Iris versicolor). Four features were measured from each sample: the length and the width of the sepals and petals, in centimeters. For a reference, see the following papers:\n", 83 | "\n", 84 | "- R. A. Fisher. \"The use of multiple measurements in taxonomic problems\". Annals of Eugenics. 7 (2): 179–188, 1936.\n", 85 | "\n", 86 | "Your goal is to construct a neural network that classifies each sample into the correct class, as well as applying validation and regularisation techniques." 87 | ] 88 | }, 89 | { 90 | "cell_type": "markdown", 91 | "metadata": {}, 92 | "source": [ 93 | "#### Load and preprocess the data\n", 94 | "\n", 95 | "First read in the Iris dataset using `datasets.load_iris()`, and split the dataset into training and test sets." 96 | ] 97 | }, 98 | { 99 | "cell_type": "code", 100 | "execution_count": null, 101 | "metadata": {}, 102 | "outputs": [], 103 | "source": [ 104 | "#### GRADED CELL ####\n", 105 | "\n", 106 | "# Complete the following function. \n", 107 | "# Make sure to not change the function name or arguments.\n", 108 | "\n", 109 | "def read_in_and_split_data(iris_data):\n", 110 | " \"\"\"\n", 111 | " This function takes the Iris dataset as loaded by sklearn.datasets.load_iris(), and then \n", 112 | " splits so that the training set includes 90% of the full dataset, with the test set \n", 113 | " making up the remaining 10%.\n", 114 | " Your function should return a tuple (train_data, test_data, train_targets, test_targets) \n", 115 | " of appropriately split training and test data and targets.\n", 116 | " \n", 117 | " If you would like to import any further packages to aid you in this task, please do so in the \n", 118 | " Package Imports cell above.\n", 119 | " \"\"\"\n", 120 | " X = iris_data['data']\n", 121 | " y = iris_data['target']\n", 122 | " X_train, X_test, y_train, y_test = train_test_split(X,y,train_size=0.1)\n", 123 | " return X_train, X_test, y_train, y_test" 124 | ] 125 | }, 126 | { 127 | "cell_type": "code", 128 | "execution_count": null, 129 | "metadata": {}, 130 | "outputs": [], 131 | "source": [ 132 | "# Run your function to generate the test and training data.\n", 133 | "\n", 134 | "iris_data = datasets.load_iris()\n", 135 | "#print(iris_data.keys())\n", 136 | "train_data, test_data , train_targets, test_targets = read_in_and_split_data(iris_data)" 137 | ] 138 | }, 139 | { 140 | "cell_type": "markdown", 141 | "metadata": { 142 | "pycharm": { 143 | "name": "#%% md\n" 144 | } 145 | }, 146 | "source": [ 147 | "We will now convert the training and test targets using a one hot encoder.\n", 148 | "\n" 149 | ] 150 | }, 151 | { 152 | "cell_type": "code", 153 | "execution_count": null, 154 | "outputs": [], 155 | "source": [ 156 | "train_data.shape" 157 | ], 158 | "metadata": { 159 | "collapsed": false, 160 | "pycharm": { 161 | "name": "#%%\n" 162 | } 163 | } 164 | }, 165 | { 166 | "cell_type": "code", 167 | "execution_count": null, 168 | "outputs": [], 169 | "source": [ 170 | "# Convert targets to a one-hot encoding\n", 171 | "\n", 172 | "train_targets = tf.keras.utils.to_categorical(np.array(train_targets))\n", 173 | "test_targets = tf.keras.utils.to_categorical(np.array(test_targets))" 174 | ], 175 | "metadata": { 176 | "collapsed": false, 177 | "pycharm": { 178 | "name": "#%%\n" 179 | } 180 | } 181 | }, 182 | { 183 | "cell_type": "markdown", 184 | "metadata": {}, 185 | "source": [ 186 | "#### Build the neural network model\n" 187 | ] 188 | }, 189 | { 190 | "cell_type": "markdown", 191 | "metadata": {}, 192 | "source": [ 193 | "You can now construct a model to fit to the data. Using the Sequential API, build your model according to the following specifications:\n", 194 | "\n", 195 | "* The model should use the `input_shape` in the function argument to set the input size in the first layer.\n", 196 | "* The first layer should be a dense layer with 64 units.\n", 197 | "* The weights of the first layer should be initialised with the He uniform initializer.\n", 198 | "* The biases of the first layer should be all initially equal to one.\n", 199 | "* There should then be a further four dense layers, each with 128 units.\n", 200 | "* This should be followed with four dense layers, each with 64 units.\n", 201 | "* All of these Dense layers should use the ReLU activation function.\n", 202 | "* The output Dense layer should have 3 units and the softmax activation function.\n", 203 | "\n", 204 | "In total, the network should have 10 layers." 205 | ] 206 | }, 207 | { 208 | "cell_type": "code", 209 | "execution_count": null, 210 | "metadata": {}, 211 | "outputs": [], 212 | "source": [ 213 | "#### GRADED CELL ####\n", 214 | "\n", 215 | "# Complete the following function. \n", 216 | "# Make sure to not change the function name or arguments.\n", 217 | "\n", 218 | "def get_model(input_shape):\n", 219 | " \"\"\"\n", 220 | " This function should build a Sequential model according to the above specification. Ensure the \n", 221 | " weights are initialised by providing the input_shape argument in the first layer, given by the\n", 222 | " function argument.\n", 223 | " Your function should return the model.\n", 224 | " \"\"\"\n", 225 | " model = Sequential([\n", 226 | " Dense(64, input_shape=input_shape, kernel_initializer=tf.keras.initializers.he_uniform(), bias_initializer='ones', activation='relu'),\n", 227 | " Dense(128, activation='relu'),\n", 228 | " Dense(128, activation='relu'),\n", 229 | " Dense(128, activation='relu'),\n", 230 | " Dense(128, activation='relu'),\n", 231 | " Dense(64, activation='relu'),\n", 232 | " Dense(64, activation='relu'),\n", 233 | " Dense(64, activation='relu'),\n", 234 | " Dense(64, activation='relu'),\n", 235 | " Dense(3, activation='softmax')\n", 236 | " ])\n", 237 | " return model\n", 238 | " " 239 | ] 240 | }, 241 | { 242 | "cell_type": "code", 243 | "execution_count": null, 244 | "metadata": {}, 245 | "outputs": [], 246 | "source": [ 247 | "# Run your function to get the model\n", 248 | "\n", 249 | "model = get_model(train_data[0].shape)" 250 | ] 251 | }, 252 | { 253 | "cell_type": "markdown", 254 | "metadata": {}, 255 | "source": [ 256 | "#### Compile the model\n", 257 | "\n", 258 | "You should now compile the model using the `compile` method. Remember that you need to specify an optimizer, a loss function and a metric to judge the performance of your model." 259 | ] 260 | }, 261 | { 262 | "cell_type": "code", 263 | "execution_count": null, 264 | "metadata": {}, 265 | "outputs": [], 266 | "source": [ 267 | "#### GRADED CELL ####\n", 268 | "\n", 269 | "# Complete the following function. \n", 270 | "# Make sure to not change the function name or arguments.\n", 271 | "\n", 272 | "def compile_model(model):\n", 273 | " \"\"\"\n", 274 | " This function takes in the model returned from your get_model function, and compiles it with an optimiser,\n", 275 | " loss function and metric.\n", 276 | " Compile the model using the Adam optimiser (with learning rate set to 0.0001), \n", 277 | " the categorical crossentropy loss function and accuracy as the only metric. \n", 278 | " Your function doesn't need to return anything; the model will be compiled in-place.\n", 279 | " \"\"\"\n", 280 | " model.compile(optimizer=tf.keras.optimizers.Adam(learning_rate=0.0001), loss=tf.keras.losses.categorical_crossentropy, metrics=['acc'] )\n", 281 | " " 282 | ] 283 | }, 284 | { 285 | "cell_type": "code", 286 | "execution_count": null, 287 | "metadata": {}, 288 | "outputs": [], 289 | "source": [ 290 | "# Run your function to compile the model\n", 291 | "\n", 292 | "compile_model(model)" 293 | ] 294 | }, 295 | { 296 | "cell_type": "markdown", 297 | "metadata": {}, 298 | "source": [ 299 | "#### Fit the model to the training data\n", 300 | "\n", 301 | "Now you should train the model on the Iris dataset, using the model's `fit` method. \n", 302 | "* Run the training for a fixed number of epochs, given by the function's `epochs` argument.\n", 303 | "* Return the training history to be used for plotting the learning curves.\n", 304 | "* Set the batch size to 40.\n", 305 | "* Set the validation set to be 15% of the training set." 306 | ] 307 | }, 308 | { 309 | "cell_type": "code", 310 | "execution_count": null, 311 | "metadata": {}, 312 | "outputs": [], 313 | "source": [ 314 | "#### GRADED CELL ####\n", 315 | "\n", 316 | "# Complete the following function. \n", 317 | "# Make sure to not change the function name or arguments.\n", 318 | "\n", 319 | "def train_model(model, train_data, train_targets, epochs):\n", 320 | " \"\"\"\n", 321 | " This function should train the model for the given number of epochs on the \n", 322 | " train_data and train_targets. \n", 323 | " Your function should return the training history, as returned by model.fit.\n", 324 | " \"\"\"\n", 325 | " history = model.fit(train_data, train_targets, epochs = epochs, batch_size=40, validation_split=0.15)\n", 326 | " return history\n", 327 | " " 328 | ] 329 | }, 330 | { 331 | "cell_type": "markdown", 332 | "metadata": {}, 333 | "source": [ 334 | "Run the following cell to run the training for 800 epochs." 335 | ] 336 | }, 337 | { 338 | "cell_type": "code", 339 | "execution_count": null, 340 | "metadata": { 341 | "tags": [ 342 | "outputPrepend" 343 | ] 344 | }, 345 | "outputs": [], 346 | "source": [ 347 | "# Run your function to train the model\n", 348 | "\n", 349 | "history = train_model(model, train_data, train_targets, epochs=800)" 350 | ] 351 | }, 352 | { 353 | "cell_type": "code", 354 | "execution_count": null, 355 | "metadata": { 356 | "tags": [], 357 | "pycharm": { 358 | "name": "#%%\n" 359 | } 360 | }, 361 | "outputs": [], 362 | "source": [ 363 | "model.evaluate(test_data,test_targets)" 364 | ] 365 | }, 366 | { 367 | "cell_type": "markdown", 368 | "metadata": { 369 | "pycharm": { 370 | "name": "#%% md\n" 371 | } 372 | }, 373 | "source": [ 374 | "#### Plot the learning curves\n", 375 | "\n", 376 | "We will now plot two graphs:\n", 377 | "* Epoch vs accuracy\n", 378 | "* Epoch vs loss\n" 379 | ] 380 | }, 381 | { 382 | "cell_type": "code", 383 | "execution_count": null, 384 | "outputs": [], 385 | "source": [ 386 | "# Run this cell to plot the epoch vs accuracy graph\n", 387 | "\n", 388 | "try:\n", 389 | " plt.plot(history.history['accuracy'])\n", 390 | " plt.plot(history.history['val_accuracy'])\n", 391 | "except KeyError:\n", 392 | " plt.plot(history.history['acc'])\n", 393 | " plt.plot(history.history['val_acc'])\n", 394 | "plt.title('Accuracy vs. epochs')\n", 395 | "plt.ylabel('Acc')\n", 396 | "plt.xlabel('Epoch')\n", 397 | "plt.legend(['Training Set', 'Validation Set'], loc='lower right')\n", 398 | "plt.show() " 399 | ], 400 | "metadata": { 401 | "collapsed": false, 402 | "pycharm": { 403 | "name": "#%%\n" 404 | } 405 | } 406 | }, 407 | { 408 | "cell_type": "code", 409 | "execution_count": null, 410 | "metadata": { 411 | "pycharm": { 412 | "name": "#%%\n" 413 | } 414 | }, 415 | "outputs": [], 416 | "source": [ 417 | "#Run this cell to plot the epoch vs loss graph\n", 418 | "plt.plot(history.history['loss'])\n", 419 | "plt.plot(history.history['val_loss'])\n", 420 | "plt.title('Loss vs. epochs')\n", 421 | "plt.ylabel('Loss')\n", 422 | "plt.xlabel('Epoch')\n", 423 | "plt.legend(['Training', 'Validation'], loc='upper left')\n", 424 | "plt.show() " 425 | ] 426 | }, 427 | { 428 | "cell_type": "markdown", 429 | "metadata": {}, 430 | "source": [ 431 | "Oh no! We have overfit our dataset. You should now try to now try to mitigate this overfitting." 432 | ] 433 | }, 434 | { 435 | "cell_type": "markdown", 436 | "metadata": {}, 437 | "source": [ 438 | "#### Reducing overfitting in the model" 439 | ] 440 | }, 441 | { 442 | "cell_type": "markdown", 443 | "metadata": { 444 | "pycharm": { 445 | "name": "#%% md\n" 446 | } 447 | }, 448 | "source": [ 449 | "You should now define a new regularised model.\n", 450 | "The specs for the regularised model are the same as our original model, with the addition of two dropout layers, weight decay, and a batch normalisation layer. \n", 451 | "\n", 452 | "In particular:\n", 453 | "\n", 454 | "* Add a dropout layer after the 3rd Dense layer\n", 455 | "* Then there should be two more Dense layers with 128 units before a batch normalisation layer\n", 456 | "* Following this, two more Dense layers with 64 units and then another Dropout layer\n", 457 | "* Two more Dense layers with 64 units and then the final 3-way softmax layer\n", 458 | "* Add weight decay (l2 kernel regularisation) in all Dense layers except the final softmax layer" 459 | ] 460 | }, 461 | { 462 | "cell_type": "code", 463 | "execution_count": null, 464 | "metadata": { 465 | "pycharm": { 466 | "name": "#%%\n" 467 | } 468 | }, 469 | "outputs": [], 470 | "source": [ 471 | "#### GRADED CELL ####\n", 472 | "\n", 473 | "# Complete the following function. \n", 474 | "# Make sure to not change the function name or arguments.\n", 475 | "\n", 476 | "def get_regularised_model(input_shape, dropout_rate, weight_decay):\n", 477 | " \"\"\"\n", 478 | " This function should build a regularised Sequential model according to the above specification.\n", 479 | " The dropout_rate argument in the function should be used to set the Dropout rate for all Dropout layers.\n", 480 | " L2 kernel regularisation (weight decay) should be added using the weight_decay argument to\n", 481 | " set the weight decay coefficient in all Dense layers that use L2 regularisation.\n", 482 | " Ensure the weights are initialised by providing the input_shape argument in the first layer, given by the\n", 483 | " function argument input_shape.\n", 484 | " Your function should return the model.\n", 485 | " \"\"\"\n", 486 | " model = Sequential([\n", 487 | " Dense(64, input_shape=input_shape, kernel_initializer=tf.keras.initializers.he_uniform(), bias_initializer='ones', activation='relu', kernel_regularizer=tf.keras.regularizers.l2(0.001)),\n", 488 | " Dense(128, activation='relu', kernel_regularizer=tf.keras.regularizers.l2(weight_decay)),\n", 489 | " Dense(128, activation='relu', kernel_regularizer=tf.keras.regularizers.l2(weight_decay)),\n", 490 | " Dropout(dropout_rate),\n", 491 | " Dense(128, activation='relu', kernel_regularizer=tf.keras.regularizers.l2(weight_decay)),\n", 492 | " Dense(128, activation='relu', kernel_regularizer=tf.keras.regularizers.l2(weight_decay)),\n", 493 | " tf.keras.layers.BatchNormalization(),\n", 494 | " Dense(64, activation='relu', kernel_regularizer=tf.keras.regularizers.l2(weight_decay)),\n", 495 | " Dense(64, activation='relu', kernel_regularizer=tf.keras.regularizers.l2(weight_decay)),\n", 496 | " Dropout(dropout_rate),\n", 497 | " Dense(64, activation='relu', kernel_regularizer=tf.keras.regularizers.l2(weight_decay)),\n", 498 | " Dense(64, activation='relu', kernel_regularizer=tf.keras.regularizers.l2(weight_decay)),\n", 499 | " Dense(3, activation='softmax')\n", 500 | " ])\n", 501 | " return model\n", 502 | "\n" 503 | ] 504 | }, 505 | { 506 | "cell_type": "markdown", 507 | "metadata": { 508 | "pycharm": { 509 | "name": "#%% md\n" 510 | } 511 | }, 512 | "source": [ 513 | "#### Instantiate, compile and train the model" 514 | ] 515 | }, 516 | { 517 | "cell_type": "code", 518 | "execution_count": null, 519 | "metadata": {}, 520 | "outputs": [], 521 | "source": [ 522 | "newreg_model = get_regularised_model(train_data[0].shape, 0.5, 0.001)\n", 523 | "\n", 524 | "# Instantiate the model, using a dropout rate of 0.3 and weight decay coefficient of 0.001\n" 525 | ] 526 | }, 527 | { 528 | "cell_type": "code", 529 | "execution_count": null, 530 | "metadata": {}, 531 | "outputs": [], 532 | "source": [ 533 | "# Compile the model\n", 534 | "\n", 535 | "compile_model(newreg_model)" 536 | ] 537 | }, 538 | { 539 | "cell_type": "code", 540 | "execution_count": null, 541 | "outputs": [], 542 | "source": [ 543 | "# Train the model\n", 544 | "\n", 545 | "reg_history = train_model(newreg_model, train_data, train_targets, epochs=800)" 546 | ], 547 | "metadata": { 548 | "collapsed": false, 549 | "pycharm": { 550 | "name": "#%%\n" 551 | } 552 | } 553 | }, 554 | { 555 | "cell_type": "code", 556 | "execution_count": null, 557 | "outputs": [], 558 | "source": [ 559 | "newreg_model.evaluate(test_data,test_targets)" 560 | ], 561 | "metadata": { 562 | "collapsed": false, 563 | "pycharm": { 564 | "name": "#%%\n" 565 | } 566 | } 567 | }, 568 | { 569 | "cell_type": "code", 570 | "execution_count": null, 571 | "metadata": { 572 | "tags": [ 573 | "outputPrepend" 574 | ] 575 | }, 576 | "outputs": [], 577 | "source": [ 578 | "#### Plot the learning curves\n", 579 | "\n", 580 | "Let's now plot the loss and accuracy for the training and validation sets." 581 | ] 582 | }, 583 | { 584 | "cell_type": "code", 585 | "execution_count": null, 586 | "metadata": { 587 | "tags": [] 588 | }, 589 | "outputs": [], 590 | "source": [ 591 | "newreg_model.evaluate(test_data,test_targets)" 592 | ] 593 | }, 594 | { 595 | "cell_type": "markdown", 596 | "metadata": {}, 597 | "source": [ 598 | "#### Plot the learning curves\n", 599 | "\n", 600 | "Let's now plot the loss and accuracy for the training and validation sets." 601 | ] 602 | }, 603 | { 604 | "cell_type": "code", 605 | "execution_count": null, 606 | "metadata": {}, 607 | "outputs": [], 608 | "source": [ 609 | "#Run this cell to plot the new accuracy vs epoch graph\n", 610 | "\n", 611 | "try:\n", 612 | " plt.plot(reg_history.history['accuracy'])\n", 613 | " plt.plot(reg_history.history['val_accuracy'])\n", 614 | "except KeyError:\n", 615 | " plt.plot(reg_history.history['acc'])\n", 616 | " plt.plot(reg_history.history['val_acc'])\n", 617 | "plt.title('Accuracy vs. epochs')\n", 618 | "plt.ylabel('Loss')\n", 619 | "plt.xlabel('Epoch')\n", 620 | "plt.legend(['Training', 'Validation'], loc='lower right')\n", 621 | "plt.show() " 622 | ] 623 | }, 624 | { 625 | "cell_type": "code", 626 | "execution_count": null, 627 | "metadata": {}, 628 | "outputs": [], 629 | "source": [ 630 | "#Run this cell to plot the new loss vs epoch graph\n", 631 | "\n", 632 | "plt.plot(reg_history.history['loss'])\n", 633 | "plt.plot(reg_history.history['val_loss'])\n", 634 | "plt.title('Loss vs. epochs')\n", 635 | "plt.ylabel('Loss')\n", 636 | "plt.xlabel('Epoch')\n", 637 | "plt.legend(['Training', 'Validation'], loc='upper right')\n", 638 | "plt.show() " 639 | ] 640 | }, 641 | { 642 | "cell_type": "markdown", 643 | "metadata": {}, 644 | "source": [ 645 | "We can see that the regularisation has helped to reduce the overfitting of the network.\n", 646 | "You will now incorporate callbacks into a new training run that implements early stopping and learning rate reduction on plateaux.\n", 647 | "\n", 648 | "Fill in the function below so that:\n", 649 | "\n", 650 | "* It creates an `EarlyStopping` callback object and a `ReduceLROnPlateau` callback object\n", 651 | "* The early stopping callback is used and monitors validation loss with the mode set to `\"min\"` and patience of 30.\n", 652 | "* The learning rate reduction on plateaux is used with a learning rate factor of 0.2 and a patience of 20." 653 | ] 654 | }, 655 | { 656 | "cell_type": "code", 657 | "execution_count": null, 658 | "metadata": {}, 659 | "outputs": [], 660 | "source": [ 661 | "#### GRADED CELL ####\n", 662 | "\n", 663 | "# Complete the following function. \n", 664 | "# Make sure to not change the function name or arguments.\n", 665 | "\n", 666 | "def get_callbacks():\n", 667 | " \"\"\"\n", 668 | " This function should create and return a tuple (early_stopping, learning_rate_reduction) callbacks.\n", 669 | " The callbacks should be instantiated according to the above requirements.\n", 670 | " \"\"\"\n", 671 | " earlstop = tf.keras.callbacks.EarlyStopping( monitor='val_loss', patience=30, mode='min')\n", 672 | " lrr = tf.keras.callbacks.ReduceLROnPlateau(patience=20, factor=0.2)\n", 673 | " return earlstop,lrr\n", 674 | " " 675 | ] 676 | }, 677 | { 678 | "cell_type": "markdown", 679 | "metadata": {}, 680 | "source": [ 681 | "Run the cell below to instantiate and train the regularised model with the callbacks." 682 | ] 683 | }, 684 | { 685 | "cell_type": "code", 686 | "execution_count": null, 687 | "metadata": {}, 688 | "outputs": [], 689 | "source": [ 690 | "call_model = get_regularised_model(train_data[0].shape, 0.3, 0.0001)\n", 691 | "compile_model(call_model)\n", 692 | "early_stopping, learning_rate_reduction = get_callbacks()\n", 693 | "call_history = call_model.fit(train_data, train_targets, epochs=800, validation_split=0.15,\n", 694 | " callbacks=[early_stopping, learning_rate_reduction], verbose=0)" 695 | ] 696 | }, 697 | { 698 | "cell_type": "code", 699 | "execution_count": null, 700 | "metadata": {}, 701 | "outputs": [], 702 | "source": [ 703 | "learning_rate_reduction.patience" 704 | ] 705 | }, 706 | { 707 | "cell_type": "markdown", 708 | "metadata": {}, 709 | "source": [ 710 | "Finally, let's replot the accuracy and loss graphs for our new model." 711 | ] 712 | }, 713 | { 714 | "cell_type": "code", 715 | "execution_count": null, 716 | "metadata": {}, 717 | "outputs": [], 718 | "source": [ 719 | "try:\n", 720 | " plt.plot(call_history.history['accuracy'])\n", 721 | " plt.plot(call_history.history['val_accuracy'])\n", 722 | "except KeyError:\n", 723 | " plt.plot(call_history.history['acc'])\n", 724 | " plt.plot(call_history.history['val_acc'])\n", 725 | "plt.title('Accuracy vs. epochs')\n", 726 | "plt.ylabel('Accuracy')\n", 727 | "plt.xlabel('Epoch')\n", 728 | "plt.legend(['Training', 'Validation'], loc='lower right')\n", 729 | "plt.show() " 730 | ] 731 | }, 732 | { 733 | "cell_type": "code", 734 | "execution_count": null, 735 | "metadata": {}, 736 | "outputs": [], 737 | "source": [ 738 | "plt.plot(call_history.history['loss'])\n", 739 | "plt.plot(call_history.history['val_loss'])\n", 740 | "plt.title('Loss vs. epochs')\n", 741 | "plt.ylabel('Loss')\n", 742 | "plt.xlabel('Epoch')\n", 743 | "plt.legend(['Training', 'Validation'], loc='upper right')\n", 744 | "plt.show() " 745 | ] 746 | }, 747 | { 748 | "cell_type": "code", 749 | "execution_count": null, 750 | "metadata": {}, 751 | "outputs": [], 752 | "source": [ 753 | "# Evaluate the model on the test set\n", 754 | "\n", 755 | "test_loss, test_acc = call_model.evaluate(test_data, test_targets, verbose=0)\n", 756 | "print(\"Test loss: {:.3f}\\nTest accuracy: {:.2f}%\".format(test_loss, 100 * test_acc))" 757 | ] 758 | }, 759 | { 760 | "cell_type": "markdown", 761 | "metadata": {}, 762 | "source": [ 763 | "Congratulations for completing this programming assignment! In the next week of the course we will learn how to save and load pre-trained models." 764 | ] 765 | } 766 | ], 767 | "metadata": { 768 | "coursera": { 769 | "course_slug": "tensor-flow-2-1", 770 | "graded_item_id": "mtZ4n", 771 | "launcher_item_id": "WphgK" 772 | }, 773 | "kernelspec": { 774 | "name": "python37764bitmyenvconda4a11ba26287d4d1c969b9946e31eb2a2", 775 | "language": "python", 776 | "display_name": "Python 3.7.7 64-bit ('myenv': conda)" 777 | }, 778 | "language_info": { 779 | "codemirror_mode": { 780 | "name": "ipython", 781 | "version": 3 782 | }, 783 | "file_extension": ".py", 784 | "mimetype": "text/x-python", 785 | "name": "python", 786 | "nbconvert_exporter": "python", 787 | "pygments_lexer": "ipython3", 788 | "version": "3.7.7-final" 789 | } 790 | }, 791 | "nbformat": 4, 792 | "nbformat_minor": 2 793 | } -------------------------------------------------------------------------------- /Validation, Regularization and Callbacks/readme.md: -------------------------------------------------------------------------------- 1 | # Validation, Regularization, & Callbacks 2 | ---- 3 | 4 | ### Validation sets 5 | 6 | Measures on how much our data is performing outside the training data, while training. 7 | 8 | Means this Validation data is never used for training the model but is used to evaluate while the model is being trained. 9 | 10 | ### Ways to add validation sets 11 | 12 | #### 1st Way 13 | 14 | First way is to pass `validation_split` argument in `model.fit()` to specify the validation split. 15 | 16 | Using this way will automatically split data in training and validation sets. Lets see the code. 17 | 18 | ```python3 19 | 20 | hist = model.fit(X_train, y_train, epochs=20, verbose=0, validation_split=0.2) 21 | 22 | ``` 23 | 24 | Here it automaticallly splits X_train and y_train into sperate training & validation sets with 80% training data and 20% validation data. 25 | 26 | #### 2nd Way 27 | 28 | Second way is to seperately pass validation set into `model.fit()` which we get from different resources, using `validation_data()` parameter. 29 | 30 | Lets see the code. 31 | ```python3 32 | from tensorflow.keras.datasets import mnist 33 | 34 | (x_train, y_train), (x_test, y_test) = mnist.load_data() 35 | 36 | # Create and define model structure 37 | # Compile the model using model.compile 38 | 39 | model.fit(x_train,y_train, epochs=20, verbose=0, validation_data = (x_test,y_test)) 40 | 41 | ``` 42 | 43 | #### 3rd Way 44 | 45 | 3rd way to split data set in training and validation set is by using `train_test_split` function from `sklearn.model_selection`. 46 | 47 | Lets see it in code. 48 | 49 | ```python3 50 | 51 | from sklearn.model_selection import train_test_split 52 | 53 | X_train, X_val, y_train, y_val = train_test_split(trainData, trainLabels, test_size = 0.1) 54 | 55 | model.fit(X_train,y_train, epochs=20, validation_data=(X_val,y_val)) 56 | ``` 57 | ---- 58 | 59 | ## Model Regularisation 60 | 61 | Now we will learn how to add regularisation terms in our model. Our main focus will be on L1 Regularization, L2 Regularization, and Dropout. 62 | 63 | I wont teach about L1, L2, and dropout, but you can look for them in Google. 64 | 65 | Lets see how you add these in Keras. 66 | 67 | #### L1 and L2 68 | You can add L1 and L2 in any Dense or Conv Layer, and it will automatically be evaluated in Loss function. All you need to do is to add them in your Layer using `kernel_regularizer` for Weights and `bias_regularizer` for bias. 69 | 70 | Lets check it in code. 71 | 72 | ```python3 73 | from tensorflow.keras.layers import Dense, Dropout 74 | from tensorflow.keras.models import Sequential 75 | 76 | model = Sequential([ 77 | Dense(64, activation = 'relu', kernel_regularizer = tf.keras.regularizers.l1(0.05)), 78 | Dense(64, activation = 'relu', kernel_regularizer = tf.keras.regularizers.l2(0.01)), 79 | Dense(64,activation='relu', tf.keras.regularizers.l1_l2( 80 | l1=0.01, l2=0.01)), 81 | Dropout(0.5), 82 | Dense(1,activation = 'sigmoid') 83 | ]) 84 | 85 | ``` 86 | 87 | L1 is computed as 88 | 89 | 90 | 91 | and L2 is computed as 92 | 93 | 94 | 95 | Here we compute L1, L2, and both L1_L2 Regularization using Tf.Keras. 96 | 97 | We also added Dropout Layer which randomly drops 50% neurons. It is also known as Bernouli's Drop out. 98 | 99 | Remember that we are not going to Drop Out randomly while in `model.evaluate()` and in `model.predict()`. It is only used in `model.fit()` phase. 100 | 101 | --- 102 | 103 | ## Batch Normalization 104 | 105 | Learn more about Batch Normalization in [this](Batch%20normalisation.ipynb) notebook. 106 | 107 | ---- 108 | 109 | ## CallBacks 110 | 111 | Call backs are certian types of Objects that can monitor loss & metrics at certain points in training and can perform certain actions based on them. 112 | 113 | These are Call backs used in training, there are other several types of Callbacks which can monitor different things and can perform actions based on them. 114 | 115 | ```python3 116 | from tensorflow.keras.callbacks import Callback 117 | ``` 118 | 119 | There are 2 ways to use Callbacks in Keras. 120 | - Callbacks Base class(which we just imported) thorugh which we make our own subclass 121 | - Built-in Call backs 122 | 123 | Let's Create our own baseclass first. 124 | 125 | ```python3 126 | from tensorflow.keras.callbacks import Callback 127 | 128 | class My_Callback(Callback): 129 | #we will rewrite built-in methods 130 | 131 | def on_train_begin(self, logs=None): 132 | #do something at start of training 133 | 134 | def on_train_batch_begin(self, batch, logs=None): 135 | # do somethings at start of batch 136 | 137 | def on_epochs_end(self, epochs, log= None): 138 | #do something at end of epoch 139 | 140 | 141 | history = model.fit(X_train, y_train, epochs=5, callbacks=[My_Callback()]) 142 | 143 | ``` 144 | 145 | Here `history` is also an example of callback whose job is to store loss and metrics in dictonary format in it's history attribute i.e it stores loss and metrics in history.history. 146 | 147 | For details on Custom Callbacks, & applications, Refer to [this](Custom%20Callback.ipynb) notebook, where we have implemented Learning Rate Decay with Keras where we reduce Learning Rate as number of Epochs Increases. 148 | 149 | ## Early Stopping 150 | 151 | Early Stopping is an another important thing in model, Let's say we have set number of epochs to 10000 and started the model training but we see that it is not improving but it is instead performing bad, so how to stop it? In order to save the computational resources. 152 | Here we can use early stopping. 153 | 154 | Early Stopping is a Callback that is built into `keras`. Let's Code it and then I will 155 | explain it's components step by step. I will not define the structure of my model and I 156 | assume that you know the basic structure of Model. 157 | 158 | ```python 159 | from tensorflow.keras.callbacks import EarlyStopping 160 | 161 | early_stopping = EarlyStopping() 162 | 163 | model.compile(optimizer = 'adam', loss='categorical_crossentropy', metrics=['accuracy']) 164 | 165 | model.fit(X_train, y_train, validation_split=0.01, epochs=100, callbacks=[early_stopping]) 166 | ``` 167 | 168 | Now here we have added `EarlyStopping` in our model which will stop our model based on validation loss `val_loss` by defualt. 169 | It has an argument of `monitor` on basis of which it stops the training. By defauly 170 | `monitor` is set to `val_loss` means it evaluates the performance on validation loss. We can change 171 | it to validation accuracy too. 172 | 173 | ```python 174 | early_stopping = EarlyStopping(monitor = 'val_accuracy') 175 | ``` 176 | 177 | Now it will evaluate on basis of accuracy on validation sets. 178 | 179 | Another important parameter in early stopping is `patience`. By default `patience` is 0. 180 | It means that for any epoch, if performance measure based on `monitor` is worse then previous 181 | epoch, it will terminate training. 182 | 183 | Now normally we do not terminate training based on 1 epoch because it may improve in next epoch 184 | so we have to increase it's value. 185 | 186 | ```python 187 | early_stopping = EarlyStopping(monitor='val_accuracy', patience=5) 188 | ``` 189 | One another parameter which is commonly used is `min_delta` which is by default 0. Let's say your model improves 190 | by 0.00001 at every epoch which is almost none. But early stopping will count it as an 191 | improvement because it greater then min_delta. 192 | 193 | So what we do in `min_delta` is that we specify some minimum value and improvement is 194 | early stopping will only take into account the `monitor` as improvement if it is greater 195 | then early stopping. 196 | 197 | ```python 198 | early_stopping = EarlyStopping(monitor='val_accuracy', patience=5, min_delta=0.1) 199 | ``` 200 | In this case if `val_accuracy` increases less then 0.1, it wont be count as improvement. 201 | 202 | Last parameter I want to talk about is `mode`. As you might have guessed that it is not necessary 203 | that value of `monitor` will always increase, e.g if `monitor` is `val_loss`, it will 204 | decrease on each epoch. By default `mode` is `"auto"`, but it can take in `max` and `min` for 205 | increasing value and decreasing value of monitor respectively. 206 | 207 | ------ 208 | 209 | ###### You can find all work of this week in [this](Validation_Regularization_CallBacks.ipynb) file. 210 | ### Now it is your turn to check your skills, open [this]() file and start using these concepts on iris dataset. 211 | 212 | -------------------------------------------------------------------------------- /readme.md: -------------------------------------------------------------------------------- 1 | # Getting Started with Tensorflow 2 2 | ------- 3 | 4 | #### This is a detailed repository with all the information and code for you to get started with Tensorflow which is one of the most famous Deep Learning Frameworks. 5 | #### It assumes that you have basic knowledge of Deep Learning and you understand how a neural network works. 6 | 7 | #### Start in the following Sequence. Read the readme section first, then attempt the notebook. 8 | * [Introduction To Google Colab](https://github.com/ahmadmustafaanis/Getting-Started-with-Tensorflow-2/tree/master/Introduction%20to%20Google%20Colab) 9 | * [Tensorflow Keras API Basics](https://github.com/ahmadmustafaanis/Getting-Started-with-Tensorflow-2/tree/master/TF.Keras%20Sequential%20API%20Basics) 10 | * [Validation, Regularization, and Callbacks](https://github.com/ahmadmustafaanis/Getting-Started-with-Tensorflow-2/tree/master/Validation%2C%20Regularization%20and%20Callbacks) 11 | * [Saving and Loading Model Weights](https://github.com/ahmadmustafaanis/Getting-Started-with-Tensorflow-2/tree/master/Saving%20and%20Loading%20Model%20Weights) 12 | * [Capstone Project](https://github.com/ahmadmustafaanis/Getting-Started-with-Tensorflow-2/tree/master/Capstone%20Project) 13 | 14 | All things were learned during the course [Getting Started with Tensorflow 2](https://www.coursera.org/learn/getting-started-with-tensor-flow2). --------------------------------------------------------------------------------