├── .gitattributes ├── 9781484266151.jpg ├── Chapter 1.ipynb ├── Chapter1 ├── Chapter 1.ipynb ├── Chapter1 ├── Coin.png └── Mario.png ├── Chapter2 ├── Chapter2 ├── DogVsCatClassification.ipynb ├── MNIST.png └── MNISTImageUsingCNN.ipynb ├── Chapter3 ├── Chapter3 ├── GermanTrafficClassificationUsingLeNet-5.ipynb ├── Important └── LeNet MNIST Digit Classification.ipynb ├── Chapter4 ├── CIFAR10_AlexNet.ipynb ├── CIFAR10_VGG.ipynb └── Chapter4 ├── Chapter5 ├── Chapter5 └── YOLO.ipynb ├── Chapter6 ├── Chapter6 ├── Chapter6.ipynb └── Important ├── Chapter7 ├── Dataset and codes for Chapter7 ├── Important └── VideoAnalyticsChapter7.ipynb ├── Chapter8 ├── Chapter 8.ipynb ├── Chapter8 └── Hoover.jpg ├── Coin.png ├── Contributing.md ├── LICENSE.txt ├── Mario.png └── README.md /.gitattributes: -------------------------------------------------------------------------------- 1 | # Auto detect text files and perform LF normalization 2 | * text=auto 3 | -------------------------------------------------------------------------------- /9781484266151.jpg: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/Apress/computer-vision-using-deep-learning/5c08fdbc4ffd67297512cc4d929c111c885e0c75/9781484266151.jpg -------------------------------------------------------------------------------- /Chapter1/Chapter1: -------------------------------------------------------------------------------- 1 | 2 | -------------------------------------------------------------------------------- /Chapter1/Coin.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/Apress/computer-vision-using-deep-learning/5c08fdbc4ffd67297512cc4d929c111c885e0c75/Chapter1/Coin.png -------------------------------------------------------------------------------- /Chapter1/Mario.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/Apress/computer-vision-using-deep-learning/5c08fdbc4ffd67297512cc4d929c111c885e0c75/Chapter1/Mario.png -------------------------------------------------------------------------------- /Chapter2/Chapter2: -------------------------------------------------------------------------------- 1 | 2 | -------------------------------------------------------------------------------- /Chapter2/DogVsCatClassification.ipynb: -------------------------------------------------------------------------------- 1 | { 2 | "cells": [ 3 | { 4 | "cell_type": "code", 5 | "execution_count": null, 6 | "metadata": {}, 7 | "outputs": [], 8 | "source": [ 9 | "f# Here we are builing an Image Classfier using Keras. The dataset used is from Kaggle Dog vs Cat Image \n", 10 | "# Classification Problem. First let's build the dataset\n", 11 | "# Step 1: Download the dataset from Kaggle\n", 12 | "# Step 2: Unzip the dataset\n", 13 | "# Step 3: You will find 2 folder, test and train\n", 14 | "# Step 4: Delete the test folder. We will create our own test folder.\n", 15 | "# Step 5: Inside both the train and test folders, create 2 sub-folders cats and dogs.\n", 16 | "# Step 7: Put all the cat's image in cats folder and all the dog's image in dogs folder\n", 17 | "# Step 8: Take some images (I took 2000) from train>cats folder and put in test>cats folder\n", 18 | "# Step 9: Take some images (I took 2000) from train>dogs folder and put in test>dogs folder\n", 19 | "# Step 10: Your dataset is ready" 20 | ] 21 | }, 22 | { 23 | "cell_type": "code", 24 | "execution_count": 69, 25 | "metadata": {}, 26 | "outputs": [], 27 | "source": [ 28 | "from keras.models import Sequential# Import the sequential layer. \n", 29 | "#Generally there are two types of layers, sequential and functional. Sequential is most common one\n", 30 | "from keras.layers import Conv2D,Activation,MaxPooling2D,Dense,Flatten,Dropout\n", 31 | "import numpy as np" 32 | ] 33 | }, 34 | { 35 | "cell_type": "code", 36 | "execution_count": null, 37 | "metadata": {}, 38 | "outputs": [], 39 | "source": [ 40 | "#Initialize a catDogImageclassifier variable here" 41 | ] 42 | }, 43 | { 44 | "cell_type": "code", 45 | "execution_count": 70, 46 | "metadata": {}, 47 | "outputs": [], 48 | "source": [ 49 | "catDogImageclassifier = Sequential()" 50 | ] 51 | }, 52 | { 53 | "cell_type": "code", 54 | "execution_count": null, 55 | "metadata": {}, 56 | "outputs": [], 57 | "source": [ 58 | "# We are adding layers to our network here.\n", 59 | "# Conv2D: 2 dimensional convolutional layer\n", 60 | "# 32: filters required. \n", 61 | "# 3,3: size of the filter (3 rows, 3 columns)\n", 62 | "# Input Image shape is 64*64*3 - height*width*RGB. Each number represents pixel intensity (0-255)\n" 63 | ] 64 | }, 65 | { 66 | "cell_type": "code", 67 | "execution_count": 71, 68 | "metadata": {}, 69 | "outputs": [], 70 | "source": [ 71 | "catDogImageclassifier.add(Conv2D(32,(3,3),input_shape=(64,64,3)))" 72 | ] 73 | }, 74 | { 75 | "cell_type": "code", 76 | "execution_count": null, 77 | "metadata": {}, 78 | "outputs": [], 79 | "source": [ 80 | "# Output is a feature map. The training data will work on it and get some feature maps" 81 | ] 82 | }, 83 | { 84 | "cell_type": "code", 85 | "execution_count": null, 86 | "metadata": {}, 87 | "outputs": [], 88 | "source": [ 89 | "# Lets add the activation function now. We are using ReLU (Rectified Linear Unit). \n", 90 | "#The activation function gives the output basis the output. In the feature map output from the previous layer,\n", 91 | "#the activation function will replace all the negative pixels with zero" 92 | ] 93 | }, 94 | { 95 | "cell_type": "code", 96 | "execution_count": 72, 97 | "metadata": {}, 98 | "outputs": [], 99 | "source": [ 100 | "catDogImageclassifier.add(Activation('relu'))" 101 | ] 102 | }, 103 | { 104 | "cell_type": "code", 105 | "execution_count": null, 106 | "metadata": {}, 107 | "outputs": [], 108 | "source": [ 109 | "# We do not want out network to be overly complex computationally, hence the pooling layer comes into picture\n", 110 | "# The pooling layer will reduce the dimensions. Max with two by two filter, will take the maximum value but the \n", 111 | "# significant features will be retained" 112 | ] 113 | }, 114 | { 115 | "cell_type": "code", 116 | "execution_count": 73, 117 | "metadata": {}, 118 | "outputs": [], 119 | "source": [ 120 | "catDogImageclassifier.add(MaxPooling2D(pool_size =(2,2)))" 121 | ] 122 | }, 123 | { 124 | "cell_type": "code", 125 | "execution_count": null, 126 | "metadata": {}, 127 | "outputs": [], 128 | "source": [ 129 | "# All three convolutional blocks. " 130 | ] 131 | }, 132 | { 133 | "cell_type": "code", 134 | "execution_count": 74, 135 | "metadata": {}, 136 | "outputs": [], 137 | "source": [ 138 | "catDogImageclassifier.add(Conv2D(32,(3,3))) # Convolutional Layer\n", 139 | "catDogImageclassifier.add(Activation('relu')) # Activation Layer\n", 140 | "catDogImageclassifier.add(MaxPooling2D(pool_size =(2,2))) # Pooling Layer\n", 141 | "catDogImageclassifier.add(Conv2D(32,(3,3))) # Convolutional Layer\n", 142 | "catDogImageclassifier.add(Activation('relu')) # Activation Layer\n", 143 | "catDogImageclassifier.add(MaxPooling2D(pool_size =(2,2))) # Pooling Layer\n", 144 | "catDogImageclassifier.add(Conv2D(32,(3,3))) # Convolutional Layer\n", 145 | "catDogImageclassifier.add(Activation('relu')) # Activation Layer\n", 146 | "catDogImageclassifier.add(MaxPooling2D(pool_size =(2,2))) # Pooling Layer" 147 | ] 148 | }, 149 | { 150 | "cell_type": "code", 151 | "execution_count": null, 152 | "metadata": {}, 153 | "outputs": [], 154 | "source": [ 155 | "# Overfitting is a nuicance. We have to fight it using Drop out. Prepare the data by flatenning it. \n", 156 | "#And flattening to 1 dimension" 157 | ] 158 | }, 159 | { 160 | "cell_type": "code", 161 | "execution_count": 75, 162 | "metadata": {}, 163 | "outputs": [], 164 | "source": [ 165 | "catDogImageclassifier.add(Flatten())" 166 | ] 167 | }, 168 | { 169 | "cell_type": "code", 170 | "execution_count": null, 171 | "metadata": {}, 172 | "outputs": [], 173 | "source": [ 174 | "# Add dense function now followed by ReLU activation" 175 | ] 176 | }, 177 | { 178 | "cell_type": "code", 179 | "execution_count": 76, 180 | "metadata": {}, 181 | "outputs": [], 182 | "source": [ 183 | "catDogImageclassifier.add(Dense(64))\n", 184 | "catDogImageclassifier.add(Activation('relu'))" 185 | ] 186 | }, 187 | { 188 | "cell_type": "code", 189 | "execution_count": null, 190 | "metadata": {}, 191 | "outputs": [], 192 | "source": [ 193 | "# Here add the doropout layer\n", 194 | "# Overfitting means that model is working good for training but failing on testing dataset" 195 | ] 196 | }, 197 | { 198 | "cell_type": "code", 199 | "execution_count": 77, 200 | "metadata": {}, 201 | "outputs": [], 202 | "source": [ 203 | "catDogImageclassifier.add(Dropout(0.5))" 204 | ] 205 | }, 206 | { 207 | "cell_type": "code", 208 | "execution_count": null, 209 | "metadata": {}, 210 | "outputs": [], 211 | "source": [ 212 | "# Add one more fully connected layer now to get the output in n-dimensional classes (a vector will be the output)" 213 | ] 214 | }, 215 | { 216 | "cell_type": "code", 217 | "execution_count": 78, 218 | "metadata": {}, 219 | "outputs": [], 220 | "source": [ 221 | "catDogImageclassifier.add(Dense(1))" 222 | ] 223 | }, 224 | { 225 | "cell_type": "code", 226 | "execution_count": null, 227 | "metadata": {}, 228 | "outputs": [], 229 | "source": [ 230 | "# Sigmoid function to convert to probabilities" 231 | ] 232 | }, 233 | { 234 | "cell_type": "code", 235 | "execution_count": 79, 236 | "metadata": {}, 237 | "outputs": [], 238 | "source": [ 239 | "catDogImageclassifier.add(Activation('sigmoid'))" 240 | ] 241 | }, 242 | { 243 | "cell_type": "code", 244 | "execution_count": null, 245 | "metadata": {}, 246 | "outputs": [], 247 | "source": [ 248 | "# Let us look how out network looks" 249 | ] 250 | }, 251 | { 252 | "cell_type": "code", 253 | "execution_count": 80, 254 | "metadata": {}, 255 | "outputs": [ 256 | { 257 | "name": "stdout", 258 | "output_type": "stream", 259 | "text": [ 260 | "_________________________________________________________________\n", 261 | "Layer (type) Output Shape Param # \n", 262 | "=================================================================\n", 263 | "conv2d_27 (Conv2D) (None, 62, 62, 32) 896 \n", 264 | "_________________________________________________________________\n", 265 | "activation_22 (Activation) (None, 62, 62, 32) 0 \n", 266 | "_________________________________________________________________\n", 267 | "max_pooling2d_20 (MaxPooling (None, 31, 31, 32) 0 \n", 268 | "_________________________________________________________________\n", 269 | "conv2d_28 (Conv2D) (None, 29, 29, 32) 9248 \n", 270 | "_________________________________________________________________\n", 271 | "activation_23 (Activation) (None, 29, 29, 32) 0 \n", 272 | "_________________________________________________________________\n", 273 | "max_pooling2d_21 (MaxPooling (None, 14, 14, 32) 0 \n", 274 | "_________________________________________________________________\n", 275 | "conv2d_29 (Conv2D) (None, 12, 12, 32) 9248 \n", 276 | "_________________________________________________________________\n", 277 | "activation_24 (Activation) (None, 12, 12, 32) 0 \n", 278 | "_________________________________________________________________\n", 279 | "max_pooling2d_22 (MaxPooling (None, 6, 6, 32) 0 \n", 280 | "_________________________________________________________________\n", 281 | "conv2d_30 (Conv2D) (None, 4, 4, 32) 9248 \n", 282 | "_________________________________________________________________\n", 283 | "activation_25 (Activation) (None, 4, 4, 32) 0 \n", 284 | "_________________________________________________________________\n", 285 | "max_pooling2d_23 (MaxPooling (None, 2, 2, 32) 0 \n", 286 | "_________________________________________________________________\n", 287 | "flatten_2 (Flatten) (None, 128) 0 \n", 288 | "_________________________________________________________________\n", 289 | "dense_3 (Dense) (None, 64) 8256 \n", 290 | "_________________________________________________________________\n", 291 | "activation_26 (Activation) (None, 64) 0 \n", 292 | "_________________________________________________________________\n", 293 | "dropout_2 (Dropout) (None, 64) 0 \n", 294 | "_________________________________________________________________\n", 295 | "dense_4 (Dense) (None, 1) 65 \n", 296 | "_________________________________________________________________\n", 297 | "activation_27 (Activation) (None, 1) 0 \n", 298 | "=================================================================\n", 299 | "Total params: 36,961\n", 300 | "Trainable params: 36,961\n", 301 | "Non-trainable params: 0\n", 302 | "_________________________________________________________________\n" 303 | ] 304 | } 305 | ], 306 | "source": [ 307 | "catDogImageclassifier.summary()" 308 | ] 309 | }, 310 | { 311 | "cell_type": "code", 312 | "execution_count": null, 313 | "metadata": {}, 314 | "outputs": [], 315 | "source": [ 316 | "# A quick look at the network summary states that total number of parameters in our network are 36,961. \n", 317 | "#Play around with different network structures and have a look how this number changes" 318 | ] 319 | }, 320 | { 321 | "cell_type": "code", 322 | "execution_count": 81, 323 | "metadata": {}, 324 | "outputs": [], 325 | "source": [ 326 | "catDogImageclassifier.compile(optimizer ='rmsprop',# rmsprop is the optimizer using Gradient Descent\n", 327 | " loss ='binary_crossentropy', # Loss or cost function for the model\n", 328 | " metrics =['accuracy']) # The KPI " 329 | ] 330 | }, 331 | { 332 | "cell_type": "code", 333 | "execution_count": null, 334 | "metadata": {}, 335 | "outputs": [], 336 | "source": [ 337 | "# Let us do some data augmentation here. It helps to fight overfitting. Zoom, scale etc. \n", 338 | "# There is a function ImageDataGenerator which is used here" 339 | ] 340 | }, 341 | { 342 | "cell_type": "code", 343 | "execution_count": 82, 344 | "metadata": {}, 345 | "outputs": [], 346 | "source": [ 347 | "from keras.preprocessing.image import ImageDataGenerator\n", 348 | "train_datagen = ImageDataGenerator(rescale =1./255,\n", 349 | " shear_range =0.25,\n", 350 | " zoom_range = 0.25,\n", 351 | " horizontal_flip =True)\n", 352 | "test_datagen = ImageDataGenerator(rescale = 1./255)" 353 | ] 354 | }, 355 | { 356 | "cell_type": "code", 357 | "execution_count": null, 358 | "metadata": {}, 359 | "outputs": [], 360 | "source": [ 361 | "# Load the training data" 362 | ] 363 | }, 364 | { 365 | "cell_type": "code", 366 | "execution_count": 83, 367 | "metadata": {}, 368 | "outputs": [ 369 | { 370 | "name": "stdout", 371 | "output_type": "stream", 372 | "text": [ 373 | "Found 23000 images belonging to 2 classes.\n" 374 | ] 375 | } 376 | ], 377 | "source": [ 378 | "training_set = train_datagen.flow_from_directory('/Users/vaibhavverdhan/Book Writing/DogsCats/train',target_size=(64,64),batch_size= 32,class_mode='binary')" 379 | ] 380 | }, 381 | { 382 | "cell_type": "code", 383 | "execution_count": null, 384 | "metadata": {}, 385 | "outputs": [], 386 | "source": [ 387 | "# Load the testing data" 388 | ] 389 | }, 390 | { 391 | "cell_type": "code", 392 | "execution_count": 84, 393 | "metadata": {}, 394 | "outputs": [ 395 | { 396 | "name": "stdout", 397 | "output_type": "stream", 398 | "text": [ 399 | "Found 2000 images belonging to 2 classes.\n" 400 | ] 401 | } 402 | ], 403 | "source": [ 404 | "test_set = test_datagen.flow_from_directory('/Users/vaibhavverdhan/Book Writing/DogsCats/test',\n", 405 | " target_size = (64,64),\n", 406 | " batch_size = 32,\n", 407 | " class_mode ='binary')" 408 | ] 409 | }, 410 | { 411 | "cell_type": "code", 412 | "execution_count": null, 413 | "metadata": {}, 414 | "outputs": [], 415 | "source": [ 416 | "# Let us begin the training now. Steps per epoch is 625 and number of epochs is 10. \n", 417 | "#Epoch is one full cycle of the training data\n", 418 | "# Steps and Batch size has to be understood next. For example: if we have 1000 images and batch size of 10, it means\n", 419 | "# number of steps = 1000/10 which is 100 steps required.\n", 420 | "# Depending on the complexity of the network, the number of epochs given etc., the compilation will take time.\n", 421 | "# The test dataset is passed as a validation_data here." 422 | ] 423 | }, 424 | { 425 | "cell_type": "code", 426 | "execution_count": 85, 427 | "metadata": {}, 428 | "outputs": [ 429 | { 430 | "name": "stdout", 431 | "output_type": "stream", 432 | "text": [ 433 | "Epoch 1/10\n", 434 | "625/625 [==============================] - 185s 296ms/step - loss: 0.6721 - acc: 0.5822 - val_loss: 0.6069 - val_acc: 0.6610\n", 435 | "Epoch 2/10\n", 436 | "625/625 [==============================] - 152s 243ms/step - loss: 0.5960 - acc: 0.6831 - val_loss: 0.5151 - val_acc: 0.7543\n", 437 | "Epoch 3/10\n", 438 | "625/625 [==============================] - 151s 242ms/step - loss: 0.5452 - acc: 0.7217 - val_loss: 0.4891 - val_acc: 0.7545\n", 439 | "Epoch 4/10\n", 440 | "625/625 [==============================] - 150s 239ms/step - loss: 0.5069 - acc: 0.7568 - val_loss: 0.4657 - val_acc: 0.7743\n", 441 | "Epoch 5/10\n", 442 | "625/625 [==============================] - 150s 240ms/step - loss: 0.4813 - acc: 0.7713 - val_loss: 0.4407 - val_acc: 0.7925\n", 443 | "Epoch 6/10\n", 444 | "625/625 [==============================] - 152s 243ms/step - loss: 0.4526 - acc: 0.7866 - val_loss: 0.4374 - val_acc: 0.7924\n", 445 | "Epoch 7/10\n", 446 | "625/625 [==============================] - 151s 241ms/step - loss: 0.4458 - acc: 0.7953 - val_loss: 0.3891 - val_acc: 0.8324\n", 447 | "Epoch 8/10\n", 448 | "625/625 [==============================] - 151s 242ms/step - loss: 0.4177 - acc: 0.8123 - val_loss: 0.3917 - val_acc: 0.8221\n", 449 | "Epoch 9/10\n", 450 | "625/625 [==============================] - 155s 248ms/step - loss: 0.4158 - acc: 0.8158 - val_loss: 0.3947 - val_acc: 0.8176\n", 451 | "Epoch 10/10\n", 452 | "625/625 [==============================] - 151s 241ms/step - loss: 0.4021 - acc: 0.8201 - val_loss: 0.3783 - val_acc: 0.8221\n" 453 | ] 454 | }, 455 | { 456 | "data": { 457 | "text/plain": [ 458 | "" 459 | ] 460 | }, 461 | "execution_count": 85, 462 | "metadata": {}, 463 | "output_type": "execute_result" 464 | } 465 | ], 466 | "source": [ 467 | "from IPython.display import display\n", 468 | "from PIL import Image\n", 469 | "catDogImageclassifier.fit_generator(training_set,\n", 470 | " steps_per_epoch =625,\n", 471 | " epochs = 10,\n", 472 | " validation_data =test_set,\n", 473 | " validation_steps = 1000)" 474 | ] 475 | }, 476 | { 477 | "cell_type": "code", 478 | "execution_count": null, 479 | "metadata": {}, 480 | "outputs": [], 481 | "source": [ 482 | "# We can see here that in the final epoch we got validation accuracy of 82.21%. \n", 483 | "#We can also see that in Epoch 7 we got accuracy of 83.24 which is better than the final accuarcy.\n", 484 | "# There are ways to give a checkpoint in between the training and save that version, \n", 485 | "#we will look at it in subsequent chapters" 486 | ] 487 | }, 488 | { 489 | "cell_type": "code", 490 | "execution_count": null, 491 | "metadata": {}, 492 | "outputs": [], 493 | "source": [ 494 | "# We are saving the final model as a file here. The model can be then loaded again as and when required.\n", 495 | "# The model will be saved as a HDF5 file. And it can be reused later." 496 | ] 497 | }, 498 | { 499 | "cell_type": "code", 500 | "execution_count": 38, 501 | "metadata": {}, 502 | "outputs": [], 503 | "source": [ 504 | "catDogImageclassifier.save('catdog_cnn_model.h5')" 505 | ] 506 | }, 507 | { 508 | "cell_type": "code", 509 | "execution_count": null, 510 | "metadata": {}, 511 | "outputs": [], 512 | "source": [ 513 | "# Load the saved model. The saved file is loaded using load_model." 514 | ] 515 | }, 516 | { 517 | "cell_type": "code", 518 | "execution_count": 39, 519 | "metadata": {}, 520 | "outputs": [], 521 | "source": [ 522 | "from keras.models import load_model \n", 523 | "catDogImageclassifier = load_model('catdog_cnn_model.h5')" 524 | ] 525 | }, 526 | { 527 | "cell_type": "code", 528 | "execution_count": null, 529 | "metadata": {}, 530 | "outputs": [], 531 | "source": [ 532 | "# Check how the model is predicting for an unseen image. " 533 | ] 534 | }, 535 | { 536 | "cell_type": "code", 537 | "execution_count": 40, 538 | "metadata": {}, 539 | "outputs": [ 540 | { 541 | "name": "stdout", 542 | "output_type": "stream", 543 | "text": [ 544 | "dog\n" 545 | ] 546 | } 547 | ], 548 | "source": [ 549 | "import numpy as np\n", 550 | "from keras.preprocessing import image\n", 551 | "an_image =image.load_img('/Users/vaibhavverdhan/Book Writing/2.jpg',target_size =(64,64))# Load the image\n", 552 | "# The image is now getting converted to array of numbers\n", 553 | "an_image =image.img_to_array(an_image)\n", 554 | "#Let us now expand it's dimensions. It will improve the prediction power \n", 555 | "an_image =np.expand_dims(an_image, axis =0)\n", 556 | "# call the predict method here\n", 557 | "verdict = catDogImageclassifier.predict(an_image)\n", 558 | "if verdict[0][0] >= 0.5:\n", 559 | " prediction = 'dog'\n", 560 | "else:\n", 561 | " prediction = 'cat'\n", 562 | "# Let us print our final prediction \n", 563 | "print(prediction)" 564 | ] 565 | }, 566 | { 567 | "cell_type": "code", 568 | "execution_count": 6, 569 | "metadata": {}, 570 | "outputs": [], 571 | "source": [ 572 | "# Here in this example, we have designed a Neural Network using Kears. We trained using images of Cats and Dogs. \n", 573 | "#And then tested it.\n", 574 | "# It is possible to train a multi-classifier system too. The onus lies to get the images for \n", 575 | "# each of the class." 576 | ] 577 | }, 578 | { 579 | "cell_type": "code", 580 | "execution_count": null, 581 | "metadata": {}, 582 | "outputs": [], 583 | "source": [] 584 | }, 585 | { 586 | "cell_type": "code", 587 | "execution_count": null, 588 | "metadata": {}, 589 | "outputs": [], 590 | "source": [] 591 | } 592 | ], 593 | "metadata": { 594 | "kernelspec": { 595 | "display_name": "Python 3", 596 | "language": "python", 597 | "name": "python3" 598 | }, 599 | "language_info": { 600 | "codemirror_mode": { 601 | "name": "ipython", 602 | "version": 3 603 | }, 604 | "file_extension": ".py", 605 | "mimetype": "text/x-python", 606 | "name": "python", 607 | "nbconvert_exporter": "python", 608 | "pygments_lexer": "ipython3", 609 | "version": "3.6.8" 610 | } 611 | }, 612 | "nbformat": 4, 613 | "nbformat_minor": 2 614 | } 615 | -------------------------------------------------------------------------------- /Chapter2/MNIST.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/Apress/computer-vision-using-deep-learning/5c08fdbc4ffd67297512cc4d929c111c885e0c75/Chapter2/MNIST.png -------------------------------------------------------------------------------- /Chapter2/MNISTImageUsingCNN.ipynb: -------------------------------------------------------------------------------- 1 | { 2 | "cells": [ 3 | { 4 | "cell_type": "markdown", 5 | "metadata": {}, 6 | "source": [ 7 | "## We are going to build an Image Classification solution here. The dataset for this is available on Kaggle Link" 8 | ] 9 | }, 10 | { 11 | "cell_type": "code", 12 | "execution_count": 1, 13 | "metadata": {}, 14 | "outputs": [ 15 | { 16 | "name": "stderr", 17 | "output_type": "stream", 18 | "text": [ 19 | "Using TensorFlow backend.\n" 20 | ] 21 | } 22 | ], 23 | "source": [ 24 | "import keras" 25 | ] 26 | }, 27 | { 28 | "cell_type": "markdown", 29 | "metadata": {}, 30 | "source": [ 31 | "#### Import all the Keras libraries here. Keras comes with many layers and we are importing all the layers required here. \n" 32 | ] 33 | }, 34 | { 35 | "cell_type": "code", 36 | "execution_count": 2, 37 | "metadata": {}, 38 | "outputs": [], 39 | "source": [ 40 | "from keras.datasets import mnist ## Data set is imported here\n", 41 | "from keras.models import Sequential # Import the sequential layer. \n", 42 | "#Generally there are two types of layers, sequential and functional. Sequential is most common one\n", 43 | "from keras.layers import Dense, Dropout, Flatten\n", 44 | "from keras.layers import Conv2D, MaxPooling2D\n", 45 | "from keras import backend as K" 46 | ] 47 | }, 48 | { 49 | "cell_type": "code", 50 | "execution_count": null, 51 | "metadata": {}, 52 | "outputs": [], 53 | "source": [ 54 | "Lets define the hyper parameters here" 55 | ] 56 | }, 57 | { 58 | "cell_type": "code", 59 | "execution_count": 3, 60 | "metadata": {}, 61 | "outputs": [], 62 | "source": [ 63 | "batch_size = 256\n", 64 | "num_classes = 10\n", 65 | "epochs = 10" 66 | ] 67 | }, 68 | { 69 | "cell_type": "code", 70 | "execution_count": 4, 71 | "metadata": {}, 72 | "outputs": [], 73 | "source": [ 74 | "# input image dimensions\n", 75 | "image_rows, image_cols = 28, 28" 76 | ] 77 | }, 78 | { 79 | "cell_type": "code", 80 | "execution_count": 5, 81 | "metadata": {}, 82 | "outputs": [], 83 | "source": [ 84 | "# here we are diving the dataset in train and test\n", 85 | "(x_train, y_train), (x_test, y_test) = mnist.load_data()" 86 | ] 87 | }, 88 | { 89 | "cell_type": "code", 90 | "execution_count": null, 91 | "metadata": {}, 92 | "outputs": [], 93 | "source": [ 94 | "An image has a dimension each for row, height and column. And there are 2 ways to represent\n", 95 | "If the image format is channels_first it means fist channel is the color [channel][row][col] else\n", 96 | "if in the case of channels_last it is [row][col][channels]" 97 | ] 98 | }, 99 | { 100 | "cell_type": "code", 101 | "execution_count": 6, 102 | "metadata": {}, 103 | "outputs": [], 104 | "source": [ 105 | "if K.image_data_format() == 'channels_first':\n", 106 | " x_train = x_train.reshape(x_train.shape[0], 1, image_rows, image_cols)\n", 107 | " x_test = x_test.reshape(x_test.shape[0], 1, image_rows, image_cols)\n", 108 | " input_shape = (1, image_rows, image_cols)\n", 109 | "else:\n", 110 | " x_train = x_train.reshape(x_train.shape[0], image_rows, image_cols, 1)\n", 111 | " x_test = x_test.reshape(x_test.shape[0], image_rows, image_cols, 1)\n", 112 | " input_shape = (image_rows, image_cols, 1)" 113 | ] 114 | }, 115 | { 116 | "cell_type": "code", 117 | "execution_count": 7, 118 | "metadata": {}, 119 | "outputs": [ 120 | { 121 | "name": "stdout", 122 | "output_type": "stream", 123 | "text": [ 124 | "x_train shape: (60000, 28, 28, 1)\n", 125 | "60000 train samples\n", 126 | "10000 test samples\n" 127 | ] 128 | } 129 | ], 130 | "source": [ 131 | "x_train = x_train.astype('float32')\n", 132 | "x_test = x_test.astype('float32')\n", 133 | "x_train /= 255\n", 134 | "x_test /= 255\n", 135 | "print('x_train shape:', x_train.shape)\n", 136 | "print(x_train.shape[0], 'train samples')\n", 137 | "print(x_test.shape[0], 'test samples')" 138 | ] 139 | }, 140 | { 141 | "cell_type": "code", 142 | "execution_count": 8, 143 | "metadata": {}, 144 | "outputs": [], 145 | "source": [ 146 | "# convert class vectors to binary class matrices\n", 147 | "y_train = keras.utils.to_categorical(y_train, num_classes)\n", 148 | "y_test = keras.utils.to_categorical(y_test, num_classes)" 149 | ] 150 | }, 151 | { 152 | "cell_type": "code", 153 | "execution_count": null, 154 | "metadata": {}, 155 | "outputs": [], 156 | "source": [ 157 | "Lets create our network here\n", 158 | "We are adding layers to our network here.\n", 159 | "Conv2D: 2 dimensional convolutional layer 32: filters required. 3,3: size of the filter (3 rows, 3 columns)\n", 160 | "Input Image shape is 64*64*3 - height*width*RGB. Each number represents pixel intensity (0-255)\n", 161 | "Output is a feature map. The training data will work on it and get some feature maps\n", 162 | "\n", 163 | "Lets add the activation function now. We are using ReLU (Rectified Linear Unit).\n", 164 | "The activation function gives the output basis the output. \n", 165 | "In the feature map output from the previous layer, \n", 166 | "the activation function will replace all the negative pixels with zero\n", 167 | "We do not want out network to be overly complex computationally, hence the pooling layer comes into picture\n", 168 | "The pooling layer will reduce the dimensions. Max with two by two filter, \n", 169 | "will take the maximum value but the significant features will be retained\n", 170 | "To fight overfitting using Drop out. Prepare the data by flattening it. And flattening to 1 dimension\n", 171 | "\n" 172 | ] 173 | }, 174 | { 175 | "cell_type": "code", 176 | "execution_count": 9, 177 | "metadata": {}, 178 | "outputs": [ 179 | { 180 | "name": "stdout", 181 | "output_type": "stream", 182 | "text": [ 183 | "WARNING:tensorflow:From /Users/vaibhavverdhan/anaconda3/lib/python3.6/site-packages/tensorflow/python/framework/op_def_library.py:263: colocate_with (from tensorflow.python.framework.ops) is deprecated and will be removed in a future version.\n", 184 | "Instructions for updating:\n", 185 | "Colocations handled automatically by placer.\n", 186 | "WARNING:tensorflow:From /Users/vaibhavverdhan/anaconda3/lib/python3.6/site-packages/keras/backend/tensorflow_backend.py:3445: calling dropout (from tensorflow.python.ops.nn_ops) with keep_prob is deprecated and will be removed in a future version.\n", 187 | "Instructions for updating:\n", 188 | "Please use `rate` instead of `keep_prob`. Rate should be set to `rate = 1 - keep_prob`.\n" 189 | ] 190 | } 191 | ], 192 | "source": [ 193 | "model = Sequential()\n", 194 | "model.add(Conv2D(32, kernel_size=(3, 3),\n", 195 | " activation='relu',\n", 196 | " input_shape=input_shape))\n", 197 | "model.add(Conv2D(64, (3, 3), activation='relu'))\n", 198 | "model.add(MaxPooling2D(pool_size=(2, 2)))\n", 199 | "model.add(Dropout(0.45))\n", 200 | "model.add(Flatten())\n", 201 | "model.add(Dense(128, activation='relu'))\n", 202 | "model.add(Dropout(0.55))\n", 203 | "model.add(Dense(num_classes, activation='softmax'))" 204 | ] 205 | }, 206 | { 207 | "cell_type": "code", 208 | "execution_count": null, 209 | "metadata": {}, 210 | "outputs": [], 211 | "source": [ 212 | "Compile the model now. We are using crossentropy as the loss function and Adadelta for optimization" 213 | ] 214 | }, 215 | { 216 | "cell_type": "code", 217 | "execution_count": 10, 218 | "metadata": {}, 219 | "outputs": [], 220 | "source": [ 221 | "model.compile(loss=keras.losses.categorical_crossentropy,\n", 222 | " optimizer=keras.optimizers.Adadelta(),\n", 223 | " metrics=['accuracy'])" 224 | ] 225 | }, 226 | { 227 | "cell_type": "code", 228 | "execution_count": null, 229 | "metadata": {}, 230 | "outputs": [], 231 | "source": [ 232 | "The network is ready. Fit the model now" 233 | ] 234 | }, 235 | { 236 | "cell_type": "code", 237 | "execution_count": 11, 238 | "metadata": {}, 239 | "outputs": [ 240 | { 241 | "name": "stdout", 242 | "output_type": "stream", 243 | "text": [ 244 | "WARNING:tensorflow:From /Users/vaibhavverdhan/anaconda3/lib/python3.6/site-packages/tensorflow/python/ops/math_ops.py:3066: to_int32 (from tensorflow.python.ops.math_ops) is deprecated and will be removed in a future version.\n", 245 | "Instructions for updating:\n", 246 | "Use tf.cast instead.\n", 247 | "Train on 60000 samples, validate on 10000 samples\n", 248 | "Epoch 1/10\n", 249 | "60000/60000 [==============================] - 55s 916us/step - loss: 0.3784 - acc: 0.8814 - val_loss: 0.0852 - val_acc: 0.9711\n", 250 | "Epoch 2/10\n", 251 | "60000/60000 [==============================] - 56s 938us/step - loss: 0.1267 - acc: 0.9625 - val_loss: 0.0512 - val_acc: 0.9818\n", 252 | "Epoch 3/10\n", 253 | "60000/60000 [==============================] - 112s 2ms/step - loss: 0.0962 - acc: 0.9720 - val_loss: 0.0393 - val_acc: 0.9873\n", 254 | "Epoch 4/10\n", 255 | "60000/60000 [==============================] - 57s 956us/step - loss: 0.0788 - acc: 0.9763 - val_loss: 0.0335 - val_acc: 0.9884\n", 256 | "Epoch 5/10\n", 257 | "60000/60000 [==============================] - 57s 951us/step - loss: 0.0698 - acc: 0.9793 - val_loss: 0.0336 - val_acc: 0.9881\n", 258 | "Epoch 6/10\n", 259 | "60000/60000 [==============================] - 448s 7ms/step - loss: 0.0622 - acc: 0.9807 - val_loss: 0.0375 - val_acc: 0.9869\n", 260 | "Epoch 7/10\n", 261 | "60000/60000 [==============================] - 56s 931us/step - loss: 0.0579 - acc: 0.9823 - val_loss: 0.0286 - val_acc: 0.9903\n", 262 | "Epoch 8/10\n", 263 | "60000/60000 [==============================] - 57s 944us/step - loss: 0.0520 - acc: 0.9837 - val_loss: 0.0288 - val_acc: 0.9904\n", 264 | "Epoch 9/10\n", 265 | "60000/60000 [==============================] - 2774s 46ms/step - loss: 0.0479 - acc: 0.9854 - val_loss: 0.0307 - val_acc: 0.9898\n", 266 | "Epoch 10/10\n", 267 | "60000/60000 [==============================] - 56s 936us/step - loss: 0.0469 - acc: 0.9858 - val_loss: 0.0292 - val_acc: 0.9900\n", 268 | "Test loss: 0.029244637563071593\n", 269 | "Test accuracy: 0.99\n" 270 | ] 271 | } 272 | ], 273 | "source": [ 274 | "model.fit(x_train, y_train,\n", 275 | " batch_size=batch_size,\n", 276 | " epochs=epochs,\n", 277 | " verbose=1,\n", 278 | " validation_data=(x_test, y_test))\n", 279 | "score = model.evaluate(x_test, y_test, verbose=0)\n", 280 | "print('Test loss:', score[0])\n", 281 | "print('Test accuracy:', score[1])" 282 | ] 283 | }, 284 | { 285 | "cell_type": "code", 286 | "execution_count": null, 287 | "metadata": {}, 288 | "outputs": [], 289 | "source": [ 290 | " " 291 | ] 292 | } 293 | ], 294 | "metadata": { 295 | "kernelspec": { 296 | "display_name": "Python 3", 297 | "language": "python", 298 | "name": "python3" 299 | }, 300 | "language_info": { 301 | "codemirror_mode": { 302 | "name": "ipython", 303 | "version": 3 304 | }, 305 | "file_extension": ".py", 306 | "mimetype": "text/x-python", 307 | "name": "python", 308 | "nbconvert_exporter": "python", 309 | "pygments_lexer": "ipython3", 310 | "version": "3.6.8" 311 | } 312 | }, 313 | "nbformat": 4, 314 | "nbformat_minor": 2 315 | } 316 | -------------------------------------------------------------------------------- /Chapter3/Chapter3: -------------------------------------------------------------------------------- 1 | 2 | -------------------------------------------------------------------------------- /Chapter3/Important: -------------------------------------------------------------------------------- 1 | The dataset for German Traffic Classification can be downloaded from this link: 2 | https://drive.google.com/drive/folders/1chK1QIeP0q5lUAvCjOjKqhZYTGDzgF4K?usp=sharing 3 | -------------------------------------------------------------------------------- /Chapter3/LeNet MNIST Digit Classification.ipynb: -------------------------------------------------------------------------------- 1 | { 2 | "cells": [ 3 | { 4 | "cell_type": "code", 5 | "execution_count": 48, 6 | "metadata": {}, 7 | "outputs": [], 8 | "source": [ 9 | "import keras\n", 10 | "from keras.optimizers import SGD\n", 11 | "from sklearn.preprocessing import LabelBinarizer\n", 12 | "from sklearn.model_selection import train_test_split\n", 13 | "from sklearn.metrics import classification_report\n", 14 | "from sklearn import datasets\n", 15 | "from keras import backend as K\n", 16 | "import matplotlib.pyplot as plt\n", 17 | "import numpy as np" 18 | ] 19 | }, 20 | { 21 | "cell_type": "code", 22 | "execution_count": 49, 23 | "metadata": {}, 24 | "outputs": [], 25 | "source": [ 26 | "from keras.datasets import mnist ## Data set is imported here\n", 27 | "from keras.models import Sequential\n", 28 | "from keras.layers.convolutional import Conv2D\n", 29 | "from keras.layers.convolutional import MaxPooling2D\n", 30 | "from keras.layers.core import Activation\n", 31 | "from keras.layers.core import Flatten\n", 32 | "from keras.layers.core import Dense\n", 33 | "from keras import backend as K" 34 | ] 35 | }, 36 | { 37 | "cell_type": "code", 38 | "execution_count": 22, 39 | "metadata": {}, 40 | "outputs": [], 41 | "source": [] 42 | }, 43 | { 44 | "cell_type": "code", 45 | "execution_count": null, 46 | "metadata": {}, 47 | "outputs": [], 48 | "source": [] 49 | }, 50 | { 51 | "cell_type": "code", 52 | "execution_count": 37, 53 | "metadata": {}, 54 | "outputs": [], 55 | "source": [] 56 | }, 57 | { 58 | "cell_type": "code", 59 | "execution_count": 78, 60 | "metadata": {}, 61 | "outputs": [], 62 | "source": [ 63 | "image_rows, image_cols = 28, 28\n", 64 | "batch_size = 256\n", 65 | "num_classes = 10\n", 66 | "epochs = 10" 67 | ] 68 | }, 69 | { 70 | "cell_type": "code", 71 | "execution_count": 80, 72 | "metadata": {}, 73 | "outputs": [], 74 | "source": [ 75 | "(x_train, y_train), (x_test, y_test) = mnist.load_data()" 76 | ] 77 | }, 78 | { 79 | "cell_type": "code", 80 | "execution_count": 81, 81 | "metadata": {}, 82 | "outputs": [ 83 | { 84 | "name": "stdout", 85 | "output_type": "stream", 86 | "text": [ 87 | "x_train shape: (60000, 28, 28)\n", 88 | "60000 train samples\n", 89 | "10000 test samples\n" 90 | ] 91 | } 92 | ], 93 | "source": [ 94 | "x_train = x_train.astype('float32')\n", 95 | "x_test = x_test.astype('float32')\n", 96 | "x_train /= 255\n", 97 | "x_test /= 255\n", 98 | "print('x_train shape:', x_train.shape)\n", 99 | "print(x_train.shape[0], 'train samples')\n", 100 | "print(x_test.shape[0], 'test samples')" 101 | ] 102 | }, 103 | { 104 | "cell_type": "code", 105 | "execution_count": 82, 106 | "metadata": {}, 107 | "outputs": [], 108 | "source": [ 109 | "# convert class vectors to binary class matrices\n", 110 | "y_train = keras.utils.to_categorical(y_train, num_classes)\n", 111 | "y_test = keras.utils.to_categorical(y_test, num_classes)" 112 | ] 113 | }, 114 | { 115 | "cell_type": "code", 116 | "execution_count": 83, 117 | "metadata": {}, 118 | "outputs": [], 119 | "source": [ 120 | "if K.image_data_format() == 'channels_first':\n", 121 | " x_train = x_train.reshape(x_train.shape[0], 1, image_rows, image_cols)\n", 122 | " x_test = x_test.reshape(x_test.shape[0], 1, image_rows, image_cols)\n", 123 | " input_shape = (1, image_rows, image_cols)\n", 124 | "else:\n", 125 | " x_train = x_train.reshape(x_train.shape[0], image_rows, image_cols, 1)\n", 126 | " x_test = x_test.reshape(x_test.shape[0], image_rows, image_cols, 1)\n", 127 | " input_shape = (image_rows, image_cols, 1)" 128 | ] 129 | }, 130 | { 131 | "cell_type": "code", 132 | "execution_count": 84, 133 | "metadata": {}, 134 | "outputs": [], 135 | "source": [ 136 | "model = Sequential()" 137 | ] 138 | }, 139 | { 140 | "cell_type": "code", 141 | "execution_count": 85, 142 | "metadata": {}, 143 | "outputs": [], 144 | "source": [ 145 | "model.add(Conv2D(20, (5, 5), padding=\"same\",input_shape=input_shape))" 146 | ] 147 | }, 148 | { 149 | "cell_type": "code", 150 | "execution_count": 86, 151 | "metadata": {}, 152 | "outputs": [], 153 | "source": [ 154 | "model.add(Activation(\"relu\"))\n", 155 | "model.add(MaxPooling2D(pool_size=(2, 2), strides=(2, 2)))" 156 | ] 157 | }, 158 | { 159 | "cell_type": "code", 160 | "execution_count": 87, 161 | "metadata": {}, 162 | "outputs": [], 163 | "source": [ 164 | "model.add(Conv2D(50, (5, 5), padding=\"same\"))\n", 165 | "model.add(Activation(\"relu\"))\n", 166 | "model.add(MaxPooling2D(pool_size=(2, 2), strides=(2, 2)))" 167 | ] 168 | }, 169 | { 170 | "cell_type": "code", 171 | "execution_count": 88, 172 | "metadata": {}, 173 | "outputs": [], 174 | "source": [ 175 | "model.add(Flatten())\n", 176 | "model.add(Dense(500))\n", 177 | "model.add(Activation(\"relu\"))" 178 | ] 179 | }, 180 | { 181 | "cell_type": "code", 182 | "execution_count": 89, 183 | "metadata": {}, 184 | "outputs": [], 185 | "source": [ 186 | "model.add(Dense(num_classes))\n", 187 | "model.add(Activation(\"softmax\"))" 188 | ] 189 | }, 190 | { 191 | "cell_type": "code", 192 | "execution_count": 90, 193 | "metadata": {}, 194 | "outputs": [], 195 | "source": [ 196 | "model.compile(loss=keras.losses.categorical_crossentropy,\n", 197 | " optimizer=keras.optimizers.Adadelta(),\n", 198 | " metrics=['accuracy'])" 199 | ] 200 | }, 201 | { 202 | "cell_type": "code", 203 | "execution_count": 91, 204 | "metadata": {}, 205 | "outputs": [ 206 | { 207 | "name": "stdout", 208 | "output_type": "stream", 209 | "text": [ 210 | "Train on 60000 samples, validate on 10000 samples\n", 211 | "Epoch 1/10\n", 212 | "60000/60000 [==============================] - 36s 607us/step - loss: 0.2736 - acc: 0.9127 - val_loss: 0.1051 - val_acc: 0.9649\n", 213 | "Epoch 2/10\n", 214 | "60000/60000 [==============================] - 37s 622us/step - loss: 0.0590 - acc: 0.9813 - val_loss: 0.0490 - val_acc: 0.9835\n", 215 | "Epoch 3/10\n", 216 | "60000/60000 [==============================] - 37s 614us/step - loss: 0.0387 - acc: 0.9879 - val_loss: 0.0939 - val_acc: 0.9671\n", 217 | "Epoch 4/10\n", 218 | "60000/60000 [==============================] - 37s 625us/step - loss: 0.0285 - acc: 0.9910 - val_loss: 0.0267 - val_acc: 0.9905\n", 219 | "Epoch 5/10\n", 220 | "60000/60000 [==============================] - 37s 615us/step - loss: 0.0215 - acc: 0.9933 - val_loss: 0.0305 - val_acc: 0.9896\n", 221 | "Epoch 6/10\n", 222 | "60000/60000 [==============================] - 37s 614us/step - loss: 0.0164 - acc: 0.9949 - val_loss: 0.0228 - val_acc: 0.9920\n", 223 | "Epoch 7/10\n", 224 | "60000/60000 [==============================] - 37s 614us/step - loss: 0.0136 - acc: 0.9955 - val_loss: 0.0236 - val_acc: 0.9918\n", 225 | "Epoch 8/10\n", 226 | "60000/60000 [==============================] - 37s 616us/step - loss: 0.0106 - acc: 0.9969 - val_loss: 0.0279 - val_acc: 0.9909\n", 227 | "Epoch 9/10\n", 228 | "60000/60000 [==============================] - 37s 617us/step - loss: 0.0082 - acc: 0.9976 - val_loss: 0.0246 - val_acc: 0.9917\n", 229 | "Epoch 10/10\n", 230 | "60000/60000 [==============================] - 37s 620us/step - loss: 0.0062 - acc: 0.9983 - val_loss: 0.0316 - val_acc: 0.9907\n" 231 | ] 232 | } 233 | ], 234 | "source": [ 235 | "theLeNetModel = model.fit(x_train, y_train,\n", 236 | " batch_size=batch_size,\n", 237 | " epochs=epochs,\n", 238 | " verbose=1,\n", 239 | " validation_data=(x_test, y_test))\n" 240 | ] 241 | }, 242 | { 243 | "cell_type": "code", 244 | "execution_count": 92, 245 | "metadata": {}, 246 | "outputs": [], 247 | "source": [ 248 | "score = model.evaluate(x_test, y_test, verbose=0)\n" 249 | ] 250 | }, 251 | { 252 | "cell_type": "code", 253 | "execution_count": 93, 254 | "metadata": {}, 255 | "outputs": [ 256 | { 257 | "data": { 258 | "text/plain": [ 259 | "Text(0, 0.5, 'acc')" 260 | ] 261 | }, 262 | "execution_count": 93, 263 | "metadata": {}, 264 | "output_type": "execute_result" 265 | }, 266 | { 267 | "data": { 268 | "image/png": "\n", 269 | "text/plain": [ 270 | "
" 271 | ] 272 | }, 273 | "metadata": { 274 | "needs_background": "light" 275 | }, 276 | "output_type": "display_data" 277 | } 278 | ], 279 | "source": [ 280 | "import matplotlib.pyplot as plt\n", 281 | "f, ax = plt.subplots()\n", 282 | "ax.plot([None] + theLeNetModel.history['acc'], 'o-')\n", 283 | "ax.plot([None] + theLeNetModel.history['val_acc'], 'x-')\n", 284 | "ax.legend(['Train acc', 'Validation acc'], loc = 0)\n", 285 | "ax.set_title('Training/Validation acc per Epoch')\n", 286 | "ax.set_xlabel('Epoch')\n", 287 | "ax.set_ylabel('acc')" 288 | ] 289 | }, 290 | { 291 | "cell_type": "code", 292 | "execution_count": 94, 293 | "metadata": {}, 294 | "outputs": [ 295 | { 296 | "data": { 297 | "text/plain": [ 298 | "Text(0, 0.5, 'acc')" 299 | ] 300 | }, 301 | "execution_count": 94, 302 | "metadata": {}, 303 | "output_type": "execute_result" 304 | }, 305 | { 306 | "data": { 307 | "image/png": "\n", 308 | "text/plain": [ 309 | "
" 310 | ] 311 | }, 312 | "metadata": { 313 | "needs_background": "light" 314 | }, 315 | "output_type": "display_data" 316 | } 317 | ], 318 | "source": [ 319 | "import matplotlib.pyplot as plt\n", 320 | "f, ax = plt.subplots()\n", 321 | "ax.plot([None] + theLeNetModel.history['loss'], 'o-')\n", 322 | "ax.plot([None] + theLeNetModel.history['val_loss'], 'x-')\n", 323 | "ax.legend(['Train loss', 'Validation loss'], loc = 0)\n", 324 | "ax.set_title('Training/Validation loss per Epoch')\n", 325 | "ax.set_xlabel('Epoch')\n", 326 | "ax.set_ylabel('acc')" 327 | ] 328 | }, 329 | { 330 | "cell_type": "code", 331 | "execution_count": null, 332 | "metadata": {}, 333 | "outputs": [], 334 | "source": [] 335 | }, 336 | { 337 | "cell_type": "code", 338 | "execution_count": null, 339 | "metadata": {}, 340 | "outputs": [], 341 | "source": [] 342 | }, 343 | { 344 | "cell_type": "code", 345 | "execution_count": null, 346 | "metadata": {}, 347 | "outputs": [], 348 | "source": [] 349 | }, 350 | { 351 | "cell_type": "code", 352 | "execution_count": null, 353 | "metadata": {}, 354 | "outputs": [], 355 | "source": [] 356 | }, 357 | { 358 | "cell_type": "code", 359 | "execution_count": null, 360 | "metadata": {}, 361 | "outputs": [], 362 | "source": [] 363 | } 364 | ], 365 | "metadata": { 366 | "kernelspec": { 367 | "display_name": "Python 3", 368 | "language": "python", 369 | "name": "python3" 370 | }, 371 | "language_info": { 372 | "codemirror_mode": { 373 | "name": "ipython", 374 | "version": 3 375 | }, 376 | "file_extension": ".py", 377 | "mimetype": "text/x-python", 378 | "name": "python", 379 | "nbconvert_exporter": "python", 380 | "pygments_lexer": "ipython3", 381 | "version": "3.6.8" 382 | } 383 | }, 384 | "nbformat": 4, 385 | "nbformat_minor": 2 386 | } 387 | -------------------------------------------------------------------------------- /Chapter4/Chapter4: -------------------------------------------------------------------------------- 1 | 2 | -------------------------------------------------------------------------------- /Chapter5/Chapter5: -------------------------------------------------------------------------------- 1 | 2 | -------------------------------------------------------------------------------- /Chapter5/YOLO.ipynb: -------------------------------------------------------------------------------- 1 | { 2 | "cells": [ 3 | { 4 | "cell_type": "code", 5 | "execution_count": 2, 6 | "metadata": {}, 7 | "outputs": [], 8 | "source": [ 9 | "import cv2\n", 10 | "from imutils.video import VideoStream\n", 11 | "import os\n", 12 | "import numpy as np" 13 | ] 14 | }, 15 | { 16 | "cell_type": "code", 17 | "execution_count": 4, 18 | "metadata": {}, 19 | "outputs": [], 20 | "source": [ 21 | "localPath_labels = \"coco.names\"\n", 22 | "localPath_weights = \"yolov3.weights\"\n", 23 | "localPath_config = \"yolov3.cfg\"\n", 24 | "labels = open(localPath_labels).read().strip().split(\"\\n\")\n", 25 | "scaling = 0.005\n", 26 | "confidence_threshold = 0.5\n", 27 | "nms_threshold = 0.005 # Non Maxima Supression Threshold Vlue\n", 28 | "model = cv2.dnn.readNetFromDarknet(localPath_config, localPath_weights)" 29 | ] 30 | }, 31 | { 32 | "cell_type": "code", 33 | "execution_count": null, 34 | "metadata": {}, 35 | "outputs": [], 36 | "source": [] 37 | }, 38 | { 39 | "cell_type": "code", 40 | "execution_count": null, 41 | "metadata": {}, 42 | "outputs": [], 43 | "source": [] 44 | }, 45 | { 46 | "cell_type": "code", 47 | "execution_count": 10, 48 | "metadata": {}, 49 | "outputs": [ 50 | { 51 | "name": "stdout", 52 | "output_type": "stream", 53 | "text": [ 54 | "\n" 55 | ] 56 | } 57 | ], 58 | "source": [ 59 | "#Start the video streat\n", 60 | "cap = VideoStream(src=0).start()\n", 61 | "#Getting the layers here\n", 62 | "layers_name = model.getLayerNames()\n", 63 | "\n", 64 | "output_layer = [layers_name[i[0] - 1] for i in model.getUnconnectedOutLayers()]\n", 65 | "print(model.getLayerNames)\n" 66 | ] 67 | }, 68 | { 69 | "cell_type": "code", 70 | "execution_count": null, 71 | "metadata": {}, 72 | "outputs": [], 73 | "source": [ 74 | "while True:\n", 75 | " frame = cap.read()\n", 76 | " (h, w) = frame.shape[:2]\n", 77 | " blob = cv2.dnn.blobFromImage(frame, 1 / 255.0, (416, 416), swapRB=True, crop=False)\n", 78 | " model.setInput(blob)\n", 79 | " nnoutputs = model.forward(output_layer)\n", 80 | " confidence_scores = []\n", 81 | " box_dimensions = []\n", 82 | " class_ids = []\n", 83 | "\n", 84 | " for output in nnoutputs:\n", 85 | " for detection in output:\n", 86 | " scores = detection[5:]\n", 87 | " class_id = np.argmax(scores)\n", 88 | " confidence = scores[class_id]\n", 89 | " if confidence > confidence_threshold :\n", 90 | " box = detection[0:4] * np.array([w, h, w, h])\n", 91 | " (center_x, center_y, width, height) = box.astype(\"int\")\n", 92 | " x = int(center_x - (width / 2))\n", 93 | " y = int(center_y - (height / 2))\n", 94 | " box_dimensions.append([x, y, int(width), int(height)])\n", 95 | " confidence_scores.append(float(confidence))\n", 96 | " class_ids.append(class_id)\n", 97 | " ind = cv2.dnn.NMSBoxes(box_dimensions, confidence_scores, confidence_threshold, nms_threshold)\n", 98 | " for i in ind:\n", 99 | " i = i[0]\n", 100 | " (x, y, w, h) = (box_dimensions[i][0], box_dimensions[i][1],box_dimensions[i][2], box_dimensions[i][3])\n", 101 | " cv2.rectangle(frame,(x, y), (x + w, y + h), (0, 255, 255), 2)\n", 102 | " label = \"{}: {:.4f}\".format(labels[class_ids[i]], confidence_scores[i])\n", 103 | " cv2.putText(frame, label, (x, y - 5), cv2.FONT_HERSHEY_SIMPLEX, 0.5, (255,0,255), 2)\n", 104 | " cv2.imshow(\"Yolo\", frame)\n", 105 | " if cv2.waitKey(1) & 0xFF == ord(\"q\"):\n", 106 | " break\n", 107 | "cv2.destroyAllWindows()\n", 108 | "cap.stop()" 109 | ] 110 | }, 111 | { 112 | "cell_type": "code", 113 | "execution_count": null, 114 | "metadata": {}, 115 | "outputs": [], 116 | "source": [] 117 | }, 118 | { 119 | "cell_type": "code", 120 | "execution_count": null, 121 | "metadata": {}, 122 | "outputs": [], 123 | "source": [] 124 | }, 125 | { 126 | "cell_type": "code", 127 | "execution_count": null, 128 | "metadata": {}, 129 | "outputs": [], 130 | "source": [] 131 | } 132 | ], 133 | "metadata": { 134 | "kernelspec": { 135 | "display_name": "Python 3", 136 | "language": "python", 137 | "name": "python3" 138 | }, 139 | "language_info": { 140 | "codemirror_mode": { 141 | "name": "ipython", 142 | "version": 3 143 | }, 144 | "file_extension": ".py", 145 | "mimetype": "text/x-python", 146 | "name": "python", 147 | "nbconvert_exporter": "python", 148 | "pygments_lexer": "ipython3", 149 | "version": "3.7.4" 150 | } 151 | }, 152 | "nbformat": 4, 153 | "nbformat_minor": 2 154 | } 155 | -------------------------------------------------------------------------------- /Chapter6/Chapter6: -------------------------------------------------------------------------------- 1 | 2 | -------------------------------------------------------------------------------- /Chapter6/Chapter6.ipynb: -------------------------------------------------------------------------------- 1 | { 2 | "cells": [ 3 | { 4 | "cell_type": "code", 5 | "execution_count": 1, 6 | "metadata": {}, 7 | "outputs": [ 8 | { 9 | "name": "stderr", 10 | "output_type": "stream", 11 | "text": [ 12 | "Using TensorFlow backend.\n" 13 | ] 14 | } 15 | ], 16 | "source": [ 17 | "from keras.models import model_from_json\n" 18 | ] 19 | }, 20 | { 21 | "cell_type": "code", 22 | "execution_count": 19, 23 | "metadata": {}, 24 | "outputs": [], 25 | "source": [ 26 | "from inception_resnet_v1 import *\n", 27 | "import numpy as np\n", 28 | "\n", 29 | "from keras.models import Sequential\n", 30 | "from keras.models import load_model\n", 31 | "from keras.models import model_from_json\n", 32 | "from keras.layers.core import Dense, Activation\n", 33 | "from keras.utils import np_utils\n", 34 | "\n", 35 | "from keras.preprocessing.image import load_img, save_img, img_to_array\n", 36 | "from keras.applications.imagenet_utils import preprocess_input\n", 37 | "\n", 38 | "import matplotlib.pyplot as plt\n", 39 | "from keras.preprocessing import image" 40 | ] 41 | }, 42 | { 43 | "cell_type": "code", 44 | "execution_count": 30, 45 | "metadata": {}, 46 | "outputs": [], 47 | "source": [ 48 | "face_model = InceptionResNetV1()" 49 | ] 50 | }, 51 | { 52 | "cell_type": "code", 53 | "execution_count": 31, 54 | "metadata": {}, 55 | "outputs": [], 56 | "source": [ 57 | "face_model.load_weights('facenet_weights.h5')" 58 | ] 59 | }, 60 | { 61 | "cell_type": "code", 62 | "execution_count": 27, 63 | "metadata": {}, 64 | "outputs": [], 65 | "source": [ 66 | "def normalize(x):\n", 67 | " return x / np.sqrt(np.sum(np.multiply(x, x)))\n" 68 | ] 69 | }, 70 | { 71 | "cell_type": "code", 72 | "execution_count": 32, 73 | "metadata": {}, 74 | "outputs": [], 75 | "source": [ 76 | "def getEuclideanDistance(source, validate):\n", 77 | " euclidean_dist = source - validate\n", 78 | " euclidean_dist = np.sum(np.multiply(euclidean_dist, euclidean_dist))\n", 79 | " euclidean_dist = np.sqrt(euclidean_dist)\n", 80 | " return euclidean_dist" 81 | ] 82 | }, 83 | { 84 | "cell_type": "code", 85 | "execution_count": 34, 86 | "metadata": {}, 87 | "outputs": [], 88 | "source": [ 89 | "def preprocess_data(image_path):\n", 90 | " image = load_img(image_path, target_size=(160, 160))\n", 91 | " image = img_to_array(image)\n", 92 | " image = np.expand_dims(image, axis=0)\n", 93 | " image = preprocess_input(image)\n", 94 | " return image" 95 | ] 96 | }, 97 | { 98 | "cell_type": "code", 99 | "execution_count": 21, 100 | "metadata": {}, 101 | "outputs": [], 102 | "source": [ 103 | "img1_representation = normalize(face_model.predict(preprocess_data('image_1.jpeg'))[0,:])\n", 104 | "img2_representation = normalize(face_model.predict(preprocess_data('image_2.jpeg'))[0,:])\n", 105 | " \n", 106 | "euclidean_distance = getEuclideanDistance(img1_representation, img2_representation)" 107 | ] 108 | }, 109 | { 110 | "cell_type": "code", 111 | "execution_count": 22, 112 | "metadata": {}, 113 | "outputs": [ 114 | { 115 | "data": { 116 | "text/plain": [ 117 | "0.70589006" 118 | ] 119 | }, 120 | "execution_count": 22, 121 | "metadata": {}, 122 | "output_type": "execute_result" 123 | } 124 | ], 125 | "source": [ 126 | "euclidean_distance" 127 | ] 128 | }, 129 | { 130 | "cell_type": "code", 131 | "execution_count": 25, 132 | "metadata": {}, 133 | "outputs": [], 134 | "source": [ 135 | "def getCosineSimilarity(source, validate):\n", 136 | " a = np.matmul(np.transpose(source), validate)\n", 137 | " b = np.sum(np.multiply(source, source))\n", 138 | " c = np.sum(np.multiply(validate, validate))\n", 139 | " return 1 - (a / (np.sqrt(b) * np.sqrt(c)))" 140 | ] 141 | }, 142 | { 143 | "cell_type": "code", 144 | "execution_count": 37, 145 | "metadata": {}, 146 | "outputs": [], 147 | "source": [ 148 | "img1_representation = (face_model.predict(preprocess_data('image_1.jpeg'))[0,:])\n", 149 | "img2_representation = (face_model.predict(preprocess_data('image_2.jpeg'))[0,:])\n", 150 | " \n", 151 | "cosine = getCosineSimilarity(img1_representation, img2_representation)" 152 | ] 153 | }, 154 | { 155 | "cell_type": "code", 156 | "execution_count": 38, 157 | "metadata": {}, 158 | "outputs": [ 159 | { 160 | "data": { 161 | "text/plain": [ 162 | "0.2491404414176941" 163 | ] 164 | }, 165 | "execution_count": 38, 166 | "metadata": {}, 167 | "output_type": "execute_result" 168 | } 169 | ], 170 | "source": [ 171 | "cosine" 172 | ] 173 | }, 174 | { 175 | "cell_type": "code", 176 | "execution_count": null, 177 | "metadata": {}, 178 | "outputs": [], 179 | "source": [] 180 | }, 181 | { 182 | "cell_type": "code", 183 | "execution_count": null, 184 | "metadata": {}, 185 | "outputs": [], 186 | "source": [] 187 | }, 188 | { 189 | "cell_type": "code", 190 | "execution_count": null, 191 | "metadata": {}, 192 | "outputs": [], 193 | "source": [ 194 | "# import all the necessary libraries\n", 195 | "import cv2\n", 196 | "import imutils\n", 197 | "import numpy as np\n", 198 | "from sklearn.metrics import pairwise\n", 199 | "\n", 200 | "\n", 201 | "# global variables\n", 202 | "bg = None" 203 | ] 204 | }, 205 | { 206 | "cell_type": "code", 207 | "execution_count": null, 208 | "metadata": {}, 209 | "outputs": [], 210 | "source": [ 211 | "#-------------------------------------------------------------------------------\n", 212 | "# Function - To find the running average over the background\n", 213 | "#-------------------------------------------------------------------------------\n", 214 | "def run_avg(image, accumWeight):\n", 215 | " global bg\n", 216 | " # initialize the background\n", 217 | " if bg is None:\n", 218 | " bg = image.copy().astype(\"float\")\n", 219 | " return\n", 220 | "\n", 221 | " # compute weighted average, accumulate it and update the background\n", 222 | " cv2.accumulateWeighted(image, bg, accumWeight)" 223 | ] 224 | }, 225 | { 226 | "cell_type": "code", 227 | "execution_count": null, 228 | "metadata": {}, 229 | "outputs": [], 230 | "source": [ 231 | "#------------------------------------------------------------------------------#\n", 232 | "#segment function starts, to segment the region of hand in the image\n", 233 | "#-------------------------------------------------------------------------------\n", 234 | "def segment(image, threshold=25):\n", 235 | " global bg\n", 236 | " # find the absolute difference between background and current frame\n", 237 | " diff = cv2.absdiff(bg.astype(\"uint8\"), image)\n", 238 | "\n", 239 | " # threshold the diff image so that we get the foreground\n", 240 | " thresholded = cv2.threshold(diff, threshold, 255, cv2.THRESH_BINARY)[1]\n", 241 | "\n", 242 | " # get the contours in the thresholded image\n", 243 | " (_, cnts, _) = cv2.findContours(thresholded.copy(), cv2.RETR_EXTERNAL, cv2.CHAIN_APPROX_SIMPLE)\n", 244 | "\n", 245 | " # return None, if no contours detected\n", 246 | " if len(cnts) == 0:\n", 247 | " return\n", 248 | " else:\n", 249 | " # based on contour area, get the maximum contour which is the hand\n", 250 | " segmented = max(cnts, key=cv2.contourArea)\n", 251 | " return (thresholded, segmented)" 252 | ] 253 | }, 254 | { 255 | "cell_type": "code", 256 | "execution_count": null, 257 | "metadata": {}, 258 | "outputs": [], 259 | "source": [ 260 | "#------------------------------------------------------------------------------#\n", 261 | "#segment function ends----------------------------------------------------------------------#\n", 262 | "\n", 263 | "# Function - To count the number of fingers in the segmented hand region\n", 264 | "#-------------------------------------------------------------------------------\n", 265 | "from sklearn.metrics import pairwise\n", 266 | "def count(thresholded, segmented):\n", 267 | "\t# find the convex hull of the segmented hand region\n", 268 | "\tchull = cv2.convexHull(segmented)\n", 269 | "\n", 270 | "\t# find the most extreme points in the convex hull\n", 271 | "\textreme_top = tuple(chull[chull[:, :, 1].argmin()][0])\n", 272 | "\textreme_bottom = tuple(chull[chull[:, :, 1].argmax()][0])\n", 273 | "\textreme_left = tuple(chull[chull[:, :, 0].argmin()][0])\n", 274 | "\textreme_right = tuple(chull[chull[:, :, 0].argmax()][0])\n", 275 | "\n", 276 | "\t# find the center of the palm\n", 277 | "\tcX = int((extreme_left[0] + extreme_right[0]) / 2)\n", 278 | "\tcY = int((extreme_top[1] + extreme_bottom[1]) / 2)\n", 279 | "\n", 280 | "\t# find the maximum euclidean distance between the center of the palm\n", 281 | "\t# and the most extreme points of the convex hull\n", 282 | "\tdistance = pairwise.euclidean_distances([(cX, cY)], Y=[extreme_left, extreme_right, extreme_top, extreme_bottom])[0]\n", 283 | "\tmaximum_distance = distance[distance.argmax()]\n", 284 | "\t\n", 285 | "\t# calculate the radius of the circle with 80% of the max euclidean distance obtained\n", 286 | "\tradius = int(0.8 * maximum_distance)\n", 287 | "\t\n", 288 | "\t# find the circumference of the circle\n", 289 | "\tcircumference = (2 * np.pi * radius)\n", 290 | "\n", 291 | "\t# take out the circular region of interest which has \n", 292 | "\t# the palm and the fingers\n", 293 | "\tcircular_roi = np.zeros(thresholded.shape[:2], dtype=\"uint8\")\n", 294 | "\t\n", 295 | "\t# draw the circular ROI\n", 296 | "\tcv2.circle(circular_roi, (cX, cY), radius, 255, 1)\n", 297 | "\t\n", 298 | "\t# take bit-wise AND between thresholded hand using the circular ROI as the mask\n", 299 | "\t# which gives the cuts obtained using mask on the thresholded hand image\n", 300 | "\tcircular_roi = cv2.bitwise_and(thresholded, thresholded, mask=circular_roi)\n", 301 | "\n", 302 | "\t# compute the contours in the circular ROI\n", 303 | "\t(_, cnts, _) = cv2.findContours(circular_roi.copy(), cv2.RETR_EXTERNAL, cv2.CHAIN_APPROX_NONE)\n", 304 | "\n", 305 | "\t# initalize the finger count\n", 306 | "\tcount = 0\n", 307 | "\n", 308 | "\t# loop through the contours found\n", 309 | "\tfor c in cnts:\n", 310 | "\t\t# compute the bounding box of the contour\n", 311 | "\t\t(x, y, w, h) = cv2.boundingRect(c)\n", 312 | "\n", 313 | "\t\t# increment the count of fingers only if -\n", 314 | "\t\t# 1. The contour region is not the wrist (bottom area)\n", 315 | "\t\t# 2. The number of points along the contour does not exceed\n", 316 | "\t\t# 20% of the circumference of the circular ROI\n", 317 | "\t\tif ((cY + (cY * 0.20)) > (y + h)) and ((circumference * 0.20) > c.shape[0]):\n", 318 | "\t\t\tcount += 1\n", 319 | "\n", 320 | "\treturn count" 321 | ] 322 | }, 323 | { 324 | "cell_type": "code", 325 | "execution_count": null, 326 | "metadata": {}, 327 | "outputs": [], 328 | "source": [ 329 | "#-------------------------------------------------------------------------------\n", 330 | "# Main function\n", 331 | "#-------------------------------------------------------------------------------\n", 332 | "if __name__ == \"__main__\":\n", 333 | " # initialize accumulated weight\n", 334 | " accumWeight = 0.5\n", 335 | "\n", 336 | " # get the reference to the webcam\n", 337 | " camera = cv2.VideoCapture(0)\n", 338 | "\n", 339 | " # region of interest (ROI) coordinates\n", 340 | " top, right, bottom, left = 20, 450, 325, 690\n", 341 | "\n", 342 | " # initialize num of frames\n", 343 | " num_frames = 0\n", 344 | "\n", 345 | " # calibration indicator\n", 346 | " calibrated = False\n", 347 | "\n", 348 | " # keep looping, until interrupted\n", 349 | " while(True):\n", 350 | " # get the current frame\n", 351 | " (grabbed, frame) = camera.read()\n", 352 | "\n", 353 | " # resize the frame\n", 354 | " frame = imutils.resize(frame, width=700)\n", 355 | "\n", 356 | " # flip the frame so that it is not the mirror view\n", 357 | " frame = cv2.flip(frame, 1)\n", 358 | "\n", 359 | " # clone the frame\n", 360 | " clone = frame.copy()\n", 361 | "\n", 362 | " # get the height and width of the frame\n", 363 | " (height, width) = frame.shape[:2]\n", 364 | "\n", 365 | " # get the ROI\n", 366 | " roi = frame[top:bottom, right:left]\n", 367 | "\n", 368 | " # convert the roi to grayscale and blur it\n", 369 | " gray = cv2.cvtColor(roi, cv2.COLOR_BGR2GRAY)\n", 370 | " gray = cv2.GaussianBlur(gray, (7, 7), 0)\n", 371 | "\n", 372 | " # to get the background, keep looking till a threshold is reached\n", 373 | " # so that our weighted average model gets calibrated\n", 374 | " if num_frames < 30:\n", 375 | " run_avg(gray, accumWeight)\n", 376 | " if num_frames == 1:\n", 377 | " \tprint (\"Calibration is in progress...\")\n", 378 | " elif num_frames == 29:\n", 379 | " print (\"Calibration is successful...\")\n", 380 | " else:\n", 381 | " # segment the hand region\n", 382 | " hand = segment(gray)\n", 383 | "\n", 384 | " # check whether hand region is segmented\n", 385 | " if hand is not None:\n", 386 | " # if yes, unpack the thresholded image and\n", 387 | " # segmented region\n", 388 | " (thresholded, segmented) = hand\n", 389 | "\n", 390 | " # draw the segmented region and display the frame\n", 391 | " cv2.drawContours(clone, [segmented + (right, top)], -1, (0, 0, 255))\n", 392 | " \n", 393 | " # count the number of fingers\n", 394 | " fingers = count(thresholded, segmented)\n", 395 | "\n", 396 | " cv2.putText(clone, str(fingers), (70, 45), cv2.FONT_HERSHEY_SIMPLEX, 1, (0,0,255), 2)\n", 397 | " \n", 398 | " # show the thresholded image\n", 399 | " cv2.imshow(\"Thesholded\", thresholded)\n", 400 | "\n", 401 | " # draw the segmented hand\n", 402 | " cv2.rectangle(clone, (left, top), (right, bottom), (0,255,0), 2)\n", 403 | "\n", 404 | " # increment the number of frames\n", 405 | " num_frames += 1\n", 406 | "\n", 407 | " # display the frame with segmented hand\n", 408 | " cv2.imshow(\"Video Feed\", clone)\n", 409 | "\n", 410 | " # observe the keypress by the user\n", 411 | " keypress = cv2.waitKey(1) & 0xFF\n", 412 | "\n", 413 | " # if the user pressed \"q\", then stop looping\n", 414 | " if keypress == ord(\"q\"):\n", 415 | " break" 416 | ] 417 | }, 418 | { 419 | "cell_type": "code", 420 | "execution_count": null, 421 | "metadata": {}, 422 | "outputs": [], 423 | "source": [] 424 | }, 425 | { 426 | "cell_type": "code", 427 | "execution_count": null, 428 | "metadata": {}, 429 | "outputs": [], 430 | "source": [] 431 | }, 432 | { 433 | "cell_type": "code", 434 | "execution_count": null, 435 | "metadata": {}, 436 | "outputs": [], 437 | "source": [] 438 | }, 439 | { 440 | "cell_type": "code", 441 | "execution_count": null, 442 | "metadata": {}, 443 | "outputs": [], 444 | "source": [] 445 | }, 446 | { 447 | "cell_type": "code", 448 | "execution_count": null, 449 | "metadata": {}, 450 | "outputs": [], 451 | "source": [] 452 | }, 453 | { 454 | "cell_type": "code", 455 | "execution_count": null, 456 | "metadata": {}, 457 | "outputs": [], 458 | "source": [] 459 | } 460 | ], 461 | "metadata": { 462 | "kernelspec": { 463 | "display_name": "Python 3", 464 | "language": "python", 465 | "name": "python3" 466 | }, 467 | "language_info": { 468 | "codemirror_mode": { 469 | "name": "ipython", 470 | "version": 3 471 | }, 472 | "file_extension": ".py", 473 | "mimetype": "text/x-python", 474 | "name": "python", 475 | "nbconvert_exporter": "python", 476 | "pygments_lexer": "ipython3", 477 | "version": "3.7.4" 478 | } 479 | }, 480 | "nbformat": 4, 481 | "nbformat_minor": 2 482 | } 483 | -------------------------------------------------------------------------------- /Chapter6/Important: -------------------------------------------------------------------------------- 1 | The 'facenet_weights.h5' can be downloaded from this Google Drive link 2 | https://drive.google.com/drive/folders/1yNNCw3n3DscEMc3yoALKDrolb6WEA9U0?usp=sharing 3 | -------------------------------------------------------------------------------- /Chapter7/Dataset and codes for Chapter7: -------------------------------------------------------------------------------- 1 | 2 | -------------------------------------------------------------------------------- /Chapter7/Important: -------------------------------------------------------------------------------- 1 | The dataset, compiled model, weights, images are uploaded to this Google Drive. Due to the size of the dataset, they cannot be uploaded to Github. 2 | The link is https://drive.google.com/drive/u/6/folders/1QS9P5SjTqGofGp-ZCh69c8EQjCQ6VWme 3 | -------------------------------------------------------------------------------- /Chapter8/Chapter8: -------------------------------------------------------------------------------- 1 | 2 | -------------------------------------------------------------------------------- /Chapter8/Hoover.jpg: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/Apress/computer-vision-using-deep-learning/5c08fdbc4ffd67297512cc4d929c111c885e0c75/Chapter8/Hoover.jpg -------------------------------------------------------------------------------- /Coin.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/Apress/computer-vision-using-deep-learning/5c08fdbc4ffd67297512cc4d929c111c885e0c75/Coin.png -------------------------------------------------------------------------------- /Contributing.md: -------------------------------------------------------------------------------- 1 | # Contributing to Apress Source Code 2 | 3 | Copyright for Apress source code belongs to the author(s). However, under fair use you are encouraged to fork and contribute minor corrections and updates for the benefit of the author(s) and other readers. 4 | 5 | ## How to Contribute 6 | 7 | 1. Make sure you have a GitHub account. 8 | 2. Fork the repository for the relevant book. 9 | 3. Create a new branch on which to make your change, e.g. 10 | `git checkout -b my_code_contribution` 11 | 4. Commit your change. Include a commit message describing the correction. Please note that if your commit message is not clear, the correction will not be accepted. 12 | 5. Submit a pull request. 13 | 14 | Thank you for your contribution! -------------------------------------------------------------------------------- /LICENSE.txt: -------------------------------------------------------------------------------- 1 | Freeware License, some rights reserved 2 | 3 | Copyright (c) 2021 Vaibhav Verdhan 4 | 5 | Permission is hereby granted, free of charge, to anyone obtaining a copy 6 | of this software and associated documentation files (the "Software"), 7 | to work with the Software within the limits of freeware distribution and fair use. 8 | This includes the rights to use, copy, and modify the Software for personal use. 9 | Users are also allowed and encouraged to submit corrections and modifications 10 | to the Software for the benefit of other users. 11 | 12 | It is not allowed to reuse, modify, or redistribute the Software for 13 | commercial use in any way, or for a user’s educational materials such as books 14 | or blog articles without prior permission from the copyright holder. 15 | 16 | The above copyright notice and this permission notice need to be included 17 | in all copies or substantial portions of the software. 18 | 19 | THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR 20 | IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, 21 | FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE 22 | AUTHORS OR COPYRIGHT HOLDERS OR APRESS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER 23 | LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, 24 | OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE 25 | SOFTWARE. 26 | 27 | 28 | -------------------------------------------------------------------------------- /Mario.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/Apress/computer-vision-using-deep-learning/5c08fdbc4ffd67297512cc4d929c111c885e0c75/Mario.png -------------------------------------------------------------------------------- /README.md: -------------------------------------------------------------------------------- 1 | # Apress Source Code 2 | 3 | This repository accompanies [*Computer Vision Using Deep Learning: Neural Network Architectures with Python, Keras, and TensorFlow*](https://www.apress.com/9781484266151) by Vaibhav Verdhan(Apress, 2021). 4 | 5 | [comment]: #cover 6 | ![Cover image](9781484266151.jpg) 7 | 8 | Download the files as a zip using the green button, or clone the repository to your machine using Git. 9 | 10 | ## Releases 11 | 12 | Release v1.0 corresponds to the code in the published book, without corrections or updates. 13 | 14 | ## Contributions 15 | 16 | See the file Contributing.md for more information on how you can contribute to this repository. --------------------------------------------------------------------------------