├── requirements.txt ├── Image └── Detect_image.png ├── LICENSE ├── README.md ├── object_detection_using_vgg16_with_tensorflow.py └── Object_Detection_Using_VGG16_With_Tensorflow.ipynb /requirements.txt: -------------------------------------------------------------------------------- 1 | tensorflow 2 | numpy 3 | matplotlib 4 | scikit-learn -------------------------------------------------------------------------------- /Image/Detect_image.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/zubairsamo/Object-Detection-With-Tensorflow-Using-VGG16/HEAD/Image/Detect_image.png -------------------------------------------------------------------------------- /LICENSE: -------------------------------------------------------------------------------- 1 | MIT License 2 | 3 | Copyright (c) 2020 zubair samo 4 | 5 | Permission is hereby granted, free of charge, to any person obtaining a copy 6 | of this software and associated documentation files (the "Software"), to deal 7 | in the Software without restriction, including without limitation the rights 8 | to use, copy, modify, merge, publish, distribute, sublicense, and/or sell 9 | copies of the Software, and to permit persons to whom the Software is 10 | furnished to do so, subject to the following conditions: 11 | 12 | The above copyright notice and this permission notice shall be included in all 13 | copies or substantial portions of the Software. 14 | 15 | THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR 16 | IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, 17 | FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE 18 | AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER 19 | LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, 20 | OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE 21 | SOFTWARE. 22 | -------------------------------------------------------------------------------- /README.md: -------------------------------------------------------------------------------- 1 |

Object-Detection-With-Tensorflow-Using-VGG16

2 | 3 |

4 | 5 |

6 | 7 | 8 | [![Build Status](https://img.shields.io/badge/Build-Passing-brightgreen.svg?style=for-the-badge&logo=appveyor)](#) 9 | [![Open Source Love svg1](https://badges.frapsoft.com/os/v1/open-source.svg?v=103)](#) 10 | [![contributions welcome](https://img.shields.io/badge/contributions-welcome-brightgreen.svg?style=flat&label=Contributions&colorA=red&colorB=black )](#) 11 | [![GitHub Forks](https://img.shields.io/github/forks/zubairsamo/Object-Detection-With-Tensorflow-Using-VGG16.svg?style=social&label=Fork&maxAge=2592000)](https://github.com/zubairsamo/Object-Detection-With-Tensorflow-Using-VGG16/fork) 12 | [![GitHub Issues](https://img.shields.io/github/issues/zubairsamo/Object-Detection-With-Tensorflow-Using-VGG16.svg?style=flat&label=Issues&maxAge=2592000)](https://github.com/zubairsamo/Object-Detection-With-Tensorflow-Using-VGG16/issues) 13 | # VGG16 Architecture 14 | The input to the Convolutional Network is a fixed-size 224 X 224 X 3 image. The preprocessing step subtracts the mean RGB value from each pixel. The image is passed 15 | through a stack of convolutional layers with 3 X 3 receptive fields (smallest size that 16 | accommodates a pixel shift). In one of the layers, a 1 X 1 convolutional filter linearly 17 | transforms the input channels. The stride is fixed to 1 pixel and padding is such that the 18 | spatial resolution is preserved after convolution. 5 max-pooling layers are performed over 19 | a 2 X 2 pixel window, with stride 2. I have used the pre-trained weights for VGG16 trained 20 | model on ImageNet dataset to extract features. The feature for each image is a tensor of 7 21 | X 7 X 512 dimension. Fig 1 represents the architecture of the convolutional layers in 22 | VGG16. 23 | # Description 24 | Object detection in one of the fundamental problems in the field of artificial intelligence 25 | with applications in robotics, automation, and human-computer interaction. The aim is to 26 | track an arbitrary object in consecutive frames of a video segment by localizing it inside 27 | bounding boxes. The most common representation of these bounding boxes is in terms of 28 | the top-left and bottom-right coordinates in the frame with respect to the origin of each 29 | imaginary image grid. 30 | The VGG16 model secured the first position in ILSRVC for object localization and its accuracy 31 | for predicting the location of these boxes is unquestionably high Nevertheless, tradeoffs between accuracy 32 | and computation-intensity is obvious and raises the need for faster 33 | approaches. In this project, the VGG16 model has been trained on pre-trained weights on 34 | ImageNet for feature extraction. This transfer learning model is advantageous as it escapes 35 | the necessity to train the model on a large-scale dataset like ImageNet 36 | # Run it now 37 | [![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/drive/15tA57gXnWprZjc5J_V7591AUdSQWrTF6?usp=sharing) 38 | 39 | # Dataset 40 | https://drive.google.com/drive/folders/1NxnVWN-aJuHdTibSOPstzZcFiIHEgx6Y?usp=sharing 41 | 42 | # Requirements 43 | 44 | Type below command in cmd to get up and running with the dependencies of the file. 45 | ``` 46 | pip install -r requirement.txt 47 | ``` 48 | git clone https://github.com/zubairsamo/Object-Detection-With-Tensorflow-Using-VGG16 49 | .git 50 | 51 | # Usage 52 | Object-Detection-With-Tensorflow-Using-VGG16 53 | .ipynb 54 | 55 | ## Author 56 | You can get in touch with me on my LinkedIn Profile: 57 | 58 | #### Zubair Samo 59 | [![LinkedIn Link](https://img.shields.io/badge/Connect-ZubairSamo-blue.svg?logo=linkedin&longCache=true&style=social&label=Connect 60 | )](https://linkedin.com/in/zubair-samo-3a2764197) 61 | 62 | You can also follow my GitHub Profile to stay updated about my latest projects: [![GitHub Follow](https://img.shields.io/badge/Connect-zubairsamo-blue.svg?logo=Github&longCache=true&style=social&label=Follow)](https://github.com/zubairsamo) 63 | 64 | If you liked the repo then kindly support it by giving it a star ⭐! 65 | 66 | ## Contributions Welcome 67 | [![forthebadge](https://forthebadge.com/images/badges/built-with-love.svg)](#) 68 | 69 | If you find any bug in the code or have any improvements in mind then feel free to generate a pull request. 70 | 71 | ## License 72 | [![MIT](https://img.shields.io/cocoapods/l/AFNetworking.svg?style=style&label=License&maxAge=2592000)](../master/LICENSE) 73 | 74 | Copyright (c) 2020 Zubair Samo 75 | -------------------------------------------------------------------------------- /object_detection_using_vgg16_with_tensorflow.py: -------------------------------------------------------------------------------- 1 | # -*- coding: utf-8 -*- 2 | """Object Detection Using VGG16 With Tensorflow.ipynb 3 | 4 | Automatically generated by Colaboratory. 5 | 6 | Original file is located at 7 | https://colab.research.google.com/drive/15tA57gXnWprZjc5J_V7591AUdSQWrTF6 8 | """ 9 | 10 | import os 11 | base_path="/content/drive/MyDrive/Applied_Ai_Course/Datasets" 12 | images=os.path.sep.join([base_path,'images']) 13 | annotations=os.path.sep.join([base_path,'airplanes.csv']) 14 | 15 | # Lets Load Dataset 16 | # airplanes annotation is a Csv file thats why we can see through with rows 17 | 18 | rows= open(annotations).read().strip().split("\n") 19 | 20 | # lets make three list where we save our exact bounding boxes 21 | data=[] 22 | targets=[] 23 | filenames=[] 24 | 25 | # After load we have to split dataset according to images 26 | # import some usefull libraries 27 | import cv2 28 | from tensorflow.keras.preprocessing.image import load_img 29 | # we also save images into array format so import img_array library too 30 | from tensorflow.keras.preprocessing.image import img_to_array 31 | for row in rows: 32 | row=row.split(",") 33 | # we always create rectangle with h+w so we have to know where exactly we should start from 34 | (filename,startX,startY,endX,endY)=row 35 | 36 | imagepaths=os.path.sep.join([images,filename]) 37 | image=cv2.imread(imagepaths) 38 | (h,w)=image.shape[:2] 39 | 40 | # initializing starting point 41 | # Why we take in float because when we convert into array so then will trouble happen 42 | startX = float(startX) / w 43 | startY = float(startY) / h 44 | # Also initialize ending point 45 | endX = float(endX) / w 46 | endY = float(endY) / h 47 | #load image and give them default size 48 | image=load_img(imagepaths,target_size=(224,224)) 49 | # see here if we cant take it into float then we face trouble 50 | image=img_to_array(image) 51 | 52 | # Lets append into data , targets ,filenames 53 | targets.append((startX,startY,endX,endY)) 54 | filenames.append(filename) 55 | data.append(image) 56 | 57 | # Normalizing Data here also we face would face issues if we take input as integer 58 | import numpy as np 59 | data=np.array(data,dtype='float32') / 255.0 60 | targets=np.array(targets,dtype='float32') 61 | 62 | # we should seperate data into train and split so import sklearn library 63 | from sklearn.model_selection import train_test_split 64 | 65 | # split into testing and training 66 | split=train_test_split(data,targets,filenames,test_size=0.10,random_state=42) 67 | 68 | # lets split into steps 69 | (train_images,test_images) = split[:2] 70 | (train_targets,test_targets) = split[2:4] 71 | (train_filenames,test_filenames) = split[4:] 72 | 73 | # lets import pre trained VGG16 Which is already Builtin for computer vision 74 | from tensorflow.keras.applications import VGG16 75 | from tensorflow.keras.layers import Input 76 | 77 | # Imagenet is a competition every year held and VGG16 is winner of between 2013-14 78 | # so here we just want limited layers so thats why we false included_top 79 | vgg=VGG16(weights='imagenet',include_top=False,input_tensor=Input(shape=(224,224,3))) 80 | 81 | vgg.summary() 82 | 83 | from tensorflow.keras.layers import Input,Flatten,Dense 84 | 85 | # we use VGG16 as per our requirement not use whole 86 | vgg.trainable = False 87 | 88 | flatten = vgg.output 89 | 90 | flatten = Flatten()(flatten) 91 | 92 | # Lets make bboxhead 93 | bboxhead = Dense(128,activation="relu")(flatten) 94 | bboxhead = Dense(64,activation="relu")(bboxhead) 95 | bboxhead = Dense(32,activation="relu")(bboxhead) 96 | bboxhead = Dense(4,activation="relu")(bboxhead) 97 | 98 | # lets import Model 99 | from tensorflow.keras.models import Model 100 | model = Model(inputs = vgg.input,outputs = bboxhead) 101 | 102 | model.summary() 103 | 104 | # Lets fit our model 105 | # Optimization 106 | from tensorflow.keras.optimizers import Adam 107 | 108 | opt = Adam(1e-4) 109 | 110 | model.compile(loss='mse',optimizer=opt) 111 | 112 | history = model.fit(train_images,train_targets,validation_data=(test_images,test_targets),batch_size=32,epochs=50,verbose=1) 113 | 114 | # lets save model 115 | model.save('detect_Planes.h5') 116 | 117 | from tensorflow.keras.models import load_model 118 | 119 | model=load_model('/content/detect_Planes.h5') 120 | 121 | imagepath='/content/drive/MyDrive/Applied_Ai_Course/Datasets/images/image_0111.jpg' 122 | 123 | image = load_img(imagepath, 124 | target_size=(224,224)) 125 | image = img_to_array(image) / 255.0 126 | image = np.expand_dims(image,axis=0) 127 | 128 | preds=model.predict(image)[0] 129 | (startX,startY,endX,endY)=preds 130 | 131 | import imutils 132 | 133 | image=cv2.imread(imagepaths) 134 | image=imutils.resize(image,width=600) 135 | 136 | (h,w)=image.shape[:2] 137 | 138 | startX=int(startX * w) 139 | startY=int(startY * h) 140 | 141 | endX=int(endX * w) 142 | endY=int(endY * h) 143 | 144 | cv2.rectangle(image,(startX,startY),(endX,endY),(0,255,0),3) 145 | 146 | from google.colab.patches import cv2_imshow 147 | 148 | import matplotlib.pyplot as plt 149 | plt.imshow(image) 150 | cv2.waitKey(0) 151 | 152 | -------------------------------------------------------------------------------- /Object_Detection_Using_VGG16_With_Tensorflow.ipynb: -------------------------------------------------------------------------------- 1 | { 2 | "nbformat": 4, 3 | "nbformat_minor": 0, 4 | "metadata": { 5 | "colab": { 6 | "name": "Object Detection Using VGG16 With Tensorflow.ipynb", 7 | "provenance": [], 8 | "collapsed_sections": [] 9 | }, 10 | "kernelspec": { 11 | "name": "python3", 12 | "display_name": "Python 3" 13 | }, 14 | "accelerator": "GPU" 15 | }, 16 | "cells": [ 17 | { 18 | "cell_type": "markdown", 19 | "metadata": { 20 | "id": "iVKGF4unVslV" 21 | }, 22 | "source": [ 23 | "" 24 | ] 25 | }, 26 | { 27 | "cell_type": "code", 28 | "metadata": { 29 | "id": "Vl_dLDnV_Lys" 30 | }, 31 | "source": [ 32 | "import os\n", 33 | "base_path=\"/content/drive/MyDrive/Applied_Ai_Course/Datasets\"\n", 34 | "images=os.path.sep.join([base_path,'images'])\n", 35 | "annotations=os.path.sep.join([base_path,'airplanes.csv'])" 36 | ], 37 | "execution_count": 175, 38 | "outputs": [] 39 | }, 40 | { 41 | "cell_type": "code", 42 | "metadata": { 43 | "id": "ztWsfMIXFFlK" 44 | }, 45 | "source": [ 46 | "# Lets Load Dataset\n", 47 | "# airplanes annotation is a Csv file thats why we can see through with rows\n", 48 | "\n", 49 | "rows= open(annotations).read().strip().split(\"\\n\")\n", 50 | "\n", 51 | "# lets make three list where we save our exact bounding boxes\n", 52 | "data=[]\n", 53 | "targets=[]\n", 54 | "filenames=[]" 55 | ], 56 | "execution_count": 176, 57 | "outputs": [] 58 | }, 59 | { 60 | "cell_type": "code", 61 | "metadata": { 62 | "id": "-C1Ynxw7FveF" 63 | }, 64 | "source": [ 65 | "# After load we have to split dataset according to images\n", 66 | "# import some usefull libraries\n", 67 | "import cv2\n", 68 | "from tensorflow.keras.preprocessing.image import load_img\n", 69 | "# we also save images into array format so import img_array library too\n", 70 | "from tensorflow.keras.preprocessing.image import img_to_array\n", 71 | "for row in rows:\n", 72 | " row=row.split(\",\")\n", 73 | " # we always create rectangle with h+w so we have to know where exactly we should start from\n", 74 | " (filename,startX,startY,endX,endY)=row\n", 75 | "\n", 76 | " imagepaths=os.path.sep.join([images,filename])\n", 77 | " image=cv2.imread(imagepaths)\n", 78 | " (h,w)=image.shape[:2]\n", 79 | "\n", 80 | " # initializing starting point\n", 81 | " # Why we take in float because when we convert into array so then will trouble happen\n", 82 | " startX = float(startX) / w\n", 83 | " startY = float(startY) / h\n", 84 | " # Also initialize ending point \n", 85 | " endX = float(endX) / w\n", 86 | " endY = float(endY) / h\n", 87 | " #load image and give them default size\n", 88 | " image=load_img(imagepaths,target_size=(224,224))\n", 89 | " # see here if we cant take it into float then we face trouble \n", 90 | " image=img_to_array(image)\n", 91 | "\n", 92 | " # Lets append into data , targets ,filenames\n", 93 | " targets.append((startX,startY,endX,endY))\n", 94 | " filenames.append(filename)\n", 95 | " data.append(image)\n" 96 | ], 97 | "execution_count": 177, 98 | "outputs": [] 99 | }, 100 | { 101 | "cell_type": "code", 102 | "metadata": { 103 | "id": "0J6zfNkZJKi_" 104 | }, 105 | "source": [ 106 | "# Normalizing Data here also we face would face issues if we take input as integer\n", 107 | "import numpy as np\n", 108 | "data=np.array(data,dtype='float32') / 255.0\n", 109 | "targets=np.array(targets,dtype='float32')" 110 | ], 111 | "execution_count": 178, 112 | "outputs": [] 113 | }, 114 | { 115 | "cell_type": "code", 116 | "metadata": { 117 | "id": "cI3z3eOUJdCI" 118 | }, 119 | "source": [ 120 | "# we should seperate data into train and split so import sklearn library \n", 121 | "from sklearn.model_selection import train_test_split" 122 | ], 123 | "execution_count": 179, 124 | "outputs": [] 125 | }, 126 | { 127 | "cell_type": "code", 128 | "metadata": { 129 | "id": "2WwSEjEJK1MI" 130 | }, 131 | "source": [ 132 | "# split into testing and training\n", 133 | "split=train_test_split(data,targets,filenames,test_size=0.10,random_state=42)" 134 | ], 135 | "execution_count": 180, 136 | "outputs": [] 137 | }, 138 | { 139 | "cell_type": "code", 140 | "metadata": { 141 | "id": "sxxnaBHhLIyR" 142 | }, 143 | "source": [ 144 | "# lets split into steps\n", 145 | "(train_images,test_images) = split[:2]\n", 146 | "(train_targets,test_targets) = split[2:4]\n", 147 | "(train_filenames,test_filenames) = split[4:]\n" 148 | ], 149 | "execution_count": 181, 150 | "outputs": [] 151 | }, 152 | { 153 | "cell_type": "code", 154 | "metadata": { 155 | "id": "MZBKv2k_L62z" 156 | }, 157 | "source": [ 158 | "# lets import pre trained VGG16 Which is already Builtin for computer vision\n", 159 | "from tensorflow.keras.applications import VGG16\n", 160 | "from tensorflow.keras.layers import Input" 161 | ], 162 | "execution_count": 182, 163 | "outputs": [] 164 | }, 165 | { 166 | "cell_type": "code", 167 | "metadata": { 168 | "id": "YpIvvM48MUOs" 169 | }, 170 | "source": [ 171 | "# Imagenet is a competition every year held and VGG16 is winner of between 2013-14\n", 172 | "# so here we just want limited layers so thats why we false included_top \n", 173 | "vgg=VGG16(weights='imagenet',include_top=False,input_tensor=Input(shape=(224,224,3)))" 174 | ], 175 | "execution_count": 183, 176 | "outputs": [] 177 | }, 178 | { 179 | "cell_type": "code", 180 | "metadata": { 181 | "colab": { 182 | "base_uri": "https://localhost:8080/" 183 | }, 184 | "id": "gPaFO3AZNZOE", 185 | "outputId": "98af118e-9bc5-4ad4-aac0-6c1c965740a9" 186 | }, 187 | "source": [ 188 | "vgg.summary()" 189 | ], 190 | "execution_count": 184, 191 | "outputs": [ 192 | { 193 | "output_type": "stream", 194 | "text": [ 195 | "Model: \"vgg16\"\n", 196 | "_________________________________________________________________\n", 197 | "Layer (type) Output Shape Param # \n", 198 | "=================================================================\n", 199 | "input_4 (InputLayer) [(None, 224, 224, 3)] 0 \n", 200 | "_________________________________________________________________\n", 201 | "block1_conv1 (Conv2D) (None, 224, 224, 64) 1792 \n", 202 | "_________________________________________________________________\n", 203 | "block1_conv2 (Conv2D) (None, 224, 224, 64) 36928 \n", 204 | "_________________________________________________________________\n", 205 | "block1_pool (MaxPooling2D) (None, 112, 112, 64) 0 \n", 206 | "_________________________________________________________________\n", 207 | "block2_conv1 (Conv2D) (None, 112, 112, 128) 73856 \n", 208 | "_________________________________________________________________\n", 209 | "block2_conv2 (Conv2D) (None, 112, 112, 128) 147584 \n", 210 | "_________________________________________________________________\n", 211 | "block2_pool (MaxPooling2D) (None, 56, 56, 128) 0 \n", 212 | "_________________________________________________________________\n", 213 | "block3_conv1 (Conv2D) (None, 56, 56, 256) 295168 \n", 214 | "_________________________________________________________________\n", 215 | "block3_conv2 (Conv2D) (None, 56, 56, 256) 590080 \n", 216 | "_________________________________________________________________\n", 217 | "block3_conv3 (Conv2D) (None, 56, 56, 256) 590080 \n", 218 | "_________________________________________________________________\n", 219 | "block3_pool (MaxPooling2D) (None, 28, 28, 256) 0 \n", 220 | "_________________________________________________________________\n", 221 | "block4_conv1 (Conv2D) (None, 28, 28, 512) 1180160 \n", 222 | "_________________________________________________________________\n", 223 | "block4_conv2 (Conv2D) (None, 28, 28, 512) 2359808 \n", 224 | "_________________________________________________________________\n", 225 | "block4_conv3 (Conv2D) (None, 28, 28, 512) 2359808 \n", 226 | "_________________________________________________________________\n", 227 | "block4_pool (MaxPooling2D) (None, 14, 14, 512) 0 \n", 228 | "_________________________________________________________________\n", 229 | "block5_conv1 (Conv2D) (None, 14, 14, 512) 2359808 \n", 230 | "_________________________________________________________________\n", 231 | "block5_conv2 (Conv2D) (None, 14, 14, 512) 2359808 \n", 232 | "_________________________________________________________________\n", 233 | "block5_conv3 (Conv2D) (None, 14, 14, 512) 2359808 \n", 234 | "_________________________________________________________________\n", 235 | "block5_pool (MaxPooling2D) (None, 7, 7, 512) 0 \n", 236 | "=================================================================\n", 237 | "Total params: 14,714,688\n", 238 | "Trainable params: 14,714,688\n", 239 | "Non-trainable params: 0\n", 240 | "_________________________________________________________________\n" 241 | ], 242 | "name": "stdout" 243 | } 244 | ] 245 | }, 246 | { 247 | "cell_type": "code", 248 | "metadata": { 249 | "id": "0YPGZE1oNdvP" 250 | }, 251 | "source": [ 252 | "from tensorflow.keras.layers import Input,Flatten,Dense" 253 | ], 254 | "execution_count": 185, 255 | "outputs": [] 256 | }, 257 | { 258 | "cell_type": "code", 259 | "metadata": { 260 | "id": "Qg2i026tNnOM" 261 | }, 262 | "source": [ 263 | "# we use VGG16 as per our requirement not use whole \n", 264 | "vgg.trainable = False\n", 265 | "\n", 266 | "flatten = vgg.output\n", 267 | "\n", 268 | "flatten = Flatten()(flatten)" 269 | ], 270 | "execution_count": 186, 271 | "outputs": [] 272 | }, 273 | { 274 | "cell_type": "code", 275 | "metadata": { 276 | "id": "zBCbiwSMODz-" 277 | }, 278 | "source": [ 279 | "# Lets make bboxhead\n", 280 | "bboxhead = Dense(128,activation=\"relu\")(flatten)\n", 281 | "bboxhead = Dense(64,activation=\"relu\")(bboxhead)\n", 282 | "bboxhead = Dense(32,activation=\"relu\")(bboxhead)\n", 283 | "bboxhead = Dense(4,activation=\"relu\")(bboxhead)" 284 | ], 285 | "execution_count": 187, 286 | "outputs": [] 287 | }, 288 | { 289 | "cell_type": "code", 290 | "metadata": { 291 | "id": "6whx45l1OdFW" 292 | }, 293 | "source": [ 294 | "# lets import Model\n", 295 | "from tensorflow.keras.models import Model\n", 296 | "model = Model(inputs = vgg.input,outputs = bboxhead)" 297 | ], 298 | "execution_count": 188, 299 | "outputs": [] 300 | }, 301 | { 302 | "cell_type": "code", 303 | "metadata": { 304 | "colab": { 305 | "base_uri": "https://localhost:8080/" 306 | }, 307 | "id": "nhp9ptwpO7_7", 308 | "outputId": "95426c6f-3ea8-49df-c5b5-fcf5b098b9ff" 309 | }, 310 | "source": [ 311 | "model.summary()" 312 | ], 313 | "execution_count": 189, 314 | "outputs": [ 315 | { 316 | "output_type": "stream", 317 | "text": [ 318 | "Model: \"functional_7\"\n", 319 | "_________________________________________________________________\n", 320 | "Layer (type) Output Shape Param # \n", 321 | "=================================================================\n", 322 | "input_4 (InputLayer) [(None, 224, 224, 3)] 0 \n", 323 | "_________________________________________________________________\n", 324 | "block1_conv1 (Conv2D) (None, 224, 224, 64) 1792 \n", 325 | "_________________________________________________________________\n", 326 | "block1_conv2 (Conv2D) (None, 224, 224, 64) 36928 \n", 327 | "_________________________________________________________________\n", 328 | "block1_pool (MaxPooling2D) (None, 112, 112, 64) 0 \n", 329 | "_________________________________________________________________\n", 330 | "block2_conv1 (Conv2D) (None, 112, 112, 128) 73856 \n", 331 | "_________________________________________________________________\n", 332 | "block2_conv2 (Conv2D) (None, 112, 112, 128) 147584 \n", 333 | "_________________________________________________________________\n", 334 | "block2_pool (MaxPooling2D) (None, 56, 56, 128) 0 \n", 335 | "_________________________________________________________________\n", 336 | "block3_conv1 (Conv2D) (None, 56, 56, 256) 295168 \n", 337 | "_________________________________________________________________\n", 338 | "block3_conv2 (Conv2D) (None, 56, 56, 256) 590080 \n", 339 | "_________________________________________________________________\n", 340 | "block3_conv3 (Conv2D) (None, 56, 56, 256) 590080 \n", 341 | "_________________________________________________________________\n", 342 | "block3_pool (MaxPooling2D) (None, 28, 28, 256) 0 \n", 343 | "_________________________________________________________________\n", 344 | "block4_conv1 (Conv2D) (None, 28, 28, 512) 1180160 \n", 345 | "_________________________________________________________________\n", 346 | "block4_conv2 (Conv2D) (None, 28, 28, 512) 2359808 \n", 347 | "_________________________________________________________________\n", 348 | "block4_conv3 (Conv2D) (None, 28, 28, 512) 2359808 \n", 349 | "_________________________________________________________________\n", 350 | "block4_pool (MaxPooling2D) (None, 14, 14, 512) 0 \n", 351 | "_________________________________________________________________\n", 352 | "block5_conv1 (Conv2D) (None, 14, 14, 512) 2359808 \n", 353 | "_________________________________________________________________\n", 354 | "block5_conv2 (Conv2D) (None, 14, 14, 512) 2359808 \n", 355 | "_________________________________________________________________\n", 356 | "block5_conv3 (Conv2D) (None, 14, 14, 512) 2359808 \n", 357 | "_________________________________________________________________\n", 358 | "block5_pool (MaxPooling2D) (None, 7, 7, 512) 0 \n", 359 | "_________________________________________________________________\n", 360 | "flatten_3 (Flatten) (None, 25088) 0 \n", 361 | "_________________________________________________________________\n", 362 | "dense_12 (Dense) (None, 128) 3211392 \n", 363 | "_________________________________________________________________\n", 364 | "dense_13 (Dense) (None, 64) 8256 \n", 365 | "_________________________________________________________________\n", 366 | "dense_14 (Dense) (None, 32) 2080 \n", 367 | "_________________________________________________________________\n", 368 | "dense_15 (Dense) (None, 4) 132 \n", 369 | "=================================================================\n", 370 | "Total params: 17,936,548\n", 371 | "Trainable params: 3,221,860\n", 372 | "Non-trainable params: 14,714,688\n", 373 | "_________________________________________________________________\n" 374 | ], 375 | "name": "stdout" 376 | } 377 | ] 378 | }, 379 | { 380 | "cell_type": "code", 381 | "metadata": { 382 | "id": "Fye3qz96O9y2" 383 | }, 384 | "source": [ 385 | "# Lets fit our model \n", 386 | "# Optimization \n", 387 | "from tensorflow.keras.optimizers import Adam\n", 388 | "\n", 389 | "opt = Adam(1e-4)" 390 | ], 391 | "execution_count": 190, 392 | "outputs": [] 393 | }, 394 | { 395 | "cell_type": "code", 396 | "metadata": { 397 | "id": "o1G-qSwgPX6u" 398 | }, 399 | "source": [ 400 | "model.compile(loss='mse',optimizer=opt)" 401 | ], 402 | "execution_count": 191, 403 | "outputs": [] 404 | }, 405 | { 406 | "cell_type": "code", 407 | "metadata": { 408 | "colab": { 409 | "base_uri": "https://localhost:8080/" 410 | }, 411 | "id": "rOfPJd6RPek5", 412 | "outputId": "fade4076-7c0d-44e4-abb0-c0e9deb9edc6" 413 | }, 414 | "source": [ 415 | "history = model.fit(train_images,train_targets,validation_data=(test_images,test_targets),batch_size=32,epochs=50,verbose=1)" 416 | ], 417 | "execution_count": 192, 418 | "outputs": [ 419 | { 420 | "output_type": "stream", 421 | "text": [ 422 | "Epoch 1/50\n", 423 | "23/23 [==============================] - 4s 155ms/step - loss: 0.0342 - val_loss: 0.0186\n", 424 | "Epoch 2/50\n", 425 | "23/23 [==============================] - 3s 152ms/step - loss: 0.0124 - val_loss: 0.0080\n", 426 | "Epoch 3/50\n", 427 | "23/23 [==============================] - 4s 153ms/step - loss: 0.0072 - val_loss: 0.0069\n", 428 | "Epoch 4/50\n", 429 | "23/23 [==============================] - 4s 155ms/step - loss: 0.0061 - val_loss: 0.0066\n", 430 | "Epoch 5/50\n", 431 | "23/23 [==============================] - 4s 155ms/step - loss: 0.0057 - val_loss: 0.0064\n", 432 | "Epoch 6/50\n", 433 | "23/23 [==============================] - 3s 151ms/step - loss: 0.0042 - val_loss: 0.0026\n", 434 | "Epoch 7/50\n", 435 | "23/23 [==============================] - 3s 148ms/step - loss: 0.0017 - val_loss: 0.0022\n", 436 | "Epoch 8/50\n", 437 | "23/23 [==============================] - 3s 147ms/step - loss: 0.0011 - val_loss: 0.0019\n", 438 | "Epoch 9/50\n", 439 | "23/23 [==============================] - 3s 146ms/step - loss: 7.5519e-04 - val_loss: 0.0020\n", 440 | "Epoch 10/50\n", 441 | "23/23 [==============================] - 3s 142ms/step - loss: 6.2409e-04 - val_loss: 0.0018\n", 442 | "Epoch 11/50\n", 443 | "23/23 [==============================] - 3s 141ms/step - loss: 5.1805e-04 - val_loss: 0.0017\n", 444 | "Epoch 12/50\n", 445 | "23/23 [==============================] - 3s 140ms/step - loss: 4.3773e-04 - val_loss: 0.0018\n", 446 | "Epoch 13/50\n", 447 | "23/23 [==============================] - 3s 139ms/step - loss: 3.7624e-04 - val_loss: 0.0017\n", 448 | "Epoch 14/50\n", 449 | "23/23 [==============================] - 3s 138ms/step - loss: 3.0916e-04 - val_loss: 0.0016\n", 450 | "Epoch 15/50\n", 451 | "23/23 [==============================] - 3s 138ms/step - loss: 2.7746e-04 - val_loss: 0.0017\n", 452 | "Epoch 16/50\n", 453 | "23/23 [==============================] - 3s 139ms/step - loss: 2.4081e-04 - val_loss: 0.0017\n", 454 | "Epoch 17/50\n", 455 | "23/23 [==============================] - 3s 139ms/step - loss: 2.0326e-04 - val_loss: 0.0016\n", 456 | "Epoch 18/50\n", 457 | "23/23 [==============================] - 3s 139ms/step - loss: 1.8227e-04 - val_loss: 0.0017\n", 458 | "Epoch 19/50\n", 459 | "23/23 [==============================] - 3s 139ms/step - loss: 1.6671e-04 - val_loss: 0.0016\n", 460 | "Epoch 20/50\n", 461 | "23/23 [==============================] - 3s 139ms/step - loss: 1.4739e-04 - val_loss: 0.0016\n", 462 | "Epoch 21/50\n", 463 | "23/23 [==============================] - 3s 140ms/step - loss: 1.2999e-04 - val_loss: 0.0016\n", 464 | "Epoch 22/50\n", 465 | "23/23 [==============================] - 3s 142ms/step - loss: 1.1637e-04 - val_loss: 0.0016\n", 466 | "Epoch 23/50\n", 467 | "23/23 [==============================] - 3s 144ms/step - loss: 1.0369e-04 - val_loss: 0.0016\n", 468 | "Epoch 24/50\n", 469 | "23/23 [==============================] - 3s 145ms/step - loss: 9.5637e-05 - val_loss: 0.0016\n", 470 | "Epoch 25/50\n", 471 | "23/23 [==============================] - 3s 148ms/step - loss: 8.6480e-05 - val_loss: 0.0016\n", 472 | "Epoch 26/50\n", 473 | "23/23 [==============================] - 3s 148ms/step - loss: 8.1371e-05 - val_loss: 0.0016\n", 474 | "Epoch 27/50\n", 475 | "23/23 [==============================] - 3s 148ms/step - loss: 7.6879e-05 - val_loss: 0.0016\n", 476 | "Epoch 28/50\n", 477 | "23/23 [==============================] - 3s 147ms/step - loss: 7.5892e-05 - val_loss: 0.0016\n", 478 | "Epoch 29/50\n", 479 | "23/23 [==============================] - 3s 147ms/step - loss: 6.9565e-05 - val_loss: 0.0016\n", 480 | "Epoch 30/50\n", 481 | "23/23 [==============================] - 3s 145ms/step - loss: 6.9834e-05 - val_loss: 0.0016\n", 482 | "Epoch 31/50\n", 483 | "23/23 [==============================] - 3s 145ms/step - loss: 7.2559e-05 - val_loss: 0.0016\n", 484 | "Epoch 32/50\n", 485 | "23/23 [==============================] - 3s 145ms/step - loss: 7.9856e-05 - val_loss: 0.0016\n", 486 | "Epoch 33/50\n", 487 | "23/23 [==============================] - 3s 143ms/step - loss: 8.3668e-05 - val_loss: 0.0016\n", 488 | "Epoch 34/50\n", 489 | "23/23 [==============================] - 3s 143ms/step - loss: 9.3816e-05 - val_loss: 0.0016\n", 490 | "Epoch 35/50\n", 491 | "23/23 [==============================] - 3s 142ms/step - loss: 9.1081e-05 - val_loss: 0.0017\n", 492 | "Epoch 36/50\n", 493 | "23/23 [==============================] - 3s 142ms/step - loss: 7.7571e-05 - val_loss: 0.0015\n", 494 | "Epoch 37/50\n", 495 | "23/23 [==============================] - 3s 142ms/step - loss: 6.3817e-05 - val_loss: 0.0016\n", 496 | "Epoch 38/50\n", 497 | "23/23 [==============================] - 3s 142ms/step - loss: 5.2386e-05 - val_loss: 0.0017\n", 498 | "Epoch 39/50\n", 499 | "23/23 [==============================] - 3s 142ms/step - loss: 5.4663e-05 - val_loss: 0.0015\n", 500 | "Epoch 40/50\n", 501 | "23/23 [==============================] - 3s 142ms/step - loss: 5.1314e-05 - val_loss: 0.0016\n", 502 | "Epoch 41/50\n", 503 | "23/23 [==============================] - 3s 142ms/step - loss: 5.9441e-05 - val_loss: 0.0015\n", 504 | "Epoch 42/50\n", 505 | "23/23 [==============================] - 3s 142ms/step - loss: 5.2642e-05 - val_loss: 0.0016\n", 506 | "Epoch 43/50\n", 507 | "23/23 [==============================] - 3s 143ms/step - loss: 5.3741e-05 - val_loss: 0.0015\n", 508 | "Epoch 44/50\n", 509 | "23/23 [==============================] - 3s 143ms/step - loss: 4.8830e-05 - val_loss: 0.0016\n", 510 | "Epoch 45/50\n", 511 | "23/23 [==============================] - 3s 143ms/step - loss: 5.1866e-05 - val_loss: 0.0015\n", 512 | "Epoch 46/50\n", 513 | "23/23 [==============================] - 3s 144ms/step - loss: 4.8851e-05 - val_loss: 0.0016\n", 514 | "Epoch 47/50\n", 515 | "23/23 [==============================] - 3s 144ms/step - loss: 5.3998e-05 - val_loss: 0.0016\n", 516 | "Epoch 48/50\n", 517 | "23/23 [==============================] - 3s 145ms/step - loss: 7.1642e-05 - val_loss: 0.0015\n", 518 | "Epoch 49/50\n", 519 | "23/23 [==============================] - 3s 144ms/step - loss: 7.0317e-05 - val_loss: 0.0016\n", 520 | "Epoch 50/50\n", 521 | "23/23 [==============================] - 3s 145ms/step - loss: 7.0439e-05 - val_loss: 0.0017\n" 522 | ], 523 | "name": "stdout" 524 | } 525 | ] 526 | }, 527 | { 528 | "cell_type": "code", 529 | "metadata": { 530 | "id": "lxlowj7qPyyv" 531 | }, 532 | "source": [ 533 | "# lets save model \n", 534 | "model.save('detect_Planes.h5')" 535 | ], 536 | "execution_count": 228, 537 | "outputs": [] 538 | }, 539 | { 540 | "cell_type": "code", 541 | "metadata": { 542 | "id": "42ly_eqHQh6v" 543 | }, 544 | "source": [ 545 | "from tensorflow.keras.models import load_model" 546 | ], 547 | "execution_count": 229, 548 | "outputs": [] 549 | }, 550 | { 551 | "cell_type": "code", 552 | "metadata": { 553 | "id": "THgB3xU0QqGI" 554 | }, 555 | "source": [ 556 | "model=load_model('/content/detect_Planes.h5')" 557 | ], 558 | "execution_count": 230, 559 | "outputs": [] 560 | }, 561 | { 562 | "cell_type": "code", 563 | "metadata": { 564 | "id": "w-tWP_UtQuwk" 565 | }, 566 | "source": [ 567 | "imagepath='/content/drive/MyDrive/Applied_Ai_Course/Datasets/images/image_0111.jpg'" 568 | ], 569 | "execution_count": 231, 570 | "outputs": [] 571 | }, 572 | { 573 | "cell_type": "code", 574 | "metadata": { 575 | "id": "IV_2eMaxQ6pa" 576 | }, 577 | "source": [ 578 | "image = load_img(imagepath,\n", 579 | " target_size=(224,224))\n", 580 | "image = img_to_array(image) / 255.0\n", 581 | "image = np.expand_dims(image,axis=0)" 582 | ], 583 | "execution_count": 232, 584 | "outputs": [] 585 | }, 586 | { 587 | "cell_type": "code", 588 | "metadata": { 589 | "id": "XVLc6q9_RO8m", 590 | "colab": { 591 | "base_uri": "https://localhost:8080/" 592 | }, 593 | "outputId": "101aa2c3-0af5-4be8-ef7e-13a7c8680530" 594 | }, 595 | "source": [ 596 | "preds=model.predict(image)[0]\n", 597 | "(startX,startY,endX,endY)=preds" 598 | ], 599 | "execution_count": 233, 600 | "outputs": [ 601 | { 602 | "output_type": "stream", 603 | "text": [ 604 | "WARNING:tensorflow:5 out of the last 11 calls to .predict_function at 0x7fb51c1a81e0> triggered tf.function retracing. Tracing is expensive and the excessive number of tracings could be due to (1) creating @tf.function repeatedly in a loop, (2) passing tensors with different shapes, (3) passing Python objects instead of tensors. For (1), please define your @tf.function outside of the loop. For (2), @tf.function has experimental_relax_shapes=True option that relaxes argument shapes that can avoid unnecessary retracing. For (3), please refer to https://www.tensorflow.org/tutorials/customization/performance#python_or_tensor_args and https://www.tensorflow.org/api_docs/python/tf/function for more details.\n" 605 | ], 606 | "name": "stdout" 607 | } 608 | ] 609 | }, 610 | { 611 | "cell_type": "code", 612 | "metadata": { 613 | "id": "B0tlESgNRkcW" 614 | }, 615 | "source": [ 616 | "import imutils" 617 | ], 618 | "execution_count": 234, 619 | "outputs": [] 620 | }, 621 | { 622 | "cell_type": "code", 623 | "metadata": { 624 | "id": "UjPHbP2sRnUH" 625 | }, 626 | "source": [ 627 | "image=cv2.imread(imagepaths)\n", 628 | "image=imutils.resize(image,width=600)" 629 | ], 630 | "execution_count": 235, 631 | "outputs": [] 632 | }, 633 | { 634 | "cell_type": "code", 635 | "metadata": { 636 | "id": "mMj9dh6IRuAm" 637 | }, 638 | "source": [ 639 | "(h,w)=image.shape[:2]" 640 | ], 641 | "execution_count": 236, 642 | "outputs": [] 643 | }, 644 | { 645 | "cell_type": "code", 646 | "metadata": { 647 | "id": "oHiDSiASSKXE" 648 | }, 649 | "source": [ 650 | "startX=int(startX * w)\n", 651 | "startY=int(startY * h)\n", 652 | "\n", 653 | "endX=int(endX * w)\n", 654 | "endY=int(endY * h)" 655 | ], 656 | "execution_count": 237, 657 | "outputs": [] 658 | }, 659 | { 660 | "cell_type": "code", 661 | "metadata": { 662 | "colab": { 663 | "base_uri": "https://localhost:8080/" 664 | }, 665 | "id": "Wo2Ps1aEScXZ", 666 | "outputId": "8266c719-e5a5-49fe-be0d-ba5098824f32" 667 | }, 668 | "source": [ 669 | "cv2.rectangle(image,(startX,startY),(endX,endY),(0,255,0),3)" 670 | ], 671 | "execution_count": 238, 672 | "outputs": [ 673 | { 674 | "output_type": "execute_result", 675 | "data": { 676 | "text/plain": [ 677 | "array([[[255, 255, 255],\n", 678 | " [255, 255, 255],\n", 679 | " [255, 255, 255],\n", 680 | " ...,\n", 681 | " [255, 255, 255],\n", 682 | " [255, 255, 255],\n", 683 | " [255, 255, 255]],\n", 684 | "\n", 685 | " [[255, 255, 255],\n", 686 | " [255, 255, 255],\n", 687 | " [255, 255, 255],\n", 688 | " ...,\n", 689 | " [255, 255, 255],\n", 690 | " [255, 255, 255],\n", 691 | " [255, 255, 255]],\n", 692 | "\n", 693 | " [[255, 255, 255],\n", 694 | " [255, 255, 255],\n", 695 | " [255, 255, 255],\n", 696 | " ...,\n", 697 | " [255, 255, 255],\n", 698 | " [255, 255, 255],\n", 699 | " [255, 255, 255]],\n", 700 | "\n", 701 | " ...,\n", 702 | "\n", 703 | " [[255, 255, 255],\n", 704 | " [255, 255, 255],\n", 705 | " [255, 255, 255],\n", 706 | " ...,\n", 707 | " [255, 255, 255],\n", 708 | " [255, 255, 255],\n", 709 | " [255, 255, 255]],\n", 710 | "\n", 711 | " [[255, 255, 255],\n", 712 | " [255, 255, 255],\n", 713 | " [255, 255, 255],\n", 714 | " ...,\n", 715 | " [255, 255, 255],\n", 716 | " [255, 255, 255],\n", 717 | " [255, 255, 255]],\n", 718 | "\n", 719 | " [[255, 255, 255],\n", 720 | " [255, 255, 255],\n", 721 | " [255, 255, 255],\n", 722 | " ...,\n", 723 | " [255, 255, 255],\n", 724 | " [255, 255, 255],\n", 725 | " [255, 255, 255]]], dtype=uint8)" 726 | ] 727 | }, 728 | "metadata": { 729 | "tags": [] 730 | }, 731 | "execution_count": 238 732 | } 733 | ] 734 | }, 735 | { 736 | "cell_type": "code", 737 | "metadata": { 738 | "id": "uklFcl2AStTe" 739 | }, 740 | "source": [ 741 | "\n", 742 | "from google.colab.patches import cv2_imshow\n" 743 | ], 744 | "execution_count": 239, 745 | "outputs": [] 746 | }, 747 | { 748 | "cell_type": "code", 749 | "metadata": { 750 | "colab": { 751 | "base_uri": "https://localhost:8080/", 752 | "height": 181 753 | }, 754 | "id": "jUNMby9FS3OJ", 755 | "outputId": "7a89dff4-d565-4be3-cae3-5cc030a95719" 756 | }, 757 | "source": [ 758 | "import matplotlib.pyplot as plt\n", 759 | "plt.imshow(image)\n", 760 | "cv2.waitKey(0)" 761 | ], 762 | "execution_count": 240, 763 | "outputs": [ 764 | { 765 | "output_type": "execute_result", 766 | "data": { 767 | "text/plain": [ 768 | "-1" 769 | ] 770 | }, 771 | "metadata": { 772 | "tags": [] 773 | }, 774 | "execution_count": 240 775 | }, 776 | { 777 | "output_type": "display_data", 778 | "data": { 779 | "image/png": "\n", 780 | "text/plain": [ 781 | "
" 782 | ] 783 | }, 784 | "metadata": { 785 | "tags": [], 786 | "needs_background": "light" 787 | } 788 | } 789 | ] 790 | }, 791 | { 792 | "cell_type": "code", 793 | "metadata": { 794 | "id": "sknQm7nFTmVZ" 795 | }, 796 | "source": [ 797 | "" 798 | ], 799 | "execution_count": 240, 800 | "outputs": [] 801 | } 802 | ] 803 | } --------------------------------------------------------------------------------