├── requirements.txt
├── Image
└── Detect_image.png
├── LICENSE
├── README.md
├── object_detection_using_vgg16_with_tensorflow.py
└── Object_Detection_Using_VGG16_With_Tensorflow.ipynb
/requirements.txt:
--------------------------------------------------------------------------------
1 | tensorflow
2 | numpy
3 | matplotlib
4 | scikit-learn
--------------------------------------------------------------------------------
/Image/Detect_image.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/zubairsamo/Object-Detection-With-Tensorflow-Using-VGG16/HEAD/Image/Detect_image.png
--------------------------------------------------------------------------------
/LICENSE:
--------------------------------------------------------------------------------
1 | MIT License
2 |
3 | Copyright (c) 2020 zubair samo
4 |
5 | Permission is hereby granted, free of charge, to any person obtaining a copy
6 | of this software and associated documentation files (the "Software"), to deal
7 | in the Software without restriction, including without limitation the rights
8 | to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
9 | copies of the Software, and to permit persons to whom the Software is
10 | furnished to do so, subject to the following conditions:
11 |
12 | The above copyright notice and this permission notice shall be included in all
13 | copies or substantial portions of the Software.
14 |
15 | THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
16 | IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
17 | FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
18 | AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
19 | LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
20 | OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
21 | SOFTWARE.
22 |
--------------------------------------------------------------------------------
/README.md:
--------------------------------------------------------------------------------
1 |
Object-Detection-With-Tensorflow-Using-VGG16
2 |
3 |
4 |
5 |
6 |
7 |
8 | [](#)
9 | [](#)
10 | [](#)
11 | [](https://github.com/zubairsamo/Object-Detection-With-Tensorflow-Using-VGG16/fork)
12 | [](https://github.com/zubairsamo/Object-Detection-With-Tensorflow-Using-VGG16/issues)
13 | # VGG16 Architecture
14 | The input to the Convolutional Network is a fixed-size 224 X 224 X 3 image. The preprocessing step subtracts the mean RGB value from each pixel. The image is passed
15 | through a stack of convolutional layers with 3 X 3 receptive fields (smallest size that
16 | accommodates a pixel shift). In one of the layers, a 1 X 1 convolutional filter linearly
17 | transforms the input channels. The stride is fixed to 1 pixel and padding is such that the
18 | spatial resolution is preserved after convolution. 5 max-pooling layers are performed over
19 | a 2 X 2 pixel window, with stride 2. I have used the pre-trained weights for VGG16 trained
20 | model on ImageNet dataset to extract features. The feature for each image is a tensor of 7
21 | X 7 X 512 dimension. Fig 1 represents the architecture of the convolutional layers in
22 | VGG16.
23 | # Description
24 | Object detection in one of the fundamental problems in the field of artificial intelligence
25 | with applications in robotics, automation, and human-computer interaction. The aim is to
26 | track an arbitrary object in consecutive frames of a video segment by localizing it inside
27 | bounding boxes. The most common representation of these bounding boxes is in terms of
28 | the top-left and bottom-right coordinates in the frame with respect to the origin of each
29 | imaginary image grid.
30 | The VGG16 model secured the first position in ILSRVC for object localization and its accuracy
31 | for predicting the location of these boxes is unquestionably high Nevertheless, tradeoffs between accuracy
32 | and computation-intensity is obvious and raises the need for faster
33 | approaches. In this project, the VGG16 model has been trained on pre-trained weights on
34 | ImageNet for feature extraction. This transfer learning model is advantageous as it escapes
35 | the necessity to train the model on a large-scale dataset like ImageNet
36 | # Run it now
37 | [](https://colab.research.google.com/drive/15tA57gXnWprZjc5J_V7591AUdSQWrTF6?usp=sharing)
38 |
39 | # Dataset
40 | https://drive.google.com/drive/folders/1NxnVWN-aJuHdTibSOPstzZcFiIHEgx6Y?usp=sharing
41 |
42 | # Requirements
43 |
44 | Type below command in cmd to get up and running with the dependencies of the file.
45 | ```
46 | pip install -r requirement.txt
47 | ```
48 | git clone https://github.com/zubairsamo/Object-Detection-With-Tensorflow-Using-VGG16
49 | .git
50 |
51 | # Usage
52 | Object-Detection-With-Tensorflow-Using-VGG16
53 | .ipynb
54 |
55 | ## Author
56 | You can get in touch with me on my LinkedIn Profile:
57 |
58 | #### Zubair Samo
59 | [](https://linkedin.com/in/zubair-samo-3a2764197)
61 |
62 | You can also follow my GitHub Profile to stay updated about my latest projects: [](https://github.com/zubairsamo)
63 |
64 | If you liked the repo then kindly support it by giving it a star ⭐!
65 |
66 | ## Contributions Welcome
67 | [](#)
68 |
69 | If you find any bug in the code or have any improvements in mind then feel free to generate a pull request.
70 |
71 | ## License
72 | [](../master/LICENSE)
73 |
74 | Copyright (c) 2020 Zubair Samo
75 |
--------------------------------------------------------------------------------
/object_detection_using_vgg16_with_tensorflow.py:
--------------------------------------------------------------------------------
1 | # -*- coding: utf-8 -*-
2 | """Object Detection Using VGG16 With Tensorflow.ipynb
3 |
4 | Automatically generated by Colaboratory.
5 |
6 | Original file is located at
7 | https://colab.research.google.com/drive/15tA57gXnWprZjc5J_V7591AUdSQWrTF6
8 | """
9 |
10 | import os
11 | base_path="/content/drive/MyDrive/Applied_Ai_Course/Datasets"
12 | images=os.path.sep.join([base_path,'images'])
13 | annotations=os.path.sep.join([base_path,'airplanes.csv'])
14 |
15 | # Lets Load Dataset
16 | # airplanes annotation is a Csv file thats why we can see through with rows
17 |
18 | rows= open(annotations).read().strip().split("\n")
19 |
20 | # lets make three list where we save our exact bounding boxes
21 | data=[]
22 | targets=[]
23 | filenames=[]
24 |
25 | # After load we have to split dataset according to images
26 | # import some usefull libraries
27 | import cv2
28 | from tensorflow.keras.preprocessing.image import load_img
29 | # we also save images into array format so import img_array library too
30 | from tensorflow.keras.preprocessing.image import img_to_array
31 | for row in rows:
32 | row=row.split(",")
33 | # we always create rectangle with h+w so we have to know where exactly we should start from
34 | (filename,startX,startY,endX,endY)=row
35 |
36 | imagepaths=os.path.sep.join([images,filename])
37 | image=cv2.imread(imagepaths)
38 | (h,w)=image.shape[:2]
39 |
40 | # initializing starting point
41 | # Why we take in float because when we convert into array so then will trouble happen
42 | startX = float(startX) / w
43 | startY = float(startY) / h
44 | # Also initialize ending point
45 | endX = float(endX) / w
46 | endY = float(endY) / h
47 | #load image and give them default size
48 | image=load_img(imagepaths,target_size=(224,224))
49 | # see here if we cant take it into float then we face trouble
50 | image=img_to_array(image)
51 |
52 | # Lets append into data , targets ,filenames
53 | targets.append((startX,startY,endX,endY))
54 | filenames.append(filename)
55 | data.append(image)
56 |
57 | # Normalizing Data here also we face would face issues if we take input as integer
58 | import numpy as np
59 | data=np.array(data,dtype='float32') / 255.0
60 | targets=np.array(targets,dtype='float32')
61 |
62 | # we should seperate data into train and split so import sklearn library
63 | from sklearn.model_selection import train_test_split
64 |
65 | # split into testing and training
66 | split=train_test_split(data,targets,filenames,test_size=0.10,random_state=42)
67 |
68 | # lets split into steps
69 | (train_images,test_images) = split[:2]
70 | (train_targets,test_targets) = split[2:4]
71 | (train_filenames,test_filenames) = split[4:]
72 |
73 | # lets import pre trained VGG16 Which is already Builtin for computer vision
74 | from tensorflow.keras.applications import VGG16
75 | from tensorflow.keras.layers import Input
76 |
77 | # Imagenet is a competition every year held and VGG16 is winner of between 2013-14
78 | # so here we just want limited layers so thats why we false included_top
79 | vgg=VGG16(weights='imagenet',include_top=False,input_tensor=Input(shape=(224,224,3)))
80 |
81 | vgg.summary()
82 |
83 | from tensorflow.keras.layers import Input,Flatten,Dense
84 |
85 | # we use VGG16 as per our requirement not use whole
86 | vgg.trainable = False
87 |
88 | flatten = vgg.output
89 |
90 | flatten = Flatten()(flatten)
91 |
92 | # Lets make bboxhead
93 | bboxhead = Dense(128,activation="relu")(flatten)
94 | bboxhead = Dense(64,activation="relu")(bboxhead)
95 | bboxhead = Dense(32,activation="relu")(bboxhead)
96 | bboxhead = Dense(4,activation="relu")(bboxhead)
97 |
98 | # lets import Model
99 | from tensorflow.keras.models import Model
100 | model = Model(inputs = vgg.input,outputs = bboxhead)
101 |
102 | model.summary()
103 |
104 | # Lets fit our model
105 | # Optimization
106 | from tensorflow.keras.optimizers import Adam
107 |
108 | opt = Adam(1e-4)
109 |
110 | model.compile(loss='mse',optimizer=opt)
111 |
112 | history = model.fit(train_images,train_targets,validation_data=(test_images,test_targets),batch_size=32,epochs=50,verbose=1)
113 |
114 | # lets save model
115 | model.save('detect_Planes.h5')
116 |
117 | from tensorflow.keras.models import load_model
118 |
119 | model=load_model('/content/detect_Planes.h5')
120 |
121 | imagepath='/content/drive/MyDrive/Applied_Ai_Course/Datasets/images/image_0111.jpg'
122 |
123 | image = load_img(imagepath,
124 | target_size=(224,224))
125 | image = img_to_array(image) / 255.0
126 | image = np.expand_dims(image,axis=0)
127 |
128 | preds=model.predict(image)[0]
129 | (startX,startY,endX,endY)=preds
130 |
131 | import imutils
132 |
133 | image=cv2.imread(imagepaths)
134 | image=imutils.resize(image,width=600)
135 |
136 | (h,w)=image.shape[:2]
137 |
138 | startX=int(startX * w)
139 | startY=int(startY * h)
140 |
141 | endX=int(endX * w)
142 | endY=int(endY * h)
143 |
144 | cv2.rectangle(image,(startX,startY),(endX,endY),(0,255,0),3)
145 |
146 | from google.colab.patches import cv2_imshow
147 |
148 | import matplotlib.pyplot as plt
149 | plt.imshow(image)
150 | cv2.waitKey(0)
151 |
152 |
--------------------------------------------------------------------------------
/Object_Detection_Using_VGG16_With_Tensorflow.ipynb:
--------------------------------------------------------------------------------
1 | {
2 | "nbformat": 4,
3 | "nbformat_minor": 0,
4 | "metadata": {
5 | "colab": {
6 | "name": "Object Detection Using VGG16 With Tensorflow.ipynb",
7 | "provenance": [],
8 | "collapsed_sections": []
9 | },
10 | "kernelspec": {
11 | "name": "python3",
12 | "display_name": "Python 3"
13 | },
14 | "accelerator": "GPU"
15 | },
16 | "cells": [
17 | {
18 | "cell_type": "markdown",
19 | "metadata": {
20 | "id": "iVKGF4unVslV"
21 | },
22 | "source": [
23 | ""
24 | ]
25 | },
26 | {
27 | "cell_type": "code",
28 | "metadata": {
29 | "id": "Vl_dLDnV_Lys"
30 | },
31 | "source": [
32 | "import os\n",
33 | "base_path=\"/content/drive/MyDrive/Applied_Ai_Course/Datasets\"\n",
34 | "images=os.path.sep.join([base_path,'images'])\n",
35 | "annotations=os.path.sep.join([base_path,'airplanes.csv'])"
36 | ],
37 | "execution_count": 175,
38 | "outputs": []
39 | },
40 | {
41 | "cell_type": "code",
42 | "metadata": {
43 | "id": "ztWsfMIXFFlK"
44 | },
45 | "source": [
46 | "# Lets Load Dataset\n",
47 | "# airplanes annotation is a Csv file thats why we can see through with rows\n",
48 | "\n",
49 | "rows= open(annotations).read().strip().split(\"\\n\")\n",
50 | "\n",
51 | "# lets make three list where we save our exact bounding boxes\n",
52 | "data=[]\n",
53 | "targets=[]\n",
54 | "filenames=[]"
55 | ],
56 | "execution_count": 176,
57 | "outputs": []
58 | },
59 | {
60 | "cell_type": "code",
61 | "metadata": {
62 | "id": "-C1Ynxw7FveF"
63 | },
64 | "source": [
65 | "# After load we have to split dataset according to images\n",
66 | "# import some usefull libraries\n",
67 | "import cv2\n",
68 | "from tensorflow.keras.preprocessing.image import load_img\n",
69 | "# we also save images into array format so import img_array library too\n",
70 | "from tensorflow.keras.preprocessing.image import img_to_array\n",
71 | "for row in rows:\n",
72 | " row=row.split(\",\")\n",
73 | " # we always create rectangle with h+w so we have to know where exactly we should start from\n",
74 | " (filename,startX,startY,endX,endY)=row\n",
75 | "\n",
76 | " imagepaths=os.path.sep.join([images,filename])\n",
77 | " image=cv2.imread(imagepaths)\n",
78 | " (h,w)=image.shape[:2]\n",
79 | "\n",
80 | " # initializing starting point\n",
81 | " # Why we take in float because when we convert into array so then will trouble happen\n",
82 | " startX = float(startX) / w\n",
83 | " startY = float(startY) / h\n",
84 | " # Also initialize ending point \n",
85 | " endX = float(endX) / w\n",
86 | " endY = float(endY) / h\n",
87 | " #load image and give them default size\n",
88 | " image=load_img(imagepaths,target_size=(224,224))\n",
89 | " # see here if we cant take it into float then we face trouble \n",
90 | " image=img_to_array(image)\n",
91 | "\n",
92 | " # Lets append into data , targets ,filenames\n",
93 | " targets.append((startX,startY,endX,endY))\n",
94 | " filenames.append(filename)\n",
95 | " data.append(image)\n"
96 | ],
97 | "execution_count": 177,
98 | "outputs": []
99 | },
100 | {
101 | "cell_type": "code",
102 | "metadata": {
103 | "id": "0J6zfNkZJKi_"
104 | },
105 | "source": [
106 | "# Normalizing Data here also we face would face issues if we take input as integer\n",
107 | "import numpy as np\n",
108 | "data=np.array(data,dtype='float32') / 255.0\n",
109 | "targets=np.array(targets,dtype='float32')"
110 | ],
111 | "execution_count": 178,
112 | "outputs": []
113 | },
114 | {
115 | "cell_type": "code",
116 | "metadata": {
117 | "id": "cI3z3eOUJdCI"
118 | },
119 | "source": [
120 | "# we should seperate data into train and split so import sklearn library \n",
121 | "from sklearn.model_selection import train_test_split"
122 | ],
123 | "execution_count": 179,
124 | "outputs": []
125 | },
126 | {
127 | "cell_type": "code",
128 | "metadata": {
129 | "id": "2WwSEjEJK1MI"
130 | },
131 | "source": [
132 | "# split into testing and training\n",
133 | "split=train_test_split(data,targets,filenames,test_size=0.10,random_state=42)"
134 | ],
135 | "execution_count": 180,
136 | "outputs": []
137 | },
138 | {
139 | "cell_type": "code",
140 | "metadata": {
141 | "id": "sxxnaBHhLIyR"
142 | },
143 | "source": [
144 | "# lets split into steps\n",
145 | "(train_images,test_images) = split[:2]\n",
146 | "(train_targets,test_targets) = split[2:4]\n",
147 | "(train_filenames,test_filenames) = split[4:]\n"
148 | ],
149 | "execution_count": 181,
150 | "outputs": []
151 | },
152 | {
153 | "cell_type": "code",
154 | "metadata": {
155 | "id": "MZBKv2k_L62z"
156 | },
157 | "source": [
158 | "# lets import pre trained VGG16 Which is already Builtin for computer vision\n",
159 | "from tensorflow.keras.applications import VGG16\n",
160 | "from tensorflow.keras.layers import Input"
161 | ],
162 | "execution_count": 182,
163 | "outputs": []
164 | },
165 | {
166 | "cell_type": "code",
167 | "metadata": {
168 | "id": "YpIvvM48MUOs"
169 | },
170 | "source": [
171 | "# Imagenet is a competition every year held and VGG16 is winner of between 2013-14\n",
172 | "# so here we just want limited layers so thats why we false included_top \n",
173 | "vgg=VGG16(weights='imagenet',include_top=False,input_tensor=Input(shape=(224,224,3)))"
174 | ],
175 | "execution_count": 183,
176 | "outputs": []
177 | },
178 | {
179 | "cell_type": "code",
180 | "metadata": {
181 | "colab": {
182 | "base_uri": "https://localhost:8080/"
183 | },
184 | "id": "gPaFO3AZNZOE",
185 | "outputId": "98af118e-9bc5-4ad4-aac0-6c1c965740a9"
186 | },
187 | "source": [
188 | "vgg.summary()"
189 | ],
190 | "execution_count": 184,
191 | "outputs": [
192 | {
193 | "output_type": "stream",
194 | "text": [
195 | "Model: \"vgg16\"\n",
196 | "_________________________________________________________________\n",
197 | "Layer (type) Output Shape Param # \n",
198 | "=================================================================\n",
199 | "input_4 (InputLayer) [(None, 224, 224, 3)] 0 \n",
200 | "_________________________________________________________________\n",
201 | "block1_conv1 (Conv2D) (None, 224, 224, 64) 1792 \n",
202 | "_________________________________________________________________\n",
203 | "block1_conv2 (Conv2D) (None, 224, 224, 64) 36928 \n",
204 | "_________________________________________________________________\n",
205 | "block1_pool (MaxPooling2D) (None, 112, 112, 64) 0 \n",
206 | "_________________________________________________________________\n",
207 | "block2_conv1 (Conv2D) (None, 112, 112, 128) 73856 \n",
208 | "_________________________________________________________________\n",
209 | "block2_conv2 (Conv2D) (None, 112, 112, 128) 147584 \n",
210 | "_________________________________________________________________\n",
211 | "block2_pool (MaxPooling2D) (None, 56, 56, 128) 0 \n",
212 | "_________________________________________________________________\n",
213 | "block3_conv1 (Conv2D) (None, 56, 56, 256) 295168 \n",
214 | "_________________________________________________________________\n",
215 | "block3_conv2 (Conv2D) (None, 56, 56, 256) 590080 \n",
216 | "_________________________________________________________________\n",
217 | "block3_conv3 (Conv2D) (None, 56, 56, 256) 590080 \n",
218 | "_________________________________________________________________\n",
219 | "block3_pool (MaxPooling2D) (None, 28, 28, 256) 0 \n",
220 | "_________________________________________________________________\n",
221 | "block4_conv1 (Conv2D) (None, 28, 28, 512) 1180160 \n",
222 | "_________________________________________________________________\n",
223 | "block4_conv2 (Conv2D) (None, 28, 28, 512) 2359808 \n",
224 | "_________________________________________________________________\n",
225 | "block4_conv3 (Conv2D) (None, 28, 28, 512) 2359808 \n",
226 | "_________________________________________________________________\n",
227 | "block4_pool (MaxPooling2D) (None, 14, 14, 512) 0 \n",
228 | "_________________________________________________________________\n",
229 | "block5_conv1 (Conv2D) (None, 14, 14, 512) 2359808 \n",
230 | "_________________________________________________________________\n",
231 | "block5_conv2 (Conv2D) (None, 14, 14, 512) 2359808 \n",
232 | "_________________________________________________________________\n",
233 | "block5_conv3 (Conv2D) (None, 14, 14, 512) 2359808 \n",
234 | "_________________________________________________________________\n",
235 | "block5_pool (MaxPooling2D) (None, 7, 7, 512) 0 \n",
236 | "=================================================================\n",
237 | "Total params: 14,714,688\n",
238 | "Trainable params: 14,714,688\n",
239 | "Non-trainable params: 0\n",
240 | "_________________________________________________________________\n"
241 | ],
242 | "name": "stdout"
243 | }
244 | ]
245 | },
246 | {
247 | "cell_type": "code",
248 | "metadata": {
249 | "id": "0YPGZE1oNdvP"
250 | },
251 | "source": [
252 | "from tensorflow.keras.layers import Input,Flatten,Dense"
253 | ],
254 | "execution_count": 185,
255 | "outputs": []
256 | },
257 | {
258 | "cell_type": "code",
259 | "metadata": {
260 | "id": "Qg2i026tNnOM"
261 | },
262 | "source": [
263 | "# we use VGG16 as per our requirement not use whole \n",
264 | "vgg.trainable = False\n",
265 | "\n",
266 | "flatten = vgg.output\n",
267 | "\n",
268 | "flatten = Flatten()(flatten)"
269 | ],
270 | "execution_count": 186,
271 | "outputs": []
272 | },
273 | {
274 | "cell_type": "code",
275 | "metadata": {
276 | "id": "zBCbiwSMODz-"
277 | },
278 | "source": [
279 | "# Lets make bboxhead\n",
280 | "bboxhead = Dense(128,activation=\"relu\")(flatten)\n",
281 | "bboxhead = Dense(64,activation=\"relu\")(bboxhead)\n",
282 | "bboxhead = Dense(32,activation=\"relu\")(bboxhead)\n",
283 | "bboxhead = Dense(4,activation=\"relu\")(bboxhead)"
284 | ],
285 | "execution_count": 187,
286 | "outputs": []
287 | },
288 | {
289 | "cell_type": "code",
290 | "metadata": {
291 | "id": "6whx45l1OdFW"
292 | },
293 | "source": [
294 | "# lets import Model\n",
295 | "from tensorflow.keras.models import Model\n",
296 | "model = Model(inputs = vgg.input,outputs = bboxhead)"
297 | ],
298 | "execution_count": 188,
299 | "outputs": []
300 | },
301 | {
302 | "cell_type": "code",
303 | "metadata": {
304 | "colab": {
305 | "base_uri": "https://localhost:8080/"
306 | },
307 | "id": "nhp9ptwpO7_7",
308 | "outputId": "95426c6f-3ea8-49df-c5b5-fcf5b098b9ff"
309 | },
310 | "source": [
311 | "model.summary()"
312 | ],
313 | "execution_count": 189,
314 | "outputs": [
315 | {
316 | "output_type": "stream",
317 | "text": [
318 | "Model: \"functional_7\"\n",
319 | "_________________________________________________________________\n",
320 | "Layer (type) Output Shape Param # \n",
321 | "=================================================================\n",
322 | "input_4 (InputLayer) [(None, 224, 224, 3)] 0 \n",
323 | "_________________________________________________________________\n",
324 | "block1_conv1 (Conv2D) (None, 224, 224, 64) 1792 \n",
325 | "_________________________________________________________________\n",
326 | "block1_conv2 (Conv2D) (None, 224, 224, 64) 36928 \n",
327 | "_________________________________________________________________\n",
328 | "block1_pool (MaxPooling2D) (None, 112, 112, 64) 0 \n",
329 | "_________________________________________________________________\n",
330 | "block2_conv1 (Conv2D) (None, 112, 112, 128) 73856 \n",
331 | "_________________________________________________________________\n",
332 | "block2_conv2 (Conv2D) (None, 112, 112, 128) 147584 \n",
333 | "_________________________________________________________________\n",
334 | "block2_pool (MaxPooling2D) (None, 56, 56, 128) 0 \n",
335 | "_________________________________________________________________\n",
336 | "block3_conv1 (Conv2D) (None, 56, 56, 256) 295168 \n",
337 | "_________________________________________________________________\n",
338 | "block3_conv2 (Conv2D) (None, 56, 56, 256) 590080 \n",
339 | "_________________________________________________________________\n",
340 | "block3_conv3 (Conv2D) (None, 56, 56, 256) 590080 \n",
341 | "_________________________________________________________________\n",
342 | "block3_pool (MaxPooling2D) (None, 28, 28, 256) 0 \n",
343 | "_________________________________________________________________\n",
344 | "block4_conv1 (Conv2D) (None, 28, 28, 512) 1180160 \n",
345 | "_________________________________________________________________\n",
346 | "block4_conv2 (Conv2D) (None, 28, 28, 512) 2359808 \n",
347 | "_________________________________________________________________\n",
348 | "block4_conv3 (Conv2D) (None, 28, 28, 512) 2359808 \n",
349 | "_________________________________________________________________\n",
350 | "block4_pool (MaxPooling2D) (None, 14, 14, 512) 0 \n",
351 | "_________________________________________________________________\n",
352 | "block5_conv1 (Conv2D) (None, 14, 14, 512) 2359808 \n",
353 | "_________________________________________________________________\n",
354 | "block5_conv2 (Conv2D) (None, 14, 14, 512) 2359808 \n",
355 | "_________________________________________________________________\n",
356 | "block5_conv3 (Conv2D) (None, 14, 14, 512) 2359808 \n",
357 | "_________________________________________________________________\n",
358 | "block5_pool (MaxPooling2D) (None, 7, 7, 512) 0 \n",
359 | "_________________________________________________________________\n",
360 | "flatten_3 (Flatten) (None, 25088) 0 \n",
361 | "_________________________________________________________________\n",
362 | "dense_12 (Dense) (None, 128) 3211392 \n",
363 | "_________________________________________________________________\n",
364 | "dense_13 (Dense) (None, 64) 8256 \n",
365 | "_________________________________________________________________\n",
366 | "dense_14 (Dense) (None, 32) 2080 \n",
367 | "_________________________________________________________________\n",
368 | "dense_15 (Dense) (None, 4) 132 \n",
369 | "=================================================================\n",
370 | "Total params: 17,936,548\n",
371 | "Trainable params: 3,221,860\n",
372 | "Non-trainable params: 14,714,688\n",
373 | "_________________________________________________________________\n"
374 | ],
375 | "name": "stdout"
376 | }
377 | ]
378 | },
379 | {
380 | "cell_type": "code",
381 | "metadata": {
382 | "id": "Fye3qz96O9y2"
383 | },
384 | "source": [
385 | "# Lets fit our model \n",
386 | "# Optimization \n",
387 | "from tensorflow.keras.optimizers import Adam\n",
388 | "\n",
389 | "opt = Adam(1e-4)"
390 | ],
391 | "execution_count": 190,
392 | "outputs": []
393 | },
394 | {
395 | "cell_type": "code",
396 | "metadata": {
397 | "id": "o1G-qSwgPX6u"
398 | },
399 | "source": [
400 | "model.compile(loss='mse',optimizer=opt)"
401 | ],
402 | "execution_count": 191,
403 | "outputs": []
404 | },
405 | {
406 | "cell_type": "code",
407 | "metadata": {
408 | "colab": {
409 | "base_uri": "https://localhost:8080/"
410 | },
411 | "id": "rOfPJd6RPek5",
412 | "outputId": "fade4076-7c0d-44e4-abb0-c0e9deb9edc6"
413 | },
414 | "source": [
415 | "history = model.fit(train_images,train_targets,validation_data=(test_images,test_targets),batch_size=32,epochs=50,verbose=1)"
416 | ],
417 | "execution_count": 192,
418 | "outputs": [
419 | {
420 | "output_type": "stream",
421 | "text": [
422 | "Epoch 1/50\n",
423 | "23/23 [==============================] - 4s 155ms/step - loss: 0.0342 - val_loss: 0.0186\n",
424 | "Epoch 2/50\n",
425 | "23/23 [==============================] - 3s 152ms/step - loss: 0.0124 - val_loss: 0.0080\n",
426 | "Epoch 3/50\n",
427 | "23/23 [==============================] - 4s 153ms/step - loss: 0.0072 - val_loss: 0.0069\n",
428 | "Epoch 4/50\n",
429 | "23/23 [==============================] - 4s 155ms/step - loss: 0.0061 - val_loss: 0.0066\n",
430 | "Epoch 5/50\n",
431 | "23/23 [==============================] - 4s 155ms/step - loss: 0.0057 - val_loss: 0.0064\n",
432 | "Epoch 6/50\n",
433 | "23/23 [==============================] - 3s 151ms/step - loss: 0.0042 - val_loss: 0.0026\n",
434 | "Epoch 7/50\n",
435 | "23/23 [==============================] - 3s 148ms/step - loss: 0.0017 - val_loss: 0.0022\n",
436 | "Epoch 8/50\n",
437 | "23/23 [==============================] - 3s 147ms/step - loss: 0.0011 - val_loss: 0.0019\n",
438 | "Epoch 9/50\n",
439 | "23/23 [==============================] - 3s 146ms/step - loss: 7.5519e-04 - val_loss: 0.0020\n",
440 | "Epoch 10/50\n",
441 | "23/23 [==============================] - 3s 142ms/step - loss: 6.2409e-04 - val_loss: 0.0018\n",
442 | "Epoch 11/50\n",
443 | "23/23 [==============================] - 3s 141ms/step - loss: 5.1805e-04 - val_loss: 0.0017\n",
444 | "Epoch 12/50\n",
445 | "23/23 [==============================] - 3s 140ms/step - loss: 4.3773e-04 - val_loss: 0.0018\n",
446 | "Epoch 13/50\n",
447 | "23/23 [==============================] - 3s 139ms/step - loss: 3.7624e-04 - val_loss: 0.0017\n",
448 | "Epoch 14/50\n",
449 | "23/23 [==============================] - 3s 138ms/step - loss: 3.0916e-04 - val_loss: 0.0016\n",
450 | "Epoch 15/50\n",
451 | "23/23 [==============================] - 3s 138ms/step - loss: 2.7746e-04 - val_loss: 0.0017\n",
452 | "Epoch 16/50\n",
453 | "23/23 [==============================] - 3s 139ms/step - loss: 2.4081e-04 - val_loss: 0.0017\n",
454 | "Epoch 17/50\n",
455 | "23/23 [==============================] - 3s 139ms/step - loss: 2.0326e-04 - val_loss: 0.0016\n",
456 | "Epoch 18/50\n",
457 | "23/23 [==============================] - 3s 139ms/step - loss: 1.8227e-04 - val_loss: 0.0017\n",
458 | "Epoch 19/50\n",
459 | "23/23 [==============================] - 3s 139ms/step - loss: 1.6671e-04 - val_loss: 0.0016\n",
460 | "Epoch 20/50\n",
461 | "23/23 [==============================] - 3s 139ms/step - loss: 1.4739e-04 - val_loss: 0.0016\n",
462 | "Epoch 21/50\n",
463 | "23/23 [==============================] - 3s 140ms/step - loss: 1.2999e-04 - val_loss: 0.0016\n",
464 | "Epoch 22/50\n",
465 | "23/23 [==============================] - 3s 142ms/step - loss: 1.1637e-04 - val_loss: 0.0016\n",
466 | "Epoch 23/50\n",
467 | "23/23 [==============================] - 3s 144ms/step - loss: 1.0369e-04 - val_loss: 0.0016\n",
468 | "Epoch 24/50\n",
469 | "23/23 [==============================] - 3s 145ms/step - loss: 9.5637e-05 - val_loss: 0.0016\n",
470 | "Epoch 25/50\n",
471 | "23/23 [==============================] - 3s 148ms/step - loss: 8.6480e-05 - val_loss: 0.0016\n",
472 | "Epoch 26/50\n",
473 | "23/23 [==============================] - 3s 148ms/step - loss: 8.1371e-05 - val_loss: 0.0016\n",
474 | "Epoch 27/50\n",
475 | "23/23 [==============================] - 3s 148ms/step - loss: 7.6879e-05 - val_loss: 0.0016\n",
476 | "Epoch 28/50\n",
477 | "23/23 [==============================] - 3s 147ms/step - loss: 7.5892e-05 - val_loss: 0.0016\n",
478 | "Epoch 29/50\n",
479 | "23/23 [==============================] - 3s 147ms/step - loss: 6.9565e-05 - val_loss: 0.0016\n",
480 | "Epoch 30/50\n",
481 | "23/23 [==============================] - 3s 145ms/step - loss: 6.9834e-05 - val_loss: 0.0016\n",
482 | "Epoch 31/50\n",
483 | "23/23 [==============================] - 3s 145ms/step - loss: 7.2559e-05 - val_loss: 0.0016\n",
484 | "Epoch 32/50\n",
485 | "23/23 [==============================] - 3s 145ms/step - loss: 7.9856e-05 - val_loss: 0.0016\n",
486 | "Epoch 33/50\n",
487 | "23/23 [==============================] - 3s 143ms/step - loss: 8.3668e-05 - val_loss: 0.0016\n",
488 | "Epoch 34/50\n",
489 | "23/23 [==============================] - 3s 143ms/step - loss: 9.3816e-05 - val_loss: 0.0016\n",
490 | "Epoch 35/50\n",
491 | "23/23 [==============================] - 3s 142ms/step - loss: 9.1081e-05 - val_loss: 0.0017\n",
492 | "Epoch 36/50\n",
493 | "23/23 [==============================] - 3s 142ms/step - loss: 7.7571e-05 - val_loss: 0.0015\n",
494 | "Epoch 37/50\n",
495 | "23/23 [==============================] - 3s 142ms/step - loss: 6.3817e-05 - val_loss: 0.0016\n",
496 | "Epoch 38/50\n",
497 | "23/23 [==============================] - 3s 142ms/step - loss: 5.2386e-05 - val_loss: 0.0017\n",
498 | "Epoch 39/50\n",
499 | "23/23 [==============================] - 3s 142ms/step - loss: 5.4663e-05 - val_loss: 0.0015\n",
500 | "Epoch 40/50\n",
501 | "23/23 [==============================] - 3s 142ms/step - loss: 5.1314e-05 - val_loss: 0.0016\n",
502 | "Epoch 41/50\n",
503 | "23/23 [==============================] - 3s 142ms/step - loss: 5.9441e-05 - val_loss: 0.0015\n",
504 | "Epoch 42/50\n",
505 | "23/23 [==============================] - 3s 142ms/step - loss: 5.2642e-05 - val_loss: 0.0016\n",
506 | "Epoch 43/50\n",
507 | "23/23 [==============================] - 3s 143ms/step - loss: 5.3741e-05 - val_loss: 0.0015\n",
508 | "Epoch 44/50\n",
509 | "23/23 [==============================] - 3s 143ms/step - loss: 4.8830e-05 - val_loss: 0.0016\n",
510 | "Epoch 45/50\n",
511 | "23/23 [==============================] - 3s 143ms/step - loss: 5.1866e-05 - val_loss: 0.0015\n",
512 | "Epoch 46/50\n",
513 | "23/23 [==============================] - 3s 144ms/step - loss: 4.8851e-05 - val_loss: 0.0016\n",
514 | "Epoch 47/50\n",
515 | "23/23 [==============================] - 3s 144ms/step - loss: 5.3998e-05 - val_loss: 0.0016\n",
516 | "Epoch 48/50\n",
517 | "23/23 [==============================] - 3s 145ms/step - loss: 7.1642e-05 - val_loss: 0.0015\n",
518 | "Epoch 49/50\n",
519 | "23/23 [==============================] - 3s 144ms/step - loss: 7.0317e-05 - val_loss: 0.0016\n",
520 | "Epoch 50/50\n",
521 | "23/23 [==============================] - 3s 145ms/step - loss: 7.0439e-05 - val_loss: 0.0017\n"
522 | ],
523 | "name": "stdout"
524 | }
525 | ]
526 | },
527 | {
528 | "cell_type": "code",
529 | "metadata": {
530 | "id": "lxlowj7qPyyv"
531 | },
532 | "source": [
533 | "# lets save model \n",
534 | "model.save('detect_Planes.h5')"
535 | ],
536 | "execution_count": 228,
537 | "outputs": []
538 | },
539 | {
540 | "cell_type": "code",
541 | "metadata": {
542 | "id": "42ly_eqHQh6v"
543 | },
544 | "source": [
545 | "from tensorflow.keras.models import load_model"
546 | ],
547 | "execution_count": 229,
548 | "outputs": []
549 | },
550 | {
551 | "cell_type": "code",
552 | "metadata": {
553 | "id": "THgB3xU0QqGI"
554 | },
555 | "source": [
556 | "model=load_model('/content/detect_Planes.h5')"
557 | ],
558 | "execution_count": 230,
559 | "outputs": []
560 | },
561 | {
562 | "cell_type": "code",
563 | "metadata": {
564 | "id": "w-tWP_UtQuwk"
565 | },
566 | "source": [
567 | "imagepath='/content/drive/MyDrive/Applied_Ai_Course/Datasets/images/image_0111.jpg'"
568 | ],
569 | "execution_count": 231,
570 | "outputs": []
571 | },
572 | {
573 | "cell_type": "code",
574 | "metadata": {
575 | "id": "IV_2eMaxQ6pa"
576 | },
577 | "source": [
578 | "image = load_img(imagepath,\n",
579 | " target_size=(224,224))\n",
580 | "image = img_to_array(image) / 255.0\n",
581 | "image = np.expand_dims(image,axis=0)"
582 | ],
583 | "execution_count": 232,
584 | "outputs": []
585 | },
586 | {
587 | "cell_type": "code",
588 | "metadata": {
589 | "id": "XVLc6q9_RO8m",
590 | "colab": {
591 | "base_uri": "https://localhost:8080/"
592 | },
593 | "outputId": "101aa2c3-0af5-4be8-ef7e-13a7c8680530"
594 | },
595 | "source": [
596 | "preds=model.predict(image)[0]\n",
597 | "(startX,startY,endX,endY)=preds"
598 | ],
599 | "execution_count": 233,
600 | "outputs": [
601 | {
602 | "output_type": "stream",
603 | "text": [
604 | "WARNING:tensorflow:5 out of the last 11 calls to .predict_function at 0x7fb51c1a81e0> triggered tf.function retracing. Tracing is expensive and the excessive number of tracings could be due to (1) creating @tf.function repeatedly in a loop, (2) passing tensors with different shapes, (3) passing Python objects instead of tensors. For (1), please define your @tf.function outside of the loop. For (2), @tf.function has experimental_relax_shapes=True option that relaxes argument shapes that can avoid unnecessary retracing. For (3), please refer to https://www.tensorflow.org/tutorials/customization/performance#python_or_tensor_args and https://www.tensorflow.org/api_docs/python/tf/function for more details.\n"
605 | ],
606 | "name": "stdout"
607 | }
608 | ]
609 | },
610 | {
611 | "cell_type": "code",
612 | "metadata": {
613 | "id": "B0tlESgNRkcW"
614 | },
615 | "source": [
616 | "import imutils"
617 | ],
618 | "execution_count": 234,
619 | "outputs": []
620 | },
621 | {
622 | "cell_type": "code",
623 | "metadata": {
624 | "id": "UjPHbP2sRnUH"
625 | },
626 | "source": [
627 | "image=cv2.imread(imagepaths)\n",
628 | "image=imutils.resize(image,width=600)"
629 | ],
630 | "execution_count": 235,
631 | "outputs": []
632 | },
633 | {
634 | "cell_type": "code",
635 | "metadata": {
636 | "id": "mMj9dh6IRuAm"
637 | },
638 | "source": [
639 | "(h,w)=image.shape[:2]"
640 | ],
641 | "execution_count": 236,
642 | "outputs": []
643 | },
644 | {
645 | "cell_type": "code",
646 | "metadata": {
647 | "id": "oHiDSiASSKXE"
648 | },
649 | "source": [
650 | "startX=int(startX * w)\n",
651 | "startY=int(startY * h)\n",
652 | "\n",
653 | "endX=int(endX * w)\n",
654 | "endY=int(endY * h)"
655 | ],
656 | "execution_count": 237,
657 | "outputs": []
658 | },
659 | {
660 | "cell_type": "code",
661 | "metadata": {
662 | "colab": {
663 | "base_uri": "https://localhost:8080/"
664 | },
665 | "id": "Wo2Ps1aEScXZ",
666 | "outputId": "8266c719-e5a5-49fe-be0d-ba5098824f32"
667 | },
668 | "source": [
669 | "cv2.rectangle(image,(startX,startY),(endX,endY),(0,255,0),3)"
670 | ],
671 | "execution_count": 238,
672 | "outputs": [
673 | {
674 | "output_type": "execute_result",
675 | "data": {
676 | "text/plain": [
677 | "array([[[255, 255, 255],\n",
678 | " [255, 255, 255],\n",
679 | " [255, 255, 255],\n",
680 | " ...,\n",
681 | " [255, 255, 255],\n",
682 | " [255, 255, 255],\n",
683 | " [255, 255, 255]],\n",
684 | "\n",
685 | " [[255, 255, 255],\n",
686 | " [255, 255, 255],\n",
687 | " [255, 255, 255],\n",
688 | " ...,\n",
689 | " [255, 255, 255],\n",
690 | " [255, 255, 255],\n",
691 | " [255, 255, 255]],\n",
692 | "\n",
693 | " [[255, 255, 255],\n",
694 | " [255, 255, 255],\n",
695 | " [255, 255, 255],\n",
696 | " ...,\n",
697 | " [255, 255, 255],\n",
698 | " [255, 255, 255],\n",
699 | " [255, 255, 255]],\n",
700 | "\n",
701 | " ...,\n",
702 | "\n",
703 | " [[255, 255, 255],\n",
704 | " [255, 255, 255],\n",
705 | " [255, 255, 255],\n",
706 | " ...,\n",
707 | " [255, 255, 255],\n",
708 | " [255, 255, 255],\n",
709 | " [255, 255, 255]],\n",
710 | "\n",
711 | " [[255, 255, 255],\n",
712 | " [255, 255, 255],\n",
713 | " [255, 255, 255],\n",
714 | " ...,\n",
715 | " [255, 255, 255],\n",
716 | " [255, 255, 255],\n",
717 | " [255, 255, 255]],\n",
718 | "\n",
719 | " [[255, 255, 255],\n",
720 | " [255, 255, 255],\n",
721 | " [255, 255, 255],\n",
722 | " ...,\n",
723 | " [255, 255, 255],\n",
724 | " [255, 255, 255],\n",
725 | " [255, 255, 255]]], dtype=uint8)"
726 | ]
727 | },
728 | "metadata": {
729 | "tags": []
730 | },
731 | "execution_count": 238
732 | }
733 | ]
734 | },
735 | {
736 | "cell_type": "code",
737 | "metadata": {
738 | "id": "uklFcl2AStTe"
739 | },
740 | "source": [
741 | "\n",
742 | "from google.colab.patches import cv2_imshow\n"
743 | ],
744 | "execution_count": 239,
745 | "outputs": []
746 | },
747 | {
748 | "cell_type": "code",
749 | "metadata": {
750 | "colab": {
751 | "base_uri": "https://localhost:8080/",
752 | "height": 181
753 | },
754 | "id": "jUNMby9FS3OJ",
755 | "outputId": "7a89dff4-d565-4be3-cae3-5cc030a95719"
756 | },
757 | "source": [
758 | "import matplotlib.pyplot as plt\n",
759 | "plt.imshow(image)\n",
760 | "cv2.waitKey(0)"
761 | ],
762 | "execution_count": 240,
763 | "outputs": [
764 | {
765 | "output_type": "execute_result",
766 | "data": {
767 | "text/plain": [
768 | "-1"
769 | ]
770 | },
771 | "metadata": {
772 | "tags": []
773 | },
774 | "execution_count": 240
775 | },
776 | {
777 | "output_type": "display_data",
778 | "data": {
779 | "image/png": "\n",
780 | "text/plain": [
781 | ""
782 | ]
783 | },
784 | "metadata": {
785 | "tags": [],
786 | "needs_background": "light"
787 | }
788 | }
789 | ]
790 | },
791 | {
792 | "cell_type": "code",
793 | "metadata": {
794 | "id": "sknQm7nFTmVZ"
795 | },
796 | "source": [
797 | ""
798 | ],
799 | "execution_count": 240,
800 | "outputs": []
801 | }
802 | ]
803 | }
--------------------------------------------------------------------------------