├── .~lock.Vehicles_Categorization.odt#
├── Capstone Project_writeup.md
├── Capstone Project_writeup.pdf
├── Dataset
├── pictures_per_wnid_dict.pickle
├── pictures_summary.pickle
└── vehicles_synsets.txt
├── README.md
├── Vehicles_Categorization.html
├── Vehicles_Categorization.ipynb
├── Vehicles_Categorization.odt
├── images
├── Layers_1_adam_acc.png
├── Layers_1_adam_loss.png
├── augmented_pics.png
├── car_filters.png
├── cnn_pooling.png
├── dataset_characteristics.png
├── deeplearning_google.png
├── download_process.png
├── image_transformations.png
├── kernel.png
├── model_results.png
├── model_workflow.png
├── monza_analysis.png
├── neural_net.png
├── optimizers.png
├── sgd.png
├── truck_filters.png
├── truck_filters_4.png
└── truck_versus_car.png
├── models
├── Layers_1_Adagrad_acc.png
├── Layers_1_Adagrad_loss.png
├── Layers_1_adam_acc.png
├── Layers_1_adam_loss.png
├── Layers_2_Adagrad_acc.png
├── Layers_2_Adagrad_loss.png
├── Layers_2_adam_acc.png
├── Layers_2_adam_loss.png
├── Layers_3_Adagrad_acc.png
├── Layers_3_Adagrad_loss.png
├── Layers_3_adam_acc.png
├── Layers_3_adam_loss.png
├── Layers_4_Adagrad_acc.png
├── Layers_4_Adagrad_loss.png
├── Layers_4_adam_acc.png
└── Layers_4_adam_loss.png
└── scripts
├── Dataset_wrangling.py
├── ImageProcessor.py
├── configuration.py
├── download_imagenet_images.py
└── models.py
/.~lock.Vehicles_Categorization.odt#:
--------------------------------------------------------------------------------
1 | ,rafaelcastillo,INV00733,20.09.2016 10:22,file:///home/rafaelcastillo/.config/libreoffice/4;
--------------------------------------------------------------------------------
/Capstone Project_writeup.md:
--------------------------------------------------------------------------------
1 | # Capstone Project
2 | ## Machine Learning Engineer Nanodegree
3 | Rafael Castillo Alcibar
4 |
5 | September 15th, 2016
6 |
7 | ## I. Definition
8 |
9 | ### Project Overview
10 |
11 | This project is a proof of concept (POC) solution where deep learning techniques are applied to vehicle recognition tasks, this is particularly important task in the area of traffic control and management, for example, companies operating road tolls to detect fraud actions since different fees are applied with regards to vehicle types. Images used to train neural nets are obtained from the [Imagenet](http://image-net.org/) dataset, which publicly distributes images URLs for hundreds of categories. Since the whole experiment is performed on a personal computer with limited computational resources, POC scope is also limited to the simple classification of two different kinds of vehicles: Trailer Trucks versus Sports Cars. Main POC's goal is to determine the maximum accuracy (percent of times model was correct on its predictions) different neural nets with basic architectures can reach using a limited set of images (less than 700) for training.
12 |
13 |
14 |
15 | The whole project is constructed using [Keras](https://github.com/fchollet/keras), which is a highly modular neural networks library, written in Python and capable of running on top of either TensorFlow or Theano and it was developed with a focus on enabling fast experimentation. In this case, Theano is the backend selected.
16 |
17 | #### Some Deep Learning Background
18 | The first general, working learning algorithm for supervised deep feedforward multilayer perceptrons was published by Ivakhnenko and Lapa in 1965. In 1989, Yann LeCun et al. were able to apply the standard backpropagation algorithm, which had been around as the reverse mode of automatic differentiation since 1970 to a deep neural network with the purpose of recognizing handwritten ZIP codes on mail. Despite the success of applying the algorithm, the time to train the network on this dataset was approximately 3 days, making it impractical for general use. According to LeCun, in the early 2000s, in an industrial application CNNs already processed an estimated 10% to 20% of all the checks written in the US in the early 2000s. The significant additional impact of deep learning in image or object recognition was felt in the years 2011–2012. Although CNNs trained by backpropagation had been around for decades, fast implementations of CNNs with max-pooling on GPUs were needed to make a dent in computer vision. In 2011, this approach achieved for the first time superhuman performance in a visual pattern recognition contest.
19 |
20 | Deep learning is often presented as a step towards realizing strong AI and thus many organizations have become interested in its use for particular applications. In December 2013, Facebook hired Yann LeCun to head its new artificial intelligence (AI) lab. The AI lab will develop deep learning techniques to help Facebook do tasks such as automatically tagging uploaded pictures with the names of the people in them.
21 |
22 | In 2014, Google also bought DeepMind Technologies, a British start-up that developed a system capable of learning how to play Atari video games using only raw pixels as data input. In 2015 they demonstrated AlphaGo system which achieved one of the long-standing "grand challenges" of AI by learning the game of Go well enough to beat a human professional Go player. [ref 04]
23 |
24 |
25 |
26 |
27 |
28 |
29 | ### Problem Statement
30 |
31 | For this project, it is used a personal computer (Intel® Core™ i5-4310M CPU @ 2.70GHz × 4, 8 Gb RAM and 64-bit). Different deep learning models are trained and validated and their results compared in order to determine which architecture maximizes prediction scores in the vehicle classification recognition while minimizing computational costs. Vehicles images will be used to train and test models while a subset of images are used for validation (these are unseen images for models not previously used during the train/test phase).
32 |
33 | ### Metrics
34 |
35 | Since this is a binary classification problem, basically model will try to respond to the question: "Is the vehicle of this picture a trailer truck or a sports car?", scores like Precision, Recall, F-Scores or Accuracy are suitable. For simplicity, and since the dataset is balanced (there are a similar number of images for each class), accuracy is the score used to evaluate models performances. Accuracy gives an estimate of how an often model is correct on its predictions, that is, how often model correctly flags a truck like a truck and a sports car as a sports car.
36 |
37 | In the other hand, since computational resources is also a critical point to consider, minutes required to train the model is the second metric used. Combination of both scores allows the identification of the model that maximizes precision while minimizing computational resources.
38 |
39 |
40 |
41 | ## II. Analysis
42 |
43 | ### Data Exploration
44 |
45 | Images are collected from Imagenet dataset which contains hundreds of different categories, [synsets](https://en.wikipedia.org/wiki/WordNet) is used to identify any particular category, for example, in this particular case the following synsets are used:
46 |
47 | 1. n04467665: Trailer Trucks
48 | 2. n04285008: Sport-Cars
49 |
50 | To retrieve images, since image URLs are freely available, the process to download by HTTP protocol requires a little python script to download pictures from urls listed in the following link:
51 | ```
52 | http://www.image-net.org/api/text/imagenet.synset.geturls?wnid=[wnid]
53 | ````
54 |
55 | where ```[wnid]``` is one of the synsets selected previously. For further reference, please check [Imagenet documentation](http://image-net.org/download-imageurls).
56 |
57 |
58 |
59 |
60 | Image below summarizes main dataset characteristics, a total of 1550 pictures are available with a mean height and width of 352 and 483 pixels respectively with 3 channels (RGB) for colors. Since there are pictures with 1px height and width, pictures with less than 150px for each dimension are removed since are considered to not be valid for the project due to their poor resolution,this process eliminated just 32 images.
61 |
62 |
63 |
64 | With regards of the different classes, just one of them include more than 800 pictures, since the dataset is at this stage umbalanced and small, [Keras Data Generator](https://keras.io/preprocessing/image/) utility is employed to generate fake images from the pictures already available, for example, the picture below. This utility is crucial to balance classes and generate new images to train models [ref 05].
65 |
66 |
67 |
68 |
69 | ### Exploratory Visualization
70 |
71 | In order to use images as input for the deep learning models, images need to be converted into multidimensional arrays of number where each pixel represents a cell in the multidimensional array. For this process it is used the Numpy library ```ndimage``` as described in this [tutorial](http://www.scipy-lectures.org/advanced/image_processing/). In this project images are resized to 150px height and width respectively in gray scale of colors (since color is not a determinant characteristic to differentiate between a truck and a sports car) which reduces dimensions in two dimensions (RGB = 3 channels, grayscale = 1 channel)
72 |
73 |
74 |
75 | ### Algorithms and Techniques
76 |
77 | Different deep convolutional neural nets architectures are used to perform this task, which nowadays seems to be the best known approach in the image recognition field. Images categorization is a complex task, for example, a grayscale image of size 150x150 would be transformed to a vector of size 150·150 = 22500 for a fully connected neural network. Such huge dimensionality with no predefined features makes this problem unapproachable for standard supervised learning approaches, even combining them with dimensional reduction techniques like PCA.
78 |
79 | Convolutional nets are elected to be the most efficient technique to extract relevant information from, in this case, images to be used in classification tasks. When used for image recognition, convolutional neural networks (CNNs) consist of multiple layers of small kernels which process portions of the input image, called receptive fields. Kernels are small matrix (normally 3x3 or 5x5) applied over the input image to extract features from data, this technique has been used in image processing for decades, from Photoshop filters to medical imaging. [This blog by Victor Powell](http://setosa.io/ev/image-kernels/) is an excellent resource to understand how kernels works.
80 |
81 |
82 |
83 | The outputs of these kernels are then tiled so that their input regions overlap, to obtain a better representation of the original image; this is repeated for every such layer. Convolutional networks may include local or global pooling layers, which combine the outputs of neuron clusters. Compared to other image classification algorithms, convolutional neural networks use relatively little pre-processing. This means that the network is responsible for learning the filters that in traditional algorithms were hand-engineered. The lack of dependence on prior knowledge and human effort in designing features is a major advantage for CNNs.
84 |
85 | Another important concept of CNNs is pooling, which is a form of non-linear down-sampling. There are several non-linear functions to implement pooling among which max pooling is the most common. It partitions the input image into a set of non-overlapping rectangles and, for each such sub-region, outputs the maximum. The intuition is that once a feature has been found, its exact location isn't as important as its rough location relative to other features. The function of the pooling layer is to progressively reduce the spatial size of the representation to reduce the amount of parameters and computation in the network, and hence to also control overfitting. It is common to periodically insert a pooling layer in-between successive conv layers in a CNN architecture. The pooling operation provides a form of translation invariance.[ref 06].
86 |
87 |
88 |
89 | The proposed net architecture for this particular problem is a neural net with 1 to 4 layers where each layer includes a CNN + Max Pooling layer. On top of that it is included a fully connected net with 150 nodes in the input side and 1 node to output results and dropout implemented. Dropout is a regularization technique for reducing overfitting in neural networks by preventing complex co-adaptations on training data, it basically consist in dropping out nodes randomly in a neural network to gain robustness in model predictions. Below is included an example of a proposed architecture:
90 |
91 |
92 |
93 |
94 | On top of this, two different optimizers are employed: ```Adam``` and ```Adagram```. Optimizers are used to minimize the ```Cost``` function in a neural net. In the example below, we can see there are weights (W) and biases (b) for every node and connection between nodes in a neural network:
95 |
96 |
97 |
98 | A cost function is a measure of "how good" a neural network did with respect to it's given training sample and the expected output. It also may depend on variables such as weights and biases. A cost function is a single value, not a vector, because it rates how good the neural network did as a whole.
99 |
100 | Specifically, a cost function is of the form:
101 | ```
102 | C(W,B,S,E)
103 | ```
104 | where ```W``` is our neural network's weights, ```B``` is our neural network's biases, ```S``` is the input of a single training sample, and ```E``` is the desired output of that training sample.[ref 08]
105 |
106 | While there are different ways to represent the ```Cost``` function, the goal of optimization is to minimize it. Different approaches are used, Stochastic Gradient Descent (SGD) tries to find minimums or maximums by iteration. This is the most common approach and different versions of this method originates the optimizers here employed:
107 | * **AdaGrad** (for adaptive gradient algorithm) is a modified stochastic gradient descent with per-parameter learning rate, first published in 2011. Informally, this increases the learning rate for more sparse parameters and decreases the learning rate for less sparse ones. This strategy often improves convergence performance over standard stochastic gradient descent.
108 | * **Adam** is also a method in which the learning rate is adapted for each of the parameters. The idea is to divide the learning rate for a weight by a running average of their magnitudes and recent gradients for that weight.
109 |
110 |
111 |
112 | ### Benchmark
113 |
114 | In the study: [Monza: Image Classification of Vehicle Make and Model Using Convolutional Neural Networks and Transfer Learning](http://cs231n.stanford.edu/reports/lediurfinal.pdf) several machine learning approaches are used for car detection and identification. A fine-grained dataset containing 196 different classes of cars is employed. This dataset is particularly challenging due to the freeform nature of the images, which contained cars in many different sizes, shapes, and poses, similar scenario applies to the current dataset, but in this particular case there are just two different classes. Study results are presented in terms of accuracy for the top1 and top5 classes for the different approaches used. For the Deep Learning approaches, accuracy values are around 0.8, so this will be the value used to benchmark current results.
115 |
116 |
117 |
118 | ## III. Methodology
119 |
120 | ### Data Preprocessing
121 |
122 | Vehicles images are downloaded from the Imagenet dataset, in the notebook included is described the process step-by-step, basically it is required to include the configuration parameters in ```configuration.py``` and execute ```download_imagenet_images.py```.
123 |
124 | Once all files are downloaded, different picture classes needs to be organized following the structure:
125 | ```
126 | dataset\
127 | train\
128 | n04467665\
129 | n04467665_01.png
130 | n04467665_04.png
131 | ...
132 | n04285008\
133 | n04285008_01.png
134 | n04285008_04.png
135 | ...
136 | test\
137 | n04467665\
138 | n04467665_02.png
139 | n04467665_03.png
140 | ...
141 | n04285008\
142 | n04285008_02.png
143 | n04285008_03.png
144 | ...
145 | validation\
146 | n04467665\
147 | n04467665_07.png
148 | n04467665_09.png
149 | ...
150 | n04285008\
151 | n04285008_07.png
152 | n04285008_09.png
153 | ...
154 | ```
155 |
156 | For this purpose, ```Data_Wrangling.py``` is employed. Next step is to eliminate those pictures with height and width lower than 150px. A threshold of 150px is employed since this is the images dimensions used in common Deep Learning models nowadays and the size of the input images for the models. To end, ```ImageProcessor.py``` is employed to perform several tasks:
157 | 1. Generate augmented images from current images using the Keras utility ImageDataGenerator to be used to train models.
158 | - rescale=1./255: as the images taken by Raspberry Pi’s camera come with RGB coefficients in the range of 0-255 I had to normalize the values to span from 0 to 1., which was achieved by this scaling
159 | - rotation_range=40: images were rotated randomly by 0-40 degrees
160 | - width_shift_range=0.01: range in which image was randomly translated vertically
161 | - height_shift_range=0.1: range in which image was randomly translated horizontally
162 | - shear_range=0.05: range in which shearing transformations were applied randomly
163 | - zoom_range=0.1: range in which image was zoomed at randomly
164 | - fill_mode='nearest': this was the method with which newly introduced pixels were filled out
165 | 2. Resize pictures to height and width of 150px.
166 | 3. Use a gray scale for the picture colors (since color is not a important feature to distinguish a truck from a car).
167 |
168 | ### Implementation
169 |
170 | For the implementation I have chosen Keras. Keras is a neural network library for Theano and TensorFlow written in Python. Different convolutional neural net architectures were applied for the task with the intention of identifying the architecture that reached a reasonable accuracy with the minimum computational resources. Networks consisted of an input layer, 1 to 4 convolutional layers, a fully connected layer, and an output layer. The convolutional layers used 3x3 convolutions and 32-64 output filters followed by max pooling layers of 2x2. For the activation functions rectified linear units are used, except for the final output neuron which was sigmoid. After the fully connected layer a dropout of 0.5 was applied (this helps to prevent overfitting). For the loss function I have used logloss (binary crossentropy). Two different optimizers are used and compared: ```adam``` and ```Adagrad```.
171 |
172 | With regards to the difficulties encountered in the process, the first difficulty was to understand how data augmentation should be carried out. It needed a bit of trial and error to figure out how much I can distort the images, such that the car/truck remains on the images all the time.
173 | A further difficult was to figure out how to visualize the filters. Although I had a [solution to start from](https://keras.io/getting-started/faq/#how-can-i-record-the-training-validation-loss-accuracy-at-each-epoch), the code needed some workaround, such as referencing my convolutional layers, writing a function to draw the images for all filters.
174 |
175 | In the code side, ```models.run_model()``` is used to train the model as well as to validate it over the validation set and generate the learning curves. ```models.build_model()``` is used to build the model with the network architecture according to the parameters defined. Data generated in function ```models.data_augmentation()``` is used to train/test the model in batches of 32 images and 100 epochs, and to generate validation data to validate the trained model against new data not previously used during train/test. ```models.learning_curves()``` are used to graphically represent the process and get an overview of the accuracy and loss values by epochs.
176 |
177 |
178 |
179 |
180 | ### Refinement
181 |
182 | Nets with 1 to 4 layers are tested in order to determine which configuration provides the best performance while minimizing computational resources. In the picture below it is demonstrate nets performance in terms of accuracy and minutes to train for the different layers and optimized used:
183 |
184 |
185 |
186 | | Net Architecture | Accuracy | Minutes to train |
187 | |----------|:-------------:|------:|
188 | | Layers_1_adam | 0.86 | 21 |
189 | | Layers_1_Adagrad | 0.55 | 16 |
190 | | Layers_2_adam | 0.81 | 26 |
191 | | Layers_2_Adagrad | 0.55 | 22 |
192 | | Layers_3_adam | 0.87 | 25 |
193 | | Layers_3_Adagrad | 0.86 | 23 |
194 | | Layers_4_adam | 0.66 | 24 |
195 | | Layers_4_Adagrad | 0.84 | 25 |
196 |
197 |
198 | Neural Net with 1 layer and adam optimizer already meet the benchmark criteria, so no need of further refinement is required. Below are the accuracy and loss representation for Layer 1 and adam optimizer:
199 |
200 |
201 | ## IV. Results
202 |
203 |
204 | ### Model Evaluation and Validation
205 |
206 | A validation set with a 10% of the dataset, not used during training/testing phase, is used to validate results. The final architecture selected, 1 layer and adam optimizer, reaches an accuracy over 80% which is in the range of the benchmark results. An accuracy of 80% means that model is correct in in 80 out of 100 predictions made. Since the dataset is balanced (thanks to the data augmentation), accuracy is a perfectly valid metric in this scenario and no need to investigate alternatives like Precision & Recall or F-Scores is required.
207 |
208 |
209 |
210 |
211 |
212 | ### Justification
213 | In this section, your model’s final solution and its results should be compared to the benchmark you established earlier in the project using some type of statistical analysis. You should also justify whether these results and the solution are significant enough to have solved the problem posed in the project. Questions to ask yourself when writing this section:
214 | - _Are the final results found stronger than the benchmark result reported earlier?_
215 | - _Have you thoroughly analyzed and discussed the final solution?_
216 | - _Is the final solution significant enough to have solved the problem?_
217 |
218 | The result obtained with the model selected was higher than actually expected. In a more simplistic approach (just two classes), model is capable to reach state of the art accuracy performances even on the validation set (completely unseen data for the model). We can consider this proof of concept satisfactory as model reaches benchmark results.
219 |
220 |
221 | ## V. Conclusion
222 | _(approx. 1-2 pages)_
223 |
224 | ### Free-Form Visualization
225 |
226 | Following are represented some original pictures and how different filters represent them in the different convolutional layers. This gives us an idea of how the neural net decomposes the
227 | visual space.
228 |
229 |
230 |
231 |
232 |
233 |
234 |
235 |
236 | In both examples, for layer 1, different filters focus mainly on shapes and still images are recognizable, but in higher layers this not happens anymore and looks mostly noise. As mentioned by ```@fchollet``` on his expcetional [post](https://blog.keras.io/how-convolutional-neural-networks-see-the-world.html): _"Does it mean that convnets are bad tools? Of course not, they serve their purpose just fine. What it means is that we should refrain from our natural tendency to anthropomorphize them and believe that they "understand", say, the concept of dog, or the appearance of a magpie, just because they are able to classify these objects with high accuracy. They don't, at least not to any any extent that would make sense to us humans."_
237 |
238 | ### Reflection
239 |
240 | In this POC it is implemented a Deep Learning solution to automatic vehicle recognition. Image recognition used to be a difficult task historically, however for the last few years (thanks to augmented computational resources) there are efficient methods to approach these kinds of problems. Deep multi-layer neural networks are capable of building up a hierarchy of abstractions that makes it possible to identify
241 | complex inputs (i.e. images), and in this project this is the approach selected.
242 |
243 | There were two major areas for the project. The first was data collection, the second was model building. Given that collected dataset is reduced, a critical part in this project is the use ofthe data augmentation utility from Keras to help to prevent overfitting and improve
244 | generalization.
245 |
246 | After this, building the different models attempted is not particularly complex (thanks to Keras again!), and although there are a significant amount of parameters to experiment (such as the type of activation functions, regularization methods, loss functions, error metrics, nodes in fully connected layers, etc.), it is started from good architectures that were published by ```@fchollet```, and build from there. It is amazing to see how efficient this method is, and how fast it is possible to set up an architecture that is performing well on the task.
247 |
248 | Although the final method fits expectations for the problem, further testing with more validation data would be desired. The bottleneck here was the difficulty around data collection. Additional data could be used to warranty model does generalize well enough in largely different environments.
249 |
250 |
251 |
252 | ### Improvement
253 |
254 | With regards to improvements, as already mentioned, gathering additional data would help in generalization. A further area in which to expand the project, is to expand it to a multiclass classification project, such that the model not only recognizes cars from trucks, but many other vehicles as well, such as vans, motorcycles, etc. Considering that, it would potentially be necessary to expand the model architecture by adding more layers and neurons to it, such that the model is expressive enough to accommodate the additional complexity.
255 |
256 |
257 | ### References:
258 |
259 | [ref 01]: [Imagenet Classification with deep convolutional neural networks](https://papers.nips.cc/paper/4824-imagenet-classification-with-deep-convolutional-neural-networks.pdf)
260 |
261 | [ref 02]: [Multi-digit Number Recognition from Street View Imagery using Deep Convolutional Neural Networks](http://static.googleusercontent.com/media/research.google.com/es//pubs/archive/42241.pdf)
262 |
263 | [ref 03]: Ivakhnenko, A. G. and Lapa, V. G. (1965). Cybernetic Predicting Devices. CCM Information Corporation.
264 |
265 | [ref 04]: [Wikipedia Deep Learning History](https://en.wikipedia.org/wiki/Deep_learning#History)
266 |
267 | [ref 05]: [Building powerful image classification models using very little data](https://blog.keras.io/building-powerful-image-classification-models-using-very-little-data.html)
268 |
269 | [ref 06]: [Convolutional Neural Network](https://en.wikipedia.org/wiki/Convolutional_neural_network)
270 |
271 | [ref 07]: [Monza: Image Classification of Vehicle Make and Model Using Convolutional Neural Networks and Transfer Learning](http://cs231n.stanford.edu/reports/lediurfinal.pdf)
272 |
273 | [ref 08]: [ Neilsen's book](http://neuralnetworksanddeeplearning.com/)
274 |
275 |
276 |
--------------------------------------------------------------------------------
/Capstone Project_writeup.pdf:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/kingkastle/Deep-Learning---Vehicle-Classification/e6bd3ecd42d2b65cd29f0ce6b052cf6f60160390/Capstone Project_writeup.pdf
--------------------------------------------------------------------------------
/Dataset/pictures_per_wnid_dict.pickle:
--------------------------------------------------------------------------------
1 | (dp0
2 | S'n04467665'
3 | p1
4 | cnumpy.core.multiarray
5 | scalar
6 | p2
7 | (cnumpy
8 | dtype
9 | p3
10 | (S'i8'
11 | p4
12 | I0
13 | I1
14 | tp5
15 | Rp6
16 | (I3
17 | S'<'
18 | p7
19 | NNNI-1
20 | I-1
21 | I0
22 | tp8
23 | bS'm\x03\x00\x00\x00\x00\x00\x00'
24 | p9
25 | tp10
26 | Rp11
27 | sS'n04285008'
28 | p12
29 | g2
30 | (g6
31 | S'\xd1\x02\x00\x00\x00\x00\x00\x00'
32 | p13
33 | tp14
34 | Rp15
35 | s.
--------------------------------------------------------------------------------
/Dataset/vehicles_synsets.txt:
--------------------------------------------------------------------------------
1 | n04467665
2 | n04285008
--------------------------------------------------------------------------------
/README.md:
--------------------------------------------------------------------------------
1 | # Deep-Learning Vehicle Classification
2 |
3 |
4 | This project is a proof of concept (POC) solution where deep learning techniques are applied to vehicle recognition tasks, this is particularly important task in the area of traffic control and management, for example, companies operating road tolls to detect fraud actions since different fees are applied with regards to vehicle types. Images used to train neural nets are obtained from the [Imagenet](http://image-net.org/) dataset, which publicly distributes images URLs for hundreds of categories. Since the whole experiment is performed on a personal computer with limited computational resources, POC scope is also limited to the simple classification of two different kinds of vehicles: Trailer Trucks versus Sports Cars. Main POC's goal is to determine the maximum accuracy (percent of times model was correct on its predictions) different neural nets with basic architectures can reach using a limited set of images (less than 700) for training.
5 |
6 |
7 |
8 | ## COMPLETE REPORT:
9 | Report with full descriptions of motivations, methodology, results, etc. [Deep Learning Vehicle Classification-Project_writeup](https://github.com/kingkastle/Deep-Learning---Vehicle-Classification/blob/master/Capstone%20Project_writeup.md)
10 |
11 |
12 | ## Requirements:
13 |
14 | Following libraries are necessary:
15 |
16 | ```
17 | # local scripts:
18 | from scripts import configuration # includes paths and parameters configurations
19 | from scripts import models # includes the different models
20 | from scripts import Dataset_wrangling # includes scripts from downloading pics to generate datasets
21 |
22 |
23 | # standard libraries
24 | import os
25 | import h5py
26 | from keras.preprocessing.image import ImageDataGenerator
27 | from keras.models import Sequential
28 | from keras.layers import Convolution2D,MaxPooling2D
29 | import numpy as np
30 | from sklearn.metrics import f1_score
31 | from sklearn.metrics import recall_score
32 | from datetime import datetime
33 | import configuration
34 | import pickle
35 | import multiprocessing
36 | import logging
37 | import urllib2
38 | import download_imagenet_images
39 | import pandas
40 | import pickle
41 | import seaborn as sns
42 | import matplotlib.pyplot as plt
43 | import matplotlib.image as mpimg
44 | from IPython.core.display import HTML,display
45 |
46 | ```
47 |
48 | Project is enterely written in Python 2.7.
49 |
50 | ## Instructions:
51 |
52 | Please follow the instructions given in ```Vehicles_Categorization.ipynb```
53 |
54 |
55 | Enjoy!
56 |
--------------------------------------------------------------------------------
/Vehicles_Categorization.odt:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/kingkastle/Deep-Learning---Vehicle-Classification/e6bd3ecd42d2b65cd29f0ce6b052cf6f60160390/Vehicles_Categorization.odt
--------------------------------------------------------------------------------
/images/Layers_1_adam_acc.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/kingkastle/Deep-Learning---Vehicle-Classification/e6bd3ecd42d2b65cd29f0ce6b052cf6f60160390/images/Layers_1_adam_acc.png
--------------------------------------------------------------------------------
/images/Layers_1_adam_loss.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/kingkastle/Deep-Learning---Vehicle-Classification/e6bd3ecd42d2b65cd29f0ce6b052cf6f60160390/images/Layers_1_adam_loss.png
--------------------------------------------------------------------------------
/images/augmented_pics.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/kingkastle/Deep-Learning---Vehicle-Classification/e6bd3ecd42d2b65cd29f0ce6b052cf6f60160390/images/augmented_pics.png
--------------------------------------------------------------------------------
/images/car_filters.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/kingkastle/Deep-Learning---Vehicle-Classification/e6bd3ecd42d2b65cd29f0ce6b052cf6f60160390/images/car_filters.png
--------------------------------------------------------------------------------
/images/cnn_pooling.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/kingkastle/Deep-Learning---Vehicle-Classification/e6bd3ecd42d2b65cd29f0ce6b052cf6f60160390/images/cnn_pooling.png
--------------------------------------------------------------------------------
/images/dataset_characteristics.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/kingkastle/Deep-Learning---Vehicle-Classification/e6bd3ecd42d2b65cd29f0ce6b052cf6f60160390/images/dataset_characteristics.png
--------------------------------------------------------------------------------
/images/deeplearning_google.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/kingkastle/Deep-Learning---Vehicle-Classification/e6bd3ecd42d2b65cd29f0ce6b052cf6f60160390/images/deeplearning_google.png
--------------------------------------------------------------------------------
/images/download_process.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/kingkastle/Deep-Learning---Vehicle-Classification/e6bd3ecd42d2b65cd29f0ce6b052cf6f60160390/images/download_process.png
--------------------------------------------------------------------------------
/images/image_transformations.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/kingkastle/Deep-Learning---Vehicle-Classification/e6bd3ecd42d2b65cd29f0ce6b052cf6f60160390/images/image_transformations.png
--------------------------------------------------------------------------------
/images/kernel.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/kingkastle/Deep-Learning---Vehicle-Classification/e6bd3ecd42d2b65cd29f0ce6b052cf6f60160390/images/kernel.png
--------------------------------------------------------------------------------
/images/model_results.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/kingkastle/Deep-Learning---Vehicle-Classification/e6bd3ecd42d2b65cd29f0ce6b052cf6f60160390/images/model_results.png
--------------------------------------------------------------------------------
/images/model_workflow.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/kingkastle/Deep-Learning---Vehicle-Classification/e6bd3ecd42d2b65cd29f0ce6b052cf6f60160390/images/model_workflow.png
--------------------------------------------------------------------------------
/images/monza_analysis.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/kingkastle/Deep-Learning---Vehicle-Classification/e6bd3ecd42d2b65cd29f0ce6b052cf6f60160390/images/monza_analysis.png
--------------------------------------------------------------------------------
/images/neural_net.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/kingkastle/Deep-Learning---Vehicle-Classification/e6bd3ecd42d2b65cd29f0ce6b052cf6f60160390/images/neural_net.png
--------------------------------------------------------------------------------
/images/optimizers.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/kingkastle/Deep-Learning---Vehicle-Classification/e6bd3ecd42d2b65cd29f0ce6b052cf6f60160390/images/optimizers.png
--------------------------------------------------------------------------------
/images/sgd.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/kingkastle/Deep-Learning---Vehicle-Classification/e6bd3ecd42d2b65cd29f0ce6b052cf6f60160390/images/sgd.png
--------------------------------------------------------------------------------
/images/truck_filters.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/kingkastle/Deep-Learning---Vehicle-Classification/e6bd3ecd42d2b65cd29f0ce6b052cf6f60160390/images/truck_filters.png
--------------------------------------------------------------------------------
/images/truck_filters_4.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/kingkastle/Deep-Learning---Vehicle-Classification/e6bd3ecd42d2b65cd29f0ce6b052cf6f60160390/images/truck_filters_4.png
--------------------------------------------------------------------------------
/images/truck_versus_car.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/kingkastle/Deep-Learning---Vehicle-Classification/e6bd3ecd42d2b65cd29f0ce6b052cf6f60160390/images/truck_versus_car.png
--------------------------------------------------------------------------------
/models/Layers_1_Adagrad_acc.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/kingkastle/Deep-Learning---Vehicle-Classification/e6bd3ecd42d2b65cd29f0ce6b052cf6f60160390/models/Layers_1_Adagrad_acc.png
--------------------------------------------------------------------------------
/models/Layers_1_Adagrad_loss.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/kingkastle/Deep-Learning---Vehicle-Classification/e6bd3ecd42d2b65cd29f0ce6b052cf6f60160390/models/Layers_1_Adagrad_loss.png
--------------------------------------------------------------------------------
/models/Layers_1_adam_acc.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/kingkastle/Deep-Learning---Vehicle-Classification/e6bd3ecd42d2b65cd29f0ce6b052cf6f60160390/models/Layers_1_adam_acc.png
--------------------------------------------------------------------------------
/models/Layers_1_adam_loss.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/kingkastle/Deep-Learning---Vehicle-Classification/e6bd3ecd42d2b65cd29f0ce6b052cf6f60160390/models/Layers_1_adam_loss.png
--------------------------------------------------------------------------------
/models/Layers_2_Adagrad_acc.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/kingkastle/Deep-Learning---Vehicle-Classification/e6bd3ecd42d2b65cd29f0ce6b052cf6f60160390/models/Layers_2_Adagrad_acc.png
--------------------------------------------------------------------------------
/models/Layers_2_Adagrad_loss.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/kingkastle/Deep-Learning---Vehicle-Classification/e6bd3ecd42d2b65cd29f0ce6b052cf6f60160390/models/Layers_2_Adagrad_loss.png
--------------------------------------------------------------------------------
/models/Layers_2_adam_acc.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/kingkastle/Deep-Learning---Vehicle-Classification/e6bd3ecd42d2b65cd29f0ce6b052cf6f60160390/models/Layers_2_adam_acc.png
--------------------------------------------------------------------------------
/models/Layers_2_adam_loss.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/kingkastle/Deep-Learning---Vehicle-Classification/e6bd3ecd42d2b65cd29f0ce6b052cf6f60160390/models/Layers_2_adam_loss.png
--------------------------------------------------------------------------------
/models/Layers_3_Adagrad_acc.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/kingkastle/Deep-Learning---Vehicle-Classification/e6bd3ecd42d2b65cd29f0ce6b052cf6f60160390/models/Layers_3_Adagrad_acc.png
--------------------------------------------------------------------------------
/models/Layers_3_Adagrad_loss.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/kingkastle/Deep-Learning---Vehicle-Classification/e6bd3ecd42d2b65cd29f0ce6b052cf6f60160390/models/Layers_3_Adagrad_loss.png
--------------------------------------------------------------------------------
/models/Layers_3_adam_acc.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/kingkastle/Deep-Learning---Vehicle-Classification/e6bd3ecd42d2b65cd29f0ce6b052cf6f60160390/models/Layers_3_adam_acc.png
--------------------------------------------------------------------------------
/models/Layers_3_adam_loss.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/kingkastle/Deep-Learning---Vehicle-Classification/e6bd3ecd42d2b65cd29f0ce6b052cf6f60160390/models/Layers_3_adam_loss.png
--------------------------------------------------------------------------------
/models/Layers_4_Adagrad_acc.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/kingkastle/Deep-Learning---Vehicle-Classification/e6bd3ecd42d2b65cd29f0ce6b052cf6f60160390/models/Layers_4_Adagrad_acc.png
--------------------------------------------------------------------------------
/models/Layers_4_Adagrad_loss.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/kingkastle/Deep-Learning---Vehicle-Classification/e6bd3ecd42d2b65cd29f0ce6b052cf6f60160390/models/Layers_4_Adagrad_loss.png
--------------------------------------------------------------------------------
/models/Layers_4_adam_acc.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/kingkastle/Deep-Learning---Vehicle-Classification/e6bd3ecd42d2b65cd29f0ce6b052cf6f60160390/models/Layers_4_adam_acc.png
--------------------------------------------------------------------------------
/models/Layers_4_adam_loss.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/kingkastle/Deep-Learning---Vehicle-Classification/e6bd3ecd42d2b65cd29f0ce6b052cf6f60160390/models/Layers_4_adam_loss.png
--------------------------------------------------------------------------------
/scripts/Dataset_wrangling.py:
--------------------------------------------------------------------------------
1 | '''
2 | Created on Aug 21, 2016
3 |
4 | @author: rafaelcastillo
5 |
6 | This script is used to built the final dataset from the pictures downloaded
7 | '''
8 |
9 | import pandas
10 | import numpy as np
11 | from os import listdir,walk
12 | from os.path import isfile, join
13 | from keras.preprocessing.image import ImageDataGenerator
14 | from scipy import ndimage
15 | import seaborn as sns
16 | sns.set(style="white")
17 | import matplotlib.pyplot as plt
18 | import os, shutil
19 | import random
20 | import Image
21 |
22 |
23 |
24 | plt.style.use('seaborn-muted') # Using ggplot style for visualizations
25 |
26 |
27 | def resize_pic(img_path,width,height):
28 | '''
29 | This function is used to resize pictures to the desired shape.
30 |
31 | Args:
32 | * img_path: path to local image (in jpg format) to convert
33 | * width: desired width of the output picture
34 | * height: desired height of the output picture
35 |
36 | Return:
37 | 0/1 depending of the conversion status.
38 | Generated picture is stored under a different name using img_path
39 | '''
40 | im1 = Image.open(img_path)
41 | # use cubic spline interpolation in a 4x4 environment filter options to resize the image
42 | try:
43 | im4 = im1.resize((width, height), Image.BICUBIC)
44 | im4.save(img_path.replace('.jpg','_good_shape.jpg'))
45 | except:
46 | #print "Unable to resize file: {0}".format(img_path)
47 | return 0
48 | return 1
49 |
50 |
51 | def generate_pics(dataset_path,family,df_hw,number_pics,width,height):
52 | '''
53 | This script is used to generate pictures using the Keras utility ImageDataGenerator.
54 |
55 | Args:
56 | * dataset_path: dataset path
57 | * family: flower category
58 | * df_hw: dataframe with height width, channel and family name for each picture
59 | * number_pics: number of pictures to generate
60 | * width: desired width of the output picture
61 | * height: desired height of the output picture
62 |
63 | Return:
64 | none
65 |
66 | '''
67 |
68 | # Generate temporary folder and subfolder where the input pics to generate
69 | # fake pictures will be located.
70 | pictures_path = dataset_path + family
71 | if not os.path.exists(pictures_path + '/' + 'temp'):
72 | os.mkdir(pictures_path + '/' + 'temp')
73 | os.mkdir(pictures_path + '/' + 'temp/pics')
74 |
75 | # Get pictures paths:
76 | picture_files = df_hw[(df_hw['family']==family)&
77 | (df_hw['height']>=height)&
78 | (df_hw['width']>=width)]['name'].unique()
79 |
80 | # Select a number_pics of pictures using a random sample:
81 | if len(picture_files) < number_pics:
82 | print """Warning: There are insufficient pictures with optimum size to generate augmented pics.
83 | Process will repeat pictures to generate augmented data"""
84 | pics_diff = number_pics - len(picture_files)
85 | selected_pics = random.sample(picture_files,len(picture_files)) + random.sample(picture_files,pics_diff)
86 | else:
87 | selected_pics = random.sample(picture_files,number_pics)
88 |
89 | # Get wnid for the flower specie:
90 | wnid = selected_pics[0].split("_")[0]
91 |
92 | # Copy those files to the destination folder:
93 | for pic in selected_pics:
94 | shutil.copyfile(join(pictures_path, pic), pictures_path + '/' + 'temp/pics/' + pic)
95 |
96 | # Generate augmented pics. Docs: https://keras.io/preprocessing/image/
97 | datagen = ImageDataGenerator(
98 | rotation_range=40,
99 | width_shift_range=0.01,
100 | height_shift_range=0.1,
101 | shear_range=0.05,
102 | zoom_range=0.1,
103 | horizontal_flip=True,
104 | fill_mode='nearest')
105 |
106 | # the .flow_from_directory() command below generates batches of randomly transformed images
107 | # and saves the results to save_to_dir directory
108 | for _ in datagen.flow_from_directory(pictures_path + '/' + 'temp',
109 | target_size=(width,height),
110 | batch_size=number_pics,
111 | classes=None,
112 | class_mode=None,
113 | shuffle=True,
114 | save_to_dir=pictures_path,
115 | save_prefix='Generated',
116 | save_format='jpg'):
117 | break # otherwise the generator would loop indefinitely
118 |
119 | # Rename generated files:
120 | for pic in [join(pictures_path, f) for f in os.listdir(pictures_path) if 'Generated' in f]:
121 | shutil.move( pic, pic.replace('Generated',wnid + '_fake'))
122 |
123 | # Remove temporary folder:
124 | shutil.rmtree(pictures_path + '/' + 'temp')
125 |
126 |
127 |
128 | def sizes_distribution(dataset_path):
129 | '''
130 | This function generates a dataframes, one stores height and width and
131 | the number of channels available in RGB model for all pictures
132 |
133 | Args:
134 | * dataset_path: dataset path
135 |
136 | Return:
137 | * df_hw: dataframe with height width, channel and family name for each picture
138 |
139 | '''
140 | # Dataframe to store results
141 | df_hw = pandas.DataFrame(columns=['name','height','width','channels'])
142 | idx = 0
143 |
144 | # Get list with all directories in the Dataset
145 | dir_list = [x[1] for x in walk(dataset_path)][0]
146 | for dir in dir_list:
147 | current_dir = dataset_path + "/" + dir
148 | picture_names = [f for f in listdir(current_dir) if isfile(join(current_dir, f))]
149 | for picture in picture_names:
150 | try:
151 | img = ndimage.imread(current_dir + "/" + picture,mode='RGB').shape
152 | except:
153 | print 'Unable to open file {0}'.format(current_dir + "/" + picture)
154 | os.remove(current_dir + "/" + picture)
155 | continue
156 | df_hw.loc[idx] = [picture,img[0],img[1],img[2]]
157 | idx += 1
158 |
159 | # generate family column
160 | df_hw['family'] = df_hw['name'].apply(lambda x: x.split("_")[0])
161 |
162 | return df_hw
163 |
164 |
165 | def generate_sets(dataset_path,output_path,sizes,selected_classes):
166 | '''
167 | This function generates the train and test structure directories required by
168 | function data_augmentation in models.py
169 |
170 | Args:
171 | * dataset_path: dataset path with images
172 | * output_path: train and test dataset path
173 | * sizes: list with the train/test/validation sizes distributions, i.e: [0.6,0.2,0.2]
174 | * selected_classes: list with selected flowers categories used for modeling
175 |
176 | Return:
177 | None
178 | '''
179 |
180 | assert (np.sum(sizes) != 1.),"Total sizes must sum 1!, i.e. [.8,.1,.1]"
181 |
182 | try:
183 | os.mkdir(output_path)
184 | os.mkdir(output_path + '/' + 'train')
185 | os.mkdir(output_path + '/' + 'test')
186 | os.mkdir(output_path + '/' + 'validation')
187 | except:
188 | pass
189 |
190 | # Get list with all directories in the Dataset
191 | dir_list = [x[1] for x in walk(dataset_path)][0]
192 | for dir in dir_list:
193 |
194 | # Just process those flowers categories included in selected_classes
195 | if dir not in selected_classes: continue
196 |
197 | # Create folder in output directory
198 | current_dir = dataset_path + "/" + dir
199 | os.mkdir(current_dir.replace(dataset_path,output_path+"/train"))
200 | os.mkdir(current_dir.replace(dataset_path,output_path+"/test"))
201 | os.mkdir(current_dir.replace(dataset_path,output_path+"/validation"))
202 |
203 | # Get picture names and shuffle:
204 | picture_names = [f for f in listdir(current_dir) if isfile(join(current_dir, f))]
205 | random.shuffle(picture_names)
206 | train_size = int(len(picture_names)*sizes[0])
207 | test_size = int(len(picture_names)*sizes[1])
208 |
209 | # Copy files to the corresponding output directory
210 | for picture in picture_names[:train_size]:
211 | shutil.copyfile(current_dir + "/" + picture, current_dir.replace(dataset_path,output_path+"/train") + "/" + picture)
212 | for picture in picture_names[train_size:train_size + test_size]:
213 | shutil.copyfile(current_dir + "/" + picture, current_dir.replace(dataset_path,output_path+"/test") + "/" + picture)
214 | for picture in picture_names[train_size + test_size:]:
215 | shutil.copyfile(current_dir + "/" + picture, current_dir.replace(dataset_path,output_path+"/validation") + "/" + picture)
216 |
217 |
218 | if __name__ == '__main__':
219 | print "Dataset Wrangling module loaded..."
220 |
221 |
222 |
223 |
224 |
225 |
226 |
--------------------------------------------------------------------------------
/scripts/ImageProcessor.py:
--------------------------------------------------------------------------------
1 | import os, shutil
2 | import random
3 | from os import listdir
4 | from os.path import isfile, join
5 | import Image
6 | from scipy import ndimage
7 | from keras.preprocessing.image import ImageDataGenerator
8 |
9 |
10 | def resize_pic(img_path,width,height):
11 | '''
12 | This function is used to resize pictures to the desired shape.
13 |
14 | Args:
15 | * img_path: path to local image (in jpg format) to convert
16 | * width: desired width of the output picture
17 | * height: desired height of the output picture
18 |
19 | Return:
20 | 0/1 depending of the conversion status.
21 | Generated picture is stored under a different name using img_path
22 | '''
23 | im1 = Image.open(img_path)
24 | # use cubic spline interpolation in a 4x4 environment filter options to resize the image
25 | try:
26 | im4 = im1.resize((width, height), Image.BICUBIC)
27 | im4.save(img_path.replace('.jpg','_good_shape.jpg'))
28 | except:
29 | print "Unable to resize file: {0}".format(img_path)
30 | return 0
31 | return 1
32 |
33 |
34 | def generate_pics(pictures_path,number_pics,width,height):
35 | '''
36 | This script is used to generate pictures using the Keras utility ImageDataGenerator.
37 |
38 | Args:
39 | * pictures_path: path to folder where a particular flowers specie is located
40 | * number_pics: number of pictures to generate
41 | * width: desired width of the output picture
42 | * height: desired height of the output picture
43 |
44 | Return:
45 | none
46 |
47 | '''
48 |
49 | # Generate temporary folder and subfolder where the input pics to generate
50 | # fake pictures will be located.
51 | if not os.path.exists(pictures_path + '/' + 'temp'):
52 | os.mkdir(pictures_path + '/' + 'temp')
53 | os.mkdir(pictures_path + '/' + 'temp/pics')
54 |
55 | # Get pictures paths:
56 | picture_files = [f for f in listdir(pictures_path)
57 | if isfile(join(pictures_path, f))
58 | if ndimage.imread(join(pictures_path,f),mode='RGB').shape[0]>height]
59 | if len(picture_files) < number_pics:
60 | print "Warning: there are insufficient files with optimum size to generate augmented pics"
61 | number_pics = len(picture_files)
62 |
63 | # Select a number_pics of pictures using a random sample:
64 | selected_pics = random.sample(picture_files,number_pics)
65 |
66 | # Get wnid for the flower specie:
67 | wnid = selected_pics[0].split("_")[0]
68 |
69 | # Copy those files to the destination folder:
70 | for pic in selected_pics:
71 | shutil.copyfile(join(pictures_path, pic), pictures_path + '/' + 'temp/pics/' + pic)
72 |
73 | # Generate augmented pics. Docs: https://keras.io/preprocessing/image/
74 | datagen = ImageDataGenerator(
75 | rotation_range=40,
76 | width_shift_range=0.01,
77 | height_shift_range=0.1,
78 | shear_range=0.05,
79 | zoom_range=0.1,
80 | horizontal_flip=True,
81 | fill_mode='nearest')
82 |
83 | # the .flow_from_directory() command below generates batches of randomly transformed images
84 | # and saves the results to save_to_dir directory
85 | for _ in datagen.flow_from_directory(pictures_path + '/' + 'temp',
86 | target_size=(width,height),
87 | batch_size=number_pics,
88 | classes=None,
89 | class_mode=None,
90 | shuffle=True,
91 | save_to_dir=pictures_path,
92 | save_prefix='Generated',
93 | save_format='jpg'):
94 | break # otherwise the generator would loop indefinitely
95 |
96 | # Rename generated files:
97 | for pic in [join(pictures_path, f) for f in listdir(pictures_path) if 'Generated' in f]:
98 | shutil.move( pic, pic.replace('Generated',wnid + '_fake'))
99 |
100 | # Remove temporary folder:
101 | shutil.rmtree(pictures_path + '/' + 'temp')
102 |
103 | pictures_path = '/home/rafaelcastillo/MLND/Project5/DeepLearning/Dataset2/n12914923'
104 | number_pics = 4
105 | width = 250
106 | height = 250
107 | generate_pics(pictures_path,number_pics,width,height)
108 |
--------------------------------------------------------------------------------
/scripts/configuration.py:
--------------------------------------------------------------------------------
1 | '''
2 | Created on Aug 27, 2016
3 |
4 | @author: rafaelcastillo
5 |
6 | This file includes all configuration parameters used along the project
7 | '''
8 |
9 | ## Configurations during the wrangling phase:
10 |
11 | path_dataset = '/home/rafaelcastillo/MLND/Project5/DeepLearning/Dataset/' # path to dataset in local directory
12 |
13 | dataset_train_test = '/home/rafaelcastillo/MLND/Project5/DeepLearning/Dataset_train_test' # path to dataset in the model required structure
14 |
15 | sizes = [.6,.3,.1] # sizes of the different sets: train/test/validation
16 |
17 | ## Classes:
18 | classes = ['n04467665','n04285008']
19 |
20 | ## Dimensions of the augmented pictures: (vehicles)
21 | height = 405
22 | width = 460
23 |
24 | ## models path:
25 | model_path = '/home/rafaelcastillo/MLND/Project5/DeepLearning/models'
26 |
27 | ## models input pics dimensions:
28 | model_height = 150
29 | model_width = 150
30 |
--------------------------------------------------------------------------------
/scripts/download_imagenet_images.py:
--------------------------------------------------------------------------------
1 | # -*- coding: utf-8 -*-
2 | '''
3 | Created on Aug 19, 2016
4 |
5 | @author: rafaelcastillo
6 |
7 |
8 | This script is intended to download pictures from Imagenet (http://image-net.org/)
9 | based on the synset by HTTP request as here described:
10 |
11 | http://image-net.org/download-imageurls
12 |
13 | This project focus on the flowers classification (http://image-net.org/explore?wnid=n11669921),
14 | corresponding "WordNet ID" (wnid) of a synset of the different flowers families
15 | are stored in local file:
16 |
17 | 'flowers_synsets.txt'
18 |
19 | The different wnid included in such files are used to download pictures for each of the flower families:
20 |
21 | http://www.image-net.org/api/text/imagenet.synset.geturls?wnid=[wnid]
22 |
23 | '''
24 |
25 | import urllib2
26 | import os
27 | import numpy as np
28 | import logging
29 |
30 |
31 |
32 |
33 |
34 | def download(url,local_dataset,file_name):
35 | '''
36 | This function is employed to download files by HTTP
37 |
38 | Args:
39 | * url: url of picture to download
40 | * local_dataset: root path to locate pictures
41 | * file_name: name of picture name in local folder
42 |
43 | Return:
44 | 1: picture downloaded successfully
45 | 0: picture not downloaded
46 |
47 | '''
48 | try:
49 | furl = urllib2.urlopen(url)
50 | finalurl = furl.geturl() # Since some pictures are no longer available, finalurl is used to detect url redirection:
51 | if url != finalurl:
52 | logging.info('File no longer available: {0}'.format(url))
53 | return 0
54 | wnid = file_name.split("_")[0]
55 | local_path = local_dataset + "/" + wnid
56 | if not os.path.exists(local_path):
57 | os.makedirs(local_path)
58 | f = file("{0}.jpg".format(local_path + "/" + file_name),'wb')
59 | f.write(furl.read())
60 | f.close()
61 | except:
62 | logging.info('Unable to download file {0}'.format(url))
63 | return 0
64 | return 1
65 |
66 | def worker(procnum, process_number,return_list):
67 | '''
68 | Worker function to perform multiprocessing
69 |
70 | Arg:
71 | * procnum: list with url, local Dataset directory and file_name elements for download function.
72 | * return_list: list to store download function status
73 |
74 | Return:
75 | none
76 | '''
77 | url,local_dataset,file_name = procnum[0],procnum[1],procnum[2]
78 | download_status = download(url, local_dataset, file_name)
79 | return_list[process_number] = download_status
80 |
81 | def process_jobs(jobs,return_list):
82 | '''
83 | Deploy all appended processes in jobs
84 |
85 | Arg:
86 | * jobs: list with appended processes
87 | * return_list: list with all processes' outputs
88 |
89 | Return:
90 | Sum of all processes' outputs
91 | '''
92 | for p in jobs:
93 | p.join(2)
94 | # If thread is active
95 | if p.is_alive():
96 | logging.info( "Process is running... let's kill it...")
97 | # Terminate process
98 | p.terminate()
99 | return np.sum(return_list)
100 |
101 |
102 | if __name__ == '__main__':
103 | print "download module loaded..."
104 |
105 |
106 |
107 |
108 |
109 |
110 |
111 |
112 |
113 |
114 |
--------------------------------------------------------------------------------
/scripts/models.py:
--------------------------------------------------------------------------------
1 | '''
2 | Created on Aug 30, 2016
3 |
4 | @author: rafaelcastillo
5 |
6 | This script includes the modelling part
7 |
8 | ref: https://blog.keras.io/building-powerful-image-classification-models-using-very-little-data.html
9 | ref: https://blog.keras.io/how-convolutional-neural-networks-see-the-world.html
10 | '''
11 | from keras.preprocessing.image import ImageDataGenerator, array_to_img, img_to_array, load_img
12 | from keras.models import Sequential
13 | from keras.layers import Convolution2D, MaxPooling2D, AveragePooling2D
14 | from keras.layers import Activation, Dropout, Flatten, Dense
15 | from keras import backend as K
16 | from scipy import ndimage
17 | import matplotlib.pyplot as plt
18 | import matplotlib.image as mpimg
19 | from datetime import datetime
20 | import pandas, os
21 | import numpy as np
22 | import configuration, Dataset_wrangling
23 | from sklearn.metrics import accuracy_score
24 |
25 | def data_augmentation(path_dataset,pic_dims):
26 | """
27 | This function generates batches of input data for models from the directories.
28 |
29 | Args:
30 | * dataset_path: dataset path
31 | * pic_dims: list with width and height of pictures
32 |
33 | Returns:
34 | * train and test objects with the input images
35 | """
36 |
37 | # augmentation configuration used for training
38 | train_datagen = ImageDataGenerator(
39 | rescale=1/255.,
40 | rotation_range=30,
41 | width_shift_range=0.1,
42 | height_shift_range=0.1,
43 | shear_range=0,
44 | zoom_range=0.2,
45 | horizontal_flip=True,
46 | fill_mode='nearest')
47 |
48 | # augmentation configuration used for testing
49 | test_datagen = ImageDataGenerator(rescale=1./255)
50 |
51 | # reading images from the specified directory and generating batches of augmented data
52 | train_generator = train_datagen.flow_from_directory(
53 | '{0}/train'.format(path_dataset),
54 | color_mode="grayscale",
55 | target_size=(pic_dims[0], pic_dims[1]),
56 | batch_size=32,
57 | class_mode='binary')
58 |
59 | # reading images from the specified directory and generating batches of augmented data
60 | validation_generator = test_datagen.flow_from_directory(
61 | '{0}/test'.format(path_dataset),
62 | color_mode="grayscale",
63 | target_size=(pic_dims[0], pic_dims[1]),
64 | batch_size=32,
65 | class_mode='binary')
66 |
67 | return train_generator, validation_generator
68 |
69 |
70 | def learning_curves(model_path,model_name,optimizer,history,show_plots):
71 | """
72 | Display and save learning curves.
73 |
74 | Args:
75 | * model_path: path to models directory
76 | * model_name: name of trained and validated model
77 | * show_plots: whether to show plots or not while executing
78 | """
79 |
80 | # accuracy
81 | plt.figure()
82 | plt.plot(history.history['acc'])
83 | plt.plot(history.history['val_acc'])
84 | plt.title('accuracy of the model')
85 | plt.ylabel('accuracy')
86 | plt.xlabel('epoch')
87 | plt.legend(['validation set','training set'], loc='lower right')
88 | plt.savefig(model_path + "/" + model_name + '_acc.png')
89 | if show_plots: plt.show()
90 |
91 | # loss
92 | plt.figure()
93 | plt.plot(history.history['loss'])
94 | plt.plot(history.history['val_loss'])
95 | plt.title('loss of the model')
96 | plt.ylabel('loss')
97 | plt.xlabel('epoch')
98 | plt.legend(['validation set','training set'], loc='upper right')
99 | plt.savefig(model_path + "/" + model_name + '_loss.png')
100 | if show_plots: plt.show()
101 |
102 |
103 | def build_model(optimizer,pic_dims,layers):
104 | """
105 | Builds model with desired hyperparameters.
106 |
107 | Args:
108 | * optimizer: optimizer used in the model
109 | * pic_dims: list with width and height of pictures
110 | * layers: number of conv layers included in the net
111 |
112 | Returns:
113 | * model: defined model with the selected optimizer
114 | """
115 |
116 | # Define the first conv layer
117 | model = Sequential()
118 | model.add(Convolution2D(32, 3, 3, activation='relu', input_shape=(1, pic_dims[0], pic_dims[1]), name='conv1'))
119 | model.add(MaxPooling2D(pool_size=(2, 2)))
120 |
121 | #Include as many convs layers as defined by parameter layer
122 | num_layers = 1
123 | while (num_layers < layers):
124 | num_layers += 1
125 | model.add(Convolution2D(64, 3, 3, activation='relu', name='conv{0}'.format(num_layers)))
126 | model.add(MaxPooling2D(pool_size=(2, 2)))
127 |
128 | # Define on top a fully connected net
129 | model.add(Flatten())
130 | model.add(Dense(150))
131 | model.add(Activation('relu'))
132 | model.add(Dropout(0.5))
133 | model.add(Dense(1))
134 | model.add(Activation('sigmoid'))
135 |
136 | model.compile(loss='binary_crossentropy',
137 | optimizer=optimizer,
138 | metrics=['accuracy'])
139 |
140 | return model
141 |
142 | def run_model(optimizer,nb_epoch,model_path,path_dataset,pic_dims,layers):
143 | """
144 | This function builds the model as well as validate its over the validation set.
145 | Model weights are saved in a local file as well as training time and accuracy over validation sets.
146 |
147 | Args:
148 | * optimizer: optimizer used to calculate deep net loss
149 | * nb_epoch: number of epochs used during training process
150 | * model_path: path to models directory
151 | * path_dataset: path to input images
152 | * pic_dims: list with width and height of pictures
153 | * layers: number of conv layers included in the net
154 |
155 | Return:
156 | * model: trained model
157 | * time_to_train: time (seconds) required to train model
158 | * acc: model accuracy over validation set
159 | * model_name: name of trained and validated model
160 | """
161 |
162 | # Generate connectors to augmented train/test data:
163 | train_generator, validation_generator = data_augmentation(path_dataset,pic_dims)
164 | model = build_model(optimizer,pic_dims,layers)
165 |
166 | #Start calculating train processing time:
167 | startTime = datetime.now()
168 |
169 | # Train model:
170 | history = model.fit_generator(
171 | train_generator,
172 | samples_per_epoch=300,
173 | nb_epoch=nb_epoch,
174 | validation_data=validation_generator,
175 | nb_val_samples=100,
176 | verbose=1)
177 |
178 | # End training processing:
179 | time_to_train = datetime.now() - startTime
180 |
181 | # Generate learning plots:
182 | model_name = "Layers_{0}_".format(layers) + str(optimizer)
183 | learning_curves(model_path,model_name,optimizer,history,False)
184 | model.save_weights(model_path + "/" + model_name +'.h5')
185 |
186 | # Calculate accuracy over validation set:
187 | model_parameters = [model_path,optimizer,model_name,pic_dims]
188 | acc = validate_model(model_parameters,path_dataset)
189 |
190 | return model,model_name,time_to_train,acc
191 |
192 | def validate_model(model_parameters,path_dataset):
193 | '''
194 |
195 | This script test a model over the validation dataset (using augmented data)
196 |
197 | Args:
198 | * model_parameters: List that includes:[model_path, optimizer, model_name, [height, width], layers]
199 | * path_dataset: path to dataset (where validation data is included)
200 |
201 | Returns:
202 | * Accuracy score
203 | '''
204 |
205 | model_path, optimizer, model_name, pic_dims = model_parameters
206 |
207 | # Load model
208 | model = build_model(optimizer,pic_dims,layers)
209 | model.load_weights(model_path + "/" + model_name + '.h5')
210 |
211 | # Load test data and labels
212 | test_datagen = ImageDataGenerator(rescale=1./255)
213 | i = 0
214 | for test_data in test_datagen.flow_from_directory(
215 | '{0}/validation'.format(configuration.dataset_train_test),
216 | target_size=(150, 150),
217 | batch_size=200,
218 | color_mode="grayscale",
219 | class_mode='binary',
220 | shuffle=False):
221 | i += 1
222 | if i > 2:
223 | break # otherwise the generator would loop indefinitely
224 | X = test_data[0]
225 | y_true = test_data[1]
226 |
227 | # Predict on test data
228 | y_pred = model.predict(X, batch_size=1, verbose=0)
229 |
230 | # Round predictions to 1s and 0s
231 | y_pred = np.around(y_pred)
232 |
233 | return accuracy_score(y_true, y_pred)
234 |
235 | def visualize_filters(path_to_image,layer_name,filters):
236 | '''
237 | This function is used to visualize two filters for a layer in a neural network
238 |
239 | * Args:
240 | * path_to_image: Path to image
241 | * layer_name: name of the layer to visualize
242 | * filters: list of length = 2 to visualize
243 |
244 | * Returns:
245 | Generates a visualization
246 | '''
247 | # Get layer index from model.layers
248 | layer_index = [i for i,x in enumerate(model.layers) if x.name == layer_name][0]
249 |
250 | # Function to get the layer output
251 | get_layer_output = K.function([model.layers[0].input, K.learning_phase()],
252 | [model.layers[layer_index].output])
253 |
254 | # Resize picture
255 | Dataset_wrangling.resize_pic(path_to_image,150,150)
256 | X = ndimage.imread(path_to_image.replace('.jpg','_good_shape.jpg'),flatten=True)
257 | y = np.expand_dims(X, axis=0)
258 | os.remove(path_to_image.replace('.jpg','_good_shape.jpg'))
259 |
260 | # Get layer output for the image:
261 | layer_output = get_layer_output([np.expand_dims(y, axis=0), 0])[0]
262 |
263 | filter_1 = filters[0]
264 | filter_2 = filters[1]
265 |
266 | # Generate visualization:
267 | fig = plt.figure()
268 |
269 | ax1 = plt.subplot(131)
270 | ax2 = plt.subplot(132)
271 | ax3 = plt.subplot(133)
272 |
273 | ax1.imshow(mpimg.imread(path_to_image))
274 | ax2.imshow(layer_output[0,filter_1,:,:],cmap='hot',alpha=1)
275 | ax3.imshow(layer_output[0,filter_2,:,:],cmap='hot',alpha=1)
276 | ax1.set_title('Original')
277 | ax2.set_title('Filter: {0}'.format(filter_1))
278 | ax3.set_title('Filter: {0}'.format(filter_2))
279 |
280 | fig.suptitle('Image representation in Layer {0}'.format(layer_name), fontsize=14, fontweight='bold')
281 |
282 | plt.tight_layout()
283 |
284 |
285 |
286 |
287 | if __name__ == '__main__':
288 | print "Models module loaded..."
289 |
290 | results = pandas.DataFrame(columns=['model_name','time_to_train','accuracy'])
291 | dataset_path = '/home/rafaelcastillo/MLND/Project5/DeepLearning/Dataset_train_test'
292 | model_path = '/home/rafaelcastillo/MLND/Project5/DeepLearning/models'
293 | optimizer_list = ['adam', 'Adagrad']
294 | pic_dims = [150,150]
295 | nb_epocs = 100
296 | for layers in [1,2,3,4]:
297 | for optimizer in optimizer_list:
298 | model,model_name,time_to_train,acc = run_model(optimizer, nb_epocs, model_path, dataset_path,pic_dims,layers)
299 | results.loc[results.shape[0]+1,:] = [model_name,time_to_train,acc]
300 | results.to_csv(dataset_path + "/" + 'Net_results.csv',sep=",",index=False)
301 |
302 |
--------------------------------------------------------------------------------