├── .gitattributes ├── Code ├── image_recognition.py └── my_image_recognition.py ├── Images ├── My_images │ ├── my_image_1.jpg │ ├── my_image_2.jpg │ ├── my_image_3.jpg │ ├── my_image_4.jpg │ ├── my_image_5.jpg │ └── my_image_6.jpg └── Outputs │ ├── my_image_1_prediction.png │ ├── my_image_2_prediction.png │ ├── my_image_3_prediction.png │ ├── my_image_4_prediction.png │ ├── my_image_5_prediction.png │ └── my_image_6_prediction.png ├── LICENSE ├── Models ├── my_cifar10_model.h5 └── my_cifar10_model2_augmented.h5 └── README.md /.gitattributes: -------------------------------------------------------------------------------- 1 | # Auto detect text files and perform LF normalization 2 | * text=auto 3 | -------------------------------------------------------------------------------- /Code/image_recognition.py: -------------------------------------------------------------------------------- 1 | # -*- coding: utf-8 -*- 2 | """ 3 | ////////////////////////////////////////////////////////////////////////////////////////// 4 | // Original author(s): https://medium.com/intuitive-deep-learning/build-your-first-convolutional-neural-network-to-recognize-images-84b9c78fe0ce 5 | // Modified by: Aritz Lizoain 6 | // Github: https://github.com/aritzLizoain 7 | // My personal website: https://aritzlizoain.github.io/ 8 | // Description: Image recognition with Keras (CIFAR-10 standard dataset) 9 | // Copyright 2020, Aritz Lizoain. 10 | // License: MIT License 11 | ////////////////////////////////////////////////////////////////////////////////////////// 12 | 13 | 1) Data processing: one-hot encoding and scaling 14 | 2) Building and training the CNN 15 | 3) Training the model 16 | 4) Model training process evaluation 17 | 5) Evaluation of the model 18 | 6) Saving the trained model 19 | """ 20 | 21 | from keras.datasets import cifar10 # CIFAR-10 dataset 22 | import matplotlib.pyplot as plt 23 | import keras 24 | from keras.models import Sequential #to set the CNN architecture 25 | from keras.layers import Dense, Dropout, Flatten, Conv2D,\ 26 | MaxPooling2D # NN layers 27 | import pickle #to save the datasets 28 | import random 29 | 30 | """ 31 | 1) Data processing 32 | *One-hot encoding labels keras.utils.to_categorical 33 | *Scale image pixel values: change data type and divide by 255 34 | """ 35 | 36 | #Loading the dataset: CIFAR-10 37 | (x_train, y_train), (x_test, y_test) = cifar10.load_data() 38 | #50000 training and 10000 testing samples. 39 | #Training images: 32 pixels in height, 32 pixels in width, 3 pixels in depth 40 | #Labels: 1 number (corresponding to the category) for each image. 41 | 42 | 43 | 44 | #Visualize a random image 45 | random = random.randint(0, len(x_train)) 46 | img=plt.imshow(x_train[random]) 47 | print('The label of this category is: ',y_train[random]) 48 | #0 airplane, 1 automobile, 2 bird, 3 cat, 4 deer, 5 dog, 6 frog, 49 | #7 horse, 8 ship, 9 truck 50 | 51 | #One-hot encoding conversion with Keras 52 | y_train_one_hot=keras.utils.to_categorical(y_train,10) 53 | y_test_one_hot=keras.utils.to_categorical(y_test,10) 54 | print('the one hot label of the random image is:', y_train_one_hot[random]) 55 | 56 | #Pixel values take value between 0 and 255 (RGB scale) -> make them between 0 and 1 57 | x_train=x_train.astype('float32') #convert the type to float32, which is 58 | #a datatype that can store values with decimal points. then divide by 255 59 | x_test=x_test.astype('float32') 60 | x_train=x_train/255 61 | x_test=x_test/255 62 | 63 | #I will save x_test and y_test in order to test the model the other file (my_image_recognition.py) 64 | pickle.dump(x_test, open("Data/x_test.dat", "wb")) 65 | pickle.dump(y_test_one_hot, open("Data/y_test.dat", "wb")) 66 | #BE CAREFUL WITH PATH AND OVERWRITING DATA 67 | 68 | """ 69 | 2) Building and training the CNN 70 | *Defining the CNN architecture with Keras Sequential model 71 | *Compiling the model 72 | """ 73 | 74 | #ARCHITECTURE: (ConvX2, Max Pool, Dropout)X2, FC, Dropout, Fc, Softmax 75 | 76 | model=Sequential() #create empty sequential model and then add layers 77 | 78 | #Layer 1: conv layer, filter size 3X3, stride size 1 in both dimensions, 79 | #depth 32. Padding same and activation relu will apply to all layers. 80 | #we will use ReLU activation for all our layers, except for the last layer 81 | #remember that ReLU doesn't map negative values (no negative values here) 82 | #no specification of stride default setting = 1 83 | #input shape needs to be specified, but not for the following layers 84 | model.add(Conv2D(32,(3,3), activation='relu', padding='same',\ 85 | input_shape=(32,32,3))) 86 | 87 | #Layer 2: conv layer, filter size 3X3, stride size 1 in both dimensions, 88 | #depth 32. Padding same and activation relu will apply to all layers. 89 | #we would need padding 1 to achieve the same width and height, but 90 | #we will use 'same' padding for all the conv layers, aka zero pad. 91 | model.add(Conv2D(32,(3,3), activation='relu', padding='same')) 92 | 93 | #Layer 3: max pooling layer, pool size 2X2, stride 2 in both dimensions, 94 | #max pooling layer stride default given by pool size 95 | model.add(MaxPooling2D(pool_size=(2,2))) 96 | 97 | #Layer 4: dropout layer with probability 25% of dropout, to prevent overfitting 98 | model.add(Dropout(0.25)) 99 | 100 | #Layers 5-8: same but depth of conv layer is 64 instead of 32 101 | model.add(Conv2D(64,(3,3), activation='relu', padding='same')) 102 | model.add(Conv2D(64,(3,3), activation='relu', padding='same')) 103 | model.add(MaxPooling2D(pool_size=(2,2))) 104 | model.add(Dropout(0.25)) 105 | 106 | #Layer 9: FC (Fully-Connected) layer. Now our neurons are not just in 107 | #on row, but spatially arranged in a cube-like format. We need to make 108 | #them into one row, flattening. Flatten layer. 109 | model.add(Flatten()) #now one row 110 | model.add(Dense(512,activation='relu')) #dense (FC) layer. 111 | 112 | #Layer 10: another dropout of probability 50% 113 | model.add(Dropout(0.5)) 114 | 115 | #Layers 11-12: another dense (FC) layer with 10 neurons and sofrmax activation 116 | #last layer, softmax, only transforms the output of the previous layer 117 | #into probability distributions, which is the final goal 118 | model.add(Dense(10,activation='softmax')) 119 | 120 | #end of architecture 121 | model.summary() 122 | 123 | #COMPILING THE MODEL: model.compile 124 | #First we need to specify with algorithm to use for the optimization, what loss 125 | #function to use and what other metrics to track apart from the loss function 126 | #optimizer: adam. Adds some tweaks to stcochastic gradient 127 | #descent such that it reaches the lower loss function faster 128 | #loss: categorical crossentropy = loss function for classification. 129 | #metrics: accuracy = we want to track accuracy on top of the loss function 130 | model.compile(loss='categorical_crossentropy', optimizer='adam',\ 131 | metrics=['accuracy']) 132 | 133 | """ 134 | 3) Training the model 135 | (highly time-consuming) 136 | """ 137 | 138 | #We are fitting the parameters to the data. We specify the data we are 139 | #training on, then the size of our mini-batch and how long we want to train it 140 | #for. Last, specify the validation data, that will tell us how we are doing on 141 | #the validation data at each point. We didn't split it before, we now specify 142 | #how much of our dataset will be used as a validation set. In this case, 20% will be validation set. 143 | hist=model.fit(x_train,y_train_one_hot, batch_size=32,epochs=20,\ 144 | validation_split=0.2) 145 | 146 | 147 | """ 148 | 4) Model training process evaluation 149 | If the improvements in our model to the training set look matched up with the 150 | imporvements to the validation set, it doesn't seem like overfitting is a huge 151 | problem in this model. 152 | """ 153 | 154 | # LOSS 155 | plt.plot(hist.history['loss']) #variable 1 to plot 156 | plt.plot(hist.history['val_loss']) #variable 2 to plot 157 | plt.title('Model loss') #title 158 | plt.ylabel('Loss') #label y 159 | plt.xlabel('Epoch') #label x 160 | plt.legend(['Training', 'Validation'], loc='upper right') #legend 161 | plt.show() #display the graph 162 | 163 | # ACCURACY 164 | plt.plot(hist.history['accuracy']) #variable 1 to plot 165 | plt.plot(hist.history['val_accuracy']) #variable 2 to plot 166 | plt.title('Model accuracy') #title 167 | plt.ylabel('Accuracy') #label y 168 | plt.xlabel('Epoch') #label x 169 | plt.legend(['Training', 'Validation'], loc='lower right') #legend 170 | plt.show() #display the graph 171 | 172 | """ 173 | 5) Evaluation of the model (on the test set) 174 | """ 175 | 176 | print('The accuracy of the model on the test set is: ',\ 177 | model.evaluate(x_test,y_test_one_hot)[1]*100,'%') 178 | # The accuracy of the model on the test set is: 77.25% 179 | 180 | """ 181 | 6) Saving the trained model 182 | The model will be saved in a file format called HDF5 183 | In order to load it run: 184 | from keras.models import load_model 185 | model = load_model('my_cifar10_model.h5') 186 | """ 187 | 188 | model.save('model_name.h5') #be careful, don't overwrite 189 | 190 | -------------------------------------------------------------------------------- /Code/my_image_recognition.py: -------------------------------------------------------------------------------- 1 | # -*- coding: utf-8 -*- 2 | """ 3 | ////////////////////////////////////////////////////////////////////////////////////////// 4 | // Original author(s): https://medium.com/intuitive-deep-learning/build-your-first-convolutional-neural-network-to-recognize-images-84b9c78fe0ce 5 | // Modified by: Aritz Lizoain 6 | // Github: https://github.com/aritzLizoain 7 | // My personal website: https://aritzlizoain.github.io/ 8 | // Description: Image recognition with Keras (CIFAR-10 standard dataset) 9 | // Copyright 2020, Aritz Lizoain. 10 | // License: MIT License 11 | ////////////////////////////////////////////////////////////////////////////////////////// 12 | 13 | Two trained models: 14 | *my_cifar10_model : original model. 77.25% accuracy on the test set. 15 | *my_cifar10_model2_augmented : model trained after applying data augmentation. 16 | 78.04% accuracy on the test set. 17 | 1) Loading a trained model 18 | 2) Predicting on the test set 19 | 3) Evaluation of the predictions 20 | 4) Predicting on OWN IMAGES 21 | """ 22 | 23 | from keras.models import load_model #loading the model 24 | from skimage.transform import resize #resize image 25 | import numpy as np 26 | import matplotlib.pyplot as plt 27 | import pickle #data loading 28 | from sklearn.metrics import classification_report #classification report 29 | import random 30 | 31 | """ 32 | 1) Loading a trained model 33 | """ 34 | 35 | #model = load_model('Models/my_cifar10_model.h5') # original model 36 | model = load_model('Models/my_cifar10_model2_augmented.h5') # +data augmentation model 37 | 38 | """ 39 | 2) Predicting on the test set 40 | """ 41 | 42 | #Load the test set 43 | x_test=pickle.load(open("Data/x_test.dat","rb")) 44 | y_test=pickle.load(open("Data/y_test.dat","rb")) 45 | 46 | y_test_label=np.argmax(np.round(y_test),axis=1) 47 | 48 | number_to_class = ['airplane', 'automobile', 'bird', 'cat', 'deer', 'dog',\ 49 | 'frog', 'horse', 'ship', 'truck'] 50 | 51 | #Predict on the test set 52 | predicted_classes=model.predict(x_test) 53 | #finding the positon (=label) of the prediction and solution 54 | predicted_classes_label=np.argmax(np.round(predicted_classes),axis=1) 55 | 56 | """ 57 | 3) Evaluation of the predictions 58 | """ 59 | 60 | #Comparing correct answers 61 | correct=np.where(predicted_classes_label==y_test_label)[0] #need to have the same shape 62 | print("Found", len(correct), "correct classes") 63 | #Comparing incorrect answers 64 | incorrect=np.where(predicted_classes_label!=y_test_label)[0] #need shape 65 | print("Found", len(incorrect), "incorrect classes") 66 | #Visusalizing a random incorrect prediction 67 | random = random.randint(0, len(incorrect)) 68 | plt.subplot(2,2,1) 69 | plt.imshow(x_test[incorrect[random]].reshape(32,32,3),cmap='gray',interpolation='none') 70 | plt.title('Predicted '+str(number_to_class[predicted_classes_label[incorrect[random]]])+\ 71 | ', correct '+str(number_to_class[y_test_label[incorrect[random]]])) 72 | plt.tight_layout() 73 | 74 | #Classification report sklearn.metrics.classification_report 75 | #Will help identifiying the misclassified classes in more detail. 76 | #Able to observe for which class the model performed better or worse. 77 | target_names = [number_to_class[i] for i in range(10)] 78 | print(classification_report(y_test_label, predicted_classes_label,\ 79 | target_names=target_names)) 80 | #Recal:"how many of this class you find over the whole number of element of 81 | # this class" 82 | #Precision:"how many are correctly classified among that class" 83 | #F1-score:the harmonic mean between precision & recall. Good on inbalanced sets, like this one 84 | #Support:the number of occurence of the given class in your dataset 85 | 86 | """ 87 | 4) Predicting on OWN IMAGES 88 | *Prepare the image to recognize 89 | *Predict on the image 90 | *Analyze the prediction 91 | """ 92 | 93 | # PREPARE THE IMAGE TO RECOGNIZE 94 | #reading the file as an array of pixel values 95 | #the image is reshaped to (32,32,3) 96 | #if the image is originally not squared, shape will be lost and recognition will be more likely to fail. 97 | my_image=plt.imread("Images/my_image_1.jpg") 98 | #resize image to fit model 99 | #model image size: 32*32*3 100 | my_image_resized=resize(my_image, (32,32,3)) #already makes values between 0-1 101 | #visualize the image 102 | img=plt.imshow(my_image) #will only show one 103 | #img_resized=plt.imshow(my_image_resized) 104 | 105 | # PREDICT ON THE IMAGE 106 | probabilities=model.predict(np.array([my_image_resized])) 107 | #np.array changes the current array of the pixel values into a 4D array 108 | #because model.predict expects a 4D array (3D+training examples). 109 | #Training set and test set were consistent with this before 110 | #10 output neurons corresponding to a probability distribution over the classes 111 | 112 | # ANALYZE THE PREDICTION 113 | index = np.argsort(probabilities[0,:]) 114 | print("Most likely class:", number_to_class[index[9]], "-- Probability:",\ 115 | probabilities[0,index[9]]) 116 | print("Second most likely class:", number_to_class[index[8]], "-- Probability:",\ 117 | probabilities[0,index[8]]) 118 | print("Third most likely class:", number_to_class[index[7]], "-- Probability:",\ 119 | probabilities[0,index[7]]) 120 | print("Fourth most likely class:", number_to_class[index[6]], "-- Probability:",\ 121 | probabilities[0,index[6]]) 122 | print("Fifth most likely class:", number_to_class[index[5]], "-- Probability:",\ 123 | probabilities[0,index[5]]) 124 | -------------------------------------------------------------------------------- /Images/My_images/my_image_1.jpg: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/aritzLizoain/Image-Classification/0ed6b901ef12417d0ba7b76c648cafac30dffd08/Images/My_images/my_image_1.jpg -------------------------------------------------------------------------------- /Images/My_images/my_image_2.jpg: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/aritzLizoain/Image-Classification/0ed6b901ef12417d0ba7b76c648cafac30dffd08/Images/My_images/my_image_2.jpg -------------------------------------------------------------------------------- /Images/My_images/my_image_3.jpg: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/aritzLizoain/Image-Classification/0ed6b901ef12417d0ba7b76c648cafac30dffd08/Images/My_images/my_image_3.jpg -------------------------------------------------------------------------------- /Images/My_images/my_image_4.jpg: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/aritzLizoain/Image-Classification/0ed6b901ef12417d0ba7b76c648cafac30dffd08/Images/My_images/my_image_4.jpg -------------------------------------------------------------------------------- /Images/My_images/my_image_5.jpg: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/aritzLizoain/Image-Classification/0ed6b901ef12417d0ba7b76c648cafac30dffd08/Images/My_images/my_image_5.jpg -------------------------------------------------------------------------------- /Images/My_images/my_image_6.jpg: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/aritzLizoain/Image-Classification/0ed6b901ef12417d0ba7b76c648cafac30dffd08/Images/My_images/my_image_6.jpg -------------------------------------------------------------------------------- /Images/Outputs/my_image_1_prediction.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/aritzLizoain/Image-Classification/0ed6b901ef12417d0ba7b76c648cafac30dffd08/Images/Outputs/my_image_1_prediction.png -------------------------------------------------------------------------------- /Images/Outputs/my_image_2_prediction.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/aritzLizoain/Image-Classification/0ed6b901ef12417d0ba7b76c648cafac30dffd08/Images/Outputs/my_image_2_prediction.png -------------------------------------------------------------------------------- /Images/Outputs/my_image_3_prediction.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/aritzLizoain/Image-Classification/0ed6b901ef12417d0ba7b76c648cafac30dffd08/Images/Outputs/my_image_3_prediction.png -------------------------------------------------------------------------------- /Images/Outputs/my_image_4_prediction.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/aritzLizoain/Image-Classification/0ed6b901ef12417d0ba7b76c648cafac30dffd08/Images/Outputs/my_image_4_prediction.png -------------------------------------------------------------------------------- /Images/Outputs/my_image_5_prediction.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/aritzLizoain/Image-Classification/0ed6b901ef12417d0ba7b76c648cafac30dffd08/Images/Outputs/my_image_5_prediction.png -------------------------------------------------------------------------------- /Images/Outputs/my_image_6_prediction.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/aritzLizoain/Image-Classification/0ed6b901ef12417d0ba7b76c648cafac30dffd08/Images/Outputs/my_image_6_prediction.png -------------------------------------------------------------------------------- /LICENSE: -------------------------------------------------------------------------------- 1 | MIT License 2 | 3 | Copyright (c) 2020 aritzLizoain 4 | 5 | Permission is hereby granted, free of charge, to any person obtaining a copy 6 | of this software and associated documentation files (the "Software"), to deal 7 | in the Software without restriction, including without limitation the rights 8 | to use, copy, modify, merge, publish, distribute, sublicense, and/or sell 9 | copies of the Software, and to permit persons to whom the Software is 10 | furnished to do so, subject to the following conditions: 11 | 12 | The above copyright notice and this permission notice shall be included in all 13 | copies or substantial portions of the Software. 14 | 15 | THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR 16 | IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, 17 | FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE 18 | AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER 19 | LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, 20 | OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE 21 | SOFTWARE. 22 | -------------------------------------------------------------------------------- /Models/my_cifar10_model.h5: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/aritzLizoain/Image-Classification/0ed6b901ef12417d0ba7b76c648cafac30dffd08/Models/my_cifar10_model.h5 -------------------------------------------------------------------------------- /Models/my_cifar10_model2_augmented.h5: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/aritzLizoain/Image-Classification/0ed6b901ef12417d0ba7b76c648cafac30dffd08/Models/my_cifar10_model2_augmented.h5 -------------------------------------------------------------------------------- /README.md: -------------------------------------------------------------------------------- 1 | # Image Classification 2 | 3 |  4 |  5 |  6 | [](https://github.com/aritzLizoain/Image-Classification) 7 |  8 |  9 | 10 | Image recognition implementation with **Keras**. A **CNN** is built and trained with the **CIFAR-10** dataset. Two models are trained: one without data-augmentation (77.25% accuracy) and the other with data-augmentation (78.04% accuracy). Process: 11 | 12 | ``` image_recognition.py ``` 13 | * Data processing: one-hot encoding and scaling 14 | * Building and training the CNN 15 | * Training the model 16 | * Training process evaluation 17 | * Evaluation of the model 18 | * Saving the trained model 19 | 20 | ``` my_image_recognition.py ``` 21 | * Loading the trained model 22 | * Predicting on the test set 23 | * Evaluation of the predictions 24 | * Predicting on my own images 25 | 26 | Followed [Course](https://medium.com/intuitive-deep-learning/build-your-first-convolutional-neural-network-to-recognize-images-84b9c78fe0ce) 27 | 28 | ## Predicting on my own images 29 | 30 | Some are correct :heavy_check_mark: some are not :x: 31 | 32 |
33 |35 | 36 |![]()
34 |
37 |39 | 40 |![]()
38 |
41 |43 | 44 |![]()
42 |
45 |47 | 48 |![]()
46 |
49 |51 | 52 |![]()
50 |
53 |55 | --------------------------------------------------------------------------------![]()
54 |