├── README.md
├── Trash-classification.ipynb
└── Trash-classification.py


/README.md:
--------------------------------------------------------------------------------
 1 | # Trash classification
 2 | 
 3 | I have developped Deep Convolutional Neural Networks to classify Trash images using keras as a Deep Learning library with Tensorflow as backend, this system is designed to classify glass from plastic where this smart system can be used as a machine's engine.  The pipeline of the model is as follows:
 4 | 
 5 | - First step is: the imports
 6 | 
 7 | - Second step is: data collection
 8 | 
 9 | - third step is: data preparation
10 | 
11 | - fourth step is data normalization
12 | 
13 | - fifth step is data tranformation and augmentation
14 | 
15 | - sixth step is: defining the model
16 | 
17 | - seventh step is: compiling the created model, thus this model will become as a computaional graph
18 | 
19 | - eighth step: after doing all above steps, we can start the model training
20 | 
21 | - the last step is model evaluation
22 | 
23 | After aplying only all the above steps, we have obtained a test accuracy = 0.91, and a test loss = 0.28
24 | 
25 | # Dataset
26 | 
27 | - The dataset has six classes: glass, paper, cardboard, plastic, metal, and trash. Where the original dataset consists of 2527 images, in this project, I used only 501 glass images, and 482 plastic images.
28 | 
29 | - The modified dataset can be downloaded from https://bit.ly/3mcb3aS
30 | 
31 | # Usage
32 | 
33 | - Download the dataset
34 | 
35 | - Download source code files 'Trash-classification'
36 | 
37 | - Install Anaconda then launch Spyder
38 | 
39 | - Paste the source code
40 | 
41 | - Train the model
42 | 
43 | - Test the model
44 | 
45 | # Issues
46 | 
47 | If you encounter any issue or have a feedback, please don't hesitate to [raise an issue](https://github.com/MostefaBen/Trash-classification/issues/new).
48 | 
49 | # Author
50 | 
51 | This project has been developed by [Mostefa Ben Naceur](https://fr.linkedin.com/in/mostefabennaceurphd)
52 | 


--------------------------------------------------------------------------------
/Trash-classification.py:
--------------------------------------------------------------------------------
  1 | 
  2 | # coding: utf-8
  3 | 
  4 | # # Building a Smart system based on Deep Convolutional Neural Networks to classify Trash
  5 | 
  6 | # In[1]:
  7 | 
  8 | 
  9 | from keras.models import Sequential
 10 | from keras.layers import Dense, Dropout, Flatten
 11 | from keras.applications import VGG16
 12 | from keras import models
 13 | from keras.optimizers import Adagrad
 14 | from keras.preprocessing.image import ImageDataGenerator
 15 | from keras.callbacks import EarlyStopping
 16 | import numpy as np
 17 | from glob import glob
 18 | import os
 19 | import matplotlib.pyplot as plt
 20 | from sklearn.model_selection import train_test_split
 21 | # for reproducibility
 22 | np.random.seed(78)
 23 | 
 24 | 
 25 | # In[ ]:
 26 | 
 27 | 
 28 | # Input image dimensions
 29 | img_rows, img_cols, img_chans = 384, 512, 3
 30 | input_shape = (img_rows, img_cols, img_chans)
 31 | batch_size = 8
 32 | num_classes = 2
 33 | epochs = 2000
 34 | data_augmentation = True
 35 | 
 36 | 
 37 | # In[2]:
 38 | 
 39 | 
 40 | def train(x_train, x_test, y_train, y_test):
 41 |     
 42 |     #Loading the VGG model
 43 |     vgg_conv = VGG16(weights='imagenet', include_top=False,  input_shape=input_shape)
 44 |     
 45 |     for i in range(8):
 46 |         #removing the last layers  
 47 |         vgg_conv.layers.pop() 
 48 |     
 49 |     
 50 |     # Freezing all layers
 51 |     for layer in vgg_conv.layers[:]:
 52 |         layer.trainable = False
 53 |      
 54 |     # Building Deep learning model
 55 |     model = models.Sequential()
 56 |      
 57 |     # Adding the vgg model
 58 |     model.add(vgg_conv)
 59 |      
 60 |     # Adding new layers
 61 |     model.add(Flatten())
 62 |     model.add(Dense(350, activation='relu', input_shape=input_shape))
 63 |     model.add(Dropout(0.2))
 64 |     model.add(Dense(350, activation='relu'))
 65 |     model.add(Dropout(0.2))
 66 |     model.add(Dense(2, activation='sigmoid'))
 67 |      
 68 |     model.compile(loss='binary_crossentropy', optimizer=Adagrad(lr=1e-5, decay=1e-6), metrics=['accuracy'])
 69 |     
 70 |     """
 71 |     files = glob('Model2**')
 72 |     print(files)
 73 |     list_models=[]
 74 |     for  model_ in files:
 75 |         list_models.append(float(model_[:-5].split('=')[1]))
 76 |         
 77 |     index = np.argmin(list_models)
 78 |     load_model = files[index]
 79 |     print(load_model)
 80 | 
 81 |     if load_model is not None:
 82 |             model.load_weights(load_model)
 83 |             print("weights are loaded")
 84 |     else:
 85 |             print("weights are None")
 86 |     """       
 87 |     
 88 |     call =  [                  
 89 |                                     EarlyStopping(monitor='val_loss',  patience=20, verbose=1,  mode='auto'),
 90 |             ]
 91 |     
 92 |     if not data_augmentation:
 93 |         print('Not using data augmentation.')
 94 |         model.fit(x_train, y_train,
 95 |                   batch_size=batch_size,
 96 |                   epochs=epochs,
 97 |                   validation_data=(x_test, y_test),
 98 |                   shuffle=True)
 99 |     else:
100 |         print('Using real-time data augmentation.')
101 |         # This will do preprocessing and realtime data augmentation:
102 |         datagen = ImageDataGenerator(
103 |         featurewise_center=False,  # set input mean to 0 over the dataset   
104 |         samplewise_center=False,  # set each sample mean to 0   
105 |         featurewise_std_normalization=False,  # divide inputs by std of the dataset   
106 |         samplewise_std_normalization=False,  # divide each input by its std  
107 |         zca_whitening=False,  # apply ZCA whitening     
108 |         zca_epsilon=1e-06,  # epsilon for ZCA whitening
109 |         rotation_range=30,  # randomly rotate images in the range (degrees, 0 to 180)  <<1    0 => 30
110 |         # randomly shift images horizontally (fraction of total width)
111 |         width_shift_range=0.1,
112 |         # randomly shift images vertically (fraction of total height)
113 |         height_shift_range=0.1,
114 |         shear_range=0.2,  # set range for random shear  <<3<<4  0 => 0.1 => 0.2
115 |         zoom_range=0.3,  # set range for random zoom    <<1<<2<<3   0 => 0.1 => 0.2 =>0.3 
116 |         channel_shift_range=0.2,  # set range for random channel shifts     <<5<<6   0.=>0.1=>0.2
117 |         # set mode for filling points outside the input boundaries
118 |         fill_mode='nearest',
119 |         cval=0.,  # value used for fill_mode = "constant"     
120 |         horizontal_flip=True,  # randomly flip images
121 |         vertical_flip=True,  # randomly flip images    <<1    false => True
122 |         # set rescaling factor (applied before any other transformation)
123 |         rescale=None,   
124 |         # set function that will be applied on each input
125 |         preprocessing_function=None,
126 |         # image data format, either "channels_first" or "channels_last"
127 |         data_format=None,
128 |         # fraction of images reserved for validation (strictly between 0 and 1)
129 |         validation_split=0.0)
130 |     
131 |         print("steps_per_epoch (nbr of samples per epoch):", int(len(x_train)/batch_size))
132 |         # Fit the model on the batches generated by datagen.flow().
133 |         history = model.fit_generator(datagen.flow(x_train, y_train,
134 |                                          batch_size=batch_size),steps_per_epoch = 800,
135 |                             epochs=2000,
136 |                             validation_data=(x_test, y_test),
137 |                             workers=10, callbacks = call)
138 |         
139 |         weights = '{}.hdf5'.format('Model3_adagrad_'+'val_acc:'+str(round(history.history['val_acc'][-1],3))+' val_loss='+str(round(history.history['val_loss'][-1],3)))
140 |         model.save_weights(weights)
141 |         print ('Model saved.')
142 |         
143 |         score = model.evaluate(x_test, y_test,batch_size=10, verbose=0)
144 |         print('Test loss:', score[0])
145 |         print('Test accuracy:', score[1])
146 | 
147 |         acc = history.history['acc']
148 |         val_acc = history.history['val_acc']
149 |         loss = history.history['loss']
150 |         val_loss = history.history['val_loss']
151 | 
152 |         epoch = range(len(acc))
153 | 
154 |         plt.plot(epoch, acc, 'b', label='Training acc')
155 |         plt.plot(epoch, val_acc, 'r', label='Validation acc')
156 |         plt.title('Training and validation accuracy')
157 |         plt.legend()
158 |         plt.figure()
159 | 
160 |         plt.plot(epoch, loss, 'b', label='Training loss')
161 |         plt.plot(epoch, val_loss, 'r', label='Validation loss')
162 |         plt.title('Training and validation loss')
163 |         plt.legend()
164 |         plt.show()
165 | 
166 | 
167 | # In[ ]:
168 | 
169 | 
170 | def test(x_test):
171 |     
172 |     image = np.expand_dims((x_test[58] - np.mean(x_test))/ np.std(x_test), axis=0)
173 | 
174 |     plt.imshow(x_test[58])
175 |     plt.show()
176 | 
177 |     out = model.predict(x_test[58])
178 |     out = np.argmax(out)
179 | 
180 |     if out == 1:
181 |             label = 'plastic'
182 |     else:
183 |             label = 'glass'
184 | 
185 |     return out, label
186 | 
187 | 
188 | # In[3]:
189 | 
190 | 
191 | train()
192 | 
193 | 
194 | # In[ ]:
195 | 
196 | 
197 | if __name__ == "__main__":   
198 |     
199 |     # Load all images
200 |     all_images_array = np.load('all_images_array.npy')
201 | 
202 |     # load the the class labels
203 |     all_labels = np.load('all_labels.npy')
204 | 
205 |     # Split the dataset into train and test sets, with percentage of splitting = 70 / 30 respectively
206 |     x_train, x_test, y_train, y_test = train_test_split(all_images_array, all_labels, test_size=0.30, shuffle=True, random_state=78)
207 | 
208 |     # Data normalization to convert features to the same scale
209 |     x_train = x_train.astype('float32')
210 |     x_test = x_test.astype('float32')
211 |     #x_train /= 255
212 |     #x_test /= 255
213 | 
214 |     x_train = (x_train - np.mean(x_train)) / np.std(x_train)
215 |     x_test  = (x_test - np.mean(x_test)) / np.std(x_test)
216 | 
217 |     print('x_train shape:', x_train.shape)
218 |     print(x_train.shape[0], 'train samples')
219 |     print(x_test.shape[0], 'test samples')
220 |     
221 |     # convert class vectors to One-hot encoding
222 |     y_train = keras.utils.to_categorical(y_train, num_classes)
223 |     y_test = keras.utils.to_categorical(y_test, num_classes)
224 | 
225 |     train(x_train, x_test, y_train, y_test)
226 |     
227 |     prediction, label = test(x_test)
228 |     
229 |     print('The prediction of this object is:', prediction, '=> ', label)
230 | 
231 | 
232 | # In[ ]:
233 | 
234 | 
235 | #Workers = 8
236 | #step_per_epoch 100:
237 | #Test loss: 0.34983771067049546
238 | #Test accuracy: 0.8220338942640919
239 | 
240 | #step_per_epoch 200: epoch=>72
241 | #Test loss: 0.28055439813662386
242 | #Test accuracy: 0.8932203276682709
243 | 
244 | #step_per_epoch 400: epoch=>12 continous
245 | #Test loss: 0.2784507023328442
246 | #Test accuracy: 0.8949152453471039
247 | 
248 | #step_per_epoch 800: epoch=>9 continous
249 | #Test loss: 0.2843580770669347
250 | #Test accuracy: 0.9050847372766269
251 | 
252 | #==========================================
253 | #Workers = 32
254 | #step_per_epoch 100:  epoch=>10 continous
255 | #Test loss: 0.2690500178200714
256 | #Test accuracy: 0.9033898216182903
257 | 
258 | #step_per_epoch 200:  epoch=>18 continous
259 | #Test loss: 0.2979729557252031
260 | #Test accuracy: 0.9050847372766269
261 | 
262 | #step_per_epoch 400: epoch=>10 continous
263 | #Test loss: 0.2838988147234007
264 | #Test accuracy: 0.9067796549554599
265 | 
266 | #step_per_epoch 400: epoch=>6 continous
267 | #Test loss: 0.2930123057468968
268 | #Test accuracy: 0.9101694842516366
269 | 
270 | #having nbr of worker=32 and step_per_epoch=800 makes the memory overwhelms
271 | 
272 | 


--------------------------------------------------------------------------------