├── README.md
├── assets
└── feats.jpg
├── cats_n_dogs.ipynb
├── cats_n_dogs_BN.ipynb
├── img_clf.py
└── vgg_bn.py
/README.md:
--------------------------------------------------------------------------------
1 | # Keras Image Classification
2 |
3 | Classifies an image as containing either a dog or a cat (using Kaggle's public dataset), but could easily be extended to other image classification problems.
4 |
5 | To run these scripts/notebooks, you must have keras, numpy, scipy, and h5py installed, and enabling GPU acceleration is highly recommended if that's an option.
6 |
7 | ## img_clf.py
8 | After playing around with hyperparameters a bit, this reaches around 96-98% accuracy on the validation data, and when tested on Kaggle's hidden test data achieved a log loss score around 0.18.
9 |
10 | Most of the code / strategy here was based on this Keras tutorial.
11 |
12 | Pre-trained VGG16 model weights can be downloaded here.
13 |
14 | The data directory structure I used was:
15 |
16 | * project
17 | * data
18 | * train
19 | * dogs
20 | * cats
21 | * validation
22 | * dogs
23 | * cats
24 | * test
25 | * test
26 |
27 | ## cats_n_dogs.ipynb:
28 | This produced a slightly better score (.161 log loss on kaggle test set). The better score most likely comes from having larger images and ensembling a few models, despite the fact there's no image augmentation in the notebook.
29 |
30 | Might run into memory errors because of the large image dimensions -- if so reducing the number of folds and saving the model weights rather than keeping the models in memory should do the trick. The notebook uses a slightly flatter directory structure, with the validation split happening after the images are loaded:
31 |
32 | * project
33 | * data
34 | * train
35 | * dogs
36 | * cats
37 | * test
38 | * test
39 |
40 | ## cats_n_dogs_BN.ipynb:
41 | This produced the best score (0.069 loss without any ensembling). The notebook incorporates some of the techniques from Jeremy Howard's deep learning class , with the inclusion of batch normalization being the biggest factor. I also added extra layers of augmentation to the prediction script, which greatly improved performance.
42 |
43 | Pre-trained model weights for VGG16 w/ batch normalization can be downloaded here.
44 |
45 | The VGG16BN class is defined in vgg_bn.py, and the data directory structure used was:
46 |
47 | * project
48 | * data
49 | * train
50 | * dogs
51 | * cats
52 | * validation
53 | * dogs
54 | * cats
55 | * test
56 | * test
57 |
--------------------------------------------------------------------------------
/assets/feats.jpg:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/rdcolema/keras-image-classification/9d4b678d33410a7c848dee85e86be5d624de7046/assets/feats.jpg
--------------------------------------------------------------------------------
/cats_n_dogs_BN.ipynb:
--------------------------------------------------------------------------------
1 | {
2 | "cells": [
3 | {
4 | "cell_type": "markdown",
5 | "metadata": {},
6 | "source": [
7 | "## Image Classification of Dogs vs. Cats Using CNN Ensemble"
8 | ]
9 | },
10 | {
11 | "cell_type": "markdown",
12 | "metadata": {},
13 | "source": [
14 | "Imports & environment"
15 | ]
16 | },
17 | {
18 | "cell_type": "code",
19 | "execution_count": 1,
20 | "metadata": {
21 | "collapsed": false
22 | },
23 | "outputs": [
24 | {
25 | "name": "stderr",
26 | "output_type": "stream",
27 | "text": [
28 | "Using Theano backend.\n",
29 | "Using gpu device 0: GeForce GTX 980M (CNMeM is enabled with initial size: 90.0% of memory, cuDNN 5105)\n",
30 | "/home/robert/anaconda3/lib/python3.5/site-packages/theano/sandbox/cuda/__init__.py:600: UserWarning: Your cuDNN version is more recent than the one Theano officially supports. If you see any problems, try updating Theano or downgrading cuDNN to version 5.\n",
31 | " warnings.warn(warn)\n",
32 | "/home/robert/anaconda3/lib/python3.5/site-packages/matplotlib/font_manager.py:273: UserWarning: Matplotlib is building the font cache using fc-list. This may take a moment.\n",
33 | " warnings.warn('Matplotlib is building the font cache using fc-list. This may take a moment.')\n",
34 | "/home/robert/anaconda3/lib/python3.5/site-packages/matplotlib/font_manager.py:273: UserWarning: Matplotlib is building the font cache using fc-list. This may take a moment.\n",
35 | " warnings.warn('Matplotlib is building the font cache using fc-list. This may take a moment.')\n"
36 | ]
37 | }
38 | ],
39 | "source": [
40 | "import os\n",
41 | "import numpy as np\n",
42 | "\n",
43 | "from glob import glob\n",
44 | "from shutil import copyfile\n",
45 | "from vgg_bn import Vgg16BN\n",
46 | "from keras.callbacks import ModelCheckpoint\n",
47 | "\n",
48 | "ROOT_DIR = os.getcwd()\n",
49 | "DATA_HOME_DIR = ROOT_DIR + '/data'\n",
50 | "%matplotlib inline"
51 | ]
52 | },
53 | {
54 | "cell_type": "markdown",
55 | "metadata": {},
56 | "source": [
57 | "Config & Hyperparameters"
58 | ]
59 | },
60 | {
61 | "cell_type": "code",
62 | "execution_count": 10,
63 | "metadata": {
64 | "collapsed": true
65 | },
66 | "outputs": [],
67 | "source": [
68 | "# paths\n",
69 | "data_path = DATA_HOME_DIR + '/' \n",
70 | "train_path = data_path + '/train/'\n",
71 | "valid_path = data_path + '/valid/'\n",
72 | "test_path = DATA_HOME_DIR + '/test/'\n",
73 | "model_path = ROOT_DIR + '/models/'\n",
74 | "submission_path = ROOT_DIR + '/submissions/'\n",
75 | "\n",
76 | "# data\n",
77 | "img_width, img_height = 224, 224\n",
78 | "batch_size = 64\n",
79 | "nb_train_samples = 23000\n",
80 | "nb_valid_samples = 2000\n",
81 | "nb_test_samples = 12500\n",
82 | "classes = [\"cats\", \"dogs\"]\n",
83 | "n_classes = len(classes)\n",
84 | "\n",
85 | "# model\n",
86 | "nb_epoch = 10\n",
87 | "nb_aug = 5\n",
88 | "lr = 0.001"
89 | ]
90 | },
91 | {
92 | "cell_type": "markdown",
93 | "metadata": {},
94 | "source": [
95 | "Build the VGG model w/ Batch Normalization"
96 | ]
97 | },
98 | {
99 | "cell_type": "code",
100 | "execution_count": 3,
101 | "metadata": {
102 | "collapsed": false,
103 | "scrolled": true
104 | },
105 | "outputs": [
106 | {
107 | "name": "stdout",
108 | "output_type": "stream",
109 | "text": [
110 | "____________________________________________________________________________________________________\n",
111 | "Layer (type) Output Shape Param # Connected to \n",
112 | "====================================================================================================\n",
113 | "lambda_1 (Lambda) (None, 3, 224, 224) 0 lambda_input_1[0][0] \n",
114 | "____________________________________________________________________________________________________\n",
115 | "zeropadding2d_1 (ZeroPadding2D) (None, 3, 226, 226) 0 lambda_1[0][0] \n",
116 | "____________________________________________________________________________________________________\n",
117 | "convolution2d_1 (Convolution2D) (None, 64, 224, 224) 0 zeropadding2d_1[0][0] \n",
118 | "____________________________________________________________________________________________________\n",
119 | "zeropadding2d_2 (ZeroPadding2D) (None, 64, 226, 226) 0 convolution2d_1[0][0] \n",
120 | "____________________________________________________________________________________________________\n",
121 | "convolution2d_2 (Convolution2D) (None, 64, 224, 224) 0 zeropadding2d_2[0][0] \n",
122 | "____________________________________________________________________________________________________\n",
123 | "maxpooling2d_1 (MaxPooling2D) (None, 64, 112, 112) 0 convolution2d_2[0][0] \n",
124 | "____________________________________________________________________________________________________\n",
125 | "zeropadding2d_3 (ZeroPadding2D) (None, 64, 114, 114) 0 maxpooling2d_1[0][0] \n",
126 | "____________________________________________________________________________________________________\n",
127 | "convolution2d_3 (Convolution2D) (None, 128, 112, 112) 0 zeropadding2d_3[0][0] \n",
128 | "____________________________________________________________________________________________________\n",
129 | "zeropadding2d_4 (ZeroPadding2D) (None, 128, 114, 114) 0 convolution2d_3[0][0] \n",
130 | "____________________________________________________________________________________________________\n",
131 | "convolution2d_4 (Convolution2D) (None, 128, 112, 112) 0 zeropadding2d_4[0][0] \n",
132 | "____________________________________________________________________________________________________\n",
133 | "maxpooling2d_2 (MaxPooling2D) (None, 128, 56, 56) 0 convolution2d_4[0][0] \n",
134 | "____________________________________________________________________________________________________\n",
135 | "zeropadding2d_5 (ZeroPadding2D) (None, 128, 58, 58) 0 maxpooling2d_2[0][0] \n",
136 | "____________________________________________________________________________________________________\n",
137 | "convolution2d_5 (Convolution2D) (None, 256, 56, 56) 0 zeropadding2d_5[0][0] \n",
138 | "____________________________________________________________________________________________________\n",
139 | "zeropadding2d_6 (ZeroPadding2D) (None, 256, 58, 58) 0 convolution2d_5[0][0] \n",
140 | "____________________________________________________________________________________________________\n",
141 | "convolution2d_6 (Convolution2D) (None, 256, 56, 56) 0 zeropadding2d_6[0][0] \n",
142 | "____________________________________________________________________________________________________\n",
143 | "zeropadding2d_7 (ZeroPadding2D) (None, 256, 58, 58) 0 convolution2d_6[0][0] \n",
144 | "____________________________________________________________________________________________________\n",
145 | "convolution2d_7 (Convolution2D) (None, 256, 56, 56) 0 zeropadding2d_7[0][0] \n",
146 | "____________________________________________________________________________________________________\n",
147 | "maxpooling2d_3 (MaxPooling2D) (None, 256, 28, 28) 0 convolution2d_7[0][0] \n",
148 | "____________________________________________________________________________________________________\n",
149 | "zeropadding2d_8 (ZeroPadding2D) (None, 256, 30, 30) 0 maxpooling2d_3[0][0] \n",
150 | "____________________________________________________________________________________________________\n",
151 | "convolution2d_8 (Convolution2D) (None, 512, 28, 28) 0 zeropadding2d_8[0][0] \n",
152 | "____________________________________________________________________________________________________\n",
153 | "zeropadding2d_9 (ZeroPadding2D) (None, 512, 30, 30) 0 convolution2d_8[0][0] \n",
154 | "____________________________________________________________________________________________________\n",
155 | "convolution2d_9 (Convolution2D) (None, 512, 28, 28) 0 zeropadding2d_9[0][0] \n",
156 | "____________________________________________________________________________________________________\n",
157 | "zeropadding2d_10 (ZeroPadding2D) (None, 512, 30, 30) 0 convolution2d_9[0][0] \n",
158 | "____________________________________________________________________________________________________\n",
159 | "convolution2d_10 (Convolution2D) (None, 512, 28, 28) 0 zeropadding2d_10[0][0] \n",
160 | "____________________________________________________________________________________________________\n",
161 | "maxpooling2d_4 (MaxPooling2D) (None, 512, 14, 14) 0 convolution2d_10[0][0] \n",
162 | "____________________________________________________________________________________________________\n",
163 | "zeropadding2d_11 (ZeroPadding2D) (None, 512, 16, 16) 0 maxpooling2d_4[0][0] \n",
164 | "____________________________________________________________________________________________________\n",
165 | "convolution2d_11 (Convolution2D) (None, 512, 14, 14) 0 zeropadding2d_11[0][0] \n",
166 | "____________________________________________________________________________________________________\n",
167 | "zeropadding2d_12 (ZeroPadding2D) (None, 512, 16, 16) 0 convolution2d_11[0][0] \n",
168 | "____________________________________________________________________________________________________\n",
169 | "convolution2d_12 (Convolution2D) (None, 512, 14, 14) 0 zeropadding2d_12[0][0] \n",
170 | "____________________________________________________________________________________________________\n",
171 | "zeropadding2d_13 (ZeroPadding2D) (None, 512, 16, 16) 0 convolution2d_12[0][0] \n",
172 | "____________________________________________________________________________________________________\n",
173 | "convolution2d_13 (Convolution2D) (None, 512, 14, 14) 0 zeropadding2d_13[0][0] \n",
174 | "____________________________________________________________________________________________________\n",
175 | "maxpooling2d_5 (MaxPooling2D) (None, 512, 7, 7) 0 convolution2d_13[0][0] \n",
176 | "____________________________________________________________________________________________________\n",
177 | "flatten_1 (Flatten) (None, 25088) 0 maxpooling2d_5[0][0] \n",
178 | "____________________________________________________________________________________________________\n",
179 | "dense_1 (Dense) (None, 4096) 0 flatten_1[0][0] \n",
180 | "____________________________________________________________________________________________________\n",
181 | "batchnormalization_1 (BatchNormal(None, 4096) 0 dense_1[0][0] \n",
182 | "____________________________________________________________________________________________________\n",
183 | "dropout_1 (Dropout) (None, 4096) 0 batchnormalization_1[0][0] \n",
184 | "____________________________________________________________________________________________________\n",
185 | "dense_2 (Dense) (None, 4096) 0 dropout_1[0][0] \n",
186 | "____________________________________________________________________________________________________\n",
187 | "batchnormalization_2 (BatchNormal(None, 4096) 0 dense_2[0][0] \n",
188 | "____________________________________________________________________________________________________\n",
189 | "dropout_2 (Dropout) (None, 4096) 0 batchnormalization_2[0][0] \n",
190 | "____________________________________________________________________________________________________\n",
191 | "dense_4 (Dense) (None, 2) 8194 dropout_2[0][0] \n",
192 | "====================================================================================================\n",
193 | "Total params: 8194\n",
194 | "____________________________________________________________________________________________________\n"
195 | ]
196 | }
197 | ],
198 | "source": [
199 | "vgg = Vgg16BN(size=(img_width, img_height), n_classes=n_classes, batch_size=batch_size, lr=lr)\n",
200 | "model = vgg.model\n",
201 | "\n",
202 | "model.summary()"
203 | ]
204 | },
205 | {
206 | "cell_type": "code",
207 | "execution_count": 12,
208 | "metadata": {
209 | "collapsed": true
210 | },
211 | "outputs": [],
212 | "source": [
213 | "info_string = \"{0}x{1}_{2}epoch_{3}aug_{4}lr_vgg16-bn\".format(img_width, img_height, nb_epoch, nb_aug, lr)\n",
214 | "ckpt_fn = model_path + '{val_loss:.2f}-loss_' + info_string + '.h5'\n",
215 | "\n",
216 | "ckpt = ModelCheckpoint(filepath=ckpt_fn,\n",
217 | " monitor='val_loss',\n",
218 | " save_best_only=True,\n",
219 | " save_weights_only=True)"
220 | ]
221 | },
222 | {
223 | "cell_type": "markdown",
224 | "metadata": {},
225 | "source": [
226 | "Train the Model"
227 | ]
228 | },
229 | {
230 | "cell_type": "code",
231 | "execution_count": 13,
232 | "metadata": {
233 | "collapsed": false
234 | },
235 | "outputs": [],
236 | "source": [
237 | "vgg.fit(train_path, valid_path,\n",
238 | " nb_trn_samples=nb_train_samples,\n",
239 | " nb_val_samples=nb_valid_samples,\n",
240 | " nb_epoch=nb_epoch,\n",
241 | " callbacks=[ckpt],\n",
242 | " aug=nb_aug)"
243 | ]
244 | },
245 | {
246 | "cell_type": "markdown",
247 | "metadata": {},
248 | "source": [
249 | "Predict on Test Data"
250 | ]
251 | },
252 | {
253 | "cell_type": "code",
254 | "execution_count": 11,
255 | "metadata": {
256 | "collapsed": false
257 | },
258 | "outputs": [
259 | {
260 | "name": "stdout",
261 | "output_type": "stream",
262 | "text": [
263 | "Generating predictions for Augmentation... 0\n",
264 | "Found 12500 images belonging to 1 classes.\n",
265 | "Generating predictions for Augmentation... 1\n",
266 | "Found 12500 images belonging to 1 classes.\n",
267 | "Generating predictions for Augmentation... 2\n",
268 | "Found 12500 images belonging to 1 classes.\n",
269 | "Generating predictions for Augmentation... 3\n",
270 | "Found 12500 images belonging to 1 classes.\n",
271 | "Generating predictions for Augmentation... 4\n",
272 | "Found 12500 images belonging to 1 classes.\n",
273 | "Averaging Predictions Across Augmentations...\n"
274 | ]
275 | }
276 | ],
277 | "source": [
278 | "# generate predictions\n",
279 | "for aug in range(nb_aug):\n",
280 | " print(\"Generating predictions for Augmentation {0}...\",format(aug+1))\n",
281 | " if aug == 0:\n",
282 | " predictions, filenames = vgg.test(test_path, nb_test_samples, aug=nb_aug)\n",
283 | " else:\n",
284 | " aug_pred, filenames = vgg.test(test_path, nb_test_samples, aug=nb_aug)\n",
285 | " predictions += aug_pred\n",
286 | "\n",
287 | "print(\"Averaging Predictions Across Augmentations...\")\n",
288 | "predictions /= nb_aug"
289 | ]
290 | },
291 | {
292 | "cell_type": "code",
293 | "execution_count": 14,
294 | "metadata": {
295 | "collapsed": false
296 | },
297 | "outputs": [],
298 | "source": [
299 | "# clip predictions\n",
300 | "c = 0.01\n",
301 | "preds = np.clip(predictions, c, 1-c)"
302 | ]
303 | },
304 | {
305 | "cell_type": "code",
306 | "execution_count": 15,
307 | "metadata": {
308 | "collapsed": false
309 | },
310 | "outputs": [
311 | {
312 | "name": "stdout",
313 | "output_type": "stream",
314 | "text": [
315 | "Writing Predictions to CSV...\n",
316 | "0 / 12500\n",
317 | "2500 / 12500\n",
318 | "5000 / 12500\n",
319 | "7500 / 12500\n",
320 | "10000 / 12500\n",
321 | "Done.\n"
322 | ]
323 | }
324 | ],
325 | "source": [
326 | "sub_file = submission_path + info_string + '.csv'\n",
327 | "\n",
328 | "with open(sub_file, 'w') as f:\n",
329 | " print(\"Writing Predictions to CSV...\")\n",
330 | " f.write('id,label\\n')\n",
331 | " for i, image_name in enumerate(filenames):\n",
332 | " pred = ['%.6f' % p for p in preds[i, :]]\n",
333 | " if i % 2500 == 0:\n",
334 | " print(i, '/', nb_test_samples)\n",
335 | " f.write('%s,%s\\n' % (os.path.basename(image_name).replace('.jpg', ''), (pred[1])))\n",
336 | " print(\"Done.\")"
337 | ]
338 | }
339 | ],
340 | "metadata": {
341 | "anaconda-cloud": {},
342 | "kernelspec": {
343 | "display_name": "Python [conda root]",
344 | "language": "python",
345 | "name": "conda-root-py"
346 | },
347 | "language_info": {
348 | "codemirror_mode": {
349 | "name": "ipython",
350 | "version": 3
351 | },
352 | "file_extension": ".py",
353 | "mimetype": "text/x-python",
354 | "name": "python",
355 | "nbconvert_exporter": "python",
356 | "pygments_lexer": "ipython3",
357 | "version": "3.5.2"
358 | }
359 | },
360 | "nbformat": 4,
361 | "nbformat_minor": 1
362 | }
363 |
--------------------------------------------------------------------------------
/img_clf.py:
--------------------------------------------------------------------------------
1 | from keras.preprocessing.image import ImageDataGenerator, array_to_img, img_to_array, load_img
2 | from keras.models import Sequential, model_from_json
3 | from keras.layers import Convolution2D, MaxPooling2D, ZeroPadding2D, Activation, Dropout, Flatten, Dense
4 | from keras.callbacks import EarlyStopping
5 | from keras import optimizers
6 | import numpy as np
7 | import csv
8 | from scipy.misc import imresize
9 | import os
10 | import h5py
11 |
12 |
13 | ### paths to weight files
14 | weights_path = '../vgg16_weights.h5' # this is the pretrained vgg16 weights
15 | top_model_weights_path = '../bottleneck_model.h5' # this is the best performing model before fine tuning
16 |
17 |
18 | ### paths to training and testing data
19 | train_data_dir = '../data/train'
20 | validation_data_dir = '../data/validation'
21 | test_data_dir = '../data/test'
22 |
23 | ### other hyperparameters
24 | nb_train_samples = 24500
25 | nb_validation_samples = 500
26 | nb_test_samples = 12500
27 | nb_epoch = 25
28 | img_width, img_height = 200, 200
29 |
30 | # (you'll have to divide up the dataset into the right directories to match this setup
31 | # since the kaggle dataset doesn't come with a validation split)
32 |
33 | early_stopping = EarlyStopping(monitor='val_loss', patience=2, verbose=1, mode='auto')
34 | # ^^ this stops training after validation loss stops improving
35 |
36 |
37 | def save_bottlebeck_features():
38 | """builds the pretrained vgg16 model and runs it on our training and validation datasets"""
39 | datagen = ImageDataGenerator(rescale=1./255)
40 |
41 | # match the vgg16 architecture so we can load the pretrained weights into this model
42 | model = Sequential()
43 | model.add(ZeroPadding2D((1, 1), input_shape=(3, img_width, img_height)))
44 |
45 | model.add(Convolution2D(64, 3, 3, activation='relu', name='conv1_1'))
46 | model.add(ZeroPadding2D((1, 1)))
47 | model.add(Convolution2D(64, 3, 3, activation='relu', name='conv1_2'))
48 | model.add(MaxPooling2D((2, 2), strides=(2, 2)))
49 |
50 | model.add(ZeroPadding2D((1, 1)))
51 | model.add(Convolution2D(128, 3, 3, activation='relu', name='conv2_1'))
52 | model.add(ZeroPadding2D((1, 1)))
53 | model.add(Convolution2D(128, 3, 3, activation='relu', name='conv2_2'))
54 | model.add(MaxPooling2D((2, 2), strides=(2, 2)))
55 |
56 | model.add(ZeroPadding2D((1, 1)))
57 | model.add(Convolution2D(256, 3, 3, activation='relu', name='conv3_1'))
58 | model.add(ZeroPadding2D((1, 1)))
59 | model.add(Convolution2D(256, 3, 3, activation='relu', name='conv3_2'))
60 | model.add(ZeroPadding2D((1, 1)))
61 | model.add(Convolution2D(256, 3, 3, activation='relu', name='conv3_3'))
62 | model.add(MaxPooling2D((2, 2), strides=(2, 2)))
63 |
64 | model.add(ZeroPadding2D((1, 1)))
65 | model.add(Convolution2D(512, 3, 3, activation='relu', name='conv4_1'))
66 | model.add(ZeroPadding2D((1, 1)))
67 | model.add(Convolution2D(512, 3, 3, activation='relu', name='conv4_2'))
68 | model.add(ZeroPadding2D((1, 1)))
69 | model.add(Convolution2D(512, 3, 3, activation='relu', name='conv4_3'))
70 | model.add(MaxPooling2D((2, 2), strides=(2, 2)))
71 |
72 | model.add(ZeroPadding2D((1, 1)))
73 | model.add(Convolution2D(512, 3, 3, activation='relu', name='conv5_1'))
74 | model.add(ZeroPadding2D((1, 1)))
75 | model.add(Convolution2D(512, 3, 3, activation='relu', name='conv5_2'))
76 | model.add(ZeroPadding2D((1, 1)))
77 | model.add(Convolution2D(512, 3, 3, activation='relu', name='conv5_3'))
78 | model.add(MaxPooling2D((2, 2), strides=(2, 2)))
79 |
80 | # load VGG16 weights
81 | f = h5py.File(weights_path)
82 |
83 | for k in range(f.attrs['nb_layers']):
84 | if k >= len(model.layers):
85 | break
86 | g = f['layer_{}'.format(k)]
87 | weights = [g['param_{}'.format(p)] for p in range(g.attrs['nb_params'])]
88 | model.layers[k].set_weights(weights)
89 |
90 | f.close()
91 | print 'Model loaded.'
92 |
93 | generator = datagen.flow_from_directory(
94 | train_data_dir,
95 | target_size=(img_width, img_height),
96 | batch_size=32,
97 | class_mode=None,
98 | shuffle=False)
99 | bottleneck_features_train = model.predict_generator(generator, nb_train_samples)
100 | np.save(open('bottleneck_features_train.npy', 'wb'), bottleneck_features_train)
101 |
102 | generator = datagen.flow_from_directory(
103 | validation_data_dir,
104 | target_size=(img_width, img_height),
105 | batch_size=32,
106 | class_mode=None,
107 | shuffle=False)
108 | bottleneck_features_validation = model.predict_generator(generator, nb_validation_samples)
109 | np.save(open('bottleneck_features_validation.npy', 'wb'), bottleneck_features_validation)
110 |
111 |
112 | def train_top_model():
113 | """trains the classifier"""
114 | train_data = np.load(open('bottleneck_features_train.npy', 'rb'))
115 | train_labels = np.array([0] * (nb_train_samples / 2) + [1] * (nb_train_samples / 2))
116 |
117 | validation_data = np.load(open('bottleneck_features_validation.npy', 'rb'))
118 | validation_labels = np.array([0] * (nb_validation_samples / 2) + [1] * (nb_validation_samples / 2))
119 |
120 | model = Sequential()
121 | model.add(Flatten(input_shape=train_data.shape[1:]))
122 | model.add(Dense(256, activation='relu'))
123 | model.add(Dropout(0.5))
124 | model.add(Dense(1, activation='sigmoid'))
125 |
126 | model.compile(optimizer='rmsprop', loss='binary_crossentropy', metrics=['accuracy'])
127 |
128 | model.fit(train_data, train_labels,
129 | nb_epoch=nb_epoch,
130 | batch_size=32,
131 | validation_data=(validation_data, validation_labels),
132 | callbacks=[early_stopping])
133 |
134 | # save the model weights
135 | model.save_weights(top_model_weights_path)
136 |
137 |
138 | def fine_tune():
139 | """recreates top model architecture/weights and fine tunes with image augmentation and optimizations"""
140 |
141 | # reconstruct vgg16 model
142 | model = Sequential()
143 | model.add(ZeroPadding2D((1, 1), input_shape=(3, img_width, img_height)))
144 |
145 | model.add(Convolution2D(64, 3, 3, activation='relu', name='conv1_1'))
146 | model.add(ZeroPadding2D((1, 1)))
147 | model.add(Convolution2D(64, 3, 3, activation='relu', name='conv1_2'))
148 | model.add(MaxPooling2D((2, 2), strides=(2, 2)))
149 |
150 | model.add(ZeroPadding2D((1, 1)))
151 | model.add(Convolution2D(128, 3, 3, activation='relu', name='conv2_1'))
152 | model.add(ZeroPadding2D((1, 1)))
153 | model.add(Convolution2D(128, 3, 3, activation='relu', name='conv2_2'))
154 | model.add(MaxPooling2D((2, 2), strides=(2, 2)))
155 |
156 | model.add(ZeroPadding2D((1, 1)))
157 | model.add(Convolution2D(256, 3, 3, activation='relu', name='conv3_1'))
158 | model.add(ZeroPadding2D((1, 1)))
159 | model.add(Convolution2D(256, 3, 3, activation='relu', name='conv3_2'))
160 | model.add(ZeroPadding2D((1, 1)))
161 | model.add(Convolution2D(256, 3, 3, activation='relu', name='conv3_3'))
162 | model.add(MaxPooling2D((2, 2), strides=(2, 2)))
163 |
164 | model.add(ZeroPadding2D((1, 1)))
165 | model.add(Convolution2D(512, 3, 3, activation='relu', name='conv4_1'))
166 | model.add(ZeroPadding2D((1, 1)))
167 | model.add(Convolution2D(512, 3, 3, activation='relu', name='conv4_2'))
168 | model.add(ZeroPadding2D((1, 1)))
169 | model.add(Convolution2D(512, 3, 3, activation='relu', name='conv4_3'))
170 | model.add(MaxPooling2D((2, 2), strides=(2, 2)))
171 |
172 | model.add(ZeroPadding2D((1, 1)))
173 | model.add(Convolution2D(512, 3, 3, activation='relu', name='conv5_1'))
174 | model.add(ZeroPadding2D((1, 1)))
175 | model.add(Convolution2D(512, 3, 3, activation='relu', name='conv5_2'))
176 | model.add(ZeroPadding2D((1, 1)))
177 | model.add(Convolution2D(512, 3, 3, activation='relu', name='conv5_3'))
178 | model.add(MaxPooling2D((2, 2), strides=(2, 2)))
179 |
180 | # load vgg16 weights
181 | f = h5py.File(weights_path)
182 |
183 | for k in range(f.attrs['nb_layers']):
184 | if k >= len(model.layers):
185 | break
186 | g = f['layer_{}'.format(k)]
187 | weights = [g['param_{}'.format(p)] for p in range(g.attrs['nb_params'])]
188 | model.layers[k].set_weights(weights)
189 |
190 | f.close()
191 |
192 | # add the classification layers
193 | top_model = Sequential()
194 | top_model.add(Flatten(input_shape=model.output_shape[1:]))
195 | top_model.add(Dense(256, activation='relu'))
196 | top_model.add(Dropout(0.5))
197 | top_model.add(Dense(1, activation='sigmoid'))
198 |
199 | top_model.load_weights(top_model_weights_path)
200 |
201 | # add the model on top of the convolutional base
202 | model.add(top_model)
203 |
204 | # set the first 25 layers (up to the last conv block)
205 | # to non-trainable (weights will not be updated)
206 | for layer in model.layers[:25]:
207 | layer.trainable = False
208 |
209 | # compile the model with a SGD/momentum optimizer
210 | # and a very slow learning rate.
211 | model.compile(loss='binary_crossentropy',
212 | optimizer=optimizers.SGD(lr=1e-4, momentum=0.9),
213 | metrics=['accuracy'])
214 |
215 | # prepare data augmentation configuration
216 | train_datagen = ImageDataGenerator(
217 | rescale=1./255,
218 | shear_range=0.2,
219 | zoom_range=0.2,
220 | horizontal_flip=True)
221 |
222 | test_datagen = ImageDataGenerator(rescale=1./255)
223 |
224 | train_generator = train_datagen.flow_from_directory(
225 | train_data_dir,
226 | target_size=(img_height, img_width),
227 | batch_size=32,
228 | class_mode='binary')
229 |
230 | validation_generator = test_datagen.flow_from_directory(
231 | validation_data_dir,
232 | target_size=(img_height, img_width),
233 | batch_size=32,
234 | class_mode='binary')
235 |
236 | # fine-tune the model
237 | model.fit_generator(
238 | train_generator,
239 | samples_per_epoch=nb_train_samples,
240 | nb_epoch=nb_epoch,
241 | validation_data=validation_generator,
242 | nb_val_samples=nb_validation_samples,
243 | callbacks=[early_stopping])
244 |
245 | # save the model
246 | json_string = model.to_json()
247 |
248 | with open('final_model_architecture.json', 'w') as f:
249 | f.write(json_string)
250 |
251 | model.save_weights('final_weights.h5')
252 |
253 | # return the model for convenience when making predictions
254 | return model
255 |
256 |
257 | def predict_labels(model):
258 | """writes test image labels and predictions to csv"""
259 |
260 | test_datagen = ImageDataGenerator(rescale=1./255)
261 | test_generator = test_datagen.flow_from_directory(
262 | test_data_dir,
263 | target_size=(img_height, img_width),
264 | batch_size=32,
265 | shuffle=False,
266 | class_mode=None)
267 |
268 | base_path = test_data_dir + "/test/"
269 |
270 | with open("prediction.csv", "w") as f:
271 | p_writer = csv.writer(f, delimiter=',', lineterminator='\n')
272 | for _, _, imgs in os.walk(base_path):
273 | for im in imgs:
274 | pic_id = im.split(".")[0]
275 | img = load_img(base_path + im)
276 | img = imresize(img, size=(img_height, img_width))
277 | test_x = img_to_array(img).reshape(3, img_height, img_width)
278 | test_x = test_x.reshape((1,) + test_x.shape)
279 | test_generator = test_datagen.flow(test_x,
280 | batch_size=1,
281 | shuffle=False)
282 | prediction = model.predict_generator(test_generator, 1)[0][0]
283 | p_writer.writerow([pic_id, prediction])
284 |
285 | def load_model():
286 | """Loads a model from an earlier run"""
287 |
288 | json_file = open('final_model_architecture.json', 'r')
289 | model_json = json_file.read()
290 | json_file.close()
291 | model = model_from_json(model_json)
292 | model.load_weights('final_weights.h5')
293 | print "Model Loaded."
294 |
295 | return model
296 |
297 | if __name__ == "__main__":
298 | save_bottlebeck_features()
299 | train_top_model()
300 | model = fine_tune()
301 | predict_labels(model)
302 |
--------------------------------------------------------------------------------
/vgg_bn.py:
--------------------------------------------------------------------------------
1 | import numpy as np
2 |
3 | from keras.layers.normalization import BatchNormalization
4 | from keras.models import Sequential
5 | from keras.layers.core import Flatten, Dense, Dropout, Lambda
6 | from keras.layers.convolutional import Convolution2D, MaxPooling2D, ZeroPadding2D
7 | from keras.optimizers import Adam
8 | from keras.preprocessing.image import ImageDataGenerator
9 |
10 |
11 | vgg_mean = np.array([123.68, 116.779, 103.939], dtype=np.float32).reshape((3,1,1))
12 | def vgg_preprocess(x):
13 | x = x - vgg_mean
14 | return x[:, ::-1] # reverse axis rgb->bgr
15 |
16 |
17 | class Vgg16BN():
18 | """The VGG 16 Imagenet model with Batch Normalization for the Dense Layers"""
19 |
20 | def __init__(self, size=(224, 224), n_classes=2, lr=0.001, batch_size=64):
21 | self.weights_file = 'vgg16_bn.h5' # download from: http://www.platform.ai/models/
22 | self.size = size
23 | self.n_classes = n_classes
24 | self.lr = lr
25 | self.batch_size = batch_size
26 | self.build()
27 |
28 | def predict(self, data):
29 | return self.model.predict(data)
30 |
31 | def ConvBlock(self, layers, filters):
32 | model = self.model
33 | for i in range(layers):
34 | model.add(ZeroPadding2D((1, 1)))
35 | model.add(Convolution2D(filters, 3, 3, activation='relu'))
36 | model.add(MaxPooling2D((2, 2), strides=(2, 2)))
37 |
38 | def FCBlock(self):
39 | model = self.model
40 | model.add(Dense(4096, activation='relu'))
41 | model.add(BatchNormalization())
42 | model.add(Dropout(0.5))
43 |
44 | def build(self, ft=True):
45 | model = self.model = Sequential()
46 | model.add(Lambda(vgg_preprocess, input_shape=(3,) + self.size))
47 |
48 | self.ConvBlock(2, 64)
49 | self.ConvBlock(2, 128)
50 | self.ConvBlock(3, 256)
51 | self.ConvBlock(3, 512)
52 | self.ConvBlock(3, 512)
53 |
54 | model.add(Flatten())
55 | self.FCBlock()
56 | self.FCBlock()
57 | model.add(Dense(self.n_classes, activation='softmax'))
58 |
59 | model.load_weights(self.weights_file)
60 |
61 | if ft:
62 | self.finetune()
63 |
64 | self.compile()
65 |
66 | def finetune(self):
67 | model = self.model
68 | model.pop()
69 | for layer in model.layers:
70 | layer.trainable=False
71 | model.add(Dense(self.n_classes, activation='softmax'))
72 |
73 | def compile(self):
74 | self.model.compile(optimizer=Adam(lr=self.lr),
75 | loss='categorical_crossentropy', metrics=['accuracy'])
76 |
77 | def fit(self, trn_path, val_path, nb_trn_samples, nb_val_samples, nb_epoch=1, callbacks=None, aug=False):
78 | if aug:
79 | train_datagen = ImageDataGenerator(rotation_range=10, width_shift_range=0.05, zoom_range=0.05,
80 | channel_shift_range=10, height_shift_range=0.05, shear_range=0.05,
81 | horizontal_flip=True)
82 | else:
83 | train_datagen = ImageDataGenerator()
84 |
85 | trn_gen = train_datagen.flow_from_directory(trn_path, target_size=self.size, batch_size=self.batch_size,
86 | class_mode='categorical', shuffle=True)
87 |
88 | val_gen = ImageDataGenerator().flow_from_directory(val_path, target_size=self.size, batch_size=self.batch_size,
89 | class_mode='categorical', shuffle=True)
90 |
91 | self.model.fit_generator(trn_gen, samples_per_epoch=nb_trn_samples, nb_epoch=nb_epoch, verbose=2,
92 | validation_data=val_gen, nb_val_samples=nb_val_samples, callbacks=callbacks)
93 |
94 | def test(self, test_path, nb_test_samples, aug=False):
95 | if aug:
96 | test_datagen = ImageDataGenerator(rotation_range=10, width_shift_range=0.05, zoom_range=0.05,
97 | channel_shift_range=10, height_shift_range=0.05, shear_range=0.05,
98 | horizontal_flip=True)
99 | else:
100 | test_datagen = ImageDataGenerator()
101 |
102 | test_gen = test_datagen.flow_from_directory(test_path, target_size=self.size, batch_size=self.batch_size,
103 | class_mode=None, shuffle=False)
104 |
105 | return self.model.predict_generator(test_gen, val_samples=nb_test_samples), test_gen.filenames
106 |
--------------------------------------------------------------------------------