├── README.md
├── assets
    └── feats.jpg
├── cats_n_dogs.ipynb
├── cats_n_dogs_BN.ipynb
├── img_clf.py
└── vgg_bn.py


/README.md:
--------------------------------------------------------------------------------
 1 | # Keras Image Classification
 2 | 
 3 | Classifies an image as containing either a dog or a cat (using Kaggle's <a href="https://www.kaggle.com/c/dogs-vs-cats-redux-kernels-edition/data">public dataset</a>), but could easily be extended to other image classification problems.
 4 | 
 5 | To run these scripts/notebooks, you must have keras, numpy, scipy, and h5py installed, and enabling GPU acceleration is highly recommended if that's an option.
 6 | 
 7 | ## img_clf.py
 8 | After playing around with hyperparameters a bit, this reaches around 96-98% accuracy on the validation data, and when tested on Kaggle's hidden test data achieved a log loss score around 0.18.
 9 | 
10 | Most of the code / strategy here was based on <a href="https://blog.keras.io/building-powerful-image-classification-models-using-very-little-data.html">this</a> Keras tutorial.
11 | 
12 | Pre-trained VGG16 model weights can be downloaded <a href="https://gist.github.com/baraldilorenzo/07d7802847aaad0a35d3">here</a>.
13 | 
14 | The data directory structure I used was:
15 | 
16 | * project
17 |   * data
18 |     * train
19 |       * dogs
20 |       * cats
21 |     * validation
22 |       * dogs
23 |       * cats
24 |     * test
25 |       * test
26 | 
27 | ## cats_n_dogs.ipynb:
28 | This produced a slightly better score (.161 log loss on kaggle test set). The better score most likely comes from having larger images and ensembling a few models, despite the fact there's no image augmentation in the notebook. 
29 | 
30 | Might run into memory errors because of the large image dimensions -- if so reducing the number of folds and saving the model weights rather than keeping the models in memory should do the trick. The notebook uses a slightly flatter directory structure, with the validation split happening after the images are loaded:
31 | 
32 | * project
33 |   * data
34 |     * train
35 |       * dogs
36 |       * cats
37 |     * test
38 |       * test
39 |             
40 | ## cats_n_dogs_BN.ipynb:
41 | This produced the best score (0.069 loss without any ensembling). The notebook incorporates some of the techniques from Jeremy Howard's <a href="http://course.fast.ai/">deep learning class</a> , with the inclusion of batch normalization being the biggest factor. I also added extra layers of augmentation to the prediction script, which greatly improved performance.
42 | 
43 | Pre-trained model weights for VGG16 w/ batch normalization can be downloaded <a href="http://www.platform.ai/models/">here</a>.
44 | 
45 | The VGG16BN class is defined in <em>vgg_bn.py</em>, and the data directory structure used was: 
46 | 
47 | * project
48 |   * data
49 |     * train
50 |       * dogs
51 |       * cats
52 |     * validation
53 |       * dogs
54 |       * cats
55 |     * test
56 |       * test
57 | 


--------------------------------------------------------------------------------
/assets/feats.jpg:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/rdcolema/keras-image-classification/9d4b678d33410a7c848dee85e86be5d624de7046/assets/feats.jpg


--------------------------------------------------------------------------------
/cats_n_dogs_BN.ipynb:
--------------------------------------------------------------------------------
  1 | {
  2 |  "cells": [
  3 |   {
  4 |    "cell_type": "markdown",
  5 |    "metadata": {},
  6 |    "source": [
  7 |     "## Image Classification of Dogs vs. Cats Using CNN Ensemble"
  8 |    ]
  9 |   },
 10 |   {
 11 |    "cell_type": "markdown",
 12 |    "metadata": {},
 13 |    "source": [
 14 |     "Imports & environment"
 15 |    ]
 16 |   },
 17 |   {
 18 |    "cell_type": "code",
 19 |    "execution_count": 1,
 20 |    "metadata": {
 21 |     "collapsed": false
 22 |    },
 23 |    "outputs": [
 24 |     {
 25 |      "name": "stderr",
 26 |      "output_type": "stream",
 27 |      "text": [
 28 |       "Using Theano backend.\n",
 29 |       "Using gpu device 0: GeForce GTX 980M (CNMeM is enabled with initial size: 90.0% of memory, cuDNN 5105)\n",
 30 |       "/home/robert/anaconda3/lib/python3.5/site-packages/theano/sandbox/cuda/__init__.py:600: UserWarning: Your cuDNN version is more recent than the one Theano officially supports. If you see any problems, try updating Theano or downgrading cuDNN to version 5.\n",
 31 |       "  warnings.warn(warn)\n",
 32 |       "/home/robert/anaconda3/lib/python3.5/site-packages/matplotlib/font_manager.py:273: UserWarning: Matplotlib is building the font cache using fc-list. This may take a moment.\n",
 33 |       "  warnings.warn('Matplotlib is building the font cache using fc-list. This may take a moment.')\n",
 34 |       "/home/robert/anaconda3/lib/python3.5/site-packages/matplotlib/font_manager.py:273: UserWarning: Matplotlib is building the font cache using fc-list. This may take a moment.\n",
 35 |       "  warnings.warn('Matplotlib is building the font cache using fc-list. This may take a moment.')\n"
 36 |      ]
 37 |     }
 38 |    ],
 39 |    "source": [
 40 |     "import os\n",
 41 |     "import numpy as np\n",
 42 |     "\n",
 43 |     "from glob import glob\n",
 44 |     "from shutil import copyfile\n",
 45 |     "from vgg_bn import Vgg16BN\n",
 46 |     "from keras.callbacks import ModelCheckpoint\n",
 47 |     "\n",
 48 |     "ROOT_DIR = os.getcwd()\n",
 49 |     "DATA_HOME_DIR = ROOT_DIR + '/data'\n",
 50 |     "%matplotlib inline"
 51 |    ]
 52 |   },
 53 |   {
 54 |    "cell_type": "markdown",
 55 |    "metadata": {},
 56 |    "source": [
 57 |     "Config & Hyperparameters"
 58 |    ]
 59 |   },
 60 |   {
 61 |    "cell_type": "code",
 62 |    "execution_count": 10,
 63 |    "metadata": {
 64 |     "collapsed": true
 65 |    },
 66 |    "outputs": [],
 67 |    "source": [
 68 |     "# paths\n",
 69 |     "data_path = DATA_HOME_DIR + '/' \n",
 70 |     "train_path = data_path + '/train/'\n",
 71 |     "valid_path = data_path + '/valid/'\n",
 72 |     "test_path = DATA_HOME_DIR + '/test/'\n",
 73 |     "model_path = ROOT_DIR + '/models/'\n",
 74 |     "submission_path = ROOT_DIR + '/submissions/'\n",
 75 |     "\n",
 76 |     "# data\n",
 77 |     "img_width, img_height = 224, 224\n",
 78 |     "batch_size = 64\n",
 79 |     "nb_train_samples = 23000\n",
 80 |     "nb_valid_samples = 2000\n",
 81 |     "nb_test_samples = 12500\n",
 82 |     "classes = [\"cats\", \"dogs\"]\n",
 83 |     "n_classes = len(classes)\n",
 84 |     "\n",
 85 |     "# model\n",
 86 |     "nb_epoch = 10\n",
 87 |     "nb_aug = 5\n",
 88 |     "lr = 0.001"
 89 |    ]
 90 |   },
 91 |   {
 92 |    "cell_type": "markdown",
 93 |    "metadata": {},
 94 |    "source": [
 95 |     "Build the VGG model w/ Batch Normalization"
 96 |    ]
 97 |   },
 98 |   {
 99 |    "cell_type": "code",
100 |    "execution_count": 3,
101 |    "metadata": {
102 |     "collapsed": false,
103 |     "scrolled": true
104 |    },
105 |    "outputs": [
106 |     {
107 |      "name": "stdout",
108 |      "output_type": "stream",
109 |      "text": [
110 |       "____________________________________________________________________________________________________\n",
111 |       "Layer (type)                     Output Shape          Param #     Connected to                     \n",
112 |       "====================================================================================================\n",
113 |       "lambda_1 (Lambda)                (None, 3, 224, 224)   0           lambda_input_1[0][0]             \n",
114 |       "____________________________________________________________________________________________________\n",
115 |       "zeropadding2d_1 (ZeroPadding2D)  (None, 3, 226, 226)   0           lambda_1[0][0]                   \n",
116 |       "____________________________________________________________________________________________________\n",
117 |       "convolution2d_1 (Convolution2D)  (None, 64, 224, 224)  0           zeropadding2d_1[0][0]            \n",
118 |       "____________________________________________________________________________________________________\n",
119 |       "zeropadding2d_2 (ZeroPadding2D)  (None, 64, 226, 226)  0           convolution2d_1[0][0]            \n",
120 |       "____________________________________________________________________________________________________\n",
121 |       "convolution2d_2 (Convolution2D)  (None, 64, 224, 224)  0           zeropadding2d_2[0][0]            \n",
122 |       "____________________________________________________________________________________________________\n",
123 |       "maxpooling2d_1 (MaxPooling2D)    (None, 64, 112, 112)  0           convolution2d_2[0][0]            \n",
124 |       "____________________________________________________________________________________________________\n",
125 |       "zeropadding2d_3 (ZeroPadding2D)  (None, 64, 114, 114)  0           maxpooling2d_1[0][0]             \n",
126 |       "____________________________________________________________________________________________________\n",
127 |       "convolution2d_3 (Convolution2D)  (None, 128, 112, 112) 0           zeropadding2d_3[0][0]            \n",
128 |       "____________________________________________________________________________________________________\n",
129 |       "zeropadding2d_4 (ZeroPadding2D)  (None, 128, 114, 114) 0           convolution2d_3[0][0]            \n",
130 |       "____________________________________________________________________________________________________\n",
131 |       "convolution2d_4 (Convolution2D)  (None, 128, 112, 112) 0           zeropadding2d_4[0][0]            \n",
132 |       "____________________________________________________________________________________________________\n",
133 |       "maxpooling2d_2 (MaxPooling2D)    (None, 128, 56, 56)   0           convolution2d_4[0][0]            \n",
134 |       "____________________________________________________________________________________________________\n",
135 |       "zeropadding2d_5 (ZeroPadding2D)  (None, 128, 58, 58)   0           maxpooling2d_2[0][0]             \n",
136 |       "____________________________________________________________________________________________________\n",
137 |       "convolution2d_5 (Convolution2D)  (None, 256, 56, 56)   0           zeropadding2d_5[0][0]            \n",
138 |       "____________________________________________________________________________________________________\n",
139 |       "zeropadding2d_6 (ZeroPadding2D)  (None, 256, 58, 58)   0           convolution2d_5[0][0]            \n",
140 |       "____________________________________________________________________________________________________\n",
141 |       "convolution2d_6 (Convolution2D)  (None, 256, 56, 56)   0           zeropadding2d_6[0][0]            \n",
142 |       "____________________________________________________________________________________________________\n",
143 |       "zeropadding2d_7 (ZeroPadding2D)  (None, 256, 58, 58)   0           convolution2d_6[0][0]            \n",
144 |       "____________________________________________________________________________________________________\n",
145 |       "convolution2d_7 (Convolution2D)  (None, 256, 56, 56)   0           zeropadding2d_7[0][0]            \n",
146 |       "____________________________________________________________________________________________________\n",
147 |       "maxpooling2d_3 (MaxPooling2D)    (None, 256, 28, 28)   0           convolution2d_7[0][0]            \n",
148 |       "____________________________________________________________________________________________________\n",
149 |       "zeropadding2d_8 (ZeroPadding2D)  (None, 256, 30, 30)   0           maxpooling2d_3[0][0]             \n",
150 |       "____________________________________________________________________________________________________\n",
151 |       "convolution2d_8 (Convolution2D)  (None, 512, 28, 28)   0           zeropadding2d_8[0][0]            \n",
152 |       "____________________________________________________________________________________________________\n",
153 |       "zeropadding2d_9 (ZeroPadding2D)  (None, 512, 30, 30)   0           convolution2d_8[0][0]            \n",
154 |       "____________________________________________________________________________________________________\n",
155 |       "convolution2d_9 (Convolution2D)  (None, 512, 28, 28)   0           zeropadding2d_9[0][0]            \n",
156 |       "____________________________________________________________________________________________________\n",
157 |       "zeropadding2d_10 (ZeroPadding2D) (None, 512, 30, 30)   0           convolution2d_9[0][0]            \n",
158 |       "____________________________________________________________________________________________________\n",
159 |       "convolution2d_10 (Convolution2D) (None, 512, 28, 28)   0           zeropadding2d_10[0][0]           \n",
160 |       "____________________________________________________________________________________________________\n",
161 |       "maxpooling2d_4 (MaxPooling2D)    (None, 512, 14, 14)   0           convolution2d_10[0][0]           \n",
162 |       "____________________________________________________________________________________________________\n",
163 |       "zeropadding2d_11 (ZeroPadding2D) (None, 512, 16, 16)   0           maxpooling2d_4[0][0]             \n",
164 |       "____________________________________________________________________________________________________\n",
165 |       "convolution2d_11 (Convolution2D) (None, 512, 14, 14)   0           zeropadding2d_11[0][0]           \n",
166 |       "____________________________________________________________________________________________________\n",
167 |       "zeropadding2d_12 (ZeroPadding2D) (None, 512, 16, 16)   0           convolution2d_11[0][0]           \n",
168 |       "____________________________________________________________________________________________________\n",
169 |       "convolution2d_12 (Convolution2D) (None, 512, 14, 14)   0           zeropadding2d_12[0][0]           \n",
170 |       "____________________________________________________________________________________________________\n",
171 |       "zeropadding2d_13 (ZeroPadding2D) (None, 512, 16, 16)   0           convolution2d_12[0][0]           \n",
172 |       "____________________________________________________________________________________________________\n",
173 |       "convolution2d_13 (Convolution2D) (None, 512, 14, 14)   0           zeropadding2d_13[0][0]           \n",
174 |       "____________________________________________________________________________________________________\n",
175 |       "maxpooling2d_5 (MaxPooling2D)    (None, 512, 7, 7)     0           convolution2d_13[0][0]           \n",
176 |       "____________________________________________________________________________________________________\n",
177 |       "flatten_1 (Flatten)              (None, 25088)         0           maxpooling2d_5[0][0]             \n",
178 |       "____________________________________________________________________________________________________\n",
179 |       "dense_1 (Dense)                  (None, 4096)          0           flatten_1[0][0]                  \n",
180 |       "____________________________________________________________________________________________________\n",
181 |       "batchnormalization_1 (BatchNormal(None, 4096)          0           dense_1[0][0]                    \n",
182 |       "____________________________________________________________________________________________________\n",
183 |       "dropout_1 (Dropout)              (None, 4096)          0           batchnormalization_1[0][0]       \n",
184 |       "____________________________________________________________________________________________________\n",
185 |       "dense_2 (Dense)                  (None, 4096)          0           dropout_1[0][0]                  \n",
186 |       "____________________________________________________________________________________________________\n",
187 |       "batchnormalization_2 (BatchNormal(None, 4096)          0           dense_2[0][0]                    \n",
188 |       "____________________________________________________________________________________________________\n",
189 |       "dropout_2 (Dropout)              (None, 4096)          0           batchnormalization_2[0][0]       \n",
190 |       "____________________________________________________________________________________________________\n",
191 |       "dense_4 (Dense)                  (None, 2)             8194        dropout_2[0][0]                  \n",
192 |       "====================================================================================================\n",
193 |       "Total params: 8194\n",
194 |       "____________________________________________________________________________________________________\n"
195 |      ]
196 |     }
197 |    ],
198 |    "source": [
199 |     "vgg = Vgg16BN(size=(img_width, img_height), n_classes=n_classes, batch_size=batch_size, lr=lr)\n",
200 |     "model = vgg.model\n",
201 |     "\n",
202 |     "model.summary()"
203 |    ]
204 |   },
205 |   {
206 |    "cell_type": "code",
207 |    "execution_count": 12,
208 |    "metadata": {
209 |     "collapsed": true
210 |    },
211 |    "outputs": [],
212 |    "source": [
213 |     "info_string = \"{0}x{1}_{2}epoch_{3}aug_{4}lr_vgg16-bn\".format(img_width, img_height, nb_epoch, nb_aug, lr)\n",
214 |     "ckpt_fn = model_path + '{val_loss:.2f}-loss_' + info_string + '.h5'\n",
215 |     "\n",
216 |     "ckpt = ModelCheckpoint(filepath=ckpt_fn,\n",
217 |     "                      monitor='val_loss',\n",
218 |     "                      save_best_only=True,\n",
219 |     "                      save_weights_only=True)"
220 |    ]
221 |   },
222 |   {
223 |    "cell_type": "markdown",
224 |    "metadata": {},
225 |    "source": [
226 |     "Train the Model"
227 |    ]
228 |   },
229 |   {
230 |    "cell_type": "code",
231 |    "execution_count": 13,
232 |    "metadata": {
233 |     "collapsed": false
234 |    },
235 |    "outputs": [],
236 |    "source": [
237 |     "vgg.fit(train_path, valid_path,\n",
238 |     "          nb_trn_samples=nb_train_samples,\n",
239 |     "          nb_val_samples=nb_valid_samples,\n",
240 |     "          nb_epoch=nb_epoch,\n",
241 |     "          callbacks=[ckpt],\n",
242 |     "          aug=nb_aug)"
243 |    ]
244 |   },
245 |   {
246 |    "cell_type": "markdown",
247 |    "metadata": {},
248 |    "source": [
249 |     "Predict on Test Data"
250 |    ]
251 |   },
252 |   {
253 |    "cell_type": "code",
254 |    "execution_count": 11,
255 |    "metadata": {
256 |     "collapsed": false
257 |    },
258 |    "outputs": [
259 |     {
260 |      "name": "stdout",
261 |      "output_type": "stream",
262 |      "text": [
263 |       "Generating predictions for Augmentation... 0\n",
264 |       "Found 12500 images belonging to 1 classes.\n",
265 |       "Generating predictions for Augmentation... 1\n",
266 |       "Found 12500 images belonging to 1 classes.\n",
267 |       "Generating predictions for Augmentation... 2\n",
268 |       "Found 12500 images belonging to 1 classes.\n",
269 |       "Generating predictions for Augmentation... 3\n",
270 |       "Found 12500 images belonging to 1 classes.\n",
271 |       "Generating predictions for Augmentation... 4\n",
272 |       "Found 12500 images belonging to 1 classes.\n",
273 |       "Averaging Predictions Across Augmentations...\n"
274 |      ]
275 |     }
276 |    ],
277 |    "source": [
278 |     "# generate predictions\n",
279 |     "for aug in range(nb_aug):\n",
280 |     "    print(\"Generating predictions for Augmentation {0}...\",format(aug+1))\n",
281 |     "    if aug == 0:\n",
282 |     "        predictions, filenames = vgg.test(test_path, nb_test_samples, aug=nb_aug)\n",
283 |     "    else:\n",
284 |     "        aug_pred, filenames = vgg.test(test_path, nb_test_samples, aug=nb_aug)\n",
285 |     "        predictions += aug_pred\n",
286 |     "\n",
287 |     "print(\"Averaging Predictions Across Augmentations...\")\n",
288 |     "predictions /= nb_aug"
289 |    ]
290 |   },
291 |   {
292 |    "cell_type": "code",
293 |    "execution_count": 14,
294 |    "metadata": {
295 |     "collapsed": false
296 |    },
297 |    "outputs": [],
298 |    "source": [
299 |     "# clip predictions\n",
300 |     "c = 0.01\n",
301 |     "preds = np.clip(predictions, c, 1-c)"
302 |    ]
303 |   },
304 |   {
305 |    "cell_type": "code",
306 |    "execution_count": 15,
307 |    "metadata": {
308 |     "collapsed": false
309 |    },
310 |    "outputs": [
311 |     {
312 |      "name": "stdout",
313 |      "output_type": "stream",
314 |      "text": [
315 |       "Writing Predictions to CSV...\n",
316 |       "0 / 12500\n",
317 |       "2500 / 12500\n",
318 |       "5000 / 12500\n",
319 |       "7500 / 12500\n",
320 |       "10000 / 12500\n",
321 |       "Done.\n"
322 |      ]
323 |     }
324 |    ],
325 |    "source": [
326 |     "sub_file = submission_path + info_string + '.csv'\n",
327 |     "\n",
328 |     "with open(sub_file, 'w') as f:\n",
329 |     "    print(\"Writing Predictions to CSV...\")\n",
330 |     "    f.write('id,label\\n')\n",
331 |     "    for i, image_name in enumerate(filenames):\n",
332 |     "        pred = ['%.6f' % p for p in preds[i, :]]\n",
333 |     "        if i % 2500 == 0:\n",
334 |     "            print(i, '/', nb_test_samples)\n",
335 |     "        f.write('%s,%s\\n' % (os.path.basename(image_name).replace('.jpg', ''), (pred[1])))\n",
336 |     "    print(\"Done.\")"
337 |    ]
338 |   }
339 |  ],
340 |  "metadata": {
341 |   "anaconda-cloud": {},
342 |   "kernelspec": {
343 |    "display_name": "Python [conda root]",
344 |    "language": "python",
345 |    "name": "conda-root-py"
346 |   },
347 |   "language_info": {
348 |    "codemirror_mode": {
349 |     "name": "ipython",
350 |     "version": 3
351 |    },
352 |    "file_extension": ".py",
353 |    "mimetype": "text/x-python",
354 |    "name": "python",
355 |    "nbconvert_exporter": "python",
356 |    "pygments_lexer": "ipython3",
357 |    "version": "3.5.2"
358 |   }
359 |  },
360 |  "nbformat": 4,
361 |  "nbformat_minor": 1
362 | }
363 | 


--------------------------------------------------------------------------------
/img_clf.py:
--------------------------------------------------------------------------------
  1 | from keras.preprocessing.image import ImageDataGenerator, array_to_img, img_to_array, load_img
  2 | from keras.models import Sequential, model_from_json
  3 | from keras.layers import Convolution2D, MaxPooling2D, ZeroPadding2D, Activation, Dropout, Flatten, Dense
  4 | from keras.callbacks import EarlyStopping
  5 | from keras import optimizers
  6 | import numpy as np
  7 | import csv
  8 | from scipy.misc import imresize
  9 | import os
 10 | import h5py
 11 | 
 12 | 
 13 | ### paths to weight files
 14 | weights_path = '../vgg16_weights.h5'                # this is the pretrained vgg16 weights
 15 | top_model_weights_path = '../bottleneck_model.h5'   # this is the best performing model before fine tuning
 16 | 
 17 | 
 18 | ### paths to training and testing data
 19 | train_data_dir = '../data/train'
 20 | validation_data_dir = '../data/validation'
 21 | test_data_dir = '../data/test'
 22 | 
 23 | ### other hyperparameters  
 24 | nb_train_samples = 24500
 25 | nb_validation_samples = 500
 26 | nb_test_samples = 12500
 27 | nb_epoch = 25
 28 | img_width, img_height = 200, 200
 29 | 
 30 | # (you'll have to divide up the dataset into the right directories to match this setup
 31 | # since the kaggle dataset doesn't come with a validation split)
 32 | 
 33 | early_stopping = EarlyStopping(monitor='val_loss', patience=2, verbose=1, mode='auto')
 34 | # ^^ this stops training after validation loss stops improving
 35 | 
 36 | 
 37 | def save_bottlebeck_features():
 38 |     """builds the pretrained vgg16 model and runs it on our training and validation datasets"""
 39 |     datagen = ImageDataGenerator(rescale=1./255)
 40 | 
 41 |     # match the vgg16 architecture so we can load the pretrained weights into this model
 42 |     model = Sequential()
 43 |     model.add(ZeroPadding2D((1, 1), input_shape=(3, img_width, img_height)))
 44 | 
 45 |     model.add(Convolution2D(64, 3, 3, activation='relu', name='conv1_1'))
 46 |     model.add(ZeroPadding2D((1, 1)))
 47 |     model.add(Convolution2D(64, 3, 3, activation='relu', name='conv1_2'))
 48 |     model.add(MaxPooling2D((2, 2), strides=(2, 2)))
 49 | 
 50 |     model.add(ZeroPadding2D((1, 1)))
 51 |     model.add(Convolution2D(128, 3, 3, activation='relu', name='conv2_1'))
 52 |     model.add(ZeroPadding2D((1, 1)))
 53 |     model.add(Convolution2D(128, 3, 3, activation='relu', name='conv2_2'))
 54 |     model.add(MaxPooling2D((2, 2), strides=(2, 2)))
 55 | 
 56 |     model.add(ZeroPadding2D((1, 1)))
 57 |     model.add(Convolution2D(256, 3, 3, activation='relu', name='conv3_1'))
 58 |     model.add(ZeroPadding2D((1, 1)))
 59 |     model.add(Convolution2D(256, 3, 3, activation='relu', name='conv3_2'))
 60 |     model.add(ZeroPadding2D((1, 1)))
 61 |     model.add(Convolution2D(256, 3, 3, activation='relu', name='conv3_3'))
 62 |     model.add(MaxPooling2D((2, 2), strides=(2, 2)))
 63 | 
 64 |     model.add(ZeroPadding2D((1, 1)))
 65 |     model.add(Convolution2D(512, 3, 3, activation='relu', name='conv4_1'))
 66 |     model.add(ZeroPadding2D((1, 1)))
 67 |     model.add(Convolution2D(512, 3, 3, activation='relu', name='conv4_2'))
 68 |     model.add(ZeroPadding2D((1, 1)))
 69 |     model.add(Convolution2D(512, 3, 3, activation='relu', name='conv4_3'))
 70 |     model.add(MaxPooling2D((2, 2), strides=(2, 2)))
 71 | 
 72 |     model.add(ZeroPadding2D((1, 1)))
 73 |     model.add(Convolution2D(512, 3, 3, activation='relu', name='conv5_1'))
 74 |     model.add(ZeroPadding2D((1, 1)))
 75 |     model.add(Convolution2D(512, 3, 3, activation='relu', name='conv5_2'))
 76 |     model.add(ZeroPadding2D((1, 1)))
 77 |     model.add(Convolution2D(512, 3, 3, activation='relu', name='conv5_3'))
 78 |     model.add(MaxPooling2D((2, 2), strides=(2, 2)))
 79 | 
 80 |     # load VGG16 weights
 81 |     f = h5py.File(weights_path)
 82 |     
 83 |     for k in range(f.attrs['nb_layers']):
 84 |         if k >= len(model.layers):
 85 |             break
 86 |         g = f['layer_{}'.format(k)]
 87 |         weights = [g['param_{}'.format(p)] for p in range(g.attrs['nb_params'])]
 88 |         model.layers[k].set_weights(weights)
 89 |         
 90 |     f.close()    
 91 |     print 'Model loaded.'
 92 | 
 93 |     generator = datagen.flow_from_directory(
 94 |             train_data_dir,
 95 |             target_size=(img_width, img_height),
 96 |             batch_size=32,
 97 |             class_mode=None,
 98 |             shuffle=False)
 99 |     bottleneck_features_train = model.predict_generator(generator, nb_train_samples)
100 |     np.save(open('bottleneck_features_train.npy', 'wb'), bottleneck_features_train)
101 | 
102 |     generator = datagen.flow_from_directory(
103 |             validation_data_dir,
104 |             target_size=(img_width, img_height),
105 |             batch_size=32,
106 |             class_mode=None,
107 |             shuffle=False)
108 |     bottleneck_features_validation = model.predict_generator(generator, nb_validation_samples)
109 |     np.save(open('bottleneck_features_validation.npy', 'wb'), bottleneck_features_validation)
110 | 
111 | 
112 | def train_top_model():
113 |     """trains the classifier"""
114 |     train_data = np.load(open('bottleneck_features_train.npy', 'rb'))
115 |     train_labels = np.array([0] * (nb_train_samples / 2) + [1] * (nb_train_samples / 2))
116 | 
117 |     validation_data = np.load(open('bottleneck_features_validation.npy', 'rb'))
118 |     validation_labels = np.array([0] * (nb_validation_samples / 2) + [1] * (nb_validation_samples / 2))
119 | 
120 |     model = Sequential()
121 |     model.add(Flatten(input_shape=train_data.shape[1:]))
122 |     model.add(Dense(256, activation='relu'))
123 |     model.add(Dropout(0.5))
124 |     model.add(Dense(1, activation='sigmoid'))
125 | 
126 |     model.compile(optimizer='rmsprop', loss='binary_crossentropy', metrics=['accuracy'])
127 | 
128 |     model.fit(train_data, train_labels,
129 |               nb_epoch=nb_epoch,
130 |               batch_size=32,
131 |               validation_data=(validation_data, validation_labels),
132 |               callbacks=[early_stopping])
133 |               
134 |     # save the model weights
135 |     model.save_weights(top_model_weights_path)
136 | 
137 | 
138 | def fine_tune():
139 |     """recreates top model architecture/weights and fine tunes with image augmentation and optimizations"""
140 |     
141 |     # reconstruct vgg16 model
142 |     model = Sequential()
143 |     model.add(ZeroPadding2D((1, 1), input_shape=(3, img_width, img_height)))
144 | 
145 |     model.add(Convolution2D(64, 3, 3, activation='relu', name='conv1_1'))
146 |     model.add(ZeroPadding2D((1, 1)))
147 |     model.add(Convolution2D(64, 3, 3, activation='relu', name='conv1_2'))
148 |     model.add(MaxPooling2D((2, 2), strides=(2, 2)))
149 | 
150 |     model.add(ZeroPadding2D((1, 1)))
151 |     model.add(Convolution2D(128, 3, 3, activation='relu', name='conv2_1'))
152 |     model.add(ZeroPadding2D((1, 1)))
153 |     model.add(Convolution2D(128, 3, 3, activation='relu', name='conv2_2'))
154 |     model.add(MaxPooling2D((2, 2), strides=(2, 2)))
155 | 
156 |     model.add(ZeroPadding2D((1, 1)))
157 |     model.add(Convolution2D(256, 3, 3, activation='relu', name='conv3_1'))
158 |     model.add(ZeroPadding2D((1, 1)))
159 |     model.add(Convolution2D(256, 3, 3, activation='relu', name='conv3_2'))
160 |     model.add(ZeroPadding2D((1, 1)))
161 |     model.add(Convolution2D(256, 3, 3, activation='relu', name='conv3_3'))
162 |     model.add(MaxPooling2D((2, 2), strides=(2, 2)))
163 | 
164 |     model.add(ZeroPadding2D((1, 1)))
165 |     model.add(Convolution2D(512, 3, 3, activation='relu', name='conv4_1'))
166 |     model.add(ZeroPadding2D((1, 1)))
167 |     model.add(Convolution2D(512, 3, 3, activation='relu', name='conv4_2'))
168 |     model.add(ZeroPadding2D((1, 1)))
169 |     model.add(Convolution2D(512, 3, 3, activation='relu', name='conv4_3'))
170 |     model.add(MaxPooling2D((2, 2), strides=(2, 2)))
171 | 
172 |     model.add(ZeroPadding2D((1, 1)))
173 |     model.add(Convolution2D(512, 3, 3, activation='relu', name='conv5_1'))
174 |     model.add(ZeroPadding2D((1, 1)))
175 |     model.add(Convolution2D(512, 3, 3, activation='relu', name='conv5_2'))
176 |     model.add(ZeroPadding2D((1, 1)))
177 |     model.add(Convolution2D(512, 3, 3, activation='relu', name='conv5_3'))
178 |     model.add(MaxPooling2D((2, 2), strides=(2, 2)))
179 | 
180 |     # load vgg16 weights
181 |     f = h5py.File(weights_path)
182 |     
183 |     for k in range(f.attrs['nb_layers']):
184 |         if k >= len(model.layers):
185 |             break
186 |         g = f['layer_{}'.format(k)]
187 |         weights = [g['param_{}'.format(p)] for p in range(g.attrs['nb_params'])]
188 |         model.layers[k].set_weights(weights)
189 |         
190 |     f.close()
191 | 
192 |     # add the classification layers
193 |     top_model = Sequential()
194 |     top_model.add(Flatten(input_shape=model.output_shape[1:]))
195 |     top_model.add(Dense(256, activation='relu'))
196 |     top_model.add(Dropout(0.5))
197 |     top_model.add(Dense(1, activation='sigmoid'))
198 | 
199 |     top_model.load_weights(top_model_weights_path)
200 | 
201 |     # add the model on top of the convolutional base
202 |     model.add(top_model)
203 | 
204 |     # set the first 25 layers (up to the last conv block)
205 |     # to non-trainable (weights will not be updated)
206 |     for layer in model.layers[:25]:
207 |         layer.trainable = False
208 | 
209 |     # compile the model with a SGD/momentum optimizer
210 |     # and a very slow learning rate.
211 |     model.compile(loss='binary_crossentropy',
212 |                   optimizer=optimizers.SGD(lr=1e-4, momentum=0.9),
213 |                   metrics=['accuracy'])
214 | 
215 |     # prepare data augmentation configuration
216 |     train_datagen = ImageDataGenerator(
217 |             rescale=1./255,
218 |             shear_range=0.2,
219 |             zoom_range=0.2,
220 |             horizontal_flip=True)
221 | 
222 |     test_datagen = ImageDataGenerator(rescale=1./255)
223 | 
224 |     train_generator = train_datagen.flow_from_directory(
225 |         train_data_dir,
226 |         target_size=(img_height, img_width),
227 |         batch_size=32,
228 |         class_mode='binary')
229 | 
230 |     validation_generator = test_datagen.flow_from_directory(
231 |         validation_data_dir,
232 |         target_size=(img_height, img_width),
233 |         batch_size=32,
234 |         class_mode='binary')
235 | 
236 |     # fine-tune the model
237 |     model.fit_generator(
238 |         train_generator,
239 |         samples_per_epoch=nb_train_samples,
240 |         nb_epoch=nb_epoch,
241 |         validation_data=validation_generator,
242 |         nb_val_samples=nb_validation_samples,
243 |         callbacks=[early_stopping])
244 | 
245 |     # save the model
246 |     json_string = model.to_json()
247 | 
248 |     with open('final_model_architecture.json', 'w') as f:
249 |         f.write(json_string)
250 | 
251 |     model.save_weights('final_weights.h5')
252 |     
253 |     # return the model for convenience when making predictions
254 |     return model 
255 | 
256 | 
257 | def predict_labels(model):
258 |     """writes test image labels and predictions to csv"""
259 |     
260 |     test_datagen = ImageDataGenerator(rescale=1./255)
261 |     test_generator = test_datagen.flow_from_directory(
262 |         test_data_dir,
263 |         target_size=(img_height, img_width),
264 |         batch_size=32,
265 |         shuffle=False,
266 |         class_mode=None)
267 | 
268 |     base_path = test_data_dir + "/test/"
269 | 
270 |     with open("prediction.csv", "w") as f:
271 |         p_writer = csv.writer(f, delimiter=',', lineterminator='\n')
272 |         for _, _, imgs in os.walk(base_path):
273 |             for im in imgs:
274 |                 pic_id = im.split(".")[0]
275 |                 img = load_img(base_path + im)
276 |                 img = imresize(img, size=(img_height, img_width))
277 |                 test_x = img_to_array(img).reshape(3, img_height, img_width)
278 |                 test_x = test_x.reshape((1,) + test_x.shape)
279 |                 test_generator = test_datagen.flow(test_x,
280 |                                                    batch_size=1,
281 |                                                    shuffle=False)
282 |                 prediction = model.predict_generator(test_generator, 1)[0][0]
283 |                 p_writer.writerow([pic_id, prediction])
284 |                 
285 | def load_model():
286 |     """Loads a model from an earlier run"""
287 | 
288 |     json_file = open('final_model_architecture.json', 'r')
289 |     model_json = json_file.read()
290 |     json_file.close()
291 |     model = model_from_json(model_json)
292 |     model.load_weights('final_weights.h5')
293 |     print "Model Loaded."
294 |     
295 |     return model
296 | 
297 | if __name__ == "__main__":
298 |     save_bottlebeck_features()
299 |     train_top_model()
300 |     model = fine_tune()
301 |     predict_labels(model)
302 | 


--------------------------------------------------------------------------------
/vgg_bn.py:
--------------------------------------------------------------------------------
  1 | import numpy as np
  2 | 
  3 | from keras.layers.normalization import BatchNormalization
  4 | from keras.models import Sequential
  5 | from keras.layers.core import Flatten, Dense, Dropout, Lambda
  6 | from keras.layers.convolutional import Convolution2D, MaxPooling2D, ZeroPadding2D
  7 | from keras.optimizers import Adam
  8 | from keras.preprocessing.image import ImageDataGenerator
  9 | 
 10 | 
 11 | vgg_mean = np.array([123.68, 116.779, 103.939], dtype=np.float32).reshape((3,1,1))
 12 | def vgg_preprocess(x):
 13 |     x = x - vgg_mean
 14 |     return x[:, ::-1] # reverse axis rgb->bgr
 15 | 
 16 | 
 17 | class Vgg16BN():
 18 |     """The VGG 16 Imagenet model with Batch Normalization for the Dense Layers"""
 19 | 
 20 |     def __init__(self, size=(224, 224), n_classes=2, lr=0.001, batch_size=64):
 21 |         self.weights_file = 'vgg16_bn.h5'  # download from: http://www.platform.ai/models/
 22 |         self.size = size
 23 |         self.n_classes = n_classes
 24 |         self.lr = lr
 25 |         self.batch_size = batch_size
 26 |         self.build()
 27 | 
 28 |     def predict(self, data):
 29 |         return self.model.predict(data)
 30 | 
 31 |     def ConvBlock(self, layers, filters):
 32 |         model = self.model
 33 |         for i in range(layers):
 34 |             model.add(ZeroPadding2D((1, 1)))
 35 |             model.add(Convolution2D(filters, 3, 3, activation='relu'))
 36 |         model.add(MaxPooling2D((2, 2), strides=(2, 2)))
 37 | 
 38 |     def FCBlock(self):
 39 |         model = self.model
 40 |         model.add(Dense(4096, activation='relu'))
 41 |         model.add(BatchNormalization())
 42 |         model.add(Dropout(0.5))
 43 | 
 44 |     def build(self, ft=True):
 45 |         model = self.model = Sequential()
 46 |         model.add(Lambda(vgg_preprocess, input_shape=(3,) + self.size))
 47 | 
 48 |         self.ConvBlock(2, 64)
 49 |         self.ConvBlock(2, 128)
 50 |         self.ConvBlock(3, 256)
 51 |         self.ConvBlock(3, 512)
 52 |         self.ConvBlock(3, 512)
 53 | 
 54 |         model.add(Flatten())
 55 |         self.FCBlock()
 56 |         self.FCBlock()
 57 |         model.add(Dense(self.n_classes, activation='softmax'))
 58 | 
 59 |         model.load_weights(self.weights_file)
 60 | 
 61 |         if ft:
 62 |             self.finetune()
 63 | 
 64 |         self.compile()
 65 | 
 66 |     def finetune(self):
 67 |         model = self.model
 68 |         model.pop()
 69 |         for layer in model.layers:
 70 |             layer.trainable=False
 71 |         model.add(Dense(self.n_classes, activation='softmax'))
 72 | 
 73 |     def compile(self):
 74 |         self.model.compile(optimizer=Adam(lr=self.lr),
 75 |                 loss='categorical_crossentropy', metrics=['accuracy'])
 76 | 
 77 |     def fit(self, trn_path, val_path, nb_trn_samples, nb_val_samples, nb_epoch=1, callbacks=None, aug=False):
 78 |         if aug:
 79 |             train_datagen = ImageDataGenerator(rotation_range=10, width_shift_range=0.05, zoom_range=0.05,
 80 |                                                channel_shift_range=10, height_shift_range=0.05, shear_range=0.05,
 81 |                                                horizontal_flip=True)
 82 |         else:
 83 |             train_datagen = ImageDataGenerator()
 84 | 
 85 |         trn_gen = train_datagen.flow_from_directory(trn_path, target_size=self.size, batch_size=self.batch_size,
 86 |                                                       class_mode='categorical', shuffle=True)
 87 | 
 88 |         val_gen = ImageDataGenerator().flow_from_directory(val_path, target_size=self.size, batch_size=self.batch_size,
 89 |                                                            class_mode='categorical', shuffle=True)
 90 | 
 91 |         self.model.fit_generator(trn_gen, samples_per_epoch=nb_trn_samples, nb_epoch=nb_epoch, verbose=2,
 92 |                 validation_data=val_gen, nb_val_samples=nb_val_samples, callbacks=callbacks)
 93 | 
 94 |     def test(self, test_path, nb_test_samples, aug=False):
 95 |         if aug:
 96 |             test_datagen = ImageDataGenerator(rotation_range=10, width_shift_range=0.05, zoom_range=0.05,
 97 |                                                channel_shift_range=10, height_shift_range=0.05, shear_range=0.05,
 98 |                                                horizontal_flip=True)
 99 |         else:
100 |             test_datagen = ImageDataGenerator()
101 | 
102 |         test_gen = test_datagen.flow_from_directory(test_path, target_size=self.size, batch_size=self.batch_size,
103 |                                                     class_mode=None, shuffle=False)
104 | 
105 |         return self.model.predict_generator(test_gen, val_samples=nb_test_samples), test_gen.filenames
106 | 


--------------------------------------------------------------------------------