├── .gitignore ├── README.md ├── data ├── Makefile ├── README.md ├── credentials.txt.template └── lena.png ├── docs ├── ft-dl-loss.png ├── ft-dlw-loss.png ├── fttl-keras-nov2016.pptx ├── tl-dl1-loss.png ├── tl-dl2-loss.png ├── vgg16-ft.dia ├── vgg16-ft.png ├── vgg16-original.dia ├── vgg16-original.png ├── vgg16-tl.dia └── vgg16-tl.png └── src ├── augment-images.ipynb ├── augment-images.py ├── caffe2keras-rebuild.ipynb ├── caffe2keras-rebuild.py ├── caffe2keras-save.py ├── confusion-to-heatmap.py ├── examine-trained-model.ipynb ├── ft-dl-train.py ├── ft-dlw-train.py ├── fttlutils.py ├── image-convolutions.ipynb ├── make-sample.py ├── preprocess-images.py ├── sample-images.py ├── tl-dl-aug-train.py ├── tl-dl1-train.py ├── tl-dl2-train.py ├── tl-lr-aug-train.py ├── tl-lr-train.py └── vectorize-images.py /.gitignore: -------------------------------------------------------------------------------- 1 | *pyc 2 | data/* 3 | .* 4 | -------------------------------------------------------------------------------- /README.md: -------------------------------------------------------------------------------- 1 | # Transfer Learning and Fine Tuning for Cross Domain Image Classification with Keras 2 | 3 | Supporting code for my talk at [Accel.AI](http://accel.ai/) Demystifying Deep Learning and AI event on November 19-20 2016 at Oakland CA. 4 | 5 | Slides are [here](http://www.slideshare.net/sujitpal/transfer-learning-and-fine-tuning-for-cross-domain-image-classification-with-keras) 6 | 7 | ## Abstract: 8 | 9 | I describe how a Deep Convolutional Network (DCNN) trained on the ImageNet dataset can be used to classify images in a completely different domain. The intuition that the training process teaches the DCNN to extract good features from images is explored with visualizations. Transfer Learning freezes the bottom layers of the DCNN to extract image vectors from a training set in a different domain, which can then be used to train a new classifier for this domain. Fine tuning involves training the pre-trained network further for the target domain. Both approaches are demonstrated using a VGG-16 network pre-trained on ImageNet to classify medical images into 5 categories. Code examples are provided using Keras. 10 | 11 | ## Dataset 12 | 13 | Dataset used comes from [Diabetic Retinopathy Detection competition](https://www.kaggle.com/c/diabetic-retinopathy-detection) on Kaggle. Dataset is a set of 35,126 of digital color fundus photographs of the retina. The code here uses a sample of 1,000 images sampled from this dataset, 200 per each of the 5 Diabetes Retinopathy images (No DR, Mild DR, Moderate DR, Severe DR and Proliferative DR). See the [data/README.md](data/README.md) for details. 14 | 15 | ## VGG-16 Model 16 | 17 | 18 | 19 | ## Results 20 | 21 | ### Transfer Learning 22 | 23 | 24 | 25 | ### Transfer Learning + Logistic Regression 26 | 27 | Result (based on Cohen's Kappa score) places this entry at position 79-80 on public leaderboard (as of Nov 9 2016). 28 | 29 | Accuracy: 0.36333, Cohen's Kappa Score: 0.51096 30 | 31 | Confusion Matrix: 32 | [[15 19 17 9 0] 33 | [20 24 10 5 1] 34 | [12 13 13 12 10] 35 | [ 7 4 11 24 14] 36 | [ 4 2 11 10 33]] 37 | 38 | Classification Report: 39 | precision recall f1-score support 40 | 41 | 0 0.26 0.25 0.25 60 42 | 1 0.39 0.40 0.39 60 43 | 2 0.21 0.22 0.21 60 44 | 3 0.40 0.40 0.40 60 45 | 4 0.57 0.55 0.56 60 46 | 47 | avg / total 0.36 0.36 0.36 300 48 | 49 | ### Transfer Learning + 1 layer MLP 50 | 51 | Result (based on Cohen's Kappa score) places this entry at position 25-26 on public leaderboard (as of Nov 9 2016). 52 | 53 | 54 | 55 | Final Model (DL#1) 56 | 57 | Accuracy: 0.66667, Cohen's Kappa Score: 0.74558 58 | 59 | Confusion Matrix: 60 | [[40 6 8 0 0] 61 | [ 5 46 12 2 0] 62 | [ 7 5 46 4 2] 63 | [ 5 5 8 31 11] 64 | [ 2 3 6 9 37]] 65 | 66 | Classification Report: 67 | precision recall f1-score support 68 | 69 | 1 0.68 0.74 0.71 54 70 | 2 0.71 0.71 0.71 65 71 | 3 0.57 0.72 0.64 64 72 | 4 0.67 0.52 0.58 60 73 | 5 0.74 0.65 0.69 57 74 | 75 | avg / total 0.67 0.67 0.67 300 76 | 77 | 78 | ### Transfer Learning + 2 layer MLP 79 | 80 | Results in lower performance than 1 layer MLP. Other tuning was to increase batch size, use Adadelta optimizer with lower learning rate. Network is slower because of additional layer and slightly worse in performance than 1 layer. 81 | 82 | 83 | 84 | Final Model (DL#2) 85 | 86 | Accuracy: 0.63333, Cohen's Kappa Score: 0.70822 87 | 88 | Confusion Matrix: 89 | [[36 11 4 1 2] 90 | [13 45 4 1 2] 91 | [ 5 15 35 5 4] 92 | [ 5 8 6 31 10] 93 | [ 4 1 3 6 43]] 94 | 95 | Classification Report: 96 | precision recall f1-score support 97 | 98 | 1 0.57 0.67 0.62 54 99 | 2 0.56 0.69 0.62 65 100 | 3 0.67 0.55 0.60 64 101 | 4 0.70 0.52 0.60 60 102 | 5 0.70 0.75 0.73 57 103 | 104 | avg / total 0.64 0.63 0.63 300 105 | 106 | 107 | ### Fine Tuning 108 | 109 | 110 | 111 | ### Fine Tuning with Random Weights for FC 112 | 113 | Result (based on Cohen's Kappa score) places this entry at position 26-27 on public leaderboard (as of Nov 9 2016). 114 | 115 | 116 | 117 | Final Model (FT#1) 118 | 119 | Accuracy: 0.61667, Cohen's Kappa Score: 0.74487 120 | 121 | Confusion Matrix: 122 | [[32 7 10 4 1] 123 | [13 37 11 4 0] 124 | [ 7 3 45 6 3] 125 | [ 1 4 8 32 15] 126 | [ 0 3 3 12 39]] 127 | 128 | Classification Report: 129 | precision recall f1-score support 130 | 131 | 0 0.60 0.59 0.60 54 132 | 1 0.69 0.57 0.62 65 133 | 2 0.58 0.70 0.64 64 134 | 3 0.55 0.53 0.54 60 135 | 4 0.67 0.68 0.68 57 136 | 137 | avg / total 0.62 0.62 0.62 300 138 | 139 | 140 | ### Fine Tuning with Learned Weights for FC 141 | 142 | Result (based on Cohen's Kappa score) places this entry at position 32-33 on public leaderboard (as of Nov 9 2016). 143 | 144 | 145 | 146 | Final Model (FT#2) 147 | 148 | Accuracy: 0.63000, Cohen's Kappa Score: 0.72214 149 | 150 | Confusion Matrix: 151 | [[34 11 6 2 1] 152 | [ 5 49 8 3 0] 153 | [ 6 12 40 5 1] 154 | [ 3 9 10 30 8] 155 | [ 1 3 7 10 36]] 156 | 157 | Classification Report: 158 | precision recall f1-score support 159 | 160 | 0 0.69 0.63 0.66 54 161 | 1 0.58 0.75 0.66 65 162 | 2 0.56 0.62 0.59 64 163 | 3 0.60 0.50 0.55 60 164 | 4 0.78 0.63 0.70 57 165 | 166 | avg / total 0.64 0.63 0.63 300 167 | 168 | 169 | ## References 170 | 171 | * [How to use pre-trained models with Keras](https://keras.io/getting-started/faq/#how-can-i-use-pre-trained-models-in-keras) - Keras documentation by Francois Chollet. 172 | * [Keras Applications (pretrained Deep Learning Models)](https://keras.io/applications/) - Keras documentation by Francois Chollet. 173 | * [Keras Blog: Building Powerful Image Classification Models using very little data](https://blog.keras.io/building-powerful-image-classification-models-using-very-little-data.html) - blog post by Francois Chollet, creator of Keras. 174 | * [KDNuggets: Recycling Deep Learning Representations with Transfer Learning](http://www.kdnuggets.com/2015/08/recycling-deep-learning-representations-transfer-ml.html) - blog post by Zachary Chase Lipton. 175 | * [Deep Neural Networks in Azure: Transfer Learning and Fine Tuning](https://info.microsoft.com/CO-AAIoT-WBNR-FY17-09Sep-27-Deep-Neural-Networks-in-Azure-Transfer-Learning-and-Fine-tuning-253624_Registration.html) - webinar presented by Anusua Trivedi of Microsoft. 176 | * [The Neural Network Zoo](http://www.asimovinstitute.org/neural-network-zoo/) - blog post by Fjodor Van Veen of The Asimov Institute. 177 | 178 | -------------------------------------------------------------------------------- /data/Makefile: -------------------------------------------------------------------------------- 1 | # Requires presence of credentials.txt file containing login/password in the following format: 2 | # UserName=my_username&Password=my_password 3 | 4 | COMPETITION=diabetic-retinopathy-detection 5 | 6 | all: download_files 7 | 8 | session.cookie: credentials.txt 9 | curl -o /dev/null -c session.cookie https://www.kaggle.com/account/login 10 | curl -o /dev/null -c session.cookie -b session.cookie -L -d @credentials.txt https://www.kaggle.com/account/login 11 | 12 | files.txt: session.cookie 13 | curl -c session.cookie -b session.cookie -L http://www.kaggle.com/c/$(COMPETITION)/data | \ 14 | grep -o \"[^\"]*\/download[^\"]*\" | sed -e 's/"//g' -e 's/^/http:\/\/www.kaggle.com/' > files.txt 15 | 16 | download_files: files.txt session.cookie 17 | mkdir -p files 18 | cd files && xargs -n 1 curl -b ../session.cookie -L -O < ../files.txt 19 | 20 | .PHONY: clean 21 | 22 | clean: 23 | rm session.cookie files.txt files/*.zip 24 | -------------------------------------------------------------------------------- /data/README.md: -------------------------------------------------------------------------------- 1 | # Data for this project 2 | 3 | Data for this project comes from the [Kaggle Diabetic Retinopathy Detection Competition](https://www.kaggle.com/c/diabetic-retinopathy-detection). We use a random sample of 1,000 images from the 35,126 images in this dataset. 4 | 5 | The following Makefile will create the directory structure under this directory. Unfortunately, the curl commands download HTML for the train.00[1-5].zip files. You will need to download them off the site via the browser. 6 | 7 | You will need to make the credentials.txt file using the provided template file. Replace the values with your Kaggle user name and password. Run the following commands. 8 | 9 | make 10 | cd files 11 | 12 | Next manually download trainLabels.csv.zip and sampleSubmission.csv.zip onto the HTML versions the make command downloaded. Also manually download the train.zip.00[1-5] files over the HTML versions downloaded. 13 | 14 | unzip -a trainLabels.csv.zip 15 | rm trainLabels.csv.zip 16 | unzip -a sampleSubmission.csv.zip 17 | rm sampleSubmission.csv.zip 18 | cd files 19 | rm test.zip* 20 | 7za x train.zip.001 21 | 22 | The last command will write 35126 images into the train directory under the current directory (files). 23 | 24 | You can now build the sample from the dataset. 25 | 26 | cd .. 27 | ../../src/make-sample.py 28 | 29 | This will generate a shell script sampleImages.sh in current directory. 30 | 31 | cd files 32 | mkdir sample 33 | cd sample 34 | mkdir 0 1 2 3 4 35 | cd ../.. 36 | bash sampleImages.sh 37 | 38 | This will copy 1000 images, 200 each into each of the 5 category directories. 39 | 40 | Finally create a directory to hold the models. 41 | 42 | cd .. # you are in files now 43 | mkdir models 44 | 45 | -------------------------------------------------------------------------------- /data/credentials.txt.template: -------------------------------------------------------------------------------- 1 | Username=your_kaggle_username&Password=your_kaggle_password 2 | -------------------------------------------------------------------------------- /data/lena.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/sujitpal/fttl-with-keras/330c5571f68958f27aaa56f80b14babc2afd3a45/data/lena.png -------------------------------------------------------------------------------- /docs/ft-dl-loss.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/sujitpal/fttl-with-keras/330c5571f68958f27aaa56f80b14babc2afd3a45/docs/ft-dl-loss.png -------------------------------------------------------------------------------- /docs/ft-dlw-loss.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/sujitpal/fttl-with-keras/330c5571f68958f27aaa56f80b14babc2afd3a45/docs/ft-dlw-loss.png -------------------------------------------------------------------------------- /docs/fttl-keras-nov2016.pptx: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/sujitpal/fttl-with-keras/330c5571f68958f27aaa56f80b14babc2afd3a45/docs/fttl-keras-nov2016.pptx -------------------------------------------------------------------------------- /docs/tl-dl1-loss.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/sujitpal/fttl-with-keras/330c5571f68958f27aaa56f80b14babc2afd3a45/docs/tl-dl1-loss.png -------------------------------------------------------------------------------- /docs/tl-dl2-loss.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/sujitpal/fttl-with-keras/330c5571f68958f27aaa56f80b14babc2afd3a45/docs/tl-dl2-loss.png -------------------------------------------------------------------------------- /docs/vgg16-ft.dia: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/sujitpal/fttl-with-keras/330c5571f68958f27aaa56f80b14babc2afd3a45/docs/vgg16-ft.dia -------------------------------------------------------------------------------- /docs/vgg16-ft.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/sujitpal/fttl-with-keras/330c5571f68958f27aaa56f80b14babc2afd3a45/docs/vgg16-ft.png -------------------------------------------------------------------------------- /docs/vgg16-original.dia: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/sujitpal/fttl-with-keras/330c5571f68958f27aaa56f80b14babc2afd3a45/docs/vgg16-original.dia -------------------------------------------------------------------------------- /docs/vgg16-original.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/sujitpal/fttl-with-keras/330c5571f68958f27aaa56f80b14babc2afd3a45/docs/vgg16-original.png -------------------------------------------------------------------------------- /docs/vgg16-tl.dia: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/sujitpal/fttl-with-keras/330c5571f68958f27aaa56f80b14babc2afd3a45/docs/vgg16-tl.dia -------------------------------------------------------------------------------- /docs/vgg16-tl.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/sujitpal/fttl-with-keras/330c5571f68958f27aaa56f80b14babc2afd3a45/docs/vgg16-tl.png -------------------------------------------------------------------------------- /src/augment-images.py: -------------------------------------------------------------------------------- 1 | # -*- coding: utf-8 -*- 2 | from __future__ import division, print_function 3 | from keras.applications.vgg16 import VGG16 4 | from keras.applications.vgg16 import preprocess_input 5 | from keras.models import Model 6 | from keras.optimizers import SGD 7 | from keras.preprocessing import image 8 | from sklearn.cross_validation import StratifiedShuffleSplit 9 | import numpy as np 10 | import os 11 | 12 | def get_next_image_loc(imgdir): 13 | for root, dirs, files in os.walk(imgdir): 14 | for name in files: 15 | path = os.path.join(root, name).split(os.path.sep)[::-1] 16 | yield (path[1], path[0]) 17 | 18 | 19 | def write_vectors(model, X, y, tag, data_dir, batch_size): 20 | fXout = open(os.path.join(data_dir, 21 | "images-500-{:s}-X.txt".format(tag)), "wb") 22 | fyout = open(os.path.join(data_dir, 23 | "images-500-{:s}-y.txt".format(tag)), "wb") 24 | num_written = 0 25 | for i in range(0, X.shape[0], batch_size): 26 | Xbatch = X[i:i + batch_size] 27 | ybatch = y[i:i + batch_size] 28 | vecs = model.predict(Xbatch) 29 | for vec, label in zip(vecs, ybatch): 30 | vec = vec.flatten() 31 | vec_str = ",".join(["{:.5f}".format(v) for v in vec.tolist()]) 32 | fXout.write("{:s}\n".format(vec_str)) 33 | fyout.write("{:d}\n".format(label)) 34 | if num_written % 100 == 0: 35 | print("\twrote {:d} {:s} records".format(num_written, tag)) 36 | num_written += 1 37 | print("\twrote {:d} {:s} records, COMPLETE".format(num_written, tag)) 38 | fXout.close() 39 | fyout.close() 40 | 41 | 42 | ########################## main ########################## 43 | 44 | DATA_DIR = "../data" 45 | IMAGE_DIR = os.path.join(DATA_DIR, "images-500") 46 | IMAGE_WIDTH = 224 47 | BATCH_SIZE = 10 48 | NUM_TO_AUGMENT = 10 49 | 50 | np.random.seed(42) 51 | 52 | # load images and labels from images directory 53 | print("Loading images and labels from images directory...") 54 | xs, ys = [], [] 55 | for label, image_file in get_next_image_loc(IMAGE_DIR): 56 | ys.append(int(label)) 57 | img = image.load_img(os.path.join(IMAGE_DIR, label, image_file), 58 | target_size=(IMAGE_WIDTH, IMAGE_WIDTH)) 59 | img4d = image.img_to_array(img) 60 | img4d = np.expand_dims(img4d, axis=0) 61 | img4d = preprocess_input(img4d) 62 | xs.append(img4d[0]) 63 | X = np.array(xs) 64 | y = np.array(ys) 65 | 66 | # using regular train_test_split results in classes not being represented 67 | print("Initial split into train/val/test...") 68 | splitter = StratifiedShuffleSplit(y, n_iter=1, test_size=0.3, 69 | random_state=42) 70 | for train, test in splitter: 71 | Xtrain, Xtest, ytrain, ytest = X[train], X[test], y[train], y[test] 72 | break 73 | print(Xtrain.shape, Xtest.shape, ytrain.shape, ytest.shape) 74 | 75 | # instantiate ImageDataGenerator to create approximately 10 images for 76 | # each input training image 77 | print("Augmenting training set images...") 78 | datagen = image.ImageDataGenerator( 79 | featurewise_center=True, 80 | featurewise_std_normalization=True, 81 | rotation_range=20, 82 | width_shift_range=0.2, 83 | height_shift_range=0.2, 84 | shear_range=0.2, 85 | horizontal_flip=True) 86 | 87 | xtas, ytas = [], [] 88 | for i in range(Xtrain.shape[0]): 89 | num_aug = 0 90 | x = Xtrain[i][np.newaxis] 91 | datagen.fit(x) 92 | for x_aug in datagen.flow(x, batch_size=1): 93 | if num_aug >= NUM_TO_AUGMENT: 94 | break 95 | xtas.append(x_aug[0]) 96 | ytas.append(ytrain[i]) 97 | num_aug += 1 98 | 99 | Xtrain = np.array(xtas) 100 | ytrain = np.array(ytas) 101 | 102 | print(Xtrain.shape, Xtest.shape, ytrain.shape, ytest.shape) 103 | 104 | # Instantiate VGG16 model and remove bottleneck 105 | print("Instantiating VGG16 model and removing top layers...") 106 | vgg16_model = VGG16(weights="imagenet", include_top=True) 107 | model = Model(input=vgg16_model.input, 108 | output=vgg16_model.get_layer("block5_pool").output) 109 | sgd = SGD(lr=0.1, decay=1e-6, momentum=0.9, nesterov=True) 110 | model.compile(optimizer=sgd, loss="categorical_crossentropy") 111 | 112 | # Read each of train, validation and test vectors out to named files 113 | print("Writing vectors to files...") 114 | write_vectors(model, Xtrain, ytrain, "train", DATA_DIR, BATCH_SIZE) 115 | write_vectors(model, Xtest, ytest, "test", DATA_DIR, BATCH_SIZE) 116 | -------------------------------------------------------------------------------- /src/caffe2keras-rebuild.py: -------------------------------------------------------------------------------- 1 | from __future__ import division, print_function 2 | from keras import backend as K 3 | from keras.engine.topology import Layer, InputSpec 4 | from keras.layers import Input 5 | from keras.layers.core import Activation, Dense, Flatten 6 | from keras.layers.convolutional import Convolution2D, ZeroPadding2D 7 | from keras.layers.normalization import BatchNormalization 8 | from keras.layers.pooling import MaxPooling2D 9 | from keras.models import Model, load_model 10 | from scipy.misc import imresize 11 | import matplotlib.pyplot as plt 12 | import numpy as np 13 | import os 14 | import re 15 | 16 | 17 | class LocalResponseNormalization(Layer): 18 | 19 | def __init__(self, n=5, alpha=0.0005, beta=0.75, k=2, **kwargs): 20 | self.n = n 21 | self.alpha = alpha 22 | self.beta = beta 23 | self.k = k 24 | super(LocalResponseNormalization, self).__init__(**kwargs) 25 | 26 | def build(self, input_shape): 27 | self.shape = input_shape 28 | super(LocalResponseNormalization, self).build(input_shape) 29 | 30 | def call(self, x, mask=None): 31 | if K.image_dim_ordering == "th": 32 | _, f, r, c = self.shape 33 | else: 34 | _, r, c, f = self.shape 35 | half_n = self.n // 2 36 | squared = K.square(x) 37 | pooled = K.pool2d(squared, (half_n, half_n), strides=(1, 1), 38 | border_mode="same", pool_mode="avg") 39 | if K.image_dim_ordering == "th": 40 | summed = K.sum(pooled, axis=1, keepdims=True) 41 | averaged = (self.alpha / self.n) * K.repeat_elements(summed, f, axis=1) 42 | else: 43 | summed = K.sum(pooled, axis=3, keepdims=True) 44 | averaged = (self.alpha / self.n) * K.repeat_elements(summed, f, axis=3) 45 | denom = K.pow(self.k + averaged, self.beta) 46 | return x / denom 47 | 48 | def get_output_shape_for(self, input_shape): 49 | return input_shape 50 | 51 | def transform_conv_weight(W): 52 | # for non FC layers, do this because Keras/Theano does convolution vs 53 | # Caffe correlation 54 | for i in range(W.shape[0]): 55 | for j in range(W.shape[1]): 56 | W[i, j] = np.rot90(W[i, j], 2) 57 | return W 58 | 59 | def transform_fc_weight(W): 60 | return W.T 61 | 62 | def preprocess_image(img, resize_wh, mean_image): 63 | # resize 64 | img4d = imresize(img, (resize_wh, resize_wh)) 65 | img4d = img4d.astype("float32") 66 | # BGR -> RGB 67 | img4d = img4d[:, :, ::-1] 68 | # swap axes to theano mode 69 | img4d = np.transpose(img4d, (2, 0, 1)) 70 | # add batch dimension 71 | img4d = np.expand_dims(img4d, axis=0) 72 | # subtract mean image 73 | img4d -= mean_image 74 | # clip to uint 75 | img4d = np.clip(img4d, 0, 255).astype("uint8") 76 | return img4d 77 | 78 | DATA_DIR = "../data/vgg-cnn" 79 | CAT_IMAGE = os.path.join(DATA_DIR, "cat.jpg") 80 | MEAN_IMAGE = os.path.join(DATA_DIR, "mean_image.npy") 81 | CAFFE_WEIGHTS_DIR = os.path.join(DATA_DIR, "saved-weights") 82 | LABEL_FILE = os.path.join(DATA_DIR, "caffe2keras-labels.txt") 83 | KERAS_MODEL_FILE = os.path.join(DATA_DIR, "vggcnn-keras.h5") 84 | RESIZE_WH = 224 85 | 86 | # caffe model layers (reference) 87 | CAFFE_LAYER_NAMES = [ 88 | "data", 89 | "conv1", "norm1", "pool1", 90 | "conv2", "pool2", 91 | "conv3", 92 | "conv4", 93 | "conv5", "pool5", 94 | "fc6", 95 | "fc7", 96 | "prob" 97 | ] 98 | CAFFE_LAYER_SHAPES = { 99 | "data" : (10, 3, 224, 224), 100 | "conv1": (10, 96, 109, 109), 101 | "norm1": (10, 96, 109, 109), 102 | "pool1": (10, 96, 37, 37), 103 | "conv2": (10, 256, 33, 33), 104 | "pool2": (10, 256, 17, 17), 105 | "conv3": (10, 512, 17, 17), 106 | "conv4": (10, 512, 17, 17), 107 | "conv5": (10, 512, 17, 17), 108 | "pool5": (10, 512, 6, 6), 109 | "fc6" : (10, 4096), 110 | "fc7" : (10, 4096), 111 | "fc8" : (10, 1000), 112 | "prob" : (10, 1000) 113 | } 114 | 115 | print("caffe:") 116 | for layer_name in CAFFE_LAYER_NAMES: 117 | print(layer_name, CAFFE_LAYER_SHAPES[layer_name]) 118 | 119 | # data (10, 3, 224, 224) 120 | # conv1 (10, 96, 109, 109) 121 | # norm1 (10, 96, 109, 109) 122 | # pool1 (10, 96, 37, 37) 123 | # conv2 (10, 256, 33, 33) 124 | # pool2 (10, 256, 17, 17) 125 | # conv3 (10, 512, 17, 17) 126 | # conv4 (10, 512, 17, 17) 127 | # conv5 (10, 512, 17, 17) 128 | # pool5 (10, 512, 6, 6) 129 | # fc6 (10, 4096) 130 | # fc7 (10, 4096) 131 | # prob (10, 1000) 132 | 133 | # set theano dimension ordering 134 | # NOTE: results match Caffe using Theano backend only 135 | K.set_image_dim_ordering("th") 136 | 137 | # load weights 138 | W_conv1 = transform_conv_weight(np.load(os.path.join(CAFFE_WEIGHTS_DIR, "W_conv1.npy"))) 139 | b_conv1 = np.load(os.path.join(CAFFE_WEIGHTS_DIR, "b_conv1.npy")) 140 | 141 | W_conv2 = transform_conv_weight(np.load(os.path.join(CAFFE_WEIGHTS_DIR, "W_conv2.npy"))) 142 | b_conv2 = np.load(os.path.join(CAFFE_WEIGHTS_DIR, "b_conv2.npy")) 143 | 144 | W_conv3 = transform_conv_weight(np.load(os.path.join(CAFFE_WEIGHTS_DIR, "W_conv3.npy"))) 145 | b_conv3 = np.load(os.path.join(CAFFE_WEIGHTS_DIR, "b_conv3.npy")) 146 | 147 | W_conv4 = transform_conv_weight(np.load(os.path.join(CAFFE_WEIGHTS_DIR, "W_conv4.npy"))) 148 | b_conv4 = np.load(os.path.join(CAFFE_WEIGHTS_DIR, "b_conv4.npy")) 149 | 150 | W_conv5 = transform_conv_weight(np.load(os.path.join(CAFFE_WEIGHTS_DIR, "W_conv5.npy"))) 151 | b_conv5 = np.load(os.path.join(CAFFE_WEIGHTS_DIR, "b_conv5.npy")) 152 | 153 | W_fc6 = transform_fc_weight(np.load(os.path.join(CAFFE_WEIGHTS_DIR, "W_fc6.npy"))) 154 | b_fc6 = np.load(os.path.join(CAFFE_WEIGHTS_DIR, "b_fc6.npy")) 155 | 156 | W_fc7 = transform_fc_weight(np.load(os.path.join(CAFFE_WEIGHTS_DIR, "W_fc7.npy"))) 157 | b_fc7 = np.load(os.path.join(CAFFE_WEIGHTS_DIR, "b_fc7.npy")) 158 | 159 | W_fc8 = transform_fc_weight(np.load(os.path.join(CAFFE_WEIGHTS_DIR, "W_fc8.npy"))) 160 | b_fc8 = np.load(os.path.join(CAFFE_WEIGHTS_DIR, "b_fc8.npy")) 161 | 162 | # define network 163 | data = Input(shape=(3, 224, 224), name="DATA") 164 | 165 | conv1 = Convolution2D(96, 7, 7, subsample=(2, 2), 166 | weights=(W_conv1, b_conv1))(data) 167 | conv1 = Activation("relu", name="CONV1")(conv1) 168 | 169 | norm1 = LocalResponseNormalization(name="NORM1")(conv1) 170 | 171 | pool1 = ZeroPadding2D(padding=(0, 2, 0, 2))(norm1) 172 | pool1 = MaxPooling2D(pool_size=(3, 3), strides=(3, 3), name="POOL1")(pool1) 173 | 174 | conv2 = Convolution2D(256, 5, 5, weights=(W_conv2, b_conv2))(pool1) 175 | conv2 = Activation("relu", name="CONV2")(conv2) 176 | 177 | pool2 = ZeroPadding2D(padding=(0, 1, 0, 1))(conv2) 178 | pool2 = MaxPooling2D(pool_size=(2, 2), strides=(2, 2), name="POOL2")(pool2) 179 | 180 | conv3 = ZeroPadding2D(padding=(0, 2, 0, 2))(pool2) 181 | conv3 = Convolution2D(512, 3, 3, weights=(W_conv3, b_conv3))(conv3) 182 | conv3 = Activation("relu", name="CONV3")(conv3) 183 | 184 | conv4 = ZeroPadding2D(padding=(0, 2, 0, 2))(conv3) 185 | conv4 = Convolution2D(512, 3, 3, weights=(W_conv4, b_conv4))(conv4) 186 | conv4 = Activation("relu", name="CONV4")(conv4) 187 | 188 | conv5 = ZeroPadding2D(padding=(0, 2, 0, 2))(conv4) 189 | conv5 = Convolution2D(512, 3, 3, weights=(W_conv5, b_conv5))(conv5) 190 | conv5 = Activation("relu", name="CONV5")(conv5) 191 | 192 | pool5 = ZeroPadding2D(padding=(0, 1, 0, 1))(conv5) 193 | pool5 = MaxPooling2D(pool_size=(3, 3), strides=(3, 3), name="POOL5")(pool5) 194 | 195 | fc6 = Flatten()(pool5) 196 | fc6 = Dense(4096, weights=(W_fc6, b_fc6))(fc6) 197 | fc6 = Activation("relu", name="FC6")(fc6) 198 | 199 | fc7 = Dense(4096, weights=(W_fc7, b_fc7))(fc6) 200 | fc7 = Activation("relu", name="FC7")(fc7) 201 | 202 | fc8 = Dense(1000, weights=(W_fc8, b_fc8), name="FC8")(fc7) 203 | prob = Activation("softmax", name="PROB")(fc8) 204 | 205 | model = Model(input=[data], output=[prob]) 206 | 207 | model.compile(optimizer="adam", loss="categorical_crossentropy") 208 | 209 | print("keras") 210 | for layer in model.layers: 211 | print(layer.name, layer.output_shape) 212 | 213 | # DATA (None, 3, 224, 224) 214 | # convolution2d_1 (None, 96, 109, 109) 215 | # CONV1 (None, 96, 109, 109) 216 | # NORM1 (None, 96, 109, 109) 217 | # zeropadding2d_1 (None, 96, 111, 111) 218 | # POOL1 (None, 96, 37, 37) 219 | # convolution2d_2 (None, 256, 33, 33) 220 | # CONV2 (None, 256, 33, 33) 221 | # zeropadding2d_2 (None, 256, 34, 34) 222 | # POOL2 (None, 256, 17, 17) 223 | # zeropadding2d_3 (None, 256, 19, 19) 224 | # convolution2d_3 (None, 512, 17, 17) 225 | # CONV3 (None, 512, 17, 17) 226 | # zeropadding2d_4 (None, 512, 19, 19) 227 | # convolution2d_4 (None, 512, 17, 17) 228 | # CONV4 (None, 512, 17, 17) 229 | # zeropadding2d_5 (None, 512, 19, 19) 230 | # convolution2d_5 (None, 512, 17, 17) 231 | # CONV5 (None, 512, 17, 17) 232 | # zeropadding2d_6 (None, 512, 18, 18) 233 | # POOL5 (None, 512, 6, 6) 234 | # flatten_1 (None, 18432) 235 | # dense_1 (None, 4096) 236 | # FC6 (None, 4096) 237 | # dense_2 (None, 4096) 238 | # FC7 (None, 4096) 239 | # FC8 (None, 1000) 240 | # PROB (None, 1000) 241 | 242 | # prediction 243 | id2label = {} 244 | flabel = open(LABEL_FILE, "rb") 245 | for line in flabel: 246 | lid, lname = line.strip().split("\t") 247 | id2label[int(lid)] = lname 248 | flabel.close() 249 | 250 | mean_image = np.load(MEAN_IMAGE) 251 | image = plt.imread(CAT_IMAGE) 252 | img4d = preprocess_image(image, RESIZE_WH, mean_image) 253 | 254 | print(image.shape, mean_image.shape, img4d.shape) 255 | 256 | preds = model.predict(img4d)[0] 257 | print(np.argmax(preds)) 258 | # 281 259 | 260 | top_preds = np.argsort(preds)[::-1][0:10] 261 | # array([281, 285, 282, 277, 287, 284, 283, 263, 387, 892]) 262 | 263 | pred_probas = [(x, id2label[x], preds[x]) for x in top_preds] 264 | print(pred_probas) 265 | 266 | print("Saving model...") 267 | model.save(KERAS_MODEL_FILE) 268 | 269 | -------------------------------------------------------------------------------- /src/caffe2keras-save.py: -------------------------------------------------------------------------------- 1 | # -*- coding: utf-8 -*- 2 | from __future__ import division, print_function 3 | import caffe 4 | import numpy as np 5 | import os 6 | 7 | CAFFE_HOME="/home/ubuntu/mnt/caffe" 8 | 9 | MODEL_DIR = os.path.join(CAFFE_HOME, "models", "vgg_cnn_s") 10 | MODEL_PROTO = os.path.join(MODEL_DIR, "deploy.prototxt") 11 | MODEL_WEIGHTS = os.path.join(MODEL_DIR, "VGG_CNN_S.caffemodel") 12 | MEAN_IMAGE = os.path.join(MODEL_DIR, "VGG_mean.binaryproto") 13 | 14 | OUTPUT_DIR = "../../data/vgg-cnn-weights" 15 | 16 | caffe.set_mode_cpu() 17 | net = caffe.Net(MODEL_PROTO, MODEL_WEIGHTS, caffe.TEST) 18 | 19 | for k, v in net.params.items(): 20 | print(k, v[0].data.shape, v[1].data.shape) 21 | np.save(os.path.join(OUTPUT_DIR, "W_{:s}.npy".format(k)), v[0].data) 22 | np.save(os.path.join(OUTPUT_DIR, "b_{:s}.npy".format(k)), v[1].data) 23 | 24 | # layer W.shape b.shape 25 | #conv1 (96, 3, 7, 7) (96,) 26 | #conv2 (256, 96, 5, 5) (256,) 27 | #conv3 (512, 256, 3, 3) (512,) 28 | #conv4 (512, 512, 3, 3) (512,) 29 | #conv5 (512, 512, 3, 3) (512,) 30 | #fc6 (4096, 18432) (4096,) 31 | #fc7 (4096, 4096) (4096,) 32 | #fc8 (1000, 4096) (1000,) 33 | 34 | blob = caffe.proto.caffe_pb2.BlobProto() 35 | with open(MEAN_IMAGE, 'rb') as fmean: 36 | mean_data = fmean.read() 37 | blob.ParseFromString(mean_data) 38 | mu = np.array(caffe.io.blobproto_to_array(blob)) 39 | print("Mean image:", mu.shape) 40 | np.save(os.path.join(OUTPUT_DIR, "mean_image.npy"), mu) 41 | 42 | #Mean image: (1, 3, 224, 224) 43 | 44 | for layer_name, blob in net.blobs.iteritems(): 45 | print(layer_name, blob.data.shape) 46 | 47 | #data (10, 3, 224, 224) 48 | #conv1 (10, 96, 109, 109) 49 | #norm1 (10, 96, 109, 109) 50 | #pool1 (10, 96, 37, 37) 51 | #conv2 (10, 256, 33, 33) 52 | #pool2 (10, 256, 17, 17) 53 | #conv3 (10, 512, 17, 17) 54 | #conv4 (10, 512, 17, 17) 55 | #conv5 (10, 512, 17, 17) 56 | #pool5 (10, 512, 6, 6) 57 | #fc6 (10, 4096) 58 | #fc7 (10, 4096) 59 | #fc8 (10, 1000) 60 | #prob (10, 1000) -------------------------------------------------------------------------------- /src/confusion-to-heatmap.py: -------------------------------------------------------------------------------- 1 | # -*- coding: utf-8 -*- 2 | import numpy as np 3 | import matplotlib.pyplot as plt 4 | 5 | conf_arr = [[33,2,0,0,0,0,0,0,0,1,3], 6 | [3,31,0,0,0,0,0,0,0,0,0], 7 | [0,4,41,0,0,0,0,0,0,0,1], 8 | [0,1,0,30,0,6,0,0,0,0,1], 9 | [0,0,0,0,38,10,0,0,0,0,0], 10 | [0,0,0,3,1,39,0,0,0,0,4], 11 | [0,2,2,0,4,1,31,0,0,0,2], 12 | [0,1,0,0,0,0,0,36,0,2,0], 13 | [0,0,0,0,0,0,1,5,37,5,1], 14 | [3,0,0,0,0,0,0,0,0,39,0], 15 | [0,0,0,0,0,0,0,0,0,0,38]] 16 | 17 | norm_conf = [] 18 | for i in conf_arr: 19 | a = 0 20 | tmp_arr = [] 21 | a = sum(i, 0) 22 | for j in i: 23 | tmp_arr.append(float(j)/float(a)) 24 | norm_conf.append(tmp_arr) 25 | 26 | fig = plt.figure() 27 | plt.clf() 28 | ax = fig.add_subplot(111) 29 | ax.set_aspect(1) 30 | res = ax.imshow(np.array(norm_conf), cmap=plt.cm.jet, 31 | interpolation='nearest') 32 | 33 | width, height = conf_arr.shape 34 | 35 | for x in xrange(width): 36 | for y in xrange(height): 37 | ax.annotate(str(conf_arr[x][y]), xy=(y, x), 38 | horizontalalignment='center', 39 | verticalalignment='center') 40 | 41 | cb = fig.colorbar(res) 42 | alphabet = 'ABCDEFGHIJKLMNOPQRSTUVWXYZ' 43 | plt.xticks(range(width), alphabet[:width]) 44 | plt.yticks(range(height), alphabet[:height]) 45 | plt.savefig('confusion_matrix.png', format='png') 46 | -------------------------------------------------------------------------------- /src/ft-dl-train.py: -------------------------------------------------------------------------------- 1 | # -*- coding: utf-8 -*- 2 | from __future__ import division, print_function 3 | from keras.applications.vgg16 import VGG16 4 | from keras.applications.vgg16 import preprocess_input 5 | from keras.callbacks import ModelCheckpoint 6 | from keras.layers import Dense, Dropout, Reshape 7 | from keras.optimizers import SGD 8 | from keras.models import Model, load_model 9 | from keras.preprocessing import image 10 | from keras.utils import np_utils 11 | import numpy as np 12 | import os 13 | 14 | import fttlutils 15 | 16 | ################################# main ################################# 17 | 18 | DATA_DIR = "../data/files" 19 | MODEL_DIR = os.path.join(DATA_DIR, "models") 20 | IMAGE_DIR = os.path.join(DATA_DIR, "sample") 21 | BATCH_SIZE = 32 22 | NUM_EPOCHS = 20 23 | IMAGE_WIDTH = 224 24 | 25 | np.random.seed(42) 26 | 27 | # data 28 | ys, fs = [], [] 29 | flabels = open(os.path.join(DATA_DIR, "images-y.txt"), "rb") 30 | for line in flabels: 31 | ys.append(int(line.strip())) 32 | flabels.close() 33 | ffilenames = open(os.path.join(DATA_DIR, "images-f.txt"), "rb") 34 | for line in ffilenames: 35 | fs.append(line.strip()) 36 | ffilenames.close() 37 | xs = [] 38 | for y, f in zip(ys, fs): 39 | img = image.load_img(os.path.join(IMAGE_DIR, str(y), f), 40 | target_size=(IMAGE_WIDTH, IMAGE_WIDTH)) 41 | img4d = image.img_to_array(img) 42 | img4d = np.expand_dims(img4d, axis=0) 43 | img4d = preprocess_input(img4d) 44 | xs.append(img4d[0]) 45 | 46 | X = np.array(xs) 47 | y = np.array(ys) 48 | Y = np_utils.to_categorical(y, nb_classes=5) 49 | 50 | Xtrain, Xtest, Ytrain, Ytest = fttlutils.train_test_split( 51 | X, Y, test_size=0.3, random_state=42) 52 | print(Xtrain.shape, Xtest.shape, Ytrain.shape, Ytest.shape) 53 | 54 | # build model 55 | 56 | # (1) instantiate VGG16 and remove top layers 57 | vgg16_model = VGG16(weights="imagenet", include_top=True) 58 | # visualize layers 59 | #print("VGG16 model layers") 60 | #for i, layer in enumerate(vgg16_model.layers): 61 | # print(i, layer.name, layer.output_shape) 62 | # 63 | #(0, 'input_6', (None, 224, 224, 3)) 64 | #(1, 'block1_conv1', (None, 224, 224, 64)) 65 | #(2, 'block1_conv2', (None, 224, 224, 64)) 66 | #(3, 'block1_pool', (None, 112, 112, 64)) 67 | #(4, 'block2_conv1', (None, 112, 112, 128)) 68 | #(5, 'block2_conv2', (None, 112, 112, 128)) 69 | #(6, 'block2_pool', (None, 56, 56, 128)) 70 | #(7, 'block3_conv1', (None, 56, 56, 256)) 71 | #(8, 'block3_conv2', (None, 56, 56, 256)) 72 | #(9, 'block3_conv3', (None, 56, 56, 256)) 73 | #(10, 'block3_pool', (None, 28, 28, 256)) 74 | #(11, 'block4_conv1', (None, 28, 28, 512)) 75 | #(12, 'block4_conv2', (None, 28, 28, 512)) 76 | #(13, 'block4_conv3', (None, 28, 28, 512)) 77 | #(14, 'block4_pool', (None, 14, 14, 512)) 78 | #(15, 'block5_conv1', (None, 14, 14, 512)) 79 | #(16, 'block5_conv2', (None, 14, 14, 512)) 80 | #(17, 'block5_conv3', (None, 14, 14, 512)) 81 | #(18, 'block5_pool', (None, 7, 7, 512)) 82 | #(19, 'flatten', (None, 25088)) 83 | #(20, 'fc1', (None, 4096)) 84 | #(21, 'fc2', (None, 4096)) 85 | #(22, 'predictions', (None, 1000)) 86 | 87 | # (2) remove the top layer 88 | base_model = Model(input=vgg16_model.input, 89 | output=vgg16_model.get_layer("block5_pool").output) 90 | 91 | # (3) attach a new top layer 92 | base_out = base_model.output 93 | base_out = Reshape((25088,))(base_out) 94 | top_fc1 = Dense(256, activation="relu")(base_out) 95 | top_fc1 = Dropout(0.5)(top_fc1) 96 | # output layer: (None, 5) 97 | top_preds = Dense(5, activation="softmax")(top_fc1) 98 | 99 | # (4) freeze weights until the last but one convolution layer (block4_pool) 100 | for layer in base_model.layers[0:14]: 101 | layer.trainable = False 102 | 103 | # (5) create new hybrid model 104 | model = Model(input=base_model.input, output=top_preds) 105 | 106 | # (6) compile and train the model 107 | sgd = SGD(lr=1e-4, momentum=0.9) 108 | model.compile(optimizer=sgd, loss="categorical_crossentropy", 109 | metrics=["accuracy"]) 110 | 111 | best_model = os.path.join(MODEL_DIR, "ft-dl-model-best.h5") 112 | checkpoint = ModelCheckpoint(filepath=best_model, verbose=1, 113 | save_best_only=True) 114 | history = model.fit([Xtrain], [Ytrain], nb_epoch=NUM_EPOCHS, 115 | batch_size=BATCH_SIZE, validation_split=0.1, 116 | callbacks=[checkpoint]) 117 | fttlutils.plot_loss(history) 118 | 119 | # evaluate final model 120 | Ytest_ = model.predict(Xtest) 121 | ytest = np_utils.categorical_probas_to_classes(Ytest) 122 | ytest_ = np_utils.categorical_probas_to_classes(Ytest_) 123 | fttlutils.print_stats(ytest, ytest_, "Final Model (FT#1)") 124 | model.save(os.path.join(MODEL_DIR, "ft-dl-model-final.h5")) 125 | 126 | # load best model and evaluate 127 | model = load_model(os.path.join(MODEL_DIR, "ft-dl-model-best.h5")) 128 | model.compile(optimizer=sgd, loss="categorical_crossentropy", 129 | metrics=["accuracy"]) 130 | Ytest_ = model.predict(Xtest) 131 | ytest = np_utils.categorical_probas_to_classes(Ytest) 132 | ytest_ = np_utils.categorical_probas_to_classes(Ytest_) 133 | fttlutils.print_stats(ytest, ytest_, "Best Model (FT#1)") 134 | -------------------------------------------------------------------------------- /src/ft-dlw-train.py: -------------------------------------------------------------------------------- 1 | # -*- coding: utf-8 -*- 2 | from __future__ import division, print_function 3 | from keras.applications.vgg16 import VGG16 4 | from keras.applications.vgg16 import preprocess_input 5 | from keras.callbacks import ModelCheckpoint 6 | from keras.layers import Dense, Dropout, Reshape 7 | from keras.optimizers import SGD 8 | from keras.models import Model, load_model 9 | from keras.preprocessing import image 10 | from keras.utils import np_utils 11 | import numpy as np 12 | import os 13 | 14 | import fttlutils 15 | 16 | ################################# main ################################# 17 | 18 | DATA_DIR = "../data/files" 19 | MODEL_DIR = os.path.join(DATA_DIR, "models") 20 | IMAGE_DIR = os.path.join(DATA_DIR, "sample") 21 | BATCH_SIZE = 32 22 | NUM_EPOCHS = 20 23 | IMAGE_WIDTH = 224 24 | 25 | np.random.seed(42) 26 | 27 | # data 28 | ys, fs = [], [] 29 | flabels = open(os.path.join(DATA_DIR, "images-y.txt"), "rb") 30 | for line in flabels: 31 | ys.append(int(line.strip())) 32 | flabels.close() 33 | ffilenames = open(os.path.join(DATA_DIR, "images-f.txt"), "rb") 34 | for line in ffilenames: 35 | fs.append(line.strip()) 36 | ffilenames.close() 37 | xs = [] 38 | for y, f in zip(ys, fs): 39 | img = image.load_img(os.path.join(IMAGE_DIR, str(y), f), 40 | target_size=(IMAGE_WIDTH, IMAGE_WIDTH)) 41 | img4d = image.img_to_array(img) 42 | img4d = np.expand_dims(img4d, axis=0) 43 | img4d = preprocess_input(img4d) 44 | xs.append(img4d[0]) 45 | 46 | X = np.array(xs) 47 | y = np.array(ys) 48 | Y = np_utils.to_categorical(y, nb_classes=5) 49 | 50 | Xtrain, Xtest, Ytrain, Ytest = fttlutils.train_test_split( 51 | X, Y, test_size=0.3, random_state=42) 52 | print(Xtrain.shape, Xtest.shape, Ytrain.shape, Ytest.shape) 53 | 54 | # build model 55 | 56 | # (1) instantiate VGG16 and remove top layers 57 | vgg16_model = VGG16(weights="imagenet", include_top=True) 58 | # visualize layers 59 | #print("VGG16 model layers") 60 | #for i, layer in enumerate(vgg16_model.layers): 61 | # print(i, layer.name, layer.output_shape) 62 | # 63 | #(0, 'input_6', (None, 224, 224, 3)) 64 | #(1, 'block1_conv1', (None, 224, 224, 64)) 65 | #(2, 'block1_conv2', (None, 224, 224, 64)) 66 | #(3, 'block1_pool', (None, 112, 112, 64)) 67 | #(4, 'block2_conv1', (None, 112, 112, 128)) 68 | #(5, 'block2_conv2', (None, 112, 112, 128)) 69 | #(6, 'block2_pool', (None, 56, 56, 128)) 70 | #(7, 'block3_conv1', (None, 56, 56, 256)) 71 | #(8, 'block3_conv2', (None, 56, 56, 256)) 72 | #(9, 'block3_conv3', (None, 56, 56, 256)) 73 | #(10, 'block3_pool', (None, 28, 28, 256)) 74 | #(11, 'block4_conv1', (None, 28, 28, 512)) 75 | #(12, 'block4_conv2', (None, 28, 28, 512)) 76 | #(13, 'block4_conv3', (None, 28, 28, 512)) 77 | #(14, 'block4_pool', (None, 14, 14, 512)) 78 | #(15, 'block5_conv1', (None, 14, 14, 512)) 79 | #(16, 'block5_conv2', (None, 14, 14, 512)) 80 | #(17, 'block5_conv3', (None, 14, 14, 512)) 81 | #(18, 'block5_pool', (None, 7, 7, 512)) 82 | #(19, 'flatten', (None, 25088)) 83 | #(20, 'fc1', (None, 4096)) 84 | #(21, 'fc2', (None, 4096)) 85 | #(22, 'predictions', (None, 1000)) 86 | 87 | # (2) remove the top layer 88 | base_model = Model(input=vgg16_model.input, 89 | output=vgg16_model.get_layer("block5_pool").output) 90 | 91 | # (3) load trained model and attach to top 92 | base_out = base_model.output 93 | base_out = Reshape((25088,))(base_out) 94 | top_fc1 = Dense(256, activation="relu", name="dl1fc1")(base_out) 95 | top_fc1 = Dropout(0.5)(top_fc1) 96 | # output layer: (None, 5) 97 | top_preds = Dense(5, activation="softmax", name="dl1preds")(top_fc1) 98 | 99 | # (4) freeze weights until the last but one convolution layer (block4_pool) 100 | for layer in base_model.layers[0:14]: 101 | layer.trainable = False 102 | 103 | # (5) create new hybrid model 104 | model = Model(input=base_model.input, output=top_preds) 105 | 106 | # (6) load weights for the final layers 107 | model.load_weights(os.path.join(MODEL_DIR, "tl-dl1-model-final.h5"), 108 | by_name=True) 109 | 110 | # (7) compile and train the model 111 | sgd = SGD(lr=1e-4, momentum=0.9) 112 | model.compile(optimizer=sgd, loss="categorical_crossentropy", 113 | metrics=["accuracy"]) 114 | 115 | best_model = os.path.join(MODEL_DIR, "ft-dlw-model-best.h5") 116 | checkpoint = ModelCheckpoint(filepath=best_model, verbose=1, 117 | save_best_only=True) 118 | history = model.fit([Xtrain], [Ytrain], nb_epoch=NUM_EPOCHS, 119 | batch_size=BATCH_SIZE, validation_split=0.1, 120 | callbacks=[checkpoint]) 121 | fttlutils.plot_loss(history) 122 | 123 | # evaluate final model 124 | Ytest_ = model.predict(Xtest) 125 | ytest = np_utils.categorical_probas_to_classes(Ytest) 126 | ytest_ = np_utils.categorical_probas_to_classes(Ytest_) 127 | fttlutils.print_stats(ytest, ytest_, "Final Model (FT#2)") 128 | model.save(os.path.join(MODEL_DIR, "ft-dlw-model-final.h5")) 129 | 130 | # load best model and evaluate 131 | model = load_model(os.path.join(MODEL_DIR, "ft-dlw-model-best.h5")) 132 | model.compile(optimizer=sgd, loss="categorical_crossentropy", 133 | metrics=["accuracy"]) 134 | Ytest_ = model.predict(Xtest) 135 | ytest = np_utils.categorical_probas_to_classes(Ytest) 136 | ytest_ = np_utils.categorical_probas_to_classes(Ytest_) 137 | fttlutils.print_stats(ytest, ytest_, "Best Model (FT#2)") 138 | -------------------------------------------------------------------------------- /src/fttlutils.py: -------------------------------------------------------------------------------- 1 | # -*- coding: utf-8 -*- 2 | from __future__ import division, print_function 3 | from sklearn.model_selection import StratifiedShuffleSplit 4 | from sklearn.metrics import accuracy_score 5 | from sklearn.metrics import confusion_matrix 6 | from sklearn.metrics import classification_report 7 | from sklearn.metrics import cohen_kappa_score 8 | import matplotlib.pyplot as plt 9 | 10 | def train_test_split(X, Y, test_size, random_state): 11 | # using regular train_test_split results in classes not being represented 12 | splitter = StratifiedShuffleSplit(n_splits=1, 13 | test_size=test_size, 14 | random_state=random_state) 15 | for train, test in splitter.split(X, Y): 16 | Xtrain, Xtest, Ytrain, Ytest = X[train], X[test], Y[train], Y[test] 17 | break 18 | return Xtrain, Xtest, Ytrain, Ytest 19 | 20 | def plot_loss(history): 21 | # visualize training loss and accuracy 22 | plt.subplot(211) 23 | plt.title("Accuracy") 24 | plt.plot(history.history["acc"], color="r", label="Train") 25 | plt.plot(history.history["val_acc"], color="b", label="Validation") 26 | plt.legend(loc="best") 27 | 28 | plt.subplot(212) 29 | plt.title("Loss") 30 | plt.plot(history.history["loss"], color="r", label="Train") 31 | plt.plot(history.history["val_loss"], color="b", label="Validation") 32 | plt.legend(loc="best") 33 | 34 | plt.tight_layout() 35 | plt.show() 36 | 37 | def print_stats(ytest, ytest_, model_name): 38 | print(model_name) 39 | print("Accuracy: {:.5f}, Cohen's Kappa Score: {:.5f}".format( 40 | accuracy_score(ytest, ytest_), 41 | cohen_kappa_score(ytest, ytest_, weights="quadratic"))) 42 | print("Confusion Matrix:") 43 | print(confusion_matrix(ytest, ytest_)) 44 | print("Classification Report:") 45 | print(classification_report(ytest, ytest_)) 46 | 47 | -------------------------------------------------------------------------------- /src/make-sample.py: -------------------------------------------------------------------------------- 1 | # -*- coding: utf-8 -*- 2 | from __future__ import division, print_function 3 | import numpy as np 4 | import os 5 | 6 | DOWNLOAD_DIR = "../data/files" 7 | SHELL_SCRIPT = os.path.join(DOWNLOAD_DIR, "scriptImages.sh") 8 | LABEL_FILE = os.path.join(DOWNLOAD_DIR, "trainLabels.csv") 9 | 10 | label2images = {} 11 | flab = open(LABEL_FILE, "rb") 12 | for line in flab: 13 | if line.startswith("image,"): 14 | continue 15 | image_name, label = line.strip().split(",") 16 | if label2images.has_key(label): 17 | label2images[label] += image_name 18 | else: 19 | label2images[label] = [image_name] 20 | flab.close() 21 | 22 | fsh = open(SHELL_SCRIPT, "wb") 23 | for label in label2images.keys(): 24 | indices = np.arange(len(label2images[label])) 25 | print("label=", label, "len=", len(label2images[label])) 26 | sample_indices = np.random.choice(indices, size=200, replace=False) 27 | images = label2images[label] 28 | for ind in sample_indices: 29 | print("cp {:s}.jpeg sample/{:s}/".format(image_name, label)) 30 | fsh.write("cp {:s}.jpeg sample/{:s}/\n".format(image_name, label)) 31 | 32 | fsh.close() 33 | -------------------------------------------------------------------------------- /src/preprocess-images.py: -------------------------------------------------------------------------------- 1 | # -*- coding: utf-8 -*- 2 | from __future__ import division, print_function 3 | import cv2 4 | import numpy as np 5 | import matplotlib.pyplot as plt 6 | import os 7 | 8 | 9 | def plot_images(images): 10 | images = images[0:9] 11 | fig, axes = plt.subplots(3, 3) 12 | axes = np.ravel(axes) 13 | for i in range(len(images)): 14 | if len(images[i].shape) == 2: 15 | axes[i].imshow(images[i], cmap="gray") 16 | else: 17 | axes[i].imshow(images[i], interpolation="nearest") 18 | axes[i].set_xticks([]) 19 | axes[i].set_yticks([]) 20 | plt.xticks([]) 21 | plt.yticks([]) 22 | plt.tight_layout() 23 | plt.show() 24 | 25 | 26 | def get_next_image_loc(imgdir): 27 | for root, dirs, files in os.walk(imgdir): 28 | for name in files: 29 | path = os.path.join(root, name).split(os.path.sep)[::-1] 30 | yield (path[1], path[0]) 31 | 32 | 33 | def compute_edges(image): 34 | image = cv2.cvtColor(image, cv2.COLOR_BGR2GRAY) 35 | image = cv2.GaussianBlur(image, (11, 11), 0) 36 | sobel_x = cv2.Sobel(image, cv2.CV_64F, 1, 0) 37 | sobel_x = np.uint8(np.absolute(sobel_x)) 38 | sobel_y = cv2.Sobel(image, cv2.CV_64F, 0, 1) 39 | sobel_y = np.uint8(np.absolute(sobel_y)) 40 | edged = cv2.bitwise_or(sobel_x, sobel_y) 41 | return edged 42 | 43 | 44 | def crop_image_to_edge(image, threshold=10, margin=0.2): 45 | edged = compute_edges(image) 46 | # find edge along center and crop 47 | mid_y = edged.shape[0] // 2 48 | notblack_x = np.where(edged[mid_y, :] >= threshold)[0] 49 | if notblack_x.shape[0] == 0: 50 | lb_x = 0 51 | ub_x = edged.shape[1] 52 | else: 53 | lb_x = notblack_x[0] 54 | ub_x = notblack_x[-1] 55 | if lb_x > margin * edged.shape[1]: 56 | lb_x = 0 57 | if (edged.shape[1] - ub_x) > margin * edged.shape[1]: 58 | ub_x = edged.shape[1] 59 | mid_x = edged.shape[1] // 2 60 | notblack_y = np.where(edged[:, mid_x] >= threshold)[0] 61 | if notblack_y.shape[0] == 0: 62 | lb_y = 0 63 | ub_y = edged.shape[0] 64 | else: 65 | lb_y = notblack_y[0] 66 | ub_y = notblack_y[-1] 67 | if lb_y > margin * edged.shape[0]: 68 | lb_y = 0 69 | if (edged.shape[0] - ub_y) > margin * edged.shape[0]: 70 | ub_y = edged.shape[0] 71 | cropped = image[lb_y:ub_y, lb_x:ub_x, :] 72 | return cropped 73 | 74 | 75 | def crop_image_to_aspect(image, tar=1.2): 76 | # load image 77 | image_bw = cv2.cvtColor(image, cv2.COLOR_RGB2GRAY) 78 | # compute aspect ratio 79 | h, w = image_bw.shape[0], image_bw.shape[1] 80 | sar = h / w if h > w else w / h 81 | if sar < tar: 82 | return image 83 | else: 84 | k = 0.5 * (1.0 - (tar / sar)) 85 | if h > w: 86 | lb = int(k * h) 87 | ub = h - lb 88 | cropped = image[lb:ub, :, :] 89 | else: 90 | lb = int(k * w) 91 | ub = w - lb 92 | cropped = image[:, lb:ub, :] 93 | return cropped 94 | 95 | 96 | def brighten_image_hsv(image, global_mean_v): 97 | image_hsv = cv2.cvtColor(image, cv2.COLOR_RGB2HSV) 98 | h, s, v = cv2.split(image_hsv) 99 | mean_v = int(np.mean(v)) 100 | v = v - mean_v + global_mean_v 101 | image_hsv = cv2.merge((h, s, v)) 102 | image_bright = cv2.cvtColor(image_hsv, cv2.COLOR_HSV2RGB) 103 | return image_bright 104 | 105 | 106 | def brighten_image_rgb(image, global_mean_rgb): 107 | r, g, b = cv2.split(image) 108 | m = np.array([np.mean(r), np.mean(g), np.mean(b)]) 109 | brightened = image + global_mean_v - m 110 | return brightened 111 | 112 | 113 | ############################# main ############################# 114 | 115 | DATA_DIR = "../data/files/sample" 116 | DATA_DIR2 = "../data/files/sample2" 117 | 118 | # random sample for printing 119 | sample_image_idxs = set(np.random.randint(0, high=1000, size=9).tolist()) 120 | sample_images = [] 121 | 122 | curr_idx = 0 123 | vs = [] 124 | mean_rgbs = [] 125 | for image_dir, image_name in get_next_image_loc(DATA_DIR): 126 | if curr_idx % 100 == 0: 127 | print("Reading {:d} images".format(curr_idx)) 128 | image = cv2.imread(os.path.join(DATA_DIR, image_dir, image_name)) 129 | if curr_idx in sample_image_idxs: 130 | sample_images.append(image) 131 | image_hsv = cv2.cvtColor(image, cv2.COLOR_BGR2HSV) 132 | h, s, v = cv2.split(image_hsv) 133 | vs.append(np.mean(v)) 134 | image_rgb = cv2.cvtColor(image, cv2.COLOR_BGR2RGB) 135 | r, g, b = cv2.split(image_rgb) 136 | mean_rgbs.append(np.array([np.mean(r), np.mean(g), np.mean(b)])) 137 | curr_idx += 1 138 | print("Reading {:d} images, complete".format(curr_idx)) 139 | global_mean_v = int(np.mean(np.array(vs))) 140 | global_mean_rgbs = np.mean(mean_rgbs, axis=0) 141 | 142 | # plot sample images at various steps 143 | sample_images_rgb = [cv2.cvtColor(simg, cv2.COLOR_BGR2RGB) 144 | for simg in sample_images] 145 | plot_images(sample_images_rgb) 146 | sample_cropped = [crop_image_to_aspect(simg) for simg in sample_images_rgb] 147 | sample_resized = [cv2.resize(simg, (int(1.2 * 224), 224)) 148 | for simg in sample_cropped] 149 | plot_images(sample_resized) 150 | sample_brightened_hsv = [brighten_image_hsv(simg, global_mean_v) 151 | for simg in sample_resized] 152 | plot_images(sample_brightened_hsv) 153 | sample_brightened_rgb = [brighten_image_rgb(simg, global_mean_rgbs) 154 | for simg in sample_resized] 155 | plot_images(sample_brightened_rgb) 156 | 157 | # save all images to disk 158 | curr_idx = 0 159 | for image_dir, image_name in get_next_image_loc(DATA_DIR): 160 | if curr_idx % 100 == 0: 161 | print("Writing {:d} preprocessed images".format(curr_idx)) 162 | image = cv2.imread(os.path.join(DATA_DIR, image_dir, image_name)) 163 | cropped = crop_image_to_aspect(image) 164 | resized = cv2.resize(cropped, (int(1.2 * 224), 224)) 165 | # brightened = brighten_image_hsv(resized, global_mean_v) 166 | brightened = brighten_image_rgb(resized, global_mean_rgbs) 167 | plt.imsave(os.path.join(DATA_DIR2, image_dir, image_name), brightened) 168 | curr_idx += 1 169 | print("Wrote {:d} images, complete".format(curr_idx)) 170 | -------------------------------------------------------------------------------- /src/sample-images.py: -------------------------------------------------------------------------------- 1 | # -*- coding: utf-8 -*- 2 | from __future__ import division, print_function 3 | import matplotlib.pyplot as plt 4 | import numpy as np 5 | import os 6 | 7 | IMAGE_DIR = "../data/files/sample" 8 | 9 | ################################## 10 | # Presentation Goals slide 11 | ################################## 12 | def presentation_goals(): 13 | files = os.listdir(IMAGE_DIR) 14 | fig, axes = plt.subplots(3, 3) 15 | axes = np.ravel(axes) 16 | for i in range(9): 17 | label = np.random.randint(4) 18 | files = os.listdir(os.path.join(IMAGE_DIR, str(label))) 19 | img_file = files[np.random.randint(len(files))] 20 | img = plt.imread(os.path.join(IMAGE_DIR, str(label), img_file)) 21 | axes[i].imshow(img, interpolation="nearest") 22 | axes[i].set_xticks([]) 23 | axes[i].set_yticks([]) 24 | plt.xticks([]) 25 | plt.yticks([]) 26 | plt.tight_layout() 27 | plt.show() 28 | 29 | ################################## 30 | # What is DR slide 31 | ################################## 32 | def what_is_dr(): 33 | plt.subplot(511) 34 | img = plt.imread(os.path.join(IMAGE_DIR, "0", "13363_left.jpeg")) 35 | plt.title("No DR") 36 | plt.imshow(img) 37 | plt.xticks([]) 38 | plt.yticks([]) 39 | 40 | plt.subplot(512) 41 | img = plt.imread(os.path.join(IMAGE_DIR, "1", "14664_left.jpeg")) 42 | plt.title("Mild DR") 43 | plt.imshow(img) 44 | plt.xticks([]) 45 | plt.yticks([]) 46 | 47 | plt.subplot(513) 48 | img = plt.imread(os.path.join(IMAGE_DIR, "2", "14323_left.jpeg")) 49 | plt.title("Moderate DR") 50 | plt.imshow(img) 51 | plt.xticks([]) 52 | plt.yticks([]) 53 | 54 | plt.subplot(514) 55 | img = plt.imread(os.path.join(IMAGE_DIR, "3", "12612_right.jpeg")) 56 | plt.title("Severe DR") 57 | plt.imshow(img) 58 | plt.xticks([]) 59 | plt.yticks([]) 60 | 61 | plt.subplot(515) 62 | plt.title("Proliferative DR") 63 | img = plt.imread(os.path.join(IMAGE_DIR, "4", "15376_left.jpeg")) 64 | plt.imshow(img) 65 | plt.xticks([]) 66 | plt.yticks([]) 67 | 68 | plt.tight_layout() 69 | plt.show() 70 | 71 | ######################## main ######################## 72 | presentation_goals() 73 | #what_is_dr() -------------------------------------------------------------------------------- /src/tl-dl-aug-train.py: -------------------------------------------------------------------------------- 1 | # -*- coding: utf-8 -*- 2 | from __future__ import division, print_function 3 | from keras.callbacks import ModelCheckpoint 4 | from keras.layers import Dense, Dropout, Input 5 | from keras.models import Model, load_model 6 | from keras.utils import np_utils 7 | from sklearn.metrics import * 8 | from sklearn.cross_validation import StratifiedShuffleSplit 9 | import numpy as np 10 | import matplotlib.pyplot as plt 11 | import os 12 | 13 | DATA_DIR = "../data" 14 | NUM_EPOCHS = 75 15 | BATCH_SIZE = 32 16 | 17 | # data 18 | print("Loading data...") 19 | Xtrain = np.loadtxt(os.path.join(DATA_DIR, "images-500-train-X.txt"), 20 | delimiter=",") 21 | ytrain = np.loadtxt(os.path.join(DATA_DIR, "images-500-train-y.txt"), 22 | delimiter=",", dtype=np.int) 23 | Ytrain = np_utils.to_categorical(ytrain - 1, nb_classes=5) 24 | print("\ttrain:", Xtrain.shape, Ytrain.shape) 25 | 26 | Xtest = np.loadtxt(os.path.join(DATA_DIR, "images-500-test-X.txt"), 27 | delimiter=",") 28 | ytest = np.loadtxt(os.path.join(DATA_DIR, "images-500-test-y.txt"), 29 | delimiter=",", dtype=np.int) 30 | Ytest = np_utils.to_categorical(ytest - 1, nb_classes=5) 31 | print("\ttest:", Xtest.shape, Ytest.shape) 32 | 33 | np.random.seed(42) 34 | 35 | # model 36 | # input: (None, 25088) 37 | imgvecs = Input(shape=(Xtrain.shape[1],), dtype="float32") 38 | # hidden layer: (None, 256) 39 | fc1 = Dense(256, activation="relu")(imgvecs) 40 | fc1_drop = Dropout(0.5)(fc1) 41 | # output layer: (None, 5) 42 | labels = Dense(5, activation="softmax")(fc1_drop) 43 | 44 | ## model 2 45 | ## input: (None, 25088) 46 | #imgvecs = Input(shape=(Xtrain.shape[1],), dtype="float32") 47 | ## hidden layer: (None, 2048) 48 | #fc1 = Dense(4096, activation="relu")(imgvecs) 49 | #fc1_drop = Dropout(0.5)(fc1) 50 | ## hidden layer: (None, 256) 51 | #fc2 = Dense(256, activation="relu")(fc1_drop) 52 | #fc2_drop = Dropout(0.5)(fc2) 53 | ## output layer: (None, 5) 54 | #labels = Dense(5, activation="softmax")(fc2_drop) 55 | 56 | 57 | model = Model(input=[imgvecs], output=[labels]) 58 | 59 | model.compile(optimizer="adadelta", loss="categorical_crossentropy", 60 | metrics=["accuracy"]) 61 | 62 | best_model = os.path.join(DATA_DIR, "dl-model-aug-best.h5") 63 | checkpoint = ModelCheckpoint(filepath=best_model, verbose=1, 64 | save_best_only=True) 65 | history = model.fit([Xtrain], [Ytrain], nb_epoch=NUM_EPOCHS, 66 | batch_size=BATCH_SIZE, validation_split=0.1, 67 | callbacks=[checkpoint]) 68 | 69 | # visualize training loss and accuracy 70 | plt.subplot(211) 71 | plt.title("Accuracy") 72 | plt.plot(history.history["acc"], color="r", label="Train") 73 | plt.plot(history.history["val_acc"], color="b", label="Validation") 74 | plt.legend(loc="best") 75 | 76 | plt.subplot(212) 77 | plt.title("Loss") 78 | plt.plot(history.history["loss"], color="r", label="Train") 79 | plt.plot(history.history["val_loss"], color="b", label="Validation") 80 | plt.legend(loc="best") 81 | 82 | plt.tight_layout() 83 | plt.show() 84 | 85 | # evaluate final model 86 | Ytest_ = model.predict(Xtest) 87 | ytest = np_utils.categorical_probas_to_classes(Ytest) + 1 88 | ytest_ = np_utils.categorical_probas_to_classes(Ytest_) + 1 89 | 90 | print("Final model") 91 | print("Accuracy: {:.5f}".format(accuracy_score(ytest_, ytest))) 92 | print("Confusion Matrix:") 93 | print(confusion_matrix(ytest_, ytest)) 94 | print("Classification Report:") 95 | print(classification_report(ytest_, ytest)) 96 | 97 | model.save(os.path.join(DATA_DIR, "dl-model-aug-final.h5")) 98 | 99 | # load best model and evaluate 100 | 101 | model = load_model(os.path.join(DATA_DIR, "dl-model-aug-best.h5")) 102 | model.compile(optimizer="rmsprop", loss="categorical_crossentropy", 103 | metrics=["accuracy"]) 104 | Ytest_ = model.predict(Xtest) 105 | ytest = np_utils.categorical_probas_to_classes(Ytest) + 1 106 | ytest_ = np_utils.categorical_probas_to_classes(Ytest_) + 1 107 | 108 | print("Best model") 109 | print("Accuracy: {:.5f}".format(accuracy_score(ytest_, ytest))) 110 | print("Confusion Matrix:") 111 | print(confusion_matrix(ytest_, ytest)) 112 | print("Classification Report:") 113 | print(classification_report(ytest_, ytest)) 114 | 115 | -------------------------------------------------------------------------------- /src/tl-dl1-train.py: -------------------------------------------------------------------------------- 1 | # -*- coding: utf-8 -*- 2 | from __future__ import division, print_function 3 | from keras.callbacks import ModelCheckpoint 4 | from keras.layers import Dense, Dropout, Input 5 | from keras.layers.normalization import BatchNormalization 6 | from keras.models import Model, load_model 7 | from keras.utils import np_utils 8 | import numpy as np 9 | import os 10 | 11 | import fttlutils 12 | 13 | DATA_DIR = "../data/files" 14 | MODEL_DIR = os.path.join(DATA_DIR, "models") 15 | NUM_EPOCHS = 50 16 | BATCH_SIZE = 32 17 | 18 | # data 19 | X = np.loadtxt(os.path.join(DATA_DIR, "images-X.txt"), delimiter=",") 20 | y = np.loadtxt(os.path.join(DATA_DIR, "images-y.txt"), delimiter=",", 21 | dtype=np.int) 22 | Y = np_utils.to_categorical(y, nb_classes=5) 23 | 24 | np.random.seed(42) 25 | 26 | Xtrain, Xtest, Ytrain, Ytest = fttlutils.train_test_split( 27 | X, Y, test_size=0.3, random_state=42) 28 | print(Xtrain.shape, Xtest.shape, Ytrain.shape, Ytest.shape) 29 | 30 | # model 31 | # input: (None, 25088) 32 | imgvecs = Input(shape=(Xtrain.shape[1],), dtype="float32") 33 | # hidden layer: (None, 256) 34 | fc1 = Dense(256, 35 | activation="relu", 36 | init="he_uniform", 37 | name="dl1fc1")(imgvecs) 38 | fc1 = BatchNormalization()(fc1) 39 | fc1 = Dropout(0.5)(fc1) 40 | # output layer: (None, 5) 41 | predictions = Dense(5, activation="softmax", name="dl1preds")(fc1) 42 | 43 | model = Model(input=[imgvecs], output=[predictions]) 44 | 45 | model.compile(optimizer="adadelta", loss="categorical_crossentropy", 46 | metrics=["accuracy"]) 47 | 48 | best_model = os.path.join(MODEL_DIR, "tl-dl1-model-best.h5") 49 | checkpoint = ModelCheckpoint(filepath=best_model, verbose=1, 50 | save_best_only=True) 51 | history = model.fit([Xtrain], [Ytrain], nb_epoch=NUM_EPOCHS, 52 | batch_size=BATCH_SIZE, validation_split=0.1, 53 | callbacks=[checkpoint]) 54 | fttlutils.plot_loss(history) 55 | 56 | # evaluate final model 57 | Ytest_ = model.predict(Xtest) 58 | ytest = np_utils.categorical_probas_to_classes(Ytest) 59 | ytest_ = np_utils.categorical_probas_to_classes(Ytest_) 60 | fttlutils.print_stats(ytest, ytest_, "Final Model (DL#1)") 61 | model.save(os.path.join(MODEL_DIR, "tl-dl1-model-final.h5")) 62 | 63 | # load best model and evaluate 64 | model = load_model(os.path.join(MODEL_DIR, "tl-dl1-model-best.h5")) 65 | model.compile(optimizer="adadelta", loss="categorical_crossentropy", 66 | metrics=["accuracy"]) 67 | Ytest_ = model.predict(Xtest) 68 | ytest = np_utils.categorical_probas_to_classes(Ytest) 69 | ytest_ = np_utils.categorical_probas_to_classes(Ytest_) 70 | fttlutils.print_stats(ytest, ytest_, "Best Model (DL#1)") 71 | -------------------------------------------------------------------------------- /src/tl-dl2-train.py: -------------------------------------------------------------------------------- 1 | # -*- coding: utf-8 -*- 2 | from __future__ import division, print_function 3 | from keras.callbacks import ModelCheckpoint 4 | from keras.layers import Dense, Dropout, Input 5 | from keras.models import Model, load_model 6 | from keras.optimizers import Adadelta 7 | from keras.utils import np_utils 8 | import numpy as np 9 | import os 10 | 11 | import fttlutils 12 | 13 | DATA_DIR = "../data/files" 14 | MODEL_DIR = os.path.join(DATA_DIR, "models") 15 | NUM_EPOCHS = 50 16 | BATCH_SIZE = 64 17 | 18 | # data 19 | X = np.loadtxt(os.path.join(DATA_DIR, "images-X.txt"), delimiter=",") 20 | y = np.loadtxt(os.path.join(DATA_DIR, "images-y.txt"), delimiter=",", 21 | dtype=np.int) 22 | Y = np_utils.to_categorical(y, nb_classes=5) 23 | 24 | np.random.seed(42) 25 | 26 | Xtrain, Xtest, Ytrain, Ytest = fttlutils.train_test_split( 27 | X, Y, test_size=0.3, random_state=42) 28 | print(Xtrain.shape, Xtest.shape, Ytrain.shape, Ytest.shape) 29 | 30 | # model 2 31 | # input: (None, 25088) 32 | imgvecs = Input(shape=(Xtrain.shape[1],), dtype="float32") 33 | # hidden layer: (None, 2048) 34 | fc1 = Dense(256, activation="relu")(imgvecs) 35 | fc1 = Dropout(0.5)(fc1) 36 | ## hidden layer: (None, 256) 37 | fc2 = Dense(128, activation="relu")(fc1) 38 | fc2 = Dropout(0.5)(fc2) 39 | # output layer: (None, 5) 40 | predictions = Dense(5, activation="softmax")(fc2) 41 | 42 | model = Model(input=[imgvecs], output=[predictions]) 43 | 44 | optimizer = Adadelta(lr=0.1) 45 | model.compile(optimizer=optimizer, loss="categorical_crossentropy", 46 | metrics=["accuracy"]) 47 | 48 | best_model = os.path.join(MODEL_DIR, "tl-dl2-model-best.h5") 49 | checkpoint = ModelCheckpoint(filepath=best_model, verbose=1, 50 | save_best_only=True) 51 | history = model.fit([Xtrain], [Ytrain], nb_epoch=NUM_EPOCHS, 52 | batch_size=BATCH_SIZE, validation_split=0.1, 53 | callbacks=[checkpoint]) 54 | fttlutils.plot_loss(history) 55 | 56 | # evaluate final model 57 | Ytest_ = model.predict(Xtest) 58 | ytest = np_utils.categorical_probas_to_classes(Ytest) 59 | ytest_ = np_utils.categorical_probas_to_classes(Ytest_) 60 | fttlutils.print_stats(ytest, ytest_, "Final Model (DL#2)") 61 | model.save(os.path.join(MODEL_DIR, "tl-dl2-model-final.h5")) 62 | 63 | # load best model and evaluate 64 | 65 | model = load_model(os.path.join(MODEL_DIR, "tl-dl2-model-best.h5")) 66 | model.compile(optimizer=optimizer, loss="categorical_crossentropy", 67 | metrics=["accuracy"]) 68 | Ytest_ = model.predict(Xtest) 69 | ytest = np_utils.categorical_probas_to_classes(Ytest) 70 | ytest_ = np_utils.categorical_probas_to_classes(Ytest_) 71 | fttlutils.print_stats(ytest, ytest_, "Best Model (DL#2)") 72 | 73 | -------------------------------------------------------------------------------- /src/tl-lr-aug-train.py: -------------------------------------------------------------------------------- 1 | # -*- coding: utf-8 -*- 2 | from __future__ import division, print_function 3 | from sklearn.cross_validation import train_test_split 4 | from sklearn.linear_model import LogisticRegression 5 | from sklearn.metrics import * 6 | import cPickle as pickle 7 | import numpy as np 8 | import os 9 | 10 | DATA_DIR = "../data" 11 | 12 | # data 13 | print("Loading data...") 14 | Xtrain = np.loadtxt(os.path.join(DATA_DIR, "images-500-train-X.txt"), 15 | delimiter=",") 16 | ytrain = np.loadtxt(os.path.join(DATA_DIR, "images-500-train-y.txt"), 17 | delimiter=",", dtype=np.int) 18 | print("\ttrain:", Xtrain.shape, ytrain.shape) 19 | 20 | Xtest = np.loadtxt(os.path.join(DATA_DIR, "images-500-test-X.txt"), 21 | delimiter=",") 22 | ytest = np.loadtxt(os.path.join(DATA_DIR, "images-500-test-y.txt"), 23 | delimiter=",", dtype=np.int) 24 | print("\ttest:", Xtest.shape, ytest.shape) 25 | 26 | np.random.seed(42) 27 | 28 | # model 29 | clf = LogisticRegression() 30 | clf.fit(Xtrain, ytrain) 31 | 32 | ytest_ = clf.predict(Xtest) 33 | 34 | print("Accuracy: {:.3f}".format(accuracy_score(ytest, ytest_))) 35 | print("Confusion Matrix:") 36 | print(confusion_matrix(ytest, ytest_)) 37 | print("Classification Report:") 38 | print(classification_report(ytest, ytest_)) 39 | 40 | with open(os.path.join(DATA_DIR, "lr-model-aug.pkl"), "wb") as fmodel: 41 | pickle.dump(clf, fmodel) 42 | -------------------------------------------------------------------------------- /src/tl-lr-train.py: -------------------------------------------------------------------------------- 1 | # -*- coding: utf-8 -*- 2 | from __future__ import division, print_function 3 | from sklearn.linear_model import LogisticRegression 4 | import cPickle as pickle 5 | import numpy as np 6 | import os 7 | 8 | import fttlutils 9 | 10 | ##################### main ###################### 11 | 12 | DATA_DIR = "../data/files" 13 | MODEL_DIR = os.path.join(DATA_DIR, "models") 14 | 15 | # data 16 | X = np.loadtxt(os.path.join(DATA_DIR, "images-X.txt"), delimiter=",") 17 | y = np.loadtxt(os.path.join(DATA_DIR, "images-y.txt"), delimiter=",", 18 | dtype=np.int) 19 | 20 | Xtrain, Xtest, ytrain, ytest = fttlutils.train_test_split( 21 | X, y, test_size=0.3, random_state=42) 22 | print(Xtrain.shape, Xtest.shape, ytrain.shape, ytest.shape) 23 | 24 | # model 25 | clf = LogisticRegression() 26 | clf.fit(Xtrain, ytrain) 27 | 28 | ytest_ = clf.predict(Xtest) 29 | fttlutils.print_stats(ytest, ytest_, "LR Model") 30 | with open(os.path.join(MODEL_DIR, "lr-model.pkl"), "wb") as fmodel: 31 | pickle.dump(clf, fmodel) 32 | -------------------------------------------------------------------------------- /src/vectorize-images.py: -------------------------------------------------------------------------------- 1 | # -*- coding: utf-8 -*- 2 | from __future__ import division, print_function 3 | from keras.applications.vgg16 import VGG16 4 | from keras.applications.vgg16 import preprocess_input 5 | from keras.models import Model 6 | from keras.optimizers import SGD 7 | from keras.preprocessing import image 8 | import numpy as np 9 | import os 10 | 11 | def get_next_image_loc(imgdir): 12 | for root, dirs, files in os.walk(imgdir): 13 | for name in files: 14 | path = os.path.join(root, name).split(os.path.sep)[::-1] 15 | yield (path[1], path[0]) 16 | 17 | 18 | def vectorize_batch(image_locs, image_dir, image_width, model, 19 | fvec_x, fvec_y, fvec_f): 20 | Xs, ys, fs = [], [], [] 21 | for subdir, filename in image_locs: 22 | # preprocess image for loading into CNN 23 | img = image.load_img(os.path.join(image_dir, subdir, filename), 24 | target_size=(image_width, image_width)) 25 | img4d = image.img_to_array(img) 26 | img4d = np.expand_dims(img4d, axis=0) 27 | img4d = preprocess_input(img4d) 28 | Xs.append(img4d[0]) 29 | ys.append(int(subdir)) 30 | fs.append(filename) 31 | X = np.array(Xs) 32 | vecs = model.predict(X) 33 | # output shape is (10, 7, 7, 512) 34 | for i in range(len(Xs)): 35 | vec = vecs[i].flatten() 36 | vec_str = ",".join(["{:.5f}".format(x) for x in vec.tolist()]) 37 | fvec_x.write("{:s}\n".format(vec_str)) 38 | fvec_y.write("{:d}\n".format(ys[i])) 39 | fvec_f.write("{:s}\n".format(fs[i])) 40 | return len(Xs) 41 | 42 | 43 | ############################ main ############################ 44 | 45 | DATA_DIR = "../data/files" 46 | IMAGE_DIR = os.path.join(DATA_DIR, "sample") 47 | BATCH_SIZE = 10 48 | 49 | IMAGE_WIDTH = 224 50 | 51 | VEC_FILE_X = os.path.join(DATA_DIR, "images-X.txt") 52 | VEC_FILE_Y = os.path.join(DATA_DIR, "images-y.txt") 53 | VEC_FILE_F = os.path.join(DATA_DIR, "images-f.txt") 54 | 55 | # load VGG-16 model 56 | vgg16_model = VGG16(weights="imagenet", include_top=True) 57 | model = Model(input=vgg16_model.input, 58 | output=vgg16_model.get_layer("block5_pool").output) 59 | 60 | sgd = SGD(lr=0.1, decay=1e-6, momentum=0.9, nesterov=True) 61 | model.compile(optimizer=sgd, loss="categorical_crossentropy") 62 | 63 | fvec_x = open(VEC_FILE_X, "wb") 64 | fvec_y = open(VEC_FILE_Y, "wb") 65 | fvec_f = open(VEC_FILE_F, "wb") 66 | 67 | batch = [] 68 | nbr_written = 0 69 | for image_loc in get_next_image_loc(IMAGE_DIR): 70 | batch.append(image_loc) 71 | if len(batch) == 10: 72 | nbr_written += vectorize_batch(batch, IMAGE_DIR, IMAGE_WIDTH, model, 73 | fvec_x, fvec_y, fvec_f) 74 | print("Vectors generated for {:d} images...".format(nbr_written)) 75 | batch = [] 76 | if len(batch) > 0: 77 | nbr_written += vectorize_batch(batch, IMAGE_DIR, IMAGE_WIDTH, model, 78 | fvec_x, fvec_y, fvec_f) 79 | print("Vectors generated for {:d} images, COMPLETE".format(nbr_written)) 80 | 81 | fvec_x.close() 82 | fvec_y.close() 83 | fvec_f.close() 84 | --------------------------------------------------------------------------------