├── .gitignore ├── LICENSE ├── README.md ├── UT.py ├── clean.sh ├── custom_layers ├── __init__.py ├── scale_layer.py └── unpooling_layer.py ├── demo.py ├── history ├── best_val_loss=0.04720-2018-05-10 18-47-13.png └── best_val_loss=0.06925-2018-05-10 06-41-02.png ├── images ├── 0_gray.png ├── 0_image.png ├── 0_out.png ├── 1_gray.png ├── 1_image.png ├── 1_out.png ├── 2_gray.png ├── 2_image.png ├── 2_out.png ├── 3_gray.png ├── 3_image.png ├── 3_out.png ├── 4_gray.png ├── 4_image.png ├── 4_out.png ├── 5_gray.png ├── 5_image.png ├── 5_out.png ├── 6_gray.png ├── 6_image.png ├── 6_out.png ├── 7_gray.png ├── 7_image.png ├── 7_out.png ├── 8_gray.png ├── 8_image.png ├── 8_out.png ├── 9_gray.png ├── 9_image.png ├── 9_out.png ├── adadelta.png ├── nadam.png ├── random.jpg ├── segnet.jpg └── train.jpg ├── migrate.py ├── model.py ├── model.svg ├── new_start.py ├── plot_model.py ├── pre-process.py ├── resnet_152.py ├── test.py ├── train.py ├── utils.py └── vgg16.py /.gitignore: -------------------------------------------------------------------------------- 1 | .idea/ 2 | __pycache__/ 3 | models/ 4 | car_devkit.tgz 5 | cars_test.tgz 6 | cars_test/ 7 | cars_train.tgz 8 | cars_train/ 9 | data/ 10 | devkit/ 11 | logs/ 12 | -------------------------------------------------------------------------------- /LICENSE: -------------------------------------------------------------------------------- 1 | MIT License 2 | 3 | Copyright (c) 2018 Yang Liu 4 | 5 | Permission is hereby granted, free of charge, to any person obtaining a copy 6 | of this software and associated documentation files (the "Software"), to deal 7 | in the Software without restriction, including without limitation the rights 8 | to use, copy, modify, merge, publish, distribute, sublicense, and/or sell 9 | copies of the Software, and to permit persons to whom the Software is 10 | furnished to do so, subject to the following conditions: 11 | 12 | The above copyright notice and this permission notice shall be included in all 13 | copies or substantial portions of the Software. 14 | 15 | THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR 16 | IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, 17 | FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE 18 | AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER 19 | LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, 20 | OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE 21 | SOFTWARE. 22 | -------------------------------------------------------------------------------- /README.md: -------------------------------------------------------------------------------- 1 | # Convolutional Autoencoder 2 | 3 | This repository is to do convolutional autoencoder by fine-tuning SetNet with Cars Dataset from Stanford. 4 | 5 | 6 | ## Dependencies 7 | 8 | - [NumPy](http://docs.scipy.org/doc/numpy-1.10.1/user/install.html) 9 | - [Tensorflow](https://www.tensorflow.org/versions/r0.8/get_started/os_setup.html) 10 | - [Keras](https://keras.io/#installation) 11 | - [OpenCV](https://opencv-python-tutroals.readthedocs.io/en/latest/) 12 | 13 | ## Dataset 14 | 15 | We use the Cars Dataset, which contains 16,185 images of 196 classes of cars. The data is split into 8,144 training images and 8,041 testing images, where each class has been split roughly in a 50-50 split. 16 | 17 | ![image](https://github.com/foamliu/Conv-Autoencoder/raw/master/images/random.jpg) 18 | 19 | You can get it from [Cars Dataset](https://ai.stanford.edu/~jkrause/cars/car_dataset.html): 20 | 21 | ```bash 22 | $ cd Conv-Autoencoder 23 | $ wget http://imagenet.stanford.edu/internal/car196/cars_train.tgz 24 | $ wget http://imagenet.stanford.edu/internal/car196/cars_test.tgz 25 | $ wget --no-check-certificate https://ai.stanford.edu/~jkrause/cars/car_devkit.tgz 26 | ``` 27 | 28 | ## Architecture 29 | 30 | ![image](https://github.com/foamliu/Conv-Autoencoder/raw/master/images/segnet.jpg) 31 | 32 | 33 | ## ImageNet Pretrained Models 34 | 35 | Download [VGG16](https://github.com/fchollet/deep-learning-models/releases/download/v0.1/vgg16_weights_tf_dim_ordering_tf_kernels.h5) into models folder. 36 | 37 | 38 | ## Usage 39 | 40 | ### Data Pre-processing 41 | Extract 8,144 training images, and split them by 80:20 rule (6,515 for training, 1,629 for validation): 42 | ```bash 43 | $ python pre-process.py 44 | ``` 45 | 46 | ### Train 47 | ```bash 48 | $ python train.py 49 | ``` 50 | 51 | If you want to visualize during training, run in your terminal: 52 | ```bash 53 | $ tensorboard --logdir path_to_current_dir/logs 54 | ``` 55 | 56 | ![image](https://github.com/foamliu/Conv-Autoencoder/raw/master/images/nadam.png) 57 | 58 | ### Demo 59 | Download pre-trained [model](https://github.com/foamliu/Conv-Autoencoder/releases/download/v1.0/model.97-0.0201.hdf5) weights into "models" folder then run: 60 | 61 | ```bash 62 | $ python demo.py 63 | ``` 64 | 65 | Then check results in images folder, something like: 66 | 67 | Input | GT | Output | 68 | |---|---|---| 69 | |![image](https://github.com/foamliu/Conv-Autoencoder/raw/master/images/0_image.png) | ![image](https://github.com/foamliu/Conv-Autoencoder/raw/master/images/0_gray.png) | ![image](https://github.com/foamliu/Conv-Autoencoder/raw/master/images/0_out.png)| 70 | |![image](https://github.com/foamliu/Conv-Autoencoder/raw/master/images/1_image.png) | ![image](https://github.com/foamliu/Conv-Autoencoder/raw/master/images/1_gray.png) | ![image](https://github.com/foamliu/Conv-Autoencoder/raw/master/images/1_out.png)| 71 | |![image](https://github.com/foamliu/Conv-Autoencoder/raw/master/images/2_image.png) | ![image](https://github.com/foamliu/Conv-Autoencoder/raw/master/images/2_gray.png) | ![image](https://github.com/foamliu/Conv-Autoencoder/raw/master/images/2_out.png)| 72 | |![image](https://github.com/foamliu/Conv-Autoencoder/raw/master/images/3_image.png) | ![image](https://github.com/foamliu/Conv-Autoencoder/raw/master/images/3_gray.png) | ![image](https://github.com/foamliu/Conv-Autoencoder/raw/master/images/3_out.png)| 73 | |![image](https://github.com/foamliu/Conv-Autoencoder/raw/master/images/4_image.png) | ![image](https://github.com/foamliu/Conv-Autoencoder/raw/master/images/4_gray.png) | ![image](https://github.com/foamliu/Conv-Autoencoder/raw/master/images/4_out.png)| 74 | |![image](https://github.com/foamliu/Conv-Autoencoder/raw/master/images/5_image.png) | ![image](https://github.com/foamliu/Conv-Autoencoder/raw/master/images/5_gray.png) | ![image](https://github.com/foamliu/Conv-Autoencoder/raw/master/images/5_out.png)| 75 | |![image](https://github.com/foamliu/Conv-Autoencoder/raw/master/images/6_image.png) | ![image](https://github.com/foamliu/Conv-Autoencoder/raw/master/images/6_gray.png) | ![image](https://github.com/foamliu/Conv-Autoencoder/raw/master/images/6_out.png)| 76 | |![image](https://github.com/foamliu/Conv-Autoencoder/raw/master/images/7_image.png) | ![image](https://github.com/foamliu/Conv-Autoencoder/raw/master/images/7_gray.png) | ![image](https://github.com/foamliu/Conv-Autoencoder/raw/master/images/7_out.png)| 77 | |![image](https://github.com/foamliu/Conv-Autoencoder/raw/master/images/8_image.png) | ![image](https://github.com/foamliu/Conv-Autoencoder/raw/master/images/8_gray.png) | ![image](https://github.com/foamliu/Conv-Autoencoder/raw/master/images/8_out.png)| 78 | |![image](https://github.com/foamliu/Conv-Autoencoder/raw/master/images/9_image.png) | ![image](https://github.com/foamliu/Conv-Autoencoder/raw/master/images/9_gray.png) | ![image](https://github.com/foamliu/Conv-Autoencoder/raw/master/images/9_out.png)| 79 | -------------------------------------------------------------------------------- /UT.py: -------------------------------------------------------------------------------- 1 | import os 2 | import numpy as np 3 | import cv2 as cv 4 | import random 5 | import keras.backend as K 6 | from utils import custom_loss 7 | 8 | i1 = random.randint(1, 8041) 9 | i2 = random.randint(1, 8041) 10 | 11 | image1 = os.path.join('data/test', '%05d.jpg' % (i1 + 1)) 12 | image2 = os.path.join('data/test', '%05d.jpg' % (i2 + 1)) 13 | 14 | bgr_img1 = cv.imread(image1) 15 | y_true = cv.cvtColor(bgr_img1, cv.COLOR_BGR2GRAY) 16 | y_true = np.array(y_true, np.float32) 17 | bgr_img2 = cv.imread(image2) 18 | y_pred = cv.cvtColor(bgr_img2, cv.COLOR_BGR2GRAY) 19 | y_pred = np.array(y_pred, np.float32) 20 | 21 | loss = custom_loss(y_true, y_pred) 22 | ret = K.eval(loss) 23 | print(ret) 24 | -------------------------------------------------------------------------------- /clean.sh: -------------------------------------------------------------------------------- 1 | #!/usr/bin/env bash 2 | rm models/model.* 3 | rm logs -r -------------------------------------------------------------------------------- /custom_layers/__init__.py: -------------------------------------------------------------------------------- 1 | # Python Package 2 | -------------------------------------------------------------------------------- /custom_layers/scale_layer.py: -------------------------------------------------------------------------------- 1 | from keras.layers.core import Layer 2 | from keras.engine import InputSpec 3 | from keras import backend as K 4 | try: 5 | from keras import initializations 6 | except ImportError: 7 | from keras import initializers as initializations 8 | 9 | class Scale(Layer): 10 | '''Learns a set of weights and biases used for scaling the input data. 11 | the output consists simply in an element-wise multiplication of the input 12 | and a sum of a set of constants: 13 | 14 | out = in * gamma + beta, 15 | 16 | where 'gamma' and 'beta' are the weights and biases larned. 17 | 18 | # Arguments 19 | axis: integer, axis along which to normalize in mode 0. For instance, 20 | if your input tensor has shape (samples, channels, rows, cols), 21 | set axis to 1 to normalize per feature map (channels axis). 22 | momentum: momentum in the computation of the 23 | exponential average of the mean and standard deviation 24 | of the data, for feature-wise normalization. 25 | weights: Initialization weights. 26 | List of 2 Numpy arrays, with shapes: 27 | `[(input_shape,), (input_shape,)]` 28 | beta_init: name of initialization function for shift parameter 29 | (see [initializations](../initializations.md)), or alternatively, 30 | Theano/TensorFlow function to use for weights initialization. 31 | This parameter is only relevant if you don't pass a `weights` argument. 32 | gamma_init: name of initialization function for scale parameter (see 33 | [initializations](../initializations.md)), or alternatively, 34 | Theano/TensorFlow function to use for weights initialization. 35 | This parameter is only relevant if you don't pass a `weights` argument. 36 | ''' 37 | def __init__(self, weights=None, axis=-1, momentum = 0.9, beta_init='zero', gamma_init='one', **kwargs): 38 | self.momentum = momentum 39 | self.axis = axis 40 | self.beta_init = initializations.get(beta_init) 41 | self.gamma_init = initializations.get(gamma_init) 42 | self.initial_weights = weights 43 | super(Scale, self).__init__(**kwargs) 44 | 45 | def build(self, input_shape): 46 | self.input_spec = [InputSpec(shape=input_shape)] 47 | shape = (int(input_shape[self.axis]),) 48 | 49 | # Compatibility with TensorFlow >= 1.0.0 50 | self.gamma = K.variable(self.gamma_init(shape), name='{}_gamma'.format(self.name)) 51 | self.beta = K.variable(self.beta_init(shape), name='{}_beta'.format(self.name)) 52 | #self.gamma = self.gamma_init(shape, name='{}_gamma'.format(self.name)) 53 | #self.beta = self.beta_init(shape, name='{}_beta'.format(self.name)) 54 | self.trainable_weights = [self.gamma, self.beta] 55 | 56 | if self.initial_weights is not None: 57 | self.set_weights(self.initial_weights) 58 | del self.initial_weights 59 | 60 | def call(self, x, mask=None): 61 | input_shape = self.input_spec[0].shape 62 | broadcast_shape = [1] * len(input_shape) 63 | broadcast_shape[self.axis] = input_shape[self.axis] 64 | 65 | out = K.reshape(self.gamma, broadcast_shape) * x + K.reshape(self.beta, broadcast_shape) 66 | return out 67 | 68 | def get_config(self): 69 | config = {"momentum": self.momentum, "axis": self.axis} 70 | base_config = super(Scale, self).get_config() 71 | return dict(list(base_config.items()) + list(config.items())) 72 | -------------------------------------------------------------------------------- /custom_layers/unpooling_layer.py: -------------------------------------------------------------------------------- 1 | from keras import backend as K 2 | from keras.engine.topology import Layer 3 | from keras.layers import Reshape, Concatenate, Lambda, Multiply 4 | 5 | 6 | class Unpooling(Layer): 7 | 8 | def __init__(self, orig, the_shape, **kwargs): 9 | self.orig = orig 10 | self.the_shape = the_shape 11 | super(Unpooling, self).__init__(**kwargs) 12 | 13 | def call(self, x): 14 | # here we're going to reshape the data for a concatenation: 15 | # xReshaped and origReshaped are now split branches 16 | shape = list(self.the_shape) 17 | shape.insert(0, 1) 18 | shape = tuple(shape) 19 | xReshaped = Reshape(shape)(x) 20 | origReshaped = Reshape(shape)(self.orig) 21 | 22 | # concatenation - here, you unite both branches again 23 | # normally you don't need to reshape or use the axis var, 24 | # but here we want to keep track of what was x and what was orig. 25 | together = Concatenate(axis=1)([origReshaped, xReshaped]) 26 | 27 | bool_mask = Lambda(lambda t: K.greater_equal(t[:, 0], t[:, 1]), 28 | output_shape=self.the_shape)(together) 29 | mask = Lambda(lambda t: K.cast(t, dtype='float32'))(bool_mask) 30 | 31 | x = Multiply()([mask, x]) 32 | return x 33 | -------------------------------------------------------------------------------- /demo.py: -------------------------------------------------------------------------------- 1 | import os 2 | import random 3 | 4 | import cv2 as cv 5 | import keras.backend as K 6 | import numpy as np 7 | 8 | from model import create_model 9 | 10 | if __name__ == '__main__': 11 | img_rows, img_cols = 320, 320 12 | channel = 4 13 | 14 | model_weights_path = 'models/model.107-0.0197.hdf5' 15 | model = create_model() 16 | 17 | model.load_weights(model_weights_path) 18 | print(model.summary()) 19 | 20 | test_path = 'data/test/' 21 | test_images = [f for f in os.listdir(test_path) if 22 | os.path.isfile(os.path.join(test_path, f)) and f.endswith('.jpg')] 23 | 24 | samples = random.sample(test_images, 10) 25 | 26 | for i in range(len(samples)): 27 | image_name = samples[i] 28 | filename = os.path.join(test_path, image_name) 29 | 30 | print('Start processing image: {}'.format(filename)) 31 | 32 | x_test = np.empty((1, img_rows, img_cols, 4), dtype=np.float32) 33 | bgr_img = cv.imread(filename) 34 | gray_img = cv.cvtColor(bgr_img, cv.COLOR_BGR2GRAY) 35 | rgb_img = cv.cvtColor(bgr_img, cv.COLOR_BGR2RGB) 36 | rgb_img = rgb_img / 255. 37 | x_test[0, :, :, 0:3] = rgb_img 38 | x_test[0, :, :, 3] = np.random.uniform(0, 1, (img_rows, img_cols)) 39 | 40 | out = model.predict(x_test) 41 | # print(out.shape) 42 | 43 | out = np.reshape(out, (img_rows, img_cols)) 44 | out = out * 255.0 45 | out = out.astype(np.uint8) 46 | 47 | cv.imwrite('images/{}_image.png'.format(i), bgr_img) 48 | cv.imwrite('images/{}_out.png'.format(i), out) 49 | cv.imwrite('images/{}_gray.png'.format(i), gray_img) 50 | 51 | K.clear_session() 52 | -------------------------------------------------------------------------------- /history/best_val_loss=0.04720-2018-05-10 18-47-13.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/foamliu/Conv-Autoencoder/dfab9c0da8350b857fad4f22370f2c53d889912a/history/best_val_loss=0.04720-2018-05-10 18-47-13.png -------------------------------------------------------------------------------- /history/best_val_loss=0.06925-2018-05-10 06-41-02.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/foamliu/Conv-Autoencoder/dfab9c0da8350b857fad4f22370f2c53d889912a/history/best_val_loss=0.06925-2018-05-10 06-41-02.png -------------------------------------------------------------------------------- /images/0_gray.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/foamliu/Conv-Autoencoder/dfab9c0da8350b857fad4f22370f2c53d889912a/images/0_gray.png -------------------------------------------------------------------------------- /images/0_image.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/foamliu/Conv-Autoencoder/dfab9c0da8350b857fad4f22370f2c53d889912a/images/0_image.png -------------------------------------------------------------------------------- /images/0_out.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/foamliu/Conv-Autoencoder/dfab9c0da8350b857fad4f22370f2c53d889912a/images/0_out.png -------------------------------------------------------------------------------- /images/1_gray.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/foamliu/Conv-Autoencoder/dfab9c0da8350b857fad4f22370f2c53d889912a/images/1_gray.png -------------------------------------------------------------------------------- /images/1_image.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/foamliu/Conv-Autoencoder/dfab9c0da8350b857fad4f22370f2c53d889912a/images/1_image.png -------------------------------------------------------------------------------- /images/1_out.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/foamliu/Conv-Autoencoder/dfab9c0da8350b857fad4f22370f2c53d889912a/images/1_out.png -------------------------------------------------------------------------------- /images/2_gray.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/foamliu/Conv-Autoencoder/dfab9c0da8350b857fad4f22370f2c53d889912a/images/2_gray.png -------------------------------------------------------------------------------- /images/2_image.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/foamliu/Conv-Autoencoder/dfab9c0da8350b857fad4f22370f2c53d889912a/images/2_image.png -------------------------------------------------------------------------------- /images/2_out.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/foamliu/Conv-Autoencoder/dfab9c0da8350b857fad4f22370f2c53d889912a/images/2_out.png -------------------------------------------------------------------------------- /images/3_gray.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/foamliu/Conv-Autoencoder/dfab9c0da8350b857fad4f22370f2c53d889912a/images/3_gray.png -------------------------------------------------------------------------------- /images/3_image.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/foamliu/Conv-Autoencoder/dfab9c0da8350b857fad4f22370f2c53d889912a/images/3_image.png -------------------------------------------------------------------------------- /images/3_out.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/foamliu/Conv-Autoencoder/dfab9c0da8350b857fad4f22370f2c53d889912a/images/3_out.png -------------------------------------------------------------------------------- /images/4_gray.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/foamliu/Conv-Autoencoder/dfab9c0da8350b857fad4f22370f2c53d889912a/images/4_gray.png -------------------------------------------------------------------------------- /images/4_image.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/foamliu/Conv-Autoencoder/dfab9c0da8350b857fad4f22370f2c53d889912a/images/4_image.png -------------------------------------------------------------------------------- /images/4_out.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/foamliu/Conv-Autoencoder/dfab9c0da8350b857fad4f22370f2c53d889912a/images/4_out.png -------------------------------------------------------------------------------- /images/5_gray.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/foamliu/Conv-Autoencoder/dfab9c0da8350b857fad4f22370f2c53d889912a/images/5_gray.png -------------------------------------------------------------------------------- /images/5_image.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/foamliu/Conv-Autoencoder/dfab9c0da8350b857fad4f22370f2c53d889912a/images/5_image.png -------------------------------------------------------------------------------- /images/5_out.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/foamliu/Conv-Autoencoder/dfab9c0da8350b857fad4f22370f2c53d889912a/images/5_out.png -------------------------------------------------------------------------------- /images/6_gray.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/foamliu/Conv-Autoencoder/dfab9c0da8350b857fad4f22370f2c53d889912a/images/6_gray.png -------------------------------------------------------------------------------- /images/6_image.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/foamliu/Conv-Autoencoder/dfab9c0da8350b857fad4f22370f2c53d889912a/images/6_image.png -------------------------------------------------------------------------------- /images/6_out.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/foamliu/Conv-Autoencoder/dfab9c0da8350b857fad4f22370f2c53d889912a/images/6_out.png -------------------------------------------------------------------------------- /images/7_gray.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/foamliu/Conv-Autoencoder/dfab9c0da8350b857fad4f22370f2c53d889912a/images/7_gray.png -------------------------------------------------------------------------------- /images/7_image.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/foamliu/Conv-Autoencoder/dfab9c0da8350b857fad4f22370f2c53d889912a/images/7_image.png -------------------------------------------------------------------------------- /images/7_out.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/foamliu/Conv-Autoencoder/dfab9c0da8350b857fad4f22370f2c53d889912a/images/7_out.png -------------------------------------------------------------------------------- /images/8_gray.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/foamliu/Conv-Autoencoder/dfab9c0da8350b857fad4f22370f2c53d889912a/images/8_gray.png -------------------------------------------------------------------------------- /images/8_image.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/foamliu/Conv-Autoencoder/dfab9c0da8350b857fad4f22370f2c53d889912a/images/8_image.png -------------------------------------------------------------------------------- /images/8_out.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/foamliu/Conv-Autoencoder/dfab9c0da8350b857fad4f22370f2c53d889912a/images/8_out.png -------------------------------------------------------------------------------- /images/9_gray.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/foamliu/Conv-Autoencoder/dfab9c0da8350b857fad4f22370f2c53d889912a/images/9_gray.png -------------------------------------------------------------------------------- /images/9_image.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/foamliu/Conv-Autoencoder/dfab9c0da8350b857fad4f22370f2c53d889912a/images/9_image.png -------------------------------------------------------------------------------- /images/9_out.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/foamliu/Conv-Autoencoder/dfab9c0da8350b857fad4f22370f2c53d889912a/images/9_out.png -------------------------------------------------------------------------------- /images/adadelta.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/foamliu/Conv-Autoencoder/dfab9c0da8350b857fad4f22370f2c53d889912a/images/adadelta.png -------------------------------------------------------------------------------- /images/nadam.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/foamliu/Conv-Autoencoder/dfab9c0da8350b857fad4f22370f2c53d889912a/images/nadam.png -------------------------------------------------------------------------------- /images/random.jpg: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/foamliu/Conv-Autoencoder/dfab9c0da8350b857fad4f22370f2c53d889912a/images/random.jpg -------------------------------------------------------------------------------- /images/segnet.jpg: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/foamliu/Conv-Autoencoder/dfab9c0da8350b857fad4f22370f2c53d889912a/images/segnet.jpg -------------------------------------------------------------------------------- /images/train.jpg: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/foamliu/Conv-Autoencoder/dfab9c0da8350b857fad4f22370f2c53d889912a/images/train.jpg -------------------------------------------------------------------------------- /migrate.py: -------------------------------------------------------------------------------- 1 | import keras.backend as K 2 | import numpy as np 3 | 4 | from model import create_model 5 | from vgg16 import vgg16_model 6 | 7 | 8 | def migrate_model(new_model): 9 | old_model = vgg16_model(224, 224, 3) 10 | # print(old_model.summary()) 11 | old_layers = [l for l in old_model.layers] 12 | new_layers = [l for l in new_model.layers] 13 | 14 | old_conv1_1 = old_model.get_layer('conv1_1') 15 | old_weights = old_conv1_1.get_weights()[0] 16 | old_biases = old_conv1_1.get_weights()[1] 17 | new_weights = np.zeros((3, 3, 4, 64), dtype=np.float32) 18 | new_weights[:, :, 0:3, :] = old_weights 19 | new_weights[:, :, 3:4, :] = 0.0 20 | new_conv1_1 = new_model.get_layer('conv1_1') 21 | new_conv1_1.set_weights([new_weights, old_biases]) 22 | 23 | for i in range(2, 31): 24 | old_layer = old_layers[i] 25 | new_layer = new_layers[i + 1] 26 | new_layer.set_weights(old_layer.get_weights()) 27 | 28 | # flatten = old_model.get_layer('flatten') 29 | # f_dim = flatten.input_shape 30 | # print('f_dim: ' + str(f_dim)) 31 | # old_dense1 = old_model.get_layer('dense1') 32 | # input_shape = old_dense1.input_shape 33 | # output_dim = old_dense1.get_weights()[1].shape[0] 34 | # print('output_dim: ' + str(output_dim)) 35 | # W, b = old_dense1.get_weights() 36 | # shape = (7, 7, 512, output_dim) 37 | # new_W = W.reshape(shape) 38 | # new_conv6 = new_model.get_layer('conv6') 39 | # new_conv6.set_weights([new_W, b]) 40 | 41 | del old_model 42 | 43 | 44 | if __name__ == '__main__': 45 | model = create_model() 46 | migrate_model(model) 47 | print(model.summary()) 48 | model.save_weights('models/model_weights.h5') 49 | 50 | K.clear_session() 51 | -------------------------------------------------------------------------------- /model.py: -------------------------------------------------------------------------------- 1 | import keras.backend as K 2 | from keras.layers import Input, Conv2D, UpSampling2D, BatchNormalization, ZeroPadding2D, MaxPooling2D 3 | from keras.models import Model 4 | from keras.utils import plot_model 5 | 6 | from custom_layers.unpooling_layer import Unpooling 7 | 8 | 9 | def create_model(): 10 | # Encoder 11 | input_tensor = Input(shape=(320, 320, 4)) 12 | x = ZeroPadding2D((1, 1))(input_tensor) 13 | x = Conv2D(64, (3, 3), activation='relu', name='conv1_1')(x) 14 | x = ZeroPadding2D((1, 1))(x) 15 | x = Conv2D(64, (3, 3), activation='relu', name='conv1_2')(x) 16 | orig_1 = x 17 | x = MaxPooling2D((2, 2), strides=(2, 2))(x) 18 | 19 | x = ZeroPadding2D((1, 1))(x) 20 | x = Conv2D(128, (3, 3), activation='relu', name='conv2_1')(x) 21 | x = ZeroPadding2D((1, 1))(x) 22 | x = Conv2D(128, (3, 3), activation='relu', name='conv2_2')(x) 23 | orig_2 = x 24 | x = MaxPooling2D((2, 2), strides=(2, 2))(x) 25 | 26 | x = ZeroPadding2D((1, 1))(x) 27 | x = Conv2D(256, (3, 3), activation='relu', name='conv3_1')(x) 28 | x = ZeroPadding2D((1, 1))(x) 29 | x = Conv2D(256, (3, 3), activation='relu', name='conv3_2')(x) 30 | x = ZeroPadding2D((1, 1))(x) 31 | x = Conv2D(256, (3, 3), activation='relu', name='conv3_3')(x) 32 | orig_3 = x 33 | x = MaxPooling2D((2, 2), strides=(2, 2))(x) 34 | 35 | x = ZeroPadding2D((1, 1))(x) 36 | x = Conv2D(512, (3, 3), activation='relu', name='conv4_1')(x) 37 | x = ZeroPadding2D((1, 1))(x) 38 | x = Conv2D(512, (3, 3), activation='relu', name='conv4_2')(x) 39 | x = ZeroPadding2D((1, 1))(x) 40 | x = Conv2D(512, (3, 3), activation='relu', name='conv4_3')(x) 41 | orig_4 = x 42 | x = MaxPooling2D((2, 2), strides=(2, 2))(x) 43 | 44 | x = ZeroPadding2D((1, 1))(x) 45 | x = Conv2D(512, (3, 3), activation='relu', name='conv5_1')(x) 46 | x = ZeroPadding2D((1, 1))(x) 47 | x = Conv2D(512, (3, 3), activation='relu', name='conv5_2')(x) 48 | x = ZeroPadding2D((1, 1))(x) 49 | x = Conv2D(512, (3, 3), activation='relu', name='conv5_3')(x) 50 | orig_5 = x 51 | x = MaxPooling2D((2, 2), strides=(2, 2))(x) 52 | 53 | # Decoder 54 | # x = Conv2D(4096, (7, 7), activation='relu', padding='valid', name='conv6')(x) 55 | # x = BatchNormalization()(x) 56 | # x = UpSampling2D(size=(7, 7))(x) 57 | 58 | x = Conv2D(512, (1, 1), activation='relu', padding='same', name='deconv6', kernel_initializer='he_normal', 59 | bias_initializer='zeros')(x) 60 | x = BatchNormalization()(x) 61 | x = UpSampling2D(size=(2, 2))(x) 62 | x = Unpooling(orig_5, (20, 20, 512))(x) 63 | 64 | x = Conv2D(512, (5, 5), activation='relu', padding='same', name='deconv5', kernel_initializer='he_normal', 65 | bias_initializer='zeros')(x) 66 | x = BatchNormalization()(x) 67 | x = UpSampling2D(size=(2, 2))(x) 68 | x = Unpooling(orig_4, (40, 40, 512))(x) 69 | 70 | x = Conv2D(256, (5, 5), activation='relu', padding='same', name='deconv4', kernel_initializer='he_normal', 71 | bias_initializer='zeros')(x) 72 | x = BatchNormalization()(x) 73 | x = UpSampling2D(size=(2, 2))(x) 74 | x = Unpooling(orig_3, (80, 80, 256))(x) 75 | 76 | x = Conv2D(128, (5, 5), activation='relu', padding='same', name='deconv3', kernel_initializer='he_normal', 77 | bias_initializer='zeros')(x) 78 | x = BatchNormalization()(x) 79 | x = UpSampling2D(size=(2, 2))(x) 80 | x = Unpooling(orig_2, (160, 160, 128))(x) 81 | 82 | x = Conv2D(64, (5, 5), activation='relu', padding='same', name='deconv2', kernel_initializer='he_normal', 83 | bias_initializer='zeros')(x) 84 | x = BatchNormalization()(x) 85 | x = UpSampling2D(size=(2, 2))(x) 86 | x = Unpooling(orig_1, (320, 320, 64))(x) 87 | 88 | x = Conv2D(64, (5, 5), activation='relu', padding='same', name='deconv1', kernel_initializer='he_normal', 89 | bias_initializer='zeros')(x) 90 | x = BatchNormalization()(x) 91 | 92 | x = Conv2D(1, (5, 5), activation='sigmoid', padding='same', name='pred', kernel_initializer='he_normal', 93 | bias_initializer='zeros')(x) 94 | 95 | model = Model(inputs=input_tensor, outputs=x) 96 | return model 97 | 98 | 99 | if __name__ == '__main__': 100 | model = create_model(224, 224, 3) 101 | # input_layer = model.get_layer('input') 102 | print(model.summary()) 103 | plot_model(model, to_file='model.svg', show_layer_names=True, show_shapes=True) 104 | 105 | K.clear_session() 106 | -------------------------------------------------------------------------------- /model.svg: -------------------------------------------------------------------------------- 1 | 2 | 4 | 6 | 7 | 9 | 10 | G 11 | 12 | 13 | 139877548231144 14 | 15 | input_input: InputLayer 16 | 17 | input: 18 | 19 | output: 20 | 21 | (None, 224, 224, 3) 22 | 23 | (None, 224, 224, 3) 24 | 25 | 26 | 139878502879696 27 | 28 | input: ZeroPadding2D 29 | 30 | input: 31 | 32 | output: 33 | 34 | (None, 224, 224, 3) 35 | 36 | (None, 226, 226, 3) 37 | 38 | 39 | 139877548231144->139878502879696 40 | 41 | 42 | 43 | 44 | 139877548317720 45 | 46 | conv1_1: Conv2D 47 | 48 | input: 49 | 50 | output: 51 | 52 | (None, 226, 226, 3) 53 | 54 | (None, 224, 224, 64) 55 | 56 | 57 | 139878502879696->139877548317720 58 | 59 | 60 | 61 | 62 | 139877548316992 63 | 64 | zero_padding2d_1: ZeroPadding2D 65 | 66 | input: 67 | 68 | output: 69 | 70 | (None, 224, 224, 64) 71 | 72 | (None, 226, 226, 64) 73 | 74 | 75 | 139877548317720->139877548316992 76 | 77 | 78 | 79 | 80 | 139877548413672 81 | 82 | conv1_2: Conv2D 83 | 84 | input: 85 | 86 | output: 87 | 88 | (None, 226, 226, 64) 89 | 90 | (None, 224, 224, 64) 91 | 92 | 93 | 139877548316992->139877548413672 94 | 95 | 96 | 97 | 98 | 139877548414232 99 | 100 | max_pooling2d_1: MaxPooling2D 101 | 102 | input: 103 | 104 | output: 105 | 106 | (None, 224, 224, 64) 107 | 108 | (None, 112, 112, 64) 109 | 110 | 111 | 139877548413672->139877548414232 112 | 113 | 114 | 115 | 116 | 139877548557536 117 | 118 | zero_padding2d_2: ZeroPadding2D 119 | 120 | input: 121 | 122 | output: 123 | 124 | (None, 112, 112, 64) 125 | 126 | (None, 114, 114, 64) 127 | 128 | 129 | 139877548414232->139877548557536 130 | 131 | 132 | 133 | 134 | 139877548558712 135 | 136 | conv2_1: Conv2D 137 | 138 | input: 139 | 140 | output: 141 | 142 | (None, 114, 114, 64) 143 | 144 | (None, 112, 112, 128) 145 | 146 | 147 | 139877548557536->139877548558712 148 | 149 | 150 | 151 | 152 | 139877437999592 153 | 154 | zero_padding2d_3: ZeroPadding2D 155 | 156 | input: 157 | 158 | output: 159 | 160 | (None, 112, 112, 128) 161 | 162 | (None, 114, 114, 128) 163 | 164 | 165 | 139877548558712->139877437999592 166 | 167 | 168 | 169 | 170 | 139877438031016 171 | 172 | conv2_2: Conv2D 173 | 174 | input: 175 | 176 | output: 177 | 178 | (None, 114, 114, 128) 179 | 180 | (None, 112, 112, 128) 181 | 182 | 183 | 139877437999592->139877438031016 184 | 185 | 186 | 187 | 188 | 139877438030568 189 | 190 | max_pooling2d_2: MaxPooling2D 191 | 192 | input: 193 | 194 | output: 195 | 196 | (None, 112, 112, 128) 197 | 198 | (None, 56, 56, 128) 199 | 200 | 201 | 139877438031016->139877438030568 202 | 203 | 204 | 205 | 206 | 139877438112824 207 | 208 | zero_padding2d_4: ZeroPadding2D 209 | 210 | input: 211 | 212 | output: 213 | 214 | (None, 56, 56, 128) 215 | 216 | (None, 58, 58, 128) 217 | 218 | 219 | 139877438030568->139877438112824 220 | 221 | 222 | 223 | 224 | 139877438114000 225 | 226 | conv3_1: Conv2D 227 | 228 | input: 229 | 230 | output: 231 | 232 | (None, 58, 58, 128) 233 | 234 | (None, 56, 56, 256) 235 | 236 | 237 | 139877438112824->139877438114000 238 | 239 | 240 | 241 | 242 | 139877438142840 243 | 244 | zero_padding2d_5: ZeroPadding2D 245 | 246 | input: 247 | 248 | output: 249 | 250 | (None, 56, 56, 256) 251 | 252 | (None, 58, 58, 256) 253 | 254 | 255 | 139877438114000->139877438142840 256 | 257 | 258 | 259 | 260 | 139877438182288 261 | 262 | conv3_2: Conv2D 263 | 264 | input: 265 | 266 | output: 267 | 268 | (None, 58, 58, 256) 269 | 270 | (None, 56, 56, 256) 271 | 272 | 273 | 139877438142840->139877438182288 274 | 275 | 276 | 277 | 278 | 139877438181448 279 | 280 | zero_padding2d_6: ZeroPadding2D 281 | 282 | input: 283 | 284 | output: 285 | 286 | (None, 56, 56, 256) 287 | 288 | (None, 58, 58, 256) 289 | 290 | 291 | 139877438182288->139877438181448 292 | 293 | 294 | 295 | 296 | 139877435605512 297 | 298 | conv3_3: Conv2D 299 | 300 | input: 301 | 302 | output: 303 | 304 | (None, 58, 58, 256) 305 | 306 | (None, 56, 56, 256) 307 | 308 | 309 | 139877438181448->139877435605512 310 | 311 | 312 | 313 | 314 | 139877435605680 315 | 316 | max_pooling2d_3: MaxPooling2D 317 | 318 | input: 319 | 320 | output: 321 | 322 | (None, 56, 56, 256) 323 | 324 | (None, 28, 28, 256) 325 | 326 | 327 | 139877435605512->139877435605680 328 | 329 | 330 | 331 | 332 | 139877435678848 333 | 334 | zero_padding2d_7: ZeroPadding2D 335 | 336 | input: 337 | 338 | output: 339 | 340 | (None, 28, 28, 256) 341 | 342 | (None, 30, 30, 256) 343 | 344 | 345 | 139877435605680->139877435678848 346 | 347 | 348 | 349 | 350 | 139877435681032 351 | 352 | conv4_1: Conv2D 353 | 354 | input: 355 | 356 | output: 357 | 358 | (None, 30, 30, 256) 359 | 360 | (None, 28, 28, 512) 361 | 362 | 363 | 139877435678848->139877435681032 364 | 365 | 366 | 367 | 368 | 139877435713296 369 | 370 | zero_padding2d_8: ZeroPadding2D 371 | 372 | input: 373 | 374 | output: 375 | 376 | (None, 28, 28, 512) 377 | 378 | (None, 30, 30, 512) 379 | 380 | 381 | 139877435681032->139877435713296 382 | 383 | 384 | 385 | 386 | 139877435756728 387 | 388 | conv4_2: Conv2D 389 | 390 | input: 391 | 392 | output: 393 | 394 | (None, 30, 30, 512) 395 | 396 | (None, 28, 28, 512) 397 | 398 | 399 | 139877435713296->139877435756728 400 | 401 | 402 | 403 | 404 | 139877435756896 405 | 406 | zero_padding2d_9: ZeroPadding2D 407 | 408 | input: 409 | 410 | output: 411 | 412 | (None, 28, 28, 512) 413 | 414 | (None, 30, 30, 512) 415 | 416 | 417 | 139877435756728->139877435756896 418 | 419 | 420 | 421 | 422 | 139877435792912 423 | 424 | conv4_3: Conv2D 425 | 426 | input: 427 | 428 | output: 429 | 430 | (None, 30, 30, 512) 431 | 432 | (None, 28, 28, 512) 433 | 434 | 435 | 139877435756896->139877435792912 436 | 437 | 438 | 439 | 440 | 139877435310432 441 | 442 | max_pooling2d_4: MaxPooling2D 443 | 444 | input: 445 | 446 | output: 447 | 448 | (None, 28, 28, 512) 449 | 450 | (None, 14, 14, 512) 451 | 452 | 453 | 139877435792912->139877435310432 454 | 455 | 456 | 457 | 458 | 139877435340224 459 | 460 | zero_padding2d_10: ZeroPadding2D 461 | 462 | input: 463 | 464 | output: 465 | 466 | (None, 14, 14, 512) 467 | 468 | (None, 16, 16, 512) 469 | 470 | 471 | 139877435310432->139877435340224 472 | 473 | 474 | 475 | 476 | 139877435389936 477 | 478 | conv5_1: Conv2D 479 | 480 | input: 481 | 482 | output: 483 | 484 | (None, 16, 16, 512) 485 | 486 | (None, 14, 14, 512) 487 | 488 | 489 | 139877435340224->139877435389936 490 | 491 | 492 | 493 | 494 | 139877435424216 495 | 496 | zero_padding2d_11: ZeroPadding2D 497 | 498 | input: 499 | 500 | output: 501 | 502 | (None, 14, 14, 512) 503 | 504 | (None, 16, 16, 512) 505 | 506 | 507 | 139877435389936->139877435424216 508 | 509 | 510 | 511 | 512 | 139877435574424 513 | 514 | conv5_2: Conv2D 515 | 516 | input: 517 | 518 | output: 519 | 520 | (None, 16, 16, 512) 521 | 522 | (None, 14, 14, 512) 523 | 524 | 525 | 139877435424216->139877435574424 526 | 527 | 528 | 529 | 530 | 139877435423712 531 | 532 | zero_padding2d_12: ZeroPadding2D 533 | 534 | input: 535 | 536 | output: 537 | 538 | (None, 14, 14, 512) 539 | 540 | (None, 16, 16, 512) 541 | 542 | 543 | 139877435574424->139877435423712 544 | 545 | 546 | 547 | 548 | 139877435492840 549 | 550 | conv5_3: Conv2D 551 | 552 | input: 553 | 554 | output: 555 | 556 | (None, 16, 16, 512) 557 | 558 | (None, 14, 14, 512) 559 | 560 | 561 | 139877435423712->139877435492840 562 | 563 | 564 | 565 | 566 | 139877435491272 567 | 568 | max_pooling2d_5: MaxPooling2D 569 | 570 | input: 571 | 572 | output: 573 | 574 | (None, 14, 14, 512) 575 | 576 | (None, 7, 7, 512) 577 | 578 | 579 | 139877435492840->139877435491272 580 | 581 | 582 | 583 | 584 | 139877435052552 585 | 586 | conv6: Conv2D 587 | 588 | input: 589 | 590 | output: 591 | 592 | (None, 7, 7, 512) 593 | 594 | (None, 1, 1, 4096) 595 | 596 | 597 | 139877435491272->139877435052552 598 | 599 | 600 | 601 | 602 | 139877435092776 603 | 604 | batch_normalization_1: BatchNormalization 605 | 606 | input: 607 | 608 | output: 609 | 610 | (None, 1, 1, 4096) 611 | 612 | (None, 1, 1, 4096) 613 | 614 | 615 | 139877435052552->139877435092776 616 | 617 | 618 | 619 | 620 | 139877435173856 621 | 622 | up_sampling2d_1: UpSampling2D 623 | 624 | input: 625 | 626 | output: 627 | 628 | (None, 1, 1, 4096) 629 | 630 | (None, 7, 7, 4096) 631 | 632 | 633 | 139877435092776->139877435173856 634 | 635 | 636 | 637 | 638 | 139877435133680 639 | 640 | deconv6: Conv2D 641 | 642 | input: 643 | 644 | output: 645 | 646 | (None, 7, 7, 4096) 647 | 648 | (None, 7, 7, 512) 649 | 650 | 651 | 139877435173856->139877435133680 652 | 653 | 654 | 655 | 656 | 139877426075184 657 | 658 | batch_normalization_2: BatchNormalization 659 | 660 | input: 661 | 662 | output: 663 | 664 | (None, 7, 7, 512) 665 | 666 | (None, 7, 7, 512) 667 | 668 | 669 | 139877435133680->139877426075184 670 | 671 | 672 | 673 | 674 | 139877160895992 675 | 676 | up_sampling2d_2: UpSampling2D 677 | 678 | input: 679 | 680 | output: 681 | 682 | (None, 7, 7, 512) 683 | 684 | (None, 14, 14, 512) 685 | 686 | 687 | 139877426075184->139877160895992 688 | 689 | 690 | 691 | 692 | 139877160854976 693 | 694 | deconv5: Conv2D 695 | 696 | input: 697 | 698 | output: 699 | 700 | (None, 14, 14, 512) 701 | 702 | (None, 14, 14, 512) 703 | 704 | 705 | 139877160895992->139877160854976 706 | 707 | 708 | 709 | 710 | 139877160992720 711 | 712 | batch_normalization_3: BatchNormalization 713 | 714 | input: 715 | 716 | output: 717 | 718 | (None, 14, 14, 512) 719 | 720 | (None, 14, 14, 512) 721 | 722 | 723 | 139877160854976->139877160992720 724 | 725 | 726 | 727 | 728 | 139877160685352 729 | 730 | up_sampling2d_3: UpSampling2D 731 | 732 | input: 733 | 734 | output: 735 | 736 | (None, 14, 14, 512) 737 | 738 | (None, 28, 28, 512) 739 | 740 | 741 | 139877160992720->139877160685352 742 | 743 | 744 | 745 | 746 | 139877160723848 747 | 748 | deconv4: Conv2D 749 | 750 | input: 751 | 752 | output: 753 | 754 | (None, 28, 28, 512) 755 | 756 | (None, 28, 28, 256) 757 | 758 | 759 | 139877160685352->139877160723848 760 | 761 | 762 | 763 | 764 | 139877160337304 765 | 766 | batch_normalization_4: BatchNormalization 767 | 768 | input: 769 | 770 | output: 771 | 772 | (None, 28, 28, 256) 773 | 774 | (None, 28, 28, 256) 775 | 776 | 777 | 139877160723848->139877160337304 778 | 779 | 780 | 781 | 782 | 139877160203544 783 | 784 | up_sampling2d_4: UpSampling2D 785 | 786 | input: 787 | 788 | output: 789 | 790 | (None, 28, 28, 256) 791 | 792 | (None, 56, 56, 256) 793 | 794 | 795 | 139877160337304->139877160203544 796 | 797 | 798 | 799 | 800 | 139877160170944 801 | 802 | deconv3: Conv2D 803 | 804 | input: 805 | 806 | output: 807 | 808 | (None, 56, 56, 256) 809 | 810 | (None, 56, 56, 128) 811 | 812 | 813 | 139877160203544->139877160170944 814 | 815 | 816 | 817 | 818 | 139877159575448 819 | 820 | batch_normalization_5: BatchNormalization 821 | 822 | input: 823 | 824 | output: 825 | 826 | (None, 56, 56, 128) 827 | 828 | (None, 56, 56, 128) 829 | 830 | 831 | 139877160170944->139877159575448 832 | 833 | 834 | 835 | 836 | 139877159650640 837 | 838 | up_sampling2d_5: UpSampling2D 839 | 840 | input: 841 | 842 | output: 843 | 844 | (None, 56, 56, 128) 845 | 846 | (None, 112, 112, 128) 847 | 848 | 849 | 139877159575448->139877159650640 850 | 851 | 852 | 853 | 854 | 139877159609568 855 | 856 | deconv2: Conv2D 857 | 858 | input: 859 | 860 | output: 861 | 862 | (None, 112, 112, 128) 863 | 864 | (None, 112, 112, 64) 865 | 866 | 867 | 139877159650640->139877159609568 868 | 869 | 870 | 871 | 872 | 139877159022544 873 | 874 | batch_normalization_6: BatchNormalization 875 | 876 | input: 877 | 878 | output: 879 | 880 | (None, 112, 112, 64) 881 | 882 | (None, 112, 112, 64) 883 | 884 | 885 | 139877159609568->139877159022544 886 | 887 | 888 | 889 | 890 | 139877159097736 891 | 892 | up_sampling2d_6: UpSampling2D 893 | 894 | input: 895 | 896 | output: 897 | 898 | (None, 112, 112, 64) 899 | 900 | (None, 224, 224, 64) 901 | 902 | 903 | 139877159022544->139877159097736 904 | 905 | 906 | 907 | 908 | 139877159064856 909 | 910 | deconv1: Conv2D 911 | 912 | input: 913 | 914 | output: 915 | 916 | (None, 224, 224, 64) 917 | 918 | (None, 224, 224, 64) 919 | 920 | 921 | 139877159097736->139877159064856 922 | 923 | 924 | 925 | 926 | 139877159206808 927 | 928 | batch_normalization_7: BatchNormalization 929 | 930 | input: 931 | 932 | output: 933 | 934 | (None, 224, 224, 64) 935 | 936 | (None, 224, 224, 64) 937 | 938 | 939 | 139877159064856->139877159206808 940 | 941 | 942 | 943 | 944 | 139877158536640 945 | 946 | pred: Conv2D 947 | 948 | input: 949 | 950 | output: 951 | 952 | (None, 224, 224, 64) 953 | 954 | (None, 224, 224, 1) 955 | 956 | 957 | 139877159206808->139877158536640 958 | 959 | 960 | 961 | 962 | 963 | -------------------------------------------------------------------------------- /new_start.py: -------------------------------------------------------------------------------- 1 | import keras.backend as K 2 | from keras.models import Sequential 3 | 4 | from encoder_decoder import build_encoder, build_decoder 5 | from utils import do_compile 6 | 7 | 8 | def autoencoder(img_rows, img_cols, channel=4): 9 | model = Sequential() 10 | # Encoder 11 | build_encoder(model, img_rows, img_cols, channel) 12 | # Decoder 13 | build_decoder(model) 14 | # Compile 15 | do_compile(model) 16 | return model 17 | 18 | 19 | if __name__ == '__main__': 20 | model = autoencoder(320, 320, 4) 21 | input_layer = model.get_layer('input') 22 | 23 | K.clear_session() 24 | -------------------------------------------------------------------------------- /plot_model.py: -------------------------------------------------------------------------------- 1 | # dependency: pip install pydot & brew install graphviz 2 | from new_start import autoencoder 3 | from keras.utils import plot_model 4 | 5 | 6 | if __name__ == '__main__': 7 | img_rows, img_cols = 320, 320 8 | channel = 3 9 | model = autoencoder(img_rows, img_cols, channel) 10 | plot_model(model, to_file='model.svg', show_layer_names=True, show_shapes=True) 11 | -------------------------------------------------------------------------------- /pre-process.py: -------------------------------------------------------------------------------- 1 | # -*- coding: utf-8 -*- 2 | 3 | import os 4 | import random 5 | import tarfile 6 | 7 | import cv2 as cv 8 | import numpy as np 9 | import scipy.io 10 | from console_progressbar import ProgressBar 11 | 12 | 13 | def ensure_folder(folder): 14 | if not os.path.exists(folder): 15 | os.makedirs(folder) 16 | 17 | 18 | def save_train_data(fnames, bboxes): 19 | src_folder = 'cars_train' 20 | num_samples = len(fnames) 21 | 22 | train_split = 0.8 23 | num_train = int(round(num_samples * train_split)) 24 | train_indexes = random.sample(range(num_samples), num_train) 25 | print('train_indexes: '.format(str(train_indexes))) 26 | 27 | pb = ProgressBar(total=100, prefix='Save train data', suffix='', decimals=3, length=50, fill='=') 28 | 29 | for i in range(num_samples): 30 | fname = fnames[i] 31 | (x1, y1, x2, y2) = bboxes[i] 32 | src_path = os.path.join(src_folder, fname) 33 | src_image = cv.imread(src_path) 34 | height, width = src_image.shape[:2] 35 | # margins of 16 pixels 36 | margin = 16 37 | x1 = max(0, x1 - margin) 38 | y1 = max(0, y1 - margin) 39 | x2 = min(x2 + margin, width) 40 | y2 = min(y2 + margin, height) 41 | # print(fname) 42 | pb.print_progress_bar((i + 1) * 100 / num_samples) 43 | 44 | if i in train_indexes: 45 | dst_folder = 'data/train' 46 | else: 47 | dst_folder = 'data/valid' 48 | dst_path = os.path.join(dst_folder, fname) 49 | crop_image = src_image[y1:y2, x1:x2] 50 | dst_img = cv.resize(src=crop_image, dsize=(img_height, img_width)) 51 | cv.imwrite(dst_path, dst_img) 52 | print('\n') 53 | 54 | 55 | def save_test_data(fnames, bboxes): 56 | src_folder = 'cars_test' 57 | dst_folder = 'data/test' 58 | num_samples = len(fnames) 59 | 60 | pb = ProgressBar(total=100, prefix='Save test data', suffix='', decimals=3, length=50, fill='=') 61 | 62 | for i in range(num_samples): 63 | fname = fnames[i] 64 | (x1, y1, x2, y2) = bboxes[i] 65 | src_path = os.path.join(src_folder, fname) 66 | src_image = cv.imread(src_path) 67 | height, width = src_image.shape[:2] 68 | # margins of 16 pixels 69 | margin = 16 70 | x1 = max(0, x1 - margin) 71 | y1 = max(0, y1 - margin) 72 | x2 = min(x2 + margin, width) 73 | y2 = min(y2 + margin, height) 74 | # print(fname) 75 | pb.print_progress_bar((i + 1) * 100 / num_samples) 76 | dst_path = os.path.join(dst_folder, fname) 77 | crop_image = src_image[y1:y2, x1:x2] 78 | dst_img = cv.resize(src=crop_image, dsize=(img_height, img_width)) 79 | cv.imwrite(dst_path, dst_img) 80 | print('\n') 81 | 82 | 83 | def process_data(usage): 84 | print("Processing {} data...".format(usage)) 85 | cars_annos = scipy.io.loadmat('devkit/cars_{}_annos'.format(usage)) 86 | annotations = cars_annos['annotations'] 87 | annotations = np.transpose(annotations) 88 | 89 | fnames = [] 90 | bboxes = [] 91 | 92 | for annotation in annotations: 93 | bbox_x1 = annotation[0][0][0][0] 94 | bbox_y1 = annotation[0][1][0][0] 95 | bbox_x2 = annotation[0][2][0][0] 96 | bbox_y2 = annotation[0][3][0][0] 97 | if usage == 'train': 98 | class_id = annotation[0][4][0][0] 99 | fname = annotation[0][5][0] 100 | else: 101 | fname = annotation[0][4][0] 102 | bboxes.append((bbox_x1, bbox_y1, bbox_x2, bbox_y2)) 103 | fnames.append(fname) 104 | 105 | if usage == 'train': 106 | save_train_data(fnames, bboxes) 107 | else: 108 | save_test_data(fnames, bboxes) 109 | 110 | 111 | if __name__ == '__main__': 112 | # parameters 113 | img_width, img_height = 320, 320 114 | 115 | print('Extracting cars_train.tgz...') 116 | if not os.path.exists('cars_train'): 117 | with tarfile.open('cars_train.tgz', "r:gz") as tar: 118 | tar.extractall() 119 | print('Extracting cars_test.tgz...') 120 | if not os.path.exists('cars_test'): 121 | with tarfile.open('cars_test.tgz', "r:gz") as tar: 122 | tar.extractall() 123 | print('Extracting car_devkit.tgz...') 124 | if not os.path.exists('devkit'): 125 | with tarfile.open('car_devkit.tgz', "r:gz") as tar: 126 | tar.extractall() 127 | 128 | cars_meta = scipy.io.loadmat('devkit/cars_meta') 129 | class_names = cars_meta['class_names'] # shape=(1, 196) 130 | class_names = np.transpose(class_names) 131 | print('class_names.shape: ' + str(class_names.shape)) 132 | print('Sample class_name: [{}]'.format(class_names[8][0][0])) 133 | 134 | ensure_folder('data/train') 135 | ensure_folder('data/valid') 136 | ensure_folder('data/test') 137 | 138 | process_data('train') 139 | process_data('test') 140 | 141 | # clean up 142 | # shutil.rmtree('cars_train') 143 | # shutil.rmtree('cars_test') 144 | # shutil.rmtree('devkit') 145 | -------------------------------------------------------------------------------- /resnet_152.py: -------------------------------------------------------------------------------- 1 | # -*- coding: utf-8 -*- 2 | 3 | from keras.optimizers import SGD 4 | from keras.layers import Input, Dense, Conv2D, MaxPooling2D, AveragePooling2D, ZeroPadding2D, Flatten, Activation, add 5 | from keras.layers.normalization import BatchNormalization 6 | from keras.models import Model 7 | from keras import backend as K 8 | 9 | from custom_layers.scale_layer import Scale 10 | 11 | import sys 12 | sys.setrecursionlimit(3000) 13 | 14 | def identity_block(input_tensor, kernel_size, filters, stage, block): 15 | '''The identity_block is the block that has no conv layer at shortcut 16 | # Arguments 17 | input_tensor: input tensor 18 | kernel_size: defualt 3, the kernel size of middle conv layer at main path 19 | filters: list of integers, the nb_filters of 3 conv layer at main path 20 | stage: integer, current stage label, used for generating layer names 21 | block: 'a','b'..., current block label, used for generating layer names 22 | ''' 23 | eps = 1.1e-5 24 | nb_filter1, nb_filter2, nb_filter3 = filters 25 | conv_name_base = 'res' + str(stage) + block + '_branch' 26 | bn_name_base = 'bn' + str(stage) + block + '_branch' 27 | scale_name_base = 'scale' + str(stage) + block + '_branch' 28 | 29 | x = Conv2D(nb_filter1, (1, 1), name=conv_name_base + '2a', use_bias=False)(input_tensor) 30 | x = BatchNormalization(epsilon=eps, axis=bn_axis, name=bn_name_base + '2a')(x) 31 | x = Scale(axis=bn_axis, name=scale_name_base + '2a')(x) 32 | x = Activation('relu', name=conv_name_base + '2a_relu')(x) 33 | 34 | x = ZeroPadding2D((1, 1), name=conv_name_base + '2b_zeropadding')(x) 35 | x = Conv2D(nb_filter2, (kernel_size, kernel_size), 36 | name=conv_name_base + '2b', use_bias=False)(x) 37 | x = BatchNormalization(epsilon=eps, axis=bn_axis, name=bn_name_base + '2b')(x) 38 | x = Scale(axis=bn_axis, name=scale_name_base + '2b')(x) 39 | x = Activation('relu', name=conv_name_base + '2b_relu')(x) 40 | 41 | x = Conv2D(nb_filter3, (1, 1), name=conv_name_base + '2c', use_bias=False)(x) 42 | x = BatchNormalization(epsilon=eps, axis=bn_axis, name=bn_name_base + '2c')(x) 43 | x = Scale(axis=bn_axis, name=scale_name_base + '2c')(x) 44 | 45 | x = add([x, input_tensor], name='res' + str(stage) + block) 46 | x = Activation('relu', name='res' + str(stage) + block + '_relu')(x) 47 | return x 48 | 49 | def conv_block(input_tensor, kernel_size, filters, stage, block, strides=(2, 2)): 50 | '''conv_block is the block that has a conv layer at shortcut 51 | # Arguments 52 | input_tensor: input tensor 53 | kernel_size: defualt 3, the kernel size of middle conv layer at main path 54 | filters: list of integers, the nb_filters of 3 conv layer at main path 55 | stage: integer, current stage label, used for generating layer names 56 | block: 'a','b'..., current block label, used for generating layer names 57 | Note that from stage 3, the first conv layer at main path is with subsample=(2,2) 58 | And the shortcut should have subsample=(2,2) as well 59 | ''' 60 | eps = 1.1e-5 61 | nb_filter1, nb_filter2, nb_filter3 = filters 62 | conv_name_base = 'res' + str(stage) + block + '_branch' 63 | bn_name_base = 'bn' + str(stage) + block + '_branch' 64 | scale_name_base = 'scale' + str(stage) + block + '_branch' 65 | 66 | x = Conv2D(nb_filter1, (1, 1), strides=strides, 67 | name=conv_name_base + '2a', use_bias=False)(input_tensor) 68 | x = BatchNormalization(epsilon=eps, axis=bn_axis, name=bn_name_base + '2a')(x) 69 | x = Scale(axis=bn_axis, name=scale_name_base + '2a')(x) 70 | x = Activation('relu', name=conv_name_base + '2a_relu')(x) 71 | 72 | x = ZeroPadding2D((1, 1), name=conv_name_base + '2b_zeropadding')(x) 73 | x = Conv2D(nb_filter2, (kernel_size, kernel_size), 74 | name=conv_name_base + '2b', use_bias=False)(x) 75 | x = BatchNormalization(epsilon=eps, axis=bn_axis, name=bn_name_base + '2b')(x) 76 | x = Scale(axis=bn_axis, name=scale_name_base + '2b')(x) 77 | x = Activation('relu', name=conv_name_base + '2b_relu')(x) 78 | 79 | x = Conv2D(nb_filter3, (1, 1), name=conv_name_base + '2c', use_bias=False)(x) 80 | x = BatchNormalization(epsilon=eps, axis=bn_axis, name=bn_name_base + '2c')(x) 81 | x = Scale(axis=bn_axis, name=scale_name_base + '2c')(x) 82 | 83 | shortcut = Conv2D(nb_filter3, (1, 1), strides=strides, 84 | name=conv_name_base + '1', use_bias=False)(input_tensor) 85 | shortcut = BatchNormalization(epsilon=eps, axis=bn_axis, name=bn_name_base + '1')(shortcut) 86 | shortcut = Scale(axis=bn_axis, name=scale_name_base + '1')(shortcut) 87 | 88 | x = add([x, shortcut], name='res' + str(stage) + block) 89 | x = Activation('relu', name='res' + str(stage) + block + '_relu')(x) 90 | return x 91 | 92 | def resnet152_model(img_rows, img_cols, color_type=1, num_classes=None): 93 | """ 94 | Resnet 152 Model for Keras 95 | 96 | Model Schema and layer naming follow that of the original Caffe implementation 97 | https://github.com/KaimingHe/deep-residual-networks 98 | 99 | ImageNet Pretrained Weights 100 | Theano: https://drive.google.com/file/d/0Byy2AcGyEVxfZHhUT3lWVWxRN28/view?usp=sharing 101 | TensorFlow: https://drive.google.com/file/d/0Byy2AcGyEVxfeXExMzNNOHpEODg/view?usp=sharing 102 | 103 | Parameters: 104 | img_rows, img_cols - resolution of inputs 105 | channel - 1 for grayscale, 3 for color 106 | num_classes - number of class labels for our classification task 107 | """ 108 | eps = 1.1e-5 109 | 110 | # Handle Dimension Ordering for different backends 111 | global bn_axis 112 | if K.image_dim_ordering() == 'tf': 113 | bn_axis = 3 114 | img_input = Input(shape=(img_rows, img_cols, color_type), name='data') 115 | else: 116 | bn_axis = 1 117 | img_input = Input(shape=(color_type, img_rows, img_cols), name='data') 118 | 119 | x = ZeroPadding2D((3, 3), name='conv1_zeropadding')(img_input) 120 | x = Conv2D(64, (7, 7), strides=(2, 2), name='conv1', use_bias=False)(x) 121 | x = BatchNormalization(epsilon=eps, axis=bn_axis, name='bn_conv1')(x) 122 | x = Scale(axis=bn_axis, name='scale_conv1')(x) 123 | x = Activation('relu', name='conv1_relu')(x) 124 | x = MaxPooling2D((3, 3), strides=(2, 2), name='pool1')(x) 125 | 126 | x = conv_block(x, 3, [64, 64, 256], stage=2, block='a', strides=(1, 1)) 127 | x = identity_block(x, 3, [64, 64, 256], stage=2, block='b') 128 | x = identity_block(x, 3, [64, 64, 256], stage=2, block='c') 129 | 130 | x = conv_block(x, 3, [128, 128, 512], stage=3, block='a') 131 | for i in range(1,8): 132 | x = identity_block(x, 3, [128, 128, 512], stage=3, block='b'+str(i)) 133 | 134 | x = conv_block(x, 3, [256, 256, 1024], stage=4, block='a') 135 | for i in range(1,36): 136 | x = identity_block(x, 3, [256, 256, 1024], stage=4, block='b'+str(i)) 137 | 138 | x = conv_block(x, 3, [512, 512, 2048], stage=5, block='a') 139 | x = identity_block(x, 3, [512, 512, 2048], stage=5, block='b') 140 | x = identity_block(x, 3, [512, 512, 2048], stage=5, block='c') 141 | 142 | x_fc = AveragePooling2D((7, 7), name='avg_pool')(x) 143 | x_fc = Flatten()(x_fc) 144 | x_fc = Dense(1000, activation='softmax', name='fc1000')(x_fc) 145 | 146 | model = Model(img_input, x_fc) 147 | 148 | if K.image_dim_ordering() == 'th': 149 | # Use pre-trained weights for Theano backend 150 | weights_path = 'models/resnet152_weights_th.h5' 151 | else: 152 | # Use pre-trained weights for Tensorflow backend 153 | weights_path = 'models/resnet152_weights_tf.h5' 154 | 155 | model.load_weights(weights_path, by_name=True) 156 | 157 | # Truncate and replace softmax layer for transfer learning 158 | # Cannot use model.layers.pop() since model is not of Sequential() type 159 | # The method below works since pre-trained weights are stored in layers but not in the model 160 | x_newfc = AveragePooling2D((7, 7), name='avg_pool')(x) 161 | x_newfc = Flatten()(x_newfc) 162 | x_newfc = Dense(num_classes, activation='softmax', name='fc8')(x_newfc) 163 | 164 | model = Model(img_input, x_newfc) 165 | 166 | # Learning rate is changed to 0.001 167 | sgd = SGD(lr=1e-3, decay=1e-6, momentum=0.9, nesterov=True) 168 | model.compile(optimizer=sgd, loss='categorical_crossentropy', metrics=['accuracy']) 169 | 170 | return model 171 | 172 | if __name__ == '__main__': 173 | 174 | # Example to fine-tune on 3000 samples from Cifar10 175 | 176 | img_rows, img_cols = 224, 224 # Resolution of inputs 177 | channel = 3 178 | num_classes = 10 179 | batch_size = 8 180 | epochs = 10 181 | 182 | # Load our model 183 | model = resnet152_model(img_rows, img_cols, channel, num_classes) 184 | 185 | print(model.output_shape) 186 | print(model.summary()) 187 | 188 | 189 | -------------------------------------------------------------------------------- /test.py: -------------------------------------------------------------------------------- 1 | import cv2 as cv 2 | import keras.backend as K 3 | import numpy as np 4 | 5 | from utils import custom_loss 6 | 7 | if __name__ == '__main__': 8 | file_id = '07647' 9 | filename = 'images/samples/{}.jpg'.format(file_id) 10 | bgr_img = cv.imread(filename) 11 | y_true = cv.cvtColor(bgr_img, cv.COLOR_BGR2GRAY) 12 | cv.imwrite( 'images/samples/{}_gray.jpg'.format(file_id), y_true) 13 | y_true = y_true / 255. 14 | y_true = y_true.astype(np.float32) 15 | 16 | filename = 'images/samples/{}_out.jpg'.format(file_id) 17 | y_pred = cv.imread(filename, 0) 18 | y_pred = y_pred / 255. 19 | y_pred = y_pred.astype(np.float32) 20 | loss = custom_loss(y_true, y_pred) 21 | print(K.eval(loss)) 22 | -------------------------------------------------------------------------------- /train.py: -------------------------------------------------------------------------------- 1 | import keras 2 | from keras.callbacks import ModelCheckpoint, EarlyStopping, ReduceLROnPlateau 3 | from keras.optimizers import SGD 4 | 5 | import migrate 6 | from model import create_model 7 | from utils import load_data, custom_loss 8 | 9 | if __name__ == '__main__': 10 | batch_size = 16 11 | epochs = 1000 12 | patience = 50 13 | 14 | # Load our model 15 | model = create_model() 16 | migrate.migrate_model(model) 17 | # model.compile(optimizer='nadam', loss=custom_loss) 18 | sgd = SGD(lr=1e-3, decay=1e-6, momentum=0.9, nesterov=True) 19 | model.compile(optimizer=sgd, loss=custom_loss) 20 | 21 | print(model.summary()) 22 | 23 | # Load our data 24 | x_train, y_train, x_valid, y_valid = load_data() 25 | 26 | # Callbacks 27 | tensor_board = keras.callbacks.TensorBoard(log_dir='./logs', histogram_freq=0, write_graph=True, write_images=True) 28 | trained_models_path = 'models/model' 29 | model_names = trained_models_path + '.{epoch:02d}-{val_loss:.4f}.hdf5' 30 | model_checkpoint = ModelCheckpoint(model_names, monitor='val_loss', verbose=1, save_best_only=True) 31 | early_stop = EarlyStopping('val_loss', patience=patience) 32 | reduce_lr = ReduceLROnPlateau('val_loss', factor=0.1, patience=int(patience / 4), verbose=1) 33 | callbacks = [tensor_board, model_checkpoint, early_stop, reduce_lr] 34 | 35 | # Start Fine-tuning 36 | model.fit(x_train, 37 | y_train, 38 | validation_data=(x_valid, y_valid), 39 | batch_size=batch_size, 40 | epochs=epochs, 41 | callbacks=callbacks, 42 | verbose=1 43 | ) 44 | -------------------------------------------------------------------------------- /utils.py: -------------------------------------------------------------------------------- 1 | import os 2 | 3 | import cv2 as cv 4 | import keras.backend as K 5 | import numpy as np 6 | from console_progressbar import ProgressBar 7 | from keras.optimizers import SGD 8 | 9 | 10 | def custom_loss(y_true, y_pred): 11 | epsilon = 1e-6 12 | epsilon_sqr = K.constant(epsilon ** 2) 13 | return K.mean(K.sqrt(K.square(y_pred - y_true) + epsilon_sqr)) 14 | 15 | 16 | def load_data(): 17 | # (num_samples, 320, 320, 4) 18 | num_samples = 8144 19 | train_split = 0.8 20 | batch_size = 16 21 | num_train = int(round(num_samples * train_split)) 22 | num_valid = num_samples - num_train 23 | pb = ProgressBar(total=100, prefix='Loading data', suffix='', decimals=3, length=50, fill='=') 24 | 25 | x_train = np.empty((num_train, 320, 320, 4), dtype=np.float32) 26 | y_train = np.empty((num_train, 320, 320, 1), dtype=np.float32) 27 | x_valid = np.empty((num_valid, 320, 320, 4), dtype=np.float32) 28 | y_valid = np.empty((num_valid, 320, 320, 1), dtype=np.float32) 29 | 30 | i_train = i_valid = 0 31 | for root, dirs, files in os.walk("data", topdown=False): 32 | for name in files: 33 | filename = os.path.join(root, name) 34 | bgr_img = cv.imread(filename) 35 | gray_img = cv.cvtColor(bgr_img, cv.COLOR_BGR2GRAY) 36 | rgb_img = cv.cvtColor(bgr_img, cv.COLOR_BGR2RGB) 37 | if filename.startswith('data/train'): 38 | x_train[i_train, :, :, 0:3] = rgb_img / 255. 39 | x_train[i_train, :, :, 3] = np.random.uniform(0, 1, (320, 320)) 40 | y_train[i_train, :, :, 0] = gray_img / 255. 41 | i_train += 1 42 | elif filename.startswith('data/valid'): 43 | x_valid[i_valid, :, :, 0:3] = rgb_img / 255. 44 | x_valid[i_valid, :, :, 3] = np.random.uniform(0, 1, (320, 320)) 45 | y_valid[i_valid, :, :, 0] = gray_img / 255. 46 | i_valid += 1 47 | 48 | i = i_train + i_valid 49 | if i % batch_size == 0: 50 | pb.print_progress_bar(i * 100 / num_samples) 51 | 52 | return x_train, y_train, x_valid, y_valid 53 | 54 | 55 | def do_compile(model): 56 | sgd = SGD(lr=1e-3, decay=1e-6, momentum=0.99, nesterov=True) 57 | # model.compile(optimizer='nadam', loss=custom_loss) 58 | model.compile(optimizer=sgd, loss=custom_loss) 59 | return model 60 | -------------------------------------------------------------------------------- /vgg16.py: -------------------------------------------------------------------------------- 1 | # -*- coding: utf-8 -*- 2 | 3 | import keras.backend as K 4 | from keras.layers import Conv2D, ZeroPadding2D, MaxPooling2D 5 | from keras.layers import Dense, Dropout, Flatten 6 | from keras.models import Sequential 7 | 8 | 9 | def vgg16_model(img_rows, img_cols, channel=3): 10 | model = Sequential() 11 | # Encoder 12 | model.add(ZeroPadding2D((1, 1), input_shape=(img_rows, img_cols, channel), name='input')) 13 | model.add(Conv2D(64, (3, 3), activation='relu', name='conv1_1')) 14 | model.add(ZeroPadding2D((1, 1))) 15 | model.add(Conv2D(64, (3, 3), activation='relu', name='conv1_2')) 16 | model.add(MaxPooling2D((2, 2), strides=(2, 2))) 17 | 18 | model.add(ZeroPadding2D((1, 1))) 19 | model.add(Conv2D(128, (3, 3), activation='relu', name='conv2_1')) 20 | model.add(ZeroPadding2D((1, 1))) 21 | model.add(Conv2D(128, (3, 3), activation='relu', name='conv2_2')) 22 | model.add(MaxPooling2D((2, 2), strides=(2, 2))) 23 | 24 | model.add(ZeroPadding2D((1, 1))) 25 | model.add(Conv2D(256, (3, 3), activation='relu', name='conv3_1')) 26 | model.add(ZeroPadding2D((1, 1))) 27 | model.add(Conv2D(256, (3, 3), activation='relu', name='conv3_2')) 28 | model.add(ZeroPadding2D((1, 1))) 29 | model.add(Conv2D(256, (3, 3), activation='relu', name='conv3_3')) 30 | model.add(MaxPooling2D((2, 2), strides=(2, 2))) 31 | 32 | model.add(ZeroPadding2D((1, 1))) 33 | model.add(Conv2D(512, (3, 3), activation='relu', name='conv4_1')) 34 | model.add(ZeroPadding2D((1, 1))) 35 | model.add(Conv2D(512, (3, 3), activation='relu', name='conv4_2')) 36 | model.add(ZeroPadding2D((1, 1))) 37 | model.add(Conv2D(512, (3, 3), activation='relu', name='conv4_3')) 38 | model.add(MaxPooling2D((2, 2), strides=(2, 2))) 39 | 40 | model.add(ZeroPadding2D((1, 1))) 41 | model.add(Conv2D(512, (3, 3), activation='relu', name='conv5_1')) 42 | model.add(ZeroPadding2D((1, 1))) 43 | model.add(Conv2D(512, (3, 3), activation='relu', name='conv5_2')) 44 | model.add(ZeroPadding2D((1, 1))) 45 | model.add(Conv2D(512, (3, 3), activation='relu', name='conv5_3')) 46 | model.add(MaxPooling2D((2, 2), strides=(2, 2))) 47 | 48 | # Add Fully Connected Layer 49 | model.add(Flatten(name='flatten')) 50 | model.add(Dense(4096, activation='relu', name='dense1')) 51 | model.add(Dropout(0.5)) 52 | model.add(Dense(4096, activation='relu', name='dense2')) 53 | model.add(Dropout(0.5)) 54 | model.add(Dense(1000, activation='softmax', name='softmax')) 55 | 56 | # Loads ImageNet pre-trained data 57 | weights_path = 'models/vgg16_weights_tf_dim_ordering_tf_kernels.h5' 58 | model.load_weights(weights_path) 59 | 60 | return model 61 | 62 | 63 | if __name__ == '__main__': 64 | model = vgg16_model(224, 224, 3) 65 | # input_layer = model.get_layer('input') 66 | print(model.summary()) 67 | 68 | K.clear_session() 69 | --------------------------------------------------------------------------------