├── .gitignore ├── README.md ├── _config.yml ├── dataset ├── cifar100_dataset.py ├── cifar10_dataset.py ├── dataset.py └── mnist_dataset.py ├── models ├── alexnet.py ├── densenet.py ├── googlenet.py ├── imgclfmodel.py ├── inception_resnet_v1.py ├── inception_resnet_v2.py ├── inception_v2.py ├── inception_v3.py ├── inception_v4.py ├── resnet.py ├── resnet_v2.py └── vgg.py ├── overview.png ├── requirements.txt ├── test.py └── trainers ├── clftrainer.py └── predefined_loss.py /.gitignore: -------------------------------------------------------------------------------- 1 | *.tar.gz 2 | *.pyc 3 | *.floydexpt 4 | *.floydignore 5 | *.yml 6 | *.p 7 | cifar-10-batches-py/* 8 | *.ckpt-*.* 9 | checkpoint 10 | -------------------------------------------------------------------------------- /README.md: -------------------------------------------------------------------------------- 1 | # DeepModels 2 | 3 | 4 | 5 | This repository is mainly for implementing and testing state-of-the-art deep learning models since 2012 when AlexNet has emerged. It will provide pre-trained models on each dataset later. 6 | 7 | In order to try with state-of-the-art deep learning models, datasets to be fed into and training methods should be also come along. This repository comes with three main parts, **Dataset**, **Model**, and **Trainer** to ease this process. 8 | 9 | Dataset and model should be provided to a trainer, and then the trainer knows how to run training, resuming where the last training is left off, and transfer learning. 10 | 11 | ## Dependencies 12 | - numpy >= 1.14.5 13 | - scikit-image >= 0.12.3 14 | - tensorflow >= 1.6 15 | - tqdm >= 4.11.2 16 | - urllib3 >= 1.23 17 | 18 | ```sh 19 | # install all the requirements. 20 | 21 | pip install -r requirements.txt 22 | ``` 23 | 24 | ## Testing Environment 25 | - macOS High Sierra (10.13.6) + eGPU encloosure (Akitio Node) + NVIDIA GTX 1080Ti 26 | - [floydhub](https://www.floydhub.com/) + NVIDIA TESLA K80, + NVIDIA TESLA V100 27 | - [GCP cloud ML engine](https://cloud.google.com/ml-engine/) + NVIDIA TESLA K80, + NVIDIA TESLA P100, + NVIDIA TESLA V100 28 | 29 | ## Pre-defined Classes 30 | #### Datasets 31 | - **[MNIST](http://yann.lecun.com/exdb/mnist)** 32 | - 10 classes of handwritten digits images in size of 28x28 33 | - 60,000 training images, 10,000 testing images 34 | - **[CIFAR-10](https://www.cs.toronto.edu/~kriz/cifar.html)** 35 | - 10 classes of colored images in size of 32x32 36 | - 50,000 training images, 10,000 testing images 37 | - 6,000 images per class 38 | - **[CIFAR-100](https://www.cs.toronto.edu/~kriz/cifar.html)** 39 | - 100 classes of colored images in size of 32x32 40 | - 600 images per class 41 | - 500 training images, 100 testing images per class 42 | - **Things to be added** 43 | - **[STL-10](https://cs.stanford.edu/~acoates/stl10/)** 44 | - **[ImageNet](http://www.image-net.org/)** 45 | 46 | #### Models 47 | - **[AlexNet](https://papers.nips.cc/paper/4824-imagenet-classification-with-deep-convolutional-neural-networks.pdf)** | 2012 | [[CODE]](./models/alexnet.py) 48 | - **[VGG](https://arxiv.org/pdf/1409.1556.pdf)** | 2014 | [[CODE]](./models/vgg.py) 49 | - model types 50 | - **A:** 11 layers, **A-LRN:** 11 layers with LRN (Local Response Normalization) 51 | - **B:** 13 layers, **C:** 13 layers with additional convolutional layer whose kernel size is 1x1 52 | - **D:** 16 layers (known as **VGG16**) 53 | - **E:** 19 layers (known as **VGG19**) 54 | - **[Inception V1 (GoogLeNet)](https://arxiv.org/pdf/1409.4842.pdf)** | 2014 | [[CODE]](./models/googlenet.py) 55 | - **[Residual Network](https://arxiv.org/pdf/1512.03385.pdf)** | 2015 | [[CODE]](./models/resnet.py) 56 | - model types (depth): 18, 34, 50, 101, 152 57 | - **[Inception V2](https://arxiv.org/pdf/1512.00567v3.pdf)** | 2015 | [[CODE]](./models/inception_v2.py) 58 | - **[Inception V3](https://arxiv.org/pdf/1512.00567v3.pdf)** | 2015 | [[CODE]](./models/inception_v3.py) 59 | - **[Residual Network V2](https://arxiv.org/pdf/1603.05027.pdf)** | 2016 | [[CODE]](./models/resnet_v2.py) 60 | - model types (depth): 18, 34, 50, 101, 152, 200 61 | - **[Inception V4](https://arxiv.org/pdf/1602.07261.pdf)** | 2016 | [[CODE]](./models/inception_v4.py) 62 | - **[Inception+Resnet V1](https://arxiv.org/pdf/1602.07261.pdf)** | 2016 | [[CODE]](./models/inception_resnet_v1.py) 63 | - **[Inception+Resnet V2](https://arxiv.org/pdf/1602.07261.pdf)** | 2016 | [[CODE]](./models/inception_resnet_v2.py) 64 | - **[DenseNet](https://arxiv.org/pdf/1608.06993.pdf)** | 2017 | [[CODE]](./models/densenet.py) 65 | - model types (depth): 121, 169, 201, 264 66 | - **Things to be added** 67 | - **[SqueezeNet](https://arxiv.org/abs/1602.07360)** | 2016 68 | - **[MobileNet](https://arxiv.org/pdf/1704.04861.pdf)** | 2017 69 | - **[NASNet](https://arxiv.org/pdf/1707.07012.pdf)** | 2017 70 | 71 | #### Trainers 72 | - ClfTrainer: Trainer for image classification like ILSVRC 73 | 74 | ## Pre-trained accuracy (coming soon) 75 | - AlexNet 76 | - VGG 77 | - Inception V1 (GoogLeNet) 78 | 79 | ## Example Usage Code Blocks 80 | #### Define hyper-parameters 81 | ```python 82 | learning_rate = 0.0001 83 | epochs = 1 84 | batch_size = 64 85 | ``` 86 | 87 | #### Train from nothing 88 | ```python 89 | from dataset.cifar10_dataset import Cifar10 90 | 91 | from models.googlenet import GoogLeNet 92 | from trainers.clftrainer import ClfTrainer 93 | 94 | inceptionv1 = GoogLeNet() 95 | cifar10_dataset = Cifar10() 96 | trainer = ClfTrainer(inceptionv1, cifar10_dataset) 97 | trainer.run_training(epochs, batch_size, learning_rate, 98 | './inceptionv1-cifar10.ckpt') 99 | ``` 100 | 101 | #### Train from where left off 102 | ```python 103 | from dataset.cifar10_dataset import Cifar10 104 | 105 | from models.googlenet import GoogLeNet 106 | from trainers.clftrainer import ClfTrainer 107 | 108 | inceptionv1 = GoogLeNet() 109 | cifar10_dataset = Cifar10() 110 | trainer = ClfTrainer(inceptionv1, cifar10_dataset) 111 | trainer.resume_training_from_ckpt(epochs, batch_size, learning_rate, 112 | './inceptionv1-cifar10.ckpt-1', './new-inceptionv1-cifar10.ckpt') 113 | ``` 114 | 115 | #### Transfer Learning 116 | ```python 117 | from dataset.cifar100_dataset import Cifar100 118 | 119 | from models.googlenet import GoogLeNet 120 | from trainers.clftrainer import ClfTrainer 121 | 122 | inceptionv1 = GoogLeNet() 123 | cifar10_dataset = Cifar100() 124 | trainer = ClfTrainer(inceptionv1, cifar10_dataset) 125 | trainer.run_transfer_learning(epochs, batch_size, learning_rate, 126 | './new-inceptionv1-cifar10.ckpt-1', './inceptionv1-ciafar100.ckpt') 127 | ``` 128 | 129 | #### Testing 130 | ```python 131 | from dataset.cifar100_dataset import Cifar100 132 | 133 | from models.googlenet import GoogLeNet 134 | from trainers.clftrainer import ClfTrainer 135 | 136 | # prepare images to test 137 | images = ... 138 | 139 | inceptionv1 = GoogLeNet() 140 | cifar10_dataset = Cifar100() 141 | trainer = ClfTrainer(inceptionv1, cifar10_dataset) 142 | results = trainer.run_testing(images, './inceptionv1-ciafar100.ckpt-1') 143 | ``` 144 | 145 | ## Basic Workflow 146 | 1. Define/Instantiate a dataset 147 | 2. Define/Instantiate a model 148 | 3. Define/Instantiate a trainer with the dataset and the model 149 | 4. Begin training/resuming/transfer learning 150 | 151 | ## References 152 | - [TensorFlow official website](https://www.tensorflow.org/) 153 | - [CNN Receptive Field Calculator](http://fomoro.com/tools/receptive-fields/index.html) 154 | - [A Simple Guide to the Versions of the Inception Network](https://towardsdatascience.com/a-simple-guide-to-the-versions-of-the-inception-network-7fc52b863202) 155 | - [UNDERSTANDING RESIDUAL NETWORKS](https://towardsdatascience.com/understanding-residual-networks-9add4b664b03) 156 | - [Improving Inception and Image Classification in TensorFlow](https://ai.googleblog.com/2016/08/improving-inception-and-image.html) 157 | -------------------------------------------------------------------------------- /_config.yml: -------------------------------------------------------------------------------- 1 | theme: jekyll-theme-cayman -------------------------------------------------------------------------------- /dataset/cifar100_dataset.py: -------------------------------------------------------------------------------- 1 | from urllib.request import urlretrieve 2 | from os.path import isfile, isdir 3 | import os 4 | 5 | from tqdm import tqdm 6 | import tarfile 7 | import pickle 8 | import numpy as np 9 | 10 | import skimage 11 | import skimage.io 12 | import skimage.transform 13 | 14 | from dataset.dataset import Dataset 15 | from dataset.dataset import DownloadProgress 16 | 17 | class Cifar100(Dataset): 18 | def __init__(self): 19 | Dataset.__init__(self, name='Cifar-100', path='cifar-100-python', num_classes=100, num_batch=5) 20 | self.width = 32 21 | self.height = 32 22 | 23 | def __download__(self): 24 | if not isfile('cifar-100-python.tar.gz'): 25 | with DownloadProgress(unit='B', unit_scale=True, miniters=1, desc='CIFAR-100 Dataset') as pbar: 26 | urlretrieve( 27 | 'https://www.cs.toronto.edu/~kriz/cifar-100-python.tar.gz', 28 | 'cifar-100-python.tar.gz', 29 | pbar.hook) 30 | else: 31 | print('cifar-100-python.tar.gz already exists') 32 | 33 | if not isdir(self.path): 34 | with tarfile.open('cifar-100-python.tar.gz') as tar: 35 | tar.extractall() 36 | tar.close() 37 | else: 38 | print('cifar100 dataset already exists') 39 | 40 | def __load_batch__(self, batch_id=1): 41 | with open(self.path + '/train', mode='rb') as file: 42 | # note the encoding type is 'latin1' 43 | batch = pickle.load(file, encoding='latin1') 44 | 45 | features = batch['data'].reshape((len(batch['data']), 3, 32, 32)).transpose(0, 2, 3, 1) 46 | labels = batch['fine_labels'] 47 | 48 | return features, labels 49 | 50 | def __preprocess_and_save_data__(self, valid_ratio=0.1): 51 | valid_features = [] 52 | valid_labels = [] 53 | flag = True 54 | 55 | features, labels = self.__load_batch__() 56 | 57 | index_of_validation = int(len(features) * valid_ratio) 58 | total_train_num = len(features) - int(len(features) * valid_ratio) 59 | n_batches = self.num_batch 60 | 61 | for batch_i in range(1, n_batches + 1): 62 | batch_filename = self.path + '/cifar100_preprocess_batch_' + str(batch_i) + '.p' 63 | 64 | if isfile(batch_filename): 65 | print(batch_filename + ' already exists') 66 | flag = False 67 | else: 68 | start_index = int((batch_i-1) * total_train_num/n_batches) 69 | end_index = int(start_index + total_train_num/n_batches) 70 | 71 | self.save_preprocessed_data(features[start_index:end_index], labels[start_index:end_index], batch_filename) 72 | 73 | valid_features.extend(features[-index_of_validation:]) 74 | valid_labels.extend(labels[-index_of_validation:]) 75 | 76 | # preprocess the all stacked validation dataset 77 | self.save_preprocessed_data(np.array(valid_features), np.array(valid_labels), self.path + '/cifar100_preprocess_validation.p') 78 | 79 | # load the test dataset 80 | with open(self.path + '/test', mode='rb') as file: 81 | batch = pickle.load(file, encoding='latin1') 82 | 83 | # preprocess the testing data 84 | test_features = batch['data'].reshape((len(batch['data']), 3, 32, 32)).transpose(0, 2, 3, 1) 85 | test_labels = batch['fine_labels'] 86 | 87 | # Preprocess and Save all testing data 88 | self.save_preprocessed_data(np.array(test_features), np.array(test_labels), self.path + '/cifar100_preprocess_testing.p') 89 | 90 | def get_batches_from(self, features, labels, batch_size): 91 | for start in range(0, len(features), batch_size): 92 | end = min(start + batch_size, len(features)) 93 | yield features[start:end], labels[start:end] 94 | 95 | def get_training_batches_from_preprocessed(self, batch_id, batch_size, scale_to_imagenet=False): 96 | filename = self.path + '/cifar100_preprocess_batch_' + str(batch_id) + '.p' 97 | features, labels = pickle.load(open(filename, mode='rb')) 98 | 99 | if scale_to_imagenet: 100 | features = self.convert_to_imagenet_size(features) 101 | 102 | return self.get_batches_from(features, labels, batch_size) 103 | 104 | def get_valid_set(self, scale_to_imagenet=False): 105 | valid_features, valid_labels = pickle.load(open(self.path + '/cifar100_preprocess_validation.p', mode='rb')) 106 | 107 | if scale_to_imagenet: 108 | valid_features = self.convert_to_imagenet_size(valid_features) 109 | 110 | return valid_features, valid_labels 111 | -------------------------------------------------------------------------------- /dataset/cifar10_dataset.py: -------------------------------------------------------------------------------- 1 | from urllib.request import urlretrieve 2 | from os.path import isfile, isdir 3 | 4 | from tqdm import tqdm 5 | import tarfile 6 | import pickle 7 | import numpy as np 8 | 9 | import skimage 10 | import skimage.io 11 | import skimage.transform 12 | 13 | from dataset.dataset import Dataset 14 | from dataset.dataset import DownloadProgress 15 | 16 | class Cifar10(Dataset): 17 | def __init__(self): 18 | Dataset.__init__(self, name='Cifar-10', path='cifar-10-batches-py', num_classes=10, num_batch=5) 19 | self.width = 32 20 | self.height = 32 21 | 22 | def __download__(self): 23 | if not isfile('cifar-10-python.tar.gz'): 24 | with DownloadProgress(unit='B', unit_scale=True, miniters=1, desc='CIFAR-10 Dataset') as pbar: 25 | urlretrieve( 26 | 'https://www.cs.toronto.edu/~kriz/cifar-10-python.tar.gz', 27 | 'cifar-10-python.tar.gz', 28 | pbar.hook) 29 | else: 30 | print('cifar-10-python.tar.gz already exists') 31 | 32 | if not isdir(self.path): 33 | with tarfile.open('cifar-10-python.tar.gz') as tar: 34 | tar.extractall() 35 | tar.close() 36 | else: 37 | print('cifar10 dataset already exists') 38 | 39 | def __load_batch__(self, batch_id): 40 | with open(self.path + '/data_batch_' + str(batch_id), mode='rb') as file: 41 | # note the encoding type is 'latin1' 42 | batch = pickle.load(file, encoding='latin1') 43 | 44 | features = batch['data'].reshape((len(batch['data']), 3, 32, 32)).transpose(0, 2, 3, 1) 45 | labels = batch['labels'] 46 | 47 | return features, labels 48 | 49 | def __preprocess_and_save_data__(self, valid_ratio=0.1): 50 | n_batches = self.num_batch 51 | valid_features = [] 52 | valid_labels = [] 53 | flag = True 54 | 55 | for batch_i in range(1, n_batches + 1): 56 | batch_filename = 'cifar10_preprocess_batch_' + str(batch_i) + '.p' 57 | 58 | if isfile(batch_filename): 59 | print(batch_filename + ' already exists') 60 | flag = False 61 | else: 62 | features, labels = self.__load_batch__(batch_i) 63 | index_of_validation = int(len(features) * valid_ratio) 64 | 65 | self.save_preprocessed_data(features[:-index_of_validation], labels[:-index_of_validation], batch_filename) 66 | 67 | valid_features.extend(features[-index_of_validation:]) 68 | valid_labels.extend(labels[-index_of_validation:]) 69 | 70 | if flag: 71 | self.save_preprocessed_data(np.array(valid_features), np.array(valid_labels), 72 | 'cifar10_preprocess_validation.p') 73 | 74 | # load the test dataset 75 | with open(self.path + '/test_batch', mode='rb') as file: 76 | batch = pickle.load(file, encoding='latin1') 77 | 78 | test_filename = 'cifar10_preprocess_testing.p' 79 | 80 | if isfile(test_filename): 81 | print(test_filename + ' already exists') 82 | else: 83 | # preprocess the testing data 84 | test_features = batch['data'].reshape((len(batch['data']), 3, 32, 32)).transpose(0, 2, 3, 1) 85 | test_labels = batch['labels'] 86 | 87 | # Preprocess and Save all testing data 88 | self.save_preprocessed_data(np.array(test_features), np.array(test_labels), test_filename) 89 | 90 | def get_batches_from(self, features, labels, batch_size): 91 | for start in range(0, len(features), batch_size): 92 | end = min(start + batch_size, len(features)) 93 | yield features[start:end], labels[start:end] 94 | 95 | def get_training_batches_from_preprocessed(self, batch_id, batch_size, scale_to_imagenet=False): 96 | filename = 'cifar10_preprocess_batch_' + str(batch_id) + '.p' 97 | features, labels = pickle.load(open(filename, mode='rb')) 98 | 99 | if scale_to_imagenet: 100 | features = self.convert_to_imagenet_size(features) 101 | 102 | return self.get_batches_from(features, labels, batch_size) 103 | 104 | def get_valid_set(self, scale_to_imagenet=False): 105 | valid_features, valid_labels = pickle.load(open('cifar10_preprocess_validation.p', mode='rb')) 106 | 107 | if scale_to_imagenet: 108 | valid_features = self.convert_to_imagenet_size(valid_features) 109 | 110 | return valid_features, valid_labels 111 | -------------------------------------------------------------------------------- /dataset/dataset.py: -------------------------------------------------------------------------------- 1 | from urllib.request import urlretrieve 2 | from os.path import isfile, isdir 3 | 4 | from tqdm import tqdm 5 | import pickle 6 | import numpy as np 7 | 8 | import skimage 9 | import skimage.io 10 | import skimage.transform 11 | 12 | class DownloadProgress(tqdm): 13 | last_block = 0 14 | 15 | def hook(self, block_num=1, block_size=1, total_size=None): 16 | self.total = total_size 17 | self.update((block_num - self.last_block) * block_size) 18 | self.last_block = block_num 19 | 20 | class Dataset: 21 | def __init__(self, name, path, num_classes=-1, num_batch=1): 22 | self.name = name 23 | self.path = path 24 | self.num_batch = num_batch 25 | self.num_classes = num_classes 26 | 27 | self.__download__() 28 | self.__preprocess_and_save_data__() 29 | 30 | def convert_to_imagenet_size(self, images): 31 | tmp_images = [] 32 | for image in images: 33 | tmp_image = skimage.transform.resize(image, (224, 224), mode='constant') 34 | tmp_images.append(tmp_image) 35 | 36 | return np.array(tmp_images) 37 | 38 | def save_preprocessed_data(self, features, labels, filename): 39 | labels = self.one_hot_encode(labels) 40 | pickle.dump((features, labels), open(filename, 'wb')) 41 | 42 | def one_hot_encode(self, x): 43 | encoded = np.zeros((len(x), self.num_classes)) 44 | 45 | for idx, val in enumerate(x): 46 | encoded[idx][val] = 1 47 | 48 | return encoded 49 | 50 | def __download__(self): 51 | raise NotImplementedError 52 | 53 | def __preprocess_and_save_data__(self): 54 | raise NotImplementedError 55 | 56 | # load downloaded files (one could be split into multiple files(batches) like CIFAR-10) 57 | def __load_batch__(self, batch_id): 58 | raise NotImplementedError 59 | 60 | def get_valid_set(self, scale_to_imagenet=False): 61 | raise NotImplementedError 62 | 63 | def get_batches_from(self, features, labels, batch_size): 64 | raise NotImplementedError 65 | 66 | def get_training_batches_from_preprocessed(self, batch_id, batch_size, scale_to_imagenet=False): 67 | raise NotImplementedError 68 | -------------------------------------------------------------------------------- /dataset/mnist_dataset.py: -------------------------------------------------------------------------------- 1 | from urllib.request import urlretrieve 2 | from os.path import isfile, isdir 3 | import os 4 | 5 | from tqdm import tqdm 6 | import gzip 7 | import shutil 8 | import struct 9 | import tarfile 10 | import pickle 11 | import numpy as np 12 | 13 | import skimage 14 | import skimage.io 15 | import skimage.transform 16 | 17 | from dataset.dataset import Dataset 18 | from dataset.dataset import DownloadProgress 19 | 20 | class Mnist(Dataset): 21 | def __init__(self): 22 | Dataset.__init__(self, name='MNIST', path='mnist_dataset', num_classes=10, num_batch=5) 23 | self.width = 28 24 | self.height = 28 25 | 26 | def __download__(self): 27 | # training dataset 28 | if not isfile('train-images-idx3-ubyte.gz'): 29 | with DownloadProgress(unit='B', unit_scale=True, miniters=1, desc='MNIST training dataset (images)') as pbar: 30 | urlretrieve( 31 | 'http://yann.lecun.com/exdb/mnist/train-images-idx3-ubyte.gz', 32 | 'train-images-idx3-ubyte.gz', 33 | pbar.hook) 34 | else: 35 | print('train-images-idx3-ubyte.gz already exists') 36 | 37 | if not isfile('train-labels-idx1-ubyte.gz'): 38 | with DownloadProgress(unit='B', unit_scale=True, miniters=1, desc='MNIST training dataset (labels)') as pbar: 39 | urlretrieve( 40 | 'http://yann.lecun.com/exdb/mnist/train-labels-idx1-ubyte.gz', 41 | 'train-labels-idx1-ubyte.gz', 42 | pbar.hook) 43 | else: 44 | print('train-labels-idx1-ubyte.gz already exists') 45 | 46 | # testing dataset 47 | if not isfile('t10k-images-idx3-ubyte.gz'): 48 | with DownloadProgress(unit='B', unit_scale=True, miniters=1, desc='MNIST testing dataset (images)') as pbar: 49 | urlretrieve( 50 | 'http://yann.lecun.com/exdb/mnist/t10k-images-idx3-ubyte.gz', 51 | 't10k-images-idx3-ubyte.gz', 52 | pbar.hook) 53 | else: 54 | print('t10k-images-idx3-ubyte.gz already exists') 55 | 56 | if not isfile('t10k-labels-idx1-ubyte.gz'): 57 | with DownloadProgress(unit='B', unit_scale=True, miniters=1, desc='MNIST testing dataset (labels)') as pbar: 58 | urlretrieve( 59 | 'http://yann.lecun.com/exdb/mnist/t10k-labels-idx1-ubyte.gz', 60 | 't10k-labels-idx1-ubyte.gz', 61 | pbar.hook) 62 | else: 63 | print('t10k-labels-idx1-ubyte.gz already exists') 64 | 65 | if not os.path.isdir(self.path): 66 | os.mkdir(self.path) 67 | 68 | # unzip 69 | with gzip.open('./train-images-idx3-ubyte.gz', 'rb') as gz_in: 70 | with open(self.path + '/train-images-idx3-ubyte', 'wb') as gz_out: 71 | shutil.copyfileobj(gz_in, gz_out) 72 | 73 | with gzip.open('./train-labels-idx1-ubyte.gz', 'rb') as gz_in: 74 | with open(self.path + '/train-labels-idx1-ubyte', 'wb') as gz_out: 75 | shutil.copyfileobj(gz_in, gz_out) 76 | 77 | with gzip.open('./t10k-images-idx3-ubyte.gz', 'rb') as gz_in: 78 | with open(self.path + '/t10k-images-idx3-ubyte', 'wb') as gz_out: 79 | shutil.copyfileobj(gz_in, gz_out) 80 | 81 | with gzip.open('./t10k-labels-idx1-ubyte.gz', 'rb') as gz_in: 82 | with open(self.path + '/t10k-labels-idx1-ubyte', 'wb') as gz_out: 83 | shutil.copyfileobj(gz_in, gz_out) 84 | 85 | def __load_batch__(self, batch_id=1): 86 | with open(self.path + '/train-labels-idx1-ubyte', 'rb') as train_label_file: 87 | magic, num = struct.unpack(">II", train_label_file.read(8)) 88 | labels = np.fromfile(train_label_file, dtype=np.int8) 89 | 90 | with open(self.path + '/train-images-idx3-ubyte', mode='rb') as train_image_file: 91 | magic, num, rows, cols = struct.unpack(">IIII", train_image_file.read(16)) 92 | features = np.fromfile(train_image_file, dtype=np.uint8).reshape(len(labels), rows, cols) 93 | 94 | return features, labels 95 | 96 | def __preprocess_and_save_data__(self, valid_ratio=0.1): 97 | valid_features = [] 98 | valid_labels = [] 99 | flag = True 100 | 101 | features, labels = self.__load_batch__() 102 | 103 | # converting 1 channel data to 3 channel data 104 | tmp_features = [] 105 | for feature in features: 106 | tmp_features.append(np.resize(feature, (28, 28, 3))) 107 | 108 | # reordering b, w, h, c 109 | # (batch, width, height, channel) 110 | features = np.asarray(tmp_features) 111 | features = features.reshape(len(labels), 3, 28, 28).transpose(0, 2, 3, 1) 112 | 113 | index_of_validation = int(len(features) * valid_ratio) 114 | total_train_num = len(features) - int(len(features) * valid_ratio) 115 | n_batches = self.num_batch 116 | 117 | for batch_i in range(1, n_batches + 1): 118 | batch_filename = self.path + '/mnist_preprocess_batch_' + str(batch_i) + '.p' 119 | 120 | if isfile(batch_filename): 121 | print(batch_filename + ' already exists') 122 | flag = False 123 | else: 124 | start_index = int((batch_i-1) * total_train_num/n_batches) 125 | end_index = int(start_index + total_train_num/n_batches) 126 | 127 | self.save_preprocessed_data(features[start_index:end_index], labels[start_index:end_index], batch_filename) 128 | 129 | valid_features.extend(features[-index_of_validation:]) 130 | valid_labels.extend(labels[-index_of_validation:]) 131 | 132 | # preprocess the all stacked validation dataset 133 | self.save_preprocessed_data(np.array(valid_features), np.array(valid_labels), self.path + '/mnist_preprocess_validation.p') 134 | 135 | # load the test dataset 136 | with open(self.path + '/t10k-labels-idx1-ubyte', 'rb') as test_label_file: 137 | magic, num = struct.unpack(">II", test_label_file.read(8)) 138 | test_labels = np.fromfile(test_label_file, dtype=np.int8) 139 | 140 | with open(self.path + '/t10k-images-idx3-ubyte', mode='rb') as test_image_file: 141 | magic, num, rows, cols = struct.unpack(">IIII", test_image_file.read(16)) 142 | test_features = np.fromfile(test_image_file, dtype=np.uint8).reshape(len(test_labels), rows, cols) 143 | 144 | # converting 1 channel data to 3 channel data 145 | tmp_features = [] 146 | for feature in test_features: 147 | tmp_features.append(np.resize(feature, (28, 28, 3))) 148 | 149 | test_features = np.asarray(tmp_features) 150 | test_features = test_features.reshape(len(test_labels), 3, 28, 28).transpose(0, 2, 3, 1) 151 | 152 | # Preprocess and Save all testing data 153 | self.save_preprocessed_data(np.array(test_features), np.array(test_labels), self.path + '/mnist_preprocess_test.p') 154 | 155 | def get_batches_from(self, features, labels, batch_size): 156 | for start in range(0, len(features), batch_size): 157 | end = min(start + batch_size, len(features)) 158 | yield features[start:end], labels[start:end] 159 | 160 | def get_training_batches_from_preprocessed(self, batch_id, batch_size, scale_to_imagenet=False): 161 | filename = self.path + '/mnist_preprocess_batch_' + str(batch_id) + '.p' 162 | features, labels = pickle.load(open(filename, mode='rb')) 163 | 164 | if scale_to_imagenet: 165 | features = self.convert_to_imagenet_size(features) 166 | 167 | return self.get_batches_from(features, labels, batch_size) 168 | 169 | def get_valid_set(self, scale_to_imagenet=False): 170 | filename = self.path + '/mnist_preprocess_validation.p' 171 | valid_features, valid_labels = pickle.load(open(filename, mode='rb')) 172 | 173 | if scale_to_imagenet: 174 | valid_features = self.convert_to_imagenet_size(valid_features) 175 | 176 | return valid_features, valid_labels 177 | -------------------------------------------------------------------------------- /models/alexnet.py: -------------------------------------------------------------------------------- 1 | from models.imgclfmodel import ImgClfModel 2 | from dataset.dataset import Dataset 3 | 4 | import tensorflow as tf 5 | from tensorflow.contrib.layers import conv2d 6 | from tensorflow.contrib.layers import max_pool2d 7 | from tensorflow.contrib.layers import flatten 8 | from tensorflow.contrib.layers import fully_connected 9 | 10 | """ 11 | Implementation of AlexNet from ILSVRC 2012. The original architecture is invented by Alex Krizhevsky @Toronto Univ. 12 | The original architecture used 2 GPUs because of the hardware limitation like a lack of memory. 13 | In this implementation, 2-GPU architecture is not considered. 14 | 15 | The main technical contributions from this architecture are "Local Response Normalization(LRN)", 16 | "Dropout", "Retified Linear Unit(ReLU)", and "Overlapped Max-pooling". Some of these techniques are 17 | being used even nowadays. It is not the first time for AlexNet to introduce these techniques, 18 | but AlexNet showed they are very usedful in deep learning. 19 | """ 20 | class AlexNet(ImgClfModel): 21 | def __init__(self): 22 | ImgClfModel.__init__(self, scale_to_imagenet=True) 23 | 24 | def create_model(self, input): 25 | # 1st 26 | with tf.variable_scope('group1'): 27 | self.conv1 = conv2d(input, num_outputs=96, 28 | kernel_size=[11,11], stride=4, padding="VALID", 29 | activation_fn=tf.nn.relu) 30 | self.lrn1 = tf.nn.local_response_normalization(self.conv1, bias=2, alpha=0.0001,beta=0.75) 31 | self.pool1 = max_pool2d(self.lrn1, kernel_size=[3,3], stride=2) 32 | 33 | # 2nd 34 | with tf.variable_scope('group2'): 35 | self.conv2 = conv2d(self.pool1, num_outputs=256, 36 | kernel_size=[5,5], stride=1, padding="VALID", 37 | biases_initializer=tf.ones_initializer(), 38 | activation_fn=tf.nn.relu) 39 | self.lrn2 = tf.nn.local_response_normalization(self.conv2, bias=2, alpha=0.0001, beta=0.75) 40 | self.pool2 = max_pool2d(self.lrn2, kernel_size=[3,3], stride=2) 41 | 42 | #3rd 43 | with tf.variable_scope('group3'): 44 | self.conv3 = conv2d(self.pool2, num_outputs=384, 45 | kernel_size=[3,3], stride=1, padding="VALID", 46 | activation_fn=tf.nn.relu) 47 | 48 | #4th 49 | with tf.variable_scope('group4'): 50 | self.conv4 = conv2d(self.conv3, num_outputs=384, 51 | kernel_size=[3,3], stride=1, padding="VALID", 52 | biases_initializer=tf.ones_initializer(), 53 | activation_fn=tf.nn.relu) 54 | 55 | #5th 56 | with tf.variable_scope('group5'): 57 | self.conv5 = conv2d(self.conv4, num_outputs=256, 58 | kernel_size=[3,3], stride=1, padding="VALID", 59 | biases_initializer=tf.ones_initializer(), 60 | activation_fn=tf.nn.relu) 61 | self.pool5 = max_pool2d(self.conv5, kernel_size=[3,3], stride=2) 62 | 63 | #6th 64 | with tf.variable_scope('fcl'): 65 | self.flat = flatten(self.pool5) 66 | self.fcl1 = fully_connected(self.flat, num_outputs=4096, 67 | biases_initializer=tf.ones_initializer(), activation_fn=tf.nn.relu) 68 | self.dr1 = tf.nn.dropout(self.fcl1, 0.5) 69 | 70 | #7th 71 | self.fcl2 = fully_connected(self.dr1, num_outputs=4096, 72 | biases_initializer=tf.ones_initializer(), activation_fn=tf.nn.relu) 73 | self.dr2 = tf.nn.dropout(self.fcl2, 0.5) 74 | 75 | #output 76 | with tf.variable_scope('final'): 77 | self.out = fully_connected(self.dr2, num_outputs=self.num_classes, activation_fn=None) 78 | 79 | return [self.out] -------------------------------------------------------------------------------- /models/densenet.py: -------------------------------------------------------------------------------- 1 | from models.imgclfmodel import ImgClfModel 2 | from dataset.dataset import Dataset 3 | 4 | import tensorflow as tf 5 | from tensorflow.contrib.layers import conv2d 6 | from tensorflow.contrib.layers import max_pool2d 7 | from tensorflow.contrib.layers import avg_pool2d 8 | from tensorflow.contrib.layers import flatten 9 | from tensorflow.contrib.layers import fully_connected 10 | 11 | class DenseNet(ImgClfModel): 12 | # model_type = [121 | 169 | 201 \ 264] 13 | def __init__(self, model_type='121', k=32, theta=0.5): 14 | ImgClfModel.__init__(self, scale_to_imagenet=True, model_type=model_type) 15 | self.k = k 16 | self.theta = theta 17 | 18 | def create_model(self, input): 19 | k = self.k 20 | theta = self.theta 21 | 22 | with tf.variable_scope('initial_block'): 23 | conv = tf.layers.batch_normalization(input) 24 | conv = tf.nn.relu(conv) 25 | conv = conv2d(input, num_outputs=2*k, 26 | kernel_size=[7,7], stride=2, padding='SAME', 27 | activation_fn=None) 28 | 29 | pool = max_pool2d(conv, kernel_size=[3,3], stride=2, padding='SAME') 30 | prev_kernels = 2*k 31 | input_kernels = prev_kernels 32 | 33 | cur_layer = pool 34 | layers_concat = list() 35 | 36 | with tf.variable_scope('dense_block_1'): 37 | for i in range(6): 38 | cur_kernels = 4 * k 39 | bottlenect = tf.layers.batch_normalization(cur_layer) 40 | bottlenect = tf.nn.relu(bottlenect) 41 | bottlenect = conv2d(bottlenect, num_outputs=cur_kernels, 42 | kernel_size=[1,1], stride=1, padding='SAME', 43 | activation_fn=None) 44 | bottlenect = tf.layers.dropout(bottlenect, 0.2) 45 | 46 | cur_kernels = input_kernels + (k * i) 47 | conv = tf.layers.batch_normalization(bottlenect) 48 | conv = tf.nn.relu(conv) 49 | conv = conv2d(conv, num_outputs=cur_kernels, 50 | kernel_size=[3,3], stride=1, padding='SAME', 51 | activation_fn=None) 52 | conv = tf.layers.dropout(conv, 0.2) 53 | 54 | layers_concat.append(conv) 55 | cur_layer = tf.concat(layers_concat, 3) 56 | print(cur_layer) 57 | prev_kernels = cur_kernels 58 | 59 | with tf.variable_scope('transition_block_1'): 60 | bottlenect = tf.layers.batch_normalization(cur_layer) 61 | bottlenect = conv2d(bottlenect, num_outputs=int(prev_kernels*theta), 62 | kernel_size=[1,1], stride=1, padding='SAME', 63 | activation_fn=tf.nn.relu) 64 | bottlenect = tf.layers.dropout(bottlenect, 0.2) 65 | 66 | pool = avg_pool2d(bottlenect, kernel_size=[2,2], stride=2, padding='SAME') 67 | prev_kernels = int(prev_kernels*theta) 68 | input_kernels = prev_kernels 69 | 70 | cur_layer = pool 71 | layers_concat = list() 72 | 73 | with tf.variable_scope('dense_block_2'): 74 | for i in range(12): 75 | cur_kernels = 4 * k 76 | bottlenect = tf.layers.batch_normalization(cur_layer) 77 | bottlenect = tf.nn.relu(bottlenect) 78 | bottlenect = conv2d(bottlenect, num_outputs=cur_kernels, 79 | kernel_size=[1,1], stride=1, padding='SAME', 80 | activation_fn=None) 81 | bottlenect = tf.layers.dropout(bottlenect, 0.2) 82 | 83 | cur_kernels = input_kernels + (k * i) 84 | conv = tf.layers.batch_normalization(bottlenect) 85 | conv = tf.nn.relu(conv) 86 | conv = conv2d(conv, num_outputs=cur_kernels, 87 | kernel_size=[3,3], stride=1, padding='SAME', 88 | activation_fn=None) 89 | conv = tf.layers.dropout(conv, 0.2) 90 | 91 | layers_concat.append(conv) 92 | cur_layer = tf.concat(layers_concat, 3) 93 | print(cur_layer) 94 | prev_kernels = cur_kernels 95 | 96 | with tf.variable_scope('transition_block_2'): 97 | bottlenect = tf.layers.batch_normalization(cur_layer) 98 | bottlenect = conv2d(bottlenect, num_outputs=int(prev_kernels*theta), 99 | kernel_size=[1,1], stride=1, padding='SAME', 100 | activation_fn=tf.nn.relu) 101 | bottlenect = tf.layers.dropout(bottlenect, 0.2) 102 | 103 | pool = avg_pool2d(bottlenect, kernel_size=[2,2], stride=2, padding='SAME') 104 | prev_kernels = int(prev_kernels*theta) 105 | input_kernels = prev_kernels 106 | 107 | cur_layer = pool 108 | layers_concat = list() 109 | dense_block_3_iter = 24 110 | if self.model_type is "169": 111 | dense_block_3_iter = 32 112 | elif self.model_type is "201": 113 | dense_block_3_iter = 48 114 | elif self.model_type is "264": 115 | dense_block_3_iter = 64 116 | 117 | with tf.variable_scope('dense_block_3'): 118 | for i in range(dense_block_3_iter): 119 | cur_kernels = 4 * k 120 | bottlenect = tf.layers.batch_normalization(cur_layer) 121 | bottlenect = tf.nn.relu(bottlenect) 122 | bottlenect = conv2d(bottlenect, num_outputs=cur_kernels, 123 | kernel_size=[1,1], stride=1, padding='SAME', 124 | activation_fn=None) 125 | bottlenect = tf.layers.dropout(bottlenect, 0.2) 126 | 127 | cur_kernels = input_kernels + (k * i) 128 | conv = tf.layers.batch_normalization(bottlenect) 129 | conv = tf.nn.relu(conv) 130 | conv = conv2d(conv, num_outputs=cur_kernels, 131 | kernel_size=[3,3], stride=1, padding='SAME', 132 | activation_fn=None) 133 | conv = tf.layers.dropout(conv, 0.2) 134 | 135 | layers_concat.append(conv) 136 | cur_layer = tf.concat(layers_concat, 3) 137 | print(cur_layer) 138 | prev_kernels = cur_kernels 139 | 140 | with tf.variable_scope('transition_block_3'): 141 | bottlenect = tf.layers.batch_normalization(cur_layer) 142 | bottlenect = conv2d(bottlenect, num_outputs=int(prev_kernels*theta), 143 | kernel_size=[1,1], stride=1, padding='SAME', 144 | activation_fn=tf.nn.relu) 145 | bottlenect = tf.layers.dropout(bottlenect, 0.2) 146 | 147 | pool = avg_pool2d(bottlenect, kernel_size=[2,2], stride=2, padding='SAME') 148 | prev_kernels = int(prev_kernels*theta) 149 | input_kernels = prev_kernels 150 | 151 | cur_layer = pool 152 | layers_concat = list() 153 | dense_block_4_iter = 16 154 | if self.model_type is "169" or self.model_type is "201": 155 | dense_block_4_iter = 32 156 | elif self.model_type is "264": 157 | dense_block_4_iter = 48 158 | 159 | with tf.variable_scope('dense_block_4'): 160 | for i in range(dense_block_4_iter): 161 | cur_kernels = 4 * k 162 | bottlenect = tf.layers.batch_normalization(cur_layer) 163 | bottlenect = tf.nn.relu(bottlenect) 164 | bottlenect = conv2d(bottlenect, num_outputs=cur_kernels, 165 | kernel_size=[1,1], stride=1, padding='SAME', 166 | activation_fn=None) 167 | bottlenect = tf.layers.dropout(bottlenect, 0.2) 168 | 169 | cur_kernels = input_kernels + (k * i) 170 | conv = tf.layers.batch_normalization(bottlenect) 171 | conv = tf.nn.relu(conv) 172 | conv = conv2d(conv, num_outputs=cur_kernels, 173 | kernel_size=[3,3], stride=1, padding='SAME', 174 | activation_fn=None) 175 | conv = tf.layers.dropout(conv, 0.2) 176 | 177 | layers_concat.append(conv) 178 | cur_layer = tf.concat(layers_concat, 3) 179 | print(cur_layer) 180 | prev_kernels = cur_kernels 181 | 182 | with tf.variable_scope('final'): 183 | pool = avg_pool2d(cur_layer, kernel_size=[7,7], stride=1, padding='SAME') 184 | flat = flatten(pool) 185 | self.out = fully_connected(flat, num_outputs=self.num_classes, activation_fn=None) 186 | 187 | return [self.out] 188 | -------------------------------------------------------------------------------- /models/googlenet.py: -------------------------------------------------------------------------------- 1 | from models.imgclfmodel import ImgClfModel 2 | from dataset.dataset import Dataset 3 | 4 | import tensorflow as tf 5 | from tensorflow.contrib.layers import conv2d 6 | from tensorflow.contrib.layers import max_pool2d 7 | from tensorflow.contrib.layers import avg_pool2d 8 | from tensorflow.contrib.layers import flatten 9 | from tensorflow.contrib.layers import fully_connected 10 | 11 | """ 12 | Implementation of GoogLeNet from ILSVRC 2014. The original architecture is invented by Christian Szegedy @Google. 13 | 14 | The main technical contributions from this architecture are "concatenation of different kernels", and "auxiliary loss functions" 15 | """ 16 | class GoogLeNet(ImgClfModel): 17 | def __init__(self): 18 | ImgClfModel.__init__(self, scale_to_imagenet=True) 19 | 20 | def create_model(self, input): 21 | # STEM Network 22 | with tf.variable_scope('stem'): 23 | self.conv2d_1 = conv2d(input, num_outputs=64, 24 | kernel_size=[7,7], stride=2, padding="SAME", 25 | activation_fn=tf.nn.relu) 26 | self.pool_1 = max_pool2d(self.conv2d_1, kernel_size=[3,3], stride=2, padding='SAME') 27 | self.lrn_1 = tf.nn.local_response_normalization(self.pool_1, bias=2, alpha=0.0001,beta=0.75) 28 | 29 | self.conv2d_2 = conv2d(self.lrn_1, num_outputs=64, 30 | kernel_size=[1,1], stride=1, padding="SAME", 31 | activation_fn=tf.nn.relu) 32 | self.conv2d_3 = conv2d(self.conv2d_2, num_outputs=192, 33 | kernel_size=[3,3], stride=1, padding="SAME", 34 | activation_fn=tf.nn.relu) 35 | self.lrn_2 = tf.nn.local_response_normalization(self.conv2d_3, bias=2, alpha=0.0001,beta=0.75) 36 | self.pool_2 = max_pool2d(self.lrn_2, kernel_size=[3,3], stride=2, padding='SAME') 37 | 38 | # Inception 3 39 | # a, b 40 | inception_3_nums = { 41 | 'conv2d_1' : [64 , 128], 42 | 'conv2d_2_1': [96 , 128], 43 | 'conv2d_2_2': [128, 192], 44 | 'conv2d_3_1': [16 , 32], 45 | 'conv2d_3_2': [32 , 96], 46 | 'conv2d_4' : [32 , 64] 47 | } 48 | 49 | with tf.variable_scope('inception_3'): 50 | prev = self.pool_2 51 | for i in range(2): 52 | conv2d_1_kernels = inception_3_nums['conv2d_1'][i] 53 | conv2d_2_1_kernels = inception_3_nums['conv2d_2_1'][i] 54 | conv2d_2_2_kernels = inception_3_nums['conv2d_2_2'][i] 55 | conv2d_3_1_kernels = inception_3_nums['conv2d_3_1'][i] 56 | conv2d_3_2_kernels = inception_3_nums['conv2d_3_2'][i] 57 | conv2d_4_kernels = inception_3_nums['conv2d_4'][i] 58 | 59 | conv2d_1 = conv2d(prev, num_outputs=conv2d_1_kernels, 60 | kernel_size=[1,1], stride=1, padding="SAME", 61 | activation_fn=tf.nn.relu) 62 | 63 | conv2d_2 = conv2d(prev, num_outputs=conv2d_2_1_kernels, 64 | kernel_size=[1,1], stride=1, padding="SAME", 65 | activation_fn=tf.nn.relu) 66 | conv2d_2 = conv2d(conv2d_2, num_outputs=conv2d_2_2_kernels, 67 | kernel_size=[3,3], stride=1, padding="SAME", 68 | activation_fn=tf.nn.relu) 69 | 70 | conv2d_3 = conv2d(prev, num_outputs=conv2d_3_1_kernels, 71 | kernel_size=[1,1], stride=1, padding="SAME", 72 | activation_fn=tf.nn.relu) 73 | conv2d_3 = conv2d(conv2d_3, num_outputs=conv2d_3_2_kernels, 74 | kernel_size=[5,5], stride=1, padding="SAME", 75 | activation_fn=tf.nn.relu) 76 | 77 | conv2d_4 = max_pool2d(prev, kernel_size=[3,3], stride=1, padding='SAME') 78 | conv2d_4 = conv2d(conv2d_4, num_outputs=conv2d_4_kernels, 79 | kernel_size=[1,1], stride=1, padding="SAME", 80 | activation_fn=tf.nn.relu) 81 | 82 | layers_concat = list() 83 | layers_concat.append(conv2d_1) 84 | layers_concat.append(conv2d_2) 85 | layers_concat.append(conv2d_3) 86 | layers_concat.append(conv2d_4) 87 | prev = tf.concat(layers_concat, 3) 88 | 89 | if i is 0: 90 | self.inception_3a = prev 91 | 92 | prev = max_pool2d(prev, kernel_size=[3,3], stride=2, padding='SAME') 93 | self.inception_3b = prev 94 | 95 | # Inception (4) 96 | # a, b, c, d, e 97 | inception_4_nums = { 98 | 'conv2d_1' : [192, 160, 128, 112, 256], 99 | 'conv2d_2_1': [96 , 112, 128, 144, 160], 100 | 'conv2d_2_2': [208, 224, 256, 228, 320], 101 | 'conv2d_3_1': [16 , 24, 24, 32, 32], 102 | 'conv2d_3_2': [48 , 64, 64, 64, 128], 103 | 'conv2d_4' : [64 , 64, 64, 64, 128] 104 | } 105 | 106 | with tf.variable_scope('inception_4'): 107 | for i in range(5): 108 | conv2d_1_kernels = inception_4_nums['conv2d_1'][i] 109 | conv2d_2_1_kernels = inception_4_nums['conv2d_2_1'][i] 110 | conv2d_2_2_kernels = inception_4_nums['conv2d_2_2'][i] 111 | conv2d_3_1_kernels = inception_4_nums['conv2d_3_1'][i] 112 | conv2d_3_2_kernels = inception_4_nums['conv2d_3_2'][i] 113 | conv2d_4_kernels = inception_4_nums['conv2d_4'][i] 114 | 115 | conv2d_1 = conv2d(prev, num_outputs=conv2d_1_kernels, 116 | kernel_size=[1,1], stride=1, padding="SAME", 117 | activation_fn=tf.nn.relu) 118 | 119 | conv2d_2 = conv2d(prev, num_outputs=conv2d_2_1_kernels, 120 | kernel_size=[1,1], stride=1, padding="SAME", 121 | activation_fn=tf.nn.relu) 122 | conv2d_2 = conv2d(conv2d_2, num_outputs=conv2d_2_2_kernels, 123 | kernel_size=[3,3], stride=1, padding="SAME", 124 | activation_fn=tf.nn.relu) 125 | 126 | conv2d_3 = conv2d(prev, num_outputs=conv2d_3_1_kernels, 127 | kernel_size=[1,1], stride=1, padding="SAME", 128 | activation_fn=tf.nn.relu) 129 | conv2d_3 = conv2d(conv2d_3, num_outputs=conv2d_3_2_kernels, 130 | kernel_size=[5,5], stride=1, padding="SAME", 131 | activation_fn=tf.nn.relu) 132 | 133 | conv2d_4 = max_pool2d(prev, kernel_size=[3,3], stride=1, padding='SAME') 134 | conv2d_4 = conv2d(conv2d_4, num_outputs=conv2d_4_kernels, 135 | kernel_size=[1,1], stride=1, padding="SAME", 136 | activation_fn=tf.nn.relu) 137 | 138 | layers_concat = list() 139 | layers_concat.append(conv2d_1) 140 | layers_concat.append(conv2d_2) 141 | layers_concat.append(conv2d_3) 142 | layers_concat.append(conv2d_4) 143 | prev = tf.concat(layers_concat, 3) 144 | 145 | if i is 0: 146 | self.inception_4a = prev 147 | elif i is 1: 148 | self.inception_4b = prev 149 | elif i is 2: 150 | self.inception_4c = prev 151 | elif i is 3: 152 | self.inception_4d = prev 153 | 154 | prev = max_pool2d(prev, kernel_size=[3,3], stride=2, padding='SAME') 155 | self.inception_4e = prev 156 | 157 | # Inception (5) 158 | # a, b 159 | inception_5_nums = { 160 | 'conv2d_1' : [256, 384], 161 | 'conv2d_2_1': [160, 192], 162 | 'conv2d_2_2': [320, 384], 163 | 'conv2d_3_1': [32 , 48], 164 | 'conv2d_3_2': [128, 128], 165 | 'conv2d_4' : [128, 128] 166 | } 167 | 168 | with tf.variable_scope('inception_5'): 169 | for i in range(2): 170 | conv2d_1_kernels = inception_5_nums['conv2d_1'][i] 171 | conv2d_2_1_kernels = inception_5_nums['conv2d_2_1'][i] 172 | conv2d_2_2_kernels = inception_5_nums['conv2d_2_2'][i] 173 | conv2d_3_1_kernels = inception_5_nums['conv2d_3_1'][i] 174 | conv2d_3_2_kernels = inception_5_nums['conv2d_3_2'][i] 175 | conv2d_4_kernels = inception_5_nums['conv2d_4'][i] 176 | 177 | conv2d_1 = conv2d(prev, num_outputs=conv2d_1_kernels, 178 | kernel_size=[1,1], stride=1, padding="SAME", 179 | activation_fn=tf.nn.relu) 180 | 181 | conv2d_2 = conv2d(prev, num_outputs=conv2d_2_1_kernels, 182 | kernel_size=[1,1], stride=1, padding="SAME", 183 | activation_fn=tf.nn.relu) 184 | conv2d_2 = conv2d(conv2d_2, num_outputs=conv2d_2_2_kernels, 185 | kernel_size=[3,3], stride=1, padding="SAME", 186 | activation_fn=tf.nn.relu) 187 | 188 | conv2d_3 = conv2d(prev, num_outputs=conv2d_3_1_kernels, 189 | kernel_size=[1,1], stride=1, padding="SAME", 190 | activation_fn=tf.nn.relu) 191 | conv2d_3 = conv2d(conv2d_3, num_outputs=conv2d_3_2_kernels, 192 | kernel_size=[5,5], stride=1, padding="SAME", 193 | activation_fn=tf.nn.relu) 194 | 195 | conv2d_4 = max_pool2d(prev, kernel_size=[3,3], stride=1, padding='SAME') 196 | conv2d_4 = conv2d(conv2d_4, num_outputs=conv2d_4_kernels, 197 | kernel_size=[1,1], stride=1, padding="SAME", 198 | activation_fn=tf.nn.relu) 199 | 200 | layers_concat = list() 201 | layers_concat.append(conv2d_1) 202 | layers_concat.append(conv2d_2) 203 | layers_concat.append(conv2d_3) 204 | layers_concat.append(conv2d_4) 205 | prev = tf.concat(layers_concat, 3) 206 | 207 | if i is 0: 208 | self.inception_5a = prev 209 | 210 | self.inception_5b = prev 211 | 212 | with tf.variable_scope('final'): 213 | # Aux #1 output 214 | aux_avg_pool_1 = avg_pool2d(self.inception_4a, kernel_size=[5,5], stride=3, padding='SAME') 215 | aux_conv2d_1 = conv2d(aux_avg_pool_1, num_outputs=128, 216 | kernel_size=[1,1], stride=1, padding="SAME", 217 | activation_fn=tf.nn.relu) 218 | aux_flat = flatten(aux_conv2d_1) 219 | aux_fcl_1 = fully_connected(aux_flat, num_outputs=1024, activation_fn=tf.nn.relu) 220 | aux_droupout_1 = tf.nn.dropout(aux_fcl_1, 0.7) 221 | self.aux_1_out = fully_connected(aux_droupout_1, num_outputs=self.num_classes, activation_fn=None) 222 | 223 | # Aux #2 output 224 | aux_avg_pool_1 = avg_pool2d(self.inception_4d, kernel_size=[5,5], stride=3, padding='SAME') 225 | aux_conv2d_1 = conv2d(aux_avg_pool_1, num_outputs=128, 226 | kernel_size=[1,1], stride=1, padding="SAME", 227 | activation_fn=tf.nn.relu) 228 | aux_flat = flatten(aux_conv2d_1) 229 | aux_fcl_1 = fully_connected(aux_flat, num_outputs=1024, activation_fn=tf.nn.relu) 230 | aux_droupout_1 = tf.nn.dropout(aux_fcl_1, 0.7) 231 | self.aux_2_out = fully_connected(aux_droupout_1, num_outputs=self.num_classes, activation_fn=None) 232 | 233 | # Final output 234 | self.final_avg_pool_1 = avg_pool2d(prev, kernel_size=[7,7], stride=1, padding='SAME') 235 | self.final_dropout = tf.nn.dropout(self.final_avg_pool_1, 0.4) 236 | self.final_flat = flatten(self.final_dropout) 237 | self.final_out = fully_connected(self.final_flat, num_outputs=self.num_classes, activation_fn=None) 238 | 239 | return [self.aux_1_out, self.aux_2_out, self.final_out] 240 | -------------------------------------------------------------------------------- /models/imgclfmodel.py: -------------------------------------------------------------------------------- 1 | import tensorflow as tf 2 | 3 | from dataset.dataset import Dataset 4 | 5 | class ImgClfModel: 6 | def __init__(self, scale_to_imagenet=False, model_type=None): 7 | self.scale_to_imagenet = scale_to_imagenet 8 | self.model_type = model_type 9 | 10 | def set_dataset(self, dataset=None): 11 | if dataset is not None: 12 | if isinstance(dataset, Dataset): 13 | print('Dataset is given as ' + dataset.name) 14 | 15 | width = dataset.width 16 | height = dataset.height 17 | 18 | if self.scale_to_imagenet: 19 | width = 224 20 | height = 224 21 | 22 | self.num_classes = dataset.num_classes 23 | input = tf.placeholder(tf.float32, [None, width, height, 3], name='input') 24 | output = tf.placeholder(tf.int32, [None, dataset.num_classes], name='output') 25 | 26 | return (input, output) 27 | else: 28 | print('dataset is unknown type, please try with Dataset class type') 29 | else: 30 | raise TypeError 31 | 32 | def create_model(self, input): 33 | raise NotImplementedError 34 | -------------------------------------------------------------------------------- /models/inception_resnet_v1.py: -------------------------------------------------------------------------------- 1 | from models.imgclfmodel import ImgClfModel 2 | from dataset.dataset import Dataset 3 | 4 | import tensorflow as tf 5 | from tensorflow.contrib.layers import conv2d 6 | from tensorflow.contrib.layers import max_pool2d 7 | from tensorflow.contrib.layers import avg_pool2d 8 | from tensorflow.contrib.layers import flatten 9 | from tensorflow.contrib.layers import fully_connected 10 | 11 | class Inception_ResnetV1(ImgClfModel): 12 | def __init__(self): 13 | ImgClfModel.__init__(self, scale_to_imagenet=True) 14 | 15 | def create_model(self, input): 16 | with tf.variable_scope('stem'): 17 | stem = conv2d(input, num_outputs=32, 18 | kernel_size=[3,3], stride=2, padding='VALID', 19 | activation_fn=tf.nn.relu) 20 | stem = conv2d(stem, num_outputs=32, 21 | kernel_size=[3,3], stride=1, padding='VALID', 22 | activation_fn=tf.nn.relu) 23 | stem = conv2d(stem, num_outputs=64, 24 | kernel_size=[3,3], stride=1, padding='SAME', 25 | activation_fn=tf.nn.relu) 26 | stem = max_pool2d(stem, kernel_size=[3,3], stride=2, padding='VALID') 27 | stem = conv2d(stem, num_outputs=80, 28 | kernel_size=[1,1], stride=1, padding='SAME', 29 | activation_fn=tf.nn.relu) 30 | stem = conv2d(stem, num_outputs=192, 31 | kernel_size=[3,3], stride=2, padding='VALID', 32 | activation_fn=tf.nn.relu) 33 | stem = conv2d(stem, num_outputs=256, 34 | kernel_size=[3,3], stride=2, padding='VALID', 35 | activation_fn=tf.nn.relu) 36 | prev = stem 37 | 38 | with tf.variable_scope('inception_resnet_a'): 39 | for i in range(5): 40 | identity = prev 41 | 42 | branch_base = conv2d(prev, num_outputs=32, 43 | kernel_size=[1,1], stride=1, padding='SAME', 44 | activation_fn=tf.nn.relu) 45 | branch_a = tf.layers.batch_normalization(branch_base) 46 | 47 | branch_b = conv2d(branch_base, num_outputs=32, 48 | kernel_size=[3,3], stride=1, padding='SAME', 49 | activation_fn=tf.nn.relu) 50 | branch_b = tf.layers.batch_normalization(branch_b) 51 | 52 | branch_c = conv2d(branch_base, num_outputs=32, 53 | kernel_size=[3,3], stride=1, padding='SAME', 54 | activation_fn=tf.nn.relu) 55 | branch_c = conv2d(branch_c, num_outputs=32, 56 | kernel_size=[3,3], stride=1, padding='SAME', 57 | activation_fn=tf.nn.relu) 58 | branch_c = tf.layers.batch_normalization(branch_c) 59 | 60 | layers_concat = list() 61 | layers_concat.append(branch_a) 62 | layers_concat.append(branch_b) 63 | layers_concat.append(branch_c) 64 | merge = tf.concat(layers_concat, 3) 65 | merge = conv2d(merge, num_outputs=256, 66 | kernel_size=[1,1], stride=1, padding='SAME', 67 | activation_fn=tf.nn.relu) 68 | merge = tf.layers.batch_normalization(merge) 69 | 70 | prev = tf.nn.relu(merge + identity) 71 | prev = tf.layers.batch_normalization(prev) 72 | 73 | with tf.variable_scope('reduction_a'): 74 | branch_a = max_pool2d(prev, kernel_size=[3,3], stride=2, padding='VALID') 75 | 76 | branch_b = conv2d(prev, num_outputs=384, 77 | kernel_size=[3,3], stride=2, padding='VALID', 78 | activation_fn=tf.nn.relu) 79 | 80 | branch_c = conv2d(prev, num_outputs=192, 81 | kernel_size=[1,1], stride=1, padding='SAME', 82 | activation_fn=tf.nn.relu) 83 | branch_c = conv2d(branch_c, num_outputs=192, 84 | kernel_size=[3,3], stride=1, padding='SAME', 85 | activation_fn=tf.nn.relu) 86 | branch_c = conv2d(branch_c, num_outputs=256, 87 | kernel_size=[3,3], stride=2, padding='VALID', 88 | activation_fn=tf.nn.relu) 89 | layers_concat = list() 90 | layers_concat.append(branch_a) 91 | layers_concat.append(branch_b) 92 | layers_concat.append(branch_c) 93 | prev = tf.concat(layers_concat, 3) 94 | 95 | with tf.variable_scope('inception_resnet_b'): 96 | for i in range(10): 97 | identity = prev 98 | 99 | branch_base = conv2d(prev, num_outputs=128, 100 | kernel_size=[1,1], stride=1, padding='SAME', 101 | activation_fn=tf.nn.relu) 102 | branch_a = tf.layers.batch_normalization(branch_base) 103 | 104 | branch_b = conv2d(branch_base, num_outputs=128, 105 | kernel_size=[1,7], stride=1, padding='SAME', 106 | activation_fn=tf.nn.relu) 107 | branch_b = conv2d(branch_b, num_outputs=128, 108 | kernel_size=[7,1], stride=1, padding='SAME', 109 | activation_fn=tf.nn.relu) 110 | branch_b = tf.layers.batch_normalization(branch_b) 111 | 112 | layers_concat = list() 113 | layers_concat.append(branch_a) 114 | layers_concat.append(branch_b) 115 | merge = tf.concat(layers_concat, 3) 116 | merge = conv2d(merge, num_outputs=896, 117 | kernel_size=[1,1], stride=1, padding='SAME', 118 | activation_fn=tf.nn.relu) 119 | merge = tf.layers.batch_normalization(merge) 120 | 121 | prev = tf.nn.relu(merge + identity) 122 | prev = tf.layers.batch_normalization(prev) 123 | 124 | with tf.variable_scope('reduction_b'): 125 | branch_a = max_pool2d(prev, kernel_size=[3,3], stride=2, padding='VALID') 126 | 127 | branch_base = conv2d(prev, num_outputs=256, 128 | kernel_size=[1,1], stride=1, padding='SAME', 129 | activation_fn=tf.nn.relu) 130 | branch_b = conv2d(branch_base, num_outputs=384, 131 | kernel_size=[3,3], stride=2, padding='VALID', 132 | activation_fn=tf.nn.relu) 133 | 134 | branch_c = conv2d(branch_base, num_outputs=256, 135 | kernel_size=[3,3], stride=2, padding='VALID', 136 | activation_fn=tf.nn.relu) 137 | 138 | branch_d = conv2d(branch_base, num_outputs=256, 139 | kernel_size=[3,3], stride=1, padding='SAME', 140 | activation_fn=tf.nn.relu) 141 | branch_d = conv2d(branch_d, num_outputs=256, 142 | kernel_size=[3,3], stride=2, padding='VALID', 143 | activation_fn=tf.nn.relu) 144 | layers_concat = list() 145 | layers_concat.append(branch_a) 146 | layers_concat.append(branch_b) 147 | layers_concat.append(branch_c) 148 | layers_concat.append(branch_d) 149 | prev = tf.concat(layers_concat, 3) 150 | 151 | with tf.variable_scope('inception_resnet_c'): 152 | for i in range(5): 153 | identity = prev 154 | 155 | branch_base = conv2d(prev, num_outputs=192, 156 | kernel_size=[1,1], stride=1, padding='SAME', 157 | activation_fn=tf.nn.relu) 158 | branch_a = tf.layers.batch_normalization(branch_base) 159 | 160 | branch_b = conv2d(branch_base, num_outputs=192, 161 | kernel_size=[1,3], stride=1, padding='SAME', 162 | activation_fn=tf.nn.relu) 163 | branch_b = conv2d(branch_b, num_outputs=192, 164 | kernel_size=[3,1], stride=1, padding='SAME', 165 | activation_fn=tf.nn.relu) 166 | branch_b = tf.layers.batch_normalization(branch_b) 167 | 168 | layers_concat = list() 169 | layers_concat.append(branch_a) 170 | layers_concat.append(branch_b) 171 | merge = tf.concat(layers_concat, 3) 172 | merge = conv2d(merge, num_outputs=1792, 173 | kernel_size=[1,1], stride=1, padding='SAME', 174 | activation_fn=tf.nn.relu) 175 | merge = tf.layers.batch_normalization(merge) 176 | 177 | prev = tf.nn.relu(merge + identity) 178 | prev = tf.layers.batch_normalization(prev) 179 | 180 | with tf.variable_scope('final'): 181 | prev = avg_pool2d(prev, kernel_size=[3,3], stride=2, padding='SAME') 182 | flat = flatten(prev) 183 | dr = tf.nn.dropout(flat, 0.8) 184 | self.out = fully_connected(dr, num_outputs=self.num_classes, activation_fn=None) 185 | 186 | return [self.out] 187 | -------------------------------------------------------------------------------- /models/inception_resnet_v2.py: -------------------------------------------------------------------------------- 1 | from models.imgclfmodel import ImgClfModel 2 | from dataset.dataset import Dataset 3 | 4 | import tensorflow as tf 5 | from tensorflow.contrib.layers import conv2d 6 | from tensorflow.contrib.layers import max_pool2d 7 | from tensorflow.contrib.layers import avg_pool2d 8 | from tensorflow.contrib.layers import flatten 9 | from tensorflow.contrib.layers import fully_connected 10 | 11 | class Inception_ResnetV2(ImgClfModel): 12 | def __init__(self): 13 | ImgClfModel.__init__(self, scale_to_imagenet=True) 14 | 15 | def create_model(self, input): 16 | # STEM Network - separate into 3 parts by filter concatenation points 17 | with tf.variable_scope('stem'): 18 | stem_a = conv2d(input, num_outputs=32, 19 | kernel_size=[3,3], stride=2, padding='VALID', 20 | activation_fn=tf.nn.relu) 21 | stem_a = conv2d(stem_a, num_outputs=32, 22 | kernel_size=[3,3], stride=1, padding='VALID', 23 | activation_fn=tf.nn.relu) 24 | stem_a = conv2d(stem_a, num_outputs=64, 25 | kernel_size=[3,3], stride=1, padding='SAME', 26 | activation_fn=tf.nn.relu) 27 | branch_a = max_pool2d(stem_a, kernel_size=[3,3], stride=2, padding='VALID') 28 | branch_b = conv2d(stem_a, num_outputs=96, 29 | kernel_size=[3,3], stride=2, padding='VALID', 30 | activation_fn=tf.nn.relu) 31 | layers_concat = list() 32 | layers_concat.append(branch_a) 33 | layers_concat.append(branch_b) 34 | stem_b = tf.concat(layers_concat, 3) 35 | 36 | branch_a = conv2d(stem_b, num_outputs=64, 37 | kernel_size=[1,1], stride=1, padding='SAME', 38 | activation_fn=tf.nn.relu) 39 | branch_a = conv2d(branch_a, num_outputs=96, 40 | kernel_size=[3,3], stride=1, padding='VALID', 41 | activation_fn=tf.nn.relu) 42 | branch_b = conv2d(stem_b, num_outputs=64, 43 | kernel_size=[1,1], stride=1, padding='SAME', 44 | activation_fn=tf.nn.relu) 45 | branch_b = conv2d(branch_b, num_outputs=64, 46 | kernel_size=[7,1], stride=1, padding='SAME', 47 | activation_fn=tf.nn.relu) 48 | branch_b = conv2d(branch_b, num_outputs=64, 49 | kernel_size=[1,7], stride=1, padding='SAME', 50 | activation_fn=tf.nn.relu) 51 | branch_b = conv2d(branch_b, num_outputs=96, 52 | kernel_size=[3,3], stride=1, padding='VALID', 53 | activation_fn=tf.nn.relu) 54 | layers_concat = list() 55 | layers_concat.append(branch_a) 56 | layers_concat.append(branch_b) 57 | stem_c = tf.concat(layers_concat, 3) 58 | 59 | branch_a = conv2d(stem_c, num_outputs=192, 60 | kernel_size=[3,3], stride=2, padding='VALID', 61 | activation_fn=tf.nn.relu) 62 | branch_b = max_pool2d(stem_c, kernel_size=[3,3], stride=2, padding='VALID') 63 | layers_concat = list() 64 | layers_concat.append(branch_a) 65 | layers_concat.append(branch_b) 66 | prev = tf.concat(layers_concat, 3) 67 | 68 | with tf.variable_scope('inception_resnet_a'): 69 | for i in range(10): 70 | identity = prev 71 | 72 | branch_base = conv2d(prev, num_outputs=32, 73 | kernel_size=[1,1], stride=1, padding='SAME', 74 | activation_fn=tf.nn.relu) 75 | branch_a = tf.layers.batch_normalization(branch_base) 76 | 77 | branch_b = tf.layers.batch_normalization(branch_base) 78 | branch_b = conv2d(branch_b, num_outputs=32, 79 | kernel_size=[3,3], stride=1, padding='SAME', 80 | activation_fn=tf.nn.relu) 81 | branch_b = tf.layers.batch_normalization(branch_b) 82 | 83 | branch_c = tf.layers.batch_normalization(branch_base) 84 | branch_c = conv2d(branch_c, num_outputs=48, 85 | kernel_size=[3,3], stride=1, padding='SAME', 86 | activation_fn=tf.nn.relu) 87 | branch_c = tf.layers.batch_normalization(branch_c) 88 | branch_c = conv2d(branch_c, num_outputs=64, 89 | kernel_size=[3,3], stride=1, padding='SAME', 90 | activation_fn=tf.nn.relu) 91 | branch_c = tf.layers.batch_normalization(branch_c) 92 | 93 | layers_concat = list() 94 | layers_concat.append(branch_a) 95 | layers_concat.append(branch_b) 96 | layers_concat.append(branch_c) 97 | merge = tf.concat(layers_concat, 3) 98 | merge = conv2d(merge, num_outputs=384, 99 | kernel_size=[1,1], stride=1, padding='SAME', 100 | activation_fn=tf.nn.relu) 101 | merge = tf.layers.batch_normalization(merge) 102 | 103 | prev = tf.nn.relu(merge + identity) 104 | prev = tf.layers.batch_normalization(prev) 105 | 106 | with tf.variable_scope('reduction_a'): 107 | branch_a = max_pool2d(prev, kernel_size=[3,3], stride=2, padding='VALID') 108 | 109 | branch_b = conv2d(prev, num_outputs=384, 110 | kernel_size=[3,3], stride=2, padding='VALID', 111 | activation_fn=tf.nn.relu) 112 | 113 | branch_c = conv2d(prev, num_outputs=256, 114 | kernel_size=[1,1], stride=1, padding='SAME', 115 | activation_fn=tf.nn.relu) 116 | branch_c = conv2d(branch_c, num_outputs=256, 117 | kernel_size=[3,3], stride=1, padding='SAME', 118 | activation_fn=tf.nn.relu) 119 | branch_c = conv2d(branch_c, num_outputs=384, 120 | kernel_size=[3,3], stride=2, padding='VALID', 121 | activation_fn=tf.nn.relu) 122 | layers_concat = list() 123 | layers_concat.append(branch_a) 124 | layers_concat.append(branch_b) 125 | layers_concat.append(branch_c) 126 | prev = tf.concat(layers_concat, 3) 127 | 128 | with tf.variable_scope('inception_resnet_b'): 129 | for i in range(20): 130 | identity = prev 131 | 132 | branch_a = conv2d(prev, num_outputs=192, 133 | kernel_size=[1,1], stride=1, padding='SAME', 134 | activation_fn=tf.nn.relu) 135 | branch_a = tf.layers.batch_normalization(branch_a) 136 | 137 | branch_b = conv2d(prev, num_outputs=128, 138 | kernel_size=[1,1], stride=1, padding='SAME', 139 | activation_fn=tf.nn.relu) 140 | branch_b = tf.layers.batch_normalization(branch_b) 141 | branch_b = conv2d(branch_b, num_outputs=160, 142 | kernel_size=[1,7], stride=1, padding='SAME', 143 | activation_fn=tf.nn.relu) 144 | branch_b = tf.layers.batch_normalization(branch_b) 145 | branch_b = conv2d(branch_b, num_outputs=192, 146 | kernel_size=[7,1], stride=1, padding='SAME', 147 | activation_fn=tf.nn.relu) 148 | branch_b = tf.layers.batch_normalization(branch_b) 149 | 150 | layers_concat = list() 151 | layers_concat.append(branch_a) 152 | layers_concat.append(branch_b) 153 | merge = tf.concat(layers_concat, 3) 154 | merge = conv2d(merge, num_outputs=1152, 155 | kernel_size=[1,1], stride=1, padding='SAME', 156 | activation_fn=tf.nn.relu) 157 | merge = tf.layers.batch_normalization(merge) 158 | 159 | prev = tf.nn.relu(merge + identity) 160 | prev = tf.layers.batch_normalization(prev) 161 | 162 | with tf.variable_scope('reduction_b'): 163 | branch_a = max_pool2d(prev, kernel_size=[3,3], stride=2, padding='VALID') 164 | 165 | branch_base = conv2d(prev, num_outputs=256, 166 | kernel_size=[1,1], stride=1, padding='SAME', 167 | activation_fn=tf.nn.relu) 168 | branch_b = conv2d(branch_base, num_outputs=384, 169 | kernel_size=[3,3], stride=2, padding='VALID', 170 | activation_fn=tf.nn.relu) 171 | 172 | branch_c = conv2d(branch_base, num_outputs=256, 173 | kernel_size=[3,3], stride=2, padding='VALID', 174 | activation_fn=tf.nn.relu) 175 | 176 | branch_d = conv2d(branch_base, num_outputs=256, 177 | kernel_size=[3,3], stride=1, padding='SAME', 178 | activation_fn=tf.nn.relu) 179 | branch_d = conv2d(branch_d, num_outputs=256, 180 | kernel_size=[3,3], stride=2, padding='VALID', 181 | activation_fn=tf.nn.relu) 182 | layers_concat = list() 183 | layers_concat.append(branch_a) 184 | layers_concat.append(branch_b) 185 | layers_concat.append(branch_c) 186 | layers_concat.append(branch_d) 187 | prev = tf.concat(layers_concat, 3) 188 | 189 | with tf.variable_scope('inception_resnet_c'): 190 | for i in range(10): 191 | identity = prev 192 | 193 | branch_base = conv2d(prev, num_outputs=192, 194 | kernel_size=[1,1], stride=1, padding='SAME', 195 | activation_fn=tf.nn.relu) 196 | branch_a = tf.layers.batch_normalization(branch_base) 197 | 198 | branch_b = tf.layers.batch_normalization(branch_base) 199 | branch_b = conv2d(branch_b, num_outputs=224, 200 | kernel_size=[1,3], stride=1, padding='SAME', 201 | activation_fn=tf.nn.relu) 202 | branch_b = tf.layers.batch_normalization(branch_b) 203 | branch_b = conv2d(branch_b, num_outputs=256, 204 | kernel_size=[3,1], stride=1, padding='SAME', 205 | activation_fn=tf.nn.relu) 206 | 207 | layers_concat = list() 208 | layers_concat.append(branch_a) 209 | layers_concat.append(branch_b) 210 | merge = tf.concat(layers_concat, 3) 211 | merge = conv2d(merge, num_outputs=2048, 212 | kernel_size=[1,1], stride=1, padding='SAME', 213 | activation_fn=tf.nn.relu) 214 | merge = tf.layers.batch_normalization(merge) 215 | 216 | prev = tf.nn.relu(merge + identity) 217 | prev = tf.layers.batch_normalization(prev) 218 | 219 | 220 | with tf.variable_scope('final'): 221 | prev = avg_pool2d(prev, kernel_size=[3,3], stride=2, padding='SAME') 222 | flat = flatten(prev) 223 | dr = tf.nn.dropout(flat, 0.8) 224 | self.out = fully_connected(dr, num_outputs=self.num_classes, activation_fn=None) 225 | 226 | return [self.out] 227 | -------------------------------------------------------------------------------- /models/inception_v2.py: -------------------------------------------------------------------------------- 1 | from models.imgclfmodel import ImgClfModel 2 | from dataset.dataset import Dataset 3 | 4 | import tensorflow as tf 5 | from tensorflow.contrib.layers import conv2d 6 | from tensorflow.contrib.layers import max_pool2d 7 | from tensorflow.contrib.layers import avg_pool2d 8 | from tensorflow.contrib.layers import flatten 9 | from tensorflow.contrib.layers import fully_connected 10 | 11 | class InceptionV2(ImgClfModel): 12 | def __init__(self): 13 | ImgClfModel.__init__(self, scale_to_imagenet=True) 14 | 15 | def create_model(self, input): 16 | # STEM Network 17 | with tf.variable_scope('stem'): 18 | self.conv2d_1 = conv2d(input, num_outputs=32, 19 | kernel_size=[3,3], stride=2, padding='VALID', 20 | activation_fn=tf.nn.relu) 21 | self.conv2d_2 = conv2d(self.conv2d_1, num_outputs=32, 22 | kernel_size=[3,3], stride=1, padding='VALID', 23 | activation_fn=tf.nn.relu) 24 | self.conv2d_3 = conv2d(self.conv2d_2, num_outputs=64, 25 | kernel_size=[3,3], stride=1, padding='SAME', 26 | activation_fn=tf.nn.relu) 27 | self.pool_1 = max_pool2d(self.conv2d_3, kernel_size=[3,3], stride=2, padding='VALID') 28 | 29 | self.conv2d_4 = conv2d(self.pool_1, num_outputs=80, 30 | kernel_size=[3,3], stride=1, padding='VALID', 31 | activation_fn=tf.nn.relu) 32 | self.conv2d_5 = conv2d(self.conv2d_4, num_outputs=192, 33 | kernel_size=[3,3], stride=2, padding='VALID', 34 | activation_fn=tf.nn.relu) 35 | self.pool_2 = max_pool2d(self.conv2d_5, kernel_size=[3,3], stride=2, padding='VALID') 36 | 37 | prev = self.pool_2 38 | 39 | # Inception (3) 1, 2, 3 40 | inception3_nums = { 41 | 'branch_a' : [64, 64, 64], 42 | 'branch_b_1': [48, 48, 48], 43 | 'branch_b_2': [64, 64, 64], 44 | 'branch_c_1': [64, 64, 64], 45 | 'branch_c_2': [96, 96, 96], 46 | 'branch_c_3': [96, 96, 96], 47 | 'branch_d' : [32, 64, 64] 48 | } 49 | 50 | with tf.variable_scope('inception_3'): 51 | for i in range(3): 52 | branch_a_kernels = inception3_nums['branch_a'][i] 53 | branch_b_1_kernels = inception3_nums['branch_b_1'][i] 54 | branch_b_2_kernels = inception3_nums['branch_b_2'][i] 55 | branch_c_1_kernels = inception3_nums['branch_c_1'][i] 56 | branch_c_2_kernels = inception3_nums['branch_c_2'][i] 57 | branch_c_3_kernels = inception3_nums['branch_c_3'][i] 58 | branch_d_kernels = inception3_nums['branch_d'][i] 59 | 60 | branch_a = conv2d(prev, num_outputs=branch_a_kernels, 61 | kernel_size=[1,1], stride=1, padding='SAME') 62 | 63 | branch_b = conv2d(prev, num_outputs=branch_b_1_kernels, 64 | kernel_size=[1,1], stride=1, padding='SAME') 65 | branch_b = conv2d(branch_b, num_outputs=branch_b_2_kernels, 66 | kernel_size=[3,3], stride=1, padding='SAME') 67 | 68 | branch_c = conv2d(prev, num_outputs=branch_c_1_kernels, 69 | kernel_size=[1,1], stride=1, padding='SAME') 70 | branch_c = conv2d(branch_c, num_outputs=branch_c_2_kernels, 71 | kernel_size=[3,3], stride=1, padding='SAME') 72 | branch_c = conv2d(branch_c, num_outputs=branch_c_3_kernels, 73 | kernel_size=[3,3], stride=1, padding='SAME') 74 | 75 | branch_d = avg_pool2d(prev, kernel_size=[3,3], stride=1, padding='SAME') 76 | branch_d = conv2d(branch_d, num_outputs=branch_d_kernels, 77 | kernel_size=[1,1], stride=1, padding='SAME') 78 | 79 | layers_concat = list() 80 | layers_concat.append(branch_a) 81 | layers_concat.append(branch_b) 82 | layers_concat.append(branch_c) 83 | layers_concat.append(branch_d) 84 | prev = tf.concat(layers_concat, 3) 85 | 86 | with tf.variable_scope('grid_reduction_a'): 87 | branch_a = conv2d(prev, num_outputs=384, 88 | kernel_size=[3,3], stride=2, padding='VALID') 89 | 90 | branch_b = conv2d(prev, num_outputs=64, 91 | kernel_size=[1,1], stride=1, padding='SAME') 92 | branch_b = conv2d(branch_b, num_outputs=96, 93 | kernel_size=[3,3], stride=1, padding='SAME') 94 | branch_b = conv2d(branch_b, num_outputs=96, 95 | kernel_size=[3,3], stride=2, padding='VALID') 96 | 97 | branch_c = max_pool2d(prev, kernel_size=[3,3], stride=2, padding='VALID') 98 | 99 | layers_concat = list() 100 | layers_concat.append(branch_a) 101 | layers_concat.append(branch_b) 102 | layers_concat.append(branch_c) 103 | prev = tf.concat(layers_concat, 3) 104 | 105 | inception5_nums = { 106 | 'branch_a' : [192, 192, 192, 192], 107 | 'branch_b_1': [128, 160, 160, 192], 108 | 'branch_b_2': [128, 160, 160, 192], 109 | 'branch_b_3': [192, 192, 192, 192], 110 | 'branch_c_1': [128, 160, 160, 192], 111 | 'branch_c_2': [128, 160, 160, 192], 112 | 'branch_c_3': [128, 160, 160, 192], 113 | 'branch_c_4': [128, 160, 160, 192], 114 | 'branch_c_5': [192, 192, 192, 192], 115 | 'branch_d' : [192, 192, 192, 192] 116 | } 117 | 118 | with tf.variable_scope('inception_5'): 119 | for i in range(4): 120 | branch_a_kernels = inception5_nums['branch_a'][i] 121 | branch_b_1_kernels = inception5_nums['branch_b_1'][i] 122 | branch_b_2_kernels = inception5_nums['branch_b_2'][i] 123 | branch_b_3_kernels = inception5_nums['branch_b_3'][i] 124 | branch_c_1_kernels = inception5_nums['branch_c_1'][i] 125 | branch_c_2_kernels = inception5_nums['branch_c_2'][i] 126 | branch_c_3_kernels = inception5_nums['branch_c_3'][i] 127 | branch_c_4_kernels = inception5_nums['branch_c_4'][i] 128 | branch_c_5_kernels = inception5_nums['branch_c_5'][i] 129 | branch_d_kernels = inception5_nums['branch_d'][i] 130 | 131 | branch_a = conv2d(prev, num_outputs=branch_a_kernels, 132 | kernel_size=[1,1], stride=1, padding='SAME') 133 | 134 | branch_b = conv2d(prev, num_outputs=branch_b_1_kernels, 135 | kernel_size=[1,1], stride=1, padding='SAME') 136 | branch_b = conv2d(branch_b, num_outputs=branch_b_2_kernels, 137 | kernel_size=[1,7], stride=1, padding='SAME') 138 | branch_b = conv2d(branch_b, num_outputs=branch_b_3_kernels, 139 | kernel_size=[7,1], stride=1, padding='SAME') 140 | 141 | branch_c = conv2d(prev, num_outputs=branch_c_1_kernels, 142 | kernel_size=[1,1], stride=1, padding='SAME') 143 | branch_c = conv2d(branch_c, num_outputs=branch_c_2_kernels, 144 | kernel_size=[7,7], stride=1, padding='SAME') 145 | branch_c = conv2d(branch_c, num_outputs=branch_c_3_kernels, 146 | kernel_size=[1,7], stride=1, padding='SAME') 147 | branch_c = conv2d(branch_c, num_outputs=branch_c_4_kernels, 148 | kernel_size=[7,1], stride=1, padding='SAME') 149 | branch_c = conv2d(branch_c, num_outputs=branch_c_5_kernels, 150 | kernel_size=[1,7], stride=1, padding='SAME') 151 | 152 | branch_d = avg_pool2d(prev, kernel_size=[3,3], stride=1, padding='SAME') 153 | branch_d = conv2d(branch_d, num_outputs=branch_d_kernels, 154 | kernel_size=[1,1], stride=1, padding='SAME') 155 | 156 | layers_concat = list() 157 | layers_concat.append(branch_a) 158 | layers_concat.append(branch_b) 159 | layers_concat.append(branch_c) 160 | layers_concat.append(branch_d) 161 | prev = tf.concat(layers_concat, 3) 162 | 163 | self.aux = prev 164 | 165 | with tf.variable_scope('grid_reduction_b'): 166 | branch_base = conv2d(prev, num_outputs=192, 167 | kernel_size=[1,1], stride=1, padding='SAME') 168 | 169 | branch_a = conv2d(branch_base, num_outputs=320, 170 | kernel_size=[3,3], stride=2, padding='VALID') 171 | 172 | branch_b = conv2d(branch_base, num_outputs=192, 173 | kernel_size=[1,7], stride=1, padding='SAME') 174 | branch_b = conv2d(branch_b, num_outputs=192, 175 | kernel_size=[7,1], stride=1, padding='SAME') 176 | branch_b = conv2d(branch_b, num_outputs=192, 177 | kernel_size=[3,3], stride=2, padding='VALID') 178 | 179 | branch_c = max_pool2d(prev, kernel_size=[3,3], stride=2, padding='VALID') 180 | 181 | layers_concat = list() 182 | layers_concat.append(branch_a) 183 | layers_concat.append(branch_b) 184 | layers_concat.append(branch_c) 185 | prev = tf.concat(layers_concat, 3) 186 | 187 | inception2_nums = { 188 | 'branch_a' : [320, 320], 189 | 'branch_b_1': [384, 384], 190 | 'branch_b_2': [384, 384], 191 | 'branch_b_3': [384, 384], 192 | 'branch_c_1': [448, 448], 193 | 'branch_c_2': [384, 384], 194 | 'branch_c_3': [384, 384], 195 | 'branch_d' : [192, 192] 196 | } 197 | 198 | with tf.variable_scope('inception_2'): 199 | for i in range(2): 200 | branch_a_kernels = inception5_nums['branch_a'][i] 201 | branch_b_1_kernels = inception5_nums['branch_b_1'][i] 202 | branch_b_2_kernels = inception5_nums['branch_b_2'][i] 203 | branch_b_3_kernels = inception5_nums['branch_b_3'][i] 204 | branch_c_1_kernels = inception5_nums['branch_c_1'][i] 205 | branch_c_2_kernels = inception5_nums['branch_c_2'][i] 206 | branch_c_3_kernels = inception5_nums['branch_c_3'][i] 207 | branch_d_kernels = inception5_nums['branch_d'][i] 208 | 209 | branch_a = conv2d(prev, num_outputs=branch_a_kernels, 210 | kernel_size=[1,1], stride=1, padding='SAME') 211 | 212 | branch_b = conv2d(prev, num_outputs=branch_b_1_kernels, 213 | kernel_size=[1,1], stride=1, padding='SAME') 214 | branch_b = conv2d(branch_b, num_outputs=branch_b_2_kernels, 215 | kernel_size=[1,3], stride=1, padding='SAME') 216 | branch_b = conv2d(branch_b, num_outputs=branch_b_3_kernels, 217 | kernel_size=[3,1], stride=1, padding='SAME') 218 | 219 | branch_c = conv2d(prev, num_outputs=branch_c_1_kernels, 220 | kernel_size=[1,1], stride=1, padding='SAME') 221 | branch_c = conv2d(branch_c, num_outputs=branch_c_2_kernels, 222 | kernel_size=[1,3], stride=1, padding='SAME') 223 | branch_c = conv2d(branch_c, num_outputs=branch_c_3_kernels, 224 | kernel_size=[3,1], stride=1, padding='SAME') 225 | 226 | branch_d = max_pool2d(prev, kernel_size=[3,3], stride=1, padding='SAME') 227 | branch_d = conv2d(branch_d, num_outputs=branch_d_kernels, 228 | kernel_size=[1,1], stride=1, padding='SAME') 229 | 230 | layers_concat = list() 231 | layers_concat.append(branch_a) 232 | layers_concat.append(branch_b) 233 | layers_concat.append(branch_c) 234 | layers_concat.append(branch_d) 235 | prev = tf.concat(layers_concat, 3) 236 | 237 | with tf.variable_scope('final'): 238 | self.aux_pool = avg_pool2d(self.aux, kernel_size=[5,5], stride=3, padding='VALID') 239 | self.aux_conv = conv2d(self.aux_pool, num_outputs=128, 240 | kernel_size=[1,1], stride=1, padding='SAME') 241 | self.aux_flat = flatten(self.aux_conv) 242 | self.aux_out = fully_connected(self.aux_flat, num_outputs=self.num_classes, activation_fn=None) 243 | 244 | self.final_pool = avg_pool2d(prev, kernel_size=[2,2], stride=1, padding='VALID') 245 | self.final_dropout = tf.nn.dropout(self.final_pool, 0.8) 246 | self.final_flat = flatten(self.final_dropout) 247 | self.final_out = fully_connected(self.final_flat, num_outputs=self.num_classes, activation_fn=None) 248 | 249 | return [self.aux_out, self.final_out] 250 | -------------------------------------------------------------------------------- /models/inception_v3.py: -------------------------------------------------------------------------------- 1 | from models.imgclfmodel import ImgClfModel 2 | from dataset.dataset import Dataset 3 | 4 | import tensorflow as tf 5 | from tensorflow.contrib.layers import conv2d 6 | from tensorflow.contrib.layers import max_pool2d 7 | from tensorflow.contrib.layers import avg_pool2d 8 | from tensorflow.contrib.layers import flatten 9 | from tensorflow.contrib.layers import fully_connected 10 | 11 | class InceptionV3(ImgClfModel): 12 | def __init__(self): 13 | ImgClfModel.__init__(self, scale_to_imagenet=True) 14 | 15 | def create_model(self, input): 16 | # STEM Network 17 | with tf.variable_scope('stem'): 18 | self.conv2d_1 = conv2d(input, num_outputs=32, 19 | kernel_size=[3,3], stride=2, padding='VALID', 20 | activation_fn=tf.nn.relu) 21 | self.conv2d_2 = conv2d(self.conv2d_1, num_outputs=32, 22 | kernel_size=[3,3], stride=1, padding='VALID', 23 | activation_fn=tf.nn.relu) 24 | self.conv2d_3 = conv2d(self.conv2d_2, num_outputs=64, 25 | kernel_size=[3,3], stride=1, padding='SAME', 26 | activation_fn=tf.nn.relu) 27 | self.pool_1 = max_pool2d(self.conv2d_3, kernel_size=[3,3], stride=2, padding='VALID') 28 | 29 | self.conv2d_4 = conv2d(self.pool_1, num_outputs=80, 30 | kernel_size=[3,3], stride=1, padding='VALID', 31 | activation_fn=tf.nn.relu) 32 | self.conv2d_5 = conv2d(self.conv2d_4, num_outputs=192, 33 | kernel_size=[3,3], stride=2, padding='VALID', 34 | activation_fn=tf.nn.relu) 35 | self.pool_2 = max_pool2d(self.conv2d_5, kernel_size=[3,3], stride=2, padding='VALID') 36 | 37 | prev = self.pool_2 38 | 39 | # Inception (3) 1, 2, 3 40 | inception3_nums = { 41 | 'branch_a' : [64, 64, 64], 42 | 'branch_b_1': [48, 48, 48], 43 | 'branch_b_2': [64, 64, 64], 44 | 'branch_c_1': [64, 64, 64], 45 | 'branch_c_2': [96, 96, 96], 46 | 'branch_c_3': [96, 96, 96], 47 | 'branch_d' : [32, 64, 64] 48 | } 49 | 50 | with tf.variable_scope('inception_3'): 51 | for i in range(3): 52 | branch_a_kernels = inception3_nums['branch_a'][i] 53 | branch_b_1_kernels = inception3_nums['branch_b_1'][i] 54 | branch_b_2_kernels = inception3_nums['branch_b_2'][i] 55 | branch_c_1_kernels = inception3_nums['branch_c_1'][i] 56 | branch_c_2_kernels = inception3_nums['branch_c_2'][i] 57 | branch_c_3_kernels = inception3_nums['branch_c_3'][i] 58 | branch_d_kernels = inception3_nums['branch_d'][i] 59 | 60 | branch_a = conv2d(prev, num_outputs=branch_a_kernels, 61 | kernel_size=[1,1], stride=1, padding='SAME') 62 | 63 | branch_b = conv2d(prev, num_outputs=branch_b_1_kernels, 64 | kernel_size=[1,1], stride=1, padding='SAME') 65 | branch_b = conv2d(branch_b, num_outputs=branch_b_2_kernels, 66 | kernel_size=[3,3], stride=1, padding='SAME') 67 | 68 | branch_c = conv2d(prev, num_outputs=branch_c_1_kernels, 69 | kernel_size=[1,1], stride=1, padding='SAME') 70 | branch_c = conv2d(branch_c, num_outputs=branch_c_2_kernels, 71 | kernel_size=[3,3], stride=1, padding='SAME') 72 | branch_c = conv2d(branch_c, num_outputs=branch_c_3_kernels, 73 | kernel_size=[3,3], stride=1, padding='SAME') 74 | 75 | branch_d = avg_pool2d(prev, kernel_size=[3,3], stride=1, padding='SAME') 76 | branch_d = conv2d(branch_d, num_outputs=branch_d_kernels, 77 | kernel_size=[1,1], stride=1, padding='SAME') 78 | 79 | layers_concat = list() 80 | layers_concat.append(branch_a) 81 | layers_concat.append(branch_b) 82 | layers_concat.append(branch_c) 83 | layers_concat.append(branch_d) 84 | prev = tf.concat(layers_concat, 3) 85 | 86 | with tf.variable_scope('grid_reduction_a'): 87 | branch_a = conv2d(prev, num_outputs=384, 88 | kernel_size=[3,3], stride=2, padding='VALID') 89 | 90 | branch_b = conv2d(prev, num_outputs=64, 91 | kernel_size=[1,1], stride=1, padding='SAME') 92 | branch_b = conv2d(branch_b, num_outputs=96, 93 | kernel_size=[3,3], stride=1, padding='SAME') 94 | branch_b = conv2d(branch_b, num_outputs=96, 95 | kernel_size=[3,3], stride=2, padding='VALID') 96 | 97 | branch_c = max_pool2d(prev, kernel_size=[3,3], stride=2, padding='VALID') 98 | 99 | layers_concat = list() 100 | layers_concat.append(branch_a) 101 | layers_concat.append(branch_b) 102 | layers_concat.append(branch_c) 103 | prev = tf.concat(layers_concat, 3) 104 | 105 | inception5_nums = { 106 | 'branch_a' : [192, 192, 192, 192], 107 | 'branch_b_1': [128, 160, 160, 192], 108 | 'branch_b_2': [128, 160, 160, 192], 109 | 'branch_b_3': [192, 192, 192, 192], 110 | 'branch_c_1': [128, 160, 160, 192], 111 | 'branch_c_2': [128, 160, 160, 192], 112 | 'branch_c_3': [128, 160, 160, 192], 113 | 'branch_c_4': [128, 160, 160, 192], 114 | 'branch_c_5': [192, 192, 192, 192], 115 | 'branch_d' : [192, 192, 192, 192] 116 | } 117 | 118 | with tf.variable_scope('inception_5'): 119 | for i in range(4): 120 | branch_a_kernels = inception5_nums['branch_a'][i] 121 | branch_b_1_kernels = inception5_nums['branch_b_1'][i] 122 | branch_b_2_kernels = inception5_nums['branch_b_2'][i] 123 | branch_b_3_kernels = inception5_nums['branch_b_3'][i] 124 | branch_c_1_kernels = inception5_nums['branch_c_1'][i] 125 | branch_c_2_kernels = inception5_nums['branch_c_2'][i] 126 | branch_c_3_kernels = inception5_nums['branch_c_3'][i] 127 | branch_c_4_kernels = inception5_nums['branch_c_4'][i] 128 | branch_c_5_kernels = inception5_nums['branch_c_5'][i] 129 | branch_d_kernels = inception5_nums['branch_d'][i] 130 | 131 | branch_a = conv2d(prev, num_outputs=branch_a_kernels, 132 | kernel_size=[1,1], stride=1, padding='SAME') 133 | 134 | branch_b = conv2d(prev, num_outputs=branch_b_1_kernels, 135 | kernel_size=[1,1], stride=1, padding='SAME') 136 | branch_b = conv2d(branch_b, num_outputs=branch_b_2_kernels, 137 | kernel_size=[1,7], stride=1, padding='SAME') 138 | branch_b = conv2d(branch_b, num_outputs=branch_b_3_kernels, 139 | kernel_size=[7,1], stride=1, padding='SAME') 140 | 141 | branch_c = conv2d(prev, num_outputs=branch_c_1_kernels, 142 | kernel_size=[1,1], stride=1, padding='SAME') 143 | branch_c = conv2d(branch_c, num_outputs=branch_c_2_kernels, 144 | kernel_size=[7,7], stride=1, padding='SAME') 145 | branch_c = conv2d(branch_c, num_outputs=branch_c_3_kernels, 146 | kernel_size=[1,7], stride=1, padding='SAME') 147 | branch_c = conv2d(branch_c, num_outputs=branch_c_4_kernels, 148 | kernel_size=[7,1], stride=1, padding='SAME') 149 | branch_c = conv2d(branch_c, num_outputs=branch_c_5_kernels, 150 | kernel_size=[1,7], stride=1, padding='SAME') 151 | 152 | branch_d = avg_pool2d(prev, kernel_size=[3,3], stride=1, padding='SAME') 153 | branch_d = conv2d(branch_d, num_outputs=branch_d_kernels, 154 | kernel_size=[1,1], stride=1, padding='SAME') 155 | 156 | layers_concat = list() 157 | layers_concat.append(branch_a) 158 | layers_concat.append(branch_b) 159 | layers_concat.append(branch_c) 160 | layers_concat.append(branch_d) 161 | prev = tf.concat(layers_concat, 3) 162 | 163 | self.aux = prev 164 | 165 | with tf.variable_scope('grid_reduction_b'): 166 | branch_base = conv2d(prev, num_outputs=192, 167 | kernel_size=[1,1], stride=1, padding='SAME') 168 | 169 | branch_a = conv2d(branch_base, num_outputs=320, 170 | kernel_size=[3,3], stride=2, padding='VALID') 171 | 172 | branch_b = conv2d(branch_base, num_outputs=192, 173 | kernel_size=[1,7], stride=1, padding='SAME') 174 | branch_b = conv2d(branch_b, num_outputs=192, 175 | kernel_size=[7,1], stride=1, padding='SAME') 176 | branch_b = conv2d(branch_b, num_outputs=192, 177 | kernel_size=[3,3], stride=2, padding='VALID') 178 | 179 | branch_c = max_pool2d(prev, kernel_size=[3,3], stride=2, padding='VALID') 180 | 181 | layers_concat = list() 182 | layers_concat.append(branch_a) 183 | layers_concat.append(branch_b) 184 | layers_concat.append(branch_c) 185 | prev = tf.concat(layers_concat, 3) 186 | 187 | inception2_nums = { 188 | 'branch_a' : [320, 320], 189 | 'branch_b_1': [384, 384], 190 | 'branch_b_2': [384, 384], 191 | 'branch_b_3': [384, 384], 192 | 'branch_c_1': [448, 448], 193 | 'branch_c_2': [384, 384], 194 | 'branch_c_3': [384, 384], 195 | 'branch_d' : [192, 192] 196 | } 197 | 198 | with tf.variable_scope('inception_2'): 199 | for i in range(2): 200 | branch_a_kernels = inception5_nums['branch_a'][i] 201 | branch_b_1_kernels = inception5_nums['branch_b_1'][i] 202 | branch_b_2_kernels = inception5_nums['branch_b_2'][i] 203 | branch_b_3_kernels = inception5_nums['branch_b_3'][i] 204 | branch_c_1_kernels = inception5_nums['branch_c_1'][i] 205 | branch_c_2_kernels = inception5_nums['branch_c_2'][i] 206 | branch_c_3_kernels = inception5_nums['branch_c_3'][i] 207 | branch_d_kernels = inception5_nums['branch_d'][i] 208 | 209 | branch_a = conv2d(prev, num_outputs=branch_a_kernels, 210 | kernel_size=[1,1], stride=1, padding='SAME') 211 | 212 | branch_b = conv2d(prev, num_outputs=branch_b_1_kernels, 213 | kernel_size=[1,1], stride=1, padding='SAME') 214 | branch_b = conv2d(branch_b, num_outputs=branch_b_2_kernels, 215 | kernel_size=[1,3], stride=1, padding='SAME') 216 | branch_b = conv2d(branch_b, num_outputs=branch_b_3_kernels, 217 | kernel_size=[3,1], stride=1, padding='SAME') 218 | 219 | branch_c = conv2d(prev, num_outputs=branch_c_1_kernels, 220 | kernel_size=[1,1], stride=1, padding='SAME') 221 | branch_c = conv2d(branch_c, num_outputs=branch_c_2_kernels, 222 | kernel_size=[1,3], stride=1, padding='SAME') 223 | branch_c = conv2d(branch_c, num_outputs=branch_c_3_kernels, 224 | kernel_size=[3,1], stride=1, padding='SAME') 225 | 226 | branch_d = max_pool2d(prev, kernel_size=[3,3], stride=1, padding='SAME') 227 | branch_d = conv2d(branch_d, num_outputs=branch_d_kernels, 228 | kernel_size=[1,1], stride=1, padding='SAME') 229 | 230 | layers_concat = list() 231 | layers_concat.append(branch_a) 232 | layers_concat.append(branch_b) 233 | layers_concat.append(branch_c) 234 | layers_concat.append(branch_d) 235 | prev = tf.concat(layers_concat, 3) 236 | 237 | with tf.variable_scope('final'): 238 | self.aux_pool = avg_pool2d(self.aux, kernel_size=[5,5], stride=3, padding='VALID') 239 | self.aux_conv = conv2d(self.aux_pool, num_outputs=128, 240 | kernel_size=[1,1], stride=1, padding='SAME') 241 | self.aux_flat = flatten(self.aux_conv) 242 | self.aux_bn = tf.layers.batch_normalization(self.aux_flat) 243 | self.aux_out = fully_connected(self.aux_bn, num_outputs=self.num_classes, activation_fn=None) 244 | 245 | self.final_pool = avg_pool2d(prev, kernel_size=[2,2], stride=1, padding='VALID') 246 | self.final_dropout = tf.nn.dropout(self.final_pool, 0.8) 247 | self.final_flat = flatten(self.final_dropout) 248 | self.final_bn = tf.layers.batch_normalization(self.final_flat) 249 | self.final_out = fully_connected(self.final_bn, num_outputs=self.num_classes, activation_fn=None) 250 | 251 | return [self.aux_out, self.final_out] 252 | -------------------------------------------------------------------------------- /models/inception_v4.py: -------------------------------------------------------------------------------- 1 | from models.imgclfmodel import ImgClfModel 2 | from dataset.dataset import Dataset 3 | 4 | import tensorflow as tf 5 | from tensorflow.contrib.layers import conv2d 6 | from tensorflow.contrib.layers import max_pool2d 7 | from tensorflow.contrib.layers import avg_pool2d 8 | from tensorflow.contrib.layers import flatten 9 | from tensorflow.contrib.layers import fully_connected 10 | 11 | class InceptionV4(ImgClfModel): 12 | def __init__(self): 13 | ImgClfModel.__init__(self, scale_to_imagenet=True) 14 | 15 | def create_model(self, input): 16 | # STEM Network - separate into 3 parts by filter concatenation points 17 | with tf.variable_scope('stem'): 18 | stem_a = conv2d(input, num_outputs=32, 19 | kernel_size=[3,3], stride=2, padding='VALID', 20 | activation_fn=tf.nn.relu) 21 | stem_a = conv2d(stem_a, num_outputs=32, 22 | kernel_size=[3,3], stride=1, padding='VALID', 23 | activation_fn=tf.nn.relu) 24 | stem_a = conv2d(stem_a, num_outputs=64, 25 | kernel_size=[3,3], stride=1, padding='SAME', 26 | activation_fn=tf.nn.relu) 27 | branch_a = max_pool2d(stem_a, kernel_size=[3,3], stride=2, padding='VALID') 28 | branch_b = conv2d(stem_a, num_outputs=96, 29 | kernel_size=[3,3], stride=2, padding='VALID', 30 | activation_fn=tf.nn.relu) 31 | layers_concat = list() 32 | layers_concat.append(branch_a) 33 | layers_concat.append(branch_b) 34 | stem_b = tf.concat(layers_concat, 3) 35 | 36 | branch_a = conv2d(stem_b, num_outputs=64, 37 | kernel_size=[1,1], stride=1, padding='SAME', 38 | activation_fn=tf.nn.relu) 39 | branch_a = conv2d(branch_a, num_outputs=96, 40 | kernel_size=[3,3], stride=1, padding='VALID', 41 | activation_fn=tf.nn.relu) 42 | branch_b = conv2d(stem_b, num_outputs=64, 43 | kernel_size=[1,1], stride=1, padding='SAME', 44 | activation_fn=tf.nn.relu) 45 | branch_b = conv2d(branch_b, num_outputs=64, 46 | kernel_size=[7,1], stride=1, padding='SAME', 47 | activation_fn=tf.nn.relu) 48 | branch_b = conv2d(branch_b, num_outputs=64, 49 | kernel_size=[1,7], stride=1, padding='SAME', 50 | activation_fn=tf.nn.relu) 51 | branch_b = conv2d(branch_b, num_outputs=96, 52 | kernel_size=[3,3], stride=1, padding='VALID', 53 | activation_fn=tf.nn.relu) 54 | layers_concat = list() 55 | layers_concat.append(branch_a) 56 | layers_concat.append(branch_b) 57 | stem_c = tf.concat(layers_concat, 3) 58 | 59 | branch_a = conv2d(stem_c, num_outputs=192, 60 | kernel_size=[3,3], stride=2, padding='VALID', 61 | activation_fn=tf.nn.relu) 62 | branch_b = max_pool2d(stem_c, kernel_size=[3,3], stride=2, padding='VALID') 63 | layers_concat = list() 64 | layers_concat.append(branch_a) 65 | layers_concat.append(branch_b) 66 | prev = tf.concat(layers_concat, 3) 67 | 68 | # 4 x Inception-A 69 | with tf.variable_scope('inception_a'): 70 | for i in range(4): 71 | branch_a = avg_pool2d(prev, kernel_size=[3,3], stride=1, padding='SAME') 72 | branch_a = conv2d(branch_a, num_outputs=96, 73 | kernel_size=[1,1], stride=1, padding='SAME', 74 | activation_fn=tf.nn.relu) 75 | 76 | branch_b = conv2d(prev, num_outputs=96, 77 | kernel_size=[1,1], stride=1, padding='SAME', 78 | activation_fn=tf.nn.relu) 79 | 80 | branch_c = conv2d(prev, num_outputs=64, 81 | kernel_size=[1,1], stride=1, padding='SAME', 82 | activation_fn=tf.nn.relu) 83 | branch_c = conv2d(branch_c, num_outputs=96, 84 | kernel_size=[3,3], stride=1, padding='SAME', 85 | activation_fn=tf.nn.relu) 86 | 87 | branch_d = conv2d(prev, num_outputs=64, 88 | kernel_size=[1,1], stride=1, padding='SAME', 89 | activation_fn=tf.nn.relu) 90 | branch_d = conv2d(branch_d, num_outputs=96, 91 | kernel_size=[3,3], stride=1, padding='SAME', 92 | activation_fn=tf.nn.relu) 93 | branch_d = conv2d(branch_d, num_outputs=96, 94 | kernel_size=[3,3], stride=1, padding='SAME', 95 | activation_fn=tf.nn.relu) 96 | layers_concat = list() 97 | layers_concat.append(branch_a) 98 | layers_concat.append(branch_b) 99 | layers_concat.append(branch_c) 100 | layers_concat.append(branch_d) 101 | prev = tf.concat(layers_concat, 3) 102 | 103 | with tf.variable_scope('reduction_a'): 104 | branch_a = max_pool2d(prev, kernel_size=[3,3], stride=2, padding='VALID') 105 | 106 | branch_b = conv2d(prev, num_outputs=384, 107 | kernel_size=[3,3], stride=2, padding='VALID', 108 | activation_fn=tf.nn.relu) 109 | 110 | #k=192, l=224, m=256 111 | branch_c = conv2d(prev, num_outputs=192, 112 | kernel_size=[1,1], stride=1, padding='SAME', 113 | activation_fn=tf.nn.relu) 114 | branch_c = conv2d(branch_c, num_outputs=224, 115 | kernel_size=[3,3], stride=1, padding='SAME', 116 | activation_fn=tf.nn.relu) 117 | branch_c = conv2d(branch_c, num_outputs=256, 118 | kernel_size=[3,3], stride=2, padding='VALID', 119 | activation_fn=tf.nn.relu) 120 | layers_concat = list() 121 | layers_concat.append(branch_a) 122 | layers_concat.append(branch_b) 123 | layers_concat.append(branch_c) 124 | prev = tf.concat(layers_concat, 3) 125 | 126 | # 7 x Inception-B 127 | with tf.variable_scope('inception_b'): 128 | for i in range(7): 129 | branch_a = avg_pool2d(prev, kernel_size=[3,3], stride=1, padding='SAME') 130 | branch_a = conv2d(branch_a, num_outputs=128, 131 | kernel_size=[1,1], stride=1, padding='SAME', 132 | activation_fn=tf.nn.relu) 133 | 134 | branch_b = conv2d(prev, num_outputs=384, 135 | kernel_size=[1,1], stride=1, padding='SAME', 136 | activation_fn=tf.nn.relu) 137 | 138 | branch_c = conv2d(prev, num_outputs=192, 139 | kernel_size=[1,1], stride=1, padding='SAME', 140 | activation_fn=tf.nn.relu) 141 | branch_c = conv2d(branch_c, num_outputs=224, 142 | kernel_size=[1,7], stride=1, padding='SAME', 143 | activation_fn=tf.nn.relu) 144 | branch_c = conv2d(branch_c, num_outputs=256, 145 | kernel_size=[7,1], stride=1, padding='SAME', 146 | activation_fn=tf.nn.relu) 147 | 148 | branch_d = conv2d(prev, num_outputs=192, 149 | kernel_size=[1,1], stride=1, padding='SAME', 150 | activation_fn=tf.nn.relu) 151 | branch_d = conv2d(branch_d, num_outputs=192, 152 | kernel_size=[1,7], stride=1, padding='SAME', 153 | activation_fn=tf.nn.relu) 154 | branch_d = conv2d(branch_d, num_outputs=224, 155 | kernel_size=[7,1], stride=1, padding='SAME', 156 | activation_fn=tf.nn.relu) 157 | branch_d = conv2d(branch_d, num_outputs=224, 158 | kernel_size=[1,7], stride=1, padding='SAME', 159 | activation_fn=tf.nn.relu) 160 | branch_d = conv2d(branch_d, num_outputs=256, 161 | kernel_size=[7,1], stride=1, padding='SAME', 162 | activation_fn=tf.nn.relu) 163 | layers_concat = list() 164 | layers_concat.append(branch_a) 165 | layers_concat.append(branch_b) 166 | layers_concat.append(branch_c) 167 | layers_concat.append(branch_d) 168 | prev = tf.concat(layers_concat, 3) 169 | 170 | with tf.variable_scope('reduction_b'): 171 | branch_a = max_pool2d(prev, kernel_size=[3,3], stride=2, padding='VALID') 172 | 173 | branch_b = conv2d(prev, num_outputs=192, 174 | kernel_size=[1,1], stride=1, padding='SAME', 175 | activation_fn=tf.nn.relu) 176 | branch_b = conv2d(branch_b, num_outputs=192, 177 | kernel_size=[3,3], stride=2, padding='VALID', 178 | activation_fn=tf.nn.relu) 179 | 180 | branch_c = conv2d(prev, num_outputs=256, 181 | kernel_size=[1,1], stride=1, padding='SAME', 182 | activation_fn=tf.nn.relu) 183 | branch_c = conv2d(branch_c, num_outputs=256, 184 | kernel_size=[1,7], stride=1, padding='SAME', 185 | activation_fn=tf.nn.relu) 186 | branch_c = conv2d(branch_c, num_outputs=320, 187 | kernel_size=[7,1], stride=1, padding='SAME', 188 | activation_fn=tf.nn.relu) 189 | branch_c = conv2d(branch_c, num_outputs=256, 190 | kernel_size=[3,3], stride=2, padding='VALID', 191 | activation_fn=tf.nn.relu) 192 | layers_concat = list() 193 | layers_concat.append(branch_a) 194 | layers_concat.append(branch_b) 195 | layers_concat.append(branch_c) 196 | prev = tf.concat(layers_concat, 3) 197 | 198 | # 3 x Inception-C 199 | with tf.variable_scope('inception_c'): 200 | for i in range(3): 201 | branch_a = avg_pool2d(prev, kernel_size=[3,3], stride=1, padding='SAME') 202 | branch_a = conv2d(branch_a, num_outputs=256, 203 | kernel_size=[1,1], stride=1, padding='SAME', 204 | activation_fn=tf.nn.relu) 205 | 206 | branch_b = conv2d(prev, num_outputs=256, 207 | kernel_size=[1,1], stride=1, padding='SAME', 208 | activation_fn=tf.nn.relu) 209 | 210 | branch_c = conv2d(prev, num_outputs=384, 211 | kernel_size=[1,1], stride=1, padding='SAME', 212 | activation_fn=tf.nn.relu) 213 | branch_c_a = conv2d(branch_c, num_outputs=256, 214 | kernel_size=[1,3], stride=1, padding='SAME', 215 | activation_fn=tf.nn.relu) 216 | branch_c_b = conv2d(branch_c, num_outputs=256, 217 | kernel_size=[3,1], stride=1, padding='SAME', 218 | activation_fn=tf.nn.relu) 219 | 220 | branch_d = conv2d(prev, num_outputs=384, 221 | kernel_size=[1,1], stride=1, padding='SAME', 222 | activation_fn=tf.nn.relu) 223 | branch_d = conv2d(branch_d, num_outputs=448, 224 | kernel_size=[1,3], stride=1, padding='SAME', 225 | activation_fn=tf.nn.relu) 226 | branch_d = conv2d(branch_d, num_outputs=512, 227 | kernel_size=[3,1], stride=1, padding='SAME', 228 | activation_fn=tf.nn.relu) 229 | branch_d_a = conv2d(branch_d, num_outputs=256, 230 | kernel_size=[1,3], stride=1, padding='SAME', 231 | activation_fn=tf.nn.relu) 232 | branch_d_b = conv2d(branch_d, num_outputs=256, 233 | kernel_size=[3,1], stride=1, padding='SAME', 234 | activation_fn=tf.nn.relu) 235 | layers_concat = list() 236 | layers_concat.append(branch_a) 237 | layers_concat.append(branch_b) 238 | layers_concat.append(branch_c_a) 239 | layers_concat.append(branch_c_b) 240 | layers_concat.append(branch_d_a) 241 | layers_concat.append(branch_d_b) 242 | prev = tf.concat(layers_concat, 3) 243 | 244 | # Finals 245 | with tf.variable_scope('final'): 246 | prev = avg_pool2d(prev, kernel_size=[3,3], stride=2, padding='SAME') 247 | flat = flatten(prev) 248 | dr = tf.nn.dropout(flat, 0.8) 249 | self.out = fully_connected(dr, num_outputs=self.num_classes, activation_fn=None) 250 | 251 | return [self.out] 252 | -------------------------------------------------------------------------------- /models/resnet.py: -------------------------------------------------------------------------------- 1 | from models.imgclfmodel import ImgClfModel 2 | from dataset.dataset import Dataset 3 | 4 | import tensorflow as tf 5 | from tensorflow.contrib.layers import conv2d 6 | from tensorflow.contrib.layers import max_pool2d 7 | from tensorflow.contrib.layers import avg_pool2d 8 | from tensorflow.contrib.layers import flatten 9 | from tensorflow.contrib.layers import fully_connected 10 | 11 | """ 12 | Implementation of Residual Network from ILSVRC 2015. The original architecture is invented by Kaiming He @Microsoft. 13 | 14 | The main technical contributions from this architecture are "identity mapping", and "making network very very deep" 15 | """ 16 | class ResNet(ImgClfModel): 17 | def __init__(self, model_type='50'): 18 | ImgClfModel.__init__(self, scale_to_imagenet=True, model_type=model_type) 19 | 20 | def create_model(self, input): 21 | with tf.variable_scope('conv1'): 22 | conv1 = conv2d(input, num_outputs=64, 23 | kernel_size=[7,7], stride=2, padding='SAME', 24 | activation_fn=None) 25 | conv1 = tf.layers.batch_normalization(conv1) 26 | conv1 = tf.nn.relu(conv1) 27 | self.conv1 = conv1 28 | 29 | with tf.variable_scope('conv2'): 30 | conv2 = max_pool2d(conv1, kernel_size=[3,3], stride=2, padding='SAME') 31 | 32 | if self.model_type is "18" or self.model_type is "34": 33 | conv2 = self.repeat_residual_blocks(repeat=2, 34 | x=conv2, 35 | block=self.residual_block_a, 36 | num_outputs=[64,64], kernel_sizes=[[3,3], [3,3]], 37 | pool=False) 38 | if self.model_type is "34": 39 | conv2 = self.repeat_residual_blocks(repeat=2, 40 | x=conv2, 41 | block=self.residual_block_a, 42 | num_outputs=[64], kernel_sizes=[[3,3] [3,3]], 43 | pool=False) 44 | 45 | elif self.model_type is "50" or self.model_type is "101" or self.model_type is "152": 46 | conv2 = self.repeat_residual_blocks(repeat=3, 47 | x=conv2, 48 | block=self.residual_block_b, 49 | num_outputs=[64,64,256], kernel_sizes=[[1,1], [3,3], [1,1]], 50 | pool=False) 51 | self.conv2 = conv2 52 | 53 | with tf.variable_scope('conv3'): 54 | if self.model_type is "18" or self.model_type is "34": 55 | conv3 = self.repeat_residual_blocks(repeat=2, 56 | x=conv2, 57 | block=self.residual_block_a, 58 | num_outputs=[128,128], kernel_sizes=[[3,3], [3,3]], 59 | pool=True) 60 | if self.model_type is "34": 61 | conv3 = self.repeat_residual_blocks(repeat=2, 62 | x=conv3, 63 | block=self.residual_block_a, 64 | num_outputs=[128,128], kernel_sizes=[[3,3], [3,3]], 65 | pool=False) 66 | 67 | elif self.model_type is "50" or self.model_type is "101" or self.model_type is "152": 68 | conv3 = self.repeat_residual_blocks(repeat=4, 69 | x=conv2, 70 | block=self.residual_block_b, 71 | num_outputs=[128,128,512], kernel_sizes=[[1,1], [3,3], [1,1]], 72 | pool=True) 73 | if self.model_type is "152": 74 | conv3 = self.repeat_residual_blocks(repeat=4, 75 | x=conv3, 76 | block=self.residual_block_b, 77 | num_outputs=[128,128,512], kernel_sizes=[[1,1], [3,3], [1,1]], 78 | pool=False) 79 | 80 | self.conv3 = conv3 81 | 82 | with tf.variable_scope('conv4'): 83 | if self.model_type is "18" or self.model_type is "34": 84 | conv4 = self.repeat_residual_blocks(repeat=2, 85 | x=conv3, 86 | block=self.residual_block_a, 87 | num_outputs=[256,256], kernel_sizes=[[3,3], [3,3]], 88 | pool=True) 89 | if self.model_type is "34": 90 | conv4 = self.repeat_residual_blocks(repeat=4, 91 | x=conv4, 92 | block=self.residual_block_a, 93 | num_outputs=[256,256], kernel_sizes=[[3,3], [3,3]], 94 | pool=False) 95 | 96 | elif self.model_type is "50" or self.model_type is "101" or self.model_type is "152": 97 | conv4 = self.repeat_residual_blocks(repeat=6, 98 | x=conv3, 99 | block=self.residual_block_b, 100 | num_outputs=[256,256,1024], kernel_sizes=[[1,1], [3,3], [1,1]], 101 | pool=True) 102 | 103 | if self.model_type is "101" or self.model_type is "152": 104 | conv4 = self.repeat_residual_blocks(repeat=17, 105 | x=conv4, 106 | block=self.residual_block_b, 107 | num_outputs=[256,256,1024], kernel_sizes=[[1,1], [3,3], [1,1]], 108 | pool=False) 109 | 110 | if self.model_type is "152": 111 | conv4 = self.repeat_residual_blocks(repeat=77, 112 | x=conv4, 113 | block=self.residual_block_b, 114 | num_outputs=[256,256,1024], kernel_sizes=[[1,1], [3,3], [1,1]], 115 | pool=False) 116 | 117 | self.conv4 = conv4 118 | 119 | with tf.variable_scope('conv5'): 120 | if self.model_type is "18" or self.model_type is "34": 121 | conv5 = self.repeat_residual_blocks(repeat=2, 122 | x=conv4, 123 | block=self.residual_block_a, 124 | num_outputs=[512,512], kernel_sizes=[[3,3], [3,3]], 125 | pool=True) 126 | if self.model_type is "34": 127 | conv5 = self.repeat_residual_blocks(repeat=1, 128 | x=conv5, 129 | block=self.residual_block_a, 130 | num_outputs=[512,512], kernel_sizes=[[3,3], [3,3]], 131 | pool=True) 132 | 133 | elif self.model_type is "50" or self.model_type is "101" or self.model_type is "152": 134 | conv5 = self.repeat_residual_blocks(repeat=3, 135 | x=conv4, 136 | block=self.residual_block_b, 137 | num_outputs=[512,512,2048], kernel_sizes=[[1,1], [3,3], [1,1]], 138 | pool=True) 139 | 140 | self.conv5 = conv5 141 | 142 | with tf.variable_scope('before_final'): 143 | avg_pool = avg_pool2d(conv5, kernel_size=[3,3], stride=2, padding='SAME') 144 | flat = flatten(avg_pool) 145 | self.flat = flat 146 | 147 | with tf.variable_scope('final'): 148 | self.final_out = fully_connected(flat, num_outputs=self.num_classes, activation_fn=None) 149 | 150 | return [self.final_out] 151 | 152 | def repeat_residual_blocks(self, repeat, x, block, num_outputs, kernel_sizes, pool=True): 153 | out = x 154 | 155 | # count 1 156 | if pool: 157 | out = block(x, num_outputs, kernel_sizes, pool=True) 158 | repeat = repeat - 1 159 | 160 | for i in range(repeat-1): 161 | out = block(x, num_outputs, kernel_sizes) 162 | 163 | return out 164 | 165 | # Applicable to 18, 34 166 | def residual_block_a(self, x, num_output, kernel_size=[[3,3], [3,3]], stride=1, pool=False): 167 | res = x 168 | out = x 169 | 170 | if pool: 171 | out = max_pool2d(out, kernel_size=[3,3], stride=2, padding='SAME') 172 | res = conv2d(res, num_outputs=num_output, 173 | kernel_size=[1,1], stride=[2,2], padding='SAME', 174 | activation_fn=None) 175 | res = tf.layers.batch_normalization(res) 176 | res = tf.nn.relu(res) 177 | 178 | for i in range(len(kernel_sizes)): 179 | num_output = num_outputs[i] 180 | kernel_size = kernel_sizes[i] 181 | 182 | out = conv2d(out, num_outputs=num_output, 183 | kernel_size=kernel_size, stride=stride, padding='SAME', 184 | activation_fn=None) 185 | out = tf.layers.batch_normalization(out) 186 | 187 | if i < len(kernel_size)-1: 188 | out = tf.nn.relu(out) 189 | 190 | f_x = tf.nn.relu(out + res) 191 | return f_x 192 | 193 | # Applicable to 50, 101, 152 194 | def residual_block_b(self, x, num_outputs, kernel_sizes=[[1,1], [3,3], [1,1]], stride=1, pool=False): 195 | res = x 196 | out = x 197 | 198 | first_num_output = num_outputs[0] 199 | last_num_output = num_outputs[len(num_outputs)-1] 200 | 201 | if pool: 202 | out = max_pool2d(out, kernel_size=[3,3], stride=2, padding='SAME') 203 | res = conv2d(res, num_outputs=last_num_output, 204 | kernel_size=[1,1], stride=[2,2], padding='SAME', 205 | activation_fn=None) 206 | res = tf.layers.batch_normalization(res) 207 | res = tf.nn.relu(res) 208 | else: 209 | res = conv2d(res, num_outputs=last_num_output, 210 | kernel_size=[1,1], stride=[1,1], padding='SAME', 211 | activation_fn=None) 212 | res = tf.layers.batch_normalization(res) 213 | res = tf.nn.relu(res) 214 | 215 | for i in range(len(kernel_sizes)): 216 | num_output = num_outputs[i] 217 | kernel_size = kernel_sizes[i] 218 | 219 | out = conv2d(out, num_outputs=num_output, 220 | kernel_size=kernel_size, stride=stride, padding='SAME', 221 | activation_fn=None) 222 | out = tf.layers.batch_normalization(out) 223 | 224 | if i < len(kernel_size)-1: 225 | out = tf.nn.relu(out) 226 | 227 | f_x = tf.nn.relu(out + res) 228 | return f_x 229 | 230 | -------------------------------------------------------------------------------- /models/resnet_v2.py: -------------------------------------------------------------------------------- 1 | from models.imgclfmodel import ImgClfModel 2 | from dataset.dataset import Dataset 3 | 4 | import tensorflow as tf 5 | from tensorflow.contrib.layers import conv2d 6 | from tensorflow.contrib.layers import max_pool2d 7 | from tensorflow.contrib.layers import avg_pool2d 8 | from tensorflow.contrib.layers import flatten 9 | from tensorflow.contrib.layers import fully_connected 10 | 11 | class ResNetV2(ImgClfModel): 12 | def __init__(self, model_type='50'): 13 | ImgClfModel.__init__(self, scale_to_imagenet=True, model_type=model_type) 14 | 15 | def create_model(self, input): 16 | with tf.variable_scope('conv1'): 17 | conv1 = conv2d(input, num_outputs=64, 18 | kernel_size=[7,7], stride=2, padding='SAME', 19 | activation_fn=None) 20 | conv1 = tf.layers.batch_normalization(conv1) 21 | conv1 = tf.nn.relu(conv1) 22 | self.conv1 = conv1 23 | 24 | with tf.variable_scope('conv2'): 25 | conv2 = max_pool2d(conv1, kernel_size=[3,3], stride=2, padding='SAME') 26 | 27 | if self.model_type is "18" or self.model_type is "34": 28 | conv2 = self.repeat_residual_blocks(repeat=2, 29 | x=conv2, 30 | block=self.residual_block_a, 31 | num_outputs=[64,64], kernel_sizes=[[3,3], [3,3]], 32 | pool=False) 33 | if self.model_type is "34": 34 | conv2 = self.repeat_residual_blocks(repeat=2, 35 | x=conv2, 36 | block=self.residual_block_a, 37 | num_outputs=[64], kernel_sizes=[[3,3] [3,3]], 38 | pool=False) 39 | 40 | elif self.model_type is "50" or self.model_type is "101" or self.model_type is "152" or self.model_type is "200": 41 | conv2 = self.repeat_residual_blocks(repeat=3, 42 | x=conv2, 43 | block=self.residual_block_b, 44 | num_outputs=[64,64,256], kernel_sizes=[[1,1], [3,3], [1,1]], 45 | pool=False) 46 | self.conv2 = conv2 47 | 48 | with tf.variable_scope('conv3'): 49 | if self.model_type is "18" or self.model_type is "34": 50 | conv3 = self.repeat_residual_blocks(repeat=2, 51 | x=conv2, 52 | block=self.residual_block_a, 53 | num_outputs=[128,128], kernel_sizes=[[3,3], [3,3]], 54 | pool=True) 55 | if self.model_type is "34": 56 | conv3 = self.repeat_residual_blocks(repeat=2, 57 | x=conv3, 58 | block=self.residual_block_a, 59 | num_outputs=[128,128], kernel_sizes=[[3,3], [3,3]], 60 | pool=False) 61 | 62 | elif self.model_type is "50" or self.model_type is "101" or self.model_type is "152" or self.model_type is "200": 63 | conv3 = self.repeat_residual_blocks(repeat=4, 64 | x=conv2, 65 | block=self.residual_block_b, 66 | num_outputs=[128,128,512], kernel_sizes=[[1,1], [3,3], [1,1]], 67 | pool=True) 68 | if self.model_type is "152": 69 | conv3 = self.repeat_residual_blocks(repeat=4, 70 | x=conv3, 71 | block=self.residual_block_b, 72 | num_outputs=[128,128,512], kernel_sizes=[[1,1], [3,3], [1,1]], 73 | pool=False) 74 | 75 | if self.model_type is "200": 76 | conv3 = self.repeat_residual_blocks(repeat=16, 77 | x=conv3, 78 | block=self.residual_block_b, 79 | num_outputs=[128,128,512], kernel_sizes=[[1,1], [3,3], [1,1]], 80 | pool=False) 81 | 82 | self.conv3 = conv3 83 | 84 | with tf.variable_scope('conv4'): 85 | if self.model_type is "18" or self.model_type is "34": 86 | conv4 = self.repeat_residual_blocks(repeat=2, 87 | x=conv3, 88 | block=self.residual_block_a, 89 | num_outputs=[256,256], kernel_sizes=[[3,3], [3,3]], 90 | pool=True) 91 | if self.model_type is "34": 92 | conv4 = self.repeat_residual_blocks(repeat=4, 93 | x=conv4, 94 | block=self.residual_block_a, 95 | num_outputs=[256,256], kernel_sizes=[[3,3], [3,3]], 96 | pool=False) 97 | 98 | elif self.model_type is "50" or self.model_type is "101" or self.model_type is "152" or self.model_type is "200": 99 | conv4 = self.repeat_residual_blocks(repeat=6, 100 | x=conv3, 101 | block=self.residual_block_b, 102 | num_outputs=[256,256,1024], kernel_sizes=[[1,1], [3,3], [1,1]], 103 | pool=True) 104 | 105 | if self.model_type is "101" or self.model_type is "152" or self.model_type is "200": 106 | conv4 = self.repeat_residual_blocks(repeat=17, 107 | x=conv4, 108 | block=self.residual_block_b, 109 | num_outputs=[256,256,1024], kernel_sizes=[[1,1], [3,3], [1,1]], 110 | pool=False) 111 | 112 | if self.model_type is "152" or self.model_type is "200": 113 | conv4 = self.repeat_residual_blocks(repeat=77, 114 | x=conv4, 115 | block=self.residual_block_b, 116 | num_outputs=[256,256,1024], kernel_sizes=[[1,1], [3,3], [1,1]], 117 | pool=False) 118 | 119 | self.conv4 = conv4 120 | 121 | with tf.variable_scope('conv5'): 122 | if self.model_type is "18" or self.model_type is "34": 123 | conv5 = self.repeat_residual_blocks(repeat=2, 124 | x=conv4, 125 | block=self.residual_block_a, 126 | num_outputs=[512,512], kernel_sizes=[[3,3], [3,3]], 127 | pool=True) 128 | if self.model_type is "34": 129 | conv5 = self.repeat_residual_blocks(repeat=1, 130 | x=conv5, 131 | block=self.residual_block_a, 132 | num_outputs=[512,512], kernel_sizes=[[3,3], [3,3]], 133 | pool=True) 134 | 135 | elif self.model_type is "50" or self.model_type is "101" or self.model_type is "152" or self.model_type is "200": 136 | conv5 = self.repeat_residual_blocks(repeat=3, 137 | x=conv4, 138 | block=self.residual_block_b, 139 | num_outputs=[512,512,2048], kernel_sizes=[[1,1], [3,3], [1,1]], 140 | pool=True) 141 | 142 | self.conv5 = conv5 143 | 144 | with tf.variable_scope('before_final'): 145 | avg_pool = avg_pool2d(conv5, kernel_size=[3,3], stride=2, padding='SAME') 146 | flat = flatten(avg_pool) 147 | self.flat = flat 148 | 149 | with tf.variable_scope('final'): 150 | self.final_out = fully_connected(flat, num_outputs=self.num_classes, activation_fn=None) 151 | 152 | return [self.final_out] 153 | 154 | def repeat_residual_blocks(self, repeat, x, block, num_outputs, kernel_sizes, pool=True): 155 | out = x 156 | 157 | # count 1 158 | if pool: 159 | out = block(x, num_outputs, kernel_sizes, pool=True) 160 | repeat = repeat - 1 161 | 162 | for i in range(repeat-1): 163 | out = block(x, num_outputs, kernel_sizes) 164 | 165 | return out 166 | 167 | # Applicable to 18, 34 168 | def residual_block_a(self, x, num_output, kernel_size=[[3,3], [3,3]], stride=1, pool=False): 169 | res = x 170 | out = x 171 | 172 | if pool: 173 | out = max_pool2d(out, kernel_size=[3,3], stride=2, padding='SAME') 174 | res = tf.layers.batch_normalization(res) 175 | res = tf.nn.relu(res) 176 | res = conv2d(res, num_outputs=num_output, 177 | kernel_size=[1,1], stride=[2,2], padding='SAME', 178 | activation_fn=None) 179 | 180 | for i in range(len(kernel_sizes)): 181 | num_output = num_outputs[i] 182 | kernel_size = kernel_sizes[i] 183 | 184 | out = tf.layers.batch_normalization(out) 185 | out = tf.nn.relu(out) 186 | out = conv2d(out, num_outputs=num_output, 187 | kernel_size=kernel_size, stride=stride, padding='SAME', 188 | activation_fn=None) 189 | 190 | f_x = out + res 191 | return f_x 192 | 193 | # Applicable to 50, 101, 152 194 | def residual_block_b(self, x, num_outputs, kernel_sizes=[[1,1], [3,3], [1,1]], stride=1, pool=False): 195 | res = x 196 | out = x 197 | 198 | first_num_output = num_outputs[0] 199 | last_num_output = num_outputs[len(num_outputs)-1] 200 | 201 | if pool: 202 | out = max_pool2d(out, kernel_size=[3,3], stride=2, padding='SAME') 203 | res = tf.layers.batch_normalization(res) 204 | res = tf.nn.relu(res) 205 | res = conv2d(res, num_outputs=last_num_output, 206 | kernel_size=[1,1], stride=[2,2], padding='SAME', 207 | activation_fn=None) 208 | else: 209 | res = tf.layers.batch_normalization(res) 210 | res = tf.nn.relu(res) 211 | res = conv2d(res, num_outputs=last_num_output, 212 | kernel_size=[1,1], stride=[1,1], padding='SAME', 213 | activation_fn=None) 214 | 215 | for i in range(len(kernel_sizes)): 216 | num_output = num_outputs[i] 217 | kernel_size = kernel_sizes[i] 218 | 219 | out = tf.layers.batch_normalization(out) 220 | out = tf.nn.relu(out) 221 | out = conv2d(out, num_outputs=num_output, 222 | kernel_size=kernel_size, stride=stride, padding='SAME', 223 | activation_fn=None) 224 | 225 | f_x = out + res 226 | return f_x 227 | -------------------------------------------------------------------------------- /models/vgg.py: -------------------------------------------------------------------------------- 1 | from models.imgclfmodel import ImgClfModel 2 | from dataset.dataset import Dataset 3 | 4 | import tensorflow as tf 5 | from tensorflow.contrib.layers import conv2d 6 | from tensorflow.contrib.layers import max_pool2d 7 | from tensorflow.contrib.layers import flatten 8 | from tensorflow.contrib.layers import fully_connected 9 | 10 | """ 11 | Implementation of VGGs from ILSVRC 2014. The original architecture is invented by VGG (Visual Geometry Group) @Oxford. 12 | This one didnt' win the ILSVRC 2014, but it took the 2nd place. It is very popular and well-known to lots of new comers in deep learning area. 13 | 14 | The main technical contributions from this architecture are "3x3 filters", and very simple architecture with deeper depth. 15 | """ 16 | class VGG(ImgClfModel): 17 | def __init__(self, model_type='D'): 18 | ImgClfModel.__init__(self, scale_to_imagenet=True, model_type=model_type) 19 | 20 | """ 21 | types 22 | A : 11 weight layers 23 | A-LRN : 11 weight layers with Local Response Normalization 24 | B : 13 weight layers 25 | C : 16 weight layers with 1D conv layers 26 | D : 16 weight layers 27 | E : 19 weight layers 28 | """ 29 | def create_model(self, input): 30 | self.group1 = [] 31 | self.group2 = [] 32 | self.group3 = [] 33 | self.group4 = [] 34 | self.group5 = [] 35 | 36 | with tf.variable_scope('group1'): 37 | # LAYER GROUP #1 38 | group_1 = conv2d(input, num_outputs=64, 39 | kernel_size=[3,3], stride=1, padding='SAME', 40 | activation_fn=tf.nn.relu) 41 | self.group1.append(group_1) 42 | 43 | if self.model_type == 'A-LRN': 44 | group_1 = tf.nn.local_response_normalization(group_1, 45 | bias=2, alpha=0.0001, beta=0.75) 46 | self.group1.append(group_1) 47 | 48 | if self.model_type != 'A' and self.model_type == 'A-LRN': 49 | group_1 = conv2d(group_1, num_outputs=64, 50 | kernel_size=[3,3], stride=1, padding='SAME', 51 | activation_fn=tf.nn.relu) 52 | self.group1.append(group_1) 53 | 54 | group_1 = max_pool2d(group_1, kernel_size=[2,2], stride=2) 55 | self.group1.append(group_1) 56 | 57 | with tf.variable_scope('group2'): 58 | # LAYER GROUP #2 59 | group_2 = conv2d(group_1, num_outputs=128, 60 | kernel_size=[3, 3], padding='SAME', 61 | activation_fn=tf.nn.relu) 62 | self.group2.append(group_2) 63 | 64 | if self.model_type != 'A' and self.model_type == 'A-LRN': 65 | group_2 = conv2d(group_2, num_outputs=128, 66 | kernel_size=[3,3], stride=1, padding='SAME', 67 | activation_fn=tf.nn.relu) 68 | self.group2.append(group_2) 69 | 70 | group_2 = max_pool2d(group_2, kernel_size=[2,2], stride=2) 71 | self.group2.append(group_2) 72 | 73 | with tf.variable_scope('group3'): 74 | # LAYER GROUP #3 75 | group_3 = conv2d(group_2, num_outputs=256, 76 | kernel_size=[3,3], stride=1, padding='SAME', 77 | activation_fn=tf.nn.relu) 78 | self.group3.append(group_3) 79 | group_3 = conv2d(group_3, num_outputs=256, 80 | kernel_size=[3,3], stride=1, padding='SAME', 81 | activation_fn=tf.nn.relu) 82 | self.group3.append(group_3) 83 | 84 | if self.model_type == 'C': 85 | group_3 = conv2d(group_3, num_outputs=256, 86 | kernel_size=[1,1], stride=1, padding='SAME', 87 | activation_fn=tf.nn.relu) 88 | self.group3.append(group_3) 89 | 90 | if self.model_type == 'D' or self.model_type == 'E': 91 | group_3 = conv2d(group_3, num_outputs=256, 92 | kernel_size=[3,3], stride=1, padding='SAME', 93 | activation_fn=tf.nn.relu) 94 | self.group3.append(group_3) 95 | 96 | if self.model_type == 'E': 97 | group_3 = conv2d(group_3, num_outputs=256, 98 | kernel_size=[3,3], stride=1, padding='SAME', 99 | activation_fn=tf.nn.relu) 100 | self.group3.append(group_3) 101 | 102 | group_3 = max_pool2d(group_3, kernel_size=[2,2], stride=2) 103 | self.group3.append(group_3) 104 | 105 | with tf.variable_scope('group4'): 106 | # LAYER GROUP #4 107 | group_4 = conv2d(group_3, num_outputs=512, 108 | kernel_size=[3,3], stride=1, padding='SAME', 109 | activation_fn=tf.nn.relu) 110 | self.group4.append(group_4) 111 | group_4 = conv2d(group_4, num_outputs=512, 112 | kernel_size=[3,3], stride=1, padding='SAME', 113 | activation_fn=tf.nn.relu) 114 | self.group4.append(group_4) 115 | 116 | if self.model_type == 'C': 117 | group_4 = conv2d(group_4, num_outputs=512, 118 | kernel_size=[1,1], stride=1, padding='SAME', 119 | activation_fn=tf.nn.relu) 120 | self.group4.append(group_4) 121 | 122 | if self.model_type == 'D' or self.model_type == 'E': 123 | group_4 = conv2d(group_4, num_outputs=512, 124 | kernel_size=[3,3], stride=1, padding='SAME', 125 | activation_fn=tf.nn.relu) 126 | self.group4.append(group_4) 127 | 128 | if self.model_type == 'E': 129 | group_4 = conv2d(group_4, num_outputs=512, 130 | kernel_size=[3,3], stride=1, padding='SAME', 131 | activation_fn=tf.nn.relu) 132 | self.group4.append(group_4) 133 | 134 | group_4 = max_pool2d(group_4, kernel_size=[2,2], stride=2) 135 | self.group4.append(group_4) 136 | 137 | with tf.variable_scope('group5'): 138 | # LAYER GROUP #5 139 | group_5 = conv2d(group_4, num_outputs=512, 140 | kernel_size=[3,3], stride=1, padding='SAME', 141 | activation_fn=tf.nn.relu) 142 | self.group5.append(group_5) 143 | group_5 = conv2d(group_5, num_outputs=512, 144 | kernel_size=[3,3], stride=1, padding='SAME', 145 | activation_fn=tf.nn.relu) 146 | self.group5.append(group_5) 147 | 148 | if self.model_type == 'C': 149 | group_5 = conv2d(group_5, num_outputs=512, 150 | kernel_size=[1,1], stride=1, padding='SAME', 151 | activation_fn=tf.nn.relu) 152 | self.group5.append(group_5) 153 | 154 | if self.model_type == 'D' or self.model_type == 'E': 155 | group_5 = conv2d(group_5, num_outputs=512, 156 | kernel_size=[3,3], stride=1, padding='SAME', 157 | activation_fn=tf.nn.relu) 158 | self.group5.append(group_5) 159 | 160 | if self.model_type == 'E': 161 | group_5 = conv2d(group_5, num_outputs=512, 162 | kernel_size=[3,3], stride=1, padding='SAME', 163 | activation_fn=tf.nn.relu) 164 | self.group5.append(group_5) 165 | 166 | group_5 = max_pool2d(group_5, kernel_size=[2,2], stride=2) 167 | self.group5.append(group_5) 168 | 169 | with tf.variable_scope('fcl'): 170 | # 1st FC 4096 171 | self.flat = flatten(group_5) 172 | self.fcl1 = fully_connected(self.flat, num_outputs=4096, activation_fn=tf.nn.relu) 173 | self.dr1 = tf.nn.dropout(self.fcl1, 0.5) 174 | 175 | # 2nd FC 4096 176 | self.fcl2 = fully_connected(self.dr1, num_outputs=4096, activation_fn=tf.nn.relu) 177 | self.dr2 = tf.nn.dropout(self.fcl2, 0.5) 178 | 179 | with tf.variable_scope('final'): 180 | # 3rd FC 1000 181 | self.out = fully_connected(self.dr2, num_outputs=self.num_classes, activation_fn=None) 182 | 183 | return [self.out] -------------------------------------------------------------------------------- /overview.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/deep-diver/DeepModels/687fbe143d4385f84677460f7d2fcf6fc92e87fc/overview.png -------------------------------------------------------------------------------- /requirements.txt: -------------------------------------------------------------------------------- 1 | numpy>=1.14.5 2 | requests>=2.19.1 3 | scikit-image>=0.12.3 4 | tensorflow>=1.6.0rc0 5 | tqdm>=4.11.2 6 | urllib3>=1.23 7 | 8 | -------------------------------------------------------------------------------- /test.py: -------------------------------------------------------------------------------- 1 | import argparse 2 | import sys 3 | 4 | from dataset.cifar10_dataset import Cifar10 5 | from dataset.cifar100_dataset import Cifar100 6 | from dataset.mnist_dataset import Mnist 7 | 8 | from models.alexnet import AlexNet 9 | from models.vgg import VGG 10 | from models.googlenet import GoogLeNet 11 | from models.resnet import ResNet 12 | from models.inception_v2 import InceptionV2 13 | from models.inception_v3 import InceptionV3 14 | from models.densenet import DenseNet 15 | from trainers.clftrainer import ClfTrainer 16 | 17 | learning_rate = 0.0000001 18 | epochs = 1 19 | batch_size = 2 20 | 21 | import warnings 22 | 23 | def main(): 24 | dataset = Cifar10() 25 | # dataset = Cifar100() 26 | # dataset = Mnist() 27 | 28 | # model = AlexNet() 29 | # model = VGG() 30 | # model = GoogLeNet() 31 | # model = ResNet(model_type="101") 32 | # model = InceptionV3() 33 | model = DenseNet(model_type="201") 34 | 35 | # training 36 | trainer = ClfTrainer(model, dataset) 37 | trainer.run_training(epochs, batch_size, learning_rate, './cifar10-ckpt') 38 | # trainer.resume_training_from_ckpt(epochs, batch_size, learning_rate, './resnet101-cifar10-new-ckpt-3', './resnet101-cifar10-ckpt') 39 | # trainer.run_training(epochs, batch_size, learning_rate, './test-ckpt', options={'model_type': 'A' }) 40 | # trainer.resume_training_from_ckpt(epochs, batch_size, learning_rate, './inceptionv3-cifar10-ckpt-5', './inceptionv3-cifar10-new-ckpt') 41 | # trainer.resume_training_from_ckpt(epochs, batch_size, learning_rate, './resume-test-ckpt-1', './resume-test-ckpt') 42 | 43 | # resuming training 44 | # trainer.resume_training_from_ckpt(epochs, batch_size, learning_rate, './test-ckpt', './new-test-ckpt') 45 | #trainer.resume_training_from_ckpt(epochs, batch_size, learning_rate, './test-ckpt', './new-test-ckpt', options={'model_type': ... }) 46 | 47 | # transfer learning 48 | # new_dataset = Cifar100() 49 | # trainer = ClfTrainer(model, new_dataset) 50 | # trainer.run_transfer_learning(epochs, batch_size, learning_rate, './new-test-ckpt-1', './test-transfer-learning-ckpt') 51 | # trainer.run_transfer_learning(epochs, batch_size, learning_rate, './new-test-ckpt-1', './test-transfer-learning-ckpt', options={'model_type': ... }) 52 | 53 | # testing 54 | # images = ... 55 | # testing_result = trainer.run_testing(images, './test-transfer-learning-ckpt-1') 56 | # testing_result = trainer.run_testing(images, './test-transfer-learning-ckpt-1', options={'model_type': ...}) 57 | 58 | if __name__ == "__main__": 59 | main() 60 | -------------------------------------------------------------------------------- /trainers/clftrainer.py: -------------------------------------------------------------------------------- 1 | import time 2 | 3 | import tensorflow as tf 4 | 5 | from models.alexnet import AlexNet 6 | from models.vgg import VGG 7 | from models.googlenet import GoogLeNet 8 | from models.resnet import ResNet 9 | from models.inception_v2 import InceptionV2 10 | from models.inception_v3 import InceptionV3 11 | from trainers.predefined_loss import * 12 | 13 | class ClfTrainer: 14 | def __init__(self, clf_model, clf_dataset): 15 | self.clf_model = clf_model 16 | self.clf_dataset = clf_dataset 17 | 18 | def __run_train__(self, sess, input, output, 19 | batch_i, batch_size, 20 | cost_func, train_op, 21 | scale_to_imagenet=False): 22 | 23 | total_loss = 0 24 | count = 0 25 | 26 | for batch_features, batch_labels in self.clf_dataset.get_training_batches_from_preprocessed(batch_i, batch_size, scale_to_imagenet): 27 | loss, _ = sess.run([cost_func, train_op], 28 | feed_dict={input: batch_features, 29 | output: batch_labels}) 30 | total_loss = total_loss + loss 31 | count = count + 1 32 | 33 | return total_loss/count 34 | 35 | def __run_accuracy_in_valid_set__(self, sess, input, output, accuracy, batch_size, scale_to_imagenet=False): 36 | valid_features, valid_labels = self.clf_dataset.get_valid_set(scale_to_imagenet) 37 | 38 | valid_acc = 0 39 | for batch_valid_features, batch_valid_labels in self.clf_dataset.get_batches_from(valid_features, valid_labels, batch_size): 40 | valid_acc += sess.run(accuracy, 41 | feed_dict={input:batch_valid_features, 42 | output:batch_valid_labels}) 43 | 44 | tmp_num = valid_features.shape[0]/batch_size 45 | return valid_acc/tmp_num 46 | 47 | def __train__(self, input, output, 48 | cost_func, train_op, accuracy, 49 | epochs, batch_size, save_model_path, 50 | save_every_epoch=1): 51 | with tf.Session() as sess: 52 | sess.run(tf.global_variables_initializer()) 53 | 54 | print('starting training ... ') 55 | for epoch in range(epochs): 56 | n_batches = self.clf_dataset.num_batch 57 | 58 | for batch_i in range(1, n_batches + 1): 59 | loss = self.__run_train__(sess, 60 | input, output, 61 | batch_i, batch_size, 62 | cost_func, train_op, 63 | self.clf_model.scale_to_imagenet) 64 | print('Epoch {:>2}, {} Batch {}: '.format(epoch + 1, self.clf_dataset.name, batch_i), end='') 65 | print('Avg. Loss: {} '.format(loss), end='') 66 | 67 | valid_acc = self.__run_accuracy_in_valid_set__(sess, 68 | input, output, 69 | accuracy, batch_size, 70 | self.clf_model.scale_to_imagenet) 71 | print('Validation Accuracy {:.6f}'.format(valid_acc)) 72 | 73 | if epoch % save_every_epoch == 0: 74 | print('epoch: {} is saved...'.format(epoch+1)) 75 | saver = tf.train.Saver() 76 | saver.save(sess, save_model_path, global_step=epoch+1, write_meta_graph=False) 77 | 78 | def __get_simple_losses_and_accuracy__(self, out_layers, output, learning_rate, options=None): 79 | is_loss_weights_considered = False 80 | label_smoothings = [0 for i in range(len(out_layers))] 81 | 82 | if options is not None: 83 | if 'loss_weights' in options and \ 84 | len(options['loss_weights']) is len(out_layers): 85 | is_loss_weights_considered = True 86 | 87 | if 'label_smoothings' in options and \ 88 | len(options['label_smoothings']) is len(out_layers): 89 | label_smoothings = options['label_smoothings'] 90 | 91 | aux_cost_sum = 0 92 | if is_loss_weights_considered: 93 | for i in range(len(out_layers) - 1): 94 | aux_out_layer = out_layers[i] 95 | aux_label_smoothing = label_smoothings[i] 96 | aux_cost = tf.losses.softmax_cross_entropy(output, aux_out_layer, label_smoothing=aux_label_smoothing, reduction=tf.losses.Reduction.MEAN) 97 | aux_cost_sum += aux_cost * options['loss_weights'][i] 98 | 99 | final_out_layer = out_layers[len(out_layers)-1] 100 | final_label_smoothing = label_smoothings[len(out_layers)-1] 101 | cost = tf.losses.softmax_cross_entropy(output, final_out_layer, label_smoothing=final_label_smoothing, reduction=tf.losses.Reduction.MEAN) 102 | 103 | if is_loss_weights_considered: 104 | cost = cost * options['loss_weights'][len(out_layers)-1] 105 | 106 | optimizer = tf.train.AdamOptimizer(learning_rate=learning_rate) 107 | gradients = optimizer.compute_gradients(cost+aux_cost_sum) 108 | train_op = optimizer.apply_gradients(gradients) 109 | 110 | correct_pred = tf.equal(tf.argmax(final_out_layer, 1), tf.argmax(output, 1)) 111 | accuracy = tf.reduce_mean(tf.cast(correct_pred, tf.float32)) 112 | 113 | return cost, train_op, accuracy 114 | 115 | def __get_losses_and_accuracy__(self, model, output, out_layers, learning_rate, options=None): 116 | from_paper_flag = True 117 | 118 | if options is None or options['optimizer_from_paper'] is False: 119 | optimizer_from_paper_flag = False 120 | 121 | if isinstance(model, AlexNet): 122 | return get_alexnet_trainer(output, out_layers, learning_rate) if optimizer_from_paper_flag else \ 123 | self.__get_simple_losses_and_accuracy__(out_layers, output, learning_rate, None) 124 | elif isinstance(model, VGG): 125 | return get_vgg_trainer(output, out_layers, learning_rate) if optimizer_from_paper_flag else \ 126 | self.__get_simple_losses_and_accuracy__(out_layers, output, learning_rate, None) 127 | elif isinstance(model, GoogLeNet): 128 | return get_googlenet_trainer(output, out_layers, learning_rate) if optimizer_from_paper_flag else \ 129 | self.__get_simple_losses_and_accuracy__(out_layers, output, learning_rate, {'loss_weights': [0.3, 0.3, 1.0]}) 130 | elif isinstance(model, ResNet): 131 | return get_resnet_trainer(output, out_layers, learning_rate) if optimizer_from_paper_flag else \ 132 | self.__get_simple_losses_and_accuracy__(out_layers, output, learning_rate, None) 133 | elif isinstance(model, InceptionV2): 134 | return get_inceptionv2_trainer(output, out_layers, learning_rate) if optimizer_from_paper_flag else \ 135 | self.__get_simple_losses_and_accuracy__(out_layers, output, learning_rate, {'loss_weights': [0.4, 1.0]}) 136 | elif isinstance(model, InceptionV3): 137 | return get_inceptionv3_trainer(output, out_layers, learning_rate) if optimizer_from_paper_flag else \ 138 | self.__get_simple_losses_and_accuracy__(out_layers, output, learning_rate, {'loss_weights': [0.4, 1.0], 'label_smoothings': [0.1, 0.1]}) 139 | else: 140 | return self.__get_simple_losses_and_accuracy__(out_layers, output, learning_rate, options) 141 | 142 | # default to use AdamOptimizer w/ softmax_cross_entropy_with_logits_v2 143 | def run_training(self, 144 | epochs, batch_size, learning_rate, 145 | save_model_to, save_every_epoch=1, 146 | options=None): 147 | input, output = self.clf_model.set_dataset(self.clf_dataset) 148 | out_layers = self.clf_model.create_model(input) 149 | 150 | cost, train_op, accuracy = self.__get_losses_and_accuracy__(self.clf_model, output, out_layers, learning_rate) 151 | 152 | self.__train__(input, output, 153 | cost, train_op, accuracy, 154 | epochs, batch_size, 155 | save_model_to, save_every_epoch) 156 | 157 | def resume_training_from_ckpt(self, epochs, batch_size, learning_rate, save_model_from, save_model_to, save_every_epoch=1, options=None): 158 | graph = tf.Graph() 159 | with graph.as_default(): 160 | input, output = self.clf_model.set_dataset(self.clf_dataset) 161 | out_layers = self.clf_model.create_model(input) 162 | 163 | cost, train_op, accuracy = self.__get_losses_and_accuracy__(self.clf_model, output, out_layers, learning_rate) 164 | 165 | with tf.Session(graph=graph) as sess: 166 | sess.run(tf.global_variables_initializer()) 167 | 168 | saver = tf.train.Saver(tf.trainable_variables()) 169 | saver.restore(sess, save_model_from) 170 | 171 | print('starting training ... ') 172 | for epoch in range(epochs): 173 | n_batches = self.clf_dataset.num_batch 174 | 175 | for batch_i in range(1, n_batches + 1): 176 | loss = self.__run_train__(sess, 177 | input, output, 178 | batch_i, batch_size, 179 | cost, train_op, 180 | self.clf_model.scale_to_imagenet) 181 | print('Epoch {:>2}, {} Batch {}: '.format(epoch + 1, self.clf_dataset.name, batch_i), end='') 182 | print('Avg. Loss: {} '.format(loss), end='') 183 | 184 | valid_acc = self.__run_accuracy_in_valid_set__(sess, 185 | input, output, 186 | accuracy, batch_size, 187 | self.clf_model.scale_to_imagenet) 188 | print('Validation Accuracy {:.6f}'.format(valid_acc)) 189 | 190 | if epoch % save_every_epoch == 0: 191 | print('epoch: {} is saved...'.format(epoch+1)) 192 | saver1 = tf.train.Saver() 193 | saver1.save(sess, save_model_to, global_step=epoch+1, write_meta_graph=False) 194 | 195 | def run_transfer_learning(self, 196 | epochs, batch_size, learning_rate, 197 | save_model_from, save_model_to, save_every_epoch=1, options=None): 198 | graph = tf.Graph() 199 | with graph.as_default(): 200 | input, output = self.clf_model.set_dataset(self.clf_dataset) 201 | out_layers = self.clf_model.create_model(input) 202 | 203 | cost, train_op, accuracy = self.__get_losses_and_accuracy__(self.clf_model, output, out_layers, learning_rate) 204 | 205 | with tf.Session(graph=graph) as sess: 206 | sess.run(tf.global_variables_initializer()) 207 | 208 | var_list = [] 209 | for var in tf.model_variables(): 210 | if 'final' not in var.name: 211 | var_list.append(var) 212 | 213 | saver = tf.train.Saver(var_list) 214 | saver.restore(sess, save_model_from) 215 | 216 | print('starting training ... ') 217 | for epoch in range(epochs): 218 | n_batches = self.clf_dataset.num_batch 219 | 220 | for batch_i in range(1, n_batches + 1): 221 | loss = self.__run_train__(sess, 222 | input, output, 223 | batch_i, batch_size, 224 | cost, train_op, 225 | self.clf_model.scale_to_imagenet) 226 | print('Epoch {:>2}, {} Batch {}: '.format(epoch + 1, self.clf_dataset.name, batch_i), end='') 227 | print('Avg. Loss: {} '.format(loss), end='') 228 | 229 | valid_acc = self.__run_accuracy_in_valid_set__(sess, 230 | input, output, 231 | accuracy, batch_size, 232 | self.clf_model.scale_to_imagenet) 233 | print('Validation Accuracy {:.6f}'.format(valid_acc)) 234 | 235 | if epoch % save_every_epoch == 0: 236 | print('epoch: {} is saved...'.format(epoch+1)) 237 | saver2 = tf.train.Saver() 238 | saver2.save(sess, save_model_to, global_step=epoch+1, write_meta_graph=False) 239 | 240 | def run_testing(self, 241 | data, save_model_from, options=None): 242 | graph = tf.Graph() 243 | with graph.as_default(): 244 | input, _ = self.clf_model.set_dataset(self.clf_dataset) 245 | out_layers = self.clf_model.create_model(input) 246 | 247 | final_out_layer = out_layers[len(out_layers)-1] 248 | softmax_result = tf.nn.softmax(final_out_layer) 249 | 250 | with tf.Session(graph=graph) as sess: 251 | sess.run(tf.global_variables_initializer()) 252 | 253 | saver = tf.train.Saver(tf.trainable_variables()) 254 | saver.restore(sess, save_model_from) 255 | 256 | results = sess.run(softmax_result, 257 | feed_dict={input:data}) 258 | 259 | return results 260 | -------------------------------------------------------------------------------- /trainers/predefined_loss.py: -------------------------------------------------------------------------------- 1 | import tensorflow as tf 2 | 3 | def get_alexnet_trainer(output, out_layers, learning_rate): 4 | final_out_layer = out_layers[len(out_layers)-1] 5 | cost = tf.losses.softmax_cross_entropy(output, final_out_layer, reduction=tf.losses.Reduction.MEAN) 6 | 7 | optimizer = tf.train.MomentumOptimizer(learning_rate=learning_rate, momentum=0.9) 8 | gradients = optimizer.compute_gradients(cost) 9 | train_op = optimizer.apply_gradients(gradients) 10 | 11 | correct_pred = tf.equal(tf.argmax(final_out_layer, 1), tf.argmax(output, 1)) 12 | accuracy = tf.reduce_mean(tf.cast(correct_pred, tf.float32)) 13 | 14 | return cost, train_op, accuracy 15 | 16 | def get_vgg_trainer(output, out_layers, learning_rate): 17 | final_out_layer = out_layers[len(out_layers)-1] 18 | cost = tf.losses.softmax_cross_entropy(output, final_out_layer, reduction=tf.losses.Reduction.MEAN) 19 | 20 | optimizer = tf.train.MomentumOptimizer(learning_rate=learning_rate, momentum=0.9) 21 | gradients = optimizer.compute_gradients(cost) 22 | train_op = optimizer.apply_gradients(gradients) 23 | 24 | correct_pred = tf.equal(tf.argmax(final_out_layer, 1), tf.argmax(output, 1)) 25 | accuracy = tf.reduce_mean(tf.cast(correct_pred, tf.float32)) 26 | 27 | return cost, train_op, accuracy 28 | 29 | def get_googlenet_trainer(output, out_layers, learning_rate): 30 | aux_cost_sum = 0 31 | for i in range(len(out_layers) - 1): 32 | aux_out_layer = out_layers[i] 33 | aux_cost = tf.losses.softmax_cross_entropy(output, aux_out_layer, reduction=tf.losses.Reduction.MEAN) 34 | aux_cost_sum += aux_cost * 0.3 35 | 36 | final_out_layer = out_layers[len(out_layers)-1] 37 | cost = tf.losses.softmax_cross_entropy(output, final_out_layer, reduction=tf.losses.Reduction.MEAN) 38 | 39 | optimizer = tf.train.MomentumOptimizer(learning_rate=learning_rate, momentum=0.9) 40 | gradients = optimizer.compute_gradients(cost+aux_cost_sum) 41 | train_op = optimizer.apply_gradients(gradients) 42 | 43 | correct_pred = tf.equal(tf.argmax(final_out_layer, 1), tf.argmax(output, 1)) 44 | accuracy = tf.reduce_mean(tf.cast(correct_pred, tf.float32)) 45 | 46 | return cost, train_op, accuracy 47 | 48 | def get_resnet_trainer(output, out_layers, learning_rate): 49 | final_out_layer = out_layers[len(out_layers)-1] 50 | cost = tf.losses.softmax_cross_entropy(output, final_out_layer, reduction=tf.losses.Reduction.MEAN) 51 | 52 | optimizer = tf.train.MomentumOptimizer(learning_rate=learning_rate, momentum=0.9) 53 | gradients = optimizer.compute_gradients(cost) 54 | train_op = optimizer.apply_gradients(gradients) 55 | 56 | correct_pred = tf.equal(tf.argmax(final_out_layer, 1), tf.argmax(output, 1)) 57 | accuracy = tf.reduce_mean(tf.cast(correct_pred, tf.float32)) 58 | 59 | return cost, train_op, accuracy 60 | 61 | def get_inceptionv2_trainer(output, out_layers, learning_rate): 62 | return get_googlenet_trainer(output, out_layers, learning_rate) 63 | 64 | def get_inceptionv3_trainer(output, out_layers, learning_rate): 65 | aux_cost_sum = 0 66 | for i in range(len(out_layers) - 1): 67 | aux_out_layer = out_layers[i] 68 | aux_cost = tf.losses.softmax_cross_entropy(output, aux_out_layer, label_smoothing=0.1, reduction=tf.losses.Reduction.MEAN) 69 | aux_cost_sum += aux_cost * 0.3 70 | 71 | final_out_layer = out_layers[len(out_layers)-1] 72 | cost = tf.losses.softmax_cross_entropy(output, final_out_layer, label_smoothing=0.1, reduction=tf.losses.Reduction.MEAN) 73 | 74 | optimizer = tf.train.RMSPropOptimizer(learning_rate=learning_rate, momentum=0.9) 75 | gradients = optimizer.compute_gradients(cost+aux_cost_sum) 76 | train_op = optimizer.apply_gradients(gradients) 77 | 78 | correct_pred = tf.equal(tf.argmax(final_out_layer, 1), tf.argmax(output, 1)) 79 | accuracy = tf.reduce_mean(tf.cast(correct_pred, tf.float32)) 80 | 81 | return cost, train_op, accuracy 82 | --------------------------------------------------------------------------------