├── README.md
├── cbof_paper
    ├── README.md
    ├── mnist_demo.py
    └── model
    │   ├── __init__.py
    │   ├── base_learner.py
    │   ├── cbof.py
    │   ├── cnn.py
    │   ├── cnn_feat.py
    │   ├── datasets.py
    │   └── nbof.py
├── datasets
    ├── __init__.py
    └── mnist.py
├── mnist_example.py
└── models
    ├── __init__.py
    ├── bof.py
    └── learner_base.py


/README.md:
--------------------------------------------------------------------------------
 1 | # Bag-of-Features Pooling for Deep Convolutional Neural Networks
 2 | 
 3 | **IMPORTANT: Given the uncertain future of theano, we also provide a [*keras*-based implementation](https://github.com/passalis/keras_cbof) of the proposed method.**
 4 | 
 5 | In this repository we provide an efficient and simple re-implementation of the [Bag-of-Features Pooling method for Deep Convolutional Neural Networks](https://arxiv.org/abs/1707.08105) using the Lasagne framework. The provided lasagne layer can be used in any lasagne-based model. The distance between the extracted feature vectors and the codebook is calculated using convolutional layers (exploiting that the squared distance ||x-y||^2 can be calculated using three inner products, i.e., x^2+y^2-2xy), significantly speeding up the training/testing speed.
 6 | 
 7 | We provide an example of using the proposed method in mnist_example.py and we compare the BoF pooling to the plain SPP polling. The proposed method can both increase the classification perfomance and provide better scale invariance, as shown below (the classification error on the MNIST dataset is reported):
 8 | 
 9 | 
10 | | Model         | Scale = 1 | Scale = 0.8 |  Scale = 0.7 | 
11 | | ------------- | --------- | ---------   | ---------    | 
12 | | SPP           | 0.68 %    | 4.08 %      | 36.78 %      |
13 | | BoF Pooling   | **0.54 %**    | **1.40 %**      | **17.60 %**    |
14 | 
15 | Note that this is not the implementation used for conducting the experiments in our [paper](https://arxiv.org/abs/1707.08105). The original (slower, but more flexible) implementation can be found in [cbof_paper](cbof_paper).
16 | 
17 | If you use this code in your work please cite the following paper:
18 | 
19 | <pre>
20 | @InProceedings{cbof_iccv,
21 | author = {Passalis, Nikolaos and Tefas, Anastasios},
22 | title = {Bag-of-Features Pooling for Deep Convolutional Neural Networks},
23 | booktitle = {Proceedings of the IEEE International Conference on Computer Vision},
24 | year = {2017}
25 | }
26 | </pre>
27 | 
28 | 
29 | ### Acknolwegment
30 | This project has received funding from the European Union’s Horizon 2020 research and innovation programme under grant agreement No 731667 (MULTIDRONE). This publication reflects the authors’ views only. The European Commission is not responsible for any use that may be made of the information it contains.
31 | 


--------------------------------------------------------------------------------
/cbof_paper/README.md:
--------------------------------------------------------------------------------
 1 | # Bag-of-Features Pooling for Deep Convolutional Neural Networks
 2 | 
 3 | This implementation is based on the implementation used for conducting the experiments in the [Bag-of-Features Pooling method for Deep Convolutional Neural Networks](https://arxiv.org/abs/1707.08105) paper. This implementation is slower than the *lasagne*-based implementation that we provide in the [main repository](). However, it is also more flexible, e.g., it allows for using separate codebooks for each spatial region.
 4 | 
 5 | Note that the obtained results might slightly vary due to the non-deterministic behaviour of the libraries (CUDA) used for the GPU calculations and the clustering algorithm used for the initialization of the codebook. To avoid these issues we explicitly avoid using non-determining algorithms during the optimization in the results reported here. To do so, you can add the following in the *theano.rc* configuration file:
 6 | 
 7 | <pre>
 8 | [dnn.conv]
 9 | algo_bwd_filter=deterministic
10 | algo_bwd_data=deterministic
11 | </pre>
12 | 
13 | After using this configuration and fixing the seeds, the following results should be obtained:
14 | 
15 | 
16 | | Model         | 28 x 28 | 20 x 20 | 
17 | | ------------- | --------- | ---------   |
18 | | CNN           | 0.56 %    |  -     |
19 | | GMP           | 0.78 %    | 3.31 %      |
20 | | SPP           | 0.55 %    | 1.49 %      |
21 | | CBoF (64, 1)   | **0.47 %**    | **0.99 %** |
22 | 
23 | 
24 | If you use this code in your work please cite the following paper:
25 | 
26 | <pre>
27 | @InProceedings{cbof_iccv,
28 | author = {Passalis, Nikolaos and Tefas, Anastasios},
29 | title = {Bag-of-Features Pooling for Deep Convolutional Neural Networks},
30 | booktitle = {Proceedings of the IEEE International Conference on Computer Vision (to appear)},
31 | year = {2017}
32 | }
33 | </pre>
34 | 
35 | 


--------------------------------------------------------------------------------
/cbof_paper/mnist_demo.py:
--------------------------------------------------------------------------------
 1 | import numpy as np
 2 | import sklearn.utils
 3 | 
 4 | from model.cbof import CBoF
 5 | from model.cnn import CNN_Simple
 6 | from model.datasets import load_mnist, resize_mnist_data
 7 | 
 8 | 
 9 | # Set the path to mnist.pkl.gz before running the code
10 | # Download mnist.pkl.gz from http://www.iro.umontreal.ca/~lisa/deep/data/mnist/mnist.pkl.gz
11 | 
12 | def run_demo_mnist(model='cbof', n_iters=50, seed=1, ):
13 | 
14 |     # Load mnist data
15 |     train_data, valid_data, test_data, train_labels, valid_labels, test_labels = load_mnist(
16 |         dataset='/home/nick/Data/Datasets/mnist.pkl.gz')
17 | 
18 |     if model != 'plain':
19 |         train_data_20 = resize_mnist_data(train_data, 20, 20)
20 |         train_data_24 = resize_mnist_data(train_data, 24, 24)
21 |         train_data_32 = resize_mnist_data(train_data, 32, 32)
22 |         train_data_36 = resize_mnist_data(train_data, 36, 36)
23 |         test_data_20 = resize_mnist_data(test_data, 20, 20)
24 | 
25 |     # Set seeds for reproducibility
26 |     sklearn.utils.check_random_state(seed)
27 |     np.random.seed(seed)
28 | 
29 |     eta = 0.0001
30 |     if model == 'cbof':
31 |         cnn = CBoF(learning_rate=eta, n_classes=10, bof_layer=(1, True, 64), hidden_neurons=(1000,))
32 |         cnn.init_bof(train_data[:50000, :])
33 |     elif model == 'spp':
34 |         cnn = CNN_Simple(learning_rate=eta, hidden_neurons=(1000,), n_classes=10, use_spatial_pooling=True,
35 |                          pool_dims=[1, 2])
36 |     elif model == 'gmp':
37 |         cnn = CNN_Simple(learning_rate=eta, hidden_neurons=(1000,), n_classes=10, use_spatial_pooling=True,
38 |                          pool_dims=[1])
39 |     elif model == 'plain':
40 |         cnn = CNN_Simple(learning_rate=eta, hidden_neurons=(1000,), n_classes=10, use_spatial_pooling=False)
41 | 
42 |     best_valid, test_acc, best_iter = 0, 0, 0
43 | 
44 |     for i in range(n_iters):
45 | 
46 |         if model != 'plain':
47 |             cnn.train_model(train_data_20, train_labels, batch_size=64)
48 |             cnn.train_model(train_data_24, train_labels, batch_size=64)
49 |             cnn.train_model(train_data_32, train_labels, batch_size=64)
50 |             cnn.train_model(train_data_36, train_labels, batch_size=64)
51 |         loss = cnn.train_model(train_data, train_labels, batch_size=64)
52 |         print("Iter: ", i, ", loss: ", loss)
53 | 
54 |         # Get validation accuracy
55 |         valid_acc = cnn.test_model(valid_data, valid_labels)
56 |         if valid_acc > best_valid:
57 |             best_valid = valid_acc
58 |             best_iter = i
59 |             # Test the model!
60 |             test_acc = cnn.test_model(test_data, test_labels)
61 |             test_acc_20 = 0
62 |             if model != 'plain':
63 |                 test_acc_20 = cnn.test_model(test_data_20, test_labels)
64 |             print("New validation best found, valid acc = ", valid_acc, " iter = ", i)
65 |             print(test_acc, test_acc_20)
66 | 
67 |     print("Evaluated model = ", model)
68 |     print("Best err = ", 100 - test_acc, "% found @ iter = ", best_iter)
69 |     if model != 'plain':
70 |         print("Err (20x20): ", 100 - test_acc_20)
71 | 
72 | 
73 | run_demo_mnist(model='plain')
74 | run_demo_mnist(model='gmp')
75 | run_demo_mnist(model='spp')
76 | run_demo_mnist(model='cbof')
77 | 
78 | 


--------------------------------------------------------------------------------
/cbof_paper/model/__init__.py:
--------------------------------------------------------------------------------
1 | 
2 | 


--------------------------------------------------------------------------------
/cbof_paper/model/base_learner.py:
--------------------------------------------------------------------------------
 1 | import numpy as np
 2 | from tqdm import tqdm
 3 | 
 4 | 
 5 | class Base_Learner:
 6 |     def __init__(self):
 7 |         self.train_fn = None
 8 |         self.train_mlp_fn = None
 9 |         self.test_fn = None
10 | 
11 |     def train_model(self, train_data, train_labels, batch_size=32, pre_train=False):
12 |         """
13 |         Trains the model
14 |         :param train_data:
15 |         :param train_labels:
16 |         :param batch_size:
17 |         :param pre_train:
18 |         :return:
19 |         """
20 |         n_batches = int(np.floor(train_data.shape[0] / batch_size))
21 |         loss = 0
22 |         for i in tqdm(range(n_batches)):
23 |             cur_data = train_data[i * batch_size:(i + 1) * batch_size, :]
24 |             cur_labels = train_labels[i * batch_size:(i + 1) * batch_size]
25 |             if pre_train:
26 |                 cur_loss = self.train_mlp_fn(np.float32(cur_data), np.int32(cur_labels))
27 |             else:
28 |                 cur_loss = self.train_fn(np.float32(cur_data), np.int32(cur_labels))
29 |             loss += cur_loss * batch_size
30 | 
31 |         if n_batches * batch_size < train_data.shape[0]:
32 |             cur_data = train_data[n_batches * batch_size:, :]
33 |             cur_labels = train_labels[n_batches * batch_size:]
34 |             if pre_train:
35 |                 cur_loss = self.train_mlp_fn(np.float32(cur_data), np.int32(cur_labels))
36 |             else:
37 |                 cur_loss = self.train_fn(np.float32(cur_data), np.int32(cur_labels))
38 |             loss += cur_loss * train_data.shape[0]
39 |         loss = loss / float(train_data.shape[0])
40 |         return loss
41 | 
42 |     def test_model(self, test_data, test_labels, batch_size=32):
43 |         """
44 |         Predicts the labels and returns the accuracy and the precision
45 |         :param test_data:
46 |         :param test_labels:
47 |         :param batch_size:
48 |         :return:
49 |         """
50 |         labels = np.zeros((0,))
51 |         n_batches = int(np.floor(test_data.shape[0] / batch_size))
52 | 
53 |         for i in range(n_batches):
54 |             cur_data = test_data[i * batch_size:(i + 1) * batch_size, :]
55 |             labels = np.hstack((labels, self.test_fn(np.float32(cur_data))))
56 | 
57 |         if n_batches * batch_size < test_data.shape[0]:
58 |             cur_data = test_data[n_batches * batch_size:, :]
59 |             labels = np.hstack((labels, self.test_fn(np.float32(cur_data))))
60 | 
61 |         return 100 * np.mean(test_labels == labels)
62 | 


--------------------------------------------------------------------------------
/cbof_paper/model/cbof.py:
--------------------------------------------------------------------------------
 1 | import lasagne
 2 | import theano
 3 | import theano.tensor as T
 4 | from model.nbof import CBoF_Input_Layer
 5 | from model.base_learner import Base_Learner
 6 | from model.cnn_feat import CNN_Feature_Extractor
 7 | 
 8 | 
 9 | class CBoF(Base_Learner):
10 |     def __init__(self, n_classes=10, learning_rate=0.00001, bof_layer=(4, 0, 128), hidden_neurons=(1000,),
11 |                  dropout=(0.5,), feature_dropout=0, g=0.1):
12 | 
13 |         Base_Learner.__init__(self)
14 | 
15 |         input_var = T.ftensor4('inputs')
16 |         target_var = T.ivector('targets')
17 | 
18 |         # Create the CNN feature extractor
19 |         self.cnn_layer = CNN_Feature_Extractor(input_var, size=None)
20 | 
21 |         # Create the BoF layer
22 |         (cnn_layer_id, spatial_level, n_codewords) = bof_layer
23 |         self.bof_layer = CBoF_Input_Layer(input_var, self.cnn_layer, cnn_layer_id, level=spatial_level,
24 |                                           n_codewords=n_codewords, g=g, pyramid=False)
25 |         features = self.bof_layer.fused_features
26 |         n_size_features = self.bof_layer.features_size
27 | 
28 |         # Create an output MLP
29 |         network = lasagne.layers.InputLayer(shape=(None, n_size_features), input_var=features)
30 |         if feature_dropout > 0:
31 |             network = lasagne.layers.DropoutLayer(network, p=feature_dropout)
32 |         for n, drop_rate in zip(hidden_neurons, dropout):
33 |             network = lasagne.layers.DenseLayer(network, num_units=n, nonlinearity=lasagne.nonlinearities.elu,
34 |                                                 W=lasagne.init.Orthogonal())
35 |             network = lasagne.layers.DropoutLayer(network, p=drop_rate)
36 | 
37 |         network = lasagne.layers.DenseLayer(network, num_units=n_classes,
38 |                                             nonlinearity=lasagne.nonlinearities.softmax,
39 |                                                 W=lasagne.init.Normal(std=1))
40 |         # Get network loss
41 |         self.prediction_train = lasagne.layers.get_output(network, deterministic=False)
42 |         loss = lasagne.objectives.categorical_crossentropy(self.prediction_train, target_var).mean()
43 | 
44 |         # Define training rules
45 |         params_mlp = lasagne.layers.get_all_params(network, trainable=True)
46 |         updates_mlp = lasagne.updates.adam(loss, params_mlp, learning_rate=learning_rate)
47 |         updates = lasagne.updates.adam(loss, params_mlp, learning_rate=learning_rate)
48 |         updates.update(lasagne.updates.adam(loss, self.cnn_layer.layer_params[cnn_layer_id],
49 |                                             learning_rate=learning_rate))
50 |         updates.update(lasagne.updates.adam(loss, self.bof_layer.V, learning_rate=learning_rate))
51 |         updates.update(lasagne.updates.adam(loss, self.bof_layer.sigma, learning_rate=learning_rate))
52 | 
53 |         # Define testing/validation
54 |         prediction_test = lasagne.layers.get_output(network, deterministic=True)
55 | 
56 |         # Compile functions
57 |         self.train_fn = theano.function([input_var, target_var], loss, updates=updates)
58 |         self.train_mlp_fn = theano.function([input_var, target_var], loss, updates=updates_mlp)
59 |         self.test_fn = theano.function([input_var], T.argmax(prediction_test, axis=1))
60 | 
61 |         # Get the output of the bof module
62 |         self.get_features_fn = theano.function([input_var], features)
63 | 
64 |     def init_bof(self, data):
65 |         """
66 |         Initializes the BoF layer using k-means
67 |         :param data:
68 |         :return:
69 |         """
70 |         self.bof_layer.initialize(data)
71 | 


--------------------------------------------------------------------------------
/cbof_paper/model/cnn.py:
--------------------------------------------------------------------------------
 1 | import lasagne
 2 | import lasagne.layers.dnn
 3 | import theano
 4 | import theano.tensor as T
 5 | from model.cnn_feat import CNN_Feature_Extractor
 6 | from model.base_learner import Base_Learner
 7 | 
 8 | 
 9 | class CNN_Simple(Base_Learner):
10 |     """
11 |     Implements the baseline models (CNN and SPP)
12 |     """
13 | 
14 |     def __init__(self, learning_rate=0.0001, hidden_neurons=(1000,), dropout=(0.5,),  feature_dropout=0.5, n_classes=15,
15 |                  use_spatial_pooling=False, pool_dims=[2, 1], size=28):
16 | 
17 |         Base_Learner.__init__(self)
18 | 
19 |         input_var = T.ftensor4('inputs')
20 |         target_var = T.ivector('targets')
21 | 
22 |         if use_spatial_pooling:
23 |             size = None
24 | 
25 |         # Create the CNN feature extractor
26 |         self.cnn_layer = CNN_Feature_Extractor(input_var, size=size, pool_size=[(2, 2), ()])
27 |         network = self.cnn_layer.networks[-1]
28 |         cnn_params = self.cnn_layer.layer_params[-1]
29 | 
30 |         # Add spatial pooling layer, if needed
31 |         if use_spatial_pooling:
32 |             # network = lasagne.layers.Conv2DLayer(network, num_filters=64, filter_size=(1,1),
33 |             #                                      nonlinearity=lasagne.nonlinearities.rectify,
34 |             #                                      W=lasagne.init.GlorotUniform())
35 |             network = lasagne.layers.dnn.SpatialPyramidPoolingDNNLayer(network, pool_dims=pool_dims)
36 |         else:
37 |             # otherwise, add a regular 2x2 pooling layer
38 |             network = lasagne.layers.MaxPool2DLayer(network, pool_size=(2, 2))
39 | 
40 |         if feature_dropout > 0:
41 |             network = lasagne.layers.DropoutLayer(network, p=feature_dropout)
42 | 
43 |         params_mlp = []
44 |         for n, drop_rate in zip(hidden_neurons, dropout):
45 |             network = lasagne.layers.DenseLayer(network, num_units=n, nonlinearity=lasagne.nonlinearities.elu,
46 |                                                 W=lasagne.init.Orthogonal())
47 |             params_mlp.append(network.W)
48 |             params_mlp.append(network.b)
49 |             network = lasagne.layers.DropoutLayer(network, p=drop_rate)
50 | 
51 |         network = lasagne.layers.DenseLayer(network, num_units=n_classes,
52 |                                             nonlinearity=lasagne.nonlinearities.softmax)
53 |         params_mlp.append(network.W)
54 |         params_mlp.append(network.b)
55 | 
56 |         # Get network loss
57 |         prediction_train = lasagne.layers.get_output(network, deterministic=False)
58 |         loss = lasagne.objectives.categorical_crossentropy(prediction_train, target_var).mean()
59 | 
60 |         # Define training rules
61 |         updates_mlp = lasagne.updates.adam(loss, params_mlp, learning_rate=learning_rate)
62 |         updates = lasagne.updates.adam(loss, params_mlp, learning_rate=learning_rate)
63 |         updates.update(lasagne.updates.adam(loss, cnn_params, learning_rate=learning_rate))
64 | 
65 |         # Define testing/validation
66 |         prediction_test = lasagne.layers.get_output(network, deterministic=True)
67 |         test_loss = lasagne.objectives.categorical_crossentropy(prediction_test, target_var).mean()
68 |         test_acc = T.mean(T.eq(T.argmax(prediction_test, axis=1), target_var), dtype='float32')
69 | 
70 |         # Compile functions
71 |         self.train_fn = theano.function([input_var, target_var], loss, updates=updates)
72 |         self.test_fn = theano.function([input_var], T.argmax(prediction_test, axis=1))
73 |         self.val_fn = theano.function([input_var, target_var], [test_loss, test_acc])
74 | 
75 |         self.train_mlp_fn = theano.function([input_var, target_var], loss, updates=updates_mlp)
76 | 


--------------------------------------------------------------------------------
/cbof_paper/model/cnn_feat.py:
--------------------------------------------------------------------------------
 1 | import lasagne
 2 | 
 3 | class Base_CNN_Feature_Extractor:
 4 |     def __init__(self):
 5 |         # Features extracted from each layer
 6 |         self.layer_features = []
 7 | 
 8 |         # Feature dimension per layer
 9 |         self.features_dim = []
10 | 
11 |         # Cumulative parameters for each layer
12 |         self.layer_params = []
13 | 
14 |         # Lasagne network reference for each layer
15 |         self.networks = []
16 | 
17 |     def get_features(self, layer):
18 |         """
19 |         Returns all the feature vectors of a layer
20 |         :param layer: the layer from which to extract the feature vectors
21 |         :return: the feature vectors
22 |         """
23 | 
24 |         features = self.layer_features[layer]
25 |         features = features.reshape((features.shape[0], features.shape[1] * features.shape[2], features.shape[3]))
26 |         return features
27 | 
28 |     def get_spatial_features(self, layer, i, level=1):
29 |         """
30 |         Returns the features of the i-th region of the layer (only 2x2 segmentation is supported)
31 |         :param layer: the layer from which to extract the features
32 |         :param i: the region of the layer to extract the features
33 |         :return: the feature vectors
34 |         """
35 |         # This function assumes a square image input
36 |         pivot = self.layer_features[layer].shape[1] // 2
37 |         if level == 1:
38 |             if i == 0:
39 |                 features = self.layer_features[layer][:, :pivot, :pivot, :]
40 |             elif i == 1:
41 |                 features = self.layer_features[layer][:, :pivot, pivot:, :]
42 |             elif i == 2:
43 |                 features = self.layer_features[layer][:, pivot:, :pivot, :]
44 |             elif i == 3:
45 |                 features = self.layer_features[layer][:, pivot:, pivot:, :]
46 |             else:
47 |                 print("Wrong region number")
48 |                 assert False
49 |         else:
50 |             print("Only spatial levels 1 and 2 are supported, got ", level)
51 |             assert False
52 | 
53 |         features = features.reshape((features.shape[0], features.shape[1] * features.shape[2], features.shape[3]))
54 |         return features
55 | 
56 | 
57 | class CNN_Feature_Extractor(Base_CNN_Feature_Extractor):
58 |     """
59 |     Implements a simple convolutional feature extractor
60 |     """
61 | 
62 |     def __init__(self, input_var, size=28, channels=1, n_filters=[32, 64], filters_size=[(5, 5), (5, 5)],
63 |                  pool_size=[(2, 2), ()]):
64 |         """
65 |         Defines a set of convolutional layer that extracts features
66 |         :param input_var: input var of the network
67 |         :param size: image input size (set to None to allow images of arbitrary size)
68 |         :param channels: number of channels in the image
69 |         :param n_filters: number of filters in each layer
70 |         :param filters_size: size of the filters in each layer
71 |         :param pool_size: pool size in each layer
72 |         """
73 |         # Input
74 |         network = lasagne.layers.InputLayer(shape=(None, channels, size, size), input_var=input_var)
75 | 
76 |         # Store the dimensionality of each feature vector (for use by the next layers)
77 |         self.features_dim = []
78 |         # Store the features of each convolutional layer
79 |         self.layer_features = []
80 |         self.layer_params = []
81 |         self.networks = []
82 | 
83 |         # Define the layers
84 |         for n, size, pool in zip(n_filters, filters_size, pool_size):
85 |             network = lasagne.layers.Conv2DLayer(network, num_filters=n, filter_size=size,
86 |                                                  nonlinearity=lasagne.nonlinearities.rectify,
87 |                                                  W=lasagne.init.GlorotUniform())
88 |             self.features_dim.append(n)
89 |             if pool:
90 |                 network = lasagne.layers.MaxPool2DLayer(network, pool_size=pool)
91 | 
92 |             # Save the output of each layer (after reordering the dimensions: n_samples, n_vectors, n_feats)
93 |             self.layer_features.append(lasagne.layers.get_output(network, deterministic=True).transpose((0, 2, 3, 1)))
94 |             self.layer_params.append(lasagne.layers.get_all_params(network, trainable=True))
95 |             self.networks.append(network)
96 | 


--------------------------------------------------------------------------------
/cbof_paper/model/datasets.py:
--------------------------------------------------------------------------------
 1 | import numpy as np
 2 | 
 3 | 
 4 | def load_mnist(dataset='/home/nick/Data/Datasets/mnist.pkl.gz'):
 5 |     """
 6 |     Loads the mnist dataset
 7 |     :return:
 8 |     """
 9 |     import gzip
10 |     import pickle
11 | 
12 |     with gzip.open(dataset, 'rb') as f:
13 |         try:
14 |             train_set, valid_set, test_set = pickle.load(f, encoding='latin1')
15 |         except:
16 |             train_set, valid_set, test_set = pickle.load(f)
17 |     train_data = train_set[0].reshape((-1, 1, 28, 28))
18 |     valid_data = valid_set[0].reshape((-1, 1, 28, 28))
19 |     test_data = test_set[0].reshape((-1, 1, 28, 28))
20 |     return train_data, valid_data, test_data, train_set[1], valid_set[1], test_set[1]
21 | 
22 | 
23 | def resize_mnist_data(images, new_size_a, new_size_b=None):
24 |     """
25 |     Resizes a set of images
26 |     :param images:
27 |     :param new_size:
28 |     :return:
29 |     """
30 |     from skimage.transform import resize
31 | 
32 |     if new_size_b is None:
33 |         new_size_b = new_size_a
34 | 
35 |     resized_data = np.zeros((images.shape[0], 1, new_size_a, new_size_b))
36 |     for i in range(len(images)):
37 |         resized_data[i, 0, :, :] = resize(images[i, 0, :, :], (new_size_a, new_size_b))
38 |     return np.float32(resized_data)
39 | 
40 | 
41 | 
42 | 
43 | 


--------------------------------------------------------------------------------
/cbof_paper/model/nbof.py:
--------------------------------------------------------------------------------
  1 | import theano
  2 | import numpy as np
  3 | from sklearn.preprocessing import normalize as feature_normalizer
  4 | import theano.tensor as T
  5 | import theano.gradient
  6 | import sklearn.cluster as cluster
  7 | 
  8 | floatX = theano.config.floatX
  9 | 
 10 | 
 11 | class NBoFInputLayer:
 12 |     """
 13 |     Defines a Neural BoF input layer
 14 |     """
 15 | 
 16 |     def __init__(self, g=0.1, feature_dimension=89, n_codewords=16):
 17 |         """
 18 |         Initializes the Neural BoF object
 19 |         :param g: defines the softness of the quantization
 20 |         :param feature_dimension: dimension of the feature vectors
 21 |         :param n_codewords: number of codewords / RBF neurons to be used
 22 |         """
 23 | 
 24 |         self.Nk = n_codewords
 25 |         self.D = feature_dimension
 26 | 
 27 |         # RBF-centers / codewords
 28 |         V = np.random.rand(self.Nk, self.D)
 29 |         self.V = theano.shared(value=V.astype(dtype=floatX), name='V', borrow=True)
 30 |         sigma = np.ones((self.Nk,)) / g
 31 |         self.sigma = theano.shared(value=sigma.astype(dtype=floatX), name='sigma', borrow=True)
 32 |         self.params = [self.V, self.sigma]
 33 | 
 34 |         # Tensor of input objects (n_objects, n_features, self.D)
 35 |         self.X = T.tensor3(name='X', dtype=floatX)
 36 | 
 37 |         # Feature matrix of an object (n_features, self.D)
 38 |         self.x = T.matrix(name='x', dtype=floatX)
 39 | 
 40 |         # Encode a set of objects
 41 |         """
 42 |         Note that the number of features per object is fixed and same for all objects.
 43 |         The code can be easily extended by defining a feature vector mask, allowing for a variable number of feature
 44 |         vectors for each object (or alternatively separately encoding each object).
 45 |         """
 46 |         self.encode_objects_theano = theano.function(inputs=[self.X], outputs=self.sym_histograms(self.X))
 47 | 
 48 |         # Encodes only one object with an arbitrary number of features
 49 |         self.encode_object_theano = theano.function(inputs=[self.x], outputs=self.sym_histogram(self.x))
 50 | 
 51 |     def sym_histogram(self, X):
 52 |         """
 53 |         Computes a soft-quantized histogram of a set of feature vectors (X is a matrix).
 54 |         :param X: matrix of feature vectors
 55 |         :return:
 56 |         """
 57 |         distances = symbolic_distance_matrix(X, self.V)
 58 |         membership = T.nnet.softmax(-distances * self.sigma)
 59 |         histogram = T.mean(membership, axis=0)
 60 |         return histogram
 61 | 
 62 |     def sym_histograms(self, X):
 63 |         """
 64 |         Encodes a set of objects (X is a tensor3)
 65 |         :param X: tensor3 containing the feature vectors for each object
 66 |         :return:
 67 |         """
 68 |         histograms, updates = theano.map(self.sym_histogram, X)
 69 |         return histograms
 70 | 
 71 |     def initialize_dictionary(self, X, max_iter=100, redo=5, n_samples=50000, normalize=False):
 72 |         """
 73 |         Samples some feature vectors from X and learns an initial dictionary
 74 |         :param X: list of objects
 75 |         :param max_iter: maximum k-means iters
 76 |         :param redo: number of times to repeat k-means clustering
 77 |         :param n_samples: number of feature vectors to sample from the objects
 78 |         :param normalize: use l_2 norm normalization for the feature vectors
 79 |         """
 80 | 
 81 |         # Sample only a small number of feature vectors from each object
 82 |         samples_per_object = int(np.ceil(n_samples / len(X)))
 83 | 
 84 |         features = None
 85 |         print("Sampling feature vectors...")
 86 |         for i in (range(len(X))):
 87 |             idx = np.random.permutation(X[i].shape[0])[:samples_per_object + 1]
 88 |             cur_features = X[i][idx, :]
 89 |             if features is None:
 90 |                 features = cur_features
 91 |             else:
 92 |                 features = np.vstack((features, cur_features))
 93 | 
 94 |         print("Clustering feature vectors...")
 95 |         features = np.float64(features)
 96 |         if normalize:
 97 |             features = feature_normalizer(features)
 98 | 
 99 |         V = cluster.k_means(features, n_clusters=self.Nk, max_iter=max_iter, n_init=redo)
100 |         self.V.set_value(np.asarray(V[0], dtype=theano.config.floatX))
101 | 
102 | 
103 | def symbolic_distance_matrix(A, B):
104 |     """
105 |     Defines the symbolic matrix that contains the distances between the vectors of A and B
106 |     :param A:
107 |     :param B:
108 |     :return:
109 |     """
110 |     aa = T.sum(A * A, axis=1)
111 |     bb = T.sum(B * B, axis=1)
112 |     AB = T.dot(A, T.transpose(B))
113 | 
114 |     AA = T.transpose(T.tile(aa, (bb.shape[0], 1)))
115 |     BB = T.tile(bb, (aa.shape[0], 1))
116 | 
117 |     D = AA + BB - 2 * AB
118 |     D = T.maximum(D, 0)
119 |     D = T.sqrt(D)
120 |     return D
121 | 
122 | 
123 | class CBoF_Input_Layer:
124 |     def __init__(self, input, cnn, layer, level=1, pyramid=False, g=0.1, n_codewords=16):
125 |         """
126 |         Defines a CBoF layer for use with convolutional feature extractors
127 |         :param input: symbolic input variable
128 |         :param cnn: the cnn input model
129 |         :param layer: the convolutional layer (id) to use
130 |         :param spatial: if set to True, spatial pyramid is used
131 |         :param g: the BoF softness variable
132 |         :param n_codewords: number of codewords for each BoF unit
133 |         """
134 |         self.bof = []
135 |         self.features = []
136 |         self.V = []
137 |         self.sigma = []
138 |         self.get_features = []
139 |         self.n = 1
140 | 
141 |         self.features_size = 0
142 |         if level == 0 or pyramid:
143 |             # Create the BoF object
144 |             self.bof.append(NBoFInputLayer(g=g, feature_dimension=cnn.features_dim[layer], n_codewords=n_codewords))
145 |             self.V.append(self.bof[0].V)
146 |             self.sigma.append(self.bof[0].sigma)
147 |             # Extract the representation
148 |             self.features.append(self.bof[0].sym_histograms(cnn.get_features(layer)))
149 |             # Compile functions for extracting feature vectors
150 |             self.get_features.append(theano.function([input], cnn.get_features(layer)))
151 |             # Fuse the extracted representations
152 |             self.fused_features = self.features[0]
153 |             # Calculate length
154 |             self.features_size += n_codewords
155 |         if level == 1:
156 |             for i in range(4 ** level):
157 |                 # Create the BoF object
158 |                 self.bof.append(NBoFInputLayer(g=g, feature_dimension=cnn.features_dim[layer], n_codewords=n_codewords))
159 |                 self.V.append(self.bof[i].V)
160 |                 self.sigma.append(self.bof[i].sigma)
161 |                 # Extract the representation
162 |                 self.features.append(self.bof[i].sym_histograms(cnn.get_spatial_features(layer, i, level)))
163 |                 # Compile functions for extracting feature vectors
164 |                 self.get_features.append(theano.function([input], cnn.get_spatial_features(layer, i, level)))
165 |             # Fuse the extracted representations
166 |             self.fused_features = T.concatenate(tuple(self.features), axis=1)
167 |             # Calculate length
168 |             self.features_size += n_codewords * (4 ** level)
169 | 
170 |     def initialize(self, data, max_iter=100, redo=5, n_samples=50000, normalize=False):
171 |         """
172 |         Initializes each of the spatial BoF layers in the CBoF layer
173 |         :param data: input samples
174 |         :param max_iter: max number of iterations for the k-means algorithm
175 |         :param redo: number to redo the clustering
176 |         :param n_samples: number of vectors to sample for clustering
177 |         :param normalize: use l_2 norm normalization for the feature vectors
178 |         :return:
179 |         """
180 | 
181 |         for i in range(len(self.bof)):
182 |             features = []
183 |             for x in data:
184 |                 x_in = x.reshape((1, x.shape[0], x.shape[1], x.shape[2]))
185 |                 cur_features = self.get_features[i](np.float32(x_in))
186 |                 features.append(cur_features)
187 |             features = np.asarray(features)
188 |             features = features.reshape((features.shape[0], features.shape[2], features.shape[3]))
189 |             self.bof[i].initialize_dictionary(features, max_iter=max_iter, redo=redo, n_samples=n_samples,
190 |                                               normalize=normalize)
191 | 


--------------------------------------------------------------------------------
/datasets/__init__.py:
--------------------------------------------------------------------------------
1 | 
2 | 


--------------------------------------------------------------------------------
/datasets/mnist.py:
--------------------------------------------------------------------------------
 1 | import sys
 2 | import os
 3 | import numpy as np
 4 | from keras.datasets import mnist
 5 | 
 6 | def load_mnist():
 7 |     (X_train, y_train), (X_test, y_test) = mnist.load_data()
 8 |     X_train, X_test = X_train/255.0, X_test/255.0
 9 | 
10 |     # Keep some validation data
11 |     X_train, X_val = X_train[:-5000], X_train[-5000:]
12 |     y_train, y_val = y_train[:-5000], y_train[-5000:]
13 | 
14 |     X_train = X_train.reshape(X_train.shape[0], 1, 28, 28)
15 |     X_val = X_val.reshape(X_val.shape[0], 1, 28, 28)
16 |     X_test = X_test.reshape(X_test.shape[0], 1, 28, 28)
17 | 
18 |     return np.float32(X_train), y_train, np.float32(X_val), y_val, np.float32(X_test), y_test
19 | 


--------------------------------------------------------------------------------
/mnist_example.py:
--------------------------------------------------------------------------------
 1 | import lasagne
 2 | import theano
 3 | import theano.tensor as T
 4 | import numpy as np
 5 | from datasets.mnist import load_mnist
 6 | from models.bof import CBoF_Layer
 7 | from models.learner_base import LearnerBase
 8 | 
 9 | 
10 | class LeNeT_Model(LearnerBase):
11 |     def __init__(self, pooling='spp', spatial_level=1, n_codewords=64, learning_rate=0.001):
12 |         self.initializers = []
13 | 
14 |         input_var = T.ftensor4('input_var')
15 |         target_var = T.ivector('targets')
16 | 
17 |         network = lasagne.layers.InputLayer(shape=(None, 1, None, None), input_var=input_var)
18 |         network = lasagne.layers.Conv2DLayer(network, num_filters=32, filter_size=(5, 5),
19 |                                              nonlinearity=lasagne.nonlinearities.rectify)
20 |         network = lasagne.layers.MaxPool2DLayer(network, pool_size=(2, 2))
21 |         network = lasagne.layers.Conv2DLayer(network, num_filters=64, filter_size=(5, 5),
22 |                                              nonlinearity=lasagne.nonlinearities.rectify)
23 |         if pooling == 'spp':
24 |             network = lasagne.layers.SpatialPyramidPoolingLayer(network, pool_dims=[1, 2])
25 |         elif pooling == 'bof':
26 |             network = CBoF_Layer(network, input_var=input_var, initializers=self.initializers, n_codewords=n_codewords,
27 |                                  spatial_level=spatial_level)
28 | 
29 |         network = lasagne.layers.dropout(network, p=.5)
30 |         network = lasagne.layers.DenseLayer(network, num_units=1000, nonlinearity=lasagne.nonlinearities.elu)
31 |         network = lasagne.layers.dropout(network, p=.5)
32 |         network = lasagne.layers.DenseLayer(network, num_units=10, nonlinearity=lasagne.nonlinearities.softmax)
33 |         self.network = network
34 | 
35 |         train_prediction = lasagne.layers.get_output(network, deterministic=False)
36 |         test_prediction = lasagne.layers.get_output(network, deterministic=True)
37 |         loss = lasagne.objectives.categorical_crossentropy(train_prediction, target_var).mean()
38 | 
39 |         self.params = lasagne.layers.get_all_params(network, trainable=True)
40 |         updates = lasagne.updates.adam(loss, self.params, learning_rate=learning_rate)
41 | 
42 |         self.train_fn = theano.function([input_var, target_var], loss, updates=updates)
43 |         self.test_fn = theano.function([input_var], T.argmax(test_prediction, axis=1))
44 | 
45 |         print "Model Compiled!"
46 | 
47 |     def initialize_model(self, data, n_samples=50000):
48 |         for initializer in self.initializers:
49 |             initializer(data, n_samples=n_samples)
50 |             print "Model initialized!"
51 | 
52 | 
53 | if __name__ == '__main__':
54 |     np.random.seed(12345)
55 | 
56 |     X_train, y_train, X_val, y_val, X_test, y_test = load_mnist()
57 | 
58 |     for pool_type in ['bof', 'spp']:
59 |         model = LeNeT_Model(pooling=pool_type)
60 | 
61 |         if pool_type == 'bof':
62 |             model.initialize_model(X_train, n_samples=50000)
63 | 
64 |         model.train_model(X_train, y_train, validation_data=X_val, validation_labels=y_val, n_iters=50, batch_size=256)
65 | 
66 |         print "Evaluated model = ", pool_type
67 |         print "Error = ", (1 - model.test_model(X_test, y_test)) * 100
68 |         print "Error (0.7 scale) = ", (1 - model.test_model(X_test, y_test, scale=0.7)) * 100
69 |         print "Error (0.8 scale) = ", (1 - model.test_model(X_test, y_test, scale=0.8)) * 100
70 | 


--------------------------------------------------------------------------------
/models/__init__.py:
--------------------------------------------------------------------------------
1 | 
2 | 


--------------------------------------------------------------------------------
/models/bof.py:
--------------------------------------------------------------------------------
  1 | import numpy as np
  2 | import theano
  3 | import theano.tensor as T
  4 | import lasagne
  5 | from sklearn.cluster import KMeans
  6 | from sklearn.metrics.pairwise import pairwise_distances
  7 | 
  8 | 
  9 | class CBoF_Layer(lasagne.layers.Layer):
 10 |     """
 11 |     Lasagne implementation of the CBoF Pooling Layer
 12 |     """
 13 | 
 14 |     def __init__(self, incoming, n_codewords=24, V=lasagne.init.Normal(0.1), gamma=lasagne.init.Constant(0.1),
 15 |                  eps=0.00001, input_var=None, initializers=None, spatial_level=1, **kwargs):
 16 |         """
 17 |         Creates a BoF layer
 18 | 
 19 |         :param incoming: 
 20 |         :param n_codewords: number of codewords
 21 |         :param V: initializer used for the codebook
 22 |         :param gamma: initializer used for the scaling factors
 23 |         :param eps: epsilon used to ensure numerical stability
 24 |         :param input_var: input_var of the model (used to compile a function that extract the features fed to layer)
 25 |         :param initializers: 
 26 |         :param spatial_level: 0 (no spatial segmentation), 1 (first spatial level)
 27 |         :param pooling_type: either 'mean' or 'max'
 28 |         :param kwargs: 
 29 |         """
 30 |         super(CBoF_Layer, self).__init__(incoming, **kwargs)
 31 | 
 32 |         self.n_codewords = n_codewords
 33 |         self.spatial_level = spatial_level
 34 |         n_filters = self.input_shape[1]
 35 |         self.eps = eps
 36 | 
 37 |         # Create parameters
 38 |         self.V = self.add_param(V, (n_codewords, n_filters, 1, 1), name='V')
 39 |         self.gamma = self.add_param(gamma, (1, n_codewords, 1, 1), name='gamma')
 40 | 
 41 |         # Make gammas broadcastable
 42 |         self.gamma = T.addbroadcast(self.gamma, 0, 2, 3)
 43 | 
 44 |         # Compile function used for feature extraction
 45 |         if input_var is not None:
 46 |             self.features_fn = theano.function([input_var], lasagne.layers.get_output(incoming, deterministic=True))
 47 | 
 48 |         if initializers is not None:
 49 |             initializers.append(self.initialize_layer)
 50 | 
 51 |     def get_output_for(self, input, **kwargs):
 52 |         distances = conv_pairwise_distance(input, self.V)
 53 |         similarities = T.exp(-distances / T.abs_(self.gamma))
 54 |         norm = T.sum(similarities, 1).reshape((similarities.shape[0], 1, similarities.shape[2], similarities.shape[3]))
 55 |         membership = similarities / (norm + self.eps)
 56 | 
 57 |         histogram = T.mean(membership, axis=(2, 3))
 58 |         if self.spatial_level == 1:
 59 |             pivot1, pivot2 = membership.shape[2] / 2, membership.shape[3] / 2
 60 |             h1 = T.mean(membership[:, :, :pivot1, :pivot2], axis=(2, 3))
 61 |             h2 = T.mean(membership[:, :, :pivot1, pivot2:], axis=(2, 3))
 62 |             h3 = T.mean(membership[:, :, pivot1:, :pivot2], axis=(2, 3))
 63 |             h4 = T.mean(membership[:, :, pivot1:, pivot2:], axis=(2, 3))
 64 |             # Pyramid is not used in the paper
 65 |             # histogram = T.horizontal_stack(h1, h2, h3, h4)
 66 |             histogram = T.horizontal_stack(histogram, h1, h2, h3, h4)
 67 |         return histogram
 68 | 
 69 |     def get_output_shape_for(self, input_shape):
 70 |         if self.spatial_level == 1:
 71 |             return (input_shape[0], 5 * self.n_codewords)
 72 |         return (input_shape[0], self.n_codewords)
 73 | 
 74 |     def initialize_layer(self, data, n_samples=10000):
 75 |         """
 76 |         Initializes the layer using k-means (sigma is set to the mean pairwise distance)
 77 |         :param data: data
 78 |         :param n_samples: n_samples to keep for initializing the model
 79 |         :return:
 80 |         """
 81 |         if self.features_fn is None:
 82 |             assert False
 83 | 
 84 |         idx = np.arange(data.shape[0])
 85 |         np.random.shuffle(idx)
 86 | 
 87 |         features = []
 88 |         for i in range(idx.shape[0]):
 89 |             feats = self.features_fn([data[idx[i]]])
 90 |             feats = feats.transpose((0, 2, 3, 1))
 91 |             feats = feats.reshape((-1, feats.shape[-1]))
 92 |             features.extend(feats)
 93 |             if len(features) > n_samples:
 94 |                 break
 95 |         features = np.asarray(features)
 96 | 
 97 |         kmeans = KMeans(n_clusters=self.n_codewords, n_jobs=4, n_init=5)
 98 |         kmeans.fit(features)
 99 |         V = kmeans.cluster_centers_.copy()
100 | 
101 |         # Initialize gamma
102 |         mean_distance = np.sum(pairwise_distances(V)) / (self.n_codewords * (self.n_codewords - 1))
103 |         self.gamma.set_value(self.gamma.get_value() * np.float32(mean_distance))
104 | 
105 |         # Initialize codebook
106 |         V = V.reshape((V.shape[0], V.shape[1], 1, 1))
107 |         self.V.set_value(np.float32(V))
108 | 
109 | 
110 | def conv_pairwise_distance(feature_maps, codebook):
111 |     """
112 |     Calculates the pairwise distances between the feature maps (n_samples, filters, x, y)
113 |     :param feature_maps: 
114 |     :param codebook: 
115 |     :return: 
116 |     """
117 |     x_square = T.sum(feature_maps ** 2, axis=1)  # n_samples, filters, x, y
118 |     x_square = x_square.reshape((x_square.shape[0], 1, x_square.shape[1], x_square.shape[2]))
119 |     x_square = T.addbroadcast(x_square, 1)
120 | 
121 |     y_square = T.sum(codebook ** 2, axis=1)
122 |     y_square = y_square.reshape((1, y_square.shape[0], y_square.shape[1], y_square.shape[2]))
123 |     y_square = T.addbroadcast(y_square, 0, 2, 3)
124 | 
125 |     inner_product = T.nnet.conv2d(feature_maps, codebook)
126 |     dist = x_square + y_square - 2 * inner_product
127 |     dist = T.sqrt(T.maximum(dist, 0))
128 |     return dist
129 | 


--------------------------------------------------------------------------------
/models/learner_base.py:
--------------------------------------------------------------------------------
  1 | import numpy as np
  2 | from time import time
  3 | from sklearn.metrics import accuracy_score
  4 | import time
  5 | import cv2
  6 | 
  7 | 
  8 | class LearnerBase():
  9 |     def __init__(self):
 10 |         self.train_fn = None
 11 |         self.test_fn = None
 12 |         self.lr = None
 13 | 
 14 |         self.best_param_values = []
 15 |         self.params = []
 16 | 
 17 |     def save_validation_parameters(self):
 18 |         """
 19 |         Saves the best parameters found during the validation
 20 |         """
 21 |         self.best_param_values = []
 22 |         for i, param in enumerate(self.params):
 23 |             self.best_param_values.append(param.get_value())
 24 | 
 25 |     def restore_validation_parameters(self):
 26 |         """
 27 |         Restores the best parameters
 28 |         """
 29 |         if len(self.best_param_values) > 0:
 30 |             for i, param in enumerate(self.params):
 31 |                 param.set_value(self.best_param_values[i])
 32 | 
 33 |     def train_model(self, data, labels, batch_size=32, n_iters=10, validation_data=None, validation_labels=None):
 34 | 
 35 |         loss = []
 36 |         idx = np.arange(data.shape[0])
 37 |         best_val_acc = 0
 38 | 
 39 |         for i in range(n_iters):
 40 |             np.random.shuffle(idx)
 41 |             cur_loss = 0
 42 |             n_batches = data.shape[0] / batch_size
 43 |             start_time = time.time()
 44 | 
 45 |             # Iterate mini-batches
 46 |             for j in range(n_batches):
 47 |                 cur_idx = np.sort(idx[j * batch_size:(j + 1) * batch_size])
 48 |                 cur_data = data[cur_idx]
 49 |                 cur_labels = labels[cur_idx]
 50 |                 cur_loss += self.train_fn(cur_data, cur_labels) * cur_data.shape[0]
 51 | 
 52 |             # Last batch
 53 |             if n_batches * batch_size < data.shape[0]:
 54 |                 # for cur_scale in scales:
 55 |                 cur_idx = np.sort(idx[n_batches * batch_size:])
 56 |                 cur_data = data[cur_idx]
 57 |                 cur_labels = labels[cur_idx]
 58 |                 cur_loss += self.train_fn(cur_data, cur_labels) * cur_data.shape[0]
 59 | 
 60 |             loss.append(cur_loss / float(data.shape[0]))
 61 |             elapsed_time = time.time() - start_time
 62 | 
 63 |             print "Epoch %d loss = %5.4f, cur_time: %6.1f s time_left: %8.1f s" % \
 64 |                   (i + 1, loss[-1], elapsed_time, (n_iters - i) * elapsed_time)
 65 | 
 66 |             if validation_data is not None:
 67 |                 val_acc = self.test_model(validation_data, validation_labels)
 68 |                 if val_acc > best_val_acc:
 69 |                     best_val_acc = val_acc
 70 |                     print "New best found!", val_acc
 71 |                     self.save_validation_parameters()
 72 | 
 73 |         if validation_data is not None:
 74 |             self.restore_validation_parameters()
 75 | 
 76 |         return loss
 77 | 
 78 |     def test_model(self, data, labels, scale=1, batch_size=100):
 79 |         """
 80 | 
 81 |         :param data: images for testing
 82 |         :param labels: classes of the images
 83 |         :param scale: the scale used for the testing (only if global pooling/cbof is used)
 84 |         :param batch_size: batch size to be used for the testing
 85 |         :return:
 86 |         """
 87 |         predicted_labels = []
 88 | 
 89 |         # Resize images if needed
 90 |         if scale != 1:
 91 |             img_size = [int(x * scale) for x in data.shape[2:]]
 92 |             new_data = np.zeros((data.shape[0], data.shape[1], img_size[0], img_size[1]))
 93 |             for k in range(data.shape[0]):
 94 |                 new_data[k] = resize_image(data[k], img_size)
 95 |             data = np.float32(new_data)
 96 | 
 97 |         n_batches = data.shape[0] / batch_size
 98 | 
 99 |         # Iterate mini-batches
100 |         for j in range(n_batches):
101 |             cur_data = data[j * batch_size:(j + 1) * batch_size]
102 |             predicted_labels.extend(self.test_fn(cur_data))
103 |         # Last batch
104 |         if n_batches * batch_size < data.shape[0]:
105 |             cur_data = data[n_batches * batch_size:]
106 |             predicted_labels.extend(self.test_fn(cur_data))
107 |         predicted_labels = np.asarray(predicted_labels)
108 | 
109 |         acc = accuracy_score(labels, predicted_labels)
110 |         return acc
111 | 
112 | 
113 | def resize_image(img, size):
114 |     img = img.transpose((1, 2, 0))
115 |     img = cv2.resize(img, (size[0], size[1]))
116 | 
117 |     if len(img.shape) == 2:
118 |         img = img.reshape((1, img.shape[0], img.shape[1]))
119 |     else:
120 |         img = img.transpose((2, 0, 1))
121 |     return img
122 | 


--------------------------------------------------------------------------------