├── README.md
├── cnn.py
├── fully_connected_network.py
├── layers
    ├── activation.py
    ├── convolution.py
    ├── flatten.py
    ├── fully_connected.py
    └── pooling.py
├── loss
    └── losses.py
└── utilities
    ├── filereader.py
    ├── initializers.py
    ├── model.py
    ├── settings.py
    └── utils.py


/README.md:
--------------------------------------------------------------------------------
 1 | # Numpy CNN
 2 | A numpy based CNN implementation for classifying images.
 3 | 
 4 | **status: archived**
 5 | 
 6 | ## Usage
 7 | 
 8 | Follow the steps listed below for using this repository after cloning it.  
 9 | For examples, you can look at the code in [fully_connected_network.py](https://github.com/ElefHead/numpy-cnn/blob/master/fully_connected_network.py) and [cnn.py](https://github.com/ElefHead/numpy-cnn/blob/master/cnn.py).  
10 | I placed the data inside a folder called data within the project root folder (this code works by default with cifar10, for other datasets, the filereader in utilities can't be used). 
11 | 
12 | After placing data, the directory structure looks as follows 
13 | - root
14 |     * data\
15 |         * data_batch_1
16 |         * data_batch_2 
17 |         * ..
18 |     * layers\
19 |     * loss\
20 |     * utilities\
21 |     * cnn.py
22 |     * fully_connected_network.py
23 |     
24 | ---  
25 | 
26 | 1) Import the required layer classes from layers folder, for example
27 |     ```python
28 |     from layers.fully_connected import FullyConnected
29 |     from layers.convolution import Convolution
30 |     from layers.flatten import Flatten
31 |     ```
32 | 2) Import the activations and losses in a similar way, for example
33 |     ```python
34 |     from layers.activation import Elu, Softmax
35 |     from loss.losses import CategoricalCrossEntropy
36 |     ```
37 | 3) Import the model class from utilities folder
38 |     ```python
39 |     from utilities.model import Model
40 |     ```
41 | 4) Create a model using Model and layer classes
42 |     ```python
43 |     model = Model(
44 |         Convolution(filters=5, padding='same'),
45 |         Elu(),
46 |         Pooling(mode='max', kernel_shape=(2, 2), stride=2),
47 |         Flatten(),
48 |         FullyConnected(units=10),
49 |         Softmax(),
50 |         name='cnn-model'
51 |     )
52 |     ```
53 | 5) Set model loss
54 |     ```python
55 |     model.set_loss(CategoricalCrossEntropy)
56 |     ```
57 | 6) Train the model using
58 |     ```python
59 |     model.train(data, labels)
60 |     ```
61 |     * set load_and_continue = True for loading trained weights and continue training
62 |     * By default the model uses AdamOptimization with AMSgrad
63 |     * It also saves the weights after each epoch to a models folder within the project
64 | 7) For prediction, use
65 |     ```python
66 |     prediction = model.predict(data)
67 |     ```
68 | 8) For calculating accuracy, the model class provides its own function
69 |     ```python
70 |     accuracy = model.evaluate(data, labels)
71 |     ```
72 | 9) To load model in a different place with the trained weights, follow till step 5 and then
73 |     ```python
74 |     model.load_weights()
75 |     ```
76 |     Note: You will have to have similar directory structure.
77 | 
78 | 
79 | ---
80 | This was a fun project that started out as me trying to implement a CNN by myself for classifying cifar10 images. In process, I was able to implement a reusable (numpy based)
81 | library-ish code for creating CNNs with adam optimization.
82 | 
83 | Anyone wanting to understand how backpropagation works in CNNs is welcome to try out this code, but for all practical usage there are better frameworks
84 | with performances that this code cannot even come close to replicating.
85 | 
86 | The CNN implemented here is based on [Andrej Karpathy's notes](http://cs231n.github.io/convolutional-networks/)
87 | 


--------------------------------------------------------------------------------
/cnn.py:
--------------------------------------------------------------------------------
 1 | from layers.fully_connected import FullyConnected
 2 | from layers.convolution import Convolution
 3 | from layers.pooling import Pooling
 4 | from layers.flatten import Flatten
 5 | from layers.activation import Elu, Softmax
 6 | 
 7 | from utilities.filereader import get_data
 8 | from utilities.model import Model
 9 | 
10 | from loss.losses import CategoricalCrossEntropy
11 | 
12 | import numpy as np
13 | np.random.seed(0)
14 | 
15 | 
16 | if __name__ == '__main__':
17 |     train_data, train_labels = get_data(num_samples=50000)
18 |     test_data, test_labels = get_data(num_samples=10000, dataset="testing")
19 | 
20 |     train_data = train_data / 255
21 |     test_data = test_data / 255
22 | 
23 |     print("Train data shape: {}, {}".format(train_data.shape, train_labels.shape))
24 |     print("Test data shape: {}, {}".format(test_data.shape, test_labels.shape))
25 | 
26 |     model = Model(
27 |         Convolution(filters=5, padding='same'),
28 |         Elu(),
29 |         Pooling(mode='max', kernel_shape=(2, 2), stride=2),
30 |         Flatten(),
31 |         FullyConnected(units=10),
32 |         Softmax(),
33 |         name='cnn5'
34 |     )
35 | 
36 |     model.set_loss(CategoricalCrossEntropy)
37 | 
38 |     model.train(train_data, train_labels.T, epochs=2) # set load_and_continue to True if you want to start from already trained weights
39 |     # model.load_weights() # uncomment if loading previously trained weights and comment above line to skip training and only load trained weights.
40 | 
41 |     print('Testing accuracy = {}'.format(model.evaluate(test_data, test_labels)))
42 | 


--------------------------------------------------------------------------------
/fully_connected_network.py:
--------------------------------------------------------------------------------
 1 | from layers.fully_connected import FullyConnected
 2 | from layers.flatten import Flatten
 3 | from layers.activation import Elu, Softmax
 4 | 
 5 | from utilities.filereader import get_data
 6 | from utilities.model import Model
 7 | 
 8 | from loss.losses import CategoricalCrossEntropy
 9 | 
10 | 
11 | import numpy as np
12 | np.random.seed(0)
13 | 
14 | 
15 | if __name__ == '__main__':
16 |     train_data, train_labels = get_data()
17 |     test_data, test_labels = get_data(num_samples=10000, dataset="testing")
18 | 
19 |     train_data = train_data / 255
20 |     test_data = test_data / 255
21 |     train_labels = train_labels.T
22 |     test_labels = test_labels.T
23 | 
24 |     print("Train data shape: {}, {}".format(train_data.shape, train_labels.shape))
25 |     print("Test data shape: {}, {}".format(test_data.shape, test_labels.shape))
26 | 
27 |     model = Model(
28 |         Flatten(),
29 |         FullyConnected(units=200),
30 |         Elu(),
31 |         FullyConnected(units=200),
32 |         Elu(),
33 |         FullyConnected(units=10),
34 |         Softmax(),
35 |         name='fcn200'
36 |     )
37 | 
38 |     model.set_loss(CategoricalCrossEntropy)
39 |     model.train(train_data, train_labels, batch_size=128, epochs=50)
40 | 
41 |     print('Testing accuracy = {}'.format(model.evaluate(test_data, test_labels)))
42 | 
43 | 


--------------------------------------------------------------------------------
/layers/activation.py:
--------------------------------------------------------------------------------
 1 | import numpy as np
 2 | 
 3 | 
 4 | class Relu:
 5 |     def __init__(self):
 6 |         self.cache = {}
 7 |         self.has_units = False
 8 | 
 9 |     def has_weights(self):
10 |         return self.has_units
11 | 
12 |     def forward_propagate(self, Z, save_cache=False):
13 |         if save_cache:
14 |             self.cache['Z'] = Z
15 |         return np.where(Z >= 0, Z, 0)
16 | 
17 |     def back_propagate(self, dA):
18 |         Z = self.cache['Z']
19 |         return dA * np.where(Z >= 0, 1, 0)
20 | 
21 | 
22 | class Softmax:
23 |     def __init__(self):
24 |         self.cache = {}
25 |         self.has_units = False
26 | 
27 |     def has_weights(self):
28 |         return self.has_units
29 | 
30 |     def forward_propagate(self, Z, save_cache=False):
31 |         if save_cache:
32 |             self.cache['Z'] = Z
33 |         Z_ = Z - Z.max()
34 |         e = np.exp(Z_)
35 |         return e / np.sum(e, axis=0, keepdims=True)
36 | 
37 |     def back_propagate(self, dA):
38 |         Z = self.cache['Z']
39 |         return dA * (Z * (1 - Z))
40 | 
41 | 
42 | class Elu:
43 |     def __init__(self, alpha=1.2):
44 |         self.cache = {}
45 |         self.params = {
46 |             'alpha': alpha
47 |         }
48 |         self.has_units = False
49 | 
50 |     def has_weights(self):
51 |         return self.has_units
52 | 
53 |     def forward_propagate(self, Z, save_cache=False):
54 |         if save_cache:
55 |             self.cache['Z'] = Z
56 |         return np.where(Z >= 0, Z, self.params['alpha'] * (np.exp(Z) - 1))
57 | 
58 |     def back_propagate(self, dA):
59 |         alpha = self.params['alpha']
60 |         Z = self.cache['Z']
61 |         return dA * np.where(Z >= 0, 1, self.forward_propagate(Z, alpha) + alpha)
62 | 
63 | 
64 | class Selu:
65 |     def __init__(self, alpha=1.6733, selu_lambda=1.0507):
66 |         self.params = {
67 |             'alpha' : alpha,
68 |             'lambda' : selu_lambda
69 |         }
70 |         self.cache = {}
71 |         self.has_units = False
72 | 
73 |     def has_weights(self):
74 |         return self.has_units
75 | 
76 |     def forward_propagate(self, Z, save_cache=False):
77 |         if save_cache:
78 |             self.cache['Z'] = Z
79 |         return self.params['lambda'] * np.where(Z >= 0, Z, self.params['alpha'] * (np.exp(Z) - 1))
80 | 
81 |     def back_propagate(self, dA):
82 |         Z = self.cache['Z']
83 |         selu_lambda, alpha = self.params['lambda'], self.params['alpha']
84 |         return dA * selu_lambda*np.where(Z >= 0, 1, alpha*np.exp(Z))


--------------------------------------------------------------------------------
/layers/convolution.py:
--------------------------------------------------------------------------------
  1 | import numpy as np
  2 | import pickle
  3 | from os import path, makedirs, remove
  4 | 
  5 | from utilities.utils import pad_inputs
  6 | from utilities.initializers import glorot_uniform
  7 | from utilities.settings import get_layer_num, inc_layer_num
  8 | 
  9 | 
 10 | class Convolution:
 11 |     def __init__(self, filters, kernel_shape=(3, 3), padding='valid', stride=1, name=None):
 12 |         self.params = {
 13 |             'filters': filters,
 14 |             'padding': padding,
 15 |             'kernel_shape': kernel_shape,
 16 |             'stride': stride
 17 |         }
 18 |         self.cache = {}
 19 |         self.rmsprop_cache = {}
 20 |         self.momentum_cache = {}
 21 |         self.grads = {}
 22 |         self.has_units = True
 23 |         self.name = name
 24 |         self.type = 'conv'
 25 | 
 26 |     def has_weights(self):
 27 |         return self.has_units
 28 | 
 29 |     def save_weights(self, dump_path):
 30 |         dump_cache = {
 31 |             'cache': self.cache,
 32 |             'grads': self.grads,
 33 |             'momentum': self.momentum_cache,
 34 |             'rmsprop': self.rmsprop_cache
 35 |         }
 36 |         save_path = path.join(dump_path, self.name+'.pickle')
 37 |         makedirs(path.dirname(save_path), exist_ok=True)
 38 |         remove(save_path)
 39 |         with open(save_path, 'wb') as d:
 40 |             pickle.dump(dump_cache, d)
 41 | 
 42 |     def load_weights(self, dump_path):
 43 |         if self.name is None:
 44 |             self.name = '{}_{}'.format(self.type, get_layer_num(self.type))
 45 |             inc_layer_num(self.type)
 46 |         read_path = path.join(dump_path, self.name+'.pickle')
 47 |         with open(read_path, 'rb') as r:
 48 |             dump_cache = pickle.load(r)
 49 |         self.cache = dump_cache['cache']
 50 |         self.grads = dump_cache['grads']
 51 |         self.momentum_cache = dump_cache['momentum']
 52 |         self.rmsprop_cache = dump_cache['rmsprop']
 53 | 
 54 |     def conv_single_step(self, input, W, b):
 55 |         '''
 56 |         Function to apply one filter to input slice.
 57 |         :param input:[numpy array]: slice of input data of shape (f, f, n_C_prev)
 58 |         :param W:[numpy array]: One filter of shape (f, f, n_C_prev)
 59 |         :param b:[numpy array]: Bias value for the filter. Shape (1, 1, 1)
 60 |         :return:
 61 |         '''
 62 |         return np.sum(np.multiply(input, W)) + float(b)
 63 | 
 64 |     def forward_propagate(self, X, save_cache=False):
 65 |         '''
 66 | 
 67 |         :param X:
 68 |         :param save_cache:
 69 |         :return:
 70 |         '''
 71 |         if self.name is None:
 72 |             self.name = '{}_{}'.format(self.type, get_layer_num(self.type))
 73 |             inc_layer_num(self.type)
 74 | 
 75 |         (num_data_points, prev_height, prev_width, prev_channels) = X.shape
 76 |         filter_shape_h, filter_shape_w = self.params['kernel_shape']
 77 | 
 78 |         if 'W' not in self.params:
 79 |             shape = (filter_shape_h, filter_shape_w, prev_channels, self.params['filters'])
 80 |             self.params['W'], self.params['b'] = glorot_uniform(shape=shape)
 81 | 
 82 |         if self.params['padding'] == 'same':
 83 |             pad_h = int(((prev_height - 1)*self.params['stride'] + filter_shape_h - prev_height) / 2)
 84 |             pad_w = int(((prev_width - 1)*self.params['stride'] + filter_shape_w - prev_width) / 2)
 85 |             n_H = prev_height
 86 |             n_W = prev_width
 87 |         else:
 88 |             pad_h = 0
 89 |             pad_w = 0
 90 |             n_H = int((prev_height - filter_shape_h) / self.params['stride']) + 1
 91 |             n_W = int((prev_width - filter_shape_w) / self.params['stride']) + 1
 92 | 
 93 |         self.params['pad_h'], self.params['pad_w'] = pad_h, pad_w
 94 | 
 95 |         Z = np.zeros(shape=(num_data_points, n_H, n_W, self.params['filters']))
 96 | 
 97 |         X_pad = pad_inputs(X, (pad_h, pad_w))
 98 | 
 99 |         for i in range(num_data_points):
100 |             x = X_pad[i]
101 |             for h in range(n_H):
102 |                 for w in range(n_W):
103 |                     vert_start = self.params['stride'] * h
104 |                     vert_end = vert_start + filter_shape_h
105 |                     horiz_start = self.params['stride'] * w
106 |                     horiz_end = horiz_start + filter_shape_w
107 | 
108 |                     for c in range(self.params['filters']):
109 | 
110 |                         x_slice = x[vert_start: vert_end, horiz_start: horiz_end, :]
111 | 
112 |                         Z[i, h, w, c] = self.conv_single_step(x_slice, self.params['W'][:, :, :, c],
113 |                                                               self.params['b'][:, :, :, c])
114 | 
115 |         if save_cache:
116 |             self.cache['A'] = X
117 | 
118 |         return Z
119 | 
120 |     def back_propagate(self, dZ):
121 |         '''
122 | 
123 |         :param dZ:
124 |         :return:
125 |         '''
126 |         A = self.cache['A']
127 |         filter_shape_h, filter_shape_w = self.params['kernel_shape']
128 |         pad_h, pad_w = self.params['pad_h'], self.params['pad_w']
129 | 
130 |         (num_data_points, prev_height, prev_width, prev_channels) = A.shape
131 | 
132 |         dA = np.zeros((num_data_points, prev_height, prev_width, prev_channels))
133 |         self.grads = self.init_cache()
134 | 
135 |         A_pad = pad_inputs(A, (pad_h, pad_w))
136 |         dA_pad = pad_inputs(dA, (pad_h, pad_w))
137 | 
138 |         for i in range(num_data_points):
139 |             a_pad = A_pad[i]
140 |             da_pad = dA_pad[i]
141 | 
142 |             for h in range(prev_height):
143 |                 for w in range(prev_width):
144 | 
145 |                     vert_start = self.params['stride'] * h
146 |                     vert_end = vert_start + filter_shape_h
147 |                     horiz_start = self.params['stride'] * w
148 |                     horiz_end = horiz_start + filter_shape_w
149 | 
150 |                     for c in range(self.params['filters']):
151 |                         a_slice = a_pad[vert_start: vert_end, horiz_start: horiz_end, :]
152 | 
153 |                         da_pad[vert_start:vert_end, horiz_start:horiz_end, :] += self.params['W'][:, :, :, c] * dZ[i, h, w, c]
154 |                         self.grads['dW'][:, :, :, c] += a_slice * dZ[i, h, w, c]
155 |                         self.grads['db'][:, :, :, c] += dZ[i, h, w, c]
156 |             dA[i, :, :, :] = da_pad[pad_h: -pad_h, pad_w: -pad_w, :]
157 | 
158 |         return dA
159 | 
160 |     def init_cache(self):
161 |         cache = dict()
162 |         cache['dW'] = np.zeros_like(self.params['W'])
163 |         cache['db'] = np.zeros_like(self.params['b'])
164 |         return cache
165 | 
166 |     def momentum(self, beta=0.9):
167 |         if not self.momentum_cache:
168 |             self.momentum_cache = self.init_cache()
169 |         self.momentum_cache['dW'] = beta * self.momentum_cache['dW'] + (1 - beta) * self.grads['dW']
170 |         self.momentum_cache['db'] = beta * self.momentum_cache['db'] + (1 - beta) * self.grads['db']
171 | 
172 |     def rmsprop(self, beta=0.999, amsprop=True):
173 |         if not self.rmsprop_cache:
174 |             self.rmsprop_cache = self.init_cache()
175 | 
176 |         new_dW = beta * self.rmsprop_cache['dW'] + (1 - beta) * (self.grads['dW']**2)
177 |         new_db = beta * self.rmsprop_cache['db'] + (1 - beta) * (self.grads['db']**2)
178 | 
179 |         if amsprop:
180 |             self.rmsprop_cache['dW'] = np.maximum(self.rmsprop_cache['dW'], new_dW)
181 |             self.rmsprop_cache['db'] = np.maximum(self.rmsprop_cache['db'], new_db)
182 |         else:
183 |             self.rmsprop_cache['dW'] = new_dW
184 |             self.rmsprop_cache['db'] = new_db
185 | 
186 |     def apply_grads(self, learning_rate=0.001, l2_penalty=1e-4, optimization='adam', epsilon=1e-8,
187 |                     correct_bias=False, beta1=0.9, beta2=0.999, iter=999):
188 |         if optimization != 'adam':
189 |             self.params['W'] -= learning_rate * (self.grads['dW'] + l2_penalty * self.params['W'])
190 |             self.params['b'] -= learning_rate * (self.grads['db'] + l2_penalty * self.params['b'])
191 | 
192 |         else:
193 |             if correct_bias:
194 |                 W_first_moment = self.momentum_cache['dW'] / (1 - beta1 ** iter)
195 |                 b_first_moment = self.momentum_cache['db'] / (1 - beta1 ** iter)
196 |                 W_second_moment = self.rmsprop_cache['dW'] / (1 - beta2 ** iter)
197 |                 b_second_moment = self.rmsprop_cache['db'] / (1 - beta2 ** iter)
198 |             else:
199 |                 W_first_moment = self.momentum_cache['dW']
200 |                 b_first_moment = self.momentum_cache['db']
201 |                 W_second_moment = self.rmsprop_cache['dW']
202 |                 b_second_moment = self.rmsprop_cache['db']
203 | 
204 |             W_learning_rate = learning_rate / (np.sqrt(W_second_moment) + epsilon)
205 |             b_learning_rate = learning_rate / (np.sqrt(b_second_moment) + epsilon)
206 | 
207 |             self.params['W'] -= W_learning_rate * (W_first_moment + l2_penalty * self.params['W'])
208 |             self.params['b'] -= b_learning_rate * (b_first_moment + l2_penalty * self.params['b'])
209 | 


--------------------------------------------------------------------------------
/layers/flatten.py:
--------------------------------------------------------------------------------
 1 | import numpy as np
 2 | 
 3 | 
 4 | class Flatten:
 5 |     def __init__(self, transpose=True):
 6 |         self.shape = ()
 7 |         self.transpose = transpose
 8 |         self.has_units = False
 9 | 
10 |     def has_weights(self):
11 |         return self.has_units
12 | 
13 |     def forward_propagate(self, Z, save_cache=False):
14 |         shape = Z.shape
15 |         if save_cache:
16 |             self.shape = shape
17 |         data = np.ravel(Z).reshape(shape[0], -1)
18 |         if self.transpose:
19 |             data = data.T
20 |         return data
21 | 
22 |     def back_propagate(self, Z):
23 |         if self.transpose:
24 |             Z = Z.T
25 |         return Z.reshape(self.shape)
26 | 


--------------------------------------------------------------------------------
/layers/fully_connected.py:
--------------------------------------------------------------------------------
  1 | import numpy as np
  2 | import pickle
  3 | from os import path, makedirs, remove
  4 | 
  5 | from utilities.initializers import he_normal
  6 | from utilities.settings import get_layer_num, inc_layer_num
  7 | 
  8 | np.random.seed(0)
  9 | 
 10 | 
 11 | class FullyConnected:
 12 |     def __init__(self, units=200, name=None):
 13 |         self.units = units
 14 |         self.params = {}
 15 |         self.cache = {}
 16 |         self.grads = {}
 17 |         self.momentum_cache = {}
 18 |         self.rmsprop_cache = {}
 19 |         self.has_units = True
 20 |         self.type = 'fc'
 21 |         self.name = name
 22 | 
 23 |     def has_weights(self):
 24 |         return self.has_units
 25 | 
 26 |     def save_weights(self, dump_path):
 27 |         dump_cache = {
 28 |             'cache': self.cache,
 29 |             'grads': self.grads,
 30 |             'momentum': self.momentum_cache,
 31 |             'rmsprop_cache': self.rmsprop_cache
 32 |         }
 33 |         save_path = path.join(dump_path, self.name+'.pickle')
 34 |         makedirs(path.dirname(save_path), exist_ok=True)
 35 |         remove(save_path)
 36 |         with open(save_path, 'wb') as d:
 37 |             pickle.dump(dump_cache, d)
 38 | 
 39 |     def load_weights(self, dump_path):
 40 |         if self.name is None:
 41 |             self.name = '{}_{}'.format(self.type, get_layer_num(self.type))
 42 |             inc_layer_num(self.type)
 43 |         read_path = path.join(dump_path, self.name+'.pickle')
 44 |         with open(read_path, 'rb') as r:
 45 |             dump_cache = pickle.load(r)
 46 |         self.cache = dump_cache['cache']
 47 |         self.grads = dump_cache['grads']
 48 |         self.momentum_cache = dump_cache['momentum']
 49 |         self.rmsprop_cache = dump_cache['rmsprop_cache']
 50 | 
 51 |     def forward_propagate(self, X, save_cache=False):
 52 |         if self.name is None:
 53 |             self.name = '{}_{}'.format(self.type, get_layer_num(self.type))
 54 |             inc_layer_num(self.type)
 55 | 
 56 |         if 'W' not in self.params:
 57 |             self.params['W'], self.params['b'] = he_normal((X.shape[0], self.units))
 58 |         Z = np.dot(self.params['W'], X) + self.params['b']
 59 |         if save_cache:
 60 |             self.cache['A'] = X
 61 |         return Z
 62 | 
 63 |     def back_propagate(self, dZ):
 64 |         batch_size = dZ.shape[1]
 65 |         self.grads['dW'] = np.dot(dZ, self.cache['A'].T) / batch_size
 66 |         self.grads['db'] = np.sum(dZ, axis=1, keepdims=True)
 67 |         return np.dot(self.params['W'].T, dZ)
 68 | 
 69 |     def init_cache(self):
 70 |         cache = dict()
 71 |         cache['dW'] = np.zeros_like(self.params['W'])
 72 |         cache['db'] = np.zeros_like(self.params['b'])
 73 |         return cache
 74 | 
 75 |     def momentum(self, beta=0.9):
 76 |         if not self.momentum_cache:
 77 |             self.momentum_cache = self.init_cache()
 78 |         self.momentum_cache['dW'] = beta * self.momentum_cache['dW'] + (1 - beta) * self.grads['dW']
 79 |         self.momentum_cache['db'] = beta * self.momentum_cache['db'] + (1 - beta) * self.grads['db']
 80 | 
 81 |     def rmsprop(self, beta=0.999, amsprop=True):
 82 |         if not self.rmsprop_cache:
 83 |             self.rmsprop_cache = self.init_cache()
 84 | 
 85 |         new_dW = beta * self.rmsprop_cache['dW'] + (1 - beta) * (self.grads['dW']**2)
 86 |         new_db = beta * self.rmsprop_cache['db'] + (1 - beta) * (self.grads['db']**2)
 87 | 
 88 |         if amsprop:
 89 |             self.rmsprop_cache['dW'] = np.maximum(self.rmsprop_cache['dW'], new_dW)
 90 |             self.rmsprop_cache['db'] = np.maximum(self.rmsprop_cache['db'], new_db)
 91 |         else:
 92 |             self.rmsprop_cache['dW'] = new_dW
 93 |             self.rmsprop_cache['db'] = new_db
 94 | 
 95 |     def apply_grads(self, learning_rate=0.001, l2_penalty=1e-4, optimization='adam', epsilon=1e-8, \
 96 |                     correct_bias=False, beta1=0.9, beta2=0.999, iter=999):
 97 |         if optimization != 'adam':
 98 |             self.params['W'] -= learning_rate * (self.grads['dW'] + l2_penalty * self.params['W'])
 99 |             self.params['b'] -= learning_rate * (self.grads['db'] + l2_penalty * self.params['b'])
100 |         else:
101 |             if correct_bias:
102 |                 W_first_moment = self.momentum_cache['dW'] / (1 - beta1 ** iter)
103 |                 b_first_moment = self.momentum_cache['db'] / (1 - beta1 ** iter)
104 |                 W_second_moment = self.rmsprop_cache['dW'] / (1 - beta2 ** iter)
105 |                 b_second_moment = self.rmsprop_cache['db'] / (1 - beta2 ** iter)
106 |             else:
107 |                 W_first_moment = self.momentum_cache['dW']
108 |                 b_first_moment = self.momentum_cache['db']
109 |                 W_second_moment = self.rmsprop_cache['dW']
110 |                 b_second_moment = self.rmsprop_cache['db']
111 | 
112 |             W_learning_rate = learning_rate / (np.sqrt(W_second_moment) + epsilon)
113 |             b_learning_rate = learning_rate / (np.sqrt(b_second_moment) + epsilon)
114 | 
115 |             self.params['W'] -= W_learning_rate * (W_first_moment + l2_penalty * self.params['W'])
116 |             self.params['b'] -= b_learning_rate * (b_first_moment + l2_penalty * self.params['b'])
117 | 


--------------------------------------------------------------------------------
/layers/pooling.py:
--------------------------------------------------------------------------------
  1 | import numpy as np
  2 | from utilities.settings import get_layer_num, inc_layer_num
  3 | 
  4 | 
  5 | class Pooling:
  6 |     def __init__(self, kernel_shape=(3, 3), stride=1, mode="max", name=None):
  7 |         '''
  8 | 
  9 |         :param kernel_shape:
 10 |         :param stride:
 11 |         :param mode:
 12 |         '''
 13 |         self.params = {
 14 |             'kernel_shape': kernel_shape,
 15 |             'stride': stride,
 16 |             'mode': mode
 17 |         }
 18 |         self.type = 'pooling'
 19 |         self.cache = {}
 20 |         self.has_units = False
 21 |         self.name = name
 22 | 
 23 |     def has_weights(self):
 24 |         return self.has_units
 25 | 
 26 |     def forward_propagate(self, X, save_cache=False):
 27 |         '''
 28 | 
 29 |         :param X:
 30 |         :param save_cache:
 31 |         :return:
 32 |         '''
 33 | 
 34 |         (num_data_points, prev_height, prev_width, prev_channels) = X.shape
 35 |         filter_shape_h, filter_shape_w = self.params['kernel_shape']
 36 | 
 37 |         n_H = int(1 + (prev_height - filter_shape_h) / self.params['stride'])
 38 |         n_W = int(1 + (prev_width - filter_shape_w) / self.params['stride'])
 39 |         n_C = prev_channels
 40 | 
 41 |         A = np.zeros((num_data_points, n_H, n_W, n_C))
 42 | 
 43 |         for i in range(num_data_points):
 44 |             for h in range(n_H):
 45 |                 for w in range(n_W):
 46 | 
 47 |                     vert_start = h * self.params['stride']
 48 |                     vert_end = vert_start + filter_shape_h
 49 |                     horiz_start = w * self.params['stride']
 50 |                     horiz_end = horiz_start + filter_shape_w
 51 | 
 52 |                     for c in range(n_C):
 53 | 
 54 |                         if self.params['mode'] == 'average':
 55 |                             A[i, h, w, c] = np.mean(X[i, vert_start: vert_end, horiz_start: horiz_end, c])
 56 |                         else:
 57 |                             A[i, h, w, c] = np.max(X[i, vert_start: vert_end, horiz_start: horiz_end, c])
 58 |         if save_cache:
 59 |             self.cache['A'] = X
 60 | 
 61 |         return A
 62 | 
 63 |     def distribute_value(self, dz, shape):
 64 |         (n_H, n_W) = shape
 65 |         average = 1 / (n_H * n_W)
 66 |         return np.ones(shape) * dz * average
 67 | 
 68 |     def create_mask(self, x):
 69 |         return x == np.max(x)
 70 | 
 71 |     def back_propagate(self, dA):
 72 |         A = self.cache['A']
 73 |         filter_shape_h, filter_shape_w = self.params['kernel_shape']
 74 | 
 75 |         (num_data_points, prev_height, prev_width, prev_channels) = A.shape
 76 |         m, n_H, n_W, n_C = dA.shape
 77 | 
 78 |         dA_prev = np.zeros(shape=(num_data_points, prev_height, prev_width, prev_channels))
 79 | 
 80 |         for i in range(num_data_points):
 81 |             a = A[i]
 82 | 
 83 |             for h in range(n_H):
 84 |                 for w in range(n_W):
 85 | 
 86 |                     vert_start = h * self.params['stride']
 87 |                     vert_end = vert_start + filter_shape_h
 88 |                     horiz_start = w * self.params['stride']
 89 |                     horiz_end = horiz_start + filter_shape_w
 90 | 
 91 |                     for c in range(n_C):
 92 | 
 93 |                         if self.params['mode'] == 'average':
 94 |                             da = dA[i, h, w, c]
 95 |                             dA_prev[i, vert_start: vert_end, horiz_start: horiz_end, c] += \
 96 |                                 self.distribute_value(da, self.params['kernel_shape'])
 97 | 
 98 |                         else:
 99 |                             a_slice = a[vert_start: vert_end, horiz_start: horiz_end, c]
100 |                             mask = self.create_mask(a_slice)
101 |                             dA_prev[i, vert_start: vert_end, horiz_start: horiz_end, c] += \
102 |                                 dA[i, h, w, c] * mask
103 | 
104 |         return dA_prev
105 | 


--------------------------------------------------------------------------------
/loss/losses.py:
--------------------------------------------------------------------------------
 1 | import numpy as np
 2 | 
 3 | 
 4 | class CategoricalCrossEntropy:
 5 |     @staticmethod
 6 |     def compute_loss(labels, predictions, epsilon=1e-8):
 7 |         '''
 8 |        The function to compute the categorical cross entropy loss, given training labels and prediction
 9 |        :param labels:[numpy array]: Training labels
10 |        :param predictions:[numpy array]: Predicted labels
11 |        :param epsilon:[float default=1e-8]: A small value for applying clipping for stability
12 |        :return:[float]: The computed value of loss.
13 |        '''
14 |         predictions /= np.sum(predictions, axis=0, keepdims=True)
15 |         predictions = np.clip(predictions, epsilon, 1. - epsilon)
16 |         return -np.sum(labels * np.log(predictions))
17 | 
18 |     @staticmethod
19 |     def compute_derivative(labels, predictions):
20 |         '''
21 |         The function to compute the derivative values of categorical cross entropy values, given labels and prediction
22 |         :param labels:[numpy array]: Training labels
23 |         :param predictions:[numpy array]: Predicted labels
24 |         :return:[numpy array]: The computed derivatives of categorical cross entropy function.
25 |         '''
26 |         return labels - predictions
27 | 


--------------------------------------------------------------------------------
/utilities/filereader.py:
--------------------------------------------------------------------------------
 1 | import pickle
 2 | import numpy as np
 3 | from os import path
 4 | from utilities.utils import to_categorical
 5 | 
 6 | 
 7 | TOTAL_BATCHES = 5
 8 | NUM_DIMENSIONS = 3072
 9 | NUM_CLASSES = 10
10 | SAMPLES_PER_BATCH = 10000
11 | MAX_TRAINING_SAMPLES = 50000
12 | MAX_TESTING_SAMPLES = 10000
13 | FILE_NAME = {
14 |     'training': 'data_batch_',
15 |     'testing': 'test_batch'
16 | }
17 | 
18 | 
19 | def unpickle(file, num_samples=10000):
20 |     '''
21 |     Function to read the data from the binary files
22 |     Description of data taken from CIFAR-10 website
23 |     :param file: the path to the datafile
24 |     :param num_samples: (remaining) samples required from a particular set (not same as num_samples in get_data)
25 |     :return: data and one-hot-encoded labels
26 |     '''
27 |     with open(file, 'rb') as fo:
28 |         data = pickle.load(fo, encoding='bytes')
29 |     return data[b'data'][:num_samples, :], to_categorical(data[b'labels'][:num_samples], NUM_CLASSES)
30 | 
31 | 
32 | def get_data(data_path="data", num_samples=50000, dataset="training"):
33 |     '''
34 |     Function that reads and returns the required training or testing data
35 |     :param data_path: string: the relative folder path to where the data lies (default: ./data)
36 |     :param num_samples: int: number of samples required (MAX 50000)
37 |     :param dataset: string: training or testing, default is training
38 |     :return: two numpy arrays 1 containing data and other containing corresponding labels.
39 |              data shape = [num_samples, 32, 32, 3] and labels shape = [num_samples, 10] for cifar-10 data
40 |              consistency checked with keras dataset cifar10
41 |     '''
42 |     if dataset == "testing" and num_samples > MAX_TESTING_SAMPLES:
43 |         num_samples = MAX_TESTING_SAMPLES
44 |     if dataset == "training" and num_samples>MAX_TRAINING_SAMPLES:
45 |         num_samples = MAX_TRAINING_SAMPLES
46 |     data = np.zeros(shape=(num_samples, NUM_DIMENSIONS))
47 |     labels = np.zeros(shape=(NUM_CLASSES, num_samples))
48 |     num_batches = num_samples//SAMPLES_PER_BATCH + 1
49 |     if num_batches > TOTAL_BATCHES:
50 |         num_batches = TOTAL_BATCHES
51 |     remaining = num_samples - 0
52 |     for _ in range(num_batches):
53 |         file_name = FILE_NAME[dataset]+str(_+1) if dataset=="training" else FILE_NAME[dataset]
54 |         file = path.join('.', data_path, file_name)
55 |         if remaining > SAMPLES_PER_BATCH:
56 |             ret_val = unpickle(file, SAMPLES_PER_BATCH)
57 |             data[_*SAMPLES_PER_BATCH: SAMPLES_PER_BATCH*(_+1)] = ret_val[0]
58 |             labels[:, _*SAMPLES_PER_BATCH: SAMPLES_PER_BATCH*(_+1)] = ret_val[1]
59 |         else:
60 |             ret_val = unpickle(file, remaining)
61 |             data[_*SAMPLES_PER_BATCH:] = ret_val[0]
62 |             labels[:, _*SAMPLES_PER_BATCH:] = ret_val[1]
63 |         remaining = remaining - SAMPLES_PER_BATCH
64 |     return data.reshape(-1, 3, 32, 32).transpose(0, 2, 3, 1).astype(np.float32), labels.T
65 | 


--------------------------------------------------------------------------------
/utilities/initializers.py:
--------------------------------------------------------------------------------
  1 | import numpy as np
  2 | 
  3 | 
  4 | def get_fans(shape):
  5 |     '''
  6 | 
  7 |     :param shape:
  8 |     :return:
  9 |     '''
 10 |     fan_in = shape[0] if len(shape) == 2 else np.prod(shape[1:])
 11 |     fan_out = shape[1] if len(shape) == 2 else shape[0]
 12 |     return fan_in, fan_out
 13 | 
 14 | 
 15 | def normal(shape, scale=0.05):
 16 |     '''
 17 | 
 18 |     :param shape:
 19 |     :param scale:
 20 |     :return:
 21 |     '''
 22 |     return np.random.normal(0, scale, size=shape)
 23 | 
 24 | 
 25 | def uniform(shape, scale=0.05):
 26 |     '''
 27 | 
 28 |     :param shape:
 29 |     :param scale:
 30 |     :return:
 31 |     '''
 32 |     return np.random.uniform(-scale, scale, size=shape)
 33 | 
 34 | 
 35 | def he_normal(shape):
 36 |     '''
 37 |     A function for smart normal distribution based initialization of parameters
 38 |     [He et al. https://arxiv.org/abs/1502.01852]
 39 |     :param fan_in: The number of units in previous layer.
 40 |     :param fan_out: The number of units in current layer.
 41 |     :return:[numpy array, numpy array]: A randomly initialized array of shape [fan_out, fan_in]
 42 |     '''
 43 |     fan_in, fan_out = get_fans(shape)
 44 |     scale = np.sqrt(2. / fan_in)
 45 |     shape = (fan_out, fan_in) if len(shape) == 2 else shape         # For a fully connected network
 46 |     bias_shape = (fan_out, 1) if len(shape) == 2 else (
 47 |         1, 1, 1, shape[3])   # This supports only CNNs and fully connected networks
 48 |     return normal(shape, scale), uniform(bias_shape)
 49 | 
 50 | 
 51 | def he_uniform(shape):
 52 |     '''
 53 |     A function for smart uniform distribution based initialization of parameters
 54 |     [He et al. https://arxiv.org/abs/1502.01852]
 55 |     :param fan_in: The number of units in previous layer.
 56 |     :param fan_out: The number of units in current layer.
 57 |     :return:[numpy array, numpy array]: A randomly initialized array of shape [fan_out, fan_in] and
 58 |             the bias of shape [fan_out, 1]
 59 |     '''
 60 |     fan_in, fan_out = get_fans(shape)
 61 |     scale = np.sqrt(6. / fan_in)
 62 |     shape = (fan_out, fan_in) if len(shape) == 2 else shape  # For a fully connected network
 63 |     bias_shape = (fan_out, 1) if len(shape) == 2 else (
 64 |         1, 1, 1, shape[3])  # This supports only CNNs and fully connected networks
 65 |     return uniform(shape, scale), uniform(shape=bias_shape)
 66 | 
 67 | 
 68 | def glorot_normal(shape):
 69 |     '''
 70 |     A function for smart uniform distribution based initialization of parameters
 71 |     [Glorot et al. http://proceedings.mlr.press/v9/glorot10a/glorot10a.pdf]
 72 |     :param fan_in: The number of units in previous layer.
 73 |     :param fan_out: The number of units in current layer.
 74 |     :return:[numpy array, numpy array]: A randomly initialized array of shape [fan_out, fan_in] and
 75 |             the bias of shape [fan_out, 1]
 76 |     '''
 77 |     fan_in, fan_out = get_fans(shape)
 78 |     scale = np.sqrt(2. / (fan_in + fan_out))
 79 |     shape = (fan_out, fan_in) if len(shape) == 2 else shape  # For a fully connected network
 80 |     bias_shape = (fan_out, 1) if len(shape) == 2 else (
 81 |         1, 1, 1, shape[3])  # This supports only CNNs and fully connected networks
 82 |     return normal(shape, scale), uniform(shape=bias_shape)
 83 | 
 84 | 
 85 | def glorot_uniform(shape):
 86 |     '''
 87 |     A function for smart uniform distribution based initialization of parameters
 88 |     [Glorot et al. http://proceedings.mlr.press/v9/glorot10a/glorot10a.pdf]
 89 |     :param fan_in: The number of units in previous layer.
 90 |     :param fan_out: The number of units in current layer.
 91 |     :return:[numpy array, numpy array]: A randomly initialized array of shape [fan_out, fan_in] and
 92 |             the bias of shape [fan_out, 1]
 93 |     '''
 94 |     fan_in, fan_out = get_fans(shape)
 95 |     scale = np.sqrt(6. / (fan_in + fan_out))
 96 |     shape = (fan_out, fan_in) if len(shape) == 2 else shape  # For a fully connected network
 97 |     bias_shape = (fan_out, 1) if len(shape) == 2 else (
 98 |         1, 1, 1, shape[3])  # This supports only CNNs and fully connected networks
 99 |     return uniform(shape, scale), uniform(shape=bias_shape)
100 | 


--------------------------------------------------------------------------------
/utilities/model.py:
--------------------------------------------------------------------------------
 1 | import numpy as np
 2 | 
 3 | from os import makedirs, path
 4 | 
 5 | from utilities.utils import get_batches, evaluate
 6 | from utilities.settings import set_network_name, get_models_path
 7 | 
 8 | class Model:
 9 |     def __init__(self, *model, **kwargs):
10 |         self.model = model
11 |         self.num_classes = 0
12 |         self.batch_size = 0
13 |         self.loss = None
14 |         self.optimizer = None
15 |         self.name = kwargs['name'] if 'name' in kwargs else None
16 | 
17 |     def set_batch_size(self, batch_size):
18 |         self.batch_size = batch_size
19 | 
20 |     def set_loss(self, loss):
21 |         self.loss = loss
22 | 
23 |     def set_name(self, name):
24 |         set_network_name(name)
25 | 
26 |     def load_weights(self):
27 |         for layer in self.model:
28 |             if layer.has_weights():
29 |                 layer.load_weights(path.join(get_models_path(), self.name))
30 | 
31 |     def train(self, data, labels, batch_size=256, epochs=50, optimization='adam',
32 |               save_model=True, load_and_continue=False):
33 |         if self.loss is None:
34 |             raise RuntimeError("Set loss first using 'model.set_loss(<loss>)'")
35 | 
36 |         self.set_batch_size(batch_size)
37 |         if save_model:
38 |             self.set_name(self.name)
39 | 
40 |         if load_and_continue:
41 |             for layer in self.model:
42 |                 if layer.has_weights():
43 |                     layer.load_weights(path.join(get_models_path(), self.name))
44 | 
45 |         iter = 1
46 |         for epoch in range(epochs):
47 |             print('Running Epoch:', epoch + 1)
48 |             for i, (x_batch, y_batch) in enumerate(get_batches(data, labels)):
49 |                 batch_preds = x_batch.copy()
50 |                 for num, layer in enumerate(self.model):
51 |                     batch_preds = layer.forward_propagate(batch_preds, save_cache=True)
52 |                 dA = self.loss.compute_derivative(y_batch, batch_preds)
53 |                 for layer in reversed(self.model):
54 |                     dA = layer.back_propagate(dA)
55 |                     if layer.has_weights():
56 |                         if optimization == 'adam':
57 |                             layer.momentum()
58 |                             layer.rmsprop()
59 | 
60 |                 for layer in self.model:
61 |                     if layer.has_weights():
62 |                         layer.apply_grads(optimization=optimization, correct_bias=True, iter=iter)
63 |             for layer in self.model:
64 |                 if layer.has_weights():
65 |                     layer.save_weights(path.join(get_models_path(), self.name))
66 | 
67 |             iter += batch_size
68 | 
69 |     def predict(self, data):
70 |         if self.batch_size == 0:
71 |             self.batch_size = data.shape[0]
72 |         if self.num_classes == 0:
73 |             predictions = np.zeros((1, data.shape[0]))
74 |         else:
75 |             predictions = np.zeros((self.num_classes, data.shape[0]))
76 |         num_batches = data.shape[0] // self.batch_size
77 |         for batch_num, x_batch in enumerate(get_batches(data, batch_size=self.batch_size, shuffle=False)):
78 |             batch_preds = x_batch.copy()
79 |             for layer in self.model:
80 |                 batch_preds = layer.forward_propagate(batch_preds, save_cache=False)
81 |             M, N = batch_preds.shape
82 |             if M != predictions.shape[0]:
83 |                 predictions = np.zeros(shape=(M, data.shape[0]))
84 |             if batch_num <= num_batches - 1:
85 |                 predictions[:, batch_num * self.batch_size:(batch_num + 1) * self.batch_size] = batch_preds
86 |             else:
87 |                 predictions[:, batch_num * self.batch_size:] = batch_preds
88 |         return predictions
89 | 
90 |     def evaluate(self, data, labels):
91 |         predictions = self.predict(data)
92 |         M, N = predictions.shape
93 |         if (M, N) == labels.shape:
94 |             return evaluate(labels, predictions)
95 |         elif (N, M) == labels.shape:
96 |             return evaluate(labels.T, predictions)
97 |         else:
98 |             raise RuntimeError("Prediction and label shapes don't match")
99 | 


--------------------------------------------------------------------------------
/utilities/settings.py:
--------------------------------------------------------------------------------
 1 | layer_nums = {
 2 |     'fc': 1,
 3 |     'conv': 1
 4 | }
 5 | network_name = None
 6 | models_path = 'models'
 7 | 
 8 | 
 9 | def init():
10 |     global layer_nums
11 |     layer_nums = {
12 |         'fc': 1,
13 |         'conv': 1
14 |     }
15 |     global network_name
16 |     network_name = None
17 | 
18 | 
19 | def get_layer_num(layer_type):
20 |     global layer_nums
21 |     return layer_nums[layer_type]
22 | 
23 | 
24 | def get_models_path():
25 |     return models_path
26 | 
27 | 
28 | def inc_layer_num(layer_type):
29 |     global layer_nums
30 |     layer_nums[layer_type] += 1
31 | 
32 | 
33 | def set_network_name(name):
34 |     global network_name
35 |     network_name = name
36 | 
37 | 
38 | def get_network_name():
39 |     if network_name is None:
40 |         raise RuntimeError("Model name not set, set name as 'model.set_name(<name>)'")
41 |     return network_name
42 | 


--------------------------------------------------------------------------------
/utilities/utils.py:
--------------------------------------------------------------------------------
  1 | import matplotlib.pyplot as plt
  2 | import numpy as np
  3 | 
  4 | 
  5 | labels_to_name_map = {
  6 |     0: 'airplane',
  7 |     1: 'automobile',
  8 |     2: 'bird',
  9 |     3: 'cat',
 10 |     4: 'deer',
 11 |     5: 'dog',
 12 |     6: 'frog',
 13 |     7: 'horse',
 14 |     8: 'ship',
 15 |     9: 'truck'
 16 | }
 17 | 
 18 | 
 19 | def get_name(label):
 20 |     return labels_to_name_map[int(np.argmax(label))]
 21 | 
 22 | 
 23 | def pad_inputs(X, pad):
 24 |     '''
 25 |     Function to apply zero padding to the image
 26 |     :param X:[numpy array]: Dataset of shape (m, height, width, depth)
 27 |     :param pad:[int]: number of columns to pad
 28 |     :return:[numpy array]: padded dataset
 29 |     '''
 30 |     return np.pad(X, ((0, 0), (pad[0], pad[0]), (pad[1], pad[1]), (0, 0)), 'constant')
 31 | 
 32 | 
 33 | def show_image(image, title=None, cmap=None):
 34 |     '''
 35 |     Function to display one image
 36 |     :param image: numpy float array: of shape (32, 32, 3)
 37 |     :return: Void
 38 |     '''
 39 |     if cmap is not None:
 40 |         plt.imshow(image, cmap=cmap)
 41 |     else:
 42 |         plt.imshow(image)
 43 |     if title is not None:
 44 |         plt.title(title)
 45 |     plt.show()
 46 | 
 47 | 
 48 | def plot_graph(Y, X=None, title=None, xlabel=None, ylabel=None):
 49 |     '''
 50 |     A function to plot a line graph.
 51 |     :param Y: Values for Y axis
 52 |     :param X: Values for X axis(optional)
 53 |     :param title:[string default=None]: Graph title.
 54 |     :param xlabel:[string default=None]: X axis label.
 55 |     :param ylabel:[string default=None]: Y axis label.
 56 |     :return: Void
 57 |     '''
 58 |     if X is None:
 59 |         plt.plot(Y)
 60 |     else:
 61 |         plt.plot(X, Y)
 62 |     if title is not None:
 63 |         plt.title(title)
 64 |     if xlabel is not None:
 65 |         plt.xlabel(xlabel)
 66 |     if ylabel is not None:
 67 |         plt.ylabel(ylabel)
 68 |     plt.show()
 69 | 
 70 | 
 71 | def to_categorical(labels, num_classes, axis=0):
 72 |     '''
 73 |     Function to one-hot-encode the labels
 74 |     :param labels:[list or vector]: list of ints: list of numbers (ranging 0-9 for CIFAR-10)
 75 |     :param num_classes:[int]: the total number of unique classes or categories.
 76 |     :param axis:[int Default=0]: decides row matrix or column matrix. if 0 then column matrix, else row
 77 |     :return: numpy array of ints: one-hot-encoded labels
 78 |     '''
 79 |     ohe_labels = np.zeros((len(labels), num_classes)) if axis != 0 else np.zeros((num_classes, len(labels)))
 80 |     for _ in range(len(labels)):
 81 |         if axis == 0:
 82 |             ohe_labels[labels[_], _] = 1
 83 |         else:
 84 |             ohe_labels[_, labels[_]] = 1
 85 |     return ohe_labels
 86 | 
 87 | 
 88 | def get_batches(data, labels=None, batch_size=256, shuffle=True):
 89 |     '''
 90 |     Function to get data in batches.
 91 |     :param data:[numpy array]: training or test data. Assumes shape=[M, N] where M is the features and N is samples.
 92 |     :param labels:[numpy array, Default = None (for without labels)]: actual labels corresponding to the data.
 93 |     Assumes shape=[M, N] where M is number of classes/results per sample and N is number of samples.
 94 |     :param batch_size:[int, Default = 256]: required size of batch. If data can't be exactly divided by batch_size,
 95 |     remaining samples will be in a new batch
 96 |     :param shuffle:[boolean, Default = True]: if true, function will shuffle the data
 97 |     :return:[numpy array, numpy array]: batch data and corresponding labels
 98 |     '''
 99 |     N = data.shape[1] if len(data.shape) == 2 else data.shape[0]
100 |     num_batches = N//batch_size
101 |     if len(data.shape) == 2:
102 |         data = data.T
103 |     if shuffle:
104 |         shuffled_indices = np.random.permutation(N)
105 |         data = data[shuffled_indices]
106 |         labels = labels[:, shuffled_indices] if labels is not None else None
107 |     if num_batches == 0:
108 |         if labels is not None:
109 |             yield (data.T, labels) if len(data.shape) == 2 else (data, labels)
110 |         else:
111 |             yield data.T if len(data.shape) == 2 else data
112 |     for batch_num in range(num_batches):
113 |         if labels is not None:
114 |             yield (data[batch_num*batch_size:(batch_num+1)*batch_size].T,
115 |                   labels[:, batch_num*batch_size:(batch_num+1)*batch_size]) if len(data.shape) == 2 \
116 |                       else (data[batch_num*batch_size:(batch_num+1)*batch_size],
117 |                   labels[:, batch_num*batch_size:(batch_num+1)*batch_size])
118 |         else:
119 |             yield data[batch_num*batch_size:(batch_num+1)*batch_size].T if len(data.shape) == 2 else \
120 |                 data[batch_num*batch_size:(batch_num+1)*batch_size]
121 |     if N%batch_size != 0 and num_batches != 0:
122 |         if labels is not None:
123 |             yield (data[num_batches*batch_size:].T, labels[:, num_batches*batch_size:]) if len(data.shape) == 2 else \
124 |                 (data[num_batches*batch_size:], labels[:, num_batches*batch_size:])
125 |         else:
126 |             yield data[num_batches*batch_size:].T if len(data.shape)==2 else data[num_batches*batch_size:]
127 | 
128 | 
129 | def evaluate(labels, predictions):
130 |     '''
131 |     A function to compute the accuracy of the predictions on a scale of 0-1.
132 |     :param labels:[numpy array]: Training labels (or testing/validation if available)
133 |     :param predictions:[numpy array]: Predicted labels
134 |     :return:[float]: a number between [0, 1] denoting the accuracy of the prediction
135 |     '''
136 |     return np.mean(np.argmax(labels, axis=0) == np.argmax(predictions, axis=0))
137 | 


--------------------------------------------------------------------------------