├── LICENSE
├── README.md
├── VAE.py
├── dataset_loader.py
├── experiment.py
├── gym_datagenerator.py
├── perceptual_embedder.py
├── perceptual_networks.py
└── utility.py


/LICENSE:
--------------------------------------------------------------------------------
 1 | MIT License
 2 | 
 3 | Copyright (c) 2020 Gustav Grund Pihlgren
 4 | 
 5 | Permission is hereby granted, free of charge, to any person obtaining a copy
 6 | of this software and associated documentation files (the "Software"), to deal
 7 | in the Software without restriction, including without limitation the rights
 8 | to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
 9 | copies of the Software, and to permit persons to whom the Software is
10 | furnished to do so, subject to the following conditions:
11 | 
12 | The above copyright notice and this permission notice shall be included in all
13 | copies or substantial portions of the Software.
14 | 
15 | THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
16 | IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
17 | FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
18 | AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
19 | LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
20 | OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
21 | SOFTWARE.
22 | 


--------------------------------------------------------------------------------
/README.md:
--------------------------------------------------------------------------------
 1 | # Perceptual-Autoencoders
 2 | Implementation of [Improving Image Autoencoder Embeddings with Perceptual Loss](https://arxiv.org/abs/2001.03444) and [Pretraining Image Encoders without Reconstruction via Feature Prediction Loss](https://arxiv.org/abs/2003.07441)
 3 | 
 4 | ## Cite papers or repository
 5 | 
 6 | If you are using the repository or work as part of a scientific work you should cite the following paper:
 7 | ```
 8 | @INPROCEEDINGS{pihlgren2020improving,
 9 |     author={G. G. {Pihlgren} and F. {Sandin} and M. {Liwicki}},
10 |     booktitle={2020 International Joint Conference on Neural Networks (IJCNN)}, 
11 |     title={Improving Image Autoencoder Embeddings with Perceptual Loss}, 
12 |     year={2020},
13 |     pages={1-7},
14 |     doi={10.1109/IJCNN48605.2020.9207431}
15 | }
16 | ```
17 | 
18 | If you are using anything from perceptual_embedder.py (i.e. FeaturePredictorCVAE, FeatureAutoencoder, PerceptualFeatureToImgCVAE, or FeatureToImgCVAE) you should also cite this paper:
19 | ```
20 | @INPROCEEDINGS{pihlgren2021pretraining,
21 |   author={Grund Pihlgren, Gustav and Sandin, Fredrik and Liwicki, Marcus},
22 |   booktitle={2020 25th International Conference on Pattern Recognition (ICPR)}, 
23 |   title={Pretraining Image Encoders without Reconstruction via Feature Prediction Loss}, 
24 |   year={2021},
25 |   pages={4105-4111},
26 |   doi={10.1109/ICPR48806.2021.9412239}
27 | }
28 | ```
29 | 
30 | 
31 | ## Requirements
32 | The repository have been tested with Python 3.6 and 3.7, Pytorch 1.2.0, Torchvision 0.4.0, and SciPy 1.3.1
33 | 
34 | To use the OpenAI gym part of the repository (gym_datagenerator.py) you additionally need OpenAI gym with all its requirements for the desired gym environments as well as opencv-python (for cv2).
35 | Since gym_datagenerator.py generates files that does not require OpenAI gym it can be run in an independent environment.
36 | The repository have been tested with gym 0.14.0 and opencv-python 4.0.0.21
37 | 
38 | ## Datasets
39 | The repository have been setup to work with [STL-10](http://ai.stanford.edu/~acoates/stl10/) and [SVHN](http://ufldl.stanford.edu/housenumbers/) datasets as well as data generated with gym_datagenerator.py for the LunarLander-v2 environment.
40 | 
41 | The STL-10 binaries can be found here: http://ai.stanford.edu/~acoates/stl10/stl10_binary.tar.gz
42 | 
43 | The SVHN binaries can be found here: [train_32x32.mat](http://ufldl.stanford.edu/housenumbers/train_32x32.mat), [test_32x32.mat](http://ufldl.stanford.edu/housenumbers/test_32x32.mat), [extra_32x32.mat](http://ufldl.stanford.edu/housenumbers/extra_32x32.mat)
44 | 
45 | LunarLander-v2 data is generated by executing `python gym_datagenerator.py`. Rename the file to something suitable and rerun the command for as many datasets you need. Recommended is three (one for training autoencoders, one for training predictors, and one for testing) but if you're more concerned with time and memory than making a rigorous experiment you can use one file for all three purposes. You must then edit `experiment.py` by adding these files at their correct positions. Running `experiment.py` with the `--data lunarlander` flag will result in an error telling you what needs to be done unless the code has been edited properly.
46 | 
47 | The binaries (files containing the actual data) for all three datasets needs to be put in `datasets/LunarLander-v2`, `datasets/stl10`, and `datasets/svhn` respectively.
48 | 
49 | ## Running experiments
50 | To run experiments run `python experiment.py --data lunarlander|stl10|svhn`
51 | The experiments can take many additional parameters which can be found by running `python experiment.py --help`
52 | 


--------------------------------------------------------------------------------
/VAE.py:
--------------------------------------------------------------------------------
  1 | # Library imports
  2 | import random
  3 | import torch
  4 | import numpy as np
  5 | import torchvision.models as models
  6 | from torch.nn import functional as F
  7 | from torch.utils.data import TensorDataset, DataLoader
  8 | import torch.nn as nn
  9 | import datetime
 10 | import time
 11 | import sys
 12 | import os
 13 | import matplotlib.pyplot as plt
 14 | 
 15 | # File imports
 16 | from utility import run_training, EarlyStopper
 17 | 
 18 | def _create_coder(channels, kernel_sizes, strides, conv_types,
 19 |     activation_types, paddings=(0,0), batch_norms=False
 20 | ):
 21 |     '''
 22 |     Function that creates en- or decoders based on parameters
 23 |     Args:
 24 |         channels ([int]): Channel sizes per layer. 1 more than layers
 25 |         kernel_sizes ([int]): Kernel sizes per layer
 26 |         strides ([int]): Strides per layer
 27 |         conv_types ([f()->type]): Type of the convoultion module per layer
 28 |         activation_types ([f()->type]): Type of activation function per layer
 29 |         paddings ([(int, int)]): The padding per layer
 30 |         batch_norms ([bool]): Whether to use batchnorm on each layer
 31 |     Returns (nn.Sequential): The created coder
 32 |     '''
 33 |     if not isinstance(conv_types, list):
 34 |         conv_types = [conv_types for _ in range(len(kernel_sizes))]
 35 | 
 36 |     if not isinstance(activation_types, list):
 37 |         activation_types = [activation_types for _ in range(len(kernel_sizes))]
 38 | 
 39 |     if not isinstance(paddings, list):
 40 |         paddings = [paddings for _ in range(len(kernel_sizes))]
 41 |         
 42 |     if not isinstance(batch_norms, list):
 43 |         batch_norms = [batch_norms for _ in range(len(kernel_sizes))]
 44 | 
 45 |     coder = nn.Sequential()
 46 |     for layer in range(len(channels)-1):
 47 |         coder.add_module(
 48 |             'conv'+ str(layer), 
 49 |             conv_types[layer](
 50 |                 in_channels=channels[layer], 
 51 |                 out_channels=channels[layer+1],
 52 |                 kernel_size=kernel_sizes[layer],
 53 |                 stride=strides[layer]
 54 |             )
 55 |         )
 56 |         if batch_norms[layer]:
 57 |             coder.add_module(
 58 |                 'norm'+str(layer),
 59 |                 nn.BatchNorm2d(channels[layer+1])
 60 |             )
 61 |         if not activation_types[layer] is None:
 62 |             coder.add_module('acti'+str(layer),activation_types[layer]())
 63 | 
 64 |     return coder
 65 | 
 66 | class TemplateVAE(nn.Module):
 67 |     '''
 68 |     A template class for Variational Autoencoders to minimize code duplication
 69 |     Args:
 70 |         input_size (int,int): The height and width of the input image
 71 |         z_dimensions (int): The number of latent dimensions in the encoding
 72 |         variational (bool): Whether the model is variational or not
 73 |         gamma (float): The weight of the KLD loss
 74 |         perceptual_net: Which perceptual network to use (None for pixel-wise)
 75 |     '''
 76 |     
 77 |     def __str__(self):
 78 |         string = super().__str__()[:-1]
 79 |         string = string + '  (variational): {}\n  (gamma): {}\n)'.format(
 80 |                 self.variational,self.gamma
 81 |             )
 82 |         return string
 83 | 
 84 |     def __repr__(self):
 85 |         string = super().__repr__()[:-1]
 86 |         string = string + '  (variational): {}\n  (gamma): {}\n)'.format(
 87 |                 self.variational,self.gamma
 88 |             )
 89 |         return string
 90 |     
 91 |     def encode(self, x):
 92 |         x = self.encoder(x)
 93 |         x = x.view(x.size(0),-1)
 94 |         mu = self.mu(x)
 95 |         logvar = self.logvar(x)
 96 |         return mu, logvar
 97 | 
 98 |     def sample(self, mu, logvar):
 99 |         std = logvar.mul(0.5).exp_()
100 |         eps = torch.autograd.Variable(std.data.new(std.size()).normal_())
101 |         out = eps.mul(std).add_(mu)
102 |         return out
103 | 
104 |     def decode(self, z):
105 |         return self.decoder(z)
106 |     
107 |     def forward(self, x):
108 |         mu, logvar = self.encode(x)
109 |         if self.variational:
110 |             z = self.sample(mu, logvar)
111 |         else:
112 |             z = mu
113 |         rec_x = self.decode(z)
114 |         return rec_x, z, mu, logvar
115 | 
116 |     def loss(self, output, x):
117 |         rec_x, z, mu, logvar = output
118 |         if self.perceptual_loss:
119 |             x = self.perceptual_net(x)
120 |             rec_x = self.perceptual_net(rec_x)
121 |         else:
122 |             x = x.reshape(x.size(0), -1)
123 |             rec_x = rec_x.view(x.size(0), -1)
124 |         REC = F.mse_loss(rec_x, x, reduction='mean')
125 | 
126 |         if self.variational:
127 |             KLD = -1 * torch.mean(1 + logvar - mu.pow(2) - logvar.exp())
128 |             return REC + self.gamma*KLD, REC, KLD
129 |         else:
130 |             return [REC]
131 | 
132 | class FourLayerCVAE(TemplateVAE):
133 |     '''
134 |     A Convolutional Variational Autoencoder for images
135 |     Args:
136 |         input_size (int,int): The height and width of the input image
137 |             acceptable sizes are 64+16*n
138 |         z_dimensions (int): The number of latent dimensions in the encoding
139 |         variational (bool): Whether the model is variational or not
140 |         gamma (float): The weight of the KLD loss
141 |         perceptual_net: Which perceptual network to use (None for pixel-wise)
142 |     '''
143 | 
144 |     def __init__(self, input_size=(64,64), z_dimensions=32,
145 |         variational=True, gamma=20.0, perceptual_net=None
146 |     ):
147 |         super().__init__()
148 | 
149 |         #Parameter check
150 |         if (input_size[0] - 64) % 16 != 0 or (input_size[1] - 64) % 16 != 0:
151 |             raise ValueError(
152 |                 f'Input_size is {input_size}, but must be 64+16*N'
153 |             )
154 | 
155 |         #Attributes
156 |         self.input_size = input_size
157 |         self.z_dimensions = z_dimensions
158 |         self.variational = variational
159 |         self.gamma = gamma
160 |         self.perceptual_net = perceptual_net
161 | 
162 |         self.perceptual_loss = not perceptual_net is None
163 |             
164 |         encoder_channels = [3,32,64,128,256]
165 |         self.encoder = _create_coder(
166 |             encoder_channels, [4,4,4,4], [2,2,2,2],
167 |             nn.Conv2d, nn.ReLU,
168 |             batch_norms=[True,True,True,True]
169 |         )
170 |         
171 |         f = lambda x: np.floor((x - (2,2))/2)
172 |         conv_sizes = f(f(f(f(np.array(input_size)))))
173 |         conv_flat_size = int(encoder_channels[-1]*conv_sizes[0]*conv_sizes[1])
174 |         self.mu = nn.Linear(conv_flat_size, self.z_dimensions)
175 |         self.logvar = nn.Linear(conv_flat_size, self.z_dimensions)
176 | 
177 |         g = lambda x: int((x-64)/16)+1
178 |         deconv_flat_size = g(input_size[0]) * g(input_size[1]) * 1024
179 |         self.dense = nn.Linear(self.z_dimensions, deconv_flat_size)
180 | 
181 |         self.decoder = _create_coder(
182 |             [1024,128,64,32,3], [5,5,6,6], [2,2,2,2],
183 |             nn.ConvTranspose2d,
184 |             [nn.ReLU,nn.ReLU,nn.ReLU,nn.Sigmoid],
185 |             batch_norms=[True,True,True,False]
186 |         )
187 | 
188 |         self.relu = nn.ReLU()
189 | 
190 |     def decode(self, z):
191 |         y = self.dense(z)
192 |         y = self.relu(y)
193 |         y = y.view(
194 |             y.size(0), 1024,
195 |             int((self.input_size[0]-64)/16)+1,
196 |             int((self.input_size[1]-64)/16)+1
197 |         )
198 |         y = self.decoder(y)
199 |         return y
200 | 
201 | def show(imgs, block=False, save=None, heading='Figure', fig_axs=None, torchy=True):
202 |     '''
203 |     Paints a column of torch images
204 |     Args:
205 |         imgs ([3darray]): Array of images in shape (channels, width, height)
206 |         block (bool): Whether the image should interupt program flow
207 |         save (str / None): Path to save the image under. Will not save if None
208 |         heading (str)): The heading to put on the image
209 |         fig_axs (plt.Figure, axes.Axes): Figure and Axes to paint on
210 |     Returns (plt.Figure, axes.Axes): The Figure and Axes that was painted
211 |     '''
212 |     if fig_axs is None:
213 |         fig, axs = plt.subplots(1,len(imgs))
214 |         if len(imgs) == 1:
215 |             axs = [axs]
216 |     else:
217 |         fig, axs = fig_axs
218 |         plt.figure(fig.number)
219 |     fig.canvas.set_window_title(heading)
220 |     for i, img in enumerate(imgs):
221 |         if torchy:
222 |             img = img[0].detach().permute(1,2,0)
223 |         plt.axes(axs[i])
224 |         plt.imshow(img)
225 |     plt.show(block=block)
226 |     plt.pause(0.001)
227 |     if not save is None:
228 |         plt.savefig(save)
229 |     return fig, axs
230 | 
231 | def show_recreation(dataset, model, block=False, save=None):
232 |     '''
233 |     Shows a random image and the encoders attempted recreation
234 |     Args:
235 |         dataset (data.Dataset): Torch Dataset with the image data
236 |         model (nn.Module): (V)AE model to be run
237 |         block (bool): Whether to stop execution until user closes image
238 |         save (str / None): Path to save the image under. Will not save if None
239 |     '''
240 |     with torch.no_grad():
241 |         img1 = dataset[random.randint(0,len(dataset)-1)][0].unsqueeze(0)
242 |         if next(model.parameters()).is_cuda:
243 |             img1 = img1.cuda()
244 |         img2, z, mu, logvar = model(img1)
245 |     show(
246 |         [img1.cpu(),img2.cpu()], block=block, save=save,
247 |         heading='Random image recreation'
248 |     )
249 | 
250 | def train_autoencoder(data, model, epochs, batch_size, gpu=False,
251 |     display=False, save_path='checkpoints'
252 | ):
253 |     '''
254 |     Trains an autoencoder with the given data
255 |     Args:
256 |         data (tensor, tensor): Tuple with train and validation data
257 |         model (nn.Module / str): Model or path to model to train
258 |         epochs (int): Number of epochs to run
259 |         batch_size (int): Size of batches
260 |         gpu (bool): Whether to train on the GPU
261 |         display (bool): Whether to display the recreated images
262 |         save_path (str): Path to folder where the trained network will be stored
263 |     Returns (nn.Module, str, float, int): The model, path, val loss, and epochs
264 |     '''
265 |     train_data, val_data = data
266 |     train_data = TensorDataset(train_data, train_data)
267 |     val_data = TensorDataset(val_data, val_data)
268 |     train_loader = DataLoader(train_data, batch_size, shuffle=True)
269 |     val_loader = DataLoader(val_data, batch_size, shuffle=True)
270 | 
271 |     if isinstance(model, str) and epochs != 0:
272 |         model = torch.load(model, map_location='cpu')
273 | 
274 |     if gpu:
275 |         model = model.cuda()
276 | 
277 |     optimizer = torch.optim.Adam(model.parameters())
278 | 
279 |     early_stop = EarlyStopper(patience=max(10, epochs/20))
280 |     if display:
281 |         epoch_update = lambda _a, _b, _c : show_recreation(
282 |                 train_data, model, block=False, save=save_path+'/image.png'
283 |             ) or early_stop(_a,_b,_c)
284 |     else:
285 |         epoch_update = early_stop
286 |     if epochs != 0:
287 |         print(
288 |             (
289 |                 'Starting autoencoder training. ' 
290 |                 f'Best checkpoint stored in ./{save_path}'
291 |             )
292 |         )   
293 |         model, model_file, val_loss, actual_epochs = run_training(
294 |             model = model,
295 |             train_loader = train_loader,
296 |             val_loader = val_loader,
297 |             loss = model.loss,
298 |             optimizer = optimizer,
299 |             save_path = save_path,
300 |             epochs = epochs,
301 |             epoch_update = epoch_update
302 |         )
303 |     elif isinstance(model, str):
304 |         model_file = model
305 |     else:
306 |         model_file = None
307 | 
308 |     if display:
309 |         for batch_id in range(len(train_data)):
310 |             show_recreation(train_data, model, block=True)
311 |     
312 |     return model, model_file, val_loss, actual_epochs
313 | 
314 | def encode_data(autoencoder, data, batch_size=512):
315 |     dataset = TensorDataset(data)
316 |     data_loader = DataLoader(dataset, batch_size, shuffle=False)
317 |     gpu = next(autoencoder.parameters()).is_cuda
318 |     encoded_batches = []
319 |     autoencoder.eval()
320 |     with torch.no_grad():
321 |         for i, batch in enumerate(data_loader):
322 |             batch = batch[0]
323 |             if gpu:
324 |                 batch = batch.cuda()
325 |             coded_batch = autoencoder.encode(batch)
326 |             if gpu:
327 |                 coded_batch = (coded_batch[0].cpu(), coded_batch[1].cpu())
328 |                 batch = batch.cpu()
329 |             encoded_batches.append(coded_batch[0])
330 |     autoencoder.train()
331 |     return torch.cat(encoded_batches, dim=0)


--------------------------------------------------------------------------------
/dataset_loader.py:
--------------------------------------------------------------------------------
  1 | import torch
  2 | from torch.utils.data import TensorDataset, DataLoader, Dataset
  3 | import pickle
  4 | import numpy as np
  5 | import scipy.io as sio
  6 | 
  7 | class PreprocessDataset(Dataset):
  8 |     '''
  9 |     A Dataset that must be fractioned and each fraction need to be preprocessed
 10 |     Args:
 11 |         datas ([[any]]): A list of the data where each data contains datapoints
 12 |         preprocess (f(any)->tensor): Function from datapointsto tensor
 13 |     '''
 14 |     def __init__(self, datas, preprocess):
 15 |         self.datas = datas
 16 |         self.preprocess = preprocess
 17 | 
 18 |     def __getitem__(self, index):
 19 |         return tuple(self.preprocess(data[index]) for data in self.datas)
 20 | 
 21 |     def __len__(self):
 22 |         return len(self.datas[0])
 23 | 
 24 | 
 25 | def split_data(datas, split_sizes=[0.8, 0.2]):
 26 |     '''
 27 |     Splits the dataset into sets of the given proportions
 28 |     Args:
 29 |         datas ([tensor]): The data to be split
 30 |         split_sizes ([int]): The relative sizes of the splits
 31 |     Returns ([[tensor]]): The list of splits
 32 |     '''
 33 |     start_index = 0
 34 |     splits = []
 35 |     split_sizes = [split_size/sum(split_sizes) for split_size in split_sizes]
 36 |     for split_size in split_sizes:
 37 |         end_index = min(
 38 |             int(datas[0].size(0)*split_size)+start_index, datas[0].size(0)
 39 |         )
 40 |         splits.append(
 41 |             [data[start_index:end_index] for data in datas]
 42 |         )
 43 |         start_index = end_index
 44 |     return splits
 45 | 
 46 | def load_pickled_gym_data(path_to_data, val_split=0.2):
 47 |     '''
 48 |     Takes pickled gym data and prepares it for pytorch use
 49 |     Args:
 50 |         path_to_data (str): Path to the .pickle file with data
 51 |         val_split (float): What fraction of data to use for validation
 52 |     Returns ({data}): A dict with data in training and validation splits
 53 |     '''
 54 |     assert val_split <= 1 and val_split >= 0, \
 55 |         'val_split must be between 0 and 1'
 56 |     
 57 |     data = pickle.load(open(path_to_data, 'rb'))
 58 |     parameters = data['parameters']
 59 |     data_size = parameters['rollouts']*parameters['timesteps_per_rollout']
 60 |     val_index = data_size - int(data_size*val_split)
 61 |     val_index = val_index - (val_index % parameters['timesteps_per_rollout'])
 62 | 
 63 |     for key, value in data.items():
 64 |         if key == 'parameters':
 65 |             continue
 66 |         assert len(value) == data_size, \
 67 |             'non-parameter data should contain data_size ({}) entries'.format(
 68 |                 data_size
 69 |             )
 70 |         if key == 'imgs':
 71 |             value = np.transpose(value, (0,3,1,2))
 72 |         if (np.array(value).dtype.kind in ['f','u','i']):
 73 |             value = torch.from_numpy(np.array(value, dtype=np.float32))
 74 |         train, valid = value[:val_index], value[val_index:data_size]
 75 |         data[key] = train, valid
 76 |     return data
 77 | 
 78 | def load_lunarlander_data(path_to_data, keep_off_screen=True):
 79 |     '''
 80 |     Takes pickled gym LunarLander-v2 data and prepares it for pytorch use
 81 |     Args:
 82 |         path_to_data (str): Path to the .pickle file with data
 83 |         keep_off_screen (bool): Whether to keep images with lander off-screen
 84 |     Returns (tensor, tensor): The images and corresponing lander positions
 85 |     '''
 86 |     
 87 |     data = load_pickled_gym_data(path_to_data, 0)
 88 |     images = data['imgs'][0].float()
 89 |     labels = data['observations'][0]
 90 |     labels = labels.narrow(1,0,2).float()
 91 |     if not keep_off_screen:
 92 |         #Remove data where the lander is off screen (-1<=x<=1 & -0.5<=y<=1.5)
 93 |         condition = (
 94 |             (labels[:,0]<=1) & (labels[:,0]>=-1) &
 95 |             (labels[:,1]<=1.5) & (labels[:,1]>=-0.5)
 96 |         )
 97 |         labels = labels[condition, :]
 98 |         images = images[condition, :]
 99 |     return images, labels
100 | 
101 | def load_svhn_data(path_to_data):
102 |     '''
103 |     Reads and returns the data for the svhn dataset
104 |     Args:
105 |         path_to_data (str): Path to the binary file containing images and labels
106 |     Returns (tensor, tensor): The images wrap-padded to be 64x64 and the labels
107 |     '''
108 | 
109 |     data = sio.loadmat(path_to_data)
110 |     images = data['X']
111 |     images = np.transpose(images, (3,2,0,1))
112 |     images = np.pad(images, ((0,0),(0,0),(0,32),(0,32)), mode='wrap')
113 |     images = images/255
114 |     images = torch.from_numpy(images).float()
115 |     labels = data['y']
116 |     labels = labels.reshape((-1))
117 |     labels = labels-1
118 |     labels = np.eye(10)[labels]
119 |     labels = torch.from_numpy(labels).float()
120 |     return images, labels
121 | 
122 | def load_stl_data(path_to_images, path_to_labels=None):
123 |     '''
124 |     Reads and returns the images and labels for the STL-10 dataset
125 |     Args:
126 |         path_to_images (str): Path to the binary file containing images
127 |         path_to_labels (str): Path to the binary file containing labels 
128 |     Returns (tensor, tensor): The images with channels first and labels
129 |     '''
130 | 
131 |     with open(path_to_images, 'rb') as f:
132 |         everything = np.fromfile(f, dtype=np.uint8)
133 |         images = np.reshape(everything, (-1, 3, 96, 96))
134 |         images = images/255
135 |         images = torch.from_numpy(images).float()
136 | 
137 |     if not path_to_labels is None:
138 |         with open(path_to_labels, 'rb') as f:
139 |             labels = np.fromfile(f, dtype=np.uint8)
140 |             labels = labels-1
141 |             labels = np.eye(10)[labels]
142 |             labels = torch.from_numpy(labels).float()
143 |     else:
144 |         labels = None
145 |     
146 |     return images, labels


--------------------------------------------------------------------------------
/experiment.py:
--------------------------------------------------------------------------------
  1 | # Library imports
  2 | import numpy as np
  3 | import matplotlib.pyplot as plt
  4 | import torch
  5 | import torch.nn as nn
  6 | from torch.utils.data import TensorDataset, DataLoader
  7 | import random
  8 | import math
  9 | import datetime
 10 | import time
 11 | import argparse
 12 | import os
 13 | import csv
 14 | import sys
 15 | from itertools import combinations_with_replacement, product
 16 | 
 17 | # File imports
 18 | from utility import run_training, run_epoch, fc_net, EarlyStopper
 19 | from VAE import FourLayerCVAE, train_autoencoder, encode_data
 20 | from perceptual_networks import SimpleExtractor, architecture_features
 21 | from perceptual_embedder import FeaturePredictorCVAE, FeatureAutoencoder, \
 22 |     PerceptualFeatureToImgCVAE, FeatureToImgCVAE
 23 | 
 24 | # Dataset imports
 25 | from dataset_loader import split_data, load_lunarlander_data, \
 26 |     load_svhn_data, load_stl_data
 27 | 
 28 | 
 29 | def generate_autoencoders(index_file, dataset_name, data, epochs=100,
 30 |     batch_size=512, networks=[FourLayerCVAE],
 31 |     z_dims=[32,64,128], gammas=[0,0.001,0.01],
 32 |     perceptual_nets=[None, SimpleExtractor('alexnet', 5)], repetitions=1
 33 | ):
 34 |     '''
 35 |     Trains autoencoders with all combinations of the given parameters that are
 36 |     missing from index_file and adds them to index_file
 37 |     Args:
 38 |         index_file (str): Path to file to save model paths and parameters in
 39 |         dataset_name (str): Name of the dataset
 40 |         data (tensor, tensor): Tuple with train and validation data
 41 |         epochs (int): Maximum number of epochs to train each autoencoder for
 42 |         batch_size (int): Size of the batches
 43 |         networks ([f()->nn.Module]): Autoencoder implementations
 44 |         z_dims ([int]): The z_dim values to try
 45 |         gammas ([float]): The gamma values to try (0 = non-variational)
 46 |         perceptual_nets ([nn.Module/None]): Perceptual networks for loss
 47 |         repetitions (int): How many AEs to train with each setting
 48 |     '''
 49 | 
 50 |     #Create the index path + file if they don't exist already
 51 |     path = index_file.split(sep='/')[:-1]
 52 |     if len(path) > 0:
 53 |         try:
 54 |             os.makedirs('/'.join(path))
 55 |         except FileExistsError:
 56 |             pass
 57 |     if not os.path.isfile(index_file):
 58 |         print(f'Creating a autoencoder index file at {index_file}...')
 59 |         with open(index_file, 'a') as index:
 60 |             index_writer = csv.writer(index, delimiter='\t')
 61 |             index_writer.writerow([
 62 |                 'autoencoder_path',
 63 |                 'dataset_name',
 64 |                 'input_size',
 65 |                 'epochs',
 66 |                 'network',
 67 |                 'z_dim',
 68 |                 'gamma',
 69 |                 'perceptual_net',
 70 |                 'actual_epochs', 
 71 |                 'process_time',
 72 |                 'validation_loss'
 73 |             ])
 74 | 
 75 |     input_size = (data[0].size()[2], data[0].size()[3])
 76 | 
 77 |     # For each parameter combination
 78 |     for network,  z_dim,  gamma,  perceptual_net in product(
 79 |         networks, z_dims, gammas, perceptual_nets
 80 |     ):
 81 | 
 82 |         parameters = [
 83 |             dataset_name,
 84 |             str(input_size),
 85 |             str(epochs),
 86 |             str(network),
 87 |             str(z_dim),
 88 |             str(gamma),
 89 |             str(perceptual_net)
 90 |         ]
 91 |         
 92 |         # Don't train more AEs per setting than necessary 
 93 |         already_trained = 0
 94 |         with open(index_file, 'r') as index:
 95 |             index_reader = csv.reader(index, delimiter='\t')
 96 |             try:
 97 |                 field_names = next(index_reader)
 98 |             except StopIteration:
 99 |                 raise RuntimeError(
100 |                     f'Header is missing in {index_file} '
101 |                     f'Delete the file and run again'
102 |                 )
103 |             for row in index_reader:
104 |                 if list(row[1:-3]) == parameters:
105 |                     already_trained += 1
106 |         
107 |         # Train as many AEs as are missing for this parameter setting
108 |         for _ in range(repetitions-already_trained):
109 | 
110 |             # Initialize an autoencoder model with the given parameters
111 |             model = network(
112 |                 input_size = input_size,
113 |                 z_dimensions = z_dim,
114 |                 variational = (gamma != 0),
115 |                 gamma = gamma,
116 |                 perceptual_net = perceptual_net
117 |             )
118 | 
119 |             # Train the autoencoder with the data and meassure the time it takes
120 |             timestamp = time.process_time()
121 |             model, model_path, val_loss, actual_epochs = train_autoencoder(
122 |                 data,
123 |                 model,
124 |                 epochs,
125 |                 batch_size,
126 |                 gpu=torch.cuda.is_available(),
127 |                 display=False,
128 |                 save_path='checkpoints'
129 |             )
130 |             elapsed_time = time.process_time() - timestamp
131 | 
132 |             # Save the path and parameters to index_file
133 |             with open(index_file, 'a') as index:
134 |                 index_writer = csv.writer(index, delimiter='\t')
135 |                 index_writer.writerow([
136 |                     model_path,
137 |                     dataset_name,
138 |                     str(input_size),
139 |                     str(epochs),
140 |                     str(network),
141 |                     str(z_dim),
142 |                     str(gamma),
143 |                     str(perceptual_net),
144 |                     str(actual_epochs),
145 |                     str(elapsed_time),
146 |                     str(val_loss)
147 |                 ])
148 | 
149 | def generate_dense_architectures(hidden_sizes, hidden_nrs):
150 |     '''
151 |     Given acceptable sizes for hidden layers and acceptable number of layers,
152 |     generates all feasible architectures to test.
153 | 
154 |     Args:
155 |         hidden_sizes ([int]): List of acceptable sizes of the hidden layers
156 |         hidden_nrs ([int]): List of acceptable number of layers
157 |     
158 |     Returns ([[int]]): List of architectures consisting of list of layer sizes
159 |     '''
160 |     archs = []
161 |     hidden_sizes.sort(reverse=True)
162 |     for hidden_nr in hidden_nrs:
163 |         archs = archs + list(combinations_with_replacement(hidden_sizes, hidden_nr))
164 |     return [list(arch) for arch in archs]
165 | 
166 | def run_experiment(results_file, dataset_name, train_data, validation_data,
167 |     test_data, autoencoder_index, epochs, batch_size, predictor_architectures,
168 |     predictor_hidden_functions, predictor_output_functions,
169 |     allowed_ae_parameters={}, ae_repetitions=1, predictor_repetitions=1
170 | ):
171 |     '''
172 |     Trains and tests fully connected networks with the given architectures on
173 |     the given data, using autoencoders from autoencoder_index to encode the
174 |     images. The results of the tests are saved to result_file
175 |     Args:
176 |         results_file (str): Path of the results file
177 |         dataset_name (str): Name of the dataset (used to pick the correct AEs)
178 |         train_data (tensor, tensor): Data and labels to train models on
179 |         validation_data (tensor, tensor): Data and labels to validate models on
180 |         test_data (tensor, tensor): Data and labels to test models on
181 |         autoencoder_index (str): Path to index file of trained autoencoders
182 |         epochs (int): Number of epochs to train each model for
183 |         batch_size (int): Size of batches
184 |         predictor_architectures ([[int]]): Architectures defined by layer sizes
185 |         predictor_hidden_functions ([f()->nn.Module]): Hidden layer functions
186 |         predictor_out_functions ([f()->nn.Module]): Output activation functions
187 |         allowed_ae_parameters ({[any]}): Allowed parameters (all if empty)
188 |         ae_repetitions (int): Nr of AEs with the same settings to test
189 |         predictor_repetitions (int): Nr of predictors to train per setting
190 |     '''
191 |     
192 |     #Create the results path + file if they don't exist already
193 |     path = results_file.split(sep='/')[:-1]
194 |     if len(path) > 0:
195 |         try:
196 |             os.makedirs('/'.join(path))
197 |         except FileExistsError:
198 |             pass
199 |     if not os.path.isfile(results_file):
200 |         with open(results_file, 'a') as results:
201 |             results_writer = csv.writer(results, delimiter='\t')
202 |             results_writer.writerow([
203 |                 'autoencoder_path',
204 |                 'dataset_name',
205 |                 'input_size',
206 |                 'autoencoder_epochs',
207 |                 'autoencoder_network',
208 |                 'z_dim',
209 |                 'gamma',
210 |                 'perceptual_net',
211 |                 'autoencoder_actual_epochs',
212 |                 'autoencoder_time',
213 |                 'autoencoder_val_loss', 
214 |                 'predictor_path',
215 |                 'architecture',
216 |                 'hidden_function',
217 |                 'out_function',
218 |                 'predictor_epochs',
219 |                 'predictor_actual_epochs',
220 |                 'predictor_train_time',
221 |                 'autoencode_test_time',
222 |                 'predictor_test_time',
223 |                 'validation_mse',
224 |                 'test_mse',
225 |                 'mean_l1_distance',
226 |                 'mean_l2_distance',
227 |                 'accuracy'
228 |             ])
229 | 
230 |     # Setup variables and losses that is used by all tests
231 |     image_size = (train_data[0].size()[2], train_data[0].size()[3])
232 |     label_size = train_data[1].size()[1]
233 |     loss_function = torch.nn.MSELoss()
234 |     losses = lambda output, target : [
235 |         loss_function(output, target),
236 |         torch.mean(torch.norm(output-target,1,dim=1)),
237 |         torch.mean(torch.norm(output-target,2,dim=1)),
238 |         torch.mean(
239 |             torch.eq(torch.max(output,1)[1], torch.max(target,1)[1]).float()
240 |         )
241 |     ]
242 | 
243 |     # Collect paths and parameters of all autoencoders to use
244 |     autoencoders = []
245 |     repetition_counter = {}
246 |     with open(autoencoder_index, 'r') as index:
247 |         index_reader = csv.reader(index, delimiter='\t')
248 |         try:
249 |             field_names = next(index_reader)
250 |         except StopIteration:
251 |             raise RuntimeError(
252 |                 f'Header is missing in {autoencoder_index} '
253 |                 f'Delete the file and run again'
254 |             )
255 |         for row in index_reader:
256 |             if row[1] != dataset_name or row[2] != str(image_size):
257 |                 continue
258 |             allowed_autoencoder = True
259 |             for i, key in enumerate(field_names):
260 |                 if not key in allowed_ae_parameters:
261 |                     continue
262 |                 if row[i] not in allowed_ae_parameters[key]:
263 |                     allowed_autoencoder = False
264 |                     break
265 |             if allowed_autoencoder:
266 |                 key = tuple(row[1:-3])
267 |                 if not key in repetition_counter:
268 |                     repetition_counter[key] = 1
269 |                     autoencoders.append(row)
270 |                 elif repetition_counter[key] < ae_repetitions:
271 |                     repetition_counter[key] = repetition_counter[key] + 1
272 |                     autoencoders.append(row)
273 |                 
274 |     
275 |     # For all autoencoders run the test with all predictors
276 |     for autoencoder_parameters in autoencoders:
277 |         autoencoder_path = autoencoder_parameters[0]
278 |         encoding_size = int(autoencoder_parameters[5])
279 |         autoencoder = torch.load(autoencoder_path, map_location='cpu')
280 | 
281 |         # Encode and prepare the data only once for each AE
282 |         ae_encoded = False
283 | 
284 |         # Train and test all predictors on the given data
285 |         for architecture, hidden_func, out_func in product(
286 |             predictor_architectures,
287 |             predictor_hidden_functions,
288 |             predictor_output_functions
289 |         ):
290 | 
291 |             # Initialize the predictor
292 |             architecture = architecture.copy()
293 |             architecture.append(label_size)
294 |             act_functs = [hidden_func]*(len(architecture)-1) + [out_func]
295 |             predictor = fc_net(
296 |                 input_size = encoding_size,
297 |                 layers = architecture,
298 |                 activation_functions = act_functs
299 |             )
300 |             optimizer = torch.optim.Adam(predictor.parameters())
301 | 
302 |             # Don't train more predictors per setting than necessary
303 |             parameters = [
304 |                 autoencoder_path,
305 |                 str(architecture),
306 |                 str(hidden_func),
307 |                 str(out_func),
308 |                 str(epochs)                
309 |             ]
310 |             already_tested = 0
311 |             with open(results_file, 'r') as results:
312 |                 results_reader = csv.reader(results, delimiter='\t')
313 |                 try:
314 |                     field_names = next(results_reader)
315 |                 except StopIteration:
316 |                     raise RuntimeError(
317 |                         f'Header is missing in {results_file} '
318 |                         f'Delete the file and run again'
319 |                     )
320 |                 for row in results_reader:
321 |                     if list([row[i] for i in [0,12,13,14,15]]) == parameters:
322 |                         already_tested += 1
323 |             
324 |             # Train as many predictors as are missing for this parameter setting
325 |             for _ in range(predictor_repetitions-already_tested):
326 | 
327 |                 # If it's the first iteration with this AE, prepare the data
328 |                 if not ae_encoded:
329 |                     print(f'Encoding data with autoencoder at {autoencoder_path}...')
330 |                     train_encoded = encode_data(autoencoder,train_data[0],batch_size)
331 |                     train_dataset = TensorDataset(train_encoded, train_data[1])
332 |                     train_loader = DataLoader(train_dataset, batch_size, shuffle=True)
333 |                     
334 |                     val_encoded = encode_data(autoencoder,validation_data[0],batch_size)
335 |                     val_dataset = TensorDataset(val_encoded, validation_data[1])
336 |                     val_loader = DataLoader(val_dataset, batch_size, shuffle=False)
337 |                     
338 |                     timestamp = time.process_time()
339 |                     test_encoded = encode_data(autoencoder,test_data[0],batch_size)
340 |                     test_dataset = TensorDataset(test_encoded, test_data[1])
341 |                     test_loader = DataLoader(test_dataset, batch_size, shuffle=False)
342 |                     autoencode_test_time = time.process_time() - timestamp
343 | 
344 |                     ae_encoded = True
345 | 
346 |                 # Train the predictor and meassure the time it takes
347 |                 early_stop = EarlyStopper(patience=max(10, epochs/20))
348 |                 timestamp = time.process_time()
349 |                 (
350 |                     predictor, predictor_path, validation_loss, actual_epochs
351 |                 ) = run_training(
352 |                     predictor, train_loader, val_loader, losses,
353 |                     optimizer, 'checkpoints', epochs, epoch_update=early_stop
354 |                 )
355 |                 train_time = time.process_time() - timestamp
356 | 
357 |                 # Test the predictor and meassure the time it takes
358 |                 timestamp = time.process_time()
359 |                 test_losses = run_epoch(
360 |                     predictor, test_loader, losses, optimizer,
361 |                     epoch_name='Test',train=False
362 |                 )
363 |                 test_time = time.process_time() - timestamp
364 |                 print()
365 | 
366 |                 # Write the results to a .csv file
367 |                 with open(results_file, 'a') as results:
368 |                     results_writer = csv.writer(
369 |                         results,
370 |                         delimiter='\t',
371 |                         quotechar='"',
372 |                         quoting=csv.QUOTE_MINIMAL
373 |                     )
374 |                     results_writer.writerow(
375 |                         autoencoder_parameters +
376 |                         [
377 |                             predictor_path, architecture, str(hidden_func),
378 |                             str(out_func), epochs, actual_epochs, train_time,
379 |                             autoencode_test_time, test_time, validation_loss
380 |                         ] +
381 |                         test_losses
382 |                     )
383 | 
384 | def main():
385 |     '''
386 |     Given the autoencoder parameters and a dataset trains those autoencoders
387 |     that are missing and then trains and tests the predictors specified by the
388 |     predictor parameters for each autoencoer.
389 |     '''
390 |     # Create parser and parse input
391 |     parser = argparse.ArgumentParser()
392 |     parser.add_argument(
393 |         #To add a dataset, append its name here and preprocessing later
394 |         '--data', type=str, choices=['lunarlander','stl10','svhn'],
395 |         required=True, help='The dataset to test on'
396 |     )
397 |     parser.add_argument(
398 |         '--ae_epochs', type=int, default=50,
399 |         help='Nr of epochs to train autoencoders for'
400 |     )
401 |     parser.add_argument(
402 |         '--ae_batch_size', type=int, default=512,
403 |         help='Size of autoencoder batches'
404 |     )
405 |     parser.add_argument(
406 |         #To add an autoencoder, append its name here and preprocessing later
407 |         '--ae_networks', type=str, default=['FourLayerCVAE'], nargs='+',
408 |         choices=[
409 |             'FourLayerCVAE', 'FeaturePredictorCVAE', 'FeatureAutoencoder',
410 |             'PerceptualFeatureToImgCVAE', 'FeatureToImgCVAE'
411 |         ],
412 |         help='The different autoencoder networks to use'
413 |     )
414 |     parser.add_argument(
415 |         '--ae_zs', type=int, default=[64,128], nargs='+',
416 |         help='The different autoencoder z_dims to use'
417 |     )
418 |     parser.add_argument(
419 |         '--ae_gammas', type=float, default=[0.0,0.01], nargs='+',
420 |         help='The different autoencoder gammas to use'
421 |     )
422 |     parser.add_argument(
423 |         '--perceptual_nets', type=str, default=['None', 'alexnet'], nargs='+',
424 |         help='The different perceptual networks to use for autoencoders'
425 |     )
426 |     parser.add_argument(
427 |         '--perceptual_layers', type=int, default=[5], nargs='+',
428 |         help='The different feature extraction layers to test'
429 |     )
430 |     parser.add_argument(
431 |         '--predictor_epochs', type=int, default=500,
432 |         help='Nr of epochs to train predictors for'
433 |     )
434 |     parser.add_argument(
435 |         '--predictor_batch_size', type=int, default=512,
436 |         help='Size of predictor batches'
437 |     )
438 |     parser.add_argument(
439 |         '--autoencoder_index', type=str, default='autoencoder_index.csv',
440 |         help='Path to store/load autoencoder paths/parameters to/from'
441 | 
442 |     )
443 |     parser.add_argument(
444 |         '--results_path', type=str, default='results.csv',
445 |         help='Path to save results to'
446 | 
447 |     )
448 |     parser.add_argument(
449 |         '--ae_repetitions', type=int, default=1,
450 |         help='How many AEs to train with each hyperparamter setting'
451 | 
452 |     )
453 |     parser.add_argument(
454 |         '--predictor_repetitions', type=int, default=1,
455 |         help='How many predictors per AE and hyperparameter setting to train'
456 | 
457 |     )
458 |     #TODO: Implement
459 |     #parser.add_argument(
460 |     #    '--no_gpu', action='store_true',
461 |     #    help='GPUs will not be used even if they are available'
462 |     #)
463 |     #TODO: Implement
464 |     #parser.add_argument(
465 |     #    '--memory_wary', action='store_true',
466 |     #    help='Will attempt to lower RAM usage (possibly at cost of speed)'
467 |     #)
468 |     #TODO: Add arguments to use non-default architectures and functions
469 | 
470 |     args = parser.parse_args()
471 |     
472 |     # Load autoencoder dataset, add code here to add new datasets
473 |     print('Loading data for autoencoder training...')
474 |     if args.data == 'lunarlander':
475 |         raise NotImplementedError(
476 |             'Use gym_datagenerator.py to generate data '
477 |             'then uncomment and add file names below'
478 |         )
479 |         #data, _ = load_lunarlander_data(
480 |         #    './datasets/LunarLander-v2/<name_of_file>'
481 |         #)
482 |     elif args.data == 'stl10':
483 |         data, _ = load_stl_data('./datasets/stl10/unlabeled_X.bin')
484 |     elif args.data == 'svhn':
485 |         data, _ = load_svhn_data('./datasets/svhn/extra_32x32.mat')
486 |     else:
487 |         raise ValueError(
488 |             f'Dataset {args.data} does not match any implemented dataset name'
489 |         )
490 |     train_data, validation_data = split_data([data])
491 |     train_data = train_data[0]
492 |     validation_data = validation_data[0]
493 | 
494 |     # Get autoencoder networks, add code here to add new autoencoders
495 |     networks = []
496 |     for network in args.ae_networks:
497 |         if network == 'FourLayerCVAE':
498 |             networks.append(FourLayerCVAE)
499 |         elif network == 'FeaturePredictorCVAE':
500 |             networks.append(FeaturePredictorCVAE)
501 |         elif network == 'FeatureAutoencoder':
502 |             networks.append(FeatureAutoencoder)
503 |         elif network == 'PerceptualFeatureToImgCVAE':
504 |             networks.append(PerceptualFeatureToImgCVAE)
505 |         elif network == 'FeatureToImgCVAE':
506 |             networks.append(FeatureToImgCVAE)
507 |         else:
508 |             raise ValueError(
509 |                 f'{network} does not match any known autoencoder'
510 |             )
511 |     
512 |     # Get perceptual networks, add code here to add new perceptual networks
513 |     perceptual_nets = []
514 |     for perceptual_net in args.perceptual_nets:
515 |         if perceptual_net == 'None':
516 |             perceptual_nets.append(None)
517 |         elif perceptual_net in architecture_features:
518 |             for layer in args.perceptual_layers:
519 |                 perceptual_nets.append(SimpleExtractor(perceptual_net, layer))
520 |         else:
521 |             raise ValueError(
522 |                 f'{perceptual_net} does not match any known perceptual net\n'
523 |                 'Select from: \n\t' + '\n\t'.join(architecture_features.keys())
524 |             )
525 | 
526 |     # Train the missing autoencoders
527 |     generate_autoencoders(
528 |         index_file = args.autoencoder_index,
529 |         dataset_name = args.data,
530 |         data = (train_data, validation_data), 
531 |         epochs = args.ae_epochs,
532 |         batch_size = args.ae_batch_size,
533 |         networks = networks,
534 |         z_dims = args.ae_zs,
535 |         gammas = args.ae_gammas,
536 |         perceptual_nets = perceptual_nets,
537 |         repetitions = args.ae_repetitions
538 |     )
539 | 
540 |     # Load the predictor training and testing data, code here to add dataset
541 |     print('Loading data for predictor training and testing...')
542 |     if args.data == 'lunarlander':
543 |         raise NotImplementedError(
544 |             'Use gym_datagenerator.py to generate data '
545 |             'then uncomment and add file names below'
546 |         )
547 |         #data, labels = load_lunarlander_data(
548 |         #    './datasets/LunarLander-v2/<name_of_file>',
549 |         #    keep_off_screen=False
550 |         #)
551 |         #test_data, test_labels = load_lunarlander_data(
552 |         #    './datasets/LunarLander-v2/<name_of_file>',
553 |         #    keep_off_screen=False
554 |         #)
555 |     elif args.data == 'stl10':
556 |         data, labels = load_stl_data(
557 |             './datasets/stl10/train_X.bin',
558 |             './datasets/stl10/train_y.bin'
559 |         )
560 |         test_data, test_labels = load_stl_data(
561 |             './datasets/stl10/test_X.bin',
562 |             './datasets/stl10/test_y.bin'
563 |         )
564 |     elif args.data == 'svhn':
565 |         data, labels = load_svhn_data(
566 |             './datasets/svhn/train_32x32.mat'
567 |         )
568 |         test_data, test_labels = load_svhn_data(
569 |             './datasets/svhn/test_32x32.mat'
570 |         )
571 |     else:
572 |         raise ValueError(
573 |             f'Dataset {args.data} does not match any implemented dataset name'
574 |         )
575 |     train_data, validation_data = split_data([data, labels])
576 |     test_data = (test_data, test_labels)
577 | 
578 |     # Create architectures TODO: Add ability to control this
579 |     architectures = [
580 |         [], [32], [64], [32,32], [64,32], [64,64], [128,128]
581 |     ]
582 | 
583 |     # Set hidden and out functions TODO: Add ability to control this
584 |     hidden_functions = [nn.LeakyReLU]
585 |     out_functions = [None]
586 | 
587 |     # Run experiments
588 |     allowed_ae_parameters = {
589 |         'epochs' : [str(args.ae_epochs)],
590 |         'network' : [str(network) for network in networks],
591 |         'z_dim' : [str(z) for z in args.ae_zs],
592 |         'gamma' : [str(gamma) for gamma in args.ae_gammas],
593 |         'perceptual_net' : [str(net) for net in perceptual_nets]
594 |     }
595 |     run_experiment(
596 |         results_file = args.results_path,
597 |         dataset_name = args.data,
598 |         train_data = train_data,
599 |         validation_data = validation_data,
600 |         test_data = test_data,
601 |         autoencoder_index = args.autoencoder_index,
602 |         epochs = args.predictor_epochs,
603 |         batch_size = args.predictor_batch_size,
604 |         predictor_architectures = architectures,
605 |         predictor_hidden_functions = hidden_functions,
606 |         predictor_output_functions = out_functions,
607 |         allowed_ae_parameters = allowed_ae_parameters,
608 |         ae_repetitions = args.ae_repetitions,
609 |         predictor_repetitions = args.predictor_repetitions
610 |     )
611 | 
612 | # When this file is executed independently, execute the main function
613 | if __name__ == "__main__":
614 |     main()


--------------------------------------------------------------------------------
/gym_datagenerator.py:
--------------------------------------------------------------------------------
  1 | import numpy as np
  2 | import pickle
  3 | import random
  4 | import signal
  5 | import multiprocessing
  6 | import gym
  7 | import cv2
  8 | import os
  9 | import matplotlib.pyplot as plt
 10 | 
 11 | def init_worker():
 12 |     '''
 13 |     Setup worker to throw exceptions back to the main process
 14 |     '''
 15 |     signal.signal(signal.SIGINT, signal.SIG_IGN)
 16 | 
 17 | def collect_rollout_data(environment, agent, timesteps, image_size):
 18 |     '''
 19 |     Runs one rollout in the given environment with the given agent
 20 |     Args:
 21 |         environment (str): ID of openai gym to run
 22 |         agent (f() -> Object / None): Agent policy. Random if None
 23 |         timesteps (int): Nr of timesteps to record rollout for
 24 |         image_size (int, int): Size of images to be stored in pixels
 25 |     Returns ([np.array],[float],[bool],[any],[any]): Data from each timestep
 26 |     '''
 27 |     imgs = []
 28 |     rewards = []
 29 |     dones = []
 30 |     actions = []
 31 |     observations = []
 32 |     rets = (imgs, rewards, dones, actions, observations)
 33 | 
 34 |     env = gym.make(environment)
 35 |     observation = env.reset()
 36 |     if not agent is None:
 37 |         actor = agent()
 38 |     for _ in range(timesteps):
 39 |         #Each timestep render the env, take an action and update env
 40 |         if environment != 'CarRacing-v0':
 41 |             img = env.render('rgb_array')
 42 |         else:
 43 |             img = observation
 44 |         if agent is None:
 45 |             action = env.action_space.sample()
 46 |         else:
 47 |             action = actor(observation)
 48 |         observation, reward, done, info = env.step(action)
 49 |         
 50 |         #Downsize, covert to float np.array, and store image
 51 |         small_image = np.array(
 52 |             np.true_divide(
 53 |                 cv2.resize(
 54 |                     img, image_size, 
 55 |                     interpolation=cv2.INTER_CUBIC
 56 |                 ),
 57 |                 255
 58 |             ), 
 59 |             dtype = np.float16
 60 |         )
 61 | 
 62 |         #Collect data
 63 |         imgs.append(small_image)
 64 |         rewards.append(reward)
 65 |         dones.append(done)
 66 |         actions.append(action)
 67 |         if environment != 'CarRacing-v0':
 68 |             observations.append(observation)
 69 |     #Close environement and return data
 70 |     env.close()
 71 |     return rets
 72 | 
 73 | def generate_gym_data(
 74 |     environment='LunarLander-v2',
 75 |     rollouts=700,
 76 |     timesteps_per_rollout=150,
 77 |     image_size=(64,64),
 78 |     save_file=None,
 79 |     agent=None,
 80 |     workers=1
 81 | ):
 82 |     '''
 83 |     Creates a .pickle file containing images, actions, parameters, etc
 84 |     of a number of rollouts in a given Gym environment
 85 |     Args:
 86 |         environment (str): ID of openai gym to run
 87 |         rollouts (int): How many runs will be recorded
 88 |         timesteps_per_rollout (int): Nr of timesteps recorded per rollout
 89 |         image_size (int, int): Size of images to be stored in pixels
 90 |         save_file (str / None): Name of the file to store the dataset in
 91 |         agent (f() -> Object / None): Agent policy. Random if None
 92 |     '''
 93 |     #Creating a save_file name if None is provided
 94 |     if save_file is None:
 95 |         save_file = f'{environment}_{rollouts*timesteps_per_rollout}.pickle'
 96 |         if not os.path.isdir('datasets/' + environment):
 97 |             os.mkdir('datasets/' + environment)
 98 |         save_file = 'datasets/' + environment + '/' + save_file
 99 | 
100 |     #Init dict for data
101 |     data = {
102 |         'imgs' : [],
103 |         'rewards' : [],
104 |         'dones' : [],
105 |         'actions' : [],
106 |         'parameters' : {
107 |             'environment' : environment,
108 |             'rollouts' : rollouts,
109 |             'timesteps_per_rollout' : timesteps_per_rollout,
110 |             'image_size' : image_size,
111 |             'agent' : agent.__class__.__name__
112 |         }
113 |     }
114 |     if environment != 'CarRacing-v0':
115 |         data['observations'] = []
116 |     
117 |     
118 |     pool = multiprocessing.Pool(workers, init_worker)
119 |     
120 |     #Run several rollout in parallel
121 |     try:
122 |         processes = [
123 |             pool.apply_async(
124 |                 collect_rollout_data, 
125 |                 (environment, agent, timesteps_per_rollout, image_size)
126 |             )
127 |             for _ in range(rollouts)
128 |         ]
129 |         for i, process in enumerate(processes):
130 |             imgs, rewards, dones, actions, observations = process.get()
131 |             data['imgs'] += imgs
132 |             data['rewards'] += rewards
133 |             data['dones'] += dones
134 |             data['actions'] += actions
135 |             if environment != 'CarRacing-v0':
136 |                 data['observations'] += observations
137 |     except Exception as e:
138 |         pool.close()
139 |         pool.terminate()
140 |         pool.join()
141 |         raise e
142 |     else:
143 |         pool.close()
144 |         pool.join()
145 |     
146 |     #Save all collected data and parameters in a .pickle file
147 |     pickle.dump(data, open(save_file, 'wb'))
148 | 
149 | if __name__ == '__main__':
150 |     '''
151 |     If run directly this will generate data from the LunarLander-v2 environment
152 |     '''
153 |     generate_gym_data(
154 |         rollouts=700,
155 |         timesteps_per_rollout=150,
156 |         workers=4
157 |     )


--------------------------------------------------------------------------------
/perceptual_embedder.py:
--------------------------------------------------------------------------------
  1 | # Library imports
  2 | import torch
  3 | import numpy as np
  4 | import torchvision.models as models
  5 | from torch.nn import functional as F
  6 | import torch.nn as nn
  7 | import matplotlib.pyplot as plt
  8 | 
  9 | # File imports
 10 | from utility import run_training, EarlyStopper
 11 | from VAE import _create_coder, TemplateVAE
 12 | 
 13 | class FeaturePredictorCVAE(TemplateVAE):
 14 |     '''
 15 |     A Convolutional Variational autoencoder trained with feature prediction
 16 |     I-F-FP procedure in the paper
 17 |     Args:
 18 |         input_size (int,int): The height and width of the input image
 19 |             acceptable sizes are 64+16*n
 20 |         z_dimensions (int): The number of latent dimensions in the encoding
 21 |         variational (bool): Whether the model is variational or not
 22 |         gamma (float): The weight of the KLD loss
 23 |         perceptual_net: Which perceptual network to use
 24 |     '''
 25 | 
 26 |     def __init__(self, input_size=(64,64), z_dimensions=32,
 27 |         variational=True, gamma=20.0, perceptual_net=None
 28 |     ):
 29 |         super().__init__()
 30 | 
 31 |         #Parameter check
 32 |         if (input_size[0] - 64) % 16 != 0 or (input_size[1] - 64) % 16 != 0:
 33 |             raise ValueError(
 34 |                 f'Input_size is {input_size}, but must be 64+16*N'
 35 |             )
 36 |         assert perceptual_net != None, \
 37 |             'For FeaturePredictorCVAE, perceptual_net cannot be None'
 38 | 
 39 |         #Attributes
 40 |         self.input_size = input_size
 41 |         self.z_dimensions = z_dimensions
 42 |         self.variational = variational
 43 |         self.gamma = gamma
 44 |         self.perceptual_net = perceptual_net
 45 |         
 46 |         inp = torch.rand((1,3,input_size[0],input_size[1]))
 47 |         out = self.perceptual_net(
 48 |             inp.to(next(perceptual_net.parameters()).device)
 49 |         )
 50 |         self.perceptual_size = out.numel()
 51 |         self.perceptual_loss = True
 52 | 
 53 |         encoder_channels = [3,32,64,128,256]
 54 |         self.encoder = _create_coder(
 55 |             encoder_channels, [4,4,4,4], [2,2,2,2],
 56 |             nn.Conv2d, nn.ReLU,
 57 |             batch_norms=[True,True,True,True]
 58 |         )
 59 |         
 60 |         f = lambda x: np.floor((x - (2,2))/2)
 61 |         conv_sizes = f(f(f(f(np.array(input_size)))))
 62 |         conv_flat_size = int(encoder_channels[-1]*conv_sizes[0]*conv_sizes[1])
 63 |         self.mu = nn.Linear(conv_flat_size, self.z_dimensions)
 64 |         self.logvar = nn.Linear(conv_flat_size, self.z_dimensions)
 65 | 
 66 |         g = lambda x: int((x-64)/16)+1
 67 |         deconv_flat_size = g(input_size[0]) * g(input_size[1]) * 1024
 68 |         
 69 |         hidden_layer_size = int(min(self.perceptual_size/2, 2048))
 70 |         self.decoder = nn.Sequential(
 71 |             nn.Linear(self.z_dimensions, hidden_layer_size),
 72 |             nn.ReLU(),
 73 |             nn.Linear(hidden_layer_size, self.perceptual_size)
 74 |         )
 75 | 
 76 |     def loss(self, output, x):
 77 |         rec_y, z, mu, logvar = output
 78 |         
 79 |         y = self.perceptual_net(x)
 80 |         REC = F.mse_loss(rec_y, y, reduction='mean')
 81 | 
 82 |         if self.variational:
 83 |             KLD = -1 * torch.mean(1 + logvar - mu.pow(2) - logvar.exp())
 84 |             return REC + self.gamma*KLD, REC, KLD
 85 |         else:
 86 |             return [REC]
 87 | 
 88 | class FeatureAutoencoder(TemplateVAE):
 89 |     '''
 90 |     An fc autoencoder that autoencodes the features of a perceptual network
 91 |     F-F-FP procedure in the paper
 92 |     Args:
 93 |         input_size (int,int): The height and width of the input image
 94 |             acceptable sizes are 64+16*n
 95 |         z_dimensions (int): The number of latent dimensions in the encoding
 96 |         variational (bool): Whether the model is variational or not
 97 |         gamma (float): The weight of the KLD loss
 98 |         perceptual_net: Which perceptual network to use
 99 |     '''
100 | 
101 |     def __init__(self, input_size=(64,64), z_dimensions=32,
102 |         variational=True, gamma=20.0, perceptual_net=None
103 |     ):
104 |         super().__init__()
105 | 
106 |         #Parameter check
107 |         if (input_size[0] - 64) % 16 != 0 or (input_size[1] - 64) % 16 != 0:
108 |             raise ValueError(
109 |                 f'Input_size is {input_size}, but must be 64+16*N'
110 |             )
111 |         assert perceptual_net != None, \
112 |             'For FeatureAutoencoder, perceptual_net cannot be None'
113 | 
114 |         #Attributes
115 |         self.input_size = input_size
116 |         self.z_dimensions = z_dimensions
117 |         self.variational = variational
118 |         self.gamma = gamma
119 |         self.perceptual_net = perceptual_net
120 |         
121 |         inp = torch.rand((1,3,input_size[0],input_size[1]))
122 |         out = self.perceptual_net(
123 |             inp.to(next(perceptual_net.parameters()).device)
124 |         )
125 |         self.perceptual_size = out.numel()
126 |         self.perceptual_loss = True
127 | 
128 |         hidden_layer_size = int(min(self.perceptual_size/2, 2048))
129 | 
130 |         self.encoder = nn.Sequential(
131 |             nn.Linear(self.perceptual_size, hidden_layer_size),
132 |             nn.ReLU(),
133 |         )
134 | 
135 |         self.mu = nn.Linear(hidden_layer_size, self.z_dimensions)
136 |         self.logvar = nn.Linear(hidden_layer_size, self.z_dimensions)
137 | 
138 |         self.decoder = nn.Sequential(
139 |             nn.Linear(self.z_dimensions, hidden_layer_size),
140 |             nn.ReLU(),
141 |             nn.Linear(hidden_layer_size, self.perceptual_size)
142 |         )
143 | 
144 |     def encode(self, x):
145 |         y = self.perceptual_net(x)
146 |         y = y.view(y.size(0),-1)
147 |         y = self.encoder(y)
148 |         mu = self.mu(y)
149 |         logvar = self.logvar(y)
150 |         return mu, logvar
151 | 
152 |     def loss(self, output, x):
153 |         rec_y, z, mu, logvar = output
154 |         
155 |         y = self.perceptual_net(x)
156 |         y = y.view(y.size(0),-1)
157 | 
158 |         REC = F.mse_loss(rec_y, y, reduction='mean')
159 | 
160 |         if self.variational:
161 |             KLD = -1 * torch.mean(1 + logvar - mu.pow(2) - logvar.exp())
162 |             return REC + self.gamma*KLD, REC, KLD
163 |         else:
164 |             return [REC]
165 | 
166 | class PerceptualFeatureToImgCVAE(TemplateVAE):
167 |     '''
168 |     A CVAE that encodes perceptual features and reconstructs the images
169 |     Trained with perceptual loss
170 |     F-I-PS in the paper
171 |     Args:
172 |         input_size (int,int): The height and width of the input image
173 |             acceptable sizes are 64+16*n
174 |         z_dimensions (int): The number of latent dimensions in the encoding
175 |         variational (bool): Whether the model is variational or not
176 |         gamma (float): The weight of the KLD loss
177 |         perceptual_net: Which feature extraction and perceptual net to use
178 |     '''
179 | 
180 |     def __init__(self, input_size=(64,64), z_dimensions=32,
181 |         variational=True, gamma=20.0, perceptual_net=None
182 |     ):
183 |         super().__init__()
184 | 
185 |         #Parameter check
186 |         if (input_size[0] - 64) % 16 != 0 or (input_size[1] - 64) % 16 != 0:
187 |             raise ValueError(
188 |                 f'Input_size is {input_size}, but must be 64+16*N'
189 |             )
190 |         assert perceptual_net != None, \
191 |             'For PerceptualFeatureToImgCVAE, perceptual_net cannot be None'
192 | 
193 |         #Attributes
194 |         self.input_size = input_size
195 |         self.z_dimensions = z_dimensions
196 |         self.variational = variational
197 |         self.gamma = gamma
198 |         self.perceptual_net = perceptual_net
199 | 
200 |         inp = torch.rand((1,3,input_size[0],input_size[1]))
201 |         out = self.perceptual_net(
202 |             inp.to(next(perceptual_net.parameters()).device)
203 |         )
204 |         self.perceptual_size = out.numel()
205 |         self.perceptual_loss = True
206 | 
207 |         hidden_layer_size = int(min(self.perceptual_size/2, 2048))
208 | 
209 |         self.encoder = nn.Sequential(
210 |             nn.Linear(self.perceptual_size, hidden_layer_size),
211 |             nn.ReLU(),
212 |         )
213 |         
214 |         self.mu = nn.Linear(hidden_layer_size, self.z_dimensions)
215 |         self.logvar = nn.Linear(hidden_layer_size, self.z_dimensions)
216 | 
217 |         g = lambda x: int((x-64)/16)+1
218 |         deconv_flat_size = g(input_size[0]) * g(input_size[1]) * 1024
219 |         self.dense = nn.Linear(self.z_dimensions, deconv_flat_size)
220 | 
221 |         self.decoder = _create_coder(
222 |             [1024,128,64,32,3], [5,5,6,6], [2,2,2,2],
223 |             nn.ConvTranspose2d,
224 |             [nn.ReLU,nn.ReLU,nn.ReLU,nn.Sigmoid],
225 |             batch_norms=[True,True,True,False]
226 |         )
227 | 
228 |         self.relu = nn.ReLU()
229 | 
230 |     def encode(self, x):
231 |         y = self.perceptual_net(x)
232 |         y = y.view(y.size(0),-1)
233 |         y = self.encoder(y)
234 |         mu = self.mu(y)
235 |         logvar = self.logvar(y)
236 |         return mu, logvar
237 | 
238 |     def decode(self, z):
239 |         y = self.dense(z)
240 |         y = self.relu(y)
241 |         y = y.view(
242 |             y.size(0), 1024,
243 |             int((self.input_size[0]-64)/16)+1,
244 |             int((self.input_size[1]-64)/16)+1
245 |         )
246 |         y = self.decoder(y)
247 |         return y
248 | 
249 | class FeatureToImgCVAE(PerceptualFeatureToImgCVAE):
250 |     '''
251 |     A CVAE that encodes perceptual features and reconstructs the images
252 |     Trained with pixel-wise loss
253 |     F-I-PW in the paper
254 |     Args:
255 |         input_size (int,int): The height and width of the input image
256 |             acceptable sizes are 64+16*n
257 |         z_dimensions (int): The number of latent dimensions in the encoding
258 |         variational (bool): Whether the model is variational or not
259 |         gamma (float): The weight of the KLD loss
260 |         perceptual_net: Which feature extraction net to use
261 |     '''
262 | 
263 |     def loss(self, output, x):
264 |         rec_x, z, mu, logvar = output
265 | 
266 |         x = x.reshape(x.size(0), -1)
267 |         rec_x = rec_x.view(x.size(0), -1)
268 |         REC = F.mse_loss(rec_x, x, reduction='mean')
269 | 
270 |         if self.variational:
271 |             KLD = -1 * torch.mean(1 + logvar - mu.pow(2) - logvar.exp())
272 |             return REC + self.gamma*KLD, REC, KLD
273 |         else:
274 |             return [REC]


--------------------------------------------------------------------------------
/perceptual_networks.py:
--------------------------------------------------------------------------------
 1 | import os
 2 | import torch.nn as nn
 3 | import torchvision.models as models
 4 | 
 5 | # Dictionary off torchvision models and the attribute 'paths' to their features
 6 | architecture_features = {
 7 |     'alexnet' : ['features'],
 8 |     'vgg11' : ['features'],
 9 |     'vgg11_bn' : ['features'],
10 |     'vgg13' : ['features'],
11 |     'vgg13_bn' : ['features'],
12 |     'vgg16' : ['features'],
13 |     'vgg16_bn' : ['features'],
14 |     'vgg19' : ['features'],
15 |     'vgg19_bn' : ['features'],
16 |     'densenet121' : ['features'],
17 |     'densenet161' : ['features'],
18 |     'densenet169' : ['features'],
19 |     'densenet201' : ['features'],
20 |     'resnet18' : [],
21 |     'resnet34' : [],
22 |     'resnet50' : [],
23 |     'resnet101' : [],
24 |     'resnet152' : [],
25 |     'wide_resnet50_2' : [],
26 |     'wide_resnet101_2' : [],
27 |     'shufflenet_v2_x1_0' : [],
28 |     'shufflenet_v2_x2_0' : [],
29 |     'mobilenet_v2' : ['features'],
30 |     'googlenet' : [],
31 |     'inception_v3' : [],
32 |     'squeezenet1_0' : ['features'],
33 |     'squeezenet1_1' : ['features']
34 | }
35 | 
36 | def AlexNet(layer=5, pretrained=True, frozen=True, sigmoid_out=True):
37 |     return SimpleExtractor('alexnet',layer,frozen,sigmoid_out)
38 | 
39 | class SimpleExtractor(nn.Module):
40 |     '''
41 |     A simple feature extractor for torchvision models
42 |     Args:
43 |         architecture (str): The architecture to extract from
44 |         layer (int): The sub-module in 'features' to extract at
45 |         frozen (bool): Whether the network can be trained
46 |         sigmoid_out (bool): Whether to normalize the output with a sigmoid
47 |     '''
48 |     def __init__(self, architecture, layer, frozen=True, sigmoid_out=True):
49 |         super(SimpleExtractor, self).__init__()
50 |         self.architecture = architecture
51 |         self.layer = layer
52 |         self.frozen = frozen
53 |         self.sigmoid_out = sigmoid_out
54 |     
55 |         os.environ['TORCH_HOME'] = './'
56 |         original_model = models.__dict__[architecture](pretrained=True)
57 |         original_features = original_model
58 |         for attribute in architecture_features[architecture]:
59 |             original_features = getattr(original_features, attribute)
60 |         self.features = nn.Sequential(
61 |             *list(original_features.children())[:layer]
62 |         )
63 |         if sigmoid_out:
64 |             self.features.add_module('sigmoid',nn.Sigmoid())
65 |         if frozen:
66 |             self.eval()
67 |             for param in self.features.parameters():
68 |                 param.requires_grad = False
69 | 
70 |     def forward(self, x):
71 |         x = self.features(x)
72 |         x = x.view(x.size(0), -1)
73 |         return x
74 | 
75 |     def __str__(self):
76 |         return (
77 |             f'{self.architecture}(layer={self.layer}, '
78 |             f'frozen={self.frozen}, sigmoid_out={self.sigmoid_out})'
79 |         )
80 | 


--------------------------------------------------------------------------------
/utility.py:
--------------------------------------------------------------------------------
  1 | import torch
  2 | import torch.nn as nn
  3 | from torch.utils.data import TensorDataset, DataLoader
  4 | import time
  5 | import pickle
  6 | import numpy as np
  7 | import datetime
  8 | import matplotlib.pyplot as plt
  9 | 
 10 | def run_epoch(model, dataloader, loss, optimizer,
 11 |     epoch_name='Epoch', train=True
 12 | ):
 13 |     '''
 14 |     Trains a given model for one epoch
 15 |     Will automatically move data to gpu if model is on the gpu
 16 |     Args:
 17 |         model (nn.Module): The network to be trained
 18 |         dataloader (data.DataLoader): Torch DataLoader to load epoch data
 19 |         loss (f(output, target)->[tensor]): Loss calculation function
 20 |         optimizer (optim.Optimizer): Optimizer for use in training
 21 |         epoch_name (str): Name of the epoch (usually a number)
 22 |         train (bool): Whether to run this epoch to train or just to evaluate
 23 |     Returns: ([float]) The mean batch losses of the epoch
 24 |     '''
 25 |     start_time = time.time()
 26 |     gpu = next(model.parameters()).is_cuda
 27 | 
 28 |     if train:
 29 |         model.train()
 30 |     else:
 31 |         model.eval()
 32 |     epoch_losses = []
 33 |     for batch_id, (batch_data, batch_labels) in enumerate(dataloader):
 34 |         if gpu:
 35 |             batch_data = batch_data.cuda()
 36 |             batch_labels = batch_labels.cuda()
 37 |         optimizer.zero_grad()
 38 |         output = model(batch_data)
 39 |         losses = loss(output, batch_labels)
 40 |         if batch_id == 0:
 41 |             epoch_losses = [
 42 |                 loss.item() for loss in losses
 43 |             ]
 44 |         else:
 45 |             epoch_losses = [
 46 |                 epoch_losses[i] + losses[i].item() for i in range(len(losses))
 47 |             ]
 48 |         losses[0].backward()
 49 |         if train:
 50 |             optimizer.step()
 51 |         print(
 52 |             '\r{} - [{}/{}] - Losses: {}, Time elapsed: {}s'.format(
 53 |                 epoch_name, batch_id+1, len(dataloader),
 54 |                 ', '.join(
 55 |                     ['{0:.5f}'.format(l/(batch_id+1)) for l in epoch_losses]
 56 |                 ),
 57 |                 '{0:.1f}'.format(time.time()-start_time)
 58 |             ),end=''
 59 |         )
 60 | 
 61 |     return [l/(batch_id+1) for l in epoch_losses]
 62 | 
 63 | def run_training(model, train_loader, val_loader, loss,
 64 |     optimizer, save_path, epochs, epoch_update=None
 65 | ):
 66 |     '''
 67 |     Args:
 68 |         model (nn.Module): The network to be trained
 69 |         train_loader (data.Dataloader): Dataloader for training data
 70 |         val_loader (data.Dataloader): Dataloader for validation data
 71 |         loss (f(output, target)->[tensor]): Loss calculation function
 72 |         optimizer (optim.Optimizer): Optimizer for use in training
 73 |         save_path (str): Path to folder where the model will be stored
 74 |         epochs (int): Number of epochs to train for
 75 |         epoch_update (f(epoch, train_loss, val_loss) -> bool): Function to run
 76 |             at the end of a epoch. Returns whether to early stop
 77 |     Returns (nn.Module, str, float, int): The model, path, val loss, and epochs
 78 |     '''
 79 |     save_file = (
 80 |         model. __class__.__name__ + 
 81 |         datetime.datetime.now().strftime('_%Y-%m-%d_%Hh%Mm%Ss.pt')
 82 |     )
 83 |     if save_path != '':
 84 |         save_file = save_path + '/' + save_file
 85 | 
 86 |     torch_model_save(model, save_file)
 87 |     best_validation_loss = float('inf')
 88 |     best_epoch = 0
 89 |     for epoch in range(1,epochs+1):
 90 |         training_losses = run_epoch(
 91 |             model, train_loader, loss, optimizer,
 92 |             'Train {}'.format(epoch), train=True
 93 |         )
 94 | 
 95 |         validation_losses = run_epoch(
 96 |             model, val_loader, loss, optimizer,
 97 |             'Validation {}'.format(epoch), train=False
 98 |         )
 99 |         
100 |         print(
101 |             f'\rEpoch {epoch} - '
102 |             f'Train loss {training_losses[0]:.5f} - '
103 |             f'Validation loss {validation_losses[0]:.5f}',
104 |             ' '*35
105 |         )
106 | 
107 |         if validation_losses[0] < best_validation_loss:
108 |             torch_model_save(model, save_file)
109 |             best_validation_loss = validation_losses[0]
110 |             best_epoch = epoch
111 |         
112 |         if not epoch_update is None:
113 |             early_stop = epoch_update(epoch, training_losses, validation_losses)
114 |             if early_stop:
115 |                 break
116 | 
117 |     model = torch.load(save_file)
118 |     return model, save_file, best_validation_loss, best_epoch
119 | 
120 | class EarlyStopper():
121 |     '''
122 |     An implementation of Early stopping for run_training
123 |     Args:
124 |         patience (int): How many epochs without progress until stopping early
125 |     '''
126 |     
127 |     def __init__(self, patience=20):
128 |         self.patience = patience
129 |         self.current_patience = patience
130 |         self.best_loss = 99999999999999
131 |     
132 |     def __call__(self, epoch, train_losses, val_losses):
133 |         if val_losses[0] < self.best_loss:
134 |             self.best_loss = val_losses[0]
135 |             self.current_patience = self.patience
136 |         else:
137 |             self.current_patience -= 1
138 |             if self.current_patience == 0:
139 |                 return True
140 |         return False
141 | 
142 | def fc_net(input_size, layers, activation_functions):
143 |     '''
144 |     Creates a simple fully connected network
145 |     Args:
146 |         input_size (int): Input size to the network
147 |         layers ([int]): Layer sizes
148 |         activation_functions ([f()->nn.Module]): class of activation functions
149 |     Returns: (nn.Sequential)
150 |     '''
151 |     if not isinstance(activation_functions, list):
152 |         activation_functions = [
153 |             activation_functions for _ in range(len(layers)+1)
154 |         ]
155 | 
156 |     network = nn.Sequential()
157 |     layers.insert(0,input_size)
158 |     for layer_id in range(len(layers)-1):
159 |         network.add_module(
160 |             'linear{}'.format(layer_id),
161 |             nn.Linear(layers[layer_id], layers[layer_id+1])
162 |         )
163 |         if not activation_functions[layer_id] is None:
164 |             network.add_module(
165 |                 'activation{}'.format(layer_id),
166 |                 activation_functions[layer_id]()
167 |             )
168 |     return network
169 | 
170 | def torch_model_save(model, file_path):
171 |     '''
172 |     Saves a cpu version of the given model at file_path
173 |     Args:
174 |         model (nn.Module): Model to save
175 |         file_path (str): Path to file to store the model in
176 |     '''
177 |     device = next(model.parameters()).device
178 |     model.cpu()
179 |     torch.save(model, file_path)
180 |     model.to(device)
181 | 


--------------------------------------------------------------------------------