├── .DS_Store ├── .gitattributes ├── 2Dcae.png ├── 3dCAE.png ├── README.md ├── code ├── .DS_Store ├── TransposedConv3DLayer.py ├── cae2d.py └── cae3d.py └── results ├── .DS_Store ├── cae3depochs.png ├── data.png ├── losscae2d.png ├── losscae3d.png ├── results3D_cae.png └── results_2dcae.png /.DS_Store: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/laurahanu/2D-and-3D-Deep-Autoencoder/3b9590d46aaabc20ad500f2010e5c735ffb2dfb4/.DS_Store -------------------------------------------------------------------------------- /.gitattributes: -------------------------------------------------------------------------------- 1 | # Auto detect text files and perform LF normalization 2 | * text=auto -------------------------------------------------------------------------------- /2Dcae.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/laurahanu/2D-and-3D-Deep-Autoencoder/3b9590d46aaabc20ad500f2010e5c735ffb2dfb4/2Dcae.png -------------------------------------------------------------------------------- /3dCAE.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/laurahanu/2D-and-3D-Deep-Autoencoder/3b9590d46aaabc20ad500f2010e5c735ffb2dfb4/3dCAE.png -------------------------------------------------------------------------------- /README.md: -------------------------------------------------------------------------------- 1 | # 2D-and-3D-Deep-Autoencoder 2 | Convolutional AutoEncoder application on MRI images 3 | 4 | Application of a deep convolutional autoencoder network on MRI images of knees. The MRI database used was provided by Imperial College London, however similar databases can be found on the OAI website (http://www.oai.ucsf.edu/), an observational study dedicated to monitor the natural evolution of osteoarthritis. 5 | The original dataset was comprised of 3D images of 160x64x64 from 180 patients. 6 | The 2D dataset used in this project combined the images coming from different patients into 28800 2D black&white MRI images of size 64x64. 7 | The 3D dataset has a shape of 180x1x160x64x64. 8 | 9 | # Prerequisites 10 | - Python, Lasagne (developer version), Theano (developer version), Numpy, Matplotlib, scikit-image 11 | - NVIDIA GPU (5.0 or above) 12 | # Architectures 13 | 14 | - **2D CAE** 15 |

16 | alt text 17 | 18 | - **3D CAE** 19 |

20 | alt text 21 | 22 | # Results 23 | 24 | ## 2D CAE 25 | 26 | The hyperparameters used were: 27 | - a learning rate of 0.0005 28 | - a decay rate of 0.5 29 | - a batch size of 128 images 30 | - a z space of 100 31 | 32 | The network was trained using 100 epochs (22500 iterations). 33 | 34 | - **Comparison between input images (top) and decoded images (bottom)** 35 |

36 | alt text 37 | 38 | - **Loss function for 2D CAE** 39 |

40 | alt text 41 | 42 | 43 | ## 3D CAE 44 | Since 180 3D images were considered, the network was trained on 500 epochs with a batchsize of 4 images. A larger batchsize could not be used due 45 | to GPU memory constraints. Better results might be obtained by tweaking these hyperparameters. 46 | In order to quickly visualize and monitor the results, different depths of the 3D images were extracted and compared with the original images at the same depth. 47 | - **Results for 3D CAE** 48 |

49 | alt text 50 | 51 | - **Traning evolution 3D CAE** 52 |

53 | alt text 54 | 55 | - **Loss function for 3D CAE** 56 |

57 | alt text 58 | 59 | # License 60 | This project is licensed under Imperial College London. 61 | # Acknowledgements 62 | 63 | The CAE model used in this application was granted by Antonia Creswell and can be found here: https://github.com/ToniCreswell/ConvolutionalAutoEncoder 64 | 65 | 66 | 67 | -------------------------------------------------------------------------------- /code/.DS_Store: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/laurahanu/2D-and-3D-Deep-Autoencoder/3b9590d46aaabc20ad500f2010e5c735ffb2dfb4/code/.DS_Store -------------------------------------------------------------------------------- /code/TransposedConv3DLayer.py: -------------------------------------------------------------------------------- 1 | 2 | import theano 3 | from theano import tensor as T 4 | 5 | from theano.tensor import nnet 6 | from theano.tensor.nnet import abstract_conv 7 | from theano.tensor.nnet.abstract_conv import AbstractConv3d_gradInputs 8 | import theano.tensor.nnet.abstract_conv.BaseAbstractConv 9 | 10 | import time 11 | import lasagne 12 | from lasagne.utils import as_tuple 13 | from lasagne.layers.conv import BaseConvLayer, conv_output_length, conv_input_length 14 | 15 | 16 | from lasagne.layers import InputLayer, NINLayer, flatten, reshape, Upscale3DLayer 17 | from lasagne.nonlinearities import rectify as relu 18 | from lasagne.nonlinearities import LeakyRectify as lrelu 19 | from lasagne.nonlinearities import sigmoid 20 | from lasagne.layers import get_all_params, get_all_layers 21 | from lasagne.objectives import binary_crossentropy as bce 22 | from lasagne.objectives import squared_error 23 | from lasagne.updates import adam 24 | 25 | from lasagne.layers import conv,base 26 | 27 | from lasagne.theano_extensions import conv 28 | from lasagne.layers import __init__ 29 | 30 | 31 | class TransposedConv3DLayer(BaseConvLayer): # pragma: no cover 32 | """ 33 | lasagne.layers.TransposedConv3DLayer(incoming, num_filters, filter_size, 34 | stride=(1, 1, 1), crop=0, untie_biases=False, 35 | W=lasagne.init.GlorotUniform(), b=lasagne.init.Constant(0.), 36 | nonlinearity=lasagne.nonlinearities.rectify, flip_filters=False, **kwargs) 37 | 3D transposed convolution layer 38 | Performs the backward pass of a 3D convolution (also called transposed 39 | convolution, fractionally-strided convolution or deconvolution in the 40 | literature) on its input and optionally adds a bias and applies an 41 | elementwise nonlinearity. 42 | Parameters 43 | ---------- 44 | incoming : a :class:`Layer` instance or a tuple 45 | The layer feeding into this layer, or the expected input shape. The 46 | output of this layer should be a 5D tensor, with shape 47 | ``(batch_size, num_input_channels, input_depth, input_rows, 48 | input_columns)``. 49 | num_filters : int 50 | The number of learnable convolutional filters this layer has. 51 | filter_size : int or iterable of int 52 | An integer or a 3-element tuple specifying the size of the filters. 53 | stride : int or iterable of int 54 | An integer or a 3-element tuple specifying the stride of the 55 | transposed convolution operation. For the transposed convolution, this 56 | gives the dilation factor for the input -- increasing it increases the 57 | output size. 58 | crop : int, iterable of int, 'full', 'same' or 'valid' (default: 0) 59 | By default, the transposed convolution is computed where the input and 60 | the filter overlap by at least one position (a full convolution). When 61 | ``stride=1``, this yields an output that is larger than the input by 62 | ``filter_size - 1``. It can be thought of as a valid convolution padded 63 | with zeros. The `crop` argument allows you to decrease the amount of 64 | this zero-padding, reducing the output size. It is the counterpart to 65 | the `pad` argument in a non-transposed convolution. 66 | A single integer results in symmetric cropping of the given size on all 67 | borders, a tuple of three integers allows different symmetric cropping 68 | per dimension. 69 | ``'full'`` disables zero-padding. It is is equivalent to computing the 70 | convolution wherever the input and the filter fully overlap. 71 | ``'same'`` pads with half the filter size (rounded down) on both sides. 72 | When ``stride=1`` this results in an output size equal to the input 73 | size. Even filter size is not supported. 74 | ``'valid'`` is an alias for ``0`` (no cropping / a full convolution). 75 | Note that ``'full'`` and ``'same'`` can be faster than equivalent 76 | integer values due to optimizations by Theano. 77 | untie_biases : bool (default: False) 78 | If ``False``, the layer will have a bias parameter for each channel, 79 | which is shared across all positions in this channel. As a result, the 80 | `b` attribute will be a vector (1D). 81 | If True, the layer will have separate bias parameters for each 82 | position in each channel. As a result, the `b` attribute will be a 83 | 3D tensor. 84 | W : Theano shared variable, expression, numpy array or callable 85 | Initial value, expression or initializer for the weights. 86 | These should be a 5D tensor with shape 87 | ``(num_input_channels, num_filters, filter_rows, filter_columns)``. 88 | Note that the first two dimensions are swapped compared to a 89 | non-transposed convolution. 90 | See :func:`lasagne.utils.create_param` for more information. 91 | b : Theano shared variable, expression, numpy array, callable or ``None`` 92 | Initial value, expression or initializer for the biases. If set to 93 | ``None``, the layer will have no biases. Otherwise, biases should be 94 | a 1D array with shape ``(num_filters,)`` if `untied_biases` is set to 95 | ``False``. If it is set to ``True``, its shape should be 96 | ``(num_filters, output_rows, output_columns)`` instead. 97 | See :func:`lasagne.utils.create_param` for more information. 98 | nonlinearity : callable or None 99 | The nonlinearity that is applied to the layer activations. If None 100 | is provided, the layer will be linear. 101 | flip_filters : bool (default: False) 102 | Whether to flip the filters before sliding them over the input, 103 | performing a convolution, or not to flip them and perform a 104 | correlation (this is the default). Note that this flag is inverted 105 | compared to a non-transposed convolution. 106 | output_size : int or iterable of int or symbolic tuple of ints 107 | The output size of the transposed convolution. Allows to specify 108 | which of the possible output shapes to return when stride > 1. 109 | If not specified, the smallest shape will be returned. 110 | **kwargs 111 | Any additional keyword arguments are passed to the `Layer` superclass. 112 | Attributes 113 | ---------- 114 | W : Theano shared variable or expression 115 | Variable or expression representing the filter weights. 116 | b : Theano shared variable or expression 117 | Variable or expression representing the biases. 118 | Notes 119 | ----- 120 | The transposed convolution is implemented as the backward pass of a 121 | corresponding non-transposed convolution. It can be thought of as dilating 122 | the input (by adding ``stride - 1`` zeros between adjacent input elements), 123 | padding it with ``filter_size - 1 - crop`` zeros, and cross-correlating it 124 | with the filters. See [1]_ for more background. 125 | Examples 126 | -------- 127 | To transpose an existing convolution, with tied filter weights: 128 | >>> from lasagne.layers import Conv3DLayer, TransposedConv3DLayer 129 | >>> conv = Conv3DLayer((None, 1, 32, 32, 32), 16, 3, stride=2, pad=2)conv 130 | >>> deconv = TransposedConv3DLayer(conv, conv.input_shape[1], 131 | ... conv.filter_size, stride=conv.stride, crop=conv.pad, 132 | ... W=conv.W, flip_filters=not conv.flip_filters) 133 | References 134 | ---------- 135 | .. [1] Vincent Dumoulin, Francesco Visin (2016): 136 | A guide to convolution arithmetic for deep learning. arXiv. 137 | http://arxiv.org/abs/1603.07285, 138 | https://github.com/vdumoulin/conv_arithmetic 139 | """ 140 | def __init__(self, incoming, num_filters, filter_size, stride=(1, 1, 1), crop=0, untie_biases=False, W=lasagne.init.GlorotUniform(), b=lasagne.init.Constant(0.),nonlinearity=lasagne.nonlinearities.rectify, flip_filters=False,output_size=None, **kwargs): 141 | # output_size must be set before calling the super constructor 142 | if (not isinstance(output_size, T.Variable) and output_size is not None): 143 | output_size = as_tuple(output_size, 3, int) 144 | self.output_size = output_size 145 | BaseConvLayer.__init__(self, incoming, num_filters, filter_size, stride, crop, untie_biases, W, b,nonlinearity, flip_filters, n=3, **kwargs) 146 | # rename self.pad to self.crop: 147 | #if crop is None: 148 | self.crop = self.pad 149 | del self.pad 150 | 151 | 152 | def get_W_shape(self): 153 | num_input_channels = self.input_shape[1] 154 | # first two sizes are swapped compared to a forward convolution 155 | return (num_input_channels, self.num_filters) + self.filter_size 156 | 157 | def get_output_shape_for(self, input_shape): 158 | if self.output_size is not None: 159 | size = self.output_size 160 | if isinstance(self.output_size, T.Variable): 161 | size = (None, None, None) 162 | 163 | return input_shape[0], self.num_filters, size[0], size[1], size[2] 164 | 165 | # If self.output_size is not specified, return the smallest shape 166 | # when called from the constructor, self.crop is still called self.pad: 167 | crop = getattr(self, 'crop', getattr(self, 'pad', None)) 168 | crop = crop if isinstance(crop, tuple) else (crop,) * self.n 169 | batchsize = input_shape[0] 170 | return ((batchsize, self.num_filters) + tuple(conv_input_length(input, filter, stride, p) for input, filter, stride, p in zip(input_shape[2:], self.filter_size, self.stride, crop))) 171 | 172 | def convolve(self, input, **kwargs): 173 | border_mode = 'half' if self.crop == 'same' else self.crop 174 | op = T.nnet.abstract_conv.AbstractConv3d_gradInputs(imshp=self.output_shape,kshp=self.get_W_shape(),subsample=self.stride, border_mode=border_mode,filter_flip=not self.flip_filters) 175 | output_size = self.output_shape[2:] 176 | if isinstance(self.output_size, T.Variable): 177 | output_size = self.output_size 178 | elif any(s is None for s in output_size): 179 | output_size = self.get_output_shape_for(input.shape)[2:] 180 | conved = op(self.W, input, output_size) 181 | return conved 182 | 183 | Deconv3DLayer = TransposedConv3DLayer -------------------------------------------------------------------------------- /code/cae2d.py: -------------------------------------------------------------------------------- 1 | #CONVOLUTIONAL AUTOENCODER Using Lasagne used on an MRI image dataset of shape 180,1,160,64,64 for nr of images, nr of channels, depth, image height, image width 2 | 3 | from lasagne.layers import InputLayer, NINLayer, flatten, reshape, Upscale3DLayer, DenseLayer 4 | from lasagne.layers import get_output, get_output_shape, get_all_params, get_all_layers 5 | from lasagne.nonlinearities import LeakyRectify as lrelu 6 | from lasagne.nonlinearities import rectify as relu 7 | from lasagne.nonlinearities import sigmoid 8 | from lasagne.objectives import binary_crossentropy as bce 9 | from lasagne.objectives import squared_error 10 | from lasagne.updates import adam 11 | 12 | import numpy as np 13 | import theano 14 | from theano import tensor as T 15 | import time 16 | import matplotlib 17 | matplotlib.use('Agg') 18 | from matplotlib import pyplot as plt 19 | 20 | from skimage.io import imsave 21 | import scipy.misc 22 | import scipy.io 23 | floatX=theano.config.floatX 24 | 25 | inPath = '' #path to dataset 26 | 27 | outPath = '' #path to where you want the results to be saved 28 | 29 | def get_args(): 30 | print ('getting args...') 31 | 32 | def save_args(): 33 | print ('saving args...') 34 | 35 | def build_net(nz=200): 36 | 37 | input_depth=160 38 | input_rows=64 39 | input_columns=64 40 | 41 | F=32 42 | 43 | # encoder 44 | 45 | enc = InputLayer(shape=(None,1,64,64)) 46 | enc = Conv2DLayer(incoming=enc, num_filters=F*2, filter_size=5,stride=2, nonlinearity=lrelu(0.2),pad=2) 47 | enc = Conv2DLayer(incoming=enc, num_filters=F*4, filter_size=5,stride=2, nonlinearity=lrelu(0.2),pad=2) 48 | enc = Conv2DLayer(incoming=enc, num_filters=F*8, filter_size=5,stride=2, nonlinearity=lrelu(0.2),pad=2) 49 | enc = Conv2DLayer(incoming=enc, num_filters=F*8, filter_size=5,stride=2, nonlinearity=lrelu(0.2),pad=2) 50 | enc = reshape(incoming=enc, shape=(-1,F*8*4*4)) 51 | enc = DenseLayer(incoming=enc, num_units=nz, nonlinearity=sigmoid) 52 | 53 | # decoder 54 | 55 | dec = InputLayer(shape=(None,nz)) 56 | dec = DenseLayer(incoming=dec, num_units=F*8*4*4) 57 | dec = reshape(incoming=dec, shape=(-1,F*8,4,4)) 58 | dec = Deconv2DLayer(incoming=dec, num_filters=F*8, filter_size=4, stride=2, nonlinearity=relu, crop=1) 59 | dec = Deconv2DLayer(incoming=dec, num_filters=F*4, filter_size=4, stride=2, nonlinearity=relu, crop=1) 60 | dec = Deconv2DLayer(incoming=dec, num_filters=F*2, filter_size=4, stride=2, nonlinearity=relu, crop=1) 61 | dec = Deconv2DLayer(incoming=dec, num_filters=1, filter_size=4, stride=2, nonlinearity=sigmoid, crop=1) 62 | 63 | return enc, dec 64 | 65 | # get the shape of the network 66 | 67 | enc, dec = build_net() 68 | for l in get_all_layers(enc): 69 | print (get_output_shape(l)) 70 | for l in get_all_layers(dec): 71 | print (get_output_shape(l)) 72 | 73 | 74 | def prep_train(alpha=0.0002, beta=0.5, nz=200): 75 | 76 | E,D=build_net(nz=nz) 77 | 78 | x=T.Tensor4('x') 79 | 80 | # x -> symbolic variable, input to the computational graph 81 | 82 | #Get outputs z=E(x), x_hat=D(z) 83 | 84 | encoding = get_output(E,x) 85 | 86 | decoding = get_output(D,encoding) 87 | 88 | #Get parameters of E and D 89 | 90 | params_e=get_all_params(E, trainable=True) 91 | 92 | params_d=get_all_params(D, trainable=True) 93 | 94 | params = params_e + params_d 95 | 96 | #Calculate cost and updates 97 | 98 | cost = T.mean(squared_error(x,decoding)) 99 | 100 | grad=T.grad(cost,params) 101 | 102 | updates = adam(grad,params, learning_rate=alpha , beta1=beta) 103 | 104 | train = theano.function(inputs=[x], outputs=cost, updates=updates) 105 | 106 | rec = theano.function(inputs=[x], outputs=decoding) 107 | 108 | test = theano.function(inputs=[x], outputs=cost) 109 | 110 | #theano.function returns an actual python function used to evaluate our real data 111 | 112 | return train ,test, rec, E, D 113 | 114 | 115 | def train( trainData, testData, nz=100, alpha=0.00005, beta=0.5, batchSize=128, epoch=100): 116 | 117 | train, test, rec, E, D = prep_train(nz=nz, alpha=alpha) 118 | 119 | print (np.shape(trainData) ) 120 | 121 | sn,sc,sx,sy=np.shape(trainData) 122 | 123 | print (sn,sc,sx,sy) 124 | 125 | batches=int(np.floor(float(sn)/batchSize)) 126 | 127 | #initialize arrays to store the cost functions 128 | 129 | trainCost_=[] 130 | 131 | testCost_=[] 132 | 133 | print ('batches=',batches) 134 | 135 | timer=time.time() 136 | 137 | print ('epoch \t batch \t train \t cost \t test \t cost \t time (s)') 138 | 139 | for e in range(epoch): 140 | 141 | for b in range(batches): 142 | 143 | trainCost=train(trainData[b*batchSize:(b+1)*batchSize]) 144 | 145 | testCost=test(testData[:10]) #test first 10 images 146 | 147 | print e , '\t', b ,'\t', trainCost ,'\t' ,testCost ,'\t' ,time.time()-timer 148 | 149 | timer=time.time() 150 | 151 | trainCost_.append(trainCost) 152 | 153 | testCost_.append(testCost) 154 | 155 | #save results every 10 interations 156 | 157 | if b%100==0: 158 | 159 | # create a montage to visualize how close the decoded images are to the input images 160 | # the chosen method here creates a montage of 2x3 images with 3 real images on top of 3 decoded ones, but other ways to visualize the results can be used 161 | 162 | x_test1=x_test[200:300:10,:,:,:] 163 | 164 | montageRow1 = np.hstack(x_test1[:10].reshape(-1,64,64)) 165 | 166 | REC=rec(x_test1[:10]) 167 | 168 | montageRow2 = np.hstack(REC[:10].reshape(-1,64,64)) 169 | 170 | montage = np.vstack((montageRow1, montageRow2)) 171 | 172 | scipy.io.savemat(outPath + 'montageREC'+str(e)+'.mat',(dict(montage=montage))) 173 | 174 | 175 | # save plot of the cost functions 176 | 177 | plt.plot(range(e*batches+b),trainCost_[:e*batches + b]) 178 | 179 | plt.plot(range(e*batches+b),testCost_[:e*batches + b]) 180 | 181 | plt.legend() 182 | 183 | plt.xlabel('Iterations') 184 | 185 | plt.ylabel('Cost Function') 186 | 187 | plt.savefig(outPath + 'cost_regular_{}.png'.format(e)) 188 | 189 | return test, rec, E, D 190 | 191 | def test(x, rec): 192 | 193 | return rec(x) 194 | 195 | # LOAD DATA 196 | 197 | data=np.load(inPath,mmap_mode='r').astype(floatX) 198 | 199 | data=np.transpose(data,(0,2,1,3)) 200 | 201 | sn,sc,sx,sy=np.shape(data) 202 | 203 | # create training and testing datasets 204 | 205 | x_train=data[20:,:,:,:] 206 | 207 | x_test=data[:20,:,:,:] 208 | 209 | test, rec, E, D =train(x_train, x_test) 210 | 211 | # Save example reconstructions at the end of the iterations 212 | 213 | REC = rec(x_test[:10]) 214 | 215 | print (np.shape(REC), np.shape(x_test[:10])) 216 | 217 | fig=plt.figure() 218 | 219 | newDir='' 220 | 221 | montageRow1 = np.hstack(x_test[:3].reshape(-1,64,64).swapaxes(1,3)) 222 | 223 | montageRow2 = np.hstack(REC[:3].reshape(-1,64,64).swapaxes(1,3)) 224 | 225 | montage = np.vstack((montageRow1, montageRow2)) 226 | 227 | print ('montage shape is ',montage.shape) 228 | 229 | np.save(outPath + 'montageREC.npy',montage) 230 | 231 | scipy.io.savemat(outPath + 'montageREC.mat',(dict(montage=montage))) 232 | -------------------------------------------------------------------------------- /code/cae3d.py: -------------------------------------------------------------------------------- 1 | #CONVOLUTIONAL AUTOENCODER Using Lasagne used on an MRI image dataset of shape 180,1,160,64,64 for nr of images, nr of channels, depth, image height, image width 2 | 3 | from lasagne.layers import InputLayer, NINLayer, flatten, reshape, Upscale3DLayer, DenseLayer 4 | from lasagne.layers import get_output, get_output_shape, get_all_params, get_all_layers 5 | from lasagne.nonlinearities import LeakyRectify as lrelu 6 | from lasagne.nonlinearities import rectify as relu 7 | from lasagne.nonlinearities import sigmoid 8 | from lasagne.objectives import binary_crossentropy as bce 9 | from lasagne.objectives import squared_error 10 | from lasagne.updates import adam 11 | 12 | try: 13 | from lasagne.layers import Conv3DLayer 14 | except ImportError: 15 | from lasagne.layers.dnn import Conv3DDNNLayer as Conv3DLayer 16 | 17 | from TransposedConv3DLayer import TransposedConv3DLayer as Deconv3D 18 | 19 | import numpy as np 20 | import theano 21 | from theano import tensor as T 22 | import time 23 | import matplotlib 24 | matplotlib.use('Agg') 25 | from matplotlib import pyplot as plt 26 | import scipy.io 27 | from scipy import misc 28 | from skimage.io import imsave 29 | 30 | 31 | floatX=theano.config.floatX 32 | 33 | inPath = '' #path to dataset 34 | 35 | outPath = '' #path to where you want the results to be saved 36 | 37 | def get_args(): 38 | print ('getting args...') 39 | 40 | def save_args(): 41 | print ('saving args...') 42 | 43 | def build_net(nz=200): 44 | 45 | input_depth=160 46 | input_rows=64 47 | input_columns=64 48 | 49 | #Encoder 50 | 51 | enc = InputLayer(shape=(None,1,input_depth,input_rows,input_columns)) #5D tensor 52 | enc = Conv3DLayer(incoming=enc, num_filters=64, filter_size=5,stride=2, nonlinearity=lrelu(0.2),pad=2) 53 | enc = Conv3DLayer(incoming=enc, num_filters=128, filter_size=5,stride=2, nonlinearity=lrelu(0.2),pad=2) 54 | enc = Conv3DLayer(incoming=enc, num_filters=256, filter_size=5,stride=2, nonlinearity=lrelu(0.2),pad=2) 55 | enc = Conv3DLayer(incoming=enc, num_filters=256, filter_size=5,stride=2, nonlinearity=lrelu(0.2),pad=2) 56 | enc = reshape(incoming=enc, shape=(-1,256*4*4*10)) 57 | enc = DenseLayer(incoming=enc, num_units=nz, nonlinearity=sigmoid) 58 | 59 | #Decoder 60 | 61 | dec = InputLayer(shape=(None,nz)) 62 | dec = DenseLayer(incoming=dec, num_units=256*4*4*10) 63 | dec = reshape(incoming=dec, shape=(-1,256,10,4,4)) 64 | dec=Deconv3D(incoming=dec, num_filters=256, filter_size=4 ,stride=2, crop=1,nonlinearity=relu) 65 | dec=Deconv3D(incoming=dec, num_filters=128, filter_size=4 ,stride=2, crop=1,nonlinearity=relu) 66 | dec=Deconv3D(incoming=dec, num_filters=64, filter_size=4 ,stride=2, crop=1,nonlinearity=relu) 67 | dec=Deconv3D(incoming=dec, num_filters=1, filter_size=4 ,stride=2, crop=1,nonlinearity=sigmoid) 68 | 69 | 70 | 71 | return enc,dec 72 | 73 | # get the shape of the network 74 | 75 | enc, dec = build_net() 76 | for l in get_all_layers(enc): 77 | print (get_output_shape(l)) 78 | for l in get_all_layers(dec): 79 | print (get_output_shape(l)) 80 | 81 | 82 | def prep_train(alpha=0.0002, beta=0.5, nz=200): 83 | 84 | E,D=build_net(nz=nz) 85 | 86 | x=T.Tensor5('x') 87 | 88 | # x -> symbolic variable, input to the computational graph 89 | 90 | #Get outputs z=E(x), x_hat=D(z) 91 | 92 | encoding = get_output(E,x) 93 | 94 | decoding = get_output(D,encoding) 95 | 96 | #Get parameters of E and D 97 | 98 | params_e=get_all_params(E, trainable=True) 99 | 100 | params_d=get_all_params(D, trainable=True) 101 | 102 | params = params_e + params_d 103 | 104 | #Calculate cost and updates 105 | 106 | cost = T.mean(squared_error(x,decoding)) 107 | 108 | grad=T.grad(cost,params) 109 | 110 | updates = adam(grad,params, learning_rate=alpha , beta1=beta) 111 | 112 | train = theano.function(inputs=[x], outputs=cost, updates=updates) 113 | 114 | rec = theano.function(inputs=[x], outputs=decoding) 115 | 116 | test = theano.function(inputs=[x], outputs=cost) 117 | 118 | #theano.function returns an actual python function used to evaluate our real data 119 | 120 | return train ,test, rec, E, D 121 | 122 | 123 | def train( trainData, testData, nz=200, alpha=0.00005, beta=0.5, batchSize=4, epoch=500): 124 | 125 | train, test, rec, E, D = prep_train(nz=nz, alpha=alpha) 126 | 127 | print (np.shape(trainData) ) 128 | 129 | sn,sc,sz,sx,sy=np.shape(trainData) 130 | 131 | print (sn,sc,sz,sx,sy) 132 | 133 | batches=int(np.floor(float(sn)/batchSize)) 134 | 135 | #initialize arrays to store the cost functions 136 | 137 | trainCost_=[] 138 | 139 | testCost_=[] 140 | 141 | print ('batches=',batches) 142 | 143 | timer=time.time() 144 | 145 | print ('epoch \t batch \t train \t cost \t test \t cost \t time (s)') 146 | 147 | for e in range(epoch): 148 | 149 | for b in range(batches): 150 | 151 | trainCost=train(trainData[b*batchSize:(b+1)*batchSize]) 152 | 153 | testCost=test(testData[:10]) #test first 10 images 154 | 155 | print e , '\t', b ,'\t', trainCost ,'\t' ,testCost ,'\t' ,time.time()-timer 156 | 157 | timer=time.time() 158 | 159 | trainCost_.append(trainCost) 160 | 161 | testCost_.append(testCost) 162 | 163 | #save results every 10 interations 164 | 165 | if b%10==0: 166 | 167 | # create a montage to visualize how close the decoded images are to the input images 168 | # the chosen method here creates a montage of 2x3 images with 3 real images on top of 3 decoded ones, but other ways to visualize the results can be used 169 | 170 | x_test1=x_test[0:20:7,:,:,:] 171 | 172 | montageRow1 = np.hstack(x_test1[:3].reshape(-1,160,64,64).swapaxes(1,3)) 173 | 174 | REC=rec(x_test1[:3,:,:,:,:]) 175 | 176 | montageRow2 = np.hstack(REC[:3].reshape(-1,160,64,64).swapaxes(1,3)) 177 | 178 | montage = np.vstack((montageRow1, montageRow2)) 179 | 180 | scipy.io.savemat(outPath + 'montageREC'+str(e)+'.mat',(dict(montage=montage))) 181 | 182 | 183 | # save plot of the cost functions 184 | 185 | plt.plot(range(e*batches+b),trainCost_[:e*batches + b]) 186 | 187 | plt.plot(range(e*batches+b),testCost_[:e*batches + b]) 188 | 189 | plt.legend() 190 | 191 | plt.xlabel('Iterations') 192 | 193 | plt.ylabel('Cost Function') 194 | 195 | plt.savefig(outPath + 'cost_regular_{}.png'.format(e)) 196 | 197 | return test, rec, E, D 198 | 199 | def test(x, rec): 200 | 201 | return rec(x) 202 | 203 | # LOAD DATA 204 | 205 | data=np.load(inPath,mmap_mode='r').astype(floatX) 206 | 207 | data=np.transpose(data,(0,2,1,3,4)) 208 | 209 | sn,sc,sz,sx,sy=np.shape(data) 210 | 211 | # normalize input dataset 212 | 213 | for n in range(sn): 214 | 215 | for j in range(sz): 216 | 217 | data[n,:,j,:,:]=data[n,:,j,:,:]/data[n,:,j,:,:].max() 218 | 219 | # create training and testing datasets 220 | 221 | x_train=data[20:,:,:,:,:] 222 | 223 | x_test=data[:20,:,:,:,:] 224 | 225 | test, rec, E, D =train(x_train, x_test) 226 | 227 | # Save example reconstructions at the end of the iterations 228 | 229 | REC = rec(x_test[:10]) 230 | 231 | print (np.shape(REC), np.shape(x_test[:10])) 232 | 233 | fig=plt.figure() 234 | 235 | newDir='' 236 | 237 | montageRow1 = np.hstack(x_test[:3].reshape(-1,160,64,64).swapaxes(1,3)) 238 | 239 | montageRow2 = np.hstack(REC[:3].reshape(-1,160,64,64).swapaxes(1,3)) 240 | 241 | montage = np.vstack((montageRow1, montageRow2)) 242 | 243 | print ('montage shape is ',montage.shape) 244 | 245 | np.save(outPath + 'montageREC.npy',montage) 246 | 247 | scipy.io.savemat(outPath + 'montageREC.mat',(dict(montage=montage))) 248 | -------------------------------------------------------------------------------- /results/.DS_Store: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/laurahanu/2D-and-3D-Deep-Autoencoder/3b9590d46aaabc20ad500f2010e5c735ffb2dfb4/results/.DS_Store -------------------------------------------------------------------------------- /results/cae3depochs.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/laurahanu/2D-and-3D-Deep-Autoencoder/3b9590d46aaabc20ad500f2010e5c735ffb2dfb4/results/cae3depochs.png -------------------------------------------------------------------------------- /results/data.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/laurahanu/2D-and-3D-Deep-Autoencoder/3b9590d46aaabc20ad500f2010e5c735ffb2dfb4/results/data.png -------------------------------------------------------------------------------- /results/losscae2d.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/laurahanu/2D-and-3D-Deep-Autoencoder/3b9590d46aaabc20ad500f2010e5c735ffb2dfb4/results/losscae2d.png -------------------------------------------------------------------------------- /results/losscae3d.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/laurahanu/2D-and-3D-Deep-Autoencoder/3b9590d46aaabc20ad500f2010e5c735ffb2dfb4/results/losscae3d.png -------------------------------------------------------------------------------- /results/results3D_cae.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/laurahanu/2D-and-3D-Deep-Autoencoder/3b9590d46aaabc20ad500f2010e5c735ffb2dfb4/results/results3D_cae.png -------------------------------------------------------------------------------- /results/results_2dcae.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/laurahanu/2D-and-3D-Deep-Autoencoder/3b9590d46aaabc20ad500f2010e5c735ffb2dfb4/results/results_2dcae.png --------------------------------------------------------------------------------