├── .gitignore ├── Code ├── MDBMCode │ ├── README.md │ ├── config.py │ ├── dbm.py │ ├── rbm.py │ ├── run_flickr_img_dbm.ipynb │ ├── run_flickr_img_dbm.py │ ├── run_flickr_multimodal_dbm.ipynb │ ├── run_flickr_multimodal_dbm.py │ ├── run_flickr_text_dbm.ipynb │ └── run_flickr_text_dbm.py ├── TrainingLog.txt └── txtGen │ ├── .ipynb_checkpoints │ └── textGen-checkpoint.ipynb │ ├── README.md │ ├── img_no_tag.npy │ ├── textGen.ipynb │ └── text_input_layer-00001-of-00001.npy ├── FP_AchariBerrada.pdf ├── README.md └── setup_env.sh /.gitignore: -------------------------------------------------------------------------------- 1 | ## Core latex/pdflatex auxiliary files: 2 | *.aux 3 | *.lof 4 | *.log 5 | *.ot 6 | *.fls 7 | *.out 8 | *.toc 9 | *.fmt 10 | *.fot 11 | *.cb 12 | *.cb2 13 | 14 | ## Intermediate documents: 15 | *.dvi 16 | *-converted-to.* 17 | # these rules might exclude image files for figures etc. 18 | *.ps 19 | *.eps 20 | # *.pdf 21 | *.tex 22 | .pdf 23 | *.sty 24 | 25 | mirflickr* 26 | *.zip 27 | ivy2016.pem 28 | 29 | #APPLE 30 | .DS_Store 31 | *.DS_Store 32 | 33 | #pem file 34 | MMDBM.pem 35 | #Research Project 36 | Paper 37 | ResearchProject 38 | Data 39 | #flickr_data 40 | #mirflickr25k 41 | #mirflickr25k_annotations_v080 42 | #features_edgehistogram 43 | 44 | ## Bibliography auxiliary files (bibtex/biblatex/biber): 45 | *.bbl 46 | *.bcf 47 | *.blg 48 | *-blx.aux 49 | *-blx.bib 50 | *.brf 51 | *.run.xml 52 | 53 | ## Build tool auxiliary files: 54 | *.fdb_latexmk 55 | *.synctex 56 | *.synctex.gz 57 | *.synctex.gz(busy) 58 | *.pdfsync 59 | 60 | ## Auxiliary and intermediate files from other packages: 61 | # algorithms 62 | *.alg 63 | *.loa 64 | 65 | # achemso 66 | acs-*.bib 67 | 68 | # amsthm 69 | *.thm 70 | 71 | # beamer 72 | *.nav 73 | *.snm 74 | *.vrb 75 | 76 | # cprotect 77 | *.cpt 78 | 79 | # fixme 80 | *.lox 81 | 82 | #(r)(e)ledmac/(r)(e)ledpar 83 | *.end 84 | *.?end 85 | *.[1-9] 86 | *.[1-9][0-9] 87 | *.[1-9][0-9][0-9] 88 | *.[1-9]R 89 | *.[1-9][0-9]R 90 | *.[1-9][0-9][0-9]R 91 | *.eledsec[1-9] 92 | *.eledsec[1-9]R 93 | *.eledsec[1-9][0-9] 94 | *.eledsec[1-9][0-9]R 95 | *.eledsec[1-9][0-9][0-9] 96 | *.eledsec[1-9][0-9][0-9]R 97 | 98 | # glossaries 99 | *.acn 100 | *.acr 101 | *.glg 102 | *.glo 103 | *.gls 104 | *.glsdefs 105 | 106 | # gnuplottex 107 | *-gnuplottex-* 108 | 109 | # hyperref 110 | *.brf 111 | 112 | # knitr 113 | *-concordance.tex 114 | # TODO Comment the next line if you want to keep your tikz graphics files 115 | *.tikz 116 | *-tikzDictionary 117 | 118 | # listings 119 | *.lol 120 | 121 | # makeidx 122 | *.idx 123 | *.ilg 124 | *.ind 125 | *.ist 126 | 127 | # minitoc 128 | *.maf 129 | *.mlf 130 | *.mlt 131 | *.mtc 132 | *.mtc[0-9] 133 | *.mtc[1-9][0-9] 134 | 135 | # minted 136 | _minted* 137 | *.pyg 138 | 139 | # morewrites 140 | *.mw 141 | 142 | # mylatexformat 143 | *.fmt 144 | 145 | # nomencl 146 | *.nlo 147 | 148 | # sagetex 149 | *.sagetex.sage 150 | *.sagetex.py 151 | *.sagetex.scmd 152 | 153 | # sympy 154 | *.sout 155 | *.sympy 156 | sympy-plots-for-*.tex/ 157 | 158 | # pdfcomment 159 | *.upa 160 | *.upb 161 | 162 | # pythontex 163 | *.pytxcode 164 | pythontex-files-*/ 165 | 166 | # thmtools 167 | *.loe 168 | 169 | # TikZ & PGF 170 | *.dpth 171 | *.md5 172 | *.auxlock 173 | 174 | # todonotes 175 | *.tdo 176 | 177 | # xindy 178 | *.xdy 179 | 180 | # xypic precompiled matrices 181 | *.xyc 182 | 183 | # endfloat 184 | *.ttt 185 | *.fff 186 | 187 | # Latexian 188 | TSWLatexianTemp* 189 | 190 | ## Editors: 191 | # WinEdt 192 | *.bak 193 | *.sav 194 | 195 | # Texpad 196 | .texpadtmp 197 | 198 | # Kile 199 | *.backup 200 | 201 | # KBibTeX 202 | *~[0-9]* 203 | -------------------------------------------------------------------------------- /Code/MDBMCode/README.md: -------------------------------------------------------------------------------- 1 | This is the structure of the code that implements the Multimodal Deep Boltzmann Machine : 2 | 3 | * **config.py** a python configuration file that store the directory of the data and the directory to store the learned model. 4 | 5 | * **rbm.py** a python class that implements an RBM. It supports the binary and gaussion type of RBMs. 6 | I added to it another type, in order to include the Replicated Softmax Model of RBMs. 7 | 8 | * **dbm.py** a python class that implements a Deep Boltzmann Machine. It is implemented as stacked RBMs. 9 | 10 | * **run_flickr_text_dbm.ipynb** is the python script that trains the text specific Deep Boltzmann Machine. 11 | It outputs **text_dbm_layer_i_W.npy** and **text_dbm_layer_i_b.npy** for layer i. 12 | 13 | * **run flickr image dbm.ipynb** is the python script that trains the image specific Deep Boltzmann Machine. 14 | It outputs **image_dbm_layer_i_W.npy** and **image_dbm_layer_i_b.npy** for layer i. 15 | 16 | * **run_flickr_multimodal_dbm.ipynb** is the python script that implements the multimodal Deep Boltzmann Machine using the weights learned from the two precedent scripts. 17 | 18 | 19 | This files are my contribution to the code. To find the working code, you can go to the [Deep_Learning_Tensorflow](https://github.com/abyoussef/Deep-Learning-TensorFlow/) repository. 20 | -------------------------------------------------------------------------------- /Code/MDBMCode/config.py: -------------------------------------------------------------------------------- 1 | models_dir = 'model/stored_models/' # dir to save/restore stored_models 2 | data_dir = 'model/data/' # directory to store algorithm data 3 | summary_dir = 'model/logs/' # directory to store tensorflow summaries 4 | 5 | vocab_dir = '/home/ubuntu/pynb/Data/flickr_data/text/' # directory of the 2000 most common tags. 6 | flickr_unlabeled_path = '/home/ubuntu/pynb/Data/flickr_data/image/unlabelled' # directory of the image data unlabelled 7 | flickr_labeled_path = '/home/ubuntu/pynb/Data/flickr_data/image/labelled' # directory of the image data labelled 8 | 9 | 10 | img_vis = 'home/ubuntu/pynb/Data/images/' # directory for real images 11 | -------------------------------------------------------------------------------- /Code/MDBMCode/dbm.py: -------------------------------------------------------------------------------- 1 | """Implementation of a Deep Boltzmann Machine as a stack of RBMs.""" 2 | 3 | 4 | from __future__ import absolute_import 5 | from __future__ import division 6 | from __future__ import print_function 7 | 8 | import numpy as np 9 | import os 10 | import tensorflow as tf 11 | 12 | from unsupervised_model import UnsupervisedModel 13 | import rbm 14 | from yadlt.utils import utilities 15 | 16 | 17 | class DBM(UnsupervisedModel): 18 | """Implementation of a Deep Boltzmann Machine as a stack of RBMs. 19 | The interface of the class is sklearn-like. 20 | """ 21 | 22 | def __init__( 23 | self, layers, model_name='srbm', main_dir='srbm/', 24 | models_dir='models/', data_dir='data/', summary_dir='logs/', 25 | num_epochs=[10], batch_size=[10], dataset='mnist', 26 | learning_rate=[0.01], gibbs_k=[1], loss_func=['mean_squared'], 27 | momentum=0.5, finetune_dropout=1, verbose=1, 28 | finetune_loss_func='cross_entropy', finetune_enc_act_func=[tf.nn.relu], 29 | finetune_dec_act_func=[tf.nn.sigmoid], finetune_opt='gradient_descent', 30 | finetune_learning_rate=0.001, l2reg=5e-4, finetune_num_epochs=10, 31 | noise=['gauss'], stddev=0.1, finetune_batch_size=20, do_pretrain=False, 32 | tied_weights=False, regtype=['none'], finetune_reg_type='none'): 33 | """Constructor. 34 | 35 | :param layers: list containing the hidden units for each layer 36 | :param finetune_loss_func: Loss function for the softmax layer. 37 | string, default ['cross_entropy', 'mean_squared'] 38 | :param finetune_dropout: dropout parameter 39 | :param finetune_learning_rate: learning rate for the finetuning. 40 | float, default 0.001 41 | :param finetune_enc_act_func: activation function for the encoder 42 | finetuning phase 43 | :param finetune_dec_act_func: activation function for the decoder 44 | finetuning phase 45 | :param finetune_opt: optimizer for the finetuning phase 46 | :param finetune_num_epochs: Number of epochs for the finetuning. 47 | int, default 20 48 | :param finetune_batch_size: Size of each mini-batch for the finetuning. 49 | int, default 20 50 | :param verbose: Level of verbosity. 0 - silent, 1 - print accuracy. 51 | int, default 0 52 | :param do_pretrain: True: uses variables from pretraining, 53 | False: initialize new variables. 54 | """ 55 | # WARNING! This must be the first expression in the function or else it 56 | # will send other variables to expanded_args() 57 | # This function takes all the passed parameters that are lists and 58 | # expands them to the number of layers, if the number 59 | # of layers is more than the list of the parameter. 60 | expanded_args = utilities.expand_args(**locals()) 61 | 62 | UnsupervisedModel.__init__( 63 | self, model_name, main_dir, models_dir, data_dir, summary_dir) 64 | 65 | self._initialize_training_parameters( 66 | loss_func=finetune_loss_func, learning_rate=finetune_learning_rate, 67 | regtype=finetune_reg_type, num_epochs=finetune_num_epochs, 68 | batch_size=finetune_batch_size, l2reg=l2reg, 69 | dropout=finetune_dropout, dataset=dataset, opt=finetune_opt, 70 | momentum=momentum) 71 | 72 | self.do_pretrain = do_pretrain 73 | self.layers = layers 74 | self.tied_weights = tied_weights 75 | self.verbose = verbose 76 | 77 | self.finetune_enc_act_func = expanded_args['finetune_enc_act_func'] 78 | self.finetune_dec_act_func = expanded_args['finetune_dec_act_func'] 79 | 80 | self.input_ref = None 81 | 82 | # Model parameters 83 | self.encoding_w_ = [] # list of matrices of encoding weights per layer 84 | self.encoding_b_ = [] # list of arrays of encoding biases per layer 85 | 86 | self.decoding_w = [] # list of matrices of decoding weights per layer 87 | self.decoding_b = [] # list of arrays of decoding biases per layer 88 | 89 | self.reconstruction = None 90 | self.rbms = [] 91 | self.rbm_graphs = [] 92 | 93 | for l, layer in enumerate(layers): 94 | rbm_str = 'rbm-' + str(l + 1) 95 | new_rbm = rbm.RBM( 96 | model_name=self.model_name + '-' + rbm_str, 97 | loss_func=expanded_args['loss_func'][l], 98 | models_dir=os.path.join(self.models_dir, rbm_str), 99 | data_dir=os.path.join(self.data_dir, rbm_str), 100 | summary_dir=os.path.join(self.tf_summary_dir, rbm_str), 101 | visible_unit_type=expanded_args['noise'][l], stddev=stddev, 102 | num_hidden=expanded_args['layers'][l], main_dir=self.main_dir, 103 | learning_rate=expanded_args['learning_rate'][l], 104 | gibbs_sampling_steps=expanded_args['gibbs_k'][l], 105 | num_epochs=expanded_args['num_epochs'][l], 106 | batch_size=expanded_args['batch_size'][l], 107 | verbose=self.verbose, regtype=expanded_args['regtype'][l]) 108 | self.rbms.append(new_rbm) 109 | self.rbm_graphs.append(tf.Graph()) 110 | 111 | def pretrain(self, train_set, validation_set=None): 112 | """Perform Unsupervised pretraining of the autoencoder.""" 113 | self.do_pretrain = True 114 | 115 | def set_params_func(rbmmachine, rbmgraph): 116 | params = rbmmachine.get_model_parameters(graph=rbmgraph) 117 | self.encoding_w_.append(params['W']) 118 | self.encoding_b_.append(params['bh_']) 119 | 120 | return UnsupervisedModel.pretrain_procedure( 121 | self, self.rbms, self.rbm_graphs, set_params_func=set_params_func, 122 | train_set=train_set, validation_set=validation_set) 123 | 124 | def _train_model(self, train_set, train_ref, 125 | validation_set, validation_ref): 126 | """Train the model. 127 | 128 | :param train_set: training set 129 | :param train_ref: training reference data 130 | :param validation_set: validation set 131 | :param validation_ref: validation reference data 132 | :return: self 133 | """ 134 | shuff = zip(train_set, train_ref) 135 | 136 | for i in range(self.num_epochs): 137 | 138 | np.random.shuffle(shuff) 139 | batches = [_ for _ in utilities.gen_batches( 140 | shuff, self.batch_size)] 141 | 142 | for batch in batches: 143 | x_batch, y_batch = zip(*batch) 144 | self.tf_session.run( 145 | self.train_step, 146 | feed_dict={self.input_data: x_batch, 147 | self.input_labels: y_batch, 148 | self.keep_prob: self.dropout}) 149 | 150 | if validation_set is not None: 151 | feed = {self.input_data: validation_set, 152 | self.input_labels: validation_ref, 153 | self.keep_prob: 1} 154 | self._run_validation_error_and_summaries(i, feed) 155 | 156 | def build_model(self, n_features, regtype='none', 157 | encoding_w=None, encoding_b=None): 158 | """Create the computational graph for the reconstruction task. 159 | 160 | :param n_features: Number of features 161 | :param regtype: regularization type 162 | :param encoding_w: list of weights for the encoding layers. 163 | :param encoding_b: list of biases for the encoding layers. 164 | :return: self 165 | """ 166 | self._create_placeholders(n_features, n_features) 167 | 168 | if encoding_w and encoding_b: 169 | self.encoding_w_ = encoding_w 170 | self.encoding_b_ = encoding_b 171 | else: 172 | self._create_variables(n_features) 173 | 174 | self._create_encoding_layers() 175 | self._create_decoding_layers() 176 | 177 | vars = [] 178 | vars.extend(self.encoding_w_) 179 | vars.extend(self.encoding_b_) 180 | regterm = self.compute_regularization(vars) 181 | 182 | self._create_cost_function_node( 183 | self.reconstruction, self.input_labels, regterm=regterm) 184 | self._create_train_step_node() 185 | 186 | def _create_placeholders(self, n_features, n_classes): 187 | """Create the TensorFlow placeholders for the model. 188 | 189 | :param n_features: number of features of the first layer 190 | :param n_classes: number of classes 191 | :return: self 192 | """ 193 | self.input_data = tf.placeholder( 194 | tf.float32, [None, n_features], name='x-input') 195 | self.input_labels = tf.placeholder( 196 | tf.float32, [None, n_classes], name='y-input') 197 | self.keep_prob = tf.placeholder( 198 | tf.float32, name='keep-probs') 199 | 200 | def _create_variables(self, n_features): 201 | """Create the TensorFlow variables for the model. 202 | 203 | :param n_features: number of features 204 | :return: self 205 | """ 206 | if self.do_pretrain: 207 | self._create_variables_pretrain() 208 | else: 209 | self._create_variables_no_pretrain(n_features) 210 | 211 | def _create_variables_no_pretrain(self, n_features): 212 | """Create model variables (no previous unsupervised pretraining). 213 | 214 | :param n_features: number of features 215 | :return: self 216 | """ 217 | self.encoding_w_ = [] 218 | self.encoding_b_ = [] 219 | 220 | for l, layer in enumerate(self.layers): 221 | 222 | if l == 0: 223 | self.encoding_w_.append(tf.Variable(tf.truncated_normal( 224 | shape=[n_features, self.layers[l]], stddev=0.1))) 225 | self.encoding_b_.append(tf.Variable(tf.truncated_normal( 226 | [self.layers[l]], stddev=0.1))) 227 | else: 228 | self.encoding_w_.append(tf.Variable(tf.truncated_normal( 229 | shape=[self.layers[l - 1], self.layers[l]], stddev=0.1))) 230 | self.encoding_b_.append(tf.Variable(tf.truncated_normal( 231 | [self.layers[l]], stddev=0.1))) 232 | 233 | def _create_variables_pretrain(self): 234 | """Create model variables (previous unsupervised pretraining). 235 | 236 | :return: self 237 | """ 238 | for l, layer in enumerate(self.layers): 239 | self.encoding_w_[l] = tf.Variable( 240 | self.encoding_w_[l], name='enc-w-{}'.format(l)) 241 | self.encoding_b_[l] = tf.Variable( 242 | self.encoding_b_[l], name='enc-b-{}'.format(l)) 243 | 244 | def _create_encoding_layers(self): 245 | """Create the encoding layers for supervised finetuning. 246 | 247 | :return: output of the final encoding layer. 248 | """ 249 | next_train = self.input_data 250 | self.layer_nodes = [] 251 | 252 | for l, layer in enumerate(self.layers): 253 | 254 | with tf.name_scope("encode-{}".format(l)): 255 | 256 | y_act = tf.add( 257 | tf.matmul(next_train, self.encoding_w_[l]), 258 | self.encoding_b_[l] 259 | ) 260 | 261 | if self.finetune_enc_act_func[l] is not None: 262 | layer_y = self.finetune_enc_act_func[l](y_act) 263 | 264 | else: 265 | layer_y = None 266 | 267 | # the input to the next layer is the output of this layer 268 | next_train = tf.nn.dropout(layer_y, self.keep_prob) 269 | 270 | self.layer_nodes.append(next_train) 271 | 272 | self.encode = next_train 273 | 274 | def _create_decoding_layers(self): 275 | """Create the decoding layers for reconstruction finetuning. 276 | 277 | :return: output of the final encoding layer. 278 | """ 279 | next_decode = self.encode 280 | 281 | for l, layer in reversed(list(enumerate(self.layers))): 282 | 283 | with tf.name_scope("decode-{}".format(l)): 284 | 285 | # Create decoding variables 286 | if self.tied_weights: 287 | dec_w = tf.transpose(self.encoding_w_[l]) 288 | else: 289 | dec_w = tf.Variable(tf.transpose( 290 | self.encoding_w_[l].initialized_value())) 291 | 292 | dec_b = tf.Variable(tf.constant( 293 | 0.1, shape=[dec_w.get_shape().dims[1].value])) 294 | self.decoding_w.append(dec_w) 295 | self.decoding_b.append(dec_b) 296 | 297 | y_act = tf.add( 298 | tf.matmul(next_decode, dec_w), 299 | dec_b 300 | ) 301 | 302 | if self.finetune_dec_act_func[l] is not None: 303 | layer_y = self.finetune_dec_act_func[l](y_act) 304 | 305 | else: 306 | layer_y = None 307 | 308 | # the input to the next layer is the output of this layer 309 | next_decode = tf.nn.dropout(layer_y, self.keep_prob) 310 | 311 | self.layer_nodes.append(next_decode) 312 | 313 | self.reconstruction = next_decode 314 | 315 | -------------------------------------------------------------------------------- /Code/MDBMCode/rbm.py: -------------------------------------------------------------------------------- 1 | """Restricted Boltzmann Machine TensorFlow implementation.""" 2 | # Achari Berrada Youssef comment : 3 | # This is the implementation of the RBM 4 | # It was available in the Deep_Learning_Tensorflow Repository with 'bin' and 'gauss' 5 | # I added the 'rsm' in order to support the Replicated Softmax Model needed for the multi_DBM model 6 | from __future__ import absolute_import 7 | from __future__ import division 8 | from __future__ import print_function 9 | 10 | import numpy as np 11 | import tensorflow as tf 12 | 13 | from unsupervised_model import UnsupervisedModel 14 | from yadlt.utils import utilities 15 | 16 | 17 | class RBM(UnsupervisedModel): 18 | """Restricted Boltzmann Machine implementation using TensorFlow. 19 | 20 | The interface of the class is sklearn-like. 21 | """ 22 | 23 | def __init__( 24 | self, num_hidden, visible_unit_type='bin', main_dir='rbm/', 25 | models_dir='models/', data_dir='data/', summary_dir='logs/', 26 | model_name='rbm', dataset='mnist', loss_func='mean_squared', 27 | l2reg=5e-4, regtype='none', gibbs_sampling_steps=1, learning_rate=0.01, 28 | batch_size=10, num_epochs=10, stddev=0.1, D=[], verbose=0): 29 | """Constructor. 30 | 31 | :param num_hidden: number of hidden units 32 | :param loss_function: type of loss function 33 | :param visible_unit_type: type of the visible units (bin, gauss or rsm) 34 | :param gibbs_sampling_steps: optional, default 1 35 | :param stddev: default 0.1. Ignored if visible_unit_type is not 'gauss' 36 | :param D: default []. Optional documents dimensions array. Used only if 37 | visible_unit_type is 'rsm' 38 | :param verbose: level of verbosity. optional, default 0 39 | """ 40 | UnsupervisedModel.__init__(self, model_name, main_dir, models_dir, 41 | data_dir, summary_dir) 42 | 43 | self._initialize_training_parameters( 44 | loss_func=loss_func, learning_rate=learning_rate, 45 | num_epochs=num_epochs, batch_size=batch_size, dataset=dataset, 46 | regtype=regtype, l2reg=l2reg) 47 | 48 | self.num_hidden = num_hidden 49 | self.visible_unit_type = visible_unit_type 50 | self.gibbs_sampling_steps = gibbs_sampling_steps 51 | self.stddev = stddev 52 | self.D = D 53 | self.verbose = verbose 54 | 55 | self.W = None 56 | self.bh_ = None 57 | self.bv_ = None 58 | 59 | self.w_upd8 = None 60 | self.bh_upd8 = None 61 | self.bv_upd8 = None 62 | 63 | self.cost = None 64 | 65 | self.input_data = None 66 | self.hrand = None 67 | self.vrand = None 68 | 69 | def _train_model(self, train_set, validation_set, 70 | train_ref=None, Validation_ref=None): 71 | """Train the model. 72 | 73 | :param train_set: training set 74 | :param validation_set: validation set. optional, default None 75 | :return: self 76 | """ 77 | for i in range(self.num_epochs): 78 | self._run_train_step(train_set) 79 | 80 | if validation_set is not None: 81 | self._run_validation_error_and_summaries( 82 | i, self._create_feed_dict(validation_set)) 83 | 84 | def _run_train_step(self, train_set): 85 | """Run a training step. 86 | 87 | A training step is made by randomly shuffling the training set, 88 | divide into batches and run the variable update nodes for each batch. 89 | :param train_set: training set 90 | :return: self 91 | """ 92 | np.random.shuffle(train_set) 93 | 94 | batches = [_ for _ in utilities.gen_batches(train_set, 95 | self.batch_size)] 96 | updates = [self.w_upd8, self.bh_upd8, self.bv_upd8] 97 | 98 | for batch in batches: 99 | with tf.device("/gpu:0"): 100 | self.tf_session.run(updates, feed_dict=self._create_feed_dict(batch)) 101 | 102 | def _create_feed_dict(self, data): 103 | """Create the dictionary of data to feed to tf session during training. 104 | 105 | :param data: training/validation set batch 106 | :return: dictionary(self.input_data: data, self.hrand: random_uniform, 107 | self.vrand: random_uniform) 108 | """ 109 | return { 110 | self.input_data: data, 111 | self.hrand: np.random.rand(data.shape[0], self.num_hidden), 112 | self.vrand: np.random.rand(data.shape[0], data.shape[1]) 113 | } 114 | 115 | def build_model(self, n_features, regtype='none'): 116 | """Build the Restricted Boltzmann Machine model in TensorFlow. 117 | 118 | :param n_features: number of features 119 | :param regtype: regularization type 120 | :return: self 121 | """ 122 | self._create_placeholders(n_features) 123 | self._create_variables(n_features) 124 | self.encode = self.sample_hidden_from_visible(self.input_data)[0] 125 | self.reconstruction = self.sample_visible_from_hidden( 126 | self.encode, n_features)[0] 127 | 128 | hprob0, hstate0, vprob, vstate, hprob1, hstate1 =\ 129 | self.gibbs_sampling_step(self.input_data, n_features) 130 | positive = self.compute_positive_association(self.input_data, 131 | hprob0, hstate0) 132 | 133 | nn_input = vprob 134 | 135 | for step in range(self.gibbs_sampling_steps - 1): 136 | hprob, hstate, vprob, vstate, hprob1, hstate1 =\ 137 | self.gibbs_sampling_step(nn_input, n_features) 138 | nn_input = vprob 139 | 140 | negative = tf.matmul(tf.transpose(vprob), hprob1) 141 | 142 | self.w_upd8 = self.W.assign_add( 143 | self.learning_rate * (positive - negative) / self.batch_size) 144 | 145 | self.bh_upd8 = self.bh_.assign_add(tf.mul( 146 | self.learning_rate, tf.reduce_mean(tf.sub(hprob0, hprob1), 0))) 147 | 148 | self.bv_upd8 = self.bv_.assign_add(tf.mul( 149 | self.learning_rate, tf.reduce_mean( 150 | tf.sub(self.input_data, vprob), 0))) 151 | 152 | vars = [self.W, self.bh_, self.bv_] 153 | regterm = self.compute_regularization(vars) 154 | 155 | self._create_cost_function_node(vprob, self.input_data, 156 | regterm=regterm) 157 | 158 | def _create_placeholders(self, n_features): 159 | """Create the TensorFlow placeholders for the model. 160 | 161 | :param n_features: number of features 162 | :return: self 163 | """ 164 | self.input_data = tf.placeholder(tf.float32, [None, n_features], 165 | name='x-input') 166 | self.hrand = tf.placeholder(tf.float32, [None, self.num_hidden], 167 | name='hrand') 168 | self.vrand = tf.placeholder(tf.float32, [None, n_features], 169 | name='vrand') 170 | # not used in this model, created just to comply with 171 | # unsupervised_model.py 172 | self.input_labels = tf.placeholder(tf.float32) 173 | self.keep_prob = tf.placeholder(tf.float32, name='keep-probs') 174 | 175 | def _create_variables(self, n_features): 176 | """Create the TensorFlow variables for the model. 177 | 178 | :param n_features: number of features 179 | :return: self 180 | """ 181 | self.W = tf.Variable(tf.truncated_normal( 182 | shape=[n_features, self.num_hidden], stddev=0.1), name='weights') 183 | self.bh_ = tf.Variable(tf.constant(0.1, shape=[self.num_hidden]), 184 | name='hidden-bias') 185 | self.bv_ = tf.Variable(tf.constant(0.1, shape=[n_features]), 186 | name='visible-bias') 187 | 188 | def gibbs_sampling_step(self, visible, n_features): 189 | """Perform one step of gibbs sampling. 190 | 191 | :param visible: activations of the visible units 192 | :param n_features: number of features 193 | :return: tuple(hidden probs, hidden states, visible probs, 194 | new hidden probs, new hidden states) 195 | """ 196 | hprobs, hstates = self.sample_hidden_from_visible(visible) 197 | vprobs, vstates = self.sample_visible_from_hidden(hprobs, n_features) 198 | hprobs1, hstates1 = self.sample_hidden_from_visible(vprobs) 199 | 200 | return hprobs, hstates, vprobs, vstates, hprobs1, hstates1 201 | 202 | def sample_hidden_from_visible(self, visible): 203 | """Sample the hidden units from the visible units. 204 | 205 | This is the Positive phase of the Contrastive Divergence algorithm. 206 | 207 | :param visible: activations of the visible units 208 | :return: tuple(hidden probabilities, hidden binary states) 209 | """ 210 | hprobs = tf.nn.sigmoid(tf.add(tf.matmul(visible, self.W), self.bh_)) 211 | hstates = utilities.sample_prob(hprobs, self.hrand) 212 | 213 | return hprobs, hstates 214 | 215 | def sample_visible_from_hidden(self, hidden, n_features): 216 | """Sample the visible units from the hidden units. 217 | 218 | This is the Negative phase of the Contrastive Divergence algorithm. 219 | :param hidden: activations of the hidden units 220 | :param n_features: number of features 221 | :return: visible probabilities 222 | """ 223 | visible_activation = tf.add( 224 | tf.matmul(hidden, tf.transpose(self.W)), 225 | self.bv_ 226 | ) 227 | 228 | if self.visible_unit_type == 'bin': 229 | vprobs = tf.nn.sigmoid(visible_activation) 230 | vstates = None 231 | 232 | elif self.visible_unit_type == 'gauss': 233 | vprobs = tf.truncated_normal( 234 | (1, n_features), mean=visible_activation, stddev=self.stddev) 235 | vstates = None 236 | 237 | elif self.visible_unit_type == 'rsm': 238 | vprobs = tf.nn.softmax(visible_activation) 239 | vstates = tf.multinomial(vprobs, 1) 240 | #vstates = [] 241 | #with tf.Session() as sess: 242 | # for i, vp in enumerate(sess.run(vprobs)): 243 | # vstates.append( 244 | # np.random.multinomial(self.D[i], vp, size=1)) 245 | 246 | else: 247 | vprobs = None 248 | vstates = None 249 | 250 | return vprobs, vstates 251 | 252 | def compute_positive_association(self, visible, 253 | hidden_probs, hidden_states): 254 | """Compute positive associations between visible and hidden units. 255 | 256 | :param visible: visible units 257 | :param hidden_probs: hidden units probabilities 258 | :param hidden_states: hidden units states 259 | :return: positive association = dot(visible.T, hidden) 260 | """ 261 | if self.visible_unit_type == 'bin': 262 | positive = tf.matmul(tf.transpose(visible), hidden_states) 263 | 264 | elif self.visible_unit_type == 'gauss': 265 | positive = tf.matmul(tf.transpose(visible), hidden_probs) 266 | 267 | elif self.visible_unit_type == 'rsm': 268 | positive = tf.matmul(tf.transpose(visible), hidden_states) 269 | 270 | else: 271 | positive = None 272 | 273 | return positive 274 | 275 | def load_model(self, shape, gibbs_sampling_steps, model_path): 276 | """Load a trained model from disk. 277 | 278 | The shape of the model (num_visible, num_hidden) and the number 279 | of gibbs sampling steps must be known in order to restore the model. 280 | :param shape: tuple(num_visible, num_hidden) 281 | :param gibbs_sampling_steps: 282 | :param model_path: 283 | :return: self 284 | """ 285 | n_features, self.num_hidden = shape[0], shape[1] 286 | self.gibbs_sampling_steps = gibbs_sampling_steps 287 | 288 | self.build_model(n_features) 289 | 290 | init_op = tf.initialize_all_variables() 291 | self.tf_saver = tf.train.Saver() 292 | 293 | with tf.Session() as self.tf_session: 294 | 295 | self.tf_session.run(init_op) 296 | self.tf_saver.restore(self.tf_session, model_path) 297 | 298 | def get_model_parameters(self, graph=None): 299 | """Return the model parameters in the form of numpy arrays. 300 | 301 | :param graph: tf graph object 302 | :return: model parameters 303 | """ 304 | g = graph if graph is not None else self.tf_graph 305 | 306 | with g.as_default(): 307 | with tf.Session() as self.tf_session: 308 | self.tf_saver.restore(self.tf_session, self.model_path) 309 | 310 | return { 311 | 'W': self.W.eval(), 312 | 'bh_': self.bh_.eval(), 313 | 'bv_': self.bv_.eval() 314 | } 315 | 316 | -------------------------------------------------------------------------------- /Code/MDBMCode/run_flickr_img_dbm.ipynb: -------------------------------------------------------------------------------- 1 | { 2 | "cells": [ 3 | { 4 | "cell_type": "code", 5 | "execution_count": null, 6 | "metadata": { 7 | "collapsed": true 8 | }, 9 | "outputs": [], 10 | "source": [ 11 | "# Achari Berrada Youssef\n", 12 | "\n", 13 | "# This is the script for training the Image Specific Deep Boltzmann Machine\n", 14 | "import os\n", 15 | "import numpy as np\n", 16 | "import tensorflow as tf\n", 17 | "import config\n", 18 | "\n", 19 | "import dbm\n" 20 | ] 21 | }, 22 | { 23 | "cell_type": "code", 24 | "execution_count": null, 25 | "metadata": { 26 | "collapsed": true 27 | }, 28 | "outputs": [], 29 | "source": [ 30 | "# Load all the .npy files in the flickr unlabeled images directory into flickr_u\n", 31 | "\n", 32 | "flickr_u = np.array([])\n", 33 | "files_count = 0\n", 34 | "\n", 35 | "for f in os.listdir(config.flickr_unlabeled_path):\n", 36 | " if files_count == 10: # only ten files for now\n", 37 | " break\n", 38 | " files_count += 1\n", 39 | " if f[-4:] == \".npy\":\n", 40 | " t = np.load(os.path.join(config.flickr_unlabeled_path, f))\n", 41 | " if flickr_u.shape == (0,):\n", 42 | " flickr_u = t\n", 43 | " else:\n", 44 | " flickr_u = np.concatenate((flickr_u, t), axis=0)\n" 45 | ] 46 | }, 47 | { 48 | "cell_type": "code", 49 | "execution_count": null, 50 | "metadata": { 51 | "collapsed": true 52 | }, 53 | "outputs": [], 54 | "source": [ 55 | "model = dbm.DBM(\n", 56 | " main_dir=\"flickr_rbm\", do_pretrain=True, layers=[1024, 1024],\n", 57 | " models_dir=config.models_dir, data_dir=config.data_dir, summary_dir=config.summary_dir,\n", 58 | " learning_rate=[0.001, 0.001], momentum=0.9, num_epochs=[1, 1], batch_size=[64, 64],\n", 59 | " stddev=0.1, verbose=1, gibbs_k=[1, 1], model_name=\"flickr_dbm\",\n", 60 | " finetune_learning_rate=0.01, finetune_enc_act_func=[tf.nn.sigmoid, tf.nn.sigmoid],\n", 61 | " finetune_dec_act_func=[tf.nn.sigmoid, tf.nn.sigmoid], finetune_num_epochs=5, finetune_batch_size=128,\n", 62 | " finetune_opt='momentum', finetune_loss_func=\"mean_squared\", finetune_dropout=0.5, noise=[\"gauss\", \"bin\"],\n", 63 | ")\n", 64 | "\n", 65 | "trainX = flickr_u[:3000]\n", 66 | "testX = flickr_u[3000:3500]" 67 | ] 68 | }, 69 | { 70 | "cell_type": "code", 71 | "execution_count": null, 72 | "metadata": { 73 | "collapsed": true 74 | }, 75 | "outputs": [], 76 | "source": [ 77 | "#Pretrain the model \n", 78 | "model.pretrain(trainX, testX)" 79 | ] 80 | }, 81 | { 82 | "cell_type": "code", 83 | "execution_count": null, 84 | "metadata": { 85 | "collapsed": true 86 | }, 87 | "outputs": [], 88 | "source": [ 89 | "# Fine tunning of the model \n", 90 | "model.fit(trainX, testX, trainX[:10], testX[:10], save_dbm_image_params=True) " 91 | ] 92 | }, 93 | { 94 | "cell_type": "code", 95 | "execution_count": null, 96 | "metadata": { 97 | "collapsed": true 98 | }, 99 | "outputs": [], 100 | "source": [ 101 | "# Output : \n", 102 | " # image_dbm_layer_1_W.npy\n", 103 | " # image_dbm_layer_1_b.npy\n", 104 | " # image_dbm_layer_2_W.npy\n", 105 | " # image_dbm_layer_2_b.npy" 106 | ] 107 | } 108 | ], 109 | "metadata": { 110 | "anaconda-cloud": {}, 111 | "kernelspec": { 112 | "display_name": "Python [conda env:py27]", 113 | "language": "python", 114 | "name": "conda-env-py27-py" 115 | }, 116 | "language_info": { 117 | "codemirror_mode": { 118 | "name": "ipython", 119 | "version": 2 120 | }, 121 | "file_extension": ".py", 122 | "mimetype": "text/x-python", 123 | "name": "python", 124 | "nbconvert_exporter": "python", 125 | "pygments_lexer": "ipython2", 126 | "version": "2.7.12" 127 | } 128 | }, 129 | "nbformat": 4, 130 | "nbformat_minor": 1 131 | } 132 | -------------------------------------------------------------------------------- /Code/MDBMCode/run_flickr_img_dbm.py: -------------------------------------------------------------------------------- 1 | # Achari Berrada Youssef 2 | 3 | # This is the script for training the Image Specific Deep Boltzmann Machine 4 | import os 5 | import numpy as np 6 | import tensorflow as tf 7 | import config 8 | 9 | import dbm 10 | 11 | # Load all the .npy files in the flickr unlabeled images directory into flickr_u 12 | 13 | flickr_u = np.array([]) 14 | files_count = 0 15 | 16 | for f in os.listdir(config.flickr_unlabeled_path): 17 | if files_count == 10: # only ten files for now 18 | break 19 | files_count += 1 20 | if f[-4:] == ".npy": 21 | t = np.load(os.path.join(config.flickr_unlabeled_path, f)) 22 | if flickr_u.shape == (0,): 23 | flickr_u = t 24 | else: 25 | flickr_u = np.concatenate((flickr_u, t), axis=0) 26 | 27 | model = dbm.DBM( 28 | main_dir="flickr_rbm", do_pretrain=True, layers=[1024, 1024], 29 | models_dir=config.models_dir, data_dir=config.data_dir, summary_dir=config.summary_dir, 30 | learning_rate=[0.001, 0.001], momentum=0.9, num_epochs=[1, 1], batch_size=[64, 64], 31 | stddev=0.1, verbose=1, gibbs_k=[1, 1], model_name="flickr_dbm", 32 | finetune_learning_rate=0.01, finetune_enc_act_func=[tf.nn.sigmoid, tf.nn.sigmoid], 33 | finetune_dec_act_func=[tf.nn.sigmoid, tf.nn.sigmoid], finetune_num_epochs=5, finetune_batch_size=128, 34 | finetune_opt='momentum', finetune_loss_func="mean_squared", finetune_dropout=0.5, noise=["gauss", "bin"], 35 | ) 36 | 37 | trainX = flickr_u[:3000] 38 | testX = flickr_u[3000:3500] 39 | 40 | 41 | model.pretrain(trainX, testX) 42 | 43 | model.fit(trainX, testX, trainX[:10], testX[:10], save_dbm_image_params=True) 44 | 45 | # Output : 46 | # image_dbm_layer_1_W.npy 47 | # image_dbm_layer_1_b.npy 48 | # image_dbm_layer_2_W.npy 49 | # image_dbm_layer_2_b.npy 50 | 51 | -------------------------------------------------------------------------------- /Code/MDBMCode/run_flickr_multimodal_dbm.ipynb: -------------------------------------------------------------------------------- 1 | { 2 | "cells": [ 3 | { 4 | "cell_type": "code", 5 | "execution_count": null, 6 | "metadata": { 7 | "collapsed": true 8 | }, 9 | "outputs": [], 10 | "source": [ 11 | "# Achari Berrada Youssef\n", 12 | "# This is an implementation of the Multinomial DBM Architecture.\n", 13 | "\n", 14 | "import numpy as np\n", 15 | "import tensorflow as tf\n", 16 | "import config\n", 17 | "import os\n", 18 | "\n", 19 | "from yadlt.utils import utilities\n", 20 | "\n", 21 | "# Trained from the notebook run_flickr_image_dbm.ipynb\n", 22 | "image_layer1_W = np.load(\"image_dbm_layer_1_W.npy\")\n", 23 | "image_layer1_b = np.load(\"image_dbm_layer_1_b.npy\")\n", 24 | "image_layer2_W = np.load(\"image_dbm_layer_2_W.npy\")\n", 25 | "image_layer2_b = np.load(\"image_dbm_layer_2_b.npy\")\n", 26 | "\n", 27 | "# Trained from the notebook run_flickr_text_dbm.ipynb\n", 28 | "text_layer1_W = np.load(\"text_dbm_layer_1_W.npy\")\n", 29 | "text_layer1_b = np.load(\"text_dbm_layer_1_b.npy\")\n", 30 | "text_layer2_W = np.load(\"text_dbm_layer_2_W.npy\")\n", 31 | "text_layer2_b = np.load(\"text_dbm_layer_2_b.npy\")\n", 32 | "\n", 33 | "\n", 34 | "\n", 35 | "# Input placeholders\n", 36 | "img_input = tf.placeholder(tf.float32, [None, 3857], name=\"image-input\")\n", 37 | "txt_input = tf.placeholder(tf.float32, [None, 2000], name=\"text-input\")\n", 38 | "\n", 39 | "\n", 40 | "# ################# #\n", 41 | "# Image Forward DBM #\n", 42 | "# ################# #\n", 43 | "\n", 44 | "img_W1 = tf.Variable(image_layer1_W)\n", 45 | "img_b1 = tf.Variable(image_layer1_b)\n", 46 | "\n", 47 | "img_W2 = tf.Variable(image_layer2_W)\n", 48 | "img_b2 = tf.Variable(image_layer2_b)\n", 49 | "\n", 50 | "img_layer1 = tf.add(tf.matmul(img_input, img_W1), img_b1)\n", 51 | "img_layer1 = tf.nn.sigmoid(img_layer1)\n", 52 | "\n", 53 | "img_layer2 = tf.add(tf.matmul(img_layer1, img_W2), img_b2)\n", 54 | "img_layer2 = tf.nn.sigmoid(img_layer2)\n", 55 | "\n", 56 | "\n", 57 | "\n", 58 | "# ################ #\n", 59 | "# Text Forward DBM #\n", 60 | "# ################ #\n", 61 | "\n", 62 | "txt_W1 = tf.Variable(text_layer1_W)\n", 63 | "txt_b1 = tf.Variable(text_layer1_b)\n", 64 | "\n", 65 | "txt_W2 = tf.Variable(text_layer2_W)\n", 66 | "txt_b2 = tf.Variable(text_layer2_b)\n", 67 | "\n", 68 | "txt_layer1 = tf.add(tf.matmul(txt_input, txt_W1), txt_b1)\n", 69 | "txt_layer1 = tf.nn.sigmoid(txt_layer1)\n", 70 | "\n", 71 | "txt_layer2 = tf.add(tf.matmul(txt_layer1, txt_W2), txt_b2)\n", 72 | "txt_layer2 = tf.nn.sigmoid(txt_layer2)\n", 73 | "\n", 74 | "\n", 75 | "# ############## #\n", 76 | "# Multimodal DBM #\n", 77 | "# ############## #\n", 78 | "\n", 79 | "joint_representation_units = 512 # number of units in the joint representation layer\n", 80 | "\n", 81 | "multi_W = tf.Variable(tf.truncated_normal(shape=[2048, joint_representation_units], stddev=0.1), name='multimodal-weights')\n", 82 | "multi_b = tf.Variable(tf.constant(0.1, shape=[joint_representation_units]), name='multimodal-bias')\n", 83 | "\n", 84 | "multi_input = tf.concat([img_layer2, txt_layer2], 0)\n", 85 | "multi_output = tf.nn.sigmoid(tf.add(tf.matmul(multi_input, multi_W), multi_b))\n", 86 | "\n", 87 | "binary_output = utilities.sample_prob(hprobs, np.random.rand(data.shape[0], joint_representation_units))\n", 88 | "\n", 89 | "# ################## #\n", 90 | "# Image Backward DBM #\n", 91 | "# ################## #\n", 92 | "\n", 93 | "img_layer2b = tf.add(tf.matmul(binary_output, img_W2.T), img_b2)\n", 94 | "img_layer2b = tf.nn.sigmoid(img_layer2b)\n", 95 | "\n", 96 | "img_layer1b = tf.add(tf.matmul(img_layer2b, img_W1.T), img_b1)\n", 97 | "img_layer1b = tf.nn.sigmoid(img_layer1b)\n", 98 | "\n", 99 | "# ################# #\n", 100 | "# Text Backward DBM #\n", 101 | "# ################# #\n", 102 | "\n", 103 | "txt_layer2b = tf.add(tf.matmul(binary_output, txt_W2.T), txt_b2)\n", 104 | "txt_layer2b = tf.nn.sigmoid(txt_layer2b)\n", 105 | "\n", 106 | "txt_layer1b = tf.add(tf.matmul(txt_layer2b, txt_W1.T), txt_b1)\n", 107 | "txt_layer1b = tf.nn.sigmoid(txt_layer1b)\n", 108 | "\n", 109 | "# Now can use img_layer1b (which should be 3587) and txt_layer1b (which should be 2000) to compute the loss\n", 110 | "# function. For mean squared error cost:\n", 111 | "\n", 112 | "img_cost = tf.sqrt(tf.reduce_mean(tf.square(tf.sub(img_dataset, img_layer1b))))\n", 113 | "txt_cost = tf.sqrt(tf.reduce_mean(tf.square(tf.sub(txt_dataset, txt_layer1b))))\n", 114 | "\n", 115 | "# Optimizer\n", 116 | "\n", 117 | "img_train_step = tf.train.MomentumOptimizer(learning_rate, momentum).minimize(img_cost)\n", 118 | "txt_train_step = tf.train.MomentumOptimizer(learning_rate, momentum).minimize(txt_cost)\n", 119 | "\n", 120 | "# ####################### #\n", 121 | "# Code to train the model #\n", 122 | "# ####################### #\n", 123 | "\n", 124 | "# train_set and train_ref should be loaded, maybe from the labeled data directory\n", 125 | "labeled1 = np.load(os.path.join(config.flickr_labeled_path, \"combined-00001-of-00100.npy\"))\n", 126 | "labeled2 = np.load(os.path.join(config.flickr_labeled_path, \"combined-00002-of-00100.npy\"))\n", 127 | "labeled3 = np.load(os.path.join(config.flickr_labeled_path, \"combined-00003_0-of-00100.npy\"))\n", 128 | "labeled = np.concatenate((labeled1, labeled2, labeled3), axis=0)\n", 129 | "\n", 130 | "num_epochs = 5\n", 131 | "batch_size = 64\n", 132 | "\n", 133 | "shuff = zip(train_set, train_ref)\n", 134 | "\n", 135 | "with tf.Session() as sess:\n", 136 | " for i in range(num_epochs):\n", 137 | " np.random.shuffle(shuff)\n", 138 | " batches = [_ for _ in utilities.gen_batches(shuff, batch_size)]\n", 139 | "\n", 140 | " for batch in batches:\n", 141 | " x_batch, y_batch = zip(*batch)\n", 142 | " # run image train\n", 143 | " sess.run(img_train_step, feed_dict={img_input: x_batch, image_ref: y_batch})\n", 144 | " # run text train\n", 145 | " sess.run(txt_train_step, feed_dict={txt_input: x_batch, text_ref: y_batch})\n", 146 | "\n", 147 | " if validation_set is not None:\n", 148 | " img_loss = sess.run(img_cost, feed_dict={img_input: img_validation_input, img_ref: img_validation_ref})\n", 149 | " txt_loss = sess.run(txt_cost, feed_dict={txt_input: txt_validation_input, txt_ref: txt_validation_ref})\n", 150 | " print(\"Image loss: %f\" % (img_loss))\n", 151 | " print(\"Text loss: %f\" % (txt_loss))\n", 152 | "\n", 153 | "\n" 154 | ] 155 | } 156 | ], 157 | "metadata": { 158 | "anaconda-cloud": {}, 159 | "kernelspec": { 160 | "display_name": "Python [conda env:py27]", 161 | "language": "python", 162 | "name": "conda-env-py27-py" 163 | }, 164 | "language_info": { 165 | "codemirror_mode": { 166 | "name": "ipython", 167 | "version": 2 168 | }, 169 | "file_extension": ".py", 170 | "mimetype": "text/x-python", 171 | "name": "python", 172 | "nbconvert_exporter": "python", 173 | "pygments_lexer": "ipython2", 174 | "version": "2.7.12" 175 | } 176 | }, 177 | "nbformat": 4, 178 | "nbformat_minor": 1 179 | } 180 | -------------------------------------------------------------------------------- /Code/MDBMCode/run_flickr_multimodal_dbm.py: -------------------------------------------------------------------------------- 1 | # Achari Berrada Youssef 2 | # This is an implementation of the Multinomial DBM Architecture. 3 | 4 | import numpy as np 5 | import tensorflow as tf 6 | import config 7 | import os 8 | 9 | from yadlt.utils import utilities 10 | 11 | # Trained from the notebook run_flickr_image_dbm.ipynb 12 | image_layer1_W = np.load("image_dbm_layer_1_W.npy") 13 | image_layer1_b = np.load("image_dbm_layer_1_b.npy") 14 | image_layer2_W = np.load("image_dbm_layer_2_W.npy") 15 | image_layer2_b = np.load("image_dbm_layer_2_b.npy") 16 | 17 | # Trained from the notebook run_flickr_text_dbm.ipynb 18 | text_layer1_W = np.load("text_dbm_layer_1_W.npy") 19 | text_layer1_b = np.load("text_dbm_layer_1_b.npy") 20 | text_layer2_W = np.load("text_dbm_layer_2_W.npy") 21 | text_layer2_b = np.load("text_dbm_layer_2_b.npy") 22 | 23 | 24 | 25 | # Input placeholders 26 | img_input = tf.placeholder(tf.float32, [None, 3857], name="image-input") 27 | txt_input = tf.placeholder(tf.float32, [None, 2000], name="text-input") 28 | 29 | 30 | # ################# # 31 | # Image Forward DBM # 32 | # ################# # 33 | 34 | img_W1 = tf.Variable(image_layer1_W) 35 | img_b1 = tf.Variable(image_layer1_b) 36 | 37 | img_W2 = tf.Variable(image_layer2_W) 38 | img_b2 = tf.Variable(image_layer2_b) 39 | 40 | img_layer1 = tf.add(tf.matmul(img_input, img_W1), img_b1) 41 | img_layer1 = tf.nn.sigmoid(img_layer1) 42 | 43 | img_layer2 = tf.add(tf.matmul(img_layer1, img_W2), img_b2) 44 | img_layer2 = tf.nn.sigmoid(img_layer2) 45 | 46 | 47 | 48 | # ################ # 49 | # Text Forward DBM # 50 | # ################ # 51 | 52 | txt_W1 = tf.Variable(text_layer1_W) 53 | txt_b1 = tf.Variable(text_layer1_b) 54 | 55 | txt_W2 = tf.Variable(text_layer2_W) 56 | txt_b2 = tf.Variable(text_layer2_b) 57 | 58 | txt_layer1 = tf.add(tf.matmul(txt_input, txt_W1), txt_b1) 59 | txt_layer1 = tf.nn.sigmoid(txt_layer1) 60 | 61 | txt_layer2 = tf.add(tf.matmul(txt_layer1, txt_W2), txt_b2) 62 | txt_layer2 = tf.nn.sigmoid(txt_layer2) 63 | 64 | 65 | # ############## # 66 | # Multimodal DBM # 67 | # ############## # 68 | 69 | joint_representation_units = 512 # number of units in the joint representation layer 70 | 71 | multi_W = tf.Variable(tf.truncated_normal(shape=[2048, joint_representation_units], stddev=0.1), name='multimodal-weights') 72 | multi_b = tf.Variable(tf.constant(0.1, shape=[joint_representation_units]), name='multimodal-bias') 73 | 74 | multi_input = tf.concat([img_layer2, txt_layer2], 0) 75 | multi_output = tf.nn.sigmoid(tf.add(tf.matmul(multi_input, multi_W), multi_b)) 76 | 77 | binary_output = utilities.sample_prob(hprobs, np.random.rand(data.shape[0], joint_representation_units)) 78 | 79 | # ################## # 80 | # Image Backward DBM # 81 | # ################## # 82 | 83 | img_layer2b = tf.add(tf.matmul(binary_output, img_W2.T), img_b2) 84 | img_layer2b = tf.nn.sigmoid(img_layer2b) 85 | 86 | img_layer1b = tf.add(tf.matmul(img_layer2b, img_W1.T), img_b1) 87 | img_layer1b = tf.nn.sigmoid(img_layer1b) 88 | 89 | # ################# # 90 | # Text Backward DBM # 91 | # ################# # 92 | 93 | txt_layer2b = tf.add(tf.matmul(binary_output, txt_W2.T), txt_b2) 94 | txt_layer2b = tf.nn.sigmoid(txt_layer2b) 95 | 96 | txt_layer1b = tf.add(tf.matmul(txt_layer2b, txt_W1.T), txt_b1) 97 | txt_layer1b = tf.nn.sigmoid(txt_layer1b) 98 | 99 | # Now can use img_layer1b (which should be 3587) and txt_layer1b (which should be 2000) to compute the loss 100 | # function. For mean squared error cost: 101 | 102 | img_cost = tf.sqrt(tf.reduce_mean(tf.square(tf.sub(img_dataset, img_layer1b)))) 103 | txt_cost = tf.sqrt(tf.reduce_mean(tf.square(tf.sub(txt_dataset, txt_layer1b)))) 104 | 105 | # Optimizer 106 | 107 | img_train_step = tf.train.MomentumOptimizer(learning_rate, momentum).minimize(img_cost) 108 | txt_train_step = tf.train.MomentumOptimizer(learning_rate, momentum).minimize(txt_cost) 109 | 110 | # ####################### # 111 | # Code to train the model # 112 | # ####################### # 113 | 114 | # train_set and train_ref should be loaded, maybe from the labeled data directory 115 | labeled1 = np.load(os.path.join(config.flickr_labeled_path, "combined-00001-of-00100.npy")) 116 | labeled2 = np.load(os.path.join(config.flickr_labeled_path, "combined-00002-of-00100.npy")) 117 | labeled3 = np.load(os.path.join(config.flickr_labeled_path, "combined-00003_0-of-00100.npy")) 118 | labeled = np.concatenate((labeled1, labeled2, labeled3), axis=0) 119 | 120 | num_epochs = 5 121 | batch_size = 64 122 | 123 | shuff = zip(train_set, train_ref) 124 | 125 | with tf.Session() as sess: 126 | for i in range(num_epochs): 127 | np.random.shuffle(shuff) 128 | batches = [_ for _ in utilities.gen_batches(shuff, batch_size)] 129 | 130 | for batch in batches: 131 | x_batch, y_batch = zip(*batch) 132 | # run image train 133 | sess.run(img_train_step, feed_dict={img_input: x_batch, image_ref: y_batch}) 134 | # run text train 135 | sess.run(txt_train_step, feed_dict={txt_input: x_batch, text_ref: y_batch}) 136 | 137 | if validation_set is not None: 138 | img_loss = sess.run(img_cost, feed_dict={img_input: img_validation_input, img_ref: img_validation_ref}) 139 | txt_loss = sess.run(txt_cost, feed_dict={txt_input: txt_validation_input, txt_ref: txt_validation_ref}) 140 | print("Image loss: %f" % (img_loss)) 141 | print("Text loss: %f" % (txt_loss)) 142 | 143 | 144 | -------------------------------------------------------------------------------- /Code/MDBMCode/run_flickr_text_dbm.ipynb: -------------------------------------------------------------------------------- 1 | { 2 | "cells": [ 3 | { 4 | "cell_type": "code", 5 | "execution_count": null, 6 | "metadata": { 7 | "collapsed": true 8 | }, 9 | "outputs": [], 10 | "source": [ 11 | "# Achari Berrada Youssef\n", 12 | "\n", 13 | "# This script is for training the Text specific Deep Boltzmann Machine. \n", 14 | "import os\n", 15 | "import numpy as np\n", 16 | "import tensorflow as tf\n", 17 | "import config\n", 18 | "\n", 19 | "import dbm\n", 20 | "\n" 21 | ] 22 | }, 23 | { 24 | "cell_type": "code", 25 | "execution_count": null, 26 | "metadata": { 27 | "collapsed": true 28 | }, 29 | "outputs": [], 30 | "source": [ 31 | "# Load flickr text\n", 32 | "dataset = np.load(\"flickr_text_dataset.npy\")\n", 33 | "D = np.sum(dataset, axis=1) # length of each document\n", 34 | "\n", 35 | "model = dbm.DBM(\n", 36 | " main_dir=\"flickr_rbm\", do_pretrain=True, layers=[1024, 1024],\n", 37 | " models_dir=config.models_dir, data_dir=config.data_dir, summary_dir=config.summary_dir,\n", 38 | " learning_rate=[0.001, 0.001], momentum=0.9, num_epochs=[1, 1], batch_size=[64, 64],\n", 39 | " stddev=0.1, verbose=1, gibbs_k=[1, 1], model_name=\"flickr_dbm\",\n", 40 | " finetune_learning_rate=0.01, finetune_enc_act_func=[tf.nn.sigmoid, tf.nn.sigmoid],\n", 41 | " finetune_dec_act_func=[tf.nn.sigmoid, tf.nn.sigmoid], finetune_num_epochs=1, finetune_batch_size=128,\n", 42 | " finetune_opt='momentum', finetune_loss_func=\"mean_squared\", finetune_dropout=0.5, noise=[\"rsm\", \"bin\"],\n", 43 | ")\n", 44 | "\n" 45 | ] 46 | }, 47 | { 48 | "cell_type": "code", 49 | "execution_count": null, 50 | "metadata": { 51 | "collapsed": true 52 | }, 53 | "outputs": [], 54 | "source": [ 55 | "# Initialize documents lengths\n", 56 | "for i, _ in enumerate(model.rbms):\n", 57 | " model.rbms[i].D = D\n" 58 | ] 59 | }, 60 | { 61 | "cell_type": "code", 62 | "execution_count": null, 63 | "metadata": { 64 | "collapsed": true 65 | }, 66 | "outputs": [], 67 | "source": [ 68 | "# Pretraining phase \n", 69 | "model.pretrain(dataset[:500], dataset[500:1000])" 70 | ] 71 | }, 72 | { 73 | "cell_type": "code", 74 | "execution_count": null, 75 | "metadata": { 76 | "collapsed": true 77 | }, 78 | "outputs": [], 79 | "source": [ 80 | "# Fit and save the txt-DBM parameters \n", 81 | "# I put save_dbm_text_params as a quick hack to save the parameters of this dbm as a numpy array\n", 82 | "model.fit(dataset[:500], dataset[500:1000], dataset[1500:2000], dataset[2000:2500], save_dbm_text_params=True)\n" 83 | ] 84 | }, 85 | { 86 | "cell_type": "code", 87 | "execution_count": null, 88 | "metadata": { 89 | "collapsed": true 90 | }, 91 | "outputs": [], 92 | "source": [ 93 | "# Output : \n", 94 | " # text_dbm_layer_1_W.npy\n", 95 | " # text_dbm_layer_1_b.npy\n", 96 | " # text_dbm_layer_2_W.npy\n", 97 | " # text_dbm_layer_2_b.npy" 98 | ] 99 | } 100 | ], 101 | "metadata": { 102 | "anaconda-cloud": {}, 103 | "kernelspec": { 104 | "display_name": "Python [conda env:py27]", 105 | "language": "python", 106 | "name": "conda-env-py27-py" 107 | }, 108 | "language_info": { 109 | "codemirror_mode": { 110 | "name": "ipython", 111 | "version": 2 112 | }, 113 | "file_extension": ".py", 114 | "mimetype": "text/x-python", 115 | "name": "python", 116 | "nbconvert_exporter": "python", 117 | "pygments_lexer": "ipython2", 118 | "version": "2.7.12" 119 | } 120 | }, 121 | "nbformat": 4, 122 | "nbformat_minor": 1 123 | } 124 | -------------------------------------------------------------------------------- /Code/MDBMCode/run_flickr_text_dbm.py: -------------------------------------------------------------------------------- 1 | # Achari Berrada Youssef 2 | 3 | # This script is for training the Text specific Deep Boltzmann Machine. 4 | import os 5 | import numpy as np 6 | import tensorflow as tf 7 | import config 8 | 9 | import dbm 10 | 11 | # Load flickr text 12 | dataset = np.load("flickr_text_dataset.npy") 13 | D = np.sum(dataset, axis=1) # length of each document 14 | 15 | model = dbm.DBM( 16 | main_dir="flickr_rbm", do_pretrain=True, layers=[1024, 1024], 17 | models_dir=config.models_dir, data_dir=config.data_dir, summary_dir=config.summary_dir, 18 | learning_rate=[0.001, 0.001], momentum=0.9, num_epochs=[1, 1], batch_size=[64, 64], 19 | stddev=0.1, verbose=1, gibbs_k=[1, 1], model_name="flickr_dbm", 20 | finetune_learning_rate=0.01, finetune_enc_act_func=[tf.nn.sigmoid, tf.nn.sigmoid], 21 | finetune_dec_act_func=[tf.nn.sigmoid, tf.nn.sigmoid], finetune_num_epochs=1, finetune_batch_size=128, 22 | finetune_opt='momentum', finetune_loss_func="mean_squared", finetune_dropout=0.5, noise=["rsm", "bin"], 23 | ) 24 | 25 | # Initialize documents lengths 26 | for i, _ in enumerate(model.rbms): 27 | model.rbms[i].D = D 28 | 29 | # Pretraining phase 30 | model.pretrain(dataset[:500], dataset[500:1000]) 31 | 32 | # Fit and save the txt-DBM parameters 33 | # I put save_dbm_text_params as a quick hack to save the parameters of this dbm as a numpy array 34 | model.fit(dataset[:500], dataset[500:1000], dataset[1500:2000], dataset[2000:2500], save_dbm_text_params=True) 35 | 36 | 37 | # Output : 38 | # text_dbm_layer_1_W.npy 39 | # text_dbm_layer_1_b.npy 40 | # text_dbm_layer_2_W.npy 41 | # text_dbm_layer_2_b.npy -------------------------------------------------------------------------------- /Code/TrainingLog.txt: -------------------------------------------------------------------------------- 1 | tep 5500 T_CE: 10.546 V_CE: 10.563 V_MAP: 0.460 V_prec50: 0.739 E_CE: 10.548 E_MAP: 0.456 E_prec50: 0.767 2 | Step 6000 T_CE: 10.464 V_CE: 10.495 V_MAP: 0.460 V_prec50: 0.739 E_CE: 10.480 E_MAP: 0.456 E_prec50: 0.766 3 | Step 6500 T_CE: 10.389 V_CE: 10.432 V_MAP: 0.461 V_prec50: 0.741 E_CE: 10.417 E_MAP: 0.457 E_prec50: 0.766 4 | Step 7000 T_CE: 10.318 V_CE: 10.374 V_MAP: 0.461 V_prec50: 0.743 E_CE: 10.359 E_MAP: 0.457 E_prec50: 0.766 5 | Step 7500 T_CE: 10.252 V_CE: 10.320 V_MAP: 0.462 V_prec50: 0.743 E_CE: 10.306 E_MAP: 0.458 E_prec50: 0.766 6 | Step 8000 T_CE: 10.190 V_CE: 10.270 V_MAP: 0.462 V_prec50: 0.745 E_CE: 10.256 E_MAP: 0.458 E_prec50: 0.768 7 | Step 8500 T_CE: 10.132 V_CE: 10.224 V_MAP: 0.463 V_prec50: 0.745 E_CE: 10.210 E_MAP: 0.459 E_prec50: 0.770 8 | Step 9000 T_CE: 10.077 V_CE: 10.181 V_MAP: 0.463 V_prec50: 0.744 E_CE: 10.167 E_MAP: 0.459 E_prec50: 0.772 9 | Step 9500 T_CE: 10.026 V_CE: 10.141 V_MAP: 0.464 V_prec50: 0.746 E_CE: 10.128 E_MAP: 0.460 E_prec50: 0.772 10 | Step 10000 T_CE: 9.977 V_CE: 10.104 V_MAP: 0.465 V_prec50: 0.748 E_CE: 10.091 E_MAP: 0.461 E_prec50: 0.776 11 | Writing checkpoint /home/ubuntu/pynb/flickr/dbn_models/classifiers/split_3/text_input_classifier_1485019805 12 | Best valid MAP : 0.4650 Test MAP 0.4609 13 | Writing current best model /home/ubuntu/pynb/flickr/dbn_models/classifiers/split_3/text_input_classifier_BEST 14 | Split 4 image_input 15 | Step 500 T_CE: 13.448 V_CE: 11.002 V_MAP: 0.394 V_prec50: 0.600 E_CE: 10.927 E_MAP: 0.393 E_prec50: 0.618 16 | Step 1000 T_CE: 7.975 V_CE: 10.074 V_MAP: 0.409 V_prec50: 0.617 E_CE: 9.953 E_MAP: 0.409 E_prec50: 0.648 17 | Step 1500 T_CE: 6.780 V_CE: 9.971 V_MAP: 0.417 V_prec50: 0.626 E_CE: 9.823 E_MAP: 0.414 E_prec50: 0.650 18 | Step 2000 T_CE: 6.148 V_CE: 10.014 V_MAP: 0.417 V_prec50: 0.634 E_CE: 9.851 E_MAP: 0.415 E_prec50: 0.650 19 | Step 2500 T_CE: 5.727 V_CE: 10.152 V_MAP: 0.420 V_prec50: 0.631 E_CE: 10.006 E_MAP: 0.416 E_prec50: 0.658 20 | Step 3000 T_CE: 5.416 V_CE: 10.378 V_MAP: 0.416 V_prec50: 0.627 E_CE: 10.205 E_MAP: 0.412 E_prec50: 0.652 21 | Step 3500 T_CE: 5.170 V_CE: 10.575 V_MAP: 0.415 V_prec50: 0.636 E_CE: 10.364 E_MAP: 0.412 E_prec50: 0.655 22 | Step 4000 T_CE: 4.968 V_CE: 10.777 V_MAP: 0.410 V_prec50: 0.623 E_CE: 10.554 E_MAP: 0.408 E_prec50: 0.642 23 | Step 4500 T_CE: 4.797 V_CE: 11.057 V_MAP: 0.412 V_prec50: 0.627 E_CE: 10.841 E_MAP: 0.407 E_prec50: 0.650 24 | Step 5000 T_CE: 4.655 V_CE: 11.155 V_MAP: 0.409 V_prec50: 0.625 E_CE: 10.944 E_MAP: 0.405 E_prec50: 0.646 25 | Step 5500 T_CE: 4.530 V_CE: 11.382 V_MAP: 0.403 V_prec50: 0.618 E_CE: 11.159 E_MAP: 0.400 E_prec50: 0.632 26 | Step 6000 T_CE: 4.413 V_CE: 11.519 V_MAP: 0.405 V_prec50: 0.618 E_CE: 11.315 E_MAP: 0.401 E_prec50: 0.632 27 | Step 6500 T_CE: 4.308 V_CE: 11.716 V_MAP: 0.405 V_prec50: 0.621 E_CE: 11.468 E_MAP: 0.400 E_prec50: 0.630 28 | Step 7000 T_CE: 4.227 V_CE: 11.982 V_MAP: 0.402 V_prec50: 0.619 E_CE: 11.730 E_MAP: 0.399 E_prec50: 0.629 29 | Step 7500 T_CE: 4.151 V_CE: 12.090 V_MAP: 0.397 V_prec50: 0.609 E_CE: 11.858 E_MAP: 0.393 E_prec50: 0.623 30 | Step 8000 T_CE: 4.078 V_CE: 12.322 V_MAP: 0.399 V_prec50: 0.619 E_CE: 12.073 E_MAP: 0.396 E_prec50: 0.618 31 | Step 8500 T_CE: 4.011 V_CE: 12.535 V_MAP: 0.395 V_prec50: 0.604 E_CE: 12.261 E_MAP: 0.393 E_prec50: 0.621 32 | Step 9000 T_CE: 3.952 V_CE: 12.685 V_MAP: 0.394 V_prec50: 0.607 E_CE: 12.419 E_MAP: 0.390 E_prec50: 0.615 33 | Step 9500 T_CE: 3.895 V_CE: 12.844 V_MAP: 0.394 V_prec50: 0.605 E_CE: 12.589 E_MAP: 0.391 E_prec50: 0.614 34 | Step 10000 T_CE: 3.846 V_CE: 13.090 V_MAP: 0.394 V_prec50: 0.600 E_CE: 12.809 E_MAP: 0.391 E_prec50: 0.610 35 | Writing checkpoint /home/ubuntu/pynb/flickr/dbn_models/classifiers/split_4/image_input_classifier_1485019830 36 | Best valid MAP : 0.4199 Test MAP 0.4161 37 | Writing current best model /home/ubuntu/pynb/flickr/dbn_models/classifiers/split_4/image_input_classifier_BEST 38 | Split 4 image_hidden1 39 | Step 500 T_CE: 9.622 V_CE: 9.130 V_MAP: 0.404 V_prec50: 0.626 E_CE: 9.032 E_MAP: 0.410 E_prec50: 0.668 40 | Step 1000 T_CE: 8.539 V_CE: 8.839 V_MAP: 0.436 V_prec50: 0.664 E_CE: 8.734 E_MAP: 0.441 E_prec50: 0.712 41 | Step 1500 T_CE: 8.126 V_CE: 8.729 V_MAP: 0.453 V_prec50: 0.686 E_CE: 8.612 E_MAP: 0.456 E_prec50: 0.734 42 | Step 2000 T_CE: 7.842 V_CE: 8.653 V_MAP: 0.463 V_prec50: 0.695 E_CE: 8.521 E_MAP: 0.465 E_prec50: 0.739 43 | Step 2500 T_CE: 7.622 V_CE: 8.613 V_MAP: 0.469 V_prec50: 0.697 E_CE: 8.478 E_MAP: 0.470 E_prec50: 0.745 44 | Step 3000 T_CE: 7.441 V_CE: 8.604 V_MAP: 0.472 V_prec50: 0.702 E_CE: 8.464 E_MAP: 0.473 E_prec50: 0.748 45 | Step 3500 T_CE: 7.286 V_CE: 8.596 V_MAP: 0.474 V_prec50: 0.706 E_CE: 8.452 E_MAP: 0.475 E_prec50: 0.752 46 | Step 4000 T_CE: 7.152 V_CE: 8.602 V_MAP: 0.476 V_prec50: 0.706 E_CE: 8.451 E_MAP: 0.477 E_prec50: 0.755 47 | Step 4500 T_CE: 7.031 V_CE: 8.612 V_MAP: 0.476 V_prec50: 0.707 E_CE: 8.454 E_MAP: 0.477 E_prec50: 0.757 48 | Step 5000 T_CE: 6.924 V_CE: 8.628 V_MAP: 0.477 V_prec50: 0.712 E_CE: 8.471 E_MAP: 0.478 E_prec50: 0.759 49 | Step 5500 T_CE: 6.826 V_CE: 8.645 V_MAP: 0.476 V_prec50: 0.711 E_CE: 8.487 E_MAP: 0.478 E_prec50: 0.760 50 | Step 6000 T_CE: 6.735 V_CE: 8.678 V_MAP: 0.476 V_prec50: 0.712 E_CE: 8.520 E_MAP: 0.477 E_prec50: 0.761 51 | Step 6500 T_CE: 6.652 V_CE: 8.693 V_MAP: 0.476 V_prec50: 0.714 E_CE: 8.538 E_MAP: 0.477 E_prec50: 0.762 52 | Step 7000 T_CE: 6.576 V_CE: 8.730 V_MAP: 0.475 V_prec50: 0.713 E_CE: 8.560 E_MAP: 0.477 E_prec50: 0.762 53 | Step 7500 T_CE: 6.503 V_CE: 8.737 V_MAP: 0.475 V_prec50: 0.709 E_CE: 8.566 E_MAP: 0.477 E_prec50: 0.759 54 | Step 8000 T_CE: 6.438 V_CE: 8.769 V_MAP: 0.474 V_prec50: 0.707 E_CE: 8.600 E_MAP: 0.476 E_prec50: 0.757 55 | Step 8500 T_CE: 6.375 V_CE: 8.790 V_MAP: 0.474 V_prec50: 0.705 E_CE: 8.618 E_MAP: 0.475 E_prec50: 0.759 56 | Step 9000 T_CE: 6.316 V_CE: 8.828 V_MAP: 0.473 V_prec50: 0.702 E_CE: 8.655 E_MAP: 0.474 E_prec50: 0.757 57 | Step 9500 T_CE: 6.260 V_CE: 8.861 V_MAP: 0.472 V_prec50: 0.704 E_CE: 8.689 E_MAP: 0.473 E_prec50: 0.756 58 | Step 10000 T_CE: 6.209 V_CE: 8.873 V_MAP: 0.471 V_prec50: 0.704 E_CE: 8.710 E_MAP: 0.472 E_prec50: 0.755 59 | Writing checkpoint /home/ubuntu/pynb/flickr/dbn_models/classifiers/split_4/image_hidden1_classifier_1485019852 60 | Best valid MAP : 0.4767 Test MAP 0.4782 61 | Writing current best model /home/ubuntu/pynb/flickr/dbn_models/classifiers/split_4/image_hidden1_classifier_BEST 62 | Split 4 image_hidden2 63 | Step 500 T_CE: 9.905 V_CE: 9.227 V_MAP: 0.407 V_prec50: 0.621 E_CE: 9.164 E_MAP: 0.406 E_prec50: 0.650 64 | Step 1000 T_CE: 8.794 V_CE: 8.900 V_MAP: 0.436 V_prec50: 0.664 E_CE: 8.824 E_MAP: 0.434 E_prec50: 0.691 65 | Step 1500 T_CE: 8.452 V_CE: 8.757 V_MAP: 0.451 V_prec50: 0.676 E_CE: 8.674 E_MAP: 0.449 E_prec50: 0.711 66 | Step 2000 T_CE: 8.235 V_CE: 8.677 V_MAP: 0.459 V_prec50: 0.689 E_CE: 8.591 E_MAP: 0.457 E_prec50: 0.727 67 | Step 2500 T_CE: 8.079 V_CE: 8.642 V_MAP: 0.465 V_prec50: 0.697 E_CE: 8.544 E_MAP: 0.464 E_prec50: 0.730 68 | Step 3000 T_CE: 7.954 V_CE: 8.601 V_MAP: 0.469 V_prec50: 0.698 E_CE: 8.503 E_MAP: 0.468 E_prec50: 0.735 69 | Step 3500 T_CE: 7.851 V_CE: 8.586 V_MAP: 0.472 V_prec50: 0.701 E_CE: 8.483 E_MAP: 0.470 E_prec50: 0.738 70 | Step 4000 T_CE: 7.763 V_CE: 8.581 V_MAP: 0.474 V_prec50: 0.701 E_CE: 8.476 E_MAP: 0.472 E_prec50: 0.741 71 | Step 4500 T_CE: 7.686 V_CE: 8.567 V_MAP: 0.475 V_prec50: 0.703 E_CE: 8.465 E_MAP: 0.474 E_prec50: 0.745 72 | Step 5000 T_CE: 7.619 V_CE: 8.565 V_MAP: 0.476 V_prec50: 0.704 E_CE: 8.457 E_MAP: 0.475 E_prec50: 0.744 73 | Step 5500 T_CE: 7.559 V_CE: 8.568 V_MAP: 0.476 V_prec50: 0.706 E_CE: 8.457 E_MAP: 0.476 E_prec50: 0.745 74 | Step 6000 T_CE: 7.503 V_CE: 8.576 V_MAP: 0.476 V_prec50: 0.707 E_CE: 8.470 E_MAP: 0.476 E_prec50: 0.748 75 | Step 6500 T_CE: 7.453 V_CE: 8.586 V_MAP: 0.477 V_prec50: 0.707 E_CE: 8.470 E_MAP: 0.476 E_prec50: 0.749 76 | Step 7000 T_CE: 7.408 V_CE: 8.581 V_MAP: 0.477 V_prec50: 0.708 E_CE: 8.474 E_MAP: 0.477 E_prec50: 0.749 77 | Step 7500 T_CE: 7.366 V_CE: 8.589 V_MAP: 0.477 V_prec50: 0.711 E_CE: 8.479 E_MAP: 0.477 E_prec50: 0.749 78 | Step 8000 T_CE: 7.327 V_CE: 8.599 V_MAP: 0.477 V_prec50: 0.711 E_CE: 8.484 E_MAP: 0.477 E_prec50: 0.750 79 | Step 8500 T_CE: 7.290 V_CE: 8.608 V_MAP: 0.477 V_prec50: 0.713 E_CE: 8.498 E_MAP: 0.477 E_prec50: 0.752 80 | Step 9000 T_CE: 7.256 V_CE: 8.624 V_MAP: 0.476 V_prec50: 0.713 E_CE: 8.510 E_MAP: 0.476 E_prec50: 0.752 81 | Step 9500 T_CE: 7.225 V_CE: 8.627 V_MAP: 0.476 V_prec50: 0.711 E_CE: 8.510 E_MAP: 0.476 E_prec50: 0.752 82 | Step 10000 T_CE: 7.196 V_CE: 8.631 V_MAP: 0.476 V_prec50: 0.713 E_CE: 8.516 E_MAP: 0.476 E_prec50: 0.755 83 | Writing checkpoint /home/ubuntu/pynb/flickr/dbn_models/classifiers/split_4/image_hidden2_classifier_1485019873 84 | Best valid MAP : 0.4772 Test MAP 0.4767 85 | Writing current best model /home/ubuntu/pynb/flickr/dbn_models/classifiers/split_4/image_hidden2_classifier_BEST 86 | Split 4 joint_hidden 87 | Step 500 T_CE: 8.522 V_CE: 7.899 V_MAP: 0.597 V_prec50: 0.826 E_CE: 7.777 E_MAP: 0.592 E_prec50: 0.853 88 | Step 1000 T_CE: 7.301 V_CE: 7.634 V_MAP: 0.615 V_prec50: 0.844 E_CE: 7.494 E_MAP: 0.612 E_prec50: 0.868 89 | Step 1500 T_CE: 6.959 V_CE: 7.612 V_MAP: 0.622 V_prec50: 0.845 E_CE: 7.479 E_MAP: 0.620 E_prec50: 0.874 90 | Step 2000 T_CE: 6.745 V_CE: 7.516 V_MAP: 0.626 V_prec50: 0.850 E_CE: 7.360 E_MAP: 0.624 E_prec50: 0.876 91 | Step 2500 T_CE: 6.585 V_CE: 7.501 V_MAP: 0.628 V_prec50: 0.852 E_CE: 7.348 E_MAP: 0.626 E_prec50: 0.878 92 | Step 3000 T_CE: 6.452 V_CE: 7.504 V_MAP: 0.629 V_prec50: 0.852 E_CE: 7.349 E_MAP: 0.627 E_prec50: 0.874 93 | Step 3500 T_CE: 6.341 V_CE: 7.509 V_MAP: 0.629 V_prec50: 0.854 E_CE: 7.354 E_MAP: 0.628 E_prec50: 0.876 94 | Step 4000 T_CE: 6.249 V_CE: 7.526 V_MAP: 0.630 V_prec50: 0.853 E_CE: 7.361 E_MAP: 0.628 E_prec50: 0.876 95 | Step 4500 T_CE: 6.160 V_CE: 7.548 V_MAP: 0.630 V_prec50: 0.856 E_CE: 7.377 E_MAP: 0.628 E_prec50: 0.877 96 | Step 5000 T_CE: 6.084 V_CE: 7.540 V_MAP: 0.629 V_prec50: 0.853 E_CE: 7.371 E_MAP: 0.628 E_prec50: 0.877 97 | Step 5500 T_CE: 6.010 V_CE: 7.607 V_MAP: 0.629 V_prec50: 0.851 E_CE: 7.434 E_MAP: 0.627 E_prec50: 0.877 98 | Step 6000 T_CE: 5.947 V_CE: 7.583 V_MAP: 0.629 V_prec50: 0.851 E_CE: 7.417 E_MAP: 0.627 E_prec50: 0.878 99 | Step 6500 T_CE: 5.881 V_CE: 7.608 V_MAP: 0.628 V_prec50: 0.851 E_CE: 7.451 E_MAP: 0.626 E_prec50: 0.879 100 | Step 7000 T_CE: 5.827 V_CE: 7.648 V_MAP: 0.627 V_prec50: 0.849 E_CE: 7.486 E_MAP: 0.625 E_prec50: 0.878 101 | Step 7500 T_CE: 5.772 V_CE: 7.685 V_MAP: 0.627 V_prec50: 0.848 E_CE: 7.510 E_MAP: 0.624 E_prec50: 0.880 102 | Step 8000 T_CE: 5.727 V_CE: 7.701 V_MAP: 0.626 V_prec50: 0.846 E_CE: 7.540 E_MAP: 0.623 E_prec50: 0.878 103 | Step 8500 T_CE: 5.677 V_CE: 7.683 V_MAP: 0.626 V_prec50: 0.847 E_CE: 7.522 E_MAP: 0.623 E_prec50: 0.877 104 | Step 9000 T_CE: 5.635 V_CE: 7.742 V_MAP: 0.625 V_prec50: 0.848 E_CE: 7.569 E_MAP: 0.622 E_prec50: 0.878 105 | Step 9500 T_CE: 5.595 V_CE: 7.725 V_MAP: 0.624 V_prec50: 0.847 E_CE: 7.568 E_MAP: 0.621 E_prec50: 0.878 106 | Step 10000 T_CE: 5.553 V_CE: 7.768 V_MAP: 0.624 V_prec50: 0.846 E_CE: 7.631 E_MAP: 0.620 E_prec50: 0.877 107 | Writing checkpoint /home/ubuntu/pynb/flickr/dbn_models/classifiers/split_4/joint_hidden_classifier_1485019895 108 | Best valid MAP : 0.6300 Test MAP 0.6282 109 | Writing current best model /home/ubuntu/pynb/flickr/dbn_models/classifiers/split_4/joint_hidden_classifier_BEST 110 | Split 4 text_hidden2 111 | Step 500 T_CE: 10.060 V_CE: 9.404 V_MAP: 0.457 V_prec50: 0.722 E_CE: 9.276 E_MAP: 0.463 E_prec50: 0.776 112 | Step 1000 T_CE: 8.890 V_CE: 9.126 V_MAP: 0.489 V_prec50: 0.766 E_CE: 8.973 E_MAP: 0.497 E_prec50: 0.807 113 | Step 1500 T_CE: 8.586 V_CE: 9.047 V_MAP: 0.502 V_prec50: 0.784 E_CE: 8.880 E_MAP: 0.509 E_prec50: 0.825 114 | Step 2000 T_CE: 8.411 V_CE: 9.006 V_MAP: 0.507 V_prec50: 0.792 E_CE: 8.846 E_MAP: 0.516 E_prec50: 0.833 115 | Step 2500 T_CE: 8.288 V_CE: 8.994 V_MAP: 0.511 V_prec50: 0.797 E_CE: 8.829 E_MAP: 0.519 E_prec50: 0.838 116 | Step 3000 T_CE: 8.191 V_CE: 8.992 V_MAP: 0.513 V_prec50: 0.797 E_CE: 8.822 E_MAP: 0.521 E_prec50: 0.838 117 | Step 3500 T_CE: 8.110 V_CE: 9.007 V_MAP: 0.513 V_prec50: 0.799 E_CE: 8.836 E_MAP: 0.521 E_prec50: 0.839 118 | Step 4000 T_CE: 8.040 V_CE: 9.022 V_MAP: 0.514 V_prec50: 0.802 E_CE: 8.851 E_MAP: 0.521 E_prec50: 0.838 119 | Step 4500 T_CE: 7.981 V_CE: 9.038 V_MAP: 0.513 V_prec50: 0.803 E_CE: 8.865 E_MAP: 0.520 E_prec50: 0.836 120 | Step 5000 T_CE: 7.926 V_CE: 9.050 V_MAP: 0.513 V_prec50: 0.804 E_CE: 8.882 E_MAP: 0.520 E_prec50: 0.837 121 | Step 5500 T_CE: 7.878 V_CE: 9.059 V_MAP: 0.512 V_prec50: 0.801 E_CE: 8.887 E_MAP: 0.519 E_prec50: 0.837 122 | Step 6000 T_CE: 7.833 V_CE: 9.076 V_MAP: 0.513 V_prec50: 0.802 E_CE: 8.914 E_MAP: 0.518 E_prec50: 0.839 123 | Step 6500 T_CE: 7.792 V_CE: 9.090 V_MAP: 0.512 V_prec50: 0.801 E_CE: 8.917 E_MAP: 0.518 E_prec50: 0.837 124 | Step 7000 T_CE: 7.755 V_CE: 9.105 V_MAP: 0.512 V_prec50: 0.798 E_CE: 8.938 E_MAP: 0.517 E_prec50: 0.838 125 | Step 7500 T_CE: 7.721 V_CE: 9.124 V_MAP: 0.511 V_prec50: 0.799 E_CE: 8.960 E_MAP: 0.516 E_prec50: 0.836 126 | Step 8000 T_CE: 7.689 V_CE: 9.146 V_MAP: 0.511 V_prec50: 0.800 E_CE: 8.981 E_MAP: 0.515 E_prec50: 0.836 127 | Step 8500 T_CE: 7.657 V_CE: 9.162 V_MAP: 0.510 V_prec50: 0.800 E_CE: 8.999 E_MAP: 0.514 E_prec50: 0.835 128 | Step 9000 T_CE: 7.630 V_CE: 9.210 V_MAP: 0.510 V_prec50: 0.798 E_CE: 9.054 E_MAP: 0.513 E_prec50: 0.835 129 | Step 9500 T_CE: 7.604 V_CE: 9.201 V_MAP: 0.509 V_prec50: 0.798 E_CE: 9.028 E_MAP: 0.512 E_prec50: 0.833 130 | Step 10000 T_CE: 7.580 V_CE: 9.208 V_MAP: 0.508 V_prec50: 0.798 E_CE: 9.043 E_MAP: 0.511 E_prec50: 0.832 131 | Writing checkpoint /home/ubuntu/pynb/flickr/dbn_models/classifiers/split_4/text_hidden2_classifier_1485019916 132 | Best valid MAP : 0.5139 Test MAP 0.5206 133 | Writing current best model /home/ubuntu/pynb/flickr/dbn_models/classifiers/split_4/text_hidden2_classifier_BEST 134 | Split 4 text_hidden1 135 | Step 500 T_CE: 9.824 V_CE: 9.375 V_MAP: 0.479 V_prec50: 0.757 E_CE: 9.230 E_MAP: 0.486 E_prec50: 0.803 136 | Step 1000 T_CE: 8.795 V_CE: 9.185 V_MAP: 0.500 V_prec50: 0.779 E_CE: 9.023 E_MAP: 0.507 E_prec50: 0.822 137 | Step 1500 T_CE: 8.501 V_CE: 9.167 V_MAP: 0.505 V_prec50: 0.789 E_CE: 8.991 E_MAP: 0.511 E_prec50: 0.826 138 | Step 2000 T_CE: 8.319 V_CE: 9.131 V_MAP: 0.507 V_prec50: 0.792 E_CE: 8.963 E_MAP: 0.513 E_prec50: 0.828 139 | Step 2500 T_CE: 8.187 V_CE: 9.145 V_MAP: 0.507 V_prec50: 0.794 E_CE: 8.972 E_MAP: 0.513 E_prec50: 0.831 140 | Step 3000 T_CE: 8.080 V_CE: 9.173 V_MAP: 0.508 V_prec50: 0.795 E_CE: 8.995 E_MAP: 0.513 E_prec50: 0.832 141 | Step 3500 T_CE: 7.994 V_CE: 9.207 V_MAP: 0.508 V_prec50: 0.793 E_CE: 9.027 E_MAP: 0.514 E_prec50: 0.835 142 | Step 4000 T_CE: 7.918 V_CE: 9.237 V_MAP: 0.508 V_prec50: 0.794 E_CE: 9.056 E_MAP: 0.511 E_prec50: 0.833 143 | Step 4500 T_CE: 7.854 V_CE: 9.268 V_MAP: 0.507 V_prec50: 0.793 E_CE: 9.085 E_MAP: 0.511 E_prec50: 0.835 144 | Step 5000 T_CE: 7.798 V_CE: 9.326 V_MAP: 0.507 V_prec50: 0.791 E_CE: 9.146 E_MAP: 0.509 E_prec50: 0.836 145 | Step 5500 T_CE: 7.748 V_CE: 9.322 V_MAP: 0.505 V_prec50: 0.789 E_CE: 9.135 E_MAP: 0.507 E_prec50: 0.834 146 | Step 6000 T_CE: 7.701 V_CE: 9.346 V_MAP: 0.504 V_prec50: 0.789 E_CE: 9.174 E_MAP: 0.507 E_prec50: 0.833 147 | Step 6500 T_CE: 7.660 V_CE: 9.399 V_MAP: 0.504 V_prec50: 0.789 E_CE: 9.217 E_MAP: 0.506 E_prec50: 0.832 148 | Step 7000 T_CE: 7.621 V_CE: 9.405 V_MAP: 0.502 V_prec50: 0.788 E_CE: 9.221 E_MAP: 0.504 E_prec50: 0.831 149 | Step 7500 T_CE: 7.589 V_CE: 9.439 V_MAP: 0.501 V_prec50: 0.788 E_CE: 9.264 E_MAP: 0.503 E_prec50: 0.830 150 | Step 8000 T_CE: 7.558 V_CE: 9.482 V_MAP: 0.501 V_prec50: 0.786 E_CE: 9.293 E_MAP: 0.502 E_prec50: 0.828 151 | Step 8500 T_CE: 7.528 V_CE: 9.512 V_MAP: 0.499 V_prec50: 0.785 E_CE: 9.334 E_MAP: 0.501 E_prec50: 0.829 152 | Step 9000 T_CE: 7.504 V_CE: 9.567 V_MAP: 0.498 V_prec50: 0.786 E_CE: 9.388 E_MAP: 0.499 E_prec50: 0.828 153 | Step 9500 T_CE: 7.480 V_CE: 9.559 V_MAP: 0.497 V_prec50: 0.784 E_CE: 9.366 E_MAP: 0.498 E_prec50: 0.829 154 | Step 10000 T_CE: 7.458 V_CE: 9.587 V_MAP: 0.495 V_prec50: 0.785 E_CE: 9.406 E_MAP: 0.495 E_prec50: 0.827 155 | Writing checkpoint /home/ubuntu/pynb/flickr/dbn_models/classifiers/split_4/text_hidden1_classifier_1485019937 156 | Best valid MAP : 0.5082 Test MAP 0.5136 157 | Writing current best model /home/ubuntu/pynb/flickr/dbn_models/classifiers/split_4/text_hidden1_classifier_BEST 158 | Split 4 text_input 159 | Step 500 T_CE: 16.020 V_CE: 12.802 V_MAP: 0.318 V_prec50: 0.618 E_CE: 12.756 E_MAP: 0.327 E_prec50: 0.683 160 | Step 1000 T_CE: 12.191 V_CE: 11.890 V_MAP: 0.383 V_prec50: 0.680 E_CE: 11.833 E_MAP: 0.394 E_prec50: 0.750 161 | Step 1500 T_CE: 11.646 V_CE: 11.571 V_MAP: 0.410 V_prec50: 0.711 E_CE: 11.506 E_MAP: 0.423 E_prec50: 0.767 162 | Step 2000 T_CE: 11.384 V_CE: 11.376 V_MAP: 0.425 V_prec50: 0.722 E_CE: 11.303 E_MAP: 0.438 E_prec50: 0.771 163 | Step 2500 T_CE: 11.198 V_CE: 11.228 V_MAP: 0.434 V_prec50: 0.725 E_CE: 11.147 E_MAP: 0.447 E_prec50: 0.778 164 | Step 3000 T_CE: 11.047 V_CE: 11.105 V_MAP: 0.441 V_prec50: 0.728 E_CE: 11.017 E_MAP: 0.453 E_prec50: 0.782 165 | Step 3500 T_CE: 10.917 V_CE: 10.998 V_MAP: 0.446 V_prec50: 0.736 E_CE: 10.902 E_MAP: 0.457 E_prec50: 0.786 166 | Step 4000 T_CE: 10.801 V_CE: 10.902 V_MAP: 0.450 V_prec50: 0.737 E_CE: 10.800 E_MAP: 0.460 E_prec50: 0.788 167 | Step 4500 T_CE: 10.695 V_CE: 10.816 V_MAP: 0.452 V_prec50: 0.739 E_CE: 10.709 E_MAP: 0.463 E_prec50: 0.788 168 | Step 5000 T_CE: 10.599 V_CE: 10.739 V_MAP: 0.454 V_prec50: 0.743 E_CE: 10.625 E_MAP: 0.465 E_prec50: 0.788 169 | Step 5500 T_CE: 10.511 V_CE: 10.667 V_MAP: 0.455 V_prec50: 0.744 E_CE: 10.549 E_MAP: 0.467 E_prec50: 0.789 170 | Step 6000 T_CE: 10.430 V_CE: 10.602 V_MAP: 0.456 V_prec50: 0.742 E_CE: 10.479 E_MAP: 0.469 E_prec50: 0.791 171 | Step 6500 T_CE: 10.354 V_CE: 10.542 V_MAP: 0.457 V_prec50: 0.741 E_CE: 10.415 E_MAP: 0.469 E_prec50: 0.793 172 | Step 7000 T_CE: 10.283 V_CE: 10.487 V_MAP: 0.459 V_prec50: 0.743 E_CE: 10.355 E_MAP: 0.470 E_prec50: 0.794 173 | Step 7500 T_CE: 10.217 V_CE: 10.436 V_MAP: 0.459 V_prec50: 0.744 E_CE: 10.301 E_MAP: 0.471 E_prec50: 0.797 174 | Step 8000 T_CE: 10.156 V_CE: 10.388 V_MAP: 0.459 V_prec50: 0.744 E_CE: 10.250 E_MAP: 0.472 E_prec50: 0.795 175 | Step 8500 T_CE: 10.098 V_CE: 10.344 V_MAP: 0.460 V_prec50: 0.746 E_CE: 10.203 E_MAP: 0.472 E_prec50: 0.796 176 | Step 9000 T_CE: 10.044 V_CE: 10.303 V_MAP: 0.461 V_prec50: 0.746 E_CE: 10.159 E_MAP: 0.472 E_prec50: 0.795 177 | Step 9500 T_CE: 9.992 V_CE: 10.265 V_MAP: 0.462 V_prec50: 0.747 E_CE: 10.118 E_MAP: 0.473 E_prec50: 0.795 178 | Step 10000 T_CE: 9.944 V_CE: 10.229 V_MAP: 0.462 V_prec50: 0.749 E_CE: 10.080 E_MAP: 0.473 E_prec50: 0.797 179 | Writing checkpoint /home/ubuntu/pynb/flickr/dbn_models/classifiers/split_4/text_input_classifier_1485019958 180 | Best valid MAP : 0.4623 Test MAP 0.4733 181 | Writing current best model /home/ubuntu/pynb/flickr/dbn_models/classifiers/split_4/text_input_classifier_BEST 182 | Split 5 image_input 183 | Step 500 T_CE: 13.463 V_CE: 11.049 V_MAP: 0.394 V_prec50: 0.603 E_CE: 11.044 E_MAP: 0.384 E_prec50: 0.604 184 | Step 1000 T_CE: 7.993 V_CE: 10.068 V_MAP: 0.420 V_prec50: 0.643 E_CE: 9.996 E_MAP: 0.413 E_prec50: 0.653 185 | Step 1500 T_CE: 6.809 V_CE: 9.912 V_MAP: 0.422 V_prec50: 0.637 E_CE: 9.785 E_MAP: 0.413 E_prec50: 0.656 186 | Step 2000 T_CE: 6.179 V_CE: 10.007 V_MAP: 0.424 V_prec50: 0.647 E_CE: 9.878 E_MAP: 0.416 E_prec50: 0.655 187 | Step 2500 T_CE: 5.764 V_CE: 10.230 V_MAP: 0.421 V_prec50: 0.641 E_CE: 10.061 E_MAP: 0.413 E_prec50: 0.654 188 | Step 3000 T_CE: 5.460 V_CE: 10.354 V_MAP: 0.418 V_prec50: 0.641 E_CE: 10.194 E_MAP: 0.409 E_prec50: 0.641 189 | Step 3500 T_CE: 5.204 V_CE: 10.511 V_MAP: 0.415 V_prec50: 0.632 E_CE: 10.354 E_MAP: 0.405 E_prec50: 0.651 190 | Step 4000 T_CE: 5.005 V_CE: 10.745 V_MAP: 0.412 V_prec50: 0.630 E_CE: 10.573 E_MAP: 0.402 E_prec50: 0.642 191 | Step 4500 T_CE: 4.835 V_CE: 10.964 V_MAP: 0.410 V_prec50: 0.623 E_CE: 10.749 E_MAP: 0.398 E_prec50: 0.630 192 | Step 5000 T_CE: 4.691 V_CE: 11.172 V_MAP: 0.407 V_prec50: 0.624 E_CE: 10.946 E_MAP: 0.399 E_prec50: 0.640 193 | Step 5500 T_CE: 4.567 V_CE: 11.379 V_MAP: 0.407 V_prec50: 0.627 E_CE: 11.152 E_MAP: 0.398 E_prec50: 0.634 194 | Step 6000 T_CE: 4.450 V_CE: 11.622 V_MAP: 0.404 V_prec50: 0.614 E_CE: 11.412 E_MAP: 0.393 E_prec50: 0.622 195 | Step 6500 T_CE: 4.360 V_CE: 11.848 V_MAP: 0.405 V_prec50: 0.618 E_CE: 11.582 E_MAP: 0.396 E_prec50: 0.632 196 | Step 7000 T_CE: 4.263 V_CE: 11.913 V_MAP: 0.402 V_prec50: 0.613 E_CE: 11.703 E_MAP: 0.391 E_prec50: 0.624 197 | Step 7500 T_CE: 4.185 V_CE: 12.108 V_MAP: 0.400 V_prec50: 0.610 E_CE: 11.893 E_MAP: 0.388 E_prec50: 0.615 198 | Step 8000 T_CE: 4.119 V_CE: 12.373 V_MAP: 0.398 V_prec50: 0.612 E_CE: 12.090 E_MAP: 0.388 E_prec50: 0.623 199 | Step 8500 T_CE: 4.046 V_CE: 12.548 V_MAP: 0.397 V_prec50: 0.606 E_CE: 12.329 E_MAP: 0.386 E_prec50: 0.615 200 | Step 9000 T_CE: 3.992 V_CE: 12.706 V_MAP: 0.396 V_prec50: 0.606 E_CE: 12.468 E_MAP: 0.386 E_prec50: 0.614 201 | Step 9500 T_CE: 3.937 V_CE: 12.794 V_MAP: 0.396 V_prec50: 0.610 E_CE: 12.569 E_MAP: 0.386 E_prec50: 0.620 202 | Step 10000 T_CE: 3.885 V_CE: 13.040 V_MAP: 0.396 V_prec50: 0.607 E_CE: 12.808 E_MAP: 0.386 E_prec50: 0.621 203 | Writing checkpoint /home/ubuntu/pynb/flickr/dbn_models/classifiers/split_5/image_input_classifier_1485019983 204 | Best valid MAP : 0.4243 Test MAP 0.4155 205 | Writing current best model /home/ubuntu/pynb/flickr/dbn_models/classifiers/split_5/image_input_classifier_BEST 206 | Split 5 image_hidden1 207 | Step 500 T_CE: 9.616 V_CE: 9.216 V_MAP: 0.407 V_prec50: 0.621 E_CE: 9.014 E_MAP: 0.405 E_prec50: 0.657 208 | Step 1000 T_CE: 8.536 V_CE: 8.926 V_MAP: 0.437 V_prec50: 0.660 E_CE: 8.719 E_MAP: 0.433 E_prec50: 0.697 209 | Step 1500 T_CE: 8.130 V_CE: 8.796 V_MAP: 0.452 V_prec50: 0.680 E_CE: 8.585 E_MAP: 0.448 E_prec50: 0.714 210 | Step 2000 T_CE: 7.853 V_CE: 8.730 V_MAP: 0.463 V_prec50: 0.697 E_CE: 8.514 E_MAP: 0.456 E_prec50: 0.728 211 | Step 2500 T_CE: 7.637 V_CE: 8.683 V_MAP: 0.469 V_prec50: 0.701 E_CE: 8.475 E_MAP: 0.462 E_prec50: 0.740 212 | Step 3000 T_CE: 7.459 V_CE: 8.663 V_MAP: 0.474 V_prec50: 0.707 E_CE: 8.459 E_MAP: 0.465 E_prec50: 0.739 213 | Step 3500 T_CE: 7.308 V_CE: 8.653 V_MAP: 0.477 V_prec50: 0.704 E_CE: 8.446 E_MAP: 0.468 E_prec50: 0.746 214 | Step 4000 T_CE: 7.175 V_CE: 8.648 V_MAP: 0.479 V_prec50: 0.714 E_CE: 8.438 E_MAP: 0.470 E_prec50: 0.750 215 | Step 4500 T_CE: 7.057 V_CE: 8.669 V_MAP: 0.481 V_prec50: 0.713 E_CE: 8.454 E_MAP: 0.471 E_prec50: 0.750 216 | Step 5000 T_CE: 6.950 V_CE: 8.672 V_MAP: 0.480 V_prec50: 0.716 E_CE: 8.469 E_MAP: 0.470 E_prec50: 0.750 217 | Step 5500 T_CE: 6.854 V_CE: 8.681 V_MAP: 0.481 V_prec50: 0.714 E_CE: 8.469 E_MAP: 0.471 E_prec50: 0.750 218 | Step 6000 T_CE: 6.764 V_CE: 8.703 V_MAP: 0.482 V_prec50: 0.715 E_CE: 8.491 E_MAP: 0.471 E_prec50: 0.753 219 | Step 6500 T_CE: 6.682 V_CE: 8.725 V_MAP: 0.482 V_prec50: 0.716 E_CE: 8.511 E_MAP: 0.471 E_prec50: 0.750 220 | Step 7000 T_CE: 6.608 V_CE: 8.745 V_MAP: 0.481 V_prec50: 0.718 E_CE: 8.533 E_MAP: 0.470 E_prec50: 0.751 221 | Step 7500 T_CE: 6.537 V_CE: 8.762 V_MAP: 0.481 V_prec50: 0.715 E_CE: 8.555 E_MAP: 0.470 E_prec50: 0.751 222 | Step 8000 T_CE: 6.471 V_CE: 8.803 V_MAP: 0.481 V_prec50: 0.715 E_CE: 8.586 E_MAP: 0.469 E_prec50: 0.749 223 | Step 8500 T_CE: 6.408 V_CE: 8.830 V_MAP: 0.480 V_prec50: 0.716 E_CE: 8.618 E_MAP: 0.469 E_prec50: 0.749 224 | Step 9000 T_CE: 6.351 V_CE: 8.852 V_MAP: 0.480 V_prec50: 0.715 E_CE: 8.639 E_MAP: 0.468 E_prec50: 0.748 225 | Step 9500 T_CE: 6.297 V_CE: 8.866 V_MAP: 0.479 V_prec50: 0.717 E_CE: 8.655 E_MAP: 0.468 E_prec50: 0.749 226 | Step 10000 T_CE: 6.246 V_CE: 8.886 V_MAP: 0.479 V_prec50: 0.714 E_CE: 8.670 E_MAP: 0.467 E_prec50: 0.746 227 | Writing checkpoint /home/ubuntu/pynb/flickr/dbn_models/classifiers/split_5/image_hidden1_classifier_1485020004 228 | Best valid MAP : 0.4818 Test MAP 0.4710 229 | Writing current best model /home/ubuntu/pynb/flickr/dbn_models/classifiers/split_5/image_hidden1_classifier_BEST 230 | Split 5 image_hidden2 231 | Step 500 T_CE: 9.909 V_CE: 9.313 V_MAP: 0.404 V_prec50: 0.615 E_CE: 9.121 E_MAP: 0.403 E_prec50: 0.646 232 | Step 1000 T_CE: 8.794 V_CE: 8.992 V_MAP: 0.432 V_prec50: 0.648 E_CE: 8.791 E_MAP: 0.429 E_prec50: 0.688 233 | Step 1500 T_CE: 8.454 V_CE: 8.846 V_MAP: 0.446 V_prec50: 0.671 E_CE: 8.641 E_MAP: 0.443 E_prec50: 0.703 234 | Step 2000 T_CE: 8.242 V_CE: 8.766 V_MAP: 0.454 V_prec50: 0.683 E_CE: 8.559 E_MAP: 0.452 E_prec50: 0.714 235 | Step 2500 T_CE: 8.087 V_CE: 8.720 V_MAP: 0.460 V_prec50: 0.693 E_CE: 8.511 E_MAP: 0.458 E_prec50: 0.724 236 | Step 3000 T_CE: 7.964 V_CE: 8.685 V_MAP: 0.464 V_prec50: 0.699 E_CE: 8.478 E_MAP: 0.462 E_prec50: 0.733 237 | Step 3500 T_CE: 7.864 V_CE: 8.674 V_MAP: 0.468 V_prec50: 0.703 E_CE: 8.465 E_MAP: 0.465 E_prec50: 0.735 238 | Step 4000 T_CE: 7.777 V_CE: 8.654 V_MAP: 0.470 V_prec50: 0.708 E_CE: 8.448 E_MAP: 0.467 E_prec50: 0.737 239 | Step 4500 T_CE: 7.703 V_CE: 8.648 V_MAP: 0.472 V_prec50: 0.711 E_CE: 8.441 E_MAP: 0.468 E_prec50: 0.736 240 | Step 5000 T_CE: 7.636 V_CE: 8.646 V_MAP: 0.473 V_prec50: 0.707 E_CE: 8.435 E_MAP: 0.470 E_prec50: 0.736 241 | Step 5500 T_CE: 7.577 V_CE: 8.645 V_MAP: 0.475 V_prec50: 0.708 E_CE: 8.440 E_MAP: 0.470 E_prec50: 0.736 242 | Step 6000 T_CE: 7.524 V_CE: 8.655 V_MAP: 0.475 V_prec50: 0.709 E_CE: 8.445 E_MAP: 0.471 E_prec50: 0.737 243 | Step 6500 T_CE: 7.476 V_CE: 8.654 V_MAP: 0.476 V_prec50: 0.709 E_CE: 8.445 E_MAP: 0.471 E_prec50: 0.737 244 | Step 7000 T_CE: 7.430 V_CE: 8.658 V_MAP: 0.476 V_prec50: 0.711 E_CE: 8.449 E_MAP: 0.471 E_prec50: 0.741 245 | Step 7500 T_CE: 7.390 V_CE: 8.674 V_MAP: 0.476 V_prec50: 0.712 E_CE: 8.466 E_MAP: 0.471 E_prec50: 0.739 246 | Step 8000 T_CE: 7.352 V_CE: 8.670 V_MAP: 0.476 V_prec50: 0.710 E_CE: 8.460 E_MAP: 0.471 E_prec50: 0.737 247 | Step 8500 T_CE: 7.316 V_CE: 8.680 V_MAP: 0.477 V_prec50: 0.710 E_CE: 8.471 E_MAP: 0.471 E_prec50: 0.739 248 | Step 9000 T_CE: 7.283 V_CE: 8.690 V_MAP: 0.476 V_prec50: 0.707 E_CE: 8.481 E_MAP: 0.471 E_prec50: 0.736 249 | Step 9500 T_CE: 7.252 V_CE: 8.700 V_MAP: 0.477 V_prec50: 0.708 E_CE: 8.489 E_MAP: 0.471 E_prec50: 0.739 250 | Step 10000 T_CE: 7.224 V_CE: 8.720 V_MAP: 0.477 V_prec50: 0.707 E_CE: 8.507 E_MAP: 0.471 E_prec50: 0.737 251 | Writing checkpoint /home/ubuntu/pynb/flickr/dbn_models/classifiers/split_5/image_hidden2_classifier_1485020026 252 | Best valid MAP : 0.4767 Test MAP 0.4709 253 | Writing current best model /home/ubuntu/pynb/flickr/dbn_models/classifiers/split_5/image_hidden2_classifier_BEST 254 | Split 5 joint_hidden 255 | Step 500 T_CE: 8.519 V_CE: 7.982 V_MAP: 0.589 V_prec50: 0.820 E_CE: 7.755 E_MAP: 0.584 E_prec50: 0.850 256 | Step 1000 T_CE: 7.298 V_CE: 7.714 V_MAP: 0.609 V_prec50: 0.840 E_CE: 7.473 E_MAP: 0.604 E_prec50: 0.865 257 | Step 1500 T_CE: 6.954 V_CE: 7.650 V_MAP: 0.617 V_prec50: 0.853 E_CE: 7.405 E_MAP: 0.612 E_prec50: 0.870 258 | Step 2000 T_CE: 6.738 V_CE: 7.618 V_MAP: 0.621 V_prec50: 0.859 E_CE: 7.358 E_MAP: 0.616 E_prec50: 0.873 259 | Step 2500 T_CE: 6.573 V_CE: 7.567 V_MAP: 0.623 V_prec50: 0.860 E_CE: 7.309 E_MAP: 0.619 E_prec50: 0.873 260 | Step 3000 T_CE: 6.442 V_CE: 7.597 V_MAP: 0.625 V_prec50: 0.860 E_CE: 7.333 E_MAP: 0.620 E_prec50: 0.875 261 | Step 3500 T_CE: 6.331 V_CE: 7.574 V_MAP: 0.625 V_prec50: 0.860 E_CE: 7.310 E_MAP: 0.621 E_prec50: 0.876 262 | Step 4000 T_CE: 6.233 V_CE: 7.611 V_MAP: 0.625 V_prec50: 0.858 E_CE: 7.349 E_MAP: 0.621 E_prec50: 0.877 263 | Step 4500 T_CE: 6.143 V_CE: 7.662 V_MAP: 0.625 V_prec50: 0.858 E_CE: 7.387 E_MAP: 0.621 E_prec50: 0.875 264 | Step 5000 T_CE: 6.068 V_CE: 7.621 V_MAP: 0.625 V_prec50: 0.855 E_CE: 7.357 E_MAP: 0.621 E_prec50: 0.873 265 | Step 5500 T_CE: 5.999 V_CE: 7.644 V_MAP: 0.624 V_prec50: 0.856 E_CE: 7.374 E_MAP: 0.620 E_prec50: 0.873 266 | Step 6000 T_CE: 5.926 V_CE: 7.709 V_MAP: 0.624 V_prec50: 0.859 E_CE: 7.433 E_MAP: 0.620 E_prec50: 0.874 267 | Step 6500 T_CE: 5.869 V_CE: 7.716 V_MAP: 0.624 V_prec50: 0.858 E_CE: 7.457 E_MAP: 0.619 E_prec50: 0.873 268 | Step 7000 T_CE: 5.811 V_CE: 7.704 V_MAP: 0.623 V_prec50: 0.855 E_CE: 7.428 E_MAP: 0.618 E_prec50: 0.871 269 | Step 7500 T_CE: 5.760 V_CE: 7.752 V_MAP: 0.622 V_prec50: 0.853 E_CE: 7.485 E_MAP: 0.618 E_prec50: 0.872 270 | Step 8000 T_CE: 5.714 V_CE: 7.816 V_MAP: 0.622 V_prec50: 0.852 E_CE: 7.556 E_MAP: 0.617 E_prec50: 0.872 271 | Step 8500 T_CE: 5.662 V_CE: 7.775 V_MAP: 0.621 V_prec50: 0.852 E_CE: 7.511 E_MAP: 0.616 E_prec50: 0.871 272 | Step 9000 T_CE: 5.621 V_CE: 7.787 V_MAP: 0.621 V_prec50: 0.852 E_CE: 7.510 E_MAP: 0.615 E_prec50: 0.872 273 | Step 9500 T_CE: 5.577 V_CE: 7.833 V_MAP: 0.620 V_prec50: 0.850 E_CE: 7.559 E_MAP: 0.615 E_prec50: 0.871 274 | Step 10000 T_CE: 5.542 V_CE: 7.874 V_MAP: 0.620 V_prec50: 0.852 E_CE: 7.606 E_MAP: 0.614 E_prec50: 0.870 275 | Writing checkpoint /home/ubuntu/pynb/flickr/dbn_models/classifiers/split_5/joint_hidden_classifier_1485020047 276 | Best valid MAP : 0.6252 Test MAP 0.6205 277 | Writing current best model /home/ubuntu/pynb/flickr/dbn_models/classifiers/split_5/joint_hidden_classifier_BEST 278 | Split 5 text_hidden2 279 | Step 500 T_CE: 10.066 V_CE: 9.443 V_MAP: 0.458 V_prec50: 0.732 E_CE: 9.243 E_MAP: 0.465 E_prec50: 0.774 280 | Step 1000 T_CE: 8.888 V_CE: 9.160 V_MAP: 0.487 V_prec50: 0.777 E_CE: 8.943 E_MAP: 0.491 E_prec50: 0.804 281 | Step 1500 T_CE: 8.584 V_CE: 9.065 V_MAP: 0.499 V_prec50: 0.795 E_CE: 8.847 E_MAP: 0.502 E_prec50: 0.813 282 | Step 2000 T_CE: 8.409 V_CE: 9.033 V_MAP: 0.506 V_prec50: 0.796 E_CE: 8.811 E_MAP: 0.509 E_prec50: 0.817 283 | Step 2500 T_CE: 8.283 V_CE: 9.034 V_MAP: 0.509 V_prec50: 0.796 E_CE: 8.806 E_MAP: 0.511 E_prec50: 0.821 284 | Step 3000 T_CE: 8.185 V_CE: 9.031 V_MAP: 0.511 V_prec50: 0.803 E_CE: 8.803 E_MAP: 0.512 E_prec50: 0.821 285 | Step 3500 T_CE: 8.103 V_CE: 9.038 V_MAP: 0.511 V_prec50: 0.805 E_CE: 8.811 E_MAP: 0.513 E_prec50: 0.824 286 | Step 4000 T_CE: 8.033 V_CE: 9.050 V_MAP: 0.512 V_prec50: 0.807 E_CE: 8.824 E_MAP: 0.512 E_prec50: 0.821 287 | Step 4500 T_CE: 7.973 V_CE: 9.067 V_MAP: 0.513 V_prec50: 0.809 E_CE: 8.836 E_MAP: 0.512 E_prec50: 0.821 288 | Step 5000 T_CE: 7.918 V_CE: 9.078 V_MAP: 0.513 V_prec50: 0.810 E_CE: 8.847 E_MAP: 0.512 E_prec50: 0.820 289 | Step 5500 T_CE: 7.869 V_CE: 9.092 V_MAP: 0.512 V_prec50: 0.810 E_CE: 8.865 E_MAP: 0.510 E_prec50: 0.822 290 | Step 6000 T_CE: 7.824 V_CE: 9.124 V_MAP: 0.512 V_prec50: 0.808 E_CE: 8.892 E_MAP: 0.509 E_prec50: 0.818 291 | Step 6500 T_CE: 7.785 V_CE: 9.133 V_MAP: 0.512 V_prec50: 0.807 E_CE: 8.902 E_MAP: 0.509 E_prec50: 0.820 292 | Step 7000 T_CE: 7.746 V_CE: 9.142 V_MAP: 0.509 V_prec50: 0.806 E_CE: 8.913 E_MAP: 0.508 E_prec50: 0.817 293 | Step 7500 T_CE: 7.714 V_CE: 9.162 V_MAP: 0.510 V_prec50: 0.807 E_CE: 8.937 E_MAP: 0.507 E_prec50: 0.820 294 | Step 8000 T_CE: 7.681 V_CE: 9.176 V_MAP: 0.508 V_prec50: 0.806 E_CE: 8.947 E_MAP: 0.506 E_prec50: 0.819 295 | Step 8500 T_CE: 7.651 V_CE: 9.186 V_MAP: 0.510 V_prec50: 0.805 E_CE: 8.956 E_MAP: 0.506 E_prec50: 0.820 296 | Step 9000 T_CE: 7.625 V_CE: 9.207 V_MAP: 0.509 V_prec50: 0.805 E_CE: 8.977 E_MAP: 0.505 E_prec50: 0.821 297 | Step 9500 T_CE: 7.598 V_CE: 9.227 V_MAP: 0.507 V_prec50: 0.805 E_CE: 8.999 E_MAP: 0.504 E_prec50: 0.821 298 | Step 10000 T_CE: 7.573 V_CE: 9.254 V_MAP: 0.506 V_prec50: 0.802 E_CE: 9.018 E_MAP: 0.503 E_prec50: 0.820 299 | Writing checkpoint /home/ubuntu/pynb/flickr/dbn_models/classifiers/split_5/text_hidden2_classifier_1485020068 300 | Best valid MAP : 0.5129 Test MAP 0.5116 301 | Writing current best model /home/ubuntu/pynb/flickr/dbn_models/classifiers/split_5/text_hidden2_classifier_BEST 302 | Split 5 text_hidden1 303 | Step 500 T_CE: 9.833 V_CE: 9.417 V_MAP: 0.478 V_prec50: 0.768 E_CE: 9.222 E_MAP: 0.479 E_prec50: 0.799 304 | Step 1000 T_CE: 8.805 V_CE: 9.207 V_MAP: 0.497 V_prec50: 0.794 E_CE: 8.996 E_MAP: 0.499 E_prec50: 0.814 305 | Step 1500 T_CE: 8.513 V_CE: 9.160 V_MAP: 0.502 V_prec50: 0.800 E_CE: 8.948 E_MAP: 0.504 E_prec50: 0.818 306 | Step 2000 T_CE: 8.332 V_CE: 9.162 V_MAP: 0.506 V_prec50: 0.800 E_CE: 8.946 E_MAP: 0.506 E_prec50: 0.817 307 | Step 2500 T_CE: 8.198 V_CE: 9.176 V_MAP: 0.507 V_prec50: 0.802 E_CE: 8.955 E_MAP: 0.505 E_prec50: 0.816 308 | Step 3000 T_CE: 8.092 V_CE: 9.188 V_MAP: 0.506 V_prec50: 0.807 E_CE: 8.967 E_MAP: 0.505 E_prec50: 0.820 309 | Step 3500 T_CE: 8.004 V_CE: 9.213 V_MAP: 0.505 V_prec50: 0.803 E_CE: 8.995 E_MAP: 0.503 E_prec50: 0.821 310 | Step 4000 T_CE: 7.928 V_CE: 9.248 V_MAP: 0.505 V_prec50: 0.804 E_CE: 9.028 E_MAP: 0.503 E_prec50: 0.820 311 | Step 4500 T_CE: 7.865 V_CE: 9.286 V_MAP: 0.503 V_prec50: 0.804 E_CE: 9.072 E_MAP: 0.502 E_prec50: 0.820 312 | Step 5000 T_CE: 7.808 V_CE: 9.314 V_MAP: 0.502 V_prec50: 0.802 E_CE: 9.094 E_MAP: 0.501 E_prec50: 0.818 313 | Step 5500 T_CE: 7.757 V_CE: 9.331 V_MAP: 0.502 V_prec50: 0.798 E_CE: 9.109 E_MAP: 0.499 E_prec50: 0.817 314 | Step 6000 T_CE: 7.713 V_CE: 9.403 V_MAP: 0.500 V_prec50: 0.796 E_CE: 9.173 E_MAP: 0.496 E_prec50: 0.815 315 | Step 6500 T_CE: 7.674 V_CE: 9.400 V_MAP: 0.499 V_prec50: 0.797 E_CE: 9.175 E_MAP: 0.497 E_prec50: 0.814 316 | Step 7000 T_CE: 7.635 V_CE: 9.428 V_MAP: 0.497 V_prec50: 0.792 E_CE: 9.201 E_MAP: 0.495 E_prec50: 0.812 317 | Step 7500 T_CE: 7.604 V_CE: 9.471 V_MAP: 0.497 V_prec50: 0.789 E_CE: 9.244 E_MAP: 0.493 E_prec50: 0.812 318 | Step 8000 T_CE: 7.571 V_CE: 9.496 V_MAP: 0.496 V_prec50: 0.786 E_CE: 9.277 E_MAP: 0.492 E_prec50: 0.810 319 | Step 8500 T_CE: 7.544 V_CE: 9.507 V_MAP: 0.497 V_prec50: 0.788 E_CE: 9.284 E_MAP: 0.491 E_prec50: 0.809 320 | Step 9000 T_CE: 7.520 V_CE: 9.555 V_MAP: 0.495 V_prec50: 0.788 E_CE: 9.329 E_MAP: 0.490 E_prec50: 0.807 321 | Step 9500 T_CE: 7.495 V_CE: 9.587 V_MAP: 0.493 V_prec50: 0.785 E_CE: 9.362 E_MAP: 0.489 E_prec50: 0.805 322 | Step 10000 T_CE: 7.472 V_CE: 9.624 V_MAP: 0.492 V_prec50: 0.784 E_CE: 9.392 E_MAP: 0.488 E_prec50: 0.804 323 | Writing checkpoint /home/ubuntu/pynb/flickr/dbn_models/classifiers/split_5/text_hidden1_classifier_1485020090 324 | Best valid MAP : 0.5067 Test MAP 0.5052 325 | Writing current best model /home/ubuntu/pynb/flickr/dbn_models/classifiers/split_5/text_hidden1_classifier_BEST 326 | Split 5 text_input 327 | Step 500 T_CE: 16.018 V_CE: 12.827 V_MAP: 0.320 V_prec50: 0.630 E_CE: 12.741 E_MAP: 0.324 E_prec50: 0.670 328 | Step 1000 T_CE: 12.186 V_CE: 11.913 V_MAP: 0.382 V_prec50: 0.687 E_CE: 11.814 E_MAP: 0.389 E_prec50: 0.723 329 | Step 1500 T_CE: 11.635 V_CE: 11.590 V_MAP: 0.407 V_prec50: 0.709 E_CE: 11.487 E_MAP: 0.414 E_prec50: 0.735 330 | Step 2000 T_CE: 11.367 V_CE: 11.392 V_MAP: 0.423 V_prec50: 0.724 E_CE: 11.286 E_MAP: 0.428 E_prec50: 0.746 331 | Step 2500 T_CE: 11.178 V_CE: 11.242 V_MAP: 0.434 V_prec50: 0.732 E_CE: 11.132 E_MAP: 0.437 E_prec50: 0.754 332 | Step 3000 T_CE: 11.025 V_CE: 11.117 V_MAP: 0.439 V_prec50: 0.732 E_CE: 11.004 E_MAP: 0.443 E_prec50: 0.760 333 | Step 3500 T_CE: 10.892 V_CE: 11.009 V_MAP: 0.444 V_prec50: 0.735 E_CE: 10.892 E_MAP: 0.447 E_prec50: 0.765 334 | Step 4000 T_CE: 10.775 V_CE: 10.913 V_MAP: 0.447 V_prec50: 0.739 E_CE: 10.793 E_MAP: 0.449 E_prec50: 0.766 335 | Step 4500 T_CE: 10.670 V_CE: 10.827 V_MAP: 0.449 V_prec50: 0.743 E_CE: 10.703 E_MAP: 0.450 E_prec50: 0.767 336 | Step 5000 T_CE: 10.575 V_CE: 10.749 V_MAP: 0.451 V_prec50: 0.744 E_CE: 10.622 E_MAP: 0.453 E_prec50: 0.767 337 | Step 5500 T_CE: 10.487 V_CE: 10.678 V_MAP: 0.454 V_prec50: 0.745 E_CE: 10.548 E_MAP: 0.454 E_prec50: 0.768 338 | Step 6000 T_CE: 10.406 V_CE: 10.613 V_MAP: 0.455 V_prec50: 0.749 E_CE: 10.481 E_MAP: 0.455 E_prec50: 0.771 339 | Step 6500 T_CE: 10.332 V_CE: 10.554 V_MAP: 0.457 V_prec50: 0.752 E_CE: 10.418 E_MAP: 0.457 E_prec50: 0.773 340 | Step 7000 T_CE: 10.262 V_CE: 10.499 V_MAP: 0.458 V_prec50: 0.753 E_CE: 10.361 E_MAP: 0.458 E_prec50: 0.777 341 | Step 7500 T_CE: 10.198 V_CE: 10.447 V_MAP: 0.459 V_prec50: 0.755 E_CE: 10.308 E_MAP: 0.458 E_prec50: 0.779 342 | Step 8000 T_CE: 10.137 V_CE: 10.400 V_MAP: 0.460 V_prec50: 0.757 E_CE: 10.258 E_MAP: 0.459 E_prec50: 0.780 343 | Step 8500 T_CE: 10.080 V_CE: 10.356 V_MAP: 0.461 V_prec50: 0.759 E_CE: 10.213 E_MAP: 0.460 E_prec50: 0.781 344 | Step 9000 T_CE: 10.027 V_CE: 10.315 V_MAP: 0.462 V_prec50: 0.760 E_CE: 10.170 E_MAP: 0.460 E_prec50: 0.782 345 | Step 9500 T_CE: 9.977 V_CE: 10.277 V_MAP: 0.462 V_prec50: 0.762 E_CE: 10.131 E_MAP: 0.461 E_prec50: 0.784 346 | Step 10000 T_CE: 9.930 V_CE: 10.242 V_MAP: 0.463 V_prec50: 0.762 E_CE: 10.094 E_MAP: 0.461 E_prec50: 0.783 347 | Writing checkpoint /home/ubuntu/pynb/flickr/dbn_models/classifiers/split_5/text_input_classifier_1485020111 348 | Best valid MAP : 0.4629 Test MAP 0.4614 349 | -------------------------------------------------------------------------------- /Code/txtGen/README.md: -------------------------------------------------------------------------------- 1 | Script for text generation for images without tags. 2 | -------------------------------------------------------------------------------- /Code/txtGen/img_no_tag.npy: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/abyoussef/DRBM_Project/cff8f57c3f8e510d340de502b26eb527bad4ce23/Code/txtGen/img_no_tag.npy -------------------------------------------------------------------------------- /Code/txtGen/text_input_layer-00001-of-00001.npy: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/abyoussef/DRBM_Project/cff8f57c3f8e510d340de502b26eb527bad4ce23/Code/txtGen/text_input_layer-00001-of-00001.npy -------------------------------------------------------------------------------- /FP_AchariBerrada.pdf: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/abyoussef/DRBM_Project/cff8f57c3f8e510d340de502b26eb527bad4ce23/FP_AchariBerrada.pdf -------------------------------------------------------------------------------- /README.md: -------------------------------------------------------------------------------- 1 | # Multimodal Deep Boltzmann Machine: 2 | This is the code of my project joint representation using multimodal Deep Boltzmann Machines. It is realized in the course of Probabilistic Graphical Model and the course Object Recognition and Computer Vision in the MVA Master. 3 | 4 | The goal is to implement a Multi-DBM model with Tensorflow. This is inspired from the [paper]( http://papers.nips.cc/paper/4683-multimodal-learning-with-deep-boltzmann-machines.pdf) of Nitish Srivastava et al appeared at NIPS2012. 5 | 6 | In the code directory, you can find : 7 | * **MDBMCode** is an implementation of the Multimodal Deep Boltzmann Machine on the top of [Deep_Learning_Tensorflow](https://github.com/blackecho/Deep-Learning-TensorFlow) library. 8 | 9 | * **txtGen** is the code for generating tags to images without annotations. 10 | * **TrainingLog.txt** is the log of my training. 11 | -------------------------------------------------------------------------------- /setup_env.sh: -------------------------------------------------------------------------------- 1 | # Download the features 25K images. 2 | wget http://press.liacs.nl/mirflickr/mirflickr25k/mirflickr25k.zip 3 | wget http://press.liacs.nl/mirflickr/mirflickr25k/mirflickr25k_annotations_v080.zip 4 | 5 | # Download the features of all the images (1M). 6 | wget http://www.cs.toronto.edu/~nitish/multimodal/flickr_data.tar.gz 7 | 8 | # Download the images to view them. 9 | wget http://press.liacs.nl/mirflickr/mirflickr1m/images0.zip 10 | 11 | 12 | --------------------------------------------------------------------------------