├── LICENSE ├── README.md ├── demo.py ├── input.py ├── model.py └── operations.py /LICENSE: -------------------------------------------------------------------------------- 1 | MIT License 2 | 3 | Copyright (c) 2017 Arthur Meyer 4 | 5 | Permission is hereby granted, free of charge, to any person obtaining a copy 6 | of this software and associated documentation files (the "Software"), to deal 7 | in the Software without restriction, including without limitation the rights 8 | to use, copy, modify, merge, publish, distribute, sublicense, and/or sell 9 | copies of the Software, and to permit persons to whom the Software is 10 | furnished to do so, subject to the following conditions: 11 | 12 | The above copyright notice and this permission notice shall be included in all 13 | copies or substantial portions of the Software. 14 | 15 | THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR 16 | IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, 17 | FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE 18 | AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER 19 | LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, 20 | OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE 21 | SOFTWARE. 22 | -------------------------------------------------------------------------------- /README.md: -------------------------------------------------------------------------------- 1 | # Saliency_Detection_Convolutional_Autoencoder 2 | Saliency detection with a convolutional autoencoder including an edge contrast penalty term to the loss to enforce sharp edges 3 | 4 | 5 | 6 | ## Description of the project 7 | This project is a convolutional autoencoder that perform saliency detection. 8 | Specifically it generates saliency maps directly from raw pixels inputs. 9 | Both encoder and decoder are based on the VGG architecture. 10 | A specific penalty term has been added to the loss to improve the peormance aswell as direct conenctions between the convolutional and deconvolution layers. 11 | 12 | ## Model overview 13 | ![figure1](https://user-images.githubusercontent.com/26786663/27525317-b3026976-5a77-11e7-8767-8f4a06e5b696.jpg) 14 | 15 | ## Samples solution 16 | From left to right: image, ground truth, baseline, baseline with direct connections, baseline with direct connections and contrast penalty term added to the loss 17 | ![figure6-2e](https://user-images.githubusercontent.com/26786663/27525314-af6375da-5a77-11e7-882c-1646e016a0a3.jpg) 18 | 19 | ## Requirement 20 | - Python 2.7.6 or 3 21 | - Tensorflow 0.12 (or above) with GPU supported 22 | - Numpy 23 | 24 | 25 | 26 | ## How to use 27 | 1. First you need to make sure that the images and their labels are located in the approriate folders. 28 | If you plan to use the most commons datasets, such as [MSRA10K](http://mmcheng.net/msra10k/) or [ECSSD](http://www.cse.cuhk.edu.hk/leojia/projects/hsaliency/dataset.html) or [DUT-OMRON](http://saliencydetection.net/dut-omron/#outline-container-orgheadline8) you can download the images and labels into the following folders `.../datasets_split/NAME/images/*` and `.../datasets_split/NAME/labels/*` where `NAME = msra10k` for MSRA10K, `NAME = ecssd` for ECSSD and `NAME = dutomron` for DUT-OMRON. These are the relative paths where the python files are located. 29 | 30 | 2. Then you need to have a model available (meaning a file with the weight of the network). By default, there are three available models that are B, B+D and B+D+E. B is the baseline autoencoder, B+E has direct connections in addition and B+D+E has both direct connections and the edge contrast penalty enable. If you have no model, then you can start by training one from scratch using the following command: 31 | ``` 32 | demo.py -o train -m B -init scratch 33 | ``` 34 | If you want to train a model using available weights, such as VGG16 on [ILSVRC](http://www.image-net.org/papers/imagenet_cvpr09.pdf) then you can use the following command 35 | ``` 36 | demo.py -o train -m B -init pretrain 37 | ``` 38 | For example you can download the following weight file [here](https://www.cs.toronto.edu/~frossard/post/vgg16/) and place in the folder `.../vgg_weight/*` 39 | To save the final weights from the baseline model so that other model can be trained from this point, make sure to use the command 40 | ``` 41 | demo.py -o train -m B -init pretrain -s_copy 42 | ``` 43 | Afterward you could train a different models with this weight using a command such as 44 | ``` 45 | demo.py -o train -m BDE -init restore_w_only 46 | ``` 47 | Finally you could resume the training of the same model and train for a final 1000 steps with 48 | ``` 49 | demo.py -o train -m BDE -init restore -step 1000 50 | ``` 51 | 52 | 3. If a model is available, you can test its performance on dataset ECSSD for instance with 53 | ``` 54 | demo.py -o score -m B -p test -d ecssd -save 55 | ``` 56 | This command will also save the resulting saliency maps in `.../log/MODEL_NAME/*` where `MODEL_NAME` depends on which model you are testing. 57 | -------------------------------------------------------------------------------- /demo.py: -------------------------------------------------------------------------------- 1 | """ -------------------------------------------------- 2 | author: arthur meyer 3 | email: arthur.meyer.38@gmail.com 4 | status: final 5 | version: v2.0 6 | --------------------------------------------------""" 7 | 8 | 9 | 10 | import os 11 | import sys 12 | import numpy as np 13 | import tensorflow as tf 14 | 15 | import input 16 | import operations 17 | 18 | 19 | 20 | HEIGHT = 224 21 | WIDTH = 224 22 | BATCH_SIZE = 16 23 | STEPS = 100*1000 24 | PATH = os.path.abspath(__file__).split('/demo.py')[0] 25 | LOG_FOLDER = PATH + '/log' 26 | 27 | L_R = 0.0001 28 | WD = 0.00001 29 | VERB = 0 30 | 31 | MODEL_NAME = 'BDE' 32 | PHASE = 'test' 33 | DATASET = 'ecssd' 34 | AUX = 'msra10k' 35 | 36 | 37 | 38 | def display_warning(name, specific = None): 39 | """ 40 | Display warning message if wrong parameter 41 | 42 | Args: 43 | name : name of the parameter where the warning occurs 44 | specific : may include a specific message to display 45 | """ 46 | 47 | print('------------------------------------------------------') 48 | print('---------------------- WARNING -----------------------') 49 | print('------------------------------------------------------') 50 | print('--------- invalid argument for parameter -%s ---------' % (name)) 51 | print('------------------------------------------------------') 52 | print(specific) 53 | print('------------------------------------------------------\n') 54 | exit() 55 | 56 | 57 | 58 | 59 | def dataset_config(): 60 | """ 61 | Return the approriate input manager 62 | 63 | Returns: 64 | handler : manager of data from the class input.py for the main data stream 65 | handler_bis : manager of data from the class input.py for the validation data stream that is only use during training 66 | """ 67 | 68 | if PHASE == 'train': 69 | folder_im = PATH + "/dataset_split/train/images/" 70 | folder_lab = PATH + "/dataset_split/train/labels/" 71 | handler = input.handler(HEIGHT, WIDTH, BATCH_SIZE, folder_im, folder_lab, random = True) 72 | 73 | print('------------------------------------------------------') 74 | print('Queueing data ....') 75 | print('------------------------------------------------------') 76 | print('from TRAINING dataset -- auxiliary dataset is %s' % (AUX.upper())) 77 | print('------------------------------------------------------') 78 | print('Height: %d -- width: %d -- batch size: %d -- log folder: %s' % (HEIGHT, WIDTH, BATCH_SIZE, LOG_FOLDER)) 79 | print('------------------------------------------------------\n') 80 | 81 | elif PHASE == 'valid' or PHASE == 'test': 82 | folder_im = PATH + "/dataset_split/" + PHASE + "/" + DATASET + "/images/" 83 | folder_lab = PATH + "/dataset_split/" + PHASE + "/" + DATASET + "/labels/" 84 | handler = input.handler(HEIGHT, WIDTH, BATCH_SIZE, folder_im, folder_lab, random = False) 85 | 86 | print('------------------------------------------------------') 87 | print('Queueing data ....') 88 | print('------------------------------------------------------') 89 | print('Dataset is %s -- from split %s' % (DATASET.upper(), PHASE.upper())) 90 | print('------------------------------------------------------') 91 | print('Height: %d -- width: %d -- batch size: %d -- log folder: %s' % (HEIGHT, WIDTH, BATCH_SIZE, LOG_FOLDER)) 92 | print('------------------------------------------------------\n') 93 | 94 | folder_im = PATH + "/dataset_split/valid/" + AUX + "/images/" 95 | folder_lab = PATH + "/dataset_split/valid/" + AUX + "/labels/" 96 | handler_bis = input.handler(HEIGHT, WIDTH, BATCH_SIZE, folder_im, folder_lab, random = False) 97 | 98 | return handler, handler_bis 99 | 100 | 101 | 102 | 103 | def model_config(): 104 | """ 105 | Return the approriate model 106 | 107 | Returns: 108 | model : create model with the appropriate configuration, B is for baseline BD for baseline with direct connections and BDE for baseline with direct connections and edge contrast penalty 109 | """ 110 | 111 | if MODEL_NAME == 'B': 112 | model = operations.create_model('VGG_CE_noDetails_further' , BATCH_SIZE, learning_rate = L_R, wd = WD, concat = False, l2_loss = False, penalty = False, verbosity = VERB) 113 | 114 | elif MODEL_NAME == 'BD': 115 | model = operations.create_model('VGG_CE_Details_c' , BATCH_SIZE, learning_rate = L_R, wd = WD, concat = True, l2_loss = False, penalty = False, verbosity = VERB) 116 | 117 | elif MODEL_NAME == 'BDE': 118 | model = operations.create_model('VGG_CE_Details_new_from_pretrain', BATCH_SIZE, learning_rate = L_R, wd = WD, concat = True, l2_loss = False, penalty = True, coef = 0.8, verbosity = VERB) 119 | 120 | elif MODEL_NAME == 'new': 121 | model = operations.create_model('test', BATCH_SIZE, learning_rate = L_R, wd = WD, concat = True, l2_loss = False, penalty = True, coef = 0.8, verbosity = VERB) 122 | 123 | return model 124 | 125 | 126 | 127 | 128 | def do_operation(sess, model): 129 | """ 130 | Do the wanted operation with the model given in parameter 131 | 132 | Args: 133 | sess : tensorflow session 134 | model : configured model 135 | """ 136 | 137 | if OPERATION == 'train': 138 | stream_input, stream_input_bis = dataset_config() 139 | operations.do_train(model, sess, stream_input, stream_input_bis, STEPS, LOG_FOLDER, INITIALISATION, weight_file = W_FILE, model_to_copy = M_COPY, model_copy_is_concat = C_C, valid = VALID, dataset = AUX, save_copy = S_COPY) 140 | 141 | elif OPERATION == 'score': 142 | stream_input, _ = dataset_config() 143 | operations.compute_score(model, sess, stream_input, LOG_FOLDER, DATASET, PHASE, write = WRITE, save = SAVE) 144 | 145 | elif OPERATION == 'tracking': 146 | operations.visual_tracking(model, sess, LOG_FOLDER, PATH + '/experiments/visual_tracking/' + TRACKING) 147 | 148 | elif OPERATION == 'infer': 149 | stream_input, _ = dataset_config() 150 | operations.compute_inter(model, sess, stream_input, LOG_FOLDER, DATASET, PHASE, ARITH_TYPE) 151 | 152 | elif OPERATION == 'nearest': 153 | stream_input, _ = dataset_config() 154 | operations.do_nearest(model, sess, stream_input, LOG_FOLDER, DATASET, PHASE) 155 | 156 | elif OPERATION == 'void': 157 | _, _ = dataset_config() 158 | 159 | 160 | 161 | 162 | if __name__ == '__main__': 163 | 164 | sess = tf.Session() 165 | l = sys.argv[1:] 166 | main_flag = False 167 | 168 | #PARSING Set the global variable with appropriate value 169 | for i in range(len(l)): 170 | 171 | if l[i] == '-batch': 172 | try: 173 | BATCH_SIZE = int(l[i+1]) 174 | except Exception: 175 | display_warning('batch') 176 | 177 | if l[i] == '-step': 178 | try: 179 | STEPS = int(l[i+1]) 180 | except Exception: 181 | display_warning('step') 182 | 183 | if l[i] == '-height': 184 | try: 185 | HEIGHT = int(l[i+1]) 186 | except Exception: 187 | display_warning('height') 188 | 189 | if l[i] == '-width': 190 | try: 191 | WIDTH = int(l[i+1]) 192 | except Exception: 193 | display_warning('width') 194 | 195 | if l[i] == '-lr': 196 | try: 197 | L_R = float(l[i+1]) 198 | except Exception: 199 | display_warning('lr') 200 | 201 | if l[i] == '-wd': 202 | try: 203 | WD = float(l[i+1]) 204 | except Exception: 205 | display_warning('wd') 206 | 207 | if l[i] == '-m': 208 | if l[i+1] in ['B', 'BD', 'BDE', 'new']: 209 | MODEL_NAME = l[i+1] 210 | else: 211 | display_warning('m') 212 | 213 | if l[i] == '-d': 214 | if l[i+1] in ['msra10k', 'ecssd', 'dutomron']: 215 | DATASET = l[i+1] 216 | else: 217 | display_warning('d') 218 | 219 | if l[i] == '-p': 220 | if l[i+1] in ['train', 'test', 'valid']: 221 | PHASE = l[i+1] 222 | else: 223 | display_warning('p') 224 | 225 | if l[i] == '-o': 226 | if l[i+1] in ['train', 'score', 'infer', 'nearest', 'tracking', 'void']: 227 | OPERATION = l[i+1] 228 | main_flag = True 229 | 230 | if l[i+1] == 'train': 231 | global W_FILE 232 | W_FILE = 'vgg_weight/vgg16_weights.npz' 233 | global M_COPY 234 | M_COPY = 'saliency_VGG_CE_noDetails' 235 | global VALID 236 | VALID = True 237 | global S_COPY 238 | S_COPY = False 239 | global C_C 240 | C_C = False 241 | 242 | for j in range(len(l)): 243 | if l[j] == '-w_file': 244 | W_FILE = l[j+1] 245 | elif l[j] == '-m_copy': 246 | M_COPY = l[j+1] 247 | elif l[j] == '-no_valid': 248 | VALID = False 249 | elif l[j] == '-s_copy': 250 | S_COPY = True 251 | elif l[j] == '-c_c': 252 | C_C = True 253 | elif l[j] == '-aux': 254 | if l[j+1] in ['msra10k', 'ecssd', 'dutomron']: 255 | AUX = l[j+1] 256 | else: 257 | display_warning('aux') 258 | 259 | flag = False 260 | for j in range(len(l)): 261 | if l[j] == '-init': 262 | if flag: 263 | display_warning('init', specific ='Multiple arguments') 264 | else: 265 | if l[j+1] in ['scratch', 'restore_w_only', 'restore', 'pretrain']: 266 | global INITIALISATION 267 | INITIALISATION = l[j+1] 268 | flag = True 269 | else: 270 | display_warning('init', specific ='Value incorrect') 271 | if not flag: 272 | display_warning('init', specific ='No initialization specified') 273 | 274 | if l[i+1] == 'score': 275 | global WRITE 276 | WRITE = False 277 | global SAVE 278 | SAVE = False 279 | for e in l: 280 | if e == '-write': 281 | WRITE = True 282 | elif e == '-save': 283 | SAVE = True 284 | 285 | if l[i+1] == 'tracking': 286 | flag = False 287 | for j in range(len(l)): 288 | if l[j] == '-video': 289 | if flag: 290 | display_warning('video', specific ='Multiple arguments') 291 | else: 292 | if l[j+1] in ['BlurBody', 'Dog', 'Girl', 'Gym']: 293 | global TRACKING 294 | TRACKING = l[j+1] 295 | flag = True 296 | else: 297 | display_warning('video', specific ='Name incorrect') 298 | if not flag: 299 | display_warning('tracking', specific ='No video name specified') 300 | 301 | if l[i+1] == 'infer': 302 | flag = False 303 | for j in range(len(l)): 304 | if l[j] == '-type': 305 | if flag: 306 | display_warning('type', specific ='Multiple arguments') 307 | else: 308 | if l[j+1] in ['1', '2', '3']: 309 | global ARITH_TYPE 310 | ARITH_TYPE = int(l[j+1]) 311 | flag = True 312 | else: 313 | display_warning('type', specific ='Type incorrect') 314 | if not flag: 315 | display_warning('infer', specific ='No type specified') 316 | 317 | else: 318 | display_warning('o') 319 | 320 | if main_flag: 321 | if OPERATION == 'train': 322 | PHASE = 'train' 323 | VERB = 1 324 | model = model_config() 325 | do_operation(sess, model) 326 | else: 327 | print('Nothing to be done') -------------------------------------------------------------------------------- /input.py: -------------------------------------------------------------------------------- 1 | """ -------------------------------------------------- 2 | author: arthur meyer 3 | email: arthur.meyer.38@gmail.com 4 | status: final 5 | version: v2.0 6 | --------------------------------------------------""" 7 | 8 | 9 | 10 | from __future__ import division 11 | import os 12 | import threading 13 | import numpy as np 14 | import tensorflow as tf 15 | from PIL import Image 16 | 17 | 18 | 19 | class handler(object): 20 | """ 21 | This class run a thread that queue data 22 | Overall this class is responsible for managing the flow of data 23 | """ 24 | 25 | def __init__(self, hight, width, batch_size, folder_image, folder_label, format_image = '.jpg' , random = True): 26 | """ 27 | Args: 28 | hight : hight of samples 29 | width : width of samples 30 | batch_size : batch size 31 | folder_image : the folder where the images are 32 | folder_label : the folder where the ground truth are 33 | format_image : format of images (usually jpg) 34 | random : is the queue shuffled (for training) or not (FIFO for test related tasks) 35 | """ 36 | 37 | self.hight = hight 38 | self.width = width 39 | self.batch_size = batch_size 40 | self.image = np.array([f for f in os.listdir(folder_image) if format_image in f]) 41 | self.f1 = folder_image 42 | self.f2 = folder_label 43 | self.size_epoch = len(self.image) 44 | if random: 45 | self.queue = tf.RandomShuffleQueue(shapes=[(self.hight,self.width,3), (self.hight,self.width), []],dtypes=[tf.float32, tf.float32, tf.string],capacity=16*self.batch_size, min_after_dequeue=8*self.batch_size) 46 | else: 47 | self.queue = tf.FIFOQueue(shapes=[(self.hight,self.width,3), (self.hight,self.width), []],dtypes=[tf.float32, tf.float32, tf.string],capacity=16*self.batch_size) 48 | self.image_pl = tf.placeholder(tf.float32, shape=(batch_size,hight,width,3)) 49 | self.label_pl = tf.placeholder(tf.float32, shape=(batch_size,hight,width)) 50 | self.name_pl = tf.placeholder(tf.string, shape=(batch_size)) 51 | self.enqueue_op = self.queue.enqueue_many([self.image_pl, self.label_pl, self.name_pl]) 52 | 53 | 54 | 55 | def get_inputs(self): 56 | """ 57 | Getter for the data in the queue 58 | 59 | Returns: 60 | A tensor of size 'self.batch_size' of data 61 | """ 62 | 63 | return self.queue.dequeue_many(self.batch_size) 64 | 65 | 66 | 67 | def start_threads(self, sess): 68 | """ 69 | Start the thread where the data is put into the queue 70 | 71 | Args: 72 | sess : the context for the thread, here a tensorflow session 73 | 74 | Returns: 75 | t : the thread started 76 | """ 77 | 78 | t = threading.Thread(target=self._thread_main, args=(sess, )) 79 | t.daemon = True 80 | t.start() 81 | return t 82 | 83 | 84 | 85 | def _thread_main(self, sess): 86 | """ 87 | The main thread where data is queued 88 | 89 | Args: 90 | sess : the context for the thread, here a tensorflow session 91 | """ 92 | 93 | for images, labels, names in self._data_iterator(): 94 | sess.run(self.enqueue_op, feed_dict = {self.image_pl : images, self.label_pl : labels, self.name_pl : names}) 95 | 96 | 97 | 98 | def _data_iterator(self): 99 | """ 100 | The iterator on the data managed by the class. Here images are read and delivered to the queue 101 | """ 102 | 103 | while True: #Main loop where each epoch is shuffled 104 | 105 | batch_index = 0 106 | index = np.arange(0, self.size_epoch) 107 | np.random.shuffle(index) 108 | shuffled_image = self.image[index] 109 | 110 | while batch_index + self.batch_size <= self.size_epoch: #Loop on one epoch 111 | 112 | images_names = shuffled_image[batch_index : batch_index + self.batch_size] 113 | batch_index += self.batch_size 114 | images_batch = np.empty((0,self.hight,self.width,3)) 115 | label_batch = np.empty((0,self.hight,self.width)) 116 | 117 | for f in images_names: #Loop on one batch 118 | 119 | im = Image.open(os.path.join(self.f1, f)) #First the image 120 | im.load() 121 | im = im.resize((self.width ,self.hight )) 122 | im = np.asarray(im, dtype="int8" ) 123 | if len(np.shape(im)) != 3 : #If not 3 channels then warning is displayed 124 | print('----- WARNING -----') 125 | print('This image is not in RGB format:') 126 | print(f) 127 | images_batch = np.append(images_batch, [im], axis=0) 128 | 129 | im = Image.open(os.path.join(self.f2, f.split('.')[0] + '.png')) #Then the ground truth 130 | im.load() 131 | im = im.resize((self.width ,self.hight )) 132 | im = np.asarray(im, dtype="int16" ) 133 | label_batch = np.append(label_batch, [im], axis=0) 134 | 135 | index = np.arange(0, self.batch_size) 136 | np.random.shuffle(index) 137 | images_batch_shuffled = images_batch[index] 138 | label_batch_shuffled = label_batch[index] 139 | names_batch_shuffled = images_names[index] 140 | 141 | yield images_batch_shuffled/255, label_batch_shuffled/255, names_batch_shuffled -------------------------------------------------------------------------------- /model.py: -------------------------------------------------------------------------------- 1 | """ -------------------------------------------------- 2 | author: arthur meyer 3 | email: arthur.meyer.38@gmail.com 4 | status: final 5 | version: v2.0 6 | --------------------------------------------------""" 7 | 8 | 9 | 10 | from __future__ import division 11 | import tensorflow as tf 12 | import numpy as np 13 | 14 | 15 | 16 | class MODEL(object): 17 | """ 18 | Model description: 19 | conv : vgg 20 | deconv : vgg + 1 more 21 | fc layer : 2 22 | loss : flexible 23 | direct 24 | connections : flexible (if yes then 111 110) 25 | edge contrast : flexible 26 | """ 27 | 28 | def __init__(self, name, batch_size, learning_rate, wd, concat, l2_loss, penalty, coef): 29 | """ 30 | Args: 31 | name : name of the model (used to create a specific folder to save/load parameters) 32 | batch_size : batch size 33 | learning_rate : learning_rate 34 | wd : weight decay factor 35 | concat : does this model include direct connections? 36 | l2_loss : does this model use l2 loss (if not then cross entropy) 37 | penalty : whether to use the edge contrast penalty 38 | coef : coef for the edge contrast penalty 39 | """ 40 | 41 | self.name = 'saliency_' + name 42 | self.losses = 'loss_of_' + self.name 43 | self.losses_decay = 'loss_of_' + self.name +'_decay' 44 | self.batch_size = batch_size 45 | self.learning_rate = learning_rate 46 | self.wd = wd 47 | self.moving_avg_decay = 0.9999 48 | self.concat = concat 49 | self.l2_loss = l2_loss 50 | self.penalty = penalty 51 | self.coef = coef 52 | self.parameters_conv = [] 53 | self.parameters_deconv = [] 54 | self.deconv = [] 55 | 56 | with tf.device('/cpu:0'): 57 | # conv1_1 58 | with tf.variable_scope(self.name + '_' + 'conv1_1') as scope: 59 | kernel = tf.get_variable('kernel', (3, 3, 3, 64), initializer=tf.truncated_normal_initializer(stddev=1e-1, dtype=tf.float32), dtype=tf.float32) 60 | biases = tf.get_variable('biases', [64], initializer=tf.constant_initializer(0), dtype=tf.float32) 61 | weight_decay = tf.mul(tf.nn.l2_loss(kernel), self.wd) 62 | tf.add_to_collection(self.losses, weight_decay) 63 | tf.add_to_collection(self.losses_decay, weight_decay) 64 | self.parameters_conv += [kernel, biases] 65 | 66 | # conv1_2 67 | with tf.variable_scope(self.name + '_' + 'conv1_2') as scope: 68 | kernel = tf.get_variable('kernel', (3, 3, 64, 64), initializer=tf.truncated_normal_initializer(stddev=1e-1, dtype=tf.float32), dtype=tf.float32) 69 | biases = tf.get_variable('biases', [64], initializer=tf.constant_initializer(0), dtype=tf.float32) 70 | weight_decay = tf.mul(tf.nn.l2_loss(kernel), self.wd) 71 | tf.add_to_collection(self.losses, weight_decay) 72 | tf.add_to_collection(self.losses_decay, weight_decay) 73 | self.parameters_conv += [kernel, biases] 74 | 75 | # conv2_1 76 | with tf.variable_scope(self.name + '_' + 'conv2_1') as scope: 77 | kernel = tf.get_variable('kernel', (3, 3, 64, 128), initializer=tf.truncated_normal_initializer(stddev=1e-1, dtype=tf.float32), dtype=tf.float32) 78 | biases = tf.get_variable('biases', [128], initializer=tf.constant_initializer(0), dtype=tf.float32) 79 | weight_decay = tf.mul(tf.nn.l2_loss(kernel), self.wd) 80 | tf.add_to_collection(self.losses_decay, weight_decay) 81 | tf.add_to_collection(self.losses, weight_decay) 82 | self.parameters_conv += [kernel, biases] 83 | 84 | # conv2_2 85 | with tf.variable_scope(self.name + '_' + 'conv2_2') as scope: 86 | kernel = tf.get_variable('kernel', (3, 3, 128, 128), initializer=tf.truncated_normal_initializer(stddev=1e-1, dtype=tf.float32), dtype=tf.float32) 87 | biases = tf.get_variable('biases', [128], initializer=tf.constant_initializer(0), dtype=tf.float32) 88 | weight_decay = tf.mul(tf.nn.l2_loss(kernel), self.wd) 89 | tf.add_to_collection(self.losses_decay, weight_decay) 90 | tf.add_to_collection(self.losses, weight_decay) 91 | self.parameters_conv += [kernel, biases] 92 | 93 | # conv3_1 94 | with tf.variable_scope(self.name + '_' + 'conv3_1') as scope: 95 | kernel = tf.get_variable('kernel', (3, 3, 128, 256), initializer=tf.truncated_normal_initializer(stddev=1e-1, dtype=tf.float32), dtype=tf.float32) 96 | biases = tf.get_variable('biases', [256], initializer=tf.constant_initializer(0), dtype=tf.float32) 97 | weight_decay = tf.mul(tf.nn.l2_loss(kernel), self.wd) 98 | tf.add_to_collection(self.losses_decay, weight_decay) 99 | tf.add_to_collection(self.losses, weight_decay) 100 | self.parameters_conv += [kernel, biases] 101 | 102 | # conv3_2 103 | with tf.variable_scope(self.name + '_' + 'conv3_2') as scope: 104 | kernel = tf.get_variable('kernel', (3, 3, 256, 256), initializer=tf.truncated_normal_initializer(stddev=1e-1, dtype=tf.float32), dtype=tf.float32) 105 | biases = tf.get_variable('biases', [256], initializer=tf.constant_initializer(0), dtype=tf.float32) 106 | weight_decay = tf.mul(tf.nn.l2_loss(kernel), self.wd) 107 | tf.add_to_collection(self.losses_decay, weight_decay) 108 | tf.add_to_collection(self.losses, weight_decay) 109 | self.parameters_conv += [kernel, biases] 110 | 111 | # conv3_3 112 | with tf.variable_scope(self.name + '_' + 'conv3_3') as scope: 113 | kernel = tf.get_variable('kernel', (3, 3, 256, 256), initializer=tf.truncated_normal_initializer(stddev=1e-1, dtype=tf.float32), dtype=tf.float32) 114 | biases = tf.get_variable('biases', [256], initializer=tf.constant_initializer(0), dtype=tf.float32) 115 | weight_decay = tf.mul(tf.nn.l2_loss(kernel), self.wd) 116 | tf.add_to_collection(self.losses_decay, weight_decay) 117 | tf.add_to_collection(self.losses, weight_decay) 118 | self.parameters_conv += [kernel, biases] 119 | 120 | # conv4_1 121 | with tf.variable_scope(self.name + '_' + 'conv4_1') as scope: 122 | kernel = tf.get_variable('kernel', (3, 3, 256, 512), initializer=tf.truncated_normal_initializer(stddev=1e-1, dtype=tf.float32), dtype=tf.float32) 123 | biases = tf.get_variable('biases', [512], initializer=tf.constant_initializer(0), dtype=tf.float32) 124 | weight_decay = tf.mul(tf.nn.l2_loss(kernel), self.wd) 125 | tf.add_to_collection(self.losses, weight_decay) 126 | tf.add_to_collection(self.losses_decay, weight_decay) 127 | self.parameters_conv += [kernel, biases] 128 | 129 | # conv4_2 130 | with tf.variable_scope(self.name + '_' + 'conv4_2') as scope: 131 | kernel = tf.get_variable('kernel', (3, 3, 512, 512), initializer=tf.truncated_normal_initializer(stddev=1e-1, dtype=tf.float32), dtype=tf.float32) 132 | biases = tf.get_variable('biases', [512], initializer=tf.constant_initializer(0), dtype=tf.float32) 133 | weight_decay = tf.mul(tf.nn.l2_loss(kernel), self.wd) 134 | tf.add_to_collection(self.losses, weight_decay) 135 | tf.add_to_collection(self.losses_decay, weight_decay) 136 | self.parameters_conv += [kernel, biases] 137 | 138 | # conv4_3 139 | with tf.variable_scope(self.name + '_' + 'conv4_3') as scope: 140 | kernel = tf.get_variable('kernel', (3, 3, 512, 512), initializer=tf.truncated_normal_initializer(stddev=1e-1, dtype=tf.float32), dtype=tf.float32) 141 | biases = tf.get_variable('biases', [512], initializer=tf.constant_initializer(0), dtype=tf.float32) 142 | weight_decay = tf.mul(tf.nn.l2_loss(kernel), self.wd) 143 | tf.add_to_collection(self.losses, weight_decay) 144 | tf.add_to_collection(self.losses_decay, weight_decay) 145 | self.parameters_conv += [kernel, biases] 146 | 147 | # conv5_1 148 | with tf.variable_scope(self.name + '_' + 'conv5_1') as scope: 149 | kernel = tf.get_variable('kernel', (3, 3, 512, 512), initializer=tf.truncated_normal_initializer(stddev=1e-1, dtype=tf.float32), dtype=tf.float32) 150 | biases = tf.get_variable('biases', [512], initializer=tf.constant_initializer(0), dtype=tf.float32) 151 | weight_decay = tf.mul(tf.nn.l2_loss(kernel), self.wd) 152 | tf.add_to_collection(self.losses_decay, weight_decay) 153 | tf.add_to_collection(self.losses, weight_decay) 154 | self.parameters_conv += [kernel, biases] 155 | 156 | # conv5_2 157 | with tf.variable_scope(self.name + '_' + 'conv5_2') as scope: 158 | kernel = tf.get_variable('kernel', (3, 3, 512, 512), initializer=tf.truncated_normal_initializer(stddev=1e-1, dtype=tf.float32), dtype=tf.float32) 159 | biases = tf.get_variable('biases', [512], initializer=tf.constant_initializer(0), dtype=tf.float32) 160 | weight_decay = tf.mul(tf.nn.l2_loss(kernel), self.wd) 161 | tf.add_to_collection(self.losses_decay, weight_decay) 162 | tf.add_to_collection(self.losses, weight_decay) 163 | self.parameters_conv += [kernel, biases] 164 | 165 | # conv5_3 166 | with tf.variable_scope(self.name + '_' + 'conv5_3') as scope: 167 | kernel = tf.get_variable('kernel', (3, 3, 512, 512), initializer=tf.truncated_normal_initializer(stddev=1e-1, dtype=tf.float32), dtype=tf.float32) 168 | biases = tf.get_variable('biases', [512], initializer=tf.constant_initializer(0), dtype=tf.float32) 169 | weight_decay = tf.mul(tf.nn.l2_loss(kernel), self.wd) 170 | tf.add_to_collection(self.losses_decay, weight_decay) 171 | tf.add_to_collection(self.losses, weight_decay) 172 | self.parameters_conv += [kernel, biases] 173 | 174 | # fc1 175 | with tf.variable_scope(self.name + '_' + 'fc1') as scope: 176 | fc1w = tf.get_variable('fc1w', [7*7*512,4096], initializer=tf.truncated_normal_initializer(stddev=1e-1, dtype=tf.float32), dtype=tf.float32) 177 | fc1b = tf.get_variable('fc1b', [4096], initializer=tf.constant_initializer(0), dtype=tf.float32) 178 | self.parameters_conv += [fc1w, fc1b] 179 | 180 | # fc2 181 | with tf.variable_scope(self.name + '_' + 'fc2') as scope: 182 | fc2w = tf.get_variable('fc2w', [4096,4096], initializer=tf.truncated_normal_initializer(stddev=1e-1, dtype=tf.float32), dtype=tf.float32) 183 | fc2b = tf.get_variable('fc2b', [4096], initializer=tf.constant_initializer(0), dtype=tf.float32) 184 | self.parameters_conv += [fc2w, fc2b] 185 | 186 | # deconv0 187 | with tf.variable_scope(self.name + '_' + 'deconv0') as scope: 188 | if self.concat: 189 | kernel = tf.get_variable('kernel', (3, 3, 1, 195), initializer=tf.truncated_normal_initializer(stddev=1e-1, dtype=tf.float32), dtype=tf.float32) 190 | else: 191 | kernel = tf.get_variable('kernel', (3, 3, 1, 64), initializer=tf.truncated_normal_initializer(stddev=1e-1, dtype=tf.float32), dtype=tf.float32) 192 | biases = tf.get_variable('biases', [1], initializer=tf.constant_initializer(0), dtype=tf.float32) 193 | weight_decay = tf.mul(tf.nn.l2_loss(kernel), self.wd) 194 | tf.add_to_collection(self.losses, weight_decay) 195 | tf.add_to_collection(self.losses_decay, weight_decay) 196 | self.deconv += [kernel, biases] 197 | 198 | # deconv1_1 199 | with tf.variable_scope(self.name + '_' + 'deconv1_1') as scope: 200 | if self.concat: 201 | kernel = tf.get_variable('kernel', (3, 3, 64, 195), initializer=tf.truncated_normal_initializer(stddev=1e-1, dtype=tf.float32), dtype=tf.float32) 202 | else: 203 | kernel = tf.get_variable('kernel', (3, 3, 64, 64), initializer=tf.truncated_normal_initializer(stddev=1e-1, dtype=tf.float32), dtype=tf.float32) 204 | biases = tf.get_variable('biases', [64], initializer=tf.constant_initializer(0), dtype=tf.float32) 205 | weight_decay = tf.mul(tf.nn.l2_loss(kernel), self.wd) 206 | tf.add_to_collection(self.losses, weight_decay) 207 | tf.add_to_collection(self.losses_decay, weight_decay) 208 | self.parameters_deconv += [kernel, biases] 209 | 210 | # deconv1_2 211 | with tf.variable_scope(self.name + '_' + 'deconv1_2') as scope: 212 | if self.concat: 213 | kernel1 = tf.get_variable('kernel1', (3, 3, 64, 64), initializer=tf.truncated_normal_initializer(stddev=1e-1, dtype=tf.float32), dtype=tf.float32) 214 | kernel2 = tf.get_variable('kernel2', (3, 3, 64, 387), initializer=tf.truncated_normal_initializer(stddev=1e-1, dtype=tf.float32), dtype=tf.float32) 215 | biases = tf.get_variable('biases', [64], initializer=tf.constant_initializer(0), dtype=tf.float32) 216 | weight_decay = tf.mul(tf.nn.l2_loss(tf.concat(3,[kernel1,kernel2])), self.wd) 217 | tf.add_to_collection(self.losses, weight_decay) 218 | tf.add_to_collection(self.losses_decay, weight_decay) 219 | self.parameters_deconv += [[kernel1,kernel2], biases] 220 | else: 221 | kernel = tf.get_variable('kernel', (3, 3, 64, 64), initializer=tf.truncated_normal_initializer(stddev=1e-1, dtype=tf.float32), dtype=tf.float32) 222 | biases = tf.get_variable('biases', [64], initializer=tf.constant_initializer(0), dtype=tf.float32) 223 | weight_decay = tf.mul(tf.nn.l2_loss(kernel), self.wd) 224 | tf.add_to_collection(self.losses, weight_decay) 225 | tf.add_to_collection(self.losses_decay, weight_decay) 226 | self.parameters_deconv += [kernel, biases] 227 | 228 | # deconv2_1 229 | with tf.variable_scope(self.name + '_' + 'deconv2_1') as scope: 230 | if self.concat: 231 | kernel1 = tf.get_variable('kernel1', (3, 3, 64, 128), initializer=tf.truncated_normal_initializer(stddev=1e-1, dtype=tf.float32), dtype=tf.float32) 232 | kernel2 = tf.get_variable('kernel2', (3, 3, 64, 387), initializer=tf.truncated_normal_initializer(stddev=1e-1, dtype=tf.float32), dtype=tf.float32) 233 | biases = tf.get_variable('biases', [64], initializer=tf.constant_initializer(0), dtype=tf.float32) 234 | weight_decay = tf.mul(tf.nn.l2_loss(tf.concat(3,[kernel1,kernel2])), self.wd) 235 | tf.add_to_collection(self.losses, weight_decay) 236 | tf.add_to_collection(self.losses_decay, weight_decay) 237 | self.parameters_deconv += [[kernel1,kernel2], biases] 238 | else: 239 | kernel = tf.get_variable('kernel', (3, 3, 64, 128), initializer=tf.truncated_normal_initializer(stddev=1e-1, dtype=tf.float32), dtype=tf.float32) 240 | biases = tf.get_variable('biases', [64], initializer=tf.constant_initializer(0), dtype=tf.float32) 241 | weight_decay = tf.mul(tf.nn.l2_loss(kernel), self.wd) 242 | tf.add_to_collection(self.losses, weight_decay) 243 | tf.add_to_collection(self.losses_decay, weight_decay) 244 | self.parameters_deconv += [kernel, biases] 245 | 246 | # deconv2_2 247 | with tf.variable_scope(self.name + '_' + 'deconv2_2') as scope: 248 | kernel = tf.get_variable('kernel', (3, 3, 128, 128), initializer=tf.truncated_normal_initializer(stddev=1e-1, dtype=tf.float32), dtype=tf.float32) 249 | biases = tf.get_variable('biases', [128], initializer=tf.constant_initializer(0), dtype=tf.float32) 250 | weight_decay = tf.mul(tf.nn.l2_loss(kernel), self.wd) 251 | tf.add_to_collection(self.losses, weight_decay) 252 | tf.add_to_collection(self.losses_decay, weight_decay) 253 | self.parameters_deconv += [kernel, biases] 254 | 255 | # deconv3_1 256 | with tf.variable_scope(self.name + '_' + 'deconv3_1') as scope: 257 | kernel = tf.get_variable('kernel', (3, 3, 128, 256), initializer=tf.truncated_normal_initializer(stddev=1e-1, dtype=tf.float32), dtype=tf.float32) 258 | biases = tf.get_variable('biases', [128], initializer=tf.constant_initializer(0), dtype=tf.float32) 259 | weight_decay = tf.mul(tf.nn.l2_loss(kernel), self.wd) 260 | tf.add_to_collection(self.losses, weight_decay) 261 | tf.add_to_collection(self.losses_decay, weight_decay) 262 | self.parameters_deconv += [kernel, biases] 263 | 264 | # deconv3_2 265 | with tf.variable_scope(self.name + '_' + 'deconv3_2') as scope: 266 | kernel = tf.get_variable('kernel', (3, 3, 256, 256), initializer=tf.truncated_normal_initializer(stddev=1e-1, dtype=tf.float32), dtype=tf.float32) 267 | biases = tf.get_variable('biases', [256], initializer=tf.constant_initializer(0), dtype=tf.float32) 268 | weight_decay = tf.mul(tf.nn.l2_loss(kernel), self.wd) 269 | tf.add_to_collection(self.losses, weight_decay) 270 | tf.add_to_collection(self.losses_decay, weight_decay) 271 | self.parameters_deconv += [kernel, biases] 272 | 273 | # deconv3_3 274 | with tf.variable_scope(self.name + '_' + 'deconv3_3') as scope: 275 | kernel = tf.get_variable('kernel', (3, 3, 256, 256), initializer=tf.truncated_normal_initializer(stddev=1e-1, dtype=tf.float32), dtype=tf.float32) 276 | biases = tf.get_variable('biases', [256], initializer=tf.constant_initializer(0), dtype=tf.float32) 277 | weight_decay = tf.mul(tf.nn.l2_loss(kernel), self.wd) 278 | tf.add_to_collection(self.losses, weight_decay) 279 | tf.add_to_collection(self.losses_decay, weight_decay) 280 | self.parameters_deconv += [kernel, biases] 281 | 282 | # deconv4_1 283 | with tf.variable_scope(self.name + '_' + 'deconv4_1') as scope: 284 | kernel = tf.get_variable('kernel', (3, 3, 256, 512), initializer=tf.truncated_normal_initializer(stddev=1e-1, dtype=tf.float32), dtype=tf.float32) 285 | biases = tf.get_variable('biases', [256], initializer=tf.constant_initializer(0), dtype=tf.float32) 286 | weight_decay = tf.mul(tf.nn.l2_loss(kernel), self.wd) 287 | tf.add_to_collection(self.losses, weight_decay) 288 | tf.add_to_collection(self.losses_decay, weight_decay) 289 | self.parameters_deconv += [kernel, biases] 290 | 291 | # deconv4_2 292 | with tf.variable_scope(self.name + '_' + 'deconv4_2') as scope: 293 | kernel = tf.get_variable('kernel', (3, 3, 512, 512), initializer=tf.truncated_normal_initializer(stddev=1e-1, dtype=tf.float32), dtype=tf.float32) 294 | biases = tf.get_variable('biases', [512], initializer=tf.constant_initializer(0), dtype=tf.float32) 295 | weight_decay = tf.mul(tf.nn.l2_loss(kernel), self.wd) 296 | tf.add_to_collection(self.losses, weight_decay) 297 | tf.add_to_collection(self.losses_decay, weight_decay) 298 | self.parameters_deconv += [kernel, biases] 299 | 300 | # deconv4_3 301 | with tf.variable_scope(self.name + '_' + 'deconv4_3') as scope: 302 | kernel = tf.get_variable('kernel', (3, 3, 512, 512), initializer=tf.truncated_normal_initializer(stddev=1e-1, dtype=tf.float32), dtype=tf.float32) 303 | biases = tf.get_variable('biases', [512], initializer=tf.constant_initializer(0), dtype=tf.float32) 304 | weight_decay = tf.mul(tf.nn.l2_loss(kernel), self.wd) 305 | tf.add_to_collection(self.losses, weight_decay) 306 | tf.add_to_collection(self.losses_decay, weight_decay) 307 | self.parameters_deconv += [kernel, biases] 308 | 309 | # deconv5_1 310 | with tf.variable_scope(self.name + '_' + 'deconv5_1') as scope: 311 | kernel = tf.get_variable('kernel', (3, 3, 512, 512), initializer=tf.truncated_normal_initializer(stddev=1e-1, dtype=tf.float32), dtype=tf.float32) 312 | biases = tf.get_variable('biases', [512], initializer=tf.constant_initializer(0), dtype=tf.float32) 313 | weight_decay = tf.mul(tf.nn.l2_loss(kernel), self.wd) 314 | tf.add_to_collection(self.losses, weight_decay) 315 | tf.add_to_collection(self.losses_decay, weight_decay) 316 | self.parameters_deconv += [kernel, biases] 317 | 318 | # deconv5_2 319 | with tf.variable_scope(self.name + '_' + 'deconv5_2') as scope: 320 | kernel = tf.get_variable('kernel', (3, 3,512, 512), initializer=tf.truncated_normal_initializer(stddev=1e-1, dtype=tf.float32), dtype=tf.float32) 321 | biases = tf.get_variable('biases', [512], initializer=tf.constant_initializer(0), dtype=tf.float32) 322 | weight_decay = tf.mul(tf.nn.l2_loss(kernel), self.wd) 323 | tf.add_to_collection(self.losses, weight_decay) 324 | tf.add_to_collection(self.losses_decay, weight_decay) 325 | self.parameters_deconv += [kernel, biases] 326 | 327 | # deconv5_3 328 | with tf.variable_scope(self.name + '_' + 'deconv5_3') as scope: 329 | kernel = tf.get_variable('kernel', (3, 3, 512, 512), initializer=tf.truncated_normal_initializer(stddev=1e-1, dtype=tf.float32), dtype=tf.float32) 330 | biases = tf.get_variable('biases', [512], initializer=tf.constant_initializer(0), dtype=tf.float32) 331 | weight_decay = tf.mul(tf.nn.l2_loss(kernel), self.wd) 332 | tf.add_to_collection(self.losses, weight_decay) 333 | tf.add_to_collection(self.losses_decay, weight_decay) 334 | self.parameters_deconv += [kernel, biases] 335 | 336 | # de_fc1 337 | with tf.variable_scope(self.name + '_' + 'defc1') as scope: 338 | fc1w = tf.get_variable('fc1w', [4096,7*7*512], initializer=tf.truncated_normal_initializer(stddev=1e-1, dtype=tf.float32), dtype=tf.float32) 339 | fc1b = tf.get_variable('fc1b', [7*7*512], initializer=tf.constant_initializer(0), dtype=tf.float32) 340 | self.parameters_deconv += [fc1w, fc1b] 341 | 342 | # de_fc2 343 | with tf.variable_scope(self.name + '_' + 'defc2') as scope: 344 | fc2w = tf.get_variable('fc2w', [4096,4096], initializer=tf.truncated_normal_initializer(stddev=1e-1, dtype=tf.float32), dtype=tf.float32) 345 | fc2b = tf.get_variable('fc2b', [4096], initializer=tf.constant_initializer(0), dtype=tf.float32) 346 | self.parameters_deconv += [fc2w, fc2b] 347 | 348 | 349 | 350 | 351 | 352 | def display_info(self, verbosity): 353 | """ 354 | Display information about this model 355 | 356 | Args: 357 | verbosity : level of details to display 358 | """ 359 | 360 | print('------------------------------------------------------') 361 | print('This model is %s' % (self.name)) 362 | print('------------------------------------------------------') 363 | if verbosity > 0: 364 | print('Learning rate: %0.8f -- Weight decay: %0.8f -- Cross entropy loss: %r' % (self.learning_rate , self.wd, not self.l2_loss)) 365 | print('------------------------------------------------------') 366 | print('Direct connections: %r' % (self.concat)) 367 | print('------------------------------------------------------') 368 | print('Edge contrast penalty: %r -- coefficient %0.5f' % (self.penalty, self.coef)) 369 | print('------------------------------------------------------\n') 370 | 371 | 372 | 373 | 374 | 375 | def infer(self, images, inter_layer = False, arithmetic = None, debug = False): 376 | """ 377 | Return saliency map from given images 378 | 379 | Args: 380 | images : input images 381 | inter_layer : whether we want to return the middle layer code 382 | arithmetic : type of special operation on the middle layer encoding (1 is add, 2 subtract, 3 is linear combination) 383 | debug : whether to return a extra value use for debug (control value) 384 | 385 | Returns: 386 | out : saliency maps of the input 387 | control_value : some value used to debug training 388 | inter_layer_out : value of the middle layer 389 | """ 390 | 391 | control_value = None 392 | inter_layer_out = None 393 | 394 | if self.concat: 395 | detail = [] 396 | detail_bis = [] 397 | detail += [tf.image.resize_images(images,[112,112])] 398 | detail_bis += [images] 399 | 400 | # conv1_1 401 | with tf.variable_scope(self.name + '_' + 'conv1_1') as scope: 402 | conv = tf.nn.conv2d(images, self.parameters_conv[0], [1, 1, 1, 1], padding='SAME') 403 | out = tf.nn.bias_add(conv, self.parameters_conv[1]) 404 | relu = tf.nn.relu(out) 405 | norm = tf.nn.lrn(relu, 4, bias=1.0, alpha=0.001 / 9.0, beta=0.75) 406 | if self.concat: 407 | detail += [tf.image.resize_images(norm,[112,112])] 408 | detail_bis += [norm] 409 | 410 | # conv1_2 411 | with tf.variable_scope(self.name + '_' + 'conv1_2') as scope: 412 | conv = tf.nn.conv2d(norm, self.parameters_conv[2], [1, 1, 1, 1], padding='SAME') 413 | out = tf.nn.bias_add(conv, self.parameters_conv[3]) 414 | relu = tf.nn.relu(out) 415 | norm = tf.nn.lrn(relu, 4, bias=1.0, alpha=0.001 / 9.0, beta=0.75) 416 | if self.concat: 417 | detail += [tf.image.resize_images(norm,[112,112])] 418 | detail_bis += [norm] 419 | 420 | # pool1 421 | pool1 = tf.nn.max_pool(norm,ksize=[1, 2, 2, 1],strides=[1, 2, 2, 1],padding='SAME',name='pool1') 422 | 423 | # conv2_1 424 | with tf.variable_scope(self.name + '_' + 'conv2_1') as scope: 425 | conv = tf.nn.conv2d(pool1, self.parameters_conv[4], [1, 1, 1, 1], padding='SAME') 426 | out = tf.nn.bias_add(conv, self.parameters_conv[5]) 427 | relu = tf.nn.relu(out) 428 | norm = tf.nn.lrn(relu, 4, bias=1.0, alpha=0.001 / 9.0, beta=0.75) 429 | if self.concat: 430 | detail += [norm] 431 | 432 | # conv2_2 433 | with tf.variable_scope(self.name + '_' + 'conv2_2') as scope: 434 | conv = tf.nn.conv2d(norm, self.parameters_conv[6], [1, 1, 1, 1], padding='SAME') 435 | out = tf.nn.bias_add(conv, self.parameters_conv[7]) 436 | relu = tf.nn.relu(out) 437 | norm = tf.nn.lrn(relu, 4, bias=1.0, alpha=0.001 / 9.0, beta=0.75) 438 | if self.concat: 439 | detail += [norm] 440 | 441 | # pool2 442 | pool2 = tf.nn.max_pool(norm,ksize=[1, 2, 2, 1],strides=[1, 2, 2, 1],padding='SAME',name='pool2') 443 | 444 | # conv3_1 445 | with tf.variable_scope(self.name + '_' + 'conv3_1') as scope: 446 | conv = tf.nn.conv2d(pool2, self.parameters_conv[8], [1, 1, 1, 1], padding='SAME') 447 | out = tf.nn.bias_add(conv, self.parameters_conv[9]) 448 | relu = tf.nn.relu(out) 449 | norm = tf.nn.lrn(relu, 4, bias=1.0, alpha=0.001 / 9.0, beta=0.75) 450 | 451 | # conv3_2 452 | with tf.variable_scope(self.name + '_' + 'conv3_2') as scope: 453 | conv = tf.nn.conv2d(norm, self.parameters_conv[10], [1, 1, 1, 1], padding='SAME') 454 | out = tf.nn.bias_add(conv, self.parameters_conv[11]) 455 | relu = tf.nn.relu(out) 456 | norm = tf.nn.lrn(relu, 4, bias=1.0, alpha=0.001 / 9.0, beta=0.75) 457 | 458 | # conv3_3 459 | with tf.variable_scope(self.name + '_' + 'conv3_3') as scope: 460 | conv = tf.nn.conv2d(norm, self.parameters_conv[12], [1, 1, 1, 1], padding='SAME') 461 | out = tf.nn.bias_add(conv, self.parameters_conv[13]) 462 | relu = tf.nn.relu(out) 463 | norm = tf.nn.lrn(relu, 4, bias=1.0, alpha=0.001 / 9.0, beta=0.75) 464 | 465 | # pool3 466 | pool3 = tf.nn.max_pool(norm, ksize=[1, 2, 2, 1], strides=[1, 2, 2, 1], padding='SAME', name='pool3') 467 | 468 | # conv4_1 469 | with tf.variable_scope(self.name + '_' + 'conv4_1') as scope: 470 | conv = tf.nn.conv2d(pool3, self.parameters_conv[14], [1, 1, 1, 1], padding='SAME') 471 | out = tf.nn.bias_add(conv, self.parameters_conv[15]) 472 | relu = tf.nn.relu(out) 473 | norm = tf.nn.lrn(relu, 4, bias=1.0, alpha=0.001 / 9.0, beta=0.75) 474 | 475 | # conv4_2 476 | with tf.variable_scope(self.name + '_' + 'conv4_2') as scope: 477 | conv = tf.nn.conv2d(norm, self.parameters_conv[16], [1, 1, 1, 1], padding='SAME') 478 | out = tf.nn.bias_add(conv, self.parameters_conv[17]) 479 | relu = tf.nn.relu(out) 480 | norm = tf.nn.lrn(relu, 4, bias=1.0, alpha=0.001 / 9.0, beta=0.75) 481 | 482 | # conv4_3 483 | with tf.variable_scope(self.name + '_' + 'conv4_3') as scope: 484 | conv = tf.nn.conv2d(norm, self.parameters_conv[18], [1, 1, 1, 1], padding='SAME') 485 | out = tf.nn.bias_add(conv, self.parameters_conv[19]) 486 | relu = tf.nn.relu(out) 487 | norm = tf.nn.lrn(relu, 4, bias=1.0, alpha=0.001 / 9.0, beta=0.75) 488 | 489 | # pool4 490 | pool4 = tf.nn.max_pool(norm, ksize=[1, 2, 2, 1], strides=[1, 2, 2, 1], padding='SAME', name='pool4') 491 | 492 | # conv5_1 493 | with tf.variable_scope(self.name + '_' + 'conv5_1') as scope: 494 | conv = tf.nn.conv2d(pool4, self.parameters_conv[20], [1, 1, 1, 1], padding='SAME') 495 | out = tf.nn.bias_add(conv, self.parameters_conv[21]) 496 | relu = tf.nn.relu(out) 497 | norm = tf.nn.lrn(relu, 4, bias=1.0, alpha=0.001 / 9.0, beta=0.75) 498 | 499 | # conv5_2 500 | with tf.variable_scope(self.name + '_' + 'conv5_2') as scope: 501 | conv = tf.nn.conv2d(norm, self.parameters_conv[22], [1, 1, 1, 1], padding='SAME') 502 | out = tf.nn.bias_add(conv, self.parameters_conv[23]) 503 | relu = tf.nn.relu(out) 504 | norm = tf.nn.lrn(relu, 4, bias=1.0, alpha=0.001 / 9.0, beta=0.75) 505 | 506 | # conv5_3 507 | with tf.variable_scope(self.name + '_' + 'conv5_3') as scope: 508 | conv = tf.nn.conv2d(norm, self.parameters_conv[24], [1, 1, 1, 1], padding='SAME') 509 | out = tf.nn.bias_add(conv, self.parameters_conv[25]) 510 | relu = tf.nn.relu(out) 511 | norm = tf.nn.lrn(relu, 4, bias=1.0, alpha=0.001 / 9.0, beta=0.75) 512 | 513 | # pool5 514 | pool5 = tf.nn.max_pool(norm,ksize=[1, 2, 2, 1],strides=[1, 2, 2, 1],padding='SAME',name='pool5') 515 | 516 | # fc1 517 | with tf.variable_scope(self.name + '_' + 'fc1') as scope: 518 | pool5_flat = tf.reshape(pool5, [self.batch_size, -1]) 519 | fc1l = tf.nn.bias_add(tf.matmul(pool5_flat, self.parameters_conv[26]), self.parameters_conv[27]) 520 | fc1 = tf.nn.relu(fc1l) 521 | 522 | # fc2 523 | with tf.variable_scope(self.name + '_' + 'fc2') as scope: 524 | fc2l = tf.nn.bias_add(tf.matmul(fc1, self.parameters_conv[28]), self.parameters_conv[29]) 525 | fc2 = tf.nn.relu(fc2l) 526 | if inter_layer: 527 | inter_layer_out = fc2 528 | if arithmetic is not None: 529 | if arithmetic == 3: 530 | im1 = tf.squeeze(tf.split(0,self.batch_size,fc2)[0]) 531 | im2 = tf.squeeze(tf.split(0,self.batch_size,fc2)[1]) 532 | vec = tf.sub(im2,im1) 533 | liste = [] 534 | for i in range(self.batch_size): 535 | liste.append(im1+i/15*vec) 536 | fc2 = tf.pack(liste) 537 | elif arithmetic == 2: 538 | norm = tf.sqrt(tf.reduce_sum(tf.square(fc2), 1, keep_dims=True)) 539 | fc2 = tf.div(fc2,norm) 540 | im1 = tf.squeeze(tf.split(0,self.batch_size,fc2)[0]) 541 | fc2 = tf.sub(fc2,im1) 542 | elif arithmetic == 1: 543 | norm = tf.sqrt(tf.reduce_sum(tf.square(fc2), 1, keep_dims=True)) 544 | fc2 = tf.div(fc2,norm) 545 | im1 = tf.squeeze(tf.split(0,self.batch_size,fc2)[0]) 546 | fc2 = tf.add(fc2,im1) 547 | 548 | # de-fc2 549 | with tf.variable_scope(self.name + '_' + 'defc2') as scope: 550 | fc2l = tf.nn.bias_add(tf.matmul(fc2, self.parameters_deconv[28]), self.parameters_deconv[29]) 551 | fc2 = tf.nn.relu(fc2l) 552 | 553 | # de-fc1 554 | with tf.variable_scope(self.name + '_' + 'defc1') as scope: 555 | fc1l = tf.nn.bias_add(tf.matmul(fc2, self.parameters_deconv[26]), self.parameters_deconv[27]) 556 | fc1 = tf.nn.relu(fc1l) 557 | pool5_flat = tf.reshape(fc1, pool5.get_shape()) 558 | 559 | 560 | # deconv5_3 561 | with tf.variable_scope(self.name + '_' + 'deconv5_3') as scope: 562 | deconv = tf.nn.conv2d_transpose(pool5_flat, self.parameters_deconv[24], (self.batch_size,14,14,512), strides= [1, 2, 2, 1], padding='SAME') 563 | bias = tf.nn.bias_add(deconv, self.parameters_deconv[25]) 564 | relu = tf.nn.relu(bias) 565 | norm = tf.nn.lrn(relu, 4, bias=1.0, alpha=0.001 / 9.0, beta=0.75) 566 | 567 | # deconv5_2 568 | with tf.variable_scope(self.name + '_' + 'deconv5_2') as scope: 569 | deconv = tf.nn.conv2d_transpose(norm, self.parameters_deconv[22], (self.batch_size,14,14,512), strides= [1, 1, 1, 1], padding='SAME') 570 | bias = tf.nn.bias_add(deconv, self.parameters_deconv[23]) 571 | relu = tf.nn.relu(bias) 572 | norm = tf.nn.lrn(relu, 4, bias=1.0, alpha=0.001 / 9.0, beta=0.75) 573 | 574 | # deconv5_1 575 | with tf.variable_scope(self.name + '_' + 'deconv5_1') as scope: 576 | deconv = tf.nn.conv2d_transpose(norm, self.parameters_deconv[20], (self.batch_size,14,14,512), strides= [1, 1, 1, 1], padding='SAME') 577 | bias = tf.nn.bias_add(deconv, self.parameters_deconv[21]) 578 | relu = tf.nn.relu(bias) 579 | norm = tf.nn.lrn(relu, 4, bias=1.0, alpha=0.001 / 9.0, beta=0.75) 580 | 581 | # deconv4_3 582 | with tf.variable_scope(self.name + '_' + 'deconv4_3') as scope: 583 | deconv = tf.nn.conv2d_transpose(norm, self.parameters_deconv[18], (self.batch_size,28,28,512), strides= [1, 2, 2, 1], padding='SAME') 584 | bias = tf.nn.bias_add(deconv, self.parameters_deconv[19]) 585 | relu = tf.nn.relu(bias) 586 | norm = tf.nn.lrn(relu, 4, bias=1.0, alpha=0.001 / 9.0, beta=0.75) 587 | 588 | # deconv4_2 589 | with tf.variable_scope(self.name + '_' + 'deconv4_2') as scope: 590 | deconv = tf.nn.conv2d_transpose(norm, self.parameters_deconv[16], (self.batch_size,28,28,512), strides= [1, 1, 1, 1], padding='SAME') 591 | bias = tf.nn.bias_add(deconv, self.parameters_deconv[17]) 592 | relu = tf.nn.relu(bias) 593 | norm = tf.nn.lrn(relu, 4, bias=1.0, alpha=0.001 / 9.0, beta=0.75) 594 | 595 | # deconv4_1 596 | with tf.variable_scope(self.name + '_' + 'deconv4_1') as scope: 597 | deconv = tf.nn.conv2d_transpose(norm, self.parameters_deconv[14], (self.batch_size,28,28,256), strides= [1, 1, 1, 1], padding='SAME') 598 | bias = tf.nn.bias_add(deconv, self.parameters_deconv[15]) 599 | relu = tf.nn.relu(bias) 600 | norm = tf.nn.lrn(relu, 4, bias=1.0, alpha=0.001 / 9.0, beta=0.75) 601 | 602 | # deconv3_3 603 | with tf.variable_scope(self.name + '_' + 'deconv3_3') as scope: 604 | deconv = tf.nn.conv2d_transpose(norm, self.parameters_deconv[12], (self.batch_size,56,56,256), strides= [1, 2, 2, 1], padding='SAME') 605 | bias = tf.nn.bias_add(deconv, self.parameters_deconv[13]) 606 | relu = tf.nn.relu(bias) 607 | norm = tf.nn.lrn(relu, 4, bias=1.0, alpha=0.001 / 9.0, beta=0.75) 608 | 609 | # deconv3_2 610 | with tf.variable_scope(self.name + '_' + 'deconv3_2') as scope: 611 | deconv = tf.nn.conv2d_transpose(norm, self.parameters_deconv[10], (self.batch_size,56,56,256), strides= [1, 1, 1, 1], padding='SAME') 612 | bias = tf.nn.bias_add(deconv, self.parameters_deconv[11]) 613 | relu = tf.nn.relu(bias) 614 | norm = tf.nn.lrn(relu, 4, bias=1.0, alpha=0.001 / 9.0, beta=0.75) 615 | 616 | # deconv3_1 617 | with tf.variable_scope(self.name + '_' + 'deconv3_1') as scope: 618 | deconv = tf.nn.conv2d_transpose(norm, self.parameters_deconv[8], (self.batch_size,56,56,128), strides= [1, 1, 1, 1], padding='SAME') 619 | bias = tf.nn.bias_add(deconv, self.parameters_deconv[9]) 620 | relu = tf.nn.relu(bias) 621 | norm = tf.nn.lrn(relu, 4, bias=1.0, alpha=0.001 / 9.0, beta=0.75) 622 | 623 | if self.concat: 624 | add = tf.concat(3,detail) 625 | add_bis = tf.concat(3,detail_bis) 626 | if arithmetic: 627 | add = tf.zeros_like(add) 628 | add_bis = tf.zeros_like(add_bis) 629 | 630 | # deconv2_2 631 | with tf.variable_scope(self.name + '_' + 'deconv2_2') as scope: 632 | deconv = tf.nn.conv2d_transpose(norm, self.parameters_deconv[6], (self.batch_size,112,112,128), strides= [1, 2, 2, 1], padding='SAME') 633 | bias = tf.nn.bias_add(deconv, self.parameters_deconv[7]) 634 | relu = tf.nn.relu(bias) 635 | norm = tf.nn.lrn(relu, 4, bias=1.0, alpha=0.001 / 9.0, beta=0.75) 636 | if self.concat: 637 | norm = tf.concat(3,[norm,add]) 638 | 639 | # deconv2_1 640 | with tf.variable_scope(self.name + '_' + 'deconv2_1') as scope: 641 | deconv = tf.nn.conv2d_transpose(norm, tf.concat(3,self.parameters_deconv[4]), (self.batch_size,112,112,64), strides= [1, 1, 1, 1], padding='SAME') 642 | bias = tf.nn.bias_add(deconv, self.parameters_deconv[5]) 643 | relu = tf.nn.relu(bias) 644 | norm = tf.nn.lrn(relu, 4, bias=1.0, alpha=0.001 / 9.0, beta=0.75) 645 | if self.concat: 646 | norm = tf.concat(3,[norm,add]) 647 | 648 | # deconv1_2 649 | with tf.variable_scope(self.name + '_' + 'deconv1_2') as scope: 650 | deconv = tf.nn.conv2d_transpose(norm, tf.concat(3,self.parameters_deconv[2]), (self.batch_size,224,224,64), strides= [1, 2, 2, 1], padding='SAME') 651 | bias = tf.nn.bias_add(deconv, self.parameters_deconv[3]) 652 | relu = tf.nn.relu(bias) 653 | norm = tf.nn.lrn(relu, 4, bias=1.0, alpha=0.001 / 9.0, beta=0.75) 654 | if self.concat: 655 | norm = tf.concat(3,[norm,add_bis]) 656 | 657 | # deconv1_1 658 | with tf.variable_scope(self.name + '_' + 'deconv1_1') as scope: 659 | deconv = tf.nn.conv2d_transpose(norm, self.parameters_deconv[0], (self.batch_size,224,224,64), strides= [1, 1, 1, 1], padding='SAME') 660 | bias = tf.nn.bias_add(deconv, self.parameters_deconv[1]) 661 | relu = tf.nn.relu(bias) 662 | norm = tf.nn.lrn(relu, 4, bias=1.0, alpha=0.001 / 9.0, beta=0.75) 663 | if self.concat: 664 | norm = tf.concat(3,[norm,add_bis]) 665 | 666 | # deconv0 667 | with tf.variable_scope(self.name + '_' + 'deconv0') as scope: 668 | deconv = tf.nn.conv2d_transpose(norm, self.deconv[0], (self.batch_size,224,224,1), strides= [1, 1, 1, 1], padding='SAME') 669 | bias = tf.nn.bias_add(deconv, self.deconv[1]) 670 | relu = tf.sigmoid(bias) 671 | out = tf.squeeze(relu) 672 | 673 | if debug: 674 | control_value = tf.reduce_mean(relu) 675 | 676 | return out, control_value, inter_layer_out 677 | 678 | 679 | 680 | 681 | 682 | def loss(self, guess, labels, loss_bis = False): 683 | """ 684 | Return the loss for given saliency map with corresponding ground truth 685 | 686 | Args: 687 | guess : input saliency map 688 | labels : corresponding ground truth 689 | loss_bis : is it the main loss or the auxiliary one (for validation while training) 690 | 691 | Returns: 692 | loss_out : the loss value 693 | """ 694 | 695 | if self.l2_loss: 696 | reconstruction = tf.reduce_sum(tf.square(guess - labels), [1,2]) 697 | reconstruction_mean = tf.reduce_mean(reconstruction) 698 | if not loss_bis: 699 | tf.add_to_collection(self.losses, reconstruction_mean) 700 | else: 701 | guess_flat = tf.reshape(guess, [self.batch_size, -1]) 702 | labels_flat = tf.reshape(labels, [self.batch_size, -1]) 703 | zero = tf.fill(tf.shape(guess_flat), 1e-7) 704 | one = tf.fill(tf.shape(guess_flat), 1 - 1e-7) 705 | ret_1 = tf.select(guess_flat > 1e-7, guess_flat, zero) 706 | ret_2 = tf.select(ret_1 < 1 - 1e-7, ret_1, one) 707 | loss = tf.reduce_mean(- labels_flat * tf.log(ret_2) - (1. - labels_flat) * tf.log(1. - ret_2)) 708 | if not loss_bis: 709 | tf.add_to_collection(self.losses, loss) 710 | elif loss_bis: 711 | tf.add_to_collection(self.losses_decay, loss) 712 | 713 | if self.penalty and not loss_bis: 714 | labels_new = tf.reshape(labels, [self.batch_size, 224, 224, 1]) 715 | guess_new = tf.reshape(guess, [self.batch_size, 224, 224, 1]) 716 | filter_x = tf.constant(np.array([[0,0,0] , [-1,2,-1], [0,0,0]]).reshape((3,3,1,1)), dtype=tf.float32) 717 | filter_y = tf.constant(np.array([[0,-1,0] , [0,2,0], [0,-1,0]]).reshape((3,3,1,1)), dtype=tf.float32) 718 | gradient_x = tf.nn.conv2d(labels_new, filter_x, [1,1,1,1], padding = "SAME") 719 | gradient_y = tf.nn.conv2d(labels_new, filter_y, [1,1,1,1], padding = "SAME") 720 | result_x = tf.greater(gradient_x,0) 721 | result_y = tf.greater(gradient_y,0) 722 | keep = tf.cast(tf.logical_or(result_x,result_y), tf.float32) #edges 723 | 724 | filter_neighboor_1 = tf.constant(np.array([[0,0,0], [0,1,-1], [0,0,0]]).reshape((3,3,1)), dtype=tf.float32) 725 | filter_neighboor_2 = tf.constant(np.array([[0,-1,0], [0,1,0], [0,0,0]]).reshape((3,3,1)), dtype=tf.float32) 726 | filter_neighboor_3 = tf.constant(np.array([[0,0,0], [-1,1,0], [0,0,0]]).reshape((3,3,1)), dtype=tf.float32) 727 | filter_neighboor_4 = tf.constant(np.array([[0,0,0], [0,1,0], [0,-1,0]]).reshape((3,3,1)), dtype=tf.float32) 728 | filter_neighboor = tf.pack([filter_neighboor_1,filter_neighboor_2,filter_neighboor_3,filter_neighboor_4], axis = 3) 729 | compare = tf.square(keep * tf.nn.conv2d(guess_new, filter_neighboor, [1,1,1,1], padding = "SAME")) 730 | 731 | compare_m = tf.nn.conv2d(labels_new, filter_neighboor, [1,1,1,1], padding = "SAME") 732 | new_compare_m = tf.select(tf.equal(compare_m, 0), tf.ones([self.batch_size,224,224,4]), -1*tf.ones([self.batch_size,224,224,4])) #0 mean same so want to minimize and if not then diff so want to maximize 733 | final_compare_m = keep * new_compare_m 734 | 735 | score_ret = tf.reduce_sum(final_compare_m * compare, [1,2,3]) / (4*(tf.reduce_sum(keep,[1,2,3])+1e-7)) 736 | score = self.coef * tf.reduce_mean(score_ret) 737 | tf.add_to_collection(self.losses, score) 738 | 739 | if loss_bis: 740 | loss_out = tf.add_n(tf.get_collection(self.losses_decay)) 741 | else: 742 | loss_out = tf.add_n(tf.get_collection(self.losses)) 743 | 744 | return loss_out 745 | 746 | 747 | 748 | 749 | 750 | def train(self, loss, global_step): 751 | """ 752 | Return a training step for the tensorflow graph 753 | 754 | Args: 755 | loss : loss to do sgd on 756 | global_step : which step are we at 757 | """ 758 | 759 | opt = tf.train.AdamOptimizer(self.learning_rate) 760 | grads = opt.compute_gradients(loss) 761 | apply_gradient_op = opt.apply_gradients(grads, global_step=global_step) 762 | 763 | variable_averages = tf.train.ExponentialMovingAverage(self.moving_avg_decay, global_step) 764 | variables_averages_op = variable_averages.apply(tf.trainable_variables()) 765 | 766 | with tf.control_dependencies([apply_gradient_op, variables_averages_op]): 767 | train_op = tf.no_op(name='train') 768 | 769 | return train_op -------------------------------------------------------------------------------- /operations.py: -------------------------------------------------------------------------------- 1 | """ -------------------------------------------------- 2 | author: arthur meyer 3 | email: arthur.meyer.38@gmail.com 4 | status: final 5 | version: v2.0 6 | --------------------------------------------------""" 7 | 8 | 9 | 10 | from __future__ import division 11 | from __future__ import print_function 12 | from datetime import datetime 13 | 14 | import os 15 | import sys 16 | import time 17 | import shutil 18 | import tensorflow as tf 19 | import numpy as np 20 | from PIL import Image, ImageDraw 21 | 22 | import model 23 | 24 | 25 | 26 | def create_model(name, batch_size, learning_rate = 0.0001, wd = 0.00001, concat = False, l2_loss = False, penalty = False, coef = 0.4, verbosity = 0): 27 | """ 28 | Create a model from model.py with the given configuration 29 | 30 | Args: 31 | name : name of the model (used to create a specific folder to save/load parameters) 32 | batch_size : batch size 33 | learning_rate : learning_rate (cross entropy is arround 100* bigger than l2) 34 | wd : weight decay factor 35 | concat : does this model include direct connections? 36 | l2_loss : does this model use l2 loss (if not then cross entropy) 37 | penalty : whether to use the edge contrast penalty 38 | coef : coef for the edge contrast penalty 39 | verbosity : level of details to display 40 | 41 | Returns: 42 | my_model : created model 43 | """ 44 | 45 | my_model = model.MODEL(name, batch_size, learning_rate, wd, concat, l2_loss, penalty, coef) 46 | my_model.display_info(verbosity) 47 | return my_model 48 | 49 | 50 | 51 | 52 | def do_train(model, sess, stream_input, stream_input_aux, max_step, log_folder, mode, weight_file = None, model_to_copy = None, model_copy_is_concat = False, valid = True, dataset = None, save_copy = False): 53 | """ 54 | Train the model 55 | 56 | Args: 57 | model : model to compute the score of 58 | sess : tensorflow session 59 | stream_input : data manager 60 | stream_input_aux : data manager for auxiliary dataset (valdiation) 61 | max_step : numbers of step to train 62 | log_folder : where is the log of the model 63 | mode : how to initialize weights 64 | weight_file : location of vgg model if pretraining 65 | model_to_copy : weights of model to copy if restoring only weight (from another model) 66 | model_copy_is_concat : whether the model to copy has direct connections 67 | valid : whether to use validation 68 | dataset : dataset for the auxiliary data using during validation 69 | save_copy : whether to save a copy of the model at the end with the weight only 70 | """ 71 | 72 | print('------------------------------------------------------') 73 | print('Starting training the model (number of steps is %d) ...'%(max_step)) 74 | print('------------------------------------------------------\n') 75 | 76 | global_step = tf.Variable(0, trainable = False) 77 | images, labels, _ = stream_input.get_inputs() 78 | guess, control, _ = model.infer(images, debug = True) 79 | loss = model.loss(guess, labels) 80 | train_op = model.train(loss, global_step) 81 | 82 | if valid: 83 | images_aux, labels_aux, _ = stream_input_aux.get_inputs() 84 | guess_aux, _, _ = model.infer(images_aux) 85 | loss_aux = model.loss(guess_aux, labels_aux, loss_bis = True) 86 | zeros = tf.zeros_like(labels_aux) 87 | ones = tf.ones_like(labels_aux) 88 | threshold = [i for i in range(255)] 89 | liste = [] 90 | 91 | for thres in threshold: 92 | predicted_class = tf.select(guess_aux*255 > thres, ones, zeros) 93 | true_positive = tf.reduce_sum(tf.cast(tf.logical_and(tf.equal(predicted_class, ones),tf.equal(labels_aux, ones)), tf.float32),[1,2]) 94 | false_positive = tf.reduce_sum(tf.cast(tf.logical_and(tf.equal(predicted_class, ones),tf.equal(labels_aux, zeros)), tf.float32),[1,2]) 95 | true_negative = tf.reduce_sum(tf.cast(tf.logical_and(tf.equal(predicted_class, zeros),tf.equal(labels_aux, zeros)), tf.float32),[1,2]) 96 | false_negative = tf.reduce_sum(tf.cast(tf.logical_and(tf.equal(predicted_class, zeros),tf.equal(labels_aux, ones)), tf.float32),[1,2]) 97 | precision = tf.reduce_sum(true_positive/(1e-8 + true_positive+false_positive)) 98 | recall = tf.reduce_sum(true_positive/(1e-8 + true_positive+false_negative)) 99 | liste.append(tf.pack([precision,recall])) 100 | result = tf.pack(liste) 101 | adaptive_threshold = (2*tf.reduce_mean(guess_aux,[0,1],keep_dims= True)) 102 | adaptive_output = tf.select(guess_aux > adaptive_threshold, ones, zeros) 103 | adaptive_true_positive = tf.reduce_sum(tf.cast(tf.logical_and(tf.equal(adaptive_output, ones),tf.equal(labels_aux, ones)), tf.float32),[1,2]) 104 | adaptive_false_positive = tf.reduce_sum(tf.cast(tf.logical_and(tf.equal(adaptive_output, ones),tf.equal(labels_aux, zeros)), tf.float32),[1,2]) 105 | adaptive_true_negative = tf.reduce_sum(tf.cast(tf.logical_and(tf.equal(adaptive_output, zeros),tf.equal(labels_aux, zeros)), tf.float32),[1,2]) 106 | adaptive_false_negative = tf.reduce_sum(tf.cast(tf.logical_and(tf.equal(adaptive_output, zeros),tf.equal(labels_aux, ones)), tf.float32),[1,2]) 107 | adaptive_precision = tf.reduce_sum(adaptive_true_positive / (1e-8 + adaptive_true_positive + adaptive_false_positive)) 108 | adaptive_recall = tf.reduce_sum(adaptive_true_positive / (1e-8 + adaptive_true_positive + adaptive_false_negative)) 109 | adaptive_f_measure = tf.reduce_sum(1.3 * adaptive_precision * adaptive_recall / (1e-8 + 0.3 * adaptive_precision + adaptive_recall)) 110 | 111 | print('------------------------------------------------------') #Initialisation of weights 112 | sess.run(tf.global_variables_initializer()) 113 | if mode == 'pretrain': 114 | print('Loading weights from vgg file...') 115 | load_weights(model, sess, weight_file) 116 | elif mode == 'restore': 117 | print('Restoring from previous checkpoint...') 118 | sess.run(global_step.assign(int(restore_model(model, sess, log_folder)))) 119 | elif mode == 'restore_w_only': 120 | print('Restoring (weights only) from model %s ...' % (model_to_copy)) 121 | restore_weight_from(model, model_to_copy, sess, log_folder, copy_concat = model_copy_is_concat) 122 | elif mode == 'scratch': 123 | print('Initializing the weights from scratch') 124 | print('------------------------------------------------------') 125 | print('Done!') 126 | print('------------------------------------------------------ \n') 127 | 128 | tf.train.start_queue_runners(sess=sess) 129 | stream_input.start_threads(sess) 130 | 131 | if valid: 132 | stream_input_aux.start_threads(sess) 133 | if tf.gfile.Exists(log_folder + '/' + model.name + '_validation_log'): 134 | tf.gfile.DeleteRecursively(log_folder + '/' + model.name + '_validation_log') 135 | tf.gfile.MakeDirs(log_folder + '/' + model.name + '_validation_log') 136 | 137 | for step in range(max_step): 138 | start_time = time.time() 139 | _, loss_value, control_value, step_b = sess.run([train_op, loss, control, tf.to_int32(global_step)]) 140 | duration = time.time() - start_time 141 | 142 | if step % 5 == 0: #Display progress 143 | print ('%s: step %d out of %d, loss = %.5f (%.1f examples/sec; %.3f sec/batch) --- control value is %.12f' % (datetime.now(), step_b, 144 | max_step-step+step_b, loss_value, stream_input.batch_size / duration, float(duration), control_value)) 145 | 146 | if step % 1000 == 0 and step != 0 : #Save model 147 | save_model(model, sess, log_folder, step_b) 148 | 149 | if valid and step % 5000 == 0: #Validation 150 | print('------------------------------------------------------') 151 | print('Doing validation ...') 152 | print('------------------------------------------------------ \n') 153 | 154 | loss_tot = 0 155 | num_iter = int(stream_input_aux.size_epoch / stream_input_aux.batch_size) 156 | counter = np.zeros((256,3)) 157 | 158 | for step1 in range(num_iter): 159 | sys.stdout.write('%d out of %d \r' %(step1, num_iter)) 160 | sys.stdout.flush() 161 | result_ret, adaptive_precision_ret, adaptive_recall_ret, adaptive_f_measure_ret, loss_value = sess.run([result, adaptive_precision, adaptive_recall, adaptive_f_measure, loss_aux]) 162 | loss_tot += loss_value 163 | loss_mean = loss_tot/(step1+1) 164 | for i in range(255): 165 | for j in range(2): 166 | counter[i,j] += result_ret[i,j] 167 | counter[255,0] += adaptive_precision_ret 168 | counter[255,1] += adaptive_recall_ret 169 | counter[255,2] += adaptive_f_measure_ret 170 | file = open(log_folder + '/' + model.name + '_validation_log/' + str(step_b) + ".txt" , 'w') 171 | file.write('model name is ' + model.name + '\n') 172 | file.write('number trained step is ' + str(step_b) + '\n') 173 | file.write('aux dataset is ' + str(dataset) + '\n') 174 | file.write('loss mean is ' + str(loss_mean) + '\n') 175 | file.write('split of dataset is valid\n') 176 | for i in range(256): 177 | precision = counter[i,0] / (num_iter * stream_input_aux.batch_size) 178 | recall = counter[i,1] / (num_iter * stream_input_aux.batch_size) 179 | file.write('Precision %0.02f percent -- Recall %0.02f percent\n' %(precision*100, recall*100)) 180 | if i == 255: 181 | f = counter[i,2] / (num_iter * stream_input_aux.batch_size) 182 | file.write('fscore %0.04f\n' %(f)) 183 | if i % 20 == 0: 184 | print('Precision %0.02f percent -- Recall %0.02f percent' %(precision*100, recall*100)) 185 | file.close() 186 | print('\n------------------------------------------------------') 187 | print('Done!') 188 | print('------------------------------------------------------ \n') 189 | 190 | save_model(model, sess, log_folder, step_b) #Final save 191 | print('------------------------------------------------------') 192 | print('Save done!') 193 | if save_copy: 194 | save_weight_only(model, sess, log_folder, step_b) #Final save 195 | print('Saving weights onlt done!') 196 | print('------------------------------------------------------ \n') 197 | 198 | 199 | 200 | 201 | def load_weights(model, sess, weight_file): 202 | """ 203 | Load weights from given weight file (used to load pretrain weight of vgg model) 204 | 205 | Args: 206 | model : model to restore variable to 207 | sess : tensorflow session 208 | weight_file : weight file name 209 | """ 210 | 211 | weights = np.load(weight_file) 212 | keys = sorted(weights.keys()) 213 | for i, k in enumerate(keys): 214 | if i <= 29: 215 | print('-- %s %s --' % (i,k)) 216 | print(np.shape(weights[k])) 217 | sess.run(model.parameters_conv[i].assign(weights[k])) 218 | 219 | 220 | 221 | 222 | def save_model(model, sess, log_path, step): 223 | """ 224 | Save model using tensorflow checkpoint (also save hidden variables) 225 | 226 | Args: 227 | model : model to save variable from 228 | sess : tensorflow session 229 | log_path : where to save 230 | step : number of step at time of saving 231 | """ 232 | 233 | path = log_path + '/' + model.name 234 | if tf.gfile.Exists(path): 235 | tf.gfile.DeleteRecursively(path) 236 | tf.gfile.MakeDirs(path) 237 | saver = tf.train.Saver() 238 | checkpoint_path = os.path.join(path, 'model.ckpt') 239 | saver.save(sess, checkpoint_path, global_step=step) 240 | 241 | 242 | 243 | 244 | def save_weight_only(model, sess, log_path, step): 245 | """ 246 | Save model but only weight (meaning no hidden variable) 247 | In practice use this to just transfer weights from one model to the other 248 | 249 | Args: 250 | model : model to save variable from 251 | sess : tensorflow session 252 | log_path : where to save 253 | step : number of step at time of saving 254 | """ 255 | 256 | path = log_path + '/' + model.name + '_weight_only' 257 | if tf.gfile.Exists(path): 258 | tf.gfile.DeleteRecursively(path) 259 | tf.gfile.MakeDirs(path) 260 | 261 | variable_to_save = {} 262 | for i in range(30): 263 | name = 'conv_' + str(i) 264 | variable_to_save[name] = model.parameters_conv[i] 265 | if i in [2, 4] and model.concat: 266 | name = 'deconv_' + str(i) 267 | variable_to_save[name] = model.parameters_deconv[i][0] 268 | name = 'deconv_' + str(i) + '_bis' 269 | variable_to_save[name] = model.parameters_deconv[i][1] 270 | else: 271 | name = 'deconv_' + str(i) 272 | variable_to_save[name] = model.parameters_deconv[i] 273 | if i < 2: 274 | name = 'deconv_bis_' + str(i) 275 | variable_to_save[name] = model.deconv[i] 276 | saver = tf.train.Saver(variable_to_save) 277 | checkpoint_path = os.path.join(path, 'model.ckpt') 278 | saver.save(sess, checkpoint_path, global_step=step) 279 | 280 | 281 | 282 | 283 | def restore_model(model, sess, log_path): 284 | """ 285 | Restore model (including hidden variable) 286 | In practice use to resume the training of the same model 287 | 288 | Args 289 | model : model to restore variable to 290 | sess : tensorflow session 291 | log_path : where to save 292 | 293 | Returns: 294 | step_b : the step number at which training ended 295 | """ 296 | 297 | path = log_path + '/' + model.name 298 | saver = tf.train.Saver() 299 | ckpt = tf.train.get_checkpoint_state(path) 300 | if ckpt and ckpt.model_checkpoint_path: 301 | saver.restore(sess, ckpt.model_checkpoint_path) 302 | return ckpt.model_checkpoint_path.split('/')[-1].split('-')[-1] 303 | else: 304 | print('------------------------------------------------------') 305 | print('No checkpoint file found') 306 | print('------------------------------------------------------ \n') 307 | exit() 308 | 309 | 310 | 311 | 312 | def restore_weight_from(model, name, sess, log_path, copy_concat = False): 313 | """ 314 | Restore model (excluding hidden variable) 315 | In practice use to train a model with the weight from another model. 316 | As long as both model have architecture from the original model.py, then it works 317 | Compatible w or w/o direct connections 318 | 319 | Args 320 | model : model to restore variable to 321 | name : name of model to copy 322 | sess : tensorflow session 323 | log_path : where to restore 324 | copy_concat : specify if the model to copy from also had direct connections 325 | 326 | Returns: 327 | step_b : the step number at which training ended 328 | """ 329 | 330 | path = log_path + '/' + name + '_weight_only' 331 | 332 | variable_to_save = {} 333 | for i in range(30): 334 | name = 'conv_' + str(i) 335 | variable_to_save[name] = model.parameters_conv[i] 336 | if i < 2: 337 | if copy_concat == model.concat: 338 | name = 'deconv_' + str(i) 339 | variable_to_save[name] = model.parameters_deconv[i] 340 | name = 'deconv_bis_' + str(i) 341 | variable_to_save[name] = model.deconv[i] 342 | else: 343 | if i in [2, 4] and model.concat: 344 | name = 'deconv_' + str(i) 345 | variable_to_save[name] = model.parameters_deconv[i][0] 346 | if copy_concat: 347 | name = 'deconv_' + str(i) + '_bis' 348 | variable_to_save[name] = model.parameters_deconv[i][1] 349 | elif i in [2, 4] and not model.concat: 350 | name = 'deconv_' + str(i) 351 | variable_to_save[name] = model.parameters_deconv[i] 352 | else: 353 | name = 'deconv_' + str(i) 354 | variable_to_save[name] = model.parameters_deconv[i] 355 | 356 | saver = tf.train.Saver(variable_to_save) 357 | ckpt = tf.train.get_checkpoint_state(path) 358 | if ckpt and ckpt.model_checkpoint_path: 359 | saver.restore(sess, ckpt.model_checkpoint_path) 360 | return ckpt.model_checkpoint_path.split('/')[-1].split('-')[-1] 361 | else: 362 | print('------------------------------------------------------') 363 | print('No checkpoint file found') 364 | print('------------------------------------------------------ \n') 365 | exit() 366 | 367 | 368 | 369 | 370 | def compute_score(model, sess, stream_input, restore_path, dataset, split, write = False, save = False): 371 | """ 372 | Compute the precision recall score for a given model, with the addition of the F1 score. 373 | 374 | Args: 375 | model : model to compute the score of 376 | sess : tensorflow session 377 | stream_input : data manager 378 | restore_path : where is the restore file 379 | dataset : dataset tested 380 | split : which split (valid, test, etc.) 381 | write : whether to write the result in a file 382 | save : whether to save the resulting saliency maps 383 | """ 384 | 385 | print('------------------------------------------------------') 386 | print('Computing score of the model on %s from %s ...'%(dataset, split)) 387 | print('Write result file : %r -- Save images : %r' % (write, save)) 388 | print('------------------------------------------------------\n') 389 | 390 | images, labels, names = stream_input.get_inputs() 391 | guess, _, _ = model.infer(images) 392 | 393 | zeros = tf.zeros_like(labels) 394 | ones = tf.ones_like(labels) 395 | threshold = [i for i in range(255)] 396 | liste = [] 397 | for t in threshold: 398 | predicted_class = tf.select(guess*255 > t, ones, zeros) 399 | true_positive = tf.reduce_sum(tf.cast(tf.logical_and(tf.equal(predicted_class, ones), tf.equal(labels, ones)), tf.float32),[1,2]) 400 | false_positive = tf.reduce_sum(tf.cast(tf.logical_and(tf.equal(predicted_class, ones), tf.equal(labels, zeros)), tf.float32),[1,2]) 401 | true_negative = tf.reduce_sum(tf.cast(tf.logical_and(tf.equal(predicted_class, zeros), tf.equal(labels, zeros)), tf.float32),[1,2]) 402 | false_negative = tf.reduce_sum(tf.cast(tf.logical_and(tf.equal(predicted_class, zeros), tf.equal(labels, ones)), tf.float32),[1,2]) 403 | precision = tf.reduce_sum(true_positive / (1e-8 + true_positive + false_positive)) 404 | recall = tf.reduce_sum(true_positive / (1e-8 + true_positive + false_negative)) 405 | liste.append(tf.pack([precision,recall])) 406 | result = tf.pack(liste) 407 | 408 | adaptive_threshold = 2*tf.reduce_mean(guess,[1,2], keep_dims= True) 409 | adaptive_output = tf.select(guess > adaptive_threshold, ones, zeros) 410 | adaptive_true_positive = tf.reduce_sum(tf.cast(tf.logical_and(tf.equal(adaptive_output, ones),tf.equal(labels, ones)), tf.float32),[1,2]) 411 | adaptive_false_positive = tf.reduce_sum(tf.cast(tf.logical_and(tf.equal(adaptive_output, ones),tf.equal(labels, zeros)), tf.float32),[1,2]) 412 | adaptive_true_negative = tf.reduce_sum(tf.cast(tf.logical_and(tf.equal(adaptive_output, zeros),tf.equal(labels, zeros)), tf.float32),[1,2]) 413 | adaptive_false_negative = tf.reduce_sum(tf.cast(tf.logical_and(tf.equal(adaptive_output, zeros),tf.equal(labels, ones)), tf.float32),[1,2]) 414 | adaptive_precision = tf.reduce_sum(adaptive_true_positive / (1e-8 + adaptive_true_positive + adaptive_false_positive)) 415 | adaptive_recall = tf.reduce_sum(adaptive_true_positive / (1e-8 + adaptive_true_positive + adaptive_false_negative)) 416 | adaptive_f_measure = tf.reduce_sum(1.3 * adaptive_precision * adaptive_recall / (1e-8 + 0.3 * adaptive_precision + adaptive_recall)) 417 | 418 | print('------------------------------------------------------') #Initialisation of weights 419 | sess.run(tf.global_variables_initializer()) 420 | print('Restoring from previous checkpoint...') 421 | ret_bis = restore_model(model, sess, restore_path) 422 | tf.train.start_queue_runners(sess=sess) 423 | stream_input.start_threads(sess) 424 | print('------------------------------------------------------') 425 | print('Done! Training ended at step %s' %(ret_bis)) 426 | print('------------------------------------------------------ \n') 427 | 428 | if save: #Save result images 429 | path = restore_path + '/' + model.name 430 | path += '/result_' + dataset + '/' 431 | if tf.gfile.Exists(path): 432 | tf.gfile.DeleteRecursively(path) 433 | tf.gfile.MakeDirs(path) 434 | 435 | num_iter = int(stream_input.size_epoch / stream_input.batch_size) 436 | counter = np.zeros((256,3)) 437 | 438 | print('------------------------------------------------------') 439 | for step in range(num_iter): #Compute score 440 | 441 | sys.stdout.write('%d out of %d \r' %(step, num_iter)) 442 | sys.stdout.flush() 443 | 444 | result_ret, adaptive_precision_ret, adaptive_recall_ret, adaptive_f_measure_ret, names_ret, images_ret, labels_ret, guess_ret = sess.run([result, adaptive_precision, adaptive_recall, adaptive_f_measure, names, images, labels, guess]) 445 | 446 | for i in range(stream_input.batch_size): 447 | 448 | if save: #Save result images 449 | ret_path = path + names_ret[i] 450 | ret = np.asarray( images_ret[i]*255, dtype="int8" ) 451 | Image.fromarray(ret, 'RGB').save(ret_path + '_' + 'im' + '.png') 452 | ret = np.asarray( labels_ret[i]*255, dtype="int8" ) 453 | Image.fromarray(ret, 'P').save(ret_path + '_' + 'lab' + '.png') 454 | ret = np.asarray( guess_ret[i]*255, dtype="int8" ) 455 | Image.fromarray(ret, 'P').save(ret_path + '_' + 'guess' + '.png') 456 | 457 | for i in range(255): 458 | for j in range(2): 459 | counter[i,j] += result_ret[i,j] 460 | counter[255,0] += adaptive_precision_ret 461 | counter[255,1] += adaptive_recall_ret 462 | counter[255,2] += adaptive_f_measure_ret 463 | print('------------------------------------------------------ \n') 464 | 465 | for i in range(256): 466 | if i ==255: 467 | print('\n------------------------------------------------------') 468 | precision = counter[i,0] / (num_iter * stream_input.batch_size) 469 | recall = counter[i,1] / (num_iter * stream_input.batch_size) 470 | print('Precision %0.02f percent -- Recall %0.02f percent' %(precision*100, recall*100)) 471 | if i ==255: 472 | print('fscore %0.04f' %(counter[i,2] / (num_iter*stream_input.batch_size))) 473 | print('------------------------------------------------------ \n') 474 | 475 | if write: #Save score 476 | file = open(restore_path + '/' + model.name + "/" + dataset + ".txt" , 'w') 477 | file.write('model name is ' + model.name + '\n') 478 | file.write('number trained step is ' + str(ret_bis) + '\n') 479 | file.write('test dataset is ' + str(dataset) + '\n') 480 | file.write('split of dataset is ' + str(split) + '\n') 481 | for i in range(255): 482 | file.write('Precision %0.02f percent -- Recall %0.02f percent\n' %(counter[i,0]/ (num_iter * stream_input.batch_size)*100, counter[i,1]/ (num_iter * stream_input.batch_size)*100)) 483 | file.write('Precision %0.02f percent -- Recall %0.02f percent\n' %(counter[255,0]/ (num_iter * stream_input.batch_size)*100, counter[255,1]/ (num_iter * stream_input.batch_size)*100)) 484 | file.write('fscore %0.04f\n' %(counter[255,2]/ (num_iter * stream_input.batch_size))) 485 | file.close() 486 | print('------------------------------------------------------') 487 | print('Log file written') 488 | print('------------------------------------------------------ \n') 489 | 490 | 491 | 492 | 493 | def visual_tracking(model, sess, restore_path, folder_data): 494 | """ 495 | Compute the tracking boundig boxe naively for each frame of the given sequence 496 | 497 | Args: 498 | model : model 499 | sess : tensorflow session 500 | restore_path : where is the restore file 501 | folder_data : the sequence location 502 | """ 503 | 504 | print('------------------------------------------------------') 505 | print('Performing tracking on given sequence %s ...'%(folder_data)) 506 | print('------------------------------------------------------\n') 507 | 508 | batch_size = model.batch_size 509 | images = tf.placeholder(tf.float32, shape=(batch_size,224,224,3)) 510 | guess, _, _ = model.infer(images) 511 | zeros = tf.zeros_like(guess) 512 | ones = tf.ones_like(guess) 513 | threshold = 2*tf.reduce_mean(guess, keep_dims= True) #Adaptive threshold 514 | output = tf.select(guess > threshold, ones, zeros) 515 | 516 | print('------------------------------------------------------') #Initialisation of weights 517 | sess.run(tf.global_variables_initializer()) 518 | print('Restoring from previous checkpoint...') 519 | ret_bis = restore_model(model, sess, restore_path) 520 | print('------------------------------------------------------') 521 | print('Done! Training ended at step %s' %(ret_bis)) 522 | print('------------------------------------------------------ \n') 523 | 524 | path = folder_data + '/results_tracking/' #Output folder 525 | if tf.gfile.Exists(path): 526 | tf.gfile.DeleteRecursively(path) 527 | tf.gfile.MakeDirs(path) 528 | tf.gfile.MakeDirs(path + 'out/') 529 | tf.gfile.MakeDirs(path + 'binary/') 530 | tf.gfile.MakeDirs(path + 'bb/') 531 | 532 | 533 | images_batch = np.empty((0,224,224,3)) 534 | index_representation = np.empty((batch_size), dtype='a1000') 535 | index = 0 536 | current = 0 537 | tot = len([f for f in os.listdir(folder_data + '/img/') if ".jpg" in f]) 538 | 539 | print('------------------------------------------------------') 540 | for e in np.array([f for f in os.listdir(folder_data + '/img/') if ".jpg" in f]): 541 | 542 | current += 1 543 | sys.stdout.write('%d out of %d \r' %(current, tot)) 544 | sys.stdout.flush() 545 | 546 | im = Image.open(os.path.join(folder_data + '/img/', e)) #Read image one by one and add them to the batch 547 | im.load() 548 | w,h = im.size 549 | im = im.resize((224,224)) 550 | im_a = np.asarray(im, dtype="int8" ) 551 | images_batch = np.append(images_batch, [im_a], axis=0) 552 | index_representation[index] = e 553 | 554 | index += 1 555 | if index == batch_size: #Ready to be processed 556 | 557 | guess_ret, threshold_ret, output_ret = sess.run([guess, threshold, output], feed_dict={images : images_batch/255}) 558 | 559 | for i in range(batch_size): 560 | ret_out = np.asarray( guess_ret[i]*255, dtype="int8" ) #Score output 561 | Image.fromarray(ret_out, 'P').save(path + 'out/' + index_representation[i] + '_out' + '.png') 562 | 563 | ret_out_b = np.asarray( output_ret[i]*255, dtype="int8" ) #Binary output 564 | Image.fromarray(ret_out_b, 'P').save(path + 'binary/' + index_representation[i] + '_binary' + '.png') 565 | 566 | im = Image.fromarray(np.asarray(images_batch[i], dtype = np.uint8), 'RGB') 567 | draw = ImageDraw.Draw(im) #Bounding box 568 | ret_mask = np.nonzero(ret_out_b) 569 | x0= np.amin(ret_mask,axis = 1)[1] 570 | y0=np.amin(ret_mask,axis = 1)[0] 571 | x1=np.amax(ret_mask,axis = 1)[1] 572 | y1=np.amax(ret_mask,axis = 1)[0] 573 | draw.rectangle([x0,y0,x1,y1],outline='red') 574 | im_b = im.resize((2*w,2*h)) 575 | im_b.save(path + 'bb/' + index_representation[i] + '_final' + '.png') 576 | del draw 577 | 578 | images_batch = np.empty((0,224,224,3)) 579 | index_representation = np.empty((batch_size), dtype='a1000') 580 | index = 0 581 | print('------------------------------------------------------ \n') 582 | 583 | if index != 0: #Last batch not processed 584 | 585 | number = index 586 | while index != batch_size: 587 | im = np.zeros((224,224,3)) 588 | images_batch = np.append(images_batch, [im], axis=0) 589 | index +=1 590 | 591 | guess_ret, threshold_ret, output_ret = sess.run([guess, threshold, output], feed_dict={images : images_batch/255}) 592 | 593 | for i in range(number): 594 | ret_out = np.asarray( guess_ret[i]*255, dtype="int8" ) #Score output 595 | Image.fromarray(ret_out, 'P').save(path + 'out/' + index_representation[i] + '_out' + '.png') 596 | 597 | ret_out_b = np.asarray( output_ret[i]*255, dtype="int8" ) #Binary output 598 | Image.fromarray(ret_out_b, 'P').save(path + 'binary/' + index_representation[i] + '_binary' + '.png') 599 | 600 | im = Image.fromarray(np.asarray(images_batch[i], dtype = np.uint8), 'RGB') 601 | draw = ImageDraw.Draw(im) #Bounding box 602 | ret_mask = np.nonzero(ret_out_b) 603 | x0= np.amin(ret_mask,axis = 1)[1] 604 | y0=np.amin(ret_mask,axis = 1)[0] 605 | x1=np.amax(ret_mask,axis = 1)[1] 606 | y1=np.amax(ret_mask,axis = 1)[0] 607 | draw.rectangle([x0,y0,x1,y1],outline='red') 608 | im_b = im.resize((2*w,2*h)) 609 | im_b.save(path + 'bb/' + index_representation[i] + '_final' + '.png') 610 | del draw 611 | 612 | print('------------------------------------------------------ ') 613 | print('Done!') 614 | print('------------------------------------------------------ \n') 615 | 616 | 617 | 618 | 619 | def compute_inter(model, sess, stream_input, restore_path, dataset, split, arithmetic): 620 | """ 621 | From real encoding of images, comoute new encodign and save the final results 622 | 623 | Args: 624 | model : model 625 | sess : tensorflow session 626 | stream_input : data manager 627 | restore_path : where is the restore file 628 | dataset : which dataset 629 | split : which split (valid, test, etc.) 630 | arithmetic : how to transform encoding (1 is add, 2 subtract, 3 is linear combination) 631 | """ 632 | 633 | print('------------------------------------------------------') 634 | print('Computing operations on encoding of images of %s from %s ...'%(dataset, split)) 635 | print('------------------------------------------------------') 636 | if arithmetic == 1: 637 | print('Operation is addition') 638 | elif arithmetic == 2: 639 | print('Operation is subtraction') 640 | elif arithmetic == 3: 641 | print('Operation is linear combination') 642 | print('------------------------------------------------------\n') 643 | 644 | images,labels, name = stream_input.get_inputs() 645 | guess, _, _ = model.infer(images, arithmetic = arithmetic) 646 | 647 | print('------------------------------------------------------') #Initialisation of weights 648 | sess.run(tf.global_variables_initializer()) 649 | print('Restoring from previous checkpoint...') 650 | ret_bis = restore_model(model, sess, restore_path) 651 | tf.train.start_queue_runners(sess=sess) 652 | stream_input.start_threads(sess) 653 | print('------------------------------------------------------') 654 | print('Done! Training ended at step %s' %(ret_bis)) 655 | print('------------------------------------------------------ \n') 656 | 657 | path = restore_path + '/' + model.name 658 | path += '/results_arith_' + dataset + '_' + split + '_' + str(arithmetic) + '/' 659 | if tf.gfile.Exists(path): 660 | tf.gfile.DeleteRecursively(path) 661 | tf.gfile.MakeDirs(path) 662 | 663 | num_iter = int(stream_input.size_epoch / stream_input.batch_size) 664 | print('------------------------------------------------------') 665 | print('There are %d data to process in %d iterations' %(int(stream_input.size_epoch), num_iter)) 666 | print('------------------------------------------------------ \n') 667 | 668 | for step in range(num_iter): 669 | 670 | sys.stdout.write('%d out of %d \r' %(step, num_iter)) 671 | sys.stdout.flush() 672 | 673 | images_ret, labels_ret, guess_ret, name_ret = sess.run([images, labels, guess, name]) 674 | 675 | for i in range(model.batch_size): 676 | 677 | ret_path = path + str(i + step*model.batch_size) + '_' + str(name_ret[i]) 678 | ret = np.asarray( images_ret[i]*255, dtype="int8" ) 679 | Image.fromarray(ret, 'RGB').save(ret_path + '_' + 'im' + '.jpg') 680 | ret = np.asarray( labels_ret[i]*255, dtype="int8" ) 681 | Image.fromarray(ret, 'P').save(ret_path + '_' + 'lab' + '.png') 682 | ret = np.asarray( guess_ret[i]*255, dtype="int8" ) 683 | Image.fromarray(ret, 'P').save(ret_path + '_' + 'guess' + '.png') 684 | 685 | print('------------------------------------------------------ ') 686 | print('Done!') 687 | print('------------------------------------------------------ \n') 688 | 689 | 690 | 691 | 692 | def do_nearest(model, sess, stream_input, restore_path, dataset, split, k = 4): 693 | """ 694 | From encoding of images, find the nearest neighbor of each image 695 | 696 | Args: 697 | model : model 698 | sess : tensorflow session 699 | stream_input : data manager 700 | restore_path : where is the restore file 701 | dataset : which dataset 702 | split : which split (valid, test, etc.) 703 | k : number of closest 704 | """ 705 | 706 | print('------------------------------------------------------') 707 | print('Computing the %d nearest neighbors of images of %s from %s ...'%(k, dataset, split)) 708 | print('------------------------------------------------------\n') 709 | 710 | images, labels, name = stream_input.get_inputs() 711 | guess, _, inter_feature = model.infer(images, inter_layer = True) 712 | 713 | num_iter = int(stream_input.size_epoch / stream_input.batch_size) 714 | num_examples = num_iter * stream_input.batch_size 715 | dimension = inter_feature.get_shape()[1].value 716 | 717 | index_representation = np.empty((num_examples), dtype='a1000') 718 | representation = np.zeros((num_examples,dimension)) 719 | closest = np.zeros((num_examples,k+1)) 720 | 721 | print('------------------------------------------------------') #Initialisation of weights 722 | sess.run(tf.global_variables_initializer()) 723 | print('Restoring from previous checkpoint...') 724 | ret = restore_model(model, sess, restore_path) 725 | tf.train.start_queue_runners(sess=sess) 726 | stream_input.start_threads(sess) 727 | print('------------------------------------------------------') 728 | print('Done! Training ended at step %s' %(ret)) 729 | print('------------------------------------------------------ \n') 730 | 731 | print('------------------------------------------------------') 732 | print('There are %d data to process in %d iterations' %(num_examples, num_iter)) 733 | print('------------------------------------------------------ \n') 734 | 735 | for step in range(num_iter): 736 | 737 | sys.stdout.write('%d out of %d \r' %(step, num_iter)) 738 | sys.stdout.flush() 739 | 740 | code, name_ret = sess.run([inter_feature, name]) 741 | for i in range(stream_input.batch_size): 742 | for j in range(dimension): 743 | representation[i + step * stream_input.batch_size,j] = code[i,j] 744 | index_representation[i + step * stream_input.batch_size] = name_ret[i] 745 | 746 | print('------------------------------------------------------ ') 747 | print('Step 1: Done!') 748 | print('------------------------------------------------------ \n') 749 | 750 | path = restore_path + '/' + model.name 751 | path += '/neighbour_' + dataset + '_' + split + '/' 752 | if tf.gfile.Exists(path): 753 | tf.gfile.DeleteRecursively(path) 754 | tf.gfile.MakeDirs(path) 755 | 756 | for i in range(num_examples): 757 | 758 | sys.stdout.write('%d out of %d \r' %(i, num_examples)) 759 | sys.stdout.flush() 760 | 761 | ret = np.reshape(np.tile(representation[i],num_examples),(num_examples,dimension)) 762 | distance = np.sum(np.square(representation-ret),axis=1) 763 | closest[i] = np.argsort(distance)[0:k+1] 764 | shutil.copy2(os.path.join(stream_input.f1 , str(index_representation[i])),path + str(index_representation[i]).split('.')[0] + '_im' + '.jpg') 765 | for j in range(k+1): 766 | if j > 0: 767 | shutil.copy2(os.path.join(stream_input.f1, str(index_representation[int(closest[i,j])])),path + str(index_representation[i]).split('.')[0] + '_neighbour_' + str(j) + '_' + str(index_representation[int(closest[i,j])]).split('.')[0] + '.jpg') 768 | 769 | print('------------------------------------------------------ ') 770 | print('Step 2: Done!') 771 | print('------------------------------------------------------ \n') --------------------------------------------------------------------------------