├── README.md
├── data
    ├── README.md
    └── gen_market1501_and_cuhk03_train_set.py
├── evaluation
    ├── README.md
    ├── deploy_mgcam.prototxt
    ├── extract_feature_cuhk.py
    ├── extract_feature_market.py
    └── extract_feature_mars.py
└── experiments
    ├── cuhk03-detected
        ├── layers.py
        ├── mgcam_iter_75000.caffemodel
        ├── mgcam_siamese_iter_20000.caffemodel
        ├── mgcam_siamese_train.prototxt
        ├── mgcam_train.prototxt
        ├── run_mgcam.sh
        ├── run_mgcam_siamese.sh
        ├── solver_mgcam.prototxt
        └── solver_mgcam_siamese.prototxt
    ├── cuhk03-labeled
        ├── layers.py
        ├── mgcam_iter_75000.caffemodel
        ├── mgcam_siamese_iter_20000.caffemodel
        ├── mgcam_siamese_train.prototxt
        ├── mgcam_train.prototxt
        ├── run_mgcam.sh
        ├── run_mgcam_siamese.sh
        ├── solver_mgcam.prototxt
        └── solver_mgcam_siamese.prototxt
    ├── market1501
        ├── layers.py
        ├── mgcam_iter_75000.caffemodel
        ├── mgcam_siamese_iter_20000.caffemodel
        ├── mgcam_siamese_train.prototxt
        ├── mgcam_train.prototxt
        ├── run_mgcam.sh
        ├── run_mgcam_siamese.sh
        ├── solver_mgcam.prototxt
        └── solver_mgcam_siamese.prototxt
    └── mars
        ├── layers.py
        ├── mgcam_iter_75000.caffemodel
        ├── mgcam_siamese_iter_20000.caffemodel
        ├── mgcam_siamese_train.prototxt
        ├── mgcam_train.prototxt
        ├── run_mgcam.sh
        ├── run_mgcam_siamese.sh
        ├── solver_mgcam.prototxt
        └── solver_mgcam_siamese.prototxt


/README.md:
--------------------------------------------------------------------------------
 1 | # MGCAM
 2 | --------------------------------------------------------------------------------
 3 | * Mask-guided Contrastive Attention Model (MGCAM) for Person Re-Identification 
 4 | * Code Version 1.0                                                                                                                       
 5 | * E-mail: chunfeng.song@nlpr.ia.ac.cn                                          
 6 | ---------------------------------------------------------------------------------
 7 | 
 8 | i.    Overview
 9 | ii.   Copying
10 | iii.  Use
11 | 
12 | i. OVERVIEW
13 | -----------------------------
14 | This code implements the paper:
15 | 
16 | >Chunfeng Song, Yan Huang, Wanli Ouyang, and Liang Wang. Mask-guided 
17 | Contrastive Attention Model for Person Re-Identification. 
18 | In CVPR, 2018.
19 | 
20 | If you find this work is helpful for your research, please cite our paper [[PDF]](http://openaccess.thecvf.com/content_cvpr_2018/html/Song_Mask-Guided_Contrastive_Attention_CVPR_2018_paper.html).
21 | 
22 | ii. COPYING
23 | -----------------------------
24 | We share this code only for research use. We neither warrant 
25 | correctness nor take any responsibility for the consequences of 
26 | using this code. If you find any problem or inappropriate content
27 | in this code, feel free to contact us (chunfeng.song@nlpr.ia.ac.cn).
28 | 
29 | iii. USE
30 | -----------------------------
31 | This code should work on Caffe with Python layer (pycaffe). You can install Caffe from [here](https://github.com/BVLC/caffe).
32 | 
33 | (1) Data Preparation.   
34 | Download the original datasets and their masks: MARS, Market-1501, CUHK-03, and their masks from [Baidu Yun](https://pan.baidu.com/s/16ZrlM1f_1_T-eZHmQTTkYg) OR [Google Drive](https://drive.google.com/drive/folders/1QVBDpH0B4k6cXKFYXBJ3HNVET_3gY0to?usp=sharing).
35 | 
36 | For Market-1501 and CUHK-03, you need to run the spilt code(./data/gen_market1501_and_cuhk03_train_set.py).
37 | 
38 | (2) Model Training. 
39 | Here, we take MARS as an example. The other two datasets are the same.
40 | 
41 | >cd ./experiments/mars
42 | 
43 | First eidt the 'im_path', 'gt_path' and 'dataset' in the prototxt file, e.g., the MGCAM-only and MGCAM-Siamese version for MARS dataset is 'mgcam_train.prototxt' and 'mgcam_siamese_train.prototxt', respectively. 
44 | 
45 | Then, we can train the MGCAM model from scratch with the command:
46 | >sh run_mgcam.sh  
47 | 
48 | It will take roughly 15 hours for single Titan X.
49 | 
50 | Finally, we can fine-tune the MGCAM model with siamese loss via run the commman:
51 | >sh run_mgcam_siamese.sh 
52 | 
53 | It will take roughly 5 hours for single Titan X.
54 | 
55 | (3) Evaluation.   
56 | Taking MARS for example, run the code in './evaluation/extract_feature_mars.py' to extract the IDE features, and then run the CMC and mAP evaluation with the [MARS-evaluation](https://github.com/liangzheng06/MARS-evaluation) code by Liang Zheng et al., or the [Re-Ranking](https://github.com/zhunzhong07/person-re-ranking) by Zhun Zhong et al.
57 | 


--------------------------------------------------------------------------------
/data/README.md:
--------------------------------------------------------------------------------
 1 | Dataset Preparation.
 2 | ---
 3 | 1) Download MARS dataset from [here](http://www.liangzheng.com.cn/Project/project_mars.html).
 4 | 
 5 | 2) Download Market-1501 dataset from [here](http://www.liangzheng.org/Project/project_reid.html).
 6 | 
 7 | 3) Download CUHK03 dataset from [here](https://github.com/zhunzhong07/person-re-ranking). You need to extract the images into folods. You can also download the new protocol version [CUHK03-NP](https://github.com/zhunzhong07/person-re-ranking/tree/master/CUHK03-NP). If you use this dataset in your work, please cite their paper:
 8 | 
 9 |     >@inproceedings{zhong2017re,    
10 |       title={Re-ranking Person Re-identification with k-reciprocal Encoding},   
11 |       author={Zhong, Zhun and Zheng, Liang and Cao, Donglin and Li, Shaozi},    
12 |       booktitle={CVPR},  
13 |       year={2017}   
14 |     }
15 |     
16 |     >@inproceedings{li2014deepreid, 
17 |     title={DeepReID: Deep Filter Pairing Neural Network for Person Re-identification},   
18 |     author={Li, Wei and Zhao, Rui and Xiao, Tong and Wang, Xiaogang},   
19 |     booktitle={CVPR},   
20 |     year={2014} 
21 |     }
22 |     
23 | * All masks can be download from [Baidu Yun](https://pan.baidu.com/s/16ZrlM1f_1_T-eZHmQTTkYg) OR [Google Drive](https://drive.google.com/drive/folders/1QVBDpH0B4k6cXKFYXBJ3HNVET_3gY0to?usp=sharing). If you use the masks in your work, please cite our paper:
24 | 
25 | >Chunfeng Song, Yan Huang, Wanli Ouyang, and Liang Wang. Mask-guided 
26 | Contrastive Attention Model for Person Re-Identification. 
27 | In CVPR, 2018.
28 | 
29 | Make sure that all the datasets are saveing in the following structure:
30 | 
31 | MARS:
32 | >./data 
33 | >./data/mars    
34 | >./data/mars/bbox_train 
35 | >./data/mars/bbox_test  
36 | >./data/mars/bbox_train_seg 
37 | >./data/mars/bbox_test_seg  
38 | 
39 | Market-1501:
40 | >./data 
41 | >./data/market-1501 
42 | >./data/market-1501/bounding_box_train  
43 | >./data/market-1501/bounding_box_test   
44 | >./data/market-1501/query   
45 | >./data/market-1501/bounding_box_train_seg  
46 | >./data/market-1501/bounding_box_test_seg   
47 | >./data/market-1501/query_seg   
48 | 
49 | CUHK03:
50 | >./data 
51 | >./data/cuhk03  
52 | >./data/cuhk03/labeled  
53 | >./data/cuhk03/cuhk03_labeled_seg   
54 | >./data/cuhk03/detected 
55 | >./data/cuhk03/cuhk03_detected_seg  
56 | 
57 | CUHK03-NP:
58 | >./data 
59 | >./data/cuhk03-np   
60 | >./data/cuhk03-np/labeled   
61 | >./data/cuhk03-np/labeled/bounding_box_train    
62 | >./data/cuhk03-np/labeled/bounding_box_test 
63 | >./data/cuhk03-np/labeled/query 
64 | >./data/cuhk03-np/labeled/bounding_box_train_seg    
65 | >./data/cuhk03-np/labeled/bounding_box_test_seg 
66 | >./data/cuhk03-np/labeled/query_seg 
67 | 
68 | >./data/cuhk03-np/detected  
69 | >./data/cuhk03-np/detected/bounding_box_train   
70 | >./data/cuhk03-np/detected/bounding_box_test    
71 | >./data/cuhk03-np/detected/query    
72 | >./data/cuhk03-np/detected/bounding_box_train_seg   
73 | >./data/cuhk03-np/detected/bounding_box_test_seg    
74 | >./data/cuhk03-np/detected/query_seg    
75 | 
76 | Now, you can run the code to generate training set with running "python gen_market1501_and_cuhk03_train_set.py".
77 | 


--------------------------------------------------------------------------------
/data/gen_market1501_and_cuhk03_train_set.py:
--------------------------------------------------------------------------------
 1 | """
 2 | Market-1501 and CUHK-03 training data.
 3 | 
 4 | by Chunfeng Song
 5 | 
 6 | 2017/10/08
 7 | 
 8 | This code is for research use only, please cite our paper:
 9 |  
10 | Chunfeng Song, Yan Huang, Wanli Ouyang, and Liang Wang. Mask-guided Contrastive Attention Model for Person Re-Identification. In CVPR, 2018.
11 |  
12 | Contact us: chunfeng.song@nlpr.ia.ac.cn
13 | """
14 | 
15 | import os
16 | import shutil
17 | import numpy as np
18 | 
19 | dataset = 'cuhk03-np/detected' #'cuhk03-np/labeled'  or 'cuhk03-np/detected', cuhk03-np can be download from https://github.com/zhunzhong07/person-re-ranking/tree/master/CUHK03-NP
20 | data_path = './' + dataset +'/bounding_box_train' #Path for original RGB training set.
21 | seg_path = './' + dataset +'/bounding_box_train_seg' #Path for binary masks of training set.
22 | save_data_path = './' + dataset +'/bounding_box_train_fold'
23 | save_seg_path = './' + dataset +'/bounding_box_train_seg_fold'
24 | if not os.path.exists(save_data_path):
25 |     os.mkdir(save_data_path)
26 |     os.mkdir(save_seg_path)
27 | pic_im_list = np.sort(os.listdir(data_path))
28 | i = 0
29 | for pic in pic_im_list:
30 |     if pic.lower().endswith('.jpg') or pic.lower().endswith('.png'):
31 |         this_data_fold = os.path.join(save_data_path,pic[:4])
32 |         this_seg_fold = os.path.join(save_seg_path,pic[:4])
33 |         if not os.path.exists(this_data_fold):
34 |             os.mkdir(this_data_fold)
35 |             os.mkdir(this_seg_fold)
36 |             i +=1
37 |         new_im_path = os.path.join(this_data_fold,pic)
38 |         new_seg_path = os.path.join(this_data_fold,pic[:-4] + '.png')
39 |         shutil.copy(os.path.join(data_path,pic),new_im_path)
40 |         shutil.copy(os.path.join(seg_path,pic[:-4] + '.png'),this_seg_fold)
41 |         print '---->dealing num-%04d with %s!'%(i,pic)
42 | print 'DONE!!!'
43 | 
44 | 


--------------------------------------------------------------------------------
/evaluation/README.md:
--------------------------------------------------------------------------------
1 | Eextract features for evaluation.
2 | ---
3 | We recommend the popular [re-ranking](https://github.com/zhunzhong07/person-re-ranking) method for further evaluation.
4 | 


--------------------------------------------------------------------------------
/evaluation/deploy_mgcam.prototxt:
--------------------------------------------------------------------------------
   1 | name: "MGCAM"
   2 | 
   3 | input: "data"
   4 | input_dim:1
   5 | input_dim:3
   6 | input_dim:160
   7 | input_dim:64
   8 | 
   9 | input: "one_mask"
  10 | input_dim:1
  11 | input_dim:1
  12 | input_dim:40
  13 | input_dim:16
  14 | 
  15 | input: "mask"
  16 | input_dim:1
  17 | input_dim:1
  18 | input_dim:160
  19 | input_dim:64
  20 | 
  21 | layer {
  22 |   name: "data_concate"
  23 |   type: "Concat"
  24 |   bottom: "data"
  25 |   bottom: "mask"
  26 |   top: "data_concate"
  27 | }
  28 | 
  29 | layer {
  30 |   name: "conv0_scale1_full"
  31 |   type: "Convolution"
  32 |   bottom: "data_concate"
  33 |   top: "conv0_scale1_full"
  34 |   param {
  35 |     name: "conv0_scale1_w_full"
  36 |     lr_mult: 1
  37 |     decay_mult: 1
  38 |   }
  39 |   param {
  40 |     name: "conv0_scale1_b_full"
  41 |     lr_mult: 2
  42 |     decay_mult: 0
  43 |   }
  44 |   convolution_param {
  45 |     num_output: 32
  46 |     pad: 2
  47 |     kernel_size: 5
  48 |     stride: 1
  49 |     weight_filler {
  50 |       type: "xavier"
  51 |     }
  52 |     bias_filler {
  53 |       type: "constant"
  54 |     }
  55 |     dilation: 1
  56 |   }
  57 | }
  58 | layer {
  59 |   name: "bn0_scale1_full"
  60 |   type: "BatchNorm"
  61 |   bottom: "conv0_scale1_full"
  62 |   top: "bn0_scale1_full"
  63 |   param {
  64 |     name: "bn0_scale1_mean_full"
  65 |     lr_mult: 0
  66 |     decay_mult: 0
  67 |   }
  68 |   param {
  69 |     name: "bn0_scale1_var_full"
  70 |     lr_mult: 0
  71 |     decay_mult: 0
  72 |   }
  73 |   param {
  74 |     name: "bn0_scale1_bias_full"
  75 |     lr_mult: 0
  76 |     decay_mult: 0
  77 |   }
  78 |   batch_norm_param {
  79 |     use_global_stats: true
  80 |   }
  81 | }
  82 | layer {
  83 |   name: "relu0_full"
  84 |   type: "ReLU"
  85 |   bottom: "bn0_scale1_full"
  86 |   top: "bn0_scale1_full"
  87 | }
  88 | layer {
  89 |   name: "pool0_full"
  90 |   type: "Pooling"
  91 |   bottom: "bn0_scale1_full"
  92 |   top: "pool0_full"
  93 |   pooling_param {
  94 |     pool: MAX
  95 |     kernel_size: 2
  96 |     stride: 2
  97 |   }
  98 | }
  99 | layer {
 100 |   name: "conv1_scale1_full"
 101 |   type: "Convolution"
 102 |   bottom: "pool0_full"
 103 |   top: "conv1_scale1_full"
 104 |   param {
 105 |     name: "conv1_scale1_w_full"
 106 |     lr_mult: 1
 107 |     decay_mult: 1
 108 |   }
 109 |   param {
 110 |     name: "conv1_scale1_b_full"
 111 |     lr_mult: 2
 112 |     decay_mult: 0
 113 |   }
 114 |   convolution_param {
 115 |     num_output: 32
 116 |     pad: 1
 117 |     kernel_size: 3
 118 |     stride: 1
 119 |     weight_filler {
 120 |       type: "xavier"
 121 |     }
 122 |     bias_filler {
 123 |       type: "constant"
 124 |     }
 125 |     dilation: 1
 126 |   }
 127 | }
 128 | layer {
 129 |   name: "conv1_scale2_full"
 130 |   type: "Convolution"
 131 |   bottom: "pool0_full"
 132 |   top: "conv1_scale2_full"
 133 |   param {
 134 |     name: "conv1_scale2_w_full"
 135 |     lr_mult: 1
 136 |     decay_mult: 1
 137 |   }
 138 |   param {
 139 |     name: "conv1_scale2_b_full"
 140 |     lr_mult: 2
 141 |     decay_mult: 0
 142 |   }
 143 |   convolution_param {
 144 |     num_output: 32
 145 |     pad: 2
 146 |     kernel_size: 3
 147 |     stride: 1
 148 |     weight_filler {
 149 |       type: "xavier"
 150 |     }
 151 |     bias_filler {
 152 |       type: "constant"
 153 |     }
 154 |     dilation: 2
 155 |   }
 156 | }
 157 | layer {
 158 |   name: "conv1_scale3_full"
 159 |   type: "Convolution"
 160 |   bottom: "pool0_full"
 161 |   top: "conv1_scale3_full"
 162 |   param {
 163 |     name: "conv1_scale3_w_full"
 164 |     lr_mult: 1
 165 |     decay_mult: 1
 166 |   }
 167 |   param {
 168 |     name: "conv1_scale3_b_full"
 169 |     lr_mult: 2
 170 |     decay_mult: 0
 171 |   }
 172 |   convolution_param {
 173 |     num_output: 32
 174 |     pad: 3
 175 |     kernel_size: 3
 176 |     stride: 1
 177 |     weight_filler {
 178 |       type: "xavier"
 179 |     }
 180 |     bias_filler {
 181 |       type: "constant"
 182 |     }
 183 |     dilation: 3
 184 |   }
 185 | }
 186 | layer {
 187 |   name: "bn1_scale1_full"
 188 |   type: "BatchNorm"
 189 |   bottom: "conv1_scale1_full"
 190 |   top: "bn1_scale1_full"
 191 |   param {
 192 |     name: "bn1_scale1_mean_full"
 193 |     lr_mult: 0
 194 |     decay_mult: 0
 195 |   }
 196 |   param {
 197 |     name: "bn1_scale1_var_full"
 198 |     lr_mult: 0
 199 |     decay_mult: 0
 200 |   }
 201 |   param {
 202 |     name: "bn1_scale1_bias_full"
 203 |     lr_mult: 0
 204 |     decay_mult: 0
 205 |   }
 206 |   batch_norm_param {
 207 |     use_global_stats: true
 208 |   }
 209 | }
 210 | layer {
 211 |   name: "bn1_scale2_full"
 212 |   type: "BatchNorm"
 213 |   bottom: "conv1_scale2_full"
 214 |   top: "bn1_scale2_full"
 215 |   param {
 216 |     name: "bn1_scale2_mean_full"
 217 |     lr_mult: 0
 218 |     decay_mult: 0
 219 |   }
 220 |   param {
 221 |     name: "bn1_scale2_var_full"
 222 |     lr_mult: 0
 223 |     decay_mult: 0
 224 |   }
 225 |   param {
 226 |     name: "bn1_scale2_bias_full"
 227 |     lr_mult: 0
 228 |     decay_mult: 0
 229 |   }
 230 |   batch_norm_param {
 231 |     use_global_stats: true
 232 |   }
 233 | }
 234 | layer {
 235 |   name: "bn1_scale3_full"
 236 |   type: "BatchNorm"
 237 |   bottom: "conv1_scale3_full"
 238 |   top: "bn1_scale3_full"
 239 |   param {
 240 |     name: "bn1_scale3_mean_full"
 241 |     lr_mult: 0
 242 |     decay_mult: 0
 243 |   }
 244 |   param {
 245 |     name: "bn1_scale3_var_full"
 246 |     lr_mult: 0
 247 |     decay_mult: 0
 248 |   }
 249 |   param {
 250 |     name: "bn1_scale3_bias_full"
 251 |     lr_mult: 0
 252 |     decay_mult: 0
 253 |   }
 254 |   batch_norm_param {
 255 |     use_global_stats: true
 256 |   }
 257 | }
 258 | layer {
 259 |   name: "bn1_full"
 260 |   type: "Concat"
 261 |   bottom: "bn1_scale1_full"
 262 |   bottom: "bn1_scale2_full"
 263 |   bottom: "bn1_scale3_full"
 264 |   top: "bn1_full"
 265 |   concat_param {
 266 |     axis: 1
 267 |   }
 268 | }
 269 | layer {
 270 |   name: "relu1_full"
 271 |   type: "ReLU"
 272 |   bottom: "bn1_full"
 273 |   top: "bn1_full"
 274 | }
 275 | layer {
 276 |   name: "pool1_full"
 277 |   type: "Pooling"
 278 |   bottom: "bn1_full"
 279 |   top: "pool1_full"
 280 |   pooling_param {
 281 |     pool: MAX
 282 |     kernel_size: 2
 283 |     stride: 2
 284 |   }
 285 | }
 286 | layer {
 287 |   name: "conv2_scale1_full"
 288 |   type: "Convolution"
 289 |   bottom: "pool1_full"
 290 |   top: "conv2_scale1_full"
 291 |   param {
 292 |     name: "conv2_scale1_w_full"
 293 |     lr_mult: 1
 294 |     decay_mult: 1
 295 |   }
 296 |   param {
 297 |     name: "conv2_scale1_b_full"
 298 |     lr_mult: 2
 299 |     decay_mult: 0
 300 |   }
 301 |   convolution_param {
 302 |     num_output: 32
 303 |     pad: 1
 304 |     kernel_size: 3
 305 |     stride: 1
 306 |     weight_filler {
 307 |       type: "xavier"
 308 |     }
 309 |     bias_filler {
 310 |       type: "constant"
 311 |     }
 312 |     dilation: 1
 313 |   }
 314 | }
 315 | layer {
 316 |   name: "conv2_scale2_full"
 317 |   type: "Convolution"
 318 |   bottom: "pool1_full"
 319 |   top: "conv2_scale2_full"
 320 |   param {
 321 |     name: "conv2_scale2_w_full"
 322 |     lr_mult: 1
 323 |     decay_mult: 1
 324 |   }
 325 |   param {
 326 |     name: "conv2_scale2_b_full"
 327 |     lr_mult: 2
 328 |     decay_mult: 0
 329 |   }
 330 |   convolution_param {
 331 |     num_output: 32
 332 |     pad: 2
 333 |     kernel_size: 3
 334 |     stride: 1
 335 |     weight_filler {
 336 |       type: "xavier"
 337 |     }
 338 |     bias_filler {
 339 |       type: "constant"
 340 |     }
 341 |     dilation: 2
 342 |   }
 343 | }
 344 | layer {
 345 |   name: "conv2_scale3_full"
 346 |   type: "Convolution"
 347 |   bottom: "pool1_full"
 348 |   top: "conv2_scale3_full"
 349 |   param {
 350 |     name: "conv2_scale3_w_full"
 351 |     lr_mult: 1
 352 |     decay_mult: 1
 353 |   }
 354 |   param {
 355 |     name: "conv2_scale3_b_full"
 356 |     lr_mult: 2
 357 |     decay_mult: 0
 358 |   }
 359 |   convolution_param {
 360 |     num_output: 32
 361 |     pad: 3
 362 |     kernel_size: 3
 363 |     stride: 1
 364 |     weight_filler {
 365 |       type: "xavier"
 366 |     }
 367 |     bias_filler {
 368 |       type: "constant"
 369 |     }
 370 |     dilation: 3
 371 |   }
 372 | }
 373 | layer {
 374 |   name: "bn2_scale1_full"
 375 |   type: "BatchNorm"
 376 |   bottom: "conv2_scale1_full"
 377 |   top: "bn2_scale1_full"
 378 |   param {
 379 |     name: "bn2_scale1_mean_full"
 380 |     lr_mult: 0
 381 |     decay_mult: 0
 382 |   }
 383 |   param {
 384 |     name: "bn2_scale1_var_full"
 385 |     lr_mult: 0
 386 |     decay_mult: 0
 387 |   }
 388 |   param {
 389 |     name: "bn2_scale1_bias_full"
 390 |     lr_mult: 0
 391 |     decay_mult: 0
 392 |   }
 393 |   batch_norm_param {
 394 |     use_global_stats: true
 395 |   }
 396 | }
 397 | layer {
 398 |   name: "bn2_scale2_full"
 399 |   type: "BatchNorm"
 400 |   bottom: "conv2_scale2_full"
 401 |   top: "bn2_scale2_full"
 402 |   param {
 403 |     name: "bn2_scale2_mean_full"
 404 |     lr_mult: 0
 405 |     decay_mult: 0
 406 |   }
 407 |   param {
 408 |     name: "bn2_scale2_var_full"
 409 |     lr_mult: 0
 410 |     decay_mult: 0
 411 |   }
 412 |   param {
 413 |     name: "bn2_scale2_bias_full"
 414 |     lr_mult: 0
 415 |     decay_mult: 0
 416 |   }
 417 |   batch_norm_param {
 418 |     use_global_stats: true
 419 |   }
 420 | }
 421 | layer {
 422 |   name: "bn2_scale3_full"
 423 |   type: "BatchNorm"
 424 |   bottom: "conv2_scale3_full"
 425 |   top: "bn2_scale3_full"
 426 |   param {
 427 |     name: "bn2_scale3_mean_full"
 428 |     lr_mult: 0
 429 |     decay_mult: 0
 430 |   }
 431 |   param {
 432 |     name: "bn2_scale3_var_full"
 433 |     lr_mult: 0
 434 |     decay_mult: 0
 435 |   }
 436 |   param {
 437 |     name: "bn2_scale3_bias_full"
 438 |     lr_mult: 0
 439 |     decay_mult: 0
 440 |   }
 441 |   batch_norm_param {
 442 |     use_global_stats: true
 443 |   }
 444 | }
 445 | layer {
 446 |   name: "bn2_full"
 447 |   type: "Concat"
 448 |   bottom: "bn2_scale1_full"
 449 |   bottom: "bn2_scale2_full"
 450 |   bottom: "bn2_scale3_full"
 451 |   top: "bn2_full"
 452 |   concat_param {
 453 |     axis: 1
 454 |   }
 455 | }
 456 | layer {
 457 |   name: "relu2_full"
 458 |   type: "ReLU"
 459 |   bottom: "bn2_full"
 460 |   top: "bn2_full"
 461 | }
 462 | 
 463 | 
 464 | layer {
 465 |   name: "make_att_mask"
 466 |   type: "Convolution"
 467 |   bottom: "bn2_full"
 468 |   top: "make_att_mask"
 469 |   param {
 470 |     lr_mult: 1
 471 |     decay_mult: 1
 472 |   }
 473 |   param {
 474 |     lr_mult: 2
 475 |     decay_mult: 0
 476 |   }
 477 |   convolution_param {
 478 |     num_output: 1
 479 |     pad: 1
 480 |     kernel_size: 3
 481 |     stride: 1
 482 |     weight_filler {
 483 |       type: "xavier"
 484 |     }
 485 |     bias_filler {
 486 |       type: "constant"
 487 |     }
 488 |     dilation: 1
 489 |   }
 490 | }
 491 | 
 492 | layer {
 493 |   name: "att_sigmoid"
 494 |   type: "Sigmoid"
 495 |   bottom: "make_att_mask"
 496 |   top: "make_att_mask"
 497 | }
 498 | 
 499 | layer {
 500 |   name: "make_att_mask_inv"
 501 |   type: "Eltwise"
 502 |   bottom: "one_mask"
 503 |   bottom: "make_att_mask"
 504 |   top: "make_att_mask_inv"
 505 |   eltwise_param {
 506 |     operation: SUM
 507 |     coeff: 1
 508 |     coeff: -1
 509 |   }
 510 | }
 511 | ###############  Seg Loss    #####################
 512 | 
 513 | layer {
 514 |   name: "tile_iner"
 515 |   type: "Tile"
 516 |   bottom: "make_att_mask"
 517 |   top: "att_iner"
 518 |   tile_param  {
 519 |     tiles: 96
 520 |     axis: 1
 521 |   }
 522 | }
 523 | 
 524 | layer {
 525 |   name: "bn2_att_iner"
 526 |   type: "Eltwise"
 527 |   bottom: "bn2_full"
 528 |   bottom: "att_iner"
 529 |   top: "bn2_att_iner"
 530 |   eltwise_param {
 531 |     operation: PROD
 532 |   }
 533 | }
 534 | 
 535 | layer {
 536 |   name: "tile_exter"
 537 |   type: "Tile"
 538 |   bottom: "make_att_mask_inv"
 539 |   top: "att_exter"
 540 |   tile_param  {
 541 |     tiles: 96
 542 |     axis: 1
 543 |   }
 544 | }
 545 | 
 546 | layer {
 547 |   name: "bn2_att_exter"
 548 |   type: "Eltwise"
 549 |   bottom: "bn2_full"
 550 |   bottom: "att_exter"
 551 |   top: "bn2_att_exter"
 552 |   eltwise_param {
 553 |     operation: PROD
 554 |   }
 555 | }
 556 | 
 557 | layer {
 558 |   name: "pool2_full"
 559 |   type: "Pooling"
 560 |   bottom: "bn2_full"
 561 |   top: "pool2_full"
 562 |   pooling_param {
 563 |     pool: MAX
 564 |     kernel_size: 2
 565 |     stride: 2
 566 |   }
 567 | }
 568 | 
 569 | layer {
 570 |   name: "conv3_scale1_full"
 571 |   type: "Convolution"
 572 |   bottom: "pool2_full"
 573 |   top: "conv3_scale1_full"
 574 |   param {
 575 |     name: "conv3_scale1_w_full"
 576 |     lr_mult: 1
 577 |     decay_mult: 1
 578 |   }
 579 |   param {
 580 |     name: "conv3_scale1_b_full"
 581 |     lr_mult: 2
 582 |     decay_mult: 0
 583 |   }
 584 |   convolution_param {
 585 |     num_output: 32
 586 |     pad: 1
 587 |     kernel_size: 3
 588 |     stride: 1
 589 |     weight_filler {
 590 |       type: "xavier"
 591 |     }
 592 |     bias_filler {
 593 |       type: "constant"
 594 |     }
 595 |     dilation: 1
 596 |   }
 597 | }
 598 | layer {
 599 |   name: "conv3_scale2_full"
 600 |   type: "Convolution"
 601 |   bottom: "pool2_full"
 602 |   top: "conv3_scale2_full"
 603 |   param {
 604 |     name: "conv3_scale2_w_full"
 605 |     lr_mult: 1
 606 |     decay_mult: 1
 607 |   }
 608 |   param {
 609 |     name: "conv3_scale2_b_full"
 610 |     lr_mult: 2
 611 |     decay_mult: 0
 612 |   }
 613 |   convolution_param {
 614 |     num_output: 32
 615 |     pad: 2
 616 |     kernel_size: 3
 617 |     stride: 1
 618 |     weight_filler {
 619 |       type: "xavier"
 620 |     }
 621 |     bias_filler {
 622 |       type: "constant"
 623 |     }
 624 |     dilation: 2
 625 |   }
 626 | }
 627 | layer {
 628 |   name: "conv3_scale3_full"
 629 |   type: "Convolution"
 630 |   bottom: "pool2_full"
 631 |   top: "conv3_scale3_full"
 632 |   param {
 633 |     name: "conv3_scale3_w_full"
 634 |     lr_mult: 1
 635 |     decay_mult: 1
 636 |   }
 637 |   param {
 638 |     name: "conv3_scale3_b_full"
 639 |     lr_mult: 2
 640 |     decay_mult: 0
 641 |   }
 642 |   convolution_param {
 643 |     num_output: 32
 644 |     pad: 3
 645 |     kernel_size: 3
 646 |     stride: 1
 647 |     weight_filler {
 648 |       type: "xavier"
 649 |     }
 650 |     bias_filler {
 651 |       type: "constant"
 652 |     }
 653 |     dilation: 3
 654 |   }
 655 | }
 656 | layer {
 657 |   name: "bn3_scale1_full"
 658 |   type: "BatchNorm"
 659 |   bottom: "conv3_scale1_full"
 660 |   top: "bn3_scale1_full"
 661 |   param {
 662 |     name: "bn3_scale1_mean_full"
 663 |     lr_mult: 0
 664 |     decay_mult: 0
 665 |   }
 666 |   param {
 667 |     name: "bn3_scale1_var_full"
 668 |     lr_mult: 0
 669 |     decay_mult: 0
 670 |   }
 671 |   param {
 672 |     name: "bn3_scale1_bias_full"
 673 |     lr_mult: 0
 674 |     decay_mult: 0
 675 |   }
 676 |   batch_norm_param {
 677 |     use_global_stats: true
 678 |   }
 679 | }
 680 | layer {
 681 |   name: "bn3_scale2_full"
 682 |   type: "BatchNorm"
 683 |   bottom: "conv3_scale2_full"
 684 |   top: "bn3_scale2_full"
 685 |   param {
 686 |     name: "bn3_scale2_mean_full"
 687 |     lr_mult: 0
 688 |     decay_mult: 0
 689 |   }
 690 |   param {
 691 |     name: "bn3_scale2_var_full"
 692 |     lr_mult: 0
 693 |     decay_mult: 0
 694 |   }
 695 |   param {
 696 |     name: "bn3_scale2_bias_full"
 697 |     lr_mult: 0
 698 |     decay_mult: 0
 699 |   }
 700 |   batch_norm_param {
 701 |     use_global_stats: true
 702 |   }
 703 | }
 704 | layer {
 705 |   name: "bn3_scale3_full"
 706 |   type: "BatchNorm"
 707 |   bottom: "conv3_scale3_full"
 708 |   top: "bn3_scale3_full"
 709 |   param {
 710 |     name: "bn3_scale3_mean_full"
 711 |     lr_mult: 0
 712 |     decay_mult: 0
 713 |   }
 714 |   param {
 715 |     name: "bn3_scale3_var_full"
 716 |     lr_mult: 0
 717 |     decay_mult: 0
 718 |   }
 719 |   param {
 720 |     name: "bn3_scale3_bias_full"
 721 |     lr_mult: 0
 722 |     decay_mult: 0
 723 |   }
 724 |   batch_norm_param {
 725 |     use_global_stats: true
 726 |   }
 727 | }
 728 | layer {
 729 |   name: "bn3_full"
 730 |   type: "Concat"
 731 |   bottom: "bn3_scale1_full"
 732 |   bottom: "bn3_scale2_full"
 733 |   bottom: "bn3_scale3_full"
 734 |   top: "bn3_full"
 735 |   concat_param {
 736 |     axis: 1
 737 |   }
 738 | }
 739 | layer {
 740 |   name: "relu3_full"
 741 |   type: "ReLU"
 742 |   bottom: "bn3_full"
 743 |   top: "bn3_full"
 744 | }
 745 | layer {
 746 |   name: "pool3_full"
 747 |   type: "Pooling"
 748 |   bottom: "bn3_full"
 749 |   top: "pool3_full"
 750 |   pooling_param {
 751 |     pool: MAX
 752 |     kernel_size: 2
 753 |     stride: 2
 754 |   }
 755 | }
 756 | layer {
 757 |   name: "conv4_scale1_full"
 758 |   type: "Convolution"
 759 |   bottom: "pool3_full"
 760 |   top: "conv4_scale1_full"
 761 |   param {
 762 |     name: "conv4_scale1_w_full"
 763 |     lr_mult: 1
 764 |     decay_mult: 1
 765 |   }
 766 |   param {
 767 |     name: "conv4_scale1_b_full"
 768 |     lr_mult: 2
 769 |     decay_mult: 0
 770 |   }
 771 |   convolution_param {
 772 |     num_output: 32
 773 |     pad: 1
 774 |     kernel_size: 3
 775 |     stride: 1
 776 |     weight_filler {
 777 |       type: "xavier"
 778 |     }
 779 |     bias_filler {
 780 |       type: "constant"
 781 |     }
 782 |     dilation: 1
 783 |   }
 784 | }
 785 | layer {
 786 |   name: "conv4_scale2_full"
 787 |   type: "Convolution"
 788 |   bottom: "pool3_full"
 789 |   top: "conv4_scale2_full"
 790 |   param {
 791 |     name: "conv4_scale2_w_full"
 792 |     lr_mult: 1
 793 |     decay_mult: 1
 794 |   }
 795 |   param {
 796 |     name: "conv4_scale2_b_full"
 797 |     lr_mult: 2
 798 |     decay_mult: 0
 799 |   }
 800 |   convolution_param {
 801 |     num_output: 32
 802 |     pad: 2
 803 |     kernel_size: 3
 804 |     stride: 1
 805 |     weight_filler {
 806 |       type: "xavier"
 807 |     }
 808 |     bias_filler {
 809 |       type: "constant"
 810 |     }
 811 |     dilation: 2
 812 |   }
 813 | }
 814 | layer {
 815 |   name: "conv4_scale3_full"
 816 |   type: "Convolution"
 817 |   bottom: "pool3_full"
 818 |   top: "conv4_scale3_full"
 819 |   param {
 820 |     name: "conv4_scale3_w_full"
 821 |     lr_mult: 1
 822 |     decay_mult: 1
 823 |   }
 824 |   param {
 825 |     name: "conv4_scale3_b_full"
 826 |     lr_mult: 2
 827 |     decay_mult: 0
 828 |   }
 829 |   convolution_param {
 830 |     num_output: 32
 831 |     pad: 3
 832 |     kernel_size: 3
 833 |     stride: 1
 834 |     weight_filler {
 835 |       type: "xavier"
 836 |     }
 837 |     bias_filler {
 838 |       type: "constant"
 839 |     }
 840 |     dilation: 3
 841 |   }
 842 | }
 843 | layer {
 844 |   name: "bn4_scale1_full"
 845 |   type: "BatchNorm"
 846 |   bottom: "conv4_scale1_full"
 847 |   top: "bn4_scale1_full"
 848 |   param {
 849 |     name: "bn4_scale1_mean_full"
 850 |     lr_mult: 0
 851 |     decay_mult: 0
 852 |   }
 853 |   param {
 854 |     name: "bn4_scale1_var_full"
 855 |     lr_mult: 0
 856 |     decay_mult: 0
 857 |   }
 858 |   param {
 859 |     name: "bn4_scale1_bias_full"
 860 |     lr_mult: 0
 861 |     decay_mult: 0
 862 |   }
 863 |   batch_norm_param {
 864 |     use_global_stats: true
 865 |   }
 866 | }
 867 | layer {
 868 |   name: "bn4_scale2_full"
 869 |   type: "BatchNorm"
 870 |   bottom: "conv4_scale2_full"
 871 |   top: "bn4_scale2_full"
 872 |   param {
 873 |     name: "bn4_scale2_mean_full"
 874 |     lr_mult: 0
 875 |     decay_mult: 0
 876 |   }
 877 |   param {
 878 |     name: "bn4_scale2_var_full"
 879 |     lr_mult: 0
 880 |     decay_mult: 0
 881 |   }
 882 |   param {
 883 |     name: "bn4_scale2_bias_full"
 884 |     lr_mult: 0
 885 |     decay_mult: 0
 886 |   }
 887 |   batch_norm_param {
 888 |     use_global_stats: true
 889 |   }
 890 | }
 891 | layer {
 892 |   name: "bn4_scale3_full"
 893 |   type: "BatchNorm"
 894 |   bottom: "conv4_scale3_full"
 895 |   top: "bn4_scale3_full"
 896 |   param {
 897 |     name: "bn4_scale3_mean_full"
 898 |     lr_mult: 0
 899 |     decay_mult: 0
 900 |   }
 901 |   param {
 902 |     name: "bn4_scale3_var_full"
 903 |     lr_mult: 0
 904 |     decay_mult: 0
 905 |   }
 906 |   param {
 907 |     name: "bn4_scale3_bias_full"
 908 |     lr_mult: 0
 909 |     decay_mult: 0
 910 |   }
 911 |   batch_norm_param {
 912 |     use_global_stats: true
 913 |   }
 914 | }
 915 | layer {
 916 |   name: "bn4_full"
 917 |   type: "Concat"
 918 |   bottom: "bn4_scale1_full"
 919 |   bottom: "bn4_scale2_full"
 920 |   bottom: "bn4_scale3_full"
 921 |   top: "bn4_full"
 922 |   concat_param {
 923 |     axis: 1
 924 |   }
 925 | }
 926 | layer {
 927 |   name: "relu4_full"
 928 |   type: "ReLU"
 929 |   bottom: "bn4_full"
 930 |   top: "bn4_full"
 931 | }
 932 | layer {
 933 |   name: "pool4_full"
 934 |   type: "Pooling"
 935 |   bottom: "bn4_full"
 936 |   top: "pool4_full"
 937 |   pooling_param {
 938 |     pool: MAX
 939 |     kernel_size: 2
 940 |     stride: 2
 941 |   }
 942 | }
 943 | layer {
 944 |   name: "fc1_full"
 945 |   type: "InnerProduct"
 946 |   bottom: "pool4_full"
 947 |   top: "fc1_full"
 948 |   param {
 949 |     name: "fc1_w_full"
 950 |     lr_mult: 1
 951 |     decay_mult: 1
 952 |   }
 953 |   param {
 954 |     name: "fc1_b_full"
 955 |     lr_mult: 2
 956 |     decay_mult: 0
 957 |   }
 958 |   inner_product_param {
 959 |     num_output: 128
 960 |     weight_filler {
 961 |       type: "xavier"
 962 |     }
 963 |     bias_filler {
 964 |       type: "constant"
 965 |     }
 966 |   }
 967 | }
 968 | layer {
 969 |   name: "fc1_full_drop"
 970 |   type: "Dropout"
 971 |   bottom: "fc1_full"
 972 |   top: "fc1_full"
 973 |   dropout_param {
 974 |     dropout_ratio: 0.2
 975 |   }
 976 | }
 977 | 
 978 | layer {
 979 |   name: "pool2_iner"
 980 |   type: "Pooling"
 981 |   bottom: "bn2_att_iner"
 982 |   top: "pool2_iner"
 983 |   pooling_param {
 984 |     pool: MAX
 985 |     kernel_size: 2
 986 |     stride: 2
 987 |   }
 988 | }
 989 | 
 990 | layer {
 991 |   name: "conv3_scale1_iner"
 992 |   type: "Convolution"
 993 |   bottom: "pool2_iner"
 994 |   top: "conv3_scale1_iner"
 995 |   param {
 996 |     name: "conv3_scale1_w_iner"
 997 |     lr_mult: 1
 998 |     decay_mult: 1
 999 |   }
1000 |   param {
1001 |     name: "conv3_scale1_b_iner"
1002 |     lr_mult: 2
1003 |     decay_mult: 0
1004 |   }
1005 |   convolution_param {
1006 |     num_output: 32
1007 |     pad: 1
1008 |     kernel_size: 3
1009 |     stride: 1
1010 |     weight_filler {
1011 |       type: "xavier"
1012 |     }
1013 |     bias_filler {
1014 |       type: "constant"
1015 |     }
1016 |     dilation: 1
1017 |   }
1018 | }
1019 | layer {
1020 |   name: "conv3_scale2_iner"
1021 |   type: "Convolution"
1022 |   bottom: "pool2_iner"
1023 |   top: "conv3_scale2_iner"
1024 |   param {
1025 |     name: "conv3_scale2_w_iner"
1026 |     lr_mult: 1
1027 |     decay_mult: 1
1028 |   }
1029 |   param {
1030 |     name: "conv3_scale2_b_iner"
1031 |     lr_mult: 2
1032 |     decay_mult: 0
1033 |   }
1034 |   convolution_param {
1035 |     num_output: 32
1036 |     pad: 2
1037 |     kernel_size: 3
1038 |     stride: 1
1039 |     weight_filler {
1040 |       type: "xavier"
1041 |     }
1042 |     bias_filler {
1043 |       type: "constant"
1044 |     }
1045 |     dilation: 2
1046 |   }
1047 | }
1048 | layer {
1049 |   name: "conv3_scale3_iner"
1050 |   type: "Convolution"
1051 |   bottom: "pool2_iner"
1052 |   top: "conv3_scale3_iner"
1053 |   param {
1054 |     name: "conv3_scale3_w_iner"
1055 |     lr_mult: 1
1056 |     decay_mult: 1
1057 |   }
1058 |   param {
1059 |     name: "conv3_scale3_b_iner"
1060 |     lr_mult: 2
1061 |     decay_mult: 0
1062 |   }
1063 |   convolution_param {
1064 |     num_output: 32
1065 |     pad: 3
1066 |     kernel_size: 3
1067 |     stride: 1
1068 |     weight_filler {
1069 |       type: "xavier"
1070 |     }
1071 |     bias_filler {
1072 |       type: "constant"
1073 |     }
1074 |     dilation: 3
1075 |   }
1076 | }
1077 | layer {
1078 |   name: "bn3_scale1_iner"
1079 |   type: "BatchNorm"
1080 |   bottom: "conv3_scale1_iner"
1081 |   top: "bn3_scale1_iner"
1082 |   param {
1083 |     name: "bn3_scale1_mean_iner"
1084 |     lr_mult: 0
1085 |     decay_mult: 0
1086 |   }
1087 |   param {
1088 |     name: "bn3_scale1_var_iner"
1089 |     lr_mult: 0
1090 |     decay_mult: 0
1091 |   }
1092 |   param {
1093 |     name: "bn3_scale1_bias_iner"
1094 |     lr_mult: 0
1095 |     decay_mult: 0
1096 |   }
1097 |   batch_norm_param {
1098 |     use_global_stats: true
1099 |   }
1100 | }
1101 | layer {
1102 |   name: "bn3_scale2_iner"
1103 |   type: "BatchNorm"
1104 |   bottom: "conv3_scale2_iner"
1105 |   top: "bn3_scale2_iner"
1106 |   param {
1107 |     name: "bn3_scale2_mean_iner"
1108 |     lr_mult: 0
1109 |     decay_mult: 0
1110 |   }
1111 |   param {
1112 |     name: "bn3_scale2_var_iner"
1113 |     lr_mult: 0
1114 |     decay_mult: 0
1115 |   }
1116 |   param {
1117 |     name: "bn3_scale2_bias_iner"
1118 |     lr_mult: 0
1119 |     decay_mult: 0
1120 |   }
1121 |   batch_norm_param {
1122 |     use_global_stats: true
1123 |   }
1124 | }
1125 | layer {
1126 |   name: "bn3_scale3_iner"
1127 |   type: "BatchNorm"
1128 |   bottom: "conv3_scale3_iner"
1129 |   top: "bn3_scale3_iner"
1130 |   param {
1131 |     name: "bn3_scale3_mean_iner"
1132 |     lr_mult: 0
1133 |     decay_mult: 0
1134 |   }
1135 |   param {
1136 |     name: "bn3_scale3_var_iner"
1137 |     lr_mult: 0
1138 |     decay_mult: 0
1139 |   }
1140 |   param {
1141 |     name: "bn3_scale3_bias_iner"
1142 |     lr_mult: 0
1143 |     decay_mult: 0
1144 |   }
1145 |   batch_norm_param {
1146 |     use_global_stats: true
1147 |   }
1148 | }
1149 | layer {
1150 |   name: "bn3_iner"
1151 |   type: "Concat"
1152 |   bottom: "bn3_scale1_iner"
1153 |   bottom: "bn3_scale2_iner"
1154 |   bottom: "bn3_scale3_iner"
1155 |   top: "bn3_iner"
1156 |   concat_param {
1157 |     axis: 1
1158 |   }
1159 | }
1160 | layer {
1161 |   name: "relu3_iner"
1162 |   type: "ReLU"
1163 |   bottom: "bn3_iner"
1164 |   top: "bn3_iner"
1165 | }
1166 | layer {
1167 |   name: "pool3_iner"
1168 |   type: "Pooling"
1169 |   bottom: "bn3_iner"
1170 |   top: "pool3_iner"
1171 |   pooling_param {
1172 |     pool: MAX
1173 |     kernel_size: 2
1174 |     stride: 2
1175 |   }
1176 | }
1177 | layer {
1178 |   name: "conv4_scale1_iner"
1179 |   type: "Convolution"
1180 |   bottom: "pool3_iner"
1181 |   top: "conv4_scale1_iner"
1182 |   param {
1183 |     name: "conv4_scale1_w_iner"
1184 |     lr_mult: 1
1185 |     decay_mult: 1
1186 |   }
1187 |   param {
1188 |     name: "conv4_scale1_b_iner"
1189 |     lr_mult: 2
1190 |     decay_mult: 0
1191 |   }
1192 |   convolution_param {
1193 |     num_output: 32
1194 |     pad: 1
1195 |     kernel_size: 3
1196 |     stride: 1
1197 |     weight_filler {
1198 |       type: "xavier"
1199 |     }
1200 |     bias_filler {
1201 |       type: "constant"
1202 |     }
1203 |     dilation: 1
1204 |   }
1205 | }
1206 | layer {
1207 |   name: "conv4_scale2_iner"
1208 |   type: "Convolution"
1209 |   bottom: "pool3_iner"
1210 |   top: "conv4_scale2_iner"
1211 |   param {
1212 |     name: "conv4_scale2_w_iner"
1213 |     lr_mult: 1
1214 |     decay_mult: 1
1215 |   }
1216 |   param {
1217 |     name: "conv4_scale2_b_iner"
1218 |     lr_mult: 2
1219 |     decay_mult: 0
1220 |   }
1221 |   convolution_param {
1222 |     num_output: 32
1223 |     pad: 2
1224 |     kernel_size: 3
1225 |     stride: 1
1226 |     weight_filler {
1227 |       type: "xavier"
1228 |     }
1229 |     bias_filler {
1230 |       type: "constant"
1231 |     }
1232 |     dilation: 2
1233 |   }
1234 | }
1235 | layer {
1236 |   name: "conv4_scale3_iner"
1237 |   type: "Convolution"
1238 |   bottom: "pool3_iner"
1239 |   top: "conv4_scale3_iner"
1240 |   param {
1241 |     name: "conv4_scale3_w_iner"
1242 |     lr_mult: 1
1243 |     decay_mult: 1
1244 |   }
1245 |   param {
1246 |     name: "conv4_scale3_b_iner"
1247 |     lr_mult: 2
1248 |     decay_mult: 0
1249 |   }
1250 |   convolution_param {
1251 |     num_output: 32
1252 |     pad: 3
1253 |     kernel_size: 3
1254 |     stride: 1
1255 |     weight_filler {
1256 |       type: "xavier"
1257 |     }
1258 |     bias_filler {
1259 |       type: "constant"
1260 |     }
1261 |     dilation: 3
1262 |   }
1263 | }
1264 | layer {
1265 |   name: "bn4_scale1_iner"
1266 |   type: "BatchNorm"
1267 |   bottom: "conv4_scale1_iner"
1268 |   top: "bn4_scale1_iner"
1269 |   param {
1270 |     name: "bn4_scale1_mean_iner"
1271 |     lr_mult: 0
1272 |     decay_mult: 0
1273 |   }
1274 |   param {
1275 |     name: "bn4_scale1_var_iner"
1276 |     lr_mult: 0
1277 |     decay_mult: 0
1278 |   }
1279 |   param {
1280 |     name: "bn4_scale1_bias_iner"
1281 |     lr_mult: 0
1282 |     decay_mult: 0
1283 |   }
1284 |   batch_norm_param {
1285 |     use_global_stats: true
1286 |   }
1287 | }
1288 | layer {
1289 |   name: "bn4_scale2_iner"
1290 |   type: "BatchNorm"
1291 |   bottom: "conv4_scale2_iner"
1292 |   top: "bn4_scale2_iner"
1293 |   param {
1294 |     name: "bn4_scale2_mean_iner"
1295 |     lr_mult: 0
1296 |     decay_mult: 0
1297 |   }
1298 |   param {
1299 |     name: "bn4_scale2_var_iner"
1300 |     lr_mult: 0
1301 |     decay_mult: 0
1302 |   }
1303 |   param {
1304 |     name: "bn4_scale2_bias_iner"
1305 |     lr_mult: 0
1306 |     decay_mult: 0
1307 |   }
1308 |   batch_norm_param {
1309 |     use_global_stats: true
1310 |   }
1311 | }
1312 | layer {
1313 |   name: "bn4_scale3_iner"
1314 |   type: "BatchNorm"
1315 |   bottom: "conv4_scale3_iner"
1316 |   top: "bn4_scale3_iner"
1317 |   param {
1318 |     name: "bn4_scale3_mean_iner"
1319 |     lr_mult: 0
1320 |     decay_mult: 0
1321 |   }
1322 |   param {
1323 |     name: "bn4_scale3_var_iner"
1324 |     lr_mult: 0
1325 |     decay_mult: 0
1326 |   }
1327 |   param {
1328 |     name: "bn4_scale3_bias_iner"
1329 |     lr_mult: 0
1330 |     decay_mult: 0
1331 |   }
1332 |   batch_norm_param {
1333 |     use_global_stats: true
1334 |   }
1335 | }
1336 | layer {
1337 |   name: "bn4_iner"
1338 |   type: "Concat"
1339 |   bottom: "bn4_scale1_iner"
1340 |   bottom: "bn4_scale2_iner"
1341 |   bottom: "bn4_scale3_iner"
1342 |   top: "bn4_iner"
1343 |   concat_param {
1344 |     axis: 1
1345 |   }
1346 | }
1347 | layer {
1348 |   name: "relu4_iner"
1349 |   type: "ReLU"
1350 |   bottom: "bn4_iner"
1351 |   top: "bn4_iner"
1352 | }
1353 | layer {
1354 |   name: "pool4_iner"
1355 |   type: "Pooling"
1356 |   bottom: "bn4_iner"
1357 |   top: "pool4_iner"
1358 |   pooling_param {
1359 |     pool: MAX
1360 |     kernel_size: 2
1361 |     stride: 2
1362 |   }
1363 | }
1364 | layer {
1365 |   name: "fc1_iner"
1366 |   type: "InnerProduct"
1367 |   bottom: "pool4_iner"
1368 |   top: "fc1_iner"
1369 |   param {
1370 |     name: "fc1_w_iner"
1371 |     lr_mult: 1
1372 |     decay_mult: 1
1373 |   }
1374 |   param {
1375 |     name: "fc1_b_iner"
1376 |     lr_mult: 2
1377 |     decay_mult: 0
1378 |   }
1379 |   inner_product_param {
1380 |     num_output: 128
1381 |     weight_filler {
1382 |       type: "xavier"
1383 |     }
1384 |     bias_filler {
1385 |       type: "constant"
1386 |     }
1387 |   }
1388 | }
1389 | layer {
1390 |   name: "fc1_iner_drop"
1391 |   type: "Dropout"
1392 |   bottom: "fc1_iner"
1393 |   top: "fc1_iner"
1394 |   dropout_param {
1395 |     dropout_ratio: 0.2
1396 |   }
1397 | }
1398 | 
1399 | layer {
1400 |   name: "pool2_exter"
1401 |   type: "Pooling"
1402 |   bottom: "bn2_att_exter"
1403 |   top: "pool2_exter"
1404 |   pooling_param {
1405 |     pool: MAX
1406 |     kernel_size: 2
1407 |     stride: 2
1408 |   }
1409 | }
1410 | 
1411 | layer {
1412 |   name: "conv3_scale1_exter"
1413 |   type: "Convolution"
1414 |   bottom: "pool2_exter"
1415 |   top: "conv3_scale1_exter"
1416 |   param {
1417 |     name: "conv3_scale1_w_exter"
1418 |     lr_mult: 1
1419 |     decay_mult: 1
1420 |   }
1421 |   param {
1422 |     name: "conv3_scale1_b_exter"
1423 |     lr_mult: 2
1424 |     decay_mult: 0
1425 |   }
1426 |   convolution_param {
1427 |     num_output: 32
1428 |     pad: 1
1429 |     kernel_size: 3
1430 |     stride: 1
1431 |     weight_filler {
1432 |       type: "xavier"
1433 |     }
1434 |     bias_filler {
1435 |       type: "constant"
1436 |     }
1437 |     dilation: 1
1438 |   }
1439 | }
1440 | layer {
1441 |   name: "conv3_scale2_exter"
1442 |   type: "Convolution"
1443 |   bottom: "pool2_exter"
1444 |   top: "conv3_scale2_exter"
1445 |   param {
1446 |     name: "conv3_scale2_w_exter"
1447 |     lr_mult: 1
1448 |     decay_mult: 1
1449 |   }
1450 |   param {
1451 |     name: "conv3_scale2_b_exter"
1452 |     lr_mult: 2
1453 |     decay_mult: 0
1454 |   }
1455 |   convolution_param {
1456 |     num_output: 32
1457 |     pad: 2
1458 |     kernel_size: 3
1459 |     stride: 1
1460 |     weight_filler {
1461 |       type: "xavier"
1462 |     }
1463 |     bias_filler {
1464 |       type: "constant"
1465 |     }
1466 |     dilation: 2
1467 |   }
1468 | }
1469 | layer {
1470 |   name: "conv3_scale3_exter"
1471 |   type: "Convolution"
1472 |   bottom: "pool2_exter"
1473 |   top: "conv3_scale3_exter"
1474 |   param {
1475 |     name: "conv3_scale3_w_exter"
1476 |     lr_mult: 1
1477 |     decay_mult: 1
1478 |   }
1479 |   param {
1480 |     name: "conv3_scale3_b_exter"
1481 |     lr_mult: 2
1482 |     decay_mult: 0
1483 |   }
1484 |   convolution_param {
1485 |     num_output: 32
1486 |     pad: 3
1487 |     kernel_size: 3
1488 |     stride: 1
1489 |     weight_filler {
1490 |       type: "xavier"
1491 |     }
1492 |     bias_filler {
1493 |       type: "constant"
1494 |     }
1495 |     dilation: 3
1496 |   }
1497 | }
1498 | layer {
1499 |   name: "bn3_scale1_exter"
1500 |   type: "BatchNorm"
1501 |   bottom: "conv3_scale1_exter"
1502 |   top: "bn3_scale1_exter"
1503 |   param {
1504 |     name: "bn3_scale1_mean_exter"
1505 |     lr_mult: 0
1506 |     decay_mult: 0
1507 |   }
1508 |   param {
1509 |     name: "bn3_scale1_var_exter"
1510 |     lr_mult: 0
1511 |     decay_mult: 0
1512 |   }
1513 |   param {
1514 |     name: "bn3_scale1_bias_exter"
1515 |     lr_mult: 0
1516 |     decay_mult: 0
1517 |   }
1518 |   batch_norm_param {
1519 |     use_global_stats: true
1520 |   }
1521 | }
1522 | layer {
1523 |   name: "bn3_scale2_exter"
1524 |   type: "BatchNorm"
1525 |   bottom: "conv3_scale2_exter"
1526 |   top: "bn3_scale2_exter"
1527 |   param {
1528 |     name: "bn3_scale2_mean_exter"
1529 |     lr_mult: 0
1530 |     decay_mult: 0
1531 |   }
1532 |   param {
1533 |     name: "bn3_scale2_var_exter"
1534 |     lr_mult: 0
1535 |     decay_mult: 0
1536 |   }
1537 |   param {
1538 |     name: "bn3_scale2_bias_exter"
1539 |     lr_mult: 0
1540 |     decay_mult: 0
1541 |   }
1542 |   batch_norm_param {
1543 |     use_global_stats: true
1544 |   }
1545 | }
1546 | layer {
1547 |   name: "bn3_scale3_exter"
1548 |   type: "BatchNorm"
1549 |   bottom: "conv3_scale3_exter"
1550 |   top: "bn3_scale3_exter"
1551 |   param {
1552 |     name: "bn3_scale3_mean_exter"
1553 |     lr_mult: 0
1554 |     decay_mult: 0
1555 |   }
1556 |   param {
1557 |     name: "bn3_scale3_var_exter"
1558 |     lr_mult: 0
1559 |     decay_mult: 0
1560 |   }
1561 |   param {
1562 |     name: "bn3_scale3_bias_exter"
1563 |     lr_mult: 0
1564 |     decay_mult: 0
1565 |   }
1566 |   batch_norm_param {
1567 |     use_global_stats: true
1568 |   }
1569 | }
1570 | layer {
1571 |   name: "bn3_exter"
1572 |   type: "Concat"
1573 |   bottom: "bn3_scale1_exter"
1574 |   bottom: "bn3_scale2_exter"
1575 |   bottom: "bn3_scale3_exter"
1576 |   top: "bn3_exter"
1577 |   concat_param {
1578 |     axis: 1
1579 |   }
1580 | }
1581 | layer {
1582 |   name: "relu3_exter"
1583 |   type: "ReLU"
1584 |   bottom: "bn3_exter"
1585 |   top: "bn3_exter"
1586 | }
1587 | layer {
1588 |   name: "pool3_exter"
1589 |   type: "Pooling"
1590 |   bottom: "bn3_exter"
1591 |   top: "pool3_exter"
1592 |   pooling_param {
1593 |     pool: MAX
1594 |     kernel_size: 2
1595 |     stride: 2
1596 |   }
1597 | }
1598 | layer {
1599 |   name: "conv4_scale1_exter"
1600 |   type: "Convolution"
1601 |   bottom: "pool3_exter"
1602 |   top: "conv4_scale1_exter"
1603 |   param {
1604 |     name: "conv4_scale1_w_exter"
1605 |     lr_mult: 1
1606 |     decay_mult: 1
1607 |   }
1608 |   param {
1609 |     name: "conv4_scale1_b_exter"
1610 |     lr_mult: 2
1611 |     decay_mult: 0
1612 |   }
1613 |   convolution_param {
1614 |     num_output: 32
1615 |     pad: 1
1616 |     kernel_size: 3
1617 |     stride: 1
1618 |     weight_filler {
1619 |       type: "xavier"
1620 |     }
1621 |     bias_filler {
1622 |       type: "constant"
1623 |     }
1624 |     dilation: 1
1625 |   }
1626 | }
1627 | layer {
1628 |   name: "conv4_scale2_exter"
1629 |   type: "Convolution"
1630 |   bottom: "pool3_exter"
1631 |   top: "conv4_scale2_exter"
1632 |   param {
1633 |     name: "conv4_scale2_w_exter"
1634 |     lr_mult: 1
1635 |     decay_mult: 1
1636 |   }
1637 |   param {
1638 |     name: "conv4_scale2_b_exter"
1639 |     lr_mult: 2
1640 |     decay_mult: 0
1641 |   }
1642 |   convolution_param {
1643 |     num_output: 32
1644 |     pad: 2
1645 |     kernel_size: 3
1646 |     stride: 1
1647 |     weight_filler {
1648 |       type: "xavier"
1649 |     }
1650 |     bias_filler {
1651 |       type: "constant"
1652 |     }
1653 |     dilation: 2
1654 |   }
1655 | }
1656 | layer {
1657 |   name: "conv4_scale3_exter"
1658 |   type: "Convolution"
1659 |   bottom: "pool3_exter"
1660 |   top: "conv4_scale3_exter"
1661 |   param {
1662 |     name: "conv4_scale3_w_exter"
1663 |     lr_mult: 1
1664 |     decay_mult: 1
1665 |   }
1666 |   param {
1667 |     name: "conv4_scale3_b_exter"
1668 |     lr_mult: 2
1669 |     decay_mult: 0
1670 |   }
1671 |   convolution_param {
1672 |     num_output: 32
1673 |     pad: 3
1674 |     kernel_size: 3
1675 |     stride: 1
1676 |     weight_filler {
1677 |       type: "xavier"
1678 |     }
1679 |     bias_filler {
1680 |       type: "constant"
1681 |     }
1682 |     dilation: 3
1683 |   }
1684 | }
1685 | layer {
1686 |   name: "bn4_scale1_exter"
1687 |   type: "BatchNorm"
1688 |   bottom: "conv4_scale1_exter"
1689 |   top: "bn4_scale1_exter"
1690 |   param {
1691 |     name: "bn4_scale1_mean_exter"
1692 |     lr_mult: 0
1693 |     decay_mult: 0
1694 |   }
1695 |   param {
1696 |     name: "bn4_scale1_var_exter"
1697 |     lr_mult: 0
1698 |     decay_mult: 0
1699 |   }
1700 |   param {
1701 |     name: "bn4_scale1_bias_exter"
1702 |     lr_mult: 0
1703 |     decay_mult: 0
1704 |   }
1705 |   batch_norm_param {
1706 |     use_global_stats: true
1707 |   }
1708 | }
1709 | layer {
1710 |   name: "bn4_scale2_exter"
1711 |   type: "BatchNorm"
1712 |   bottom: "conv4_scale2_exter"
1713 |   top: "bn4_scale2_exter"
1714 |   param {
1715 |     name: "bn4_scale2_mean_exter"
1716 |     lr_mult: 0
1717 |     decay_mult: 0
1718 |   }
1719 |   param {
1720 |     name: "bn4_scale2_var_exter"
1721 |     lr_mult: 0
1722 |     decay_mult: 0
1723 |   }
1724 |   param {
1725 |     name: "bn4_scale2_bias_exter"
1726 |     lr_mult: 0
1727 |     decay_mult: 0
1728 |   }
1729 |   batch_norm_param {
1730 |     use_global_stats: true
1731 |   }
1732 | }
1733 | layer {
1734 |   name: "bn4_scale3_exter"
1735 |   type: "BatchNorm"
1736 |   bottom: "conv4_scale3_exter"
1737 |   top: "bn4_scale3_exter"
1738 |   param {
1739 |     name: "bn4_scale3_mean_exter"
1740 |     lr_mult: 0
1741 |     decay_mult: 0
1742 |   }
1743 |   param {
1744 |     name: "bn4_scale3_var_exter"
1745 |     lr_mult: 0
1746 |     decay_mult: 0
1747 |   }
1748 |   param {
1749 |     name: "bn4_scale3_bias_exter"
1750 |     lr_mult: 0
1751 |     decay_mult: 0
1752 |   }
1753 |   batch_norm_param {
1754 |     use_global_stats: true
1755 |   }
1756 | }
1757 | layer {
1758 |   name: "bn4_exter"
1759 |   type: "Concat"
1760 |   bottom: "bn4_scale1_exter"
1761 |   bottom: "bn4_scale2_exter"
1762 |   bottom: "bn4_scale3_exter"
1763 |   top: "bn4_exter"
1764 |   concat_param {
1765 |     axis: 1
1766 |   }
1767 | }
1768 | layer {
1769 |   name: "relu4_exter"
1770 |   type: "ReLU"
1771 |   bottom: "bn4_exter"
1772 |   top: "bn4_exter"
1773 | }
1774 | layer {
1775 |   name: "pool4_exter"
1776 |   type: "Pooling"
1777 |   bottom: "bn4_exter"
1778 |   top: "pool4_exter"
1779 |   pooling_param {
1780 |     pool: MAX
1781 |     kernel_size: 2
1782 |     stride: 2
1783 |   }
1784 | }
1785 | layer {
1786 |   name: "fc1_exter"
1787 |   type: "InnerProduct"
1788 |   bottom: "pool4_exter"
1789 |   top: "fc1_exter"
1790 |   param {
1791 |     name: "fc1_w_exter"
1792 |     lr_mult: 1
1793 |     decay_mult: 1
1794 |   }
1795 |   param {
1796 |     name: "fc1_b_exter"
1797 |     lr_mult: 2
1798 |     decay_mult: 0
1799 |   }
1800 |   inner_product_param {
1801 |     num_output: 128
1802 |     weight_filler {
1803 |       type: "xavier"
1804 |     }
1805 |     bias_filler {
1806 |       type: "constant"
1807 |     }
1808 |   }
1809 | }
1810 | layer {
1811 |   name: "fc1_exter_drop"
1812 |   type: "Dropout"
1813 |   bottom: "fc1_exter"
1814 |   top: "fc1_exter"
1815 |   dropout_param {
1816 |     dropout_ratio: 0.2
1817 |   }
1818 | }


--------------------------------------------------------------------------------
/evaluation/extract_feature_cuhk.py:
--------------------------------------------------------------------------------
 1 | """
 2 | Extracting features with caffe models.
 3 | 
 4 | by Chunfeng Song
 5 | 
 6 | 2017/10/08
 7 | 
 8 | This code is for research use only, please cite our paper:
 9 |  
10 | Chunfeng Song, Yan Huang, Wanli Ouyang, and Liang Wang. Mask-guided Contrastive Attention Model for Person Re-Identification. In CVPR, 2018.
11 |  
12 | Contact us: chunfeng.song@nlpr.ia.ac.cn
13 | """
14 | import caffe
15 | import numpy as n
16 | import cv2
17 | import scipy.io as spio
18 | import os
19 | 
20 | def extract_feature(net,image_path, gt_path):
21 |     # load image
22 |     oim = cv2.imread(image_path)
23 | 
24 |     # resize image into caffe size
25 |     inputImage = cv2.resize(oim, (64, 160))    
26 |     inputImage = n.array(inputImage, dtype=n.float32)
27 |     
28 |     # substract mean
29 |     inputImage[:, :, 0] = inputImage[:, :, 0] - 104.008
30 |     inputImage[:, :, 1] = inputImage[:, :, 1] - 116.669
31 |     inputImage[:, :, 2] = inputImage[:, :, 2] - 122.675
32 |     
33 |     # permute dimensions
34 |     inputImage = inputImage.transpose([2, 0, 1])
35 |     inputImage = inputImage/256.0
36 |     one_mask_ = n.zeros((1,40,16),dtype = n.float32)
37 |     one_mask_ = one_mask_ + 1.0
38 |     
39 |     # mask
40 |     mask_im= cv2.cvtColor(cv2.imread(gt_path),cv2.COLOR_BGR2GRAY)
41 |     inputmask = n.array(cv2.resize(mask_im, (64, 160)), dtype=n.float32)
42 |     inputmask = inputmask-127.5
43 |     inputmask = inputmask/255.0
44 |     inputmask = inputmask[n.newaxis, ...]
45 |     
46 |     #caffe forward 
47 |     net.blobs['data'].reshape(1,*inputImage.shape)
48 |     net.blobs['one_mask'].reshape(1,*one_mask_.shape)
49 |     net.blobs['mask'].reshape(1,*inputmask.shape)
50 |     net.blobs['data'].data[...] = inputImage
51 |     net.blobs['one_mask'].data[...] = one_mask_
52 |     net.blobs['mask'].data[...] = inputmask
53 |     net.forward()
54 |     
55 |     #caffe output
56 |     feature = n.squeeze(net.blobs['fc1_full'].data)
57 |     return feature
58 | 
59 | if __name__ == '__main__':
60 |     pass
61 |     prefix = 'labeled' #'labeled' or 'detected'
62 |     prefix_2 = None #None or 'siamese'
63 |     gpu_id = 0
64 |     if prefix_2 is None:
65 |         model_data = '../experiments/cuhk03-'+prefix+'/mgcam_iter_75000.caffemodel'
66 |     else:
67 |         model_data = '../experiments/cuhk03-'+prefix+'/mgcam_'+prefix_2+'_iter_20000.caffemodel'
68 |     fea_dims = 128
69 |     model_config = './deploy_mgcam.prototxt'
70 |     caffe.set_mode_gpu()
71 |     caffe.set_device(gpu_id)
72 |     net = caffe.Net(model_config, model_data, caffe.TEST)
73 |     image_path = '../data/cuhk03/cuhk03_'+prefix
74 |     mask_path = '../data/cuhk03/cuhk03_'+ prefix + '_seg'
75 |     # list images
76 |     image_list = n.sort(os.listdir(image_path))
77 |     feature_all = n.zeros((fea_dims,len(image_list)),n.single)
78 |     now = 0
79 |     for item in image_list:
80 |         if item.lower().endswith('.png'):
81 |             this_image_path = os.path.join(image_path,item)
82 |             this_mask_path = os.path.join(mask_path,item)
83 |             this_fea = extract_feature(net,this_image_path,this_mask_path)
84 |             feature_all[:,now] = this_fea[:]
85 |             now +=1
86 |             print '---->%04d of %d  with %s is done!'%(now,len(image_list),item)
87 |     if prefix_2 is None:
88 |         spio.savemat('cuhk03-fea-' + prefix, {'feat': feature_all})
89 |     else:
90 |         spio.savemat('cuhk03-fea-' + prefix + '-' + prefix_2, {'feat': feature_all})
91 | 


--------------------------------------------------------------------------------
/evaluation/extract_feature_market.py:
--------------------------------------------------------------------------------
 1 | """
 2 | Extracting features with caffe models.
 3 | 
 4 | by Chunfeng Song
 5 | 
 6 | 2017/10/08
 7 | 
 8 | This code is for research use only, please cite our paper:
 9 |  
10 | Chunfeng Song, Yan Huang, Wanli Ouyang, and Liang Wang. Mask-guided Contrastive Attention Model for Person Re-Identification. In CVPR, 2018.
11 |  
12 | Contact us: chunfeng.song@nlpr.ia.ac.cn
13 | """
14 | import caffe
15 | import numpy as n
16 | import cv2
17 | import scipy.io as spio
18 | import os
19 | 
20 | def extract_feature(net,image_path, gt_path):
21 |     # load image
22 |     oim = cv2.imread(image_path)
23 | 
24 |     # resize image into caffe size
25 |     inputImage = cv2.resize(oim, (64, 160))    
26 |     inputImage = n.array(inputImage, dtype=n.float32)
27 |     
28 |     # substract mean
29 |     inputImage[:, :, 0] = inputImage[:, :, 0] - 104.008
30 |     inputImage[:, :, 1] = inputImage[:, :, 1] - 116.669
31 |     inputImage[:, :, 2] = inputImage[:, :, 2] - 122.675
32 |     
33 |     # permute dimensions
34 |     inputImage = inputImage.transpose([2, 0, 1])
35 |     inputImage = inputImage/256.0
36 |     one_mask_ = n.zeros((1,40,16),dtype = n.float32)
37 |     one_mask_ = one_mask_ + 1.0
38 |     
39 |     # mask
40 |     mask_im= cv2.cvtColor(cv2.imread(gt_path),cv2.COLOR_BGR2GRAY)
41 |     inputmask = n.array(cv2.resize(mask_im, (64, 160)), dtype=n.float32)
42 |     inputmask = inputmask-127.5
43 |     inputmask = inputmask/255.0
44 |     inputmask = inputmask[n.newaxis, ...]
45 |     
46 |     #caffe forward 
47 |     net.blobs['data'].reshape(1,*inputImage.shape)
48 |     net.blobs['one_mask'].reshape(1,*one_mask_.shape)
49 |     net.blobs['mask'].reshape(1,*inputmask.shape)
50 |     net.blobs['data'].data[...] = inputImage
51 |     net.blobs['one_mask'].data[...] = one_mask_
52 |     net.blobs['mask'].data[...] = inputmask
53 |     net.forward()
54 |     
55 |     #caffe output
56 |     feature = n.squeeze(net.blobs['fc1_full'].data)
57 |     return feature
58 |     
59 | if __name__ == '__main__':
60 |     pass
61 |     prefix_list = ['query', 'bounding_box_test','bounding_box_train']
62 |     prefix_2 = None #None or 'siamese'
63 |     gpu_id = 0
64 |     if prefix_2 is None:
65 |         model_data = '../experiments/market1501/mgcam_iter_75000.caffemodel'
66 |     else:
67 |         model_data = '../experiments/market1501/mgcam_'+prefix_2+'_iter_20000.caffemodel'
68 |     fea_dims = 128
69 |     model_config = './deploy_mgcam.prototxt'
70 |     data_path = '../data/market1501/'
71 |     caffe.set_mode_gpu()
72 |     caffe.set_device(gpu_id)
73 |     net = caffe.Net(model_config, model_data, caffe.TEST)
74 |     for prefix in prefix_list:
75 |         image_path = os.path.join(data_path, prefix)
76 |         mask_path = os.path.join(data_path, prefix + '_seg')
77 |         # list images
78 |         image_list = n.sort(os.listdir(image_path))
79 |         length = len(image_list)-1
80 |         feature_all = n.zeros((fea_dims,length),n.single)
81 |         now = 0
82 |         for item in image_list:
83 |             if item.lower().endswith('.jpg'):
84 |                 this_image_path = os.path.join(image_path,item)
85 |                 this_mask_path = os.path.join(mask_path,item[:-4] + '.png')
86 |                 this_fea = extract_feature(net,this_image_path,this_mask_path)
87 |                 feature_all[:,now] = this_fea[:]
88 |                 now +=1
89 |                 print '---->%04d of %d  with %s is done!'%(now,len(image_list),item)
90 |         if prefix_2 is None:
91 |             spio.savemat('market-fea-' + prefix, {'feat': feature_all})
92 |         else:
93 |             spio.savemat('market-fea-' + prefix +'-'+ prefix_2, {'feat': feature_all})
94 | 


--------------------------------------------------------------------------------
/evaluation/extract_feature_mars.py:
--------------------------------------------------------------------------------
 1 | """
 2 | Extracting features with caffe models.
 3 | 
 4 | by Chunfeng Song
 5 | 
 6 | 2017/10/08
 7 | 
 8 | This code is for research use only, please cite our paper:
 9 |  
10 | Chunfeng Song, Yan Huang, Wanli Ouyang, and Liang Wang. Mask-guided Contrastive Attention Model for Person Re-Identification. In CVPR, 2018.
11 |  
12 | Contact us: chunfeng.song@nlpr.ia.ac.cn
13 | """
14 | import caffe
15 | import numpy as n
16 | import cv2
17 | import scipy.io as spio
18 | import os
19 | 
20 | def extract_feature(net,image_path, gt_path):
21 |     # load image
22 |     oim = cv2.imread(image_path)
23 | 
24 |     # resize image into caffe size
25 |     inputImage = cv2.resize(oim, (64, 160))    
26 |     inputImage = n.array(inputImage, dtype=n.float32)
27 |     
28 |     # substract mean
29 |     inputImage[:, :, 0] = inputImage[:, :, 0] - 104.008
30 |     inputImage[:, :, 1] = inputImage[:, :, 1] - 116.669
31 |     inputImage[:, :, 2] = inputImage[:, :, 2] - 122.675
32 |     
33 |     # permute dimensions
34 |     inputImage = inputImage.transpose([2, 0, 1])
35 |     inputImage = inputImage/256.0
36 |     one_mask_ = n.zeros((1,40,16),dtype = n.float32)
37 |     one_mask_ = one_mask_ + 1.0
38 |     
39 |     # mask
40 |     mask_im= cv2.cvtColor(cv2.imread(gt_path),cv2.COLOR_BGR2GRAY)
41 |     inputmask = n.array(cv2.resize(mask_im, (64, 160)), dtype=n.float32)
42 |     inputmask = inputmask-127.5
43 |     inputmask = inputmask/255.0
44 |     inputmask = inputmask[n.newaxis, ...]
45 |     
46 |     #caffe forward 
47 |     net.blobs['data'].reshape(1,*inputImage.shape)
48 |     net.blobs['one_mask'].reshape(1,*one_mask_.shape)
49 |     net.blobs['mask'].reshape(1,*inputmask.shape)
50 |     net.blobs['data'].data[...] = inputImage
51 |     net.blobs['one_mask'].data[...] = one_mask_
52 |     net.blobs['mask'].data[...] = inputmask
53 |     net.forward()
54 |     
55 |     #caffe output
56 |     feature = n.squeeze(net.blobs['fc1_full'].data)
57 |     return feature
58 |     
59 | if __name__ == '__main__':
60 |     pass
61 |     prefix_list = ['train', 'test']
62 |     prefix_2 = None #None or 'siamese'
63 |     gpu_id = 0
64 |     if prefix_2 is None:
65 |         model_data = '../experiments/mars/mgcam_iter_75000.caffemodel'
66 |     else:
67 |         model_data = '../experiments/mars/mgcam_'+prefix_2+'_iter_20000.caffemodel'
68 |     fea_dims = 128
69 |     model_config = './deploy_mgcam.prototxt'
70 |     data_path = '../data/mars/'
71 |     mars_eval_info_path = '../data/mars/MARS-evaluation-master/info/' # The evaluation info can be download from: https://github.com/liangzheng06/MARS-evaluation
72 |     caffe.set_mode_gpu()
73 |     caffe.set_device(gpu_id)
74 |     net = caffe.Net(model_config, model_data, caffe.TEST)
75 |     for prefix in prefix_list:
76 |         image_path = os.path.join(mars_eval_info_path, prefix + '_name.txt')
77 |         files = open(image_path, 'r')
78 |         img_list = files.readlines()
79 |         data_num = len(img_list)
80 |         feature_all = n.zeros((fea_dims,data_num),n.single)
81 |         for i in xrange(data_num):
82 |             this_line = img_list[i]
83 |             this_path = os.path.join(data_path, 'bbox_'+prefix, this_line[:4], this_line[:-2])
84 |             this_gt_path = os.path.join(data_path, 'bbox_'+prefix+'_seg', this_line[:4], this_line[:-6] + '.png')
85 |             if not os.path.exists(this_path) or not os.path.exists(this_gt_path):
86 |                 import pdb
87 |                 pdb.set_trace()
88 |                 print 'ERROR!!!'
89 |                 break
90 |             this_fea = extract_feature(net,this_path,this_gt_path)
91 |             feature_all[:,i] = this_fea[:]
92 |             if i%1000==0:
93 |                 print '---->%d of %d is done!'%(i,data_num)
94 |         files.close
95 |         if prefix_2 is None:
96 |             spio.savemat('mars-fea-' + prefix, {prefix: feature_all})
97 |         else:
98 |             spio.savemat('mars-fea-' + prefix + '-' + prefix_2, {prefix: feature_all})
99 |         print 'Extracting feature of %s is done!' %prefix


--------------------------------------------------------------------------------
/experiments/cuhk03-detected/layers.py:
--------------------------------------------------------------------------------
  1 | """
  2 | MGCAM data layer.
  3 | 
  4 | by Chunfeng Song
  5 | 
  6 | 2017/10/08
  7 | 
  8 | This code is for research use only, please cite our paper:
  9 |  
 10 | Chunfeng Song, Yan Huang, Wanli Ouyang, and Liang Wang. Mask-guided Contrastive Attention Model for Person Re-Identification. In CVPR, 2018.
 11 |  
 12 | Contact us: chunfeng.song@nlpr.ia.ac.cn
 13 | """
 14 | 
 15 | import caffe
 16 | import numpy as np
 17 | import yaml
 18 | from random import shuffle
 19 | import numpy.random as nr
 20 | import cv2
 21 | import os
 22 | import pickle as cPickle
 23 | import pdb
 24 | 
 25 | def mypickle(filename, data):
 26 |     fo = open(filename, "wb")
 27 |     cPickle.dump(data, fo, protocol=cPickle.HIGHEST_PROTOCOL)
 28 |     fo.close()
 29 |     
 30 | def myunpickle(filename):
 31 |     if not os.path.exists(filename):
 32 |         raise UnpickleError("Path '%s' does not exist." % filename)
 33 | 
 34 |     fo = open(filename, 'rb')
 35 |     dict = cPickle.load(fo)    
 36 |     fo.close()
 37 |     return dict
 38 | 
 39 | class MGCAM_DataLayer(caffe.Layer):
 40 |     """Data layer for training"""   
 41 |     def setup(self, bottom, top): 
 42 |         self.width = 64
 43 |         self.height = 160 # We resize all images into a size of 160*64.
 44 |         self.width_gt = 16
 45 |         self.height_gt = 40 # We resize all masks which are used to supervise attention learning into a size of 40*16.
 46 |         layer_params = yaml.load(self.param_str)
 47 |         self.batch_size = layer_params['batch_size']
 48 |         self.im_path = layer_params['im_path']
 49 |         self.gt_path = layer_params['gt_path']
 50 |         self.dataset = layer_params['dataset']
 51 |         self.labels, self.im_list, self.gt_list, self.im_dir_list, self.gt_dir_list = self.data_processor(self.dataset)
 52 |         self.idx = 0
 53 |         self.data_num = len(self.im_list) # Number of data pairs
 54 |         self.rnd_list = np.arange(self.data_num) # Random the images list
 55 |         shuffle(self.rnd_list)
 56 |         
 57 |     def forward(self, bottom, top):
 58 |         # Assign forward data
 59 |         top[0].data[...] = self.im
 60 |         top[1].data[...] = self.inner_label
 61 |         top[2].data[...] = self.exter_label
 62 |         top[3].data[...] = self.one_mask
 63 |         top[4].data[...] = self.label
 64 |         top[5].data[...] = self.label_plus
 65 |         top[6].data[...] = self.gt
 66 |         top[7].data[...] = self.mask
 67 | 
 68 |     def backward(self, top, propagate_down, bottom):
 69 |         """This layer does not propagate gradients."""
 70 |         pass
 71 | 
 72 |     def reshape(self, bottom, top):
 73 |         # Load image + label image pairs
 74 |         self.im = []
 75 |         self.label = []
 76 |         self.inner_label = []
 77 |         self.exter_label = []
 78 |         self.one_mask = []
 79 |         self.label_plus = []
 80 |         self.gt = []
 81 |         self.mask = []
 82 | 
 83 |         for i in xrange(self.batch_size):
 84 |             if self.idx == self.data_num:
 85 |                 self.idx = 0
 86 |                 shuffle(self.rnd_list) #Randomly shuffle the list.
 87 |             cur_idx = self.rnd_list[self.idx]
 88 |             im_path = self.im_list[cur_idx]
 89 |             gt_path = self.gt_list[cur_idx]
 90 |             im_, gt_, mask_= self.load_data(im_path, gt_path)
 91 |             self.im.append(im_)
 92 |             self.gt.append(gt_)
 93 |             self.mask.append(mask_)
 94 |             self.label.append(self.labels[cur_idx])
 95 |             self.inner_label.append(int(1))
 96 |             self.exter_label.append(int(0))
 97 |             one_mask_ = np.zeros((1,40,16),dtype = np.float32)
 98 |             one_mask_ = one_mask_ + 1.0
 99 |             self.one_mask.append(one_mask_)
100 |             self.label_plus.append(self.labels[cur_idx]) #Here, we also give the ID-labels to background-stream.
101 |             self.idx +=1
102 | 
103 |         self.im = np.array(self.im).astype(np.float32)
104 |         self.inner_label = np.array(self.inner_label).astype(np.float32)
105 |         self.exter_label = np.array(self.exter_label).astype(np.float32)
106 |         self.one_mask = np.array(self.one_mask).astype(np.float32)
107 |         self.label = np.array(self.label).astype(np.float32)
108 |         self.label_plus = np.array(self.label_plus).astype(np.float32)
109 |         self.gt = np.array(self.gt).astype(np.float32)
110 |         self.mask = np.array(self.mask).astype(np.float32)
111 |         # Reshape tops to fit blobs      
112 |         top[0].reshape(*self.im.shape)
113 |         top[1].reshape(*self.inner_label.shape)
114 |         top[2].reshape(*self.exter_label.shape)
115 |         top[3].reshape(*self.one_mask.shape)
116 |         top[4].reshape(*self.label.shape)
117 |         top[5].reshape(*self.label_plus.shape)
118 |         top[6].reshape(*self.gt.shape)
119 |         top[7].reshape(*self.mask.shape)
120 |         
121 |     def data_processor(self, data_name):
122 |         data_dic = './' + data_name
123 |         if not os.path.exists(data_dic):
124 |             im_list = []
125 |             gt_list = []
126 |             labels = []
127 |             im_dir_list = []
128 |             gt_dir_list = []
129 |             new_id = 0
130 |             id_list = np.sort(os.listdir(self.im_path))
131 |             for id in id_list:
132 |                 im_dir = os.path.join(self.im_path, id)
133 |                 gt_dir = os.path.join(self.gt_path, id)
134 |                 if not os.path.exists(im_dir):
135 |                     continue
136 |                 pic_im_list = np.sort(os.listdir(im_dir))
137 |                 if len(pic_im_list)>1:
138 |                     for pic in pic_im_list:
139 |                         this_dir = os.path.join(self.im_path, id, pic)
140 |                         gt_pic = pic
141 |                         if not pic.lower().endswith('.png'):
142 |                             gt_pic = pic[:-4] + '.png'
143 |                         this_gt_dir = os.path.join(self.gt_path, id, gt_pic)
144 |                         im_list.append(this_dir)
145 |                         gt_list.append(this_gt_dir)
146 |                         labels.append(int(new_id))
147 |                     new_id +=1
148 |                     im_dir_list.append(im_dir)
149 |                     gt_dir_list.append(gt_dir)
150 |             dic = {'im_list':im_list,'gt_list':gt_list,'labels':labels,'im_dir_list':im_dir_list,'gt_dir_list':gt_dir_list}
151 |             mypickle(data_dic, dic)
152 |         # Load saved data dict to resume.
153 |         else:
154 |             dic = myunpickle(data_dic)
155 |             im_list = dic['im_list']
156 |             gt_list = dic['gt_list']
157 |             labels = dic['labels']
158 |             im_dir_list = dic['im_dir_list']
159 |             gt_dir_list = dic['gt_dir_list']
160 |         return labels, im_list, gt_list, im_dir_list, gt_dir_list
161 |     
162 |     def load_data(self, im_path, gt_path):
163 |         """
164 |         Load input image and preprocess for Caffe:
165 |         - cast to float
166 |         - switch channels RGB -> BGR
167 |         - subtract mean
168 |         - transpose to channel x height x width order
169 |         """
170 |         oim = cv2.imread(im_path)
171 |         inputImage = cv2.resize(oim, (self.width, self.height))
172 |         inputImage = np.array(inputImage, dtype=np.float32)
173 |         
174 |         # Substract mean
175 |         inputImage[:, :, 0] = inputImage[:, :, 0] - 104.008
176 |         inputImage[:, :, 1] = inputImage[:, :, 1] - 116.669
177 |         inputImage[:, :, 2] = inputImage[:, :, 2] - 122.675
178 |         
179 |         # Permute dimensions
180 |         if_flip = nr.randint(2)
181 |         if if_flip == 0: # Also flip the image with 50% probability
182 |             inputImage = inputImage[:,::-1,:]
183 |         inputImage = inputImage.transpose([2, 0, 1])        
184 |         inputImage = inputImage/256.0
185 |         #GT
186 |         mask_im= cv2.cvtColor(cv2.imread(gt_path),cv2.COLOR_BGR2GRAY)
187 |         inputGt = np.array(cv2.resize(mask_im, (self.width_gt, self.height_gt)), dtype=np.float32)
188 |         inputGt = inputGt/255.0
189 |         if if_flip == 0:
190 |             inputGt = inputGt[:,::-1]
191 |         inputGt = inputGt[np.newaxis, ...]
192 |         #Mask
193 |         inputMask = np.array(cv2.resize(mask_im, (self.width, self.height)), dtype=np.float32)
194 |         inputMask = inputMask-127.5
195 |         inputMask = inputMask/255.0
196 |         if if_flip == 0:
197 |             inputMask = inputMask[:,::-1]
198 |         inputMask = inputMask[np.newaxis, ...]
199 |         return inputImage, inputGt, inputMask
200 | 
201 | 
202 | class MGCAM_SIA_DataLayer(caffe.Layer):
203 |     """Data layer for training"""   
204 |     def setup(self, bottom, top): 
205 |         self.width = 64
206 |         self.height = 160 # We resize all images into a size of 160*64.
207 |         self.width_gt = 16
208 |         self.height_gt = 40 # We resize all masks which are used to supervise attention learning into a size of 160*64.
209 |         
210 |         layer_params = yaml.load(self.param_str)
211 |         self.batch_size = layer_params['batch_size']
212 |         self.pos_pair_num = int(0.30*self.batch_size) # There will be at least 30 percent postive pairs for each batch.
213 |         self.im_path = layer_params['im_path']
214 |         self.gt_path = layer_params['gt_path']
215 |         self.dataset = layer_params['dataset']
216 |         self.labels, self.im_list, self.gt_list, self.im_dir_list, self.gt_dir_list = self.data_processor(self.dataset)
217 |         self.idx = 0
218 |         self.data_num = len(self.im_list)
219 |         self.rnd_list = np.arange(self.data_num)
220 |         shuffle(self.rnd_list)
221 |         
222 |     def forward(self, bottom, top):
223 |         # Assign forward data
224 |         top[0].data[...] = self.im
225 |         top[1].data[...] = self.inner_label
226 |         top[2].data[...] = self.exter_label
227 |         top[3].data[...] = self.one_mask
228 |         top[4].data[...] = self.label
229 |         top[5].data[...] = self.label_plus
230 |         top[6].data[...] = self.gt
231 |         top[7].data[...] = self.mask
232 |         top[8].data[...] = self.siam_label
233 |         
234 |     def backward(self, top, propagate_down, bottom):
235 |         """This layer does not propagate gradients."""
236 |         pass
237 | 
238 |     def reshape(self, bottom, top):
239 |         # Load image + label image pairs
240 |         self.im = []
241 |         self.label = []
242 |         self.inner_label = []
243 |         self.exter_label = []
244 |         self.one_mask = []
245 |         self.label_plus = []
246 |         self.gt = []
247 |         self.mask = []
248 |         self.siam_label = []
249 | 
250 |         for i in xrange(self.batch_size):
251 |             if self.idx == self.data_num:
252 |                 self.idx = 0
253 |                 shuffle(self.rnd_list)
254 |             cur_idx = self.rnd_list[self.idx]
255 |             im_path = self.im_list[cur_idx]
256 |             gt_path = self.gt_list[cur_idx]
257 |             im_, gt_, mask_= self.load_data(im_path, gt_path)
258 |             self.im.append(im_)
259 |             self.gt.append(gt_)
260 |             self.mask.append(mask_)
261 |             self.label.append(self.labels[cur_idx])
262 |             self.inner_label.append(int(1))
263 |             self.exter_label.append(int(0))
264 |             one_mask_ = np.zeros((1,40,16),dtype = np.float32)
265 |             one_mask_ = one_mask_ + 1.0
266 |             self.one_mask.append(one_mask_)
267 |             self.label_plus.append(self.labels[cur_idx])#Labels for backgrounds. We use the same labels with other two regions here. 
268 |             self.idx +=1
269 |             
270 |         for i in xrange(self.batch_size):
271 |             if i > self.pos_pair_num:
272 |                 if self.idx == self.data_num:
273 |                     self.idx = 0
274 |                     shuffle(self.rnd_list)#Randomly shuffle the list.
275 |                 cur_idx = self.rnd_list[self.idx]
276 |                 self.idx +=1
277 |                 im_path = self.im_list[cur_idx]
278 |                 gt_path = self.gt_list[cur_idx]
279 |                 label = self.labels[cur_idx]
280 |                 if label==self.label[i]:
281 |                     self.siam_label.append(int(1))#In case of getting postive pairs, maybe not much.
282 |                 else:
283 |                     self.siam_label.append(int(0))#Negative pairs.
284 |             else:
285 |                 im_dir = self.im_dir_list[self.label[i]]
286 |                 gt_dir = self.gt_dir_list[self.label[i]]
287 |                 im_list = np.sort(os.listdir(im_dir))
288 |                 gt_list = np.sort(os.listdir(gt_dir))
289 |                 tmp_list = np.arange(len(im_list))
290 |                 shuffle(tmp_list) #Randomly select one.
291 |                 im_path = os.path.join(im_dir, im_list[tmp_list[0]])
292 |                 gt_path = os.path.join(gt_dir, gt_list[tmp_list[0]])
293 |                 label = self.label[i]
294 |                 self.siam_label.append(int(1))#This is a postive pair.
295 |             
296 |             im_, gt_, mask_= self.load_data(im_path, gt_path)
297 |             self.im.append(im_)
298 |             self.gt.append(gt_)
299 |             self.mask.append(mask_)
300 |             self.label.append(label)
301 |             self.inner_label.append(int(1))#Allways be ones, for constrastive learning.
302 |             self.exter_label.append(int(0))
303 |             one_mask_ = np.zeros((1,40,16),dtype = np.float32)
304 |             one_mask_ = one_mask_ + 1.0
305 |             self.one_mask.append(one_mask_)
306 |             self.label_plus.append(label)
307 |                 
308 |         self.im = np.array(self.im).astype(np.float32)
309 |         self.inner_label = np.array(self.inner_label).astype(np.float32)
310 |         self.exter_label = np.array(self.exter_label).astype(np.float32)
311 |         self.one_mask = np.array(self.one_mask).astype(np.float32)
312 |         self.label = np.array(self.label).astype(np.float32)
313 |         self.label_plus = np.array(self.label_plus).astype(np.float32)
314 |         self.gt = np.array(self.gt).astype(np.float32)
315 |         self.mask = np.array(self.mask).astype(np.float32)
316 |         self.siam_label = np.array(self.siam_label).astype(np.float32)
317 |         # Reshape tops to fit blobs      
318 |         top[0].reshape(*self.im.shape)
319 |         top[1].reshape(*self.inner_label.shape)
320 |         top[2].reshape(*self.exter_label.shape)
321 |         top[3].reshape(*self.one_mask.shape)
322 |         top[4].reshape(*self.label.shape)
323 |         top[5].reshape(*self.label_plus.shape)
324 |         top[6].reshape(*self.gt.shape)
325 |         top[7].reshape(*self.mask.shape)
326 |         top[8].reshape(*self.siam_label.shape)
327 |         
328 |     def data_processor(self, data_name):
329 |         data_dic = './' + data_name
330 |         if not os.path.exists(data_dic):
331 |             im_list = []
332 |             gt_list = []
333 |             labels = []
334 |             im_dir_list = []
335 |             gt_dir_list = []
336 |             new_id = 0
337 |             id_list = np.sort(os.listdir(self.im_path))
338 |             for id in id_list:
339 |                 im_dir = os.path.join(self.im_path, id)
340 |                 gt_dir = os.path.join(self.gt_path, id)
341 |                 if not os.path.exists(im_dir):
342 |                     continue
343 |                 pic_im_list = np.sort(os.listdir(im_dir))
344 |                 if len(pic_im_list)>1:
345 |                     for pic in pic_im_list:
346 |                         this_dir = os.path.join(self.im_path, id, pic)
347 |                         gt_pic = pic
348 |                         if not pic.lower().endswith('.png'):
349 |                             gt_pic = pic[:-4] + '.png'
350 |                         this_gt_dir = os.path.join(self.gt_path, id, gt_pic)
351 |                         im_list.append(this_dir)
352 |                         gt_list.append(this_gt_dir)
353 |                         labels.append(int(new_id))
354 |                     new_id +=1
355 |                     im_dir_list.append(im_dir)
356 |                     gt_dir_list.append(gt_dir)
357 |             dic = {'im_list':im_list,'gt_list':gt_list,'labels':labels,'im_dir_list':im_dir_list,'gt_dir_list':gt_dir_list}
358 |             mypickle(data_dic, dic)
359 |         # Load saved data dict to resume.
360 |         else:
361 |             dic = myunpickle(data_dic)
362 |             im_list = dic['im_list']
363 |             gt_list = dic['gt_list']
364 |             labels = dic['labels']
365 |             im_dir_list = dic['im_dir_list']
366 |             gt_dir_list = dic['gt_dir_list']
367 |         return labels, im_list, gt_list, im_dir_list, gt_dir_list
368 |     
369 |     def load_data(self, im_path, gt_path):
370 |         """
371 |         Load input image and preprocess for Caffe:
372 |         - cast to float
373 |         - switch channels RGB -> BGR
374 |         - subtract mean
375 |         - transpose to channel x height x width order
376 |         """
377 |         oim = cv2.imread(im_path)
378 |         inputImage = cv2.resize(oim, (self.width, self.height))
379 |         inputImage = np.array(inputImage, dtype=np.float32)
380 |         
381 |         # Substract mean
382 |         inputImage[:, :, 0] = inputImage[:, :, 0] - 104.008
383 |         inputImage[:, :, 1] = inputImage[:, :, 1] - 116.669
384 |         inputImage[:, :, 2] = inputImage[:, :, 2] - 122.675
385 |         
386 |         # Permute dimensions
387 |         if_flip = nr.randint(2)
388 |         if if_flip == 0: # Also flip the image with 50% probability
389 |             inputImage = inputImage[:,::-1,:]
390 |         inputImage = inputImage.transpose([2, 0, 1])        
391 |         inputImage = inputImage/256.0
392 |         #GT
393 |         mask_im= cv2.cvtColor(cv2.imread(gt_path),cv2.COLOR_BGR2GRAY)
394 |         inputGt = np.array(cv2.resize(mask_im, (self.width_gt, self.height_gt)), dtype=np.float32)
395 |         inputGt = inputGt/255.0
396 |         if if_flip == 0:
397 |             inputGt = inputGt[:,::-1]
398 |         inputGt = inputGt[np.newaxis, ...]
399 |         #Mask
400 |         inputMask = np.array(cv2.resize(mask_im, (self.width, self.height)), dtype=np.float32)
401 |         inputMask = inputMask-127.5
402 |         inputMask = inputMask/255.0
403 |         if if_flip == 0:
404 |             inputMask = inputMask[:,::-1]
405 |         inputMask = inputMask[np.newaxis, ...]
406 |         return inputImage, inputGt, inputMask
407 |             


--------------------------------------------------------------------------------
/experiments/cuhk03-detected/mgcam_iter_75000.caffemodel:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/developfeng/MGCAM/1b5957218ecaa7f13bf2107bc41b1349c05be9c8/experiments/cuhk03-detected/mgcam_iter_75000.caffemodel


--------------------------------------------------------------------------------
/experiments/cuhk03-detected/mgcam_siamese_iter_20000.caffemodel:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/developfeng/MGCAM/1b5957218ecaa7f13bf2107bc41b1349c05be9c8/experiments/cuhk03-detected/mgcam_siamese_iter_20000.caffemodel


--------------------------------------------------------------------------------
/experiments/cuhk03-detected/run_mgcam.sh:
--------------------------------------------------------------------------------
1 | #!/usr/bin/env sh
2 | LOG=./mgcam-`date +%Y-%m-%d-%H-%M-%S`.log
3 | CAFFE=/path-to-caffe/build/tools/caffe
4 | 
5 | $CAFFE train --solver=./solver_mgcam.prototxt --gpu=0 2>&1 | tee $LOG
6 | 


--------------------------------------------------------------------------------
/experiments/cuhk03-detected/run_mgcam_siamese.sh:
--------------------------------------------------------------------------------
1 | #!/usr/bin/env sh
2 | LOG=./mgcam-siamese`date +%Y-%m-%d-%H-%M-%S`.log
3 | CAFFE=/path-to-caffe/build/tools/caffe
4 | 
5 | $CAFFE train --solver=./solver_mgcam_siamese.prototxt --weights=./mgcam_iter_75000.caffemodel --gpu=0 2>&1 | tee $LOG
6 | 


--------------------------------------------------------------------------------
/experiments/cuhk03-detected/solver_mgcam.prototxt:
--------------------------------------------------------------------------------
 1 | net: "mgcam_train.prototxt"
 2 | 
 3 | test_iter: 10
 4 | test_interval:1000 
 5 | base_lr: 0.01
 6 | lr_policy: "step"
 7 | gamma: 0.1
 8 | stepsize: 15000
 9 | display: 10
10 | max_iter: 75000
11 | momentum: 0.9
12 | weight_decay: 0.005
13 | snapshot: 5000
14 | snapshot_prefix: "mgcam"
15 | solver_mode: GPU
16 | 


--------------------------------------------------------------------------------
/experiments/cuhk03-detected/solver_mgcam_siamese.prototxt:
--------------------------------------------------------------------------------
 1 | net: "mgcam_siamese_train.prototxt"
 2 | 
 3 | test_iter: 10
 4 | test_interval:1000 
 5 | base_lr: 0.0001
 6 | lr_policy: "step"
 7 | gamma: 0.1
 8 | stepsize: 10000
 9 | display: 10
10 | max_iter: 20000
11 | momentum: 0.9
12 | weight_decay: 0.005
13 | snapshot: 5000
14 | snapshot_prefix: "mgcam_siamese"
15 | solver_mode: GPU
16 | 


--------------------------------------------------------------------------------
/experiments/cuhk03-labeled/layers.py:
--------------------------------------------------------------------------------
  1 | """
  2 | MGCAM data layer.
  3 | 
  4 | by Chunfeng Song
  5 | 
  6 | 2017/10/08
  7 | 
  8 | This code is for research use only, please cite our paper:
  9 |  
 10 | Chunfeng Song, Yan Huang, Wanli Ouyang, and Liang Wang. Mask-guided Contrastive Attention Model for Person Re-Identification. In CVPR, 2018.
 11 |  
 12 | Contact us: chunfeng.song@nlpr.ia.ac.cn
 13 | """
 14 | 
 15 | import caffe
 16 | import numpy as np
 17 | import yaml
 18 | from random import shuffle
 19 | import numpy.random as nr
 20 | import cv2
 21 | import os
 22 | import pickle as cPickle
 23 | import pdb
 24 | 
 25 | def mypickle(filename, data):
 26 |     fo = open(filename, "wb")
 27 |     cPickle.dump(data, fo, protocol=cPickle.HIGHEST_PROTOCOL)
 28 |     fo.close()
 29 |     
 30 | def myunpickle(filename):
 31 |     if not os.path.exists(filename):
 32 |         raise UnpickleError("Path '%s' does not exist." % filename)
 33 | 
 34 |     fo = open(filename, 'rb')
 35 |     dict = cPickle.load(fo)    
 36 |     fo.close()
 37 |     return dict
 38 | 
 39 | class MGCAM_DataLayer(caffe.Layer):
 40 |     """Data layer for training"""   
 41 |     def setup(self, bottom, top): 
 42 |         self.width = 64
 43 |         self.height = 160 # We resize all images into a size of 160*64.
 44 |         self.width_gt = 16
 45 |         self.height_gt = 40 # We resize all masks which are used to supervise attention learning into a size of 40*16.
 46 |         layer_params = yaml.load(self.param_str)
 47 |         self.batch_size = layer_params['batch_size']
 48 |         self.im_path = layer_params['im_path']
 49 |         self.gt_path = layer_params['gt_path']
 50 |         self.dataset = layer_params['dataset']
 51 |         self.labels, self.im_list, self.gt_list, self.im_dir_list, self.gt_dir_list = self.data_processor(self.dataset)
 52 |         self.idx = 0
 53 |         self.data_num = len(self.im_list) # Number of data pairs
 54 |         self.rnd_list = np.arange(self.data_num) # Random the images list
 55 |         shuffle(self.rnd_list)
 56 |         
 57 |     def forward(self, bottom, top):
 58 |         # Assign forward data
 59 |         top[0].data[...] = self.im
 60 |         top[1].data[...] = self.inner_label
 61 |         top[2].data[...] = self.exter_label
 62 |         top[3].data[...] = self.one_mask
 63 |         top[4].data[...] = self.label
 64 |         top[5].data[...] = self.label_plus
 65 |         top[6].data[...] = self.gt
 66 |         top[7].data[...] = self.mask
 67 | 
 68 |     def backward(self, top, propagate_down, bottom):
 69 |         """This layer does not propagate gradients."""
 70 |         pass
 71 | 
 72 |     def reshape(self, bottom, top):
 73 |         # Load image + label image pairs
 74 |         self.im = []
 75 |         self.label = []
 76 |         self.inner_label = []
 77 |         self.exter_label = []
 78 |         self.one_mask = []
 79 |         self.label_plus = []
 80 |         self.gt = []
 81 |         self.mask = []
 82 | 
 83 |         for i in xrange(self.batch_size):
 84 |             if self.idx == self.data_num:
 85 |                 self.idx = 0
 86 |                 shuffle(self.rnd_list) #Randomly shuffle the list.
 87 |             cur_idx = self.rnd_list[self.idx]
 88 |             im_path = self.im_list[cur_idx]
 89 |             gt_path = self.gt_list[cur_idx]
 90 |             im_, gt_, mask_= self.load_data(im_path, gt_path)
 91 |             self.im.append(im_)
 92 |             self.gt.append(gt_)
 93 |             self.mask.append(mask_)
 94 |             self.label.append(self.labels[cur_idx])
 95 |             self.inner_label.append(int(1))
 96 |             self.exter_label.append(int(0))
 97 |             one_mask_ = np.zeros((1,40,16),dtype = np.float32)
 98 |             one_mask_ = one_mask_ + 1.0
 99 |             self.one_mask.append(one_mask_)
100 |             self.label_plus.append(self.labels[cur_idx]) #Here, we also give the ID-labels to background-stream.
101 |             self.idx +=1
102 | 
103 |         self.im = np.array(self.im).astype(np.float32)
104 |         self.inner_label = np.array(self.inner_label).astype(np.float32)
105 |         self.exter_label = np.array(self.exter_label).astype(np.float32)
106 |         self.one_mask = np.array(self.one_mask).astype(np.float32)
107 |         self.label = np.array(self.label).astype(np.float32)
108 |         self.label_plus = np.array(self.label_plus).astype(np.float32)
109 |         self.gt = np.array(self.gt).astype(np.float32)
110 |         self.mask = np.array(self.mask).astype(np.float32)
111 |         # Reshape tops to fit blobs      
112 |         top[0].reshape(*self.im.shape)
113 |         top[1].reshape(*self.inner_label.shape)
114 |         top[2].reshape(*self.exter_label.shape)
115 |         top[3].reshape(*self.one_mask.shape)
116 |         top[4].reshape(*self.label.shape)
117 |         top[5].reshape(*self.label_plus.shape)
118 |         top[6].reshape(*self.gt.shape)
119 |         top[7].reshape(*self.mask.shape)
120 |         
121 |     def data_processor(self, data_name):
122 |         data_dic = './' + data_name
123 |         if not os.path.exists(data_dic):
124 |             im_list = []
125 |             gt_list = []
126 |             labels = []
127 |             im_dir_list = []
128 |             gt_dir_list = []
129 |             new_id = 0
130 |             id_list = np.sort(os.listdir(self.im_path))
131 |             for id in id_list:
132 |                 im_dir = os.path.join(self.im_path, id)
133 |                 gt_dir = os.path.join(self.gt_path, id)
134 |                 if not os.path.exists(im_dir):
135 |                     continue
136 |                 pic_im_list = np.sort(os.listdir(im_dir))
137 |                 if len(pic_im_list)>1:
138 |                     for pic in pic_im_list:
139 |                         this_dir = os.path.join(self.im_path, id, pic)
140 |                         gt_pic = pic
141 |                         if not pic.lower().endswith('.png'):
142 |                             gt_pic = pic[:-4] + '.png'
143 |                         this_gt_dir = os.path.join(self.gt_path, id, gt_pic)
144 |                         im_list.append(this_dir)
145 |                         gt_list.append(this_gt_dir)
146 |                         labels.append(int(new_id))
147 |                     new_id +=1
148 |                     im_dir_list.append(im_dir)
149 |                     gt_dir_list.append(gt_dir)
150 |             dic = {'im_list':im_list,'gt_list':gt_list,'labels':labels,'im_dir_list':im_dir_list,'gt_dir_list':gt_dir_list}
151 |             mypickle(data_dic, dic)
152 |         # Load saved data dict to resume.
153 |         else:
154 |             dic = myunpickle(data_dic)
155 |             im_list = dic['im_list']
156 |             gt_list = dic['gt_list']
157 |             labels = dic['labels']
158 |             im_dir_list = dic['im_dir_list']
159 |             gt_dir_list = dic['gt_dir_list']
160 |         return labels, im_list, gt_list, im_dir_list, gt_dir_list
161 |     
162 |     def load_data(self, im_path, gt_path):
163 |         """
164 |         Load input image and preprocess for Caffe:
165 |         - cast to float
166 |         - switch channels RGB -> BGR
167 |         - subtract mean
168 |         - transpose to channel x height x width order
169 |         """
170 |         oim = cv2.imread(im_path)
171 |         inputImage = cv2.resize(oim, (self.width, self.height))
172 |         inputImage = np.array(inputImage, dtype=np.float32)
173 |         
174 |         # Substract mean
175 |         inputImage[:, :, 0] = inputImage[:, :, 0] - 104.008
176 |         inputImage[:, :, 1] = inputImage[:, :, 1] - 116.669
177 |         inputImage[:, :, 2] = inputImage[:, :, 2] - 122.675
178 |         
179 |         # Permute dimensions
180 |         if_flip = nr.randint(2)
181 |         if if_flip == 0: # Also flip the image with 50% probability
182 |             inputImage = inputImage[:,::-1,:]
183 |         inputImage = inputImage.transpose([2, 0, 1])        
184 |         inputImage = inputImage/256.0
185 |         #GT
186 |         mask_im= cv2.cvtColor(cv2.imread(gt_path),cv2.COLOR_BGR2GRAY)
187 |         inputGt = np.array(cv2.resize(mask_im, (self.width_gt, self.height_gt)), dtype=np.float32)
188 |         inputGt = inputGt/255.0
189 |         if if_flip == 0:
190 |             inputGt = inputGt[:,::-1]
191 |         inputGt = inputGt[np.newaxis, ...]
192 |         #Mask
193 |         inputMask = np.array(cv2.resize(mask_im, (self.width, self.height)), dtype=np.float32)
194 |         inputMask = inputMask-127.5
195 |         inputMask = inputMask/255.0
196 |         if if_flip == 0:
197 |             inputMask = inputMask[:,::-1]
198 |         inputMask = inputMask[np.newaxis, ...]
199 |         return inputImage, inputGt, inputMask
200 | 
201 | 
202 | class MGCAM_SIA_DataLayer(caffe.Layer):
203 |     """Data layer for training"""   
204 |     def setup(self, bottom, top): 
205 |         self.width = 64
206 |         self.height = 160 # We resize all images into a size of 160*64.
207 |         self.width_gt = 16
208 |         self.height_gt = 40 # We resize all masks which are used to supervise attention learning into a size of 160*64.
209 |         
210 |         layer_params = yaml.load(self.param_str)
211 |         self.batch_size = layer_params['batch_size']
212 |         self.pos_pair_num = int(0.30*self.batch_size) # There will be at least 30 percent postive pairs for each batch.
213 |         self.im_path = layer_params['im_path']
214 |         self.gt_path = layer_params['gt_path']
215 |         self.dataset = layer_params['dataset']
216 |         self.labels, self.im_list, self.gt_list, self.im_dir_list, self.gt_dir_list = self.data_processor(self.dataset)
217 |         self.idx = 0
218 |         self.data_num = len(self.im_list)
219 |         self.rnd_list = np.arange(self.data_num)
220 |         shuffle(self.rnd_list)
221 |         
222 |     def forward(self, bottom, top):
223 |         # Assign forward data
224 |         top[0].data[...] = self.im
225 |         top[1].data[...] = self.inner_label
226 |         top[2].data[...] = self.exter_label
227 |         top[3].data[...] = self.one_mask
228 |         top[4].data[...] = self.label
229 |         top[5].data[...] = self.label_plus
230 |         top[6].data[...] = self.gt
231 |         top[7].data[...] = self.mask
232 |         top[8].data[...] = self.siam_label
233 |         
234 |     def backward(self, top, propagate_down, bottom):
235 |         """This layer does not propagate gradients."""
236 |         pass
237 | 
238 |     def reshape(self, bottom, top):
239 |         # Load image + label image pairs
240 |         self.im = []
241 |         self.label = []
242 |         self.inner_label = []
243 |         self.exter_label = []
244 |         self.one_mask = []
245 |         self.label_plus = []
246 |         self.gt = []
247 |         self.mask = []
248 |         self.siam_label = []
249 | 
250 |         for i in xrange(self.batch_size):
251 |             if self.idx == self.data_num:
252 |                 self.idx = 0
253 |                 shuffle(self.rnd_list)
254 |             cur_idx = self.rnd_list[self.idx]
255 |             im_path = self.im_list[cur_idx]
256 |             gt_path = self.gt_list[cur_idx]
257 |             im_, gt_, mask_= self.load_data(im_path, gt_path)
258 |             self.im.append(im_)
259 |             self.gt.append(gt_)
260 |             self.mask.append(mask_)
261 |             self.label.append(self.labels[cur_idx])
262 |             self.inner_label.append(int(1))
263 |             self.exter_label.append(int(0))
264 |             one_mask_ = np.zeros((1,40,16),dtype = np.float32)
265 |             one_mask_ = one_mask_ + 1.0
266 |             self.one_mask.append(one_mask_)
267 |             self.label_plus.append(self.labels[cur_idx])#Labels for backgrounds. We use the same labels with other two regions here. 
268 |             self.idx +=1
269 |             
270 |         for i in xrange(self.batch_size):
271 |             if i > self.pos_pair_num:
272 |                 if self.idx == self.data_num:
273 |                     self.idx = 0
274 |                     shuffle(self.rnd_list)#Randomly shuffle the list.
275 |                 cur_idx = self.rnd_list[self.idx]
276 |                 self.idx +=1
277 |                 im_path = self.im_list[cur_idx]
278 |                 gt_path = self.gt_list[cur_idx]
279 |                 label = self.labels[cur_idx]
280 |                 if label==self.label[i]:
281 |                     self.siam_label.append(int(1))#In case of getting postive pairs, maybe not much.
282 |                 else:
283 |                     self.siam_label.append(int(0))#Negative pairs.
284 |             else:
285 |                 im_dir = self.im_dir_list[self.label[i]]
286 |                 gt_dir = self.gt_dir_list[self.label[i]]
287 |                 im_list = np.sort(os.listdir(im_dir))
288 |                 gt_list = np.sort(os.listdir(gt_dir))
289 |                 tmp_list = np.arange(len(im_list))
290 |                 shuffle(tmp_list) #Randomly select one.
291 |                 im_path = os.path.join(im_dir, im_list[tmp_list[0]])
292 |                 gt_path = os.path.join(gt_dir, gt_list[tmp_list[0]])
293 |                 label = self.label[i]
294 |                 self.siam_label.append(int(1))#This is a postive pair.
295 |             
296 |             im_, gt_, mask_= self.load_data(im_path, gt_path)
297 |             self.im.append(im_)
298 |             self.gt.append(gt_)
299 |             self.mask.append(mask_)
300 |             self.label.append(label)
301 |             self.inner_label.append(int(1))#Allways be ones, for constrastive learning.
302 |             self.exter_label.append(int(0))
303 |             one_mask_ = np.zeros((1,40,16),dtype = np.float32)
304 |             one_mask_ = one_mask_ + 1.0
305 |             self.one_mask.append(one_mask_)
306 |             self.label_plus.append(label)
307 |                 
308 |         self.im = np.array(self.im).astype(np.float32)
309 |         self.inner_label = np.array(self.inner_label).astype(np.float32)
310 |         self.exter_label = np.array(self.exter_label).astype(np.float32)
311 |         self.one_mask = np.array(self.one_mask).astype(np.float32)
312 |         self.label = np.array(self.label).astype(np.float32)
313 |         self.label_plus = np.array(self.label_plus).astype(np.float32)
314 |         self.gt = np.array(self.gt).astype(np.float32)
315 |         self.mask = np.array(self.mask).astype(np.float32)
316 |         self.siam_label = np.array(self.siam_label).astype(np.float32)
317 |         # Reshape tops to fit blobs      
318 |         top[0].reshape(*self.im.shape)
319 |         top[1].reshape(*self.inner_label.shape)
320 |         top[2].reshape(*self.exter_label.shape)
321 |         top[3].reshape(*self.one_mask.shape)
322 |         top[4].reshape(*self.label.shape)
323 |         top[5].reshape(*self.label_plus.shape)
324 |         top[6].reshape(*self.gt.shape)
325 |         top[7].reshape(*self.mask.shape)
326 |         top[8].reshape(*self.siam_label.shape)
327 |         
328 |     def data_processor(self, data_name):
329 |         data_dic = './' + data_name
330 |         if not os.path.exists(data_dic):
331 |             im_list = []
332 |             gt_list = []
333 |             labels = []
334 |             im_dir_list = []
335 |             gt_dir_list = []
336 |             new_id = 0
337 |             id_list = np.sort(os.listdir(self.im_path))
338 |             for id in id_list:
339 |                 im_dir = os.path.join(self.im_path, id)
340 |                 gt_dir = os.path.join(self.gt_path, id)
341 |                 if not os.path.exists(im_dir):
342 |                     continue
343 |                 pic_im_list = np.sort(os.listdir(im_dir))
344 |                 if len(pic_im_list)>1:
345 |                     for pic in pic_im_list:
346 |                         this_dir = os.path.join(self.im_path, id, pic)
347 |                         gt_pic = pic
348 |                         if not pic.lower().endswith('.png'):
349 |                             gt_pic = pic[:-4] + '.png'
350 |                         this_gt_dir = os.path.join(self.gt_path, id, gt_pic)
351 |                         im_list.append(this_dir)
352 |                         gt_list.append(this_gt_dir)
353 |                         labels.append(int(new_id))
354 |                     new_id +=1
355 |                     im_dir_list.append(im_dir)
356 |                     gt_dir_list.append(gt_dir)
357 |             dic = {'im_list':im_list,'gt_list':gt_list,'labels':labels,'im_dir_list':im_dir_list,'gt_dir_list':gt_dir_list}
358 |             mypickle(data_dic, dic)
359 |         # Load saved data dict to resume.
360 |         else:
361 |             dic = myunpickle(data_dic)
362 |             im_list = dic['im_list']
363 |             gt_list = dic['gt_list']
364 |             labels = dic['labels']
365 |             im_dir_list = dic['im_dir_list']
366 |             gt_dir_list = dic['gt_dir_list']
367 |         return labels, im_list, gt_list, im_dir_list, gt_dir_list
368 |     
369 |     def load_data(self, im_path, gt_path):
370 |         """
371 |         Load input image and preprocess for Caffe:
372 |         - cast to float
373 |         - switch channels RGB -> BGR
374 |         - subtract mean
375 |         - transpose to channel x height x width order
376 |         """
377 |         oim = cv2.imread(im_path)
378 |         inputImage = cv2.resize(oim, (self.width, self.height))
379 |         inputImage = np.array(inputImage, dtype=np.float32)
380 |         
381 |         # Substract mean
382 |         inputImage[:, :, 0] = inputImage[:, :, 0] - 104.008
383 |         inputImage[:, :, 1] = inputImage[:, :, 1] - 116.669
384 |         inputImage[:, :, 2] = inputImage[:, :, 2] - 122.675
385 |         
386 |         # Permute dimensions
387 |         if_flip = nr.randint(2)
388 |         if if_flip == 0: # Also flip the image with 50% probability
389 |             inputImage = inputImage[:,::-1,:]
390 |         inputImage = inputImage.transpose([2, 0, 1])        
391 |         inputImage = inputImage/256.0
392 |         #GT
393 |         mask_im= cv2.cvtColor(cv2.imread(gt_path),cv2.COLOR_BGR2GRAY)
394 |         inputGt = np.array(cv2.resize(mask_im, (self.width_gt, self.height_gt)), dtype=np.float32)
395 |         inputGt = inputGt/255.0
396 |         if if_flip == 0:
397 |             inputGt = inputGt[:,::-1]
398 |         inputGt = inputGt[np.newaxis, ...]
399 |         #Mask
400 |         inputMask = np.array(cv2.resize(mask_im, (self.width, self.height)), dtype=np.float32)
401 |         inputMask = inputMask-127.5
402 |         inputMask = inputMask/255.0
403 |         if if_flip == 0:
404 |             inputMask = inputMask[:,::-1]
405 |         inputMask = inputMask[np.newaxis, ...]
406 |         return inputImage, inputGt, inputMask
407 |             


--------------------------------------------------------------------------------
/experiments/cuhk03-labeled/mgcam_iter_75000.caffemodel:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/developfeng/MGCAM/1b5957218ecaa7f13bf2107bc41b1349c05be9c8/experiments/cuhk03-labeled/mgcam_iter_75000.caffemodel


--------------------------------------------------------------------------------
/experiments/cuhk03-labeled/mgcam_siamese_iter_20000.caffemodel:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/developfeng/MGCAM/1b5957218ecaa7f13bf2107bc41b1349c05be9c8/experiments/cuhk03-labeled/mgcam_siamese_iter_20000.caffemodel


--------------------------------------------------------------------------------
/experiments/cuhk03-labeled/run_mgcam.sh:
--------------------------------------------------------------------------------
1 | #!/usr/bin/env sh
2 | LOG=./mgcam-`date +%Y-%m-%d-%H-%M-%S`.log
3 | CAFFE=/path-to-caffe/build/tools/caffe
4 | 
5 | $CAFFE train --solver=./solver_mgcam.prototxt --gpu=0 2>&1 | tee $LOG
6 | 


--------------------------------------------------------------------------------
/experiments/cuhk03-labeled/run_mgcam_siamese.sh:
--------------------------------------------------------------------------------
1 | #!/usr/bin/env sh
2 | LOG=./mgcam-siamese`date +%Y-%m-%d-%H-%M-%S`.log
3 | CAFFE=/path-to-caffe/build/tools/caffe
4 | 
5 | $CAFFE train --solver=./solver_mgcam_siamese.prototxt --weights=./mgcam_iter_75000.caffemodel --gpu=0 2>&1 | tee $LOG
6 | 


--------------------------------------------------------------------------------
/experiments/cuhk03-labeled/solver_mgcam.prototxt:
--------------------------------------------------------------------------------
 1 | net: "mgcam_train.prototxt"
 2 | 
 3 | test_iter: 10
 4 | test_interval:1000 
 5 | base_lr: 0.01
 6 | lr_policy: "step"
 7 | gamma: 0.1
 8 | stepsize: 15000
 9 | display: 10
10 | max_iter: 75000
11 | momentum: 0.9
12 | weight_decay: 0.005
13 | snapshot: 5000
14 | snapshot_prefix: "mgcam"
15 | solver_mode: GPU
16 | 


--------------------------------------------------------------------------------
/experiments/cuhk03-labeled/solver_mgcam_siamese.prototxt:
--------------------------------------------------------------------------------
 1 | net: "mgcam_siamese_train.prototxt"
 2 | 
 3 | test_iter: 10
 4 | test_interval:1000 
 5 | base_lr: 0.0001
 6 | lr_policy: "step"
 7 | gamma: 0.1
 8 | stepsize: 10000
 9 | display: 10
10 | max_iter: 20000
11 | momentum: 0.9
12 | weight_decay: 0.005
13 | snapshot: 5000
14 | snapshot_prefix: "mgcam_siamese"
15 | solver_mode: GPU
16 | 


--------------------------------------------------------------------------------
/experiments/market1501/layers.py:
--------------------------------------------------------------------------------
  1 | """
  2 | MGCAM data layer.
  3 | 
  4 | by Chunfeng Song
  5 | 
  6 | 2017/10/08
  7 | 
  8 | This code is for research use only, please cite our paper:
  9 |  
 10 | Chunfeng Song, Yan Huang, Wanli Ouyang, and Liang Wang. Mask-guided Contrastive Attention Model for Person Re-Identification. In CVPR, 2018.
 11 |  
 12 | Contact us: chunfeng.song@nlpr.ia.ac.cn
 13 | """
 14 | 
 15 | import caffe
 16 | import numpy as np
 17 | import yaml
 18 | from random import shuffle
 19 | import numpy.random as nr
 20 | import cv2
 21 | import os
 22 | import pickle as cPickle
 23 | import pdb
 24 | 
 25 | def mypickle(filename, data):
 26 |     fo = open(filename, "wb")
 27 |     cPickle.dump(data, fo, protocol=cPickle.HIGHEST_PROTOCOL)
 28 |     fo.close()
 29 |     
 30 | def myunpickle(filename):
 31 |     if not os.path.exists(filename):
 32 |         raise UnpickleError("Path '%s' does not exist." % filename)
 33 | 
 34 |     fo = open(filename, 'rb')
 35 |     dict = cPickle.load(fo)    
 36 |     fo.close()
 37 |     return dict
 38 | 
 39 | class MGCAM_DataLayer(caffe.Layer):
 40 |     """Data layer for training"""   
 41 |     def setup(self, bottom, top): 
 42 |         self.width = 64
 43 |         self.height = 160 # We resize all images into a size of 160*64.
 44 |         self.width_gt = 16
 45 |         self.height_gt = 40 # We resize all masks which are used to supervise attention learning into a size of 40*16.
 46 |         layer_params = yaml.load(self.param_str)
 47 |         self.batch_size = layer_params['batch_size']
 48 |         self.im_path = layer_params['im_path']
 49 |         self.gt_path = layer_params['gt_path']
 50 |         self.dataset = layer_params['dataset']
 51 |         self.labels, self.im_list, self.gt_list, self.im_dir_list, self.gt_dir_list = self.data_processor(self.dataset)
 52 |         self.idx = 0
 53 |         self.data_num = len(self.im_list) # Number of data pairs
 54 |         self.rnd_list = np.arange(self.data_num) # Random the images list
 55 |         shuffle(self.rnd_list)
 56 |         
 57 |     def forward(self, bottom, top):
 58 |         # Assign forward data
 59 |         top[0].data[...] = self.im
 60 |         top[1].data[...] = self.inner_label
 61 |         top[2].data[...] = self.exter_label
 62 |         top[3].data[...] = self.one_mask
 63 |         top[4].data[...] = self.label
 64 |         top[5].data[...] = self.label_plus
 65 |         top[6].data[...] = self.gt
 66 |         top[7].data[...] = self.mask
 67 | 
 68 |     def backward(self, top, propagate_down, bottom):
 69 |         """This layer does not propagate gradients."""
 70 |         pass
 71 | 
 72 |     def reshape(self, bottom, top):
 73 |         # Load image + label image pairs
 74 |         self.im = []
 75 |         self.label = []
 76 |         self.inner_label = []
 77 |         self.exter_label = []
 78 |         self.one_mask = []
 79 |         self.label_plus = []
 80 |         self.gt = []
 81 |         self.mask = []
 82 | 
 83 |         for i in xrange(self.batch_size):
 84 |             if self.idx == self.data_num:
 85 |                 self.idx = 0
 86 |                 shuffle(self.rnd_list) #Randomly shuffle the list.
 87 |             cur_idx = self.rnd_list[self.idx]
 88 |             im_path = self.im_list[cur_idx]
 89 |             gt_path = self.gt_list[cur_idx]
 90 |             im_, gt_, mask_= self.load_data(im_path, gt_path)
 91 |             self.im.append(im_)
 92 |             self.gt.append(gt_)
 93 |             self.mask.append(mask_)
 94 |             self.label.append(self.labels[cur_idx])
 95 |             self.inner_label.append(int(1))
 96 |             self.exter_label.append(int(0))
 97 |             one_mask_ = np.zeros((1,40,16),dtype = np.float32)
 98 |             one_mask_ = one_mask_ + 1.0
 99 |             self.one_mask.append(one_mask_)
100 |             self.label_plus.append(self.labels[cur_idx]) #Here, we also give the ID-labels to background-stream.
101 |             self.idx +=1
102 | 
103 |         self.im = np.array(self.im).astype(np.float32)
104 |         self.inner_label = np.array(self.inner_label).astype(np.float32)
105 |         self.exter_label = np.array(self.exter_label).astype(np.float32)
106 |         self.one_mask = np.array(self.one_mask).astype(np.float32)
107 |         self.label = np.array(self.label).astype(np.float32)
108 |         self.label_plus = np.array(self.label_plus).astype(np.float32)
109 |         self.gt = np.array(self.gt).astype(np.float32)
110 |         self.mask = np.array(self.mask).astype(np.float32)
111 |         # Reshape tops to fit blobs      
112 |         top[0].reshape(*self.im.shape)
113 |         top[1].reshape(*self.inner_label.shape)
114 |         top[2].reshape(*self.exter_label.shape)
115 |         top[3].reshape(*self.one_mask.shape)
116 |         top[4].reshape(*self.label.shape)
117 |         top[5].reshape(*self.label_plus.shape)
118 |         top[6].reshape(*self.gt.shape)
119 |         top[7].reshape(*self.mask.shape)
120 |         
121 |     def data_processor(self, data_name):
122 |         data_dic = './' + data_name
123 |         if not os.path.exists(data_dic):
124 |             im_list = []
125 |             gt_list = []
126 |             labels = []
127 |             im_dir_list = []
128 |             gt_dir_list = []
129 |             new_id = 0
130 |             id_list = np.sort(os.listdir(self.im_path))
131 |             for id in id_list:
132 |                 im_dir = os.path.join(self.im_path, id)
133 |                 gt_dir = os.path.join(self.gt_path, id)
134 |                 if not os.path.exists(im_dir):
135 |                     continue
136 |                 pic_im_list = np.sort(os.listdir(im_dir))
137 |                 if len(pic_im_list)>1:
138 |                     for pic in pic_im_list:
139 |                         this_dir = os.path.join(self.im_path, id, pic)
140 |                         gt_pic = pic
141 |                         if not pic.lower().endswith('.png'):
142 |                             gt_pic = pic[:-4] + '.png'
143 |                         this_gt_dir = os.path.join(self.gt_path, id, gt_pic)
144 |                         im_list.append(this_dir)
145 |                         gt_list.append(this_gt_dir)
146 |                         labels.append(int(new_id))
147 |                     new_id +=1
148 |                     im_dir_list.append(im_dir)
149 |                     gt_dir_list.append(gt_dir)
150 |             dic = {'im_list':im_list,'gt_list':gt_list,'labels':labels,'im_dir_list':im_dir_list,'gt_dir_list':gt_dir_list}
151 |             mypickle(data_dic, dic)
152 |         # Load saved data dict to resume.
153 |         else:
154 |             dic = myunpickle(data_dic)
155 |             im_list = dic['im_list']
156 |             gt_list = dic['gt_list']
157 |             labels = dic['labels']
158 |             im_dir_list = dic['im_dir_list']
159 |             gt_dir_list = dic['gt_dir_list']
160 |         return labels, im_list, gt_list, im_dir_list, gt_dir_list
161 |     
162 |     def load_data(self, im_path, gt_path):
163 |         """
164 |         Load input image and preprocess for Caffe:
165 |         - cast to float
166 |         - switch channels RGB -> BGR
167 |         - subtract mean
168 |         - transpose to channel x height x width order
169 |         """
170 |         oim = cv2.imread(im_path)
171 |         inputImage = cv2.resize(oim, (self.width, self.height))
172 |         inputImage = np.array(inputImage, dtype=np.float32)
173 |         
174 |         # Substract mean
175 |         inputImage[:, :, 0] = inputImage[:, :, 0] - 104.008
176 |         inputImage[:, :, 1] = inputImage[:, :, 1] - 116.669
177 |         inputImage[:, :, 2] = inputImage[:, :, 2] - 122.675
178 |         
179 |         # Permute dimensions
180 |         if_flip = nr.randint(2)
181 |         if if_flip == 0: # Also flip the image with 50% probability
182 |             inputImage = inputImage[:,::-1,:]
183 |         inputImage = inputImage.transpose([2, 0, 1])        
184 |         inputImage = inputImage/256.0
185 |         #GT
186 |         mask_im= cv2.cvtColor(cv2.imread(gt_path),cv2.COLOR_BGR2GRAY)
187 |         inputGt = np.array(cv2.resize(mask_im, (self.width_gt, self.height_gt)), dtype=np.float32)
188 |         inputGt = inputGt/255.0
189 |         if if_flip == 0:
190 |             inputGt = inputGt[:,::-1]
191 |         inputGt = inputGt[np.newaxis, ...]
192 |         #Mask
193 |         inputMask = np.array(cv2.resize(mask_im, (self.width, self.height)), dtype=np.float32)
194 |         inputMask = inputMask-127.5
195 |         inputMask = inputMask/255.0
196 |         if if_flip == 0:
197 |             inputMask = inputMask[:,::-1]
198 |         inputMask = inputMask[np.newaxis, ...]
199 |         return inputImage, inputGt, inputMask
200 | 
201 | 
202 | class MGCAM_SIA_DataLayer(caffe.Layer):
203 |     """Data layer for training"""   
204 |     def setup(self, bottom, top): 
205 |         self.width = 64
206 |         self.height = 160 # We resize all images into a size of 160*64.
207 |         self.width_gt = 16
208 |         self.height_gt = 40 # We resize all masks which are used to supervise attention learning into a size of 160*64.
209 |         
210 |         layer_params = yaml.load(self.param_str)
211 |         self.batch_size = layer_params['batch_size']
212 |         self.pos_pair_num = int(0.30*self.batch_size) # There will be at least 30 percent postive pairs for each batch.
213 |         self.im_path = layer_params['im_path']
214 |         self.gt_path = layer_params['gt_path']
215 |         self.dataset = layer_params['dataset']
216 |         self.labels, self.im_list, self.gt_list, self.im_dir_list, self.gt_dir_list = self.data_processor(self.dataset)
217 |         self.idx = 0
218 |         self.data_num = len(self.im_list)
219 |         self.rnd_list = np.arange(self.data_num)
220 |         shuffle(self.rnd_list)
221 |         
222 |     def forward(self, bottom, top):
223 |         # Assign forward data
224 |         top[0].data[...] = self.im
225 |         top[1].data[...] = self.inner_label
226 |         top[2].data[...] = self.exter_label
227 |         top[3].data[...] = self.one_mask
228 |         top[4].data[...] = self.label
229 |         top[5].data[...] = self.label_plus
230 |         top[6].data[...] = self.gt
231 |         top[7].data[...] = self.mask
232 |         top[8].data[...] = self.siam_label
233 |         
234 |     def backward(self, top, propagate_down, bottom):
235 |         """This layer does not propagate gradients."""
236 |         pass
237 | 
238 |     def reshape(self, bottom, top):
239 |         # Load image + label image pairs
240 |         self.im = []
241 |         self.label = []
242 |         self.inner_label = []
243 |         self.exter_label = []
244 |         self.one_mask = []
245 |         self.label_plus = []
246 |         self.gt = []
247 |         self.mask = []
248 |         self.siam_label = []
249 | 
250 |         for i in xrange(self.batch_size):
251 |             if self.idx == self.data_num:
252 |                 self.idx = 0
253 |                 shuffle(self.rnd_list)
254 |             cur_idx = self.rnd_list[self.idx]
255 |             im_path = self.im_list[cur_idx]
256 |             gt_path = self.gt_list[cur_idx]
257 |             im_, gt_, mask_= self.load_data(im_path, gt_path)
258 |             self.im.append(im_)
259 |             self.gt.append(gt_)
260 |             self.mask.append(mask_)
261 |             self.label.append(self.labels[cur_idx])
262 |             self.inner_label.append(int(1))
263 |             self.exter_label.append(int(0))
264 |             one_mask_ = np.zeros((1,40,16),dtype = np.float32)
265 |             one_mask_ = one_mask_ + 1.0
266 |             self.one_mask.append(one_mask_)
267 |             self.label_plus.append(self.labels[cur_idx])#Labels for backgrounds. We use the same labels with other two regions here. 
268 |             self.idx +=1
269 |             
270 |         for i in xrange(self.batch_size):
271 |             if i > self.pos_pair_num:
272 |                 if self.idx == self.data_num:
273 |                     self.idx = 0
274 |                     shuffle(self.rnd_list)#Randomly shuffle the list.
275 |                 cur_idx = self.rnd_list[self.idx]
276 |                 self.idx +=1
277 |                 im_path = self.im_list[cur_idx]
278 |                 gt_path = self.gt_list[cur_idx]
279 |                 label = self.labels[cur_idx]
280 |                 if label==self.label[i]:
281 |                     self.siam_label.append(int(1))#In case of getting postive pairs, maybe not much.
282 |                 else:
283 |                     self.siam_label.append(int(0))#Negative pairs.
284 |             else:
285 |                 im_dir = self.im_dir_list[self.label[i]]
286 |                 gt_dir = self.gt_dir_list[self.label[i]]
287 |                 im_list = np.sort(os.listdir(im_dir))
288 |                 gt_list = np.sort(os.listdir(gt_dir))
289 |                 tmp_list = np.arange(len(im_list))
290 |                 shuffle(tmp_list) #Randomly select one.
291 |                 im_path = os.path.join(im_dir, im_list[tmp_list[0]])
292 |                 gt_path = os.path.join(gt_dir, gt_list[tmp_list[0]])
293 |                 label = self.label[i]
294 |                 self.siam_label.append(int(1))#This is a postive pair.
295 |             
296 |             im_, gt_, mask_= self.load_data(im_path, gt_path)
297 |             self.im.append(im_)
298 |             self.gt.append(gt_)
299 |             self.mask.append(mask_)
300 |             self.label.append(label)
301 |             self.inner_label.append(int(1))#Allways be ones, for constrastive learning.
302 |             self.exter_label.append(int(0))
303 |             one_mask_ = np.zeros((1,40,16),dtype = np.float32)
304 |             one_mask_ = one_mask_ + 1.0
305 |             self.one_mask.append(one_mask_)
306 |             self.label_plus.append(label)
307 |                 
308 |         self.im = np.array(self.im).astype(np.float32)
309 |         self.inner_label = np.array(self.inner_label).astype(np.float32)
310 |         self.exter_label = np.array(self.exter_label).astype(np.float32)
311 |         self.one_mask = np.array(self.one_mask).astype(np.float32)
312 |         self.label = np.array(self.label).astype(np.float32)
313 |         self.label_plus = np.array(self.label_plus).astype(np.float32)
314 |         self.gt = np.array(self.gt).astype(np.float32)
315 |         self.mask = np.array(self.mask).astype(np.float32)
316 |         self.siam_label = np.array(self.siam_label).astype(np.float32)
317 |         # Reshape tops to fit blobs      
318 |         top[0].reshape(*self.im.shape)
319 |         top[1].reshape(*self.inner_label.shape)
320 |         top[2].reshape(*self.exter_label.shape)
321 |         top[3].reshape(*self.one_mask.shape)
322 |         top[4].reshape(*self.label.shape)
323 |         top[5].reshape(*self.label_plus.shape)
324 |         top[6].reshape(*self.gt.shape)
325 |         top[7].reshape(*self.mask.shape)
326 |         top[8].reshape(*self.siam_label.shape)
327 |         
328 |     def data_processor(self, data_name):
329 |         data_dic = './' + data_name
330 |         if not os.path.exists(data_dic):
331 |             im_list = []
332 |             gt_list = []
333 |             labels = []
334 |             im_dir_list = []
335 |             gt_dir_list = []
336 |             new_id = 0
337 |             id_list = np.sort(os.listdir(self.im_path))
338 |             for id in id_list:
339 |                 im_dir = os.path.join(self.im_path, id)
340 |                 gt_dir = os.path.join(self.gt_path, id)
341 |                 if not os.path.exists(im_dir):
342 |                     continue
343 |                 pic_im_list = np.sort(os.listdir(im_dir))
344 |                 if len(pic_im_list)>1:
345 |                     for pic in pic_im_list:
346 |                         this_dir = os.path.join(self.im_path, id, pic)
347 |                         gt_pic = pic
348 |                         if not pic.lower().endswith('.png'):
349 |                             gt_pic = pic[:-4] + '.png'
350 |                         this_gt_dir = os.path.join(self.gt_path, id, gt_pic)
351 |                         im_list.append(this_dir)
352 |                         gt_list.append(this_gt_dir)
353 |                         labels.append(int(new_id))
354 |                     new_id +=1
355 |                     im_dir_list.append(im_dir)
356 |                     gt_dir_list.append(gt_dir)
357 |             dic = {'im_list':im_list,'gt_list':gt_list,'labels':labels,'im_dir_list':im_dir_list,'gt_dir_list':gt_dir_list}
358 |             mypickle(data_dic, dic)
359 |         # Load saved data dict to resume.
360 |         else:
361 |             dic = myunpickle(data_dic)
362 |             im_list = dic['im_list']
363 |             gt_list = dic['gt_list']
364 |             labels = dic['labels']
365 |             im_dir_list = dic['im_dir_list']
366 |             gt_dir_list = dic['gt_dir_list']
367 |         return labels, im_list, gt_list, im_dir_list, gt_dir_list
368 |     
369 |     def load_data(self, im_path, gt_path):
370 |         """
371 |         Load input image and preprocess for Caffe:
372 |         - cast to float
373 |         - switch channels RGB -> BGR
374 |         - subtract mean
375 |         - transpose to channel x height x width order
376 |         """
377 |         oim = cv2.imread(im_path)
378 |         inputImage = cv2.resize(oim, (self.width, self.height))
379 |         inputImage = np.array(inputImage, dtype=np.float32)
380 |         
381 |         # Substract mean
382 |         inputImage[:, :, 0] = inputImage[:, :, 0] - 104.008
383 |         inputImage[:, :, 1] = inputImage[:, :, 1] - 116.669
384 |         inputImage[:, :, 2] = inputImage[:, :, 2] - 122.675
385 |         
386 |         # Permute dimensions
387 |         if_flip = nr.randint(2)
388 |         if if_flip == 0: # Also flip the image with 50% probability
389 |             inputImage = inputImage[:,::-1,:]
390 |         inputImage = inputImage.transpose([2, 0, 1])        
391 |         inputImage = inputImage/256.0
392 |         #GT
393 |         mask_im= cv2.cvtColor(cv2.imread(gt_path),cv2.COLOR_BGR2GRAY)
394 |         inputGt = np.array(cv2.resize(mask_im, (self.width_gt, self.height_gt)), dtype=np.float32)
395 |         inputGt = inputGt/255.0
396 |         if if_flip == 0:
397 |             inputGt = inputGt[:,::-1]
398 |         inputGt = inputGt[np.newaxis, ...]
399 |         #Mask
400 |         inputMask = np.array(cv2.resize(mask_im, (self.width, self.height)), dtype=np.float32)
401 |         inputMask = inputMask-127.5
402 |         inputMask = inputMask/255.0
403 |         if if_flip == 0:
404 |             inputMask = inputMask[:,::-1]
405 |         inputMask = inputMask[np.newaxis, ...]
406 |         return inputImage, inputGt, inputMask
407 |             


--------------------------------------------------------------------------------
/experiments/market1501/mgcam_iter_75000.caffemodel:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/developfeng/MGCAM/1b5957218ecaa7f13bf2107bc41b1349c05be9c8/experiments/market1501/mgcam_iter_75000.caffemodel


--------------------------------------------------------------------------------
/experiments/market1501/mgcam_siamese_iter_20000.caffemodel:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/developfeng/MGCAM/1b5957218ecaa7f13bf2107bc41b1349c05be9c8/experiments/market1501/mgcam_siamese_iter_20000.caffemodel


--------------------------------------------------------------------------------
/experiments/market1501/mgcam_train.prototxt:
--------------------------------------------------------------------------------
   1 | layer {
   2 |   name: "data"
   3 |   type: "Python"
   4 |   top: "data"
   5 |   top: "sim_iner"
   6 |   top: "sim_exter"
   7 |   top: "one_mask"
   8 |   top: "label"
   9 |   top: "label_plus"
  10 |   top: "gt"
  11 |   top: "mask"
  12 |   include {
  13 |     phase: TRAIN
  14 |   }
  15 |   python_param {
  16 |     module: "layers"
  17 |     layer: "MGCAM_DataLayer"
  18 |     param_str: "{'batch_size': 128,'im_path':'../../data/market1501/bounding_box_train_fold','gt_path':'../../data/market1501/bounding_box_train_seg_fold/','dataset':'market1501'}"
  19 |   }
  20 | }
  21 | 
  22 | layer {
  23 |   name: "data"
  24 |   type: "Python"
  25 |   top: "data"
  26 |   top: "sim_iner"
  27 |   top: "sim_exter"
  28 |   top: "one_mask"
  29 |   top: "label"
  30 |   top: "label_plus"
  31 |   top: "gt"
  32 |   top: "mask"
  33 |   include {
  34 |     phase: TEST
  35 |   }#Note the testing set is same with the training here.
  36 |   python_param {
  37 |     module: "layers"
  38 |     layer: "MGCAM_DataLayer"
  39 |     param_str: "{'batch_size': 10,'im_path':'../../data/market1501/bounding_box_train_fold','gt_path':'../../data/market1501/bounding_box_train_seg_fold/','dataset':'market1501'}"
  40 |   }
  41 | }
  42 | 
  43 | layer {
  44 |   name: "data_concate"
  45 |   type: "Concat"
  46 |   bottom: "data"
  47 |   bottom: "mask"
  48 |   top: "data_concate"
  49 | }
  50 | 
  51 | layer {
  52 |   name: "conv0_scale1_full"
  53 |   type: "Convolution"
  54 |   bottom: "data_concate"
  55 |   top: "conv0_scale1_full"
  56 |   param {
  57 |     name: "conv0_scale1_w_full"
  58 |     lr_mult: 1
  59 |     decay_mult: 1
  60 |   }
  61 |   param {
  62 |     name: "conv0_scale1_b_full"
  63 |     lr_mult: 2
  64 |     decay_mult: 0
  65 |   }
  66 |   convolution_param {
  67 |     num_output: 32
  68 |     pad: 2
  69 |     kernel_size: 5
  70 |     stride: 1
  71 |     weight_filler {
  72 |       type: "xavier"
  73 |     }
  74 |     bias_filler {
  75 |       type: "constant"
  76 |     }
  77 |     dilation: 1
  78 |   }
  79 | }
  80 | layer {
  81 |   name: "bn0_scale1_full"
  82 |   type: "BatchNorm"
  83 |   bottom: "conv0_scale1_full"
  84 |   top: "bn0_scale1_full"
  85 |   param {
  86 |     name: "bn0_scale1_mean_full"
  87 |     lr_mult: 0
  88 |     decay_mult: 0
  89 |   }
  90 |   param {
  91 |     name: "bn0_scale1_var_full"
  92 |     lr_mult: 0
  93 |     decay_mult: 0
  94 |   }
  95 |   param {
  96 |     name: "bn0_scale1_bias_full"
  97 |     lr_mult: 0
  98 |     decay_mult: 0
  99 |   }
 100 |    batch_norm_param {
 101 |     use_global_stats: false
 102 |   }
 103 | }
 104 | layer {
 105 |   name: "relu0_full"
 106 |   type: "ReLU"
 107 |   bottom: "bn0_scale1_full"
 108 |   top: "bn0_scale1_full"
 109 | }
 110 | layer {
 111 |   name: "pool0_full"
 112 |   type: "Pooling"
 113 |   bottom: "bn0_scale1_full"
 114 |   top: "pool0_full"
 115 |   pooling_param {
 116 |     pool: MAX
 117 |     kernel_size: 2
 118 |     stride: 2
 119 |   }
 120 | }
 121 | layer {
 122 |   name: "conv1_scale1_full"
 123 |   type: "Convolution"
 124 |   bottom: "pool0_full"
 125 |   top: "conv1_scale1_full"
 126 |   param {
 127 |     name: "conv1_scale1_w_full"
 128 |     lr_mult: 1
 129 |     decay_mult: 1
 130 |   }
 131 |   param {
 132 |     name: "conv1_scale1_b_full"
 133 |     lr_mult: 2
 134 |     decay_mult: 0
 135 |   }
 136 |   convolution_param {
 137 |     num_output: 32
 138 |     pad: 1
 139 |     kernel_size: 3
 140 |     stride: 1
 141 |     weight_filler {
 142 |       type: "xavier"
 143 |     }
 144 |     bias_filler {
 145 |       type: "constant"
 146 |     }
 147 |     dilation: 1
 148 |   }
 149 | }
 150 | layer {
 151 |   name: "conv1_scale2_full"
 152 |   type: "Convolution"
 153 |   bottom: "pool0_full"
 154 |   top: "conv1_scale2_full"
 155 |   param {
 156 |     name: "conv1_scale2_w_full"
 157 |     lr_mult: 1
 158 |     decay_mult: 1
 159 |   }
 160 |   param {
 161 |     name: "conv1_scale2_b_full"
 162 |     lr_mult: 2
 163 |     decay_mult: 0
 164 |   }
 165 |   convolution_param {
 166 |     num_output: 32
 167 |     pad: 2
 168 |     kernel_size: 3
 169 |     stride: 1
 170 |     weight_filler {
 171 |       type: "xavier"
 172 |     }
 173 |     bias_filler {
 174 |       type: "constant"
 175 |     }
 176 |     dilation: 2
 177 |   }
 178 | }
 179 | layer {
 180 |   name: "conv1_scale3_full"
 181 |   type: "Convolution"
 182 |   bottom: "pool0_full"
 183 |   top: "conv1_scale3_full"
 184 |   param {
 185 |     name: "conv1_scale3_w_full"
 186 |     lr_mult: 1
 187 |     decay_mult: 1
 188 |   }
 189 |   param {
 190 |     name: "conv1_scale3_b_full"
 191 |     lr_mult: 2
 192 |     decay_mult: 0
 193 |   }
 194 |   convolution_param {
 195 |     num_output: 32
 196 |     pad: 3
 197 |     kernel_size: 3
 198 |     stride: 1
 199 |     weight_filler {
 200 |       type: "xavier"
 201 |     }
 202 |     bias_filler {
 203 |       type: "constant"
 204 |     }
 205 |     dilation: 3
 206 |   }
 207 | }
 208 | layer {
 209 |   name: "bn1_scale1_full"
 210 |   type: "BatchNorm"
 211 |   bottom: "conv1_scale1_full"
 212 |   top: "bn1_scale1_full"
 213 |   param {
 214 |     name: "bn1_scale1_mean_full"
 215 |     lr_mult: 0
 216 |     decay_mult: 0
 217 |   }
 218 |   param {
 219 |     name: "bn1_scale1_var_full"
 220 |     lr_mult: 0
 221 |     decay_mult: 0
 222 |   }
 223 |   param {
 224 |     name: "bn1_scale1_bias_full"
 225 |     lr_mult: 0
 226 |     decay_mult: 0
 227 |   }
 228 |   batch_norm_param {
 229 |     use_global_stats: false
 230 |   }
 231 | }
 232 | layer {
 233 |   name: "bn1_scale2_full"
 234 |   type: "BatchNorm"
 235 |   bottom: "conv1_scale2_full"
 236 |   top: "bn1_scale2_full"
 237 |   param {
 238 |     name: "bn1_scale2_mean_full"
 239 |     lr_mult: 0
 240 |     decay_mult: 0
 241 |   }
 242 |   param {
 243 |     name: "bn1_scale2_var_full"
 244 |     lr_mult: 0
 245 |     decay_mult: 0
 246 |   }
 247 |   param {
 248 |     name: "bn1_scale2_bias_full"
 249 |     lr_mult: 0
 250 |     decay_mult: 0
 251 |   }
 252 |   batch_norm_param {
 253 |     use_global_stats: false
 254 |   }
 255 | }
 256 | layer {
 257 |   name: "bn1_scale3_full"
 258 |   type: "BatchNorm"
 259 |   bottom: "conv1_scale3_full"
 260 |   top: "bn1_scale3_full"
 261 |   param {
 262 |     name: "bn1_scale3_mean_full"
 263 |     lr_mult: 0
 264 |     decay_mult: 0
 265 |   }
 266 |   param {
 267 |     name: "bn1_scale3_var_full"
 268 |     lr_mult: 0
 269 |     decay_mult: 0
 270 |   }
 271 |   param {
 272 |     name: "bn1_scale3_bias_full"
 273 |     lr_mult: 0
 274 |     decay_mult: 0
 275 |   }
 276 |   batch_norm_param {
 277 |     use_global_stats: false
 278 |   }
 279 | }
 280 | layer {
 281 |   name: "bn1_full"
 282 |   type: "Concat"
 283 |   bottom: "bn1_scale1_full"
 284 |   bottom: "bn1_scale2_full"
 285 |   bottom: "bn1_scale3_full"
 286 |   top: "bn1_full"
 287 |   concat_param {
 288 |     axis: 1
 289 |   }
 290 | }
 291 | layer {
 292 |   name: "relu1_full"
 293 |   type: "ReLU"
 294 |   bottom: "bn1_full"
 295 |   top: "bn1_full"
 296 | }
 297 | layer {
 298 |   name: "pool1_full"
 299 |   type: "Pooling"
 300 |   bottom: "bn1_full"
 301 |   top: "pool1_full"
 302 |   pooling_param {
 303 |     pool: MAX
 304 |     kernel_size: 2
 305 |     stride: 2
 306 |   }
 307 | }
 308 | layer {
 309 |   name: "conv2_scale1_full"
 310 |   type: "Convolution"
 311 |   bottom: "pool1_full"
 312 |   top: "conv2_scale1_full"
 313 |   param {
 314 |     name: "conv2_scale1_w_full"
 315 |     lr_mult: 1
 316 |     decay_mult: 1
 317 |   }
 318 |   param {
 319 |     name: "conv2_scale1_b_full"
 320 |     lr_mult: 2
 321 |     decay_mult: 0
 322 |   }
 323 |   convolution_param {
 324 |     num_output: 32
 325 |     pad: 1
 326 |     kernel_size: 3
 327 |     stride: 1
 328 |     weight_filler {
 329 |       type: "xavier"
 330 |     }
 331 |     bias_filler {
 332 |       type: "constant"
 333 |     }
 334 |     dilation: 1
 335 |   }
 336 | }
 337 | layer {
 338 |   name: "conv2_scale2_full"
 339 |   type: "Convolution"
 340 |   bottom: "pool1_full"
 341 |   top: "conv2_scale2_full"
 342 |   param {
 343 |     name: "conv2_scale2_w_full"
 344 |     lr_mult: 1
 345 |     decay_mult: 1
 346 |   }
 347 |   param {
 348 |     name: "conv2_scale2_b_full"
 349 |     lr_mult: 2
 350 |     decay_mult: 0
 351 |   }
 352 |   convolution_param {
 353 |     num_output: 32
 354 |     pad: 2
 355 |     kernel_size: 3
 356 |     stride: 1
 357 |     weight_filler {
 358 |       type: "xavier"
 359 |     }
 360 |     bias_filler {
 361 |       type: "constant"
 362 |     }
 363 |     dilation: 2
 364 |   }
 365 | }
 366 | layer {
 367 |   name: "conv2_scale3_full"
 368 |   type: "Convolution"
 369 |   bottom: "pool1_full"
 370 |   top: "conv2_scale3_full"
 371 |   param {
 372 |     name: "conv2_scale3_w_full"
 373 |     lr_mult: 1
 374 |     decay_mult: 1
 375 |   }
 376 |   param {
 377 |     name: "conv2_scale3_b_full"
 378 |     lr_mult: 2
 379 |     decay_mult: 0
 380 |   }
 381 |   convolution_param {
 382 |     num_output: 32
 383 |     pad: 3
 384 |     kernel_size: 3
 385 |     stride: 1
 386 |     weight_filler {
 387 |       type: "xavier"
 388 |     }
 389 |     bias_filler {
 390 |       type: "constant"
 391 |     }
 392 |     dilation: 3
 393 |   }
 394 | }
 395 | layer {
 396 |   name: "bn2_scale1_full"
 397 |   type: "BatchNorm"
 398 |   bottom: "conv2_scale1_full"
 399 |   top: "bn2_scale1_full"
 400 |   param {
 401 |     name: "bn2_scale1_mean_full"
 402 |     lr_mult: 0
 403 |     decay_mult: 0
 404 |   }
 405 |   param {
 406 |     name: "bn2_scale1_var_full"
 407 |     lr_mult: 0
 408 |     decay_mult: 0
 409 |   }
 410 |   param {
 411 |     name: "bn2_scale1_bias_full"
 412 |     lr_mult: 0
 413 |     decay_mult: 0
 414 |   }
 415 |   batch_norm_param {
 416 |     use_global_stats: false
 417 |   }
 418 | }
 419 | layer {
 420 |   name: "bn2_scale2_full"
 421 |   type: "BatchNorm"
 422 |   bottom: "conv2_scale2_full"
 423 |   top: "bn2_scale2_full"
 424 |   param {
 425 |     name: "bn2_scale2_mean_full"
 426 |     lr_mult: 0
 427 |     decay_mult: 0
 428 |   }
 429 |   param {
 430 |     name: "bn2_scale2_var_full"
 431 |     lr_mult: 0
 432 |     decay_mult: 0
 433 |   }
 434 |   param {
 435 |     name: "bn2_scale2_bias_full"
 436 |     lr_mult: 0
 437 |     decay_mult: 0
 438 |   }
 439 |   batch_norm_param {
 440 |     use_global_stats: false
 441 |   }
 442 | }
 443 | layer {
 444 |   name: "bn2_scale3_full"
 445 |   type: "BatchNorm"
 446 |   bottom: "conv2_scale3_full"
 447 |   top: "bn2_scale3_full"
 448 |   param {
 449 |     name: "bn2_scale3_mean_full"
 450 |     lr_mult: 0
 451 |     decay_mult: 0
 452 |   }
 453 |   param {
 454 |     name: "bn2_scale3_var_full"
 455 |     lr_mult: 0
 456 |     decay_mult: 0
 457 |   }
 458 |   param {
 459 |     name: "bn2_scale3_bias_full"
 460 |     lr_mult: 0
 461 |     decay_mult: 0
 462 |   }
 463 |   batch_norm_param {
 464 |     use_global_stats: false
 465 |   }
 466 | }
 467 | layer {
 468 |   name: "bn2_full"
 469 |   type: "Concat"
 470 |   bottom: "bn2_scale1_full"
 471 |   bottom: "bn2_scale2_full"
 472 |   bottom: "bn2_scale3_full"
 473 |   top: "bn2_full"
 474 |   concat_param {
 475 |     axis: 1
 476 |   }
 477 | }
 478 | layer {
 479 |   name: "relu2_full"
 480 |   type: "ReLU"
 481 |   bottom: "bn2_full"
 482 |   top: "bn2_full"
 483 | }
 484 | 
 485 | 
 486 | layer {
 487 |   name: "make_att_mask"
 488 |   type: "Convolution"
 489 |   bottom: "bn2_full"
 490 |   top: "make_att_mask"
 491 |   param {
 492 |     lr_mult: 1
 493 |     decay_mult: 1
 494 |   }
 495 |   param {
 496 |     lr_mult: 2
 497 |     decay_mult: 0
 498 |   }
 499 |   convolution_param {
 500 |     num_output: 1
 501 |     pad: 1
 502 |     kernel_size: 3
 503 |     stride: 1
 504 |     weight_filler {
 505 |       type: "xavier"
 506 |     }
 507 |     bias_filler {
 508 |       type: "constant"
 509 |     }
 510 |     dilation: 1
 511 |   }
 512 | }
 513 | 
 514 | layer {
 515 |   name: "att_sigmoid"
 516 |   type: "Sigmoid"
 517 |   bottom: "make_att_mask"
 518 |   top: "make_att_mask"
 519 | }
 520 | 
 521 | ###############  Seg Loss    #####################
 522 | layer {
 523 |   name: "loss_seg"
 524 |   type: "EuclideanLoss"
 525 |   bottom: "gt"
 526 |   bottom: "make_att_mask"
 527 |   top: "loss_seg"
 528 |   loss_weight: 0.1
 529 | }
 530 | 
 531 | layer {
 532 |   name: "make_att_mask_inv"
 533 |   type: "Eltwise"
 534 |   bottom: "one_mask"
 535 |   bottom: "make_att_mask"
 536 |   top: "make_att_mask_inv"
 537 |   eltwise_param {
 538 |     operation: SUM
 539 |     coeff: 1
 540 |     coeff: -1
 541 |   }
 542 | }
 543 | ###############  Seg Loss    #####################
 544 | 
 545 | layer {
 546 |   name: "tile_iner"
 547 |   type: "Tile"
 548 |   bottom: "make_att_mask"
 549 |   top: "att_iner"
 550 |   tile_param  {
 551 |     tiles: 96
 552 |     axis: 1
 553 |   }
 554 | }
 555 | 
 556 | layer {
 557 |   name: "bn2_att_iner"
 558 |   type: "Eltwise"
 559 |   bottom: "bn2_full"
 560 |   bottom: "att_iner"
 561 |   top: "bn2_att_iner"
 562 |   eltwise_param {
 563 |     operation: PROD
 564 |   }
 565 | }
 566 | 
 567 | layer {
 568 |   name: "tile_exter"
 569 |   type: "Tile"
 570 |   bottom: "make_att_mask_inv"
 571 |   top: "att_exter"
 572 |   tile_param  {
 573 |     tiles: 96
 574 |     axis: 1
 575 |   }
 576 | }
 577 | 
 578 | layer {
 579 |   name: "bn2_att_exter"
 580 |   type: "Eltwise"
 581 |   bottom: "bn2_full"
 582 |   bottom: "att_exter"
 583 |   top: "bn2_att_exter"
 584 |   eltwise_param {
 585 |     operation: PROD
 586 |   }
 587 | }
 588 | 
 589 | layer {
 590 |   name: "pool2_full"
 591 |   type: "Pooling"
 592 |   bottom: "bn2_full"
 593 |   top: "pool2_full"
 594 |   pooling_param {
 595 |     pool: MAX
 596 |     kernel_size: 2
 597 |     stride: 2
 598 |   }
 599 | }
 600 | 
 601 | layer {
 602 |   name: "conv3_scale1_full"
 603 |   type: "Convolution"
 604 |   bottom: "pool2_full"
 605 |   top: "conv3_scale1_full"
 606 |   param {
 607 |     name: "conv3_scale1_w_full"
 608 |     lr_mult: 1
 609 |     decay_mult: 1
 610 |   }
 611 |   param {
 612 |     name: "conv3_scale1_b_full"
 613 |     lr_mult: 2
 614 |     decay_mult: 0
 615 |   }
 616 |   convolution_param {
 617 |     num_output: 32
 618 |     pad: 1
 619 |     kernel_size: 3
 620 |     stride: 1
 621 |     weight_filler {
 622 |       type: "xavier"
 623 |     }
 624 |     bias_filler {
 625 |       type: "constant"
 626 |     }
 627 |     dilation: 1
 628 |   }
 629 | }
 630 | layer {
 631 |   name: "conv3_scale2_full"
 632 |   type: "Convolution"
 633 |   bottom: "pool2_full"
 634 |   top: "conv3_scale2_full"
 635 |   param {
 636 |     name: "conv3_scale2_w_full"
 637 |     lr_mult: 1
 638 |     decay_mult: 1
 639 |   }
 640 |   param {
 641 |     name: "conv3_scale2_b_full"
 642 |     lr_mult: 2
 643 |     decay_mult: 0
 644 |   }
 645 |   convolution_param {
 646 |     num_output: 32
 647 |     pad: 2
 648 |     kernel_size: 3
 649 |     stride: 1
 650 |     weight_filler {
 651 |       type: "xavier"
 652 |     }
 653 |     bias_filler {
 654 |       type: "constant"
 655 |     }
 656 |     dilation: 2
 657 |   }
 658 | }
 659 | layer {
 660 |   name: "conv3_scale3_full"
 661 |   type: "Convolution"
 662 |   bottom: "pool2_full"
 663 |   top: "conv3_scale3_full"
 664 |   param {
 665 |     name: "conv3_scale3_w_full"
 666 |     lr_mult: 1
 667 |     decay_mult: 1
 668 |   }
 669 |   param {
 670 |     name: "conv3_scale3_b_full"
 671 |     lr_mult: 2
 672 |     decay_mult: 0
 673 |   }
 674 |   convolution_param {
 675 |     num_output: 32
 676 |     pad: 3
 677 |     kernel_size: 3
 678 |     stride: 1
 679 |     weight_filler {
 680 |       type: "xavier"
 681 |     }
 682 |     bias_filler {
 683 |       type: "constant"
 684 |     }
 685 |     dilation: 3
 686 |   }
 687 | }
 688 | layer {
 689 |   name: "bn3_scale1_full"
 690 |   type: "BatchNorm"
 691 |   bottom: "conv3_scale1_full"
 692 |   top: "bn3_scale1_full"
 693 |   param {
 694 |     name: "bn3_scale1_mean_full"
 695 |     lr_mult: 0
 696 |     decay_mult: 0
 697 |   }
 698 |   param {
 699 |     name: "bn3_scale1_var_full"
 700 |     lr_mult: 0
 701 |     decay_mult: 0
 702 |   }
 703 |   param {
 704 |     name: "bn3_scale1_bias_full"
 705 |     lr_mult: 0
 706 |     decay_mult: 0
 707 |   }
 708 |   batch_norm_param {
 709 |     use_global_stats: false
 710 |   }
 711 | }
 712 | layer {
 713 |   name: "bn3_scale2_full"
 714 |   type: "BatchNorm"
 715 |   bottom: "conv3_scale2_full"
 716 |   top: "bn3_scale2_full"
 717 |   param {
 718 |     name: "bn3_scale2_mean_full"
 719 |     lr_mult: 0
 720 |     decay_mult: 0
 721 |   }
 722 |   param {
 723 |     name: "bn3_scale2_var_full"
 724 |     lr_mult: 0
 725 |     decay_mult: 0
 726 |   }
 727 |   param {
 728 |     name: "bn3_scale2_bias_full"
 729 |     lr_mult: 0
 730 |     decay_mult: 0
 731 |   }
 732 |   batch_norm_param {
 733 |     use_global_stats: false
 734 |   }
 735 | }
 736 | layer {
 737 |   name: "bn3_scale3_full"
 738 |   type: "BatchNorm"
 739 |   bottom: "conv3_scale3_full"
 740 |   top: "bn3_scale3_full"
 741 |   param {
 742 |     name: "bn3_scale3_mean_full"
 743 |     lr_mult: 0
 744 |     decay_mult: 0
 745 |   }
 746 |   param {
 747 |     name: "bn3_scale3_var_full"
 748 |     lr_mult: 0
 749 |     decay_mult: 0
 750 |   }
 751 |   param {
 752 |     name: "bn3_scale3_bias_full"
 753 |     lr_mult: 0
 754 |     decay_mult: 0
 755 |   }
 756 |   batch_norm_param {
 757 |     use_global_stats: false
 758 |   }
 759 | }
 760 | layer {
 761 |   name: "bn3_full"
 762 |   type: "Concat"
 763 |   bottom: "bn3_scale1_full"
 764 |   bottom: "bn3_scale2_full"
 765 |   bottom: "bn3_scale3_full"
 766 |   top: "bn3_full"
 767 |   concat_param {
 768 |     axis: 1
 769 |   }
 770 | }
 771 | layer {
 772 |   name: "relu3_full"
 773 |   type: "ReLU"
 774 |   bottom: "bn3_full"
 775 |   top: "bn3_full"
 776 | }
 777 | layer {
 778 |   name: "pool3_full"
 779 |   type: "Pooling"
 780 |   bottom: "bn3_full"
 781 |   top: "pool3_full"
 782 |   pooling_param {
 783 |     pool: MAX
 784 |     kernel_size: 2
 785 |     stride: 2
 786 |   }
 787 | }
 788 | layer {
 789 |   name: "conv4_scale1_full"
 790 |   type: "Convolution"
 791 |   bottom: "pool3_full"
 792 |   top: "conv4_scale1_full"
 793 |   param {
 794 |     name: "conv4_scale1_w_full"
 795 |     lr_mult: 1
 796 |     decay_mult: 1
 797 |   }
 798 |   param {
 799 |     name: "conv4_scale1_b_full"
 800 |     lr_mult: 2
 801 |     decay_mult: 0
 802 |   }
 803 |   convolution_param {
 804 |     num_output: 32
 805 |     pad: 1
 806 |     kernel_size: 3
 807 |     stride: 1
 808 |     weight_filler {
 809 |       type: "xavier"
 810 |     }
 811 |     bias_filler {
 812 |       type: "constant"
 813 |     }
 814 |     dilation: 1
 815 |   }
 816 | }
 817 | layer {
 818 |   name: "conv4_scale2_full"
 819 |   type: "Convolution"
 820 |   bottom: "pool3_full"
 821 |   top: "conv4_scale2_full"
 822 |   param {
 823 |     name: "conv4_scale2_w_full"
 824 |     lr_mult: 1
 825 |     decay_mult: 1
 826 |   }
 827 |   param {
 828 |     name: "conv4_scale2_b_full"
 829 |     lr_mult: 2
 830 |     decay_mult: 0
 831 |   }
 832 |   convolution_param {
 833 |     num_output: 32
 834 |     pad: 2
 835 |     kernel_size: 3
 836 |     stride: 1
 837 |     weight_filler {
 838 |       type: "xavier"
 839 |     }
 840 |     bias_filler {
 841 |       type: "constant"
 842 |     }
 843 |     dilation: 2
 844 |   }
 845 | }
 846 | layer {
 847 |   name: "conv4_scale3_full"
 848 |   type: "Convolution"
 849 |   bottom: "pool3_full"
 850 |   top: "conv4_scale3_full"
 851 |   param {
 852 |     name: "conv4_scale3_w_full"
 853 |     lr_mult: 1
 854 |     decay_mult: 1
 855 |   }
 856 |   param {
 857 |     name: "conv4_scale3_b_full"
 858 |     lr_mult: 2
 859 |     decay_mult: 0
 860 |   }
 861 |   convolution_param {
 862 |     num_output: 32
 863 |     pad: 3
 864 |     kernel_size: 3
 865 |     stride: 1
 866 |     weight_filler {
 867 |       type: "xavier"
 868 |     }
 869 |     bias_filler {
 870 |       type: "constant"
 871 |     }
 872 |     dilation: 3
 873 |   }
 874 | }
 875 | layer {
 876 |   name: "bn4_scale1_full"
 877 |   type: "BatchNorm"
 878 |   bottom: "conv4_scale1_full"
 879 |   top: "bn4_scale1_full"
 880 |   param {
 881 |     name: "bn4_scale1_mean_full"
 882 |     lr_mult: 0
 883 |     decay_mult: 0
 884 |   }
 885 |   param {
 886 |     name: "bn4_scale1_var_full"
 887 |     lr_mult: 0
 888 |     decay_mult: 0
 889 |   }
 890 |   param {
 891 |     name: "bn4_scale1_bias_full"
 892 |     lr_mult: 0
 893 |     decay_mult: 0
 894 |   }
 895 |   batch_norm_param {
 896 |     use_global_stats: false
 897 |   }
 898 | }
 899 | layer {
 900 |   name: "bn4_scale2_full"
 901 |   type: "BatchNorm"
 902 |   bottom: "conv4_scale2_full"
 903 |   top: "bn4_scale2_full"
 904 |   param {
 905 |     name: "bn4_scale2_mean_full"
 906 |     lr_mult: 0
 907 |     decay_mult: 0
 908 |   }
 909 |   param {
 910 |     name: "bn4_scale2_var_full"
 911 |     lr_mult: 0
 912 |     decay_mult: 0
 913 |   }
 914 |   param {
 915 |     name: "bn4_scale2_bias_full"
 916 |     lr_mult: 0
 917 |     decay_mult: 0
 918 |   }
 919 |   batch_norm_param {
 920 |     use_global_stats: false
 921 |   }
 922 | }
 923 | layer {
 924 |   name: "bn4_scale3_full"
 925 |   type: "BatchNorm"
 926 |   bottom: "conv4_scale3_full"
 927 |   top: "bn4_scale3_full"
 928 |   param {
 929 |     name: "bn4_scale3_mean_full"
 930 |     lr_mult: 0
 931 |     decay_mult: 0
 932 |   }
 933 |   param {
 934 |     name: "bn4_scale3_var_full"
 935 |     lr_mult: 0
 936 |     decay_mult: 0
 937 |   }
 938 |   param {
 939 |     name: "bn4_scale3_bias_full"
 940 |     lr_mult: 0
 941 |     decay_mult: 0
 942 |   }
 943 |   batch_norm_param {
 944 |     use_global_stats: false
 945 |   }
 946 | }
 947 | layer {
 948 |   name: "bn4_full"
 949 |   type: "Concat"
 950 |   bottom: "bn4_scale1_full"
 951 |   bottom: "bn4_scale2_full"
 952 |   bottom: "bn4_scale3_full"
 953 |   top: "bn4_full"
 954 |   concat_param {
 955 |     axis: 1
 956 |   }
 957 | }
 958 | layer {
 959 |   name: "relu4_full"
 960 |   type: "ReLU"
 961 |   bottom: "bn4_full"
 962 |   top: "bn4_full"
 963 | }
 964 | layer {
 965 |   name: "pool4_full"
 966 |   type: "Pooling"
 967 |   bottom: "bn4_full"
 968 |   top: "pool4_full"
 969 |   pooling_param {
 970 |     pool: MAX
 971 |     kernel_size: 2
 972 |     stride: 2
 973 |   }
 974 | }
 975 | layer {
 976 |   name: "fc1_full"
 977 |   type: "InnerProduct"
 978 |   bottom: "pool4_full"
 979 |   top: "fc1_full"
 980 |   param {
 981 |     name: "fc1_w_full"
 982 |     lr_mult: 1
 983 |     decay_mult: 1
 984 |   }
 985 |   param {
 986 |     name: "fc1_b_full"
 987 |     lr_mult: 2
 988 |     decay_mult: 0
 989 |   }
 990 |   inner_product_param {
 991 |     num_output: 128
 992 |     weight_filler {
 993 |       type: "xavier"
 994 |     }
 995 |     bias_filler {
 996 |       type: "constant"
 997 |     }
 998 |   }
 999 | }
1000 | layer {
1001 |   name: "fc1_full_drop"
1002 |   type: "Dropout"
1003 |   bottom: "fc1_full"
1004 |   top: "fc1_full"
1005 |   dropout_param {
1006 |     dropout_ratio: 0.2
1007 |   }
1008 | }
1009 | 
1010 | layer {
1011 |   name: "fc2_full"
1012 |   type: "InnerProduct"
1013 |   bottom: "fc1_full"
1014 |   top: "fc2_full"
1015 |   param {
1016 |     lr_mult: 1
1017 |     decay_mult: 1
1018 |   }
1019 |   param {
1020 |     lr_mult: 2
1021 |     decay_mult: 0
1022 |   }
1023 |   inner_product_param {
1024 |     num_output: 751
1025 |     weight_filler {
1026 |       type: "xavier"
1027 |     }
1028 |     bias_filler {
1029 |       type: "constant"
1030 |     }
1031 |   }
1032 | }
1033 | layer {
1034 |   name: "loss_cls_full"
1035 |   type: "SoftmaxWithLoss"
1036 |   bottom: "fc2_full"
1037 |   bottom: "label"
1038 |   top: "loss_cls_full"
1039 |   loss_weight: 1
1040 | }
1041 | layer {
1042 |   name: "acc_cls_full"
1043 |   type: "Accuracy"
1044 |   bottom: "fc2_full"
1045 |   bottom: "label"
1046 |   top: "acc_cls_full"
1047 | }
1048 | 
1049 | layer {
1050 |   name: "pool2_iner"
1051 |   type: "Pooling"
1052 |   bottom: "bn2_att_iner"
1053 |   top: "pool2_iner"
1054 |   pooling_param {
1055 |     pool: MAX
1056 |     kernel_size: 2
1057 |     stride: 2
1058 |   }
1059 | }
1060 | 
1061 | layer {
1062 |   name: "conv3_scale1_iner"
1063 |   type: "Convolution"
1064 |   bottom: "pool2_iner"
1065 |   top: "conv3_scale1_iner"
1066 |   param {
1067 |     name: "conv3_scale1_w_iner"
1068 |     lr_mult: 1
1069 |     decay_mult: 1
1070 |   }
1071 |   param {
1072 |     name: "conv3_scale1_b_iner"
1073 |     lr_mult: 2
1074 |     decay_mult: 0
1075 |   }
1076 |   convolution_param {
1077 |     num_output: 32
1078 |     pad: 1
1079 |     kernel_size: 3
1080 |     stride: 1
1081 |     weight_filler {
1082 |       type: "xavier"
1083 |     }
1084 |     bias_filler {
1085 |       type: "constant"
1086 |     }
1087 |     dilation: 1
1088 |   }
1089 | }
1090 | layer {
1091 |   name: "conv3_scale2_iner"
1092 |   type: "Convolution"
1093 |   bottom: "pool2_iner"
1094 |   top: "conv3_scale2_iner"
1095 |   param {
1096 |     name: "conv3_scale2_w_iner"
1097 |     lr_mult: 1
1098 |     decay_mult: 1
1099 |   }
1100 |   param {
1101 |     name: "conv3_scale2_b_iner"
1102 |     lr_mult: 2
1103 |     decay_mult: 0
1104 |   }
1105 |   convolution_param {
1106 |     num_output: 32
1107 |     pad: 2
1108 |     kernel_size: 3
1109 |     stride: 1
1110 |     weight_filler {
1111 |       type: "xavier"
1112 |     }
1113 |     bias_filler {
1114 |       type: "constant"
1115 |     }
1116 |     dilation: 2
1117 |   }
1118 | }
1119 | layer {
1120 |   name: "conv3_scale3_iner"
1121 |   type: "Convolution"
1122 |   bottom: "pool2_iner"
1123 |   top: "conv3_scale3_iner"
1124 |   param {
1125 |     name: "conv3_scale3_w_iner"
1126 |     lr_mult: 1
1127 |     decay_mult: 1
1128 |   }
1129 |   param {
1130 |     name: "conv3_scale3_b_iner"
1131 |     lr_mult: 2
1132 |     decay_mult: 0
1133 |   }
1134 |   convolution_param {
1135 |     num_output: 32
1136 |     pad: 3
1137 |     kernel_size: 3
1138 |     stride: 1
1139 |     weight_filler {
1140 |       type: "xavier"
1141 |     }
1142 |     bias_filler {
1143 |       type: "constant"
1144 |     }
1145 |     dilation: 3
1146 |   }
1147 | }
1148 | layer {
1149 |   name: "bn3_scale1_iner"
1150 |   type: "BatchNorm"
1151 |   bottom: "conv3_scale1_iner"
1152 |   top: "bn3_scale1_iner"
1153 |   param {
1154 |     name: "bn3_scale1_mean_iner"
1155 |     lr_mult: 0
1156 |     decay_mult: 0
1157 |   }
1158 |   param {
1159 |     name: "bn3_scale1_var_iner"
1160 |     lr_mult: 0
1161 |     decay_mult: 0
1162 |   }
1163 |   param {
1164 |     name: "bn3_scale1_bias_iner"
1165 |     lr_mult: 0
1166 |     decay_mult: 0
1167 |   }
1168 |   batch_norm_param {
1169 |     use_global_stats: false
1170 |   }
1171 | }
1172 | layer {
1173 |   name: "bn3_scale2_iner"
1174 |   type: "BatchNorm"
1175 |   bottom: "conv3_scale2_iner"
1176 |   top: "bn3_scale2_iner"
1177 |   param {
1178 |     name: "bn3_scale2_mean_iner"
1179 |     lr_mult: 0
1180 |     decay_mult: 0
1181 |   }
1182 |   param {
1183 |     name: "bn3_scale2_var_iner"
1184 |     lr_mult: 0
1185 |     decay_mult: 0
1186 |   }
1187 |   param {
1188 |     name: "bn3_scale2_bias_iner"
1189 |     lr_mult: 0
1190 |     decay_mult: 0
1191 |   }
1192 |   batch_norm_param {
1193 |     use_global_stats: false
1194 |   }
1195 | }
1196 | layer {
1197 |   name: "bn3_scale3_iner"
1198 |   type: "BatchNorm"
1199 |   bottom: "conv3_scale3_iner"
1200 |   top: "bn3_scale3_iner"
1201 |   param {
1202 |     name: "bn3_scale3_mean_iner"
1203 |     lr_mult: 0
1204 |     decay_mult: 0
1205 |   }
1206 |   param {
1207 |     name: "bn3_scale3_var_iner"
1208 |     lr_mult: 0
1209 |     decay_mult: 0
1210 |   }
1211 |   param {
1212 |     name: "bn3_scale3_bias_iner"
1213 |     lr_mult: 0
1214 |     decay_mult: 0
1215 |   }
1216 |   batch_norm_param {
1217 |     use_global_stats: false
1218 |   }
1219 | }
1220 | layer {
1221 |   name: "bn3_iner"
1222 |   type: "Concat"
1223 |   bottom: "bn3_scale1_iner"
1224 |   bottom: "bn3_scale2_iner"
1225 |   bottom: "bn3_scale3_iner"
1226 |   top: "bn3_iner"
1227 |   concat_param {
1228 |     axis: 1
1229 |   }
1230 | }
1231 | layer {
1232 |   name: "relu3_iner"
1233 |   type: "ReLU"
1234 |   bottom: "bn3_iner"
1235 |   top: "bn3_iner"
1236 | }
1237 | layer {
1238 |   name: "pool3_iner"
1239 |   type: "Pooling"
1240 |   bottom: "bn3_iner"
1241 |   top: "pool3_iner"
1242 |   pooling_param {
1243 |     pool: MAX
1244 |     kernel_size: 2
1245 |     stride: 2
1246 |   }
1247 | }
1248 | layer {
1249 |   name: "conv4_scale1_iner"
1250 |   type: "Convolution"
1251 |   bottom: "pool3_iner"
1252 |   top: "conv4_scale1_iner"
1253 |   param {
1254 |     name: "conv4_scale1_w_iner"
1255 |     lr_mult: 1
1256 |     decay_mult: 1
1257 |   }
1258 |   param {
1259 |     name: "conv4_scale1_b_iner"
1260 |     lr_mult: 2
1261 |     decay_mult: 0
1262 |   }
1263 |   convolution_param {
1264 |     num_output: 32
1265 |     pad: 1
1266 |     kernel_size: 3
1267 |     stride: 1
1268 |     weight_filler {
1269 |       type: "xavier"
1270 |     }
1271 |     bias_filler {
1272 |       type: "constant"
1273 |     }
1274 |     dilation: 1
1275 |   }
1276 | }
1277 | layer {
1278 |   name: "conv4_scale2_iner"
1279 |   type: "Convolution"
1280 |   bottom: "pool3_iner"
1281 |   top: "conv4_scale2_iner"
1282 |   param {
1283 |     name: "conv4_scale2_w_iner"
1284 |     lr_mult: 1
1285 |     decay_mult: 1
1286 |   }
1287 |   param {
1288 |     name: "conv4_scale2_b_iner"
1289 |     lr_mult: 2
1290 |     decay_mult: 0
1291 |   }
1292 |   convolution_param {
1293 |     num_output: 32
1294 |     pad: 2
1295 |     kernel_size: 3
1296 |     stride: 1
1297 |     weight_filler {
1298 |       type: "xavier"
1299 |     }
1300 |     bias_filler {
1301 |       type: "constant"
1302 |     }
1303 |     dilation: 2
1304 |   }
1305 | }
1306 | layer {
1307 |   name: "conv4_scale3_iner"
1308 |   type: "Convolution"
1309 |   bottom: "pool3_iner"
1310 |   top: "conv4_scale3_iner"
1311 |   param {
1312 |     name: "conv4_scale3_w_iner"
1313 |     lr_mult: 1
1314 |     decay_mult: 1
1315 |   }
1316 |   param {
1317 |     name: "conv4_scale3_b_iner"
1318 |     lr_mult: 2
1319 |     decay_mult: 0
1320 |   }
1321 |   convolution_param {
1322 |     num_output: 32
1323 |     pad: 3
1324 |     kernel_size: 3
1325 |     stride: 1
1326 |     weight_filler {
1327 |       type: "xavier"
1328 |     }
1329 |     bias_filler {
1330 |       type: "constant"
1331 |     }
1332 |     dilation: 3
1333 |   }
1334 | }
1335 | layer {
1336 |   name: "bn4_scale1_iner"
1337 |   type: "BatchNorm"
1338 |   bottom: "conv4_scale1_iner"
1339 |   top: "bn4_scale1_iner"
1340 |   param {
1341 |     name: "bn4_scale1_mean_iner"
1342 |     lr_mult: 0
1343 |     decay_mult: 0
1344 |   }
1345 |   param {
1346 |     name: "bn4_scale1_var_iner"
1347 |     lr_mult: 0
1348 |     decay_mult: 0
1349 |   }
1350 |   param {
1351 |     name: "bn4_scale1_bias_iner"
1352 |     lr_mult: 0
1353 |     decay_mult: 0
1354 |   }
1355 |   batch_norm_param {
1356 |     use_global_stats: false
1357 |   }
1358 | }
1359 | layer {
1360 |   name: "bn4_scale2_iner"
1361 |   type: "BatchNorm"
1362 |   bottom: "conv4_scale2_iner"
1363 |   top: "bn4_scale2_iner"
1364 |   param {
1365 |     name: "bn4_scale2_mean_iner"
1366 |     lr_mult: 0
1367 |     decay_mult: 0
1368 |   }
1369 |   param {
1370 |     name: "bn4_scale2_var_iner"
1371 |     lr_mult: 0
1372 |     decay_mult: 0
1373 |   }
1374 |   param {
1375 |     name: "bn4_scale2_bias_iner"
1376 |     lr_mult: 0
1377 |     decay_mult: 0
1378 |   }
1379 |   batch_norm_param {
1380 |     use_global_stats: false
1381 |   }
1382 | }
1383 | layer {
1384 |   name: "bn4_scale3_iner"
1385 |   type: "BatchNorm"
1386 |   bottom: "conv4_scale3_iner"
1387 |   top: "bn4_scale3_iner"
1388 |   param {
1389 |     name: "bn4_scale3_mean_iner"
1390 |     lr_mult: 0
1391 |     decay_mult: 0
1392 |   }
1393 |   param {
1394 |     name: "bn4_scale3_var_iner"
1395 |     lr_mult: 0
1396 |     decay_mult: 0
1397 |   }
1398 |   param {
1399 |     name: "bn4_scale3_bias_iner"
1400 |     lr_mult: 0
1401 |     decay_mult: 0
1402 |   }
1403 |   batch_norm_param {
1404 |     use_global_stats: false
1405 |   }
1406 | }
1407 | layer {
1408 |   name: "bn4_iner"
1409 |   type: "Concat"
1410 |   bottom: "bn4_scale1_iner"
1411 |   bottom: "bn4_scale2_iner"
1412 |   bottom: "bn4_scale3_iner"
1413 |   top: "bn4_iner"
1414 |   concat_param {
1415 |     axis: 1
1416 |   }
1417 | }
1418 | layer {
1419 |   name: "relu4_iner"
1420 |   type: "ReLU"
1421 |   bottom: "bn4_iner"
1422 |   top: "bn4_iner"
1423 | }
1424 | layer {
1425 |   name: "pool4_iner"
1426 |   type: "Pooling"
1427 |   bottom: "bn4_iner"
1428 |   top: "pool4_iner"
1429 |   pooling_param {
1430 |     pool: MAX
1431 |     kernel_size: 2
1432 |     stride: 2
1433 |   }
1434 | }
1435 | layer {
1436 |   name: "fc1_iner"
1437 |   type: "InnerProduct"
1438 |   bottom: "pool4_iner"
1439 |   top: "fc1_iner"
1440 |   param {
1441 |     name: "fc1_w_iner"
1442 |     lr_mult: 1
1443 |     decay_mult: 1
1444 |   }
1445 |   param {
1446 |     name: "fc1_b_iner"
1447 |     lr_mult: 2
1448 |     decay_mult: 0
1449 |   }
1450 |   inner_product_param {
1451 |     num_output: 128
1452 |     weight_filler {
1453 |       type: "xavier"
1454 |     }
1455 |     bias_filler {
1456 |       type: "constant"
1457 |     }
1458 |   }
1459 | }
1460 | layer {
1461 |   name: "fc1_iner_drop"
1462 |   type: "Dropout"
1463 |   bottom: "fc1_iner"
1464 |   top: "fc1_iner"
1465 |   dropout_param {
1466 |     dropout_ratio: 0.2
1467 |   }
1468 | }
1469 | 
1470 | layer {
1471 |   name: "fc2_iner"
1472 |   type: "InnerProduct"
1473 |   bottom: "fc1_iner"
1474 |   top: "fc2_iner"
1475 |   param {
1476 |     lr_mult: 1
1477 |     decay_mult: 1
1478 |   }
1479 |   param {
1480 |     lr_mult: 2
1481 |     decay_mult: 0
1482 |   }
1483 |   inner_product_param {
1484 |     num_output: 751
1485 |     weight_filler {
1486 |       type: "xavier"
1487 |     }
1488 |     bias_filler {
1489 |       type: "constant"
1490 |     }
1491 |   }
1492 | }
1493 | layer {
1494 |   name: "loss_cls_iner"
1495 |   type: "SoftmaxWithLoss"
1496 |   bottom: "fc2_iner"
1497 |   bottom: "label"
1498 |   top: "loss_cls_iner"
1499 |   loss_weight: 1
1500 | }
1501 | layer {
1502 |   name: "acc_cls_iner"
1503 |   type: "Accuracy"
1504 |   bottom: "fc2_iner"
1505 |   bottom: "label"
1506 |   top: "acc_cls_iner"
1507 | }
1508 | 
1509 | layer {
1510 |   name: "pool2_exter"
1511 |   type: "Pooling"
1512 |   bottom: "bn2_att_exter"
1513 |   top: "pool2_exter"
1514 |   pooling_param {
1515 |     pool: MAX
1516 |     kernel_size: 2
1517 |     stride: 2
1518 |   }
1519 | }
1520 | 
1521 | layer {
1522 |   name: "conv3_scale1_exter"
1523 |   type: "Convolution"
1524 |   bottom: "pool2_exter"
1525 |   top: "conv3_scale1_exter"
1526 |   param {
1527 |     name: "conv3_scale1_w_exter"
1528 |     lr_mult: 1
1529 |     decay_mult: 1
1530 |   }
1531 |   param {
1532 |     name: "conv3_scale1_b_exter"
1533 |     lr_mult: 2
1534 |     decay_mult: 0
1535 |   }
1536 |   convolution_param {
1537 |     num_output: 32
1538 |     pad: 1
1539 |     kernel_size: 3
1540 |     stride: 1
1541 |     weight_filler {
1542 |       type: "xavier"
1543 |     }
1544 |     bias_filler {
1545 |       type: "constant"
1546 |     }
1547 |     dilation: 1
1548 |   }
1549 | }
1550 | layer {
1551 |   name: "conv3_scale2_exter"
1552 |   type: "Convolution"
1553 |   bottom: "pool2_exter"
1554 |   top: "conv3_scale2_exter"
1555 |   param {
1556 |     name: "conv3_scale2_w_exter"
1557 |     lr_mult: 1
1558 |     decay_mult: 1
1559 |   }
1560 |   param {
1561 |     name: "conv3_scale2_b_exter"
1562 |     lr_mult: 2
1563 |     decay_mult: 0
1564 |   }
1565 |   convolution_param {
1566 |     num_output: 32
1567 |     pad: 2
1568 |     kernel_size: 3
1569 |     stride: 1
1570 |     weight_filler {
1571 |       type: "xavier"
1572 |     }
1573 |     bias_filler {
1574 |       type: "constant"
1575 |     }
1576 |     dilation: 2
1577 |   }
1578 | }
1579 | layer {
1580 |   name: "conv3_scale3_exter"
1581 |   type: "Convolution"
1582 |   bottom: "pool2_exter"
1583 |   top: "conv3_scale3_exter"
1584 |   param {
1585 |     name: "conv3_scale3_w_exter"
1586 |     lr_mult: 1
1587 |     decay_mult: 1
1588 |   }
1589 |   param {
1590 |     name: "conv3_scale3_b_exter"
1591 |     lr_mult: 2
1592 |     decay_mult: 0
1593 |   }
1594 |   convolution_param {
1595 |     num_output: 32
1596 |     pad: 3
1597 |     kernel_size: 3
1598 |     stride: 1
1599 |     weight_filler {
1600 |       type: "xavier"
1601 |     }
1602 |     bias_filler {
1603 |       type: "constant"
1604 |     }
1605 |     dilation: 3
1606 |   }
1607 | }
1608 | layer {
1609 |   name: "bn3_scale1_exter"
1610 |   type: "BatchNorm"
1611 |   bottom: "conv3_scale1_exter"
1612 |   top: "bn3_scale1_exter"
1613 |   param {
1614 |     name: "bn3_scale1_mean_exter"
1615 |     lr_mult: 0
1616 |     decay_mult: 0
1617 |   }
1618 |   param {
1619 |     name: "bn3_scale1_var_exter"
1620 |     lr_mult: 0
1621 |     decay_mult: 0
1622 |   }
1623 |   param {
1624 |     name: "bn3_scale1_bias_exter"
1625 |     lr_mult: 0
1626 |     decay_mult: 0
1627 |   }
1628 |   batch_norm_param {
1629 |     use_global_stats: false
1630 |   }
1631 | }
1632 | layer {
1633 |   name: "bn3_scale2_exter"
1634 |   type: "BatchNorm"
1635 |   bottom: "conv3_scale2_exter"
1636 |   top: "bn3_scale2_exter"
1637 |   param {
1638 |     name: "bn3_scale2_mean_exter"
1639 |     lr_mult: 0
1640 |     decay_mult: 0
1641 |   }
1642 |   param {
1643 |     name: "bn3_scale2_var_exter"
1644 |     lr_mult: 0
1645 |     decay_mult: 0
1646 |   }
1647 |   param {
1648 |     name: "bn3_scale2_bias_exter"
1649 |     lr_mult: 0
1650 |     decay_mult: 0
1651 |   }
1652 |   batch_norm_param {
1653 |     use_global_stats: false
1654 |   }
1655 | }
1656 | layer {
1657 |   name: "bn3_scale3_exter"
1658 |   type: "BatchNorm"
1659 |   bottom: "conv3_scale3_exter"
1660 |   top: "bn3_scale3_exter"
1661 |   param {
1662 |     name: "bn3_scale3_mean_exter"
1663 |     lr_mult: 0
1664 |     decay_mult: 0
1665 |   }
1666 |   param {
1667 |     name: "bn3_scale3_var_exter"
1668 |     lr_mult: 0
1669 |     decay_mult: 0
1670 |   }
1671 |   param {
1672 |     name: "bn3_scale3_bias_exter"
1673 |     lr_mult: 0
1674 |     decay_mult: 0
1675 |   }
1676 |   batch_norm_param {
1677 |     use_global_stats: false
1678 |   }
1679 | }
1680 | layer {
1681 |   name: "bn3_exter"
1682 |   type: "Concat"
1683 |   bottom: "bn3_scale1_exter"
1684 |   bottom: "bn3_scale2_exter"
1685 |   bottom: "bn3_scale3_exter"
1686 |   top: "bn3_exter"
1687 |   concat_param {
1688 |     axis: 1
1689 |   }
1690 | }
1691 | layer {
1692 |   name: "relu3_exter"
1693 |   type: "ReLU"
1694 |   bottom: "bn3_exter"
1695 |   top: "bn3_exter"
1696 | }
1697 | layer {
1698 |   name: "pool3_exter"
1699 |   type: "Pooling"
1700 |   bottom: "bn3_exter"
1701 |   top: "pool3_exter"
1702 |   pooling_param {
1703 |     pool: MAX
1704 |     kernel_size: 2
1705 |     stride: 2
1706 |   }
1707 | }
1708 | layer {
1709 |   name: "conv4_scale1_exter"
1710 |   type: "Convolution"
1711 |   bottom: "pool3_exter"
1712 |   top: "conv4_scale1_exter"
1713 |   param {
1714 |     name: "conv4_scale1_w_exter"
1715 |     lr_mult: 1
1716 |     decay_mult: 1
1717 |   }
1718 |   param {
1719 |     name: "conv4_scale1_b_exter"
1720 |     lr_mult: 2
1721 |     decay_mult: 0
1722 |   }
1723 |   convolution_param {
1724 |     num_output: 32
1725 |     pad: 1
1726 |     kernel_size: 3
1727 |     stride: 1
1728 |     weight_filler {
1729 |       type: "xavier"
1730 |     }
1731 |     bias_filler {
1732 |       type: "constant"
1733 |     }
1734 |     dilation: 1
1735 |   }
1736 | }
1737 | layer {
1738 |   name: "conv4_scale2_exter"
1739 |   type: "Convolution"
1740 |   bottom: "pool3_exter"
1741 |   top: "conv4_scale2_exter"
1742 |   param {
1743 |     name: "conv4_scale2_w_exter"
1744 |     lr_mult: 1
1745 |     decay_mult: 1
1746 |   }
1747 |   param {
1748 |     name: "conv4_scale2_b_exter"
1749 |     lr_mult: 2
1750 |     decay_mult: 0
1751 |   }
1752 |   convolution_param {
1753 |     num_output: 32
1754 |     pad: 2
1755 |     kernel_size: 3
1756 |     stride: 1
1757 |     weight_filler {
1758 |       type: "xavier"
1759 |     }
1760 |     bias_filler {
1761 |       type: "constant"
1762 |     }
1763 |     dilation: 2
1764 |   }
1765 | }
1766 | layer {
1767 |   name: "conv4_scale3_exter"
1768 |   type: "Convolution"
1769 |   bottom: "pool3_exter"
1770 |   top: "conv4_scale3_exter"
1771 |   param {
1772 |     name: "conv4_scale3_w_exter"
1773 |     lr_mult: 1
1774 |     decay_mult: 1
1775 |   }
1776 |   param {
1777 |     name: "conv4_scale3_b_exter"
1778 |     lr_mult: 2
1779 |     decay_mult: 0
1780 |   }
1781 |   convolution_param {
1782 |     num_output: 32
1783 |     pad: 3
1784 |     kernel_size: 3
1785 |     stride: 1
1786 |     weight_filler {
1787 |       type: "xavier"
1788 |     }
1789 |     bias_filler {
1790 |       type: "constant"
1791 |     }
1792 |     dilation: 3
1793 |   }
1794 | }
1795 | layer {
1796 |   name: "bn4_scale1_exter"
1797 |   type: "BatchNorm"
1798 |   bottom: "conv4_scale1_exter"
1799 |   top: "bn4_scale1_exter"
1800 |   param {
1801 |     name: "bn4_scale1_mean_exter"
1802 |     lr_mult: 0
1803 |     decay_mult: 0
1804 |   }
1805 |   param {
1806 |     name: "bn4_scale1_var_exter"
1807 |     lr_mult: 0
1808 |     decay_mult: 0
1809 |   }
1810 |   param {
1811 |     name: "bn4_scale1_bias_exter"
1812 |     lr_mult: 0
1813 |     decay_mult: 0
1814 |   }
1815 |   batch_norm_param {
1816 |     use_global_stats: false
1817 |   }
1818 | }
1819 | layer {
1820 |   name: "bn4_scale2_exter"
1821 |   type: "BatchNorm"
1822 |   bottom: "conv4_scale2_exter"
1823 |   top: "bn4_scale2_exter"
1824 |   param {
1825 |     name: "bn4_scale2_mean_exter"
1826 |     lr_mult: 0
1827 |     decay_mult: 0
1828 |   }
1829 |   param {
1830 |     name: "bn4_scale2_var_exter"
1831 |     lr_mult: 0
1832 |     decay_mult: 0
1833 |   }
1834 |   param {
1835 |     name: "bn4_scale2_bias_exter"
1836 |     lr_mult: 0
1837 |     decay_mult: 0
1838 |   }
1839 |   batch_norm_param {
1840 |     use_global_stats: false
1841 |   }
1842 | }
1843 | layer {
1844 |   name: "bn4_scale3_exter"
1845 |   type: "BatchNorm"
1846 |   bottom: "conv4_scale3_exter"
1847 |   top: "bn4_scale3_exter"
1848 |   param {
1849 |     name: "bn4_scale3_mean_exter"
1850 |     lr_mult: 0
1851 |     decay_mult: 0
1852 |   }
1853 |   param {
1854 |     name: "bn4_scale3_var_exter"
1855 |     lr_mult: 0
1856 |     decay_mult: 0
1857 |   }
1858 |   param {
1859 |     name: "bn4_scale3_bias_exter"
1860 |     lr_mult: 0
1861 |     decay_mult: 0
1862 |   }
1863 |   batch_norm_param {
1864 |     use_global_stats: false
1865 |   }
1866 | }
1867 | layer {
1868 |   name: "bn4_exter"
1869 |   type: "Concat"
1870 |   bottom: "bn4_scale1_exter"
1871 |   bottom: "bn4_scale2_exter"
1872 |   bottom: "bn4_scale3_exter"
1873 |   top: "bn4_exter"
1874 |   concat_param {
1875 |     axis: 1
1876 |   }
1877 | }
1878 | layer {
1879 |   name: "relu4_exter"
1880 |   type: "ReLU"
1881 |   bottom: "bn4_exter"
1882 |   top: "bn4_exter"
1883 | }
1884 | layer {
1885 |   name: "pool4_exter"
1886 |   type: "Pooling"
1887 |   bottom: "bn4_exter"
1888 |   top: "pool4_exter"
1889 |   pooling_param {
1890 |     pool: MAX
1891 |     kernel_size: 2
1892 |     stride: 2
1893 |   }
1894 | }
1895 | layer {
1896 |   name: "fc1_exter"
1897 |   type: "InnerProduct"
1898 |   bottom: "pool4_exter"
1899 |   top: "fc1_exter"
1900 |   param {
1901 |     name: "fc1_w_exter"
1902 |     lr_mult: 1
1903 |     decay_mult: 1
1904 |   }
1905 |   param {
1906 |     name: "fc1_b_exter"
1907 |     lr_mult: 2
1908 |     decay_mult: 0
1909 |   }
1910 |   inner_product_param {
1911 |     num_output: 128
1912 |     weight_filler {
1913 |       type: "xavier"
1914 |     }
1915 |     bias_filler {
1916 |       type: "constant"
1917 |     }
1918 |   }
1919 | }
1920 | layer {
1921 |   name: "fc1_exter_drop"
1922 |   type: "Dropout"
1923 |   bottom: "fc1_exter"
1924 |   top: "fc1_exter"
1925 |   dropout_param {
1926 |     dropout_ratio: 0.2
1927 |   }
1928 | }
1929 | 
1930 | layer {
1931 |   name: "fc2_exter"
1932 |   type: "InnerProduct"
1933 |   bottom: "fc1_exter"
1934 |   top: "fc2_exter"
1935 |   param {
1936 |     lr_mult: 1
1937 |     decay_mult: 1
1938 |   }
1939 |   param {
1940 |     lr_mult: 2
1941 |     decay_mult: 0
1942 |   }
1943 |   inner_product_param {
1944 |     num_output: 751 # For mars dataset training set.
1945 |     weight_filler {
1946 |       type: "xavier"
1947 |     }
1948 |     bias_filler {
1949 |       type: "constant"
1950 |     }
1951 |   }
1952 | }
1953 | layer {
1954 |   name: "loss_cls_exter"
1955 |   type: "SoftmaxWithLoss"
1956 |   bottom: "fc2_exter"
1957 |   bottom: "label_plus"
1958 |   top: "loss_cls_exter"
1959 |   loss_weight: 1
1960 | }
1961 | layer {
1962 |   name: "acc_cls_exter"
1963 |   type: "Accuracy"
1964 |   bottom: "fc2_exter"
1965 |   bottom: "label_plus"
1966 |   top: "acc_cls_exter"
1967 | }
1968 | 
1969 | layer {
1970 |   name: "loss_pull"
1971 |   type: "ContrastiveLoss"
1972 |   bottom: "fc1_iner"
1973 |   bottom: "fc1_full"
1974 |   bottom: "sim_iner"
1975 |   top: "loss_pull"
1976 |   loss_weight: 0.01
1977 |   contrastive_loss_param {
1978 |     margin: 0.1 # we set margin to 0 or small value for body regions. 
1979 |   }
1980 | }
1981 | 
1982 | layer {
1983 |   name: "loss_push"
1984 |   type: "ContrastiveLoss"
1985 |   bottom: "fc1_exter"
1986 |   bottom: "fc1_full"
1987 |   bottom: "sim_exter"
1988 |   top: "loss_push"
1989 |   loss_weight: 0.01
1990 |   contrastive_loss_param {
1991 |     margin: 100 # we set margin to 100 or 10 for background regions. 
1992 |   }
1993 | }


--------------------------------------------------------------------------------
/experiments/market1501/run_mgcam.sh:
--------------------------------------------------------------------------------
1 | #!/usr/bin/env sh
2 | LOG=./mgcam-`date +%Y-%m-%d-%H-%M-%S`.log
3 | CAFFE=/path-to-caffe/build/tools/caffe
4 | 
5 | $CAFFE train --solver=./solver_mgcam.prototxt --gpu=0 2>&1 | tee $LOG
6 | 


--------------------------------------------------------------------------------
/experiments/market1501/run_mgcam_siamese.sh:
--------------------------------------------------------------------------------
1 | #!/usr/bin/env sh
2 | LOG=./mgcam-siamese`date +%Y-%m-%d-%H-%M-%S`.log
3 | CAFFE=/path-to-caffe/build/tools/caffe
4 | 
5 | $CAFFE train --solver=./solver_mgcam_siamese.prototxt --weights=./mgcam_iter_75000.caffemodel --gpu=0 2>&1 | tee $LOG
6 | 


--------------------------------------------------------------------------------
/experiments/market1501/solver_mgcam.prototxt:
--------------------------------------------------------------------------------
 1 | net: "mgcam_train.prototxt"
 2 | 
 3 | test_iter: 10
 4 | test_interval:1000 
 5 | base_lr: 0.01
 6 | lr_policy: "step"
 7 | gamma: 0.1
 8 | stepsize: 15000
 9 | display: 10
10 | max_iter: 75000
11 | momentum: 0.9
12 | weight_decay: 0.005
13 | snapshot: 5000
14 | snapshot_prefix: "mgcam"
15 | solver_mode: GPU
16 | 


--------------------------------------------------------------------------------
/experiments/market1501/solver_mgcam_siamese.prototxt:
--------------------------------------------------------------------------------
 1 | net: "mgcam_siamese_train.prototxt"
 2 | 
 3 | test_iter: 10
 4 | test_interval:1000 
 5 | base_lr: 0.0001
 6 | lr_policy: "step"
 7 | gamma: 0.1
 8 | stepsize: 10000
 9 | display: 10
10 | max_iter: 20000
11 | momentum: 0.9
12 | weight_decay: 0.005
13 | snapshot: 5000
14 | snapshot_prefix: "mgcam_siamese"
15 | solver_mode: GPU
16 | 


--------------------------------------------------------------------------------
/experiments/mars/layers.py:
--------------------------------------------------------------------------------
  1 | """
  2 | MGCAM data layer.
  3 | 
  4 | by Chunfeng Song
  5 | 
  6 | 2017/10/08
  7 | 
  8 | This code is for research use only, please cite our paper:
  9 |  
 10 | Chunfeng Song, Yan Huang, Wanli Ouyang, and Liang Wang. Mask-guided Contrastive Attention Model for Person Re-Identification. In CVPR, 2018.
 11 |  
 12 | Contact us: chunfeng.song@nlpr.ia.ac.cn
 13 | """
 14 | 
 15 | import caffe
 16 | import numpy as np
 17 | import yaml
 18 | from random import shuffle
 19 | import numpy.random as nr
 20 | import cv2
 21 | import os
 22 | import pickle as cPickle
 23 | import pdb
 24 | 
 25 | def mypickle(filename, data):
 26 |     fo = open(filename, "wb")
 27 |     cPickle.dump(data, fo, protocol=cPickle.HIGHEST_PROTOCOL)
 28 |     fo.close()
 29 |     
 30 | def myunpickle(filename):
 31 |     if not os.path.exists(filename):
 32 |         raise UnpickleError("Path '%s' does not exist." % filename)
 33 | 
 34 |     fo = open(filename, 'rb')
 35 |     dict = cPickle.load(fo)    
 36 |     fo.close()
 37 |     return dict
 38 | 
 39 | class MGCAM_DataLayer(caffe.Layer):
 40 |     """Data layer for training"""   
 41 |     def setup(self, bottom, top): 
 42 |         self.width = 64
 43 |         self.height = 160 # We resize all images into a size of 160*64.
 44 |         self.width_gt = 16
 45 |         self.height_gt = 40 # We resize all masks which are used to supervise attention learning into a size of 40*16.
 46 |         layer_params = yaml.load(self.param_str)
 47 |         self.batch_size = layer_params['batch_size']
 48 |         self.im_path = layer_params['im_path']
 49 |         self.gt_path = layer_params['gt_path']
 50 |         self.dataset = layer_params['dataset']
 51 |         self.labels, self.im_list, self.gt_list, self.im_dir_list, self.gt_dir_list = self.data_processor(self.dataset)
 52 |         self.idx = 0
 53 |         self.data_num = len(self.im_list) # Number of data pairs
 54 |         self.rnd_list = np.arange(self.data_num) # Random the images list
 55 |         shuffle(self.rnd_list)
 56 |         
 57 |     def forward(self, bottom, top):
 58 |         # Assign forward data
 59 |         top[0].data[...] = self.im
 60 |         top[1].data[...] = self.inner_label
 61 |         top[2].data[...] = self.exter_label
 62 |         top[3].data[...] = self.one_mask
 63 |         top[4].data[...] = self.label
 64 |         top[5].data[...] = self.label_plus
 65 |         top[6].data[...] = self.gt
 66 |         top[7].data[...] = self.mask
 67 | 
 68 |     def backward(self, top, propagate_down, bottom):
 69 |         """This layer does not propagate gradients."""
 70 |         pass
 71 | 
 72 |     def reshape(self, bottom, top):
 73 |         # Load image + label image pairs
 74 |         self.im = []
 75 |         self.label = []
 76 |         self.inner_label = []
 77 |         self.exter_label = []
 78 |         self.one_mask = []
 79 |         self.label_plus = []
 80 |         self.gt = []
 81 |         self.mask = []
 82 | 
 83 |         for i in xrange(self.batch_size):
 84 |             if self.idx == self.data_num:
 85 |                 self.idx = 0
 86 |                 shuffle(self.rnd_list) #Randomly shuffle the list.
 87 |             cur_idx = self.rnd_list[self.idx]
 88 |             im_path = self.im_list[cur_idx]
 89 |             gt_path = self.gt_list[cur_idx]
 90 |             im_, gt_, mask_= self.load_data(im_path, gt_path)
 91 |             self.im.append(im_)
 92 |             self.gt.append(gt_)
 93 |             self.mask.append(mask_)
 94 |             self.label.append(self.labels[cur_idx])
 95 |             self.inner_label.append(int(1))
 96 |             self.exter_label.append(int(0))
 97 |             one_mask_ = np.zeros((1,40,16),dtype = np.float32)
 98 |             one_mask_ = one_mask_ + 1.0
 99 |             self.one_mask.append(one_mask_)
100 |             self.label_plus.append(self.labels[cur_idx]) #Here, we also give the ID-labels to background-stream.
101 |             self.idx +=1
102 | 
103 |         self.im = np.array(self.im).astype(np.float32)
104 |         self.inner_label = np.array(self.inner_label).astype(np.float32)
105 |         self.exter_label = np.array(self.exter_label).astype(np.float32)
106 |         self.one_mask = np.array(self.one_mask).astype(np.float32)
107 |         self.label = np.array(self.label).astype(np.float32)
108 |         self.label_plus = np.array(self.label_plus).astype(np.float32)
109 |         self.gt = np.array(self.gt).astype(np.float32)
110 |         self.mask = np.array(self.mask).astype(np.float32)
111 |         # Reshape tops to fit blobs      
112 |         top[0].reshape(*self.im.shape)
113 |         top[1].reshape(*self.inner_label.shape)
114 |         top[2].reshape(*self.exter_label.shape)
115 |         top[3].reshape(*self.one_mask.shape)
116 |         top[4].reshape(*self.label.shape)
117 |         top[5].reshape(*self.label_plus.shape)
118 |         top[6].reshape(*self.gt.shape)
119 |         top[7].reshape(*self.mask.shape)
120 |         
121 |     def data_processor(self, data_name):
122 |         data_dic = './' + data_name
123 |         if not os.path.exists(data_dic):
124 |             im_list = []
125 |             gt_list = []
126 |             labels = []
127 |             im_dir_list = []
128 |             gt_dir_list = []
129 |             new_id = 0
130 |             id_list = np.sort(os.listdir(self.im_path))
131 |             for id in id_list:
132 |                 im_dir = os.path.join(self.im_path, id)
133 |                 gt_dir = os.path.join(self.gt_path, id)
134 |                 if not os.path.exists(im_dir):
135 |                     continue
136 |                 pic_im_list = np.sort(os.listdir(im_dir))
137 |                 if len(pic_im_list)>1:
138 |                     for pic in pic_im_list:
139 |                         this_dir = os.path.join(self.im_path, id, pic)
140 |                         gt_pic = pic
141 |                         if not pic.lower().endswith('.png'):
142 |                             gt_pic = pic[:-4] + '.png'
143 |                         this_gt_dir = os.path.join(self.gt_path, id, gt_pic)
144 |                         im_list.append(this_dir)
145 |                         gt_list.append(this_gt_dir)
146 |                         labels.append(int(new_id))
147 |                     new_id +=1
148 |                     im_dir_list.append(im_dir)
149 |                     gt_dir_list.append(gt_dir)
150 |             dic = {'im_list':im_list,'gt_list':gt_list,'labels':labels,'im_dir_list':im_dir_list,'gt_dir_list':gt_dir_list}
151 |             mypickle(data_dic, dic)
152 |         # Load saved data dict to resume.
153 |         else:
154 |             dic = myunpickle(data_dic)
155 |             im_list = dic['im_list']
156 |             gt_list = dic['gt_list']
157 |             labels = dic['labels']
158 |             im_dir_list = dic['im_dir_list']
159 |             gt_dir_list = dic['gt_dir_list']
160 |         return labels, im_list, gt_list, im_dir_list, gt_dir_list
161 |     
162 |     def load_data(self, im_path, gt_path):
163 |         """
164 |         Load input image and preprocess for Caffe:
165 |         - cast to float
166 |         - switch channels RGB -> BGR
167 |         - subtract mean
168 |         - transpose to channel x height x width order
169 |         """
170 |         oim = cv2.imread(im_path)
171 |         inputImage = cv2.resize(oim, (self.width, self.height))
172 |         inputImage = np.array(inputImage, dtype=np.float32)
173 |         
174 |         # Substract mean
175 |         inputImage[:, :, 0] = inputImage[:, :, 0] - 104.008
176 |         inputImage[:, :, 1] = inputImage[:, :, 1] - 116.669
177 |         inputImage[:, :, 2] = inputImage[:, :, 2] - 122.675
178 |         
179 |         # Permute dimensions
180 |         if_flip = nr.randint(2)
181 |         if if_flip == 0: # Also flip the image with 50% probability
182 |             inputImage = inputImage[:,::-1,:]
183 |         inputImage = inputImage.transpose([2, 0, 1])        
184 |         inputImage = inputImage/256.0
185 |         #GT
186 |         mask_im= cv2.cvtColor(cv2.imread(gt_path),cv2.COLOR_BGR2GRAY)
187 |         inputGt = np.array(cv2.resize(mask_im, (self.width_gt, self.height_gt)), dtype=np.float32)
188 |         inputGt = inputGt/255.0
189 |         if if_flip == 0:
190 |             inputGt = inputGt[:,::-1]
191 |         inputGt = inputGt[np.newaxis, ...]
192 |         #Mask
193 |         inputMask = np.array(cv2.resize(mask_im, (self.width, self.height)), dtype=np.float32)
194 |         inputMask = inputMask-127.5
195 |         inputMask = inputMask/255.0
196 |         if if_flip == 0:
197 |             inputMask = inputMask[:,::-1]
198 |         inputMask = inputMask[np.newaxis, ...]
199 |         return inputImage, inputGt, inputMask
200 | 
201 | 
202 | class MGCAM_SIA_DataLayer(caffe.Layer):
203 |     """Data layer for training"""   
204 |     def setup(self, bottom, top): 
205 |         self.width = 64
206 |         self.height = 160 # We resize all images into a size of 160*64.
207 |         self.width_gt = 16
208 |         self.height_gt = 40 # We resize all masks which are used to supervise attention learning into a size of 160*64.
209 |         
210 |         layer_params = yaml.load(self.param_str)
211 |         self.batch_size = layer_params['batch_size']
212 |         self.pos_pair_num = int(0.30*self.batch_size) # There will be at least 30 percent postive pairs for each batch.
213 |         self.im_path = layer_params['im_path']
214 |         self.gt_path = layer_params['gt_path']
215 |         self.dataset = layer_params['dataset']
216 |         self.labels, self.im_list, self.gt_list, self.im_dir_list, self.gt_dir_list = self.data_processor(self.dataset)
217 |         self.idx = 0
218 |         self.data_num = len(self.im_list)
219 |         self.rnd_list = np.arange(self.data_num)
220 |         shuffle(self.rnd_list)
221 |         
222 |     def forward(self, bottom, top):
223 |         # Assign forward data
224 |         top[0].data[...] = self.im
225 |         top[1].data[...] = self.inner_label
226 |         top[2].data[...] = self.exter_label
227 |         top[3].data[...] = self.one_mask
228 |         top[4].data[...] = self.label
229 |         top[5].data[...] = self.label_plus
230 |         top[6].data[...] = self.gt
231 |         top[7].data[...] = self.mask
232 |         top[8].data[...] = self.siam_label
233 |         
234 |     def backward(self, top, propagate_down, bottom):
235 |         """This layer does not propagate gradients."""
236 |         pass
237 | 
238 |     def reshape(self, bottom, top):
239 |         # Load image + label image pairs
240 |         self.im = []
241 |         self.label = []
242 |         self.inner_label = []
243 |         self.exter_label = []
244 |         self.one_mask = []
245 |         self.label_plus = []
246 |         self.gt = []
247 |         self.mask = []
248 |         self.siam_label = []
249 | 
250 |         for i in xrange(self.batch_size):
251 |             if self.idx == self.data_num:
252 |                 self.idx = 0
253 |                 shuffle(self.rnd_list)
254 |             cur_idx = self.rnd_list[self.idx]
255 |             im_path = self.im_list[cur_idx]
256 |             gt_path = self.gt_list[cur_idx]
257 |             im_, gt_, mask_= self.load_data(im_path, gt_path)
258 |             self.im.append(im_)
259 |             self.gt.append(gt_)
260 |             self.mask.append(mask_)
261 |             self.label.append(self.labels[cur_idx])
262 |             self.inner_label.append(int(1))
263 |             self.exter_label.append(int(0))
264 |             one_mask_ = np.zeros((1,40,16),dtype = np.float32)
265 |             one_mask_ = one_mask_ + 1.0
266 |             self.one_mask.append(one_mask_)
267 |             self.label_plus.append(self.labels[cur_idx])#Labels for backgrounds. We use the same labels with other two regions here. 
268 |             self.idx +=1
269 |             
270 |         for i in xrange(self.batch_size):
271 |             if i > self.pos_pair_num:
272 |                 if self.idx == self.data_num:
273 |                     self.idx = 0
274 |                     shuffle(self.rnd_list)#Randomly shuffle the list.
275 |                 cur_idx = self.rnd_list[self.idx]
276 |                 self.idx +=1
277 |                 im_path = self.im_list[cur_idx]
278 |                 gt_path = self.gt_list[cur_idx]
279 |                 label = self.labels[cur_idx]
280 |                 if label==self.label[i]:
281 |                     self.siam_label.append(int(1))#In case of getting postive pairs, maybe not much.
282 |                 else:
283 |                     self.siam_label.append(int(0))#Negative pairs.
284 |             else:
285 |                 im_dir = self.im_dir_list[self.label[i]]
286 |                 gt_dir = self.gt_dir_list[self.label[i]]
287 |                 im_list = np.sort(os.listdir(im_dir))
288 |                 gt_list = np.sort(os.listdir(gt_dir))
289 |                 tmp_list = np.arange(len(im_list))
290 |                 shuffle(tmp_list) #Randomly select one.
291 |                 im_path = os.path.join(im_dir, im_list[tmp_list[0]])
292 |                 gt_path = os.path.join(gt_dir, gt_list[tmp_list[0]])
293 |                 label = self.label[i]
294 |                 self.siam_label.append(int(1))#This is a postive pair.
295 |             
296 |             im_, gt_, mask_= self.load_data(im_path, gt_path)
297 |             self.im.append(im_)
298 |             self.gt.append(gt_)
299 |             self.mask.append(mask_)
300 |             self.label.append(label)
301 |             self.inner_label.append(int(1))#Allways be ones, for constrastive learning.
302 |             self.exter_label.append(int(0))
303 |             one_mask_ = np.zeros((1,40,16),dtype = np.float32)
304 |             one_mask_ = one_mask_ + 1.0
305 |             self.one_mask.append(one_mask_)
306 |             self.label_plus.append(label)
307 |                 
308 |         self.im = np.array(self.im).astype(np.float32)
309 |         self.inner_label = np.array(self.inner_label).astype(np.float32)
310 |         self.exter_label = np.array(self.exter_label).astype(np.float32)
311 |         self.one_mask = np.array(self.one_mask).astype(np.float32)
312 |         self.label = np.array(self.label).astype(np.float32)
313 |         self.label_plus = np.array(self.label_plus).astype(np.float32)
314 |         self.gt = np.array(self.gt).astype(np.float32)
315 |         self.mask = np.array(self.mask).astype(np.float32)
316 |         self.siam_label = np.array(self.siam_label).astype(np.float32)
317 |         # Reshape tops to fit blobs      
318 |         top[0].reshape(*self.im.shape)
319 |         top[1].reshape(*self.inner_label.shape)
320 |         top[2].reshape(*self.exter_label.shape)
321 |         top[3].reshape(*self.one_mask.shape)
322 |         top[4].reshape(*self.label.shape)
323 |         top[5].reshape(*self.label_plus.shape)
324 |         top[6].reshape(*self.gt.shape)
325 |         top[7].reshape(*self.mask.shape)
326 |         top[8].reshape(*self.siam_label.shape)
327 |         
328 |     def data_processor(self, data_name):
329 |         data_dic = './' + data_name
330 |         if not os.path.exists(data_dic):
331 |             im_list = []
332 |             gt_list = []
333 |             labels = []
334 |             im_dir_list = []
335 |             gt_dir_list = []
336 |             new_id = 0
337 |             id_list = np.sort(os.listdir(self.im_path))
338 |             for id in id_list:
339 |                 im_dir = os.path.join(self.im_path, id)
340 |                 gt_dir = os.path.join(self.gt_path, id)
341 |                 if not os.path.exists(im_dir):
342 |                     continue
343 |                 pic_im_list = np.sort(os.listdir(im_dir))
344 |                 if len(pic_im_list)>1:
345 |                     for pic in pic_im_list:
346 |                         this_dir = os.path.join(self.im_path, id, pic)
347 |                         gt_pic = pic
348 |                         if not pic.lower().endswith('.png'):
349 |                             gt_pic = pic[:-4] + '.png'
350 |                         this_gt_dir = os.path.join(self.gt_path, id, gt_pic)
351 |                         im_list.append(this_dir)
352 |                         gt_list.append(this_gt_dir)
353 |                         labels.append(int(new_id))
354 |                     new_id +=1
355 |                     im_dir_list.append(im_dir)
356 |                     gt_dir_list.append(gt_dir)
357 |             dic = {'im_list':im_list,'gt_list':gt_list,'labels':labels,'im_dir_list':im_dir_list,'gt_dir_list':gt_dir_list}
358 |             mypickle(data_dic, dic)
359 |         # Load saved data dict to resume.
360 |         else:
361 |             dic = myunpickle(data_dic)
362 |             im_list = dic['im_list']
363 |             gt_list = dic['gt_list']
364 |             labels = dic['labels']
365 |             im_dir_list = dic['im_dir_list']
366 |             gt_dir_list = dic['gt_dir_list']
367 |         return labels, im_list, gt_list, im_dir_list, gt_dir_list
368 |     
369 |     def load_data(self, im_path, gt_path):
370 |         """
371 |         Load input image and preprocess for Caffe:
372 |         - cast to float
373 |         - switch channels RGB -> BGR
374 |         - subtract mean
375 |         - transpose to channel x height x width order
376 |         """
377 |         oim = cv2.imread(im_path)
378 |         inputImage = cv2.resize(oim, (self.width, self.height))
379 |         inputImage = np.array(inputImage, dtype=np.float32)
380 |         
381 |         # Substract mean
382 |         inputImage[:, :, 0] = inputImage[:, :, 0] - 104.008
383 |         inputImage[:, :, 1] = inputImage[:, :, 1] - 116.669
384 |         inputImage[:, :, 2] = inputImage[:, :, 2] - 122.675
385 |         
386 |         # Permute dimensions
387 |         if_flip = nr.randint(2)
388 |         if if_flip == 0: # Also flip the image with 50% probability
389 |             inputImage = inputImage[:,::-1,:]
390 |         inputImage = inputImage.transpose([2, 0, 1])        
391 |         inputImage = inputImage/256.0
392 |         #GT
393 |         mask_im= cv2.cvtColor(cv2.imread(gt_path),cv2.COLOR_BGR2GRAY)
394 |         inputGt = np.array(cv2.resize(mask_im, (self.width_gt, self.height_gt)), dtype=np.float32)
395 |         inputGt = inputGt/255.0
396 |         if if_flip == 0:
397 |             inputGt = inputGt[:,::-1]
398 |         inputGt = inputGt[np.newaxis, ...]
399 |         #Mask
400 |         inputMask = np.array(cv2.resize(mask_im, (self.width, self.height)), dtype=np.float32)
401 |         inputMask = inputMask-127.5
402 |         inputMask = inputMask/255.0
403 |         if if_flip == 0:
404 |             inputMask = inputMask[:,::-1]
405 |         inputMask = inputMask[np.newaxis, ...]
406 |         return inputImage, inputGt, inputMask
407 |             


--------------------------------------------------------------------------------
/experiments/mars/mgcam_iter_75000.caffemodel:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/developfeng/MGCAM/1b5957218ecaa7f13bf2107bc41b1349c05be9c8/experiments/mars/mgcam_iter_75000.caffemodel


--------------------------------------------------------------------------------
/experiments/mars/mgcam_siamese_iter_20000.caffemodel:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/developfeng/MGCAM/1b5957218ecaa7f13bf2107bc41b1349c05be9c8/experiments/mars/mgcam_siamese_iter_20000.caffemodel


--------------------------------------------------------------------------------
/experiments/mars/mgcam_train.prototxt:
--------------------------------------------------------------------------------
   1 | layer {
   2 |   name: "data"
   3 |   type: "Python"
   4 |   top: "data"
   5 |   top: "sim_iner"
   6 |   top: "sim_exter"
   7 |   top: "one_mask"
   8 |   top: "label"
   9 |   top: "label_plus"
  10 |   top: "gt"
  11 |   top: "mask"
  12 |   include {
  13 |     phase: TRAIN
  14 |   }
  15 |   python_param {
  16 |     module: "layers"
  17 |     layer: "MGCAM_DataLayer"
  18 |     param_str: "{'batch_size': 128,'im_path':'../../data/mars/bounding_box_train_fold','gt_path':'../../data/mars/bounding_box_train_seg_fold/','dataset':'mars'}"
  19 |   }
  20 | }
  21 | 
  22 | layer {
  23 |   name: "data"
  24 |   type: "Python"
  25 |   top: "data"
  26 |   top: "sim_iner"
  27 |   top: "sim_exter"
  28 |   top: "one_mask"
  29 |   top: "label"
  30 |   top: "label_plus"
  31 |   top: "gt"
  32 |   top: "mask"
  33 |   include {
  34 |     phase: TEST
  35 |   }#Note the testing set is the same with the training here.
  36 |   python_param {
  37 |     module: "layers"
  38 |     layer: "MGCAM_DataLayer"
  39 |     param_str: "{'batch_size': 10,'im_path':'../../data/mars/bounding_box_train_fold','gt_path':'../../data/mars/bounding_box_train_seg_fold/','dataset':'mars'}"
  40 |   }
  41 | }
  42 | 
  43 | layer {
  44 |   name: "data_concate"
  45 |   type: "Concat"
  46 |   bottom: "data"
  47 |   bottom: "mask"
  48 |   top: "data_concate"
  49 | }
  50 | 
  51 | layer {
  52 |   name: "conv0_scale1_full"
  53 |   type: "Convolution"
  54 |   bottom: "data_concate"
  55 |   top: "conv0_scale1_full"
  56 |   param {
  57 |     name: "conv0_scale1_w_full"
  58 |     lr_mult: 1
  59 |     decay_mult: 1
  60 |   }
  61 |   param {
  62 |     name: "conv0_scale1_b_full"
  63 |     lr_mult: 2
  64 |     decay_mult: 0
  65 |   }
  66 |   convolution_param {
  67 |     num_output: 32
  68 |     pad: 2
  69 |     kernel_size: 5
  70 |     stride: 1
  71 |     weight_filler {
  72 |       type: "xavier"
  73 |     }
  74 |     bias_filler {
  75 |       type: "constant"
  76 |     }
  77 |     dilation: 1
  78 |   }
  79 | }
  80 | layer {
  81 |   name: "bn0_scale1_full"
  82 |   type: "BatchNorm"
  83 |   bottom: "conv0_scale1_full"
  84 |   top: "bn0_scale1_full"
  85 |   param {
  86 |     name: "bn0_scale1_mean_full"
  87 |     lr_mult: 0
  88 |     decay_mult: 0
  89 |   }
  90 |   param {
  91 |     name: "bn0_scale1_var_full"
  92 |     lr_mult: 0
  93 |     decay_mult: 0
  94 |   }
  95 |   param {
  96 |     name: "bn0_scale1_bias_full"
  97 |     lr_mult: 0
  98 |     decay_mult: 0
  99 |   }
 100 |   batch_norm_param {
 101 |     use_global_stats: false
 102 |   }
 103 | }
 104 | layer {
 105 |   name: "relu0_full"
 106 |   type: "ReLU"
 107 |   bottom: "bn0_scale1_full"
 108 |   top: "bn0_scale1_full"
 109 | }
 110 | layer {
 111 |   name: "pool0_full"
 112 |   type: "Pooling"
 113 |   bottom: "bn0_scale1_full"
 114 |   top: "pool0_full"
 115 |   pooling_param {
 116 |     pool: MAX
 117 |     kernel_size: 2
 118 |     stride: 2
 119 |   }
 120 | }
 121 | layer {
 122 |   name: "conv1_scale1_full"
 123 |   type: "Convolution"
 124 |   bottom: "pool0_full"
 125 |   top: "conv1_scale1_full"
 126 |   param {
 127 |     name: "conv1_scale1_w_full"
 128 |     lr_mult: 1
 129 |     decay_mult: 1
 130 |   }
 131 |   param {
 132 |     name: "conv1_scale1_b_full"
 133 |     lr_mult: 2
 134 |     decay_mult: 0
 135 |   }
 136 |   convolution_param {
 137 |     num_output: 32
 138 |     pad: 1
 139 |     kernel_size: 3
 140 |     stride: 1
 141 |     weight_filler {
 142 |       type: "xavier"
 143 |     }
 144 |     bias_filler {
 145 |       type: "constant"
 146 |     }
 147 |     dilation: 1
 148 |   }
 149 | }
 150 | layer {
 151 |   name: "conv1_scale2_full"
 152 |   type: "Convolution"
 153 |   bottom: "pool0_full"
 154 |   top: "conv1_scale2_full"
 155 |   param {
 156 |     name: "conv1_scale2_w_full"
 157 |     lr_mult: 1
 158 |     decay_mult: 1
 159 |   }
 160 |   param {
 161 |     name: "conv1_scale2_b_full"
 162 |     lr_mult: 2
 163 |     decay_mult: 0
 164 |   }
 165 |   convolution_param {
 166 |     num_output: 32
 167 |     pad: 2
 168 |     kernel_size: 3
 169 |     stride: 1
 170 |     weight_filler {
 171 |       type: "xavier"
 172 |     }
 173 |     bias_filler {
 174 |       type: "constant"
 175 |     }
 176 |     dilation: 2
 177 |   }
 178 | }
 179 | layer {
 180 |   name: "conv1_scale3_full"
 181 |   type: "Convolution"
 182 |   bottom: "pool0_full"
 183 |   top: "conv1_scale3_full"
 184 |   param {
 185 |     name: "conv1_scale3_w_full"
 186 |     lr_mult: 1
 187 |     decay_mult: 1
 188 |   }
 189 |   param {
 190 |     name: "conv1_scale3_b_full"
 191 |     lr_mult: 2
 192 |     decay_mult: 0
 193 |   }
 194 |   convolution_param {
 195 |     num_output: 32
 196 |     pad: 3
 197 |     kernel_size: 3
 198 |     stride: 1
 199 |     weight_filler {
 200 |       type: "xavier"
 201 |     }
 202 |     bias_filler {
 203 |       type: "constant"
 204 |     }
 205 |     dilation: 3
 206 |   }
 207 | }
 208 | layer {
 209 |   name: "bn1_scale1_full"
 210 |   type: "BatchNorm"
 211 |   bottom: "conv1_scale1_full"
 212 |   top: "bn1_scale1_full"
 213 |   param {
 214 |     name: "bn1_scale1_mean_full"
 215 |     lr_mult: 0
 216 |     decay_mult: 0
 217 |   }
 218 |   param {
 219 |     name: "bn1_scale1_var_full"
 220 |     lr_mult: 0
 221 |     decay_mult: 0
 222 |   }
 223 |   param {
 224 |     name: "bn1_scale1_bias_full"
 225 |     lr_mult: 0
 226 |     decay_mult: 0
 227 |   }
 228 |   batch_norm_param {
 229 |     use_global_stats: false
 230 |   }
 231 | }
 232 | layer {
 233 |   name: "bn1_scale2_full"
 234 |   type: "BatchNorm"
 235 |   bottom: "conv1_scale2_full"
 236 |   top: "bn1_scale2_full"
 237 |   param {
 238 |     name: "bn1_scale2_mean_full"
 239 |     lr_mult: 0
 240 |     decay_mult: 0
 241 |   }
 242 |   param {
 243 |     name: "bn1_scale2_var_full"
 244 |     lr_mult: 0
 245 |     decay_mult: 0
 246 |   }
 247 |   param {
 248 |     name: "bn1_scale2_bias_full"
 249 |     lr_mult: 0
 250 |     decay_mult: 0
 251 |   }
 252 |   batch_norm_param {
 253 |     use_global_stats: false
 254 |   }
 255 | }
 256 | layer {
 257 |   name: "bn1_scale3_full"
 258 |   type: "BatchNorm"
 259 |   bottom: "conv1_scale3_full"
 260 |   top: "bn1_scale3_full"
 261 |   param {
 262 |     name: "bn1_scale3_mean_full"
 263 |     lr_mult: 0
 264 |     decay_mult: 0
 265 |   }
 266 |   param {
 267 |     name: "bn1_scale3_var_full"
 268 |     lr_mult: 0
 269 |     decay_mult: 0
 270 |   }
 271 |   param {
 272 |     name: "bn1_scale3_bias_full"
 273 |     lr_mult: 0
 274 |     decay_mult: 0
 275 |   }
 276 |   batch_norm_param {
 277 |     use_global_stats: false
 278 |   }
 279 | }
 280 | layer {
 281 |   name: "bn1_full"
 282 |   type: "Concat"
 283 |   bottom: "bn1_scale1_full"
 284 |   bottom: "bn1_scale2_full"
 285 |   bottom: "bn1_scale3_full"
 286 |   top: "bn1_full"
 287 |   concat_param {
 288 |     axis: 1
 289 |   }
 290 | }
 291 | layer {
 292 |   name: "relu1_full"
 293 |   type: "ReLU"
 294 |   bottom: "bn1_full"
 295 |   top: "bn1_full"
 296 | }
 297 | layer {
 298 |   name: "pool1_full"
 299 |   type: "Pooling"
 300 |   bottom: "bn1_full"
 301 |   top: "pool1_full"
 302 |   pooling_param {
 303 |     pool: MAX
 304 |     kernel_size: 2
 305 |     stride: 2
 306 |   }
 307 | }
 308 | layer {
 309 |   name: "conv2_scale1_full"
 310 |   type: "Convolution"
 311 |   bottom: "pool1_full"
 312 |   top: "conv2_scale1_full"
 313 |   param {
 314 |     name: "conv2_scale1_w_full"
 315 |     lr_mult: 1
 316 |     decay_mult: 1
 317 |   }
 318 |   param {
 319 |     name: "conv2_scale1_b_full"
 320 |     lr_mult: 2
 321 |     decay_mult: 0
 322 |   }
 323 |   convolution_param {
 324 |     num_output: 32
 325 |     pad: 1
 326 |     kernel_size: 3
 327 |     stride: 1
 328 |     weight_filler {
 329 |       type: "xavier"
 330 |     }
 331 |     bias_filler {
 332 |       type: "constant"
 333 |     }
 334 |     dilation: 1
 335 |   }
 336 | }
 337 | layer {
 338 |   name: "conv2_scale2_full"
 339 |   type: "Convolution"
 340 |   bottom: "pool1_full"
 341 |   top: "conv2_scale2_full"
 342 |   param {
 343 |     name: "conv2_scale2_w_full"
 344 |     lr_mult: 1
 345 |     decay_mult: 1
 346 |   }
 347 |   param {
 348 |     name: "conv2_scale2_b_full"
 349 |     lr_mult: 2
 350 |     decay_mult: 0
 351 |   }
 352 |   convolution_param {
 353 |     num_output: 32
 354 |     pad: 2
 355 |     kernel_size: 3
 356 |     stride: 1
 357 |     weight_filler {
 358 |       type: "xavier"
 359 |     }
 360 |     bias_filler {
 361 |       type: "constant"
 362 |     }
 363 |     dilation: 2
 364 |   }
 365 | }
 366 | layer {
 367 |   name: "conv2_scale3_full"
 368 |   type: "Convolution"
 369 |   bottom: "pool1_full"
 370 |   top: "conv2_scale3_full"
 371 |   param {
 372 |     name: "conv2_scale3_w_full"
 373 |     lr_mult: 1
 374 |     decay_mult: 1
 375 |   }
 376 |   param {
 377 |     name: "conv2_scale3_b_full"
 378 |     lr_mult: 2
 379 |     decay_mult: 0
 380 |   }
 381 |   convolution_param {
 382 |     num_output: 32
 383 |     pad: 3
 384 |     kernel_size: 3
 385 |     stride: 1
 386 |     weight_filler {
 387 |       type: "xavier"
 388 |     }
 389 |     bias_filler {
 390 |       type: "constant"
 391 |     }
 392 |     dilation: 3
 393 |   }
 394 | }
 395 | layer {
 396 |   name: "bn2_scale1_full"
 397 |   type: "BatchNorm"
 398 |   bottom: "conv2_scale1_full"
 399 |   top: "bn2_scale1_full"
 400 |   param {
 401 |     name: "bn2_scale1_mean_full"
 402 |     lr_mult: 0
 403 |     decay_mult: 0
 404 |   }
 405 |   param {
 406 |     name: "bn2_scale1_var_full"
 407 |     lr_mult: 0
 408 |     decay_mult: 0
 409 |   }
 410 |   param {
 411 |     name: "bn2_scale1_bias_full"
 412 |     lr_mult: 0
 413 |     decay_mult: 0
 414 |   }
 415 |   batch_norm_param {
 416 |     use_global_stats: false
 417 |   }
 418 | }
 419 | layer {
 420 |   name: "bn2_scale2_full"
 421 |   type: "BatchNorm"
 422 |   bottom: "conv2_scale2_full"
 423 |   top: "bn2_scale2_full"
 424 |   param {
 425 |     name: "bn2_scale2_mean_full"
 426 |     lr_mult: 0
 427 |     decay_mult: 0
 428 |   }
 429 |   param {
 430 |     name: "bn2_scale2_var_full"
 431 |     lr_mult: 0
 432 |     decay_mult: 0
 433 |   }
 434 |   param {
 435 |     name: "bn2_scale2_bias_full"
 436 |     lr_mult: 0
 437 |     decay_mult: 0
 438 |   }
 439 |   batch_norm_param {
 440 |     use_global_stats: false
 441 |   }
 442 | }
 443 | layer {
 444 |   name: "bn2_scale3_full"
 445 |   type: "BatchNorm"
 446 |   bottom: "conv2_scale3_full"
 447 |   top: "bn2_scale3_full"
 448 |   param {
 449 |     name: "bn2_scale3_mean_full"
 450 |     lr_mult: 0
 451 |     decay_mult: 0
 452 |   }
 453 |   param {
 454 |     name: "bn2_scale3_var_full"
 455 |     lr_mult: 0
 456 |     decay_mult: 0
 457 |   }
 458 |   param {
 459 |     name: "bn2_scale3_bias_full"
 460 |     lr_mult: 0
 461 |     decay_mult: 0
 462 |   }
 463 |   batch_norm_param {
 464 |     use_global_stats: false
 465 |   }
 466 | }
 467 | layer {
 468 |   name: "bn2_full"
 469 |   type: "Concat"
 470 |   bottom: "bn2_scale1_full"
 471 |   bottom: "bn2_scale2_full"
 472 |   bottom: "bn2_scale3_full"
 473 |   top: "bn2_full"
 474 |   concat_param {
 475 |     axis: 1
 476 |   }
 477 | }
 478 | layer {
 479 |   name: "relu2_full"
 480 |   type: "ReLU"
 481 |   bottom: "bn2_full"
 482 |   top: "bn2_full"
 483 | }
 484 | 
 485 | 
 486 | layer {
 487 |   name: "make_att_mask"
 488 |   type: "Convolution"
 489 |   bottom: "bn2_full"
 490 |   top: "make_att_mask"
 491 |   param {
 492 |     lr_mult: 1
 493 |     decay_mult: 1
 494 |   }
 495 |   param {
 496 |     lr_mult: 2
 497 |     decay_mult: 0
 498 |   }
 499 |   convolution_param {
 500 |     num_output: 1
 501 |     pad: 1
 502 |     kernel_size: 3
 503 |     stride: 1
 504 |     weight_filler {
 505 |       type: "xavier"
 506 |     }
 507 |     bias_filler {
 508 |       type: "constant"
 509 |     }
 510 |     dilation: 1
 511 |   }
 512 | }
 513 | 
 514 | layer {
 515 |   name: "att_sigmoid"
 516 |   type: "Sigmoid"
 517 |   bottom: "make_att_mask"
 518 |   top: "make_att_mask"
 519 | }
 520 | 
 521 | ###############  Seg Loss    #####################
 522 | layer {
 523 |   name: "loss_seg"
 524 |   type: "EuclideanLoss"
 525 |   bottom: "gt"
 526 |   bottom: "make_att_mask"
 527 |   top: "loss_seg"
 528 |   loss_weight: 0.1
 529 | }
 530 | 
 531 | layer {
 532 |   name: "make_att_mask_inv"
 533 |   type: "Eltwise"
 534 |   bottom: "one_mask"
 535 |   bottom: "make_att_mask"
 536 |   top: "make_att_mask_inv"
 537 |   eltwise_param {
 538 |     operation: SUM
 539 |     coeff: 1
 540 |     coeff: -1
 541 |   }
 542 | }
 543 | ###############  Seg Loss    #####################
 544 | 
 545 | layer {
 546 |   name: "tile_iner"
 547 |   type: "Tile"
 548 |   bottom: "make_att_mask"
 549 |   top: "att_iner"
 550 |   tile_param  {
 551 |     tiles: 96
 552 |     axis: 1
 553 |   }
 554 | }
 555 | 
 556 | layer {
 557 |   name: "bn2_att_iner"
 558 |   type: "Eltwise"
 559 |   bottom: "bn2_full"
 560 |   bottom: "att_iner"
 561 |   top: "bn2_att_iner"
 562 |   eltwise_param {
 563 |     operation: PROD
 564 |   }
 565 | }
 566 | 
 567 | layer {
 568 |   name: "tile_exter"
 569 |   type: "Tile"
 570 |   bottom: "make_att_mask_inv"
 571 |   top: "att_exter"
 572 |   tile_param  {
 573 |     tiles: 96
 574 |     axis: 1
 575 |   }
 576 | }
 577 | 
 578 | layer {
 579 |   name: "bn2_att_exter"
 580 |   type: "Eltwise"
 581 |   bottom: "bn2_full"
 582 |   bottom: "att_exter"
 583 |   top: "bn2_att_exter"
 584 |   eltwise_param {
 585 |     operation: PROD
 586 |   }
 587 | }
 588 | 
 589 | layer {
 590 |   name: "pool2_full"
 591 |   type: "Pooling"
 592 |   bottom: "bn2_full"
 593 |   top: "pool2_full"
 594 |   pooling_param {
 595 |     pool: MAX
 596 |     kernel_size: 2
 597 |     stride: 2
 598 |   }
 599 | }
 600 | 
 601 | layer {
 602 |   name: "conv3_scale1_full"
 603 |   type: "Convolution"
 604 |   bottom: "pool2_full"
 605 |   top: "conv3_scale1_full"
 606 |   param {
 607 |     name: "conv3_scale1_w_full"
 608 |     lr_mult: 1
 609 |     decay_mult: 1
 610 |   }
 611 |   param {
 612 |     name: "conv3_scale1_b_full"
 613 |     lr_mult: 2
 614 |     decay_mult: 0
 615 |   }
 616 |   convolution_param {
 617 |     num_output: 32
 618 |     pad: 1
 619 |     kernel_size: 3
 620 |     stride: 1
 621 |     weight_filler {
 622 |       type: "xavier"
 623 |     }
 624 |     bias_filler {
 625 |       type: "constant"
 626 |     }
 627 |     dilation: 1
 628 |   }
 629 | }
 630 | layer {
 631 |   name: "conv3_scale2_full"
 632 |   type: "Convolution"
 633 |   bottom: "pool2_full"
 634 |   top: "conv3_scale2_full"
 635 |   param {
 636 |     name: "conv3_scale2_w_full"
 637 |     lr_mult: 1
 638 |     decay_mult: 1
 639 |   }
 640 |   param {
 641 |     name: "conv3_scale2_b_full"
 642 |     lr_mult: 2
 643 |     decay_mult: 0
 644 |   }
 645 |   convolution_param {
 646 |     num_output: 32
 647 |     pad: 2
 648 |     kernel_size: 3
 649 |     stride: 1
 650 |     weight_filler {
 651 |       type: "xavier"
 652 |     }
 653 |     bias_filler {
 654 |       type: "constant"
 655 |     }
 656 |     dilation: 2
 657 |   }
 658 | }
 659 | layer {
 660 |   name: "conv3_scale3_full"
 661 |   type: "Convolution"
 662 |   bottom: "pool2_full"
 663 |   top: "conv3_scale3_full"
 664 |   param {
 665 |     name: "conv3_scale3_w_full"
 666 |     lr_mult: 1
 667 |     decay_mult: 1
 668 |   }
 669 |   param {
 670 |     name: "conv3_scale3_b_full"
 671 |     lr_mult: 2
 672 |     decay_mult: 0
 673 |   }
 674 |   convolution_param {
 675 |     num_output: 32
 676 |     pad: 3
 677 |     kernel_size: 3
 678 |     stride: 1
 679 |     weight_filler {
 680 |       type: "xavier"
 681 |     }
 682 |     bias_filler {
 683 |       type: "constant"
 684 |     }
 685 |     dilation: 3
 686 |   }
 687 | }
 688 | layer {
 689 |   name: "bn3_scale1_full"
 690 |   type: "BatchNorm"
 691 |   bottom: "conv3_scale1_full"
 692 |   top: "bn3_scale1_full"
 693 |   param {
 694 |     name: "bn3_scale1_mean_full"
 695 |     lr_mult: 0
 696 |     decay_mult: 0
 697 |   }
 698 |   param {
 699 |     name: "bn3_scale1_var_full"
 700 |     lr_mult: 0
 701 |     decay_mult: 0
 702 |   }
 703 |   param {
 704 |     name: "bn3_scale1_bias_full"
 705 |     lr_mult: 0
 706 |     decay_mult: 0
 707 |   }
 708 |   batch_norm_param {
 709 |     use_global_stats: false
 710 |   }
 711 | }
 712 | layer {
 713 |   name: "bn3_scale2_full"
 714 |   type: "BatchNorm"
 715 |   bottom: "conv3_scale2_full"
 716 |   top: "bn3_scale2_full"
 717 |   param {
 718 |     name: "bn3_scale2_mean_full"
 719 |     lr_mult: 0
 720 |     decay_mult: 0
 721 |   }
 722 |   param {
 723 |     name: "bn3_scale2_var_full"
 724 |     lr_mult: 0
 725 |     decay_mult: 0
 726 |   }
 727 |   param {
 728 |     name: "bn3_scale2_bias_full"
 729 |     lr_mult: 0
 730 |     decay_mult: 0
 731 |   }
 732 |   batch_norm_param {
 733 |     use_global_stats: false
 734 |   }
 735 | }
 736 | layer {
 737 |   name: "bn3_scale3_full"
 738 |   type: "BatchNorm"
 739 |   bottom: "conv3_scale3_full"
 740 |   top: "bn3_scale3_full"
 741 |   param {
 742 |     name: "bn3_scale3_mean_full"
 743 |     lr_mult: 0
 744 |     decay_mult: 0
 745 |   }
 746 |   param {
 747 |     name: "bn3_scale3_var_full"
 748 |     lr_mult: 0
 749 |     decay_mult: 0
 750 |   }
 751 |   param {
 752 |     name: "bn3_scale3_bias_full"
 753 |     lr_mult: 0
 754 |     decay_mult: 0
 755 |   }
 756 |   batch_norm_param {
 757 |     use_global_stats: false
 758 |   }
 759 | }
 760 | layer {
 761 |   name: "bn3_full"
 762 |   type: "Concat"
 763 |   bottom: "bn3_scale1_full"
 764 |   bottom: "bn3_scale2_full"
 765 |   bottom: "bn3_scale3_full"
 766 |   top: "bn3_full"
 767 |   concat_param {
 768 |     axis: 1
 769 |   }
 770 | }
 771 | layer {
 772 |   name: "relu3_full"
 773 |   type: "ReLU"
 774 |   bottom: "bn3_full"
 775 |   top: "bn3_full"
 776 | }
 777 | layer {
 778 |   name: "pool3_full"
 779 |   type: "Pooling"
 780 |   bottom: "bn3_full"
 781 |   top: "pool3_full"
 782 |   pooling_param {
 783 |     pool: MAX
 784 |     kernel_size: 2
 785 |     stride: 2
 786 |   }
 787 | }
 788 | layer {
 789 |   name: "conv4_scale1_full"
 790 |   type: "Convolution"
 791 |   bottom: "pool3_full"
 792 |   top: "conv4_scale1_full"
 793 |   param {
 794 |     name: "conv4_scale1_w_full"
 795 |     lr_mult: 1
 796 |     decay_mult: 1
 797 |   }
 798 |   param {
 799 |     name: "conv4_scale1_b_full"
 800 |     lr_mult: 2
 801 |     decay_mult: 0
 802 |   }
 803 |   convolution_param {
 804 |     num_output: 32
 805 |     pad: 1
 806 |     kernel_size: 3
 807 |     stride: 1
 808 |     weight_filler {
 809 |       type: "xavier"
 810 |     }
 811 |     bias_filler {
 812 |       type: "constant"
 813 |     }
 814 |     dilation: 1
 815 |   }
 816 | }
 817 | layer {
 818 |   name: "conv4_scale2_full"
 819 |   type: "Convolution"
 820 |   bottom: "pool3_full"
 821 |   top: "conv4_scale2_full"
 822 |   param {
 823 |     name: "conv4_scale2_w_full"
 824 |     lr_mult: 1
 825 |     decay_mult: 1
 826 |   }
 827 |   param {
 828 |     name: "conv4_scale2_b_full"
 829 |     lr_mult: 2
 830 |     decay_mult: 0
 831 |   }
 832 |   convolution_param {
 833 |     num_output: 32
 834 |     pad: 2
 835 |     kernel_size: 3
 836 |     stride: 1
 837 |     weight_filler {
 838 |       type: "xavier"
 839 |     }
 840 |     bias_filler {
 841 |       type: "constant"
 842 |     }
 843 |     dilation: 2
 844 |   }
 845 | }
 846 | layer {
 847 |   name: "conv4_scale3_full"
 848 |   type: "Convolution"
 849 |   bottom: "pool3_full"
 850 |   top: "conv4_scale3_full"
 851 |   param {
 852 |     name: "conv4_scale3_w_full"
 853 |     lr_mult: 1
 854 |     decay_mult: 1
 855 |   }
 856 |   param {
 857 |     name: "conv4_scale3_b_full"
 858 |     lr_mult: 2
 859 |     decay_mult: 0
 860 |   }
 861 |   convolution_param {
 862 |     num_output: 32
 863 |     pad: 3
 864 |     kernel_size: 3
 865 |     stride: 1
 866 |     weight_filler {
 867 |       type: "xavier"
 868 |     }
 869 |     bias_filler {
 870 |       type: "constant"
 871 |     }
 872 |     dilation: 3
 873 |   }
 874 | }
 875 | layer {
 876 |   name: "bn4_scale1_full"
 877 |   type: "BatchNorm"
 878 |   bottom: "conv4_scale1_full"
 879 |   top: "bn4_scale1_full"
 880 |   param {
 881 |     name: "bn4_scale1_mean_full"
 882 |     lr_mult: 0
 883 |     decay_mult: 0
 884 |   }
 885 |   param {
 886 |     name: "bn4_scale1_var_full"
 887 |     lr_mult: 0
 888 |     decay_mult: 0
 889 |   }
 890 |   param {
 891 |     name: "bn4_scale1_bias_full"
 892 |     lr_mult: 0
 893 |     decay_mult: 0
 894 |   }
 895 |   batch_norm_param {
 896 |     use_global_stats: false
 897 |   }
 898 | }
 899 | layer {
 900 |   name: "bn4_scale2_full"
 901 |   type: "BatchNorm"
 902 |   bottom: "conv4_scale2_full"
 903 |   top: "bn4_scale2_full"
 904 |   param {
 905 |     name: "bn4_scale2_mean_full"
 906 |     lr_mult: 0
 907 |     decay_mult: 0
 908 |   }
 909 |   param {
 910 |     name: "bn4_scale2_var_full"
 911 |     lr_mult: 0
 912 |     decay_mult: 0
 913 |   }
 914 |   param {
 915 |     name: "bn4_scale2_bias_full"
 916 |     lr_mult: 0
 917 |     decay_mult: 0
 918 |   }
 919 |   batch_norm_param {
 920 |     use_global_stats: false
 921 |   }
 922 | }
 923 | layer {
 924 |   name: "bn4_scale3_full"
 925 |   type: "BatchNorm"
 926 |   bottom: "conv4_scale3_full"
 927 |   top: "bn4_scale3_full"
 928 |   param {
 929 |     name: "bn4_scale3_mean_full"
 930 |     lr_mult: 0
 931 |     decay_mult: 0
 932 |   }
 933 |   param {
 934 |     name: "bn4_scale3_var_full"
 935 |     lr_mult: 0
 936 |     decay_mult: 0
 937 |   }
 938 |   param {
 939 |     name: "bn4_scale3_bias_full"
 940 |     lr_mult: 0
 941 |     decay_mult: 0
 942 |   }
 943 |   batch_norm_param {
 944 |     use_global_stats: false
 945 |   }
 946 | }
 947 | layer {
 948 |   name: "bn4_full"
 949 |   type: "Concat"
 950 |   bottom: "bn4_scale1_full"
 951 |   bottom: "bn4_scale2_full"
 952 |   bottom: "bn4_scale3_full"
 953 |   top: "bn4_full"
 954 |   concat_param {
 955 |     axis: 1
 956 |   }
 957 | }
 958 | layer {
 959 |   name: "relu4_full"
 960 |   type: "ReLU"
 961 |   bottom: "bn4_full"
 962 |   top: "bn4_full"
 963 | }
 964 | layer {
 965 |   name: "pool4_full"
 966 |   type: "Pooling"
 967 |   bottom: "bn4_full"
 968 |   top: "pool4_full"
 969 |   pooling_param {
 970 |     pool: MAX
 971 |     kernel_size: 2
 972 |     stride: 2
 973 |   }
 974 | }
 975 | layer {
 976 |   name: "fc1_full"
 977 |   type: "InnerProduct"
 978 |   bottom: "pool4_full"
 979 |   top: "fc1_full"
 980 |   param {
 981 |     name: "fc1_w_full"
 982 |     lr_mult: 1
 983 |     decay_mult: 1
 984 |   }
 985 |   param {
 986 |     name: "fc1_b_full"
 987 |     lr_mult: 2
 988 |     decay_mult: 0
 989 |   }
 990 |   inner_product_param {
 991 |     num_output: 128
 992 |     weight_filler {
 993 |       type: "xavier"
 994 |     }
 995 |     bias_filler {
 996 |       type: "constant"
 997 |     }
 998 |   }
 999 | }
1000 | layer {
1001 |   name: "fc1_full_drop"
1002 |   type: "Dropout"
1003 |   bottom: "fc1_full"
1004 |   top: "fc1_full"
1005 |   dropout_param {
1006 |     dropout_ratio: 0.2
1007 |   }
1008 | }
1009 | 
1010 | layer {
1011 |   name: "fc2_full"
1012 |   type: "InnerProduct"
1013 |   bottom: "fc1_full"
1014 |   top: "fc2_full"
1015 |   param {
1016 |     lr_mult: 1
1017 |     decay_mult: 1
1018 |   }
1019 |   param {
1020 |     lr_mult: 2
1021 |     decay_mult: 0
1022 |   }
1023 |   inner_product_param {
1024 |     num_output: 625
1025 |     weight_filler {
1026 |       type: "xavier"
1027 |     }
1028 |     bias_filler {
1029 |       type: "constant"
1030 |     }
1031 |   }
1032 | }
1033 | layer {
1034 |   name: "loss_cls_full"
1035 |   type: "SoftmaxWithLoss"
1036 |   bottom: "fc2_full"
1037 |   bottom: "label"
1038 |   top: "loss_cls_full"
1039 |   loss_weight: 1
1040 | }
1041 | layer {
1042 |   name: "acc_cls_full"
1043 |   type: "Accuracy"
1044 |   bottom: "fc2_full"
1045 |   bottom: "label"
1046 |   top: "acc_cls_full"
1047 | }
1048 | 
1049 | layer {
1050 |   name: "pool2_iner"
1051 |   type: "Pooling"
1052 |   bottom: "bn2_att_iner"
1053 |   top: "pool2_iner"
1054 |   pooling_param {
1055 |     pool: MAX
1056 |     kernel_size: 2
1057 |     stride: 2
1058 |   }
1059 | }
1060 | 
1061 | layer {
1062 |   name: "conv3_scale1_iner"
1063 |   type: "Convolution"
1064 |   bottom: "pool2_iner"
1065 |   top: "conv3_scale1_iner"
1066 |   param {
1067 |     name: "conv3_scale1_w_iner"
1068 |     lr_mult: 1
1069 |     decay_mult: 1
1070 |   }
1071 |   param {
1072 |     name: "conv3_scale1_b_iner"
1073 |     lr_mult: 2
1074 |     decay_mult: 0
1075 |   }
1076 |   convolution_param {
1077 |     num_output: 32
1078 |     pad: 1
1079 |     kernel_size: 3
1080 |     stride: 1
1081 |     weight_filler {
1082 |       type: "xavier"
1083 |     }
1084 |     bias_filler {
1085 |       type: "constant"
1086 |     }
1087 |     dilation: 1
1088 |   }
1089 | }
1090 | layer {
1091 |   name: "conv3_scale2_iner"
1092 |   type: "Convolution"
1093 |   bottom: "pool2_iner"
1094 |   top: "conv3_scale2_iner"
1095 |   param {
1096 |     name: "conv3_scale2_w_iner"
1097 |     lr_mult: 1
1098 |     decay_mult: 1
1099 |   }
1100 |   param {
1101 |     name: "conv3_scale2_b_iner"
1102 |     lr_mult: 2
1103 |     decay_mult: 0
1104 |   }
1105 |   convolution_param {
1106 |     num_output: 32
1107 |     pad: 2
1108 |     kernel_size: 3
1109 |     stride: 1
1110 |     weight_filler {
1111 |       type: "xavier"
1112 |     }
1113 |     bias_filler {
1114 |       type: "constant"
1115 |     }
1116 |     dilation: 2
1117 |   }
1118 | }
1119 | layer {
1120 |   name: "conv3_scale3_iner"
1121 |   type: "Convolution"
1122 |   bottom: "pool2_iner"
1123 |   top: "conv3_scale3_iner"
1124 |   param {
1125 |     name: "conv3_scale3_w_iner"
1126 |     lr_mult: 1
1127 |     decay_mult: 1
1128 |   }
1129 |   param {
1130 |     name: "conv3_scale3_b_iner"
1131 |     lr_mult: 2
1132 |     decay_mult: 0
1133 |   }
1134 |   convolution_param {
1135 |     num_output: 32
1136 |     pad: 3
1137 |     kernel_size: 3
1138 |     stride: 1
1139 |     weight_filler {
1140 |       type: "xavier"
1141 |     }
1142 |     bias_filler {
1143 |       type: "constant"
1144 |     }
1145 |     dilation: 3
1146 |   }
1147 | }
1148 | layer {
1149 |   name: "bn3_scale1_iner"
1150 |   type: "BatchNorm"
1151 |   bottom: "conv3_scale1_iner"
1152 |   top: "bn3_scale1_iner"
1153 |   param {
1154 |     name: "bn3_scale1_mean_iner"
1155 |     lr_mult: 0
1156 |     decay_mult: 0
1157 |   }
1158 |   param {
1159 |     name: "bn3_scale1_var_iner"
1160 |     lr_mult: 0
1161 |     decay_mult: 0
1162 |   }
1163 |   param {
1164 |     name: "bn3_scale1_bias_iner"
1165 |     lr_mult: 0
1166 |     decay_mult: 0
1167 |   }
1168 |   batch_norm_param {
1169 |     use_global_stats: false
1170 |   }
1171 | }
1172 | layer {
1173 |   name: "bn3_scale2_iner"
1174 |   type: "BatchNorm"
1175 |   bottom: "conv3_scale2_iner"
1176 |   top: "bn3_scale2_iner"
1177 |   param {
1178 |     name: "bn3_scale2_mean_iner"
1179 |     lr_mult: 0
1180 |     decay_mult: 0
1181 |   }
1182 |   param {
1183 |     name: "bn3_scale2_var_iner"
1184 |     lr_mult: 0
1185 |     decay_mult: 0
1186 |   }
1187 |   param {
1188 |     name: "bn3_scale2_bias_iner"
1189 |     lr_mult: 0
1190 |     decay_mult: 0
1191 |   }
1192 |   batch_norm_param {
1193 |     use_global_stats: false
1194 |   }
1195 | }
1196 | layer {
1197 |   name: "bn3_scale3_iner"
1198 |   type: "BatchNorm"
1199 |   bottom: "conv3_scale3_iner"
1200 |   top: "bn3_scale3_iner"
1201 |   param {
1202 |     name: "bn3_scale3_mean_iner"
1203 |     lr_mult: 0
1204 |     decay_mult: 0
1205 |   }
1206 |   param {
1207 |     name: "bn3_scale3_var_iner"
1208 |     lr_mult: 0
1209 |     decay_mult: 0
1210 |   }
1211 |   param {
1212 |     name: "bn3_scale3_bias_iner"
1213 |     lr_mult: 0
1214 |     decay_mult: 0
1215 |   }
1216 |   batch_norm_param {
1217 |     use_global_stats: false
1218 |   }
1219 | }
1220 | layer {
1221 |   name: "bn3_iner"
1222 |   type: "Concat"
1223 |   bottom: "bn3_scale1_iner"
1224 |   bottom: "bn3_scale2_iner"
1225 |   bottom: "bn3_scale3_iner"
1226 |   top: "bn3_iner"
1227 |   concat_param {
1228 |     axis: 1
1229 |   }
1230 | }
1231 | layer {
1232 |   name: "relu3_iner"
1233 |   type: "ReLU"
1234 |   bottom: "bn3_iner"
1235 |   top: "bn3_iner"
1236 | }
1237 | layer {
1238 |   name: "pool3_iner"
1239 |   type: "Pooling"
1240 |   bottom: "bn3_iner"
1241 |   top: "pool3_iner"
1242 |   pooling_param {
1243 |     pool: MAX
1244 |     kernel_size: 2
1245 |     stride: 2
1246 |   }
1247 | }
1248 | layer {
1249 |   name: "conv4_scale1_iner"
1250 |   type: "Convolution"
1251 |   bottom: "pool3_iner"
1252 |   top: "conv4_scale1_iner"
1253 |   param {
1254 |     name: "conv4_scale1_w_iner"
1255 |     lr_mult: 1
1256 |     decay_mult: 1
1257 |   }
1258 |   param {
1259 |     name: "conv4_scale1_b_iner"
1260 |     lr_mult: 2
1261 |     decay_mult: 0
1262 |   }
1263 |   convolution_param {
1264 |     num_output: 32
1265 |     pad: 1
1266 |     kernel_size: 3
1267 |     stride: 1
1268 |     weight_filler {
1269 |       type: "xavier"
1270 |     }
1271 |     bias_filler {
1272 |       type: "constant"
1273 |     }
1274 |     dilation: 1
1275 |   }
1276 | }
1277 | layer {
1278 |   name: "conv4_scale2_iner"
1279 |   type: "Convolution"
1280 |   bottom: "pool3_iner"
1281 |   top: "conv4_scale2_iner"
1282 |   param {
1283 |     name: "conv4_scale2_w_iner"
1284 |     lr_mult: 1
1285 |     decay_mult: 1
1286 |   }
1287 |   param {
1288 |     name: "conv4_scale2_b_iner"
1289 |     lr_mult: 2
1290 |     decay_mult: 0
1291 |   }
1292 |   convolution_param {
1293 |     num_output: 32
1294 |     pad: 2
1295 |     kernel_size: 3
1296 |     stride: 1
1297 |     weight_filler {
1298 |       type: "xavier"
1299 |     }
1300 |     bias_filler {
1301 |       type: "constant"
1302 |     }
1303 |     dilation: 2
1304 |   }
1305 | }
1306 | layer {
1307 |   name: "conv4_scale3_iner"
1308 |   type: "Convolution"
1309 |   bottom: "pool3_iner"
1310 |   top: "conv4_scale3_iner"
1311 |   param {
1312 |     name: "conv4_scale3_w_iner"
1313 |     lr_mult: 1
1314 |     decay_mult: 1
1315 |   }
1316 |   param {
1317 |     name: "conv4_scale3_b_iner"
1318 |     lr_mult: 2
1319 |     decay_mult: 0
1320 |   }
1321 |   convolution_param {
1322 |     num_output: 32
1323 |     pad: 3
1324 |     kernel_size: 3
1325 |     stride: 1
1326 |     weight_filler {
1327 |       type: "xavier"
1328 |     }
1329 |     bias_filler {
1330 |       type: "constant"
1331 |     }
1332 |     dilation: 3
1333 |   }
1334 | }
1335 | layer {
1336 |   name: "bn4_scale1_iner"
1337 |   type: "BatchNorm"
1338 |   bottom: "conv4_scale1_iner"
1339 |   top: "bn4_scale1_iner"
1340 |   param {
1341 |     name: "bn4_scale1_mean_iner"
1342 |     lr_mult: 0
1343 |     decay_mult: 0
1344 |   }
1345 |   param {
1346 |     name: "bn4_scale1_var_iner"
1347 |     lr_mult: 0
1348 |     decay_mult: 0
1349 |   }
1350 |   param {
1351 |     name: "bn4_scale1_bias_iner"
1352 |     lr_mult: 0
1353 |     decay_mult: 0
1354 |   }
1355 |   batch_norm_param {
1356 |     use_global_stats: false
1357 |   }
1358 | }
1359 | layer {
1360 |   name: "bn4_scale2_iner"
1361 |   type: "BatchNorm"
1362 |   bottom: "conv4_scale2_iner"
1363 |   top: "bn4_scale2_iner"
1364 |   param {
1365 |     name: "bn4_scale2_mean_iner"
1366 |     lr_mult: 0
1367 |     decay_mult: 0
1368 |   }
1369 |   param {
1370 |     name: "bn4_scale2_var_iner"
1371 |     lr_mult: 0
1372 |     decay_mult: 0
1373 |   }
1374 |   param {
1375 |     name: "bn4_scale2_bias_iner"
1376 |     lr_mult: 0
1377 |     decay_mult: 0
1378 |   }
1379 |   batch_norm_param {
1380 |     use_global_stats: false
1381 |   }
1382 | }
1383 | layer {
1384 |   name: "bn4_scale3_iner"
1385 |   type: "BatchNorm"
1386 |   bottom: "conv4_scale3_iner"
1387 |   top: "bn4_scale3_iner"
1388 |   param {
1389 |     name: "bn4_scale3_mean_iner"
1390 |     lr_mult: 0
1391 |     decay_mult: 0
1392 |   }
1393 |   param {
1394 |     name: "bn4_scale3_var_iner"
1395 |     lr_mult: 0
1396 |     decay_mult: 0
1397 |   }
1398 |   param {
1399 |     name: "bn4_scale3_bias_iner"
1400 |     lr_mult: 0
1401 |     decay_mult: 0
1402 |   }
1403 |   batch_norm_param {
1404 |     use_global_stats: false
1405 |   }
1406 | }
1407 | layer {
1408 |   name: "bn4_iner"
1409 |   type: "Concat"
1410 |   bottom: "bn4_scale1_iner"
1411 |   bottom: "bn4_scale2_iner"
1412 |   bottom: "bn4_scale3_iner"
1413 |   top: "bn4_iner"
1414 |   concat_param {
1415 |     axis: 1
1416 |   }
1417 | }
1418 | layer {
1419 |   name: "relu4_iner"
1420 |   type: "ReLU"
1421 |   bottom: "bn4_iner"
1422 |   top: "bn4_iner"
1423 | }
1424 | layer {
1425 |   name: "pool4_iner"
1426 |   type: "Pooling"
1427 |   bottom: "bn4_iner"
1428 |   top: "pool4_iner"
1429 |   pooling_param {
1430 |     pool: MAX
1431 |     kernel_size: 2
1432 |     stride: 2
1433 |   }
1434 | }
1435 | layer {
1436 |   name: "fc1_iner"
1437 |   type: "InnerProduct"
1438 |   bottom: "pool4_iner"
1439 |   top: "fc1_iner"
1440 |   param {
1441 |     name: "fc1_w_iner"
1442 |     lr_mult: 1
1443 |     decay_mult: 1
1444 |   }
1445 |   param {
1446 |     name: "fc1_b_iner"
1447 |     lr_mult: 2
1448 |     decay_mult: 0
1449 |   }
1450 |   inner_product_param {
1451 |     num_output: 128
1452 |     weight_filler {
1453 |       type: "xavier"
1454 |     }
1455 |     bias_filler {
1456 |       type: "constant"
1457 |     }
1458 |   }
1459 | }
1460 | layer {
1461 |   name: "fc1_iner_drop"
1462 |   type: "Dropout"
1463 |   bottom: "fc1_iner"
1464 |   top: "fc1_iner"
1465 |   dropout_param {
1466 |     dropout_ratio: 0.2
1467 |   }
1468 | }
1469 | 
1470 | layer {
1471 |   name: "fc2_iner"
1472 |   type: "InnerProduct"
1473 |   bottom: "fc1_iner"
1474 |   top: "fc2_iner"
1475 |   param {
1476 |     lr_mult: 1
1477 |     decay_mult: 1
1478 |   }
1479 |   param {
1480 |     lr_mult: 2
1481 |     decay_mult: 0
1482 |   }
1483 |   inner_product_param {
1484 |     num_output: 625
1485 |     weight_filler {
1486 |       type: "xavier"
1487 |     }
1488 |     bias_filler {
1489 |       type: "constant"
1490 |     }
1491 |   }
1492 | }
1493 | layer {
1494 |   name: "loss_cls_iner"
1495 |   type: "SoftmaxWithLoss"
1496 |   bottom: "fc2_iner"
1497 |   bottom: "label"
1498 |   top: "loss_cls_iner"
1499 |   loss_weight: 1
1500 | }
1501 | layer {
1502 |   name: "acc_cls_iner"
1503 |   type: "Accuracy"
1504 |   bottom: "fc2_iner"
1505 |   bottom: "label"
1506 |   top: "acc_cls_iner"
1507 | }
1508 | 
1509 | layer {
1510 |   name: "pool2_exter"
1511 |   type: "Pooling"
1512 |   bottom: "bn2_att_exter"
1513 |   top: "pool2_exter"
1514 |   pooling_param {
1515 |     pool: MAX
1516 |     kernel_size: 2
1517 |     stride: 2
1518 |   }
1519 | }
1520 | 
1521 | layer {
1522 |   name: "conv3_scale1_exter"
1523 |   type: "Convolution"
1524 |   bottom: "pool2_exter"
1525 |   top: "conv3_scale1_exter"
1526 |   param {
1527 |     name: "conv3_scale1_w_exter"
1528 |     lr_mult: 1
1529 |     decay_mult: 1
1530 |   }
1531 |   param {
1532 |     name: "conv3_scale1_b_exter"
1533 |     lr_mult: 2
1534 |     decay_mult: 0
1535 |   }
1536 |   convolution_param {
1537 |     num_output: 32
1538 |     pad: 1
1539 |     kernel_size: 3
1540 |     stride: 1
1541 |     weight_filler {
1542 |       type: "xavier"
1543 |     }
1544 |     bias_filler {
1545 |       type: "constant"
1546 |     }
1547 |     dilation: 1
1548 |   }
1549 | }
1550 | layer {
1551 |   name: "conv3_scale2_exter"
1552 |   type: "Convolution"
1553 |   bottom: "pool2_exter"
1554 |   top: "conv3_scale2_exter"
1555 |   param {
1556 |     name: "conv3_scale2_w_exter"
1557 |     lr_mult: 1
1558 |     decay_mult: 1
1559 |   }
1560 |   param {
1561 |     name: "conv3_scale2_b_exter"
1562 |     lr_mult: 2
1563 |     decay_mult: 0
1564 |   }
1565 |   convolution_param {
1566 |     num_output: 32
1567 |     pad: 2
1568 |     kernel_size: 3
1569 |     stride: 1
1570 |     weight_filler {
1571 |       type: "xavier"
1572 |     }
1573 |     bias_filler {
1574 |       type: "constant"
1575 |     }
1576 |     dilation: 2
1577 |   }
1578 | }
1579 | layer {
1580 |   name: "conv3_scale3_exter"
1581 |   type: "Convolution"
1582 |   bottom: "pool2_exter"
1583 |   top: "conv3_scale3_exter"
1584 |   param {
1585 |     name: "conv3_scale3_w_exter"
1586 |     lr_mult: 1
1587 |     decay_mult: 1
1588 |   }
1589 |   param {
1590 |     name: "conv3_scale3_b_exter"
1591 |     lr_mult: 2
1592 |     decay_mult: 0
1593 |   }
1594 |   convolution_param {
1595 |     num_output: 32
1596 |     pad: 3
1597 |     kernel_size: 3
1598 |     stride: 1
1599 |     weight_filler {
1600 |       type: "xavier"
1601 |     }
1602 |     bias_filler {
1603 |       type: "constant"
1604 |     }
1605 |     dilation: 3
1606 |   }
1607 | }
1608 | layer {
1609 |   name: "bn3_scale1_exter"
1610 |   type: "BatchNorm"
1611 |   bottom: "conv3_scale1_exter"
1612 |   top: "bn3_scale1_exter"
1613 |   param {
1614 |     name: "bn3_scale1_mean_exter"
1615 |     lr_mult: 0
1616 |     decay_mult: 0
1617 |   }
1618 |   param {
1619 |     name: "bn3_scale1_var_exter"
1620 |     lr_mult: 0
1621 |     decay_mult: 0
1622 |   }
1623 |   param {
1624 |     name: "bn3_scale1_bias_exter"
1625 |     lr_mult: 0
1626 |     decay_mult: 0
1627 |   }
1628 |   batch_norm_param {
1629 |     use_global_stats: false
1630 |   }
1631 | }
1632 | layer {
1633 |   name: "bn3_scale2_exter"
1634 |   type: "BatchNorm"
1635 |   bottom: "conv3_scale2_exter"
1636 |   top: "bn3_scale2_exter"
1637 |   param {
1638 |     name: "bn3_scale2_mean_exter"
1639 |     lr_mult: 0
1640 |     decay_mult: 0
1641 |   }
1642 |   param {
1643 |     name: "bn3_scale2_var_exter"
1644 |     lr_mult: 0
1645 |     decay_mult: 0
1646 |   }
1647 |   param {
1648 |     name: "bn3_scale2_bias_exter"
1649 |     lr_mult: 0
1650 |     decay_mult: 0
1651 |   }
1652 |   batch_norm_param {
1653 |     use_global_stats: false
1654 |   }
1655 | }
1656 | layer {
1657 |   name: "bn3_scale3_exter"
1658 |   type: "BatchNorm"
1659 |   bottom: "conv3_scale3_exter"
1660 |   top: "bn3_scale3_exter"
1661 |   param {
1662 |     name: "bn3_scale3_mean_exter"
1663 |     lr_mult: 0
1664 |     decay_mult: 0
1665 |   }
1666 |   param {
1667 |     name: "bn3_scale3_var_exter"
1668 |     lr_mult: 0
1669 |     decay_mult: 0
1670 |   }
1671 |   param {
1672 |     name: "bn3_scale3_bias_exter"
1673 |     lr_mult: 0
1674 |     decay_mult: 0
1675 |   }
1676 |   batch_norm_param {
1677 |     use_global_stats: false
1678 |   }
1679 | }
1680 | layer {
1681 |   name: "bn3_exter"
1682 |   type: "Concat"
1683 |   bottom: "bn3_scale1_exter"
1684 |   bottom: "bn3_scale2_exter"
1685 |   bottom: "bn3_scale3_exter"
1686 |   top: "bn3_exter"
1687 |   concat_param {
1688 |     axis: 1
1689 |   }
1690 | }
1691 | layer {
1692 |   name: "relu3_exter"
1693 |   type: "ReLU"
1694 |   bottom: "bn3_exter"
1695 |   top: "bn3_exter"
1696 | }
1697 | layer {
1698 |   name: "pool3_exter"
1699 |   type: "Pooling"
1700 |   bottom: "bn3_exter"
1701 |   top: "pool3_exter"
1702 |   pooling_param {
1703 |     pool: MAX
1704 |     kernel_size: 2
1705 |     stride: 2
1706 |   }
1707 | }
1708 | layer {
1709 |   name: "conv4_scale1_exter"
1710 |   type: "Convolution"
1711 |   bottom: "pool3_exter"
1712 |   top: "conv4_scale1_exter"
1713 |   param {
1714 |     name: "conv4_scale1_w_exter"
1715 |     lr_mult: 1
1716 |     decay_mult: 1
1717 |   }
1718 |   param {
1719 |     name: "conv4_scale1_b_exter"
1720 |     lr_mult: 2
1721 |     decay_mult: 0
1722 |   }
1723 |   convolution_param {
1724 |     num_output: 32
1725 |     pad: 1
1726 |     kernel_size: 3
1727 |     stride: 1
1728 |     weight_filler {
1729 |       type: "xavier"
1730 |     }
1731 |     bias_filler {
1732 |       type: "constant"
1733 |     }
1734 |     dilation: 1
1735 |   }
1736 | }
1737 | layer {
1738 |   name: "conv4_scale2_exter"
1739 |   type: "Convolution"
1740 |   bottom: "pool3_exter"
1741 |   top: "conv4_scale2_exter"
1742 |   param {
1743 |     name: "conv4_scale2_w_exter"
1744 |     lr_mult: 1
1745 |     decay_mult: 1
1746 |   }
1747 |   param {
1748 |     name: "conv4_scale2_b_exter"
1749 |     lr_mult: 2
1750 |     decay_mult: 0
1751 |   }
1752 |   convolution_param {
1753 |     num_output: 32
1754 |     pad: 2
1755 |     kernel_size: 3
1756 |     stride: 1
1757 |     weight_filler {
1758 |       type: "xavier"
1759 |     }
1760 |     bias_filler {
1761 |       type: "constant"
1762 |     }
1763 |     dilation: 2
1764 |   }
1765 | }
1766 | layer {
1767 |   name: "conv4_scale3_exter"
1768 |   type: "Convolution"
1769 |   bottom: "pool3_exter"
1770 |   top: "conv4_scale3_exter"
1771 |   param {
1772 |     name: "conv4_scale3_w_exter"
1773 |     lr_mult: 1
1774 |     decay_mult: 1
1775 |   }
1776 |   param {
1777 |     name: "conv4_scale3_b_exter"
1778 |     lr_mult: 2
1779 |     decay_mult: 0
1780 |   }
1781 |   convolution_param {
1782 |     num_output: 32
1783 |     pad: 3
1784 |     kernel_size: 3
1785 |     stride: 1
1786 |     weight_filler {
1787 |       type: "xavier"
1788 |     }
1789 |     bias_filler {
1790 |       type: "constant"
1791 |     }
1792 |     dilation: 3
1793 |   }
1794 | }
1795 | layer {
1796 |   name: "bn4_scale1_exter"
1797 |   type: "BatchNorm"
1798 |   bottom: "conv4_scale1_exter"
1799 |   top: "bn4_scale1_exter"
1800 |   param {
1801 |     name: "bn4_scale1_mean_exter"
1802 |     lr_mult: 0
1803 |     decay_mult: 0
1804 |   }
1805 |   param {
1806 |     name: "bn4_scale1_var_exter"
1807 |     lr_mult: 0
1808 |     decay_mult: 0
1809 |   }
1810 |   param {
1811 |     name: "bn4_scale1_bias_exter"
1812 |     lr_mult: 0
1813 |     decay_mult: 0
1814 |   }
1815 |   batch_norm_param {
1816 |     use_global_stats: false
1817 |   }
1818 | }
1819 | layer {
1820 |   name: "bn4_scale2_exter"
1821 |   type: "BatchNorm"
1822 |   bottom: "conv4_scale2_exter"
1823 |   top: "bn4_scale2_exter"
1824 |   param {
1825 |     name: "bn4_scale2_mean_exter"
1826 |     lr_mult: 0
1827 |     decay_mult: 0
1828 |   }
1829 |   param {
1830 |     name: "bn4_scale2_var_exter"
1831 |     lr_mult: 0
1832 |     decay_mult: 0
1833 |   }
1834 |   param {
1835 |     name: "bn4_scale2_bias_exter"
1836 |     lr_mult: 0
1837 |     decay_mult: 0
1838 |   }
1839 |   batch_norm_param {
1840 |     use_global_stats: false
1841 |   }
1842 | }
1843 | layer {
1844 |   name: "bn4_scale3_exter"
1845 |   type: "BatchNorm"
1846 |   bottom: "conv4_scale3_exter"
1847 |   top: "bn4_scale3_exter"
1848 |   param {
1849 |     name: "bn4_scale3_mean_exter"
1850 |     lr_mult: 0
1851 |     decay_mult: 0
1852 |   }
1853 |   param {
1854 |     name: "bn4_scale3_var_exter"
1855 |     lr_mult: 0
1856 |     decay_mult: 0
1857 |   }
1858 |   param {
1859 |     name: "bn4_scale3_bias_exter"
1860 |     lr_mult: 0
1861 |     decay_mult: 0
1862 |   }
1863 |   batch_norm_param {
1864 |     use_global_stats: false
1865 |   }
1866 | }
1867 | layer {
1868 |   name: "bn4_exter"
1869 |   type: "Concat"
1870 |   bottom: "bn4_scale1_exter"
1871 |   bottom: "bn4_scale2_exter"
1872 |   bottom: "bn4_scale3_exter"
1873 |   top: "bn4_exter"
1874 |   concat_param {
1875 |     axis: 1
1876 |   }
1877 | }
1878 | layer {
1879 |   name: "relu4_exter"
1880 |   type: "ReLU"
1881 |   bottom: "bn4_exter"
1882 |   top: "bn4_exter"
1883 | }
1884 | layer {
1885 |   name: "pool4_exter"
1886 |   type: "Pooling"
1887 |   bottom: "bn4_exter"
1888 |   top: "pool4_exter"
1889 |   pooling_param {
1890 |     pool: MAX
1891 |     kernel_size: 2
1892 |     stride: 2
1893 |   }
1894 | }
1895 | layer {
1896 |   name: "fc1_exter"
1897 |   type: "InnerProduct"
1898 |   bottom: "pool4_exter"
1899 |   top: "fc1_exter"
1900 |   param {
1901 |     name: "fc1_w_exter"
1902 |     lr_mult: 1
1903 |     decay_mult: 1
1904 |   }
1905 |   param {
1906 |     name: "fc1_b_exter"
1907 |     lr_mult: 2
1908 |     decay_mult: 0
1909 |   }
1910 |   inner_product_param {
1911 |     num_output: 128
1912 |     weight_filler {
1913 |       type: "xavier"
1914 |     }
1915 |     bias_filler {
1916 |       type: "constant"
1917 |     }
1918 |   }
1919 | }
1920 | layer {
1921 |   name: "fc1_exter_drop"
1922 |   type: "Dropout"
1923 |   bottom: "fc1_exter"
1924 |   top: "fc1_exter"
1925 |   dropout_param {
1926 |     dropout_ratio: 0.2
1927 |   }
1928 | }
1929 | 
1930 | layer {
1931 |   name: "fc2_exter"
1932 |   type: "InnerProduct"
1933 |   bottom: "fc1_exter"
1934 |   top: "fc2_exter"
1935 |   param {
1936 |     lr_mult: 1
1937 |     decay_mult: 1
1938 |   }
1939 |   param {
1940 |     lr_mult: 2
1941 |     decay_mult: 0
1942 |   }
1943 |   inner_product_param {
1944 |     num_output: 625 # For mars dataset training set.
1945 |     weight_filler {
1946 |       type: "xavier"
1947 |     }
1948 |     bias_filler {
1949 |       type: "constant"
1950 |     }
1951 |   }
1952 | }
1953 | layer {
1954 |   name: "loss_cls_exter"
1955 |   type: "SoftmaxWithLoss"
1956 |   bottom: "fc2_exter"
1957 |   bottom: "label_plus"
1958 |   top: "loss_cls_exter"
1959 |   loss_weight: 1
1960 | }
1961 | layer {
1962 |   name: "acc_cls_exter"
1963 |   type: "Accuracy"
1964 |   bottom: "fc2_exter"
1965 |   bottom: "label_plus"
1966 |   top: "acc_cls_exter"
1967 | }
1968 | 
1969 | layer {
1970 |   name: "loss_pull"
1971 |   type: "ContrastiveLoss"
1972 |   bottom: "fc1_iner"
1973 |   bottom: "fc1_full"
1974 |   bottom: "sim_iner"
1975 |   top: "loss_pull"
1976 |   loss_weight: 0.01
1977 |   contrastive_loss_param {
1978 |     margin: 0.1 # we set margin to 0 or small value for body regions. 
1979 |   }
1980 | }
1981 | 
1982 | layer {
1983 |   name: "loss_push"
1984 |   type: "ContrastiveLoss"
1985 |   bottom: "fc1_exter"
1986 |   bottom: "fc1_full"
1987 |   bottom: "sim_exter"
1988 |   top: "loss_push"
1989 |   loss_weight: 0.01
1990 |   contrastive_loss_param {
1991 |     margin: 100 # we set margin to 100 or 10 for background regions. 
1992 |   }
1993 | }


--------------------------------------------------------------------------------
/experiments/mars/run_mgcam.sh:
--------------------------------------------------------------------------------
1 | #!/usr/bin/env sh
2 | LOG=./mgcam-`date +%Y-%m-%d-%H-%M-%S`.log
3 | CAFFE=/path-to-caffe/build/tools/caffe
4 | 
5 | $CAFFE train --solver=./solver_mgcam.prototxt --gpu=0 2>&1 | tee $LOG
6 | 


--------------------------------------------------------------------------------
/experiments/mars/run_mgcam_siamese.sh:
--------------------------------------------------------------------------------
1 | #!/usr/bin/env sh
2 | LOG=./mgcam-siamese`date +%Y-%m-%d-%H-%M-%S`.log
3 | CAFFE=/path-to-caffe/build/tools/caffe
4 | 
5 | $CAFFE train --solver=./solver_mgcam_siamese.prototxt --weights=./mgcam_iter_75000.caffemodel --gpu=0 2>&1 | tee $LOG
6 | 


--------------------------------------------------------------------------------
/experiments/mars/solver_mgcam.prototxt:
--------------------------------------------------------------------------------
 1 | net: "mgcam_train.prototxt"
 2 | 
 3 | test_iter: 10
 4 | test_interval:1000 
 5 | base_lr: 0.01
 6 | lr_policy: "step"
 7 | gamma: 0.1
 8 | stepsize: 15000
 9 | display: 10
10 | max_iter: 75000
11 | momentum: 0.9
12 | weight_decay: 0.005
13 | snapshot: 5000
14 | snapshot_prefix: "mgcam"
15 | solver_mode: GPU
16 | 


--------------------------------------------------------------------------------
/experiments/mars/solver_mgcam_siamese.prototxt:
--------------------------------------------------------------------------------
 1 | net: "mgcam_siamese_train.prototxt"
 2 | 
 3 | test_iter: 10
 4 | test_interval:1000 
 5 | base_lr: 0.0001
 6 | lr_policy: "step"
 7 | gamma: 0.1
 8 | stepsize: 10000
 9 | display: 10
10 | max_iter: 20000
11 | momentum: 0.9
12 | weight_decay: 0.005
13 | snapshot: 5000
14 | snapshot_prefix: "mgcam_siamese"
15 | solver_mode: GPU
16 | 


--------------------------------------------------------------------------------