├── .gitignore ├── LICENSE ├── README.md ├── data ├── VOC2012 │ └── README.md ├── get_data.sh └── imagesets │ └── README.md ├── inference ├── cache_FCN8s_results.m ├── data │ └── README.md ├── ext │ └── README.md ├── generate_EDeconvNet_CRF_results.m ├── run_demo.m ├── setup.sh ├── startup.m ├── util │ ├── init_VOC2012_TEST.m │ ├── preprocess_image.m │ └── preprocess_image_bb.m └── voc_gt_cmap.mat ├── model ├── DeconvNet │ ├── DeconvNet_inference_deploy.prototxt │ └── README.md ├── FCN │ ├── README.md │ └── fcn-8s-pascal-deploy.prototxt ├── VGG_conv │ └── README.md ├── get_model.sh ├── stage_1_train_result │ └── README.md └── stage_2_train_result │ └── README.md ├── setup.sh └── training ├── 001_stage_1 ├── solver.prototxt ├── stage_1_train.prototxt └── start_train.sh ├── 001_start_train.sh ├── 002_stage_2 ├── solver.prototxt ├── stage_2_train.prototxt └── start_train.sh ├── 002_start_train.sh ├── 003_make_bn_layer_testable ├── BN_make_INFERENCE_script.py ├── DeconvNet_inference_deploy.prototxt ├── DeconvNet_inference_test.prototxt ├── DeconvNet_make_bn_layer_testable.prototxt └── start_test.sh └── 003_start_make_bn_layer_testable.sh /.gitignore: -------------------------------------------------------------------------------- 1 | caffe 2 | 3 | # data for training 4 | data/VOC2012/* 5 | *.png 6 | *.jpg 7 | data/imagesets/* 8 | *.txt 9 | 10 | # files created in training directory 11 | *.caffemodel 12 | *.caffemodel* 13 | training_log 14 | testing_log 15 | snapshot 16 | training/001_stage_1/* 17 | training/002_stage_2/* 18 | training/003_make_bn_layer_testable/* 19 | 20 | # temporarily created files 21 | *~ 22 | *.swp 23 | *.tar.gz 24 | *.tar 25 | *.zip 26 | 27 | # inference files 28 | inference/DeconvNet 29 | inference/FCN 30 | inference/data/* 31 | inference/ext/* 32 | inference/results 33 | -------------------------------------------------------------------------------- /LICENSE: -------------------------------------------------------------------------------- 1 | Copyright Pohang University of Science and Technology. All rights reserved. 2 | 3 | Contact persons: 4 | Hyeonowo Noh (hyeonwoonoh_ postech.ac.kr) 5 | 6 | This software is being made available for individual research use only. 7 | Any commercial use or redistribution of this software requires a license from 8 | the Pohang University of Science and Technology. 9 | 10 | You may use this work subject to the following conditions: 11 | 12 | 1. This work is provided "as is" by the copyright holder, with 13 | absolutely no warranties of correctness, fitness, intellectual property 14 | ownership, or anything else whatsoever. You use the work 15 | entirely at your own risk. The copyright holder will not be liable for 16 | any legal damages whatsoever connected with the use of this work. 17 | 18 | 2. The copyright holder retain all copyright to the work. All copies of 19 | the work and all works derived from it must contain (1) this copyright 20 | notice, and (2) additional notices describing the content, dates and 21 | copyright holder of modifications or additions made to the work, if 22 | any, including distribution and use conditions and intellectual property 23 | claims. Derived works must be clearly distinguished from the original 24 | work, both by name and by the prominent inclusion of explicit 25 | descriptions of overlaps and differences. 26 | 27 | 3. The names and trademarks of the copyright holder may not be used in 28 | advertising or publicity related to this work without specific prior 29 | written permission. 30 | 31 | 4. In return for the free use of this work, you are requested, but not 32 | legally required, to do the following: 33 | 34 | * If you become aware of factors that may significantly affect other 35 | users of the work, for example major bugs or 36 | deficiencies or possible intellectual property issues, you are 37 | requested to report them to the copyright holder, if possible 38 | including redistributable fixes or workarounds. 39 | 40 | * If you use the work in scientific research or as part of a larger 41 | software system, you are requested to cite the use in any related 42 | publications or technical documentation. The work is based upon: 43 | 44 | Hyeonwoo Noh, Seunghoon Hong, Bohyung Han. 45 | Learning Deconvolution Network for Semantic Segmentation 46 | arXiv:1505.04366, 2015. 47 | 48 | @article{noh2015learning, 49 | title={Learning Deconvolution Network for Semantic Segmentation}, 50 | author={Noh, Hyeonwoo and Hong, Seunghoon and Han, Bohyung}, 51 | journal={arXiv preprint arXiv:1505.04366}, 52 | year={2015} 53 | } 54 | 55 | This copyright notice must be retained with all copies of the software, 56 | including any modified or derived versions. 57 | -------------------------------------------------------------------------------- /README.md: -------------------------------------------------------------------------------- 1 | ## DeconvNet: Learning Deconvolution Network for Semantic Segmentation 2 | 3 | Created by [Hyeonwoo Noh](http://cvlab.postech.ac.kr/~hyeonwoonoh/), [Seunghoon Hong](http://cvlab.postech.ac.kr/~maga33/) and [Bohyung Han](http://cvlab.postech.ac.kr/~bhhan/) at POSTECH 4 | 5 | Acknowledgements: Thanks to Yangqing Jia and the BVLC team for creating Caffe. 6 | 7 | ### Introduction 8 | 9 | DeconvNet is state-of-the-art semantic segmentation system that combines bottom-up region proposals with multi-layer decovolution network. 10 | 11 | Detailed description of the system will be provided by our technical report [arXiv tech report] http://arxiv.org/abs/1505.04366 12 | 13 | ### Citation 14 | 15 | If you're using this code in a publication, please cite our papers. 16 | 17 | @article{noh2015learning, 18 | title={Learning Deconvolution Network for Semantic Segmentation}, 19 | author={Noh, Hyeonwoo and Hong, Seunghoon and Han, Bohyung}, 20 | journal={arXiv preprint arXiv:1505.04366}, 21 | year={2015} 22 | } 23 | 24 | ### Pre-trained Model 25 | 26 | If you need model definition and pre-trained model only, you can download them from following location: 27 | 0. caffe for DeconvNet: https://github.com/HyeonwooNoh/caffe 28 | 0. DeconvNet model definition: http://cvlab.postech.ac.kr/research/deconvnet/model/DeconvNet/DeconvNet_inference_deploy.prototxt 29 | 0. Pre-trained DeconvNet weight: http://cvlab.postech.ac.kr/research/deconvnet/model/DeconvNet/DeconvNet_trainval_inference.caffemodel 30 | 31 | ### Licence 32 | 33 | This software is being made available for research purpose only. 34 | Check LICENSE file for details. 35 | 36 | ### System Requirements 37 | 38 | This software is tested on Ubuntu 14.04 LTS (64bit). 39 | 40 | **Prerequisites** 41 | 0. MATLAB (tested with 2014b on 64-bit Linux) 42 | 0. prerequisites for caffe(http://caffe.berkeleyvision.org/installation.html#prequequisites) 43 | 44 | ### Installing DeconvNet 45 | 46 | **By running "setup.sh" you can download all the necessary file for training and inference include:** 47 | 0. caffe: you need modified version of caffe which support DeconvNet - https://github.com/HyeonwooNoh/caffe.git 48 | 0. data: data used for training stage 1 and 2 49 | 0. model: caffemodel of trained DeconvNet and other caffemodels required for training 50 | 51 | ### Training DeconvNet 52 | 53 | Training scripts are included in *./training/* directory 54 | 55 | **To train DeconvNet you can simply run following scripts in order:** 56 | 0. 001\_start\_train.sh : script for first stage training 57 | 0. 002\_start\_train.sh : script for second stage training 58 | 0. 003\_start\_make\_bn\_layer\_testable : script converting trained DeconvNet with bn layer to inference mode 59 | 60 | ### Inference EDeconvNet+CRF 61 | 62 | Run *run_demo.m* to reproduce EDeconvNet+CRF results on VOC2012 test data. 63 | 64 | **This script will generated EDeconvNet+CRF results through following steps:** 65 | 0. run FCN-8s and cache the score [cache\_FCN8s\_results.m] 66 | 0. generate DeconvNet score and apply ensemble with FCN-8s score, post processing with densecrf [generate\_EDeconvNet\_CRF\_results.m] 67 | 68 | *EDeconvNet+CRF obtains 72.5 mean I/U on PASCAL VOC 2012 Test* 69 | 70 | **External dependencies [can be downloaded by running "setup.sh" script]** 71 | 0. FCN-8s model and weight file [https://github.com/BVLC/caffe/wiki/Model-Zoo] 72 | 0. densecrf with matlab wrapper [https://github.com/johannesu/meanfield-matlab.git] 73 | 0. cached proposal bounding boxes extracted with edgebox object proposal [https://github.com/pdollar/edges] 74 | 75 | 76 | 77 | 78 | 79 | 80 | 81 | -------------------------------------------------------------------------------- /data/VOC2012/README.md: -------------------------------------------------------------------------------- 1 | **This directory should contain data used for training stage 1 and 2** 2 | -------------------------------------------------------------------------------- /data/get_data.sh: -------------------------------------------------------------------------------- 1 | # download and extract data necessary for training 2 | cd VOC2012 3 | 4 | # download and extract stage 1 training data 5 | wget http://cvlab.postech.ac.kr/research/deconvnet/data/VOC_OBJECT.tar.gz 6 | tar -zxvf VOC_OBJECT.tar.gz 7 | rm -rf VOC_OBJECT.tar.gz 8 | 9 | # download and extract stage 2 training data 10 | wget http://cvlab.postech.ac.kr/research/deconvnet/data/VOC2012_SEG_AUG.tar.gz 11 | tar -zxvf VOC2012_SEG_AUG.tar.gz 12 | rm -rf VOC2012_SEG_AUG.tar.gz 13 | 14 | cd .. 15 | 16 | # download and extract imagesets necessary for training 17 | cd imagesets 18 | 19 | # download and extract stage 1 training data 20 | wget http://cvlab.postech.ac.kr/research/deconvnet/data/stage_1_train_imgset.tar.gz 21 | tar -zxvf stage_1_train_imgset.tar.gz 22 | rm -rf stage_1_train_imgset.tar.gz 23 | 24 | # download and extract stage 2 training data 25 | wget http://cvlab.postech.ac.kr/research/deconvnet/data/stage_2_train_imgset.tar.gz 26 | tar -zxvf stage_2_train_imgset.tar.gz 27 | rm -rf stage_2_train_imgset.tar.gz 28 | 29 | cd .. 30 | 31 | 32 | -------------------------------------------------------------------------------- /data/imagesets/README.md: -------------------------------------------------------------------------------- 1 | **This directory contains image sets used for training stage 1 and 2** 2 | -------------------------------------------------------------------------------- /inference/cache_FCN8s_results.m: -------------------------------------------------------------------------------- 1 | 2 | function cache_FCN8s_results(config) 3 | 4 | fprintf('start caching FCN-8s restuls and score\n'); 5 | 6 | %% initialization 7 | load(config.cmap); 8 | init_VOC2012_TEST; 9 | 10 | %% initialize caffe 11 | addpath(fullfile(config.Path.CNN.caffe_root, 'matlab/caffe')); 12 | fprintf('initializing caffe..\n'); 13 | if caffe('is_initialized') 14 | caffe('reset') 15 | end 16 | caffe('init', config.Path.CNN.model_proto, config.Path.CNN.model_data) 17 | caffe('set_device', config.gpuNum); 18 | caffe('set_mode_gpu'); 19 | caffe('set_phase_test'); 20 | fprintf('done\n'); 21 | 22 | %% initialize paths 23 | save_res_dir = [config.save_root, '/FCN8s/results']; 24 | save_res_path = [save_res_dir, '/%s.png']; 25 | save_score_dir = [config.save_root, '/FCN8s/scores']; 26 | save_score_path = [save_score_dir, '/%s.mat']; 27 | 28 | %% create directory 29 | if config.write_file 30 | if ~exist(save_res_dir), mkdir(save_res_dir), end 31 | if ~exist(save_score_dir), mkdir(save_score_dir), end 32 | end 33 | 34 | % load image set 35 | ids=textread(sprintf(VOCopts.seg.imgsetpath, config.imageset), '%s'); 36 | 37 | for i=1:length(ids) 38 | fprintf('progress: %d/%d [%s]...', i, length(ids), ids{i}); 39 | tic; 40 | 41 | % read image 42 | I=imread(sprintf(VOCopts.imgpath,ids{i})); 43 | 44 | input_data = preprocess_image(I, config.im_sz); 45 | cnn_output = caffe('forward', input_data); 46 | 47 | result = cnn_output{1}; 48 | 49 | tr_result = permute(result,[2,1,3]); 50 | score = tr_result(1:size(I,1),1:size(I,2),:); 51 | 52 | [~, result_seg] = max(score,[], 3); 53 | result_seg = uint8(result_seg-1); 54 | 55 | if config.write_file 56 | imwrite(result_seg, cmap, sprintf(save_res_path, ids{i})); 57 | save(sprintf(save_score_path, ids{i}), 'score'); 58 | else 59 | subplot(1,2,1); 60 | imshow(I); 61 | subplot(1,2,2); 62 | result_seg_im = reshape(cmap(int32(result_seg)+1,:),[size(result_seg,1),size(result_seg,2),3]); 63 | imshow(result_seg_im); 64 | waitforbuttonpress; 65 | end 66 | fprintf(' done [%f]\n', toc); 67 | end 68 | 69 | %% function end 70 | end 71 | 72 | 73 | 74 | 75 | 76 | 77 | 78 | -------------------------------------------------------------------------------- /inference/data/README.md: -------------------------------------------------------------------------------- 1 | **this directory should contain data necessary for EDconvNet inference (VOC2012 test data, cached edge box object proposals)** 2 | -------------------------------------------------------------------------------- /inference/ext/README.md: -------------------------------------------------------------------------------- 1 | **This directory should contains external dependencies** 2 | -------------------------------------------------------------------------------- /inference/generate_EDeconvNet_CRF_results.m: -------------------------------------------------------------------------------- 1 | function generate_EDeconvNet_CRF_results(config) 2 | 3 | fprintf('start generating EDeconvNet+CRF results\n'); 4 | 5 | %% initialization 6 | load(config.cmap); 7 | init_VOC2012_TEST; 8 | 9 | %% initialize caffe 10 | addpath(fullfile(config.Path.CNN.caffe_root, 'matlab/caffe')); 11 | fprintf('initializing caffe..\n'); 12 | if caffe('is_initialized') 13 | caffe('reset') 14 | end 15 | caffe('init', config.Path.CNN.model_proto, config.Path.CNN.model_data) 16 | caffe('set_device', config.gpuNum); 17 | caffe('set_mode_gpu'); 18 | caffe('set_phase_test'); 19 | fprintf('done\n'); 20 | 21 | 22 | %% initialize paths 23 | save_res_dir = [config.save_root, '/EDeconvNet_CRF']; 24 | save_res_path = [save_res_dir, '/%s.png']; 25 | edgebox_cache_path = [config.edgebox_cache_dir '/%s.mat']; 26 | 27 | fcn_score_dir = [config.fcn_score_dir '/scores']; 28 | fcn_score_path = [fcn_score_dir '/%s.mat']; 29 | 30 | %% create directory 31 | if config.write_file 32 | if ~exist(save_res_dir), mkdir(save_res_dir), end 33 | end 34 | 35 | 36 | fprintf('start generating result\n'); 37 | fprintf('caffe model: %s\n', config.Path.CNN.model_proto); 38 | fprintf('caffe weight: %s\n', config.Path.CNN.model_data); 39 | 40 | 41 | %% read VOC2012 TEST image set 42 | ids=textread(sprintf(VOCopts.seg.imgsetpath, config.imageset), '%s'); 43 | 44 | for i=1:length(ids) 45 | fprintf('progress: %d/%d [%s]...', i, length(ids), ids{i}); 46 | tic; 47 | 48 | % read image 49 | I=imread(sprintf(VOCopts.imgpath,ids{i})); 50 | 51 | result_base = uint8(zeros(size(I,1), size(I,2))); 52 | prob_base = zeros(size(I,1),size(I,2),21); 53 | cnt_base = uint8(zeros(size(I,1),size(I,2))); 54 | norm_prob_base = zeros(size(I,1),size(I,2),21); 55 | 56 | % padding for easy cropping 57 | [img_height, img_width, ~] = size(I); 58 | pad_offset_col = img_height; 59 | pad_offset_row = img_width; 60 | 61 | % pad every images(I, cls_seg, inst_seg...) to make cropping easy 62 | padded_I = padarray(I,[pad_offset_row, pad_offset_col]); 63 | padded_result_base = padarray(result_base,[pad_offset_row, pad_offset_col]); 64 | padded_prob_base = padarray(prob_base,[pad_offset_row, pad_offset_col]); 65 | padded_cnt_base = padarray(cnt_base, [pad_offset_row, pad_offset_col]); 66 | norm_padded_prob_base = padarray(norm_prob_base,[pad_offset_row, pad_offset_col]); 67 | norm_padded_prob_base(:,:,1) = eps; 68 | 69 | padded_frame_255 = 255-padarray(uint8(ones(size(I,1),size(I,2))*255),[pad_offset_row, pad_offset_col]); 70 | 71 | padded_result_base = padded_result_base + padded_frame_255; 72 | 73 | %% load extended bounding box 74 | cache = load(sprintf(edgebox_cache_path, ids{i})); % boxes_padded 75 | boxes_padded = cache.boxes_padded; 76 | 77 | numBoxes = size(boxes_padded,1); 78 | cnt_process = 1; 79 | for bidx = 1:numBoxes 80 | box = boxes_padded(bidx,:); 81 | box_wd = box(3)-box(1)+1; 82 | box_ht = box(4)-box(2)+1; 83 | 84 | if min(box_wd, box_ht) < 112, continue; end 85 | 86 | input_data = preprocess_image_bb(padded_I, boxes_padded(bidx,:), config.im_sz); 87 | cnn_output = caffe('forward', input_data); 88 | 89 | segImg = permute(cnn_output{1}, [2, 1, 3]); 90 | segImg = imresize(segImg, [box_ht, box_wd], 'bilinear'); 91 | 92 | % accumulate prediction result 93 | cropped_prob_base = padded_prob_base(box(2):box(4),box(1):box(3),:); 94 | padded_prob_base(box(2):box(4),box(1):box(3),:) = max(cropped_prob_base,segImg); 95 | 96 | if mod(cnt_process, 10) == 0, fprintf(',%d', cnt_process); end 97 | if cnt_process >= config.max_proposal_num 98 | break; 99 | end 100 | 101 | cnt_process = cnt_process + 1; 102 | end 103 | 104 | %% save DeconvNet prediction score 105 | deconv_score = padded_prob_base(pad_offset_row:pad_offset_row+size(I,1)-1,pad_offset_col:pad_offset_col+size(I,2)-1,:); 106 | 107 | %% load fcn-8s score 108 | fcn_cache = load(sprintf(fcn_score_path, ids{i})); 109 | fcn_score = fcn_cache.score; 110 | 111 | %% ensemble 112 | zero_mask = zeros(size(fcn_score)); 113 | fcn_score = max(zero_mask,fcn_score); 114 | 115 | ens_score = deconv_score .* fcn_score; 116 | [ens_segscore, ens_segmask] = max(ens_score, [], 3); 117 | ens_segmask = uint8(ens_segmask-1); 118 | 119 | %% densecrf 120 | fprintf('[densecrf.. '); 121 | prob_map = exp(ens_score - repmat(max(ens_score, [], 3), [1,1,size(ens_score,3)])); 122 | prob_map = prob_map ./ repmat(sum(prob_map, 3), [1,1, size(prob_map,3)]); 123 | unary = -log(prob_map); 124 | 125 | D = Densecrf(I,single(unary)); 126 | 127 | % Some settings. 128 | D.gaussian_x_stddev = 3; 129 | D.gaussian_y_stddev = 3; 130 | D.gaussian_weight = 3; 131 | 132 | D.bilateral_x_stddev = 20; 133 | D.bilateral_y_stddev = 20; 134 | D.bilateral_r_stddev = 3; 135 | D.bilateral_g_stddev = 3; 136 | D.bilateral_b_stddev = 3; 137 | D.bilateral_weight = 5; 138 | 139 | D.iterations = 10; 140 | 141 | D.mean_field; 142 | segmask = D.segmentation; 143 | resulting_seg = uint8(segmask-1); 144 | fprintf('done] '); 145 | 146 | %% save or display result 147 | if config.write_file 148 | imwrite(resulting_seg,cmap,sprintf(save_res_path, ids{i})); 149 | else 150 | subplot(1,2,1); 151 | imshow(I); 152 | subplot(1,2,2); 153 | resulting_seg_im = reshape(cmap(int32(resulting_seg)+1,:),[size(resulting_seg,1),size(resulting_seg,2),3]); 154 | imshow(resulting_seg_im); 155 | waitforbuttonpress; 156 | end 157 | fprintf(' done [%f]\n', toc); 158 | end 159 | 160 | %% function end 161 | end 162 | -------------------------------------------------------------------------------- /inference/run_demo.m: -------------------------------------------------------------------------------- 1 | clear all; close all; clc; 2 | 3 | %% startup 4 | startup; 5 | config.imageset = 'test'; 6 | config.cmap= './voc_gt_cmap.mat'; 7 | config.gpuNum = 0; 8 | config.Path.CNN.caffe_root = './caffe'; 9 | config.save_root = './results'; 10 | 11 | %% cache FCN-8s results 12 | config.write_file = 1; 13 | config.Path.CNN.script_path = './FCN'; 14 | config.Path.CNN.model_data = [config.Path.CNN.script_path '/fcn-8s-pascal.caffemodel']; 15 | config.Path.CNN.model_proto = [config.Path.CNN.script_path '/fcn-8s-pascal-deploy.prototxt']; 16 | config.im_sz = 500; 17 | 18 | cache_FCN8s_results(config); 19 | 20 | %% generate EDeconvNet+CRF results 21 | 22 | config.write_file = 1; 23 | config.edgebox_cache_dir = './data/edgebox_cached/VOC2012_TEST'; 24 | config.Path.CNN.script_path = './DeconvNet'; 25 | config.Path.CNN.model_data = [config.Path.CNN.script_path '/DeconvNet_trainval_inference.caffemodel']; 26 | config.Path.CNN.model_proto = [config.Path.CNN.script_path '/DeconvNet_inference_deploy.prototxt']; 27 | config.max_proposal_num = 50; 28 | config.im_sz = 224; 29 | config.fcn_score_dir = './results/FCN8s'; 30 | 31 | generate_EDeconvNet_CRF_results(config); -------------------------------------------------------------------------------- /inference/setup.sh: -------------------------------------------------------------------------------- 1 | ############################ 2 | # prepare to run inference # 3 | ############################ 4 | 5 | # create simulinks 6 | # caffe 7 | ln -s ../caffe 8 | ln -s ../model/DeconvNet 9 | ln -s ../model/FCN 10 | 11 | # download necessary data for inference 12 | cd data 13 | # VOC2012 test data 14 | wget http://cvlab.postech.ac.kr/research/deconvnet/data/VOC2012_TEST.tar.gz 15 | tar -zxvf VOC2012_TEST.tar.gz 16 | rm -rf VOC2012_TEST.tar.gz 17 | 18 | # edgebox cached data for VOC2012 test 19 | wget http://cvlab.postech.ac.kr/research/deconvnet/data/edgebox_cached.tar.gz 20 | tar -zxvf edgebox_cached.tar.gz 21 | rm -rf edgebox_cached.tar.gz 22 | 23 | cd .. 24 | 25 | # download external dependencies 26 | cd ext 27 | # densecrf 28 | git clone https://github.com/johannesu/meanfield-matlab.git densecrf 29 | cd .. 30 | -------------------------------------------------------------------------------- /inference/startup.m: -------------------------------------------------------------------------------- 1 | clear all; close all; clc; 2 | %% initialize paths 3 | addpath(genpath('./ext/densecrf')); 4 | addpath(genpath('./util')); -------------------------------------------------------------------------------- /inference/util/init_VOC2012_TEST.m: -------------------------------------------------------------------------------- 1 | clear VOCopts 2 | 3 | % dataset 4 | % 5 | % Note for experienced users: the VOC2008-11 test sets are subsets 6 | % of the VOC2012 test set. You don't need to do anything special 7 | % to submit results for VOC2008-11. 8 | 9 | VOCopts.dataset='VOC2012_TEST'; 10 | 11 | % get devkit directory with forward slashes 12 | devkitroot='./data'; 13 | 14 | % change this path to point to your copy of the PASCAL VOC data 15 | VOCopts.datadir=[devkitroot '/']; 16 | 17 | % change this path to a writable directory for your results 18 | VOCopts.resdir=[devkitroot '/results/' VOCopts.dataset '/']; 19 | 20 | % change this path to a writable local directory for the example code 21 | VOCopts.localdir=[devkitroot '/local/' VOCopts.dataset '/']; 22 | 23 | % initialize the training set 24 | 25 | % VOCopts.trainset='train'; % use train for development 26 | VOCopts.trainset='trainval'; % use train+val for final challenge 27 | 28 | % initialize the test set 29 | 30 | % VOCopts.testset='val'; % use validation data for development test set 31 | VOCopts.testset='test'; % use test set for final challenge 32 | 33 | % initialize main challenge paths 34 | 35 | VOCopts.annopath=[VOCopts.datadir VOCopts.dataset '/Annotations/%s.xml']; 36 | VOCopts.imgpath=[VOCopts.datadir VOCopts.dataset '/JPEGImages/%s.jpg']; 37 | VOCopts.imgsetpath=[VOCopts.datadir VOCopts.dataset '/ImageSets/Main/%s.txt']; 38 | VOCopts.clsimgsetpath=[VOCopts.datadir VOCopts.dataset '/ImageSets/Main/%s_%s.txt']; 39 | VOCopts.clsrespath=[VOCopts.resdir 'Main/%s_cls_' VOCopts.testset '_%s.txt']; 40 | VOCopts.detrespath=[VOCopts.resdir 'Main/%s_det_' VOCopts.testset '_%s.txt']; 41 | 42 | % initialize segmentation task paths 43 | 44 | VOCopts.seg.clsimgpath=[VOCopts.datadir VOCopts.dataset '/SegmentationClass/%s.png']; 45 | VOCopts.seg.instimgpath=[VOCopts.datadir VOCopts.dataset '/SegmentationObject/%s.png']; 46 | 47 | VOCopts.seg.imgsetpath=[VOCopts.datadir VOCopts.dataset '/ImageSets/Segmentation/%s.txt']; 48 | 49 | VOCopts.seg.clsresdir=[VOCopts.resdir 'Segmentation/%s_%s_cls']; 50 | VOCopts.seg.instresdir=[VOCopts.resdir 'Segmentation/%s_%s_inst']; 51 | VOCopts.seg.clsrespath=[VOCopts.seg.clsresdir '/%s.png']; 52 | VOCopts.seg.instrespath=[VOCopts.seg.instresdir '/%s.png']; 53 | 54 | % initialize layout task paths 55 | 56 | VOCopts.layout.imgsetpath=[VOCopts.datadir VOCopts.dataset '/ImageSets/Layout/%s.txt']; 57 | VOCopts.layout.respath=[VOCopts.resdir 'Layout/%s_layout_' VOCopts.testset '.xml']; 58 | 59 | % initialize action task paths 60 | 61 | VOCopts.action.imgsetpath=[VOCopts.datadir VOCopts.dataset '/ImageSets/Action/%s.txt']; 62 | VOCopts.action.clsimgsetpath=[VOCopts.datadir VOCopts.dataset '/ImageSets/Action/%s_%s.txt']; 63 | VOCopts.action.respath=[VOCopts.resdir 'Action/%s_action_' VOCopts.testset '_%s.txt']; 64 | 65 | % initialize the VOC challenge options 66 | 67 | % classes 68 | 69 | VOCopts.classes={... 70 | 'aeroplane' 71 | 'bicycle' 72 | 'bird' 73 | 'boat' 74 | 'bottle' 75 | 'bus' 76 | 'car' 77 | 'cat' 78 | 'chair' 79 | 'cow' 80 | 'diningtable' 81 | 'dog' 82 | 'horse' 83 | 'motorbike' 84 | 'person' 85 | 'pottedplant' 86 | 'sheep' 87 | 'sofa' 88 | 'train' 89 | 'tvmonitor'}; 90 | 91 | VOCopts.nclasses=length(VOCopts.classes); 92 | 93 | % poses 94 | 95 | VOCopts.poses={... 96 | 'Unspecified' 97 | 'Left' 98 | 'Right' 99 | 'Frontal' 100 | 'Rear'}; 101 | 102 | VOCopts.nposes=length(VOCopts.poses); 103 | 104 | % layout parts 105 | 106 | VOCopts.parts={... 107 | 'head' 108 | 'hand' 109 | 'foot'}; 110 | 111 | VOCopts.nparts=length(VOCopts.parts); 112 | 113 | VOCopts.maxparts=[1 2 2]; % max of each of above parts 114 | 115 | % actions 116 | 117 | VOCopts.actions={... 118 | 'other' % skip this when training classifiers 119 | 'jumping' 120 | 'phoning' 121 | 'playinginstrument' 122 | 'reading' 123 | 'ridingbike' 124 | 'ridinghorse' 125 | 'running' 126 | 'takingphoto' 127 | 'usingcomputer' 128 | 'walking'}; 129 | 130 | VOCopts.nactions=length(VOCopts.actions); 131 | 132 | % overlap threshold 133 | 134 | VOCopts.minoverlap=0.5; 135 | 136 | % annotation cache for evaluation 137 | 138 | VOCopts.annocachepath=[VOCopts.localdir '%s_anno.mat']; 139 | 140 | % options for example implementations 141 | 142 | VOCopts.exfdpath=[VOCopts.localdir '%s_fd.mat']; 143 | -------------------------------------------------------------------------------- /inference/util/preprocess_image.m: -------------------------------------------------------------------------------- 1 | function preprocessed_img = preprocess_image(img, img_sz) 2 | % B 104.00698793 G 116.66876762 R 122.67891434 3 | mean = [104.00698793; 116.66876762; 122.67891434]; 4 | mean = permute(mean,[2,3,1]); 5 | 6 | I_zero_mean = double(img(:,:,[3 2 1])) - repmat(mean,[size(img,1),size(img,2),1]); 7 | im = padarray(I_zero_mean,[img_sz - size(I_zero_mean,1), img_sz - size(I_zero_mean,2)],'post'); 8 | 9 | caffe_images = zeros(img_sz, img_sz, 3, 1, 'single'); 10 | caffe_images(:,:,:,1) = single(permute(im,[2, 1, 3])); 11 | 12 | preprocessed_img = {caffe_images}; 13 | end -------------------------------------------------------------------------------- /inference/util/preprocess_image_bb.m: -------------------------------------------------------------------------------- 1 | function preprocessed_img = preprocess_image_bb(img, box, img_sz) 2 | meanImg = [104.00698793, 116.66876762, 122.67891434]; % order = bgr 3 | meanImg = repmat(meanImg, [img_sz^2,1]); 4 | meanImg = reshape(meanImg, [img_sz, img_sz, 3]); 5 | 6 | crop = double(img(box(2):box(4),box(1):box(3),:)); 7 | crop = imresize(crop, [img_sz img_sz], 'bilinear'); % resize cropped image 8 | crop = crop(:,:,[3 2 1]) - meanImg; % convert color channer rgb->bgr and subtract mean 9 | preprocessed_img = {single(permute(crop, [2 1 3]))}; 10 | 11 | end -------------------------------------------------------------------------------- /inference/voc_gt_cmap.mat: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/HyeonwooNoh/DeconvNet/87dd133d62b5407d09e4973bb9647dccb68e71f0/inference/voc_gt_cmap.mat -------------------------------------------------------------------------------- /model/DeconvNet/DeconvNet_inference_deploy.prototxt: -------------------------------------------------------------------------------- 1 | name: "DeconvNet_inference_deploy" 2 | 3 | input: "data" 4 | input_dim: 1 5 | input_dim: 3 6 | input_dim: 224 7 | input_dim: 224 8 | 9 | # 224 x 224 10 | # conv1_1 11 | layers { bottom: "data" top: "conv1_1" name: "conv1_1" type: CONVOLUTION 12 | blobs_lr: 1 blobs_lr: 2 weight_decay: 1 weight_decay: 0 13 | convolution_param { num_output: 64 pad: 1 kernel_size: 3 }} 14 | layers { bottom: 'conv1_1' top: 'conv1_1' name: 'bn1_1' type: BN 15 | bn_param { scale_filler { type: 'constant' value: 1 } 16 | shift_filler { type: 'constant' value: 0.001 } 17 | bn_mode: INFERENCE } } 18 | layers { bottom: "conv1_1" top: "conv1_1" name: "relu1_1" type: RELU} 19 | # conv1_2 20 | layers { bottom: "conv1_1" top: "conv1_2" name: "conv1_2" type: CONVOLUTION 21 | blobs_lr: 1 blobs_lr: 2 weight_decay: 1 weight_decay: 0 22 | convolution_param { num_output: 64 pad: 1 kernel_size: 3 }} 23 | layers { bottom: 'conv1_2' top: 'conv1_2' name: 'bn1_2' type: BN 24 | bn_param { scale_filler { type: 'constant' value: 1 } 25 | shift_filler { type: 'constant' value: 0.001 } 26 | bn_mode: INFERENCE } } 27 | layers { bottom: "conv1_2" top: "conv1_2" name: "relu1_2" type: RELU} 28 | 29 | # pool1 30 | layers { 31 | bottom: "conv1_2" top: "pool1" top:"pool1_mask" name: "pool1" type: POOLING 32 | pooling_param { pool: MAX kernel_size: 2 stride: 2 } 33 | } 34 | 35 | # 112 x 112 36 | # conv2_1 37 | layers { bottom: "pool1" top: "conv2_1" name: "conv2_1" type: CONVOLUTION 38 | blobs_lr: 1 blobs_lr: 2 weight_decay: 1 weight_decay: 0 39 | convolution_param { num_output: 128 pad: 1 kernel_size: 3 }} 40 | layers { bottom: 'conv2_1' top: 'conv2_1' name: 'bn2_1' type: BN 41 | bn_param { scale_filler { type: 'constant' value: 1 } 42 | shift_filler { type: 'constant' value: 0.001 } 43 | bn_mode: INFERENCE } } 44 | layers { bottom: "conv2_1" top: "conv2_1" name: "relu2_1" type: RELU} 45 | # conv2_2 46 | layers { bottom: "conv2_1" top: "conv2_2" name: "conv2_2" type: CONVOLUTION 47 | blobs_lr: 1 blobs_lr: 2 weight_decay: 1 weight_decay: 0 48 | convolution_param { num_output: 128 pad: 1 kernel_size: 3 }} 49 | layers { bottom: 'conv2_2' top: 'conv2_2' name: 'bn2_2' type: BN 50 | bn_param { scale_filler { type: 'constant' value: 1 } 51 | shift_filler { type: 'constant' value: 0.001 } 52 | bn_mode: INFERENCE } } 53 | layers { bottom: "conv2_2" top: "conv2_2" name: "relu2_2" type: RELU} 54 | 55 | # pool2 56 | layers { 57 | bottom: "conv2_2" top: "pool2" top: "pool2_mask" name: "pool2" type: POOLING 58 | pooling_param { pool: MAX kernel_size: 2 stride: 2 } 59 | } 60 | 61 | # 56 x 56 62 | # conv3_1 63 | layers { bottom: "pool2" top: "conv3_1" name: "conv3_1" type: CONVOLUTION 64 | blobs_lr: 1 blobs_lr: 2 weight_decay: 1 weight_decay: 0 65 | convolution_param { num_output: 256 pad: 1 kernel_size: 3 }} 66 | layers { bottom: 'conv3_1' top: 'conv3_1' name: 'bn3_1' type: BN 67 | bn_param { scale_filler { type: 'constant' value: 1 } 68 | shift_filler { type: 'constant' value: 0.001 } 69 | bn_mode: INFERENCE } } 70 | layers { bottom: "conv3_1" top: "conv3_1" name: "relu3_1" type: RELU} 71 | # conv3_2 72 | layers { bottom: "conv3_1" top: "conv3_2" name: "conv3_2" type: CONVOLUTION 73 | blobs_lr: 1 blobs_lr: 2 weight_decay: 1 weight_decay: 0 74 | convolution_param { num_output: 256 pad: 1 kernel_size: 3 }} 75 | layers { bottom: 'conv3_2' top: 'conv3_2' name: 'bn3_2' type: BN 76 | bn_param { scale_filler { type: 'constant' value: 1 } 77 | shift_filler { type: 'constant' value: 0.001 } 78 | bn_mode: INFERENCE } } 79 | layers { bottom: "conv3_2" top: "conv3_2" name: "relu3_2" type: RELU} 80 | # conv3_3 81 | layers { bottom: "conv3_2" top: "conv3_3" name: "conv3_3" type: CONVOLUTION 82 | blobs_lr: 1 blobs_lr: 2 weight_decay: 1 weight_decay: 0 83 | convolution_param { num_output: 256 pad: 1 kernel_size: 3 }} 84 | layers { bottom: 'conv3_3' top: 'conv3_3' name: 'bn3_3' type: BN 85 | bn_param { scale_filler { type: 'constant' value: 1 } 86 | shift_filler { type: 'constant' value: 0.001 } 87 | bn_mode: INFERENCE } } 88 | layers { bottom: "conv3_3" top: "conv3_3" name: "relu3_3" type: RELU} 89 | 90 | # pool3 91 | layers { 92 | bottom: "conv3_3" top: "pool3" top: "pool3_mask" name: "pool3" type: POOLING 93 | pooling_param { pool: MAX kernel_size: 2 stride: 2 } 94 | } 95 | 96 | # 28 x 28 97 | # conv4_1 98 | layers { bottom: "pool3" top: "conv4_1" name: "conv4_1" type: CONVOLUTION 99 | blobs_lr: 1 blobs_lr: 2 weight_decay: 1 weight_decay: 0 100 | convolution_param { num_output: 512 pad: 1 kernel_size: 3 }} 101 | layers { bottom: 'conv4_1' top: 'conv4_1' name: 'bn4_1' type: BN 102 | bn_param { scale_filler { type: 'constant' value: 1 } 103 | shift_filler { type: 'constant' value: 0.001 } 104 | bn_mode: INFERENCE } } 105 | layers { bottom: "conv4_1" top: "conv4_1" name: "relu4_1" type: RELU} 106 | # conv4_2 107 | layers { bottom: "conv4_1" top: "conv4_2" name: "conv4_2" type: CONVOLUTION 108 | blobs_lr: 1 blobs_lr: 2 weight_decay: 1 weight_decay: 0 109 | convolution_param { num_output: 512 pad: 1 kernel_size: 3 }} 110 | layers { bottom: 'conv4_2' top: 'conv4_2' name: 'bn4_2' type: BN 111 | bn_param { scale_filler { type: 'constant' value: 1 } 112 | shift_filler { type: 'constant' value: 0.001 } 113 | bn_mode: INFERENCE } } 114 | layers { bottom: "conv4_2" top: "conv4_2" name: "relu4_2" type: RELU} 115 | # conv4_3 116 | layers { bottom: "conv4_2" top: "conv4_3" name: "conv4_3" type: CONVOLUTION 117 | blobs_lr: 1 blobs_lr: 2 weight_decay: 1 weight_decay: 0 118 | convolution_param { num_output: 512 pad: 1 kernel_size: 3 }} 119 | layers { bottom: 'conv4_3' top: 'conv4_3' name: 'bn4_3' type: BN 120 | bn_param { scale_filler { type: 'constant' value: 1 } 121 | shift_filler { type: 'constant' value: 0.001 } 122 | bn_mode: INFERENCE } } 123 | layers { bottom: "conv4_3" top: "conv4_3" name: "relu4_3" type: RELU} 124 | 125 | # pool4 126 | layers { 127 | bottom: "conv4_3" top: "pool4" top: "pool4_mask" name: "pool4" type: POOLING 128 | pooling_param { pool: MAX kernel_size: 2 stride: 2 } 129 | } 130 | 131 | # 14 x 14 132 | # conv5_1 133 | layers { bottom: "pool4" top: "conv5_1" name: "conv5_1" type: CONVOLUTION 134 | blobs_lr: 1 blobs_lr: 2 weight_decay: 1 weight_decay: 0 135 | convolution_param { num_output: 512 pad: 1 kernel_size: 3 }} 136 | layers { bottom: 'conv5_1' top: 'conv5_1' name: 'bn5_1' type: BN 137 | bn_param { scale_filler { type: 'constant' value: 1 } 138 | shift_filler { type: 'constant' value: 0.001 } 139 | bn_mode: INFERENCE } } 140 | layers { bottom: "conv5_1" top: "conv5_1" name: "relu5_1" type: RELU} 141 | # conv5_2 142 | layers { bottom: "conv5_1" top: "conv5_2" name: "conv5_2" type: CONVOLUTION 143 | blobs_lr: 1 blobs_lr: 2 weight_decay: 1 weight_decay: 0 144 | convolution_param { num_output: 512 pad: 1 kernel_size: 3 }} 145 | layers { bottom: 'conv5_2' top: 'conv5_2' name: 'bn5_2' type: BN 146 | bn_param { scale_filler { type: 'constant' value: 1 } 147 | shift_filler { type: 'constant' value: 0.001 } 148 | bn_mode: INFERENCE } } 149 | layers { bottom: "conv5_2" top: "conv5_2" name: "relu5_2" type: RELU} 150 | # conv5_3 151 | layers { bottom: "conv5_2" top: "conv5_3" name: "conv5_3" type: CONVOLUTION 152 | blobs_lr: 1 blobs_lr: 2 weight_decay: 1 weight_decay: 0 153 | convolution_param { num_output: 512 pad: 1 kernel_size: 3 }} 154 | layers { bottom: 'conv5_3' top: 'conv5_3' name: 'bn5_3' type: BN 155 | bn_param { scale_filler { type: 'constant' value: 1 } 156 | shift_filler { type: 'constant' value: 0.001 } 157 | bn_mode: INFERENCE } } 158 | layers { bottom: "conv5_3" top: "conv5_3" name: "relu5_3" type: RELU} 159 | 160 | # pool5 161 | layers { 162 | bottom: "conv5_3" top: "pool5" top: "pool5_mask" name: "pool5" type: POOLING 163 | pooling_param { pool: MAX kernel_size: 2 stride: 2 } 164 | } 165 | 166 | # 7 x 7 167 | # fc6 168 | layers { bottom: 'pool5' top: 'fc6' name: 'fc6' type: CONVOLUTION 169 | blobs_lr: 1 blobs_lr: 2 weight_decay: 1 weight_decay: 0 170 | convolution_param { kernel_size: 7 num_output: 4096 } } 171 | layers { bottom: 'fc6' top: 'fc6' name: 'bnfc6' type: BN 172 | bn_param { scale_filler { type: 'constant' value: 1 } 173 | shift_filler { type: 'constant' value: 0.001 } 174 | bn_mode: INFERENCE } } 175 | layers { bottom: "fc6" top: "fc6" name: "relu6" type: RELU} 176 | 177 | # 1 x 1 178 | # fc7 179 | layers { bottom: 'fc6' top: 'fc7' name: 'fc7' type: CONVOLUTION 180 | blobs_lr: 1 blobs_lr: 2 weight_decay: 1 weight_decay: 0 181 | convolution_param { kernel_size: 1 num_output: 4096 } } 182 | layers { bottom: 'fc7' top: 'fc7' name: 'bnfc7' type: BN 183 | bn_param { scale_filler { type: 'constant' value: 1 } 184 | shift_filler { type: 'constant' value: 0.001 } 185 | bn_mode: INFERENCE } } 186 | layers { bottom: "fc7" top: "fc7" name: "relu7" type: RELU} 187 | 188 | # fc6-deconv 189 | layers { bottom: 'fc7' top: 'fc6-deconv' name: 'fc6-deconv' type: DECONVOLUTION 190 | blobs_lr: 1 blobs_lr: 2 weight_decay: 1 weight_decay: 0 191 | convolution_param { num_output: 512 kernel_size: 7 192 | weight_filler { type: "gaussian" std: 0.01 } 193 | bias_filler { type: "constant" value: 0 }} } 194 | layers { bottom: 'fc6-deconv' top: 'fc6-deconv' name: 'fc6-deconv-bn' type: BN 195 | bn_param { scale_filler { type: 'constant' value: 1 } 196 | shift_filler { type: 'constant' value: 0.001 } 197 | bn_mode: INFERENCE } } 198 | layers { bottom: 'fc6-deconv' top: 'fc6-deconv' name: 'fc6-deconv-relu' type: RELU } 199 | 200 | # 7 x 7 201 | # unpool5 202 | layers { type: UNPOOLING bottom: "fc6-deconv" bottom: "pool5_mask" top: "unpool5" name: "unpool5" 203 | unpooling_param { unpool: MAX kernel_size: 2 stride: 2 unpool_size: 14 } 204 | } 205 | 206 | # 14 x 14 207 | # deconv5_1 208 | layers { bottom: 'unpool5' top: 'deconv5_1' name: 'deconv5_1' type: DECONVOLUTION 209 | blobs_lr: 1 blobs_lr: 2 weight_decay: 1 weight_decay: 0 210 | convolution_param { num_output: 512 pad: 1 kernel_size: 3 211 | weight_filler { type: "gaussian" std: 0.01 } 212 | bias_filler { type: "constant" value: 0 }} } 213 | layers { bottom: 'deconv5_1' top: 'deconv5_1' name: 'debn5_1' type: BN 214 | bn_param { scale_filler { type: 'constant' value: 1 } 215 | shift_filler { type: 'constant' value: 0.001 } 216 | bn_mode: INFERENCE } } 217 | layers { bottom: 'deconv5_1' top: 'deconv5_1' name: 'derelu5_1' type: RELU } 218 | # deconv5_2 219 | layers { bottom: 'deconv5_1' top: 'deconv5_2' name: 'deconv5_2' type: DECONVOLUTION 220 | blobs_lr: 1 blobs_lr: 2 weight_decay: 1 weight_decay: 0 221 | convolution_param { num_output: 512 pad: 1 kernel_size: 3 222 | weight_filler { type: "gaussian" std: 0.01 } 223 | bias_filler { type: "constant" value: 0 }} } 224 | layers { bottom: 'deconv5_2' top: 'deconv5_2' name: 'debn5_2' type: BN 225 | bn_param { scale_filler { type: 'constant' value: 1 } 226 | shift_filler { type: 'constant' value: 0.001 } 227 | bn_mode: INFERENCE } } 228 | layers { bottom: 'deconv5_2' top: 'deconv5_2' name: 'derelu5_2' type: RELU } 229 | # deconv5_3 230 | layers { bottom: 'deconv5_2' top: 'deconv5_3' name: 'deconv5_3' type: DECONVOLUTION 231 | blobs_lr: 1 blobs_lr: 2 weight_decay: 1 weight_decay: 0 232 | convolution_param { num_output: 512 pad: 1 kernel_size: 3 233 | weight_filler { type: "gaussian" std: 0.01 } 234 | bias_filler { type: "constant" value: 0 }} } 235 | layers { bottom: 'deconv5_3' top: 'deconv5_3' name: 'debn5_3' type: BN 236 | bn_param { scale_filler { type: 'constant' value: 1 } 237 | shift_filler { type: 'constant' value: 0.001 } 238 | bn_mode: INFERENCE } } 239 | layers { bottom: 'deconv5_3' top: 'deconv5_3' name: 'derelu5_3' type: RELU } 240 | 241 | # unpool4 242 | layers { type: UNPOOLING bottom: "deconv5_3" bottom: "pool4_mask" top: "unpool4" name: "unpool4" 243 | unpooling_param { unpool: MAX kernel_size: 2 stride: 2 unpool_size: 28 } 244 | } 245 | 246 | # 28 x 28 247 | # deconv4_1 248 | layers { bottom: 'unpool4' top: 'deconv4_1' name: 'deconv4_1' type: DECONVOLUTION 249 | blobs_lr: 1 blobs_lr: 2 weight_decay: 1 weight_decay: 0 250 | convolution_param { num_output: 512 pad: 1 kernel_size: 3 251 | weight_filler { type: "gaussian" std: 0.01 } 252 | bias_filler { type: "constant" value: 0 }} } 253 | layers { bottom: 'deconv4_1' top: 'deconv4_1' name: 'debn4_1' type: BN 254 | bn_param { scale_filler { type: 'constant' value: 1 } 255 | shift_filler { type: 'constant' value: 0.001 } 256 | bn_mode: INFERENCE } } 257 | layers { bottom: 'deconv4_1' top: 'deconv4_1' name: 'derelu4_1' type: RELU } 258 | # deconv 4_2 259 | layers { bottom: 'deconv4_1' top: 'deconv4_2' name: 'deconv4_2' type: DECONVOLUTION 260 | blobs_lr: 1 blobs_lr: 2 weight_decay: 1 weight_decay: 0 261 | convolution_param { num_output: 512 pad: 1 kernel_size: 3 262 | weight_filler { type: "gaussian" std: 0.01 } 263 | bias_filler { type: "constant" value: 0 }} } 264 | layers { bottom: 'deconv4_2' top: 'deconv4_2' name: 'debn4_2' type: BN 265 | bn_param { scale_filler { type: 'constant' value: 1 } 266 | shift_filler { type: 'constant' value: 0.001 } 267 | bn_mode: INFERENCE } } 268 | layers { bottom: 'deconv4_2' top: 'deconv4_2' name: 'derelu4_2' type: RELU } 269 | # deconv 4_3 270 | layers { bottom: 'deconv4_2' top: 'deconv4_3' name: 'deconv4_3' type: DECONVOLUTION 271 | blobs_lr: 1 blobs_lr: 2 weight_decay: 1 weight_decay: 0 272 | convolution_param { num_output: 256 pad: 1 kernel_size: 3 273 | weight_filler { type: "gaussian" std: 0.01 } 274 | bias_filler { type: "constant" value: 0 }} } 275 | layers { bottom: 'deconv4_3' top: 'deconv4_3' name: 'debn4_3' type: BN 276 | bn_param { scale_filler { type: 'constant' value: 1 } 277 | shift_filler { type: 'constant' value: 0.001 } 278 | bn_mode: INFERENCE } } 279 | layers { bottom: 'deconv4_3' top: 'deconv4_3' name: 'derelu4_3' type: RELU } 280 | 281 | # unpool3 282 | layers { type: UNPOOLING bottom: "deconv4_3" bottom: "pool3_mask" top: "unpool3" name: "unpool3" 283 | unpooling_param { unpool: MAX kernel_size: 2 stride: 2 unpool_size: 56 } 284 | } 285 | 286 | # 56 x 56 287 | # deconv3_1 288 | layers { bottom: 'unpool3' top: 'deconv3_1' name: 'deconv3_1' type: DECONVOLUTION 289 | blobs_lr: 1 blobs_lr: 2 weight_decay: 1 weight_decay: 0 290 | convolution_param { num_output:256 pad:1 kernel_size: 3 291 | weight_filler { type: "gaussian" std: 0.01 } 292 | bias_filler { type: "constant" value: 0 }} } 293 | layers { bottom: 'deconv3_1' top: 'deconv3_1' name: 'debn3_1' type: BN 294 | bn_param { scale_filler { type: 'constant' value: 1 } 295 | shift_filler { type: 'constant' value: 0.001 } 296 | bn_mode: INFERENCE } } 297 | layers { bottom: 'deconv3_1' top: 'deconv3_1' name: 'derelu3_1' type: RELU } 298 | # deconv3_2 299 | layers { bottom: 'deconv3_1' top: 'deconv3_2' name: 'deconv3_2' type: DECONVOLUTION 300 | blobs_lr: 1 blobs_lr: 2 weight_decay: 1 weight_decay: 0 301 | convolution_param { num_output:256 pad:1 kernel_size: 3 302 | weight_filler { type: "gaussian" std: 0.01 } 303 | bias_filler { type: "constant" value: 0 }} } 304 | layers { bottom: 'deconv3_2' top: 'deconv3_2' name: 'debn3_2' type: BN 305 | bn_param { scale_filler { type: 'constant' value: 1 } 306 | shift_filler { type: 'constant' value: 0.001 } 307 | bn_mode: INFERENCE } } 308 | layers { bottom: 'deconv3_2' top: 'deconv3_2' name: 'derelu3_2' type: RELU } 309 | # deconv3_3 310 | layers { bottom: 'deconv3_2' top: 'deconv3_3' name: 'deconv3_3' type: DECONVOLUTION 311 | blobs_lr: 1 blobs_lr: 2 weight_decay: 1 weight_decay: 0 312 | convolution_param { num_output:128 pad:1 kernel_size: 3 313 | weight_filler { type: "gaussian" std: 0.01 } 314 | bias_filler { type: "constant" value: 0 }} } 315 | layers { bottom: 'deconv3_3' top: 'deconv3_3' name: 'debn3_3' type: BN 316 | bn_param { scale_filler { type: 'constant' value: 1 } 317 | shift_filler { type: 'constant' value: 0.001 } 318 | bn_mode: INFERENCE } } 319 | layers { bottom: 'deconv3_3' top: 'deconv3_3' name: 'derelu3_3' type: RELU } 320 | 321 | # unpool2 322 | layers { type: UNPOOLING bottom: "deconv3_3" bottom: "pool2_mask" top: "unpool2" name: "unpool2" 323 | unpooling_param { unpool: MAX kernel_size: 2 stride: 2 unpool_size: 112 } 324 | } 325 | 326 | # 112 x 112 327 | # deconv2_1 328 | layers { bottom: 'unpool2' top: 'deconv2_1' name: 'deconv2_1' type: DECONVOLUTION 329 | blobs_lr: 1 blobs_lr: 2 weight_decay: 1 weight_decay: 0 330 | convolution_param { num_output:128 pad:1 kernel_size: 3 331 | weight_filler { type: "gaussian" std: 0.01 } 332 | bias_filler { type: "constant" value: 0 }} } 333 | layers { bottom: 'deconv2_1' top: 'deconv2_1' name: 'debn2_1' type: BN 334 | bn_param { scale_filler { type: 'constant' value: 1 } 335 | shift_filler { type: 'constant' value: 0.001 } 336 | bn_mode: INFERENCE } } 337 | layers { bottom: 'deconv2_1' top: 'deconv2_1' name: 'derelu2_1' type: RELU } 338 | # deconv2_2 339 | layers { bottom: 'deconv2_1' top: 'deconv2_2' name: 'deconv2_2' type: DECONVOLUTION 340 | blobs_lr: 1 blobs_lr: 2 weight_decay: 1 weight_decay: 0 341 | convolution_param { num_output:64 pad:1 kernel_size: 3 342 | weight_filler { type: "gaussian" std: 0.01 } 343 | bias_filler { type: "constant" value: 0 }} } 344 | layers { bottom: 'deconv2_2' top: 'deconv2_2' name: 'debn2_2' type: BN 345 | bn_param { scale_filler { type: 'constant' value: 1 } 346 | shift_filler { type: 'constant' value: 0.001 } 347 | bn_mode: INFERENCE } } 348 | layers { bottom: 'deconv2_2' top: 'deconv2_2' name: 'derelu2_2' type: RELU } 349 | 350 | # unpool1 351 | layers { type: UNPOOLING bottom: "deconv2_2" bottom: "pool1_mask" top: "unpool1" name: "unpool1" 352 | unpooling_param { unpool: MAX kernel_size: 2 stride: 2 unpool_size: 224 } 353 | } 354 | 355 | # deconv1_1 356 | layers { bottom: 'unpool1' top: 'deconv1_1' name: 'deconv1_1' type: DECONVOLUTION 357 | blobs_lr: 1 blobs_lr: 2 weight_decay: 1 weight_decay: 0 358 | convolution_param { num_output:64 pad:1 kernel_size: 3 359 | weight_filler { type: "gaussian" std: 0.01 } 360 | bias_filler { type: "constant" value: 0 }} } 361 | layers { bottom: 'deconv1_1' top: 'deconv1_1' name: 'debn1_1' type: BN 362 | bn_param { scale_filler { type: 'constant' value: 1 } 363 | shift_filler { type: 'constant' value: 0.001 } 364 | bn_mode: INFERENCE } } 365 | layers { bottom: 'deconv1_1' top: 'deconv1_1' name: 'derelu1_1' type: RELU } 366 | 367 | # deconv1_2 368 | layers { bottom: 'deconv1_1' top: 'deconv1_2' name: 'deconv1_2' type: DECONVOLUTION 369 | blobs_lr: 1 blobs_lr: 2 weight_decay: 1 weight_decay: 0 370 | convolution_param { num_output:64 pad:1 kernel_size: 3 371 | weight_filler { type: "gaussian" std: 0.01 } 372 | bias_filler { type: "constant" value: 0 }} } 373 | layers { bottom: 'deconv1_2' top: 'deconv1_2' name: 'debn1_2' type: BN 374 | bn_param { scale_filler { type: 'constant' value: 1 } 375 | shift_filler { type: 'constant' value: 0.001 } 376 | bn_mode: INFERENCE } } 377 | layers { bottom: 'deconv1_2' top: 'deconv1_2' name: 'derelu1_2' type: RELU } 378 | 379 | # seg-score 380 | layers { name: 'seg-score-voc' type: CONVOLUTION bottom: 'deconv1_2' top: 'seg-score' 381 | blobs_lr: 1 blobs_lr: 2 weight_decay: 1 weight_decay: 0 382 | convolution_param { num_output: 21 kernel_size: 1 383 | weight_filler { type: "gaussian" std: 0.01 } 384 | bias_filler { type: "constant" value: 0 }} } 385 | 386 | -------------------------------------------------------------------------------- /model/DeconvNet/README.md: -------------------------------------------------------------------------------- 1 | **This directory contains pre-trained DeconvNet definition and weight file** 2 | -------------------------------------------------------------------------------- /model/FCN/README.md: -------------------------------------------------------------------------------- 1 | **you need FCN-8s for inference(EDConvNet+CRF)** 2 | -------------------------------------------------------------------------------- /model/FCN/fcn-8s-pascal-deploy.prototxt: -------------------------------------------------------------------------------- 1 | name: 'FCN' 2 | 3 | input: 'data' 4 | input_dim: 1 5 | input_dim: 3 6 | input_dim: 500 7 | input_dim: 500 8 | force_backward: true 9 | 10 | layers { bottom: 'data' top: 'conv1_1' name: 'conv1_1' type: CONVOLUTION 11 | blobs_lr: 1 blobs_lr: 2 weight_decay: 1 weight_decay: 0 12 | convolution_param { engine: CAFFE num_output: 64 pad: 100 kernel_size: 3 } } 13 | layers { bottom: 'conv1_1' top: 'conv1_1' name: 'relu1_1' type: RELU } 14 | layers { bottom: 'conv1_1' top: 'conv1_2' name: 'conv1_2' type: CONVOLUTION 15 | blobs_lr: 1 blobs_lr: 2 weight_decay: 1 weight_decay: 0 16 | convolution_param { engine: CAFFE num_output: 64 pad: 1 kernel_size: 3 } } 17 | layers { bottom: 'conv1_2' top: 'conv1_2' name: 'relu1_2' type: RELU } 18 | layers { name: 'pool1' bottom: 'conv1_2' top: 'pool1' type: POOLING 19 | pooling_param { pool: MAX kernel_size: 2 stride: 2 } } 20 | layers { name: 'conv2_1' bottom: 'pool1' top: 'conv2_1' type: CONVOLUTION 21 | blobs_lr: 1 blobs_lr: 2 weight_decay: 1 weight_decay: 0 22 | convolution_param { engine: CAFFE num_output: 128 pad: 1 kernel_size: 3 } } 23 | layers { bottom: 'conv2_1' top: 'conv2_1' name: 'relu2_1' type: RELU } 24 | layers { bottom: 'conv2_1' top: 'conv2_2' name: 'conv2_2' type: CONVOLUTION 25 | blobs_lr: 1 blobs_lr: 2 weight_decay: 1 weight_decay: 0 26 | convolution_param { engine: CAFFE num_output: 128 pad: 1 kernel_size: 3 } } 27 | layers { bottom: 'conv2_2' top: 'conv2_2' name: 'relu2_2' type: RELU } 28 | layers { bottom: 'conv2_2' top: 'pool2' name: 'pool2' type: POOLING 29 | pooling_param { pool: MAX kernel_size: 2 stride: 2 } } 30 | layers { bottom: 'pool2' top: 'conv3_1' name: 'conv3_1' type: CONVOLUTION 31 | blobs_lr: 1 blobs_lr: 2 weight_decay: 1 weight_decay: 0 32 | convolution_param { engine: CAFFE num_output: 256 pad: 1 kernel_size: 3 } } 33 | layers { bottom: 'conv3_1' top: 'conv3_1' name: 'relu3_1' type: RELU } 34 | layers { bottom: 'conv3_1' top: 'conv3_2' name: 'conv3_2' type: CONVOLUTION 35 | blobs_lr: 1 blobs_lr: 2 weight_decay: 1 weight_decay: 0 36 | convolution_param { engine: CAFFE num_output: 256 pad: 1 kernel_size: 3 } } 37 | layers { bottom: 'conv3_2' top: 'conv3_2' name: 'relu3_2' type: RELU } 38 | layers { bottom: 'conv3_2' top: 'conv3_3' name: 'conv3_3' type: CONVOLUTION 39 | blobs_lr: 1 blobs_lr: 2 weight_decay: 1 weight_decay: 0 40 | convolution_param { engine: CAFFE num_output: 256 pad: 1 kernel_size: 3 } } 41 | layers { bottom: 'conv3_3' top: 'conv3_3' name: 'relu3_3' type: RELU } 42 | layers { bottom: 'conv3_3' top: 'pool3' name: 'pool3' type: POOLING 43 | pooling_param { pool: MAX kernel_size: 2 stride: 2 } } 44 | layers { bottom: 'pool3' top: 'conv4_1' name: 'conv4_1' type: CONVOLUTION 45 | blobs_lr: 1 blobs_lr: 2 weight_decay: 1 weight_decay: 0 46 | convolution_param { engine: CAFFE num_output: 512 pad: 1 kernel_size: 3 } } 47 | layers { bottom: 'conv4_1' top: 'conv4_1' name: 'relu4_1' type: RELU } 48 | layers { bottom: 'conv4_1' top: 'conv4_2' name: 'conv4_2' type: CONVOLUTION 49 | blobs_lr: 1 blobs_lr: 2 weight_decay: 1 weight_decay: 0 50 | convolution_param { engine: CAFFE num_output: 512 pad: 1 kernel_size: 3 } } 51 | layers { bottom: 'conv4_2' top: 'conv4_2' name: 'relu4_2' type: RELU } 52 | layers { bottom: 'conv4_2' top: 'conv4_3' name: 'conv4_3' type: CONVOLUTION 53 | blobs_lr: 1 blobs_lr: 2 weight_decay: 1 weight_decay: 0 54 | convolution_param { engine: CAFFE num_output: 512 pad: 1 kernel_size: 3 } } 55 | layers { bottom: 'conv4_3' top: 'conv4_3' name: 'relu4_3' type: RELU } 56 | layers { bottom: 'conv4_3' top: 'pool4' name: 'pool4' type: POOLING 57 | pooling_param { pool: MAX kernel_size: 2 stride: 2 } } 58 | layers { bottom: 'pool4' top: 'conv5_1' name: 'conv5_1' type: CONVOLUTION 59 | blobs_lr: 1 blobs_lr: 2 weight_decay: 1 weight_decay: 0 60 | convolution_param { engine: CAFFE num_output: 512 pad: 1 kernel_size: 3 } } 61 | layers { bottom: 'conv5_1' top: 'conv5_1' name: 'relu5_1' type: RELU } 62 | layers { bottom: 'conv5_1' top: 'conv5_2' name: 'conv5_2' type: CONVOLUTION 63 | blobs_lr: 1 blobs_lr: 2 weight_decay: 1 weight_decay: 0 64 | convolution_param { engine: CAFFE num_output: 512 pad: 1 kernel_size: 3 } } 65 | layers { bottom: 'conv5_2' top: 'conv5_2' name: 'relu5_2' type: RELU } 66 | layers { bottom: 'conv5_2' top: 'conv5_3' name: 'conv5_3' type: CONVOLUTION 67 | blobs_lr: 1 blobs_lr: 2 weight_decay: 1 weight_decay: 0 68 | convolution_param { engine: CAFFE num_output: 512 pad: 1 kernel_size: 3 } } 69 | layers { bottom: 'conv5_3' top: 'conv5_3' name: 'relu5_3' type: RELU } 70 | layers { bottom: 'conv5_3' top: 'pool5' name: 'pool5' type: POOLING 71 | pooling_param { pool: MAX kernel_size: 2 stride: 2 } } 72 | layers { bottom: 'pool5' top: 'fc6' name: 'fc6' type: CONVOLUTION 73 | blobs_lr: 1 blobs_lr: 2 weight_decay: 1 weight_decay: 0 74 | convolution_param { engine: CAFFE kernel_size: 7 num_output: 4096 } } 75 | layers { bottom: 'fc6' top: 'fc6' name: 'relu6' type: RELU } 76 | layers { bottom: 'fc6' top: 'fc6' name: 'drop6' type: DROPOUT 77 | dropout_param { dropout_ratio: 0.5 } } 78 | layers { bottom: 'fc6' top: 'fc7' name: 'fc7' type: CONVOLUTION 79 | blobs_lr: 1 blobs_lr: 2 weight_decay: 1 weight_decay: 0 80 | convolution_param { engine: CAFFE kernel_size: 1 num_output: 4096 } } 81 | layers { bottom: 'fc7' top: 'fc7' name: 'relu7' type: RELU } 82 | layers { bottom: 'fc7' top: 'fc7' name: 'drop7' type: DROPOUT 83 | dropout_param { dropout_ratio: 0.5 } } 84 | layers { name: 'score-fr' type: CONVOLUTION bottom: 'fc7' top: 'score' 85 | blobs_lr: 1 blobs_lr: 2 weight_decay: 1 weight_decay: 0 86 | convolution_param { engine: CAFFE num_output: 21 kernel_size: 1 } } 87 | 88 | layers { type: DECONVOLUTION name: 'score2' bottom: 'score' top: 'score2' 89 | blobs_lr: 1 blobs_lr: 2 weight_decay: 1 weight_decay: 0 90 | convolution_param { kernel_size: 4 stride: 2 num_output: 21 } } 91 | 92 | layers { name: 'score-pool4' type: CONVOLUTION bottom: 'pool4' top: 'score-pool4' 93 | blobs_lr: 1 blobs_lr: 2 weight_decay: 1 weight_decay: 0 94 | convolution_param { engine: CAFFE num_output: 21 kernel_size: 1 } } 95 | 96 | layers { type: CROP name: 'crop' bottom: 'score-pool4' bottom: 'score2' 97 | top: 'score-pool4c' } 98 | 99 | layers { type: ELTWISE name: 'fuse' bottom: 'score2' bottom: 'score-pool4c' 100 | top: 'score-fused' 101 | eltwise_param { operation: SUM } } 102 | 103 | layers { type: DECONVOLUTION name: 'score4' bottom: 'score-fused' 104 | top: 'score4' 105 | blobs_lr: 1 weight_decay: 1 106 | convolution_param { bias_term: false kernel_size: 4 stride: 2 num_output: 21 } } 107 | 108 | layers { name: 'score-pool3' type: CONVOLUTION bottom: 'pool3' top: 'score-pool3' 109 | blobs_lr: 1 blobs_lr: 2 weight_decay: 1 weight_decay: 0 110 | convolution_param { engine: CAFFE num_output: 21 kernel_size: 1 } } 111 | 112 | layers { type: CROP name: 'crop' bottom: 'score-pool3' bottom: 'score4' 113 | top: 'score-pool3c' } 114 | 115 | layers { type: ELTWISE name: 'fuse' bottom: 'score4' bottom: 'score-pool3c' 116 | top: 'score-final' 117 | eltwise_param { operation: SUM } } 118 | 119 | layers { type: DECONVOLUTION name: 'upsample' 120 | bottom: 'score-final' top: 'bigscore' 121 | blobs_lr: 0 122 | convolution_param { bias_term: false num_output: 21 kernel_size: 16 stride: 8 } } 123 | 124 | layers { type: CROP name: 'crop' bottom: 'bigscore' bottom: 'data' top: 'upscore' } 125 | -------------------------------------------------------------------------------- /model/VGG_conv/README.md: -------------------------------------------------------------------------------- 1 | **This directory contains pre-trained VGG-16 weight of which fully-connected layers are converted to convolution layer** 2 | -------------------------------------------------------------------------------- /model/get_model.sh: -------------------------------------------------------------------------------- 1 | 2 | cd VGG_conv 3 | wget http://cvlab.postech.ac.kr/research/deconvnet/model/VGG_conv/VGG_ILSVRC_16_layers_conv.caffemodel 4 | cd .. 5 | 6 | cd stage_1_train_result 7 | wget http://cvlab.postech.ac.kr/research/deconvnet/model/stage_1_train_result/stage_1_train_result.caffemodel 8 | cd .. 9 | 10 | cd stage_2_train_result 11 | wget http://cvlab.postech.ac.kr/research/deconvnet/model/stage_2_train_result/stage_2_train_result.caffemodel 12 | cd .. 13 | 14 | cd DeconvNet 15 | wget http://cvlab.postech.ac.kr/research/deconvnet/model/DeconvNet/DeconvNet_trainval_inference.caffemodel 16 | cd .. 17 | 18 | cd FCN 19 | wget http://dl.caffe.berkeleyvision.org/fcn-8s-pascal.caffemodel 20 | cd .. 21 | 22 | -------------------------------------------------------------------------------- /model/stage_1_train_result/README.md: -------------------------------------------------------------------------------- 1 | **This directory contains the resultng weight of stage 1 training** 2 | -------------------------------------------------------------------------------- /model/stage_2_train_result/README.md: -------------------------------------------------------------------------------- 1 | **This directory contains the resultng weight of stage 2 training** 2 | -------------------------------------------------------------------------------- /setup.sh: -------------------------------------------------------------------------------- 1 | # get caffe (you should compile this caffe library with your own Makefile.config file) 2 | git clone https://github.com/HyeonwooNoh/caffe.git 3 | 4 | # get data 5 | cd data 6 | ./get_data.sh 7 | cd .. 8 | 9 | # get models 10 | cd model 11 | ./get_model.sh 12 | cd .. 13 | 14 | # prepare inference 15 | cd inference 16 | ./setup.sh 17 | cd .. 18 | 19 | -------------------------------------------------------------------------------- /training/001_stage_1/solver.prototxt: -------------------------------------------------------------------------------- 1 | net: "./stage_1_train.prototxt" 2 | base_lr: 0.01 3 | lr_policy: "step" 4 | stepsize: 9000 5 | gamma: 0.1 6 | iter_size: 8 7 | display: 1 8 | max_iter: 20000 9 | momentum: 0.9 10 | weight_decay: 0.0005 11 | test_interval: 500 12 | test_iter: 1000 13 | snapshot: 500 14 | snapshot_prefix: "./snapshot/stage_1_train" 15 | solver_mode: GPU 16 | -------------------------------------------------------------------------------- /training/001_stage_1/stage_1_train.prototxt: -------------------------------------------------------------------------------- 1 | name: "stage_1_training" 2 | 3 | layers { 4 | name: "data" 5 | type: IMAGE_SEG_DATA 6 | top: "data" 7 | top: "seg-label" 8 | image_data_param { 9 | root_folder: "./VOC2012" 10 | source: "./stage_1_train_imgset/train.txt" 11 | label_type: PIXEL 12 | batch_size: 8 13 | shuffle: true 14 | new_width: 250 15 | new_height: 250 16 | } 17 | transform_param { 18 | mean_value: 104.008 19 | mean_value: 116.669 20 | mean_value: 122.675 21 | crop_size: 224 22 | mirror: true 23 | } 24 | include: { phase: TRAIN } 25 | } 26 | layers { 27 | name: "data" 28 | type: IMAGE_SEG_DATA 29 | top: "data" 30 | top: "seg-label" 31 | image_data_param { 32 | root_folder: "./VOC2012" 33 | source: "./stage_1_train_imgset/val.txt" 34 | label_type: PIXEL 35 | batch_size: 8 36 | shuffle: true 37 | new_width: 250 38 | new_height: 250 39 | } 40 | transform_param { 41 | mean_value: 104.008 42 | mean_value: 116.669 43 | mean_value: 122.675 44 | crop_size: 224 45 | mirror: true 46 | } 47 | include: { phase: TEST } 48 | } 49 | 50 | # 224 x 224 51 | # conv1_1 52 | layers { bottom: "data" top: "conv1_1" name: "conv1_1" type: CONVOLUTION 53 | blobs_lr: 1 blobs_lr: 2 weight_decay: 1 weight_decay: 0 54 | convolution_param { num_output: 64 pad: 1 kernel_size: 3 }} 55 | layers { bottom: 'conv1_1' top: 'conv1_1' name: 'bn1_1' type: BN 56 | bn_param { scale_filler { type: 'constant' value: 1 } 57 | shift_filler { type: 'constant' value: 0.001 } } } 58 | layers { bottom: "conv1_1" top: "conv1_1" name: "relu1_1" type: RELU} 59 | # conv1_2 60 | layers { bottom: "conv1_1" top: "conv1_2" name: "conv1_2" type: CONVOLUTION 61 | blobs_lr: 1 blobs_lr: 2 weight_decay: 1 weight_decay: 0 62 | convolution_param { num_output: 64 pad: 1 kernel_size: 3 }} 63 | layers { bottom: 'conv1_2' top: 'conv1_2' name: 'bn1_2' type: BN 64 | bn_param { scale_filler { type: 'constant' value: 1 } 65 | shift_filler { type: 'constant' value: 0.001 } } } 66 | layers { bottom: "conv1_2" top: "conv1_2" name: "relu1_2" type: RELU} 67 | 68 | # pool1 69 | layers { 70 | bottom: "conv1_2" top: "pool1" top:"pool1_mask" name: "pool1" type: POOLING 71 | pooling_param { pool: MAX kernel_size: 2 stride: 2 } 72 | } 73 | 74 | # 112 x 112 75 | # conv2_1 76 | layers { bottom: "pool1" top: "conv2_1" name: "conv2_1" type: CONVOLUTION 77 | blobs_lr: 1 blobs_lr: 2 weight_decay: 1 weight_decay: 0 78 | convolution_param { num_output: 128 pad: 1 kernel_size: 3 }} 79 | layers { bottom: 'conv2_1' top: 'conv2_1' name: 'bn2_1' type: BN 80 | bn_param { scale_filler { type: 'constant' value: 1 } 81 | shift_filler { type: 'constant' value: 0.001 } } } 82 | layers { bottom: "conv2_1" top: "conv2_1" name: "relu2_1" type: RELU} 83 | # conv2_2 84 | layers { bottom: "conv2_1" top: "conv2_2" name: "conv2_2" type: CONVOLUTION 85 | blobs_lr: 1 blobs_lr: 2 weight_decay: 1 weight_decay: 0 86 | convolution_param { num_output: 128 pad: 1 kernel_size: 3 }} 87 | layers { bottom: 'conv2_2' top: 'conv2_2' name: 'bn2_2' type: BN 88 | bn_param { scale_filler { type: 'constant' value: 1 } 89 | shift_filler { type: 'constant' value: 0.001 } } } 90 | layers { bottom: "conv2_2" top: "conv2_2" name: "relu2_2" type: RELU} 91 | 92 | # pool2 93 | layers { 94 | bottom: "conv2_2" top: "pool2" top: "pool2_mask" name: "pool2" type: POOLING 95 | pooling_param { pool: MAX kernel_size: 2 stride: 2 } 96 | } 97 | 98 | # 56 x 56 99 | # conv3_1 100 | layers { bottom: "pool2" top: "conv3_1" name: "conv3_1" type: CONVOLUTION 101 | blobs_lr: 1 blobs_lr: 2 weight_decay: 1 weight_decay: 0 102 | convolution_param { num_output: 256 pad: 1 kernel_size: 3 }} 103 | layers { bottom: 'conv3_1' top: 'conv3_1' name: 'bn3_1' type: BN 104 | bn_param { scale_filler { type: 'constant' value: 1 } 105 | shift_filler { type: 'constant' value: 0.001 } } } 106 | layers { bottom: "conv3_1" top: "conv3_1" name: "relu3_1" type: RELU} 107 | # conv3_2 108 | layers { bottom: "conv3_1" top: "conv3_2" name: "conv3_2" type: CONVOLUTION 109 | blobs_lr: 1 blobs_lr: 2 weight_decay: 1 weight_decay: 0 110 | convolution_param { num_output: 256 pad: 1 kernel_size: 3 }} 111 | layers { bottom: 'conv3_2' top: 'conv3_2' name: 'bn3_2' type: BN 112 | bn_param { scale_filler { type: 'constant' value: 1 } 113 | shift_filler { type: 'constant' value: 0.001 } } } 114 | layers { bottom: "conv3_2" top: "conv3_2" name: "relu3_2" type: RELU} 115 | # conv3_3 116 | layers { bottom: "conv3_2" top: "conv3_3" name: "conv3_3" type: CONVOLUTION 117 | blobs_lr: 1 blobs_lr: 2 weight_decay: 1 weight_decay: 0 118 | convolution_param { num_output: 256 pad: 1 kernel_size: 3 }} 119 | layers { bottom: 'conv3_3' top: 'conv3_3' name: 'bn3_3' type: BN 120 | bn_param { scale_filler { type: 'constant' value: 1 } 121 | shift_filler { type: 'constant' value: 0.001 } } } 122 | layers { bottom: "conv3_3" top: "conv3_3" name: "relu3_3" type: RELU} 123 | 124 | # pool3 125 | layers { 126 | bottom: "conv3_3" top: "pool3" top: "pool3_mask" name: "pool3" type: POOLING 127 | pooling_param { pool: MAX kernel_size: 2 stride: 2 } 128 | } 129 | 130 | # 28 x 28 131 | # conv4_1 132 | layers { bottom: "pool3" top: "conv4_1" name: "conv4_1" type: CONVOLUTION 133 | blobs_lr: 1 blobs_lr: 2 weight_decay: 1 weight_decay: 0 134 | convolution_param { num_output: 512 pad: 1 kernel_size: 3 }} 135 | layers { bottom: 'conv4_1' top: 'conv4_1' name: 'bn4_1' type: BN 136 | bn_param { scale_filler { type: 'constant' value: 1 } 137 | shift_filler { type: 'constant' value: 0.001 } } } 138 | layers { bottom: "conv4_1" top: "conv4_1" name: "relu4_1" type: RELU} 139 | # conv4_2 140 | layers { bottom: "conv4_1" top: "conv4_2" name: "conv4_2" type: CONVOLUTION 141 | blobs_lr: 1 blobs_lr: 2 weight_decay: 1 weight_decay: 0 142 | convolution_param { num_output: 512 pad: 1 kernel_size: 3 }} 143 | layers { bottom: 'conv4_2' top: 'conv4_2' name: 'bn4_2' type: BN 144 | bn_param { scale_filler { type: 'constant' value: 1 } 145 | shift_filler { type: 'constant' value: 0.001 } } } 146 | layers { bottom: "conv4_2" top: "conv4_2" name: "relu4_2" type: RELU} 147 | # conv4_3 148 | layers { bottom: "conv4_2" top: "conv4_3" name: "conv4_3" type: CONVOLUTION 149 | blobs_lr: 1 blobs_lr: 2 weight_decay: 1 weight_decay: 0 150 | convolution_param { num_output: 512 pad: 1 kernel_size: 3 }} 151 | layers { bottom: 'conv4_3' top: 'conv4_3' name: 'bn4_3' type: BN 152 | bn_param { scale_filler { type: 'constant' value: 1 } 153 | shift_filler { type: 'constant' value: 0.001 } } } 154 | layers { bottom: "conv4_3" top: "conv4_3" name: "relu4_3" type: RELU} 155 | 156 | # pool4 157 | layers { 158 | bottom: "conv4_3" top: "pool4" top: "pool4_mask" name: "pool4" type: POOLING 159 | pooling_param { pool: MAX kernel_size: 2 stride: 2 } 160 | } 161 | 162 | # 14 x 14 163 | # conv5_1 164 | layers { bottom: "pool4" top: "conv5_1" name: "conv5_1" type: CONVOLUTION 165 | blobs_lr: 1 blobs_lr: 2 weight_decay: 1 weight_decay: 0 166 | convolution_param { num_output: 512 pad: 1 kernel_size: 3 }} 167 | layers { bottom: 'conv5_1' top: 'conv5_1' name: 'bn5_1' type: BN 168 | bn_param { scale_filler { type: 'constant' value: 1 } 169 | shift_filler { type: 'constant' value: 0.001 } } } 170 | layers { bottom: "conv5_1" top: "conv5_1" name: "relu5_1" type: RELU} 171 | # conv5_2 172 | layers { bottom: "conv5_1" top: "conv5_2" name: "conv5_2" type: CONVOLUTION 173 | blobs_lr: 1 blobs_lr: 2 weight_decay: 1 weight_decay: 0 174 | convolution_param { num_output: 512 pad: 1 kernel_size: 3 }} 175 | layers { bottom: 'conv5_2' top: 'conv5_2' name: 'bn5_2' type: BN 176 | bn_param { scale_filler { type: 'constant' value: 1 } 177 | shift_filler { type: 'constant' value: 0.001 } } } 178 | layers { bottom: "conv5_2" top: "conv5_2" name: "relu5_2" type: RELU} 179 | # conv5_3 180 | layers { bottom: "conv5_2" top: "conv5_3" name: "conv5_3" type: CONVOLUTION 181 | blobs_lr: 1 blobs_lr: 2 weight_decay: 1 weight_decay: 0 182 | convolution_param { num_output: 512 pad: 1 kernel_size: 3 }} 183 | layers { bottom: 'conv5_3' top: 'conv5_3' name: 'bn5_3' type: BN 184 | bn_param { scale_filler { type: 'constant' value: 1 } 185 | shift_filler { type: 'constant' value: 0.001 } } } 186 | layers { bottom: "conv5_3" top: "conv5_3" name: "relu5_3" type: RELU} 187 | 188 | # pool5 189 | layers { 190 | bottom: "conv5_3" top: "pool5" top: "pool5_mask" name: "pool5" type: POOLING 191 | pooling_param { pool: MAX kernel_size: 2 stride: 2 } 192 | } 193 | 194 | # 7 x 7 195 | # fc6 196 | layers { bottom: 'pool5' top: 'fc6' name: 'fc6' type: CONVOLUTION 197 | blobs_lr: 1 blobs_lr: 2 weight_decay: 1 weight_decay: 0 198 | convolution_param { kernel_size: 7 num_output: 4096 } } 199 | layers { bottom: 'fc6' top: 'fc6' name: 'bnfc6' type: BN 200 | bn_param { scale_filler { type: 'constant' value: 1 } 201 | shift_filler { type: 'constant' value: 0.001 } } } 202 | layers { bottom: "fc6" top: "fc6" name: "relu6" type: RELU} 203 | 204 | # 1 x 1 205 | # fc7 206 | layers { bottom: 'fc6' top: 'fc7' name: 'fc7' type: CONVOLUTION 207 | blobs_lr: 1 blobs_lr: 2 weight_decay: 1 weight_decay: 0 208 | convolution_param { kernel_size: 1 num_output: 4096 } } 209 | layers { bottom: 'fc7' top: 'fc7' name: 'bnfc7' type: BN 210 | bn_param { scale_filler { type: 'constant' value: 1 } 211 | shift_filler { type: 'constant' value: 0.001 } } } 212 | layers { bottom: "fc7" top: "fc7" name: "relu7" type: RELU} 213 | 214 | # fc6-deconv 215 | layers { bottom: 'fc7' top: 'fc6-deconv' name: 'fc6-deconv' type: DECONVOLUTION 216 | blobs_lr: 1 blobs_lr: 2 weight_decay: 1 weight_decay: 0 217 | convolution_param { num_output: 512 kernel_size: 7 218 | weight_filler { type: "gaussian" std: 0.01 } 219 | bias_filler { type: "constant" value: 0 }} } 220 | layers { bottom: 'fc6-deconv' top: 'fc6-deconv' name: 'fc6-deconv-bn' type: BN 221 | bn_param { scale_filler { type: 'constant' value: 1 } 222 | shift_filler { type: 'constant' value: 0.001 } } } 223 | layers { bottom: 'fc6-deconv' top: 'fc6-deconv' name: 'fc6-deconv-relu' type: RELU } 224 | 225 | # 7 x 7 226 | # unpool5 227 | layers { type: UNPOOLING bottom: "fc6-deconv" bottom: "pool5_mask" top: "unpool5" name: "unpool5" 228 | unpooling_param { unpool: MAX kernel_size: 2 stride: 2 unpool_size: 14 } 229 | } 230 | 231 | # 14 x 14 232 | # deconv5_1 233 | layers { bottom: 'unpool5' top: 'deconv5_1' name: 'deconv5_1' type: DECONVOLUTION 234 | blobs_lr: 1 blobs_lr: 2 weight_decay: 1 weight_decay: 0 235 | convolution_param { num_output: 512 pad: 1 kernel_size: 3 236 | weight_filler { type: "gaussian" std: 0.01 } 237 | bias_filler { type: "constant" value: 0 }} } 238 | layers { bottom: 'deconv5_1' top: 'deconv5_1' name: 'debn5_1' type: BN 239 | bn_param { scale_filler { type: 'constant' value: 1 } 240 | shift_filler { type: 'constant' value: 0.001 } } } 241 | layers { bottom: 'deconv5_1' top: 'deconv5_1' name: 'derelu5_1' type: RELU } 242 | # deconv5_2 243 | layers { bottom: 'deconv5_1' top: 'deconv5_2' name: 'deconv5_2' type: DECONVOLUTION 244 | blobs_lr: 1 blobs_lr: 2 weight_decay: 1 weight_decay: 0 245 | convolution_param { num_output: 512 pad: 1 kernel_size: 3 246 | weight_filler { type: "gaussian" std: 0.01 } 247 | bias_filler { type: "constant" value: 0 }} } 248 | layers { bottom: 'deconv5_2' top: 'deconv5_2' name: 'debn5_2' type: BN 249 | bn_param { scale_filler { type: 'constant' value: 1 } 250 | shift_filler { type: 'constant' value: 0.001 } } } 251 | layers { bottom: 'deconv5_2' top: 'deconv5_2' name: 'derelu5_2' type: RELU } 252 | # deconv5_3 253 | layers { bottom: 'deconv5_2' top: 'deconv5_3' name: 'deconv5_3' type: DECONVOLUTION 254 | blobs_lr: 1 blobs_lr: 2 weight_decay: 1 weight_decay: 0 255 | convolution_param { num_output: 512 pad: 1 kernel_size: 3 256 | weight_filler { type: "gaussian" std: 0.01 } 257 | bias_filler { type: "constant" value: 0 }} } 258 | layers { bottom: 'deconv5_3' top: 'deconv5_3' name: 'debn5_3' type: BN 259 | bn_param { scale_filler { type: 'constant' value: 1 } 260 | shift_filler { type: 'constant' value: 0.001 } } } 261 | layers { bottom: 'deconv5_3' top: 'deconv5_3' name: 'derelu5_3' type: RELU } 262 | 263 | # unpool4 264 | layers { type: UNPOOLING bottom: "deconv5_3" bottom: "pool4_mask" top: "unpool4" name: "unpool4" 265 | unpooling_param { unpool: MAX kernel_size: 2 stride: 2 unpool_size: 28 } 266 | } 267 | 268 | # 28 x 28 269 | # deconv4_1 270 | layers { bottom: 'unpool4' top: 'deconv4_1' name: 'deconv4_1' type: DECONVOLUTION 271 | blobs_lr: 1 blobs_lr: 2 weight_decay: 1 weight_decay: 0 272 | convolution_param { num_output: 512 pad: 1 kernel_size: 3 273 | weight_filler { type: "gaussian" std: 0.01 } 274 | bias_filler { type: "constant" value: 0 }} } 275 | layers { bottom: 'deconv4_1' top: 'deconv4_1' name: 'debn4_1' type: BN 276 | bn_param { scale_filler { type: 'constant' value: 1 } 277 | shift_filler { type: 'constant' value: 0.001 } } } 278 | layers { bottom: 'deconv4_1' top: 'deconv4_1' name: 'derelu4_1' type: RELU } 279 | # deconv 4_2 280 | layers { bottom: 'deconv4_1' top: 'deconv4_2' name: 'deconv4_2' type: DECONVOLUTION 281 | blobs_lr: 1 blobs_lr: 2 weight_decay: 1 weight_decay: 0 282 | convolution_param { num_output: 512 pad: 1 kernel_size: 3 283 | weight_filler { type: "gaussian" std: 0.01 } 284 | bias_filler { type: "constant" value: 0 }} } 285 | layers { bottom: 'deconv4_2' top: 'deconv4_2' name: 'debn4_2' type: BN 286 | bn_param { scale_filler { type: 'constant' value: 1 } 287 | shift_filler { type: 'constant' value: 0.001 } } } 288 | layers { bottom: 'deconv4_2' top: 'deconv4_2' name: 'derelu4_2' type: RELU } 289 | # deconv 4_3 290 | layers { bottom: 'deconv4_2' top: 'deconv4_3' name: 'deconv4_3' type: DECONVOLUTION 291 | blobs_lr: 1 blobs_lr: 2 weight_decay: 1 weight_decay: 0 292 | convolution_param { num_output: 256 pad: 1 kernel_size: 3 293 | weight_filler { type: "gaussian" std: 0.01 } 294 | bias_filler { type: "constant" value: 0 }} } 295 | layers { bottom: 'deconv4_3' top: 'deconv4_3' name: 'debn4_3' type: BN 296 | bn_param { scale_filler { type: 'constant' value: 1 } 297 | shift_filler { type: 'constant' value: 0.001 } } } 298 | layers { bottom: 'deconv4_3' top: 'deconv4_3' name: 'derelu4_3' type: RELU } 299 | 300 | # unpool3 301 | layers { type: UNPOOLING bottom: "deconv4_3" bottom: "pool3_mask" top: "unpool3" name: "unpool3" 302 | unpooling_param { unpool: MAX kernel_size: 2 stride: 2 unpool_size: 56 } 303 | } 304 | 305 | # 56 x 56 306 | # deconv3_1 307 | layers { bottom: 'unpool3' top: 'deconv3_1' name: 'deconv3_1' type: DECONVOLUTION 308 | blobs_lr: 1 blobs_lr: 2 weight_decay: 1 weight_decay: 0 309 | convolution_param { num_output:256 pad:1 kernel_size: 3 310 | weight_filler { type: "gaussian" std: 0.01 } 311 | bias_filler { type: "constant" value: 0 }} } 312 | layers { bottom: 'deconv3_1' top: 'deconv3_1' name: 'debn3_1' type: BN 313 | bn_param { scale_filler { type: 'constant' value: 1 } 314 | shift_filler { type: 'constant' value: 0.001 } } } 315 | layers { bottom: 'deconv3_1' top: 'deconv3_1' name: 'derelu3_1' type: RELU } 316 | # deconv3_2 317 | layers { bottom: 'deconv3_1' top: 'deconv3_2' name: 'deconv3_2' type: DECONVOLUTION 318 | blobs_lr: 1 blobs_lr: 2 weight_decay: 1 weight_decay: 0 319 | convolution_param { num_output:256 pad:1 kernel_size: 3 320 | weight_filler { type: "gaussian" std: 0.01 } 321 | bias_filler { type: "constant" value: 0 }} } 322 | layers { bottom: 'deconv3_2' top: 'deconv3_2' name: 'debn3_2' type: BN 323 | bn_param { scale_filler { type: 'constant' value: 1 } 324 | shift_filler { type: 'constant' value: 0.001 } } } 325 | layers { bottom: 'deconv3_2' top: 'deconv3_2' name: 'derelu3_2' type: RELU } 326 | # deconv3_3 327 | layers { bottom: 'deconv3_2' top: 'deconv3_3' name: 'deconv3_3' type: DECONVOLUTION 328 | blobs_lr: 1 blobs_lr: 2 weight_decay: 1 weight_decay: 0 329 | convolution_param { num_output:128 pad:1 kernel_size: 3 330 | weight_filler { type: "gaussian" std: 0.01 } 331 | bias_filler { type: "constant" value: 0 }} } 332 | layers { bottom: 'deconv3_3' top: 'deconv3_3' name: 'debn3_3' type: BN 333 | bn_param { scale_filler { type: 'constant' value: 1 } 334 | shift_filler { type: 'constant' value: 0.001 } } } 335 | layers { bottom: 'deconv3_3' top: 'deconv3_3' name: 'derelu3_3' type: RELU } 336 | 337 | # unpool2 338 | layers { type: UNPOOLING bottom: "deconv3_3" bottom: "pool2_mask" top: "unpool2" name: "unpool2" 339 | unpooling_param { unpool: MAX kernel_size: 2 stride: 2 unpool_size: 112 } 340 | } 341 | 342 | # 112 x 112 343 | # deconv2_1 344 | layers { bottom: 'unpool2' top: 'deconv2_1' name: 'deconv2_1' type: DECONVOLUTION 345 | blobs_lr: 1 blobs_lr: 2 weight_decay: 1 weight_decay: 0 346 | convolution_param { num_output:128 pad:1 kernel_size: 3 347 | weight_filler { type: "gaussian" std: 0.01 } 348 | bias_filler { type: "constant" value: 0 }} } 349 | layers { bottom: 'deconv2_1' top: 'deconv2_1' name: 'debn2_1' type: BN 350 | bn_param { scale_filler { type: 'constant' value: 1 } 351 | shift_filler { type: 'constant' value: 0.001 } } } 352 | layers { bottom: 'deconv2_1' top: 'deconv2_1' name: 'derelu2_1' type: RELU } 353 | # deconv2_2 354 | layers { bottom: 'deconv2_1' top: 'deconv2_2' name: 'deconv2_2' type: DECONVOLUTION 355 | blobs_lr: 1 blobs_lr: 2 weight_decay: 1 weight_decay: 0 356 | convolution_param { num_output:64 pad:1 kernel_size: 3 357 | weight_filler { type: "gaussian" std: 0.01 } 358 | bias_filler { type: "constant" value: 0 }} } 359 | layers { bottom: 'deconv2_2' top: 'deconv2_2' name: 'debn2_2' type: BN 360 | bn_param { scale_filler { type: 'constant' value: 1 } 361 | shift_filler { type: 'constant' value: 0.001 } } } 362 | layers { bottom: 'deconv2_2' top: 'deconv2_2' name: 'derelu2_2' type: RELU } 363 | 364 | # unpool1 365 | layers { type: UNPOOLING bottom: "deconv2_2" bottom: "pool1_mask" top: "unpool1" name: "unpool1" 366 | unpooling_param { unpool: MAX kernel_size: 2 stride: 2 unpool_size: 224 } 367 | } 368 | 369 | # deconv1_1 370 | layers { bottom: 'unpool1' top: 'deconv1_1' name: 'deconv1_1' type: DECONVOLUTION 371 | blobs_lr: 1 blobs_lr: 2 weight_decay: 1 weight_decay: 0 372 | convolution_param { num_output:64 pad:1 kernel_size: 3 373 | weight_filler { type: "gaussian" std: 0.01 } 374 | bias_filler { type: "constant" value: 0 }} } 375 | layers { bottom: 'deconv1_1' top: 'deconv1_1' name: 'debn1_1' type: BN 376 | bn_param { scale_filler { type: 'constant' value: 1 } 377 | shift_filler { type: 'constant' value: 0.001 } } } 378 | layers { bottom: 'deconv1_1' top: 'deconv1_1' name: 'derelu1_1' type: RELU } 379 | 380 | # deconv1_2 381 | layers { bottom: 'deconv1_1' top: 'deconv1_2' name: 'deconv1_2' type: DECONVOLUTION 382 | blobs_lr: 1 blobs_lr: 2 weight_decay: 1 weight_decay: 0 383 | convolution_param { num_output:64 pad:1 kernel_size: 3 384 | weight_filler { type: "gaussian" std: 0.01 } 385 | bias_filler { type: "constant" value: 0 }} } 386 | layers { bottom: 'deconv1_2' top: 'deconv1_2' name: 'debn1_2' type: BN 387 | bn_param { scale_filler { type: 'constant' value: 1 } 388 | shift_filler { type: 'constant' value: 0.001 } } } 389 | layers { bottom: 'deconv1_2' top: 'deconv1_2' name: 'derelu1_2' type: RELU } 390 | 391 | # seg-score 392 | layers { name: 'seg-score-voc' type: CONVOLUTION bottom: 'deconv1_2' top: 'seg-score' 393 | blobs_lr: 1 blobs_lr: 2 weight_decay: 1 weight_decay: 0 394 | convolution_param { num_output: 21 kernel_size: 1 395 | weight_filler { type: "gaussian" std: 0.01 } 396 | bias_filler { type: "constant" value: 0 }} } 397 | 398 | 399 | # score and accuracy 400 | layers { 401 | name: "seg-accuracy" 402 | type: ELTWISE_ACCURACY 403 | bottom: "seg-score" 404 | bottom: "seg-label" 405 | top: "seg-accuracy" 406 | eltwise_accuracy_param { 407 | ignore_label: 255 408 | } 409 | include: { phase: TEST } 410 | } 411 | 412 | layers { 413 | name: "seg-loss" 414 | type: SOFTMAX_LOSS 415 | bottom: "seg-score" 416 | bottom: "seg-label" 417 | top: "seg-loss" 418 | loss_param { 419 | ignore_label: 255 420 | } 421 | } 422 | 423 | -------------------------------------------------------------------------------- /training/001_stage_1/start_train.sh: -------------------------------------------------------------------------------- 1 | LOGDIR=./training_log 2 | CAFFE=./caffe/build/tools/caffe 3 | SOLVER=./solver.prototxt 4 | WEIGHTS=./model/VGG_conv/VGG_ILSVRC_16_layers_conv.caffemodel 5 | 6 | GLOG_log_dir=$LOGDIR $CAFFE train -solver $SOLVER -weights $WEIGHTS -gpu 0 7 | 8 | #send_notify_mail "039_voc_single_object_finetune_from_031 train script is finished" 9 | -------------------------------------------------------------------------------- /training/001_start_train.sh: -------------------------------------------------------------------------------- 1 | ########################## 2 | # start stage 1 training # 3 | ########################## 4 | cd 001_stage_1 5 | 6 | ## create simulinks 7 | 8 | # image data 9 | ln -s ../../data/VOC2012 10 | # training / validataion imageset 11 | ln -s ../../data/imagesets/stage_1_train_imgset 12 | # caffe 13 | ln -s ../../caffe 14 | # pre-trained caffe model (VGG16 - fully convolution) 15 | ln -s ../../model 16 | # training result saving path 17 | ln -s ../../model/stage_1_train_result 18 | 19 | ## create directories 20 | mkdir snapshot 21 | mkdir training_log 22 | 23 | ## start training 24 | ./start_train.sh 25 | 26 | ## copy and rename trained model 27 | cp ./snapshot/stage_1_train_iter_20000.caffemodel ./stage_1_train_result/stage_1_train_result.caffemodel 28 | -------------------------------------------------------------------------------- /training/002_stage_2/solver.prototxt: -------------------------------------------------------------------------------- 1 | net: "./stage_2_train.prototxt" 2 | base_lr: 0.01 3 | lr_policy: "step" 4 | gamma: 0.1 5 | stepsize: 10000 6 | iter_size: 8 7 | display: 1 8 | max_iter: 40000 9 | momentum: 0.9 10 | weight_decay: 0.0005 11 | test_interval: 500 12 | test_iter: 1000 13 | snapshot: 500 14 | snapshot_prefix: "./snapshot/stage_2_train" 15 | solver_mode: GPU 16 | -------------------------------------------------------------------------------- /training/002_stage_2/stage_2_train.prototxt: -------------------------------------------------------------------------------- 1 | name: "stage_2_train" 2 | 3 | layers { 4 | name: "data" 5 | type: WINDOW_SEG_DATA 6 | top: "data" 7 | top: "seg-label" 8 | image_data_param { 9 | root_folder: "./VOC2012" 10 | source: "./stage_2_train_imgset/train.txt" 11 | label_type: PIXEL 12 | batch_size: 8 13 | shuffle: true 14 | new_width: 250 15 | new_height: 250 16 | } 17 | transform_param { 18 | mean_value: 104.008 19 | mean_value: 116.669 20 | mean_value: 122.675 21 | crop_size: 224 22 | mirror: true 23 | } 24 | include: { phase: TRAIN } 25 | } 26 | layers { 27 | name: "data" 28 | type: WINDOW_SEG_DATA 29 | top: "data" 30 | top: "seg-label" 31 | image_data_param { 32 | root_folder: "./VOC2012" 33 | source: "./stage_2_train_imgset/val.txt" 34 | label_type: PIXEL 35 | batch_size: 8 36 | shuffle: true 37 | new_width: 224 38 | new_height: 224 39 | } 40 | transform_param { 41 | mean_value: 104.008 42 | mean_value: 116.669 43 | mean_value: 122.675 44 | crop_size: 224 45 | mirror: true 46 | } 47 | include: { phase: TEST } 48 | } 49 | 50 | # 224 x 224 51 | # conv1_1 52 | layers { bottom: "data" top: "conv1_1" name: "conv1_1" type: CONVOLUTION 53 | blobs_lr: 1 blobs_lr: 2 weight_decay: 1 weight_decay: 0 54 | convolution_param { num_output: 64 pad: 1 kernel_size: 3 }} 55 | layers { bottom: 'conv1_1' top: 'conv1_1' name: 'bn1_1' type: BN 56 | bn_param { scale_filler { type: 'constant' value: 1 } 57 | shift_filler { type: 'constant' value: 0.001 } } } 58 | layers { bottom: "conv1_1" top: "conv1_1" name: "relu1_1" type: RELU} 59 | # conv1_2 60 | layers { bottom: "conv1_1" top: "conv1_2" name: "conv1_2" type: CONVOLUTION 61 | blobs_lr: 1 blobs_lr: 2 weight_decay: 1 weight_decay: 0 62 | convolution_param { num_output: 64 pad: 1 kernel_size: 3 }} 63 | layers { bottom: 'conv1_2' top: 'conv1_2' name: 'bn1_2' type: BN 64 | bn_param { scale_filler { type: 'constant' value: 1 } 65 | shift_filler { type: 'constant' value: 0.001 } } } 66 | layers { bottom: "conv1_2" top: "conv1_2" name: "relu1_2" type: RELU} 67 | 68 | # pool1 69 | layers { 70 | bottom: "conv1_2" top: "pool1" top:"pool1_mask" name: "pool1" type: POOLING 71 | pooling_param { pool: MAX kernel_size: 2 stride: 2 } 72 | } 73 | 74 | # 112 x 112 75 | # conv2_1 76 | layers { bottom: "pool1" top: "conv2_1" name: "conv2_1" type: CONVOLUTION 77 | blobs_lr: 1 blobs_lr: 2 weight_decay: 1 weight_decay: 0 78 | convolution_param { num_output: 128 pad: 1 kernel_size: 3 }} 79 | layers { bottom: 'conv2_1' top: 'conv2_1' name: 'bn2_1' type: BN 80 | bn_param { scale_filler { type: 'constant' value: 1 } 81 | shift_filler { type: 'constant' value: 0.001 } } } 82 | layers { bottom: "conv2_1" top: "conv2_1" name: "relu2_1" type: RELU} 83 | # conv2_2 84 | layers { bottom: "conv2_1" top: "conv2_2" name: "conv2_2" type: CONVOLUTION 85 | blobs_lr: 1 blobs_lr: 2 weight_decay: 1 weight_decay: 0 86 | convolution_param { num_output: 128 pad: 1 kernel_size: 3 }} 87 | layers { bottom: 'conv2_2' top: 'conv2_2' name: 'bn2_2' type: BN 88 | bn_param { scale_filler { type: 'constant' value: 1 } 89 | shift_filler { type: 'constant' value: 0.001 } } } 90 | layers { bottom: "conv2_2" top: "conv2_2" name: "relu2_2" type: RELU} 91 | 92 | # pool2 93 | layers { 94 | bottom: "conv2_2" top: "pool2" top: "pool2_mask" name: "pool2" type: POOLING 95 | pooling_param { pool: MAX kernel_size: 2 stride: 2 } 96 | } 97 | 98 | # 56 x 56 99 | # conv3_1 100 | layers { bottom: "pool2" top: "conv3_1" name: "conv3_1" type: CONVOLUTION 101 | blobs_lr: 1 blobs_lr: 2 weight_decay: 1 weight_decay: 0 102 | convolution_param { num_output: 256 pad: 1 kernel_size: 3 }} 103 | layers { bottom: 'conv3_1' top: 'conv3_1' name: 'bn3_1' type: BN 104 | bn_param { scale_filler { type: 'constant' value: 1 } 105 | shift_filler { type: 'constant' value: 0.001 } } } 106 | layers { bottom: "conv3_1" top: "conv3_1" name: "relu3_1" type: RELU} 107 | # conv3_2 108 | layers { bottom: "conv3_1" top: "conv3_2" name: "conv3_2" type: CONVOLUTION 109 | blobs_lr: 1 blobs_lr: 2 weight_decay: 1 weight_decay: 0 110 | convolution_param { num_output: 256 pad: 1 kernel_size: 3 }} 111 | layers { bottom: 'conv3_2' top: 'conv3_2' name: 'bn3_2' type: BN 112 | bn_param { scale_filler { type: 'constant' value: 1 } 113 | shift_filler { type: 'constant' value: 0.001 } } } 114 | layers { bottom: "conv3_2" top: "conv3_2" name: "relu3_2" type: RELU} 115 | # conv3_3 116 | layers { bottom: "conv3_2" top: "conv3_3" name: "conv3_3" type: CONVOLUTION 117 | blobs_lr: 1 blobs_lr: 2 weight_decay: 1 weight_decay: 0 118 | convolution_param { num_output: 256 pad: 1 kernel_size: 3 }} 119 | layers { bottom: 'conv3_3' top: 'conv3_3' name: 'bn3_3' type: BN 120 | bn_param { scale_filler { type: 'constant' value: 1 } 121 | shift_filler { type: 'constant' value: 0.001 } } } 122 | layers { bottom: "conv3_3" top: "conv3_3" name: "relu3_3" type: RELU} 123 | 124 | # pool3 125 | layers { 126 | bottom: "conv3_3" top: "pool3" top: "pool3_mask" name: "pool3" type: POOLING 127 | pooling_param { pool: MAX kernel_size: 2 stride: 2 } 128 | } 129 | 130 | # 28 x 28 131 | # conv4_1 132 | layers { bottom: "pool3" top: "conv4_1" name: "conv4_1" type: CONVOLUTION 133 | blobs_lr: 1 blobs_lr: 2 weight_decay: 1 weight_decay: 0 134 | convolution_param { num_output: 512 pad: 1 kernel_size: 3 }} 135 | layers { bottom: 'conv4_1' top: 'conv4_1' name: 'bn4_1' type: BN 136 | bn_param { scale_filler { type: 'constant' value: 1 } 137 | shift_filler { type: 'constant' value: 0.001 } } } 138 | layers { bottom: "conv4_1" top: "conv4_1" name: "relu4_1" type: RELU} 139 | # conv4_2 140 | layers { bottom: "conv4_1" top: "conv4_2" name: "conv4_2" type: CONVOLUTION 141 | blobs_lr: 1 blobs_lr: 2 weight_decay: 1 weight_decay: 0 142 | convolution_param { num_output: 512 pad: 1 kernel_size: 3 }} 143 | layers { bottom: 'conv4_2' top: 'conv4_2' name: 'bn4_2' type: BN 144 | bn_param { scale_filler { type: 'constant' value: 1 } 145 | shift_filler { type: 'constant' value: 0.001 } } } 146 | layers { bottom: "conv4_2" top: "conv4_2" name: "relu4_2" type: RELU} 147 | # conv4_3 148 | layers { bottom: "conv4_2" top: "conv4_3" name: "conv4_3" type: CONVOLUTION 149 | blobs_lr: 1 blobs_lr: 2 weight_decay: 1 weight_decay: 0 150 | convolution_param { num_output: 512 pad: 1 kernel_size: 3 }} 151 | layers { bottom: 'conv4_3' top: 'conv4_3' name: 'bn4_3' type: BN 152 | bn_param { scale_filler { type: 'constant' value: 1 } 153 | shift_filler { type: 'constant' value: 0.001 } } } 154 | layers { bottom: "conv4_3" top: "conv4_3" name: "relu4_3" type: RELU} 155 | 156 | # pool4 157 | layers { 158 | bottom: "conv4_3" top: "pool4" top: "pool4_mask" name: "pool4" type: POOLING 159 | pooling_param { pool: MAX kernel_size: 2 stride: 2 } 160 | } 161 | 162 | # 14 x 14 163 | # conv5_1 164 | layers { bottom: "pool4" top: "conv5_1" name: "conv5_1" type: CONVOLUTION 165 | blobs_lr: 1 blobs_lr: 2 weight_decay: 1 weight_decay: 0 166 | convolution_param { num_output: 512 pad: 1 kernel_size: 3 }} 167 | layers { bottom: 'conv5_1' top: 'conv5_1' name: 'bn5_1' type: BN 168 | bn_param { scale_filler { type: 'constant' value: 1 } 169 | shift_filler { type: 'constant' value: 0.001 } } } 170 | layers { bottom: "conv5_1" top: "conv5_1" name: "relu5_1" type: RELU} 171 | # conv5_2 172 | layers { bottom: "conv5_1" top: "conv5_2" name: "conv5_2" type: CONVOLUTION 173 | blobs_lr: 1 blobs_lr: 2 weight_decay: 1 weight_decay: 0 174 | convolution_param { num_output: 512 pad: 1 kernel_size: 3 }} 175 | layers { bottom: 'conv5_2' top: 'conv5_2' name: 'bn5_2' type: BN 176 | bn_param { scale_filler { type: 'constant' value: 1 } 177 | shift_filler { type: 'constant' value: 0.001 } } } 178 | layers { bottom: "conv5_2" top: "conv5_2" name: "relu5_2" type: RELU} 179 | # conv5_3 180 | layers { bottom: "conv5_2" top: "conv5_3" name: "conv5_3" type: CONVOLUTION 181 | blobs_lr: 1 blobs_lr: 2 weight_decay: 1 weight_decay: 0 182 | convolution_param { num_output: 512 pad: 1 kernel_size: 3 }} 183 | layers { bottom: 'conv5_3' top: 'conv5_3' name: 'bn5_3' type: BN 184 | bn_param { scale_filler { type: 'constant' value: 1 } 185 | shift_filler { type: 'constant' value: 0.001 } } } 186 | layers { bottom: "conv5_3" top: "conv5_3" name: "relu5_3" type: RELU} 187 | 188 | # pool5 189 | layers { 190 | bottom: "conv5_3" top: "pool5" top: "pool5_mask" name: "pool5" type: POOLING 191 | pooling_param { pool: MAX kernel_size: 2 stride: 2 } 192 | } 193 | 194 | # 7 x 7 195 | # fc6 196 | layers { bottom: 'pool5' top: 'fc6' name: 'fc6' type: CONVOLUTION 197 | blobs_lr: 1 blobs_lr: 2 weight_decay: 1 weight_decay: 0 198 | convolution_param { kernel_size: 7 num_output: 4096 } } 199 | layers { bottom: 'fc6' top: 'fc6' name: 'bnfc6' type: BN 200 | bn_param { scale_filler { type: 'constant' value: 1 } 201 | shift_filler { type: 'constant' value: 0.001 } } } 202 | layers { bottom: "fc6" top: "fc6" name: "relu6" type: RELU} 203 | 204 | # 1 x 1 205 | # fc7 206 | layers { bottom: 'fc6' top: 'fc7' name: 'fc7' type: CONVOLUTION 207 | blobs_lr: 1 blobs_lr: 2 weight_decay: 1 weight_decay: 0 208 | convolution_param { kernel_size: 1 num_output: 4096 } } 209 | layers { bottom: 'fc7' top: 'fc7' name: 'bnfc7' type: BN 210 | bn_param { scale_filler { type: 'constant' value: 1 } 211 | shift_filler { type: 'constant' value: 0.001 } } } 212 | layers { bottom: "fc7" top: "fc7" name: "relu7" type: RELU} 213 | 214 | # fc6-deconv 215 | layers { bottom: 'fc7' top: 'fc6-deconv' name: 'fc6-deconv' type: DECONVOLUTION 216 | blobs_lr: 1 blobs_lr: 2 weight_decay: 1 weight_decay: 0 217 | convolution_param { num_output: 512 kernel_size: 7 218 | weight_filler { type: "gaussian" std: 0.01 } 219 | bias_filler { type: "constant" value: 0 }} } 220 | layers { bottom: 'fc6-deconv' top: 'fc6-deconv' name: 'fc6-deconv-bn' type: BN 221 | bn_param { scale_filler { type: 'constant' value: 1 } 222 | shift_filler { type: 'constant' value: 0.001 } } } 223 | layers { bottom: 'fc6-deconv' top: 'fc6-deconv' name: 'fc6-deconv-relu' type: RELU } 224 | 225 | # 7 x 7 226 | # unpool5 227 | layers { type: UNPOOLING bottom: "fc6-deconv" bottom: "pool5_mask" top: "unpool5" name: "unpool5" 228 | unpooling_param { unpool: MAX kernel_size: 2 stride: 2 unpool_size: 14 } 229 | } 230 | 231 | # 14 x 14 232 | # deconv5_1 233 | layers { bottom: 'unpool5' top: 'deconv5_1' name: 'deconv5_1' type: DECONVOLUTION 234 | blobs_lr: 1 blobs_lr: 2 weight_decay: 1 weight_decay: 0 235 | convolution_param { num_output: 512 pad: 1 kernel_size: 3 236 | weight_filler { type: "gaussian" std: 0.01 } 237 | bias_filler { type: "constant" value: 0 }} } 238 | layers { bottom: 'deconv5_1' top: 'deconv5_1' name: 'debn5_1' type: BN 239 | bn_param { scale_filler { type: 'constant' value: 1 } 240 | shift_filler { type: 'constant' value: 0.001 } } } 241 | layers { bottom: 'deconv5_1' top: 'deconv5_1' name: 'derelu5_1' type: RELU } 242 | # deconv5_2 243 | layers { bottom: 'deconv5_1' top: 'deconv5_2' name: 'deconv5_2' type: DECONVOLUTION 244 | blobs_lr: 1 blobs_lr: 2 weight_decay: 1 weight_decay: 0 245 | convolution_param { num_output: 512 pad: 1 kernel_size: 3 246 | weight_filler { type: "gaussian" std: 0.01 } 247 | bias_filler { type: "constant" value: 0 }} } 248 | layers { bottom: 'deconv5_2' top: 'deconv5_2' name: 'debn5_2' type: BN 249 | bn_param { scale_filler { type: 'constant' value: 1 } 250 | shift_filler { type: 'constant' value: 0.001 } } } 251 | layers { bottom: 'deconv5_2' top: 'deconv5_2' name: 'derelu5_2' type: RELU } 252 | # deconv5_3 253 | layers { bottom: 'deconv5_2' top: 'deconv5_3' name: 'deconv5_3' type: DECONVOLUTION 254 | blobs_lr: 1 blobs_lr: 2 weight_decay: 1 weight_decay: 0 255 | convolution_param { num_output: 512 pad: 1 kernel_size: 3 256 | weight_filler { type: "gaussian" std: 0.01 } 257 | bias_filler { type: "constant" value: 0 }} } 258 | layers { bottom: 'deconv5_3' top: 'deconv5_3' name: 'debn5_3' type: BN 259 | bn_param { scale_filler { type: 'constant' value: 1 } 260 | shift_filler { type: 'constant' value: 0.001 } } } 261 | layers { bottom: 'deconv5_3' top: 'deconv5_3' name: 'derelu5_3' type: RELU } 262 | 263 | # unpool4 264 | layers { type: UNPOOLING bottom: "deconv5_3" bottom: "pool4_mask" top: "unpool4" name: "unpool4" 265 | unpooling_param { unpool: MAX kernel_size: 2 stride: 2 unpool_size: 28 } 266 | } 267 | 268 | # 28 x 28 269 | # deconv4_1 270 | layers { bottom: 'unpool4' top: 'deconv4_1' name: 'deconv4_1' type: DECONVOLUTION 271 | blobs_lr: 1 blobs_lr: 2 weight_decay: 1 weight_decay: 0 272 | convolution_param { num_output: 512 pad: 1 kernel_size: 3 273 | weight_filler { type: "gaussian" std: 0.01 } 274 | bias_filler { type: "constant" value: 0 }} } 275 | layers { bottom: 'deconv4_1' top: 'deconv4_1' name: 'debn4_1' type: BN 276 | bn_param { scale_filler { type: 'constant' value: 1 } 277 | shift_filler { type: 'constant' value: 0.001 } } } 278 | layers { bottom: 'deconv4_1' top: 'deconv4_1' name: 'derelu4_1' type: RELU } 279 | # deconv 4_2 280 | layers { bottom: 'deconv4_1' top: 'deconv4_2' name: 'deconv4_2' type: DECONVOLUTION 281 | blobs_lr: 1 blobs_lr: 2 weight_decay: 1 weight_decay: 0 282 | convolution_param { num_output: 512 pad: 1 kernel_size: 3 283 | weight_filler { type: "gaussian" std: 0.01 } 284 | bias_filler { type: "constant" value: 0 }} } 285 | layers { bottom: 'deconv4_2' top: 'deconv4_2' name: 'debn4_2' type: BN 286 | bn_param { scale_filler { type: 'constant' value: 1 } 287 | shift_filler { type: 'constant' value: 0.001 } } } 288 | layers { bottom: 'deconv4_2' top: 'deconv4_2' name: 'derelu4_2' type: RELU } 289 | # deconv 4_3 290 | layers { bottom: 'deconv4_2' top: 'deconv4_3' name: 'deconv4_3' type: DECONVOLUTION 291 | blobs_lr: 1 blobs_lr: 2 weight_decay: 1 weight_decay: 0 292 | convolution_param { num_output: 256 pad: 1 kernel_size: 3 293 | weight_filler { type: "gaussian" std: 0.01 } 294 | bias_filler { type: "constant" value: 0 }} } 295 | layers { bottom: 'deconv4_3' top: 'deconv4_3' name: 'debn4_3' type: BN 296 | bn_param { scale_filler { type: 'constant' value: 1 } 297 | shift_filler { type: 'constant' value: 0.001 } } } 298 | layers { bottom: 'deconv4_3' top: 'deconv4_3' name: 'derelu4_3' type: RELU } 299 | 300 | # unpool3 301 | layers { type: UNPOOLING bottom: "deconv4_3" bottom: "pool3_mask" top: "unpool3" name: "unpool3" 302 | unpooling_param { unpool: MAX kernel_size: 2 stride: 2 unpool_size: 56 } 303 | } 304 | 305 | # 56 x 56 306 | # deconv3_1 307 | layers { bottom: 'unpool3' top: 'deconv3_1' name: 'deconv3_1' type: DECONVOLUTION 308 | blobs_lr: 1 blobs_lr: 2 weight_decay: 1 weight_decay: 0 309 | convolution_param { num_output:256 pad:1 kernel_size: 3 310 | weight_filler { type: "gaussian" std: 0.01 } 311 | bias_filler { type: "constant" value: 0 }} } 312 | layers { bottom: 'deconv3_1' top: 'deconv3_1' name: 'debn3_1' type: BN 313 | bn_param { scale_filler { type: 'constant' value: 1 } 314 | shift_filler { type: 'constant' value: 0.001 } } } 315 | layers { bottom: 'deconv3_1' top: 'deconv3_1' name: 'derelu3_1' type: RELU } 316 | # deconv3_2 317 | layers { bottom: 'deconv3_1' top: 'deconv3_2' name: 'deconv3_2' type: DECONVOLUTION 318 | blobs_lr: 1 blobs_lr: 2 weight_decay: 1 weight_decay: 0 319 | convolution_param { num_output:256 pad:1 kernel_size: 3 320 | weight_filler { type: "gaussian" std: 0.01 } 321 | bias_filler { type: "constant" value: 0 }} } 322 | layers { bottom: 'deconv3_2' top: 'deconv3_2' name: 'debn3_2' type: BN 323 | bn_param { scale_filler { type: 'constant' value: 1 } 324 | shift_filler { type: 'constant' value: 0.001 } } } 325 | layers { bottom: 'deconv3_2' top: 'deconv3_2' name: 'derelu3_2' type: RELU } 326 | # deconv3_3 327 | layers { bottom: 'deconv3_2' top: 'deconv3_3' name: 'deconv3_3' type: DECONVOLUTION 328 | blobs_lr: 1 blobs_lr: 2 weight_decay: 1 weight_decay: 0 329 | convolution_param { num_output:128 pad:1 kernel_size: 3 330 | weight_filler { type: "gaussian" std: 0.01 } 331 | bias_filler { type: "constant" value: 0 }} } 332 | layers { bottom: 'deconv3_3' top: 'deconv3_3' name: 'debn3_3' type: BN 333 | bn_param { scale_filler { type: 'constant' value: 1 } 334 | shift_filler { type: 'constant' value: 0.001 } } } 335 | layers { bottom: 'deconv3_3' top: 'deconv3_3' name: 'derelu3_3' type: RELU } 336 | 337 | # unpool2 338 | layers { type: UNPOOLING bottom: "deconv3_3" bottom: "pool2_mask" top: "unpool2" name: "unpool2" 339 | unpooling_param { unpool: MAX kernel_size: 2 stride: 2 unpool_size: 112 } 340 | } 341 | 342 | # 112 x 112 343 | # deconv2_1 344 | layers { bottom: 'unpool2' top: 'deconv2_1' name: 'deconv2_1' type: DECONVOLUTION 345 | blobs_lr: 1 blobs_lr: 2 weight_decay: 1 weight_decay: 0 346 | convolution_param { num_output:128 pad:1 kernel_size: 3 347 | weight_filler { type: "gaussian" std: 0.01 } 348 | bias_filler { type: "constant" value: 0 }} } 349 | layers { bottom: 'deconv2_1' top: 'deconv2_1' name: 'debn2_1' type: BN 350 | bn_param { scale_filler { type: 'constant' value: 1 } 351 | shift_filler { type: 'constant' value: 0.001 } } } 352 | layers { bottom: 'deconv2_1' top: 'deconv2_1' name: 'derelu2_1' type: RELU } 353 | # deconv2_2 354 | layers { bottom: 'deconv2_1' top: 'deconv2_2' name: 'deconv2_2' type: DECONVOLUTION 355 | blobs_lr: 1 blobs_lr: 2 weight_decay: 1 weight_decay: 0 356 | convolution_param { num_output:64 pad:1 kernel_size: 3 357 | weight_filler { type: "gaussian" std: 0.01 } 358 | bias_filler { type: "constant" value: 0 }} } 359 | layers { bottom: 'deconv2_2' top: 'deconv2_2' name: 'debn2_2' type: BN 360 | bn_param { scale_filler { type: 'constant' value: 1 } 361 | shift_filler { type: 'constant' value: 0.001 } } } 362 | layers { bottom: 'deconv2_2' top: 'deconv2_2' name: 'derelu2_2' type: RELU } 363 | 364 | # unpool1 365 | layers { type: UNPOOLING bottom: "deconv2_2" bottom: "pool1_mask" top: "unpool1" name: "unpool1" 366 | unpooling_param { unpool: MAX kernel_size: 2 stride: 2 unpool_size: 224 } 367 | } 368 | # deconv1_1 369 | layers { bottom: 'unpool1' top: 'deconv1_1' name: 'deconv1_1' type: DECONVOLUTION 370 | blobs_lr: 1 blobs_lr: 2 weight_decay: 1 weight_decay: 0 371 | convolution_param { num_output:64 pad:1 kernel_size: 3 372 | weight_filler { type: "gaussian" std: 0.01 } 373 | bias_filler { type: "constant" value: 0 }} } 374 | layers { bottom: 'deconv1_1' top: 'deconv1_1' name: 'debn1_1' type: BN 375 | bn_param { scale_filler { type: 'constant' value: 1 } 376 | shift_filler { type: 'constant' value: 0.001 } } } 377 | layers { bottom: 'deconv1_1' top: 'deconv1_1' name: 'derelu1_1' type: RELU } 378 | 379 | # deconv1_2 380 | layers { bottom: 'deconv1_1' top: 'deconv1_2' name: 'deconv1_2' type: DECONVOLUTION 381 | blobs_lr: 1 blobs_lr: 2 weight_decay: 1 weight_decay: 0 382 | convolution_param { num_output:64 pad:1 kernel_size: 3 383 | weight_filler { type: "gaussian" std: 0.01 } 384 | bias_filler { type: "constant" value: 0 }} } 385 | layers { bottom: 'deconv1_2' top: 'deconv1_2' name: 'debn1_2' type: BN 386 | bn_param { scale_filler { type: 'constant' value: 1 } 387 | shift_filler { type: 'constant' value: 0.001 } } } 388 | layers { bottom: 'deconv1_2' top: 'deconv1_2' name: 'derelu1_2' type: RELU } 389 | 390 | # seg-score 391 | layers { name: 'seg-score-voc' type: CONVOLUTION bottom: 'deconv1_2' top: 'seg-score' 392 | blobs_lr: 1 blobs_lr: 2 weight_decay: 1 weight_decay: 0 393 | convolution_param { num_output: 21 kernel_size: 1 394 | weight_filler { type: "gaussian" std: 0.01 } 395 | bias_filler { type: "constant" value: 0 }} } 396 | 397 | # score and accuracy 398 | layers { 399 | name: "seg-accuracy" 400 | type: ELTWISE_ACCURACY 401 | bottom: "seg-score" 402 | bottom: "seg-label" 403 | top: "seg-accuracy" 404 | eltwise_accuracy_param { 405 | ignore_label: 255 406 | } 407 | include: { phase: TEST } 408 | } 409 | 410 | layers { 411 | name: "seg-loss" 412 | type: SOFTMAX_LOSS 413 | bottom: "seg-score" 414 | bottom: "seg-label" 415 | top: "seg-loss" 416 | loss_param { 417 | ignore_label: 255 418 | } 419 | } 420 | 421 | -------------------------------------------------------------------------------- /training/002_stage_2/start_train.sh: -------------------------------------------------------------------------------- 1 | LOGDIR=./training_log 2 | CAFFE=./caffe/build/tools/caffe 3 | SOLVER=./solver.prototxt 4 | WEIGHTS=./model/stage_1_train_result/stage_1_train_result.caffemodel 5 | 6 | GLOG_log_dir=$LOGDIR $CAFFE train -solver $SOLVER -weights $WEIGHTS -gpu 0 7 | 8 | -------------------------------------------------------------------------------- /training/002_start_train.sh: -------------------------------------------------------------------------------- 1 | ########################## 2 | # start stage 2 training # 3 | ########################## 4 | cd 002_stage_2 5 | 6 | ## create simulinks 7 | 8 | # image data 9 | ln -s ../../data/VOC2012 10 | # training / validataion imageset 11 | ln -s ../../data/imagesets/stage_2_train_imgset 12 | # caffe 13 | ln -s ../../caffe 14 | # pre-trained caffe model (trained model with stage 1 training) 15 | ln -s ../../model 16 | # training result saving path 17 | ln -s ../../model/stage_2_train_result 18 | 19 | ## create directories 20 | mkdir snapshot 21 | mkdir training_log 22 | 23 | ## start training 24 | ./start_train.sh 25 | 26 | ## copy and rename trained model 27 | cp ./snapshot/stage_2_train_iter_40000.caffemodel ./stage_2_train_result/stage_2_train_result.caffemodel 28 | 29 | -------------------------------------------------------------------------------- /training/003_make_bn_layer_testable/BN_make_INFERENCE_script.py: -------------------------------------------------------------------------------- 1 | 2 | ### User Configuration 3 | 4 | iteration = 500 5 | gpu_id_num = 0 6 | 7 | ## path configuration 8 | caffe_root = './caffe' 9 | script_path = '.' 10 | caffe_model = script_path + '/DeconvNet_make_bn_layer_testable.prototxt' 11 | caffe_weight = script_path + '/model/stage_2_train_result/stage_2_train_result.caffemodel' 12 | caffe_inference_weight = script_path + '/DeconvNet_trainval_inference.caffemodel' 13 | 14 | ## modify this definition according to model definition 15 | bn_blobs= ['conv1_1','conv1_2', 16 | 'conv2_1','conv2_2', 17 | 'conv3_1','conv3_2','conv3_3', 18 | 'conv4_1','conv4_2','conv4_3', 19 | 'conv5_1','conv5_2','conv5_3', 20 | 'fc6','fc7', 21 | 'fc6-deconv', 22 | 'deconv5_1','deconv5_2','deconv5_3', 23 | 'deconv4_1','deconv4_2','deconv4_3', 24 | 'deconv3_1','deconv3_2','deconv3_3', 25 | 'deconv2_1','deconv2_2', 26 | 'deconv1_1','deconv1_2'] 27 | bn_layers=['bn1_1','bn1_2', 28 | 'bn2_1','bn2_2', 29 | 'bn3_1','bn3_2','bn3_3', 30 | 'bn4_1','bn4_2','bn4_3', 31 | 'bn5_1','bn5_2','bn5_3', 32 | 'bnfc6','bnfc7', 33 | 'fc6-deconv-bn', 34 | 'debn5_1','debn5_2','debn5_3', 35 | 'debn4_1','debn4_2','debn4_3', 36 | 'debn3_1','debn3_2','debn3_3', 37 | 'debn2_1','debn2_2', 38 | 'debn1_1','debn1_2'] 39 | bn_means= ['bn1_1-mean','bn1_2-mean', 40 | 'bn2_1-mean','bn2_2-mean', 41 | 'bn3_1-mean','bn3_2-mean','bn3_3-mean', 42 | 'bn4_1-mean','bn4_2-mean','bn4_3-mean', 43 | 'bn5_1-mean','bn5_2-mean','bn5_3-mean', 44 | 'bnfc6-mean','bnfc7-mean', 45 | 'fc6-deconv-bn-mean', 46 | 'debn5_1-mean','debn5_2-mean','debn5_3-mean', 47 | 'debn4_1-mean','debn4_2-mean','debn4_3-mean', 48 | 'debn3_1-mean','debn3_2-mean','debn3_3-mean', 49 | 'debn2_1-mean','debn2_2-mean', 50 | 'debn1_1-mean','debn1_2-mean'] 51 | bn_vars = ['bn1_1-var','bn1_2-var', 52 | 'bn2_1-var','bn2_2-var', 53 | 'bn3_1-var','bn3_2-var','bn3_3-var', 54 | 'bn4_1-var','bn4_2-var','bn4_3-var', 55 | 'bn5_1-var','bn5_2-var','bn5_3-var', 56 | 'bnfc6-var','bnfc7-var', 57 | 'fc6-deconv-bn-var', 58 | 'debn5_1-var','debn5_2-var','debn5_3-var', 59 | 'debn4_1-var','debn4_2-var','debn4_3-var', 60 | 'debn3_1-var','debn3_2-var','debn3_3-var', 61 | 'debn2_1-var','debn2_2-var', 62 | 'debn1_1-var','debn1_2-var'] 63 | 64 | 65 | ### start generate caffemodel 66 | 67 | print 'start generating BN-testable caffemodel' 68 | print 'caffe_root: %s' % caffe_root 69 | print 'script_path: %s' % script_path 70 | print 'caffe_model: %s' % caffe_model 71 | print 'caffe_weight: %s' % caffe_weight 72 | print 'caffe_inference_weight: %s' % caffe_inference_weight 73 | 74 | import numpy as np 75 | 76 | import sys 77 | sys.path.append(caffe_root+'/python') 78 | import caffe 79 | from caffe.proto import caffe_pb2 80 | 81 | net = caffe.Net(caffe_model, caffe_weight) 82 | #net.set_mode_cpu() 83 | net.set_mode_gpu() 84 | net.set_device(gpu_id_num) 85 | net.set_phase_test() 86 | 87 | def forward_once(net): 88 | start_ind = 0 89 | end_ind = len(net.layers) - 1 90 | net._forward(start_ind, end_ind) 91 | return {out: net.blobs[out].data for out in net.outputs} 92 | 93 | print net.params.keys() 94 | 95 | res = forward_once(net) 96 | 97 | bn_avg_mean = {bn_mean: np.squeeze(res[bn_mean]).copy() for bn_mean in bn_means} 98 | bn_avg_var = {bn_var: np.squeeze(res[bn_var]).copy() for bn_var in bn_vars} 99 | 100 | cnt = 1 101 | 102 | for i in range(0, iteration): 103 | res = forward_once(net) 104 | for bn_mean in bn_means: 105 | bn_avg_mean[bn_mean] = bn_avg_mean[bn_mean] + np.squeeze(res[bn_mean]) 106 | for bn_var in bn_vars: 107 | bn_avg_var[bn_var] = bn_avg_var[bn_var] + np.squeeze(res[bn_var]) 108 | 109 | cnt += 1 110 | print 'progress: %d/%d' % (i, iteration) 111 | 112 | ## compute average 113 | for bn_mean in bn_means: 114 | bn_avg_mean[bn_mean] /= cnt 115 | for bn_var in bn_vars: 116 | bn_avg_var[bn_var] /= cnt 117 | 118 | for i in range(0, len(bn_vars)): 119 | m = np.prod(net.blobs[bn_blobs[i]].data.shape) / np.prod(bn_avg_var[bn_vars[i]].shape) 120 | bn_avg_var[bn_vars[i]] *= (m/(m-1)) 121 | 122 | scale_data = {bn_layer: np.squeeze(net.params[bn_layer][0].data) for bn_layer in bn_layers} 123 | shift_data = {bn_layer: np.squeeze(net.params[bn_layer][1].data) for bn_layer in bn_layers} 124 | 125 | var_eps = 1e-9 126 | 127 | new_scale_data = {} 128 | new_shift_data = {} 129 | for i in range(0, len(bn_layers)): 130 | gamma = scale_data[bn_layers[i]] 131 | beta = shift_data[bn_layers[i]] 132 | Ex = bn_avg_mean[bn_means[i]] 133 | Varx = bn_avg_var[bn_vars[i]] 134 | new_gamma = gamma / np.sqrt(Varx + var_eps) 135 | new_beta = beta - (gamma * Ex / np.sqrt(Varx + var_eps)) 136 | 137 | new_scale_data[bn_layers[i]] = new_gamma 138 | new_shift_data[bn_layers[i]] = new_beta 139 | 140 | print new_scale_data.keys() 141 | print new_shift_data.keys() 142 | 143 | ## assign computed new scale and shift values to net.params 144 | for i in range(0, len(bn_layers)): 145 | net.params[bn_layers[i]][0].data[...] = new_scale_data[bn_layers[i]].reshape(net.params[bn_layers[i]][0].data.shape) 146 | net.params[bn_layers[i]][1].data[...] = new_shift_data[bn_layers[i]].reshape(net.params[bn_layers[i]][1].data.shape) 147 | 148 | print 'start saving model' 149 | 150 | net.save(caffe_inference_weight) 151 | 152 | print 'done' 153 | 154 | 155 | 156 | -------------------------------------------------------------------------------- /training/003_make_bn_layer_testable/DeconvNet_inference_deploy.prototxt: -------------------------------------------------------------------------------- 1 | name: "DeconvNet_inference_deploy" 2 | 3 | input: "data" 4 | input_dim: 1 5 | input_dim: 3 6 | input_dim: 224 7 | input_dim: 224 8 | 9 | # 224 x 224 10 | # conv1_1 11 | layers { bottom: "data" top: "conv1_1" name: "conv1_1" type: CONVOLUTION 12 | blobs_lr: 1 blobs_lr: 2 weight_decay: 1 weight_decay: 0 13 | convolution_param { num_output: 64 pad: 1 kernel_size: 3 }} 14 | layers { bottom: 'conv1_1' top: 'conv1_1' name: 'bn1_1' type: BN 15 | bn_param { scale_filler { type: 'constant' value: 1 } 16 | shift_filler { type: 'constant' value: 0.001 } 17 | bn_mode: INFERENCE } } 18 | layers { bottom: "conv1_1" top: "conv1_1" name: "relu1_1" type: RELU} 19 | # conv1_2 20 | layers { bottom: "conv1_1" top: "conv1_2" name: "conv1_2" type: CONVOLUTION 21 | blobs_lr: 1 blobs_lr: 2 weight_decay: 1 weight_decay: 0 22 | convolution_param { num_output: 64 pad: 1 kernel_size: 3 }} 23 | layers { bottom: 'conv1_2' top: 'conv1_2' name: 'bn1_2' type: BN 24 | bn_param { scale_filler { type: 'constant' value: 1 } 25 | shift_filler { type: 'constant' value: 0.001 } 26 | bn_mode: INFERENCE } } 27 | layers { bottom: "conv1_2" top: "conv1_2" name: "relu1_2" type: RELU} 28 | 29 | # pool1 30 | layers { 31 | bottom: "conv1_2" top: "pool1" top:"pool1_mask" name: "pool1" type: POOLING 32 | pooling_param { pool: MAX kernel_size: 2 stride: 2 } 33 | } 34 | 35 | # 112 x 112 36 | # conv2_1 37 | layers { bottom: "pool1" top: "conv2_1" name: "conv2_1" type: CONVOLUTION 38 | blobs_lr: 1 blobs_lr: 2 weight_decay: 1 weight_decay: 0 39 | convolution_param { num_output: 128 pad: 1 kernel_size: 3 }} 40 | layers { bottom: 'conv2_1' top: 'conv2_1' name: 'bn2_1' type: BN 41 | bn_param { scale_filler { type: 'constant' value: 1 } 42 | shift_filler { type: 'constant' value: 0.001 } 43 | bn_mode: INFERENCE } } 44 | layers { bottom: "conv2_1" top: "conv2_1" name: "relu2_1" type: RELU} 45 | # conv2_2 46 | layers { bottom: "conv2_1" top: "conv2_2" name: "conv2_2" type: CONVOLUTION 47 | blobs_lr: 1 blobs_lr: 2 weight_decay: 1 weight_decay: 0 48 | convolution_param { num_output: 128 pad: 1 kernel_size: 3 }} 49 | layers { bottom: 'conv2_2' top: 'conv2_2' name: 'bn2_2' type: BN 50 | bn_param { scale_filler { type: 'constant' value: 1 } 51 | shift_filler { type: 'constant' value: 0.001 } 52 | bn_mode: INFERENCE } } 53 | layers { bottom: "conv2_2" top: "conv2_2" name: "relu2_2" type: RELU} 54 | 55 | # pool2 56 | layers { 57 | bottom: "conv2_2" top: "pool2" top: "pool2_mask" name: "pool2" type: POOLING 58 | pooling_param { pool: MAX kernel_size: 2 stride: 2 } 59 | } 60 | 61 | # 56 x 56 62 | # conv3_1 63 | layers { bottom: "pool2" top: "conv3_1" name: "conv3_1" type: CONVOLUTION 64 | blobs_lr: 1 blobs_lr: 2 weight_decay: 1 weight_decay: 0 65 | convolution_param { num_output: 256 pad: 1 kernel_size: 3 }} 66 | layers { bottom: 'conv3_1' top: 'conv3_1' name: 'bn3_1' type: BN 67 | bn_param { scale_filler { type: 'constant' value: 1 } 68 | shift_filler { type: 'constant' value: 0.001 } 69 | bn_mode: INFERENCE } } 70 | layers { bottom: "conv3_1" top: "conv3_1" name: "relu3_1" type: RELU} 71 | # conv3_2 72 | layers { bottom: "conv3_1" top: "conv3_2" name: "conv3_2" type: CONVOLUTION 73 | blobs_lr: 1 blobs_lr: 2 weight_decay: 1 weight_decay: 0 74 | convolution_param { num_output: 256 pad: 1 kernel_size: 3 }} 75 | layers { bottom: 'conv3_2' top: 'conv3_2' name: 'bn3_2' type: BN 76 | bn_param { scale_filler { type: 'constant' value: 1 } 77 | shift_filler { type: 'constant' value: 0.001 } 78 | bn_mode: INFERENCE } } 79 | layers { bottom: "conv3_2" top: "conv3_2" name: "relu3_2" type: RELU} 80 | # conv3_3 81 | layers { bottom: "conv3_2" top: "conv3_3" name: "conv3_3" type: CONVOLUTION 82 | blobs_lr: 1 blobs_lr: 2 weight_decay: 1 weight_decay: 0 83 | convolution_param { num_output: 256 pad: 1 kernel_size: 3 }} 84 | layers { bottom: 'conv3_3' top: 'conv3_3' name: 'bn3_3' type: BN 85 | bn_param { scale_filler { type: 'constant' value: 1 } 86 | shift_filler { type: 'constant' value: 0.001 } 87 | bn_mode: INFERENCE } } 88 | layers { bottom: "conv3_3" top: "conv3_3" name: "relu3_3" type: RELU} 89 | 90 | # pool3 91 | layers { 92 | bottom: "conv3_3" top: "pool3" top: "pool3_mask" name: "pool3" type: POOLING 93 | pooling_param { pool: MAX kernel_size: 2 stride: 2 } 94 | } 95 | 96 | # 28 x 28 97 | # conv4_1 98 | layers { bottom: "pool3" top: "conv4_1" name: "conv4_1" type: CONVOLUTION 99 | blobs_lr: 1 blobs_lr: 2 weight_decay: 1 weight_decay: 0 100 | convolution_param { num_output: 512 pad: 1 kernel_size: 3 }} 101 | layers { bottom: 'conv4_1' top: 'conv4_1' name: 'bn4_1' type: BN 102 | bn_param { scale_filler { type: 'constant' value: 1 } 103 | shift_filler { type: 'constant' value: 0.001 } 104 | bn_mode: INFERENCE } } 105 | layers { bottom: "conv4_1" top: "conv4_1" name: "relu4_1" type: RELU} 106 | # conv4_2 107 | layers { bottom: "conv4_1" top: "conv4_2" name: "conv4_2" type: CONVOLUTION 108 | blobs_lr: 1 blobs_lr: 2 weight_decay: 1 weight_decay: 0 109 | convolution_param { num_output: 512 pad: 1 kernel_size: 3 }} 110 | layers { bottom: 'conv4_2' top: 'conv4_2' name: 'bn4_2' type: BN 111 | bn_param { scale_filler { type: 'constant' value: 1 } 112 | shift_filler { type: 'constant' value: 0.001 } 113 | bn_mode: INFERENCE } } 114 | layers { bottom: "conv4_2" top: "conv4_2" name: "relu4_2" type: RELU} 115 | # conv4_3 116 | layers { bottom: "conv4_2" top: "conv4_3" name: "conv4_3" type: CONVOLUTION 117 | blobs_lr: 1 blobs_lr: 2 weight_decay: 1 weight_decay: 0 118 | convolution_param { num_output: 512 pad: 1 kernel_size: 3 }} 119 | layers { bottom: 'conv4_3' top: 'conv4_3' name: 'bn4_3' type: BN 120 | bn_param { scale_filler { type: 'constant' value: 1 } 121 | shift_filler { type: 'constant' value: 0.001 } 122 | bn_mode: INFERENCE } } 123 | layers { bottom: "conv4_3" top: "conv4_3" name: "relu4_3" type: RELU} 124 | 125 | # pool4 126 | layers { 127 | bottom: "conv4_3" top: "pool4" top: "pool4_mask" name: "pool4" type: POOLING 128 | pooling_param { pool: MAX kernel_size: 2 stride: 2 } 129 | } 130 | 131 | # 14 x 14 132 | # conv5_1 133 | layers { bottom: "pool4" top: "conv5_1" name: "conv5_1" type: CONVOLUTION 134 | blobs_lr: 1 blobs_lr: 2 weight_decay: 1 weight_decay: 0 135 | convolution_param { num_output: 512 pad: 1 kernel_size: 3 }} 136 | layers { bottom: 'conv5_1' top: 'conv5_1' name: 'bn5_1' type: BN 137 | bn_param { scale_filler { type: 'constant' value: 1 } 138 | shift_filler { type: 'constant' value: 0.001 } 139 | bn_mode: INFERENCE } } 140 | layers { bottom: "conv5_1" top: "conv5_1" name: "relu5_1" type: RELU} 141 | # conv5_2 142 | layers { bottom: "conv5_1" top: "conv5_2" name: "conv5_2" type: CONVOLUTION 143 | blobs_lr: 1 blobs_lr: 2 weight_decay: 1 weight_decay: 0 144 | convolution_param { num_output: 512 pad: 1 kernel_size: 3 }} 145 | layers { bottom: 'conv5_2' top: 'conv5_2' name: 'bn5_2' type: BN 146 | bn_param { scale_filler { type: 'constant' value: 1 } 147 | shift_filler { type: 'constant' value: 0.001 } 148 | bn_mode: INFERENCE } } 149 | layers { bottom: "conv5_2" top: "conv5_2" name: "relu5_2" type: RELU} 150 | # conv5_3 151 | layers { bottom: "conv5_2" top: "conv5_3" name: "conv5_3" type: CONVOLUTION 152 | blobs_lr: 1 blobs_lr: 2 weight_decay: 1 weight_decay: 0 153 | convolution_param { num_output: 512 pad: 1 kernel_size: 3 }} 154 | layers { bottom: 'conv5_3' top: 'conv5_3' name: 'bn5_3' type: BN 155 | bn_param { scale_filler { type: 'constant' value: 1 } 156 | shift_filler { type: 'constant' value: 0.001 } 157 | bn_mode: INFERENCE } } 158 | layers { bottom: "conv5_3" top: "conv5_3" name: "relu5_3" type: RELU} 159 | 160 | # pool5 161 | layers { 162 | bottom: "conv5_3" top: "pool5" top: "pool5_mask" name: "pool5" type: POOLING 163 | pooling_param { pool: MAX kernel_size: 2 stride: 2 } 164 | } 165 | 166 | # 7 x 7 167 | # fc6 168 | layers { bottom: 'pool5' top: 'fc6' name: 'fc6' type: CONVOLUTION 169 | blobs_lr: 1 blobs_lr: 2 weight_decay: 1 weight_decay: 0 170 | convolution_param { kernel_size: 7 num_output: 4096 } } 171 | layers { bottom: 'fc6' top: 'fc6' name: 'bnfc6' type: BN 172 | bn_param { scale_filler { type: 'constant' value: 1 } 173 | shift_filler { type: 'constant' value: 0.001 } 174 | bn_mode: INFERENCE } } 175 | layers { bottom: "fc6" top: "fc6" name: "relu6" type: RELU} 176 | 177 | # 1 x 1 178 | # fc7 179 | layers { bottom: 'fc6' top: 'fc7' name: 'fc7' type: CONVOLUTION 180 | blobs_lr: 1 blobs_lr: 2 weight_decay: 1 weight_decay: 0 181 | convolution_param { kernel_size: 1 num_output: 4096 } } 182 | layers { bottom: 'fc7' top: 'fc7' name: 'bnfc7' type: BN 183 | bn_param { scale_filler { type: 'constant' value: 1 } 184 | shift_filler { type: 'constant' value: 0.001 } 185 | bn_mode: INFERENCE } } 186 | layers { bottom: "fc7" top: "fc7" name: "relu7" type: RELU} 187 | 188 | # fc6-deconv 189 | layers { bottom: 'fc7' top: 'fc6-deconv' name: 'fc6-deconv' type: DECONVOLUTION 190 | blobs_lr: 1 blobs_lr: 2 weight_decay: 1 weight_decay: 0 191 | convolution_param { num_output: 512 kernel_size: 7 192 | weight_filler { type: "gaussian" std: 0.01 } 193 | bias_filler { type: "constant" value: 0 }} } 194 | layers { bottom: 'fc6-deconv' top: 'fc6-deconv' name: 'fc6-deconv-bn' type: BN 195 | bn_param { scale_filler { type: 'constant' value: 1 } 196 | shift_filler { type: 'constant' value: 0.001 } 197 | bn_mode: INFERENCE } } 198 | layers { bottom: 'fc6-deconv' top: 'fc6-deconv' name: 'fc6-deconv-relu' type: RELU } 199 | 200 | # 7 x 7 201 | # unpool5 202 | layers { type: UNPOOLING bottom: "fc6-deconv" bottom: "pool5_mask" top: "unpool5" name: "unpool5" 203 | unpooling_param { unpool: MAX kernel_size: 2 stride: 2 unpool_size: 14 } 204 | } 205 | 206 | # 14 x 14 207 | # deconv5_1 208 | layers { bottom: 'unpool5' top: 'deconv5_1' name: 'deconv5_1' type: DECONVOLUTION 209 | blobs_lr: 1 blobs_lr: 2 weight_decay: 1 weight_decay: 0 210 | convolution_param { num_output: 512 pad: 1 kernel_size: 3 211 | weight_filler { type: "gaussian" std: 0.01 } 212 | bias_filler { type: "constant" value: 0 }} } 213 | layers { bottom: 'deconv5_1' top: 'deconv5_1' name: 'debn5_1' type: BN 214 | bn_param { scale_filler { type: 'constant' value: 1 } 215 | shift_filler { type: 'constant' value: 0.001 } 216 | bn_mode: INFERENCE } } 217 | layers { bottom: 'deconv5_1' top: 'deconv5_1' name: 'derelu5_1' type: RELU } 218 | # deconv5_2 219 | layers { bottom: 'deconv5_1' top: 'deconv5_2' name: 'deconv5_2' type: DECONVOLUTION 220 | blobs_lr: 1 blobs_lr: 2 weight_decay: 1 weight_decay: 0 221 | convolution_param { num_output: 512 pad: 1 kernel_size: 3 222 | weight_filler { type: "gaussian" std: 0.01 } 223 | bias_filler { type: "constant" value: 0 }} } 224 | layers { bottom: 'deconv5_2' top: 'deconv5_2' name: 'debn5_2' type: BN 225 | bn_param { scale_filler { type: 'constant' value: 1 } 226 | shift_filler { type: 'constant' value: 0.001 } 227 | bn_mode: INFERENCE } } 228 | layers { bottom: 'deconv5_2' top: 'deconv5_2' name: 'derelu5_2' type: RELU } 229 | # deconv5_3 230 | layers { bottom: 'deconv5_2' top: 'deconv5_3' name: 'deconv5_3' type: DECONVOLUTION 231 | blobs_lr: 1 blobs_lr: 2 weight_decay: 1 weight_decay: 0 232 | convolution_param { num_output: 512 pad: 1 kernel_size: 3 233 | weight_filler { type: "gaussian" std: 0.01 } 234 | bias_filler { type: "constant" value: 0 }} } 235 | layers { bottom: 'deconv5_3' top: 'deconv5_3' name: 'debn5_3' type: BN 236 | bn_param { scale_filler { type: 'constant' value: 1 } 237 | shift_filler { type: 'constant' value: 0.001 } 238 | bn_mode: INFERENCE } } 239 | layers { bottom: 'deconv5_3' top: 'deconv5_3' name: 'derelu5_3' type: RELU } 240 | 241 | # unpool4 242 | layers { type: UNPOOLING bottom: "deconv5_3" bottom: "pool4_mask" top: "unpool4" name: "unpool4" 243 | unpooling_param { unpool: MAX kernel_size: 2 stride: 2 unpool_size: 28 } 244 | } 245 | 246 | # 28 x 28 247 | # deconv4_1 248 | layers { bottom: 'unpool4' top: 'deconv4_1' name: 'deconv4_1' type: DECONVOLUTION 249 | blobs_lr: 1 blobs_lr: 2 weight_decay: 1 weight_decay: 0 250 | convolution_param { num_output: 512 pad: 1 kernel_size: 3 251 | weight_filler { type: "gaussian" std: 0.01 } 252 | bias_filler { type: "constant" value: 0 }} } 253 | layers { bottom: 'deconv4_1' top: 'deconv4_1' name: 'debn4_1' type: BN 254 | bn_param { scale_filler { type: 'constant' value: 1 } 255 | shift_filler { type: 'constant' value: 0.001 } 256 | bn_mode: INFERENCE } } 257 | layers { bottom: 'deconv4_1' top: 'deconv4_1' name: 'derelu4_1' type: RELU } 258 | # deconv 4_2 259 | layers { bottom: 'deconv4_1' top: 'deconv4_2' name: 'deconv4_2' type: DECONVOLUTION 260 | blobs_lr: 1 blobs_lr: 2 weight_decay: 1 weight_decay: 0 261 | convolution_param { num_output: 512 pad: 1 kernel_size: 3 262 | weight_filler { type: "gaussian" std: 0.01 } 263 | bias_filler { type: "constant" value: 0 }} } 264 | layers { bottom: 'deconv4_2' top: 'deconv4_2' name: 'debn4_2' type: BN 265 | bn_param { scale_filler { type: 'constant' value: 1 } 266 | shift_filler { type: 'constant' value: 0.001 } 267 | bn_mode: INFERENCE } } 268 | layers { bottom: 'deconv4_2' top: 'deconv4_2' name: 'derelu4_2' type: RELU } 269 | # deconv 4_3 270 | layers { bottom: 'deconv4_2' top: 'deconv4_3' name: 'deconv4_3' type: DECONVOLUTION 271 | blobs_lr: 1 blobs_lr: 2 weight_decay: 1 weight_decay: 0 272 | convolution_param { num_output: 256 pad: 1 kernel_size: 3 273 | weight_filler { type: "gaussian" std: 0.01 } 274 | bias_filler { type: "constant" value: 0 }} } 275 | layers { bottom: 'deconv4_3' top: 'deconv4_3' name: 'debn4_3' type: BN 276 | bn_param { scale_filler { type: 'constant' value: 1 } 277 | shift_filler { type: 'constant' value: 0.001 } 278 | bn_mode: INFERENCE } } 279 | layers { bottom: 'deconv4_3' top: 'deconv4_3' name: 'derelu4_3' type: RELU } 280 | 281 | # unpool3 282 | layers { type: UNPOOLING bottom: "deconv4_3" bottom: "pool3_mask" top: "unpool3" name: "unpool3" 283 | unpooling_param { unpool: MAX kernel_size: 2 stride: 2 unpool_size: 56 } 284 | } 285 | 286 | # 56 x 56 287 | # deconv3_1 288 | layers { bottom: 'unpool3' top: 'deconv3_1' name: 'deconv3_1' type: DECONVOLUTION 289 | blobs_lr: 1 blobs_lr: 2 weight_decay: 1 weight_decay: 0 290 | convolution_param { num_output:256 pad:1 kernel_size: 3 291 | weight_filler { type: "gaussian" std: 0.01 } 292 | bias_filler { type: "constant" value: 0 }} } 293 | layers { bottom: 'deconv3_1' top: 'deconv3_1' name: 'debn3_1' type: BN 294 | bn_param { scale_filler { type: 'constant' value: 1 } 295 | shift_filler { type: 'constant' value: 0.001 } 296 | bn_mode: INFERENCE } } 297 | layers { bottom: 'deconv3_1' top: 'deconv3_1' name: 'derelu3_1' type: RELU } 298 | # deconv3_2 299 | layers { bottom: 'deconv3_1' top: 'deconv3_2' name: 'deconv3_2' type: DECONVOLUTION 300 | blobs_lr: 1 blobs_lr: 2 weight_decay: 1 weight_decay: 0 301 | convolution_param { num_output:256 pad:1 kernel_size: 3 302 | weight_filler { type: "gaussian" std: 0.01 } 303 | bias_filler { type: "constant" value: 0 }} } 304 | layers { bottom: 'deconv3_2' top: 'deconv3_2' name: 'debn3_2' type: BN 305 | bn_param { scale_filler { type: 'constant' value: 1 } 306 | shift_filler { type: 'constant' value: 0.001 } 307 | bn_mode: INFERENCE } } 308 | layers { bottom: 'deconv3_2' top: 'deconv3_2' name: 'derelu3_2' type: RELU } 309 | # deconv3_3 310 | layers { bottom: 'deconv3_2' top: 'deconv3_3' name: 'deconv3_3' type: DECONVOLUTION 311 | blobs_lr: 1 blobs_lr: 2 weight_decay: 1 weight_decay: 0 312 | convolution_param { num_output:128 pad:1 kernel_size: 3 313 | weight_filler { type: "gaussian" std: 0.01 } 314 | bias_filler { type: "constant" value: 0 }} } 315 | layers { bottom: 'deconv3_3' top: 'deconv3_3' name: 'debn3_3' type: BN 316 | bn_param { scale_filler { type: 'constant' value: 1 } 317 | shift_filler { type: 'constant' value: 0.001 } 318 | bn_mode: INFERENCE } } 319 | layers { bottom: 'deconv3_3' top: 'deconv3_3' name: 'derelu3_3' type: RELU } 320 | 321 | # unpool2 322 | layers { type: UNPOOLING bottom: "deconv3_3" bottom: "pool2_mask" top: "unpool2" name: "unpool2" 323 | unpooling_param { unpool: MAX kernel_size: 2 stride: 2 unpool_size: 112 } 324 | } 325 | 326 | # 112 x 112 327 | # deconv2_1 328 | layers { bottom: 'unpool2' top: 'deconv2_1' name: 'deconv2_1' type: DECONVOLUTION 329 | blobs_lr: 1 blobs_lr: 2 weight_decay: 1 weight_decay: 0 330 | convolution_param { num_output:128 pad:1 kernel_size: 3 331 | weight_filler { type: "gaussian" std: 0.01 } 332 | bias_filler { type: "constant" value: 0 }} } 333 | layers { bottom: 'deconv2_1' top: 'deconv2_1' name: 'debn2_1' type: BN 334 | bn_param { scale_filler { type: 'constant' value: 1 } 335 | shift_filler { type: 'constant' value: 0.001 } 336 | bn_mode: INFERENCE } } 337 | layers { bottom: 'deconv2_1' top: 'deconv2_1' name: 'derelu2_1' type: RELU } 338 | # deconv2_2 339 | layers { bottom: 'deconv2_1' top: 'deconv2_2' name: 'deconv2_2' type: DECONVOLUTION 340 | blobs_lr: 1 blobs_lr: 2 weight_decay: 1 weight_decay: 0 341 | convolution_param { num_output:64 pad:1 kernel_size: 3 342 | weight_filler { type: "gaussian" std: 0.01 } 343 | bias_filler { type: "constant" value: 0 }} } 344 | layers { bottom: 'deconv2_2' top: 'deconv2_2' name: 'debn2_2' type: BN 345 | bn_param { scale_filler { type: 'constant' value: 1 } 346 | shift_filler { type: 'constant' value: 0.001 } 347 | bn_mode: INFERENCE } } 348 | layers { bottom: 'deconv2_2' top: 'deconv2_2' name: 'derelu2_2' type: RELU } 349 | 350 | # unpool1 351 | layers { type: UNPOOLING bottom: "deconv2_2" bottom: "pool1_mask" top: "unpool1" name: "unpool1" 352 | unpooling_param { unpool: MAX kernel_size: 2 stride: 2 unpool_size: 224 } 353 | } 354 | 355 | # deconv1_1 356 | layers { bottom: 'unpool1' top: 'deconv1_1' name: 'deconv1_1' type: DECONVOLUTION 357 | blobs_lr: 1 blobs_lr: 2 weight_decay: 1 weight_decay: 0 358 | convolution_param { num_output:64 pad:1 kernel_size: 3 359 | weight_filler { type: "gaussian" std: 0.01 } 360 | bias_filler { type: "constant" value: 0 }} } 361 | layers { bottom: 'deconv1_1' top: 'deconv1_1' name: 'debn1_1' type: BN 362 | bn_param { scale_filler { type: 'constant' value: 1 } 363 | shift_filler { type: 'constant' value: 0.001 } 364 | bn_mode: INFERENCE } } 365 | layers { bottom: 'deconv1_1' top: 'deconv1_1' name: 'derelu1_1' type: RELU } 366 | 367 | # deconv1_2 368 | layers { bottom: 'deconv1_1' top: 'deconv1_2' name: 'deconv1_2' type: DECONVOLUTION 369 | blobs_lr: 1 blobs_lr: 2 weight_decay: 1 weight_decay: 0 370 | convolution_param { num_output:64 pad:1 kernel_size: 3 371 | weight_filler { type: "gaussian" std: 0.01 } 372 | bias_filler { type: "constant" value: 0 }} } 373 | layers { bottom: 'deconv1_2' top: 'deconv1_2' name: 'debn1_2' type: BN 374 | bn_param { scale_filler { type: 'constant' value: 1 } 375 | shift_filler { type: 'constant' value: 0.001 } 376 | bn_mode: INFERENCE } } 377 | layers { bottom: 'deconv1_2' top: 'deconv1_2' name: 'derelu1_2' type: RELU } 378 | 379 | # seg-score 380 | layers { name: 'seg-score-voc' type: CONVOLUTION bottom: 'deconv1_2' top: 'seg-score' 381 | blobs_lr: 1 blobs_lr: 2 weight_decay: 1 weight_decay: 0 382 | convolution_param { num_output: 21 kernel_size: 1 383 | weight_filler { type: "gaussian" std: 0.01 } 384 | bias_filler { type: "constant" value: 0 }} } 385 | 386 | -------------------------------------------------------------------------------- /training/003_make_bn_layer_testable/DeconvNet_inference_test.prototxt: -------------------------------------------------------------------------------- 1 | name: "DeconvNet_inference_test.prototxt" 2 | 3 | layers { 4 | name: "data" 5 | type: WINDOW_SEG_DATA 6 | top: "data" 7 | top: "seg-label" 8 | image_data_param { 9 | root_folder: "./VOC2012" 10 | source: "./stage_2_train_imgset/val.txt" 11 | label_type: PIXEL 12 | batch_size: 32 13 | shuffle: true 14 | new_width: 224 15 | new_height: 224 16 | } 17 | transform_param { 18 | mean_value: 104.008 19 | mean_value: 116.669 20 | mean_value: 122.675 21 | crop_size: 224 22 | mirror: true 23 | } 24 | } 25 | 26 | # 224 x 224 27 | # conv1_1 28 | layers { bottom: "data" top: "conv1_1" name: "conv1_1" type: CONVOLUTION 29 | blobs_lr: 1 blobs_lr: 2 weight_decay: 1 weight_decay: 0 30 | convolution_param { num_output: 64 pad: 1 kernel_size: 3 }} 31 | layers { bottom: 'conv1_1' top: 'conv1_1' name: 'bn1_1' type: BN 32 | bn_param { scale_filler { type: 'constant' value: 1 } 33 | shift_filler { type: 'constant' value: 0.001 } 34 | bn_mode: INFERENCE } } 35 | layers { bottom: "conv1_1" top: "conv1_1" name: "relu1_1" type: RELU} 36 | # conv1_2 37 | layers { bottom: "conv1_1" top: "conv1_2" name: "conv1_2" type: CONVOLUTION 38 | blobs_lr: 1 blobs_lr: 2 weight_decay: 1 weight_decay: 0 39 | convolution_param { num_output: 64 pad: 1 kernel_size: 3 }} 40 | layers { bottom: 'conv1_2' top: 'conv1_2' name: 'bn1_2' type: BN 41 | bn_param { scale_filler { type: 'constant' value: 1 } 42 | shift_filler { type: 'constant' value: 0.001 } 43 | bn_mode: INFERENCE } } 44 | layers { bottom: "conv1_2" top: "conv1_2" name: "relu1_2" type: RELU} 45 | 46 | # pool1 47 | layers { 48 | bottom: "conv1_2" top: "pool1" top:"pool1_mask" name: "pool1" type: POOLING 49 | pooling_param { pool: MAX kernel_size: 2 stride: 2 } 50 | } 51 | 52 | # 112 x 112 53 | # conv2_1 54 | layers { bottom: "pool1" top: "conv2_1" name: "conv2_1" type: CONVOLUTION 55 | blobs_lr: 1 blobs_lr: 2 weight_decay: 1 weight_decay: 0 56 | convolution_param { num_output: 128 pad: 1 kernel_size: 3 }} 57 | layers { bottom: 'conv2_1' top: 'conv2_1' name: 'bn2_1' type: BN 58 | bn_param { scale_filler { type: 'constant' value: 1 } 59 | shift_filler { type: 'constant' value: 0.001 } 60 | bn_mode: INFERENCE } } 61 | layers { bottom: "conv2_1" top: "conv2_1" name: "relu2_1" type: RELU} 62 | # conv2_2 63 | layers { bottom: "conv2_1" top: "conv2_2" name: "conv2_2" type: CONVOLUTION 64 | blobs_lr: 1 blobs_lr: 2 weight_decay: 1 weight_decay: 0 65 | convolution_param { num_output: 128 pad: 1 kernel_size: 3 }} 66 | layers { bottom: 'conv2_2' top: 'conv2_2' name: 'bn2_2' type: BN 67 | bn_param { scale_filler { type: 'constant' value: 1 } 68 | shift_filler { type: 'constant' value: 0.001 } 69 | bn_mode: INFERENCE } } 70 | layers { bottom: "conv2_2" top: "conv2_2" name: "relu2_2" type: RELU} 71 | 72 | # pool2 73 | layers { 74 | bottom: "conv2_2" top: "pool2" top: "pool2_mask" name: "pool2" type: POOLING 75 | pooling_param { pool: MAX kernel_size: 2 stride: 2 } 76 | } 77 | 78 | # 56 x 56 79 | # conv3_1 80 | layers { bottom: "pool2" top: "conv3_1" name: "conv3_1" type: CONVOLUTION 81 | blobs_lr: 1 blobs_lr: 2 weight_decay: 1 weight_decay: 0 82 | convolution_param { num_output: 256 pad: 1 kernel_size: 3 }} 83 | layers { bottom: 'conv3_1' top: 'conv3_1' name: 'bn3_1' type: BN 84 | bn_param { scale_filler { type: 'constant' value: 1 } 85 | shift_filler { type: 'constant' value: 0.001 } 86 | bn_mode: INFERENCE } } 87 | layers { bottom: "conv3_1" top: "conv3_1" name: "relu3_1" type: RELU} 88 | # conv3_2 89 | layers { bottom: "conv3_1" top: "conv3_2" name: "conv3_2" type: CONVOLUTION 90 | blobs_lr: 1 blobs_lr: 2 weight_decay: 1 weight_decay: 0 91 | convolution_param { num_output: 256 pad: 1 kernel_size: 3 }} 92 | layers { bottom: 'conv3_2' top: 'conv3_2' name: 'bn3_2' type: BN 93 | bn_param { scale_filler { type: 'constant' value: 1 } 94 | shift_filler { type: 'constant' value: 0.001 } 95 | bn_mode: INFERENCE } } 96 | layers { bottom: "conv3_2" top: "conv3_2" name: "relu3_2" type: RELU} 97 | # conv3_3 98 | layers { bottom: "conv3_2" top: "conv3_3" name: "conv3_3" type: CONVOLUTION 99 | blobs_lr: 1 blobs_lr: 2 weight_decay: 1 weight_decay: 0 100 | convolution_param { num_output: 256 pad: 1 kernel_size: 3 }} 101 | layers { bottom: 'conv3_3' top: 'conv3_3' name: 'bn3_3' type: BN 102 | bn_param { scale_filler { type: 'constant' value: 1 } 103 | shift_filler { type: 'constant' value: 0.001 } 104 | bn_mode: INFERENCE } } 105 | layers { bottom: "conv3_3" top: "conv3_3" name: "relu3_3" type: RELU} 106 | 107 | # pool3 108 | layers { 109 | bottom: "conv3_3" top: "pool3" top: "pool3_mask" name: "pool3" type: POOLING 110 | pooling_param { pool: MAX kernel_size: 2 stride: 2 } 111 | } 112 | 113 | # 28 x 28 114 | # conv4_1 115 | layers { bottom: "pool3" top: "conv4_1" name: "conv4_1" type: CONVOLUTION 116 | blobs_lr: 1 blobs_lr: 2 weight_decay: 1 weight_decay: 0 117 | convolution_param { num_output: 512 pad: 1 kernel_size: 3 }} 118 | layers { bottom: 'conv4_1' top: 'conv4_1' name: 'bn4_1' type: BN 119 | bn_param { scale_filler { type: 'constant' value: 1 } 120 | shift_filler { type: 'constant' value: 0.001 } 121 | bn_mode: INFERENCE } } 122 | layers { bottom: "conv4_1" top: "conv4_1" name: "relu4_1" type: RELU} 123 | # conv4_2 124 | layers { bottom: "conv4_1" top: "conv4_2" name: "conv4_2" type: CONVOLUTION 125 | blobs_lr: 1 blobs_lr: 2 weight_decay: 1 weight_decay: 0 126 | convolution_param { num_output: 512 pad: 1 kernel_size: 3 }} 127 | layers { bottom: 'conv4_2' top: 'conv4_2' name: 'bn4_2' type: BN 128 | bn_param { scale_filler { type: 'constant' value: 1 } 129 | shift_filler { type: 'constant' value: 0.001 } 130 | bn_mode: INFERENCE } } 131 | layers { bottom: "conv4_2" top: "conv4_2" name: "relu4_2" type: RELU} 132 | # conv4_3 133 | layers { bottom: "conv4_2" top: "conv4_3" name: "conv4_3" type: CONVOLUTION 134 | blobs_lr: 1 blobs_lr: 2 weight_decay: 1 weight_decay: 0 135 | convolution_param { num_output: 512 pad: 1 kernel_size: 3 }} 136 | layers { bottom: 'conv4_3' top: 'conv4_3' name: 'bn4_3' type: BN 137 | bn_param { scale_filler { type: 'constant' value: 1 } 138 | shift_filler { type: 'constant' value: 0.001 } 139 | bn_mode: INFERENCE } } 140 | layers { bottom: "conv4_3" top: "conv4_3" name: "relu4_3" type: RELU} 141 | 142 | # pool4 143 | layers { 144 | bottom: "conv4_3" top: "pool4" top: "pool4_mask" name: "pool4" type: POOLING 145 | pooling_param { pool: MAX kernel_size: 2 stride: 2 } 146 | } 147 | 148 | # 14 x 14 149 | # conv5_1 150 | layers { bottom: "pool4" top: "conv5_1" name: "conv5_1" type: CONVOLUTION 151 | blobs_lr: 1 blobs_lr: 2 weight_decay: 1 weight_decay: 0 152 | convolution_param { num_output: 512 pad: 1 kernel_size: 3 }} 153 | layers { bottom: 'conv5_1' top: 'conv5_1' name: 'bn5_1' type: BN 154 | bn_param { scale_filler { type: 'constant' value: 1 } 155 | shift_filler { type: 'constant' value: 0.001 } 156 | bn_mode: INFERENCE } } 157 | layers { bottom: "conv5_1" top: "conv5_1" name: "relu5_1" type: RELU} 158 | # conv5_2 159 | layers { bottom: "conv5_1" top: "conv5_2" name: "conv5_2" type: CONVOLUTION 160 | blobs_lr: 1 blobs_lr: 2 weight_decay: 1 weight_decay: 0 161 | convolution_param { num_output: 512 pad: 1 kernel_size: 3 }} 162 | layers { bottom: 'conv5_2' top: 'conv5_2' name: 'bn5_2' type: BN 163 | bn_param { scale_filler { type: 'constant' value: 1 } 164 | shift_filler { type: 'constant' value: 0.001 } 165 | bn_mode: INFERENCE } } 166 | layers { bottom: "conv5_2" top: "conv5_2" name: "relu5_2" type: RELU} 167 | # conv5_3 168 | layers { bottom: "conv5_2" top: "conv5_3" name: "conv5_3" type: CONVOLUTION 169 | blobs_lr: 1 blobs_lr: 2 weight_decay: 1 weight_decay: 0 170 | convolution_param { num_output: 512 pad: 1 kernel_size: 3 }} 171 | layers { bottom: 'conv5_3' top: 'conv5_3' name: 'bn5_3' type: BN 172 | bn_param { scale_filler { type: 'constant' value: 1 } 173 | shift_filler { type: 'constant' value: 0.001 } 174 | bn_mode: INFERENCE } } 175 | layers { bottom: "conv5_3" top: "conv5_3" name: "relu5_3" type: RELU} 176 | 177 | # pool5 178 | layers { 179 | bottom: "conv5_3" top: "pool5" top: "pool5_mask" name: "pool5" type: POOLING 180 | pooling_param { pool: MAX kernel_size: 2 stride: 2 } 181 | } 182 | 183 | # 7 x 7 184 | # fc6 185 | layers { bottom: 'pool5' top: 'fc6' name: 'fc6' type: CONVOLUTION 186 | blobs_lr: 1 blobs_lr: 2 weight_decay: 1 weight_decay: 0 187 | convolution_param { kernel_size: 7 num_output: 4096 } } 188 | layers { bottom: 'fc6' top: 'fc6' name: 'bnfc6' type: BN 189 | bn_param { scale_filler { type: 'constant' value: 1 } 190 | shift_filler { type: 'constant' value: 0.001 } 191 | bn_mode: INFERENCE } } 192 | layers { bottom: "fc6" top: "fc6" name: "relu6" type: RELU} 193 | 194 | # 1 x 1 195 | # fc7 196 | layers { bottom: 'fc6' top: 'fc7' name: 'fc7' type: CONVOLUTION 197 | blobs_lr: 1 blobs_lr: 2 weight_decay: 1 weight_decay: 0 198 | convolution_param { kernel_size: 1 num_output: 4096 } } 199 | layers { bottom: 'fc7' top: 'fc7' name: 'bnfc7' type: BN 200 | bn_param { scale_filler { type: 'constant' value: 1 } 201 | shift_filler { type: 'constant' value: 0.001 } 202 | bn_mode: INFERENCE } } 203 | layers { bottom: "fc7" top: "fc7" name: "relu7" type: RELU} 204 | 205 | # fc6-deconv 206 | layers { bottom: 'fc7' top: 'fc6-deconv' name: 'fc6-deconv' type: DECONVOLUTION 207 | blobs_lr: 1 blobs_lr: 2 weight_decay: 1 weight_decay: 0 208 | convolution_param { num_output: 512 kernel_size: 7 209 | weight_filler { type: "gaussian" std: 0.01 } 210 | bias_filler { type: "constant" value: 0 }} } 211 | layers { bottom: 'fc6-deconv' top: 'fc6-deconv' name: 'fc6-deconv-bn' type: BN 212 | bn_param { scale_filler { type: 'constant' value: 1 } 213 | shift_filler { type: 'constant' value: 0.001 } 214 | bn_mode: INFERENCE } } 215 | layers { bottom: 'fc6-deconv' top: 'fc6-deconv' name: 'fc6-deconv-relu' type: RELU } 216 | 217 | # 7 x 7 218 | # unpool5 219 | layers { type: UNPOOLING bottom: "fc6-deconv" bottom: "pool5_mask" top: "unpool5" name: "unpool5" 220 | unpooling_param { unpool: MAX kernel_size: 2 stride: 2 unpool_size: 14 } 221 | } 222 | 223 | # 14 x 14 224 | # deconv5_1 225 | layers { bottom: 'unpool5' top: 'deconv5_1' name: 'deconv5_1' type: DECONVOLUTION 226 | blobs_lr: 1 blobs_lr: 2 weight_decay: 1 weight_decay: 0 227 | convolution_param { num_output: 512 pad: 1 kernel_size: 3 228 | weight_filler { type: "gaussian" std: 0.01 } 229 | bias_filler { type: "constant" value: 0 }} } 230 | layers { bottom: 'deconv5_1' top: 'deconv5_1' name: 'debn5_1' type: BN 231 | bn_param { scale_filler { type: 'constant' value: 1 } 232 | shift_filler { type: 'constant' value: 0.001 } 233 | bn_mode: INFERENCE } } 234 | layers { bottom: 'deconv5_1' top: 'deconv5_1' name: 'derelu5_1' type: RELU } 235 | # deconv5_2 236 | layers { bottom: 'deconv5_1' top: 'deconv5_2' name: 'deconv5_2' type: DECONVOLUTION 237 | blobs_lr: 1 blobs_lr: 2 weight_decay: 1 weight_decay: 0 238 | convolution_param { num_output: 512 pad: 1 kernel_size: 3 239 | weight_filler { type: "gaussian" std: 0.01 } 240 | bias_filler { type: "constant" value: 0 }} } 241 | layers { bottom: 'deconv5_2' top: 'deconv5_2' name: 'debn5_2' type: BN 242 | bn_param { scale_filler { type: 'constant' value: 1 } 243 | shift_filler { type: 'constant' value: 0.001 } 244 | bn_mode: INFERENCE } } 245 | layers { bottom: 'deconv5_2' top: 'deconv5_2' name: 'derelu5_2' type: RELU } 246 | # deconv5_3 247 | layers { bottom: 'deconv5_2' top: 'deconv5_3' name: 'deconv5_3' type: DECONVOLUTION 248 | blobs_lr: 1 blobs_lr: 2 weight_decay: 1 weight_decay: 0 249 | convolution_param { num_output: 512 pad: 1 kernel_size: 3 250 | weight_filler { type: "gaussian" std: 0.01 } 251 | bias_filler { type: "constant" value: 0 }} } 252 | layers { bottom: 'deconv5_3' top: 'deconv5_3' name: 'debn5_3' type: BN 253 | bn_param { scale_filler { type: 'constant' value: 1 } 254 | shift_filler { type: 'constant' value: 0.001 } 255 | bn_mode: INFERENCE } } 256 | layers { bottom: 'deconv5_3' top: 'deconv5_3' name: 'derelu5_3' type: RELU } 257 | 258 | # unpool4 259 | layers { type: UNPOOLING bottom: "deconv5_3" bottom: "pool4_mask" top: "unpool4" name: "unpool4" 260 | unpooling_param { unpool: MAX kernel_size: 2 stride: 2 unpool_size: 28 } 261 | } 262 | 263 | # 28 x 28 264 | # deconv4_1 265 | layers { bottom: 'unpool4' top: 'deconv4_1' name: 'deconv4_1' type: DECONVOLUTION 266 | blobs_lr: 1 blobs_lr: 2 weight_decay: 1 weight_decay: 0 267 | convolution_param { num_output: 512 pad: 1 kernel_size: 3 268 | weight_filler { type: "gaussian" std: 0.01 } 269 | bias_filler { type: "constant" value: 0 }} } 270 | layers { bottom: 'deconv4_1' top: 'deconv4_1' name: 'debn4_1' type: BN 271 | bn_param { scale_filler { type: 'constant' value: 1 } 272 | shift_filler { type: 'constant' value: 0.001 } 273 | bn_mode: INFERENCE } } 274 | layers { bottom: 'deconv4_1' top: 'deconv4_1' name: 'derelu4_1' type: RELU } 275 | # deconv 4_2 276 | layers { bottom: 'deconv4_1' top: 'deconv4_2' name: 'deconv4_2' type: DECONVOLUTION 277 | blobs_lr: 1 blobs_lr: 2 weight_decay: 1 weight_decay: 0 278 | convolution_param { num_output: 512 pad: 1 kernel_size: 3 279 | weight_filler { type: "gaussian" std: 0.01 } 280 | bias_filler { type: "constant" value: 0 }} } 281 | layers { bottom: 'deconv4_2' top: 'deconv4_2' name: 'debn4_2' type: BN 282 | bn_param { scale_filler { type: 'constant' value: 1 } 283 | shift_filler { type: 'constant' value: 0.001 } 284 | bn_mode: INFERENCE } } 285 | layers { bottom: 'deconv4_2' top: 'deconv4_2' name: 'derelu4_2' type: RELU } 286 | # deconv 4_3 287 | layers { bottom: 'deconv4_2' top: 'deconv4_3' name: 'deconv4_3' type: DECONVOLUTION 288 | blobs_lr: 1 blobs_lr: 2 weight_decay: 1 weight_decay: 0 289 | convolution_param { num_output: 256 pad: 1 kernel_size: 3 290 | weight_filler { type: "gaussian" std: 0.01 } 291 | bias_filler { type: "constant" value: 0 }} } 292 | layers { bottom: 'deconv4_3' top: 'deconv4_3' name: 'debn4_3' type: BN 293 | bn_param { scale_filler { type: 'constant' value: 1 } 294 | shift_filler { type: 'constant' value: 0.001 } 295 | bn_mode: INFERENCE } } 296 | layers { bottom: 'deconv4_3' top: 'deconv4_3' name: 'derelu4_3' type: RELU } 297 | 298 | # unpool3 299 | layers { type: UNPOOLING bottom: "deconv4_3" bottom: "pool3_mask" top: "unpool3" name: "unpool3" 300 | unpooling_param { unpool: MAX kernel_size: 2 stride: 2 unpool_size: 56 } 301 | } 302 | 303 | # 56 x 56 304 | # deconv3_1 305 | layers { bottom: 'unpool3' top: 'deconv3_1' name: 'deconv3_1' type: DECONVOLUTION 306 | blobs_lr: 1 blobs_lr: 2 weight_decay: 1 weight_decay: 0 307 | convolution_param { num_output:256 pad:1 kernel_size: 3 308 | weight_filler { type: "gaussian" std: 0.01 } 309 | bias_filler { type: "constant" value: 0 }} } 310 | layers { bottom: 'deconv3_1' top: 'deconv3_1' name: 'debn3_1' type: BN 311 | bn_param { scale_filler { type: 'constant' value: 1 } 312 | shift_filler { type: 'constant' value: 0.001 } 313 | bn_mode: INFERENCE } } 314 | layers { bottom: 'deconv3_1' top: 'deconv3_1' name: 'derelu3_1' type: RELU } 315 | # deconv3_2 316 | layers { bottom: 'deconv3_1' top: 'deconv3_2' name: 'deconv3_2' type: DECONVOLUTION 317 | blobs_lr: 1 blobs_lr: 2 weight_decay: 1 weight_decay: 0 318 | convolution_param { num_output:256 pad:1 kernel_size: 3 319 | weight_filler { type: "gaussian" std: 0.01 } 320 | bias_filler { type: "constant" value: 0 }} } 321 | layers { bottom: 'deconv3_2' top: 'deconv3_2' name: 'debn3_2' type: BN 322 | bn_param { scale_filler { type: 'constant' value: 1 } 323 | shift_filler { type: 'constant' value: 0.001 } 324 | bn_mode: INFERENCE } } 325 | layers { bottom: 'deconv3_2' top: 'deconv3_2' name: 'derelu3_2' type: RELU } 326 | # deconv3_3 327 | layers { bottom: 'deconv3_2' top: 'deconv3_3' name: 'deconv3_3' type: DECONVOLUTION 328 | blobs_lr: 1 blobs_lr: 2 weight_decay: 1 weight_decay: 0 329 | convolution_param { num_output:128 pad:1 kernel_size: 3 330 | weight_filler { type: "gaussian" std: 0.01 } 331 | bias_filler { type: "constant" value: 0 }} } 332 | layers { bottom: 'deconv3_3' top: 'deconv3_3' name: 'debn3_3' type: BN 333 | bn_param { scale_filler { type: 'constant' value: 1 } 334 | shift_filler { type: 'constant' value: 0.001 } 335 | bn_mode: INFERENCE } } 336 | layers { bottom: 'deconv3_3' top: 'deconv3_3' name: 'derelu3_3' type: RELU } 337 | 338 | # unpool2 339 | layers { type: UNPOOLING bottom: "deconv3_3" bottom: "pool2_mask" top: "unpool2" name: "unpool2" 340 | unpooling_param { unpool: MAX kernel_size: 2 stride: 2 unpool_size: 112 } 341 | } 342 | 343 | # 112 x 112 344 | # deconv2_1 345 | layers { bottom: 'unpool2' top: 'deconv2_1' name: 'deconv2_1' type: DECONVOLUTION 346 | blobs_lr: 1 blobs_lr: 2 weight_decay: 1 weight_decay: 0 347 | convolution_param { num_output:128 pad:1 kernel_size: 3 348 | weight_filler { type: "gaussian" std: 0.01 } 349 | bias_filler { type: "constant" value: 0 }} } 350 | layers { bottom: 'deconv2_1' top: 'deconv2_1' name: 'debn2_1' type: BN 351 | bn_param { scale_filler { type: 'constant' value: 1 } 352 | shift_filler { type: 'constant' value: 0.001 } 353 | bn_mode: INFERENCE } } 354 | layers { bottom: 'deconv2_1' top: 'deconv2_1' name: 'derelu2_1' type: RELU } 355 | # deconv2_2 356 | layers { bottom: 'deconv2_1' top: 'deconv2_2' name: 'deconv2_2' type: DECONVOLUTION 357 | blobs_lr: 1 blobs_lr: 2 weight_decay: 1 weight_decay: 0 358 | convolution_param { num_output:64 pad:1 kernel_size: 3 359 | weight_filler { type: "gaussian" std: 0.01 } 360 | bias_filler { type: "constant" value: 0 }} } 361 | layers { bottom: 'deconv2_2' top: 'deconv2_2' name: 'debn2_2' type: BN 362 | bn_param { scale_filler { type: 'constant' value: 1 } 363 | shift_filler { type: 'constant' value: 0.001 } 364 | bn_mode: INFERENCE } } 365 | layers { bottom: 'deconv2_2' top: 'deconv2_2' name: 'derelu2_2' type: RELU } 366 | 367 | # unpool1 368 | layers { type: UNPOOLING bottom: "deconv2_2" bottom: "pool1_mask" top: "unpool1" name: "unpool1" 369 | unpooling_param { unpool: MAX kernel_size: 2 stride: 2 unpool_size: 224 } 370 | } 371 | 372 | # deconv1_1 373 | layers { bottom: 'unpool1' top: 'deconv1_1' name: 'deconv1_1' type: DECONVOLUTION 374 | blobs_lr: 1 blobs_lr: 2 weight_decay: 1 weight_decay: 0 375 | convolution_param { num_output:64 pad:1 kernel_size: 3 376 | weight_filler { type: "gaussian" std: 0.01 } 377 | bias_filler { type: "constant" value: 0 }} } 378 | layers { bottom: 'deconv1_1' top: 'deconv1_1' name: 'debn1_1' type: BN 379 | bn_param { scale_filler { type: 'constant' value: 1 } 380 | shift_filler { type: 'constant' value: 0.001 } 381 | bn_mode: INFERENCE } } 382 | layers { bottom: 'deconv1_1' top: 'deconv1_1' name: 'derelu1_1' type: RELU } 383 | 384 | # deconv1_2 385 | layers { bottom: 'deconv1_1' top: 'deconv1_2' name: 'deconv1_2' type: DECONVOLUTION 386 | blobs_lr: 1 blobs_lr: 2 weight_decay: 1 weight_decay: 0 387 | convolution_param { num_output:64 pad:1 kernel_size: 3 388 | weight_filler { type: "gaussian" std: 0.01 } 389 | bias_filler { type: "constant" value: 0 }} } 390 | layers { bottom: 'deconv1_2' top: 'deconv1_2' name: 'debn1_2' type: BN 391 | bn_param { scale_filler { type: 'constant' value: 1 } 392 | shift_filler { type: 'constant' value: 0.001 } 393 | bn_mode: INFERENCE } } 394 | layers { bottom: 'deconv1_2' top: 'deconv1_2' name: 'derelu1_2' type: RELU } 395 | 396 | # seg-score 397 | layers { name: 'seg-score-voc' type: CONVOLUTION bottom: 'deconv1_2' top: 'seg-score' 398 | blobs_lr: 1 blobs_lr: 2 weight_decay: 1 weight_decay: 0 399 | convolution_param { num_output: 21 kernel_size: 1 400 | weight_filler { type: "gaussian" std: 0.01 } 401 | bias_filler { type: "constant" value: 0 }} } 402 | 403 | 404 | # score and accuracy 405 | layers { 406 | name: "seg-accuracy" 407 | type: ELTWISE_ACCURACY 408 | bottom: "seg-score" 409 | bottom: "seg-label" 410 | top: "seg-accuracy" 411 | eltwise_accuracy_param { 412 | ignore_label: 255 413 | } 414 | } 415 | 416 | 417 | -------------------------------------------------------------------------------- /training/003_make_bn_layer_testable/DeconvNet_make_bn_layer_testable.prototxt: -------------------------------------------------------------------------------- 1 | name: "DeconvNet_make_bn_layer_testable" 2 | 3 | layers { 4 | name: "data" 5 | type: WINDOW_SEG_DATA 6 | top: "data" 7 | top: "seg-label" 8 | image_data_param { 9 | root_folder: "./VOC2012" 10 | source: "./stage_2_train_imgset/train.txt" 11 | label_type: PIXEL 12 | batch_size: 28 13 | shuffle: true 14 | new_width: 250 15 | new_height: 250 16 | } 17 | transform_param { 18 | mean_value: 104.008 19 | mean_value: 116.669 20 | mean_value: 122.675 21 | crop_size: 224 22 | mirror: true 23 | } 24 | } 25 | 26 | # 224 x 224 27 | # conv1_1 28 | layers { bottom: "data" top: "conv1_1" name: "conv1_1" type: CONVOLUTION 29 | blobs_lr: 1 blobs_lr: 2 weight_decay: 1 weight_decay: 0 30 | convolution_param { num_output: 64 pad: 1 kernel_size: 3 }} 31 | layers { bottom: 'conv1_1' top: 'conv1_1' top:'bn1_1-mean' top: 'bn1_1-var' name: 'bn1_1' type: BN 32 | bn_param { scale_filler { type: 'constant' value: 1 } 33 | shift_filler { type: 'constant' value: 0.001 } } } 34 | layers { bottom: "conv1_1" top: "conv1_1" name: "relu1_1" type: RELU} 35 | # conv1_2 36 | layers { bottom: "conv1_1" top: "conv1_2" name: "conv1_2" type: CONVOLUTION 37 | blobs_lr: 1 blobs_lr: 2 weight_decay: 1 weight_decay: 0 38 | convolution_param { num_output: 64 pad: 1 kernel_size: 3 }} 39 | layers { bottom: 'conv1_2' top: 'conv1_2' top:'bn1_2-mean' top:'bn1_2-var' name: 'bn1_2' type: BN 40 | bn_param { scale_filler { type: 'constant' value: 1 } 41 | shift_filler { type: 'constant' value: 0.001 } } } 42 | layers { bottom: "conv1_2" top: "conv1_2" name: "relu1_2" type: RELU} 43 | 44 | # pool1 45 | layers { 46 | bottom: "conv1_2" top: "pool1" top:"pool1_mask" name: "pool1" type: POOLING 47 | pooling_param { pool: MAX kernel_size: 2 stride: 2 } 48 | } 49 | 50 | # 112 x 112 51 | # conv2_1 52 | layers { bottom: "pool1" top: "conv2_1" name: "conv2_1" type: CONVOLUTION 53 | blobs_lr: 1 blobs_lr: 2 weight_decay: 1 weight_decay: 0 54 | convolution_param { num_output: 128 pad: 1 kernel_size: 3 }} 55 | layers { bottom: 'conv2_1' top: 'conv2_1' top: 'bn2_1-mean' top: 'bn2_1-var' name: 'bn2_1' type: BN 56 | bn_param { scale_filler { type: 'constant' value: 1 } 57 | shift_filler { type: 'constant' value: 0.001 } } } 58 | layers { bottom: "conv2_1" top: "conv2_1" name: "relu2_1" type: RELU} 59 | # conv2_2 60 | layers { bottom: "conv2_1" top: "conv2_2" name: "conv2_2" type: CONVOLUTION 61 | blobs_lr: 1 blobs_lr: 2 weight_decay: 1 weight_decay: 0 62 | convolution_param { num_output: 128 pad: 1 kernel_size: 3 }} 63 | layers { bottom: 'conv2_2' top: 'conv2_2' top:'bn2_2-mean' top:'bn2_2-var' name: 'bn2_2' type: BN 64 | bn_param { scale_filler { type: 'constant' value: 1 } 65 | shift_filler { type: 'constant' value: 0.001 } } } 66 | layers { bottom: "conv2_2" top: "conv2_2" name: "relu2_2" type: RELU} 67 | 68 | # pool2 69 | layers { 70 | bottom: "conv2_2" top: "pool2" top: "pool2_mask" name: "pool2" type: POOLING 71 | pooling_param { pool: MAX kernel_size: 2 stride: 2 } 72 | } 73 | 74 | # 56 x 56 75 | # conv3_1 76 | layers { bottom: "pool2" top: "conv3_1" name: "conv3_1" type: CONVOLUTION 77 | blobs_lr: 1 blobs_lr: 2 weight_decay: 1 weight_decay: 0 78 | convolution_param { num_output: 256 pad: 1 kernel_size: 3 }} 79 | layers { bottom: 'conv3_1' top: 'conv3_1' top:'bn3_1-mean' top: 'bn3_1-var' name: 'bn3_1' type: BN 80 | bn_param { scale_filler { type: 'constant' value: 1 } 81 | shift_filler { type: 'constant' value: 0.001 } } } 82 | layers { bottom: "conv3_1" top: "conv3_1" name: "relu3_1" type: RELU} 83 | # conv3_2 84 | layers { bottom: "conv3_1" top: "conv3_2" name: "conv3_2" type: CONVOLUTION 85 | blobs_lr: 1 blobs_lr: 2 weight_decay: 1 weight_decay: 0 86 | convolution_param { num_output: 256 pad: 1 kernel_size: 3 }} 87 | layers { bottom: 'conv3_2' top: 'conv3_2' top:'bn3_2-mean' top: 'bn3_2-var' name: 'bn3_2' type: BN 88 | bn_param { scale_filler { type: 'constant' value: 1 } 89 | shift_filler { type: 'constant' value: 0.001 } } } 90 | layers { bottom: "conv3_2" top: "conv3_2" name: "relu3_2" type: RELU} 91 | # conv3_3 92 | layers { bottom: "conv3_2" top: "conv3_3" name: "conv3_3" type: CONVOLUTION 93 | blobs_lr: 1 blobs_lr: 2 weight_decay: 1 weight_decay: 0 94 | convolution_param { num_output: 256 pad: 1 kernel_size: 3 }} 95 | layers { bottom: 'conv3_3' top: 'conv3_3' top: 'bn3_3-mean' top: 'bn3_3-var' name: 'bn3_3' type: BN 96 | bn_param { scale_filler { type: 'constant' value: 1 } 97 | shift_filler { type: 'constant' value: 0.001 } } } 98 | layers { bottom: "conv3_3" top: "conv3_3" name: "relu3_3" type: RELU} 99 | 100 | # pool3 101 | layers { 102 | bottom: "conv3_3" top: "pool3" top: "pool3_mask" name: "pool3" type: POOLING 103 | pooling_param { pool: MAX kernel_size: 2 stride: 2 } 104 | } 105 | 106 | # 28 x 28 107 | # conv4_1 108 | layers { bottom: "pool3" top: "conv4_1" name: "conv4_1" type: CONVOLUTION 109 | blobs_lr: 1 blobs_lr: 2 weight_decay: 1 weight_decay: 0 110 | convolution_param { num_output: 512 pad: 1 kernel_size: 3 }} 111 | layers { bottom: 'conv4_1' top: 'conv4_1' top: 'bn4_1-mean' top: 'bn4_1-var' name: 'bn4_1' type: BN 112 | bn_param { scale_filler { type: 'constant' value: 1 } 113 | shift_filler { type: 'constant' value: 0.001 } } } 114 | layers { bottom: "conv4_1" top: "conv4_1" name: "relu4_1" type: RELU} 115 | # conv4_2 116 | layers { bottom: "conv4_1" top: "conv4_2" name: "conv4_2" type: CONVOLUTION 117 | blobs_lr: 1 blobs_lr: 2 weight_decay: 1 weight_decay: 0 118 | convolution_param { num_output: 512 pad: 1 kernel_size: 3 }} 119 | layers { bottom: 'conv4_2' top: 'conv4_2' top: 'bn4_2-mean' top: 'bn4_2-var' name: 'bn4_2' type: BN 120 | bn_param { scale_filler { type: 'constant' value: 1 } 121 | shift_filler { type: 'constant' value: 0.001 } } } 122 | layers { bottom: "conv4_2" top: "conv4_2" name: "relu4_2" type: RELU} 123 | # conv4_3 124 | layers { bottom: "conv4_2" top: "conv4_3" name: "conv4_3" type: CONVOLUTION 125 | blobs_lr: 1 blobs_lr: 2 weight_decay: 1 weight_decay: 0 126 | convolution_param { num_output: 512 pad: 1 kernel_size: 3 }} 127 | layers { bottom: 'conv4_3' top: 'conv4_3' top: 'bn4_3-mean' top: 'bn4_3-var' name: 'bn4_3' type: BN 128 | bn_param { scale_filler { type: 'constant' value: 1 } 129 | shift_filler { type: 'constant' value: 0.001 } } } 130 | layers { bottom: "conv4_3" top: "conv4_3" name: "relu4_3" type: RELU} 131 | 132 | # pool4 133 | layers { 134 | bottom: "conv4_3" top: "pool4" top: "pool4_mask" name: "pool4" type: POOLING 135 | pooling_param { pool: MAX kernel_size: 2 stride: 2 } 136 | } 137 | 138 | # 14 x 14 139 | # conv5_1 140 | layers { bottom: "pool4" top: "conv5_1" name: "conv5_1" type: CONVOLUTION 141 | blobs_lr: 1 blobs_lr: 2 weight_decay: 1 weight_decay: 0 142 | convolution_param { num_output: 512 pad: 1 kernel_size: 3 }} 143 | layers { bottom: 'conv5_1' top: 'conv5_1' top: 'bn5_1-mean' top: 'bn5_1-var' name: 'bn5_1' type: BN 144 | bn_param { scale_filler { type: 'constant' value: 1 } 145 | shift_filler { type: 'constant' value: 0.001 } } } 146 | layers { bottom: "conv5_1" top: "conv5_1" name: "relu5_1" type: RELU} 147 | # conv5_2 148 | layers { bottom: "conv5_1" top: "conv5_2" name: "conv5_2" type: CONVOLUTION 149 | blobs_lr: 1 blobs_lr: 2 weight_decay: 1 weight_decay: 0 150 | convolution_param { num_output: 512 pad: 1 kernel_size: 3 }} 151 | layers { bottom: 'conv5_2' top: 'conv5_2' top: 'bn5_2-mean' top: 'bn5_2-var' name: 'bn5_2' type: BN 152 | bn_param { scale_filler { type: 'constant' value: 1 } 153 | shift_filler { type: 'constant' value: 0.001 } } } 154 | layers { bottom: "conv5_2" top: "conv5_2" name: "relu5_2" type: RELU} 155 | # conv5_3 156 | layers { bottom: "conv5_2" top: "conv5_3" name: "conv5_3" type: CONVOLUTION 157 | blobs_lr: 1 blobs_lr: 2 weight_decay: 1 weight_decay: 0 158 | convolution_param { num_output: 512 pad: 1 kernel_size: 3 }} 159 | layers { bottom: 'conv5_3' top: 'conv5_3' top: 'bn5_3-mean' top: 'bn5_3-var' name: 'bn5_3' type: BN 160 | bn_param { scale_filler { type: 'constant' value: 1 } 161 | shift_filler { type: 'constant' value: 0.001 } } } 162 | layers { bottom: "conv5_3" top: "conv5_3" name: "relu5_3" type: RELU} 163 | 164 | # pool5 165 | layers { 166 | bottom: "conv5_3" top: "pool5" top: "pool5_mask" name: "pool5" type: POOLING 167 | pooling_param { pool: MAX kernel_size: 2 stride: 2 } 168 | } 169 | 170 | # 7 x 7 171 | # fc6 172 | layers { bottom: 'pool5' top: 'fc6' name: 'fc6' type: CONVOLUTION 173 | blobs_lr: 1 blobs_lr: 2 weight_decay: 1 weight_decay: 0 174 | convolution_param { kernel_size: 7 num_output: 4096 } } 175 | layers { bottom: 'fc6' top: 'fc6' top: 'bnfc6-mean' top: 'bnfc6-var' name: 'bnfc6' type: BN 176 | bn_param { scale_filler { type: 'constant' value: 1 } 177 | shift_filler { type: 'constant' value: 0.001 } } } 178 | layers { bottom: "fc6" top: "fc6" name: "relu6" type: RELU} 179 | 180 | # 1 x 1 181 | # fc7 182 | layers { bottom: 'fc6' top: 'fc7' name: 'fc7' type: CONVOLUTION 183 | blobs_lr: 1 blobs_lr: 2 weight_decay: 1 weight_decay: 0 184 | convolution_param { kernel_size: 1 num_output: 4096 } } 185 | layers { bottom: 'fc7' top: 'fc7' top: 'bnfc7-mean' top: 'bnfc7-var' name: 'bnfc7' type: BN 186 | bn_param { scale_filler { type: 'constant' value: 1 } 187 | shift_filler { type: 'constant' value: 0.001 } } } 188 | layers { bottom: "fc7" top: "fc7" name: "relu7" type: RELU} 189 | 190 | # fc6-deconv 191 | layers { bottom: 'fc7' top: 'fc6-deconv' name: 'fc6-deconv' type: DECONVOLUTION 192 | blobs_lr: 1 blobs_lr: 2 weight_decay: 1 weight_decay: 0 193 | convolution_param { num_output: 512 kernel_size: 7 194 | weight_filler { type: "gaussian" std: 0.01 } 195 | bias_filler { type: "constant" value: 0 }} } 196 | layers { bottom: 'fc6-deconv' top: 'fc6-deconv' top: 'fc6-deconv-bn-mean' top: 'fc6-deconv-bn-var' 197 | name: 'fc6-deconv-bn' type: BN 198 | bn_param { scale_filler { type: 'constant' value: 1 } 199 | shift_filler { type: 'constant' value: 0.001 } } } 200 | layers { bottom: 'fc6-deconv' top: 'fc6-deconv' name: 'fc6-deconv-relu' type: RELU } 201 | 202 | # 7 x 7 203 | # unpool5 204 | layers { type: UNPOOLING bottom: "fc6-deconv" bottom: "pool5_mask" top: "unpool5" name: "unpool5" 205 | unpooling_param { unpool: MAX kernel_size: 2 stride: 2 unpool_size: 14 } 206 | } 207 | 208 | # 14 x 14 209 | # deconv5_1 210 | layers { bottom: 'unpool5' top: 'deconv5_1' name: 'deconv5_1' type: DECONVOLUTION 211 | blobs_lr: 1 blobs_lr: 2 weight_decay: 1 weight_decay: 0 212 | convolution_param { num_output: 512 pad: 1 kernel_size: 3 213 | weight_filler { type: "gaussian" std: 0.01 } 214 | bias_filler { type: "constant" value: 0 }} } 215 | layers { bottom: 'deconv5_1' top: 'deconv5_1' top: 'debn5_1-mean' top: 'debn5_1-var' 216 | name: 'debn5_1' type: BN 217 | bn_param { scale_filler { type: 'constant' value: 1 } 218 | shift_filler { type: 'constant' value: 0.001 } } } 219 | layers { bottom: 'deconv5_1' top: 'deconv5_1' name: 'derelu5_1' type: RELU } 220 | # deconv5_2 221 | layers { bottom: 'deconv5_1' top: 'deconv5_2' name: 'deconv5_2' type: DECONVOLUTION 222 | blobs_lr: 1 blobs_lr: 2 weight_decay: 1 weight_decay: 0 223 | convolution_param { num_output: 512 pad: 1 kernel_size: 3 224 | weight_filler { type: "gaussian" std: 0.01 } 225 | bias_filler { type: "constant" value: 0 }} } 226 | layers { bottom: 'deconv5_2' top: 'deconv5_2' top: 'debn5_2-mean' top: 'debn5_2-var' 227 | name: 'debn5_2' type: BN 228 | bn_param { scale_filler { type: 'constant' value: 1 } 229 | shift_filler { type: 'constant' value: 0.001 } } } 230 | layers { bottom: 'deconv5_2' top: 'deconv5_2' name: 'derelu5_2' type: RELU } 231 | # deconv5_3 232 | layers { bottom: 'deconv5_2' top: 'deconv5_3' name: 'deconv5_3' type: DECONVOLUTION 233 | blobs_lr: 1 blobs_lr: 2 weight_decay: 1 weight_decay: 0 234 | convolution_param { num_output: 512 pad: 1 kernel_size: 3 235 | weight_filler { type: "gaussian" std: 0.01 } 236 | bias_filler { type: "constant" value: 0 }} } 237 | layers { bottom: 'deconv5_3' top: 'deconv5_3' top: 'debn5_3-mean' top: 'debn5_3-var' 238 | name: 'debn5_3' type: BN 239 | bn_param { scale_filler { type: 'constant' value: 1 } 240 | shift_filler { type: 'constant' value: 0.001 } } } 241 | layers { bottom: 'deconv5_3' top: 'deconv5_3' name: 'derelu5_3' type: RELU } 242 | 243 | # unpool4 244 | layers { type: UNPOOLING bottom: "deconv5_3" bottom: "pool4_mask" top: "unpool4" name: "unpool4" 245 | unpooling_param { unpool: MAX kernel_size: 2 stride: 2 unpool_size: 28 } 246 | } 247 | 248 | # 28 x 28 249 | # deconv4_1 250 | layers { bottom: 'unpool4' top: 'deconv4_1' name: 'deconv4_1' type: DECONVOLUTION 251 | blobs_lr: 1 blobs_lr: 2 weight_decay: 1 weight_decay: 0 252 | convolution_param { num_output: 512 pad: 1 kernel_size: 3 253 | weight_filler { type: "gaussian" std: 0.01 } 254 | bias_filler { type: "constant" value: 0 }} } 255 | layers { bottom: 'deconv4_1' top: 'deconv4_1' top: 'debn4_1-mean' top: 'debn4_1-var' 256 | name: 'debn4_1' type: BN 257 | bn_param { scale_filler { type: 'constant' value: 1 } 258 | shift_filler { type: 'constant' value: 0.001 } } } 259 | layers { bottom: 'deconv4_1' top: 'deconv4_1' name: 'derelu4_1' type: RELU } 260 | # deconv 4_2 261 | layers { bottom: 'deconv4_1' top: 'deconv4_2' name: 'deconv4_2' type: DECONVOLUTION 262 | blobs_lr: 1 blobs_lr: 2 weight_decay: 1 weight_decay: 0 263 | convolution_param { num_output: 512 pad: 1 kernel_size: 3 264 | weight_filler { type: "gaussian" std: 0.01 } 265 | bias_filler { type: "constant" value: 0 }} } 266 | layers { bottom: 'deconv4_2' top: 'deconv4_2' top: 'debn4_2-mean' top: 'debn4_2-var' 267 | name: 'debn4_2' type: BN 268 | bn_param { scale_filler { type: 'constant' value: 1 } 269 | shift_filler { type: 'constant' value: 0.001 } } } 270 | layers { bottom: 'deconv4_2' top: 'deconv4_2' name: 'derelu4_2' type: RELU } 271 | # deconv 4_3 272 | layers { bottom: 'deconv4_2' top: 'deconv4_3' name: 'deconv4_3' type: DECONVOLUTION 273 | blobs_lr: 1 blobs_lr: 2 weight_decay: 1 weight_decay: 0 274 | convolution_param { num_output: 256 pad: 1 kernel_size: 3 275 | weight_filler { type: "gaussian" std: 0.01 } 276 | bias_filler { type: "constant" value: 0 }} } 277 | layers { bottom: 'deconv4_3' top: 'deconv4_3' top: 'debn4_3-mean' top: 'debn4_3-var' 278 | name: 'debn4_3' type: BN 279 | bn_param { scale_filler { type: 'constant' value: 1 } 280 | shift_filler { type: 'constant' value: 0.001 } } } 281 | layers { bottom: 'deconv4_3' top: 'deconv4_3' name: 'derelu4_3' type: RELU } 282 | 283 | # unpool3 284 | layers { type: UNPOOLING bottom: "deconv4_3" bottom: "pool3_mask" top: "unpool3" name: "unpool3" 285 | unpooling_param { unpool: MAX kernel_size: 2 stride: 2 unpool_size: 56 } 286 | } 287 | 288 | # 56 x 56 289 | # deconv3_1 290 | layers { bottom: 'unpool3' top: 'deconv3_1' name: 'deconv3_1' type: DECONVOLUTION 291 | blobs_lr: 1 blobs_lr: 2 weight_decay: 1 weight_decay: 0 292 | convolution_param { num_output:256 pad:1 kernel_size: 3 293 | weight_filler { type: "gaussian" std: 0.01 } 294 | bias_filler { type: "constant" value: 0 }} } 295 | layers { bottom: 'deconv3_1' top: 'deconv3_1' top: 'debn3_1-mean' top: 'debn3_1-var' 296 | name: 'debn3_1' type: BN 297 | bn_param { scale_filler { type: 'constant' value: 1 } 298 | shift_filler { type: 'constant' value: 0.001 } } } 299 | layers { bottom: 'deconv3_1' top: 'deconv3_1' name: 'derelu3_1' type: RELU } 300 | # deconv3_2 301 | layers { bottom: 'deconv3_1' top: 'deconv3_2' name: 'deconv3_2' type: DECONVOLUTION 302 | blobs_lr: 1 blobs_lr: 2 weight_decay: 1 weight_decay: 0 303 | convolution_param { num_output:256 pad:1 kernel_size: 3 304 | weight_filler { type: "gaussian" std: 0.01 } 305 | bias_filler { type: "constant" value: 0 }} } 306 | layers { bottom: 'deconv3_2' top: 'deconv3_2' top: 'debn3_2-mean' top: 'debn3_2-var' 307 | name: 'debn3_2' type: BN 308 | bn_param { scale_filler { type: 'constant' value: 1 } 309 | shift_filler { type: 'constant' value: 0.001 } } } 310 | layers { bottom: 'deconv3_2' top: 'deconv3_2' name: 'derelu3_2' type: RELU } 311 | # deconv3_3 312 | layers { bottom: 'deconv3_2' top: 'deconv3_3' name: 'deconv3_3' type: DECONVOLUTION 313 | blobs_lr: 1 blobs_lr: 2 weight_decay: 1 weight_decay: 0 314 | convolution_param { num_output:128 pad:1 kernel_size: 3 315 | weight_filler { type: "gaussian" std: 0.01 } 316 | bias_filler { type: "constant" value: 0 }} } 317 | layers { bottom: 'deconv3_3' top: 'deconv3_3' top: 'debn3_3-mean' top: 'debn3_3-var' 318 | name: 'debn3_3' type: BN 319 | bn_param { scale_filler { type: 'constant' value: 1 } 320 | shift_filler { type: 'constant' value: 0.001 } } } 321 | layers { bottom: 'deconv3_3' top: 'deconv3_3' name: 'derelu3_3' type: RELU } 322 | 323 | # unpool2 324 | layers { type: UNPOOLING bottom: "deconv3_3" bottom: "pool2_mask" top: "unpool2" name: "unpool2" 325 | unpooling_param { unpool: MAX kernel_size: 2 stride: 2 unpool_size: 112 } 326 | } 327 | 328 | # 112 x 112 329 | # deconv2_1 330 | layers { bottom: 'unpool2' top: 'deconv2_1' name: 'deconv2_1' type: DECONVOLUTION 331 | blobs_lr: 1 blobs_lr: 2 weight_decay: 1 weight_decay: 0 332 | convolution_param { num_output:128 pad:1 kernel_size: 3 333 | weight_filler { type: "gaussian" std: 0.01 } 334 | bias_filler { type: "constant" value: 0 }} } 335 | layers { bottom: 'deconv2_1' top: 'deconv2_1' top: 'debn2_1-mean' top: 'debn2_1-var' 336 | name: 'debn2_1' type: BN 337 | bn_param { scale_filler { type: 'constant' value: 1 } 338 | shift_filler { type: 'constant' value: 0.001 } } } 339 | layers { bottom: 'deconv2_1' top: 'deconv2_1' name: 'derelu2_1' type: RELU } 340 | # deconv2_2 341 | layers { bottom: 'deconv2_1' top: 'deconv2_2' name: 'deconv2_2' type: DECONVOLUTION 342 | blobs_lr: 1 blobs_lr: 2 weight_decay: 1 weight_decay: 0 343 | convolution_param { num_output:64 pad:1 kernel_size: 3 344 | weight_filler { type: "gaussian" std: 0.01 } 345 | bias_filler { type: "constant" value: 0 }} } 346 | layers { bottom: 'deconv2_2' top: 'deconv2_2' top: 'debn2_2-mean' top: 'debn2_2-var' 347 | name: 'debn2_2' type: BN 348 | bn_param { scale_filler { type: 'constant' value: 1 } 349 | shift_filler { type: 'constant' value: 0.001 } } } 350 | layers { bottom: 'deconv2_2' top: 'deconv2_2' name: 'derelu2_2' type: RELU } 351 | 352 | # unpool1 353 | layers { type: UNPOOLING bottom: "deconv2_2" bottom: "pool1_mask" top: "unpool1" name: "unpool1" 354 | unpooling_param { unpool: MAX kernel_size: 2 stride: 2 unpool_size: 224 } 355 | } 356 | 357 | # deconv1_1 358 | layers { bottom: 'unpool1' top: 'deconv1_1' name: 'deconv1_1' type: DECONVOLUTION 359 | blobs_lr: 1 blobs_lr: 2 weight_decay: 1 weight_decay: 0 360 | convolution_param { num_output:64 pad:1 kernel_size: 3 361 | weight_filler { type: "gaussian" std: 0.01 } 362 | bias_filler { type: "constant" value: 0 }} } 363 | layers { bottom: 'deconv1_1' top: 'deconv1_1' top: 'debn1_1-mean' top: 'debn1_1-var' 364 | name: 'debn1_1' type: BN 365 | bn_param { scale_filler { type: 'constant' value: 1 } 366 | shift_filler { type: 'constant' value: 0.001 } } } 367 | layers { bottom: 'deconv1_1' top: 'deconv1_1' name: 'derelu1_1' type: RELU } 368 | 369 | # deconv1_2 370 | layers { bottom: 'deconv1_1' top: 'deconv1_2' name: 'deconv1_2' type: DECONVOLUTION 371 | blobs_lr: 1 blobs_lr: 2 weight_decay: 1 weight_decay: 0 372 | convolution_param { num_output:64 pad:1 kernel_size: 3 373 | weight_filler { type: "gaussian" std: 0.01 } 374 | bias_filler { type: "constant" value: 0 }} } 375 | layers { bottom: 'deconv1_2' top: 'deconv1_2' top: 'debn1_2-mean' top: 'debn1_2-var' 376 | name: 'debn1_2' type: BN 377 | bn_param { scale_filler { type: 'constant' value: 1 } 378 | shift_filler { type: 'constant' value: 0.001 } } } 379 | layers { bottom: 'deconv1_2' top: 'deconv1_2' name: 'derelu1_2' type: RELU } 380 | 381 | # seg-score 382 | layers { name: 'seg-score-voc' type: CONVOLUTION bottom: 'deconv1_2' top: 'seg-score' 383 | blobs_lr: 1 blobs_lr: 2 weight_decay: 1 weight_decay: 0 384 | convolution_param { num_output: 21 kernel_size: 1 385 | weight_filler { type: "gaussian" std: 0.01 } 386 | bias_filler { type: "constant" value: 0 }} } 387 | 388 | 389 | -------------------------------------------------------------------------------- /training/003_make_bn_layer_testable/start_test.sh: -------------------------------------------------------------------------------- 1 | LOGDIR=./testing_log 2 | CAFFE=./caffe/build/tools/caffe 3 | MODEL=./DeconvNet_inference_test.prototxt 4 | WEIGHTS=./DeconvNet_trainval_inference.caffemodel 5 | 6 | GLOG_log_dir=$LOGDIR $CAFFE test -model $MODEL -weights $WEIGHTS -gpu 0 7 | 8 | 9 | -------------------------------------------------------------------------------- /training/003_start_make_bn_layer_testable.sh: -------------------------------------------------------------------------------- 1 | ################################ 2 | # start make bn layer testable # 3 | ################################ 4 | cd 003_make_bn_layer_testable 5 | 6 | ## create simulinks 7 | 8 | # image data 9 | ln -s ../../data/VOC2012 10 | # training / validataion imageset 11 | ln -s ../../data/imagesets/stage_2_train_imgset 12 | # caffe 13 | ln -s ../../caffe 14 | # pre-trained caffe model (trained model with stage 2 training) 15 | ln -s ../../model 16 | ln -s ../../model/DeconvNet 17 | 18 | ## create directories 19 | mkdir testing_log 20 | 21 | ## start make bn layer testable 22 | python BN_make_INFERENCE_script.py 23 | 24 | ## run test to check 25 | ./start_test.sh 26 | 27 | ## copy and rename trained model 28 | cp ./DeconvNet_trainval_inference.caffemodel ./DeconvNet/DeconvNet_trainval_inference.caffemodel 29 | 30 | --------------------------------------------------------------------------------