├── 000084.jpg ├── LICENSE ├── README.md ├── cached_pyra.mat ├── data └── caffe_nets │ └── .dummy ├── deep_pyramid.m ├── deep_pyramid_add_padding.m ├── deep_pyramid_cache_wrapper.m ├── demo_deep_pyramid.m ├── green_colormap.mat ├── im_to_pyra_coords.m ├── init_cnn_model.m ├── model-defs ├── pyramid_cnn_output_conv5_scales_7_plane_1713.prototxt └── pyramid_cnn_output_max5_scales_7_plane_1713.prototxt └── pyra_to_im_coords.m /000084.jpg: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/rbgirshick/DeepPyramid/adb87bbb96fcaef87420028750cf2abb9b3d929c/000084.jpg -------------------------------------------------------------------------------- /LICENSE: -------------------------------------------------------------------------------- 1 | Copyright (c) 2014, The Regents of the University of California (Regents) 2 | All rights reserved. 3 | 4 | Redistribution and use in source and binary forms, with or without 5 | modification, are permitted provided that the following conditions are met: 6 | 7 | 1. Redistributions of source code must retain the above copyright notice, this 8 | list of conditions and the following disclaimer. 9 | 2. Redistributions in binary form must reproduce the above copyright notice, 10 | this list of conditions and the following disclaimer in the documentation 11 | and/or other materials provided with the distribution. 12 | 13 | THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS" AND 14 | ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED 15 | WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE 16 | DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT OWNER OR CONTRIBUTORS BE LIABLE FOR 17 | ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES 18 | (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; 19 | LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND 20 | ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT 21 | (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS 22 | SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. 23 | -------------------------------------------------------------------------------- /README.md: -------------------------------------------------------------------------------- 1 | ## DeepPyramid 2 | 3 | DeepPyramid is a simple toolkit for building feature pyramids from deep convolutional networks. 4 | The DeepPyramid data structure is nearly identical to the HOG feature pyramid created by the featpyramid.m function in the [voc-dpm](https://github.com/rbgirshick/voc-dpm) code. 5 | 6 | ### References 7 | 8 | This code was used in our [tech report](http://arxiv.org/pdf/1409.5403v2.pdf) about the relationship between deformable part models and convolutional networks. 9 | 10 | @article{girshick14dpdpm, 11 | author = {Ross Girshick and Forrest Iandola and Trevor Darrell and Jitendra Malik}, 12 | title = {Deformable Part Models are Convolutional Neural Networks}, 13 | journal = {CoRR}, 14 | year = {2014}, 15 | volume = {abs/1409.5403}, 16 | url = {http://arxiv.org/abs/1409.5403}, 17 | year = {2014} 18 | } 19 | 20 | 21 | ### Installation 22 | 23 | 0. **Prerequisites** 24 | 0. MATLAB (tested with 2014a on 64-bit Linux) 25 | 0. Caffe's [prerequisites](http://caffe.berkeleyvision.org/installation.html#prequequisites) 26 | 0. **Install Caffe** (this is the most complicated part) 27 | 0. Follow the [Caffe installation instructions](http://caffe.berkeleyvision.org/installation.html) 28 | 0. Let's call the place where you installed caffe `$CAFFE_ROOT` (you can run `export CAFFE_ROOT=$(pwd)`) 29 | 0. **Important:** Make sure to compile the Caffe MATLAB wrapper, which is not built by default: `make matcaffe` 30 | 0. **Important:** Make sure to run `cd $CAFFE_ROOT/data/ilsvrc12 && ./get_ilsvrc_aux.sh` to download the ImageNet image mean 31 | 0. DeepPyramid has been tested with master and dev at the time of this writing 32 | 0. **Get DeepPyramid** 33 | 0. `git clone https://github.com/rbgirshick/DeepPyramid.git` 34 | 0. If you haven't installed R-CNN, you'll need to download its [models](https://dl.dropboxusercontent.com/s/og7ghmiken2olzh/r-cnn-release1-data.tgz?dl=0) 35 | 1. Copy R-CNN's non-finetuned ImageNet network `/data/caffe_nets/ilsvrc_2012_train_iter_310k` to `/data/caffe_nets/ilsvrc_2012_train_iter_310k` (or just create a symlink). 36 | 37 | ### Usage 38 | 39 | 1. Run matlab from inside the DeepPyramid code directory 40 | 2. Add the `matcaffe` mex function to your path (`addpath /path/to/caffe/matlab/caffe`) 41 | 3. Run the demo `demo_deep_pyramid` 42 | 43 | ### Uses 44 | 45 | DeepPyramid can be used for implementing DPMs on deep convolutional network features, rather than HOG features. It can also be used whenever you need a dense multiscale pyramid of image features. 46 | 47 | ### Caveats 48 | 49 | The implementation is designed to be simple and as a result is very inefficient. There are a variety of ways to speed it up, and they will be done in the future. For now, it takes about 0.5 to 0.6 seconds to compute a feature pyramid on an NVIDIA Titan GPU, which is acceptable. 50 | -------------------------------------------------------------------------------- /cached_pyra.mat: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/rbgirshick/DeepPyramid/adb87bbb96fcaef87420028750cf2abb9b3d929c/cached_pyra.mat -------------------------------------------------------------------------------- /data/caffe_nets/.dummy: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/rbgirshick/DeepPyramid/adb87bbb96fcaef87420028750cf2abb9b3d929c/data/caffe_nets/.dummy -------------------------------------------------------------------------------- /deep_pyramid.m: -------------------------------------------------------------------------------- 1 | function [pyra, im_pyra] = deep_pyramid(im, cnn_model) 2 | % [pyra, im_pyra] = deep_pyramid(im, cnn_model, cache_opts) 3 | % 4 | % im: a color image 5 | % cnn_model: a handle to the model loaded into caffe 6 | 7 | % AUTORIGHTS 8 | % --------------------------------------------------------- 9 | % Copyright (c) 2014, Ross Girshick 10 | % 11 | % This file is part of the DeepPyramid code and is available 12 | % under the terms of the Simplified BSD License provided in 13 | % LICENSE. Please retain this notice and LICENSE if you use 14 | % this file (or any portion of it) in your project. 15 | % --------------------------------------------------------- 16 | 17 | [pyra, im_pyra] = feat_pyramid(im, cnn_model); 18 | 19 | imsize = size(im); 20 | pyra.imsize = imsize(1:2); 21 | pyra.num_levels = cnn_model.pyra.num_levels; 22 | pyra.stride = cnn_model.pyra.stride; 23 | 24 | pyra.valid_levels = true(pyra.num_levels, 1); 25 | pyra.padx = 0; 26 | pyra.pady = 0; 27 | 28 | 29 | % ------------------------------------------------------------------------ 30 | function [pyra, im_pyra] = feat_pyramid(im, cnn_model) 31 | % ------------------------------------------------------------------------ 32 | % Compute a feature pyramid with caffe 33 | 34 | if cnn_model.init_key ~= caffe('get_init_key') 35 | error('You probably need to load the cnn model into caffe.'); 36 | end 37 | 38 | % we're assuming the input height and width are the same 39 | assert(cnn_model.pyra.dimx == cnn_model.pyra.dimy); 40 | % compute output width and height 41 | sz_w = (cnn_model.pyra.dimx - 1) / cnn_model.pyra.stride + 1; 42 | sz_h = sz_w; 43 | sz_l = cnn_model.pyra.num_levels; 44 | sz_c = cnn_model.pyra.num_channels; 45 | 46 | %th = tic; 47 | [batch, scales, level_sizes] = image_pyramid(im, cnn_model); 48 | %fprintf('prep: %.3fs\n', toc(th)); 49 | 50 | %th = tic; 51 | feat = caffe('forward', {batch}); 52 | % standard song and dance of swapping width and height between caffe and matlab 53 | feat = permute(reshape(feat{1}, [sz_w sz_h sz_c sz_l]), [2 1 3 4]); 54 | %fprintf('fwd: %.3fs\n', toc(th)); 55 | 56 | pyra.feat = feat; 57 | pyra.scales = scales; 58 | pyra.level_sizes = level_sizes; 59 | 60 | im_pyra = batch; 61 | 62 | % ------------------------------------------------------------------------ 63 | function [batch, scales, level_sizes] = image_pyramid(im, cnn_model) 64 | % ------------------------------------------------------------------------ 65 | % Construct an image pyramid that's ready for feeding directly into caffe 66 | % forward 67 | 68 | batch_width = cnn_model.pyra.dimx; 69 | batch_height = cnn_model.pyra.dimy; 70 | num_levels = cnn_model.pyra.num_levels; 71 | stride = cnn_model.pyra.stride; 72 | 73 | im = single(im); 74 | % Convert to BGR 75 | im = im(:,:,[3 2 1]); 76 | % Subtract mean (mean of the image mean--one mean per channel) 77 | im = bsxfun(@minus, im, cnn_model.mu); 78 | 79 | % scale = base scale relative to input image 80 | im_sz = size(im); 81 | if im_sz(1) > im_sz(2) 82 | height = batch_height; 83 | width = NaN; 84 | scale = height / im_sz(1); 85 | else 86 | height = NaN; 87 | width = batch_width; 88 | scale = width / im_sz(2); 89 | end 90 | im_orig = im; 91 | 92 | batch = zeros(batch_width, batch_height, 3, num_levels, 'single'); 93 | alpha = cnn_model.pyra.scale_factor; 94 | scales = scale*(alpha.^-(0:num_levels-1))'; 95 | level_sizes = zeros(num_levels, 2); 96 | for i = 0:num_levels-1 97 | if i == 0 98 | im = imresize(im_orig, [height width], 'bilinear'); 99 | else 100 | im = imresize(im_orig, scales(i+1), 'bilinear'); 101 | end 102 | im_sz = size(im); 103 | im_sz = im_sz(1:2); 104 | level_sizes(i+1, :) = ceil((im_sz - 1) / stride + 1); 105 | % Make width the fastest dimension (for caffe) 106 | im = permute(im, [2 1 3]); 107 | batch(1:im_sz(2), 1:im_sz(1), :, i+1) = im; 108 | end 109 | -------------------------------------------------------------------------------- /deep_pyramid_add_padding.m: -------------------------------------------------------------------------------- 1 | function pyra = deep_pyramid_add_padding(pyra, padx, pady, uniform_output) 2 | % pyra = deep_pyramid_add_padding(pyra, padx, pady, uniform_output) 3 | % 4 | % pyra: a pyramid computed using deep_pyramid.m 5 | % padx, pady: amount of extra padding to add around each pyramid level 6 | % uniform_output: true => each pyramid level is embedded in the upper-left 7 | % corner of each 3D plane inside a 4D array; 8 | % false => each pyramid level is stored as a 3D array with the height 9 | % and width of that level (same as in the DPM HOG feature pyramid code) 10 | 11 | % AUTORIGHTS 12 | % --------------------------------------------------------- 13 | % Copyright (c) 2014, Ross Girshick 14 | % 15 | % This file is part of the DeepPyramid code and is available 16 | % under the terms of the Simplified BSD License provided in 17 | % LICENSE. Please retain this notice and LICENSE if you use 18 | % this file (or any portion of it) in your project. 19 | % --------------------------------------------------------- 20 | 21 | pyra.padx = padx; 22 | pyra.pady = pady; 23 | 24 | if ~exist('uniform_output', 'var') || isempty(uniform_output) 25 | uniform_output = false; 26 | end 27 | 28 | if uniform_output 29 | pyra.feat = padarray(pyra.feat, [pady padx 0 0], 0); 30 | else 31 | % This mimics the output of the DPM featpyramid.m code 32 | feat = pyra.feat; 33 | pyra.feat = cell(pyra.num_levels, 1); 34 | for i = 1:pyra.num_levels 35 | % crop out feat map levels 36 | pyra.feat{i} = feat(1:pyra.level_sizes(i, 1), ... 37 | 1:pyra.level_sizes(i, 2), :, i); 38 | % add padding 39 | pyra.feat{i} = padarray(pyra.feat{i}, [pady padx 0], 0); 40 | end 41 | end 42 | 43 | -------------------------------------------------------------------------------- /deep_pyramid_cache_wrapper.m: -------------------------------------------------------------------------------- 1 | function pyra = deep_pyramid_cache_wrapper(im, cnn_model, cache_opts) 2 | % pyra = deep_pyramid_cache_wrapper(im, cnn_model, cache_opts) 3 | % 4 | % Load a feature pyramid from a cache on disk, optionally saving if 5 | % there's a cache miss. 6 | % 7 | % cache_opts.cache_file 8 | % .debug 9 | % .write_on_miss 10 | % .exists 11 | 12 | % AUTORIGHTS 13 | % --------------------------------------------------------- 14 | % Copyright (c) 2014, Ross Girshick 15 | % 16 | % This file is part of the DeepPyramid code and is available 17 | % under the terms of the Simplified BSD License provided in 18 | % LICENSE. Please retain this notice and LICENSE if you use 19 | % this file (or any portion of it) in your project. 20 | % --------------------------------------------------------- 21 | 22 | debug = isfield(cache_opts, 'debug') && cache_opts.debug; 23 | write_on_miss = isfield(cache_opts, 'write_on_miss') && ... 24 | cache_opts.write_on_miss; 25 | 26 | cache_file = cache_opts.cache_file; 27 | if exist(cache_file, 'file') 28 | ld = load(cache_file); 29 | pyra = ld.pyra; 30 | if debug 31 | warning('Loaded cache file %s.', cache_file); 32 | end 33 | else 34 | if debug 35 | warning('Cache file %s not found.', cache_file); 36 | end 37 | if write_on_miss 38 | pyra = deep_pyramid(im, cnn_model); 39 | save(cache_opts.cache_file, 'pyra'); 40 | if debug 41 | warning('Cache file % saved.', cache_file); 42 | end 43 | end 44 | end 45 | -------------------------------------------------------------------------------- /demo_deep_pyramid.m: -------------------------------------------------------------------------------- 1 | function pyra = demo_deep_pyramid(im) 2 | % Demonstrate basic usage and visualize features. 3 | % 4 | % 5 | % AUTORIGHTS 6 | % --------------------------------------------------------- 7 | % Copyright (c) 2014, Ross Girshick 8 | % 9 | % This file is part of the DeepPyramid code and is available 10 | % under the terms of the Simplified BSD License provided in 11 | % LICENSE. Please retain this notice and LICENSE if you use 12 | % this file (or any portion of it) in your project. 13 | % --------------------------------------------------------- 14 | 15 | if exist('caffe') ~= 3 16 | error('You must add matcaffe to your path.'); 17 | end 18 | 19 | if ~exist('data/caffe_nets/ilsvrc_2012_train_iter_310k') 20 | error(['You need the CNN model in %s. ' ... 21 | 'You can get this model by following ' ... 22 | 'the R-CNN installation instructions.'], ... 23 | 'data/caffe_nets/ilsvrc_2012_train_iter_310k'); 24 | end 25 | 26 | if 1 27 | % real use settings (compute features using the GPU) 28 | USE_GPU = true; 29 | USE_CACHE = false; 30 | USE_CAFFE = true; 31 | else 32 | % fast demo settings 33 | USE_GPU = false; 34 | USE_CACHE = true; 35 | USE_CAFFE = false; 36 | end 37 | 38 | if ~exist('im', 'var') || isempty(im) 39 | im = imread('000084.jpg'); 40 | end 41 | bbox = [263 145 381 225]; 42 | 43 | cnn = init_cnn_model('use_gpu', USE_GPU, 'use_caffe', USE_CAFFE); 44 | 45 | if USE_CACHE 46 | cache_opts.cache_file = './cached_pyra.mat'; 47 | cache_opts.debug = true; 48 | cache_opts.write_on_miss = true; 49 | th = tic; 50 | pyra = deep_pyramid_cache_wrapper(im, cnn, cache_opts); 51 | fprintf('deep_pyramid_cache_wrapper took %.3fs\n', toc(th)); 52 | else 53 | th = tic; 54 | pyra = deep_pyramid(im, cnn); 55 | fprintf('deep_pyramid took %.3fs\n', toc(th)); 56 | end 57 | padx = 0; 58 | pady = 0; 59 | 60 | pyra = deep_pyramid_add_padding(pyra, padx, pady); 61 | 62 | fprintf(['Press almost any key (with fig focused) to loop through ' ... 63 | 'feature channels (or esc to exit).\n']); 64 | for channel = 1:256 65 | vis_pyramid(im, pyra, bbox, channel); 66 | [~, ~, key_code] = ginput(1); 67 | if key_code == 27 68 | break; 69 | end 70 | end 71 | 72 | 73 | % ------------------------------------------------------------------------ 74 | function vis_pyramid(im, pyra, bbox, channel) 75 | % ------------------------------------------------------------------------ 76 | 77 | pyra_boxes = im_to_pyra_coords(pyra, bbox); 78 | 79 | clf; 80 | 81 | rows = 2; 82 | cols = 4; 83 | subplot(rows, cols, 1); 84 | imagesc(im); 85 | axis image; 86 | rectangle('Position', bbox_to_xywh(bbox), 'EdgeColor', 'g'); 87 | title(sprintf('input image feature %d', channel)); 88 | 89 | max_val = 0; 90 | for level = 1:pyra.num_levels 91 | f = pyra.feat{level}(:,:,channel); 92 | max_val = max(max_val, max(f(:))); 93 | end 94 | 95 | ld = load('green_colormap'); 96 | colormap(ld.map); clear ld; 97 | 98 | for level = 1:pyra.num_levels 99 | subplot(rows, cols, level+1); 100 | imagesc(pyra.feat{level}(:,:,channel), [0 max_val]); 101 | axis image; 102 | rectangle('Position', bbox_to_xywh(pyra_boxes{level}), 'EdgeColor', 'r'); 103 | title(sprintf('level %d; scale = %.2fx', level, pyra.scales(level))); 104 | 105 | % project pyramid box back to image and display as red 106 | im_bbox = pyra_to_im_coords(pyra, [pyra_boxes{level} level]); 107 | subplot(rows, cols, 1); 108 | rectangle('Position', bbox_to_xywh(im_bbox), 'EdgeColor', 'r'); 109 | %text(im_bbox(1), im_bbox(2), sprintf('%d', level)); 110 | end 111 | 112 | function xywh = bbox_to_xywh(bbox) 113 | xywh = [bbox(1) bbox(2) bbox(3)-bbox(1)+1 bbox(4)-bbox(2)+1]; 114 | -------------------------------------------------------------------------------- /green_colormap.mat: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/rbgirshick/DeepPyramid/adb87bbb96fcaef87420028750cf2abb9b3d929c/green_colormap.mat -------------------------------------------------------------------------------- /im_to_pyra_coords.m: -------------------------------------------------------------------------------- 1 | function pyra_boxes = im_to_pyra_coords(pyra, boxes) 2 | % boxes is N x 4 where each row is a box in the image specified 3 | % by [x1 y1 x2 y2]. 4 | % 5 | % Output is a cell array where cell i holds the pyramid boxes 6 | % coming from the image box 7 | 8 | % AUTORIGHTS 9 | % --------------------------------------------------------- 10 | % Copyright (c) 2014, Ross Girshick 11 | % 12 | % This file is part of the DeepPyramid code and is available 13 | % under the terms of the Simplified BSD License provided in 14 | % LICENSE. Please retain this notice and LICENSE if you use 15 | % this file (or any portion of it) in your project. 16 | % --------------------------------------------------------- 17 | 18 | boxes = boxes - 1; 19 | for level = 1:pyra.num_levels 20 | level_boxes = bsxfun(@times, boxes, pyra.scales(level)); 21 | level_boxes = round(level_boxes / pyra.stride); 22 | level_boxes = level_boxes + 1; 23 | level_boxes(:, [1 3]) = level_boxes(:, [1 3]) + pyra.padx; 24 | level_boxes(:, [2 4]) = level_boxes(:, [2 4]) + pyra.pady; 25 | pyra_boxes{level} = level_boxes; 26 | end 27 | -------------------------------------------------------------------------------- /init_cnn_model.m: -------------------------------------------------------------------------------- 1 | function cnn = init_cnn_model(varargin) 2 | % cnn = init_cnn_model 3 | % Initialize a CNN with caffe 4 | % 5 | % Optional arguments 6 | % net_file network binary file 7 | % def_file network prototxt file 8 | % use_gpu set to false to use CPU (default: true) 9 | % use_caffe set to false to avoid using caffe (default: true) 10 | % useful for running on the cluster (must use cached pyramids!) 11 | 12 | % AUTORIGHTS 13 | % --------------------------------------------------------- 14 | % Copyright (c) 2014, Ross Girshick 15 | % 16 | % This file is part of the DeepPyramid code and is available 17 | % under the terms of the Simplified BSD License provided in 18 | % LICENSE. Please retain this notice and LICENSE if you use 19 | % this file (or any portion of it) in your project. 20 | % --------------------------------------------------------- 21 | 22 | % ------------------------------------------------------------------------ 23 | % Options 24 | ip = inputParser; 25 | 26 | % network binary file 27 | ip.addParamValue('net_file', ... 28 | './data/caffe_nets/ilsvrc_2012_train_iter_310k', ... 29 | @isstr); 30 | 31 | % network prototxt file 32 | ip.addParamValue('def_file', ... 33 | './model-defs/pyramid_cnn_output_conv5_scales_7_plane_1713.prototxt', ... 34 | @isstr); 35 | 36 | % Set use_gpu to false to use the CPU 37 | ip.addParamValue('use_gpu', true, @islogical); 38 | 39 | % Set use_caffe to false to avoid using caffe 40 | % (must be used in conjunction with cached features!) 41 | ip.addParamValue('use_caffe', true, @islogical); 42 | 43 | ip.parse(varargin{:}); 44 | opts = ip.Results; 45 | % ------------------------------------------------------------------------ 46 | 47 | cnn.binary_file = opts.net_file; 48 | cnn.definition_file = opts.def_file; 49 | cnn.init_key = -1; 50 | cnn.mu = get_channelwise_mean; 51 | cnn.pyra.dimx = 1713; 52 | cnn.pyra.dimy = 1713; 53 | cnn.pyra.stride = 16; 54 | cnn.pyra.num_levels = 7; 55 | cnn.pyra.num_channels = 256; 56 | cnn.pyra.scale_factor = sqrt(2); 57 | 58 | if opts.use_caffe 59 | cnn.init_key = ... 60 | caffe('init', cnn.definition_file, cnn.binary_file); 61 | caffe('set_phase_test'); 62 | if opts.use_gpu 63 | caffe('set_mode_gpu'); 64 | else 65 | caffe('set_mode_cpu'); 66 | end 67 | end 68 | 69 | 70 | % ------------------------------------------------------------------------ 71 | function mu = get_channelwise_mean() 72 | % ------------------------------------------------------------------------ 73 | % load the ilsvrc image mean 74 | data_mean_file = 'ilsvrc_2012_mean.mat'; 75 | assert(exist(data_mean_file, 'file') ~= 0); 76 | % input size business isn't likley necessary, but we're doing it 77 | % to be consistent with previous experiments 78 | ld = load(data_mean_file); 79 | mu = ld.image_mean; clear ld; 80 | input_size = 227; 81 | off = floor((size(mu,1) - input_size)/2)+1; 82 | mu = mu(off:off+input_size-1, off:off+input_size-1, :); 83 | mu = sum(sum(mu, 1), 2) / size(mu, 1) / size(mu, 2); 84 | -------------------------------------------------------------------------------- /model-defs/pyramid_cnn_output_conv5_scales_7_plane_1713.prototxt: -------------------------------------------------------------------------------- 1 | input: "data" 2 | input_dim: 7 3 | input_dim: 3 4 | input_dim: 1713 5 | input_dim: 1713 6 | layers { 7 | bottom: "data" 8 | top: "conv1" 9 | name: "conv1" 10 | type: CONVOLUTION 11 | convolution_param { 12 | num_output: 96 13 | pad: 5 14 | kernel_size: 11 15 | stride: 4 16 | } 17 | } 18 | layers { 19 | bottom: "conv1" 20 | top: "conv1" 21 | name: "relu1" 22 | type: RELU 23 | } 24 | layers { 25 | bottom: "conv1" 26 | top: "pool1" 27 | name: "pool1" 28 | type: POOLING 29 | pooling_param { 30 | pool: MAX 31 | kernel_size: 3 32 | stride: 2 33 | pad: 1 34 | } 35 | } 36 | layers { 37 | bottom: "pool1" 38 | top: "norm1" 39 | name: "norm1" 40 | type: LRN 41 | lrn_param { 42 | local_size: 5 43 | alpha: 0.0001 44 | beta: 0.75 45 | } 46 | } 47 | layers { 48 | bottom: "norm1" 49 | top: "conv2" 50 | name: "conv2" 51 | type: CONVOLUTION 52 | convolution_param { 53 | num_output: 256 54 | pad: 2 55 | kernel_size: 5 56 | group: 2 57 | } 58 | } 59 | layers { 60 | bottom: "conv2" 61 | top: "conv2" 62 | name: "relu2" 63 | type: RELU 64 | } 65 | layers { 66 | bottom: "conv2" 67 | top: "pool2" 68 | name: "pool2" 69 | type: POOLING 70 | pooling_param { 71 | pool: MAX 72 | kernel_size: 3 73 | stride: 2 74 | pad: 1 75 | } 76 | } 77 | layers { 78 | bottom: "pool2" 79 | top: "norm2" 80 | name: "norm2" 81 | type: LRN 82 | lrn_param { 83 | local_size: 5 84 | alpha: 0.0001 85 | beta: 0.75 86 | } 87 | } 88 | layers { 89 | bottom: "norm2" 90 | top: "conv3" 91 | name: "conv3" 92 | type: CONVOLUTION 93 | convolution_param { 94 | num_output: 384 95 | pad: 1 96 | kernel_size: 3 97 | } 98 | } 99 | layers { 100 | bottom: "conv3" 101 | top: "conv3" 102 | name: "relu3" 103 | type: RELU 104 | } 105 | layers { 106 | bottom: "conv3" 107 | top: "conv4" 108 | name: "conv4" 109 | type: CONVOLUTION 110 | convolution_param { 111 | num_output: 384 112 | pad: 1 113 | kernel_size: 3 114 | group: 2 115 | } 116 | } 117 | layers { 118 | bottom: "conv4" 119 | top: "conv4" 120 | name: "relu4" 121 | type: RELU 122 | } 123 | layers { 124 | bottom: "conv4" 125 | top: "conv5" 126 | name: "conv5" 127 | type: CONVOLUTION 128 | convolution_param { 129 | num_output: 256 130 | pad: 1 131 | kernel_size: 3 132 | group: 2 133 | } 134 | } 135 | layers { 136 | bottom: "conv5" 137 | top: "conv5" 138 | name: "relu5" 139 | type: RELU 140 | } 141 | -------------------------------------------------------------------------------- /model-defs/pyramid_cnn_output_max5_scales_7_plane_1713.prototxt: -------------------------------------------------------------------------------- 1 | input: "data" 2 | input_dim: 7 3 | input_dim: 3 4 | input_dim: 1713 5 | input_dim: 1713 6 | layers { 7 | bottom: "data" 8 | top: "conv1" 9 | name: "conv1" 10 | type: CONVOLUTION 11 | convolution_param { 12 | num_output: 96 13 | pad: 5 14 | kernel_size: 11 15 | stride: 4 16 | } 17 | } 18 | layers { 19 | bottom: "conv1" 20 | top: "conv1" 21 | name: "relu1" 22 | type: RELU 23 | } 24 | layers { 25 | bottom: "conv1" 26 | top: "pool1" 27 | name: "pool1" 28 | type: POOLING 29 | pooling_param { 30 | pool: MAX 31 | kernel_size: 3 32 | stride: 2 33 | pad: 1 34 | } 35 | } 36 | layers { 37 | bottom: "pool1" 38 | top: "norm1" 39 | name: "norm1" 40 | type: LRN 41 | lrn_param { 42 | local_size: 5 43 | alpha: 0.0001 44 | beta: 0.75 45 | } 46 | } 47 | layers { 48 | bottom: "norm1" 49 | top: "conv2" 50 | name: "conv2" 51 | type: CONVOLUTION 52 | convolution_param { 53 | num_output: 256 54 | pad: 2 55 | kernel_size: 5 56 | group: 2 57 | } 58 | } 59 | layers { 60 | bottom: "conv2" 61 | top: "conv2" 62 | name: "relu2" 63 | type: RELU 64 | } 65 | layers { 66 | bottom: "conv2" 67 | top: "pool2" 68 | name: "pool2" 69 | type: POOLING 70 | pooling_param { 71 | pool: MAX 72 | kernel_size: 3 73 | stride: 2 74 | pad: 1 75 | } 76 | } 77 | layers { 78 | bottom: "pool2" 79 | top: "norm2" 80 | name: "norm2" 81 | type: LRN 82 | lrn_param { 83 | local_size: 5 84 | alpha: 0.0001 85 | beta: 0.75 86 | } 87 | } 88 | layers { 89 | bottom: "norm2" 90 | top: "conv3" 91 | name: "conv3" 92 | type: CONVOLUTION 93 | convolution_param { 94 | num_output: 384 95 | pad: 1 96 | kernel_size: 3 97 | } 98 | } 99 | layers { 100 | bottom: "conv3" 101 | top: "conv3" 102 | name: "relu3" 103 | type: RELU 104 | } 105 | layers { 106 | bottom: "conv3" 107 | top: "conv4" 108 | name: "conv4" 109 | type: CONVOLUTION 110 | convolution_param { 111 | num_output: 384 112 | pad: 1 113 | kernel_size: 3 114 | group: 2 115 | } 116 | } 117 | layers { 118 | bottom: "conv4" 119 | top: "conv4" 120 | name: "relu4" 121 | type: RELU 122 | } 123 | layers { 124 | bottom: "conv4" 125 | top: "conv5" 126 | name: "conv5" 127 | type: CONVOLUTION 128 | convolution_param { 129 | num_output: 256 130 | pad: 1 131 | kernel_size: 3 132 | group: 2 133 | } 134 | } 135 | layers { 136 | bottom: "conv5" 137 | top: "conv5" 138 | name: "relu5" 139 | type: RELU 140 | } 141 | layers { 142 | bottom: "conv5" 143 | top: "max5" 144 | name: "max5" 145 | type: POOLING 146 | pooling_param { 147 | pool: MAX 148 | kernel_size: 3 149 | stride: 1 150 | pad: 1 151 | } 152 | } 153 | -------------------------------------------------------------------------------- /pyra_to_im_coords.m: -------------------------------------------------------------------------------- 1 | function im_boxes = pyra_to_im_coords(pyra, boxes) 2 | % boxes is N x 5 where each row is a box in the format [x1 y1 x2 y2 pyra_level] 3 | % where (x1, y1) is the upper-left corner of the box in pyramid level pyra_level 4 | % and (x2, y2) is the lower-right corner of the box in pyramid level pyra_level 5 | % Assumes 1-based indexing. 6 | 7 | % AUTORIGHTS 8 | % --------------------------------------------------------- 9 | % Copyright (c) 2014, Ross Girshick 10 | % 11 | % This file is part of the DeepPyramid code and is available 12 | % under the terms of the Simplified BSD License provided in 13 | % LICENSE. Please retain this notice and LICENSE if you use 14 | % this file (or any portion of it) in your project. 15 | % --------------------------------------------------------- 16 | 17 | % pyramid to im scale factors for each scale 18 | scales = pyra.stride ./ pyra.scales; 19 | % pyramid to im scale factors for each pyra level in boxes 20 | scales = scales(boxes(:, end)); 21 | 22 | % Remove padding from pyramid boxes 23 | boxes(:, [1 3]) = boxes(:, [1 3]) - pyra.padx; 24 | boxes(:, [2 4]) = boxes(:, [2 4]) - pyra.pady; 25 | 26 | im_boxes = bsxfun(@times, (boxes(:, 1:4) - 1), scales) + 1; 27 | --------------------------------------------------------------------------------