├── 000084.jpg
├── LICENSE
├── README.md
├── cached_pyra.mat
├── data
    └── caffe_nets
    │   └── .dummy
├── deep_pyramid.m
├── deep_pyramid_add_padding.m
├── deep_pyramid_cache_wrapper.m
├── demo_deep_pyramid.m
├── green_colormap.mat
├── im_to_pyra_coords.m
├── init_cnn_model.m
├── model-defs
    ├── pyramid_cnn_output_conv5_scales_7_plane_1713.prototxt
    └── pyramid_cnn_output_max5_scales_7_plane_1713.prototxt
└── pyra_to_im_coords.m


/000084.jpg:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/rbgirshick/DeepPyramid/adb87bbb96fcaef87420028750cf2abb9b3d929c/000084.jpg


--------------------------------------------------------------------------------
/LICENSE:
--------------------------------------------------------------------------------
 1 | Copyright (c) 2014, The Regents of the University of California (Regents)
 2 | All rights reserved.
 3 | 
 4 | Redistribution and use in source and binary forms, with or without
 5 | modification, are permitted provided that the following conditions are met:
 6 | 
 7 | 1. Redistributions of source code must retain the above copyright notice, this
 8 |    list of conditions and the following disclaimer.
 9 | 2. Redistributions in binary form must reproduce the above copyright notice,
10 |    this list of conditions and the following disclaimer in the documentation
11 |    and/or other materials provided with the distribution.
12 | 
13 | THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS" AND
14 | ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED
15 | WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE
16 | DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT OWNER OR CONTRIBUTORS BE LIABLE FOR
17 | ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES
18 | (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES;
19 | LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND
20 | ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
21 | (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS
22 | SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
23 | 


--------------------------------------------------------------------------------
/README.md:
--------------------------------------------------------------------------------
 1 | ## DeepPyramid
 2 | 
 3 | DeepPyramid is a simple toolkit for building feature pyramids from deep convolutional networks.
 4 | The DeepPyramid data structure is nearly identical to the HOG feature pyramid created by the featpyramid.m function in the [voc-dpm](https://github.com/rbgirshick/voc-dpm) code.
 5 | 
 6 | ### References
 7 | 
 8 | This code was used in our [tech report](http://arxiv.org/pdf/1409.5403v2.pdf) about the relationship between deformable part models and convolutional networks.
 9 | 
10 |     @article{girshick14dpdpm,
11 |         author    = {Ross Girshick and Forrest Iandola and Trevor Darrell and Jitendra Malik},
12 |         title     = {Deformable Part Models are Convolutional Neural Networks},
13 |         journal   = {CoRR},
14 |         year      = {2014},
15 |         volume    = {abs/1409.5403},
16 |         url       = {http://arxiv.org/abs/1409.5403},
17 |         year      = {2014}
18 |     }
19 | 
20 | 
21 | ### Installation
22 | 
23 | 0. **Prerequisites** 
24 |   0. MATLAB (tested with 2014a on 64-bit Linux)
25 |   0. Caffe's [prerequisites](http://caffe.berkeleyvision.org/installation.html#prequequisites)
26 | 0. **Install Caffe** (this is the most complicated part)
27 |   0. Follow the [Caffe installation instructions](http://caffe.berkeleyvision.org/installation.html)
28 |   0. Let's call the place where you installed caffe `$CAFFE_ROOT` (you can run `export CAFFE_ROOT=$(pwd)`)
29 |   0. **Important:** Make sure to compile the Caffe MATLAB wrapper, which is not built by default: `make matcaffe`
30 |   0. **Important:** Make sure to run `cd $CAFFE_ROOT/data/ilsvrc12 && ./get_ilsvrc_aux.sh` to download the ImageNet image mean
31 |   0. DeepPyramid has been tested with master and dev at the time of this writing
32 | 0. **Get DeepPyramid**
33 |   0. `git clone https://github.com/rbgirshick/DeepPyramid.git`
34 |   0. If you haven't installed R-CNN, you'll need to download its [models](https://dl.dropboxusercontent.com/s/og7ghmiken2olzh/r-cnn-release1-data.tgz?dl=0)
35 |   1. Copy R-CNN's non-finetuned ImageNet network `<rcnnpath>/data/caffe_nets/ilsvrc_2012_train_iter_310k` to `<deeppyramidpath>/data/caffe_nets/ilsvrc_2012_train_iter_310k` (or just create a symlink).
36 | 
37 | ### Usage
38 | 
39 | 1. Run matlab from inside the DeepPyramid code directory
40 | 2. Add the `matcaffe` mex function to your path (`addpath /path/to/caffe/matlab/caffe`)
41 | 3. Run the demo `demo_deep_pyramid`
42 | 
43 | ### Uses
44 | 
45 | DeepPyramid can be used for implementing DPMs on deep convolutional network features, rather than HOG features. It can also be used whenever you need a dense multiscale pyramid of image features.
46 | 
47 | ### Caveats
48 | 
49 | The implementation is designed to be simple and as a result is very inefficient. There are a variety of ways to speed it up, and they will be done in the future. For now, it takes about 0.5 to 0.6 seconds to compute a feature pyramid on an NVIDIA Titan GPU, which is acceptable.
50 | 


--------------------------------------------------------------------------------
/cached_pyra.mat:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/rbgirshick/DeepPyramid/adb87bbb96fcaef87420028750cf2abb9b3d929c/cached_pyra.mat


--------------------------------------------------------------------------------
/data/caffe_nets/.dummy:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/rbgirshick/DeepPyramid/adb87bbb96fcaef87420028750cf2abb9b3d929c/data/caffe_nets/.dummy


--------------------------------------------------------------------------------
/deep_pyramid.m:
--------------------------------------------------------------------------------
  1 | function [pyra, im_pyra] = deep_pyramid(im, cnn_model)
  2 | % [pyra, im_pyra] = deep_pyramid(im, cnn_model, cache_opts)
  3 | %
  4 | % im: a color image
  5 | % cnn_model: a handle to the model loaded into caffe
  6 | 
  7 | % AUTORIGHTS
  8 | % ---------------------------------------------------------
  9 | % Copyright (c) 2014, Ross Girshick
 10 | %
 11 | % This file is part of the DeepPyramid code and is available
 12 | % under the terms of the Simplified BSD License provided in
 13 | % LICENSE. Please retain this notice and LICENSE if you use
 14 | % this file (or any portion of it) in your project.
 15 | % ---------------------------------------------------------
 16 | 
 17 | [pyra, im_pyra] = feat_pyramid(im, cnn_model);
 18 | 
 19 | imsize = size(im);
 20 | pyra.imsize = imsize(1:2);
 21 | pyra.num_levels = cnn_model.pyra.num_levels;
 22 | pyra.stride = cnn_model.pyra.stride;
 23 | 
 24 | pyra.valid_levels = true(pyra.num_levels, 1);
 25 | pyra.padx = 0;
 26 | pyra.pady = 0;
 27 | 
 28 | 
 29 | % ------------------------------------------------------------------------
 30 | function [pyra, im_pyra] = feat_pyramid(im, cnn_model)
 31 | % ------------------------------------------------------------------------
 32 | % Compute a feature pyramid with caffe
 33 | 
 34 | if cnn_model.init_key ~= caffe('get_init_key')
 35 |   error('You probably need to load the cnn model into caffe.');
 36 | end
 37 | 
 38 | % we're assuming the input height and width are the same
 39 | assert(cnn_model.pyra.dimx == cnn_model.pyra.dimy);
 40 | % compute output width and height
 41 | sz_w = (cnn_model.pyra.dimx - 1) / cnn_model.pyra.stride + 1;
 42 | sz_h = sz_w;
 43 | sz_l = cnn_model.pyra.num_levels;
 44 | sz_c = cnn_model.pyra.num_channels;
 45 | 
 46 | %th = tic;
 47 | [batch, scales, level_sizes] = image_pyramid(im, cnn_model);
 48 | %fprintf('prep: %.3fs\n', toc(th));
 49 | 
 50 | %th = tic;
 51 | feat = caffe('forward', {batch});
 52 | % standard song and dance of swapping width and height between caffe and matlab
 53 | feat = permute(reshape(feat{1}, [sz_w sz_h sz_c sz_l]), [2 1 3 4]);
 54 | %fprintf('fwd: %.3fs\n', toc(th));
 55 | 
 56 | pyra.feat = feat;
 57 | pyra.scales = scales;
 58 | pyra.level_sizes = level_sizes;
 59 | 
 60 | im_pyra = batch;
 61 | 
 62 | % ------------------------------------------------------------------------
 63 | function [batch, scales, level_sizes] = image_pyramid(im, cnn_model)
 64 | % ------------------------------------------------------------------------
 65 | % Construct an image pyramid that's ready for feeding directly into caffe
 66 | % forward
 67 | 
 68 | batch_width = cnn_model.pyra.dimx;
 69 | batch_height = cnn_model.pyra.dimy;
 70 | num_levels = cnn_model.pyra.num_levels;
 71 | stride = cnn_model.pyra.stride;
 72 | 
 73 | im = single(im);
 74 | % Convert to BGR
 75 | im = im(:,:,[3 2 1]);
 76 | % Subtract mean (mean of the image mean--one mean per channel)
 77 | im = bsxfun(@minus, im, cnn_model.mu);
 78 | 
 79 | % scale = base scale relative to input image
 80 | im_sz = size(im);
 81 | if im_sz(1) > im_sz(2)
 82 |   height = batch_height;
 83 |   width = NaN;
 84 |   scale = height / im_sz(1);
 85 | else
 86 |   height = NaN;
 87 |   width = batch_width;
 88 |   scale = width / im_sz(2);
 89 | end
 90 | im_orig = im;
 91 | 
 92 | batch = zeros(batch_width, batch_height, 3, num_levels, 'single');
 93 | alpha = cnn_model.pyra.scale_factor;
 94 | scales = scale*(alpha.^-(0:num_levels-1))';
 95 | level_sizes = zeros(num_levels, 2);
 96 | for i = 0:num_levels-1
 97 |   if i == 0
 98 |     im = imresize(im_orig, [height width], 'bilinear');
 99 |   else
100 |     im = imresize(im_orig, scales(i+1), 'bilinear');
101 |   end
102 |   im_sz = size(im);
103 |   im_sz = im_sz(1:2);
104 |   level_sizes(i+1, :) = ceil((im_sz - 1) / stride + 1);
105 |   % Make width the fastest dimension (for caffe)
106 |   im = permute(im, [2 1 3]);
107 |   batch(1:im_sz(2), 1:im_sz(1), :, i+1) = im;
108 | end
109 | 


--------------------------------------------------------------------------------
/deep_pyramid_add_padding.m:
--------------------------------------------------------------------------------
 1 | function pyra = deep_pyramid_add_padding(pyra, padx, pady, uniform_output)
 2 | % pyra = deep_pyramid_add_padding(pyra, padx, pady, uniform_output)
 3 | %
 4 | % pyra: a pyramid computed using deep_pyramid.m
 5 | % padx, pady: amount of extra padding to add around each pyramid level
 6 | % uniform_output: true => each pyramid level is embedded in the upper-left
 7 | %     corner of each 3D plane inside a 4D array;
 8 | %     false => each pyramid level is stored as a 3D array with the height
 9 | %     and width of that level (same as in the DPM HOG feature pyramid code)
10 | 
11 | % AUTORIGHTS
12 | % ---------------------------------------------------------
13 | % Copyright (c) 2014, Ross Girshick
14 | %
15 | % This file is part of the DeepPyramid code and is available
16 | % under the terms of the Simplified BSD License provided in
17 | % LICENSE. Please retain this notice and LICENSE if you use
18 | % this file (or any portion of it) in your project.
19 | % ---------------------------------------------------------
20 | 
21 | pyra.padx = padx;
22 | pyra.pady = pady;
23 | 
24 | if ~exist('uniform_output', 'var') || isempty(uniform_output)
25 |   uniform_output = false;
26 | end
27 | 
28 | if uniform_output
29 |   pyra.feat = padarray(pyra.feat, [pady padx 0 0], 0);
30 | else
31 |   % This mimics the output of the DPM featpyramid.m code
32 |   feat = pyra.feat;
33 |   pyra.feat = cell(pyra.num_levels, 1);
34 |   for i = 1:pyra.num_levels
35 |     % crop out feat map levels
36 |     pyra.feat{i} = feat(1:pyra.level_sizes(i, 1), ...
37 |                         1:pyra.level_sizes(i, 2), :, i);
38 |     % add padding
39 |     pyra.feat{i} = padarray(pyra.feat{i}, [pady padx 0], 0);
40 |   end
41 | end
42 | 
43 | 


--------------------------------------------------------------------------------
/deep_pyramid_cache_wrapper.m:
--------------------------------------------------------------------------------
 1 | function pyra = deep_pyramid_cache_wrapper(im, cnn_model, cache_opts)
 2 | % pyra = deep_pyramid_cache_wrapper(im, cnn_model, cache_opts)
 3 | %
 4 | % Load a feature pyramid from a cache on disk, optionally saving if
 5 | % there's a cache miss.
 6 | %
 7 | % cache_opts.cache_file
 8 | %           .debug
 9 | %           .write_on_miss
10 | %           .exists
11 | 
12 | % AUTORIGHTS
13 | % ---------------------------------------------------------
14 | % Copyright (c) 2014, Ross Girshick
15 | %
16 | % This file is part of the DeepPyramid code and is available
17 | % under the terms of the Simplified BSD License provided in
18 | % LICENSE. Please retain this notice and LICENSE if you use
19 | % this file (or any portion of it) in your project.
20 | % ---------------------------------------------------------
21 | 
22 | debug = isfield(cache_opts, 'debug') && cache_opts.debug;
23 | write_on_miss = isfield(cache_opts, 'write_on_miss') && ...
24 |                 cache_opts.write_on_miss;
25 | 
26 | cache_file = cache_opts.cache_file;
27 | if exist(cache_file, 'file')
28 |   ld = load(cache_file);
29 |   pyra = ld.pyra;
30 |   if debug
31 |     warning('Loaded cache file %s.', cache_file);
32 |   end
33 | else
34 |   if debug
35 |     warning('Cache file %s not found.', cache_file);
36 |   end
37 |   if write_on_miss
38 |     pyra = deep_pyramid(im, cnn_model);
39 |     save(cache_opts.cache_file, 'pyra');
40 |     if debug
41 |       warning('Cache file % saved.', cache_file);
42 |     end
43 |   end
44 | end
45 | 


--------------------------------------------------------------------------------
/demo_deep_pyramid.m:
--------------------------------------------------------------------------------
  1 | function pyra = demo_deep_pyramid(im)
  2 | % Demonstrate basic usage and visualize features.
  3 | %
  4 | %
  5 | % AUTORIGHTS
  6 | % ---------------------------------------------------------
  7 | % Copyright (c) 2014, Ross Girshick
  8 | %
  9 | % This file is part of the DeepPyramid code and is available
 10 | % under the terms of the Simplified BSD License provided in
 11 | % LICENSE. Please retain this notice and LICENSE if you use
 12 | % this file (or any portion of it) in your project.
 13 | % ---------------------------------------------------------
 14 | 
 15 | if exist('caffe') ~= 3
 16 |   error('You must add matcaffe to your path.');
 17 | end
 18 | 
 19 | if ~exist('data/caffe_nets/ilsvrc_2012_train_iter_310k')
 20 |   error(['You need the CNN model in %s. ' ...
 21 |          'You can get this model by following ' ...
 22 |          'the R-CNN installation instructions.'], ...
 23 |         'data/caffe_nets/ilsvrc_2012_train_iter_310k');
 24 | end
 25 | 
 26 | if 1
 27 |   % real use settings (compute features using the GPU)
 28 |   USE_GPU = true;
 29 |   USE_CACHE = false;
 30 |   USE_CAFFE = true;
 31 | else
 32 |   % fast demo settings
 33 |   USE_GPU = false;
 34 |   USE_CACHE = true;
 35 |   USE_CAFFE = false;
 36 | end
 37 | 
 38 | if ~exist('im', 'var') || isempty(im)
 39 |   im = imread('000084.jpg');
 40 | end
 41 | bbox = [263 145 381 225];
 42 | 
 43 | cnn = init_cnn_model('use_gpu', USE_GPU, 'use_caffe', USE_CAFFE);
 44 | 
 45 | if USE_CACHE
 46 |   cache_opts.cache_file = './cached_pyra.mat';
 47 |   cache_opts.debug = true;
 48 |   cache_opts.write_on_miss = true;
 49 |   th = tic;
 50 |   pyra = deep_pyramid_cache_wrapper(im, cnn, cache_opts);
 51 |   fprintf('deep_pyramid_cache_wrapper took %.3fs\n', toc(th));
 52 | else
 53 |   th = tic;
 54 |   pyra = deep_pyramid(im, cnn);
 55 |   fprintf('deep_pyramid took %.3fs\n', toc(th));
 56 | end
 57 | padx = 0;
 58 | pady = 0;
 59 | 
 60 | pyra = deep_pyramid_add_padding(pyra, padx, pady);
 61 | 
 62 | fprintf(['Press almost any key (with fig focused) to loop through ' ...
 63 |          'feature channels (or esc to exit).\n']);
 64 | for channel = 1:256
 65 |   vis_pyramid(im, pyra, bbox, channel);
 66 |   [~, ~, key_code] = ginput(1);
 67 |   if key_code == 27
 68 |     break;
 69 |   end
 70 | end
 71 | 
 72 | 
 73 | % ------------------------------------------------------------------------
 74 | function vis_pyramid(im, pyra, bbox, channel)
 75 | % ------------------------------------------------------------------------
 76 | 
 77 | pyra_boxes = im_to_pyra_coords(pyra, bbox);
 78 | 
 79 | clf;
 80 | 
 81 | rows = 2;
 82 | cols = 4;
 83 | subplot(rows, cols, 1);
 84 | imagesc(im);
 85 | axis image;
 86 | rectangle('Position', bbox_to_xywh(bbox), 'EdgeColor', 'g');
 87 | title(sprintf('input image feature %d', channel));
 88 | 
 89 | max_val = 0;
 90 | for level = 1:pyra.num_levels
 91 |   f = pyra.feat{level}(:,:,channel);
 92 |   max_val = max(max_val, max(f(:)));
 93 | end
 94 | 
 95 | ld = load('green_colormap');
 96 | colormap(ld.map); clear ld;
 97 | 
 98 | for level = 1:pyra.num_levels
 99 |   subplot(rows, cols, level+1);
100 |   imagesc(pyra.feat{level}(:,:,channel), [0 max_val]);
101 |   axis image;
102 |   rectangle('Position', bbox_to_xywh(pyra_boxes{level}), 'EdgeColor', 'r');
103 |   title(sprintf('level %d; scale = %.2fx', level, pyra.scales(level)));
104 | 
105 |   % project pyramid box back to image and display as red
106 |   im_bbox = pyra_to_im_coords(pyra, [pyra_boxes{level} level]);
107 |   subplot(rows, cols, 1);
108 |   rectangle('Position', bbox_to_xywh(im_bbox), 'EdgeColor', 'r');
109 |   %text(im_bbox(1), im_bbox(2), sprintf('%d', level));
110 | end
111 | 
112 | function xywh = bbox_to_xywh(bbox)
113 | xywh = [bbox(1) bbox(2) bbox(3)-bbox(1)+1 bbox(4)-bbox(2)+1];
114 | 


--------------------------------------------------------------------------------
/green_colormap.mat:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/rbgirshick/DeepPyramid/adb87bbb96fcaef87420028750cf2abb9b3d929c/green_colormap.mat


--------------------------------------------------------------------------------
/im_to_pyra_coords.m:
--------------------------------------------------------------------------------
 1 | function pyra_boxes = im_to_pyra_coords(pyra, boxes)
 2 | % boxes is N x 4 where each row is a box in the image specified
 3 | % by [x1 y1 x2 y2].
 4 | %
 5 | % Output is a cell array where cell i holds the pyramid boxes
 6 | % coming from the image box
 7 | 
 8 | % AUTORIGHTS
 9 | % ---------------------------------------------------------
10 | % Copyright (c) 2014, Ross Girshick
11 | %
12 | % This file is part of the DeepPyramid code and is available
13 | % under the terms of the Simplified BSD License provided in
14 | % LICENSE. Please retain this notice and LICENSE if you use
15 | % this file (or any portion of it) in your project.
16 | % ---------------------------------------------------------
17 | 
18 | boxes = boxes - 1;
19 | for level = 1:pyra.num_levels
20 |   level_boxes = bsxfun(@times, boxes, pyra.scales(level));
21 |   level_boxes = round(level_boxes / pyra.stride);
22 |   level_boxes = level_boxes + 1;
23 |   level_boxes(:, [1 3]) = level_boxes(:, [1 3]) + pyra.padx;
24 |   level_boxes(:, [2 4]) = level_boxes(:, [2 4]) + pyra.pady;
25 |   pyra_boxes{level} = level_boxes;
26 | end
27 | 


--------------------------------------------------------------------------------
/init_cnn_model.m:
--------------------------------------------------------------------------------
 1 | function cnn = init_cnn_model(varargin)
 2 | % cnn = init_cnn_model
 3 | % Initialize a CNN with caffe
 4 | %
 5 | % Optional arguments
 6 | % net_file   network binary file
 7 | % def_file   network prototxt file
 8 | % use_gpu    set to false to use CPU (default: true)
 9 | % use_caffe  set to false to avoid using caffe (default: true)
10 | %            useful for running on the cluster (must use cached pyramids!)
11 | 
12 | % AUTORIGHTS
13 | % ---------------------------------------------------------
14 | % Copyright (c) 2014, Ross Girshick
15 | %
16 | % This file is part of the DeepPyramid code and is available
17 | % under the terms of the Simplified BSD License provided in
18 | % LICENSE. Please retain this notice and LICENSE if you use
19 | % this file (or any portion of it) in your project.
20 | % ---------------------------------------------------------
21 | 
22 | % ------------------------------------------------------------------------
23 | % Options
24 | ip = inputParser;
25 | 
26 | % network binary file
27 | ip.addParamValue('net_file', ...
28 |     './data/caffe_nets/ilsvrc_2012_train_iter_310k', ...
29 |     @isstr);
30 | 
31 | % network prototxt file
32 | ip.addParamValue('def_file', ...
33 |     './model-defs/pyramid_cnn_output_conv5_scales_7_plane_1713.prototxt', ...
34 |     @isstr);
35 | 
36 | % Set use_gpu to false to use the CPU
37 | ip.addParamValue('use_gpu', true, @islogical);
38 | 
39 | % Set use_caffe to false to avoid using caffe
40 | % (must be used in conjunction with cached features!)
41 | ip.addParamValue('use_caffe', true, @islogical);
42 | 
43 | ip.parse(varargin{:});
44 | opts = ip.Results;
45 | % ------------------------------------------------------------------------
46 | 
47 | cnn.binary_file = opts.net_file;
48 | cnn.definition_file = opts.def_file;
49 | cnn.init_key = -1;
50 | cnn.mu = get_channelwise_mean;
51 | cnn.pyra.dimx = 1713;
52 | cnn.pyra.dimy = 1713;
53 | cnn.pyra.stride = 16;
54 | cnn.pyra.num_levels = 7;
55 | cnn.pyra.num_channels = 256;
56 | cnn.pyra.scale_factor = sqrt(2);
57 | 
58 | if opts.use_caffe
59 |   cnn.init_key = ...
60 |       caffe('init', cnn.definition_file, cnn.binary_file);
61 |   caffe('set_phase_test');
62 |   if opts.use_gpu
63 |     caffe('set_mode_gpu');
64 |   else
65 |     caffe('set_mode_cpu');
66 |   end
67 | end
68 | 
69 | 
70 | % ------------------------------------------------------------------------
71 | function mu = get_channelwise_mean()
72 | % ------------------------------------------------------------------------
73 | % load the ilsvrc image mean
74 | data_mean_file = 'ilsvrc_2012_mean.mat';
75 | assert(exist(data_mean_file, 'file') ~= 0);
76 | % input size business isn't likley necessary, but we're doing it
77 | % to be consistent with previous experiments
78 | ld = load(data_mean_file);
79 | mu = ld.image_mean; clear ld;
80 | input_size = 227;
81 | off = floor((size(mu,1) - input_size)/2)+1;
82 | mu = mu(off:off+input_size-1, off:off+input_size-1, :);
83 | mu = sum(sum(mu, 1), 2) / size(mu, 1) / size(mu, 2);
84 | 


--------------------------------------------------------------------------------
/model-defs/pyramid_cnn_output_conv5_scales_7_plane_1713.prototxt:
--------------------------------------------------------------------------------
  1 | input: "data"
  2 | input_dim: 7
  3 | input_dim: 3
  4 | input_dim: 1713
  5 | input_dim: 1713
  6 | layers {
  7 |   bottom: "data"
  8 |   top: "conv1"
  9 |   name: "conv1"
 10 |   type: CONVOLUTION
 11 |   convolution_param {
 12 |     num_output: 96
 13 |     pad: 5
 14 |     kernel_size: 11
 15 |     stride: 4
 16 |   }
 17 | }
 18 | layers {
 19 |   bottom: "conv1"
 20 |   top: "conv1"
 21 |   name: "relu1"
 22 |   type: RELU
 23 | }
 24 | layers {
 25 |   bottom: "conv1"
 26 |   top: "pool1"
 27 |   name: "pool1"
 28 |   type: POOLING
 29 |   pooling_param {
 30 |     pool: MAX
 31 |     kernel_size: 3
 32 |     stride: 2
 33 |     pad: 1
 34 |   }
 35 | }
 36 | layers {
 37 |   bottom: "pool1"
 38 |   top: "norm1"
 39 |   name: "norm1"
 40 |   type: LRN
 41 |   lrn_param {
 42 |     local_size: 5
 43 |     alpha: 0.0001
 44 |     beta: 0.75
 45 |   }
 46 | }
 47 | layers {
 48 |   bottom: "norm1"
 49 |   top: "conv2"
 50 |   name: "conv2"
 51 |   type: CONVOLUTION
 52 |   convolution_param {
 53 |     num_output: 256
 54 |     pad: 2
 55 |     kernel_size: 5
 56 |     group: 2
 57 |   }
 58 | }
 59 | layers {
 60 |   bottom: "conv2"
 61 |   top: "conv2"
 62 |   name: "relu2"
 63 |   type: RELU
 64 | }
 65 | layers {
 66 |   bottom: "conv2"
 67 |   top: "pool2"
 68 |   name: "pool2"
 69 |   type: POOLING
 70 |   pooling_param {
 71 |     pool: MAX
 72 |     kernel_size: 3
 73 |     stride: 2
 74 |     pad: 1
 75 |   }
 76 | }
 77 | layers {
 78 |   bottom: "pool2"
 79 |   top: "norm2"
 80 |   name: "norm2"
 81 |   type: LRN
 82 |   lrn_param {
 83 |     local_size: 5
 84 |     alpha: 0.0001
 85 |     beta: 0.75
 86 |   }
 87 | }
 88 | layers {
 89 |   bottom: "norm2"
 90 |   top: "conv3"
 91 |   name: "conv3"
 92 |   type: CONVOLUTION
 93 |   convolution_param {
 94 |     num_output: 384
 95 |     pad: 1
 96 |     kernel_size: 3
 97 |   }
 98 | }
 99 | layers {
100 |   bottom: "conv3"
101 |   top: "conv3"
102 |   name: "relu3"
103 |   type: RELU
104 | }
105 | layers {
106 |   bottom: "conv3"
107 |   top: "conv4"
108 |   name: "conv4"
109 |   type: CONVOLUTION
110 |   convolution_param {
111 |     num_output: 384
112 |     pad: 1
113 |     kernel_size: 3
114 |     group: 2
115 |   }
116 | }
117 | layers {
118 |   bottom: "conv4"
119 |   top: "conv4"
120 |   name: "relu4"
121 |   type: RELU
122 | }
123 | layers {
124 |   bottom: "conv4"
125 |   top: "conv5"
126 |   name: "conv5"
127 |   type: CONVOLUTION
128 |   convolution_param {
129 |     num_output: 256
130 |     pad: 1
131 |     kernel_size: 3
132 |     group: 2
133 |   }
134 | }
135 | layers {
136 |   bottom: "conv5"
137 |   top: "conv5"
138 |   name: "relu5"
139 |   type: RELU
140 | }
141 | 


--------------------------------------------------------------------------------
/model-defs/pyramid_cnn_output_max5_scales_7_plane_1713.prototxt:
--------------------------------------------------------------------------------
  1 | input: "data"
  2 | input_dim: 7
  3 | input_dim: 3
  4 | input_dim: 1713
  5 | input_dim: 1713
  6 | layers {
  7 |   bottom: "data"
  8 |   top: "conv1"
  9 |   name: "conv1"
 10 |   type: CONVOLUTION
 11 |   convolution_param {
 12 |     num_output: 96
 13 |     pad: 5
 14 |     kernel_size: 11
 15 |     stride: 4
 16 |   }
 17 | }
 18 | layers {
 19 |   bottom: "conv1"
 20 |   top: "conv1"
 21 |   name: "relu1"
 22 |   type: RELU
 23 | }
 24 | layers {
 25 |   bottom: "conv1"
 26 |   top: "pool1"
 27 |   name: "pool1"
 28 |   type: POOLING
 29 |   pooling_param {
 30 |     pool: MAX
 31 |     kernel_size: 3
 32 |     stride: 2
 33 |     pad: 1
 34 |   }
 35 | }
 36 | layers {
 37 |   bottom: "pool1"
 38 |   top: "norm1"
 39 |   name: "norm1"
 40 |   type: LRN
 41 |   lrn_param {
 42 |     local_size: 5
 43 |     alpha: 0.0001
 44 |     beta: 0.75
 45 |   }
 46 | }
 47 | layers {
 48 |   bottom: "norm1"
 49 |   top: "conv2"
 50 |   name: "conv2"
 51 |   type: CONVOLUTION
 52 |   convolution_param {
 53 |     num_output: 256
 54 |     pad: 2
 55 |     kernel_size: 5
 56 |     group: 2
 57 |   }
 58 | }
 59 | layers {
 60 |   bottom: "conv2"
 61 |   top: "conv2"
 62 |   name: "relu2"
 63 |   type: RELU
 64 | }
 65 | layers {
 66 |   bottom: "conv2"
 67 |   top: "pool2"
 68 |   name: "pool2"
 69 |   type: POOLING
 70 |   pooling_param {
 71 |     pool: MAX
 72 |     kernel_size: 3
 73 |     stride: 2
 74 |     pad: 1
 75 |   }
 76 | }
 77 | layers {
 78 |   bottom: "pool2"
 79 |   top: "norm2"
 80 |   name: "norm2"
 81 |   type: LRN
 82 |   lrn_param {
 83 |     local_size: 5
 84 |     alpha: 0.0001
 85 |     beta: 0.75
 86 |   }
 87 | }
 88 | layers {
 89 |   bottom: "norm2"
 90 |   top: "conv3"
 91 |   name: "conv3"
 92 |   type: CONVOLUTION
 93 |   convolution_param {
 94 |     num_output: 384
 95 |     pad: 1
 96 |     kernel_size: 3
 97 |   }
 98 | }
 99 | layers {
100 |   bottom: "conv3"
101 |   top: "conv3"
102 |   name: "relu3"
103 |   type: RELU
104 | }
105 | layers {
106 |   bottom: "conv3"
107 |   top: "conv4"
108 |   name: "conv4"
109 |   type: CONVOLUTION
110 |   convolution_param {
111 |     num_output: 384
112 |     pad: 1
113 |     kernel_size: 3
114 |     group: 2
115 |   }
116 | }
117 | layers {
118 |   bottom: "conv4"
119 |   top: "conv4"
120 |   name: "relu4"
121 |   type: RELU
122 | }
123 | layers {
124 |   bottom: "conv4"
125 |   top: "conv5"
126 |   name: "conv5"
127 |   type: CONVOLUTION
128 |   convolution_param {
129 |     num_output: 256
130 |     pad: 1
131 |     kernel_size: 3
132 |     group: 2
133 |   }
134 | }
135 | layers {
136 |   bottom: "conv5"
137 |   top: "conv5"
138 |   name: "relu5"
139 |   type: RELU
140 | }
141 | layers {
142 |   bottom: "conv5"
143 |   top: "max5"
144 |   name: "max5"
145 |   type: POOLING
146 |   pooling_param {
147 |     pool: MAX
148 |     kernel_size: 3
149 |     stride: 1
150 |     pad: 1
151 |   }
152 | }
153 | 


--------------------------------------------------------------------------------
/pyra_to_im_coords.m:
--------------------------------------------------------------------------------
 1 | function im_boxes = pyra_to_im_coords(pyra, boxes)
 2 | % boxes is N x 5 where each row is a box in the format [x1 y1 x2 y2 pyra_level]
 3 | % where (x1, y1) is the upper-left corner of the box in pyramid level pyra_level
 4 | % and (x2, y2) is the lower-right corner of the box in pyramid level pyra_level
 5 | % Assumes 1-based indexing.
 6 | 
 7 | % AUTORIGHTS
 8 | % ---------------------------------------------------------
 9 | % Copyright (c) 2014, Ross Girshick
10 | %
11 | % This file is part of the DeepPyramid code and is available
12 | % under the terms of the Simplified BSD License provided in
13 | % LICENSE. Please retain this notice and LICENSE if you use
14 | % this file (or any portion of it) in your project.
15 | % ---------------------------------------------------------
16 | 
17 | % pyramid to im scale factors for each scale
18 | scales = pyra.stride ./ pyra.scales;
19 | % pyramid to im scale factors for each pyra level in boxes
20 | scales = scales(boxes(:, end));
21 | 
22 | % Remove padding from pyramid boxes
23 | boxes(:, [1 3]) = boxes(:, [1 3]) - pyra.padx;
24 | boxes(:, [2 4]) = boxes(:, [2 4]) - pyra.pady;
25 | 
26 | im_boxes = bsxfun(@times, (boxes(:, 1:4) - 1), scales) + 1;
27 | 


--------------------------------------------------------------------------------