├── LICENSE
├── README.md
├── chnsCompute.m
└── params.mat


/LICENSE:
--------------------------------------------------------------------------------
  1 | 
  2 |                    GNU LESSER GENERAL PUBLIC LICENSE
  3 |                        Version 3, 29 June 2007
  4 | 
  5 |  Copyright (C) 2007 Free Software Foundation, Inc. <https://fsf.org/>
  6 |  Everyone is permitted to copy and distribute verbatim copies
  7 |  of this license document, but changing it is not allowed.
  8 | 
  9 | 
 10 |   This version of the GNU Lesser General Public License incorporates
 11 | the terms and conditions of version 3 of the GNU General Public
 12 | License, supplemented by the additional permissions listed below.
 13 | 
 14 |   0. Additional Definitions.
 15 | 
 16 |   As used herein, "this License" refers to version 3 of the GNU Lesser
 17 | General Public License, and the "GNU GPL" refers to version 3 of the GNU
 18 | General Public License.
 19 | 
 20 |   "The Library" refers to a covered work governed by this License,
 21 | other than an Application or a Combined Work as defined below.
 22 | 
 23 |   An "Application" is any work that makes use of an interface provided
 24 | by the Library, but which is not otherwise based on the Library.
 25 | Defining a subclass of a class defined by the Library is deemed a mode
 26 | of using an interface provided by the Library.
 27 | 
 28 |   A "Combined Work" is a work produced by combining or linking an
 29 | Application with the Library.  The particular version of the Library
 30 | with which the Combined Work was made is also called the "Linked
 31 | Version".
 32 | 
 33 |   The "Minimal Corresponding Source" for a Combined Work means the
 34 | Corresponding Source for the Combined Work, excluding any source code
 35 | for portions of the Combined Work that, considered in isolation, are
 36 | based on the Application, and not on the Linked Version.
 37 | 
 38 |   The "Corresponding Application Code" for a Combined Work means the
 39 | object code and/or source code for the Application, including any data
 40 | and utility programs needed for reproducing the Combined Work from the
 41 | Application, but excluding the System Libraries of the Combined Work.
 42 | 
 43 |   1. Exception to Section 3 of the GNU GPL.
 44 | 
 45 |   You may convey a covered work under sections 3 and 4 of this License
 46 | without being bound by section 3 of the GNU GPL.
 47 | 
 48 |   2. Conveying Modified Versions.
 49 | 
 50 |   If you modify a copy of the Library, and, in your modifications, a
 51 | facility refers to a function or data to be supplied by an Application
 52 | that uses the facility (other than as an argument passed when the
 53 | facility is invoked), then you may convey a copy of the modified
 54 | version:
 55 | 
 56 |    a) under this License, provided that you make a good faith effort to
 57 |    ensure that, in the event an Application does not supply the
 58 |    function or data, the facility still operates, and performs
 59 |    whatever part of its purpose remains meaningful, or
 60 | 
 61 |    b) under the GNU GPL, with none of the additional permissions of
 62 |    this License applicable to that copy.
 63 | 
 64 |   3. Object Code Incorporating Material from Library Header Files.
 65 | 
 66 |   The object code form of an Application may incorporate material from
 67 | a header file that is part of the Library.  You may convey such object
 68 | code under terms of your choice, provided that, if the incorporated
 69 | material is not limited to numerical parameters, data structure
 70 | layouts and accessors, or small macros, inline functions and templates
 71 | (ten or fewer lines in length), you do both of the following:
 72 | 
 73 |    a) Give prominent notice with each copy of the object code that the
 74 |    Library is used in it and that the Library and its use are
 75 |    covered by this License.
 76 | 
 77 |    b) Accompany the object code with a copy of the GNU GPL and this license
 78 |    document.
 79 | 
 80 |   4. Combined Works.
 81 | 
 82 |   You may convey a Combined Work under terms of your choice that,
 83 | taken together, effectively do not restrict modification of the
 84 | portions of the Library contained in the Combined Work and reverse
 85 | engineering for debugging such modifications, if you also do each of
 86 | the following:
 87 | 
 88 |    a) Give prominent notice with each copy of the Combined Work that
 89 |    the Library is used in it and that the Library and its use are
 90 |    covered by this License.
 91 | 
 92 |    b) Accompany the Combined Work with a copy of the GNU GPL and this license
 93 |    document.
 94 | 
 95 |    c) For a Combined Work that displays copyright notices during
 96 |    execution, include the copyright notice for the Library among
 97 |    these notices, as well as a reference directing the user to the
 98 |    copies of the GNU GPL and this license document.
 99 | 
100 |    d) Do one of the following:
101 | 
102 |        0) Convey the Minimal Corresponding Source under the terms of this
103 |        License, and the Corresponding Application Code in a form
104 |        suitable for, and under terms that permit, the user to
105 |        recombine or relink the Application with a modified version of
106 |        the Linked Version to produce a modified Combined Work, in the
107 |        manner specified by section 6 of the GNU GPL for conveying
108 |        Corresponding Source.
109 | 
110 |        1) Use a suitable shared library mechanism for linking with the
111 |        Library.  A suitable mechanism is one that (a) uses at run time
112 |        a copy of the Library already present on the user's computer
113 |        system, and (b) will operate properly with a modified version
114 |        of the Library that is interface-compatible with the Linked
115 |        Version.
116 | 
117 |    e) Provide Installation Information, but only if you would otherwise
118 |    be required to provide such information under section 6 of the
119 |    GNU GPL, and only to the extent that such information is
120 |    necessary to install and execute a modified version of the
121 |    Combined Work produced by recombining or relinking the
122 |    Application with a modified version of the Linked Version. (If
123 |    you use option 4d0, the Installation Information must accompany
124 |    the Minimal Corresponding Source and Corresponding Application
125 |    Code. If you use option 4d1, you must provide the Installation
126 |    Information in the manner specified by section 6 of the GNU GPL
127 |    for conveying Corresponding Source.)
128 | 
129 |   5. Combined Libraries.
130 | 
131 |   You may place library facilities that are a work based on the
132 | Library side by side in a single library together with other library
133 | facilities that are not Applications and are not covered by this
134 | License, and convey such a combined library under terms of your
135 | choice, if you do both of the following:
136 | 
137 |    a) Accompany the combined library with a copy of the same work based
138 |    on the Library, uncombined with any other library facilities,
139 |    conveyed under the terms of this License.
140 | 
141 |    b) Give prominent notice with the combined library that part of it
142 |    is a work based on the Library, and explaining where to find the
143 |    accompanying uncombined form of the same work.
144 | 
145 |   6. Revised Versions of the GNU Lesser General Public License.
146 | 
147 |   The Free Software Foundation may publish revised and/or new versions
148 | of the GNU Lesser General Public License from time to time. Such new
149 | versions will be similar in spirit to the present version, but may
150 | differ in detail to address new problems or concerns.
151 | 
152 |   Each version is given a distinguishing version number. If the
153 | Library as you received it specifies that a certain numbered version
154 | of the GNU Lesser General Public License "or any later version"
155 | applies to it, you have the option of following the terms and
156 | conditions either of that published version or of any later version
157 | published by the Free Software Foundation. If the Library as you
158 | received it does not specify a version number of the GNU Lesser
159 | General Public License, you may choose any version of the GNU Lesser
160 | General Public License ever published by the Free Software Foundation.
161 | 
162 |   If the Library as you received it specifies that a proxy can decide
163 | whether future versions of the GNU Lesser General Public License shall
164 | apply, that proxy's public statement of acceptance of any version is
165 | permanent authorization for you to choose that version for the
166 | Library.
167 | 


--------------------------------------------------------------------------------
/README.md:
--------------------------------------------------------------------------------
 1 | 
 2 | ## DetectorWithFocus
 3 | Created by [Yonghyun Kim](http://imlab.postech.ac.kr/members.htm) and [Daijin Kim](http://imlab.postech.ac.kr/members_d.htm) at [POSTECH IM Lab](http://imlab.postech.ac.kr)
 4 | 
 5 | ### Overview
 6 | 
 7 | An image pyramid can extend many object detection algorithms to solve detection on multiple scales. However, interpolation during the resampling process of an image pyramid causes gradient variation, which is the difference of the gradients between the original image and the scaled images. Our key insight is that the increased variance of gradients makes the classifiers have difficulty in correctly assigning categories. We prove the existence of the gradient variation by formulating the ratio of gradient expectations between an original image and scaled images, then propose a simple and novel gradient normalization method to eliminate the effect of this variation. The proposed normalization method reduce the variance in an image pyramid and allow the classifier to focus on a smaller coverage. We show the improvement in three different visual recognition problems: pedestrian detection, pose estimation, and object detection. The method is generally applicable to many vision algorithms based on an image pyramid with gradients.
 8 | 
 9 | 
10 | ## Citation
11 | If you're using this code in a publication, please cite our papers.
12 | 
13 | 
14 |   @inproceedings{ykim2017detector,
15 |     title={DETECTOR WITH FOCUS: NORMALIZING GRADIENT IN IMAGE PYRAMID},
16 |     author={Yonghyun Kim, Bongnam Kang and Daijin Kim},
17 |     booktitle={IEEE International Conference on Image Processing (ICIP)},
18 |     year={2017},
19 |     organization={IEEE}
20 |   } 
21 | 
22 | 
23 | ### Performance
24 | We conduct the experiments in object detection with three applications: pedestrian detection, pose estimation, and object detection.
25 | ### Overall
26 | We propose ### Overalla simple and novel gradient normalization method by analyzing the gradient variation in the viewpoint of the classifier.
27 | The proposed method defines the original image as reference, and normalizes gradients from other resampled images to the reference image. 
28 | The normalized gradient, which is similar to the gradients of original images, reduces the variance, and increases the performance of the classifiers with negligible increase in computing time.
29 | We show the effectiveness of the gradient normalization in object detection with three applications: pedestrian detection, pose estimation, and object detection.
30 | ### Related Papers
31 | Yonghyun Kim, Bong-Nam Kang, Daijin Kim, "DETECTOR WITH FOCUS: NORMALIZING GRADIENT IN IMAGE PYRAMID," 2017  IEEE International Conference on Image Processing (ICIP) 2017.
32 | ### Acknowledgements
33 | This work was supported by Institute for Information & communications Technology Promotion (IITP) grant funded by the Korea government (MSIP)(2014-0-00059, Development of Predictive Visual Intelligence Technology), MSIP (Ministry of Science, ICT and Future Planning), Korea, under the ICT Consilience Creative Program (IITP-R0346-16-1007) supervised by the IITP, and MSIP(Ministry of Science, ICT and Future Planning), Korea, under the ITRC (Information Technology Research Center) support program (IITP-2017-2016-0-00464) supervised by the IITP.
34 | 


--------------------------------------------------------------------------------
/chnsCompute.m:
--------------------------------------------------------------------------------
  1 | function chns = chnsCompute( I, varargin )
  2 | % Compute channel features at a single scale given an input image.
  3 | %
  4 | % Compute the channel features as described in:
  5 | %  P. Doll?, Z. Tu, P. Perona and S. Belongie
  6 | %  "Integral Channel Features", BMVC 2009.
  7 | % Channel features have proven very effective in sliding window object
  8 | % detection, both in terms of *accuracy* and *speed*. Numerous feature
  9 | % types including histogram of gradients (hog) can be converted into
 10 | % channel features, and overall, channels are general and powerful.
 11 | %
 12 | % Given an input image I, a corresponding channel is a registered map of I,
 13 | % where the output pixels are computed from corresponding patches of input
 14 | % pixels (thus preserving overall image layout). A trivial channel is
 15 | % simply the input grayscale image, likewise for a color image each color
 16 | % channel can serve as a channel. Other channels can be computed using
 17 | % linear or non-linear transformations of I, various choices implemented
 18 | % here are described below. The only constraint is that channels must be
 19 | % translationally invariant (i.e. translating the input image or the
 20 | % resulting channels gives the same result). This allows for fast object
 21 | % detection, as the channels can be computed once on the entire image
 22 | % rather than separately for each overlapping detection window.
 23 | %
 24 | % Currently, three channel types are available by default (to date, these
 25 | % have proven the most effective for sliding window object detection):
 26 | %  (1) color channels (computed using rgbConvert.m)
 27 | %  (2) gradient magnitude (computed using gradientMag.m)
 28 | %  (3) quantized gradient channels (computed using gradientHist.m)
 29 | % For more information about each channel type, including the exact input
 30 | % parameters and their meanings, see the respective m-files which perform
 31 | % the actual computatons (chnsCompute is essentially a wrapper function).
 32 | % The converted color channels serve as input to gradientMag/gradientHist.
 33 | %
 34 | % Additionally, custom channels can be specified via an optional struct
 35 | % array "pCustom" which may have 0 or more custom channel definitions. Each
 36 | % custom channel is generated via a call to "chns=feval(hFunc,I,pFunc{:})".
 37 | % The color space of I is determined by pColor.colorSpace, use the setting
 38 | % colorSpace='orig' if the input image is not an 'rgb' image and should be
 39 | % left unchanged (e.g. if I has multiple channels). The input I will have
 40 | % type single and the output of hFunc should also have type single.
 41 | %
 42 | % "shrink" (which should be an integer) determines the amount to subsample
 43 | % the computed channels (in applications such as detection subsamping does
 44 | % not affect performance). The params for each channel type are described
 45 | % in detail in the respective function. In addition, each channel type has
 46 | % a param "enabled" that determines if the channel is computed. If
 47 | % chnsCompute() is called with no inputs, the output is the complete
 48 | % default params (pChns). Otherwise the outputs are the computed channels
 49 | % and additional meta-data (see below). The channels are computed at a
 50 | % single scale, for (fast) multi-scale channel computation see chnsPyramid.
 51 | %
 52 | % An emphasis has been placed on speed, with the code undergoing heavy
 53 | % optimization. Computing the full set of channels used in the BMVC09 paper
 54 | % referenced above on a 480x640 image runs over *100 fps* on a single core
 55 | % of a machine from 2011 (although runtime depends on input parameters).
 56 | %
 57 | % USAGE
 58 | %  pChns = chnsCompute()
 59 | %  chns = chnsCompute( I, pChns )
 60 | %
 61 | % INPUTS
 62 | %  I           - [hxwx3] input image (uint8 or single/double in [0,1])
 63 | %  pChns       - parameters (struct or name/value pairs)
 64 | %   .shrink       - [4] integer downsampling amount for channels
 65 | %   .pColor       - parameters for color space:
 66 | %     .enabled      - [1] if true enable color channels
 67 | %     .smooth       - [1] radius for image smoothing (using convTri)
 68 | %     .colorSpace   - ['luv'] choices are: 'gray', 'rgb', 'hsv', 'orig'
 69 | %   .pGradMag     - parameters for gradient magnitude:
 70 | %     .enabled      - [1] if true enable gradient magnitude channel
 71 | %     .colorChn     - [0] if>0 color channel to use for grad computation
 72 | %     .normRad      - [5] normalization radius for gradient
 73 | %     .normConst    - [.005] normalization constant for gradient
 74 | %     .full         - [0] if true compute angles in [0,2*pi) else in [0,pi)
 75 | %   .pGradHist    - parameters for gradient histograms:
 76 | %     .enabled      - [1] if true enable gradient histogram channels
 77 | %     .binSize      - [shrink] spatial bin size (defaults to shrink)
 78 | %     .nOrients     - [6] number of orientation channels
 79 | %     .softBin      - [0] if true use "soft" bilinear spatial binning
 80 | %     .useHog       - [0] if true perform 4-way hog normalization/clipping
 81 | %     .clipHog      - [.2] value at which to clip hog histogram bins
 82 | %   .pCustom      - parameters for custom channels (optional struct array):
 83 | %     .enabled      - [1] if true enable custom channel type
 84 | %     .name         - ['REQ'] custom channel type name
 85 | %     .hFunc        - ['REQ'] function handle for computing custom channels
 86 | %     .pFunc        - [{}] additional params for chns=hFunc(I,pFunc{:})
 87 | %     .padWith      - [0] how channel should be padded (e.g. 0,'replicate')
 88 | %   .complete     - [] if true does not check/set default vals in pChns
 89 | %
 90 | % OUTPUTS
 91 | %  chns       - output struct
 92 | %   .pChns      - exact input parameters used
 93 | %   .nTypes     - number of channel types
 94 | %   .data       - [nTypes x 1] cell [h/shrink x w/shrink x nChns] channels
 95 | %   .info       - [nTypes x 1] struct array
 96 | %     .name       - channel type name
 97 | %     .pChn       - exact input parameters for given channel type
 98 | %     .nChns      - number of channels for given channel type
 99 | %     .padWith    - how channel should be padded (0,'replicate')
100 | %
101 | % EXAMPLE - default channels
102 | %  I=imResample(imread('peppers.png'),[480 640]); pChns=chnsCompute();
103 | %  tic, for i=1:100, chns=chnsCompute(I,pChns); end; toc
104 | %  figure(1); montage2(cat(3,chns.data{:}));
105 | %
106 | % EXAMPLE - default + custom channels
107 | %  I=imResample(imread('peppers.png'),[480 640]); pChns=chnsCompute();
108 | %  hFunc=@(I) 5*sqrt(max(0,max(convBox(I.^2,2)-convBox(I,2).^2,[],3)));
109 | %  pChns.pCustom=struct('name','Std02','hFunc',hFunc); pChns.complete=0;
110 | %  tic, chns=chnsCompute(I,pChns); toc
111 | %  figure(1); im(chns.data{4});
112 | %
113 | % See also rgbConvert, gradientMag, gradientHist, chnsPyramid
114 | %
115 | % Piotr's Image&Video Toolbox      Version 3.23
116 | % Copyright 2013 Piotr Dollar & Ron Appel.  [pdollar-at-caltech.edu]
117 | % Please email me if you find bugs, or have suggestions or questions!
118 | % Licensed under the Simplified BSD License [see external/bsd.txt]
119 | 
120 | % get default parameters pChns
121 | if(nargin==2), pChns=varargin{1}; else pChns=[]; end
122 | if( ~isfield(pChns,'complete') || pChns.complete~=1 || isempty(I) )
123 |   p=struct('enabled',{},'name',{},'hFunc',{},'pFunc',{},'padWith',{});
124 |   pChns = getPrmDflt(varargin,{'shrink',4,'pColor',{},'pGradMag',{},...
125 |     'pGradHist',{},'pCustom',p,'complete',1,'s',1},1);
126 |   pChns.pColor = getPrmDflt( pChns.pColor, {'enabled',1,...
127 |     'smooth',1, 'colorSpace','luv'}, 1 );
128 |   pChns.pGradMag = getPrmDflt( pChns.pGradMag, {'enabled',1,...
129 |     'colorChn',0,'normRad',5,'normConst',.005,'full',0,'gmcMethod','','gmcParam',[]}, 1 );
130 |   pChns.pGradHist = getPrmDflt( pChns.pGradHist, {'enabled',1,...
131 |     'binSize',[],'nOrients',6,'softBin',0,'useHog',0,'clipHog',.2}, 1 );
132 |   nc=length(pChns.pCustom); pc=cell(1,nc);
133 |   for i=1:nc, pc{i} = getPrmDflt( pChns.pCustom(i), {'enabled',1,...
134 |       'name','REQ','hFunc','REQ','pFunc',{},'padWith',0}, 1 ); end
135 |   if( nc>0 ), pChns.pCustom=[pc{:}]; end
136 | end
137 | if(nargin==0), chns=pChns; return; end
138 | 
139 | % create output struct
140 | info=struct('name',{},'pChn',{},'nChns',{},'padWith',{});
141 | chns=struct('pChns',pChns,'nTypes',0,'data',{{}},'info',info);
142 | 
143 | % crop I so divisible by shrink and get target dimensions
144 | shrink=pChns.shrink; [h,w,~]=size(I); cr=mod([h w],shrink);
145 | if(any(cr)), h=h-cr(1); w=w-cr(2); I=I(1:h,1:w,:); end
146 | h=h/shrink; w=w/shrink;
147 | 
148 | % compute color channels
149 | p=pChns.pColor; nm='color channels';
150 | I=rgbConvert(I,p.colorSpace); I=convTri(I,p.smooth);
151 | if(p.enabled), chns=addChn(chns,I,nm,p,'replicate',h,w); end
152 | 
153 | % compute gradient magnitude channel
154 | p=pChns.pGradMag; nm='gradient magnitude';
155 | full=0; if(isfield(p,'full')), full=p.full; end
156 | if( pChns.pGradHist.enabled )
157 |   [M,O]=gradientMag(I,p.colorChn,p.normRad,p.normConst,full);
158 | elseif( p.enabled )
159 |   M=gradientMag(I,p.colorChn,p.normRad,p.normConst,full);
160 | end
161 | 
162 | % GMC Procedure
163 | if (~isempty(chns.pChns.pGradMag.gmcMethod) && ~isempty(chns.pChns.pGradMag.gmcParam))
164 |      if (pChns.s == 1)
165 |      elseif (pChns.s < 1)
166 |          X = computeKernel(pChns.s, 'Quadratic');
167 |          GMC = X * pChns.pGradMag.gmcParam{1};
168 |          M=M*GMC;
169 |      else
170 |          X = computeKernel(pChns.s, 'Linear');
171 |          GMC = X * pChns.pGradMag.gmcParam{2};
172 |          M=M*GMC;
173 |      end
174 |  end
175 | 
176 | if(p.enabled), chns=addChn(chns,M,nm,p,0,h,w); end
177 | 
178 | % compute gradient histgoram channels
179 | p=pChns.pGradHist; nm='gradient histogram';
180 | if( p.enabled )
181 |   binSize=p.binSize; if(isempty(binSize)), binSize=shrink; end
182 |   H=gradientHist(M,O,binSize,p.nOrients,p.softBin,p.useHog,p.clipHog,full);
183 |   chns=addChn(chns,H,nm,pChns.pGradHist,0,h,w);
184 | end
185 | 
186 | % compute custom channels
187 | p=pChns.pCustom;
188 | for i=find( [p.enabled] )
189 |   C=feval(p(i).hFunc,I,p(i).pFunc{:});
190 |   chns=addChn(chns,C,p(i).name,p(i),p(i).padWith,h,w);
191 | end
192 | 
193 | % % compute gradient histgoram channels
194 | % p=pChns.pGradHist; nm='dACF';
195 | % if( p.enabled )
196 | %     dACF = cat(3, chns.data{:});
197 | %     
198 | %     for offset = 2 : 4
199 | %     
200 | %         % 1st D
201 | %         sACF = zeros(size(dACF), 'single');
202 | %         sACF(1:end-offset+1,:,:)=(dACF(1:end-offset+1,:,:)-dACF(offset:end,:,:));
203 | %         chns=addChn(chns,sACF,nm,pChns.pGradHist,0,h,w);
204 | % 
205 | %         % 2nd D
206 | %         sACF = zeros(size(dACF), 'single');
207 | %         sACF(:,1:end-offset+1,:)=(dACF(:,1:end-offset+1,:)-dACF(:,offset:end,:));
208 | %         chns=addChn(chns,sACF,nm,pChns.pGradHist,0,h,w);
209 | % 
210 | %         % Diagonal 1 
211 | %         sACF = zeros(size(dACF), 'single');
212 | %         sACF(offset:end,1:end-offset+1,:)=(dACF(offset:end,1:end-offset+1,:)-dACF(1:end-offset+1,offset:end,:));
213 | %         chns=addChn(chns,sACF,nm,pChns.pGradHist,0,h,w);
214 | % 
215 | %         % Diagonal 2
216 | %         sACF = zeros(size(dACF), 'single');
217 | %         sACF(1:end-offset+1,1:end-offset+1,:)=(dACF(1:end-offset+1,1:end-offset+1,:)-dACF(offset:end,offset:end,:));
218 | %         chns=addChn(chns,sACF,nm,pChns.pGradHist,0,h,w);
219 | %     end
220 | % end
221 | 
222 | end
223 | 
224 | function chns = addChn( chns, data, name, pChn, padWith, h, w )
225 | % Helper function to add a channel to chns.
226 | [h1,w1,~]=size(data);
227 | if(h1~=h || w1~=w), data=imResampleMex(data,h,w,1);
228 |   assert(all(mod([h1 w1]./[h w],1)==0)); end
229 | chns.data{end+1}=data; chns.nTypes=chns.nTypes+1;
230 | chns.info(end+1)=struct('name',name,'pChn',pChn,...
231 |   'nChns',size(data,3),'padWith',padWith);
232 | end
233 | 


--------------------------------------------------------------------------------
/params.mat:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/POSTECH-IMLAB/DetectorWithFocus/c09a7d497d7148719c325be875df8ebdcf392565/params.mat


--------------------------------------------------------------------------------