├── LICENSE ├── README.md ├── chnsCompute.m └── params.mat /LICENSE: -------------------------------------------------------------------------------- 1 | 2 | GNU LESSER GENERAL PUBLIC LICENSE 3 | Version 3, 29 June 2007 4 | 5 | Copyright (C) 2007 Free Software Foundation, Inc. 6 | Everyone is permitted to copy and distribute verbatim copies 7 | of this license document, but changing it is not allowed. 8 | 9 | 10 | This version of the GNU Lesser General Public License incorporates 11 | the terms and conditions of version 3 of the GNU General Public 12 | License, supplemented by the additional permissions listed below. 13 | 14 | 0. Additional Definitions. 15 | 16 | As used herein, "this License" refers to version 3 of the GNU Lesser 17 | General Public License, and the "GNU GPL" refers to version 3 of the GNU 18 | General Public License. 19 | 20 | "The Library" refers to a covered work governed by this License, 21 | other than an Application or a Combined Work as defined below. 22 | 23 | An "Application" is any work that makes use of an interface provided 24 | by the Library, but which is not otherwise based on the Library. 25 | Defining a subclass of a class defined by the Library is deemed a mode 26 | of using an interface provided by the Library. 27 | 28 | A "Combined Work" is a work produced by combining or linking an 29 | Application with the Library. The particular version of the Library 30 | with which the Combined Work was made is also called the "Linked 31 | Version". 32 | 33 | The "Minimal Corresponding Source" for a Combined Work means the 34 | Corresponding Source for the Combined Work, excluding any source code 35 | for portions of the Combined Work that, considered in isolation, are 36 | based on the Application, and not on the Linked Version. 37 | 38 | The "Corresponding Application Code" for a Combined Work means the 39 | object code and/or source code for the Application, including any data 40 | and utility programs needed for reproducing the Combined Work from the 41 | Application, but excluding the System Libraries of the Combined Work. 42 | 43 | 1. Exception to Section 3 of the GNU GPL. 44 | 45 | You may convey a covered work under sections 3 and 4 of this License 46 | without being bound by section 3 of the GNU GPL. 47 | 48 | 2. Conveying Modified Versions. 49 | 50 | If you modify a copy of the Library, and, in your modifications, a 51 | facility refers to a function or data to be supplied by an Application 52 | that uses the facility (other than as an argument passed when the 53 | facility is invoked), then you may convey a copy of the modified 54 | version: 55 | 56 | a) under this License, provided that you make a good faith effort to 57 | ensure that, in the event an Application does not supply the 58 | function or data, the facility still operates, and performs 59 | whatever part of its purpose remains meaningful, or 60 | 61 | b) under the GNU GPL, with none of the additional permissions of 62 | this License applicable to that copy. 63 | 64 | 3. Object Code Incorporating Material from Library Header Files. 65 | 66 | The object code form of an Application may incorporate material from 67 | a header file that is part of the Library. You may convey such object 68 | code under terms of your choice, provided that, if the incorporated 69 | material is not limited to numerical parameters, data structure 70 | layouts and accessors, or small macros, inline functions and templates 71 | (ten or fewer lines in length), you do both of the following: 72 | 73 | a) Give prominent notice with each copy of the object code that the 74 | Library is used in it and that the Library and its use are 75 | covered by this License. 76 | 77 | b) Accompany the object code with a copy of the GNU GPL and this license 78 | document. 79 | 80 | 4. Combined Works. 81 | 82 | You may convey a Combined Work under terms of your choice that, 83 | taken together, effectively do not restrict modification of the 84 | portions of the Library contained in the Combined Work and reverse 85 | engineering for debugging such modifications, if you also do each of 86 | the following: 87 | 88 | a) Give prominent notice with each copy of the Combined Work that 89 | the Library is used in it and that the Library and its use are 90 | covered by this License. 91 | 92 | b) Accompany the Combined Work with a copy of the GNU GPL and this license 93 | document. 94 | 95 | c) For a Combined Work that displays copyright notices during 96 | execution, include the copyright notice for the Library among 97 | these notices, as well as a reference directing the user to the 98 | copies of the GNU GPL and this license document. 99 | 100 | d) Do one of the following: 101 | 102 | 0) Convey the Minimal Corresponding Source under the terms of this 103 | License, and the Corresponding Application Code in a form 104 | suitable for, and under terms that permit, the user to 105 | recombine or relink the Application with a modified version of 106 | the Linked Version to produce a modified Combined Work, in the 107 | manner specified by section 6 of the GNU GPL for conveying 108 | Corresponding Source. 109 | 110 | 1) Use a suitable shared library mechanism for linking with the 111 | Library. A suitable mechanism is one that (a) uses at run time 112 | a copy of the Library already present on the user's computer 113 | system, and (b) will operate properly with a modified version 114 | of the Library that is interface-compatible with the Linked 115 | Version. 116 | 117 | e) Provide Installation Information, but only if you would otherwise 118 | be required to provide such information under section 6 of the 119 | GNU GPL, and only to the extent that such information is 120 | necessary to install and execute a modified version of the 121 | Combined Work produced by recombining or relinking the 122 | Application with a modified version of the Linked Version. (If 123 | you use option 4d0, the Installation Information must accompany 124 | the Minimal Corresponding Source and Corresponding Application 125 | Code. If you use option 4d1, you must provide the Installation 126 | Information in the manner specified by section 6 of the GNU GPL 127 | for conveying Corresponding Source.) 128 | 129 | 5. Combined Libraries. 130 | 131 | You may place library facilities that are a work based on the 132 | Library side by side in a single library together with other library 133 | facilities that are not Applications and are not covered by this 134 | License, and convey such a combined library under terms of your 135 | choice, if you do both of the following: 136 | 137 | a) Accompany the combined library with a copy of the same work based 138 | on the Library, uncombined with any other library facilities, 139 | conveyed under the terms of this License. 140 | 141 | b) Give prominent notice with the combined library that part of it 142 | is a work based on the Library, and explaining where to find the 143 | accompanying uncombined form of the same work. 144 | 145 | 6. Revised Versions of the GNU Lesser General Public License. 146 | 147 | The Free Software Foundation may publish revised and/or new versions 148 | of the GNU Lesser General Public License from time to time. Such new 149 | versions will be similar in spirit to the present version, but may 150 | differ in detail to address new problems or concerns. 151 | 152 | Each version is given a distinguishing version number. If the 153 | Library as you received it specifies that a certain numbered version 154 | of the GNU Lesser General Public License "or any later version" 155 | applies to it, you have the option of following the terms and 156 | conditions either of that published version or of any later version 157 | published by the Free Software Foundation. If the Library as you 158 | received it does not specify a version number of the GNU Lesser 159 | General Public License, you may choose any version of the GNU Lesser 160 | General Public License ever published by the Free Software Foundation. 161 | 162 | If the Library as you received it specifies that a proxy can decide 163 | whether future versions of the GNU Lesser General Public License shall 164 | apply, that proxy's public statement of acceptance of any version is 165 | permanent authorization for you to choose that version for the 166 | Library. 167 | -------------------------------------------------------------------------------- /README.md: -------------------------------------------------------------------------------- 1 | 2 | ## DetectorWithFocus 3 | Created by [Yonghyun Kim](http://imlab.postech.ac.kr/members.htm) and [Daijin Kim](http://imlab.postech.ac.kr/members_d.htm) at [POSTECH IM Lab](http://imlab.postech.ac.kr) 4 | 5 | ### Overview 6 | 7 | An image pyramid can extend many object detection algorithms to solve detection on multiple scales. However, interpolation during the resampling process of an image pyramid causes gradient variation, which is the difference of the gradients between the original image and the scaled images. Our key insight is that the increased variance of gradients makes the classifiers have difficulty in correctly assigning categories. We prove the existence of the gradient variation by formulating the ratio of gradient expectations between an original image and scaled images, then propose a simple and novel gradient normalization method to eliminate the effect of this variation. The proposed normalization method reduce the variance in an image pyramid and allow the classifier to focus on a smaller coverage. We show the improvement in three different visual recognition problems: pedestrian detection, pose estimation, and object detection. The method is generally applicable to many vision algorithms based on an image pyramid with gradients. 8 | 9 | 10 | ## Citation 11 | If you're using this code in a publication, please cite our papers. 12 | 13 | 14 | @inproceedings{ykim2017detector, 15 | title={DETECTOR WITH FOCUS: NORMALIZING GRADIENT IN IMAGE PYRAMID}, 16 | author={Yonghyun Kim, Bongnam Kang and Daijin Kim}, 17 | booktitle={IEEE International Conference on Image Processing (ICIP)}, 18 | year={2017}, 19 | organization={IEEE} 20 | } 21 | 22 | 23 | ### Performance 24 | We conduct the experiments in object detection with three applications: pedestrian detection, pose estimation, and object detection. 25 | ### Overall 26 | We propose ### Overalla simple and novel gradient normalization method by analyzing the gradient variation in the viewpoint of the classifier. 27 | The proposed method defines the original image as reference, and normalizes gradients from other resampled images to the reference image. 28 | The normalized gradient, which is similar to the gradients of original images, reduces the variance, and increases the performance of the classifiers with negligible increase in computing time. 29 | We show the effectiveness of the gradient normalization in object detection with three applications: pedestrian detection, pose estimation, and object detection. 30 | ### Related Papers 31 | Yonghyun Kim, Bong-Nam Kang, Daijin Kim, "DETECTOR WITH FOCUS: NORMALIZING GRADIENT IN IMAGE PYRAMID," 2017 IEEE International Conference on Image Processing (ICIP) 2017. 32 | ### Acknowledgements 33 | This work was supported by Institute for Information & communications Technology Promotion (IITP) grant funded by the Korea government (MSIP)(2014-0-00059, Development of Predictive Visual Intelligence Technology), MSIP (Ministry of Science, ICT and Future Planning), Korea, under the ICT Consilience Creative Program (IITP-R0346-16-1007) supervised by the IITP, and MSIP(Ministry of Science, ICT and Future Planning), Korea, under the ITRC (Information Technology Research Center) support program (IITP-2017-2016-0-00464) supervised by the IITP. 34 | -------------------------------------------------------------------------------- /chnsCompute.m: -------------------------------------------------------------------------------- 1 | function chns = chnsCompute( I, varargin ) 2 | % Compute channel features at a single scale given an input image. 3 | % 4 | % Compute the channel features as described in: 5 | % P. Doll?, Z. Tu, P. Perona and S. Belongie 6 | % "Integral Channel Features", BMVC 2009. 7 | % Channel features have proven very effective in sliding window object 8 | % detection, both in terms of *accuracy* and *speed*. Numerous feature 9 | % types including histogram of gradients (hog) can be converted into 10 | % channel features, and overall, channels are general and powerful. 11 | % 12 | % Given an input image I, a corresponding channel is a registered map of I, 13 | % where the output pixels are computed from corresponding patches of input 14 | % pixels (thus preserving overall image layout). A trivial channel is 15 | % simply the input grayscale image, likewise for a color image each color 16 | % channel can serve as a channel. Other channels can be computed using 17 | % linear or non-linear transformations of I, various choices implemented 18 | % here are described below. The only constraint is that channels must be 19 | % translationally invariant (i.e. translating the input image or the 20 | % resulting channels gives the same result). This allows for fast object 21 | % detection, as the channels can be computed once on the entire image 22 | % rather than separately for each overlapping detection window. 23 | % 24 | % Currently, three channel types are available by default (to date, these 25 | % have proven the most effective for sliding window object detection): 26 | % (1) color channels (computed using rgbConvert.m) 27 | % (2) gradient magnitude (computed using gradientMag.m) 28 | % (3) quantized gradient channels (computed using gradientHist.m) 29 | % For more information about each channel type, including the exact input 30 | % parameters and their meanings, see the respective m-files which perform 31 | % the actual computatons (chnsCompute is essentially a wrapper function). 32 | % The converted color channels serve as input to gradientMag/gradientHist. 33 | % 34 | % Additionally, custom channels can be specified via an optional struct 35 | % array "pCustom" which may have 0 or more custom channel definitions. Each 36 | % custom channel is generated via a call to "chns=feval(hFunc,I,pFunc{:})". 37 | % The color space of I is determined by pColor.colorSpace, use the setting 38 | % colorSpace='orig' if the input image is not an 'rgb' image and should be 39 | % left unchanged (e.g. if I has multiple channels). The input I will have 40 | % type single and the output of hFunc should also have type single. 41 | % 42 | % "shrink" (which should be an integer) determines the amount to subsample 43 | % the computed channels (in applications such as detection subsamping does 44 | % not affect performance). The params for each channel type are described 45 | % in detail in the respective function. In addition, each channel type has 46 | % a param "enabled" that determines if the channel is computed. If 47 | % chnsCompute() is called with no inputs, the output is the complete 48 | % default params (pChns). Otherwise the outputs are the computed channels 49 | % and additional meta-data (see below). The channels are computed at a 50 | % single scale, for (fast) multi-scale channel computation see chnsPyramid. 51 | % 52 | % An emphasis has been placed on speed, with the code undergoing heavy 53 | % optimization. Computing the full set of channels used in the BMVC09 paper 54 | % referenced above on a 480x640 image runs over *100 fps* on a single core 55 | % of a machine from 2011 (although runtime depends on input parameters). 56 | % 57 | % USAGE 58 | % pChns = chnsCompute() 59 | % chns = chnsCompute( I, pChns ) 60 | % 61 | % INPUTS 62 | % I - [hxwx3] input image (uint8 or single/double in [0,1]) 63 | % pChns - parameters (struct or name/value pairs) 64 | % .shrink - [4] integer downsampling amount for channels 65 | % .pColor - parameters for color space: 66 | % .enabled - [1] if true enable color channels 67 | % .smooth - [1] radius for image smoothing (using convTri) 68 | % .colorSpace - ['luv'] choices are: 'gray', 'rgb', 'hsv', 'orig' 69 | % .pGradMag - parameters for gradient magnitude: 70 | % .enabled - [1] if true enable gradient magnitude channel 71 | % .colorChn - [0] if>0 color channel to use for grad computation 72 | % .normRad - [5] normalization radius for gradient 73 | % .normConst - [.005] normalization constant for gradient 74 | % .full - [0] if true compute angles in [0,2*pi) else in [0,pi) 75 | % .pGradHist - parameters for gradient histograms: 76 | % .enabled - [1] if true enable gradient histogram channels 77 | % .binSize - [shrink] spatial bin size (defaults to shrink) 78 | % .nOrients - [6] number of orientation channels 79 | % .softBin - [0] if true use "soft" bilinear spatial binning 80 | % .useHog - [0] if true perform 4-way hog normalization/clipping 81 | % .clipHog - [.2] value at which to clip hog histogram bins 82 | % .pCustom - parameters for custom channels (optional struct array): 83 | % .enabled - [1] if true enable custom channel type 84 | % .name - ['REQ'] custom channel type name 85 | % .hFunc - ['REQ'] function handle for computing custom channels 86 | % .pFunc - [{}] additional params for chns=hFunc(I,pFunc{:}) 87 | % .padWith - [0] how channel should be padded (e.g. 0,'replicate') 88 | % .complete - [] if true does not check/set default vals in pChns 89 | % 90 | % OUTPUTS 91 | % chns - output struct 92 | % .pChns - exact input parameters used 93 | % .nTypes - number of channel types 94 | % .data - [nTypes x 1] cell [h/shrink x w/shrink x nChns] channels 95 | % .info - [nTypes x 1] struct array 96 | % .name - channel type name 97 | % .pChn - exact input parameters for given channel type 98 | % .nChns - number of channels for given channel type 99 | % .padWith - how channel should be padded (0,'replicate') 100 | % 101 | % EXAMPLE - default channels 102 | % I=imResample(imread('peppers.png'),[480 640]); pChns=chnsCompute(); 103 | % tic, for i=1:100, chns=chnsCompute(I,pChns); end; toc 104 | % figure(1); montage2(cat(3,chns.data{:})); 105 | % 106 | % EXAMPLE - default + custom channels 107 | % I=imResample(imread('peppers.png'),[480 640]); pChns=chnsCompute(); 108 | % hFunc=@(I) 5*sqrt(max(0,max(convBox(I.^2,2)-convBox(I,2).^2,[],3))); 109 | % pChns.pCustom=struct('name','Std02','hFunc',hFunc); pChns.complete=0; 110 | % tic, chns=chnsCompute(I,pChns); toc 111 | % figure(1); im(chns.data{4}); 112 | % 113 | % See also rgbConvert, gradientMag, gradientHist, chnsPyramid 114 | % 115 | % Piotr's Image&Video Toolbox Version 3.23 116 | % Copyright 2013 Piotr Dollar & Ron Appel. [pdollar-at-caltech.edu] 117 | % Please email me if you find bugs, or have suggestions or questions! 118 | % Licensed under the Simplified BSD License [see external/bsd.txt] 119 | 120 | % get default parameters pChns 121 | if(nargin==2), pChns=varargin{1}; else pChns=[]; end 122 | if( ~isfield(pChns,'complete') || pChns.complete~=1 || isempty(I) ) 123 | p=struct('enabled',{},'name',{},'hFunc',{},'pFunc',{},'padWith',{}); 124 | pChns = getPrmDflt(varargin,{'shrink',4,'pColor',{},'pGradMag',{},... 125 | 'pGradHist',{},'pCustom',p,'complete',1,'s',1},1); 126 | pChns.pColor = getPrmDflt( pChns.pColor, {'enabled',1,... 127 | 'smooth',1, 'colorSpace','luv'}, 1 ); 128 | pChns.pGradMag = getPrmDflt( pChns.pGradMag, {'enabled',1,... 129 | 'colorChn',0,'normRad',5,'normConst',.005,'full',0,'gmcMethod','','gmcParam',[]}, 1 ); 130 | pChns.pGradHist = getPrmDflt( pChns.pGradHist, {'enabled',1,... 131 | 'binSize',[],'nOrients',6,'softBin',0,'useHog',0,'clipHog',.2}, 1 ); 132 | nc=length(pChns.pCustom); pc=cell(1,nc); 133 | for i=1:nc, pc{i} = getPrmDflt( pChns.pCustom(i), {'enabled',1,... 134 | 'name','REQ','hFunc','REQ','pFunc',{},'padWith',0}, 1 ); end 135 | if( nc>0 ), pChns.pCustom=[pc{:}]; end 136 | end 137 | if(nargin==0), chns=pChns; return; end 138 | 139 | % create output struct 140 | info=struct('name',{},'pChn',{},'nChns',{},'padWith',{}); 141 | chns=struct('pChns',pChns,'nTypes',0,'data',{{}},'info',info); 142 | 143 | % crop I so divisible by shrink and get target dimensions 144 | shrink=pChns.shrink; [h,w,~]=size(I); cr=mod([h w],shrink); 145 | if(any(cr)), h=h-cr(1); w=w-cr(2); I=I(1:h,1:w,:); end 146 | h=h/shrink; w=w/shrink; 147 | 148 | % compute color channels 149 | p=pChns.pColor; nm='color channels'; 150 | I=rgbConvert(I,p.colorSpace); I=convTri(I,p.smooth); 151 | if(p.enabled), chns=addChn(chns,I,nm,p,'replicate',h,w); end 152 | 153 | % compute gradient magnitude channel 154 | p=pChns.pGradMag; nm='gradient magnitude'; 155 | full=0; if(isfield(p,'full')), full=p.full; end 156 | if( pChns.pGradHist.enabled ) 157 | [M,O]=gradientMag(I,p.colorChn,p.normRad,p.normConst,full); 158 | elseif( p.enabled ) 159 | M=gradientMag(I,p.colorChn,p.normRad,p.normConst,full); 160 | end 161 | 162 | % GMC Procedure 163 | if (~isempty(chns.pChns.pGradMag.gmcMethod) && ~isempty(chns.pChns.pGradMag.gmcParam)) 164 | if (pChns.s == 1) 165 | elseif (pChns.s < 1) 166 | X = computeKernel(pChns.s, 'Quadratic'); 167 | GMC = X * pChns.pGradMag.gmcParam{1}; 168 | M=M*GMC; 169 | else 170 | X = computeKernel(pChns.s, 'Linear'); 171 | GMC = X * pChns.pGradMag.gmcParam{2}; 172 | M=M*GMC; 173 | end 174 | end 175 | 176 | if(p.enabled), chns=addChn(chns,M,nm,p,0,h,w); end 177 | 178 | % compute gradient histgoram channels 179 | p=pChns.pGradHist; nm='gradient histogram'; 180 | if( p.enabled ) 181 | binSize=p.binSize; if(isempty(binSize)), binSize=shrink; end 182 | H=gradientHist(M,O,binSize,p.nOrients,p.softBin,p.useHog,p.clipHog,full); 183 | chns=addChn(chns,H,nm,pChns.pGradHist,0,h,w); 184 | end 185 | 186 | % compute custom channels 187 | p=pChns.pCustom; 188 | for i=find( [p.enabled] ) 189 | C=feval(p(i).hFunc,I,p(i).pFunc{:}); 190 | chns=addChn(chns,C,p(i).name,p(i),p(i).padWith,h,w); 191 | end 192 | 193 | % % compute gradient histgoram channels 194 | % p=pChns.pGradHist; nm='dACF'; 195 | % if( p.enabled ) 196 | % dACF = cat(3, chns.data{:}); 197 | % 198 | % for offset = 2 : 4 199 | % 200 | % % 1st D 201 | % sACF = zeros(size(dACF), 'single'); 202 | % sACF(1:end-offset+1,:,:)=(dACF(1:end-offset+1,:,:)-dACF(offset:end,:,:)); 203 | % chns=addChn(chns,sACF,nm,pChns.pGradHist,0,h,w); 204 | % 205 | % % 2nd D 206 | % sACF = zeros(size(dACF), 'single'); 207 | % sACF(:,1:end-offset+1,:)=(dACF(:,1:end-offset+1,:)-dACF(:,offset:end,:)); 208 | % chns=addChn(chns,sACF,nm,pChns.pGradHist,0,h,w); 209 | % 210 | % % Diagonal 1 211 | % sACF = zeros(size(dACF), 'single'); 212 | % sACF(offset:end,1:end-offset+1,:)=(dACF(offset:end,1:end-offset+1,:)-dACF(1:end-offset+1,offset:end,:)); 213 | % chns=addChn(chns,sACF,nm,pChns.pGradHist,0,h,w); 214 | % 215 | % % Diagonal 2 216 | % sACF = zeros(size(dACF), 'single'); 217 | % sACF(1:end-offset+1,1:end-offset+1,:)=(dACF(1:end-offset+1,1:end-offset+1,:)-dACF(offset:end,offset:end,:)); 218 | % chns=addChn(chns,sACF,nm,pChns.pGradHist,0,h,w); 219 | % end 220 | % end 221 | 222 | end 223 | 224 | function chns = addChn( chns, data, name, pChn, padWith, h, w ) 225 | % Helper function to add a channel to chns. 226 | [h1,w1,~]=size(data); 227 | if(h1~=h || w1~=w), data=imResampleMex(data,h,w,1); 228 | assert(all(mod([h1 w1]./[h w],1)==0)); end 229 | chns.data{end+1}=data; chns.nTypes=chns.nTypes+1; 230 | chns.info(end+1)=struct('name',name,'pChn',pChn,... 231 | 'nChns',size(data,3),'padWith',padWith); 232 | end 233 | -------------------------------------------------------------------------------- /params.mat: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/POSTECH-IMLAB/DetectorWithFocus/c09a7d497d7148719c325be875df8ebdcf392565/params.mat --------------------------------------------------------------------------------