├── Makefile ├── README.md ├── binarizewolfjolion.cpp ├── illustration.png └── sample.jpg /Makefile: -------------------------------------------------------------------------------- 1 | 2 | 3 | all: 4 | g++ -I/usr/include/opencv binarizewolfjolion.cpp -o binarizewolfjolion `pkg-config opencv --libs` -lstdc++ 5 | 6 | clean: 7 | rm -f binarizewolfjolion 8 | 9 | test: 10 | ./binarizewolfjolion -k 0.6 sample.jpg _result.jpg 11 | 12 | 13 | package: clean 14 | rm -f x.jpg 15 | tar cvfz binarizewolfjolionopencv.tgz * 16 | 17 | -------------------------------------------------------------------------------- /README.md: -------------------------------------------------------------------------------- 1 | # Local Adaptive Binarization 2 | 3 | ![alt text](illustration.png) 4 | 5 | This code uses an improved contrast maximization version of Niblack/Sauvola et al's method to binarize document images. It is also able to perform the more classical Niblack as well as Sauvola et al. methods. Details can be found in the [ICPR 2002 paper](file:///Users/chris/www/prof/publications/index.html#icpr2002v). 6 | 7 | 8 | You need to cite the following paper when you use this code : 9 | 10 | Christian Wolf, Jean-Michel Jolion and Francoise Chassaing. 11 | Text Localization, Enhancement and Binarization in Multimedia Documents. 12 | International Conference on Pattern Recognition (ICPR), 13 | volume 4, pages 1037-1040, 2002. 14 | 15 | ## Usage: 16 | 17 | The executable is called on the command line and only reads and writes PGM files. Under Linux you can use for instance "convert" (part of the ImageMagick package) to convert to and from the PGM format. The first argument chooses between one of several methods, the first and the second argument specify, respectively, the input and the output file: 18 | 19 | ``` 20 | usage: binarize [ -x -y -k ] [ version ] 21 | 22 | version: n Niblack (1986) needs white text on black background 23 | s Sauvola et al. (1997) needs black text on white background 24 | w Wolf et al. (2001) needs black text on white background 25 | 26 | Default version: w (Wolf et al. 2001) 27 | Default value for "k": 0.5 28 | 29 | example: 30 | binarize w in.pgm out.pgm 31 | binarize in.pgm out.pgm 32 | binarize s -x 50 -y 50 -k 0.6 in.pgm out.pgm 33 | ``` 34 | 35 | The best working method is 'w', the one which performed 5th in the [DIBCO 2009 competition](http://www.cvc.uab.es/icdar2009/papers/3725b375.pdf). 36 | 37 | If no further arguments are provided, than the window sizes are estimated automatically. The argument -k sets the "k" parameter from all 3 papers. 38 | 39 | > IMPORTANT! Note, that the parameter should be set differently according! It might be necessary to set a different parameter for Niblack's method (he recommends -0.2 in his paper). 40 | 41 | ## Compilation 42 | 43 | The code has been developed under Linux. No project files are provided for Windows, although the code should in principle compile under Windows. 44 | 45 | ## Credits 46 | 47 | This code was written by [Christian Wolf](http://liris.cnrs.fr/christian.wolf/). 48 | 49 | Patched by Thibault Yohan for faster speed (uses open-cv integral images.) 50 | 51 | ## GIMP/Python version 52 | 53 | A GIMP Plugin written in Python has been written by Vincent Vansuyt (thanks!). He provides the plugin upon 54 | request : http://www.vvpix.com. 55 | 56 | ## Licence 57 | 58 | Permission is granted for anyone to copy, use, modify, or distribute this program and accompanying programs and documents for any purpose, provided this copyright notice is retained and prominently displayed, along with a note saying that the original programs are available from my web page. 59 | 60 | The programs and documents are distributed without any warranty, express or implied. As the programs were written for research purposes only, they have not been tested to the degree that would be advisable in any important application. 61 | 62 | All use of these programs is entirely at the user's own risk. 63 | 64 | You need to cite the following paper when you use this code : 65 | 66 | Christian Wolf, Jean-Michel Jolion and Francoise Chassaing. 67 | Text Localization, Enhancement and Binarization in Multimedia Documents. 68 | International Conference on Pattern Recognition (ICPR), 69 | volume 4, pages 1037-1040, 2002. 70 | 71 | 72 | 73 | 74 | 75 | 76 | -------------------------------------------------------------------------------- /binarizewolfjolion.cpp: -------------------------------------------------------------------------------- 1 | /************************************************************** 2 | * Binarization with several methods 3 | * (0) Niblacks method 4 | * (1) Sauvola & Co. 5 | * ICDAR 1997, pp 147-152 6 | * (2) by myself - Christian Wolf 7 | * Research notebook 19.4.2001, page 129 8 | * (3) by myself - Christian Wolf 9 | * 20.4.2007 10 | * 11 | * See also: 12 | * Research notebook 24.4.2001, page 132 (Calculation of s) 13 | **************************************************************/ 14 | 15 | #include 16 | #include 17 | #include 18 | #include 19 | // #include 20 | // #include 21 | #include 22 | 23 | using namespace std; 24 | using namespace cv; 25 | 26 | enum NiblackVersion 27 | { 28 | NIBLACK=0, 29 | SAUVOLA, 30 | WOLFJOLION, 31 | }; 32 | 33 | #define BINARIZEWOLF_VERSION "2.4 (August 1st, 2014)" 34 | 35 | #define uget(x,y) at(y,x) 36 | #define uset(x,y,v) at(y,x)=v; 37 | #define fget(x,y) at(y,x) 38 | #define fset(x,y,v) at(y,x)=v; 39 | 40 | /********************************************************** 41 | * Usage 42 | **********************************************************/ 43 | 44 | static void usage (char *com) { 45 | cerr << "usage: " << com << " [ -x -y -k ] [ version ] \n\n" 46 | << "version: n Niblack (1986) needs white text on black background\n" 47 | << " s Sauvola et al. (1997) needs black text on white background\n" 48 | << " w Wolf et al. (2001) needs black text on white background\n" 49 | << "\n" 50 | << "Default version: w (Wolf et al. 2001)\n" 51 | << "\n" 52 | << "example:\n" 53 | << " " << com << " w in.pgm out.pgm\n" 54 | << " " << com << " in.pgm out.pgm\n" 55 | << " " << com << " s -x 50 -y 50 -k 0.6 in.pgm out.pgm\n"; 56 | } 57 | 58 | // ************************************************************* 59 | // glide a window across the image and 60 | // ************************************************************* 61 | // create two maps: mean and standard deviation. 62 | // 63 | // Version patched by Thibault Yohan (using opencv integral images) 64 | 65 | 66 | double calcLocalStats (Mat &im, Mat &map_m, Mat &map_s, int winx, int winy) { 67 | Mat im_sum, im_sum_sq; 68 | cv::integral(im,im_sum,im_sum_sq,CV_64F); 69 | 70 | double m,s,max_s,sum,sum_sq; 71 | int wxh = winx/2; 72 | int wyh = winy/2; 73 | int x_firstth= wxh; 74 | int y_firstth= wyh; 75 | int y_lastth = im.rows-wyh-1; 76 | double winarea = winx*winy; 77 | 78 | max_s = 0; 79 | for (int j = y_firstth ; j<=y_lastth; j++){ 80 | sum = sum_sq = 0; 81 | 82 | // for sum array iterator pointer 83 | double *sum_top_left = im_sum.ptr(j - wyh); 84 | double *sum_top_right = sum_top_left + winx; 85 | double *sum_bottom_left = im_sum.ptr(j - wyh + winy); 86 | double *sum_bottom_right = sum_bottom_left + winx; 87 | 88 | // for sum_sq array iterator pointer 89 | double *sum_eq_top_left = im_sum_sq.ptr(j - wyh); 90 | double *sum_eq_top_right = sum_eq_top_left + winx; 91 | double *sum_eq_bottom_left = im_sum_sq.ptr(j - wyh + winy); 92 | double *sum_eq_bottom_right = sum_eq_bottom_left + winx; 93 | 94 | sum = (*sum_bottom_right + *sum_top_left) - (*sum_top_right + *sum_bottom_left); 95 | sum_sq = (*sum_eq_bottom_right + *sum_eq_top_left) - (*sum_eq_top_right + *sum_eq_bottom_left); 96 | 97 | m = sum / winarea; 98 | s = sqrt ((sum_sq - m*sum)/winarea); 99 | if (s > max_s) max_s = s; 100 | 101 | float *map_m_data = map_m.ptr(j) + x_firstth; 102 | float *map_s_data = map_s.ptr(j) + x_firstth; 103 | *map_m_data++ = m; 104 | *map_s_data++ = s; 105 | 106 | // Shift the window, add and remove new/old values to the histogram 107 | for (int i=1 ; i <= im.cols-winx; i++) { 108 | sum_top_left++, sum_top_right++, sum_bottom_left++, sum_bottom_right++; 109 | 110 | sum_eq_top_left++, sum_eq_top_right++, sum_eq_bottom_left++, sum_eq_bottom_right++; 111 | 112 | sum = (*sum_bottom_right + *sum_top_left) - (*sum_top_right + *sum_bottom_left); 113 | sum_sq = (*sum_eq_bottom_right + *sum_eq_top_left) - (*sum_eq_top_right + *sum_eq_bottom_left); 114 | 115 | m = sum / winarea; 116 | s = sqrt ((sum_sq - m*sum)/winarea); 117 | if (s > max_s) max_s = s; 118 | 119 | *map_m_data++ = m; 120 | *map_s_data++ = s; 121 | } 122 | } 123 | 124 | return max_s; 125 | } 126 | 127 | 128 | 129 | /********************************************************** 130 | * The binarization routine 131 | **********************************************************/ 132 | 133 | 134 | void NiblackSauvolaWolfJolion (Mat im, Mat output, NiblackVersion version, 135 | int winx, int winy, double k, double dR) { 136 | 137 | 138 | double m, s, max_s; 139 | double th=0; 140 | double min_I, max_I; 141 | int wxh = winx/2; 142 | int wyh = winy/2; 143 | int x_firstth= wxh; 144 | int x_lastth = im.cols-wxh-1; 145 | int y_lastth = im.rows-wyh-1; 146 | int y_firstth= wyh; 147 | // int mx, my; 148 | 149 | // Create local statistics and store them in a double matrices 150 | Mat map_m = Mat::zeros (im.rows, im.cols, CV_32F); 151 | Mat map_s = Mat::zeros (im.rows, im.cols, CV_32F); 152 | max_s = calcLocalStats (im, map_m, map_s, winx, winy); 153 | 154 | minMaxLoc(im, &min_I, &max_I); 155 | 156 | Mat thsurf (im.rows, im.cols, CV_32F); 157 | 158 | // Create the threshold surface, including border processing 159 | // ---------------------------------------------------- 160 | for (int j = y_firstth ; j<=y_lastth; j++) { 161 | 162 | float *th_surf_data = thsurf.ptr(j) + wxh; 163 | float *map_m_data = map_m.ptr(j) + wxh; 164 | float *map_s_data = map_s.ptr(j) + wxh; 165 | 166 | // NORMAL, NON-BORDER AREA IN THE MIDDLE OF THE WINDOW: 167 | for (int i=0 ; i <= im.cols-winx; i++) { 168 | m = *map_m_data++; 169 | s = *map_s_data++; 170 | 171 | // Calculate the threshold 172 | switch (version) { 173 | 174 | case NIBLACK: 175 | th = m + k*s; 176 | break; 177 | 178 | case SAUVOLA: 179 | th = m * (1 + k*(s/dR-1)); 180 | break; 181 | 182 | case WOLFJOLION: 183 | th = m + k * (s/max_s-1) * (m-min_I); 184 | break; 185 | 186 | default: 187 | cerr << "Unknown threshold type in ImageThresholder::surfaceNiblackImproved()\n"; 188 | exit (1); 189 | } 190 | 191 | // thsurf.fset(i+wxh,j,th); 192 | *th_surf_data++ = th; 193 | 194 | 195 | if (i==0) { 196 | // LEFT BORDER 197 | float *th_surf_ptr = thsurf.ptr(j); 198 | for (int i=0; i<=x_firstth; ++i) 199 | *th_surf_ptr++ = th; 200 | 201 | // LEFT-UPPER CORNER 202 | if (j==y_firstth) 203 | { 204 | for (int u=0; u(u); 207 | for (int i=0; i<=x_firstth; ++i) 208 | *th_surf_ptr++ = th; 209 | } 210 | 211 | } 212 | 213 | // LEFT-LOWER CORNER 214 | if (j==y_lastth) 215 | { 216 | for (int u=y_lastth+1; u(u); 219 | for (int i=0; i<=x_firstth; ++i) 220 | *th_surf_ptr++ = th; 221 | } 222 | } 223 | } 224 | 225 | // UPPER BORDER 226 | if (j==y_firstth) 227 | for (int u=0; u(j) + x_lastth; 238 | for (int i=x_lastth; i(u) + x_lastth; 248 | for (int i=x_lastth; i(u) + x_lastth; 259 | for (int i=x_lastth; i(y); 269 | float *th_surf_data = thsurf.ptr(y); 270 | unsigned char *output_data = output.ptr(y); 271 | for (int x=0; x= *th_surf_data ? 255 : 0; 274 | im_data++; 275 | th_surf_data++; 276 | output_data++; 277 | } 278 | } 279 | } 280 | 281 | /********************************************************** 282 | * The main function 283 | **********************************************************/ 284 | 285 | int main (int argc, char **argv) 286 | { 287 | char version; 288 | int c; 289 | int winx=0, winy=0; 290 | float optK=0.5; 291 | bool didSpecifyK=false; 292 | NiblackVersion versionCode; 293 | char *inputname, *outputname, *versionstring; 294 | 295 | cerr << "===========================================================\n" 296 | << "Christian Wolf, LIRIS Laboratory, Lyon, France.\n" 297 | << "christian.wolf@liris.cnrs.fr\n" 298 | << "Version " << BINARIZEWOLF_VERSION << endl 299 | << "===========================================================\n"; 300 | 301 | // Argument processing 302 | while ((c = getopt (argc, argv, "x:y:k:")) != EOF) { 303 | 304 | switch (c) { 305 | 306 | case 'x': 307 | winx = atof(optarg); 308 | break; 309 | 310 | case 'y': 311 | winy = atof(optarg); 312 | break; 313 | 314 | case 'k': 315 | optK = atof(optarg); 316 | didSpecifyK = true; 317 | break; 318 | 319 | case '?': 320 | usage (*argv); 321 | cerr << "\nProblem parsing the options!\n\n"; 322 | exit (1); 323 | } 324 | } 325 | 326 | switch(argc-optind) 327 | { 328 | case 3: 329 | versionstring=argv[optind]; 330 | inputname=argv[optind+1]; 331 | outputname=argv[optind+2]; 332 | break; 333 | 334 | case 2: 335 | versionstring=(char *) "w"; 336 | inputname=argv[optind]; 337 | outputname=argv[optind+1]; 338 | break; 339 | 340 | default: 341 | usage (*argv); 342 | exit (1); 343 | } 344 | 345 | cerr << "Adaptive binarization\n" 346 | << "Threshold calculation: "; 347 | 348 | // Determine the method 349 | version = versionstring[0]; 350 | switch (version) 351 | { 352 | case 'n': 353 | versionCode = NIBLACK; 354 | cerr << "Niblack (1986)\n"; 355 | break; 356 | 357 | case 's': 358 | versionCode = SAUVOLA; 359 | cerr << "Sauvola et al. (1997)\n"; 360 | break; 361 | 362 | case 'w': 363 | versionCode = WOLFJOLION; 364 | cerr << "Wolf and Jolion (2001)\n"; 365 | break; 366 | 367 | default: 368 | usage (*argv); 369 | cerr << "\nInvalid version: '" << version << "'!"; 370 | } 371 | 372 | 373 | cerr << "parameter k=" << optK << endl; 374 | 375 | if (!didSpecifyK) 376 | cerr << "Setting k to default value " << optK << endl; 377 | 378 | 379 | // Load the image in grayscale mode 380 | Mat input = imread(inputname,CV_LOAD_IMAGE_GRAYSCALE); 381 | 382 | 383 | if ((input.rows<=0) || (input.cols<=0)) { 384 | cerr << "*** ERROR: Couldn't read input image " << inputname << endl; 385 | exit(1); 386 | } 387 | 388 | 389 | // Treat the window size 390 | if (winx==0||winy==0) { 391 | cerr << "Input size: " << input.cols << "x" << input.rows << endl; 392 | winy = (int) (2.0 * input.rows-1)/3; 393 | winx = (int) input.cols-1 < winy ? input.cols-1 : winy; 394 | // if the window is too big, than we asume that the image 395 | // is not a single text box, but a document page: set 396 | // the window size to a fixed constant. 397 | if (winx > 100) 398 | winx = winy = 40; 399 | cerr << "Setting window size to [" << winx 400 | << "," << winy << "].\n"; 401 | } 402 | 403 | // Threshold 404 | Mat output (input.rows, input.cols, CV_8U); 405 | NiblackSauvolaWolfJolion (input, output, versionCode, winx, winy, optK, 128); 406 | 407 | // Write the tresholded file 408 | cerr << "Writing binarized image to file '" << outputname << "'.\n"; 409 | imwrite (outputname, output); 410 | 411 | return 0; 412 | } 413 | -------------------------------------------------------------------------------- /illustration.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/chriswolfvision/local_adaptive_binarization/2eb51465a917297910f2795fc149abafc96e657f/illustration.png -------------------------------------------------------------------------------- /sample.jpg: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/chriswolfvision/local_adaptive_binarization/2eb51465a917297910f2795fc149abafc96e657f/sample.jpg --------------------------------------------------------------------------------