├── .gitignore ├── .DS_Store ├── package.lisp ├── cl-tesseract.asd ├── LICENSE ├── library.lisp ├── README.txt ├── cl-tesseract.lisp └── capi.lisp /.gitignore: -------------------------------------------------------------------------------- 1 | *.FASL 2 | *.fasl 3 | *.lisp-temp 4 | -------------------------------------------------------------------------------- /.DS_Store: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/GOFAI/cl-tesseract/HEAD/.DS_Store -------------------------------------------------------------------------------- /package.lisp: -------------------------------------------------------------------------------- 1 | ;;;; package.lisp 2 | 3 | (defpackage #:cl-tesseract 4 | (:use #:cl #:cffi) 5 | (:nicknames #:tesseract #:tess) 6 | (:export 7 | #:*tessdata-directory* 8 | #:tesseract-version 9 | #:with-base-api 10 | #:init-tess-api 11 | #:process-pages 12 | #:image-to-text 13 | #:image-to-hocr)) 14 | 15 | -------------------------------------------------------------------------------- /cl-tesseract.asd: -------------------------------------------------------------------------------- 1 | ;;;; cl-tesseract.asd 2 | 3 | (asdf:defsystem #:cl-tesseract 4 | :description "CFFI bindings to the Tesseract OCR library." 5 | :author "Edward Geist" 6 | :license "MIT" 7 | :depends-on (#:cffi) 8 | :serial t 9 | :components ((:file "package") 10 | (:file "library") 11 | (:file "capi") 12 | (:file "cl-tesseract"))) 13 | 14 | -------------------------------------------------------------------------------- /LICENSE: -------------------------------------------------------------------------------- 1 | The MIT License (MIT) 2 | 3 | Copyright (c) 2015 GOFAI 4 | 5 | Permission is hereby granted, free of charge, to any person obtaining a copy 6 | of this software and associated documentation files (the "Software"), to deal 7 | in the Software without restriction, including without limitation the rights 8 | to use, copy, modify, merge, publish, distribute, sublicense, and/or sell 9 | copies of the Software, and to permit persons to whom the Software is 10 | furnished to do so, subject to the following conditions: 11 | 12 | The above copyright notice and this permission notice shall be included in all 13 | copies or substantial portions of the Software. 14 | 15 | THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR 16 | IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, 17 | FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE 18 | AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER 19 | LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, 20 | OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE 21 | SOFTWARE. 22 | 23 | -------------------------------------------------------------------------------- /library.lisp: -------------------------------------------------------------------------------- 1 | (in-package :cl-tesseract) 2 | 3 | (cffi:define-foreign-library tesseract 4 | (:darwin (:or "libtesseract.3.dylib" "libtesseract.dylib")) 5 | (:linux (:or "libtesseract.3.so" "libtesseract.so")) 6 | (t (:default "libtesseract"))) ; Will this work on Windows? 7 | 8 | (cffi:use-foreign-library tesseract) 9 | 10 | (defparameter *tessdata-directory* 11 | #+unix 12 | (namestring (or (probe-file "/usr/local/share/tessdata") ; Homebrew 13 | (probe-file "/opt/homebrew/share/tessdata") 14 | (probe-file "/usr/local/tessdata"))) 15 | #+windows 16 | (namestring (probe-file "C:\\Program Files\\Tesseract OCR\\tessdata")) 17 | "*tessdata-directory* should point to the location of the directory containing 18 | .traineddata files for use by Tesseract. The value must be a string representing the full 19 | path to the directory. 20 | 21 | This searches in a common default location for each platform and will follow symlinks to 22 | find the true location of the tessdata directory if one exists. 23 | 24 | If your .traineddata files are in a non-standard location, it can be shadowed; i.e. 25 | (let ((*tessdata-directory* \"/path/to/tessdata\")) 26 | (image-to-text #P\"~/eurotext.jpg\"))") 27 | -------------------------------------------------------------------------------- /README.txt: -------------------------------------------------------------------------------- 1 | CL-TESSERACT is a set of CFFI bindings for the Tesseract OCR library v. 3.04: 2 | https://github.com/tesseract-ocr/tesseract 3 | 4 | On OS X, Tesseract can be conveniently installed using Homebrew: 5 | brew install tesseract 6 | 7 | As Tesseract OCR’s capi changed in the update to v. 3.04, earlier versions such as 3.02 8 | will not work with these bindings. 9 | 10 | CL-TESSERACT also provides convenient lisp functions to retrieve text from images, 11 | IMAGE-TO-TEXT and IMAGE-TO-HOCR. 12 | 13 | IMAGE-TO-TEXT accepts a lisp pathname and an optional language parameter and returns a 14 | unicode string: 15 | 16 | * (image-to-text #P"~/eurotext.tif") 17 | "The (quick) [brown] {fox} jumps! 18 | Over the $43,456.78 #90 dog 19 | & duck/goose, as 12.5% of E-mail 20 | from aspammer@website.com is spam. 21 | Der ,,schnelle” braune Fuchs springt 22 | fiber den faulen Hund. Le renard brun 23 | «rapide» saute par-dessus le chien 24 | paresseux. La volpe marrone rapida 25 | salta sopra i] cane pigro. El zorro 26 | marrén répido salta sobre el perro 27 | perezoso. A raposa marrom répida 28 | salta sobre 0 C50 preguieoso. 29 | 30 | " 31 | 32 | * (image-to-text #P"~/eurotext.tif" :lang "rus") 33 | "ТЬе (чиісК) [Ьгошп] {Гох} ]итрз! 34 | Очег [пе $43‚456.78 <1а2у> #90 603 35 | & ‹1исК/3005е, аз 12.5% ог Е-таіі 36 | Ггот азраттег@шеЬ5і[е.сош із зрат. 37 | Бег ‚,5с11пе11е” Ьгаипе Риспз зргіпві 38 | ііЬег ‹!еп Тапіеп Нипа. Ье гепага Ьгип 39 | «гарісіе» заше раг-сіеззиз 1е сЬіеп 40 | рагеззеих. Ьа уоіре тапопе гаріаа 41 | зама зорга і] сапе рівго. Е1 гогго 42 | таггбп гёріао зама воЬге е1 репо 43 | регегозо. А гароза шапот гйріаа 44 | зака воЬге о еде ргевиісозо. 45 | 46 | " 47 | 48 | Available languages are dependent on the Tesseract OCR .traineddata files located in the directory denoted by *TESSDATA-DIRECTORY*. CL-TESSERACT attempts to set this variable to 49 | a reasonable default for your platform. 50 | 51 | IMAGE-TO-HOCR accepts a lisp pathname, the optional language parameter, and a optional 52 | page number (default 0) and return HOCR XML describing not just the recognized text, but 53 | its location in the page: 54 | 55 | * (image-to-hocr #P"~/python-tesseract/eurotext.jpg”) 56 | "
57 |
58 | . . . 59 | word_2_65' title='bbox 391 621 456 651; x_wconf 72' lang='eng' dir='ltr'>C50 preguieoso. 60 | 61 |

62 |
63 |
64 | " 65 | 66 | This can be parsed using Common Lisp libraries such as Closure-XML and plump. 67 | 68 | Tested on CCL and SBCL. 69 | 70 | License: 71 | 72 | MIT 73 | 74 | Author: 75 | Edward Geist (egeist@stanford.edu) 76 | -------------------------------------------------------------------------------- /cl-tesseract.lisp: -------------------------------------------------------------------------------- 1 | ;;;; cl-tesseract.lisp 2 | 3 | (in-package #:cl-tesseract) 4 | 5 | ;;; "cl-tesseract" goes here. Hacks and glory await! 6 | 7 | (defun tesseract-version () 8 | "Returns Tesseract version as string." 9 | (tessversion)) 10 | 11 | #| On SBCL, libtesseract 3.04.00 signals a SIGFPE while carrying out text recognition. While 12 | this has no effect using CCL, it requires the use of a SBCL-specific macro on that 13 | implementation invoking SB-INT:WITH-FLOAT-TRAPS-MASKED. Under SBCL, all calls to libtesseract 14 | that carry out character recognition (process-pages, tessbaseapigetutf8text, etc.) must be 15 | executed with the :DIVIDE-BY-ZERO flag masked. |# 16 | 17 | (defmacro mask-sigfpe (&body body) 18 | "Under SBCL, this macro wraps body with SB-INT:WITH-FLOAT-TRAPS-MASKED to prevent a SIGFPE 19 | generated by libtesseract from causing a non-recoverable error. On other implementations, 20 | simply wraps body in a PROGN." 21 | #+sbcl 22 | `(sb-int:with-float-traps-masked (:invalid :divide-by-zero) 23 | ,@body) 24 | #-sbcl 25 | `(progn ,@body)) 26 | 27 | (defmacro with-base-api (api-name &body body &aux (handle (gensym))) 28 | "This macro creates a TessBaseAPI using TessBaseAPICreate and binds it to api-name. It 29 | wraps body with an UNWIND-PROTECT form to ensure that TessBaseAPIEnd and TessBaseAPIDelete 30 | are called on api-name upon exit." 31 | `(let (,handle) 32 | (unwind-protect 33 | (let ((,api-name (setq ,handle (tessbaseapicreate)))) 34 | ,@body) 35 | (tessbaseapiend ,handle) 36 | (tessbaseapidelete ,handle)))) 37 | 38 | (defun init-tess-api (api lang) 39 | "Calls TessBaseAPIInit3 on api and sets lang, which should be a string naming a 40 | .traineddata file in *tessdata-directory*. Attempting to set a language that is not available 41 | will raise an error." 42 | (let ((return-code (tessbaseapiinit3 api *tessdata-directory* lang))) 43 | (case return-code 44 | (-1 (error "Failure to initialize TessBaseAPI.")) 45 | (0 t)))) 46 | 47 | (defun process-pages (api truename) 48 | "This function calls TessBaseAPIProcessPages on api, which should be an initialized 49 | TessBaseAPI, and then uses Leptonica to load the image found at truename (which must be a 50 | string containing the complete path to the file). It then runs OCR on the image. On SBCL, 51 | the division-by-zero float trap is masked to prevent it from causing a non-recoverable error. 52 | 53 | The details of your Leptonica installation will determine exactly what image formats this 54 | function can process successfully. 55 | 56 | Returns T on success, otherwise signals error." 57 | (mask-sigfpe 58 | (let ((return-code (tessbaseapiprocesspages api truename "" 0 (null-pointer)))) 59 | (case return-code 60 | (1 t) 61 | (t (error "TessBaseAPIProcessPages returned failure.")))))) 62 | 63 | (defun get-utf8-text (api) 64 | "This function calls TessBaseAPIGetUTF8Text on api, which should be an initialized 65 | TessBaseAPI with its image set. If called before another function such as PROCESS-PAGES has 66 | carried out OCR on the image, it will be carried out. On SBCL, the division-by-zero float 67 | trap is masked to prevent it from causing a non-recoverable error. 68 | 69 | Returns a lisp string of the UTF-8 text produced by Tesseract." 70 | (mask-sigfpe (tessbaseapigetutf8text api))) 71 | 72 | (defun get-hocr-text (api page) 73 | "This function calls TessBaseAPIGetHOCRText on api, which should be an initialized 74 | TessBaseAPI with its image set. If called before another function such as PROCESS-PAGES has 75 | carried out OCR on the image, it will be carried out. On SBCL, the division-by-zero float 76 | trap is masked to prevent it from causing a non-recoverable error. 77 | 78 | Returns a lisp string of the HOCR XML produced by Tesseract. This can be parsed using 79 | Common Lisp XML packages such as CLOSURE-XML and plump." 80 | (mask-sigfpe (tessbaseapigethocrtext api page))) 81 | 82 | (defun image-to-text (filepath &key (lang "eng")) 83 | "Runs OCR on the file found at filepath for language lang (English by default). Returns 84 | text string." 85 | (with-base-api api 86 | (init-tess-api api lang) 87 | (process-pages api (namestring (truename filepath))) ; protected from SIGFPE on sbcl 88 | (tessbaseapigetutf8text api))) ; no need to mask float traps after process-pages 89 | 90 | (defun image-to-hocr (filepath &key (lang "eng") (page 0)) 91 | "Runs OCR on the file found at filepath for language lang (English by default). Returns 92 | HOCR xml string." 93 | (with-base-api api 94 | (init-tess-api api lang) 95 | (process-pages api (namestring (truename filepath))) ; protected from SIGFPE on sbcl 96 | (tessbaseapigethocrtext api page))) ; no need to mask float traps after process-pages 97 | -------------------------------------------------------------------------------- /capi.lisp: -------------------------------------------------------------------------------- 1 | (in-package :cl-tesseract) 2 | 3 | ;;; Bindings for Tesseract OCR v.3.04.00 capi generated using SWIG 4 | 5 | (cffi:defcenum TessOcrEngineMode 6 | :OEM_TESSERACT_ONLY 7 | :OEM_CUBE_ONLY 8 | :OEM_TESSERACT_CUBE_COMBINED 9 | :OEM_DEFAULT) 10 | 11 | (cffi:defcenum TessPageSegMode 12 | :PSM_OSD_ONLY 13 | :PSM_AUTO_OSD 14 | :PSM_AUTO_ONLY 15 | :PSM_AUTO 16 | :PSM_SINGLE_COLUMN 17 | :PSM_SINGLE_BLOCK_VERT_TEXT 18 | :PSM_SINGLE_BLOCK 19 | :PSM_SINGLE_LINE 20 | :PSM_SINGLE_WORD 21 | :PSM_CIRCLE_WORD 22 | :PSM_SINGLE_CHAR 23 | :PSM_SPARSE_TEXT 24 | :PSM_SPARSE_TEXT_OSD 25 | :PSM_COUNT) 26 | 27 | (cffi:defcenum TessPageIteratorLevel 28 | :RIL_BLOCK 29 | :RIL_PARA 30 | :RIL_TEXTLINE 31 | :RIL_WORD 32 | :RIL_SYMBOL) 33 | 34 | (cffi:defcenum TessPolyBlockType 35 | :PT_UNKNOWN 36 | :PT_FLOWING_TEXT 37 | :PT_HEADING_TEXT 38 | :PT_PULLOUT_TEXT 39 | :PT_EQUATION 40 | :PT_INLINE_EQUATION 41 | :PT_TABLE 42 | :PT_VERTICAL_TEXT 43 | :PT_CAPTION_TEXT 44 | :PT_FLOWING_IMAGE 45 | :PT_HEADING_IMAGE 46 | :PT_PULLOUT_IMAGE 47 | :PT_HORZ_LINE 48 | :PT_VERT_LINE 49 | :PT_NOISE 50 | :PT_COUNT) 51 | 52 | (cffi:defcenum TessOrientation 53 | :ORIENTATION_PAGE_UP 54 | :ORIENTATION_PAGE_RIGHT 55 | :ORIENTATION_PAGE_DOWN 56 | :ORIENTATION_PAGE_LEFT) 57 | 58 | (cffi:defcenum TessParagraphJustification 59 | :JUSTIFICATION_UNKNOWN 60 | :JUSTIFICATION_LEFT 61 | :JUSTIFICATION_CENTER 62 | :JUSTIFICATION_RIGHT) 63 | 64 | (cffi:defcenum TessWritingDirection 65 | :WRITING_DIRECTION_LEFT_TO_RIGHT 66 | :WRITING_DIRECTION_RIGHT_TO_LEFT 67 | :WRITING_DIRECTION_TOP_TO_BOTTOM) 68 | 69 | (cffi:defcenum TessTextlineOrder 70 | :TEXTLINE_ORDER_LEFT_TO_RIGHT 71 | :TEXTLINE_ORDER_RIGHT_TO_LEFT 72 | :TEXTLINE_ORDER_TOP_TO_BOTTOM) 73 | 74 | (cl:defconstant TRUE 1) 75 | 76 | (cl:defconstant FALSE 0) 77 | 78 | (cffi:defcfun ("TessVersion" TessVersion) :string) 79 | 80 | (cffi:defcfun ("TessDeleteText" TessDeleteText) :void 81 | (text :string)) 82 | 83 | (cffi:defcfun ("TessDeleteTextArray" TessDeleteTextArray) :void 84 | (arr :pointer)) 85 | 86 | (cffi:defcfun ("TessDeleteIntArray" TessDeleteIntArray) :void 87 | (arr :pointer)) 88 | 89 | (cffi:defcfun ("TessDeleteBlockList" TessDeleteBlockList) :void 90 | (block_list :pointer)) 91 | 92 | (cffi:defcfun ("TessTextRendererCreate" TessTextRendererCreate) :pointer 93 | (outputbase :string)) 94 | 95 | (cffi:defcfun ("TessHOcrRendererCreate" TessHOcrRendererCreate) :pointer 96 | (outputbase :string)) 97 | 98 | (cffi:defcfun ("TessHOcrRendererCreate2" TessHOcrRendererCreate2) :pointer 99 | (outputbase :string) 100 | (font_info :int)) 101 | 102 | (cffi:defcfun ("TessPDFRendererCreate" TessPDFRendererCreate) :pointer 103 | (outputbase :string) 104 | (datadir :string)) 105 | 106 | (cffi:defcfun ("TessUnlvRendererCreate" TessUnlvRendererCreate) :pointer 107 | (outputbase :string)) 108 | 109 | (cffi:defcfun ("TessBoxTextRendererCreate" TessBoxTextRendererCreate) :pointer 110 | (outputbase :string)) 111 | 112 | (cffi:defcfun ("TessDeleteResultRenderer" TessDeleteResultRenderer) :void 113 | (renderer :pointer)) 114 | 115 | (cffi:defcfun ("TessResultRendererInsert" TessResultRendererInsert) :void 116 | (renderer :pointer) 117 | (next :pointer)) 118 | 119 | (cffi:defcfun ("TessResultRendererNext" TessResultRendererNext) :pointer 120 | (renderer :pointer)) 121 | 122 | (cffi:defcfun ("TessResultRendererBeginDocument" TessResultRendererBeginDocument) :int 123 | (renderer :pointer) 124 | (title :string)) 125 | 126 | (cffi:defcfun ("TessResultRendererAddImage" TessResultRendererAddImage) :int 127 | (renderer :pointer) 128 | (api :pointer)) 129 | 130 | (cffi:defcfun ("TessResultRendererEndDocument" TessResultRendererEndDocument) :int 131 | (renderer :pointer)) 132 | 133 | (cffi:defcfun ("TessResultRendererExtention" TessResultRendererExtention) :string 134 | (renderer :pointer)) 135 | 136 | (cffi:defcfun ("TessResultRendererTitle" TessResultRendererTitle) :string 137 | (renderer :pointer)) 138 | 139 | (cffi:defcfun ("TessResultRendererImageNum" TessResultRendererImageNum) :int 140 | (renderer :pointer)) 141 | 142 | (cffi:defcfun ("TessBaseAPICreate" TessBaseAPICreate) :pointer) 143 | 144 | (cffi:defcfun ("TessBaseAPIDelete" TessBaseAPIDelete) :void 145 | (handle :pointer)) 146 | 147 | (cffi:defcfun ("TessBaseAPIGetOpenCLDevice" TessBaseAPIGetOpenCLDevice) :pointer 148 | (handle :pointer) 149 | (device :pointer)) 150 | 151 | (cffi:defcfun ("TessBaseAPISetInputName" TessBaseAPISetInputName) :void 152 | (handle :pointer) 153 | (name :string)) 154 | 155 | (cffi:defcfun ("TessBaseAPIGetInputName" TessBaseAPIGetInputName) :string 156 | (handle :pointer)) 157 | 158 | (cffi:defcfun ("TessBaseAPISetInputImage" TessBaseAPISetInputImage) :void 159 | (handle :pointer) 160 | (pix :pointer)) 161 | 162 | (cffi:defcfun ("TessBaseAPIGetInputImage" TessBaseAPIGetInputImage) :pointer 163 | (handle :pointer)) 164 | 165 | (cffi:defcfun ("TessBaseAPIGetSourceYResolution" TessBaseAPIGetSourceYResolution) :int 166 | (handle :pointer)) 167 | 168 | (cffi:defcfun ("TessBaseAPIGetDatapath" TessBaseAPIGetDatapath) :string 169 | (handle :pointer)) 170 | 171 | (cffi:defcfun ("TessBaseAPISetOutputName" TessBaseAPISetOutputName) :void 172 | (handle :pointer) 173 | (name :string)) 174 | 175 | (cffi:defcfun ("TessBaseAPISetVariable" TessBaseAPISetVariable) :int 176 | (handle :pointer) 177 | (name :string) 178 | (value :string)) 179 | 180 | (cffi:defcfun ("TessBaseAPISetDebugVariable" TessBaseAPISetDebugVariable) :int 181 | (handle :pointer) 182 | (name :string) 183 | (value :string)) 184 | 185 | (cffi:defcfun ("TessBaseAPIGetIntVariable" TessBaseAPIGetIntVariable) :int 186 | (handle :pointer) 187 | (name :string) 188 | (value :pointer)) 189 | 190 | (cffi:defcfun ("TessBaseAPIGetBoolVariable" TessBaseAPIGetBoolVariable) :int 191 | (handle :pointer) 192 | (name :string) 193 | (value :pointer)) 194 | 195 | (cffi:defcfun ("TessBaseAPIGetDoubleVariable" TessBaseAPIGetDoubleVariable) :int 196 | (handle :pointer) 197 | (name :string) 198 | (value :pointer)) 199 | 200 | (cffi:defcfun ("TessBaseAPIGetStringVariable" TessBaseAPIGetStringVariable) :string 201 | (handle :pointer) 202 | (name :string)) 203 | 204 | (cffi:defcfun ("TessBaseAPIPrintVariables" TessBaseAPIPrintVariables) :void 205 | (handle :pointer) 206 | (fp :pointer)) 207 | 208 | (cffi:defcfun ("TessBaseAPIPrintVariablesToFile" TessBaseAPIPrintVariablesToFile) :int 209 | (handle :pointer) 210 | (filename :string)) 211 | 212 | (cffi:defcfun ("TessBaseAPIGetVariableAsString" TessBaseAPIGetVariableAsString) :int 213 | (handle :pointer) 214 | (name :string) 215 | (val :pointer)) 216 | 217 | (cffi:defcfun ("TessBaseAPIInit" TessBaseAPIInit) :int 218 | (handle :pointer) 219 | (datapath :string) 220 | (language :string) 221 | (mode TessOcrEngineMode) 222 | (configs :pointer) 223 | (configs_size :int) 224 | (vars_vec :pointer) 225 | (vars_vec_size :pointer) 226 | (vars_values :pointer) 227 | (vars_values_size :pointer) 228 | (set_only_init_params :int)) 229 | 230 | (cffi:defcfun ("TessBaseAPIInit1" TessBaseAPIInit1) :int 231 | (handle :pointer) 232 | (datapath :string) 233 | (language :string) 234 | (oem TessOcrEngineMode) 235 | (configs :pointer) 236 | (configs_size :int)) 237 | 238 | (cffi:defcfun ("TessBaseAPIInit2" TessBaseAPIInit2) :int 239 | (handle :pointer) 240 | (datapath :string) 241 | (language :string) 242 | (oem TessOcrEngineMode)) 243 | 244 | (cffi:defcfun ("TessBaseAPIInit3" TessBaseAPIInit3) :int 245 | (handle :pointer) 246 | (datapath :string) 247 | (language :string)) 248 | 249 | (cffi:defcfun ("TessBaseAPIInit4" TessBaseAPIInit4) :int 250 | (handle :pointer) 251 | (datapath :string) 252 | (language :string) 253 | (mode TessOcrEngineMode) 254 | (configs :pointer) 255 | (configs_size :int) 256 | (vars_vec :pointer) 257 | (vars_values :pointer) 258 | (vars_vec_size :pointer) 259 | (set_only_non_debug_params :int)) 260 | 261 | (cffi:defcfun ("TessBaseAPIGetInitLanguagesAsString" TessBaseAPIGetInitLanguagesAsString) :string 262 | (handle :pointer)) 263 | 264 | (cffi:defcfun ("TessBaseAPIGetLoadedLanguagesAsVector" TessBaseAPIGetLoadedLanguagesAsVector) :pointer 265 | (handle :pointer)) 266 | 267 | (cffi:defcfun ("TessBaseAPIGetAvailableLanguagesAsVector" TessBaseAPIGetAvailableLanguagesAsVector) :pointer 268 | (handle :pointer)) 269 | 270 | (cffi:defcfun ("TessBaseAPIInitLangMod" TessBaseAPIInitLangMod) :int 271 | (handle :pointer) 272 | (datapath :string) 273 | (language :string)) 274 | 275 | (cffi:defcfun ("TessBaseAPIInitForAnalysePage" TessBaseAPIInitForAnalysePage) :void 276 | (handle :pointer)) 277 | 278 | (cffi:defcfun ("TessBaseAPIReadConfigFile" TessBaseAPIReadConfigFile) :void 279 | (handle :pointer) 280 | (filename :string)) 281 | 282 | (cffi:defcfun ("TessBaseAPIReadDebugConfigFile" TessBaseAPIReadDebugConfigFile) :void 283 | (handle :pointer) 284 | (filename :string)) 285 | 286 | (cffi:defcfun ("TessBaseAPISetPageSegMode" TessBaseAPISetPageSegMode) :void 287 | (handle :pointer) 288 | (mode TessPageSegMode)) 289 | 290 | (cffi:defcfun ("TessBaseAPIGetPageSegMode" TessBaseAPIGetPageSegMode) TessPageSegMode 291 | (handle :pointer)) 292 | 293 | (cffi:defcfun ("TessBaseAPIRect" TessBaseAPIRect) :string 294 | (handle :pointer) 295 | (imagedata :pointer) 296 | (bytes_per_pixel :int) 297 | (bytes_per_line :int) 298 | (left :int) 299 | (top :int) 300 | (width :int) 301 | (height :int)) 302 | 303 | (cffi:defcfun ("TessBaseAPIClearAdaptiveClassifier" TessBaseAPIClearAdaptiveClassifier) :void 304 | (handle :pointer)) 305 | 306 | (cffi:defcfun ("TessBaseAPISetImage" TessBaseAPISetImage) :void 307 | (handle :pointer) 308 | (imagedata :pointer) 309 | (width :int) 310 | (height :int) 311 | (bytes_per_pixel :int) 312 | (bytes_per_line :int)) 313 | 314 | (cffi:defcfun ("TessBaseAPISetImage2" TessBaseAPISetImage2) :void 315 | (handle :pointer) 316 | (pix :pointer)) 317 | 318 | (cffi:defcfun ("TessBaseAPISetSourceResolution" TessBaseAPISetSourceResolution) :void 319 | (handle :pointer) 320 | (ppi :int)) 321 | 322 | (cffi:defcfun ("TessBaseAPISetRectangle" TessBaseAPISetRectangle) :void 323 | (handle :pointer) 324 | (left :int) 325 | (top :int) 326 | (width :int) 327 | (height :int)) 328 | 329 | (cffi:defcfun ("TessBaseAPISetThresholder" TessBaseAPISetThresholder) :void 330 | (handle :pointer) 331 | (thresholder :pointer)) 332 | 333 | (cffi:defcfun ("TessBaseAPIGetThresholdedImage" TessBaseAPIGetThresholdedImage) :pointer 334 | (handle :pointer)) 335 | 336 | (cffi:defcfun ("TessBaseAPIGetRegions" TessBaseAPIGetRegions) :pointer 337 | (handle :pointer) 338 | (pixa :pointer)) 339 | 340 | (cffi:defcfun ("TessBaseAPIGetTextlines" TessBaseAPIGetTextlines) :pointer 341 | (handle :pointer) 342 | (pixa :pointer) 343 | (blockids :pointer)) 344 | 345 | (cffi:defcfun ("TessBaseAPIGetTextlines1" TessBaseAPIGetTextlines1) :pointer 346 | (handle :pointer) 347 | (raw_image :int) 348 | (raw_padding :int) 349 | (pixa :pointer) 350 | (blockids :pointer) 351 | (paraids :pointer)) 352 | 353 | (cffi:defcfun ("TessBaseAPIGetStrips" TessBaseAPIGetStrips) :pointer 354 | (handle :pointer) 355 | (pixa :pointer) 356 | (blockids :pointer)) 357 | 358 | (cffi:defcfun ("TessBaseAPIGetWords" TessBaseAPIGetWords) :pointer 359 | (handle :pointer) 360 | (pixa :pointer)) 361 | 362 | (cffi:defcfun ("TessBaseAPIGetConnectedComponents" TessBaseAPIGetConnectedComponents) :pointer 363 | (handle :pointer) 364 | (cc :pointer)) 365 | 366 | (cffi:defcfun ("TessBaseAPIGetComponentImages" TessBaseAPIGetComponentImages) :pointer 367 | (handle :pointer) 368 | (level TessPageIteratorLevel) 369 | (text_only :int) 370 | (pixa :pointer) 371 | (blockids :pointer)) 372 | 373 | (cffi:defcfun ("TessBaseAPIGetComponentImages1" TessBaseAPIGetComponentImages1) :pointer 374 | (handle :pointer) 375 | (level TessPageIteratorLevel) 376 | (text_only :int) 377 | (raw_image :int) 378 | (raw_padding :int) 379 | (pixa :pointer) 380 | (blockids :pointer) 381 | (paraids :pointer)) 382 | 383 | (cffi:defcfun ("TessBaseAPIGetThresholdedImageScaleFactor" TessBaseAPIGetThresholdedImageScaleFactor) :int 384 | (handle :pointer)) 385 | 386 | (cffi:defcfun ("TessBaseAPIDumpPGM" TessBaseAPIDumpPGM) :void 387 | (handle :pointer) 388 | (filename :string)) 389 | 390 | (cffi:defcfun ("TessBaseAPIAnalyseLayout" TessBaseAPIAnalyseLayout) :pointer 391 | (handle :pointer)) 392 | 393 | (cffi:defcfun ("TessBaseAPIRecognize" TessBaseAPIRecognize) :int 394 | (handle :pointer) 395 | (monitor :pointer)) 396 | 397 | (cffi:defcfun ("TessBaseAPIRecognizeForChopTest" TessBaseAPIRecognizeForChopTest) :int 398 | (handle :pointer) 399 | (monitor :pointer)) 400 | 401 | (cffi:defcfun ("TessBaseAPIProcessPages" TessBaseAPIProcessPages) :int 402 | (handle :pointer) 403 | (filename :string) 404 | (retry_config :string) 405 | (timeout_millisec :int) 406 | (renderer :pointer)) 407 | 408 | (cffi:defcfun ("TessBaseAPIProcessPage" TessBaseAPIProcessPage) :int 409 | (handle :pointer) 410 | (pix :pointer) 411 | (page_index :int) 412 | (filename :string) 413 | (retry_config :string) 414 | (timeout_millisec :int) 415 | (renderer :pointer)) 416 | 417 | (cffi:defcfun ("TessBaseAPIGetIterator" TessBaseAPIGetIterator) :pointer 418 | (handle :pointer)) 419 | 420 | (cffi:defcfun ("TessBaseAPIGetMutableIterator" TessBaseAPIGetMutableIterator) :pointer 421 | (handle :pointer)) 422 | 423 | (cffi:defcfun ("TessBaseAPIGetUTF8Text" TessBaseAPIGetUTF8Text) :string 424 | (handle :pointer)) 425 | 426 | (cffi:defcfun ("TessBaseAPIGetHOCRText" TessBaseAPIGetHOCRText) :string 427 | (handle :pointer) 428 | (page_number :int)) 429 | 430 | (cffi:defcfun ("TessBaseAPIGetBoxText" TessBaseAPIGetBoxText) :string 431 | (handle :pointer) 432 | (page_number :int)) 433 | 434 | (cffi:defcfun ("TessBaseAPIGetUNLVText" TessBaseAPIGetUNLVText) :string 435 | (handle :pointer)) 436 | 437 | (cffi:defcfun ("TessBaseAPIMeanTextConf" TessBaseAPIMeanTextConf) :int 438 | (handle :pointer)) 439 | 440 | (cffi:defcfun ("TessBaseAPIAllWordConfidences" TessBaseAPIAllWordConfidences) :pointer 441 | (handle :pointer)) 442 | 443 | (cffi:defcfun ("TessBaseAPIAdaptToWordStr" TessBaseAPIAdaptToWordStr) :int 444 | (handle :pointer) 445 | (mode TessPageSegMode) 446 | (wordstr :string)) 447 | 448 | (cffi:defcfun ("TessBaseAPIClear" TessBaseAPIClear) :void 449 | (handle :pointer)) 450 | 451 | (cffi:defcfun ("TessBaseAPIEnd" TessBaseAPIEnd) :void 452 | (handle :pointer)) 453 | 454 | (cffi:defcfun ("TessBaseAPIIsValidWord" TessBaseAPIIsValidWord) :int 455 | (handle :pointer) 456 | (word :string)) 457 | 458 | (cffi:defcfun ("TessBaseAPIGetTextDirection" TessBaseAPIGetTextDirection) :int 459 | (handle :pointer) 460 | (out_offset :pointer) 461 | (out_slope :pointer)) 462 | 463 | (cffi:defcfun ("TessBaseAPISetDictFunc" TessBaseAPISetDictFunc) :void 464 | (handle :pointer) 465 | (f :pointer)) 466 | 467 | (cffi:defcfun ("TessBaseAPIClearPersistentCache" TessBaseAPIClearPersistentCache) :void 468 | (handle :pointer)) 469 | 470 | (cffi:defcfun ("TessBaseAPISetProbabilityInContextFunc" TessBaseAPISetProbabilityInContextFunc) :void 471 | (handle :pointer) 472 | (f :pointer)) 473 | 474 | (cffi:defcfun ("TessBaseAPISetFillLatticeFunc" TessBaseAPISetFillLatticeFunc) :void 475 | (handle :pointer) 476 | (f :pointer)) 477 | 478 | (cffi:defcfun ("TessBaseAPIDetectOS" TessBaseAPIDetectOS) :int 479 | (handle :pointer) 480 | (results :pointer)) 481 | 482 | (cffi:defcfun ("TessBaseAPIGetFeaturesForBlob" TessBaseAPIGetFeaturesForBlob) :void 483 | (handle :pointer) 484 | (blob :pointer) 485 | (int_features :pointer) 486 | (num_features :pointer) 487 | (FeatureOutlineIndex :pointer)) 488 | 489 | (cffi:defcfun ("TessFindRowForBox" TessFindRowForBox) :pointer 490 | (blocks :pointer) 491 | (left :int) 492 | (top :int) 493 | (right :int) 494 | (bottom :int)) 495 | 496 | (cffi:defcfun ("TessBaseAPIRunAdaptiveClassifier" TessBaseAPIRunAdaptiveClassifier) :void 497 | (handle :pointer) 498 | (blob :pointer) 499 | (num_max_matches :int) 500 | (unichar_ids :pointer) 501 | (ratings :pointer) 502 | (num_matches_returned :pointer)) 503 | 504 | (cffi:defcfun ("TessBaseAPIGetUnichar" TessBaseAPIGetUnichar) :string 505 | (handle :pointer) 506 | (unichar_id :int)) 507 | 508 | (cffi:defcfun ("TessBaseAPIGetDawg" TessBaseAPIGetDawg) :pointer 509 | (handle :pointer) 510 | (i :int)) 511 | 512 | (cffi:defcfun ("TessBaseAPINumDawgs" TessBaseAPINumDawgs) :int 513 | (handle :pointer)) 514 | 515 | (cffi:defcfun ("TessMakeTessOCRRow" TessMakeTessOCRRow) :pointer 516 | (baseline :float) 517 | (xheight :float) 518 | (descender :float) 519 | (ascender :float)) 520 | 521 | (cffi:defcfun ("TessMakeTBLOB" TessMakeTBLOB) :pointer 522 | (pix :pointer)) 523 | 524 | (cffi:defcfun ("TessNormalizeTBLOB" TessNormalizeTBLOB) :void 525 | (tblob :pointer) 526 | (row :pointer) 527 | (numeric_mode :int)) 528 | 529 | (cffi:defcfun ("TessBaseAPIOem" TessBaseAPIOem) TessOcrEngineMode 530 | (handle :pointer)) 531 | 532 | (cffi:defcfun ("TessBaseAPIInitTruthCallback" TessBaseAPIInitTruthCallback) :void 533 | (handle :pointer) 534 | (cb :pointer)) 535 | 536 | (cffi:defcfun ("TessBaseAPIGetCubeRecoContext" TessBaseAPIGetCubeRecoContext) :pointer 537 | (handle :pointer)) 538 | 539 | (cffi:defcfun ("TessBaseAPISetMinOrientationMargin" TessBaseAPISetMinOrientationMargin) :void 540 | (handle :pointer) 541 | (margin :double)) 542 | 543 | (cffi:defcfun ("TessBaseGetBlockTextOrientations" TessBaseGetBlockTextOrientations) :void 544 | (handle :pointer) 545 | (block_orientation :pointer) 546 | (vertical_writing :pointer)) 547 | 548 | (cffi:defcfun ("TessBaseAPIFindLinesCreateBlockList" TessBaseAPIFindLinesCreateBlockList) :pointer 549 | (handle :pointer)) 550 | 551 | (cffi:defcfun ("TessPageIteratorDelete" TessPageIteratorDelete) :void 552 | (handle :pointer)) 553 | 554 | (cffi:defcfun ("TessPageIteratorCopy" TessPageIteratorCopy) :pointer 555 | (handle :pointer)) 556 | 557 | (cffi:defcfun ("TessPageIteratorBegin" TessPageIteratorBegin) :void 558 | (handle :pointer)) 559 | 560 | (cffi:defcfun ("TessPageIteratorNext" TessPageIteratorNext) :int 561 | (handle :pointer) 562 | (level TessPageIteratorLevel)) 563 | 564 | (cffi:defcfun ("TessPageIteratorIsAtBeginningOf" TessPageIteratorIsAtBeginningOf) :int 565 | (handle :pointer) 566 | (level TessPageIteratorLevel)) 567 | 568 | (cffi:defcfun ("TessPageIteratorIsAtFinalElement" TessPageIteratorIsAtFinalElement) :int 569 | (handle :pointer) 570 | (level TessPageIteratorLevel) 571 | (element TessPageIteratorLevel)) 572 | 573 | (cffi:defcfun ("TessPageIteratorBoundingBox" TessPageIteratorBoundingBox) :int 574 | (handle :pointer) 575 | (level TessPageIteratorLevel) 576 | (left :pointer) 577 | (top :pointer) 578 | (right :pointer) 579 | (bottom :pointer)) 580 | 581 | (cffi:defcfun ("TessPageIteratorBlockType" TessPageIteratorBlockType) TessPolyBlockType 582 | (handle :pointer)) 583 | 584 | (cffi:defcfun ("TessPageIteratorGetBinaryImage" TessPageIteratorGetBinaryImage) :pointer 585 | (handle :pointer) 586 | (level TessPageIteratorLevel)) 587 | 588 | (cffi:defcfun ("TessPageIteratorGetImage" TessPageIteratorGetImage) :pointer 589 | (handle :pointer) 590 | (level TessPageIteratorLevel) 591 | (padding :int) 592 | (original_image :pointer) 593 | (left :pointer) 594 | (top :pointer)) 595 | 596 | (cffi:defcfun ("TessPageIteratorBaseline" TessPageIteratorBaseline) :int 597 | (handle :pointer) 598 | (level TessPageIteratorLevel) 599 | (x1 :pointer) 600 | (y1 :pointer) 601 | (x2 :pointer) 602 | (y2 :pointer)) 603 | 604 | (cffi:defcfun ("TessPageIteratorOrientation" TessPageIteratorOrientation) :void 605 | (handle :pointer) 606 | (orientation :pointer) 607 | (writing_direction :pointer) 608 | (textline_order :pointer) 609 | (deskew_angle :pointer)) 610 | 611 | (cffi:defcfun ("TessPageIteratorParagraphInfo" TessPageIteratorParagraphInfo) :void 612 | (handle :pointer) 613 | (justification :pointer) 614 | (is_list_item :pointer) 615 | (is_crown :pointer) 616 | (first_line_indent :pointer)) 617 | 618 | (cffi:defcfun ("TessResultIteratorDelete" TessResultIteratorDelete) :void 619 | (handle :pointer)) 620 | 621 | (cffi:defcfun ("TessResultIteratorCopy" TessResultIteratorCopy) :pointer 622 | (handle :pointer)) 623 | 624 | (cffi:defcfun ("TessResultIteratorGetPageIterator" TessResultIteratorGetPageIterator) :pointer 625 | (handle :pointer)) 626 | 627 | (cffi:defcfun ("TessResultIteratorGetPageIteratorConst" TessResultIteratorGetPageIteratorConst) :pointer 628 | (handle :pointer)) 629 | 630 | (cffi:defcfun ("TessResultIteratorGetChoiceIterator" TessResultIteratorGetChoiceIterator) :pointer 631 | (handle :pointer)) 632 | 633 | (cffi:defcfun ("TessResultIteratorNext" TessResultIteratorNext) :int 634 | (handle :pointer) 635 | (level TessPageIteratorLevel)) 636 | 637 | (cffi:defcfun ("TessResultIteratorGetUTF8Text" TessResultIteratorGetUTF8Text) :string 638 | (handle :pointer) 639 | (level TessPageIteratorLevel)) 640 | 641 | (cffi:defcfun ("TessResultIteratorConfidence" TessResultIteratorConfidence) :float 642 | (handle :pointer) 643 | (level TessPageIteratorLevel)) 644 | 645 | (cffi:defcfun ("TessResultIteratorWordRecognitionLanguage" TessResultIteratorWordRecognitionLanguage) :string 646 | (handle :pointer)) 647 | 648 | (cffi:defcfun ("TessResultIteratorWordFontAttributes" TessResultIteratorWordFontAttributes) :string 649 | (handle :pointer) 650 | (is_bold :pointer) 651 | (is_italic :pointer) 652 | (is_underlined :pointer) 653 | (is_monospace :pointer) 654 | (is_serif :pointer) 655 | (is_smallcaps :pointer) 656 | (pointsize :pointer) 657 | (font_id :pointer)) 658 | 659 | (cffi:defcfun ("TessResultIteratorWordIsFromDictionary" TessResultIteratorWordIsFromDictionary) :int 660 | (handle :pointer)) 661 | 662 | (cffi:defcfun ("TessResultIteratorWordIsNumeric" TessResultIteratorWordIsNumeric) :int 663 | (handle :pointer)) 664 | 665 | (cffi:defcfun ("TessResultIteratorSymbolIsSuperscript" TessResultIteratorSymbolIsSuperscript) :int 666 | (handle :pointer)) 667 | 668 | (cffi:defcfun ("TessResultIteratorSymbolIsSubscript" TessResultIteratorSymbolIsSubscript) :int 669 | (handle :pointer)) 670 | 671 | (cffi:defcfun ("TessResultIteratorSymbolIsDropcap" TessResultIteratorSymbolIsDropcap) :int 672 | (handle :pointer)) 673 | 674 | (cffi:defcfun ("TessChoiceIteratorDelete" TessChoiceIteratorDelete) :void 675 | (handle :pointer)) 676 | 677 | (cffi:defcfun ("TessChoiceIteratorNext" TessChoiceIteratorNext) :int 678 | (handle :pointer)) 679 | 680 | (cffi:defcfun ("TessChoiceIteratorGetUTF8Text" TessChoiceIteratorGetUTF8Text) :string 681 | (handle :pointer)) 682 | 683 | (cffi:defcfun ("TessChoiceIteratorConfidence" TessChoiceIteratorConfidence) :float 684 | (handle :pointer)) 685 | --------------------------------------------------------------------------------