├── ocred ├── py.typed ├── version.pyi ├── __init__.py ├── preprocessing.py └── ocr.py ├── tests ├── __init__.py ├── test_preprocessing.py └── test_ocr.py ├── docs ├── index.md ├── changelog.md ├── conduct.md ├── contributing.md ├── reference.md ├── requirements.txt ├── mathjax-config.js ├── _overrides │ └── partial │ │ └── source.html ├── install.md └── stylesheets │ └── extra.css ├── .gitattributes ├── images ├── Page.png ├── CosmosOne.jpg ├── CosmosTwo.jpg ├── signboard.jpg ├── 1146-receipt.jpg ├── 1166-receipt.jpg └── 1174-receipt.jpg ├── .github ├── codecov.yml ├── ISSUE_TEMPLATE │ ├── config.yml │ ├── feature_request.yml │ └── bug_report.yml ├── PULL_REQUEST_TEMPLATE.md └── workflows │ ├── cd.yml │ └── ci.yml ├── .gitmodules ├── .git_archival.txt ├── .coveragerc ├── .flake8 ├── .gitignore ├── .readthedocs.yaml ├── LICENSE ├── .all-contributorsrc ├── noxfile.py ├── mkdocs.yml ├── .pre-commit-config.yaml ├── pyproject.toml ├── CHANGELOG.md ├── CODE_OF_CONDUCT.md ├── CONTRIBUTING.md └── README.md /ocred/py.typed: -------------------------------------------------------------------------------- 1 | -------------------------------------------------------------------------------- /tests/__init__.py: -------------------------------------------------------------------------------- 1 | -------------------------------------------------------------------------------- /docs/index.md: -------------------------------------------------------------------------------- 1 | --8<-- "README.md" 2 | -------------------------------------------------------------------------------- /docs/changelog.md: -------------------------------------------------------------------------------- 1 | --8<-- "CHANGELOG.md" 2 | -------------------------------------------------------------------------------- /docs/conduct.md: -------------------------------------------------------------------------------- 1 | --8<-- "CODE_OF_CONDUCT.md" 2 | -------------------------------------------------------------------------------- /.gitattributes: -------------------------------------------------------------------------------- 1 | .git_archival.txt export-subst 2 | -------------------------------------------------------------------------------- /docs/contributing.md: -------------------------------------------------------------------------------- 1 | --8<-- "CONTRIBUTING.md" 2 | -------------------------------------------------------------------------------- /images/Page.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/Saransh-cpp/OCRed/HEAD/images/Page.png -------------------------------------------------------------------------------- /images/CosmosOne.jpg: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/Saransh-cpp/OCRed/HEAD/images/CosmosOne.jpg -------------------------------------------------------------------------------- /images/CosmosTwo.jpg: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/Saransh-cpp/OCRed/HEAD/images/CosmosTwo.jpg -------------------------------------------------------------------------------- /images/signboard.jpg: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/Saransh-cpp/OCRed/HEAD/images/signboard.jpg -------------------------------------------------------------------------------- /.github/codecov.yml: -------------------------------------------------------------------------------- 1 | codecov: 2 | notify: 3 | after_n_builds: 6 # update after windows CI work 4 | -------------------------------------------------------------------------------- /images/1146-receipt.jpg: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/Saransh-cpp/OCRed/HEAD/images/1146-receipt.jpg -------------------------------------------------------------------------------- /images/1166-receipt.jpg: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/Saransh-cpp/OCRed/HEAD/images/1166-receipt.jpg -------------------------------------------------------------------------------- /images/1174-receipt.jpg: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/Saransh-cpp/OCRed/HEAD/images/1174-receipt.jpg -------------------------------------------------------------------------------- /.gitmodules: -------------------------------------------------------------------------------- 1 | [submodule "ocred_backend"] 2 | path = ocred_backend 3 | url = https://github.com/Saransh-cpp/ocred_backend 4 | -------------------------------------------------------------------------------- /docs/reference.md: -------------------------------------------------------------------------------- 1 | ## OCR class 2 | 3 | ::: ocred.ocr.OCR 4 | 5 | ## Preprocessor class 6 | 7 | ::: ocred.preprocessing.Preprocessor 8 | -------------------------------------------------------------------------------- /ocred/version.pyi: -------------------------------------------------------------------------------- 1 | from __future__ import annotations 2 | 3 | version: str 4 | version_tuple: tuple[int, int, int] | tuple[int, int, int, str, str] 5 | -------------------------------------------------------------------------------- /.git_archival.txt: -------------------------------------------------------------------------------- 1 | node: b286e251aa0120792932020bb22aacf2d46b4cc0 2 | node-date: 2024-10-07T12:59:54+01:00 3 | describe-name: v0.4.0-64-gb286e25 4 | ref-names: HEAD -> main 5 | -------------------------------------------------------------------------------- /.coveragerc: -------------------------------------------------------------------------------- 1 | [run] 2 | omit=*/test* 3 | 4 | [report] 5 | exclude_lines = 6 | if not self.testing: 7 | if not testing: 8 | pragma: no cover 9 | if __name__ == .__main__.: 10 | -------------------------------------------------------------------------------- /.flake8: -------------------------------------------------------------------------------- 1 | [flake8] 2 | extend-select = B9 3 | extend-ignore = E501, E203, D103, D102, D101, D100, D107, D105, D205, D400, D401, D104, D412, B006, B950 4 | per-file-ignores = 5 | tests/*: T 6 | noxfile.py: T 7 | -------------------------------------------------------------------------------- /ocred/__init__.py: -------------------------------------------------------------------------------- 1 | from __future__ import annotations 2 | 3 | from .ocr import OCR # noqa: F401 4 | from .preprocessing import Preprocessor # noqa: F401 5 | from .version import version as __version__ # noqa: F401 6 | -------------------------------------------------------------------------------- /docs/requirements.txt: -------------------------------------------------------------------------------- 1 | markdown-callouts>=0.2.0 2 | mkdocs>=1.3.1 3 | mkdocs-include-exclude-files>=0.0.1 4 | mkdocs-jupyter>=0.21.0 5 | mkdocs-material>=8.3.9 6 | mkdocstrings-python>=0.7.1 7 | mkdocstrings-python-legacy>=0.2.3 8 | pymdown-extensions>=9.5 9 | -------------------------------------------------------------------------------- /.github/ISSUE_TEMPLATE/config.yml: -------------------------------------------------------------------------------- 1 | blank_issues_enabled: true 2 | contact_links: 3 | - name: I'm unsure where to go 4 | url: https://github.com/Saransh-cpp/OCRed/discussions 5 | about: If you are unsure where to go, then joining our chat is recommended; Just ask! 6 | -------------------------------------------------------------------------------- /.gitignore: -------------------------------------------------------------------------------- 1 | .vscode 2 | .idea 3 | __pycache__ 4 | build 5 | dist 6 | ocred.egg-info 7 | OCR.png 8 | output.txt 9 | audio.mp3 10 | preprocessed.png 11 | scanned.png 12 | thick_font.png 13 | noise_free.png 14 | .coverage 15 | .env 16 | ocred/version.py 17 | .pytest_cache 18 | site/ 19 | .nox/ 20 | .mypy_cache/ 21 | .DS_Store 22 | -------------------------------------------------------------------------------- /.readthedocs.yaml: -------------------------------------------------------------------------------- 1 | # .readthedocs.yaml 2 | # Read the Docs configuration file 3 | # See https://docs.readthedocs.io/en/stable/config-file/v2.html for details 4 | 5 | # Required 6 | version: 2 7 | 8 | build: 9 | os: "ubuntu-20.04" 10 | tools: 11 | python: "3.9" 12 | 13 | python: 14 | install: 15 | - requirements: docs/requirements.txt 16 | 17 | mkdocs: 18 | configuration: mkdocs.yml 19 | fail_on_warning: false 20 | -------------------------------------------------------------------------------- /.github/PULL_REQUEST_TEMPLATE.md: -------------------------------------------------------------------------------- 1 | # Description 2 | 3 | Please include a summary of the change and which issue is fixed. Please also include relevant motivation and context. List any dependencies that are required for this change. 4 | 5 | Fixes # (issue) 6 | 7 | ## Type of change 8 | 9 | - [ ] New feature (non-breaking change which adds functionality) 10 | - [ ] Optimization (back-end change that speeds up the code) 11 | - [ ] Bug fix (non-breaking change which fixes an issue) 12 | 13 | # Key checklist: 14 | 15 | - [ ] No style issues: `$ flake8` 16 | - [ ] All tests pass: `$ pytest tests/` 17 | 18 | ## Further checks: 19 | 20 | - [ ] Code is commented, particularly in hard-to-understand areas 21 | - [ ] Tests added that prove fix is effective or that feature works 22 | -------------------------------------------------------------------------------- /docs/mathjax-config.js: -------------------------------------------------------------------------------- 1 | /* mathjax-loader.js file */ 2 | /* ref: http://facelessuser.github.io/pymdown-extensions/extensions/arithmatex/ */ 3 | (function (win, doc) { 4 | win.MathJax = { 5 | config: ["MMLorHTML.js"], 6 | extensions: ["tex2jax.js"], 7 | jax: ["input/TeX"], 8 | tex2jax: { 9 | inlineMath: [["\\(", "\\)"]], 10 | displayMath: [["\\[", "\\]"]], 11 | }, 12 | TeX: { 13 | TagSide: "right", 14 | TagIndent: ".8em", 15 | MultLineWidth: "85%", 16 | equationNumbers: { 17 | autoNumber: "AMS", 18 | }, 19 | unicode: { 20 | fonts: "STIXGeneral,'Arial Unicode MS'", 21 | }, 22 | }, 23 | displayAlign: "center", 24 | showProcessingMessages: false, 25 | messageStyle: "none", 26 | }; 27 | })(window, document); 28 | -------------------------------------------------------------------------------- /docs/_overrides/partial/source.html: -------------------------------------------------------------------------------- 1 | {% import "partials/language.html" as lang with context %} 2 | 8 |
9 | {% set icon = config.theme.icon.repo or "fontawesome/brands/git-alt" %} {% 10 | include ".icons/" ~ icon ~ ".svg" %} 11 |
12 |
{{ config.repo_name }}
13 |
14 | {% if config.theme.twitter_name %} 15 | 16 |
17 | {% include ".icons/fontawesome/brands/twitter.svg" %} 18 |
19 |
{{ config.theme.twitter_name }}
20 |
21 | {% endif %} 22 | -------------------------------------------------------------------------------- /.github/ISSUE_TEMPLATE/feature_request.yml: -------------------------------------------------------------------------------- 1 | name: Feature Request 2 | description: Suggest an idea for this project 3 | labels: ["feature"] 4 | body: 5 | - type: markdown 6 | attributes: 7 | value: | 8 | Thanks for taking the time to fill out this form! 9 | - type: textarea 10 | id: description 11 | attributes: 12 | label: Description 13 | description: Explanation of the feature. 14 | validations: 15 | required: true 16 | - type: textarea 17 | id: motivation 18 | attributes: 19 | label: Motivation 20 | description: Why are we doing this? What use cases does it support? What is the expected outcome? 21 | - type: textarea 22 | id: implementation 23 | attributes: 24 | label: Possible Implementation 25 | description: Suggest an idea for implementing the addition or change. 26 | - type: textarea 27 | id: additional-context 28 | attributes: 29 | label: Additional context 30 | description: Add any other context or screenshots about the feature request here. 31 | -------------------------------------------------------------------------------- /LICENSE: -------------------------------------------------------------------------------- 1 | Copyright 2021 - 2022 Saransh Chopra 2 | 3 | Permission is hereby granted, free of charge, to any person obtaining a copy of this software and associated documentation files (the "Software"), to deal in the Software without restriction, including without limitation the rights to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies of the Software, and to permit persons to whom the Software is furnished to do so, subject to the following conditions: 4 | 5 | The above copyright notice and this permission notice shall be included in all copies or substantial portions of the Software. 6 | 7 | THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE. 8 | -------------------------------------------------------------------------------- /.all-contributorsrc: -------------------------------------------------------------------------------- 1 | { 2 | "files": [ 3 | "README.md" 4 | ], 5 | "imageSize": 100, 6 | "commit": false, 7 | "contributors": [ 8 | { 9 | "login": "Saransh-cpp", 10 | "name": "Saransh", 11 | "avatar_url": "https://avatars.githubusercontent.com/u/74055102?v=4", 12 | "profile": "https://saransh-cpp.github.io/", 13 | "contributions": [ 14 | "code", 15 | "bug", 16 | "content", 17 | "doc", 18 | "design", 19 | "example", 20 | "ideas", 21 | "infra", 22 | "maintenance", 23 | "platform", 24 | "review", 25 | "test", 26 | "tutorial", 27 | "mentoring" 28 | ], 29 | }, 30 | { 31 | "login": "priyanshi-git", 32 | "name": "Priyanshi Goel", 33 | "avatar_url": "https://avatars.githubusercontent.com/u/82112540?v=4", 34 | "profile": "https://github.com/priyanshi-git", 35 | "contributions": [ 36 | "bug" 37 | ] 38 | }, 39 | ], 40 | "contributorsPerLine": 7, 41 | "projectName": "OCRed", 42 | "projectOwner": "Saransh-cpp", 43 | "repoType": "github", 44 | "repoHost": "https://github.com", 45 | "skipCi": true 46 | } 47 | -------------------------------------------------------------------------------- /docs/install.md: -------------------------------------------------------------------------------- 1 | # Installation 2 | 3 | Follow the steps below to install `ocred` locally. 4 | 5 | ## Create a virtual environment 6 | 7 | Create and activate a virtual environment 8 | 9 | ```bash 10 | python -m venv env 11 | 12 | . env/bin/activate 13 | ``` 14 | 15 | ## Install OCRed 16 | 17 | - Install Tesseract for your OS and add it to PATH 18 | 19 | The installation guide is available [here](https://tesseract-ocr.github.io/tessdoc/Installation.html) 20 | 21 | - `pip` magic 22 | 23 | `OCRed` uses modern `Python` packaging and can be installed using `pip` - 24 | 25 | ``` 26 | python -m pip install ocred 27 | ``` 28 | 29 | ## Build OCRed from source 30 | 31 | If you want to develop `OCRed`, or use its latest commit (!can be unstable!), you might want to install it from the source - 32 | 33 | - Install Tesseract for your OS and add it to PATH 34 | 35 | The installation guide is available [here](https://tesseract-ocr.github.io/tessdoc/Installation.html) 36 | 37 | - Clone this repository 38 | 39 | ```bash 40 | git clone https://github.com/Saransh-cpp/OCRed 41 | ``` 42 | 43 | - Change directory 44 | 45 | ```bash 46 | cd OCRed 47 | ``` 48 | 49 | - Install the package in editable mode with the "dev" dependencies 50 | 51 | ```bash 52 | python -m pip install -e ".[dev]" 53 | ``` 54 | 55 | Feel free to read our [Contributing Guide](https://github.com/Saransh-cpp/OCRed/blob/main/CONTRIBUTING.md) for more information on developing `OCRed`. 56 | -------------------------------------------------------------------------------- /noxfile.py: -------------------------------------------------------------------------------- 1 | from __future__ import annotations 2 | 3 | import nox 4 | 5 | ALL_PYTHONS = ["3.7", "3.8", "3.9", "3.10", "3.11"] 6 | 7 | nox.options.sessions = ["lint", "tests", "doctests"] 8 | 9 | 10 | @nox.session(reuse_venv=True) 11 | def lint(session): 12 | """Run the linter.""" 13 | session.install("pre-commit") 14 | session.run("pre-commit", "run", "--all-files", *session.posargs) 15 | 16 | 17 | @nox.session(python=ALL_PYTHONS, reuse_venv=True) 18 | def tests(session): 19 | """Run the unit and regular tests.""" 20 | session.install(".[dev]") 21 | session.run("pytest", *session.posargs) 22 | 23 | 24 | @nox.session(reuse_venv=True) 25 | def doctests(session): 26 | """Run the doctests.""" 27 | session.install(".[dev]") 28 | session.run("xdoctest", "./ocred/", *session.posargs) 29 | 30 | 31 | @nox.session(reuse_venv=True) 32 | def docs(session): 33 | """Build the docs. Pass "serve" to serve.""" 34 | session.install("-e", ".[docs]") 35 | 36 | if session.posargs: 37 | if "serve" in session.posargs: 38 | print("Launching docs at http://localhost:8000/ - use Ctrl-C to quit") 39 | session.run("mkdocs", "serve") 40 | else: 41 | print("Unsupported argument to docs") 42 | else: 43 | session.run("mkdocs", "build") 44 | 45 | 46 | @nox.session 47 | def build(session): 48 | """Build an SDist and wheel.""" 49 | session.install("build") 50 | session.run("python", "-m", "build") 51 | -------------------------------------------------------------------------------- /.github/ISSUE_TEMPLATE/bug_report.yml: -------------------------------------------------------------------------------- 1 | name: Bug Report 2 | description: File a bug report 3 | title: "[Bug]: " 4 | labels: ["bug"] 5 | body: 6 | - type: markdown 7 | attributes: 8 | value: | 9 | Thanks for taking the time to fill out this bug report! 10 | - type: input 11 | id: OCRed-version 12 | attributes: 13 | label: OCRed Version 14 | description: What version of OCRed are you running? 15 | placeholder: OCRed version 16 | validations: 17 | required: true 18 | - type: input 19 | id: python-version 20 | attributes: 21 | label: Python Version 22 | description: What version of python are you running? 23 | placeholder: python version 24 | validations: 25 | required: true 26 | - type: textarea 27 | id: what-happened 28 | attributes: 29 | label: Describe the bug 30 | description: A clear and concise description of what the bug is. 31 | validations: 32 | required: true 33 | - type: textarea 34 | id: reproduce 35 | attributes: 36 | label: Steps to Reproduce 37 | description: Tell us how to reproduce this behaviour. Ideally, this should take the form of a [Minimum Workable Example](https://stackoverflow.com/help/minimal-reproducible-example) 38 | validations: 39 | required: true 40 | - type: textarea 41 | id: logs 42 | attributes: 43 | label: Relevant log output 44 | description: Please copy and paste any relevant log output. This will be automatically formatted into code, so no need for backticks. 45 | render: shell 46 | -------------------------------------------------------------------------------- /.github/workflows/cd.yml: -------------------------------------------------------------------------------- 1 | name: CD 2 | 3 | on: 4 | workflow_dispatch: 5 | inputs: 6 | target: 7 | description: 'Deployment target. Can be "pypi" or "testpypi"' 8 | default: "testpypi" 9 | release: 10 | types: 11 | - published 12 | 13 | jobs: 14 | dist: 15 | runs-on: ubuntu-latest 16 | steps: 17 | - uses: actions/checkout@v4 18 | with: 19 | fetch-depth: 0 20 | 21 | - name: Build SDist and wheel 22 | run: pipx run build 23 | 24 | - uses: actions/upload-artifact@v4 25 | with: 26 | path: dist/* 27 | 28 | - name: Check metadata 29 | run: pipx run twine check dist/* 30 | 31 | publish: 32 | needs: dist 33 | runs-on: ubuntu-latest 34 | 35 | steps: 36 | - uses: actions/download-artifact@v4 37 | with: 38 | name: artifact 39 | path: dist 40 | 41 | - name: Publish on PyPI 42 | if: github.event.inputs.target == 'pypi' || (github.event_name == 'release' && github.event.action == 'published') 43 | uses: pypa/gh-action-pypi-publish@v1.10.3 44 | with: 45 | user: __token__ 46 | password: ${{ secrets.PYPI_API_TOKEN }} 47 | 48 | - name: Publish on TestPyPI 49 | if: github.event.inputs.target == 'testpypi' || (github.event_name == 'release' && github.event.action == 'published') 50 | uses: pypa/gh-action-pypi-publish@v1.10.3 51 | with: 52 | user: __token__ 53 | password: ${{ secrets.TEST_PYPI_API_TOKEN }} 54 | repository_url: https://test.pypi.org/legacy/ 55 | -------------------------------------------------------------------------------- /mkdocs.yml: -------------------------------------------------------------------------------- 1 | # inspired from https://github.com/avik-pal/Lux.jl/blob/main/docs/mkdocs.yml 2 | theme: 3 | name: material 4 | features: 5 | - navigation.sections 6 | palette: 7 | - scheme: default 8 | primary: white 9 | accent: amber 10 | toggle: 11 | icon: material/weather-night 12 | name: Switch to dark mode 13 | - scheme: slate 14 | primary: black 15 | accent: amber 16 | toggle: 17 | icon: material/weather-sunny 18 | name: Switch to light mode 19 | font: 20 | text: Lato 21 | icon: 22 | repo: fontawesome/brands/github 23 | custom_dir: "docs/_overrides/" # Overriding part of the HTML 24 | 25 | # TODO: look into this 26 | # twitter_name: "@saranshchopra7" 27 | # twitter_url: "https://twitter.com/saranshchopra7" 28 | 29 | site_name: OCRed 30 | site_description: Documentation for OCRed 31 | site_author: Saransh Chopra 32 | site_url: https://ocred.readthedocs.io/ 33 | 34 | repo_url: https://github.com/Saransh-cpp/OCRed 35 | repo_name: Saransh-cpp/OCRed 36 | edit_uri: ./edit/main/docs 37 | 38 | extra_css: 39 | - stylesheets/extra.css 40 | 41 | strict: true 42 | 43 | plugins: 44 | - search 45 | - mkdocstrings 46 | - autorefs # Cross-links to headings 47 | - include_exclude_files: 48 | exclude: 49 | - "_overrides" 50 | 51 | markdown_extensions: 52 | - callouts 53 | - pymdownx.arithmatex 54 | - pymdownx.magiclink 55 | - pymdownx.details # Allowing hidden expandable regions denoted by ??? 56 | - pymdownx.highlight 57 | - pymdownx.inlinehilite 58 | - pymdownx.superfences # Seems to enable syntax highlighting when used with the Material theme. 59 | - pymdownx.tasklist: 60 | custom_checkbox: true 61 | - pymdownx.tabbed: 62 | alternate_style: true 63 | - pymdownx.snippets: 64 | check_paths: true 65 | - toc: 66 | permalink: "¤" # Adds a clickable permalink to each section heading 67 | toc_depth: 4 68 | 69 | extra_javascript: 70 | - mathjax-config.js 71 | - https://cdn.mathjax.org/mathjax/latest/MathJax.js?config=TeX-AMS-MML_HTMLorMML 72 | 73 | docs_dir: docs 74 | 75 | nav: 76 | - Home: "index.md" 77 | - Installation: "install.md" 78 | # - Examples: 79 | - Reference: "reference.md" 80 | - Contributing: "contributing.md" 81 | - Changelog: "changelog.md" 82 | - Code of Conduct: "conduct.md" 83 | -------------------------------------------------------------------------------- /.pre-commit-config.yaml: -------------------------------------------------------------------------------- 1 | ci: 2 | autoupdate_commit_msg: "chore: update pre-commit hooks" 3 | autofix_commit_msg: "style: pre-commit fixes" 4 | 5 | repos: 6 | - repo: https://github.com/psf/black-pre-commit-mirror 7 | rev: 24.4.2 8 | hooks: 9 | - id: black-jupyter 10 | 11 | - repo: https://github.com/pre-commit/pre-commit-hooks 12 | rev: v4.6.0 13 | hooks: 14 | - id: check-added-large-files 15 | - id: check-case-conflict 16 | - id: check-merge-conflict 17 | - id: check-symlinks 18 | - id: check-yaml 19 | - id: debug-statements 20 | - id: end-of-file-fixer 21 | exclude: ^docs 22 | - id: mixed-line-ending 23 | - id: requirements-txt-fixer 24 | - id: trailing-whitespace 25 | 26 | - repo: https://github.com/astral-sh/ruff-pre-commit 27 | rev: "v0.4.10" 28 | hooks: 29 | - id: ruff 30 | args: ["--fix", "--show-fixes"] 31 | 32 | - repo: https://github.com/tox-dev/pyproject-fmt 33 | rev: "2.1.3" 34 | hooks: 35 | - id: pyproject-fmt 36 | 37 | - repo: https://github.com/pre-commit/mirrors-mypy 38 | rev: v1.10.0 39 | hooks: 40 | - id: mypy 41 | files: src 42 | args: [] 43 | additional_dependencies: 44 | - numpy 45 | - packaging 46 | 47 | - repo: https://github.com/codespell-project/codespell 48 | rev: v2.3.0 49 | hooks: 50 | - id: codespell 51 | 52 | - repo: https://github.com/pre-commit/mirrors-prettier 53 | rev: "v4.0.0-alpha.8" 54 | hooks: 55 | - id: prettier 56 | types_or: [yaml, markdown, html, css, scss, javascript, json] 57 | exclude: assets/js/webapp\.js 58 | 59 | - repo: https://github.com/asottile/blacken-docs 60 | rev: 1.16.0 61 | hooks: 62 | - id: blacken-docs 63 | args: ["-E"] 64 | additional_dependencies: [black==23.1.0] 65 | 66 | - repo: https://github.com/pre-commit/pygrep-hooks 67 | rev: v1.10.0 68 | hooks: 69 | - id: python-check-blanket-type-ignore 70 | - id: rst-backticks 71 | - id: rst-directive-colons 72 | - id: rst-inline-touching-normal 73 | 74 | - repo: https://github.com/nbQA-dev/nbQA 75 | rev: 1.8.5 76 | hooks: 77 | - id: nbqa-pyupgrade 78 | args: ["--py37-plus"] 79 | - id: nbqa-isort 80 | args: ["--float-to-top"] 81 | -------------------------------------------------------------------------------- /tests/test_preprocessing.py: -------------------------------------------------------------------------------- 1 | from __future__ import annotations 2 | 3 | import os 4 | 5 | import cv2 6 | import numpy as np 7 | import pytest 8 | 9 | from ocred.preprocessing import Preprocessor 10 | 11 | path = "images/CosmosOne.jpg" 12 | 13 | 14 | def test_deprecations_and_errors(): 15 | pre = Preprocessor(path) 16 | img = cv2.imread(path) 17 | 18 | with pytest.raises(DeprecationWarning): 19 | pre.scan(inplace=True) 20 | with pytest.raises(DeprecationWarning): 21 | pre.scan(overriden_image=img) 22 | 23 | with pytest.raises(DeprecationWarning): 24 | pre.rotate(inplace=True) 25 | with pytest.raises(DeprecationWarning): 26 | pre.rotate(overriden_image=img) 27 | 28 | with pytest.raises(DeprecationWarning): 29 | pre.remove_noise(inplace=True) 30 | with pytest.raises(DeprecationWarning): 31 | pre.remove_noise(overriden_image=img) 32 | 33 | with pytest.raises(DeprecationWarning): 34 | pre.thicken_font(inplace=True) 35 | with pytest.raises(DeprecationWarning): 36 | pre.thicken_font(overriden_image=img) 37 | 38 | 39 | def test_scan(): 40 | pre = Preprocessor(path) 41 | assert isinstance(pre.img, np.ndarray) 42 | 43 | scanned = pre.scan() 44 | assert isinstance(scanned, np.ndarray) 45 | assert isinstance(pre.img, np.ndarray) 46 | assert (scanned == pre.img).all() 47 | 48 | img = cv2.imread(path) 49 | pre = Preprocessor(img) 50 | scanned = pre.scan(save=True) 51 | assert isinstance(scanned, np.ndarray) 52 | assert isinstance(pre.img, np.ndarray) 53 | assert (scanned == pre.img).all() 54 | 55 | assert os.path.exists("scanned.png") 56 | os.remove("scanned.png") 57 | 58 | 59 | def test_rotate(): 60 | pre = Preprocessor(path) 61 | assert isinstance(pre.img, np.ndarray) 62 | 63 | rotated, median_angle = pre.rotate(save=True) 64 | assert isinstance(median_angle, float) 65 | assert isinstance(rotated, np.ndarray) 66 | assert isinstance(pre.img, np.ndarray) 67 | assert (rotated == pre.img).all() 68 | assert os.path.exists("rotated.png") 69 | 70 | os.remove("rotated.png") 71 | 72 | 73 | def test_remove_noise(): 74 | pre = Preprocessor(path) 75 | assert isinstance(pre.img, np.ndarray) 76 | 77 | noiseless = pre.remove_noise(save=True) 78 | assert isinstance(noiseless, np.ndarray) 79 | assert isinstance(pre.img, np.ndarray) 80 | assert (noiseless == pre.img).all() 81 | assert os.path.exists("noise_free.png") 82 | 83 | os.remove("noise_free.png") 84 | 85 | 86 | def test_thicken_font(): 87 | pre = Preprocessor(path) 88 | assert isinstance(pre.img, np.ndarray) 89 | 90 | thickened = pre.thicken_font(save=True) 91 | assert isinstance(thickened, np.ndarray) 92 | assert isinstance(pre.img, np.ndarray) 93 | assert (thickened == pre.img).all() 94 | assert os.path.exists("thick_font.png") 95 | 96 | os.remove("thick_font.png") 97 | -------------------------------------------------------------------------------- /docs/stylesheets/extra.css: -------------------------------------------------------------------------------- 1 | /* Fix /page#foo going to the top of the viewport and being hidden by the navbar */ 2 | html { 3 | scroll-padding-top: 50px; 4 | } 5 | 6 | /* Fit the Twitter handle alongside the GitHub one in the top right. */ 7 | 8 | div.md-header__source { 9 | width: revert; 10 | max-width: revert; 11 | } 12 | 13 | a.md-source { 14 | display: inline-block; 15 | } 16 | 17 | .md-source__repository { 18 | max-width: 100%; 19 | } 20 | 21 | /* Emphasise sections of nav on left hand side */ 22 | 23 | nav.md-nav { 24 | padding-left: 5px; 25 | } 26 | 27 | nav.md-nav--secondary { 28 | border-left: revert !important; 29 | } 30 | 31 | .md-nav__title { 32 | font-size: 0.9rem; 33 | } 34 | 35 | .md-nav__item--section > .md-nav__link { 36 | font-size: 0.9rem; 37 | } 38 | 39 | /* Indent autogenerated documentation */ 40 | 41 | div.doc-contents { 42 | padding-left: 25px; 43 | border-left: 4px solid rgba(230, 230, 230); 44 | } 45 | 46 | /* Increase visibility of splitters "---" */ 47 | 48 | [data-md-color-scheme="default"] .md-typeset hr { 49 | border-bottom-color: rgb(0, 0, 0); 50 | border-bottom-width: 1pt; 51 | } 52 | 53 | [data-md-color-scheme="slate"] .md-typeset hr { 54 | border-bottom-color: rgb(230, 230, 230); 55 | } 56 | 57 | /* More space at the bottom of the page */ 58 | 59 | .md-main__inner { 60 | margin-bottom: 1.5rem; 61 | } 62 | 63 | /* Remove prev/next footer buttons */ 64 | 65 | .md-footer__inner { 66 | display: none; 67 | } 68 | 69 | /* Bugfix: remove the superfluous parts generated when doing: 70 | 71 | ??? Blah 72 | 73 | ::: library.something 74 | */ 75 | 76 | .md-typeset details .mkdocstrings > h4 { 77 | display: none; 78 | } 79 | 80 | .md-typeset details .mkdocstrings > h5 { 81 | display: none; 82 | } 83 | 84 | /* Change default colours for tags */ 85 | 86 | [data-md-color-scheme="default"] { 87 | --md-typeset-a-color: rgb(0, 189, 164) !important; 88 | } 89 | [data-md-color-scheme="slate"] { 90 | --md-typeset-a-color: rgb(0, 189, 164) !important; 91 | } 92 | 93 | /* Highlight functions, classes etc. type signatures. Really helps to make clear where 94 | one item ends and another begins. */ 95 | 96 | [data-md-color-scheme="default"] { 97 | --doc-heading-color: #ddd; 98 | --doc-heading-border-color: #ccc; 99 | --doc-heading-color-alt: #f0f0f0; 100 | } 101 | [data-md-color-scheme="slate"] { 102 | --doc-heading-color: rgb(25, 25, 33); 103 | --doc-heading-border-color: rgb(25, 25, 33); 104 | --doc-heading-color-alt: rgb(33, 33, 44); 105 | --md-code-bg-color: rgb(38, 38, 50); 106 | } 107 | 108 | h4.doc-heading { 109 | /* NOT var(--md-code-bg-color) as that's not visually distinct from other code blocks.*/ 110 | background-color: var(--doc-heading-color); 111 | border: solid var(--doc-heading-border-color); 112 | border-width: 1.5pt; 113 | border-radius: 2pt; 114 | padding: 0pt 5pt 2pt 5pt; 115 | } 116 | h5.doc-heading, 117 | h6.heading { 118 | background-color: var(--doc-heading-color-alt); 119 | border-radius: 2pt; 120 | padding: 0pt 5pt 2pt 5pt; 121 | } 122 | -------------------------------------------------------------------------------- /pyproject.toml: -------------------------------------------------------------------------------- 1 | [build-system] 2 | build-backend = "hatchling.build" 3 | requires = [ 4 | "hatch-vcs", 5 | "hatchling", 6 | ] 7 | 8 | [project] 9 | name = "ocred" 10 | description = "Clever, simple, and intuitive wrapper functionalities for OCRing specific textual materials" 11 | readme = "README.md" 12 | keywords = [ 13 | "Computer Vision", 14 | "Intended for direct users", 15 | "OCR", 16 | ] 17 | license = "MIT" 18 | authors = [ 19 | { name = "Saransh Chopra", email = "saransh0701@gmail.com" }, 20 | ] 21 | requires-python = ">=3.7" 22 | classifiers = [ 23 | "Development Status :: 5 - Production/Stable", 24 | "Intended Audience :: Customer Service", 25 | "Intended Audience :: Developers", 26 | "License :: OSI Approved :: MIT License", 27 | "Operating System :: OS Independent", 28 | "Programming Language :: Python", 29 | "Programming Language :: Python :: 3 :: Only", 30 | "Programming Language :: Python :: 3.7", 31 | "Programming Language :: Python :: 3.8", 32 | "Programming Language :: Python :: 3.9", 33 | "Programming Language :: Python :: 3.10", 34 | "Programming Language :: Python :: 3.11", 35 | "Programming Language :: Python :: 3.12", 36 | "Topic :: Scientific/Engineering", 37 | "Typing :: Typed", 38 | ] 39 | dynamic = [ 40 | "version", 41 | ] 42 | dependencies = [ 43 | "easyocr>=1.4.1", 44 | "numpy>=1.19.3", 45 | "opencv-python>=4.5.3.56", 46 | "packaging", 47 | "pillow<10", 48 | "pytesseract>=0.3.8", 49 | "scikit-image>=0.18.3", 50 | "scipy>=1.5.4", 51 | ] 52 | optional-dependencies.dev = [ 53 | "nltk>=3.5", 54 | "pytest>=6", 55 | "pytest-cov>=3", 56 | "xdoctest>=1", 57 | ] 58 | optional-dependencies.docs = [ 59 | "markdown-callouts>=0.2", 60 | "mkdocs>=1.3.1", 61 | "mkdocs-include-exclude-files>=0.0.1", 62 | "mkdocs-jupyter>=0.21", 63 | "mkdocs-material>=8.3.9", 64 | "mkdocstrings-python>=0.7.1", 65 | "mkdocstrings-python-legacy>=0.2.3", 66 | "pymdown-extensions>=9.5", 67 | ] 68 | optional-dependencies.nltk = [ 69 | "nltk>=3.5", 70 | ] 71 | optional-dependencies.test = [ 72 | "pytest>=6", 73 | "pytest-cov>=3", 74 | "xdoctest>=1", 75 | ] 76 | urls."Bug Tracker" = "https://github.com/Saransh-cpp/OCRed/issues" 77 | urls.Changelog = "https://ocred.readthedocs.io/en/latest/changelog/" 78 | urls.Discussions = "https://github.com/Saransh-cpp/OCRed/discussions" 79 | urls.Documentation = "https://ocred.readthedocs.io/" 80 | urls.Homepage = "https://github.com/Saransh-cpp/OCRed" 81 | 82 | [tool.hatch] 83 | version.source = "vcs" 84 | build.hooks.vcs.version-file = "ocred/version.py" 85 | 86 | [tool.isort] 87 | profile = "black" 88 | 89 | [tool.pytest.ini_options] 90 | minversion = "6.0" 91 | xfail_strict = true 92 | addopts = [ 93 | "-ra", 94 | "--strict-markers", 95 | "--strict-config", 96 | ] 97 | testpaths = [ 98 | "tests", 99 | ] 100 | log_cli_level = "DEBUG" 101 | filterwarnings = [ 102 | "error", 103 | "ignore::DeprecationWarning", 104 | "ignore::UserWarning", 105 | ] 106 | 107 | [tool.mypy] 108 | files = [ 109 | "./ocred/", 110 | ] 111 | python_version = "3.8" 112 | strict = true 113 | warn_return_any = false 114 | show_error_codes = true 115 | warn_unreachable = true 116 | enable_error_code = [ 117 | "ignore-without-code", 118 | "truthy-bool", 119 | "redundant-expr", 120 | ] 121 | ignore_missing_imports = true 122 | -------------------------------------------------------------------------------- /.github/workflows/ci.yml: -------------------------------------------------------------------------------- 1 | # This workflow will install Python dependencies, run tests and lint with a single version of Python 2 | # For more information see: https://help.github.com/actions/language-and-framework-guides/using-python-with-github-actions 3 | 4 | name: CI 5 | 6 | on: 7 | push: 8 | branches: ["main"] 9 | pull_request: 10 | branches: ["main"] 11 | 12 | concurrency: 13 | # Skip intermediate builds: always. 14 | # Cancel intermediate builds: always. 15 | group: ${{ github.workflow }}-${{ github.ref }} 16 | cancel-in-progress: true 17 | 18 | jobs: 19 | pre-commit: 20 | runs-on: ubuntu-latest 21 | name: Check SDist 22 | steps: 23 | - uses: actions/checkout@v4 24 | - uses: actions/setup-python@v5 25 | with: 26 | # TODO: Change '3.10.6' to '3.x' after updating mypy version 27 | # See issue: https://github.com/python/mypy/issues/13627 28 | # "(might be included in mypy 0.980, if not it will be in mypy 0.990)" 29 | python-version: 3.10.6 30 | - uses: pre-commit/action@v3.0.1 31 | 32 | build: 33 | needs: pre-commit 34 | runs-on: ${{ matrix.os }} 35 | name: Build and test package 36 | strategy: 37 | fail-fast: false 38 | matrix: 39 | os: [ubuntu-latest, macos-latest] 40 | python-version: ["3.7", "3.8", "3.9", "3.10", "3.11"] 41 | 42 | steps: 43 | - uses: actions/checkout@v4 44 | 45 | - name: Set up Python ${{ matrix.python-version }} 46 | uses: actions/setup-python@v5 47 | with: 48 | python-version: ${{ matrix.python-version }} 49 | 50 | - name: Install OCRed 51 | run: | 52 | python -m pip install .[dev] 53 | 54 | - name: Install Tesseract OCR Engine on Ubuntu 55 | if: matrix.os == 'ubuntu-latest' 56 | run: | 57 | sudo apt-get update 58 | sudo apt-get install tesseract-ocr 59 | 60 | - name: Install Tesseract OCR Engine on MacOS 61 | if: matrix.os == 'macos-latest' 62 | run: | 63 | brew update 64 | rm -f /usr/local/bin/2to3* 65 | rm -f /usr/local/bin/idle3* 66 | rm -f /usr/local/bin/pydoc3* 67 | rm -f /usr/local/bin/python3* 68 | brew install tesseract-lang 69 | 70 | - name: Run unit tests and generate coverage report 71 | run: | 72 | python -m pytest -ra --cov=ocred tests/ 73 | 74 | - name: Upload coverage report 75 | uses: codecov/codecov-action@v4.6.0 76 | env: 77 | CODECOV_TOKEN: ${{ secrets.CODECOV_TOKEN }} 78 | 79 | docs: 80 | needs: pre-commit 81 | runs-on: ubuntu-latest 82 | name: Build and test documentation 83 | strategy: 84 | fail-fast: false 85 | 86 | steps: 87 | - uses: actions/checkout@v4 88 | 89 | - name: Set up Python 3.9 90 | uses: actions/setup-python@v5 91 | with: 92 | python-version: 3.9 93 | 94 | - name: Install Tesseract OCR Engine 95 | run: | 96 | sudo apt-get update 97 | sudo apt-get install tesseract-ocr 98 | 99 | - name: Install ocred with doc dependencies 100 | run: python -m pip install -e .[dev,docs] 101 | 102 | - name: Run doctests 103 | run: xdoctest ./ocred/ 104 | 105 | - name: Build docs 106 | run: mkdocs build 107 | -------------------------------------------------------------------------------- /CHANGELOG.md: -------------------------------------------------------------------------------- 1 | # [Unreleased](https://github.com/Saransh-cpp/OCRed) 2 | 3 | # [v0.4.0](https://github.com/Saransh-cpp/OCRed/tree/v0.4.0) 4 | 5 | ## Features 6 | 7 | - Add support for Python `3.10` and `3.11` ([#137](https://github.com/Saransh-cpp/OCRed/pull/137)) 8 | 9 | ## Bug fixes 10 | 11 | - Fix pillow errors ([#132](https://github.com/Saransh-cpp/OCRed/pull/132)) 12 | 13 | ## CI 14 | 15 | - 3.7 again by ([#127](https://github.com/Saransh-cpp/OCRed/pull/127)) 16 | - Fix macOS CI ([#97](https://github.com/Saransh-cpp/OCRed/pull/97)) 17 | 18 | ## Docs 19 | 20 | - Add pull request template [(#81](https://github.com/Saransh-cpp/OCRed/pull/81)) 21 | - Issue template ([#80](https://github.com/Saransh-cpp/OCRed/pull/80)) 22 | 23 | ## Maintenance 24 | 25 | - Revamp pre-commit configuration (https://github.com/Saransh-cpp/OCRed/commit/b9390834f5950b36fe873eaccf76f666c6bcbf4f) 26 | - Update pre-commit configuration ([#75](https://github.com/Saransh-cpp/OCRed/pull/75)) 27 | - Back to the `__future__` ([#74](https://github.com/Saransh-cpp/OCRed/pull/74)) 28 | 29 | # [v0.3.0](https://github.com/Saransh-cpp/OCRed/tree/v0.3.0) 30 | 31 | ## Breaking changes 32 | 33 | - The arguments `inplace` and `overriden_image` have been deprecated and removed ([#71](https://github.com/Saransh-cpp/OCRed/pull/71)) 34 | - `Preprocessor` now alters the `img` attribute in each method, which can be accessed via `self.img` ([#71](https://github.com/Saransh-cpp/OCRed/pull/71)) 35 | 36 | ## CI 37 | 38 | - Added a separate CI pipeline for documentation ([#67](https://github.com/Saransh-cpp/OCRed/pull/67)) 39 | 40 | ## Docs 41 | 42 | - Revamped the UI and fixed minor UI bugs ([#67](https://github.com/Saransh-cpp/OCRed/pull/67)) 43 | 44 | ## Tests 45 | 46 | - Simplify and fasten tests ([#71](https://github.com/Saransh-cpp/OCRed/pull/71)) 47 | 48 | # [v0.2.0](https://github.com/Saransh-cpp/OCRed/tree/v0.2.0) 49 | 50 | ## Features 51 | 52 | - Introduced `tesseract_config` argument to pass down configuration for Tesseract OCR Engine in `ocr_meaningful_text` ([#61](https://github.com/Saransh-cpp/OCRed/pull/61)) 53 | - Introduced `preserve_orientation` argument to preserve the orientation of OCRed text in `ocr_meaningful_text` ([#61](https://github.com/Saransh-cpp/OCRed/pull/61)) 54 | - `OCRed` can now be built from archive ([#56](https://github.com/Saransh-cpp/OCRed/pull/56)) 55 | 56 | ## Breaking changes 57 | 58 | - `ocr_sparse_text` now returns the output of `easyocr.Reader.readtext()` too ([#61](https://github.com/Saransh-cpp/OCRed/pull/61)) 59 | - `text_to_speech` is deprecated and removed ([#58](https://github.com/Saransh-cpp/OCRed/pull/58)) 60 | 61 | ## Bug fixes 62 | 63 | - Fixed the return value of `Preprocessor.remove_noise` ([#62](https://github.com/Saransh-cpp/OCRed/pull/62)) 64 | 65 | ## Misc 66 | 67 | - Added custom and more informative errors in the `OCR` class ([#61](https://github.com/Saransh-cpp/OCRed/pull/61)) 68 | 69 | ## Maintenance 70 | 71 | - Added a check for docs in the `CI` ([#59](https://github.com/Saransh-cpp/OCRed/pull/58)) 72 | - Fixed failing doc deployment ([#59](https://github.com/Saransh-cpp/OCRed/pull/58)) 73 | - Added `pyproject-fmt` pre-commit hook ([#57](https://github.com/Saransh-cpp/OCRed/pull/57)) 74 | - Fixed building from archive (tarballs) ([#56](https://github.com/Saransh-cpp/OCRed/pull/56)) 75 | 76 | # [v0.1.2](https://github.com/Saransh-cpp/OCRed/tree/v0.1.2) 77 | 78 | ## Maintenance 79 | 80 | - Removed capitalisation from `PyPI`'s name (`OCRed` -> `ocred`) ([#52](https://github.com/Saransh-cpp/OCRed/pull/52)) 81 | 82 | ## Bug fixes 83 | 84 | - Fixed the `DeprecatingWarning` in `text_to_speech` ([#52](https://github.com/Saransh-cpp/OCRed/pull/52)) 85 | - Removed capitalisation from `PyPI`'s name (`OCRed` -> `ocred`) ([#52](https://github.com/Saransh-cpp/OCRed/pull/52)) 86 | 87 | # [v0.1.1](https://github.com/Saransh-cpp/OCRed/tree/v0.1.1) 88 | 89 | ## Maintenance 90 | 91 | - Updated classifiers and links in `pyproject.toml` ([#49](https://github.com/Saransh-cpp/OCRed/pull/49)) 92 | - `nltk` is now not a part of the default dependencies ([#49](https://github.com/Saransh-cpp/OCRed/pull/49)) 93 | - Added `__version__` to `OCRed`'s namespace ([#50](https://github.com/Saransh-cpp/OCRed/pull/50)) 94 | 95 | ## Deprecations 96 | 97 | - `text_to_speech` is deprecated and will be removed in `v0.2.0`, use `gTTS` manually ([#50](https://github.com/Saransh-cpp/OCRed/pull/50)) 98 | 99 | # [v0.1.0](https://github.com/Saransh-cpp/OCRed/tree/v0.1.0) 100 | 101 | - Added ability to `OCR` various textual mediums. 102 | - Added ability to `Preprocess` images. 103 | - Infrastructure built with `GitHub Actions`, `hatch`, `Codecov`, and `readthedocs`. 104 | - Optimised algorithms with `inplace` edits. 105 | - Added documentation with `mkdocstrings`. 106 | - Other chore work like `pre-commit`, `nox` support, etc. 107 | - Tests with `pytest` and coverage with `pytest-cov`. 108 | -------------------------------------------------------------------------------- /CODE_OF_CONDUCT.md: -------------------------------------------------------------------------------- 1 | # Contributor Covenant Code of Conduct 2 | 3 | ## Our Pledge 4 | 5 | We as members, contributors, and leaders pledge to make participation in our 6 | community a harassment-free experience for everyone, regardless of age, body 7 | size, visible or invisible disability, ethnicity, sex characteristics, gender 8 | identity and expression, level of experience, education, socioeconomic status, 9 | nationality, personal appearance, race, religion, or sexual identity 10 | and orientation. 11 | 12 | We pledge to act and interact in ways that contribute to an open, welcoming, 13 | diverse, inclusive, and healthy community. 14 | 15 | ## Our Standards 16 | 17 | Examples of behavior that contributes to a positive environment for our 18 | community include: 19 | 20 | - Demonstrating empathy and kindness toward other people 21 | - Being respectful of differing opinions, viewpoints, and experiences 22 | - Giving and gracefully accepting constructive feedback 23 | - Accepting responsibility and apologizing to those affected by our mistakes, 24 | and learning from the experience 25 | - Focusing on what is best not just for us as individuals, but for the 26 | overall community 27 | 28 | Examples of unacceptable behavior include: 29 | 30 | - The use of sexualized language or imagery, and sexual attention or 31 | advances of any kind 32 | - Trolling, insulting or derogatory comments, and personal or political attacks 33 | - Public or private harassment 34 | - Publishing others' private information, such as a physical or email 35 | address, without their explicit permission 36 | - Other conduct which could reasonably be considered inappropriate in a 37 | professional setting 38 | 39 | ## Enforcement Responsibilities 40 | 41 | Community leaders are responsible for clarifying and enforcing our standards of 42 | acceptable behavior and will take appropriate and fair corrective action in 43 | response to any behavior that they deem inappropriate, threatening, offensive, 44 | or harmful. 45 | 46 | Community leaders have the right and responsibility to remove, edit, or reject 47 | comments, commits, code, wiki edits, issues, and other contributions that are 48 | not aligned to this Code of Conduct, and will communicate reasons for moderation 49 | decisions when appropriate. 50 | 51 | ## Scope 52 | 53 | This Code of Conduct applies within all community spaces, and also applies when 54 | an individual is officially representing the community in public spaces. 55 | Examples of representing our community include using an official e-mail address, 56 | posting via an official social media account, or acting as an appointed 57 | representative at an online or offline event. 58 | 59 | ## Enforcement 60 | 61 | Instances of abusive, harassing, or otherwise unacceptable behavior may be 62 | reported to the community leaders responsible for enforcement at 63 | saransh0701@gmail.com. 64 | All complaints will be reviewed and investigated promptly and fairly. 65 | 66 | All community leaders are obligated to respect the privacy and security of the 67 | reporter of any incident. 68 | 69 | ## Enforcement Guidelines 70 | 71 | Community leaders will follow these Community Impact Guidelines in determining 72 | the consequences for any action they deem in violation of this Code of Conduct: 73 | 74 | ### 1. Correction 75 | 76 | **Community Impact**: Use of inappropriate language or other behavior deemed 77 | unprofessional or unwelcome in the community. 78 | 79 | **Consequence**: A private, written warning from community leaders, providing 80 | clarity around the nature of the violation and an explanation of why the 81 | behavior was inappropriate. A public apology may be requested. 82 | 83 | ### 2. Warning 84 | 85 | **Community Impact**: A violation through a single incident or series 86 | of actions. 87 | 88 | **Consequence**: A warning with consequences for continued behavior. No 89 | interaction with the people involved, including unsolicited interaction with 90 | those enforcing the Code of Conduct, for a specified period of time. This 91 | includes avoiding interactions in community spaces as well as external channels 92 | like social media. Violating these terms may lead to a temporary or 93 | permanent ban. 94 | 95 | ### 3. Temporary Ban 96 | 97 | **Community Impact**: A serious violation of community standards, including 98 | sustained inappropriate behavior. 99 | 100 | **Consequence**: A temporary ban from any sort of interaction or public 101 | communication with the community for a specified period of time. No public or 102 | private interaction with the people involved, including unsolicited interaction 103 | with those enforcing the Code of Conduct, is allowed during this period. 104 | Violating these terms may lead to a permanent ban. 105 | 106 | ### 4. Permanent Ban 107 | 108 | **Community Impact**: Demonstrating a pattern of violation of community 109 | standards, including sustained inappropriate behavior, harassment of an 110 | individual, or aggression toward or disparagement of classes of individuals. 111 | 112 | **Consequence**: A permanent ban from any sort of public interaction within 113 | the community. 114 | 115 | ## Attribution 116 | 117 | This Code of Conduct is adapted from the [Contributor Covenant][homepage], 118 | version 2.0, available at 119 | https://www.contributor-covenant.org/version/2/0/code_of_conduct.html. 120 | 121 | Community Impact Guidelines were inspired by [Mozilla's code of conduct 122 | enforcement ladder](https://github.com/mozilla/diversity). 123 | 124 | [homepage]: https://www.contributor-covenant.org 125 | 126 | For answers to common questions about this code of conduct, see the FAQ at 127 | https://www.contributor-covenant.org/faq. Translations are available at 128 | https://www.contributor-covenant.org/translations. 129 | -------------------------------------------------------------------------------- /tests/test_ocr.py: -------------------------------------------------------------------------------- 1 | from __future__ import annotations 2 | 3 | import os 4 | 5 | import pytest 6 | 7 | from ocred.ocr import OCR 8 | 9 | path_scanned = "images/Page.png" 10 | path_real = "images/CosmosOne.jpg" 11 | path_sign_board = "images/signboard.jpg" 12 | path_invoice = "images/1146-receipt.jpg" 13 | 14 | 15 | def test_deprecations_and_errors(): 16 | ocr = OCR( 17 | False, 18 | path_scanned, 19 | ) 20 | with pytest.raises(DeprecationWarning): 21 | ocr.text_to_speech() 22 | 23 | ocr = OCR( 24 | False, 25 | path_scanned, 26 | ) 27 | with pytest.raises(ValueError): 28 | ocr.save_output() 29 | 30 | ocr = OCR( 31 | False, 32 | path_scanned, 33 | ) 34 | with pytest.raises(ValueError): 35 | ocr.process_extracted_text_from_invoice() 36 | 37 | 38 | def test_ocr_with_scanned_image(): 39 | ocr = OCR( 40 | False, 41 | path_scanned, 42 | ) 43 | 44 | assert ocr.path == path_scanned 45 | assert ocr.preprocess is False 46 | 47 | text = ocr.ocr_meaningful_text(save_output=True) 48 | 49 | assert isinstance(ocr.text, str) 50 | assert isinstance(text, str) 51 | assert text == ocr.text 52 | assert os.path.exists("OCR.png") 53 | assert os.path.exists("output.txt") 54 | assert not os.path.exists("preprocessed.png") 55 | 56 | os.remove("OCR.png") 57 | os.remove("output.txt") 58 | 59 | 60 | def test_ocr_with_real_image(): 61 | ocr = OCR( 62 | True, 63 | path_real, 64 | ) 65 | 66 | assert ocr.path == "preprocessed.png" 67 | assert ocr.preprocess is True 68 | 69 | text = ocr.ocr_meaningful_text(preserve_orientation=True) 70 | 71 | assert isinstance(ocr.text, str) 72 | assert isinstance(text, str) 73 | assert text == ocr.text 74 | assert os.path.exists("OCR.png") 75 | assert os.path.exists("preprocessed.png") 76 | 77 | os.remove("OCR.png") 78 | os.remove("preprocessed.png") 79 | 80 | 81 | def test_ocr_sign_board(): 82 | ocr = OCR( 83 | False, 84 | path_sign_board, 85 | ) 86 | 87 | assert ocr.path == path_sign_board 88 | assert ocr.preprocess is False 89 | 90 | text, detailed_text = ocr.ocr_sparse_text(save_output=True) 91 | 92 | assert isinstance(ocr.text, str) 93 | assert isinstance(text, str) 94 | assert isinstance(ocr.detailed_text, list) 95 | assert isinstance(detailed_text, list) 96 | assert detailed_text == ocr.detailed_text 97 | assert text == ocr.text 98 | assert os.path.exists("OCR.png") 99 | assert os.path.exists("output.txt") 100 | assert not os.path.exists("preprocessed.png") 101 | 102 | os.remove("OCR.png") 103 | os.remove("output.txt") 104 | 105 | 106 | def test_ocr_invoices(): 107 | global path_invoice 108 | ocr = OCR( 109 | False, 110 | path_invoice, 111 | ) 112 | 113 | assert ocr.path == path_invoice 114 | assert ocr.preprocess is False 115 | 116 | text, detailed_text = ocr.ocr_sparse_text() 117 | 118 | assert isinstance(ocr.text, str) 119 | assert isinstance(text, str) 120 | assert isinstance(ocr.detailed_text, list) 121 | assert isinstance(detailed_text, list) 122 | assert detailed_text == ocr.detailed_text 123 | assert text == ocr.text 124 | assert os.path.exists("OCR.png") 125 | assert not os.path.exists("preprocessed.png") 126 | 127 | extracted_info = ocr.process_extracted_text_from_invoice() 128 | 129 | assert isinstance(extracted_info, dict) 130 | assert isinstance(ocr.extracted_info, dict) 131 | assert ocr.extracted_info == extracted_info 132 | assert ( 133 | "price" in extracted_info 134 | and "date" in extracted_info 135 | and "place" in extracted_info 136 | and "order_number" in extracted_info 137 | and "phone_number" in extracted_info 138 | and "post_processed_word_list" in extracted_info 139 | ) is True 140 | assert isinstance(extracted_info["price"], str) 141 | assert isinstance(extracted_info["date"], list) 142 | assert len(extracted_info["date"]) == 0 143 | assert isinstance(extracted_info["place"], str) 144 | assert isinstance(extracted_info["phone_number"], list) 145 | assert len(extracted_info["phone_number"]) == 1 146 | assert isinstance(extracted_info["order_number"], int) 147 | assert isinstance(extracted_info["post_processed_word_list"], list) 148 | 149 | path_invoice = "images/1166-receipt.jpg" 150 | ocr = OCR( 151 | False, 152 | path_invoice, 153 | ) 154 | 155 | assert ocr.path == path_invoice 156 | assert ocr.preprocess is False 157 | 158 | text, _ = ocr.ocr_sparse_text() 159 | 160 | assert isinstance(ocr.text, str) 161 | assert isinstance(text, str) 162 | assert text == ocr.text 163 | assert os.path.exists("OCR.png") 164 | assert not os.path.exists("preprocessed.png") 165 | 166 | extracted_info = ocr.process_extracted_text_from_invoice() 167 | 168 | assert isinstance(extracted_info, dict) 169 | assert isinstance(ocr.extracted_info, dict) 170 | assert ocr.extracted_info == extracted_info 171 | assert ( 172 | "price" in extracted_info 173 | and "date" in extracted_info 174 | and "place" in extracted_info 175 | and "order_number" in extracted_info 176 | and "phone_number" in extracted_info 177 | and "post_processed_word_list" in extracted_info 178 | ) is True 179 | assert isinstance(extracted_info["price"], str) 180 | assert isinstance(extracted_info["date"], list) 181 | assert len(extracted_info["date"]) == 1 182 | assert isinstance(extracted_info["place"], str) 183 | assert isinstance(extracted_info["phone_number"], list) 184 | assert len(extracted_info["phone_number"]) == 0 185 | assert isinstance(extracted_info["order_number"], str) 186 | assert isinstance(extracted_info["post_processed_word_list"], list) 187 | 188 | os.remove("OCR.png") 189 | -------------------------------------------------------------------------------- /CONTRIBUTING.md: -------------------------------------------------------------------------------- 1 | # Contributing guide 2 | 3 | If you are planning to develop `OCRed`, or want to use the latest commit of `OCRed` on your local machine, you might want to install it from the source. This installation is not recommended for users who want to use the stable version of `OCRed`. The steps below describe the installation process of `OCRed`'s latest commit. It also describes how to test `OCRed`'s codebase and build `OCRed`'s documentation. 4 | 5 | **Note**: `OCRed` uses [Scikit-HEP's developer information](https://scikit-hep.org/developer) as a reference for all the development work. The guide is a general and much more explained collection of documentation available for developing `Scikit-HEP` packages. `OCRed` is not a `Scikit-HEP` package, but it still loosely follows this developer guide as it is absolutely amazing! 6 | 7 | ## Installing OCRed 8 | 9 | We recommend using a virtual environment to install `OCRed`. This would isolate the library from your global `Python` environment, which would be beneficial for reproducing bugs, and the overall development of `OCRed`. The first step would be to clone `OCRed` - 10 | 11 | ``` 12 | git clone https://github.com/Scikit-hep/OCRed.git 13 | ``` 14 | 15 | and then we can change the current working directory and enter `OCRed` - 16 | 17 | ``` 18 | cd OCRed 19 | ``` 20 | 21 | ### Creating a virtual environment 22 | 23 | A virtual environment can be set up and activated using `venv` in both `UNIX` and `Windows` systems. 24 | 25 | **UNIX**: 26 | 27 | ``` 28 | python3 -m venv .env 29 | . .env/bin/activate 30 | ``` 31 | 32 | **Windows**: 33 | 34 | ``` 35 | python -m venv .env 36 | .env\bin\activate 37 | ``` 38 | 39 | ### Installation 40 | 41 | The developer installation of `OCRed` comes with a lot of options - 42 | 43 | - `test`: the test dependencies 44 | - `docs`: extra dependencies to build and develop `OCRed`'s documentation 45 | - `dev`: installs the `test` and `docs` dependencies 46 | - `nltk`: installs `nltk` 47 | 48 | These options can be used with `pip` with the editable (`-e`) mode of installation in the following ways - 49 | 50 | ``` 51 | pip install -e .[dev,test] 52 | ``` 53 | 54 | For example, if you want to install the `docs` dependencies along with the dependencies included above, use - 55 | 56 | ``` 57 | pip install -e .[dev,test,docs] 58 | ``` 59 | 60 | ### Adding OCRed for notebooks 61 | 62 | `OCRed` can be added to the notebooks using the following commands - 63 | 64 | ``` 65 | python -m ipykernel install --user --name ocred 66 | ``` 67 | 68 | ## Activating pre-commit 69 | 70 | `OCRed` uses a set of `pre-commit` hooks and the `pre-commit` bot to format, type-check, and prettify the codebase. The hooks can be installed locally using - 71 | 72 | ``` 73 | pre-commit install 74 | ``` 75 | 76 | This would run the checks every time a commit is created locally. The checks will only run on the files modified by that commit, but the checks can be triggered for all the files using - 77 | 78 | ``` 79 | pre-commit run --all-files 80 | ``` 81 | 82 | If you would like to skip the failing checks and push the code for further discussion, use the `--no-verify` option with `git commit`. 83 | 84 | ## Testing OCRed 85 | 86 | `OCRed` is tested with `pytest` and `xdoctest`. `pytest` is responsible for testing the code, whose configuration is available in [pyproject.toml](https://github.com/Saransh-cpp/OCRed/blob/main/pyproject.toml), and on the other hand, `xdoctest` is responsible for testing the examples available in every docstring, which prevents them from going stale. Additionally, `OCRed` also uses `pytest-cov` to calculate the coverage of these unit tests. 87 | 88 | ### Running tests locally 89 | 90 | The tests can be executed using the `test` dependencies of `OCRed` in the following way - 91 | 92 | ``` 93 | python -m pytest -ra 94 | ``` 95 | 96 | ### Running tests with coverage locally 97 | 98 | The coverage value can be obtained while running the tests using `pytest-cov` in the following way - 99 | 100 | ``` 101 | python -m pytest -ra --cov=ocred tests/ 102 | ``` 103 | 104 | ### Running doctests 105 | 106 | The doctests can be executed using the `test` dependencies of `OCRed` in the following way - 107 | 108 | ``` 109 | xdoctest ./src/ocred/ 110 | ``` 111 | 112 | A much more detailed guide on testing with `pytest` is available [here](https://scikit-hep.org/developer/pytest). 113 | 114 | ## Documenting OCRed 115 | 116 | `OCRed`'s documentation is mainly written in the form of [docstrings](https://peps.python.org/pep-0257/) and [Markdown](https://en.wikipedia.org/wiki/Markdown). The docstrings include the description, arguments, examples, return values, and attributes of a class or a function, and the `.md` files enable us to render this documentation on `OCRed`'s documentation website. 117 | 118 | `OCRed` primarily uses [MkDocs](https://www.mkdocs.org/) and [mkdocstrings](https://mkdocstrings.github.io/) for rendering documentation on its website. The configuration file (`mkdocs.yml`) for `MkDocs` can be found [here](https://github.com/Saransh-cpp/OCRed/blob/main/mkdocs.yml). The documentation is deployed on [here](https://ocred.readthedocs.io/en/latest/). 119 | 120 | Ideally, with the addition of every new feature to `OCRed`, documentation should be added using comments, docstrings, and `.md` files. 121 | 122 | ### Building documentation locally 123 | 124 | The documentation is located in the `docs` folder of the main repository. This documentation can be generated using the `docs` dependencies of `OCRed` in the following way - 125 | 126 | ``` 127 | mkdocs serve 128 | ``` 129 | 130 | The commands executed above will clean any existing documentation build, create a new build (in `./site/`), and serve it on your `localhost`. To just build the documentation, use - 131 | 132 | ``` 133 | mkdocs build 134 | ``` 135 | 136 | ## Nox 137 | 138 | `OCRed` supports running various critical commands using [nox](https://github.com/wntrblm/nox) to make them less intimidating for new developers. All of these commands (or sessions in the language of `nox`) - `lint`, `tests`, `doctests`, `docs`, and `build` - are defined in [noxfile.py](https://github.com/Saransh-cpp/OCRed/blob/main/noxfile.py). 139 | 140 | `nox` can be installed via `pip` using - 141 | 142 | ``` 143 | pip install nox 144 | ``` 145 | 146 | The default sessions (`lint`, `tests`, and `doctests`) can be executed 147 | using - 148 | 149 | ``` 150 | nox 151 | ``` 152 | 153 | ### Running pre-commit with nox 154 | 155 | The `pre-commit` hooks can be run with `nox` in the following way - 156 | 157 | ``` 158 | nox -s lint 159 | ``` 160 | 161 | ### Running tests with nox 162 | 163 | Tests can be run with `nox` in the following way - 164 | 165 | ``` 166 | nox -s tests 167 | ``` 168 | 169 | ### Building documentation with nox 170 | 171 | Docs can be built with `nox` in the following way - 172 | 173 | ``` 174 | nox -s docs 175 | ``` 176 | 177 | Use the following command if you want to deploy the docs on `localhost` - 178 | 179 | ``` 180 | nox -s docs -- serve 181 | ``` 182 | -------------------------------------------------------------------------------- /ocred/preprocessing.py: -------------------------------------------------------------------------------- 1 | from __future__ import annotations 2 | 3 | import math 4 | 5 | import cv2 6 | import numpy as np 7 | import numpy.typing as npt 8 | from scipy import ndimage 9 | from skimage.filters import threshold_local 10 | 11 | _dep_warn_inplace = "inplace is deprecated and was removed in v0.3.0; Preprocessor now alters self.img directly" # noqa: E501 12 | _dep_warn_overriden_image = "overriden_image is deprecated and was removed in v0.3.0; Preprocessor now only alters self.img" # noqa: E501 13 | 14 | 15 | class Preprocessor: 16 | """ 17 | Preprocesses an image and makes it ready for OCR. 18 | 19 | Args: 20 | image: 21 | Path of the image or a numpy array. 22 | 23 | Examples: 24 | >>> import sys 25 | >>> sys.displayhook = lambda x: None 26 | >>> import cv2 27 | >>> from scipy import ndimage 28 | >>> from ocred import Preprocessor 29 | >>> # scan the image and copy the scanned image 30 | >>> preprocessed = Preprocessor("images/CosmosTwo.jpg") 31 | >>> # scan the image and copy the scanned image 32 | >>> preprocessed.scan() 33 | >>> orig = preprocessed.img.copy() 34 | >>> # remove noise 35 | >>> preprocessed.remove_noise() 36 | >>> # thicken the ink to draw Hough lines better 37 | >>> preprocessed.thicken_font() 38 | >>> # calculate the median angle of all the Hough lines 39 | >>> _, median_angle = preprocessed.rotate() 40 | >>> # rotate the original scanned image 41 | >>> rotated = ndimage.rotate(orig, median_angle) 42 | >>> # remove noise again 43 | >>> preprocessed = Preprocessor(rotated) 44 | >>> preprocessed.remove_noise() 45 | >>> cv2.imwrite("preprocessed.png", preprocessed.img) 46 | True 47 | """ 48 | 49 | def __init__( 50 | self, 51 | image: str | npt.NDArray[np.int64] | npt.NDArray[np.float64], 52 | ) -> None: 53 | if isinstance(image, str): 54 | self.img = cv2.imread(image) 55 | else: 56 | self.img = image 57 | 58 | def remove_noise( 59 | self, 60 | *, 61 | save: bool | None = False, 62 | inplace: bool | None | None = None, 63 | iterations: int | None = 1, 64 | overriden_image: npt.NDArray[np.int64] | npt.NDArray[np.float64] | None = None, 65 | ) -> npt.NDArray[np.int64] | npt.NDArray[np.float64]: 66 | """ 67 | Removes noise from an image. 68 | 69 | Args: 70 | save: 71 | Saves the resultant image. 72 | iterations: 73 | Number of times the image is processed. 74 | inplace: 75 | DANGER: Deprecated since version v0.3.0. 76 | Was intended to edit the image inplace, but never actually worked. 77 | overriden_image: 78 | DANGER: Deprecated since version v0.3.0. 79 | Was used to pass a new image to the method but was redundant and buggy. 80 | 81 | Returns: 82 | noise_free_image: 83 | The noise free image. 84 | """ 85 | if inplace is not None: 86 | raise DeprecationWarning(_dep_warn_inplace) 87 | if overriden_image is not None: 88 | raise DeprecationWarning(_dep_warn_overriden_image) 89 | 90 | kernel: npt.NDArray[np.int64] = np.ones((1, 1), np.uint8) 91 | self.img = cv2.dilate(self.img, kernel, iterations=iterations) 92 | kernel = np.ones((1, 1), np.uint8) 93 | self.img = cv2.erode(self.img, kernel, iterations=iterations) 94 | self.img = cv2.morphologyEx(self.img, cv2.MORPH_CLOSE, kernel) 95 | self.img = cv2.medianBlur(self.img, 3) 96 | 97 | if save: 98 | cv2.imwrite("noise_free.png", self.img) 99 | 100 | return self.img 101 | 102 | def thicken_font( 103 | self, 104 | *, 105 | save: bool | None = False, 106 | inplace: bool | None | None = None, 107 | iterations: int | None = 2, 108 | overriden_image: npt.NDArray[np.int64] | npt.NDArray[np.float64] | None = None, 109 | ) -> npt.NDArray[np.int64] | npt.NDArray[np.float64]: 110 | """ 111 | Thickens the ink of an image. 112 | 113 | Args: 114 | save: 115 | Saves the resultant image. 116 | iterations: 117 | Number of times the image is processed. 118 | inplace: 119 | DANGER: Deprecated since version v0.3.0. 120 | Was intended to edit the image inplace, but never actually worked. 121 | overriden_image: 122 | DANGER: Deprecated since version v0.3.0. 123 | Was used to pass a new image to the method but was redundant and buggy. 124 | 125 | Returns: 126 | thickened_image: 127 | The thickened image. 128 | """ 129 | if inplace is not None: 130 | raise DeprecationWarning(_dep_warn_inplace) 131 | if overriden_image is not None: 132 | raise DeprecationWarning(_dep_warn_overriden_image) 133 | 134 | self.img = cv2.bitwise_not(self.img) 135 | kernel: npt.NDArray[np.int64] = np.ones((2, 2), np.uint8) 136 | self.img = cv2.dilate(self.img, kernel, iterations=iterations) 137 | self.img = cv2.bitwise_not(self.img) 138 | 139 | if save: 140 | cv2.imwrite("thick_font.png", self.img) 141 | 142 | return self.img 143 | 144 | def scan( 145 | self, 146 | *, 147 | save: bool | None = False, 148 | inplace: bool | None | None = None, 149 | overriden_image: npt.NDArray[np.int64] | npt.NDArray[np.float64] | None = None, 150 | ) -> npt.NDArray[np.int64] | npt.NDArray[np.float64]: 151 | """ 152 | Transforms an image/document view into B&W view (proper scanned colour scheme). 153 | 154 | Args: 155 | save: 156 | Saves the resultant image. 157 | inplace: 158 | DANGER: Deprecated since version v0.3.0. 159 | Was intended to edit the image inplace, but never actually worked. 160 | overriden_image: 161 | DANGER: Deprecated since version v0.3.0. 162 | Was used to pass a new image to the method but was redundant and buggy. 163 | 164 | Returns: 165 | scanned_image: 166 | The scanned image. 167 | """ 168 | if inplace is not None: 169 | raise DeprecationWarning(_dep_warn_inplace) 170 | if overriden_image is not None: 171 | raise DeprecationWarning(_dep_warn_overriden_image) 172 | 173 | self.img = cv2.cvtColor(self.img, cv2.COLOR_BGR2GRAY) 174 | thr = threshold_local(self.img, 11, offset=10, method="gaussian") 175 | self.img = (self.img > thr).astype("uint8") * 255 176 | 177 | if save: 178 | cv2.imwrite("scanned.png", self.img) 179 | 180 | return self.img 181 | 182 | def rotate( 183 | self, 184 | *, 185 | save: bool | None = False, 186 | inplace: bool | None | None = None, 187 | overriden_image: npt.NDArray[np.int64] | npt.NDArray[np.float64] | None = None, 188 | ) -> tuple[npt.NDArray[np.int64] | npt.NDArray[np.float64], float]: 189 | """ 190 | Rotates an image for a face-on view (view from the top). 191 | 192 | Args: 193 | save: 194 | Saves the resultant image. 195 | inplace: 196 | DANGER: Deprecated since version v0.3.0. 197 | Was intended to edit the image inplace, but never actually worked. 198 | overriden_image: 199 | DANGER: Deprecated since version v0.3.0. 200 | Was used to pass a new image to the method but was redundant and buggy. 201 | 202 | Returns: 203 | rotated_image: 204 | The rotated image. 205 | median_angle: 206 | The angly by which it is rotated. 207 | """ 208 | if inplace is not None: 209 | raise DeprecationWarning(_dep_warn_inplace) 210 | if overriden_image is not None: 211 | raise DeprecationWarning(_dep_warn_overriden_image) 212 | 213 | img_edges = cv2.Canny(self.img, 100, 100, apertureSize=3) 214 | lines = cv2.HoughLinesP( 215 | img_edges, 216 | rho=1, 217 | theta=np.pi / 180.0, 218 | threshold=160, 219 | minLineLength=100, 220 | maxLineGap=10, 221 | ) 222 | 223 | angles = [] 224 | for [[x1, y1, x2, y2]] in lines: 225 | angle = math.degrees(math.atan2(y2 - y1, x2 - x1)) 226 | angles.append(angle) 227 | 228 | median_angle = float(np.median(angles)) 229 | self.img = ndimage.rotate(self.img, median_angle) 230 | 231 | if save: 232 | cv2.imwrite("rotated.png", self.img) 233 | 234 | return self.img, median_angle 235 | -------------------------------------------------------------------------------- /ocred/ocr.py: -------------------------------------------------------------------------------- 1 | from __future__ import annotations 2 | 3 | import re 4 | import typing 5 | 6 | import cv2 7 | import easyocr 8 | import pytesseract 9 | from scipy import ndimage 10 | 11 | from ocred.preprocessing import Preprocessor 12 | 13 | 14 | class OCR: 15 | """ 16 | Performs OCR on a given image, saves an image with boxes around the words, and 17 | converts the extracted text to an MP3 file. 18 | 19 | Add Tesseract OCR's installation location in PATH for functions using it to work. 20 | 21 | Args: 22 | 23 | preprocess: 24 | Set True if the image is a real life photo of some large meaningful (page of 25 | a book). Usually set to False when OCRing using `ocr_meaningful_text` to 26 | preprocess the image. 27 | Set False if the image is a scanned photo (an e-book). It will not be 28 | pre-processed before OCRing. 29 | Use the `Preprocessor` class manually to have more control! 30 | path: 31 | Path of the image to be used. 32 | 33 | Examples: 34 | >>> import sys 35 | >>> sys.displayhook = lambda x: None 36 | >>> import ocred 37 | >>> ocr = ocred.OCR( 38 | ... False, # preprocess -> to preprocess the image 39 | ... "./images/Page.png" 40 | ... ) 41 | >>> ocr.ocr_meaningful_text(save_output=True) 42 | """ 43 | 44 | def __init__(self, preprocess: bool, path: str) -> None: 45 | self.path = path 46 | self.preprocess = preprocess 47 | 48 | if self.preprocess: 49 | preprocessed = Preprocessor(self.path) 50 | 51 | # scan the image and copy the scanned image 52 | preprocessed.scan() 53 | orig = preprocessed.img.copy() 54 | 55 | # remove noise 56 | preprocessed.remove_noise() 57 | 58 | # thicken the ink to draw Hough lines better 59 | preprocessed.thicken_font() 60 | 61 | # calculate the median angle of all the Hough lines 62 | _, median_angle = preprocessed.rotate() 63 | 64 | # rotate the original scanned image 65 | rotated = ndimage.rotate(orig, median_angle) 66 | 67 | # remove noise again 68 | preprocessed = Preprocessor(rotated) 69 | preprocessed.remove_noise() 70 | 71 | cv2.imwrite("preprocessed.png", preprocessed.img) 72 | self.path = "preprocessed.png" 73 | 74 | def ocr_meaningful_text( 75 | self, 76 | *, 77 | tesseract_config: str | None = "-l eng --oem 1", 78 | preserve_orientation: bool | None = False, 79 | save_output: bool | None = False, 80 | ) -> str: 81 | """ 82 | Performs OCR on long meaningful text documents and saves the image with boxes 83 | around the words. For example - books, PDFs etc. 84 | 85 | Args: 86 | tesseract_config: 87 | Configuration passed down to the Tesseract OCR Engine. 88 | preserve_orientation: 89 | Preserves the orientation of OCRed text. 90 | save_output: 91 | Saves the text to `output.txt` file. 92 | 93 | Returns: 94 | text: 95 | The extracted text. 96 | """ 97 | # reading the image 98 | img = cv2.imread(self.path) 99 | 100 | # extracting the text 101 | self.text = pytesseract.image_to_string(img, config=tesseract_config) 102 | if not preserve_orientation: 103 | self.text = self.text.replace("-\n", "").replace("\n", " ") 104 | 105 | # adding boxes around the words 106 | boxes = pytesseract.image_to_data(img) 107 | for z, box in enumerate(boxes.splitlines()): 108 | if z != 0: 109 | box = box.split() 110 | 111 | # if the data has a word 112 | if len(box) == 12: 113 | x, y = int(box[6]), int(box[7]) 114 | h, w = int(box[8]), int(box[9]) 115 | 116 | cv2.rectangle(img, (x, y), (x + h, y + w), (0, 0, 255), 1) 117 | 118 | cv2.imwrite("OCR.png", img) 119 | 120 | if save_output: 121 | self.save_output() 122 | 123 | return self.text 124 | 125 | def ocr_sparse_text( 126 | self, 127 | *, 128 | languages: list[str] | None = ["en", "hi"], 129 | decoder: str | None = "greedy", 130 | save_output: bool | None = False, 131 | ) -> tuple[str, typing.Any]: 132 | """ 133 | Performs OCR on sparse text and saves the image with boxes around the words. 134 | This method can be used to OCR documents in which the characters don't form 135 | any proper/meaningful sentences, or if there are very less meaningful sentences, 136 | for example - bills, sign-boards etc. 137 | 138 | Args: 139 | languages: 140 | A list of languages that the signboard possible has. 141 | Note: Provide only the languages that are present in the image, adding 142 | additional languages misguides the model. 143 | decoder: 144 | If the document has a larger number of meaningful sentences then use 145 | "beamsearch". For most of the cases "greedy" works very well. 146 | save_output: 147 | Saves the text to `output.txt` file. 148 | 149 | Returns: 150 | text: 151 | The extracted text. 152 | detailed_text: 153 | Text with extra information (returned by easyocr.Reader.readtext()). 154 | """ 155 | self.text = "" 156 | 157 | # reading the image using open-cv and easyocr 158 | img = cv2.imread(self.path) 159 | reader = easyocr.Reader( 160 | languages 161 | ) # slow for the first time (also depends upon CPU/GPU) 162 | self.detailed_text: typing.Any = reader.readtext( 163 | self.path, decoder=decoder, batch_size=5 164 | ) 165 | 166 | for text in self.detailed_text: 167 | # extracting the coordinates to highlight the text 168 | coords_lower = text[0][:2] 169 | coords_upper = text[0][2:4] 170 | 171 | coords_lower.sort(key=lambda x: x[0]) 172 | pt1 = [int(x) for x in coords_upper[-1]] 173 | 174 | coords_lower.sort(key=lambda x: x[0]) 175 | pt2 = [int(x) for x in coords_lower[-1]] 176 | 177 | # highlighting the text 178 | cv2.rectangle(img, pt1, pt2, (0, 0, 255), 1) 179 | 180 | self.text = self.text + " " + text[-2] 181 | 182 | cv2.imwrite("OCR.png", img) 183 | 184 | if save_output: 185 | self.save_output() 186 | 187 | return self.text, self.detailed_text 188 | 189 | def process_extracted_text_from_invoice(self) -> dict[str, typing.Any]: 190 | """ 191 | This method processes the extracted text from invoices, and returns some useful 192 | information. 193 | 194 | Returns: 195 | extracted_info: 196 | The extracted information. 197 | """ 198 | if not hasattr(self, "detailed_text"): 199 | raise ValueError("no invoice OCRed; OCR an invoice first") 200 | 201 | import nltk 202 | 203 | nltk.download("punkt") 204 | nltk.download("wordnet") 205 | nltk.download("stopwords") 206 | 207 | self.extracted_info = {} 208 | self.text_list = self.text.split(" ") 209 | 210 | # find date 211 | date_re = re.compile( 212 | r"^([1-9]|0[1-9]|1[0-9]|2[0-9]|3[0-1])(\.|-|\/)([1-9]|0[1-9]|1[0-2])(\.|-|\/)([0-9][0-9]|19[0-9][0-9]|20[0-9][0-9])$", 213 | ) 214 | date = list(filter(date_re.match, self.text_list)) 215 | 216 | # find phone number 217 | phone_number_re = re.compile( 218 | r"((\+*)((0[ -]*)*|((91 )*))((\d{12})+|(\d{10})+))|\d{5}([- ]*)\d{6}", 219 | ) 220 | phone_number = list(filter(phone_number_re.match, self.text_list)) 221 | 222 | # find place 223 | place = self.detailed_text[0][-2] 224 | 225 | # remove puntuations and redundant words 226 | tokenizer = nltk.RegexpTokenizer(r"\w+") 227 | removed_punctuation = tokenizer.tokenize(self.text) 228 | 229 | stop_words = set(nltk.corpus.stopwords.words("english")) 230 | post_processed_word_list = [ 231 | w for w in removed_punctuation if w not in stop_words 232 | ] 233 | 234 | # find order number 235 | order_number: str | int = "" 236 | for i in range(len(post_processed_word_list)): 237 | if post_processed_word_list[i].lower() == "order": 238 | try: 239 | order_number = int(post_processed_word_list[i + 1]) 240 | except Exception: 241 | order_number = post_processed_word_list[i + 2] 242 | break 243 | 244 | # find total price 245 | price: list[typing.Any] | str = "" 246 | 247 | # try finding a number with Rs, INR, ₹ or रे in front of it or Rs, INR at the end 248 | # of it 249 | try: 250 | price = re.findall( 251 | r"(?:Rs\.?|INR|₹\.?|रे\.?)\s*(\d+(?:[.,]\d+)*)|(\d+(?:[.,]\d+)*)\s*(?:Rs\.?|INR)", 252 | self.text, 253 | ) 254 | price = list(map(float, price)) 255 | price = max(price) 256 | # try finding numbers with "grand total" or "total" written in front of them 257 | except ValueError: 258 | lowered_list = [x.lower() for x in post_processed_word_list] 259 | if "grand" in lowered_list: 260 | indices = [i for i, x in enumerate(lowered_list) if x == "grand"] 261 | i = indices[-1] 262 | price = post_processed_word_list[i + 2] 263 | elif "total" in lowered_list: 264 | indices = [i for i, x in enumerate(lowered_list) if x == "total"] 265 | i = indices[-1] 266 | price = post_processed_word_list[i + 1] 267 | 268 | self.extracted_info.update( 269 | { 270 | "price": price, 271 | "date": date, 272 | "place": place, 273 | "order_number": order_number, 274 | "phone_number": phone_number, 275 | "post_processed_word_list": post_processed_word_list, 276 | } 277 | ) 278 | 279 | return self.extracted_info 280 | 281 | def save_output(self) -> None: 282 | """Saves the extracted text in the `output.txt` file.""" 283 | if not hasattr(self, "text"): 284 | raise ValueError("no text OCRed; OCR a document first") 285 | f = open("output.txt", "w", encoding="utf-8") 286 | f.write(self.text) 287 | f.close() 288 | 289 | def text_to_speech(self, *, lang: str | None = "en") -> None: 290 | """ 291 | DANGER: Deprecated since version v0.2.0. 292 | Instead, use gTTS manually. 293 | 294 | Converts the extracted text to speech and save it as an MP3 file. 295 | 296 | Args: 297 | lang: 298 | Language of the processed text. 299 | """ 300 | raise DeprecationWarning( 301 | "text_to_speech is deprecated and was removed in v0.2.0; use gTTS manually", 302 | ) 303 | -------------------------------------------------------------------------------- /README.md: -------------------------------------------------------------------------------- 1 | # OCRed 2 | 3 | ![ocred](https://user-images.githubusercontent.com/74055102/190708403-0e29eca6-eb55-40f5-9c8b-787284f90f54.png) 4 | 5 | [![CI](https://github.com/Saransh-cpp/OCRed/actions/workflows/ci.yml/badge.svg)](https://github.com/Saransh-cpp/OCRed/actions/workflows/ci.yml) 6 | [![Documentation Status](https://readthedocs.org/projects/ocred/badge/?version=latest)](https://ocred.readthedocs.io/en/latest/?badge=latest) 7 | [![pre-commit.ci status](https://results.pre-commit.ci/badge/github/Saransh-cpp/OCRed/main.svg)](https://results.pre-commit.ci/latest/github/Saransh-cpp/OCRed/main) 8 | [![codecov](https://codecov.io/gh/Saransh-cpp/OCRed/branch/main/graph/badge.svg?token=L6ObHKhaZ7)](https://codecov.io/gh/Saransh-cpp/OCRed) 9 | [![discussion](https://img.shields.io/static/v1?label=Discussions&message=Ask&color=blue&logo=github)](https://github.com/Saransh-cpp/OCRed/discussions) 10 | 11 | [![Python Versions](https://img.shields.io/pypi/pyversions/ocred)](https://pypi.org/project/ocred/) 12 | [![Package Version](https://badge.fury.io/py/ocred.svg)](https://pypi.org/project/ocred/) 13 | [![Downloads](https://static.pepy.tech/badge/ocred)](https://pepy.tech/project/ocred) 14 | ![License](https://img.shields.io/github/license/Saransh-cpp/OCRed?color=blue) 15 | [![black](https://img.shields.io/badge/code%20style-black-000000.svg)](https://github.com/psf/black) 16 | 17 | 18 | 19 | 20 | [![All Contributors](https://img.shields.io/badge/all_contributors-2-orange.svg?style=flat-square)](#contributors-) 21 | 22 | 23 | 24 | 25 | 26 | `OCRed` (pronounced as _OCR'd_) provides clever, simple, and intuitive wrapper functionalities for OCRing specific text material. You don't want to learn `OCR` or the libraries that will help you perform `OCR`, but you need to `OCR` something? This friendly neighborhood library hides all of that stuff under simple functions like `ocr_meaningful_text()`. 27 | 28 | In other words, instead of manual preprocessing, looking for an OCR library, learning the library, then finally getting what you were looking for, use `OCRed` instead. 29 | 30 | On the other hand, if you want to learn `OCR` and use the famous `OCR` libraries by yourself, then this library is not for you. But, it still can be a good start for your journey! 31 | 32 | ## Structure 33 | 34 | `OCR` is performed using the [`OCR`](https://github.com/Saransh-cpp/OCRed/blob/main/ocred/ocr.py) class and preprocessing of an image is performed using the [`Preprocessor`](https://github.com/Saransh-cpp/OCRed/blob/main/ocred/preprocessing.py) class. All the details are available in the [documentation](https://ocred.readthedocs.io/en/latest/). 35 | 36 | ## Installation 37 | 38 | 1. Install Tesseract for your OS and add it to PATH 39 | 40 | The installation guide is available [here](https://tesseract-ocr.github.io/tessdoc/Installation.html) 41 | 42 | 2. Use `pip` magic 43 | 44 | `OCRed` uses modern `Python` packaging and can be installed using `pip` - 45 | 46 | ``` 47 | python -m pip install ocred 48 | ``` 49 | 50 | ## Usage example 51 | 52 | ```py 53 | # OCRing a book 54 | import ocred 55 | 56 | ocr = ocred.OCR( 57 | False, # is_scanned -> to preprocess the image 58 | "path/to/an/image", # path 59 | ) 60 | ocr.ocr_meaningful_text(save_output=True) 61 | ``` 62 | 63 | ```py 64 | # OCRing a signboard 65 | import ocred 66 | 67 | ocr = ocred.OCR( 68 | True, # is_scanned -> sign boards don't need to be preprocessed 69 | "path/to/an/image", # path 70 | ) 71 | extracted_text = ocr.ocr_sparse_text() 72 | print(extracted_text) 73 | ``` 74 | 75 | ```py 76 | # OCRing an invoice 77 | import ocred 78 | 79 | ocr = ocred.OCR( 80 | True, # is_scanned -> invoices don't need to be preprocessed 81 | "path/to/an/image", # path 82 | ) 83 | extracted_text = ocr.ocr_sparse_text() 84 | print(extracted_text) 85 | 86 | extraxted_info = ocr.process_extracted_text_from_invoice() 87 | print(extraxted_info) 88 | ``` 89 | 90 | ```py 91 | # manually preprocessing an image 92 | import cv2 93 | from scipy import ndimage 94 | from ocred import Preprocessor 95 | 96 | preprocessed = Preprocessor("path/to/img.jpg") 97 | 98 | # scan the image and copy the scanned image 99 | preprocessed.scan() 100 | orig = preprocessed.img.copy() 101 | 102 | # remove noise 103 | preprocessed.remove_noise() 104 | 105 | # thicken the ink to draw Hough lines better 106 | preprocessed.thicken_font() 107 | 108 | # calculate the median angle of all the Hough lines 109 | _, median_angle = preprocessed.rotate() 110 | 111 | # rotate the original scanned image 112 | rotated = ndimage.rotate(orig, median_angle) 113 | 114 | # remove noise again 115 | preprocessed = Preprocessor(rotated) 116 | preprocessed.remove_noise() 117 | 118 | cv2.imwrite("preprocessed.png", preprocessed.img) 119 | ``` 120 | 121 | ## Testing 122 | 123 | The tests are present in the `tests` directory. New tests must be added with any additional features. 124 | 125 | To run the tests - 126 | 127 | ``` 128 | pytest 129 | ``` 130 | 131 | ## Some examples 132 | 133 | ![roof-500x500](https://user-images.githubusercontent.com/74055102/135721441-7516bbf1-da6f-498b-a30b-d381c66b187e.jpg) 134 | ![OCR](https://user-images.githubusercontent.com/74055102/135721446-5ea2e3f9-7cab-41f9-a1b0-52ff6707b0c2.png) 135 | 136 | ``` 137 | जयपुर JAIPUR 321 आगरा AGRA 554 श्री गगांनगर 242 SRIGANGANAGAR JODHPUR 261 जोधपुर 138 | ``` 139 | 140 | ![Page](https://user-images.githubusercontent.com/74055102/133644506-3dcf08fc-36f9-404a-b1b7-65117a3f9869.png) 141 | ![OCR](https://user-images.githubusercontent.com/74055102/133644598-89551323-df51-45cc-8210-871b2c4dd756.png) 142 | 143 | ``` 144 | Preface This book deals with computer architecture as well as computer organization and design. Computer architecture is concerned with the structure and behavior of the various functional modules of the computer and how they interact to provide the processing needs of the user. Computer organization is concerned with the way the hardware components are connected together to form a computer system. Computer design is concerned with the development of the hardware for the computer taking into consideration a given set of specifications. The book provides the basic knowledge necessary to understand the hardware operation of digital computers and covers the three subjects associated with computer hardware. Chapters 1 through 4 present the various digital components used in the organization and design of digital computers. Chapters 5 through 7 show the detailed steps that a designer must go through in order to design an elementary basic computer. Chapters 8 through 10 deal with the organization and architecture of the central processing unit. Chapters 11 and 12 present the organization and architecture of input-output and memory. Chapter 13 introduces the concept of multiprocessing. The plan of the book is to present the simpler material first and introduce the more advanced subjects later, Thus, the first seven chapters cover material needed for the basic understanding of computer organization, design, and programming of a simple digital computer. The last six chapters present the organization and architecture of the separate functional units of the digital computer with an emphasis ‘on more advanced topics. ‘The material in the third edition is organized in the same manner as in the second edition and many of the features remain the same. The third edition, however, offers several improvements over the second edition. All chapters ‘two (6 and 10) have been completely revised to bring the material up to date and to clarify the presentation. Two new chapters were added: chapter 9 on pipeline and vector processing, and chapter 13 on multiprocessors. Two sections deal with the reduced instruction set computer (RISC). Chapter 5 has been revised completely to simplify and clarify the design of the basic computer. New problems have been formulated for eleven of the thirteen chapters. ‘The physical organization of a particular computer including its registers 145 | ``` 146 | 147 | ![CosmosOne](https://user-images.githubusercontent.com/74055102/133640550-eba241af-db0a-46e3-9b24-b4219dd74cfd.jpg) 148 | ![preprocessed](https://user-images.githubusercontent.com/74055102/136529402-eb42d8fa-d987-4b09-bb36-8d5a477ed391.png) 149 | ![OCR](https://user-images.githubusercontent.com/74055102/136529362-9c82a1f2-ffde-4edc-a154-0692a3b219a8.png) 150 | 151 | ``` 152 | organisms of our globe, including hydrogen, sodiurn, magnesiuia, and iron. May it not be thai, at least, the brighter stars are like our Sun, the upholding and energizing centres of systems of worlds, adapted to be the abode of living beings? — William Hugeins, 1865 All my life I have wondered about the possibility of life elsewhere. What would it be like? Of what would it be made? All living things on our planet are constructed of organic molecules ~ complex microscopic architectures in which the carbon atom plays a central role. There was once a time before life, when the Earth was barren and utterly desolate. Our world is now overflowing with life. How did it come about? How, in the absence of life, were carbon-based organic molecules made? How did the first living things arise? How did life evolve to produce beings as elaborate and complex as we, able to explore the mystery of Our Own origins? And on ihe countless other planets that many circle other suns, is there life also? Is extraterrestrial life, if it exists, based on the same organic molecules as life on Earth? Do the beings of other worlds look much like life on Earth? Or are they stunningly different — other adaptations to other environments? What else is possible? The nature of life on Earth and the search for life elsewhere are two sides of the sarne question — the search for who we are. In the great dark between the stars there are clouds of gas and dust and organic matter. Dozens of different kinds of organic molecules have been found there by radio telescopes. The abundance of these molecules suggests that the stuff of life is everywhere. Perhaps the origin and evolution of life is, given enough time, a cosmic inevitability. On some of the billions of planets in the Milky Way Galaxy, life may never arise. On others, it May arise and die out, or never evolve beyond its simplest forms. And on some small fraction of worlds there may 35 153 | ``` 154 | 155 | ## Contributing 156 | 157 | If you want to contribute to `OCRed` (thanks!), please have a look at our [Contributing Guide](https://github.com/Saransh-cpp/OCRed/blob/main/CONTRIBUTING.md). 158 | 159 | ## Contributors ✨ 160 | 161 | Thanks goes to these wonderful people ([emoji key](https://allcontributors.org/docs/en/emoji-key)): 162 | 163 | 164 | 165 | 166 | 167 | 168 | 169 | 170 | 171 |

Saransh

💻 🐛 🖋 📖 🎨 💡 🤔 🚇 🚧 📦 👀 ⚠️ 🧑‍🏫

Priyanshi Goel

🐛
172 | 173 | 174 | 175 | 176 | 177 | 178 | This project follows the [all-contributors](https://github.com/all-contributors/all-contributors) specification. Contributions of any kind welcome! 179 | --------------------------------------------------------------------------------