18 |
19 | ### Alternative installation methods
20 |
21 | * [Snap Store](https://snapcraft.io/djpdf)
22 | * Manual:
23 | * Dependencies: [ImageMagick](http://www.imagemagick.org/), [QPDF](https://github.com/qpdf/qpdf),
24 | [jbig2enc](https://github.com/agl/jbig2enc), [Tesseract](https://github.com/tesseract-ocr/tesseract)
25 | * Install library and CLI: `pip3 install .`
26 | * Install GUI: `meson builddir && meson install -C builddir`
27 |
28 | ## Translation
29 |
30 | We're using [Weblate](https://hosted.weblate.org/engage/djpdf/) to translate the UI. So feel free, to contribute translations over there.
31 |
32 | ## Screenshots
33 |
34 | 
35 |
36 | 
37 |
--------------------------------------------------------------------------------
/flatpak/tesseract-wrapper.py:
--------------------------------------------------------------------------------
1 | #!/usr/bin/env python3
2 |
3 | import os
4 | import subprocess
5 | import sys
6 | import tempfile
7 |
8 | PREFIX = '/app'
9 | TESSERACT_RELPATH = 'bin/tesseract.real'
10 | TESSDATA_RELPATH = 'share/tessdata'
11 | EXTENSIONS_RELPATH = 'extensions/ocr'
12 |
13 |
14 | def merge_directories(target, sources):
15 | for source in sources:
16 | for dirpath, dirnames, filenames in os.walk(source):
17 | rel_dirpath = os.path.relpath(dirpath, start=source)
18 | for name in dirnames:
19 | os.makedirs(os.path.join(target, rel_dirpath, name),
20 | exist_ok=True)
21 | for name in filenames:
22 | os.symlink(os.path.join(dirpath, name),
23 | os.path.join(target, rel_dirpath, name))
24 |
25 |
26 | def exec_tesseract(tessdata_path=None):
27 | env = os.environ.copy()
28 | if tessdata_path is not None:
29 | env['TESSDATA_PREFIX'] = tessdata_path
30 | tesseract = os.path.join(PREFIX, TESSERACT_RELPATH)
31 | exit(subprocess.run(sys.argv, executable=tesseract, env=env).returncode)
32 |
33 |
34 | def main():
35 | if 'TESSDATA_PREFIX' in os.environ:
36 | exec_tesseract()
37 | tessdata_paths = [os.path.join(PREFIX, TESSDATA_RELPATH)]
38 | for entry in os.scandir(os.path.join(PREFIX, EXTENSIONS_RELPATH)):
39 | tessdata_paths.append(os.path.join(entry.path, TESSDATA_RELPATH))
40 | with tempfile.TemporaryDirectory(prefix='tessdata-') as tempdir:
41 | merge_directories(tempdir, tessdata_paths)
42 | exec_tesseract(tempdir)
43 |
44 |
45 | if __name__ == '__main__':
46 | main()
47 |
--------------------------------------------------------------------------------
/desktop/com.github.unrud.djpdf.metainfo.xml.in:
--------------------------------------------------------------------------------
1 |
2 | 11 | Create small, searchable PDFs from scanned documents. 12 | The program divides images into bitonal foreground images (text) 13 | and a color background image, then compresses them separately. 14 | An invisible OCR text layer is added, making the PDF searchable. 15 |
16 |17 | Color and grayscale scans need some preparation for good results. 18 | Recommended tools are Scan Tailor or GIMP. 19 |
20 |21 | A GUI and command line interface are included. 22 |
23 |