├── .github └── workflows │ ├── cd.yml │ └── ci.yml ├── .gitignore ├── README.md ├── config.yml ├── manifest.yml ├── pdf_segmenter.py ├── requirements.txt └── tests ├── __init__.py ├── conftest.py ├── data ├── cats_are_awesome.pdf ├── cats_are_awesome_img.pdf ├── cats_are_awesome_text.pdf ├── test_img_0.jpg └── test_img_1.jpg ├── integration ├── __init__.py └── test_exec.py ├── requirements.txt └── unit ├── __init__.py └── test_exec.py /.github/workflows/cd.yml: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/jina-ai/executor-pdfsegmenter/HEAD/.github/workflows/cd.yml -------------------------------------------------------------------------------- /.github/workflows/ci.yml: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/jina-ai/executor-pdfsegmenter/HEAD/.github/workflows/ci.yml -------------------------------------------------------------------------------- /.gitignore: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/jina-ai/executor-pdfsegmenter/HEAD/.gitignore -------------------------------------------------------------------------------- /README.md: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/jina-ai/executor-pdfsegmenter/HEAD/README.md -------------------------------------------------------------------------------- /config.yml: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/jina-ai/executor-pdfsegmenter/HEAD/config.yml -------------------------------------------------------------------------------- /manifest.yml: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/jina-ai/executor-pdfsegmenter/HEAD/manifest.yml -------------------------------------------------------------------------------- /pdf_segmenter.py: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/jina-ai/executor-pdfsegmenter/HEAD/pdf_segmenter.py -------------------------------------------------------------------------------- /requirements.txt: -------------------------------------------------------------------------------- 1 | PyMuPDF==1.21.1 2 | pdfplumber==0.8.0 3 | -------------------------------------------------------------------------------- /tests/__init__.py: -------------------------------------------------------------------------------- 1 | -------------------------------------------------------------------------------- /tests/conftest.py: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/jina-ai/executor-pdfsegmenter/HEAD/tests/conftest.py -------------------------------------------------------------------------------- /tests/data/cats_are_awesome.pdf: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/jina-ai/executor-pdfsegmenter/HEAD/tests/data/cats_are_awesome.pdf -------------------------------------------------------------------------------- /tests/data/cats_are_awesome_img.pdf: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/jina-ai/executor-pdfsegmenter/HEAD/tests/data/cats_are_awesome_img.pdf -------------------------------------------------------------------------------- /tests/data/cats_are_awesome_text.pdf: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/jina-ai/executor-pdfsegmenter/HEAD/tests/data/cats_are_awesome_text.pdf -------------------------------------------------------------------------------- /tests/data/test_img_0.jpg: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/jina-ai/executor-pdfsegmenter/HEAD/tests/data/test_img_0.jpg -------------------------------------------------------------------------------- /tests/data/test_img_1.jpg: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/jina-ai/executor-pdfsegmenter/HEAD/tests/data/test_img_1.jpg -------------------------------------------------------------------------------- /tests/integration/__init__.py: -------------------------------------------------------------------------------- 1 | -------------------------------------------------------------------------------- /tests/integration/test_exec.py: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/jina-ai/executor-pdfsegmenter/HEAD/tests/integration/test_exec.py -------------------------------------------------------------------------------- /tests/requirements.txt: -------------------------------------------------------------------------------- 1 | pytest 2 | -------------------------------------------------------------------------------- /tests/unit/__init__.py: -------------------------------------------------------------------------------- 1 | -------------------------------------------------------------------------------- /tests/unit/test_exec.py: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/jina-ai/executor-pdfsegmenter/HEAD/tests/unit/test_exec.py --------------------------------------------------------------------------------