├── .dockerignore ├── .gitignore ├── Dockerfile ├── LICENSE.md ├── README.md ├── data └── codigos_de_comunas.csv ├── doc ├── csv.png ├── data_stream_example.txt ├── image2022.png ├── pdf_structure.png └── stream_explanation.png ├── out └── README.md ├── requirements.txt ├── servel_scraper ├── data_extractor │ ├── __init__.py │ ├── csv_writer.py │ ├── extractor.py │ ├── person.py │ ├── servel_pdf_stream_page.py │ └── test_extractor.py ├── downloader │ ├── cut.py │ ├── cut_csv_repo.py │ ├── file_downloader.py │ ├── servel_file_repo.py │ └── server_file_repo_example.py ├── main.py └── servel_pipeline │ └── servel_pipeline.py └── test └── testdata └── tiny.pdf /.dockerignore: -------------------------------------------------------------------------------- 1 | out/ -------------------------------------------------------------------------------- /.gitignore: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/Eitol/servel_scraper/HEAD/.gitignore -------------------------------------------------------------------------------- /Dockerfile: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/Eitol/servel_scraper/HEAD/Dockerfile -------------------------------------------------------------------------------- /LICENSE.md: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/Eitol/servel_scraper/HEAD/LICENSE.md -------------------------------------------------------------------------------- /README.md: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/Eitol/servel_scraper/HEAD/README.md -------------------------------------------------------------------------------- /data/codigos_de_comunas.csv: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/Eitol/servel_scraper/HEAD/data/codigos_de_comunas.csv -------------------------------------------------------------------------------- /doc/csv.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/Eitol/servel_scraper/HEAD/doc/csv.png -------------------------------------------------------------------------------- /doc/data_stream_example.txt: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/Eitol/servel_scraper/HEAD/doc/data_stream_example.txt -------------------------------------------------------------------------------- /doc/image2022.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/Eitol/servel_scraper/HEAD/doc/image2022.png -------------------------------------------------------------------------------- /doc/pdf_structure.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/Eitol/servel_scraper/HEAD/doc/pdf_structure.png -------------------------------------------------------------------------------- /doc/stream_explanation.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/Eitol/servel_scraper/HEAD/doc/stream_explanation.png -------------------------------------------------------------------------------- /out/README.md: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/Eitol/servel_scraper/HEAD/out/README.md -------------------------------------------------------------------------------- /requirements.txt: -------------------------------------------------------------------------------- 1 | tqdm 2 | pikepdf -------------------------------------------------------------------------------- /servel_scraper/data_extractor/__init__.py: -------------------------------------------------------------------------------- 1 | -------------------------------------------------------------------------------- /servel_scraper/data_extractor/csv_writer.py: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/Eitol/servel_scraper/HEAD/servel_scraper/data_extractor/csv_writer.py -------------------------------------------------------------------------------- /servel_scraper/data_extractor/extractor.py: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/Eitol/servel_scraper/HEAD/servel_scraper/data_extractor/extractor.py -------------------------------------------------------------------------------- /servel_scraper/data_extractor/person.py: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/Eitol/servel_scraper/HEAD/servel_scraper/data_extractor/person.py -------------------------------------------------------------------------------- /servel_scraper/data_extractor/servel_pdf_stream_page.py: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/Eitol/servel_scraper/HEAD/servel_scraper/data_extractor/servel_pdf_stream_page.py -------------------------------------------------------------------------------- /servel_scraper/data_extractor/test_extractor.py: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/Eitol/servel_scraper/HEAD/servel_scraper/data_extractor/test_extractor.py -------------------------------------------------------------------------------- /servel_scraper/downloader/cut.py: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/Eitol/servel_scraper/HEAD/servel_scraper/downloader/cut.py -------------------------------------------------------------------------------- /servel_scraper/downloader/cut_csv_repo.py: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/Eitol/servel_scraper/HEAD/servel_scraper/downloader/cut_csv_repo.py -------------------------------------------------------------------------------- /servel_scraper/downloader/file_downloader.py: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/Eitol/servel_scraper/HEAD/servel_scraper/downloader/file_downloader.py -------------------------------------------------------------------------------- /servel_scraper/downloader/servel_file_repo.py: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/Eitol/servel_scraper/HEAD/servel_scraper/downloader/servel_file_repo.py -------------------------------------------------------------------------------- /servel_scraper/downloader/server_file_repo_example.py: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/Eitol/servel_scraper/HEAD/servel_scraper/downloader/server_file_repo_example.py -------------------------------------------------------------------------------- /servel_scraper/main.py: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/Eitol/servel_scraper/HEAD/servel_scraper/main.py -------------------------------------------------------------------------------- /servel_scraper/servel_pipeline/servel_pipeline.py: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/Eitol/servel_scraper/HEAD/servel_scraper/servel_pipeline/servel_pipeline.py -------------------------------------------------------------------------------- /test/testdata/tiny.pdf: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/Eitol/servel_scraper/HEAD/test/testdata/tiny.pdf --------------------------------------------------------------------------------