├── .gitignore ├── .pyup.yml ├── .travis.yml ├── AUTHORS.rst ├── CONTRIBUTING.rst ├── HISTORY.rst ├── LICENSE.md ├── MANIFEST.in ├── Makefile ├── README.rst ├── config.yaml ├── docs ├── Makefile ├── authors.rst ├── conf.py ├── contributing.rst ├── history.rst ├── index.rst ├── installation.rst ├── make.bat ├── readme.rst └── usage.rst ├── pgdedupe ├── __init__.py ├── cli.py ├── exact_matches.py ├── run.py └── utils.py ├── requirements.txt ├── requirements_dev.txt ├── setup.cfg ├── setup.py ├── tests ├── Deduplication validation.ipynb ├── __init__.py ├── dedup_postgres_training.json ├── generate_fake_dataset.py ├── initialize_db.py ├── nicknames.csv ├── test_integration.py ├── test_pgdedupe.py └── test_reproducibility.py └── tox.ini /.gitignore: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/dssg/pgdedupe/HEAD/.gitignore -------------------------------------------------------------------------------- /.pyup.yml: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/dssg/pgdedupe/HEAD/.pyup.yml -------------------------------------------------------------------------------- /.travis.yml: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/dssg/pgdedupe/HEAD/.travis.yml -------------------------------------------------------------------------------- /AUTHORS.rst: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/dssg/pgdedupe/HEAD/AUTHORS.rst -------------------------------------------------------------------------------- /CONTRIBUTING.rst: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/dssg/pgdedupe/HEAD/CONTRIBUTING.rst -------------------------------------------------------------------------------- /HISTORY.rst: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/dssg/pgdedupe/HEAD/HISTORY.rst -------------------------------------------------------------------------------- /LICENSE.md: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/dssg/pgdedupe/HEAD/LICENSE.md -------------------------------------------------------------------------------- /MANIFEST.in: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/dssg/pgdedupe/HEAD/MANIFEST.in -------------------------------------------------------------------------------- /Makefile: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/dssg/pgdedupe/HEAD/Makefile -------------------------------------------------------------------------------- /README.rst: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/dssg/pgdedupe/HEAD/README.rst -------------------------------------------------------------------------------- /config.yaml: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/dssg/pgdedupe/HEAD/config.yaml -------------------------------------------------------------------------------- /docs/Makefile: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/dssg/pgdedupe/HEAD/docs/Makefile -------------------------------------------------------------------------------- /docs/authors.rst: -------------------------------------------------------------------------------- 1 | .. include:: ../AUTHORS.rst 2 | -------------------------------------------------------------------------------- /docs/conf.py: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/dssg/pgdedupe/HEAD/docs/conf.py -------------------------------------------------------------------------------- /docs/contributing.rst: -------------------------------------------------------------------------------- 1 | .. include:: ../CONTRIBUTING.rst 2 | -------------------------------------------------------------------------------- /docs/history.rst: -------------------------------------------------------------------------------- 1 | .. include:: ../HISTORY.rst 2 | -------------------------------------------------------------------------------- /docs/index.rst: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/dssg/pgdedupe/HEAD/docs/index.rst -------------------------------------------------------------------------------- /docs/installation.rst: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/dssg/pgdedupe/HEAD/docs/installation.rst -------------------------------------------------------------------------------- /docs/make.bat: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/dssg/pgdedupe/HEAD/docs/make.bat -------------------------------------------------------------------------------- /docs/readme.rst: -------------------------------------------------------------------------------- 1 | .. include:: ../README.rst 2 | -------------------------------------------------------------------------------- /docs/usage.rst: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/dssg/pgdedupe/HEAD/docs/usage.rst -------------------------------------------------------------------------------- /pgdedupe/__init__.py: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/dssg/pgdedupe/HEAD/pgdedupe/__init__.py -------------------------------------------------------------------------------- /pgdedupe/cli.py: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/dssg/pgdedupe/HEAD/pgdedupe/cli.py -------------------------------------------------------------------------------- /pgdedupe/exact_matches.py: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/dssg/pgdedupe/HEAD/pgdedupe/exact_matches.py -------------------------------------------------------------------------------- /pgdedupe/run.py: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/dssg/pgdedupe/HEAD/pgdedupe/run.py -------------------------------------------------------------------------------- /pgdedupe/utils.py: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/dssg/pgdedupe/HEAD/pgdedupe/utils.py -------------------------------------------------------------------------------- /requirements.txt: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/dssg/pgdedupe/HEAD/requirements.txt -------------------------------------------------------------------------------- /requirements_dev.txt: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/dssg/pgdedupe/HEAD/requirements_dev.txt -------------------------------------------------------------------------------- /setup.cfg: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/dssg/pgdedupe/HEAD/setup.cfg -------------------------------------------------------------------------------- /setup.py: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/dssg/pgdedupe/HEAD/setup.py -------------------------------------------------------------------------------- /tests/Deduplication validation.ipynb: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/dssg/pgdedupe/HEAD/tests/Deduplication validation.ipynb -------------------------------------------------------------------------------- /tests/__init__.py: -------------------------------------------------------------------------------- 1 | # -*- coding: utf-8 -*- 2 | -------------------------------------------------------------------------------- /tests/dedup_postgres_training.json: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/dssg/pgdedupe/HEAD/tests/dedup_postgres_training.json -------------------------------------------------------------------------------- /tests/generate_fake_dataset.py: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/dssg/pgdedupe/HEAD/tests/generate_fake_dataset.py -------------------------------------------------------------------------------- /tests/initialize_db.py: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/dssg/pgdedupe/HEAD/tests/initialize_db.py -------------------------------------------------------------------------------- /tests/nicknames.csv: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/dssg/pgdedupe/HEAD/tests/nicknames.csv -------------------------------------------------------------------------------- /tests/test_integration.py: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/dssg/pgdedupe/HEAD/tests/test_integration.py -------------------------------------------------------------------------------- /tests/test_pgdedupe.py: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/dssg/pgdedupe/HEAD/tests/test_pgdedupe.py -------------------------------------------------------------------------------- /tests/test_reproducibility.py: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/dssg/pgdedupe/HEAD/tests/test_reproducibility.py -------------------------------------------------------------------------------- /tox.ini: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/dssg/pgdedupe/HEAD/tox.ini --------------------------------------------------------------------------------