├── .gitignore ├── Dockerfile ├── README.md ├── address_to_geocoding.json ├── binder └── requirements.txt ├── dedupe-settings.pickle ├── dedupe-simple-settings.pickle ├── dedupe-simple-training.json ├── dedupe-slides-training.json ├── dedupe └── variables │ ├── __init__.py │ └── custom_variables.py ├── full-indexing.png ├── graph_utils.py ├── index.html ├── requirements.in ├── requirements.txt ├── restaurant-training.csv ├── restaurant.csv ├── restaurant.original.csv ├── rise.css ├── slides-pt-br.ipynb ├── slides-pycon-us-2020.ipynb ├── slides-reduced.ipynb ├── slides.ipynb ├── sorted-neighbourhood.png ├── standard-blocking.png ├── svm_dedupe.py ├── training-input-output.txt ├── training-simple-input-output.txt └── vinta.png /.gitignore: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/vintasoftware/deduplication-slides/HEAD/.gitignore -------------------------------------------------------------------------------- /Dockerfile: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/vintasoftware/deduplication-slides/HEAD/Dockerfile -------------------------------------------------------------------------------- /README.md: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/vintasoftware/deduplication-slides/HEAD/README.md -------------------------------------------------------------------------------- /address_to_geocoding.json: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/vintasoftware/deduplication-slides/HEAD/address_to_geocoding.json -------------------------------------------------------------------------------- /binder/requirements.txt: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/vintasoftware/deduplication-slides/HEAD/binder/requirements.txt -------------------------------------------------------------------------------- /dedupe-settings.pickle: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/vintasoftware/deduplication-slides/HEAD/dedupe-settings.pickle -------------------------------------------------------------------------------- /dedupe-simple-settings.pickle: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/vintasoftware/deduplication-slides/HEAD/dedupe-simple-settings.pickle -------------------------------------------------------------------------------- /dedupe-simple-training.json: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/vintasoftware/deduplication-slides/HEAD/dedupe-simple-training.json -------------------------------------------------------------------------------- /dedupe-slides-training.json: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/vintasoftware/deduplication-slides/HEAD/dedupe-slides-training.json -------------------------------------------------------------------------------- /dedupe/variables/__init__.py: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/vintasoftware/deduplication-slides/HEAD/dedupe/variables/__init__.py -------------------------------------------------------------------------------- /dedupe/variables/custom_variables.py: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/vintasoftware/deduplication-slides/HEAD/dedupe/variables/custom_variables.py -------------------------------------------------------------------------------- /full-indexing.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/vintasoftware/deduplication-slides/HEAD/full-indexing.png -------------------------------------------------------------------------------- /graph_utils.py: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/vintasoftware/deduplication-slides/HEAD/graph_utils.py -------------------------------------------------------------------------------- /index.html: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/vintasoftware/deduplication-slides/HEAD/index.html -------------------------------------------------------------------------------- /requirements.in: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/vintasoftware/deduplication-slides/HEAD/requirements.in -------------------------------------------------------------------------------- /requirements.txt: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/vintasoftware/deduplication-slides/HEAD/requirements.txt -------------------------------------------------------------------------------- /restaurant-training.csv: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/vintasoftware/deduplication-slides/HEAD/restaurant-training.csv -------------------------------------------------------------------------------- /restaurant.csv: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/vintasoftware/deduplication-slides/HEAD/restaurant.csv -------------------------------------------------------------------------------- /restaurant.original.csv: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/vintasoftware/deduplication-slides/HEAD/restaurant.original.csv -------------------------------------------------------------------------------- /rise.css: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/vintasoftware/deduplication-slides/HEAD/rise.css -------------------------------------------------------------------------------- /slides-pt-br.ipynb: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/vintasoftware/deduplication-slides/HEAD/slides-pt-br.ipynb -------------------------------------------------------------------------------- /slides-pycon-us-2020.ipynb: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/vintasoftware/deduplication-slides/HEAD/slides-pycon-us-2020.ipynb -------------------------------------------------------------------------------- /slides-reduced.ipynb: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/vintasoftware/deduplication-slides/HEAD/slides-reduced.ipynb -------------------------------------------------------------------------------- /slides.ipynb: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/vintasoftware/deduplication-slides/HEAD/slides.ipynb -------------------------------------------------------------------------------- /sorted-neighbourhood.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/vintasoftware/deduplication-slides/HEAD/sorted-neighbourhood.png -------------------------------------------------------------------------------- /standard-blocking.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/vintasoftware/deduplication-slides/HEAD/standard-blocking.png -------------------------------------------------------------------------------- /svm_dedupe.py: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/vintasoftware/deduplication-slides/HEAD/svm_dedupe.py -------------------------------------------------------------------------------- /training-input-output.txt: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/vintasoftware/deduplication-slides/HEAD/training-input-output.txt -------------------------------------------------------------------------------- /training-simple-input-output.txt: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/vintasoftware/deduplication-slides/HEAD/training-simple-input-output.txt -------------------------------------------------------------------------------- /vinta.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/vintasoftware/deduplication-slides/HEAD/vinta.png --------------------------------------------------------------------------------