├── .gitignore ├── MANIFEST.in ├── Makefile ├── README.md ├── readysetdata ├── __init__.py ├── arrow.py ├── download.py ├── duckdb.py ├── http_unzip.py ├── inputs.py ├── jsonl.py ├── output.py ├── parquet.py ├── sqlite.py ├── utils.py └── wikipedia.py ├── requirements.txt ├── scripts ├── download.py ├── fakedata.py ├── geonames-nonus.py ├── geonames-us.py ├── imdb.py ├── movielens.py ├── parse-wikidata.py ├── parse-wikipedia.py ├── remote-unzip.py ├── tpch.py ├── vdinfobox.py ├── wikidata.sh ├── wikipages.sh └── xml2jsonl.py ├── setup.py └── wd_properties.jsonl /.gitignore: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/saulpw/readysetdata/HEAD/.gitignore -------------------------------------------------------------------------------- /MANIFEST.in: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/saulpw/readysetdata/HEAD/MANIFEST.in -------------------------------------------------------------------------------- /Makefile: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/saulpw/readysetdata/HEAD/Makefile -------------------------------------------------------------------------------- /README.md: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/saulpw/readysetdata/HEAD/README.md -------------------------------------------------------------------------------- /readysetdata/__init__.py: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/saulpw/readysetdata/HEAD/readysetdata/__init__.py -------------------------------------------------------------------------------- /readysetdata/arrow.py: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/saulpw/readysetdata/HEAD/readysetdata/arrow.py -------------------------------------------------------------------------------- /readysetdata/download.py: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/saulpw/readysetdata/HEAD/readysetdata/download.py -------------------------------------------------------------------------------- /readysetdata/duckdb.py: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/saulpw/readysetdata/HEAD/readysetdata/duckdb.py -------------------------------------------------------------------------------- /readysetdata/http_unzip.py: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/saulpw/readysetdata/HEAD/readysetdata/http_unzip.py -------------------------------------------------------------------------------- /readysetdata/inputs.py: -------------------------------------------------------------------------------- 1 | -------------------------------------------------------------------------------- /readysetdata/jsonl.py: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/saulpw/readysetdata/HEAD/readysetdata/jsonl.py -------------------------------------------------------------------------------- /readysetdata/output.py: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/saulpw/readysetdata/HEAD/readysetdata/output.py -------------------------------------------------------------------------------- /readysetdata/parquet.py: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/saulpw/readysetdata/HEAD/readysetdata/parquet.py -------------------------------------------------------------------------------- /readysetdata/sqlite.py: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/saulpw/readysetdata/HEAD/readysetdata/sqlite.py -------------------------------------------------------------------------------- /readysetdata/utils.py: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/saulpw/readysetdata/HEAD/readysetdata/utils.py -------------------------------------------------------------------------------- /readysetdata/wikipedia.py: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/saulpw/readysetdata/HEAD/readysetdata/wikipedia.py -------------------------------------------------------------------------------- /requirements.txt: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/saulpw/readysetdata/HEAD/requirements.txt -------------------------------------------------------------------------------- /scripts/download.py: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/saulpw/readysetdata/HEAD/scripts/download.py -------------------------------------------------------------------------------- /scripts/fakedata.py: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/saulpw/readysetdata/HEAD/scripts/fakedata.py -------------------------------------------------------------------------------- /scripts/geonames-nonus.py: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/saulpw/readysetdata/HEAD/scripts/geonames-nonus.py -------------------------------------------------------------------------------- /scripts/geonames-us.py: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/saulpw/readysetdata/HEAD/scripts/geonames-us.py -------------------------------------------------------------------------------- /scripts/imdb.py: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/saulpw/readysetdata/HEAD/scripts/imdb.py -------------------------------------------------------------------------------- /scripts/movielens.py: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/saulpw/readysetdata/HEAD/scripts/movielens.py -------------------------------------------------------------------------------- /scripts/parse-wikidata.py: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/saulpw/readysetdata/HEAD/scripts/parse-wikidata.py -------------------------------------------------------------------------------- /scripts/parse-wikipedia.py: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/saulpw/readysetdata/HEAD/scripts/parse-wikipedia.py -------------------------------------------------------------------------------- /scripts/remote-unzip.py: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/saulpw/readysetdata/HEAD/scripts/remote-unzip.py -------------------------------------------------------------------------------- /scripts/tpch.py: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/saulpw/readysetdata/HEAD/scripts/tpch.py -------------------------------------------------------------------------------- /scripts/vdinfobox.py: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/saulpw/readysetdata/HEAD/scripts/vdinfobox.py -------------------------------------------------------------------------------- /scripts/wikidata.sh: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/saulpw/readysetdata/HEAD/scripts/wikidata.sh -------------------------------------------------------------------------------- /scripts/wikipages.sh: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/saulpw/readysetdata/HEAD/scripts/wikipages.sh -------------------------------------------------------------------------------- /scripts/xml2jsonl.py: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/saulpw/readysetdata/HEAD/scripts/xml2jsonl.py -------------------------------------------------------------------------------- /setup.py: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/saulpw/readysetdata/HEAD/setup.py -------------------------------------------------------------------------------- /wd_properties.jsonl: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/saulpw/readysetdata/HEAD/wd_properties.jsonl --------------------------------------------------------------------------------