├── .gitignore ├── README.md ├── data └── .keep ├── generators ├── arxiv │ ├── README.md │ ├── generate_arxiv_queries.py │ └── generate_filters.py ├── clothes_images │ ├── __init__.py │ └── generate_hnm_queries.py ├── cohere_wiki │ ├── hf.py │ ├── prepare_data.py │ └── search_exact.py ├── config.py ├── dbpedia_openai_1M │ ├── __init__.py │ └── generate_dbpedia_openai_1M.py ├── generate.py ├── laion │ ├── __init__.py │ ├── generate_laion_small.py │ └── generate_laion_small_no_filters.py ├── random_data │ ├── __init__.py │ ├── generate_random_float_datasets.py │ ├── generate_random_geo_datasets.py │ ├── generate_random_int_datasets.py │ └── generate_random_keyword_datasets.py ├── search_generator │ ├── __init__.py │ └── qdrant_generator.py └── yandex_1B │ ├── __init__.py │ ├── generate_deep_10k_gt.py │ └── generate_t2i_100k_gt.py ├── poetry.lock ├── pyproject.toml └── sync.sh /.gitignore: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/qdrant/ann-filtering-benchmark-datasets/HEAD/.gitignore -------------------------------------------------------------------------------- /README.md: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/qdrant/ann-filtering-benchmark-datasets/HEAD/README.md -------------------------------------------------------------------------------- /data/.keep: -------------------------------------------------------------------------------- 1 | -------------------------------------------------------------------------------- /generators/arxiv/README.md: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/qdrant/ann-filtering-benchmark-datasets/HEAD/generators/arxiv/README.md -------------------------------------------------------------------------------- /generators/arxiv/generate_arxiv_queries.py: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/qdrant/ann-filtering-benchmark-datasets/HEAD/generators/arxiv/generate_arxiv_queries.py -------------------------------------------------------------------------------- /generators/arxiv/generate_filters.py: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/qdrant/ann-filtering-benchmark-datasets/HEAD/generators/arxiv/generate_filters.py -------------------------------------------------------------------------------- /generators/clothes_images/__init__.py: -------------------------------------------------------------------------------- 1 | -------------------------------------------------------------------------------- /generators/clothes_images/generate_hnm_queries.py: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/qdrant/ann-filtering-benchmark-datasets/HEAD/generators/clothes_images/generate_hnm_queries.py -------------------------------------------------------------------------------- /generators/cohere_wiki/hf.py: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/qdrant/ann-filtering-benchmark-datasets/HEAD/generators/cohere_wiki/hf.py -------------------------------------------------------------------------------- /generators/cohere_wiki/prepare_data.py: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/qdrant/ann-filtering-benchmark-datasets/HEAD/generators/cohere_wiki/prepare_data.py -------------------------------------------------------------------------------- /generators/cohere_wiki/search_exact.py: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/qdrant/ann-filtering-benchmark-datasets/HEAD/generators/cohere_wiki/search_exact.py -------------------------------------------------------------------------------- /generators/config.py: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/qdrant/ann-filtering-benchmark-datasets/HEAD/generators/config.py -------------------------------------------------------------------------------- /generators/dbpedia_openai_1M/__init__.py: -------------------------------------------------------------------------------- 1 | -------------------------------------------------------------------------------- /generators/dbpedia_openai_1M/generate_dbpedia_openai_1M.py: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/qdrant/ann-filtering-benchmark-datasets/HEAD/generators/dbpedia_openai_1M/generate_dbpedia_openai_1M.py -------------------------------------------------------------------------------- /generators/generate.py: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/qdrant/ann-filtering-benchmark-datasets/HEAD/generators/generate.py -------------------------------------------------------------------------------- /generators/laion/__init__.py: -------------------------------------------------------------------------------- 1 | -------------------------------------------------------------------------------- /generators/laion/generate_laion_small.py: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/qdrant/ann-filtering-benchmark-datasets/HEAD/generators/laion/generate_laion_small.py -------------------------------------------------------------------------------- /generators/laion/generate_laion_small_no_filters.py: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/qdrant/ann-filtering-benchmark-datasets/HEAD/generators/laion/generate_laion_small_no_filters.py -------------------------------------------------------------------------------- /generators/random_data/__init__.py: -------------------------------------------------------------------------------- 1 | -------------------------------------------------------------------------------- /generators/random_data/generate_random_float_datasets.py: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/qdrant/ann-filtering-benchmark-datasets/HEAD/generators/random_data/generate_random_float_datasets.py -------------------------------------------------------------------------------- /generators/random_data/generate_random_geo_datasets.py: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/qdrant/ann-filtering-benchmark-datasets/HEAD/generators/random_data/generate_random_geo_datasets.py -------------------------------------------------------------------------------- /generators/random_data/generate_random_int_datasets.py: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/qdrant/ann-filtering-benchmark-datasets/HEAD/generators/random_data/generate_random_int_datasets.py -------------------------------------------------------------------------------- /generators/random_data/generate_random_keyword_datasets.py: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/qdrant/ann-filtering-benchmark-datasets/HEAD/generators/random_data/generate_random_keyword_datasets.py -------------------------------------------------------------------------------- /generators/search_generator/__init__.py: -------------------------------------------------------------------------------- 1 | -------------------------------------------------------------------------------- /generators/search_generator/qdrant_generator.py: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/qdrant/ann-filtering-benchmark-datasets/HEAD/generators/search_generator/qdrant_generator.py -------------------------------------------------------------------------------- /generators/yandex_1B/__init__.py: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/qdrant/ann-filtering-benchmark-datasets/HEAD/generators/yandex_1B/__init__.py -------------------------------------------------------------------------------- /generators/yandex_1B/generate_deep_10k_gt.py: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/qdrant/ann-filtering-benchmark-datasets/HEAD/generators/yandex_1B/generate_deep_10k_gt.py -------------------------------------------------------------------------------- /generators/yandex_1B/generate_t2i_100k_gt.py: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/qdrant/ann-filtering-benchmark-datasets/HEAD/generators/yandex_1B/generate_t2i_100k_gt.py -------------------------------------------------------------------------------- /poetry.lock: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/qdrant/ann-filtering-benchmark-datasets/HEAD/poetry.lock -------------------------------------------------------------------------------- /pyproject.toml: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/qdrant/ann-filtering-benchmark-datasets/HEAD/pyproject.toml -------------------------------------------------------------------------------- /sync.sh: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/qdrant/ann-filtering-benchmark-datasets/HEAD/sync.sh --------------------------------------------------------------------------------