├── .gitignore ├── README.md ├── analysis-of-submissions ├── all_evaluations.json ├── benchmarks-in-pilot-study.json ├── corpus-to-benchmark.json ├── leaderboards-in-the-ir-experiment-platform.ipynb ├── retrieval-softwares-in-pilot-study.json ├── tira-ir-replicability.ipynb └── type-of-retrieval-softwares.json ├── ir-datasets ├── README.md └── tutorial │ ├── .gitignore │ ├── Dockerfile │ ├── README.md │ ├── pangram-documents.jsonl │ ├── pangram-qrels.txt │ ├── pangram-topics.xml │ └── pangrams.py ├── ir-measures ├── .devcontainer.json ├── .gitignore ├── .vscode │ └── settings.json ├── Dockerfile ├── Dockerfile.dev ├── Makefile ├── Pipfile ├── Pipfile.lock ├── README.md ├── __init__.py ├── ir-measures.iml ├── ir_measures_evaluator.py ├── output │ ├── evaluation-per-query.prototext │ └── evaluation.prototext └── tests │ ├── __init__.py │ ├── __pycache__ │ └── test_with_approvals.cpython-310-pytest-7.2.0.pyc │ ├── approvaltests_config.json │ ├── approved_files │ ├── test_with_approvals.test_all_valid.approved.txt │ ├── test_with_approvals.test_all_valid_with_rendering.approved.txt │ ├── test_with_approvals.test_all_valid_with_rendering_wrong_qrels_and_queries.approved.txt │ ├── test_with_approvals.test_document_ids_inconsistent_run_qrels.approved.txt │ ├── test_with_approvals.test_measure_invalid.approved.txt │ ├── test_with_approvals.test_measure_unknown.approved.txt │ ├── test_with_approvals.test_measure_valid.approved.txt │ ├── test_with_approvals.test_measure_valid_no_qrels.approved.txt │ ├── test_with_approvals.test_measure_valid_no_topics.approved.txt │ ├── test_with_approvals.test_measures_valid.approved.txt │ ├── test_with_approvals.test_output_dir_not_empty.approved.txt │ ├── test_with_approvals.test_output_path_is_file.approved.txt │ ├── test_with_approvals.test_output_path_not_found.approved.txt │ ├── test_with_approvals.test_output_valid_no_measures.approved.txt │ ├── test_with_approvals.test_output_valid_no_qrels.approved.txt │ ├── test_with_approvals.test_output_valid_no_topics.approved.txt │ ├── test_with_approvals.test_qrels_file_empty.approved.txt │ ├── test_with_approvals.test_qrels_path_is_dir.approved.txt │ ├── test_with_approvals.test_qrels_path_not_found.approved.txt │ ├── test_with_approvals.test_qrels_topics_valid.approved.txt │ ├── test_with_approvals.test_qrels_valid_no_topics.approved.txt │ ├── test_with_approvals.test_query_ids_inconsistent_run_qrels.approved.txt │ ├── test_with_approvals.test_query_ids_inconsistent_topics_qrels.approved.txt │ ├── test_with_approvals.test_query_ids_inconsistent_topics_run.approved.txt │ ├── test_with_approvals.test_run_document_id_special_chars.approved.txt │ ├── test_with_approvals.test_run_fewer_columns.approved.txt │ ├── test_with_approvals.test_run_file_empty.approved.txt │ ├── test_with_approvals.test_run_first_rank_not_zero.approved.txt │ ├── test_with_approvals.test_run_ignored_column_not_default.approved.txt │ ├── test_with_approvals.test_run_more_columns.approved.txt │ ├── test_with_approvals.test_run_multiple_tags.approved.txt │ ├── test_with_approvals.test_run_path_is_dir.approved.txt │ ├── test_with_approvals.test_run_path_not_found.approved.txt │ ├── test_with_approvals.test_run_query_id_not_ascending.approved.txt │ ├── test_with_approvals.test_run_query_id_special_chars.approved.txt │ ├── test_with_approvals.test_run_rank_not_ascending.approved.txt │ ├── test_with_approvals.test_run_rank_not_consecutive.approved.txt │ ├── test_with_approvals.test_run_rank_not_integer.approved.txt │ ├── test_with_approvals.test_run_rank_not_numeric.approved.txt │ ├── test_with_approvals.test_run_rank_ties.approved.txt │ ├── test_with_approvals.test_run_score_not_descending.approved.txt │ ├── test_with_approvals.test_run_score_not_numeric.approved.txt │ ├── test_with_approvals.test_run_score_rank_inconsistent.approved.txt │ ├── test_with_approvals.test_run_score_scientific_notation.approved.txt │ ├── test_with_approvals.test_run_score_ties.approved.txt │ ├── test_with_approvals.test_run_tag_special_chars.approved.txt │ ├── test_with_approvals.test_run_valid.approved.txt │ ├── test_with_approvals.test_topics_file_empty.approved.txt │ ├── test_with_approvals.test_topics_path_is_dir.approved.txt │ ├── test_with_approvals.test_topics_path_not_found.approved.txt │ └── test_with_approvals.test_topics_valid_no_qrels.approved.txt │ ├── end-to-end-test │ ├── document-components │ │ └── documents.jsonl │ ├── output-of-run │ │ └── run.txt │ ├── query-components │ │ └── queries.jsonl │ └── truth-data │ │ ├── metadata.json │ │ ├── qrels.txt │ │ └── queries.jsonl │ ├── test-io │ ├── test-input │ │ ├── empty_file.txt │ │ ├── qrels_sample_valid.txt │ │ ├── run_sample_invalid_less_columns.txt │ │ ├── run_sample_invalid_more_columns.txt │ │ ├── run_sample_invalid_multiple_tags.txt │ │ ├── run_sample_valid.txt │ │ ├── run_sample_warning_consistency.txt │ │ ├── run_sample_warning_docid_not_in_qrels.txt │ │ ├── run_sample_warning_docid_special_chars.txt │ │ ├── run_sample_warning_ignored_column_wrong.txt │ │ ├── run_sample_warning_qid_not_asc.txt │ │ ├── run_sample_warning_qid_not_in_qrels.txt │ │ ├── run_sample_warning_qid_special_chars.txt │ │ ├── run_sample_warning_rank_not_asc.txt │ │ ├── run_sample_warning_rank_not_consecutive.txt │ │ ├── run_sample_warning_rank_not_int.txt │ │ ├── run_sample_warning_rank_not_num.txt │ │ ├── run_sample_warning_rank_not_start_at_0.txt │ │ ├── run_sample_warning_rank_ties.txt │ │ ├── run_sample_warning_score_not_desc.txt │ │ ├── run_sample_warning_score_not_num.txt │ │ ├── run_sample_warning_score_scientific.txt │ │ ├── run_sample_warning_score_ties.txt │ │ ├── run_sample_warning_tag_special_chars.txt │ │ ├── test-input-cranfield │ │ │ ├── metadata.json │ │ │ ├── qrels.txt │ │ │ ├── queries.jsonl │ │ │ └── run.txt │ │ ├── topics_sample_valid.jsonl │ │ ├── topics_sample_warning_qid_not_in_qrels.jsonl │ │ └── topics_sample_warning_qid_not_in_run.jsonl │ └── test-output-not-empty │ │ └── file.txt │ └── test_with_approvals.py ├── reproducibility-experiments ├── README.md ├── full-rank-retriever-reproducibility.ipynb ├── interoparability-tutorial.ipynb └── re-rank-reproducibility.ipynb ├── serps └── create-serps.py └── tira-ir-starters ├── .gitignore ├── Makefile ├── README.md ├── beir ├── Dockerfile.base ├── Dockerfile.dres ├── Dockerfile.sbert ├── README.md ├── full_ranking.py ├── reranking.py ├── sample-input-full-rank │ ├── documents.jsonl │ ├── metadata.json │ ├── queries.jsonl │ └── queries.xml └── sample-input │ └── rerank.jsonl.gz ├── chatnoir ├── Dockerfile ├── README.md ├── chatnoir_pipelines.py ├── retrieve-with-chatnoir.sh └── sample-input │ ├── chatnoir-credentials.json │ ├── queries.jsonl │ └── queries.xml ├── duo-t5 ├── Dockerfile ├── README.md ├── Untitled.ipynb ├── __pycache__ │ └── duo-t5-preferences.cpython-37.pyc └── duo-t5-preferences.py ├── pygaggle ├── Dockerfile.base ├── Dockerfile.transformer ├── README.md ├── reranking.py └── sample-input │ └── rerank.jsonl.gz ├── pyserini ├── Dockerfile.base ├── Makefile ├── full-rank-bm25-rm3.ipynb ├── full-rank-bm25.ipynb ├── full-rank-qld-rm3.ipynb ├── full-rank-qld.ipynb ├── re-rank-bm25-rm3.ipynb ├── re-rank-bm25.ipynb ├── re-rank-qld-rm3.ipynb ├── re-rank-qld.ipynb ├── sample-input-full-rank │ ├── documents.jsonl │ ├── metadata.json │ ├── queries.jsonl │ └── queries.xml └── sample-input │ └── rerank.jsonl.gz ├── pyterrier-ciff ├── example-ciff │ └── index.ciff └── example-input │ ├── documents.jsonl │ ├── metadata.json │ ├── queries.jsonl │ └── queries.xml ├── pyterrier-colbert ├── Dockerfile ├── README.md ├── bm25-colbert.ipynb ├── reranking.py ├── sample-input-full-rank │ ├── documents.jsonl │ ├── metadata.json │ ├── queries.jsonl │ └── queries.xml └── sample-input │ └── rerank.jsonl.gz ├── pyterrier-duot5 ├── Dockerfile ├── README.md ├── __pycache__ │ ├── bla.cpython-310.pyc │ └── reranking.cpython-310.pyc ├── reranking.py └── sample-input │ └── rerank.jsonl.gz ├── pyterrier-t5 ├── Dockerfile ├── README.md ├── bm25-monot5.ipynb └── sample-input-full-rank │ ├── documents.jsonl │ ├── metadata.json │ ├── queries.jsonl │ └── queries.xml └── pyterrier ├── Dockerfile.base ├── README.md ├── __pycache__ └── tira_utils.cpython-310.pyc ├── default_pipelines.py ├── full-rank-pipeline.ipynb ├── pyterrier_cli.py ├── retrieval-pipeline.ipynb ├── run-pyterrier-notebook.py ├── sample-input-full-rank-gz ├── documents.jsonl.gz ├── metadata.json ├── queries.jsonl └── queries.xml ├── sample-input-full-rank ├── documents.jsonl ├── metadata.json ├── queries.jsonl └── queries.xml └── sample-input └── rerank.jsonl.gz /.gitignore: -------------------------------------------------------------------------------- 1 | __pycache__ 2 | -------------------------------------------------------------------------------- /analysis-of-submissions/benchmarks-in-pilot-study.json: -------------------------------------------------------------------------------- 1 | {"cranfield-20230107-training": "/mnt/ceph/tira/data/runs/cranfield-20230107-training/tira-ir-starter/2023-02-07-00-25-01/", 2 | "antique-test-20230107-training": "/mnt/ceph/tira/data/runs/antique-test-20230107-training/tira-ir-starter/2023-02-07-00-34-07/", 3 | "vaswani-20230107-training": "/mnt/ceph/tira/data/runs/vaswani-20230107-training/tira-ir-starter/2023-02-07-00-34-11/", 4 | "msmarco-passage-trec-dl-2019-judged-20230107-training": "/mnt/ceph/tira/data/runs/msmarco-passage-trec-dl-2019-judged-20230107-training/tira-ir-starter/2023-02-07-00-34-15/", 5 | "medline-2004-trec-genomics-2004-20230107-training": "/mnt/ceph/tira/data/runs/medline-2004-trec-genomics-2004-20230107-training/tira-ir-starter/2023-02-07-00-34-20/", 6 | "wapo-v2-trec-core-2018-20230107-training": "/mnt/ceph/tira/data/runs/wapo-v2-trec-core-2018-20230107-training/tira-ir-starter/2023-02-06-13-44-49/", 7 | "disks45-nocr-trec7-20230209-training": "/mnt/ceph/tira/data/runs/disks45-nocr-trec7-20230209-training/tira-ir-starter/2023-02-11-09-29-35/", 8 | "disks45-nocr-trec8-20230209-training": "/mnt/ceph/tira/data/runs/disks45-nocr-trec8-20230209-training/tira-ir-starter/2023-02-11-09-39-56/", 9 | "disks45-nocr-trec-robust-2004-20230209-training": "/mnt/ceph/tira/data/runs/disks45-nocr-trec-robust-2004-20230209-training/tira-ir-starter/2023-02-11-09-40-00/", 10 | "nfcorpus-test-20230107-training": "/mnt/ceph/tira/data/runs/nfcorpus-test-20230107-training/tira-ir-starter/2023-02-11-09-40-04/", 11 | "argsme-touche-2020-task-1-20230209-training": "/mnt/ceph/tira/data/runs/argsme-touche-2020-task-1-20230209-training/tira-ir-starter/2023-02-11-20-33-58/", 12 | "argsme-touche-2021-task-1-20230209-training": "/mnt/ceph/tira/data/runs/argsme-touche-2021-task-1-20230209-training/tira-ir-starter/2023-02-11-20-34-21/", 13 | "msmarco-passage-trec-dl-2020-judged-20230107-training": "/mnt/ceph/tira/data/runs/msmarco-passage-trec-dl-2020-judged-20230107-training/tira-ir-starter/2023-02-11-10-07-16/", 14 | "medline-2004-trec-genomics-2005-20230107-training": "/mnt/ceph/tira/data/runs/medline-2004-trec-genomics-2005-20230107-training/tira-ir-starter/2023-02-11-10-07-22/", 15 | "gov-trec-web-2002-20230209-training": "/mnt/ceph/tira/data/runs/gov-trec-web-2002-20230209-training/tira-ir-starter/2023-02-11-10-07-29/", 16 | "gov-trec-web-2003-20230209-training": "/mnt/ceph/tira/data/runs/gov-trec-web-2003-20230209-training/tira-ir-starter/2023-02-11-10-07-37/", 17 | "gov-trec-web-2004-20230209-training": "/mnt/ceph/tira/data/runs/gov-trec-web-2004-20230209-training/tira-ir-starter/2023-02-11-15-13-43/", 18 | "gov2-trec-tb-2006-20230209-training": "/mnt/ceph/tira/data/runs/gov2-trec-tb-2006-20230209-training/tira-ir-starter/2023-02-12-09-10-26/", 19 | "gov2-trec-tb-2004-20230209-training": "/mnt/ceph/tira/data/runs/gov2-trec-tb-2004-20230209-training/tira-ir-starter/2023-02-12-09-10-30/", 20 | "gov2-trec-tb-2005-20230209-training": "/mnt/ceph/tira/data/runs/gov2-trec-tb-2005-20230209-training/tira-ir-starter/2023-02-12-09-10-35/", 21 | "medline-2017-trec-pm-2017-20230211-training": "/mnt/ceph/tira/data/runs/medline-2017-trec-pm-2017-20230211-training/tira-ir-starter/2023-02-12-15-24-18/", 22 | "medline-2017-trec-pm-2018-20230211-training": "/mnt/ceph/tira/data/runs/medline-2017-trec-pm-2018-20230211-training/tira-ir-starter/2023-02-12-15-35-35/", 23 | "cord19-fulltext-trec-covid-20230107-training": "/mnt/ceph/tira/data/runs/cord19-fulltext-trec-covid-20230107-training/tira-ir-starter/2023-02-06-13-44-59/", 24 | "clueweb09-en-trec-web-2009-20230107-training": "/mnt/ceph/tira/data/runs/clueweb09-en-trec-web-2009-20230107-training/tira-ir-starter/2023-02-07-00-34-25/", 25 | "clueweb09-en-trec-web-2010-20230107-training": "/mnt/ceph/tira/data/runs/clueweb09-en-trec-web-2010-20230107-training/tira-ir-starter/2023-02-06-11-08-01/", 26 | "clueweb09-en-trec-web-2011-20230107-training": "/mnt/ceph/tira/data/runs/clueweb09-en-trec-web-2011-20230107-training/tira-ir-starter/2023-02-06-11-08-06/", 27 | "clueweb09-en-trec-web-2012-20230107-training": "/mnt/ceph/tira/data/runs/clueweb09-en-trec-web-2012-20230107-training/tira-ir-starter/2023-02-06-11-08-10/", 28 | "clueweb12-trec-web-2013-20230107-training": "/mnt/ceph/tira/data/runs/clueweb12-trec-web-2013-20230107-training/tira-ir-starter/2023-02-06-11-08-15/", 29 | "clueweb12-trec-web-2014-20230107-training": "/mnt/ceph/tira/data/runs/clueweb12-trec-web-2014-20230107-training/tira-ir-starter/2023-02-06-11-08-21/", 30 | "clueweb12-touche-2020-task-2-20230209-training": "/mnt/ceph/tira/data/runs/clueweb12-touche-2020-task-2-20230209-training/tira-ir-starter/2023-02-11-22-49-29/", 31 | "clueweb12-touche-2021-task-2-20230209-training": "/mnt/ceph/tira/data/runs/clueweb12-touche-2021-task-2-20230209-training/tira-ir-starter/2023-02-11-22-49-42/"} 32 | -------------------------------------------------------------------------------- /analysis-of-submissions/corpus-to-benchmark.json: -------------------------------------------------------------------------------- 1 | { 2 | "Args.me": ["argsme-touche-2021-task-1-20230209-training", "argsme-touche-2020-task-1-20230209-training"], 3 | "Antique": ["antique-test-20230107-training"], 4 | "Cranfield": ["cranfield-20230107-training"], 5 | "ClueWeb09": ["clueweb09-en-trec-web-2009-20230107-training", "clueweb09-en-trec-web-2010-20230107-training", "clueweb09-en-trec-web-2011-20230107-training", "clueweb09-en-trec-web-2012-20230107-training"], 6 | "ClueWeb12": ["clueweb12-trec-web-2013-20230107-training", "clueweb12-trec-web-2014-20230107-training", "clueweb12-touche-2020-task-2-20230209-training", "clueweb12-touche-2021-task-2-20230209-training"], 7 | "CORD-19": ["cord19-fulltext-trec-covid-20230107-training"], 8 | "Disks4+5": ["disks45-nocr-trec7-20230209-training", "disks45-nocr-trec8-20230209-training", "disks45-nocr-trec-robust-2004-20230209-training"], 9 | "Gov": ["gov-trec-web-2002-20230209-training", "gov-trec-web-2003-20230209-training", "gov-trec-web-2004-20230209-training"], 10 | "Gov2": ["gov2-trec-tb-2006-20230209-training", "gov2-trec-tb-2004-20230209-training", "gov2-trec-tb-2005-20230209-training"], 11 | "Medline": ["medline-2004-trec-genomics-2004-20230107-training", "medline-2004-trec-genomics-2005-20230107-training", "medline-2017-trec-pm-2017-20230211-training", "medline-2017-trec-pm-2018-20230211-training"], 12 | "MARCO": ["msmarco-passage-trec-dl-2019-judged-20230107-training", "msmarco-passage-trec-dl-2020-judged-20230107-training"], 13 | "NFCorpus": ["nfcorpus-test-20230107-training"], 14 | "Vaswani": ["vaswani-20230107-training"], 15 | "WaPo": ["wapo-v2-trec-core-2018-20230107-training"] 16 | } 17 | -------------------------------------------------------------------------------- /analysis-of-submissions/retrieval-softwares-in-pilot-study.json: -------------------------------------------------------------------------------- 1 | { 2 | "claret-fortress": "small-resources-gpu", 3 | "rectilinear-credits": "small-resources-gpu", 4 | "senior-platform": "small-resources-gpu", 5 | "MonoBERT Large (tira-ir-starter-gygaggle)": "small-resources-gpu", 6 | "DuoT5 base-10k-ms-marco Top-25 (tira-ir-starter-pyterrier)": "small-resources-gpu", 7 | "DuoT5 Top-25 (tira-ir-starter-pyterrier)": "small-resources-gpu", 8 | "MonoT5 Base (tira-ir-starter-gygaggle)": "small-resources-gpu", 9 | "ColBERT Re-Rank (tira-ir-starter-pyterrier)": "small-resources-gpu", 10 | "MonoT5 Large (tira-ir-starter-gygaggle)": "small-resources-gpu", 11 | "ANCE Base Cosine (tira-ir-starter-beir)": "small-resources-gpu", 12 | "ANCE Base Dot (tira-ir-starter-beir)": "small-resources-gpu", 13 | "SBERT msmarco-distilbert-base-v3-cos (tira-ir-starter-beir)": "small-resources-gpu", 14 | "SBERT msmarco-distilbert-base-v3-dot (tira-ir-starter-beir)": "small-resources-gpu", 15 | "SBERT msmarco-MiniLM-L6-cos-v5 (tira-ir-starter-beir)": "small-resources-gpu", 16 | "SBERT msmarco-MiniLM-L12-cos-v5 (tira-ir-starter-beir)": "small-resources-gpu", 17 | "SBERT msmarco-distilbert-cos-v5 (tira-ir-starter-beir)": "small-resources-gpu", 18 | "SBERT msmarco-distilbert-dot-v5 (tira-ir-starter-beir)": "small-resources-gpu", 19 | "SBERT msmarco-bert-base-dot-v5 (tira-ir-starter-beir)": "small-resources-gpu", 20 | "TASB msmarco-distilbert-base-cos (tira-ir-starter-beir)": "small-resources-gpu", 21 | "TASB msmarco-distilbert-base-dot (tira-ir-starter-beir)": "small-resources-gpu", 22 | "SBERT multi-qa-MiniLM-L6-cos-v1 (tira-ir-starter-beir)": "small-resources-gpu", 23 | "SBERT multi-qa-distilbert-cos-v1 (tira-ir-starter-beir)": "small-resources-gpu", 24 | "SBERT multi-qa-mpnet-base-cos-v1 (tira-ir-starter-beir)": "small-resources-gpu", 25 | "SBERT multi-qa-MiniLM-L6-dot-v1 (tira-ir-starter-beir)": "small-resources-gpu", 26 | "SBERT multi-qa-distilbert-dot-v1 (tira-ir-starter-beir)": "small-resources-gpu", 27 | "SBERT multi-qa-mpnet-base-dot-v1 (tira-ir-starter-beir)": "small-resources-gpu", 28 | "DuoT5 3b-ms-marco Top-25 (tira-ir-starter-pyterrier)": "small-resources-gpu", 29 | "nimble-bar": "small-resources-gpu", 30 | "obsolete-mart": "small-resources-gpu", 31 | "latent-wetland": "small-resources-gpu", 32 | "BM25 Re-Rank (tira-ir-starter-pyterrier)": "small-resources", 33 | "DFIC Re-Rank (tira-ir-starter-pyterrier)": "small-resources", 34 | "DFIZ Re-Rank (tira-ir-starter-pyterrier)": "small-resources", 35 | "DFR_BM25 Re-Rank (tira-ir-starter-pyterrier)": "small-resources", 36 | "DFRee Re-Rank (tira-ir-starter-pyterrier)": "small-resources", 37 | "DFReeKLIM Re-Rank (tira-ir-starter-pyterrier)": "small-resources", 38 | "DirichletLM Re-Rank (tira-ir-starter-pyterrier)": "small-resources", 39 | "DLH Re-Rank (tira-ir-starter-pyterrier)": "small-resources", 40 | "DPH Re-Rank (tira-ir-starter-pyterrier)": "small-resources", 41 | "Hiemstra_LM Re-Rank (tira-ir-starter-pyterrier)": "small-resources", 42 | "IFB2 Re-Rank (tira-ir-starter-pyterrier)": "small-resources", 43 | "In_expB2 Re-Rank (tira-ir-starter-pyterrier)": "small-resources", 44 | "In_expC2 Re-Rank (tira-ir-starter-pyterrier)": "small-resources", 45 | "InB2 Re-Rank (tira-ir-starter-pyterrier)": "small-resources", 46 | "InL2 Re-Rank (tira-ir-starter-pyterrier)": "small-resources", 47 | "Js_KLs Re-Rank (tira-ir-starter-pyterrier)": "small-resources", 48 | "LGD Re-Rank (tira-ir-starter-pyterrier)": "small-resources", 49 | "PL2 Re-Rank (tira-ir-starter-pyterrier)": "small-resources", 50 | "TF_IDF Re-Rank (tira-ir-starter-pyterrier)": "small-resources", 51 | "XSqrA_M Re-Rank (tira-ir-starter-pyterrier)": "small-resources", 52 | "ChatNoir": "small-resources" 53 | } 54 | -------------------------------------------------------------------------------- /analysis-of-submissions/type-of-retrieval-softwares.json: -------------------------------------------------------------------------------- 1 | { 2 | "Lexical": [ 3 | "BM25 Re-Rank (tira-ir-starter-pyterrier)", "DFIC Re-Rank (tira-ir-starter-pyterrier)", 4 | "DFIZ Re-Rank (tira-ir-starter-pyterrier)","DFR_BM25 Re-Rank (tira-ir-starter-pyterrier)", 5 | "DFRee Re-Rank (tira-ir-starter-pyterrier)", "DFReeKLIM Re-Rank (tira-ir-starter-pyterrier)", 6 | "DirichletLM Re-Rank (tira-ir-starter-pyterrier)", "DLH Re-Rank (tira-ir-starter-pyterrier)", 7 | "DPH Re-Rank (tira-ir-starter-pyterrier)", "Hiemstra_LM Re-Rank (tira-ir-starter-pyterrier)", 8 | "IFB2 Re-Rank (tira-ir-starter-pyterrier)", "In_expB2 Re-Rank (tira-ir-starter-pyterrier)", 9 | "In_expC2 Re-Rank (tira-ir-starter-pyterrier)", "InB2 Re-Rank (tira-ir-starter-pyterrier)", 10 | "InL2 Re-Rank (tira-ir-starter-pyterrier)", "Js_KLs Re-Rank (tira-ir-starter-pyterrier)", 11 | "LGD Re-Rank (tira-ir-starter-pyterrier)", "PL2 Re-Rank (tira-ir-starter-pyterrier)", 12 | "TF_IDF Re-Rank (tira-ir-starter-pyterrier)", "XSqrA_M Re-Rank (tira-ir-starter-pyterrier)" 13 | ], 14 | 15 | "Late Int.": ["ColBERT Re-Rank (tira-ir-starter-pyterrier)"], 16 | 17 | "Bi-Encoder": [ 18 | "ANCE Base Cosine (tira-ir-starter-beir)", 19 | "ANCE Base Dot (tira-ir-starter-beir)", 20 | "SBERT msmarco-distilbert-base-v3-cos (tira-ir-starter-beir)", 21 | "SBERT msmarco-distilbert-base-v3-dot (tira-ir-starter-beir)", 22 | "SBERT msmarco-MiniLM-L6-cos-v5 (tira-ir-starter-beir)", 23 | "SBERT msmarco-MiniLM-L12-cos-v5 (tira-ir-starter-beir)", 24 | "SBERT msmarco-distilbert-cos-v5 (tira-ir-starter-beir)", 25 | "SBERT msmarco-distilbert-dot-v5 (tira-ir-starter-beir)", 26 | "SBERT msmarco-bert-base-dot-v5 (tira-ir-starter-beir)", 27 | "TASB msmarco-distilbert-base-cos (tira-ir-starter-beir)", 28 | "TASB msmarco-distilbert-base-dot (tira-ir-starter-beir)", 29 | "SBERT multi-qa-MiniLM-L6-cos-v1 (tira-ir-starter-beir)", 30 | "SBERT multi-qa-distilbert-cos-v1 (tira-ir-starter-beir)", 31 | "SBERT multi-qa-mpnet-base-cos-v1 (tira-ir-starter-beir)", 32 | "SBERT multi-qa-MiniLM-L6-dot-v1 (tira-ir-starter-beir)", 33 | "SBERT multi-qa-distilbert-dot-v1 (tira-ir-starter-beir)", 34 | "SBERT multi-qa-mpnet-base-dot-v1 (tira-ir-starter-beir)" 35 | ], 36 | 37 | "duoT5": [ 38 | "DuoT5 base-10k-ms-marco Top-25 (tira-ir-starter-pyterrier)", 39 | "DuoT5 Top-25 (tira-ir-starter-pyterrier)", 40 | "DuoT5 3b-ms-marco Top-25 (tira-ir-starter-pyterrier)" 41 | ], 42 | 43 | "PyGaggle": [ 44 | "claret-fortress", 45 | "rectilinear-credits", 46 | "senior-platform", 47 | "MonoBERT Large (tira-ir-starter-gygaggle)", 48 | "MonoT5 Base (tira-ir-starter-gygaggle)", 49 | "MonoT5 Large (tira-ir-starter-gygaggle)", 50 | "nimble-bar", 51 | "obsolete-mart", 52 | "latent-wetland" 53 | ] 54 | } 55 | -------------------------------------------------------------------------------- /ir-datasets/README.md: -------------------------------------------------------------------------------- 1 | # Import new Datasets 2 | 3 | All datasets from the main branch of `ir_datasets` are supported by default. 4 | We have a tutorial showing how new, potentially work-in-progress data can be imported at [ir-datasets/tutorial](ir-datasets/tutorial) 5 | 6 | # Integration of ir_datasets into the Information Retrieval Experiment Platform 7 | 8 | 9 | 10 | The integration of ir_datasets into the IR Experiment platform lives in the main branch of TIRA: [https://github.com/tira-io/tira/blob/main/application/src/tira/management/commands/ir_datasets_loader_cli.py](https://github.com/tira-io/tira/blob/main/application/src/tira/management/commands/ir_datasets_loader_cli.py). 11 | -------------------------------------------------------------------------------- /ir-datasets/tutorial/.gitignore: -------------------------------------------------------------------------------- 1 | pangram-dataset-tira 2 | tira-output 3 | -------------------------------------------------------------------------------- /ir-datasets/tutorial/Dockerfile: -------------------------------------------------------------------------------- 1 | FROM webis/tira-ir-datasets-starter:0.0.54 2 | 3 | # The ir_datasets integration in "pangrams.py" and the resources "pangram-qrels.txt", "pangram-topics.xml", and "pangram-documents.jsonl" need to be 4 | # located in a package "datasets_in_progress" on the pythonpath so that they can be found. The following copy command ensures this. 5 | # You can test that it is correctly in the python path by running "from ir_datasets.datasets_in_progress import pangrams" inside a python shell. 6 | # I.e., if your docker image has the name "pangram-ir-dataset" as in the tutorial, a command to test this would be: 7 | # docker run --rm -ti --entrypoint python3 pangram-ir-dataset -c 'from ir_datasets.datasets_in_progress import pangrams; help(pangrams)' 8 | COPY pangrams.py pangram-qrels.txt pangram-topics.xml pangram-documents.jsonl /usr/lib/python3.8/site-packages/ir_datasets/datasets_in_progress/ 9 | 10 | ENTRYPOINT [ "/irds_cli.sh" ] 11 | 12 | -------------------------------------------------------------------------------- /ir-datasets/tutorial/pangram-documents.jsonl: -------------------------------------------------------------------------------- 1 | {"doc_id": "pangram-01", "text": "How quickly daft jumping zebras vex.", "letters": 30} 2 | {"doc_id": "pangram-02", "text": "Quick fox jumps nightly above wizard.", "letters": 31} 3 | {"doc_id": "pangram-03", "text": "The jay, pig, fox, zebra and my wolves quack!", "letters": 33} 4 | {"doc_id": "pangram-04", "text": "The quick brown fox jumps over the lazy dog.", "letters": 35} 5 | {"doc_id": "pangram-05", "text": "As quirky joke, chefs won’t pay devil magic zebra tax.", "letters": 42} 6 | -------------------------------------------------------------------------------- /ir-datasets/tutorial/pangram-qrels.txt: -------------------------------------------------------------------------------- 1 | 1 0 pangram-01 0 2 | 1 0 pangram-02 0 3 | 1 0 pangram-03 0 4 | 1 0 pangram-04 1 5 | 1 0 pangram-05 0 6 | 2 0 pangram-01 0 7 | 2 0 pangram-02 0 8 | 2 0 pangram-03 1 9 | 2 0 pangram-04 0 10 | 2 0 pangram-05 0 11 | -------------------------------------------------------------------------------- /ir-datasets/tutorial/pangram-topics.xml: -------------------------------------------------------------------------------- 1 | 2 | 3 | fox jumps above animal 4 | What pangrams have a fox jumping above some animal? 5 | Relevant pangrams have a fox jumping over an animal (e.g., an dog). Pangrams containing a fox that is not jumping or jumps over something that is not an animal are not relevant. 6 | 7 | 8 | multiple animals including a zebra 9 | Which pangrams have multiple animals where one of the animals is a zebra? 10 | Relevant pangrams have at least two animals, one of the animals must be a zebra. Pangrams containing only a zebra are not relevant. 11 | 12 | 13 | -------------------------------------------------------------------------------- /ir-datasets/tutorial/pangrams.py: -------------------------------------------------------------------------------- 1 | """This python file registers a new ir_datasets class 'pangrams'. 2 | You can find the ir_datasets documentation here: https://github.com/allenai/ir_datasets/. 3 | This file is intended to work inside the Docker image produced during this tutorial (the Dockerfile copies it and the other files loaded below to the correct locations). 4 | """ 5 | import ir_datasets 6 | from ir_datasets.formats import JsonlDocs, TrecXmlQueries, TrecQrels 7 | from typing import NamedTuple, Dict 8 | from ir_datasets.datasets.base import Dataset 9 | 10 | class PangramDocument(NamedTuple): 11 | doc_id: str 12 | text: str 13 | letters: int 14 | 15 | def default_text(self): 16 | return self.text 17 | 18 | ir_datasets.registry.register('pangrams', Dataset( 19 | JsonlDocs(ir_datasets.util.PackageDataFile(path='datasets_in_progress/pangram-documents.jsonl'), doc_cls=PangramDocument, lang='en'), 20 | TrecXmlQueries(ir_datasets.util.PackageDataFile(path='datasets_in_progress/pangram-topics.xml'), lang='en'), 21 | TrecQrels(ir_datasets.util.PackageDataFile(path='datasets_in_progress/pangram-qrels.txt'), {0: 'Not Relevant', 1: 'Relevant'}) 22 | )) 23 | -------------------------------------------------------------------------------- /ir-measures/.devcontainer.json: -------------------------------------------------------------------------------- 1 | { 2 | "image": "webis/ir_measures_evaluator:dev-1.0.7", 3 | "customizations": { 4 | "vscode": { 5 | "extensions": [ 6 | "ms-python.python", 7 | "ms-python.vscode-pylance", 8 | "ms-toolsai.jupyter" 9 | ] 10 | } 11 | } 12 | } 13 | -------------------------------------------------------------------------------- /ir-measures/.gitignore: -------------------------------------------------------------------------------- 1 | tests/end-to-end-test/evaluator-output/ 2 | -------------------------------------------------------------------------------- /ir-measures/.vscode/settings.json: -------------------------------------------------------------------------------- 1 | { 2 | "python.testing.pytestArgs": [ 3 | "tests" 4 | ], 5 | "python.testing.unittestEnabled": false, 6 | "python.testing.pytestEnabled": true 7 | } -------------------------------------------------------------------------------- /ir-measures/Dockerfile: -------------------------------------------------------------------------------- 1 | FROM webis/ir_measures_evaluator:dev-1.0.7 2 | 3 | COPY ir_measures_evaluator.py / 4 | 5 | COPY tests /tmp/tests/ 6 | 7 | RUN cd /tmp \ 8 | && find -iname __pycache__ -exec rm -Rf {} \; || echo "" \ 9 | && PYTHONPATH='../:.' pytest \ 10 | && cd / \ 11 | && rm -Rf /tmp/tests 12 | 13 | ENTRYPOINT [ "/ir_measures_evaluator.py" ] 14 | -------------------------------------------------------------------------------- /ir-measures/Dockerfile.dev: -------------------------------------------------------------------------------- 1 | # docker build -t webis/ir_measures_evaluator:dev-1.0.7 -f Dockerfile.dev . 2 | FROM webis/tira-application:0.0.94 3 | 4 | RUN apt-get install -y jq libffi-dev && pip3 install ir-datasets ir-measures==0.3.1 approvaltests exceptiongroup pytest jupyter 5 | 6 | #./ir_measures_evaluator.py --topics tests/end-to-end-test/truth-data/queries.jsonl --qrels tests/end-to-end-test/truth-data/qrels.txt --run tests/end-to-end-test/document-components/run.txt --output fff/ 7 | #./ir_measures_evaluator.py --topics tests/end-to-end-test/truth-data/queries.jsonl --qrels tests/end-to-end-test/truth-data/qrels.txt --run tests/end-to-end-test/query-components/run.txt --output fff/ 8 | -------------------------------------------------------------------------------- /ir-measures/Makefile: -------------------------------------------------------------------------------- 1 | IMAGE_VERSION=1.0.7 2 | 3 | build-docker-image: 4 | docker build -t webis/ir_measures_evaluator:dev-${IMAGE_VERSION} -f Dockerfile.dev . 5 | docker build -t webis/ir_measures_evaluator:${IMAGE_VERSION} . 6 | 7 | .PHONY: tests 8 | tests: 9 | pytest 10 | 11 | example-execution: 12 | rm -Rf output 13 | docker run --rm -it -v ${PWD}/input:/input -v ${PWD}/output:/output webis/ir_measures_evaluator:${IMAGE_VERSION} --run /input/run.txt --qrels /input/qrels.txt --measures "AP(rel=2)" "P(rel=2)@10" --output_path /output/eval.prototext 14 | 15 | publish-docker-image: 16 | docker push webis/ir_measures_evaluator:${IMAGE_VERSION} 17 | -------------------------------------------------------------------------------- /ir-measures/Pipfile: -------------------------------------------------------------------------------- 1 | [[source]] 2 | url = "https://pypi.org/simple" 3 | verify_ssl = true 4 | name = "pypi" 5 | 6 | [packages] 7 | ir-measures = "*" 8 | pytest = "*" 9 | approvaltests = "*" 10 | exceptiongroup = "*" 11 | 12 | [dev-packages] 13 | 14 | [requires] 15 | python_version = "3.8" 16 | -------------------------------------------------------------------------------- /ir-measures/README.md: -------------------------------------------------------------------------------- 1 | # IR Measures Evaluator 2 | 3 | The IR Measures evaluator uses the truth data (e.g., qrels) and the outputs of a system (e.g., a run) as input to produce an evaluation. This evaluator produces both, a quantitative evaluation with ir_measures (e.g., P@10, nDCG@10) but also the basis for qualitative evaluations by rendering the runs into search engine result pages. Both, the rendered SERP and the effectiveness measures are stored in the evaluation directory so that participants can see both if a run and evaluation is unblinded. 4 | 5 | To test the evaluator locally, please install `tira-run` (e.g., `pip3 install tira`) which executes an docker image as it would be executed in TIRA and use the following command from within this directory to try it on your local system on a small example: 6 | 7 | ``` 8 | tira-run \ 9 | --image webis/ir_measures_evaluator:1.0.3 \ 10 | --input-run ${PWD}/tests/end-to-end-test/output-of-run/ \ 11 | --input-directory ${PWD}/tests/end-to-end-test/truth-data/ \ 12 | --output-directory ${PWD}/tests/end-to-end-test/evaluator-output \ 13 | --allow-network true \ 14 | --command '/ir_measures_evaluator.py --run ${inputRun}/run.txt --topics ${inputDataset}/queries.jsonl --qrels ${inputDataset}/qrels.txt --output ${outputDir} --measures "P@10" "nDCG@10" "MRR"' 15 | ``` 16 | 17 | This creates a directory `tests/end-to-end-test/evaluator-output/` with the following content: 18 | 19 | - `evaluation-per-query.prototext`: Per query evaluations (e.g., for significance tests) 20 | - `evaluation.prototext`: Evaluations 21 | - `serp.html`: The rendered SERP for all topics 22 | - `.data-top-10-for-rendering.jsonl`: Small export of all the data required for rendering the run. This is intended for more dynamic rendering. 23 | 24 | # Usage in TIRA 25 | 26 | Add the evaluator to tira with: 27 | 28 | Image: 29 | ``` 30 | webis/ir_measures_evaluator:1.0.2 31 | ``` 32 | 33 | Command (if no qrels are available): 34 | 35 | ``` 36 | /ir_measures_evaluator.py --run ${inputRun}/run.txt --output_path ${outputDir}/evaluation.prototext 37 | ``` 38 | 39 | 40 | Command (if qrels are available): 41 | 42 | ``` 43 | /ir_measures_evaluator.py --run ${inputRun}/run.txt --topics ${inputDataset}/queries.jsonl --qrels ${inputDataset}/qrels.txt --output_path ${outputDir} --measures "P@10" "nDCG@10" "MRR" 44 | ``` 45 | 46 | -------------------------------------------------------------------------------- /ir-measures/__init__.py: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/tira-io/ir-experiment-platform/9fded863a5fb06332ee73b2a6913d763eed3510d/ir-measures/__init__.py -------------------------------------------------------------------------------- /ir-measures/ir-measures.iml: -------------------------------------------------------------------------------- 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9 | -------------------------------------------------------------------------------- /ir-measures/output/evaluation.prototext: -------------------------------------------------------------------------------- 1 | measure { 2 | key: "AP" 3 | value: "0.0984479789876958" 4 | } 5 | measure { 6 | key: "ERR@20" 7 | value: "0.11547874999999996" 8 | } 9 | measure { 10 | key: "P@20" 11 | value: "0.3895833333333332" 12 | } 13 | measure { 14 | key: "nDCG@20" 15 | value: "0.2595661630864897" 16 | } 17 | -------------------------------------------------------------------------------- /ir-measures/tests/__init__.py: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/tira-io/ir-experiment-platform/9fded863a5fb06332ee73b2a6913d763eed3510d/ir-measures/tests/__init__.py -------------------------------------------------------------------------------- /ir-measures/tests/__pycache__/test_with_approvals.cpython-310-pytest-7.2.0.pyc: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/tira-io/ir-experiment-platform/9fded863a5fb06332ee73b2a6913d763eed3510d/ir-measures/tests/__pycache__/test_with_approvals.cpython-310-pytest-7.2.0.pyc -------------------------------------------------------------------------------- /ir-measures/tests/approvaltests_config.json: -------------------------------------------------------------------------------- 1 | { 2 | "subdirectory": "approved_files" 3 | } -------------------------------------------------------------------------------- /ir-measures/tests/approved_files/test_with_approvals.test_all_valid.approved.txt: -------------------------------------------------------------------------------- 1 | ℹ Check qrels path: test-input/qrels_sample_valid.txt 2 | ✓ Qrels path is valid. 3 | ℹ Load qrels with ir-measures. 4 | ✓ Qrels successfully loaded. 5 | ℹ Check topics path: test-input/topics_sample_valid.jsonl 6 | ✓ Topics path is valid. 7 | ℹ Load topics. 8 | ✓ Topics successfully loaded. 9 | ℹ Parse measures: P@2, nDCG@2 10 | ✓ Measures successfully parsed. 11 | ℹ Check output path: test-output 12 | ✓ Output path is valid. 13 | ℹ Check run path: test-input/run_sample_valid.txt 14 | ✓ Run path is valid. 15 | ℹ Check run file format. 16 | ✓ Run file format is valid. 17 | ℹ Load run with ir-measures. 18 | ✓ Run successfully loaded. 19 | ℹ Check run, qrels, and topics consistency. 20 | ✓ Run, qrels, and topics are consistent. 21 | ℹ Evaluate run with measures: P@2, nDCG@2 22 | ✓ Run successfully evaluated. 23 | ℹ Export metrics. 24 | ✓ Metrics successfully exported. 25 | 26 | 27 | #### 28 | files: ['test-output/evaluation-per-query.prototext', 'test-output/evaluation.prototext'] 29 | 30 | 31 | ####test-output/evaluation-per-query.prototext 32 | measure { 33 | query_id: "1" 34 | measure: "P@2" 35 | value: "1.0" 36 | } 37 | measure { 38 | query_id: "1" 39 | measure: "nDCG@2" 40 | value: "0.6666666666666667" 41 | } 42 | 43 | 44 | ####test-output/evaluation.prototext 45 | measure { 46 | key: "P@2" 47 | value: "1.0" 48 | } 49 | measure { 50 | key: "nDCG@2" 51 | value: "0.6666666666666667" 52 | } 53 | -------------------------------------------------------------------------------- /ir-measures/tests/approved_files/test_with_approvals.test_all_valid_with_rendering.approved.txt: -------------------------------------------------------------------------------- 1 | ℹ Check qrels path: test-input/test-input-cranfield/qrels.txt 2 | ✓ Qrels path is valid. 3 | ℹ Load qrels with ir-measures. 4 | ✓ Qrels successfully loaded. 5 | ℹ Check topics path: test-input/test-input-cranfield/queries.jsonl 6 | ✓ Topics path is valid. 7 | ℹ Load topics. 8 | ✓ Topics successfully loaded. 9 | ℹ Parse measures: P@2, nDCG@2 10 | ✓ Measures successfully parsed. 11 | ℹ Check output path: test-output 12 | ✓ Output path is valid. 13 | ℹ Export metrics. 14 | ✓ Metrics successfully exported. 15 | ℹ Check run path: test-input/test-input-cranfield/run.txt 16 | ✓ Run path is valid. 17 | ℹ Check run file format. 18 | ⚠ Ranks do not start at 0. 19 | ⚠ Scores not in descending order at lines: 11<12 20 | ⚠ Ranks not in ascending order at lines: 11>12 21 | ⚠ Ranks not consecutive at lines: 11↛12 22 | ⚠ Run file format is valid: 4 warnings 23 | ℹ Load run with ir-measures. 24 | ✓ Run successfully loaded. 25 | ℹ Check run, qrels, and topics consistency. 26 | ⚠ Document IDs of run file not found in qrels file: 359, 486, 573, 663, 746 (+1 more) 27 | ⚠ Run, qrels, and topics are inconsistent: 6 warnings 28 | ℹ Evaluate run with measures: P@2, nDCG@2 29 | ✓ Run successfully evaluated. 30 | ℹ Export metrics. 31 | ✓ Metrics successfully exported. 32 | 33 | 34 | #### 35 | files: ['test-output/.data-top-10-for-rendering.json.gz', 'test-output/evaluation-per-query.prototext', 'test-output/evaluation.prototext'] 36 | 37 | 38 | ####test-output/evaluation-per-query.prototext 39 | measure { 40 | query_id: "1" 41 | measure: "P@2" 42 | value: "0.5" 43 | } 44 | measure { 45 | query_id: "1" 46 | measure: "nDCG@2" 47 | value: "0.4598603945740938" 48 | } 49 | 50 | 51 | ####test-output/evaluation.prototext 52 | measure { 53 | key: "P@2" 54 | value: "0.5" 55 | } 56 | measure { 57 | key: "intermediate_processed_queries_judged" 58 | value: "1" 59 | } 60 | measure { 61 | key: "nDCG@2" 62 | value: "0.4598603945740938" 63 | } 64 | -------------------------------------------------------------------------------- /ir-measures/tests/approved_files/test_with_approvals.test_all_valid_with_rendering_wrong_qrels_and_queries.approved.txt: -------------------------------------------------------------------------------- 1 | ℹ Check qrels path: test-input/test-input-cranfield/qrels.txt 2 | ✓ Qrels path is valid. 3 | ℹ Load qrels with ir-measures. 4 | ✓ Qrels successfully loaded. 5 | ℹ Check topics path: test-input/topics_sample_valid.jsonl 6 | ✓ Topics path is valid. 7 | ℹ Load topics. 8 | ✓ Topics successfully loaded. 9 | ℹ Parse measures: P@2, nDCG@2 10 | ✓ Measures successfully parsed. 11 | ℹ Check output path: test-output 12 | ✓ Output path is valid. 13 | ℹ Export metrics. 14 | ✓ Metrics successfully exported. 15 | ℹ Check run path: test-input/test-input-cranfield/run.txt 16 | ✓ Run path is valid. 17 | ℹ Check run file format. 18 | ⚠ Ranks do not start at 0. 19 | ⚠ Scores not in descending order at lines: 11<12 20 | ⚠ Ranks not in ascending order at lines: 11>12 21 | ⚠ Ranks not consecutive at lines: 11↛12 22 | ⚠ Run file format is valid: 4 warnings 23 | ℹ Load run with ir-measures. 24 | ✓ Run successfully loaded. 25 | ℹ Check run, qrels, and topics consistency. 26 | ⚠ Document IDs of run file not found in qrels file: 359, 486, 573, 663, 746 (+1 more) 27 | ⚠ Run, qrels, and topics are inconsistent: 6 warnings 28 | ℹ Evaluate run with measures: P@2, nDCG@2 29 | ✓ Run successfully evaluated. 30 | ℹ Export metrics. 31 | ✓ Metrics successfully exported. 32 | 33 | 34 | #### 35 | files: ['test-output/.data-top-10-for-rendering.json.gz', 'test-output/evaluation-per-query.prototext', 'test-output/evaluation.prototext'] 36 | 37 | 38 | ####test-output/evaluation-per-query.prototext 39 | measure { 40 | query_id: "1" 41 | measure: "P@2" 42 | value: "0.5" 43 | } 44 | measure { 45 | query_id: "1" 46 | measure: "nDCG@2" 47 | value: "0.4598603945740938" 48 | } 49 | 50 | 51 | ####test-output/evaluation.prototext 52 | measure { 53 | key: "P@2" 54 | value: "0.5" 55 | } 56 | measure { 57 | key: "intermediate_processed_queries_judged" 58 | value: "1" 59 | } 60 | measure { 61 | key: "nDCG@2" 62 | value: "0.4598603945740938" 63 | } 64 | -------------------------------------------------------------------------------- /ir-measures/tests/approved_files/test_with_approvals.test_document_ids_inconsistent_run_qrels.approved.txt: -------------------------------------------------------------------------------- 1 | ℹ Check run path: test-input/run_sample_warning_docid_not_in_qrels.txt 2 | ✓ Run path is valid. 3 | ℹ Check run file format. 4 | ✓ Run file format is valid. 5 | ℹ Load run with ir-measures. 6 | ✓ Run successfully loaded. 7 | ℹ Check qrels path: test-input/qrels_sample_valid.txt 8 | ✓ Qrels path is valid. 9 | ℹ Load qrels with ir-measures. 10 | ✓ Qrels successfully loaded. 11 | ℹ Check topics path: test-input/topics_sample_valid.jsonl 12 | ✓ Topics path is valid. 13 | ℹ Load topics. 14 | ✓ Topics successfully loaded. 15 | ℹ Check run, qrels, and topics consistency. 16 | ⚠ Document IDs of run file not found in qrels file: 9 17 | ⚠ Run, qrels, and topics are inconsistent: 1 warnings 18 | 19 | 20 | #### 21 | files: [] 22 | -------------------------------------------------------------------------------- /ir-measures/tests/approved_files/test_with_approvals.test_measure_invalid.approved.txt: -------------------------------------------------------------------------------- 1 | ℹ Check qrels path: test-input/qrels_sample_valid.txt 2 | ✓ Qrels path is valid. 3 | ℹ Load qrels with ir-measures. 4 | ✓ Qrels successfully loaded. 5 | ℹ Check topics path: test-input/topics_sample_valid.jsonl 6 | ✓ Topics path is valid. 7 | ℹ Load topics. 8 | ✓ Topics successfully loaded. 9 | ℹ Parse measures: P@X 10 | ✗ Measure is invalid: P@X 11 | ✗ Measures could not be parsed: 1 invalid 12 | 13 | 14 | #### 15 | files: [] 16 | -------------------------------------------------------------------------------- /ir-measures/tests/approved_files/test_with_approvals.test_measure_unknown.approved.txt: -------------------------------------------------------------------------------- 1 | ℹ Check qrels path: test-input/qrels_sample_valid.txt 2 | ✓ Qrels path is valid. 3 | ℹ Load qrels with ir-measures. 4 | ✓ Qrels successfully loaded. 5 | ℹ Check topics path: test-input/topics_sample_valid.jsonl 6 | ✓ Topics path is valid. 7 | ℹ Load topics. 8 | ✓ Topics successfully loaded. 9 | ℹ Parse measures: FOOBAR 10 | ✗ Measure is unknown: FOOBAR 11 | ✗ Measures could not be parsed: 1 unknown 12 | 13 | 14 | #### 15 | files: [] 16 | -------------------------------------------------------------------------------- /ir-measures/tests/approved_files/test_with_approvals.test_measure_valid.approved.txt: -------------------------------------------------------------------------------- 1 | ℹ Check qrels path: test-input/qrels_sample_valid.txt 2 | ✓ Qrels path is valid. 3 | ℹ Load qrels with ir-measures. 4 | ✓ Qrels successfully loaded. 5 | ℹ Check topics path: test-input/topics_sample_valid.jsonl 6 | ✓ Topics path is valid. 7 | ℹ Load topics. 8 | ✓ Topics successfully loaded. 9 | ℹ Parse measures: P@2 10 | ✓ Measures successfully parsed. 11 | 12 | 13 | #### 14 | files: [] 15 | -------------------------------------------------------------------------------- /ir-measures/tests/approved_files/test_with_approvals.test_measure_valid_no_qrels.approved.txt: -------------------------------------------------------------------------------- 1 | ℹ Check run path: test-input/run_sample_valid.txt 2 | ✓ Run path is valid. 3 | ℹ Check run file format. 4 | ✓ Run file format is valid. 5 | ℹ Load run with ir-measures. 6 | ✓ Run successfully loaded. 7 | ℹ Check topics path: test-input/topics_sample_valid.jsonl 8 | ✓ Topics path is valid. 9 | ℹ Load topics. 10 | ✓ Topics successfully loaded. 11 | ℹ Parse measures: P@2 12 | ✓ Measures successfully parsed. 13 | ✗ Consistency check without qrels file is not allowed. 14 | 15 | 16 | #### 17 | files: [] 18 | -------------------------------------------------------------------------------- /ir-measures/tests/approved_files/test_with_approvals.test_measure_valid_no_topics.approved.txt: -------------------------------------------------------------------------------- 1 | ℹ Check run path: test-input/run_sample_valid.txt 2 | ✓ Run path is valid. 3 | ℹ Check run file format. 4 | ✓ Run file format is valid. 5 | ℹ Load run with ir-measures. 6 | ✓ Run successfully loaded. 7 | ℹ Check qrels path: test-input/qrels_sample_valid.txt 8 | ✓ Qrels path is valid. 9 | ℹ Load qrels with ir-measures. 10 | ✓ Qrels successfully loaded. 11 | ℹ Parse measures: P@2 12 | ✓ Measures successfully parsed. 13 | ✗ Consistency check without topics file is not allowed. 14 | 15 | 16 | #### 17 | files: [] 18 | -------------------------------------------------------------------------------- /ir-measures/tests/approved_files/test_with_approvals.test_measures_valid.approved.txt: -------------------------------------------------------------------------------- 1 | ℹ Check qrels path: test-input/qrels_sample_valid.txt 2 | ✓ Qrels path is valid. 3 | ℹ Load qrels with ir-measures. 4 | ✓ Qrels successfully loaded. 5 | ℹ Check topics path: test-input/topics_sample_valid.jsonl 6 | ✓ Topics path is valid. 7 | ℹ Load topics. 8 | ✓ Topics successfully loaded. 9 | ℹ Parse measures: P@2, nDCG@2 10 | ✓ Measures successfully parsed. 11 | 12 | 13 | #### 14 | files: [] 15 | -------------------------------------------------------------------------------- /ir-measures/tests/approved_files/test_with_approvals.test_output_dir_not_empty.approved.txt: -------------------------------------------------------------------------------- 1 | ℹ Check qrels path: test-input/qrels_sample_valid.txt 2 | ✓ Qrels path is valid. 3 | ℹ Load qrels with ir-measures. 4 | ✓ Qrels successfully loaded. 5 | ℹ Check topics path: test-input/topics_sample_valid.jsonl 6 | ✓ Topics path is valid. 7 | ℹ Load topics. 8 | ✓ Topics successfully loaded. 9 | ℹ Parse measures: P@2, nDCG@2 10 | ✓ Measures successfully parsed. 11 | ℹ Check output path: test-output-not-empty 12 | ⚠ Output directory is not empty. 13 | ✓ Output path is valid: 1 warning 14 | ℹ Check run path: test-input/run_sample_valid.txt 15 | ✓ Run path is valid. 16 | ℹ Check run file format. 17 | ✓ Run file format is valid. 18 | ℹ Load run with ir-measures. 19 | ✓ Run successfully loaded. 20 | ℹ Check run, qrels, and topics consistency. 21 | ✓ Run, qrels, and topics are consistent. 22 | ℹ Evaluate run with measures: P@2, nDCG@2 23 | ✓ Run successfully evaluated. 24 | ℹ Export metrics. 25 | ✓ Metrics successfully exported. 26 | 27 | 28 | #### 29 | files: ['test-output-not-empty/evaluation-per-query.prototext', 'test-output-not-empty/evaluation.prototext', 'test-output-not-empty/file.txt'] 30 | 31 | 32 | ####test-output-not-empty/evaluation-per-query.prototext 33 | measure { 34 | query_id: "1" 35 | measure: "P@2" 36 | value: "1.0" 37 | } 38 | measure { 39 | query_id: "1" 40 | measure: "nDCG@2" 41 | value: "0.6666666666666667" 42 | } 43 | 44 | 45 | ####test-output-not-empty/evaluation.prototext 46 | measure { 47 | key: "P@2" 48 | value: "1.0" 49 | } 50 | measure { 51 | key: "nDCG@2" 52 | value: "0.6666666666666667" 53 | } 54 | 55 | 56 | ####test-output-not-empty/file.txt 57 | -------------------------------------------------------------------------------- /ir-measures/tests/approved_files/test_with_approvals.test_output_path_is_file.approved.txt: -------------------------------------------------------------------------------- 1 | ℹ Check qrels path: test-input/qrels_sample_valid.txt 2 | ✓ Qrels path is valid. 3 | ℹ Load qrels with ir-measures. 4 | ✓ Qrels successfully loaded. 5 | ℹ Check topics path: test-input/topics_sample_valid.jsonl 6 | ✓ Topics path is valid. 7 | ℹ Load topics. 8 | ✓ Topics successfully loaded. 9 | ℹ Parse measures: P@2, nDCG@2 10 | ✓ Measures successfully parsed. 11 | ℹ Check output path: test-output-not-empty/file.txt 12 | ✗ Output path is not a directory. 13 | ✗ Output path is invalid: 1 error 14 | 15 | 16 | #### 17 | files: [] 18 | -------------------------------------------------------------------------------- /ir-measures/tests/approved_files/test_with_approvals.test_output_path_not_found.approved.txt: -------------------------------------------------------------------------------- 1 | ℹ Check qrels path: test-input/qrels_sample_valid.txt 2 | ✓ Qrels path is valid. 3 | ℹ Load qrels with ir-measures. 4 | ✓ Qrels successfully loaded. 5 | ℹ Check topics path: test-input/topics_sample_valid.jsonl 6 | ✓ Topics path is valid. 7 | ℹ Load topics. 8 | ✓ Topics successfully loaded. 9 | ℹ Parse measures: P@2, nDCG@2 10 | ✓ Measures successfully parsed. 11 | ℹ Check output path: 42 12 | ✗ Output path does not exist. 13 | ✗ Output path is invalid: 1 error 14 | 15 | 16 | #### 17 | files: [] 18 | -------------------------------------------------------------------------------- /ir-measures/tests/approved_files/test_with_approvals.test_output_valid_no_measures.approved.txt: -------------------------------------------------------------------------------- 1 | ℹ Check qrels path: test-input/qrels_sample_valid.txt 2 | ✓ Qrels path is valid. 3 | ℹ Load qrels with ir-measures. 4 | ✓ Qrels successfully loaded. 5 | ℹ Check topics path: test-input/topics_sample_valid.jsonl 6 | ✓ Topics path is valid. 7 | ℹ Load topics. 8 | ✓ Topics successfully loaded. 9 | ℹ Check output path: test-output 10 | ✓ Output path is valid. 11 | ℹ Check run path: test-input/run_sample_valid.txt 12 | ✓ Run path is valid. 13 | ℹ Check run file format. 14 | ✓ Run file format is valid. 15 | ℹ Load run with ir-measures. 16 | ✓ Run successfully loaded. 17 | ℹ Check run, qrels, and topics consistency. 18 | ✓ Run, qrels, and topics are consistent. 19 | ✗ Exporting metrics without measures is not allowed. 20 | 21 | 22 | #### 23 | files: [] 24 | -------------------------------------------------------------------------------- /ir-measures/tests/approved_files/test_with_approvals.test_output_valid_no_qrels.approved.txt: -------------------------------------------------------------------------------- 1 | ℹ Check topics path: test-input/topics_sample_valid.jsonl 2 | ✓ Topics path is valid. 3 | ℹ Load topics. 4 | ✓ Topics successfully loaded. 5 | ℹ Parse measures: P@2, nDCG@2 6 | ✓ Measures successfully parsed. 7 | ℹ Check output path: test-output 8 | ✓ Output path is valid. 9 | ✗ Consistency check without qrels file is not allowed. 10 | 11 | 12 | #### 13 | files: [] 14 | -------------------------------------------------------------------------------- /ir-measures/tests/approved_files/test_with_approvals.test_output_valid_no_topics.approved.txt: -------------------------------------------------------------------------------- 1 | ℹ Check qrels path: test-input/qrels_sample_valid.txt 2 | ✓ Qrels path is valid. 3 | ℹ Load qrels with ir-measures. 4 | ✓ Qrels successfully loaded. 5 | ℹ Parse measures: P@2, nDCG@2 6 | ✓ Measures successfully parsed. 7 | ℹ Check output path: test-output 8 | ✓ Output path is valid. 9 | ✗ Consistency check without topics file is not allowed. 10 | 11 | 12 | #### 13 | files: [] 14 | -------------------------------------------------------------------------------- /ir-measures/tests/approved_files/test_with_approvals.test_qrels_file_empty.approved.txt: -------------------------------------------------------------------------------- 1 | ℹ Check qrels path: test-input/empty_file.txt 2 | ✗ Qrels file is empty. 3 | ✗ Qrels path is invalid: 1 error 4 | 5 | 6 | #### 7 | files: [] 8 | -------------------------------------------------------------------------------- /ir-measures/tests/approved_files/test_with_approvals.test_qrels_path_is_dir.approved.txt: -------------------------------------------------------------------------------- 1 | ℹ Check qrels path: test-input 2 | ✗ Qrels path is not a file. 3 | ✗ Qrels path is invalid: 1 error 4 | 5 | 6 | #### 7 | files: [] 8 | -------------------------------------------------------------------------------- /ir-measures/tests/approved_files/test_with_approvals.test_qrels_path_not_found.approved.txt: -------------------------------------------------------------------------------- 1 | ℹ Check qrels path: bar 2 | ✗ Qrels path does not exist. 3 | ✗ Qrels path is invalid: 1 error 4 | 5 | 6 | #### 7 | files: [] 8 | -------------------------------------------------------------------------------- /ir-measures/tests/approved_files/test_with_approvals.test_qrels_topics_valid.approved.txt: -------------------------------------------------------------------------------- 1 | ℹ Check run path: test-input/run_sample_valid.txt 2 | ✓ Run path is valid. 3 | ℹ Check run file format. 4 | ✓ Run file format is valid. 5 | ℹ Load run with ir-measures. 6 | ✓ Run successfully loaded. 7 | ℹ Check qrels path: test-input/qrels_sample_valid.txt 8 | ✓ Qrels path is valid. 9 | ℹ Load qrels with ir-measures. 10 | ✓ Qrels successfully loaded. 11 | ℹ Check topics path: test-input/topics_sample_valid.jsonl 12 | ✓ Topics path is valid. 13 | ℹ Load topics. 14 | ✓ Topics successfully loaded. 15 | ℹ Check run, qrels, and topics consistency. 16 | ✓ Run, qrels, and topics are consistent. 17 | 18 | 19 | #### 20 | files: [] 21 | -------------------------------------------------------------------------------- /ir-measures/tests/approved_files/test_with_approvals.test_qrels_valid_no_topics.approved.txt: -------------------------------------------------------------------------------- 1 | ℹ Check qrels path: test-input/qrels_sample_valid.txt 2 | ✓ Qrels path is valid. 3 | ℹ Load qrels with ir-measures. 4 | ✓ Qrels successfully loaded. 5 | ℹ Check output path: /tmp 6 | ⚠ Output directory is not empty. 7 | ✓ Output path is valid: 1 warning 8 | ✗ Consistency check without topics file is not allowed. 9 | 10 | 11 | #### 12 | files: [] 13 | -------------------------------------------------------------------------------- /ir-measures/tests/approved_files/test_with_approvals.test_query_ids_inconsistent_run_qrels.approved.txt: -------------------------------------------------------------------------------- 1 | ℹ Check run path: test-input/run_sample_warning_qid_not_in_qrels.txt 2 | ✓ Run path is valid. 3 | ℹ Check run file format. 4 | ✓ Run file format is valid. 5 | ℹ Load run with ir-measures. 6 | ✓ Run successfully loaded. 7 | ℹ Check qrels path: test-input/qrels_sample_valid.txt 8 | ✓ Qrels path is valid. 9 | ℹ Load qrels with ir-measures. 10 | ✓ Qrels successfully loaded. 11 | ℹ Check topics path: test-input/topics_sample_valid.jsonl 12 | ✓ Topics path is valid. 13 | ℹ Load topics. 14 | ✓ Topics successfully loaded. 15 | ℹ Check run, qrels, and topics consistency. 16 | ⚠ Query IDs of run file not found in qrels file: 2 17 | ⚠ Run, qrels, and topics are inconsistent: 1 warnings 18 | 19 | 20 | #### 21 | files: [] 22 | -------------------------------------------------------------------------------- /ir-measures/tests/approved_files/test_with_approvals.test_query_ids_inconsistent_topics_qrels.approved.txt: -------------------------------------------------------------------------------- 1 | ℹ Check run path: test-input/run_sample_valid.txt 2 | ✓ Run path is valid. 3 | ℹ Check run file format. 4 | ✓ Run file format is valid. 5 | ℹ Load run with ir-measures. 6 | ✓ Run successfully loaded. 7 | ℹ Check qrels path: test-input/qrels_sample_valid.txt 8 | ✓ Qrels path is valid. 9 | ℹ Load qrels with ir-measures. 10 | ✓ Qrels successfully loaded. 11 | ℹ Check topics path: test-input/topics_sample_warning_qid_not_in_qrels.jsonl 12 | ✓ Topics path is valid. 13 | ℹ Load topics. 14 | ✓ Topics successfully loaded. 15 | ℹ Check run, qrels, and topics consistency. 16 | ⚠ Query IDs of topics file not found in run file: 2 17 | ⚠ Query IDs of topics file not found in qrels file: 2 18 | ⚠ Run, qrels, and topics are inconsistent: 2 warnings 19 | 20 | 21 | #### 22 | files: [] 23 | -------------------------------------------------------------------------------- /ir-measures/tests/approved_files/test_with_approvals.test_query_ids_inconsistent_topics_run.approved.txt: -------------------------------------------------------------------------------- 1 | ℹ Check run path: test-input/run_sample_valid.txt 2 | ✓ Run path is valid. 3 | ℹ Check run file format. 4 | ✓ Run file format is valid. 5 | ℹ Load run with ir-measures. 6 | ✓ Run successfully loaded. 7 | ℹ Check qrels path: test-input/qrels_sample_valid.txt 8 | ✓ Qrels path is valid. 9 | ℹ Load qrels with ir-measures. 10 | ✓ Qrels successfully loaded. 11 | ℹ Check topics path: test-input/topics_sample_warning_qid_not_in_run.jsonl 12 | ✓ Topics path is valid. 13 | ℹ Load topics. 14 | ✓ Topics successfully loaded. 15 | ℹ Check run, qrels, and topics consistency. 16 | ⚠ Query IDs of topics file not found in run file: 2 17 | ⚠ Query IDs of topics file not found in qrels file: 2 18 | ⚠ Run, qrels, and topics are inconsistent: 2 warnings 19 | 20 | 21 | #### 22 | files: [] 23 | -------------------------------------------------------------------------------- /ir-measures/tests/approved_files/test_with_approvals.test_run_document_id_special_chars.approved.txt: -------------------------------------------------------------------------------- 1 | ℹ Check run path: test-input/run_sample_warning_docid_special_chars.txt 2 | ✓ Run path is valid. 3 | ℹ Check run file format. 4 | ⚠ Document IDs with special characters at lines: 2 5 | ⚠ Run file format is valid: 1 warnings 6 | ℹ Load run with ir-measures. 7 | ✓ Run successfully loaded. 8 | 9 | 10 | #### 11 | files: [] 12 | -------------------------------------------------------------------------------- /ir-measures/tests/approved_files/test_with_approvals.test_run_fewer_columns.approved.txt: -------------------------------------------------------------------------------- 1 | ℹ Check output path: /tmp 2 | ⚠ Output directory is not empty. 3 | ✓ Output path is valid: 1 warning 4 | ✗ Exporting metrics without qrels and topics files is not allowed. 5 | 6 | 7 | #### 8 | files: [] 9 | -------------------------------------------------------------------------------- /ir-measures/tests/approved_files/test_with_approvals.test_run_file_empty.approved.txt: -------------------------------------------------------------------------------- 1 | ℹ Check output path: . 2 | ⚠ Output directory is not empty. 3 | ✓ Output path is valid: 1 warning 4 | ✗ Exporting metrics without qrels and topics files is not allowed. 5 | 6 | 7 | #### 8 | files: [] 9 | -------------------------------------------------------------------------------- /ir-measures/tests/approved_files/test_with_approvals.test_run_first_rank_not_zero.approved.txt: -------------------------------------------------------------------------------- 1 | ℹ Check run path: test-input/run_sample_warning_rank_not_start_at_0.txt 2 | ✓ Run path is valid. 3 | ℹ Check run file format. 4 | ⚠ Ranks do not start at 0. 5 | ⚠ Run file format is valid: 1 warnings 6 | ℹ Load run with ir-measures. 7 | ✓ Run successfully loaded. 8 | 9 | 10 | #### 11 | files: [] 12 | -------------------------------------------------------------------------------- /ir-measures/tests/approved_files/test_with_approvals.test_run_ignored_column_not_default.approved.txt: -------------------------------------------------------------------------------- 1 | ℹ Check run path: test-input/run_sample_warning_ignored_column_wrong.txt 2 | ✓ Run path is valid. 3 | ℹ Check run file format. 4 | ⚠ Ignored column is not "Q0" at lines: 2 5 | ⚠ Run file format is valid: 1 warnings 6 | ℹ Load run with ir-measures. 7 | ✓ Run successfully loaded. 8 | 9 | 10 | #### 11 | files: [] 12 | -------------------------------------------------------------------------------- /ir-measures/tests/approved_files/test_with_approvals.test_run_more_columns.approved.txt: -------------------------------------------------------------------------------- 1 | ℹ Check run path: test-input/run_sample_invalid_more_columns.txt 2 | ✓ Run path is valid. 3 | ℹ Check run file format. 4 | ✗ More then 6 columns at lines: 2 5 | ✗ Run file format is invalid: 1 errors 6 | 7 | 8 | #### 9 | files: [] 10 | -------------------------------------------------------------------------------- /ir-measures/tests/approved_files/test_with_approvals.test_run_multiple_tags.approved.txt: -------------------------------------------------------------------------------- 1 | ℹ Check run path: test-input/run_sample_invalid_multiple_tags.txt 2 | ✓ Run path is valid. 3 | ℹ Check run file format. 4 | ✗ Conflicting run tags at lines: 1≠2, 2≠3 5 | ✗ Run file format is invalid: 2 errors 6 | 7 | 8 | #### 9 | files: [] 10 | -------------------------------------------------------------------------------- /ir-measures/tests/approved_files/test_with_approvals.test_run_path_is_dir.approved.txt: -------------------------------------------------------------------------------- 1 | ℹ Check output path: . 2 | ⚠ Output directory is not empty. 3 | ✓ Output path is valid: 1 warning 4 | ✗ Exporting metrics without qrels and topics files is not allowed. 5 | 6 | 7 | #### 8 | files: [] 9 | -------------------------------------------------------------------------------- /ir-measures/tests/approved_files/test_with_approvals.test_run_path_not_found.approved.txt: -------------------------------------------------------------------------------- 1 | ℹ Check run path: foo 2 | ✗ Run path does not exist. 3 | ✗ Run path is invalid: 1 error 4 | 5 | 6 | #### 7 | files: [] 8 | -------------------------------------------------------------------------------- /ir-measures/tests/approved_files/test_with_approvals.test_run_query_id_not_ascending.approved.txt: -------------------------------------------------------------------------------- 1 | ℹ Check run path: test-input/run_sample_warning_qid_not_asc.txt 2 | ✓ Run path is valid. 3 | ℹ Check run file format. 4 | ⚠ Query IDs not in ascending order at lines: 2>3 5 | ⚠ Run file format is valid: 1 warnings 6 | ℹ Load run with ir-measures. 7 | ✓ Run successfully loaded. 8 | 9 | 10 | #### 11 | files: [] 12 | -------------------------------------------------------------------------------- /ir-measures/tests/approved_files/test_with_approvals.test_run_query_id_special_chars.approved.txt: -------------------------------------------------------------------------------- 1 | ℹ Check run path: test-input/run_sample_warning_qid_special_chars.txt 2 | ✓ Run path is valid. 3 | ℹ Check run file format. 4 | ⚠ Query IDs with special characters at lines: 1, 2, 3, 4, 5 5 | ⚠ Run file format is valid: 5 warnings 6 | ℹ Load run with ir-measures. 7 | ✓ Run successfully loaded. 8 | 9 | 10 | #### 11 | files: [] 12 | -------------------------------------------------------------------------------- /ir-measures/tests/approved_files/test_with_approvals.test_run_rank_not_ascending.approved.txt: -------------------------------------------------------------------------------- 1 | ℹ Check run path: test-input/run_sample_warning_rank_not_asc.txt 2 | ✓ Run path is valid. 3 | ℹ Check run file format. 4 | ⚠ Ranks not consecutive at lines: 3↛4, 4↛5 5 | ⚠ Ranks not in ascending order at lines: 4>5 6 | ⚠ Ranks and scores inconsistent at lines: 4≷5 7 | ⚠ Run file format is valid: 4 warnings 8 | ℹ Load run with ir-measures. 9 | ✓ Run successfully loaded. 10 | 11 | 12 | #### 13 | files: [] 14 | -------------------------------------------------------------------------------- /ir-measures/tests/approved_files/test_with_approvals.test_run_rank_not_consecutive.approved.txt: -------------------------------------------------------------------------------- 1 | ℹ Check run path: test-input/run_sample_warning_rank_not_consecutive.txt 2 | ✓ Run path is valid. 3 | ℹ Check run file format. 4 | ⚠ Ranks not consecutive at lines: 4↛5 5 | ⚠ Run file format is valid: 1 warnings 6 | ℹ Load run with ir-measures. 7 | ✓ Run successfully loaded. 8 | 9 | 10 | #### 11 | files: [] 12 | -------------------------------------------------------------------------------- /ir-measures/tests/approved_files/test_with_approvals.test_run_rank_not_integer.approved.txt: -------------------------------------------------------------------------------- 1 | ℹ Check run path: test-input/run_sample_warning_rank_not_int.txt 2 | ✓ Run path is valid. 3 | ℹ Check run file format. 4 | ⚠ Non-integer ranks at lines: 1, 2, 3, 4, 5 5 | ⚠ Run file format is valid: 5 warnings 6 | ℹ Load run with ir-measures. 7 | ✓ Run successfully loaded. 8 | 9 | 10 | #### 11 | files: [] 12 | -------------------------------------------------------------------------------- /ir-measures/tests/approved_files/test_with_approvals.test_run_rank_not_numeric.approved.txt: -------------------------------------------------------------------------------- 1 | ℹ Check run path: test-input/run_sample_warning_rank_not_num.txt 2 | ✓ Run path is valid. 3 | ℹ Check run file format. 4 | ✗ Non-numeric ranks at lines: 1, 2, 3, 4, 5 5 | ⚠ Non-integer ranks at lines: 1, 2, 3, 4, 5 6 | ✗ Run file format is invalid: 5 errors, 5 warnings 7 | 8 | 9 | #### 10 | files: [] 11 | -------------------------------------------------------------------------------- /ir-measures/tests/approved_files/test_with_approvals.test_run_rank_ties.approved.txt: -------------------------------------------------------------------------------- 1 | ℹ Check run path: test-input/run_sample_warning_rank_ties.txt 2 | ✓ Run path is valid. 3 | ℹ Check run file format. 4 | ⚠ Rank ties at lines: 4=5 5 | ⚠ Ranks not consecutive at lines: 4↛5 6 | ⚠ Run file format is valid: 2 warnings 7 | ℹ Load run with ir-measures. 8 | ✓ Run successfully loaded. 9 | 10 | 11 | #### 12 | files: [] 13 | -------------------------------------------------------------------------------- /ir-measures/tests/approved_files/test_with_approvals.test_run_score_not_descending.approved.txt: -------------------------------------------------------------------------------- 1 | ℹ Check run path: test-input/run_sample_warning_score_not_desc.txt 2 | ✓ Run path is valid. 3 | ℹ Check run file format. 4 | ⚠ Scores not in descending order at lines: 2<3 5 | ⚠ Ranks and scores inconsistent at lines: 2≷3 6 | ⚠ Run file format is valid: 2 warnings 7 | ℹ Load run with ir-measures. 8 | ✓ Run successfully loaded. 9 | 10 | 11 | #### 12 | files: [] 13 | -------------------------------------------------------------------------------- /ir-measures/tests/approved_files/test_with_approvals.test_run_score_not_numeric.approved.txt: -------------------------------------------------------------------------------- 1 | ℹ Check run path: test-input/run_sample_warning_score_not_num.txt 2 | ✓ Run path is valid. 3 | ℹ Check run file format. 4 | ✗ Non-numeric scores at lines: 5 5 | ✗ Run file format is invalid: 1 errors 6 | 7 | 8 | #### 9 | files: [] 10 | -------------------------------------------------------------------------------- /ir-measures/tests/approved_files/test_with_approvals.test_run_score_rank_inconsistent.approved.txt: -------------------------------------------------------------------------------- 1 | ℹ Check run path: test-input/run_sample_warning_consistency.txt 2 | ✓ Run path is valid. 3 | ℹ Check run file format. 4 | ⚠ Ranks not consecutive at lines: 1↛2, 2↛3, 3↛4 5 | ⚠ Ranks not in ascending order at lines: 2>3 6 | ⚠ Ranks and scores inconsistent at lines: 2≷3, 4≷5 7 | ⚠ Scores not in descending order at lines: 4<5 8 | ⚠ Run file format is valid: 7 warnings 9 | ℹ Load run with ir-measures. 10 | ✓ Run successfully loaded. 11 | 12 | 13 | #### 14 | files: [] 15 | -------------------------------------------------------------------------------- /ir-measures/tests/approved_files/test_with_approvals.test_run_score_scientific_notation.approved.txt: -------------------------------------------------------------------------------- 1 | ℹ Check run path: test-input/run_sample_warning_score_scientific.txt 2 | ✓ Run path is valid. 3 | ℹ Check run file format. 4 | ⚠ Score in scientific notation at lines: 1 5 | ⚠ Run file format is valid: 1 warnings 6 | ℹ Load run with ir-measures. 7 | ✓ Run successfully loaded. 8 | 9 | 10 | #### 11 | files: [] 12 | -------------------------------------------------------------------------------- /ir-measures/tests/approved_files/test_with_approvals.test_run_score_ties.approved.txt: -------------------------------------------------------------------------------- 1 | ℹ Check run path: test-input/run_sample_warning_score_ties.txt 2 | ✓ Run path is valid. 3 | ℹ Check run file format. 4 | ⚠ Scores ties at lines: 2=3 5 | ⚠ Run file format is valid: 1 warnings 6 | ℹ Load run with ir-measures. 7 | ✓ Run successfully loaded. 8 | 9 | 10 | #### 11 | files: [] 12 | -------------------------------------------------------------------------------- /ir-measures/tests/approved_files/test_with_approvals.test_run_tag_special_chars.approved.txt: -------------------------------------------------------------------------------- 1 | ℹ Check run path: test-input/run_sample_warning_tag_special_chars.txt 2 | ✓ Run path is valid. 3 | ℹ Check run file format. 4 | ⚠ Run tags with special characters at lines: 1, 2, 3, 4, 5 5 | ⚠ Run file format is valid: 5 warnings 6 | ℹ Load run with ir-measures. 7 | ✓ Run successfully loaded. 8 | 9 | 10 | #### 11 | files: [] 12 | -------------------------------------------------------------------------------- /ir-measures/tests/approved_files/test_with_approvals.test_run_valid.approved.txt: -------------------------------------------------------------------------------- 1 | ℹ Check run path: test-input/run_sample_valid.txt 2 | ✓ Run path is valid. 3 | ℹ Check run file format. 4 | ✓ Run file format is valid. 5 | ℹ Load run with ir-measures. 6 | ✓ Run successfully loaded. 7 | 8 | 9 | #### 10 | files: [] 11 | -------------------------------------------------------------------------------- /ir-measures/tests/approved_files/test_with_approvals.test_topics_file_empty.approved.txt: -------------------------------------------------------------------------------- 1 | ℹ Check qrels path: test-input/qrels_sample_valid.txt 2 | ✓ Qrels path is valid. 3 | ℹ Load qrels with ir-measures. 4 | ✓ Qrels successfully loaded. 5 | ℹ Check topics path: test-input/empty_file.jsonl 6 | ✗ Topics path does not exist. 7 | ✗ Topics path is invalid: 1 error 8 | 9 | 10 | #### 11 | files: [] 12 | -------------------------------------------------------------------------------- /ir-measures/tests/approved_files/test_with_approvals.test_topics_path_is_dir.approved.txt: -------------------------------------------------------------------------------- 1 | ℹ Check qrels path: test-input/qrels_sample_valid.txt 2 | ✓ Qrels path is valid. 3 | ℹ Load qrels with ir-measures. 4 | ✓ Qrels successfully loaded. 5 | ℹ Check topics path: test-input 6 | ✗ Topics path is not a file. 7 | ✗ Topics path is invalid: 1 error 8 | 9 | 10 | #### 11 | files: [] 12 | -------------------------------------------------------------------------------- /ir-measures/tests/approved_files/test_with_approvals.test_topics_path_not_found.approved.txt: -------------------------------------------------------------------------------- 1 | ℹ Check qrels path: test-input/qrels_sample_valid.txt 2 | ✓ Qrels path is valid. 3 | ℹ Load qrels with ir-measures. 4 | ✓ Qrels successfully loaded. 5 | ℹ Check topics path: baz 6 | ✗ Topics path does not exist. 7 | ✗ Topics path is invalid: 1 error 8 | 9 | 10 | #### 11 | files: [] 12 | -------------------------------------------------------------------------------- /ir-measures/tests/approved_files/test_with_approvals.test_topics_valid_no_qrels.approved.txt: -------------------------------------------------------------------------------- 1 | ℹ Check topics path: test-input/topics_sample_valid.jsonl 2 | ✓ Topics path is valid. 3 | ℹ Load topics. 4 | ✓ Topics successfully loaded. 5 | ℹ Check output path: /tmp 6 | ⚠ Output directory is not empty. 7 | ✓ Output path is valid: 1 warning 8 | ✗ Consistency check without qrels file is not allowed. 9 | 10 | 11 | #### 12 | files: [] 13 | -------------------------------------------------------------------------------- /ir-measures/tests/end-to-end-test/document-components/documents.jsonl: -------------------------------------------------------------------------------- 1 | {"docno":"31", "dummy-annotation": 1} 2 | {"docno":"-1", "dummy-annotation": 1} 3 | {"docno":"184", "dummy-annotation": 1} 4 | -------------------------------------------------------------------------------- /ir-measures/tests/end-to-end-test/output-of-run/run.txt: -------------------------------------------------------------------------------- 1 | 1 Q0 51 1 29.46967923939181 pyterrier.default_pipelines.wmodel_batch_retrieve 2 | 1 Q0 486 2 28.934085921492994 pyterrier.default_pipelines.wmodel_batch_retrieve 3 | 1 Q0 12 3 25.145339558565592 pyterrier.default_pipelines.wmodel_batch_retrieve 4 | 1 Q0 184 4 25.06937671403007 pyterrier.default_pipelines.wmodel_batch_retrieve 5 | 1 Q0 878 5 22.940738273278733 pyterrier.default_pipelines.wmodel_batch_retrieve 6 | 1 Q0 665 6 19.897868683134224 pyterrier.default_pipelines.wmodel_batch_retrieve 7 | 1 Q0 746 7 19.217265854381456 pyterrier.default_pipelines.wmodel_batch_retrieve 8 | 1 Q0 573 8 19.21603156968554 pyterrier.default_pipelines.wmodel_batch_retrieve 9 | 1 Q0 141 9 17.521967719023912 pyterrier.default_pipelines.wmodel_batch_retrieve 10 | 1 Q0 78 10 17.26972824317063 pyterrier.default_pipelines.wmodel_batch_retrieve 11 | 1 Q0 944 11 17.07194664213376 pyterrier.default_pipelines.wmodel_batch_retrieve 12 | 1 Q0 13 12 16.412437985319805 pyterrier.default_pipelines.wmodel_batch_retrieve 13 | 1 Q0 747 13 16.062151020492635 pyterrier.default_pipelines.wmodel_batch_retrieve 14 | 1 Q0 14 14 15.959407819358246 pyterrier.default_pipelines.wmodel_batch_retrieve 15 | 1 Q0 875 15 15.666909988994039 pyterrier.default_pipelines.wmodel_batch_retrieve 16 | 1 Q0 879 16 15.665398405337946 pyterrier.default_pipelines.wmodel_batch_retrieve 17 | 1 Q0 453 17 15.15914484574456 pyterrier.default_pipelines.wmodel_batch_retrieve 18 | 1 Q0 663 18 14.88697537219847 pyterrier.default_pipelines.wmodel_batch_retrieve 19 | 1 Q0 435 19 14.799733366852877 pyterrier.default_pipelines.wmodel_batch_retrieve 20 | 1 Q0 329 20 14.783939663548521 pyterrier.default_pipelines.wmodel_batch_retrieve 21 | 1 Q0 359 21 14.375288383551364 pyterrier.default_pipelines.wmodel_batch_retrieve 22 | 1 Q0 252 22 13.854289116013891 pyterrier.default_pipelines.wmodel_batch_retrieve 23 | 1 Q0 792 23 13.708588567267952 pyterrier.default_pipelines.wmodel_batch_retrieve 24 | 1 Q0 876 24 13.656894966240834 pyterrier.default_pipelines.wmodel_batch_retrieve 25 | 1 Q0 685 25 13.634213293530383 pyterrier.default_pipelines.wmodel_batch_retrieve 26 | 12 Q0 492 1 45.5972876001579 pyterrier.default_pipelines.wmodel_batch_retrieve 27 | 12 Q0 973 2 26.24526223195076 pyterrier.default_pipelines.wmodel_batch_retrieve 28 | 12 Q0 434 3 24.796984094253354 pyterrier.default_pipelines.wmodel_batch_retrieve 29 | 12 Q0 57 4 22.692136402427153 pyterrier.default_pipelines.wmodel_batch_retrieve 30 | 12 Q0 56 5 22.112002114839196 pyterrier.default_pipelines.wmodel_batch_retrieve 31 | 12 Q0 122 6 19.0055243611944 pyterrier.default_pipelines.wmodel_batch_retrieve 32 | 12 Q0 124 7 18.52049113906165 pyterrier.default_pipelines.wmodel_batch_retrieve 33 | 12 Q0 232 8 18.175015176543358 pyterrier.default_pipelines.wmodel_batch_retrieve 34 | 12 Q0 1040 9 17.840188478743716 pyterrier.default_pipelines.wmodel_batch_retrieve 35 | 12 Q0 1381 10 17.55665202098471 pyterrier.default_pipelines.wmodel_batch_retrieve 36 | 12 Q0 688 11 16.866350648439543 pyterrier.default_pipelines.wmodel_batch_retrieve 37 | 12 Q0 1231 12 16.25691078276826 pyterrier.default_pipelines.wmodel_batch_retrieve 38 | 12 Q0 373 13 16.147336805173225 pyterrier.default_pipelines.wmodel_batch_retrieve 39 | 12 Q0 248 14 15.87052689190861 pyterrier.default_pipelines.wmodel_batch_retrieve 40 | 12 Q0 234 15 15.320303933153959 pyterrier.default_pipelines.wmodel_batch_retrieve 41 | 12 Q0 469 16 14.273803865918213 pyterrier.default_pipelines.wmodel_batch_retrieve 42 | 12 Q0 567 17 14.044124721913432 pyterrier.default_pipelines.wmodel_batch_retrieve 43 | 12 Q0 225 18 13.611762694561683 pyterrier.default_pipelines.wmodel_batch_retrieve 44 | 12 Q0 1307 19 13.485825296462698 pyterrier.default_pipelines.wmodel_batch_retrieve 45 | 12 Q0 48 20 13.344387953598321 pyterrier.default_pipelines.wmodel_batch_retrieve 46 | 12 Q0 1347 21 13.240248173327196 pyterrier.default_pipelines.wmodel_batch_retrieve 47 | 12 Q0 354 22 12.831063710412947 pyterrier.default_pipelines.wmodel_batch_retrieve 48 | 12 Q0 988 23 12.772937842832519 pyterrier.default_pipelines.wmodel_batch_retrieve 49 | 12 Q0 717 24 12.549723018860162 pyterrier.default_pipelines.wmodel_batch_retrieve 50 | 12 Q0 1350 25 12.521372410947912 pyterrier.default_pipelines.wmodel_batch_retrieve 51 | 12 Q0 801 26 12.5128750268523 pyterrier.default_pipelines.wmodel_batch_retrieve 52 | 12 Q0 233 27 12.483865285029266 pyterrier.default_pipelines.wmodel_batch_retrieve 53 | 12 Q0 698 28 12.418208027333428 pyterrier.default_pipelines.wmodel_batch_retrieve 54 | 12 Q0 713 29 11.938272329258727 pyterrier.default_pipelines.wmodel_batch_retrieve 55 | 12 Q0 1006 30 11.917862028238057 pyterrier.default_pipelines.wmodel_batch_retrieve 56 | 12 Q0 58 31 11.90767934389369 pyterrier.default_pipelines.wmodel_batch_retrieve 57 | 12 Q0 947 32 11.837403960204785 pyterrier.default_pipelines.wmodel_batch_retrieve 58 | 12 Q0 1077 33 11.808355778119276 pyterrier.default_pipelines.wmodel_batch_retrieve 59 | 12 Q0 32 34 11.779450593849033 pyterrier.default_pipelines.wmodel_batch_retrieve 60 | 12 Q0 638 35 11.629506111201582 pyterrier.default_pipelines.wmodel_batch_retrieve 61 | 12 Q0 443 36 11.546388989070941 pyterrier.default_pipelines.wmodel_batch_retrieve 62 | 12 Q0 197 37 11.476786751423191 pyterrier.default_pipelines.wmodel_batch_retrieve 63 | 12 Q0 1114 38 11.366064512876534 pyterrier.default_pipelines.wmodel_batch_retrieve 64 | 167 Q0 895 1 28.327967773014592 pyterrier.default_pipelines.wmodel_batch_retrieve 65 | 167 Q0 712 2 19.412358996707027 pyterrier.default_pipelines.wmodel_batch_retrieve 66 | 167 Q0 919 3 16.887259476504976 pyterrier.default_pipelines.wmodel_batch_retrieve 67 | 167 Q0 1290 4 16.245347517134597 pyterrier.default_pipelines.wmodel_batch_retrieve 68 | 167 Q0 1266 5 15.957075401520786 pyterrier.default_pipelines.wmodel_batch_retrieve 69 | 167 Q0 704 6 15.767690314126524 pyterrier.default_pipelines.wmodel_batch_retrieve 70 | 167 Q0 918 7 15.517127803048707 pyterrier.default_pipelines.wmodel_batch_retrieve 71 | 167 Q0 917 8 15.371565310035024 pyterrier.default_pipelines.wmodel_batch_retrieve 72 | 167 Q0 1333 9 14.8907597434809 pyterrier.default_pipelines.wmodel_batch_retrieve 73 | 167 Q0 780 10 14.68654800461253 pyterrier.default_pipelines.wmodel_batch_retrieve 74 | 167 Q0 685 11 14.252924226229574 pyterrier.default_pipelines.wmodel_batch_retrieve 75 | 167 Q0 757 12 13.970313575661853 pyterrier.default_pipelines.wmodel_batch_retrieve 76 | 167 Q0 676 13 13.617497907287422 pyterrier.default_pipelines.wmodel_batch_retrieve 77 | 167 Q0 783 14 13.405899703811238 pyterrier.default_pipelines.wmodel_batch_retrieve 78 | 167 Q0 1289 15 13.358544787547432 pyterrier.default_pipelines.wmodel_batch_retrieve 79 | 167 Q0 609 16 13.159414210028041 pyterrier.default_pipelines.wmodel_batch_retrieve 80 | 167 Q0 916 17 12.359439595764991 pyterrier.default_pipelines.wmodel_batch_retrieve 81 | 167 Q0 465 18 12.15977064445475 pyterrier.default_pipelines.wmodel_batch_retrieve 82 | 167 Q0 1246 19 11.99997745040418 pyterrier.default_pipelines.wmodel_batch_retrieve 83 | -------------------------------------------------------------------------------- /ir-measures/tests/end-to-end-test/query-components/queries.jsonl: -------------------------------------------------------------------------------- 1 | {"qid": 1, "dummy-annotation": 1} 2 | -------------------------------------------------------------------------------- /ir-measures/tests/end-to-end-test/truth-data/metadata.json: -------------------------------------------------------------------------------- 1 | {"ir_datasets_id": "cranfield"} 2 | -------------------------------------------------------------------------------- /ir-measures/tests/end-to-end-test/truth-data/qrels.txt: -------------------------------------------------------------------------------- 1 | 1 0 184 2 2 | 1 0 29 2 3 | 1 0 31 2 4 | 1 0 12 3 5 | 1 0 51 3 6 | 1 0 102 3 7 | 1 0 13 4 8 | 1 0 14 4 9 | 1 0 15 4 10 | 1 0 57 2 11 | 1 0 378 2 12 | 1 0 859 2 13 | 1 0 185 3 14 | 1 0 30 3 15 | 1 0 37 3 16 | 1 0 52 4 17 | 1 0 142 4 18 | 1 0 195 4 19 | 1 0 875 2 20 | 1 0 56 3 21 | 1 0 66 3 22 | 1 0 95 3 23 | 1 0 462 4 24 | 1 0 497 3 25 | 1 0 858 3 26 | 1 0 876 3 27 | 1 0 879 3 28 | 1 0 880 3 29 | 1 0 486 -1 30 | 12 0 86 2 31 | 12 0 194 2 32 | 12 0 650 2 33 | 12 0 649 4 34 | 12 0 652 2 35 | 12 0 624 -1 36 | 167 0 274 2 37 | 167 0 82 3 38 | 167 0 509 -1 39 | -------------------------------------------------------------------------------- /ir-measures/tests/end-to-end-test/truth-data/queries.jsonl: -------------------------------------------------------------------------------- 1 | {"qid": "1", "query": "what similarity laws must be obeyed when constructing aeroelastic models\nof heated high speed aircraft .", "original_query": {"query_id": "1", "text": "what similarity laws must be obeyed when constructing aeroelastic models\nof heated high speed aircraft ."}} 2 | {"qid": "12", "query": "is it possible to relate the available pressure distributions for an\nogive forebody at zero angle of attack to the lower surface pressures of\nan equivalent ogive forebody at angle of attack .", "original_query": {"query_id": "12", "text": "is it possible to relate the available pressure distributions for an\nogive forebody at zero angle of attack to the lower surface pressures of\nan equivalent ogive forebody at angle of attack ."}} 3 | {"qid": "167", "query": "it is not likely that the airforces on a wing of general planform\noscillating in transonic flow can be determined by purely analytical\nmethods . is it possible to determine the airforces on a single\nparticular planform, such as the rectangular one by such method .", "original_query": {"query_id": "167", "text": "it is not likely that the airforces on a wing of general planform\noscillating in transonic flow can be determined by purely analytical\nmethods . is it possible to determine the airforces on a single\nparticular planform, such as the rectangular one by such method ."}} 4 | -------------------------------------------------------------------------------- /ir-measures/tests/test-io/test-input/empty_file.txt: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/tira-io/ir-experiment-platform/9fded863a5fb06332ee73b2a6913d763eed3510d/ir-measures/tests/test-io/test-input/empty_file.txt -------------------------------------------------------------------------------- /ir-measures/tests/test-io/test-input/qrels_sample_valid.txt: -------------------------------------------------------------------------------- 1 | 1 0 1 2 2 | 1 0 2 2 3 | 1 0 3 2 4 | 1 0 4 3 5 | 1 0 5 3 -------------------------------------------------------------------------------- /ir-measures/tests/test-io/test-input/run_sample_invalid_less_columns.txt: -------------------------------------------------------------------------------- 1 | 1 Q0 1 0 3.446771712469712 pyterrier 2 | 1 Q0 2 1 3.352227701368739 3 | 1 Q0 3 2 3.292554298236954 pyterrier 4 | 1 Q0 4 3 3.260319364736074 pyterrier 5 | 1 Q0 5 4 3.251238969336898 pyterrier -------------------------------------------------------------------------------- /ir-measures/tests/test-io/test-input/run_sample_invalid_more_columns.txt: -------------------------------------------------------------------------------- 1 | 1 Q0 1 0 3.446771712469712 pyterrier 2 | 1 Q0 2 1 3.352227701368739 pyterrier 4815162342 3 | 1 Q0 3 2 3.292554298236954 pyterrier 4 | 1 Q0 4 3 3.260319364736074 pyterrier 5 | 1 Q0 5 4 3.251238969336898 pyterrier -------------------------------------------------------------------------------- /ir-measures/tests/test-io/test-input/run_sample_invalid_multiple_tags.txt: -------------------------------------------------------------------------------- 1 | 1 Q0 1 0 3.446771712469712 pyterrier 2 | 1 Q0 2 1 3.352227701368739 pydolphin 3 | 1 Q0 3 2 3.292554298236954 pyterrier 4 | 1 Q0 4 3 3.260319364736074 pyterrier 5 | 1 Q0 5 4 3.251238969336898 pyterrier -------------------------------------------------------------------------------- /ir-measures/tests/test-io/test-input/run_sample_valid.txt: -------------------------------------------------------------------------------- 1 | 1 Q0 1 0 3.446771712469712 pyterrier 2 | 1 Q0 2 1 3.352227701368739 pyterrier 3 | 1 Q0 3 2 3.292554298236954 pyterrier 4 | 1 Q0 4 3 3.260319364736074 pyterrier 5 | 1 Q0 5 4 3.251238969336898 pyterrier -------------------------------------------------------------------------------- /ir-measures/tests/test-io/test-input/run_sample_warning_consistency.txt: -------------------------------------------------------------------------------- 1 | 1 Q0 1 0 3.446771712469712 pyterrier 2 | 1 Q0 2 2 3.352227701368739 pyterrier 3 | 1 Q0 3 1 3.292554298236954 pyterrier 4 | 1 Q0 4 3 3.251238969336898 pyterrier 5 | 1 Q0 5 4 3.260319364736074 pyterrier -------------------------------------------------------------------------------- /ir-measures/tests/test-io/test-input/run_sample_warning_docid_not_in_qrels.txt: -------------------------------------------------------------------------------- 1 | 1 Q0 1 0 3.446771712469712 pyterrier 2 | 1 Q0 2 1 3.352227701368739 pyterrier 3 | 1 Q0 3 2 3.292554298236954 pyterrier 4 | 1 Q0 4 3 3.260319364736074 pyterrier 5 | 1 Q0 9 4 3.251238969336898 pyterrier -------------------------------------------------------------------------------- /ir-measures/tests/test-io/test-input/run_sample_warning_docid_special_chars.txt: -------------------------------------------------------------------------------- 1 | 1 Q0 1 0 3.446771712469712 pyterrier 2 | 1 Q0 2% 1 3.352227701368739 pyterrier 3 | 1 Q0 3 2 3.292554298236954 pyterrier 4 | 1 Q0 4 3 3.260319364736074 pyterrier 5 | 1 Q0 5 4 3.251238969336898 pyterrier -------------------------------------------------------------------------------- /ir-measures/tests/test-io/test-input/run_sample_warning_ignored_column_wrong.txt: -------------------------------------------------------------------------------- 1 | 1 Q0 1 0 3.446771712469712 pyterrier 2 | 1 Q1 2 1 3.352227701368739 pyterrier 3 | 1 Q0 3 2 3.292554298236954 pyterrier 4 | 1 Q0 4 3 3.260319364736074 pyterrier 5 | 1 Q0 5 4 3.251238969336898 pyterrier -------------------------------------------------------------------------------- /ir-measures/tests/test-io/test-input/run_sample_warning_qid_not_asc.txt: -------------------------------------------------------------------------------- 1 | 1 Q0 1 0 3.446771712469712 pyterrier 2 | 2 Q0 2 1 3.352227701368739 pyterrier 3 | 1 Q0 3 2 3.292554298236954 pyterrier 4 | 1 Q0 5 3 3.260319364736074 pyterrier 5 | 1 Q0 4 4 3.251238969336898 pyterrier -------------------------------------------------------------------------------- /ir-measures/tests/test-io/test-input/run_sample_warning_qid_not_in_qrels.txt: -------------------------------------------------------------------------------- 1 | 1 Q0 1 0 3.446771712469712 pyterrier 2 | 1 Q0 2 1 3.352227701368739 pyterrier 3 | 1 Q0 3 2 3.292554298236954 pyterrier 4 | 1 Q0 4 3 3.260319364736074 pyterrier 5 | 2 Q0 5 4 3.251238969336898 pyterrier -------------------------------------------------------------------------------- /ir-measures/tests/test-io/test-input/run_sample_warning_qid_special_chars.txt: -------------------------------------------------------------------------------- 1 | 1$ Q0 1 0 3.446771712469712 pyterrier 2 | 1$ Q0 2 1 3.352227701368739 pyterrier 3 | 1$ Q0 3 2 3.292554298236954 pyterrier 4 | 1$ Q0 4 3 3.260319364736074 pyterrier 5 | 1$ Q0 5 4 3.251238969336898 pyterrier -------------------------------------------------------------------------------- /ir-measures/tests/test-io/test-input/run_sample_warning_rank_not_asc.txt: -------------------------------------------------------------------------------- 1 | 1 Q0 1 0 3.446771712469712 pyterrier 2 | 1 Q0 2 1 3.352227701368739 pyterrier 3 | 1 Q0 3 2 3.292554298236954 pyterrier 4 | 1 Q0 4 4 3.260319364736074 pyterrier 5 | 1 Q0 5 3 3.251238969336898 pyterrier -------------------------------------------------------------------------------- /ir-measures/tests/test-io/test-input/run_sample_warning_rank_not_consecutive.txt: -------------------------------------------------------------------------------- 1 | 1 Q0 1 0 3.446771712469712 pyterrier 2 | 1 Q0 2 1 3.352227701368739 pyterrier 3 | 1 Q0 3 2 3.292554298236954 pyterrier 4 | 1 Q0 4 3 3.260319364736074 pyterrier 5 | 1 Q0 5 5 3.251238969336898 pyterrier -------------------------------------------------------------------------------- /ir-measures/tests/test-io/test-input/run_sample_warning_rank_not_int.txt: -------------------------------------------------------------------------------- 1 | 1 Q0 1 0.2 3.446771712469712 pyterrier 2 | 1 Q0 2 1.2 3.352227701368739 pyterrier 3 | 1 Q0 3 2.3 3.292554298236954 pyterrier 4 | 1 Q0 4 3.4 3.260319364736074 pyterrier 5 | 1 Q0 5 4.5 3.251238969336898 pyterrier -------------------------------------------------------------------------------- /ir-measures/tests/test-io/test-input/run_sample_warning_rank_not_num.txt: -------------------------------------------------------------------------------- 1 | 1 Q0 1 A 3.446771712469712 pyterrier 2 | 1 Q0 2 B 3.352227701368739 pyterrier 3 | 1 Q0 3 C 3.292554298236954 pyterrier 4 | 1 Q0 4 D 3.260319364736074 pyterrier 5 | 1 Q0 5 E 3.251238969336898 pyterrier -------------------------------------------------------------------------------- /ir-measures/tests/test-io/test-input/run_sample_warning_rank_not_start_at_0.txt: -------------------------------------------------------------------------------- 1 | 1 Q0 1 1 3.446771712469712 pyterrier 2 | 1 Q0 2 2 3.352227701368739 pyterrier 3 | 1 Q0 3 3 3.292554298236954 pyterrier 4 | 1 Q0 4 4 3.260319364736074 pyterrier 5 | 1 Q0 5 5 3.251238969336898 pyterrier -------------------------------------------------------------------------------- /ir-measures/tests/test-io/test-input/run_sample_warning_rank_ties.txt: -------------------------------------------------------------------------------- 1 | 1 Q0 1 0 3.446771712469712 pyterrier 2 | 1 Q0 2 1 3.352227701368739 pyterrier 3 | 1 Q0 3 2 3.292554298236954 pyterrier 4 | 1 Q0 4 3 3.260319364736074 pyterrier 5 | 1 Q0 5 3 3.251238969336898 pyterrier -------------------------------------------------------------------------------- /ir-measures/tests/test-io/test-input/run_sample_warning_score_not_desc.txt: -------------------------------------------------------------------------------- 1 | 1 Q0 1 0 3.446771712469712 pyterrier 2 | 1 Q0 2 1 3.292554298236954 pyterrier 3 | 1 Q0 3 2 3.352227701368739 pyterrier 4 | 1 Q0 4 3 3.260319364736074 pyterrier 5 | 1 Q0 5 4 3.251238969336898 pyterrier -------------------------------------------------------------------------------- /ir-measures/tests/test-io/test-input/run_sample_warning_score_not_num.txt: -------------------------------------------------------------------------------- 1 | 1 Q0 1 0 3.446771712469712 pyterrier 2 | 1 Q0 2 1 3.352227701368739 pyterrier 3 | 1 Q0 3 2 3.292554298236954 pyterrier 4 | 1 Q0 4 3 3.260319364736074 pyterrier 5 | 1 Q0 5 4 3.251238969336898a pyterrier -------------------------------------------------------------------------------- /ir-measures/tests/test-io/test-input/run_sample_warning_score_scientific.txt: -------------------------------------------------------------------------------- 1 | 1 Q0 1 0 3.446771712469712e10 pyterrier 2 | 1 Q0 2 1 3.352227701368739 pyterrier 3 | 1 Q0 3 2 3.292554298236954 pyterrier 4 | 1 Q0 4 3 3.260319364736074 pyterrier 5 | 1 Q0 5 4 3.251238969336898 pyterrier -------------------------------------------------------------------------------- /ir-measures/tests/test-io/test-input/run_sample_warning_score_ties.txt: -------------------------------------------------------------------------------- 1 | 1 Q0 1 0 3.446771712469712 pyterrier 2 | 1 Q0 2 1 3.352227701368739 pyterrier 3 | 1 Q0 3 2 3.352227701368739 pyterrier 4 | 1 Q0 4 3 3.260319364736074 pyterrier 5 | 1 Q0 5 4 3.251238969336898 pyterrier -------------------------------------------------------------------------------- /ir-measures/tests/test-io/test-input/run_sample_warning_tag_special_chars.txt: -------------------------------------------------------------------------------- 1 | 1 Q0 1 0 3.446771712469712 py+terrier 2 | 1 Q0 2 1 3.352227701368739 py+terrier 3 | 1 Q0 3 2 3.292554298236954 py+terrier 4 | 1 Q0 4 3 3.260319364736074 py+terrier 5 | 1 Q0 5 4 3.251238969336898 py+terrier -------------------------------------------------------------------------------- /ir-measures/tests/test-io/test-input/test-input-cranfield/metadata.json: -------------------------------------------------------------------------------- 1 | {"ir_datasets_id": "cranfield"} 2 | -------------------------------------------------------------------------------- /ir-measures/tests/test-io/test-input/test-input-cranfield/qrels.txt: -------------------------------------------------------------------------------- 1 | 1 0 184 2 2 | 1 0 29 2 3 | 1 0 31 2 4 | 1 0 12 3 5 | 1 0 51 3 6 | 1 0 102 3 7 | 1 0 13 4 8 | 1 0 14 4 9 | 1 0 15 4 10 | 1 0 57 2 11 | -------------------------------------------------------------------------------- /ir-measures/tests/test-io/test-input/test-input-cranfield/queries.jsonl: -------------------------------------------------------------------------------- 1 | {"qid": "1", "query": "what similarity laws must be obeyed when constructing aeroelastic models\nof heated high speed aircraft .", "original_query": {"query_id": "1", "text": "what similarity laws must be obeyed when constructing aeroelastic models\nof heated high speed aircraft ."}} 2 | -------------------------------------------------------------------------------- /ir-measures/tests/test-io/test-input/test-input-cranfield/run.txt: -------------------------------------------------------------------------------- 1 | 1 Q0 486 2 62.79701451212168 test 2 | 1 Q0 13 3 56.803264456568286 test 3 | 1 Q0 184 4 56.51739566097967 test 4 | 1 Q0 663 5 55.39478197135031 test 5 | 1 Q0 12 6 43.839787644246826 test 6 | 1 Q0 746 7 37.43292477354407 test 7 | 1 Q0 876 8 32.51120171695948 test 8 | 1 Q0 359 9 26.834240611060522 test 9 | 1 Q0 573 10 25.4155055815354 test 10 | 1 Q0 102 11 24.4155055815354 test 11 | 1 Q0 57 12 23.4155055815354 test 12 | 1 Q0 51 1 75.0153524139896 test 13 | -------------------------------------------------------------------------------- /ir-measures/tests/test-io/test-input/topics_sample_valid.jsonl: -------------------------------------------------------------------------------- 1 | {"qid": "1", "query": "what similarity laws must be obeyed when constructing aeroelastic models\nof heated high speed aircraft ."} 2 | {"qid": "1", "query": "what are the structural and aeroelastic problems associated with flight\nof high speed aircraft ."} 3 | {"qid": "1", "query": "what problems of heat conduction in composite slabs have been solved so\nfar ."} 4 | {"qid": "1", "query": "can a criterion be developed to show empirically the validity of flow\nsolutions for chemically reacting gas mixtures based on the simplifying\nassumption of instantaneous local chemical equilibrium ."} 5 | {"qid": "1", "query": "what chemical kinetic system is applicable to hypersonic aerodynamic\nproblems ."} -------------------------------------------------------------------------------- /ir-measures/tests/test-io/test-input/topics_sample_warning_qid_not_in_qrels.jsonl: -------------------------------------------------------------------------------- 1 | {"qid": "1", "query": "what similarity laws must be obeyed when constructing aeroelastic models\nof heated high speed aircraft ."} 2 | {"qid": "1", "query": "what are the structural and aeroelastic problems associated with flight\nof high speed aircraft ."} 3 | {"qid": "1", "query": "what problems of heat conduction in composite slabs have been solved so\nfar ."} 4 | {"qid": "1", "query": "can a criterion be developed to show empirically the validity of flow\nsolutions for chemically reacting gas mixtures based on the simplifying\nassumption of instantaneous local chemical equilibrium ."} 5 | {"qid": "2", "query": "what chemical kinetic system is applicable to hypersonic aerodynamic\nproblems ."} -------------------------------------------------------------------------------- /ir-measures/tests/test-io/test-input/topics_sample_warning_qid_not_in_run.jsonl: -------------------------------------------------------------------------------- 1 | {"qid": "1", "query": "what similarity laws must be obeyed when constructing aeroelastic models\nof heated high speed aircraft ."} 2 | {"qid": "1", "query": "what are the structural and aeroelastic problems associated with flight\nof high speed aircraft ."} 3 | {"qid": "1", "query": "what problems of heat conduction in composite slabs have been solved so\nfar ."} 4 | {"qid": "1", "query": "can a criterion be developed to show empirically the validity of flow\nsolutions for chemically reacting gas mixtures based on the simplifying\nassumption of instantaneous local chemical equilibrium ."} 5 | {"qid": "2", "query": "what chemical kinetic system is applicable to hypersonic aerodynamic\nproblems ."} -------------------------------------------------------------------------------- /ir-measures/tests/test-io/test-output-not-empty/file.txt: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/tira-io/ir-experiment-platform/9fded863a5fb06332ee73b2a6913d763eed3510d/ir-measures/tests/test-io/test-output-not-empty/file.txt -------------------------------------------------------------------------------- /reproducibility-experiments/README.md: -------------------------------------------------------------------------------- 1 | # Examples of Reproducibility Experiments 2 | 3 | This directory exemplifies how archived reproducibility experiments on archived shared task repositories of the IR Experiment Platform look like. 4 | This directory contains notebooks with detailed instructions and examples on how to do allow post-hoc experiments. 5 | All examples work use the archived shared task of the 32 ir benchmarks on which we executed over 50 retrieval approaches: [https://github.com/tira-io/ir-experiment-platform-benchmarks](https://github.com/tira-io/ir-experiment-platform-benchmarks). 6 | 7 | The reporucibility examples use jupyter notebooks. 8 | To start them, please clone the archived shared task repository: 9 | 10 | ``` 11 | git clone git@github.com:tira-io/ir-experiment-platform-benchmarks.git 12 | ``` 13 | 14 | Inside the cloned repository, you can start the Jupyter notebook which automatically installs a minimal virtual environment using: 15 | ``` 16 | make jupyterlab 17 | ``` 18 | 19 | 20 | The installation of the environment is simplified with a virtual environment and executing `make jupyterlab` installs the virtual environment (if not already done) and starts the jupyter notebook ready to run all parts of the tutorial. 21 | 22 | For each of the softwares submitted to TIRA, the `tira` integration to PyTerrier loads the Docker Image submitted to TIRA to execute it in PyTerrier pipelines (i.e., a first execution could take sligthly longer). 23 | 24 | The following reproducibility notebooks are available: 25 | 26 | - [full-rank-retriever-reproducibility.ipynb](full-rank-retriever-reproducibility.ipynb): showcases how full-rankers can be reproduced/replicated. 27 | - [re-rank-reproducibility.ipynb](re-rank-reproducibility.ipynb): showcases how re-rankers can be reproduced/replicated. 28 | - [interoparability-tutorial.ipynb](interoparability-tutorial.ipynb): showcases how full-rankers and re-rankers submitted in TIRA can be combined in new ways in post-hoc experiments. 29 | -------------------------------------------------------------------------------- /serps/create-serps.py: -------------------------------------------------------------------------------- 1 | #!/usr/bin/env python3 2 | from tira.tirex import IRDS_TO_TIREX_DATASET 3 | import os 4 | os.environ['IR_DATASETS_HOME'] = '/mnt/ceph/tira/state/ir_datasets/' 5 | import ir_datasets 6 | from glob import glob 7 | from tqdm import tqdm 8 | from trectools import TrecRun 9 | from diffir import WeightBuilder 10 | from diffir.dynamic_ir_datasets_loader import GenericDocFromDict 11 | from pathlib import Path 12 | import json 13 | from diffir.run import MainTask 14 | TIREX_DATASET_TO_IRDS = {v:k for k,v in IRDS_TO_TIREX_DATASET.items()} 15 | DATASET_IDS = set(['msmarco-passage-trec-dl-2019-judged-20230107-training', 'msmarco-passage-trec-dl-2020-judged-20230107-training', 'disks45-nocr-trec-robust-2004-20230209-training', 'clueweb12-trec-web-2013-20230107-training', 'clueweb12-trec-web-2014-20230107-training', 'clueweb09-en-trec-web-2009-20230107-training', 'clueweb09-en-trec-web-2010-20230107-training', 'clueweb09-en-trec-web-2011-20230107-training', 'clueweb09-en-trec-web-2012-20230107-training']) 16 | diffir = MainTask(measure='qrel', weight={"weights_1": None, "weights_2": None}) 17 | 18 | def main(dataset_id): 19 | irds_id = TIREX_DATASET_TO_IRDS[dataset_id] 20 | runs = glob(f'/mnt/ceph/tira/data/runs/{dataset_id}/**/**/**/run.txt') 21 | print(dataset_id, ':', irds_id, len(runs)) 22 | dataset = ir_datasets.load(irds_id) 23 | docs_store = dataset.docs_store() 24 | qid_to_query = {str(i.query_id): i for i in dataset.queries_iter()} 25 | qid_to_docs = {} 26 | for run in tqdm(runs): 27 | run = TrecRun(run).run_data 28 | run = run[run['rank'] <= 11] 29 | for _, i in run.iterrows(): 30 | if i['query'] not in qid_to_docs: 31 | qid_to_docs[i['query']] = set() 32 | qid_to_docs[i['query']].add(i['docid']) 33 | 34 | for qrel in dataset.qrels_iter(): 35 | qid_to_docs[qrel.query_id].add(qrel.doc_id) 36 | 37 | for qid in tqdm(qid_to_docs): 38 | snippets = {} 39 | for doc_id in qid_to_docs[qid]: 40 | # from diffir: https://github.com/capreolus-ir/diffir/blob/master/diffir/run.py#L147C32-L147C38 41 | try: 42 | doc = docs_store.get(doc_id) 43 | except: 44 | pass 45 | 46 | if not doc: 47 | snippets[doc_id] = {'snippet': '', 'weights': {}} 48 | continue 49 | 50 | doc = GenericDocFromDict({'text': doc.default_text(), 'original_document': {'doc_id': doc.doc_id}}) 51 | 52 | weights = diffir.weight.score_document_regions(qid_to_query[qid], doc, 0) 53 | snippet = diffir.find_snippet(weights, doc) 54 | assert snippet['field'] == 'text' 55 | if snippet['start'] != 0: 56 | snippet['weights'] = [[i[0] + 3, i[1] + 3, i[2]] for i in snippet['weights']] 57 | 58 | text = ('' if snippet['start'] == 0 else '...') + doc.text[snippet['start']: snippet['stop']] + ('' if snippet['stop'] >= (len(doc.text) - 20) else '...') 59 | 60 | snippets[doc_id] = {'snippet': text, 'weights': snippet['weights']} 61 | output_file = Path(f'/mnt/ceph/tira/data/publicly-shared-datasets/tirex-snippets/{dataset_id}') 62 | output_file.mkdir(exist_ok=True) 63 | with open(output_file / qid, 'w') as f: 64 | f.write(json.dumps(snippets)) 65 | 66 | 67 | if __name__ == '__main__': 68 | for tirex_dataset in IRDS_TO_TIREX_DATASET.values(): 69 | if tirex_dataset not in DATASET_IDS: 70 | continue 71 | main(tirex_dataset) 72 | -------------------------------------------------------------------------------- /tira-ir-starters/.gitignore: -------------------------------------------------------------------------------- 1 | .ipynb_checkpoints/ 2 | sample-output/ 3 | tira-output/ 4 | -------------------------------------------------------------------------------- /tira-ir-starters/Makefile: -------------------------------------------------------------------------------- 1 | 2 | 3 | 4 | build: 5 | docker build -t webis/tira-ir-baselines-beir:0.0.1-base -f beir/Dockerfile.base . \ 6 | && docker build --build-arg T5_TOKENIZER=t5-base --build-arg T5_MODEL=castorini/duot5-base-msmarco -t webis/tira-ir-baselines-duo-t5-preferences:0.0.1-castorini-duot5-base-msmarco -f duo-t5/Dockerfile . \ 7 | && docker build --build-arg DRES_MODEL=sentence-transformers/msmarco-roberta-base-ance-firstp -t webis/tira-ir-baselines-beir:0.0.1-msmarco-roberta-base-ance-firstp -f beir/Dockerfile.dres . \ 8 | && docker build --build-arg DRES_MODEL=sentence-transformers/msmarco-distilbert-base-tas-b -t webis/tira-ir-baselines-beir:0.0.1-msmarco-distilbert-base-tas-b -f beir/Dockerfile.dres . \ 9 | && docker build --build-arg DRES_MODEL=sentence-transformers/msmarco-distilbert-dot-v5 -t webis/tira-ir-baselines-beir:0.0.1-msmarco-distilbert-dot-v5 -f beir/Dockerfile.dres . \ 10 | && docker build --build-arg DRES_MODEL=sentence-transformers/msmarco-bert-base-dot-v5 -t webis/tira-ir-baselines-beir:0.0.1-msmarco-bert-base-dot-v5 -f beir/Dockerfile.dres . \ 11 | && docker build -t webis/tira-ir-baselines-pygaggle:0.0.1-base -f pygaggle/Dockerfile.base . \ 12 | && docker build --build-arg MODEL_NAME=castorini/monot5-base-msmarco-10k --build-arg TOKENIZER_NAME=t5-base -t webis/tira-ir-baselines-pygaggle:0.0.1-monot5-base-msmarco-10k -f pygaggle/Dockerfile.transformer . \ 13 | && docker build -t webis/tira-ir-baselines-pyterrier:0.0.1-base -f pyterrier/Dockerfile.base . \ 14 | 15 | 16 | 17 | 18 | upload: 19 | docker tag webis/tira-ir-baselines-beir:0.0.1-msmarco-roberta-base-ance-firstp registry.webis.de/code-research/tira/tira-user-princess-knight/beir-dres-ance:0.0.1 \ 20 | && docker push registry.webis.de/code-research/tira/tira-user-princess-knight/beir-dres-ance:0.0.1 \ 21 | && docker tag webis/tira-ir-baselines-pygaggle:0.0.1-monot5-base-msmarco-10k registry.webis.de/code-research/tira/tira-user-princess-knight/pygaggle-monot5:0.0.1 \ 22 | && docker push registry.webis.de/code-research/tira/tira-user-princess-knight/pygaggle-monot5:0.0.1 23 | 24 | run: 25 | docker run --rm -ti --cpus=2 -p 8888:8888 -v ${PWD}:/workspace webis/tira-ir-baselines-duo-t5-preferences:0.0.1-castorini-duot5-base-msmarco \ 26 | jupyter-lab --allow-root --ip 0.0.0.0 27 | # docker run --rm -ti -v ${PWD}:/workspace webis/tira-ir-baselines-pygaggle:0.0.1-monot5-base-msmarco-10k \ 28 | # /reranking.py --input_file dummy-tiny-input.jsonl 29 | # docker run --rm -ti -v ${PWD}:/workspace webis/tira-ir-baselines-beir:0.0.1-msmarco-roberta-base-ance-firstp \ 30 | # /reranking.py --input_file dummy-tiny-input.jsonl --score_function dot 31 | 32 | # TODOS BEIR: 33 | #DPR: models.SentenceBERT(("facebook-dpr-question_encoder-multiset-base", "facebook-dpr-ctx_encoder-multiset-base", " [SEP] ") 34 | #UseQA: models.UseQA("https://tfhub.dev/google/universal-sentence-encoder-qa/3") 35 | #binary passage retriever: models.BinarySentenceBERT("msmarco-distilbert-base-tas-b") 36 | # Everything from: https://docs.google.com/spreadsheets/d/1L8aACyPaXrL8iEelJLGqlMqXKPX2oSP_R10pZoy77Ns/edit#gid=0 37 | 38 | # TODOS PyGaggle: 39 | 40 | # DuoT5: 'castorini/duot5-base-msmarco' + 't5-base' 41 | # MonoBERT: 'castorini/monobert-large-msmarco' + 'bert-large-uncased' 42 | 43 | 44 | 45 | -------------------------------------------------------------------------------- /tira-ir-starters/README.md: -------------------------------------------------------------------------------- 1 | # TIRA IR Starters 2 | 3 | We provide starters for 4 frequently used IR research frameworks that can be used as basis for software submissions to the Information Retrieval Experiment Platform. The simplest starter implements BM25 retrieval using a few lines of declarative PyTerrier code in a Jupyter notebook. 4 | 5 | Retrieval Systems submitted to the IR Experiment Platform have to be implemented in fully self-contained Docker images, i.e., the software must be able to run without internet connection to improve reproducibility (e.g., preventing cases where an external dependency or API is not available anymore in a few years). 6 | 7 | Our existing starters can be directly submitted to TIRA, as all of them have been extensively tested on 32 benchmarks in TIRA, and they also might serve as starting point for custom development. 8 | 9 | ## Local Development 10 | 11 | Please use the `tira-run` command (can be installed via `pip3 install tira`) to test that your retrieval approach is correctly installed inside the Docker image. 12 | Each tira-ir-starter comes with a dedicated `tira-run` example that shows how you can test your docker image locally. 13 | 14 | ## Available Starters 15 | 16 | The following starters are available: 17 | 18 | - [Dense Retrieval starters from BEIR](beir): 17 starters for modern bi-encoder approaches. 19 | - [ChatNoir](chatnoir): BM25F retrieval via an REST API from huge corpora, such as the ClueWeb09, the ClueWeb12, or the ClueWeb22. 20 | - [PyGaggle](pygaggle): 8 starters for cross-encoder models such as monoBERT or monoT5. 21 | - [PyTerrier](pyterrier): 20 starters for lexical models such as BM25 or PL2. 22 | - [DuoT5@PyTerrier](pyterrier-duot5): 3 starters using the DuoT5 approach implemented in the PyTerrier Plugin for DuoT5. 23 | - [ColBERT@PyTerrier](pyterrier-colbert): Implementation of ColBERT in the PyTerrier Plugin. 24 | 25 | 26 | ## Adding your Retrieval Software 27 | 28 | To import your retrieval approach to TIRA, please first upload your image (you can use one of the [available starters](#available-starters) using your dedicated credentials. You can find a personalized tutorial on how to upload your image after you have clicked on "Docker Submission" -> "Upload Images": 29 | 30 | ![personalized-credentials](https://user-images.githubusercontent.com/10050886/221603400-4b0381f0-e743-4876-a455-e45792512e34.png) 31 | 32 | For instance, if you have an image `registry.webis.de/code-research/tira/tira-user-/my-software:0.0.1`, you can upload it as described in your personalized tutorial via: 33 | 34 | ``` 35 | docker login -u tira-user- -p registry.webis.de 36 | docker push registry.webis.de/code-research/tira/tira-user-/my-software:0.0.1 37 | ``` 38 | 39 | After you have uploaded your image, you can add a new Software by clicking on "Docker Submission" -> "Add Container": 40 | 41 | ![tira-define-job](https://user-images.githubusercontent.com/10050886/221604262-715013c3-843f-4393-9894-e842c4718f7d.png) 42 | 43 | ## Running your Retrieval Software 44 | 45 | After you have added the new software, you can run it on suitable datasets. 46 | 47 | ![software-already-configured](https://user-images.githubusercontent.com/10050886/221604580-3dffecd3-f774-44c7-9103-690f4c04a9b3.png) 48 | 49 | 50 | 51 | 52 | -------------------------------------------------------------------------------- /tira-ir-starters/beir/Dockerfile.base: -------------------------------------------------------------------------------- 1 | FROM pytorch/pytorch:1.12.0-cuda11.3-cudnn8-runtime 2 | 3 | RUN apt-get update \ 4 | && apt-get install -y build-essential \ 5 | && pip install pandas jupyterlab beir 6 | 7 | -------------------------------------------------------------------------------- /tira-ir-starters/beir/Dockerfile.dres: -------------------------------------------------------------------------------- 1 | FROM webis/tira-ir-baselines-beir:0.0.1-base 2 | 3 | ARG DRES_MODEL=local 4 | ENV DRES_MODEL ${DRES_MODEL} 5 | 6 | RUN pip3 install tira==0.0.9 \ 7 | && python -c "from beir.retrieval import models; from beir.retrieval.search.dense import DenseRetrievalExactSearch as DRES; DRES(models.SentenceBERT('${DRES_MODEL}'));" 8 | 9 | COPY beir/reranking.py /reranking.py 10 | 11 | COPY beir/full_ranking.py /full_ranking.py 12 | 13 | -------------------------------------------------------------------------------- /tira-ir-starters/beir/Dockerfile.sbert: -------------------------------------------------------------------------------- 1 | FROM webis/tira-ir-baselines-beir:0.0.1-base 2 | 3 | ARG DRES_MODEL=local 4 | ENV DRES_MODEL ${DRES_MODEL} 5 | 6 | RUN pip3 install tira==0.0.9 \ 7 | && python -c "from beir.retrieval import models; from beir.retrieval.search.dense import DenseRetrievalExactSearch as DRES; DRES(models.SentenceBERT('${DRES_MODEL}'));" 8 | 9 | COPY beir/reranking.py /reranking.py 10 | 11 | -------------------------------------------------------------------------------- /tira-ir-starters/beir/full_ranking.py: -------------------------------------------------------------------------------- 1 | #!/usr/bin/env python3 2 | import os 3 | import argparse 4 | import pandas as pd 5 | from beir.retrieval import models 6 | from tqdm import tqdm 7 | from tira.third_party_integrations import load_rerank_data, persist_and_normalize_run 8 | from beir.retrieval.search.dense import DenseRetrievalExactSearch as DRES 9 | 10 | 11 | def parse_args(): 12 | parser = argparse.ArgumentParser(prog='Retrieve with DenseRetrievalExactSearch models of BEIR.') 13 | 14 | parser.add_argument('--model', default=os.environ['DRES_MODEL']) 15 | parser.add_argument('--input', required=True) 16 | parser.add_argument('--output', required=True) 17 | parser.add_argument('--score_function', choices=['cos_sim', 'dot'], required=True) 18 | parser.add_argument('--batch_size', default=128) 19 | parser.add_argument('--corpus_chunk_size', default=50000) 20 | 21 | return vars(parser.parse_args()) 22 | 23 | 24 | def rank(df_queries, df_docs, sbert_model, score_function, batch_size, corpus_chunk_size): 25 | print(f'Rank {len(df_docs)} documents for {len(df_queries)} queries.') 26 | model = DRES(sbert_model, batch_size=int(batch_size), corpus_chunk_size=int(corpus_chunk_size)) 27 | 28 | corpus = {i['docno']:{'text': i['text']} for _, i in df_docs.iterrows()} 29 | queries = {i['qid']: i['query'] for _, i in df_queries.iterrows()} 30 | 31 | scores = model.search(corpus=corpus, queries=queries, top_k=1000, score_function=score_function, return_sorted=True) 32 | ret = [] 33 | 34 | for qid in scores: 35 | for doc_id in scores[qid]: 36 | ret += [{'qid': qid, 'Q0': 0, 'docno': doc_id, 'score': scores[qid][doc_id]}] 37 | 38 | return ret 39 | 40 | 41 | def main(model, input, output, score_function, batch_size, corpus_chunk_size): 42 | df_docs = pd.read_json(f'{input}/documents.jsonl', lines=True) 43 | df_queries = pd.read_json(f'{input}/queries.jsonl', lines=True) 44 | sbert_model = models.SentenceBERT(model) 45 | 46 | ret = rank(df_queries, df_docs, sbert_model, score_function, batch_size, corpus_chunk_size) 47 | 48 | persist_and_normalize_run(pd.DataFrame(ret), model + '-' + score_function, output) 49 | 50 | 51 | if __name__ == '__main__': 52 | args = parse_args() 53 | main(**args) 54 | 55 | -------------------------------------------------------------------------------- /tira-ir-starters/beir/reranking.py: -------------------------------------------------------------------------------- 1 | #!/usr/bin/env python3 2 | import os 3 | import argparse 4 | import pandas as pd 5 | from beir.retrieval import models 6 | from tqdm import tqdm 7 | from tira.third_party_integrations import load_rerank_data, persist_and_normalize_run 8 | from beir.retrieval.search.dense import DenseRetrievalExactSearch as DRES 9 | 10 | 11 | def parse_args(): 12 | parser = argparse.ArgumentParser(prog='Re-rank with DenseRetrievalExactSearch models of BEIR.') 13 | 14 | parser.add_argument('--model', default=os.environ['DRES_MODEL']) 15 | parser.add_argument('--input', required=True) 16 | parser.add_argument('--output', required=True) 17 | parser.add_argument('--score_function', choices=['cos_sim', 'dot'], required=True) 18 | parser.add_argument('--batch_size', default=128) 19 | parser.add_argument('--corpus_chunk_size', default=50000) 20 | 21 | return vars(parser.parse_args()) 22 | 23 | 24 | def rerank(qid, query, df_docs, sbert_model, score_function, batch_size, corpus_chunk_size): 25 | print(f'Rerank for query "{query}" (qid={qid}).') 26 | model = DRES(sbert_model, batch_size=int(batch_size), corpus_chunk_size=int(corpus_chunk_size)) 27 | 28 | corpus = {i['docno']:{'text': i['text']} for _, i in df_docs.iterrows()} 29 | queries = {qid: query, str(qid) + '-duplicate': query} 30 | 31 | scores = model.search(corpus=corpus, queries=queries, top_k=2*len(corpus), score_function=score_function, return_sorted=True)[qid] 32 | ret = [] 33 | 34 | for _, i in df_docs.iterrows(): 35 | ret += [{'qid': qid, 'Q0': 0, 'docno': i['docno'], 'score': scores.get(i['docno'], 0)}] 36 | 37 | return ret 38 | 39 | 40 | def main(model, input, output, score_function, batch_size, corpus_chunk_size): 41 | df = load_rerank_data(input) 42 | sbert_model = models.SentenceBERT(model) 43 | qids = sorted(list(df['qid'].unique())) 44 | df_ret = [] 45 | 46 | for qid in tqdm(qids): 47 | df_qid = df[df['qid'] == qid] 48 | query = df_qid.iloc[0].to_dict()['query'] 49 | 50 | df_ret += rerank(qid, query, df_qid[['docno', 'text']], sbert_model, score_function, batch_size, corpus_chunk_size) 51 | 52 | persist_and_normalize_run(pd.DataFrame(df_ret), model + '-' + score_function, output) 53 | 54 | 55 | if __name__ == '__main__': 56 | args = parse_args() 57 | main(**args) 58 | 59 | -------------------------------------------------------------------------------- /tira-ir-starters/beir/sample-input-full-rank/documents.jsonl: -------------------------------------------------------------------------------- 1 | {"docno": "pangram-01", "text": "How quickly daft jumping zebras vex.", "original_document": {"doc_id": "pangram-01", "text": "How quickly daft jumping zebras vex.", "letters": 30}} 2 | {"docno": "pangram-02", "text": "Quick fox jumps nightly above wizard.", "original_document": {"doc_id": "pangram-02", "text": "Quick fox jumps nightly above wizard.", "letters": 31}} 3 | {"docno": "pangram-03", "text": "The jay, pig, fox, zebra and my wolves quack!", "original_document": {"doc_id": "pangram-03", "text": "The jay, pig, fox, zebra and my wolves quack!", "letters": 33}} 4 | {"docno": "pangram-04", "text": "The quick brown fox jumps over the lazy dog.", "original_document": {"doc_id": "pangram-04", "text": "The quick brown fox jumps over the lazy dog.", "letters": 35}} 5 | {"docno": "pangram-05", "text": "As quirky joke, chefs won\u2019t pay devil magic zebra tax.", "original_document": {"doc_id": "pangram-05", "text": "As quirky joke, chefs won\u2019t pay devil magic zebra tax.", "letters": 42}} 6 | -------------------------------------------------------------------------------- /tira-ir-starters/beir/sample-input-full-rank/metadata.json: -------------------------------------------------------------------------------- 1 | {"ir_datasets_id": "pangrams"} 2 | -------------------------------------------------------------------------------- /tira-ir-starters/beir/sample-input-full-rank/queries.jsonl: -------------------------------------------------------------------------------- 1 | {"qid": "1", "query": "fox jumps above animal", "original_query": {"query_id": "1", "title": "fox jumps above animal", "description": "What pangrams have a fox jumping above some animal?", "narrative": "Relevant pangrams have a fox jumping over an animal (e.g., an dog). Pangrams containing a fox that is not jumping or jumps over something that is not an animal are not relevant."}} 2 | {"qid": "2", "query": "multiple animals including a zebra", "original_query": {"query_id": "2", "title": "multiple animals including a zebra", "description": "Which pangrams have multiple animals where one of the animals is a zebra?", "narrative": "Relevant pangrams have at least two animals, one of the animals must be a Zebra. Pangrams containing only a Zebra are not relevant."}} 3 | -------------------------------------------------------------------------------- /tira-ir-starters/beir/sample-input-full-rank/queries.xml: -------------------------------------------------------------------------------- 1 | 2 | 3 | 4 | fox jumps above animal 5 | 6 | 7 | 8 | 1 9 | 10 | 11 | fox jumps above animal 12 | 13 | 14 | What pangrams have a fox jumping above some animal? 15 | 16 | 17 | Relevant pangrams have a fox jumping over an animal (e.g., an dog). Pangrams containing a fox that is not jumping or jumps over something that is not an animal are not relevant. 18 | 19 | 20 | 21 | 22 | 23 | multiple animals including a zebra 24 | 25 | 26 | 27 | 2 28 | 29 | 30 | multiple animals including a zebra 31 | 32 | 33 | Which pangrams have multiple animals where one of the animals is a zebra? 34 | 35 | 36 | Relevant pangrams have at least two animals, one of the animals must be a Zebra. Pangrams containing only a Zebra are not relevant. 37 | 38 | 39 | 40 | -------------------------------------------------------------------------------- /tira-ir-starters/beir/sample-input/rerank.jsonl.gz: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/tira-io/ir-experiment-platform/9fded863a5fb06332ee73b2a6913d763eed3510d/tira-ir-starters/beir/sample-input/rerank.jsonl.gz -------------------------------------------------------------------------------- /tira-ir-starters/chatnoir/Dockerfile: -------------------------------------------------------------------------------- 1 | FROM webis/tira-ir-starter-pyterrier:0.0.1-base 2 | 3 | RUN pip3 install --no-deps chatnoir-pyterrier==2.0.5 && pip3 install --no-deps chatnoir_api && pip3 install dataclasses_json && pip3 uninstall -y tira && pip3 install tira==0.0.20 4 | 5 | COPY chatnoir/chatnoir_pipelines.py chatnoir/retrieve-with-chatnoir.sh /workspace/ 6 | 7 | -------------------------------------------------------------------------------- /tira-ir-starters/chatnoir/README.md: -------------------------------------------------------------------------------- 1 | # TIRA IR-Starter for ChatNoir 2 | 3 | This directory contains a retrieval system that uses the REST API of ChatNoir to retrieve documents from large corpora such as the ClueWeb09, the ClueWeb12, or the ClueWeb22. 4 | ChatNoir is reachable from within TIRA, and we keep the REST API of ChatNoir compatible to ensure reproducibility over the years. 5 | 6 | Overall, this starter (or other versions derived from the starter, maybe with custom Query Expansion or similar) can serve as full-rank retriever against huge corpora. 7 | TIRA injects the to-be-used credentials (i.e., the API key and index) in a configuration file, so that an approach can run in multiple shared tasks. 8 | 9 | ## Local Development 10 | 11 | Please use the `tira-run` command (can be installed via `pip3 install tira`) to test that your retrieval approach is correctly installed inside the Docker image. 12 | For example, you can run the following command inside this directory to retrieve with the tira-ir-starter for ChatNoir on a small example using the first 3 queries of the TREC Web Track 2009: 13 | 14 | ``` 15 | tira-run \ 16 | --input-directory ${PWD}/sample-input \ 17 | --image webis/tira-ir-starter-chatnoir:0.0.2 \ 18 | --command '/workspace/retrieve-with-chatnoir.sh --input $inputDataset --output $outputDir' 19 | ``` 20 | 21 | In this example above, the command `/workspace/retrieve-with-chatnoir.sh --input $inputDataset --output $outputDir` is the command that you would enter in TIRA, and the `--input-directory` flag points to the inputs. 22 | 23 | This creates a run file `tira-output/run.txt`, with content like (`cat sample-output/run.txt |head -3`): 24 | 25 | ``` 26 | 1 Q0 clueweb09-en0044-22-32198 1 1811.187 pyterrier.chatnoir_pipelines.retrieve_by_default_text 27 | 1 Q0 clueweb09-en0059-35-02945 2 1809.0287 pyterrier.chatnoir_pipelines.retrieve_by_default_text 28 | 1 Q0 clueweb09-en0054-92-07350 3 1655.2092 pyterrier.chatnoir_pipelines.retrieve_by_default_text 29 | ``` 30 | 31 | ## Submit the Image to TIRA 32 | 33 | You need a team for your submission, in the following, I use `tira-ir-starter` as team name, to resubmit the image, please just replace `tira-ir-starter` with your team name. 34 | 35 | First, you have to upload the image: 36 | 37 | ``` 38 | docker pull webis/tira-ir-starter-chatnoir:0.0.2 39 | docker tag webis/tira-ir-starter-chatnoir:0.0.2 registry.webis.de/code-research/tira/tira-user-tira-ir-starter/chatnoir:0.0.2 40 | docker push registry.webis.de/code-research/tira/tira-user-tira-ir-starter/chatnoir:0.0.2 41 | ``` 42 | 43 | After the image is uploaded, you should use the following command in TIRA: 44 | 45 | ``` 46 | /workspace/retrieve-with-chatnoir.sh --input $inputDataset --output $outputDir 47 | ``` 48 | 49 | Please refer to the general tutorial on [how to import your retrieval approach to TIRA](https://github.com/tira-io/ir-experiment-platform/tree/main/tira-ir-starters#adding-your-retrieval-software) and on [how to run your retrieval software in TIRA](https://github.com/tira-io/ir-experiment-platform/tree/main/tira-ir-starters#running-your-retrieval-software) for more detailed instructions on how to submit. 50 | 51 | 52 | 53 | ## Building the image: 54 | 55 | To build the image and to deploy it in TIRA, please run the follwoing commands: 56 | 57 | ``` 58 | docker build -t webis/tira-ir-starter-chatnoir:0.0.2 -f chatnoir/Dockerfile . 59 | docker push webis/tira-ir-starter-chatnoir:0.0.2 60 | ``` 61 | 62 | 63 | -------------------------------------------------------------------------------- /tira-ir-starters/chatnoir/chatnoir_pipelines.py: -------------------------------------------------------------------------------- 1 | from tira.third_party_integrations import get_preconfigured_chatnoir_client 2 | 3 | def retrieve_by_default_text(index_ref, controls): 4 | return get_preconfigured_chatnoir_client(config_directory = controls['raw_passed_arguments']['input'], features = [], verbose = True, num_results=1000, page_size=1000) 5 | 6 | -------------------------------------------------------------------------------- /tira-ir-starters/chatnoir/retrieve-with-chatnoir.sh: -------------------------------------------------------------------------------- 1 | #!/bin/sh 2 | 3 | /workspace/pyterrier_cli.py --retrieval_pipeline chatnoir_pipelines.retrieve_by_default_text ${@} 4 | 5 | -------------------------------------------------------------------------------- /tira-ir-starters/chatnoir/sample-input/chatnoir-credentials.json: -------------------------------------------------------------------------------- 1 | {"index": "ClueWeb09", "apikey":"21695554-1a2b-4114-b57d-59ffb543cc6d"} 2 | -------------------------------------------------------------------------------- /tira-ir-starters/chatnoir/sample-input/queries.jsonl: -------------------------------------------------------------------------------- 1 | {"qid": "1", "query": "obama family tree", "original_query": {"query_id": "1", "query": "obama family tree", "description": "Find information on President Barack Obama's family\n history, including genealogy, national origins, places and dates of\n birth, etc.\n ", "type": "faceted", "subtopics": [["1", "\n Find the TIME magazine photo essay \"Barack Obama's Family Tree\".\n ", "nav"], ["2", "\n Where did Barack Obama's parents and grandparents come from?\n ", "inf"], ["3", "\n Find biographical information on Barack Obama's mother.\n ", "inf"]]}} 2 | {"qid": "2", "query": "french lick resort and casino", "original_query": {"query_id": "2", "query": "french lick resort and casino", "description": "Find information on French Lick Resort and Casino in\n Indiana.\n ", "type": "faceted", "subtopics": [["1", "\n Find the homepage for French Lick Resort and Casino.\n ", "nav"], ["2", "\n What casinos are located within a day's drive of French Lick\n Resort and Casino?\n ", "inf"], ["3", "\n What jobs are available at French Lick Casino and Resort?\n ", "inf"], ["4", "\n Are there discounted packages for staying at French Lick Resort\n and Casino?\n ", "inf"]]}} 3 | {"qid": "3", "query": "getting organized", "original_query": {"query_id": "3", "query": "getting organized", "description": "Find tips, resources, supplies for getting organized\n and reducing clutter.\n ", "type": "faceted", "subtopics": [["1", "\n Find tips on getting organized, both reducing clutter and managing time.\n ", "inf"], ["2", "\n Take me to the Container Store homepage.\n ", "nav"], ["3", "\n Find catalogs of office supplies for organization and decluttering.\n ", "inf"]]}} 4 | -------------------------------------------------------------------------------- /tira-ir-starters/chatnoir/sample-input/queries.xml: -------------------------------------------------------------------------------- 1 | 2 | 3 | 4 | obama family tree 5 | 6 | 7 | 8 | 1 9 | 10 | 11 | obama family tree 12 | 13 | 14 | Find information on President Barack Obama's family 15 | history, including genealogy, national origins, places and dates of 16 | birth, etc. 17 | 18 | 19 | faceted 20 | 21 | 22 | (TrecSubtopic(number='1', text='\n Find the TIME magazine photo essay "Barack Obama\'s Family Tree".\n ', type='nav'), TrecSubtopic(number='2', text="\n Where did Barack Obama's parents and grandparents come from?\n ", type='inf'), TrecSubtopic(number='3', text="\n Find biographical information on Barack Obama's mother.\n ", type='inf')) 23 | 24 | 25 | 26 | 27 | 28 | french lick resort and casino 29 | 30 | 31 | 32 | 2 33 | 34 | 35 | french lick resort and casino 36 | 37 | 38 | Find information on French Lick Resort and Casino in 39 | Indiana. 40 | 41 | 42 | faceted 43 | 44 | 45 | (TrecSubtopic(number='1', text='\n Find the homepage for French Lick Resort and Casino.\n ', type='nav'), TrecSubtopic(number='2', text="\n What casinos are located within a day's drive of French Lick\n Resort and Casino?\n ", type='inf'), TrecSubtopic(number='3', text='\n What jobs are available at French Lick Casino and Resort?\n ', type='inf'), TrecSubtopic(number='4', text='\n Are there discounted packages for staying at French Lick Resort\n and Casino?\n ', type='inf')) 46 | 47 | 48 | 49 | 50 | 51 | getting organized 52 | 53 | 54 | 55 | 3 56 | 57 | 58 | getting organized 59 | 60 | 61 | Find tips, resources, supplies for getting organized 62 | and reducing clutter. 63 | 64 | 65 | faceted 66 | 67 | 68 | (TrecSubtopic(number='1', text='\n Find tips on getting organized, both reducing clutter and managing time.\n ', type='inf'), TrecSubtopic(number='2', text='\n Take me to the Container Store homepage.\n ', type='nav'), TrecSubtopic(number='3', text='\n Find catalogs of office supplies for organization and decluttering.\n ', type='inf')) 69 | 70 | 71 | 72 | 73 | -------------------------------------------------------------------------------- /tira-ir-starters/duo-t5/Dockerfile: -------------------------------------------------------------------------------- 1 | FROM webis/tira-ir-baselines-pyterrier:0.0.1-base 2 | 3 | ARG T5_TOKENIZER=local 4 | ENV T5_TOKENIZER ${T5_TOKENIZER} 5 | 6 | ARG T5_MODEL=local 7 | ENV T5_MODEL ${T5_MODEL} 8 | 9 | RUN pip install --upgrade git+https://github.com/terrierteam/pyterrier_t5.git \ 10 | && python3 -c "import pyterrier as pt; from pyterrier_t5 import MonoT5ReRanker, DuoT5ReRanker; DuoT5ReRanker(tok_model='${T5_TOKENIZER}', model='${T5_MODEL}');" 11 | 12 | COPY pyterrier/full-rank-pipeline.ipynb pyterrier/retrieval-pipeline.ipynb pyterrier/run-pyterrier-notebook.py /workspace/ 13 | 14 | -------------------------------------------------------------------------------- /tira-ir-starters/duo-t5/README.md: -------------------------------------------------------------------------------- 1 | # TIRA IR-Starter for DuoT5 2 | 3 | This is deprecated, please use the updated version here: [https://github.com/tira-io/ir-experiment-platform/tree/main/tira-ir-starters/pyterrier-duot5](https://github.com/tira-io/ir-experiment-platform/tree/main/tira-ir-starters/pyterrier-duot5) 4 | -------------------------------------------------------------------------------- /tira-ir-starters/duo-t5/Untitled.ipynb: -------------------------------------------------------------------------------- 1 | { 2 | "cells": [ 3 | { 4 | "cell_type": "code", 5 | "execution_count": null, 6 | "id": "68b19e01-c24e-466d-b331-50f4d543fef3", 7 | "metadata": {}, 8 | "outputs": [ 9 | { 10 | "name": "stderr", 11 | "output_type": "stream", 12 | "text": [ 13 | "/opt/conda/lib/python3.7/site-packages/tqdm/auto.py:22: TqdmWarning: IProgress not found. Please update jupyter and ipywidgets. See https://ipywidgets.readthedocs.io/en/stable/user_install.html\n", 14 | " from .autonotebook import tqdm as notebook_tqdm\n", 15 | "PyTerrier 0.9.1 has loaded Terrier 5.7 (built by craigm on 2022-11-10 18:30) and terrier-helper 0.0.7\n", 16 | "\n", 17 | "No etc/terrier.properties, using terrier.default.properties for bootstrap configuration.\n", 18 | "duoT5: 0%| | 0/3 [00:00 512). Running this sequence through the model will result in indexing errors\n" 19 | ] 20 | } 21 | ], 22 | "source": [ 23 | "import importlib\n", 24 | "duo_t5_preferences = importlib.import_module('duo-t5-preferences')\n", 25 | "\n", 26 | "duo_t5_preferences.main('../sample-input/re-rank-default-text', '/tmp/dasa')" 27 | ] 28 | }, 29 | { 30 | "cell_type": "code", 31 | "execution_count": 8, 32 | "id": "56b63cd2-25ee-4cdc-a800-4339eb2602d8", 33 | "metadata": {}, 34 | "outputs": [ 35 | { 36 | "name": "stdout", 37 | "output_type": "stream", 38 | "text": [ 39 | "rerank.jsonl\n" 40 | ] 41 | } 42 | ], 43 | "source": [ 44 | "!ls ../sample-input/re-rank-default-text\n" 45 | ] 46 | } 47 | ], 48 | "metadata": { 49 | "kernelspec": { 50 | "display_name": "Python 3 (ipykernel)", 51 | "language": "python", 52 | "name": "python3" 53 | }, 54 | "language_info": { 55 | "codemirror_mode": { 56 | "name": "ipython", 57 | "version": 3 58 | }, 59 | "file_extension": ".py", 60 | "mimetype": "text/x-python", 61 | "name": "python", 62 | "nbconvert_exporter": "python", 63 | "pygments_lexer": "ipython3", 64 | "version": "3.7.13" 65 | } 66 | }, 67 | "nbformat": 4, 68 | "nbformat_minor": 5 69 | } 70 | -------------------------------------------------------------------------------- /tira-ir-starters/duo-t5/__pycache__/duo-t5-preferences.cpython-37.pyc: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/tira-io/ir-experiment-platform/9fded863a5fb06332ee73b2a6913d763eed3510d/tira-ir-starters/duo-t5/__pycache__/duo-t5-preferences.cpython-37.pyc -------------------------------------------------------------------------------- /tira-ir-starters/duo-t5/duo-t5-preferences.py: -------------------------------------------------------------------------------- 1 | #!/usr/bin/env python3 2 | import argparse 3 | import math 4 | import pandas as pd 5 | import pyterrier as pt 6 | from pyterrier.model import add_ranks 7 | import torch 8 | from torch.nn import functional as F 9 | from transformers import T5Config, T5Tokenizer, T5ForConditionalGeneration 10 | from pyterrier.transformer import TransformerBase 11 | import os 12 | import itertools 13 | from pathlib import Path 14 | 15 | 16 | class DuoT5Preferences(TransformerBase): 17 | def __init__(self, tok_model: str = os.environ['T5_TOKENIZER'], model: str = os.environ['T5_MODEL'], batch_size: int = 4, text_field: str = 'text', verbose=True): 18 | self.verbose = verbose 19 | self.sampler = PairwiseFullSampler() 20 | self.batch_size = batch_size 21 | self.device = torch.device('cuda' if torch.cuda.is_available() else 'cpu') 22 | self.tokenizer = T5Tokenizer.from_pretrained(tok_model, truncation=True, model_max_length=512) 23 | self.model_name = model 24 | self.model = T5ForConditionalGeneration.from_pretrained(model) 25 | self.model.to(self.device) 26 | self.model.eval() 27 | self.text_field = text_field 28 | self.REL = self.tokenizer.encode('true')[0] 29 | self.NREL = self.tokenizer.encode('false')[0] 30 | 31 | def __str__(self): 32 | return f"DuoT5({self.model_name})" 33 | 34 | def transform(self, run): 35 | queries, texts = run['query'], run[self.text_field] 36 | pairwise_scores = [] 37 | all_queries = set(queries.unique()) 38 | prompts = self.tokenizer.batch_encode_plus([f'Relevant:' for _ in range(self.batch_size)], return_tensors='pt', 39 | padding='longest') 40 | max_vlen = self.model.config.n_positions - prompts['input_ids'].shape[1] 41 | batches = 0 42 | for batch in self._iter_duo_batches(run): 43 | batches += 1 44 | enc_query = self.tokenizer.batch_encode_plus([f'Query: {q}' for q in batch['query']], return_tensors='pt', 45 | padding='longest') 46 | enc_text0 = self.tokenizer.batch_encode_plus([f'Query: {q}' for q in batch['text0']], return_tensors='pt', 47 | padding='longest') 48 | enc_text1 = self.tokenizer.batch_encode_plus([f'Query: {q}' for q in batch['text1']], return_tensors='pt', 49 | padding='longest') 50 | enc = {} 51 | for key in enc_query: 52 | query = enc_query[key][:, :-1] # chop off end of sequence token-- this will be added with the prompt 53 | text0 = enc_text0[key][:, :-1] # chop off end of sequence token-- this will be added with the prompt 54 | text1 = enc_text1[key][:, :-1] # chop off end of sequence token-- this will be added with the prompt 55 | # Do we need to truncate? If so, how many tokens per document? 56 | if query.shape[1] + text0.shape[1] + text1.shape[1] > max_vlen: 57 | tokens_to_truncate = query.shape[1] + text0.shape[1] + text1.shape[1] - max_vlen 58 | tokens_to_truncate_per_doc = math.ceil(tokens_to_truncate / 2) 59 | text0 = text0[:, :-tokens_to_truncate_per_doc] 60 | text1 = text1[:, :-tokens_to_truncate_per_doc] 61 | # Combine the components: 62 | enc[key] = torch.cat([query, text0, text1, prompts[key][:query.shape[0]]], dim=1) 63 | enc['decoder_input_ids'] = torch.full( 64 | (len(batch['ids']), 1), 65 | self.model.config.decoder_start_token_id, 66 | dtype=torch.long 67 | ) 68 | enc = {k: v.to(self.device) for k, v in enc.items()} 69 | with torch.no_grad(): 70 | result = self.model(**enc).logits 71 | result = result[:, 0, (self.REL, self.NREL)] 72 | result = F.log_softmax(result, dim=1)[:, 0].cpu().detach().tolist() 73 | 74 | for (qid, did1, did2), score in zip(batch['ids'], result): 75 | pairwise_scores.append((qid, did1, did2, score)) 76 | 77 | return pd.DataFrame(pairwise_scores, columns=["qid", "id_a", "id_b", "score"]).sort_values(['qid', 'id_a', 'id_b']) 78 | 79 | def _iter_duo_pairs(self, run): 80 | groups = run.groupby('qid') 81 | 82 | if self.verbose: 83 | groups = pt.tqdm(groups, desc='duoT5', unit='queries') 84 | 85 | for qid, group in groups: 86 | for _, (id_a, id_b) in self.sampler(group).iterrows(): 87 | row1 = group[group["docno"] == id_a].iloc[0, :] 88 | row2 = group[group["docno"] == id_b].iloc[0, :] 89 | yield ( 90 | qid, 91 | row1.query, 92 | getattr(row1, self.text_field), 93 | getattr(row2, self.text_field), 94 | row1.docno, 95 | row2.docno 96 | ) 97 | 98 | def _iter_duo_batches(self, run): 99 | batch = {'ids': [], 'query': [], 'text0': [], 'text1': []} 100 | for qid, query, text0, text1, did0, did1 in self._iter_duo_pairs(run): 101 | batch['ids'].append((qid, did0, did1)) 102 | batch['query'].append(query) 103 | batch['text0'].append(text0) 104 | batch['text1'].append(text1) 105 | if len(batch['ids']) == self.batch_size: 106 | yield batch 107 | for v in batch.values(): 108 | v.clear() 109 | if len(batch['ids']) > 0: 110 | yield 111 | 112 | 113 | class PairwiseFullSampler(): 114 | def __init__(self, method="product"): 115 | """ 116 | Constructor 117 | :param method: which full set to produces, can be "combinations", or "product" 118 | """ 119 | if method == "product": 120 | self.sample_func = lambda x: itertools.product(x, repeat=2) 121 | elif method == "combinations": 122 | self.sample_func = lambda x: itertools.combinations(x, 2) 123 | else: 124 | raise ValueError("method must be 'combinations' or 'product'") 125 | super(PairwiseFullSampler, self).__init__() 126 | 127 | def __call__(self, id_frame: pd.DataFrame) -> pd.DataFrame: 128 | """ 129 | Constructs a full comparison set 130 | :param id_frame: pointwise ranking output for sampling, column "docno" must be present 131 | """ 132 | ids = id_frame.sort_values("score").loc[:, "docno"].values.tolist() 133 | comparisons = list(self.sample_func(ids)) 134 | comparisons = pd.DataFrame(comparisons, columns=["id_a", "id_b"]).sort_values(["id_a", "id_b"]) 135 | comparisons = comparisons[comparisons["id_a"] != comparisons["id_b"]] 136 | return comparisons 137 | 138 | 139 | def parse_args(): 140 | parser = argparse.ArgumentParser(description='') 141 | 142 | parser.add_argument('--input', type=str, help='The directory that contains the input data (this directory is expected to contain a rerank.jsonl file).', required=True) 143 | parser.add_argument('--output', type=str, help='The resulting duo-t5 preferences will be stored in this directory as pairwise-preferences.jsonl.', required=True) 144 | 145 | return parser.parse_args() 146 | 147 | 148 | def main(input_directory, output_directory): 149 | if not pt.started(): 150 | # started pyterrier with this configuration to ensure that no internet connection is required (for reproducibility) 151 | pt.init(version=os.environ['PYTERRIER_VERSION'], helper_version=os.environ['PYTERRIER_HELPER_VERSION'], no_download=True) 152 | 153 | df = pd.read_json(str(Path(input_directory) / 'rerank.jsonl'), lines=True) 154 | 155 | if 'score' not in df: 156 | df['score'] = df.index + 1 157 | 158 | duo_t5_prederence_calculation = DuoT5Preferences(tok_model=os.environ['T5_TOKENIZER'], model=os.environ['T5_MODEL']) 159 | 160 | ret = duo_t5_prederence_calculation(df) 161 | 162 | return ret 163 | 164 | if __name__ == '__main__': 165 | args = parse_args() 166 | main(input_directory = args.input, output_directory=args.output) 167 | 168 | -------------------------------------------------------------------------------- /tira-ir-starters/pygaggle/Dockerfile.base: -------------------------------------------------------------------------------- 1 | FROM pytorch/pytorch:1.12.0-cuda11.3-cudnn8-runtime 2 | 3 | RUN apt-get update \ 4 | && apt-get install -y git openjdk-11-jdk \ 5 | && git clone --recursive https://github.com/castorini/pygaggle.git /pygaggle \ 6 | && ln -s /pygaggle/pygaggle /opt/conda/lib/python3.7/site-packages/pygaggle \ 7 | && cd /pygaggle \ 8 | && pip3 install -r requirements.txt \ 9 | && pip3 uninstall -y markupsafe \ 10 | && pip3 install protobuf==3.20.0 faiss-cpu markupsafe==2.0.1 11 | 12 | -------------------------------------------------------------------------------- /tira-ir-starters/pygaggle/Dockerfile.transformer: -------------------------------------------------------------------------------- 1 | FROM webis/tira-ir-baselines-pygaggle:0.0.1-base 2 | 3 | ARG MODEL_NAME=local 4 | ENV MODEL_NAME ${MODEL_NAME} 5 | 6 | ARG TOKENIZER_NAME=local 7 | ENV TOKENIZER_NAME ${TOKENIZER_NAME} 8 | 9 | RUN pip3 install tira==0.0.8 && \ 10 | python -c "from pygaggle.rerank.transformer import MonoT5, MonoBERT; print(str(MonoT5.get_model('${MODEL_NAME}')) + ' - ' + str(MonoT5.get_tokenizer('${TOKENIZER_NAME}'))) if 'monot5' in '${MODEL_NAME}'.lower() else print(str(MonoBERT.get_model('${MODEL_NAME}')) + ' - ' + str(MonoBERT.get_tokenizer('${TOKENIZER_NAME}')))" 11 | 12 | COPY pygaggle/reranking.py /reranking.py 13 | 14 | -------------------------------------------------------------------------------- /tira-ir-starters/pygaggle/README.md: -------------------------------------------------------------------------------- 1 | # TIRA IR-Starter for PyGaggle 2 | 3 | This directory contains a retrieval system that uses [PyGaggle](https://github.com/castorini/pygaggle) models like monoBERT or monoT5 for retrieval. 4 | 5 | Overall, this starter (or other versions derived from the starter) can serve as re-rank retrieval approaches following any previous retrieval stage. 6 | 7 | 8 | ## Local Development 9 | 10 | Please use the `tira-run` command (can be installed via `pip3 install tira`) to test that your retrieval approach is correctly installed inside the Docker image. 11 | For example, you can run the following command inside this directory to re-rank with an PyGaggle re-ranker from our tira-ir-starter on a small example (2 queries from the passage retrieval task of TREC DL 2019): 12 | 13 | ``` 14 | tira-run \ 15 | --input-directory ${PWD}/sample-input \ 16 | --image webis/tira-ir-starter-pygaggle:0.0.1-monot5-base-msmarco-10k \ 17 | --command '/reranking.py --input $inputDataset --output $outputDir' 18 | ``` 19 | 20 | In this example above, the command `/reranking.py --input $inputDataset --output $outputDir` is the command that you would enter in TIRA, and the `--input-directory` flag points to the inputs. 21 | 22 | 23 | This creates a run file `tira-output/run.txt`, with content like (`cat sample-output/run.txt |head -3`): 24 | 25 | ``` 26 | 19335 0 8412684 1 -0.08743388205766678 castorini/monot5-base-msmarco-10k 27 | 19335 0 7267248 2 -0.20035237073898315 castorini/monot5-base-msmarco-10k 28 | 19335 0 527689 3 -0.9691352844238281 castorini/monot5-base-msmarco-10k 29 | ``` 30 | 31 | ## Submit the Image to TIRA 32 | 33 | You need a team for your submission, in the following, I use `tira-ir-starter` as team name, to resubmit the image, please just replace `tira-ir-starter` with your team name. 34 | 35 | First, you have to upload the image: 36 | 37 | ``` 38 | docker pull webis/tira-ir-starter-pygaggle:0.0.1-monot5-large-msmarco-10k 39 | docker tag webis/tira-ir-starter-pygaggle:0.0.1-monot5-large-msmarco-10k registry.webis.de/code-research/tira/tira-user-tira-ir-starter/pygaggle:0.0.1-monot5-large-msmarco-10k 40 | docker push registry.webis.de/code-research/tira/tira-user-tira-ir-starter/pygaggle:0.0.1-monot5-large-msmarco-10k 41 | ``` 42 | 43 | After the image is uploaded, you should use the following command in TIRA: 44 | 45 | ``` 46 | /reranking.py --input $inputDataset --output $outputDir 47 | ``` 48 | 49 | Please refer to the general tutorial on [how to import your retrieval approach to TIRA](https://github.com/tira-io/ir-experiment-platform/tree/main/tira-ir-starters#adding-your-retrieval-software) and on [how to run your retrieval software in TIRA](https://github.com/tira-io/ir-experiment-platform/tree/main/tira-ir-starters#running-your-retrieval-software) for more detailed instructions on how to submit. 50 | 51 | 52 | 53 | ## Building the images: 54 | 55 | There are many different variants of the image, depending on the included monoT5 or monoBERT model. 56 | All variants are: 57 | 58 | ``` 59 | docker build --build-arg MODEL_NAME=castorini/monot5-base-msmarco-10k --build-arg TOKENIZER_NAME=t5-base -t webis/tira-ir-starter-pygaggle:0.0.1-monot5-base-msmarco-10k -f pygaggle/Dockerfile.transformer . 60 | 61 | docker tag webis/tira-ir-starter-pygaggle:0.0.1-monot5-base-msmarco-10k registry.webis.de/code-research/tira/tira-user-tira-ir-starter/pygaggle:0.0.1-monot5-base-msmarco-10k 62 | ``` 63 | 64 | ``` 65 | docker build --build-arg MODEL_NAME=castorini/monot5-3b-msmarco --build-arg TOKENIZER_NAME=t5-3b -t webis/tira-ir-starter-pygaggle:0.0.1-monot5-3b-msmarco -f pygaggle/Dockerfile.transformer . 66 | 67 | docker tag webis/tira-ir-starter-pygaggle:0.0.1-monot5-3b-msmarco registry.webis.de/code-research/tira/tira-user-tira-ir-starter/pygaggle:0.0.1-monot5-3b-msmarco 68 | ``` 69 | 70 | ``` 71 | docker build --build-arg MODEL_NAME=castorini/monot5-large-msmarco --build-arg TOKENIZER_NAME=t5-large -t webis/tira-ir-starter-pygaggle:0.0.1-monot5-large-msmarco -f pygaggle/Dockerfile.transformer . 72 | 73 | docker tag webis/tira-ir-starter-pygaggle:0.0.1-monot5-large-msmarco registry.webis.de/code-research/tira/tira-user-tira-ir-starter/pygaggle:0.0.1-monot5-large-msmarco 74 | ``` 75 | 76 | ``` 77 | docker build --build-arg MODEL_NAME=castorini/monot5-large-msmarco-10k --build-arg TOKENIZER_NAME=t5-large -t webis/tira-ir-starter-pygaggle:0.0.1-monot5-large-msmarco-10k -f pygaggle/Dockerfile.transformer . 78 | 79 | docker tag webis/tira-ir-starter-pygaggle:0.0.1-monot5-large-msmarco-10k registry.webis.de/code-research/tira/tira-user-tira-ir-starter/pygaggle:0.0.1-monot5-large-msmarco-10k 80 | ``` 81 | 82 | ``` 83 | docker build --build-arg MODEL_NAME=castorini/monot5-base-med-msmarco --build-arg TOKENIZER_NAME=t5-base -t webis/tira-ir-starter-pygaggle:0.0.1-monot5-base-med-msmarco -f pygaggle/Dockerfile.transformer . 84 | 85 | docker tag webis/tira-ir-starter-pygaggle:0.0.1-monot5-base-med-msmarco registry.webis.de/code-research/tira/tira-user-tira-ir-starter/pygaggle:0.0.1-monot5-base-med-msmarco 86 | ``` 87 | 88 | ``` 89 | docker build --build-arg MODEL_NAME=castorini/monot5-small-msmarco-10k --build-arg TOKENIZER_NAME=t5-small -t webis/tira-ir-starter-pygaggle:0.0.1-monot5-small-msmarco-10k -f pygaggle/Dockerfile.transformer . 90 | 91 | docker tag webis/tira-ir-starter-pygaggle:0.0.1-monot5-small-msmarco-10k registry.webis.de/code-research/tira/tira-user-tira-ir-starter/pygaggle:0.0.1-monot5-small-msmarco-10k 92 | ``` 93 | 94 | ``` 95 | docker build --build-arg MODEL_NAME=castorini/monot5-small-msmarco-100k --build-arg TOKENIZER_NAME=t5-small -t webis/tira-ir-starter-pygaggle:0.0.1-monot5-small-msmarco-100k -f pygaggle/Dockerfile.transformer . 96 | 97 | docker tag webis/tira-ir-starter-pygaggle:0.0.1-monot5-small-msmarco-100k registry.webis.de/code-research/tira/tira-user-tira-ir-starter/pygaggle:0.0.1-monot5-small-msmarco-100k 98 | ``` 99 | 100 | ``` 101 | docker build --build-arg MODEL_NAME=castorini/monobert-large-msmarco-finetune-only --build-arg TOKENIZER_NAME=bert-large-uncased -t webis/tira-ir-starter-pygaggle:0.0.1-monobert-large-msmarco-finetune-only -f pygaggle/Dockerfile.transformer . 102 | 103 | docker tag webis/tira-ir-starter-pygaggle:0.0.1-monobert-large-msmarco-finetune-only registry.webis.de/code-research/tira/tira-user-tira-ir-starter/pygaggle:0.0.1-monobert-large-msmarco-finetune-only 104 | ``` 105 | 106 | 107 | You can test it locally via: 108 | ``` 109 | docker run --rm -ti \ 110 | -v ${PWD}/tmp-out:/output \ 111 | -v ${PWD}/clueweb-rerank:/input:ro \ 112 | --entrypoint /reranking.py \ 113 | webis/tira-ir-starter-pygaggle:0.0.1-monot5-base-msmarco-10k \ 114 | --input $inputDataset --output $outputDir 115 | ``` 116 | 117 | ``` 118 | docker build --build-arg MODEL_NAME=castorini/monobert-large-msmarco --build-arg TOKENIZER_NAME=bert-large-uncased -t webis/tira-ir-starter-pygaggle:0.0.1-monobert-large-msmarco -f pygaggle/Dockerfile.transformer . 119 | 120 | docker tag webis/tira-ir-starter-pygaggle:0.0.1-monobert-large-msmarco registry.webis.de/code-research/tira/tira-user-tira-ir-starter/pygaggle:0.0.1-monobert-large-msmarco 121 | ``` 122 | 123 | ``` 124 | docker run --rm -ti \ 125 | -v ${PWD}/tmp-out:/output \ 126 | -v ${PWD}/clueweb-rerank:/input:ro \ 127 | --entrypoint /reranking.py \ 128 | webis/tira-ir-starter-pygaggle:0.0.1-monobert-large-msmarco \ 129 | --input /input --output /output 130 | ``` 131 | -------------------------------------------------------------------------------- /tira-ir-starters/pygaggle/reranking.py: -------------------------------------------------------------------------------- 1 | #!/usr/bin/env python3 2 | import os 3 | import argparse 4 | import pandas as pd 5 | from tqdm import tqdm 6 | from tira.third_party_integrations import load_rerank_data, persist_and_normalize_run 7 | from pygaggle.rerank.base import Query, Text 8 | import importlib 9 | from pygaggle.rerank.transformer import MonoT5, MonoBERT 10 | 11 | 12 | def parse_args(): 13 | parser = argparse.ArgumentParser(prog='Re-rank with pygaggle.') 14 | 15 | parser.add_argument('--model_name', default=os.environ['MODEL_NAME']) 16 | parser.add_argument('--tokenizer_name', default=os.environ['TOKENIZER_NAME']) 17 | parser.add_argument('--input', help='The directory with the input data (i.e., a queries.jsonl and a documents.jsonl file).', required=True) 18 | parser.add_argument('--output', type=str, help='The output will be stored in this directory.', required=True) 19 | 20 | return parser.parse_args() 21 | 22 | 23 | def rerank(qid, query, df_docs, model): 24 | print(f'Rerank for query "{query}" (qid={qid}).') 25 | 26 | texts = [Text(i['text'], {'docid': i['docno']}, 0) for _, i in df_docs.iterrows()] 27 | 28 | scores = model.rerank(Query(query), texts) 29 | scores = {i.metadata["docid"]: i.score for i in scores} 30 | ret = [] 31 | 32 | for _, i in df_docs.iterrows(): 33 | ret += [{'qid': qid, 'Q0': 0, 'docno': i['docno'], 'score': scores[i['docno']], 'rank': 0}] 34 | 35 | return ret 36 | 37 | 38 | def main(model_name, tokenizer_name, input_file, output_directory): 39 | df = load_rerank_data(input_file) 40 | qids = sorted(list(df['qid'].unique())) 41 | df_ret = [] 42 | model = None 43 | 44 | if 'monot5' in model_name.lower(): 45 | model = MonoT5(model=MonoT5.get_model(model_name, local_files_only=True), tokenizer=MonoT5.get_tokenizer(tokenizer_name, local_files_only=True)) 46 | elif 'monobert' in model_name.lower(): 47 | model = MonoBERT(model=MonoBERT.get_model(model_name, local_files_only=True), tokenizer=MonoBERT.get_tokenizer(tokenizer_name, local_files_only=True)) 48 | 49 | for qid in tqdm(qids): 50 | df_qid = df[df['qid'] == qid] 51 | query = df_qid.iloc[0].to_dict()['query'] 52 | 53 | df_ret += rerank(qid, query, df_qid[['docno', 'text']], model) 54 | 55 | persist_and_normalize_run(pd.DataFrame(df_ret), model_name, output_directory) 56 | 57 | 58 | if __name__ == '__main__': 59 | args = parse_args() 60 | main(args.model_name, args.tokenizer_name, args.input, args.output) 61 | 62 | -------------------------------------------------------------------------------- /tira-ir-starters/pygaggle/sample-input/rerank.jsonl.gz: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/tira-io/ir-experiment-platform/9fded863a5fb06332ee73b2a6913d763eed3510d/tira-ir-starters/pygaggle/sample-input/rerank.jsonl.gz -------------------------------------------------------------------------------- /tira-ir-starters/pyserini/Dockerfile.base: -------------------------------------------------------------------------------- 1 | FROM pytorch/pytorch:1.12.0-cuda11.3-cudnn8-runtime 2 | 3 | RUN apt-get update \ 4 | && apt-get install -y git openjdk-11-jdk build-essential 5 | 6 | RUN pip3 install pyserini pandas jupyterlab runnb 7 | 8 | RUN pip3 install tira==0.0.22 9 | 10 | ENV PYTHONPATH=/workspace 11 | 12 | RUN jupyter trust /workspace/*.ipynb 13 | 14 | RUN pip3 install faiss-cpu 15 | 16 | ADD *.ipynb /workspace/ 17 | 18 | RUN jupyter trust /workspace/*.ipynb 19 | 20 | -------------------------------------------------------------------------------- /tira-ir-starters/pyserini/Makefile: -------------------------------------------------------------------------------- 1 | jupyter: 2 | docker run --rm -ti -p 8888:8888 -v ${HOME}/.tira:/root/.tira -v ${HOME}/.tira:/home/jovyan/.tira -v "${PWD}":/workspace webis/tira-ir-baselines-pyserini:0.0.1-base jupyter notebook --allow-root --ip 0.0.0.0 3 | 4 | -------------------------------------------------------------------------------- /tira-ir-starters/pyserini/full-rank-qld.ipynb: -------------------------------------------------------------------------------- 1 | { 2 | "cells": [ 3 | { 4 | "cell_type": "markdown", 5 | "id": "70566b0d", 6 | "metadata": {}, 7 | "source": [ 8 | "# QLD with PySerini" 9 | ] 10 | }, 11 | { 12 | "cell_type": "markdown", 13 | "id": "ba44b2e5", 14 | "metadata": {}, 15 | "source": [ 16 | "### Step 1: Import everything and load variables" 17 | ] 18 | }, 19 | { 20 | "cell_type": "code", 21 | "execution_count": 1, 22 | "id": "056a33fe", 23 | "metadata": {}, 24 | "outputs": [ 25 | { 26 | "name": "stderr", 27 | "output_type": "stream", 28 | "text": [ 29 | "/opt/conda/lib/python3.7/site-packages/tqdm/auto.py:22: TqdmWarning: IProgress not found. Please update jupyter and ipywidgets. See https://ipywidgets.readthedocs.io/en/stable/user_install.html\n", 30 | " from .autonotebook import tqdm as notebook_tqdm\n" 31 | ] 32 | }, 33 | { 34 | "name": "stdout", 35 | "output_type": "stream", 36 | "text": [ 37 | "I will use a small hardcoded example located in ./sample-input-full-rank.\n", 38 | "The output directory is /tmp/\n" 39 | ] 40 | } 41 | ], 42 | "source": [ 43 | "from pyserini.search.lucene import LuceneSearcher\n", 44 | "import pandas as pd\n", 45 | "from tira.third_party_integrations import get_input_directory_and_output_directory, persist_and_normalize_run\n", 46 | "import json\n", 47 | "from tqdm import tqdm\n", 48 | "\n", 49 | "input_directory, output_directory = get_input_directory_and_output_directory('./sample-input-full-rank')" 50 | ] 51 | }, 52 | { 53 | "cell_type": "markdown", 54 | "id": "963a9a84", 55 | "metadata": {}, 56 | "source": [ 57 | "### Step 2: Create Index and Searcher" 58 | ] 59 | }, 60 | { 61 | "cell_type": "code", 62 | "execution_count": 2, 63 | "id": "bd993ec8", 64 | "metadata": { 65 | "scrolled": true 66 | }, 67 | "outputs": [ 68 | { 69 | "name": "stderr", 70 | "output_type": "stream", 71 | "text": [ 72 | "5it [00:00, 7092.16it/s]\n" 73 | ] 74 | }, 75 | { 76 | "name": "stdout", 77 | "output_type": "stream", 78 | "text": [ 79 | "WARNING: sun.reflect.Reflection.getCallerClass is not supported. This will impact performance.\n", 80 | "2023-07-21 07:34:53,744 INFO [main] index.IndexCollection (IndexCollection.java:250) - Setting log level to INFO\n", 81 | "2023-07-21 07:34:53,746 INFO [main] index.IndexCollection (IndexCollection.java:253) - Starting indexer...\n", 82 | "2023-07-21 07:34:53,746 INFO [main] index.IndexCollection (IndexCollection.java:254) - ============ Loading Parameters ============\n", 83 | "2023-07-21 07:34:53,747 INFO [main] index.IndexCollection (IndexCollection.java:255) - DocumentCollection path: /tmp/anserini-docs\n", 84 | "2023-07-21 07:34:53,747 INFO [main] index.IndexCollection (IndexCollection.java:256) - CollectionClass: JsonCollection\n", 85 | "2023-07-21 07:34:53,748 INFO [main] index.IndexCollection (IndexCollection.java:257) - Generator: DefaultLuceneDocumentGenerator\n", 86 | "2023-07-21 07:34:53,748 INFO [main] index.IndexCollection (IndexCollection.java:258) - Threads: 1\n", 87 | "2023-07-21 07:34:53,748 INFO [main] index.IndexCollection (IndexCollection.java:259) - Language: en\n", 88 | "2023-07-21 07:34:53,749 INFO [main] index.IndexCollection (IndexCollection.java:260) - Stemmer: porter\n", 89 | "2023-07-21 07:34:53,749 INFO [main] index.IndexCollection (IndexCollection.java:261) - Keep stopwords? false\n", 90 | "2023-07-21 07:34:53,749 INFO [main] index.IndexCollection (IndexCollection.java:262) - Stopwords: null\n", 91 | "2023-07-21 07:34:53,750 INFO [main] index.IndexCollection (IndexCollection.java:263) - Store positions? true\n", 92 | "2023-07-21 07:34:53,750 INFO [main] index.IndexCollection (IndexCollection.java:264) - Store docvectors? true\n", 93 | "2023-07-21 07:34:53,751 INFO [main] index.IndexCollection (IndexCollection.java:265) - Store document \"contents\" field? false\n", 94 | "2023-07-21 07:34:53,751 INFO [main] index.IndexCollection (IndexCollection.java:266) - Store document \"raw\" field? false\n", 95 | "2023-07-21 07:34:53,751 INFO [main] index.IndexCollection (IndexCollection.java:267) - Additional fields to index: []\n", 96 | "2023-07-21 07:34:53,752 INFO [main] index.IndexCollection (IndexCollection.java:268) - Optimize (merge segments)? false\n", 97 | "2023-07-21 07:34:53,752 INFO [main] index.IndexCollection (IndexCollection.java:269) - Whitelist: null\n", 98 | "2023-07-21 07:34:53,752 INFO [main] index.IndexCollection (IndexCollection.java:270) - Pretokenized?: false\n", 99 | "2023-07-21 07:34:53,752 INFO [main] index.IndexCollection (IndexCollection.java:271) - Index path: /tmp/index\n", 100 | "2023-07-21 07:34:53,755 INFO [main] index.IndexCollection (IndexCollection.java:309) - ============ Indexing Collection ============\n", 101 | "2023-07-21 07:34:54,047 INFO [main] index.IndexCollection (IndexCollection.java:424) - Thread pool with 1 threads initialized.\n", 102 | "2023-07-21 07:34:54,047 INFO [main] index.IndexCollection (IndexCollection.java:426) - Initializing collection in /tmp/anserini-docs\n", 103 | "2023-07-21 07:34:54,048 INFO [main] index.IndexCollection (IndexCollection.java:435) - 1 file found\n", 104 | "2023-07-21 07:34:54,048 INFO [main] index.IndexCollection (IndexCollection.java:436) - Starting to index...\n", 105 | "2023-07-21 07:34:54,207 DEBUG [pool-2-thread-1] index.IndexCollection$LocalIndexerThread (IndexCollection.java:215) - anserini-docs/part-01.json: 5 docs added.\n", 106 | "2023-07-21 07:34:54,452 INFO [main] index.IndexCollection (IndexCollection.java:492) - Indexing Complete! 5 documents indexed\n", 107 | "2023-07-21 07:34:54,453 INFO [main] index.IndexCollection (IndexCollection.java:493) - ============ Final Counter Values ============\n", 108 | "2023-07-21 07:34:54,453 INFO [main] index.IndexCollection (IndexCollection.java:494) - indexed: 5\n", 109 | "2023-07-21 07:34:54,454 INFO [main] index.IndexCollection (IndexCollection.java:495) - unindexable: 0\n", 110 | "2023-07-21 07:34:54,455 INFO [main] index.IndexCollection (IndexCollection.java:496) - empty: 0\n", 111 | "2023-07-21 07:34:54,455 INFO [main] index.IndexCollection (IndexCollection.java:497) - skipped: 0\n", 112 | "2023-07-21 07:34:54,456 INFO [main] index.IndexCollection (IndexCollection.java:498) - errors: 0\n", 113 | "2023-07-21 07:34:54,468 INFO [main] index.IndexCollection (IndexCollection.java:501) - Total 5 documents indexed in 00:00:00\n" 114 | ] 115 | } 116 | ], 117 | "source": [ 118 | "!mkdir -p /tmp/anserini-docs\n", 119 | "\n", 120 | "with open(f'{input_directory}/documents.jsonl') as documents, open(f'/tmp/anserini-docs/part-01.json', 'w') as ans:\n", 121 | " for doc in tqdm(documents):\n", 122 | " doc = json.loads(doc)\n", 123 | " ans.write(json.dumps({\"id\": doc['docno'], \"contents\": doc['text']}) + '\\n')\n", 124 | "\n", 125 | "!python -m pyserini.index.lucene \\\n", 126 | " --collection JsonCollection \\\n", 127 | " --input /tmp/anserini-docs \\\n", 128 | " --index /tmp/index \\\n", 129 | " --generator DefaultLuceneDocumentGenerator \\\n", 130 | " --threads 1 \\\n", 131 | " --storePositions --storeDocvectors\n", 132 | "\n", 133 | "searcher = LuceneSearcher('/tmp/index')\n", 134 | "searcher.set_qld()" 135 | ] 136 | }, 137 | { 138 | "cell_type": "markdown", 139 | "id": "25653b1a", 140 | "metadata": {}, 141 | "source": [ 142 | "### Step 3: Create Run" 143 | ] 144 | }, 145 | { 146 | "cell_type": "code", 147 | "execution_count": 3, 148 | "id": "7ad73a92", 149 | "metadata": {}, 150 | "outputs": [], 151 | "source": [ 152 | "run = []\n", 153 | "\n", 154 | "with open(f'{input_directory}/queries.jsonl') as queries:\n", 155 | " for query in queries:\n", 156 | " query = json.loads(query)\n", 157 | " for doc in searcher.search(query['query'], 1000):\n", 158 | " run += [{\"qid\": query['qid'], \"score\": doc.score, \"docno\": doc.docid}]\n", 159 | "run = pd.DataFrame(run)" 160 | ] 161 | }, 162 | { 163 | "cell_type": "markdown", 164 | "id": "4d828d78", 165 | "metadata": {}, 166 | "source": [ 167 | "### Step 4: Persist Run" 168 | ] 169 | }, 170 | { 171 | "cell_type": "code", 172 | "execution_count": 4, 173 | "id": "6524fc70", 174 | "metadata": {}, 175 | "outputs": [], 176 | "source": [ 177 | "persist_and_normalize_run(run, output_file=output_directory, system_name='QLD', depth=1000)" 178 | ] 179 | } 180 | ], 181 | "metadata": { 182 | "kernelspec": { 183 | "display_name": "Python 3 (ipykernel)", 184 | "language": "python", 185 | "name": "python3" 186 | }, 187 | "language_info": { 188 | "codemirror_mode": { 189 | "name": "ipython", 190 | "version": 3 191 | }, 192 | "file_extension": ".py", 193 | "mimetype": "text/x-python", 194 | "name": "python", 195 | "nbconvert_exporter": "python", 196 | "pygments_lexer": "ipython3", 197 | "version": "3.7.13" 198 | } 199 | }, 200 | "nbformat": 4, 201 | "nbformat_minor": 5 202 | } 203 | -------------------------------------------------------------------------------- /tira-ir-starters/pyserini/sample-input-full-rank/documents.jsonl: -------------------------------------------------------------------------------- 1 | {"docno": "pangram-01", "text": "How quickly daft jumping zebras vex.", "original_document": {"doc_id": "pangram-01", "text": "How quickly daft jumping zebras vex.", "letters": 30}} 2 | {"docno": "pangram-02", "text": "Quick fox jumps nightly above wizard.", "original_document": {"doc_id": "pangram-02", "text": "Quick fox jumps nightly above wizard.", "letters": 31}} 3 | {"docno": "pangram-03", "text": "The jay, pig, fox, zebra and my wolves quack!", "original_document": {"doc_id": "pangram-03", "text": "The jay, pig, fox, zebra and my wolves quack!", "letters": 33}} 4 | {"docno": "pangram-04", "text": "The quick brown fox jumps over the lazy dog.", "original_document": {"doc_id": "pangram-04", "text": "The quick brown fox jumps over the lazy dog.", "letters": 35}} 5 | {"docno": "pangram-05", "text": "As quirky joke, chefs won\u2019t pay devil magic zebra tax.", "original_document": {"doc_id": "pangram-05", "text": "As quirky joke, chefs won\u2019t pay devil magic zebra tax.", "letters": 42}} 6 | -------------------------------------------------------------------------------- /tira-ir-starters/pyserini/sample-input-full-rank/metadata.json: -------------------------------------------------------------------------------- 1 | {"ir_datasets_id": "pangrams"} 2 | -------------------------------------------------------------------------------- /tira-ir-starters/pyserini/sample-input-full-rank/queries.jsonl: -------------------------------------------------------------------------------- 1 | {"qid": "1", "query": "fox jumps above animal", "original_query": {"query_id": "1", "title": "fox jumps above animal", "description": "What pangrams have a fox jumping above some animal?", "narrative": "Relevant pangrams have a fox jumping over an animal (e.g., an dog). Pangrams containing a fox that is not jumping or jumps over something that is not an animal are not relevant."}} 2 | {"qid": "2", "query": "multiple animals including a zebra", "original_query": {"query_id": "2", "title": "multiple animals including a zebra", "description": "Which pangrams have multiple animals where one of the animals is a zebra?", "narrative": "Relevant pangrams have at least two animals, one of the animals must be a Zebra. Pangrams containing only a Zebra are not relevant."}} 3 | -------------------------------------------------------------------------------- /tira-ir-starters/pyserini/sample-input-full-rank/queries.xml: -------------------------------------------------------------------------------- 1 | 2 | 3 | 4 | fox jumps above animal 5 | 6 | 7 | 8 | 1 9 | 10 | 11 | fox jumps above animal 12 | 13 | 14 | What pangrams have a fox jumping above some animal? 15 | 16 | 17 | Relevant pangrams have a fox jumping over an animal (e.g., an dog). Pangrams containing a fox that is not jumping or jumps over something that is not an animal are not relevant. 18 | 19 | 20 | 21 | 22 | 23 | multiple animals including a zebra 24 | 25 | 26 | 27 | 2 28 | 29 | 30 | multiple animals including a zebra 31 | 32 | 33 | Which pangrams have multiple animals where one of the animals is a zebra? 34 | 35 | 36 | Relevant pangrams have at least two animals, one of the animals must be a Zebra. Pangrams containing only a Zebra are not relevant. 37 | 38 | 39 | 40 | -------------------------------------------------------------------------------- /tira-ir-starters/pyserini/sample-input/rerank.jsonl.gz: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/tira-io/ir-experiment-platform/9fded863a5fb06332ee73b2a6913d763eed3510d/tira-ir-starters/pyserini/sample-input/rerank.jsonl.gz -------------------------------------------------------------------------------- /tira-ir-starters/pyterrier-ciff/example-ciff/index.ciff: -------------------------------------------------------------------------------- 1 | a" "(0)9ffffff @BJThis is the first experimental output of (part of) the CommonCrawl in CIFF 2 | above" 3 | and" 4 | as" 5 | brown" 6 | chefs" 7 | daft" 8 | devil" 9 | dog" 10 | fox""" 11 | how" 12 | jay" 13 | joke" 14 | jumping" 15 | jumps"" 16 | lazy" 17 | magic" 18 | my" 19 | nightly" 20 | over" 21 | pay" 22 | pig" 23 | quack" 24 | quick"" 25 | quickly" 26 | quirky" 27 | t" 28 | tax" 29 | the"" 30 | vex" 31 | wizard" 32 | wolves" 33 | won" 34 | zebra"" 35 | zebras" 36 | pangram-01 37 | pangram-02 38 | pangram-03  39 | pangram-04  40 | pangram-05 -------------------------------------------------------------------------------- /tira-ir-starters/pyterrier-ciff/example-input/documents.jsonl: -------------------------------------------------------------------------------- 1 | {"docno": "pangram-01", "text": "How quickly daft jumping zebras vex.", "original_document": {"doc_id": "pangram-01", "text": "How quickly daft jumping zebras vex.", "letters": 30}} 2 | {"docno": "pangram-02", "text": "Quick fox jumps nightly above wizard.", "original_document": {"doc_id": "pangram-02", "text": "Quick fox jumps nightly above wizard.", "letters": 31}} 3 | {"docno": "pangram-03", "text": "The jay, pig, fox, zebra and my wolves quack!", "original_document": {"doc_id": "pangram-03", "text": "The jay, pig, fox, zebra and my wolves quack!", "letters": 33}} 4 | {"docno": "pangram-04", "text": "The quick brown fox jumps over the lazy dog.", "original_document": {"doc_id": "pangram-04", "text": "The quick brown fox jumps over the lazy dog.", "letters": 35}} 5 | {"docno": "pangram-05", "text": "As quirky joke, chefs won\u2019t pay devil magic zebra tax.", "original_document": {"doc_id": "pangram-05", "text": "As quirky joke, chefs won\u2019t pay devil magic zebra tax.", "letters": 42}} 6 | -------------------------------------------------------------------------------- /tira-ir-starters/pyterrier-ciff/example-input/metadata.json: -------------------------------------------------------------------------------- 1 | {"ir_datasets_id": "pangrams"} 2 | -------------------------------------------------------------------------------- /tira-ir-starters/pyterrier-ciff/example-input/queries.jsonl: -------------------------------------------------------------------------------- 1 | {"qid": "1", "query": "fox jumps above animal", "original_query": {"query_id": "1", "title": "fox jumps above animal", "description": "What pangrams have a fox jumping above some animal?", "narrative": "Relevant pangrams have a fox jumping over an animal (e.g., an dog). Pangrams containing a fox that is not jumping or jumps over something that is not an animal are not relevant."}} 2 | {"qid": "2", "query": "multiple animals including a zebra", "original_query": {"query_id": "2", "title": "multiple animals including a zebra", "description": "Which pangrams have multiple animals where one of the animals is a zebra?", "narrative": "Relevant pangrams have at least two animals, one of the animals must be a zebra. Pangrams containing only a zebra are not relevant."}} 3 | -------------------------------------------------------------------------------- /tira-ir-starters/pyterrier-ciff/example-input/queries.xml: -------------------------------------------------------------------------------- 1 | 2 | 3 | 4 | fox jumps above animal 5 | 6 | 7 | 8 | 1 9 | 10 | 11 | fox jumps above animal 12 | 13 | 14 | What pangrams have a fox jumping above some animal? 15 | 16 | 17 | Relevant pangrams have a fox jumping over an animal (e.g., an dog). Pangrams containing a fox that is not jumping or jumps over something that is not an animal are not relevant. 18 | 19 | 20 | 21 | 22 | 23 | multiple animals including a zebra 24 | 25 | 26 | 27 | 2 28 | 29 | 30 | multiple animals including a zebra 31 | 32 | 33 | Which pangrams have multiple animals where one of the animals is a zebra? 34 | 35 | 36 | Relevant pangrams have at least two animals, one of the animals must be a zebra. Pangrams containing only a zebra are not relevant. 37 | 38 | 39 | 40 | -------------------------------------------------------------------------------- /tira-ir-starters/pyterrier-colbert/Dockerfile: -------------------------------------------------------------------------------- 1 | FROM webis/tira-ir-starter-pyterrier:0.0.1-base 2 | 3 | RUN pip3 install --upgrade git+https://github.com/terrierteam/pyterrier_colbert.git \ 4 | && pip install faiss-gpu==1.6.3 \ 5 | && pip install tira==0.0.9 6 | 7 | ARG MODEL_NAME=local 8 | ENV MODEL_NAME ${MODEL_NAME} 9 | 10 | RUN python3 -c "import pandas as pd; from tira.third_party_integrations import ensure_pyterrier_is_loaded; ensure_pyterrier_is_loaded(); from pyterrier_colbert.ranking import ColBERTFactory; colbert = ColBERTFactory('${MODEL_NAME}', '/tmp/tmp-index', 'index'); print(colbert.text_scorer()(pd.DataFrame([{'qid': '1', 'query': 'foo', 'docno': '1', 'text': 'bar'}])))" 11 | 12 | COPY pyterrier-colbert/reranking.py /reranking.py 13 | 14 | COPY pyterrier-colbert/bm25-colbert.ipynb /workspace/ 15 | 16 | RUN jupyter trust /workspace/*.ipynb 17 | 18 | -------------------------------------------------------------------------------- /tira-ir-starters/pyterrier-colbert/README.md: -------------------------------------------------------------------------------- 1 | # TIRA IR-Starter for ColBERT@PyTerrier 2 | 3 | This directory contains a tira starter for the [ColBERT retrieval model implemented in PyTerrier](https://github.com/terrierteam/pyterrier_colbert). 4 | 5 | Overall, this starter (or other versions derived from the starter) can serve as re-rank retriever and full-rank retriever (but we ommit the full-rank variant, as the indices become huge). 6 | 7 | ## Local Development 8 | 9 | Please use the `tira-run` command (can be installed via `pip3 install tira`) to test that your retrieval approach is correctly installed inside the Docker image. 10 | **Attention: ColBERT requires a GPU, i.e., ensure that you have installed the nvidia-runtime for Docker.** 11 | For example, you can run the following command inside this directory to re-rank with the ColBERT re-ranker from our tira-ir-starter on a small example (2 queries from the passage retrieval task of TREC DL 2019): 12 | 13 | ``` 14 | tira-run \ 15 | --input-directory ${PWD}/sample-input \ 16 | --image webis/tira-ir-starter-pyterrier-colbert:0.0.1 \ 17 | --command '/reranking.py --input $inputDataset --output $outputDir' 18 | ``` 19 | 20 | In this example above, the command `/reranking.py --input $inputDataset --output $outputDir` is the command that you would enter in TIRA, and the `--input-directory` flag points to the inputs. 21 | 22 | 23 | This creates a run file `tira-output/run.txt`, with content like (`cat sample-output/run.txt |head -3`): 24 | 25 | ``` 26 | 19335 0 7267248 1 20.848087310791016 colbert 27 | 19335 0 8412684 2 20.28887939453125 colbert 28 | 19335 0 527689 3 19.020572662353516 colbert 29 | ``` 30 | 31 | ## Submit the Image to TIRA 32 | 33 | You need a team for your submission, in the following, I use `tira-ir-starter` as team name, to resubmit the image, please just replace `tira-ir-starter` with your team name. 34 | 35 | First, you have to upload the image: 36 | 37 | ``` 38 | docker pull webis/tira-ir-starter-pyterrier-colbert:0.0.1 39 | docker tag webis/tira-ir-starter-pyterrier-colbert:0.0.1 registry.webis.de/code-research/tira/tira-user-tira-ir-starter/pyterrier-colbert:0.0.1 40 | docker push registry.webis.de/code-research/tira/tira-user-tira-ir-starter/pyterrier-colbert:0.0.1 41 | ``` 42 | 43 | After the image is uploaded, you should use the following command in TIRA: 44 | 45 | ``` 46 | /reranking.py --input $inputDataset --output $outputDir 47 | ``` 48 | 49 | Please refer to the general tutorial on [how to import your retrieval approach to TIRA](https://github.com/tira-io/ir-experiment-platform/tree/main/tira-ir-starters#adding-your-retrieval-software) and on [how to run your retrieval software in TIRA](https://github.com/tira-io/ir-experiment-platform/tree/main/tira-ir-starters#running-your-retrieval-software) for more detailed instructions on how to submit. 50 | 51 | 52 | 53 | ## Building the image: 54 | 55 | ``` 56 | docker build -t webis/tira-ir-starter-pyterrier-colbert:0.0.2 --build-arg MODEL_NAME=http://www.dcs.gla.ac.uk/~craigm/colbert.dnn.zip -f pyterrier-colbert/Dockerfile . 57 | ``` 58 | 59 | You can test it locally via: 60 | ``` 61 | docker run --rm -ti \ 62 | -v ${PWD}/tmp-out:/output \ 63 | -v ${PWD}/clueweb-rerank:/input:ro \ 64 | --entrypoint /reranking.py \ 65 | webis/tira-ir-starter-pyterrier-colbert:0.0.1 \ 66 | --input $inputDataset --output $outputDir 67 | ``` 68 | -------------------------------------------------------------------------------- /tira-ir-starters/pyterrier-colbert/reranking.py: -------------------------------------------------------------------------------- 1 | #!/usr/bin/env python3 2 | import argparse 3 | import os 4 | 5 | from tira.third_party_integrations import ensure_pyterrier_is_loaded, load_rerank_data, persist_and_normalize_run 6 | 7 | 8 | def parse_args(): 9 | parser = argparse.ArgumentParser(description='') 10 | 11 | parser.add_argument('--input', type=str, help='The directory with the input data (i.e., a queries.jsonl and a documents.jsonl file).', required=True) 12 | parser.add_argument('--model', default=os.environ['MODEL_NAME']) 13 | parser.add_argument('--output', type=str, help='The output will be stored in this directory.', required=True) 14 | 15 | return parser.parse_args() 16 | 17 | 18 | def rerank(model, input_directory, output_directory): 19 | df = load_rerank_data(input_directory) 20 | from pyterrier_colbert.ranking import ColBERTFactory 21 | pytcolbert = ColBERTFactory(model, "/index", "index") 22 | ret = pytcolbert.text_scorer(verbose=True)(df) 23 | 24 | persist_and_normalize_run(ret, 'colbert', output_directory) 25 | 26 | 27 | if __name__ == '__main__': 28 | args = parse_args() 29 | ensure_pyterrier_is_loaded() 30 | 31 | rerank(args.model, args.input, args.output) 32 | 33 | -------------------------------------------------------------------------------- /tira-ir-starters/pyterrier-colbert/sample-input-full-rank/documents.jsonl: -------------------------------------------------------------------------------- 1 | {"docno": "pangram-01", "text": "How quickly daft jumping zebras vex.", "original_document": {"doc_id": "pangram-01", "text": "How quickly daft jumping zebras vex.", "letters": 30}} 2 | {"docno": "pangram-02", "text": "Quick fox jumps nightly above wizard.", "original_document": {"doc_id": "pangram-02", "text": "Quick fox jumps nightly above wizard.", "letters": 31}} 3 | {"docno": "pangram-03", "text": "The jay, pig, fox, zebra and my wolves quack!", "original_document": {"doc_id": "pangram-03", "text": "The jay, pig, fox, zebra and my wolves quack!", "letters": 33}} 4 | {"docno": "pangram-04", "text": "The quick brown fox jumps over the lazy dog.", "original_document": {"doc_id": "pangram-04", "text": "The quick brown fox jumps over the lazy dog.", "letters": 35}} 5 | {"docno": "pangram-05", "text": "As quirky joke, chefs won\u2019t pay devil magic zebra tax.", "original_document": {"doc_id": "pangram-05", "text": "As quirky joke, chefs won\u2019t pay devil magic zebra tax.", "letters": 42}} 6 | -------------------------------------------------------------------------------- /tira-ir-starters/pyterrier-colbert/sample-input-full-rank/metadata.json: -------------------------------------------------------------------------------- 1 | {"ir_datasets_id": "pangrams"} 2 | -------------------------------------------------------------------------------- /tira-ir-starters/pyterrier-colbert/sample-input-full-rank/queries.jsonl: -------------------------------------------------------------------------------- 1 | {"qid": "1", "query": "fox jumps above animal", "original_query": {"query_id": "1", "title": "fox jumps above animal", "description": "What pangrams have a fox jumping above some animal?", "narrative": "Relevant pangrams have a fox jumping over an animal (e.g., an dog). Pangrams containing a fox that is not jumping or jumps over something that is not an animal are not relevant."}} 2 | {"qid": "2", "query": "multiple animals including a zebra", "original_query": {"query_id": "2", "title": "multiple animals including a zebra", "description": "Which pangrams have multiple animals where one of the animals is a zebra?", "narrative": "Relevant pangrams have at least two animals, one of the animals must be a Zebra. Pangrams containing only a Zebra are not relevant."}} 3 | -------------------------------------------------------------------------------- /tira-ir-starters/pyterrier-colbert/sample-input-full-rank/queries.xml: -------------------------------------------------------------------------------- 1 | 2 | 3 | 4 | fox jumps above animal 5 | 6 | 7 | 8 | 1 9 | 10 | 11 | fox jumps above animal 12 | 13 | 14 | What pangrams have a fox jumping above some animal? 15 | 16 | 17 | Relevant pangrams have a fox jumping over an animal (e.g., an dog). Pangrams containing a fox that is not jumping or jumps over something that is not an animal are not relevant. 18 | 19 | 20 | 21 | 22 | 23 | multiple animals including a zebra 24 | 25 | 26 | 27 | 2 28 | 29 | 30 | multiple animals including a zebra 31 | 32 | 33 | Which pangrams have multiple animals where one of the animals is a zebra? 34 | 35 | 36 | Relevant pangrams have at least two animals, one of the animals must be a Zebra. Pangrams containing only a Zebra are not relevant. 37 | 38 | 39 | 40 | -------------------------------------------------------------------------------- /tira-ir-starters/pyterrier-colbert/sample-input/rerank.jsonl.gz: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/tira-io/ir-experiment-platform/9fded863a5fb06332ee73b2a6913d763eed3510d/tira-ir-starters/pyterrier-colbert/sample-input/rerank.jsonl.gz -------------------------------------------------------------------------------- /tira-ir-starters/pyterrier-duot5/Dockerfile: -------------------------------------------------------------------------------- 1 | FROM webis/tira-ir-starter-pyterrier:0.0.1-base 2 | 3 | RUN pip3 install --upgrade git+https://github.com/terrierteam/pyterrier_t5.git \ 4 | && pip install tira==0.0.9 5 | 6 | ARG MODEL_NAME=local 7 | ENV MODEL_NAME ${MODEL_NAME} 8 | 9 | ARG TOKENIZER_NAME=local 10 | ENV TOKENIZER_NAME ${TOKENIZER_NAME} 11 | 12 | RUN python3 -c "from tira.third_party_integrations import ensure_pyterrier_is_loaded; ensure_pyterrier_is_loaded(); from pyterrier_t5 import DuoT5ReRanker; duo_t5 = DuoT5ReRanker(model='${MODEL_NAME}', tok_model='${TOKENIZER_NAME}');" 13 | 14 | COPY pyterrier-duot5/reranking.py /reranking.py 15 | 16 | -------------------------------------------------------------------------------- /tira-ir-starters/pyterrier-duot5/README.md: -------------------------------------------------------------------------------- 1 | # TIRA IR-Starter for duoT5@PyTerrier 2 | 3 | This directory contains a tira starter for the [duoT5 retrieval model implemented in PyTerrier](https://github.com/terrierteam/pyterrier_t5). 4 | 5 | Overall, this starter (or other versions derived from the starter) can serve as re-rank retriever and we produce 3 different variants by switching out the embedded duoT5 model. 6 | 7 | 8 | ## Local Development 9 | 10 | Please use the `tira-run` command (can be installed via `pip3 install tira`) to test that your retrieval approach is correctly installed inside the Docker image. 11 | For example, you can run the following command inside this directory to re-rank with an duot5 re-ranker from our tira-ir-starter on a small example (2 queries from the passage retrieval task of TREC DL 2019): 12 | 13 | ``` 14 | tira-run \ 15 | --input-directory ${PWD}/sample-input \ 16 | --image webis/tira-ir-starter-duot5:0.0.1-duot5-base-msmarco \ 17 | --command '/reranking.py --input $inputDataset --output $outputDir --top_k 3' 18 | ``` 19 | 20 | In this example above, the command `/reranking.py --input $inputDataset --output $outputDir --score_function cos_sim` is the command that you would enter in TIRA, and the `--input-directory` flag points to the inputs. 21 | 22 | 23 | This creates a run file `tira-output/run.txt`, with content like (`cat sample-output/run.txt |head -3`): 24 | 25 | ``` 26 | 19335 0 8412684 1 2.7129419036209583 duoT5-additive 27 | 19335 0 7267248 2 1.6789115946739912 duoT5-additive 28 | 19335 0 8412687 3 1.6081465017050505 duoT5-additive 29 | ``` 30 | 31 | ## Submit the Image to TIRA 32 | 33 | You need a team for your submission, in the following, I use `tira-ir-starter` as team name, to resubmit the image, please just replace `tira-ir-starter` with your team name. 34 | 35 | First, you have to upload the image: 36 | 37 | ``` 38 | docker pull webis/tira-ir-starter-duot5:0.0.1-duot5-base-msmarco 39 | docker tag webis/tira-ir-starter-duot5:0.0.1-duot5-base-msmarco registry.webis.de/code-research/tira/tira-user-tira-ir-starter-duot5:0.0.1-duot5-base-msmarco 40 | docker push registry.webis.de/code-research/tira/tira-user-tira-ir-starter-duot5:0.0.1-duot5-base-msmarco 41 | ``` 42 | 43 | After the image is uploaded, you should use the following command in TIRA: 44 | 45 | ``` 46 | /reranking.py --input $inputDataset --output $outputDir --top_k 25 47 | ``` 48 | 49 | Please refer to the general tutorial on [how to import your retrieval approach to TIRA](https://github.com/tira-io/ir-experiment-platform/tree/main/tira-ir-starters#adding-your-retrieval-software) and on [how to run your retrieval software in TIRA](https://github.com/tira-io/ir-experiment-platform/tree/main/tira-ir-starters#running-your-retrieval-software) for more detailed instructions on how to submit. 50 | 51 | 52 | 53 | ## Building the image: 54 | 55 | To build the image and to deploy it in TIRA, please run the follwoing commands (we have 3 different variants, a full list comes below): 56 | 57 | ``` 58 | docker build --build-arg MODEL_NAME=castorini/duot5-base-msmarco --build-arg TOKENIZER_NAME=t5-base -t webis/tira-ir-starter-duot5:0.0.1-duot5-base-msmarco -f pyterrier-duot5/Dockerfile . 59 | docker push webis/tira-ir-starter-duot5:0.0.1-duot5-base-msmarco 60 | ``` 61 | 62 | 63 | 64 | ## Overview of all Images 65 | 66 | ``` 67 | docker build --build-arg MODEL_NAME=castorini/duot5-base-msmarco --build-arg TOKENIZER_NAME=t5-base -t webis/tira-ir-starter-duot5:0.0.1-duot5-base-msmarco -f pyterrier-duot5/Dockerfile . 68 | 69 | docker tag webis/tira-ir-starter-duot5:0.0.1-duot5-base-msmarco registry.webis.de/code-research/tira/tira-user-tira-ir-starter/duot5:0.0.1-duot5-base-msmarco 70 | ``` 71 | 72 | 73 | ``` 74 | docker build --build-arg MODEL_NAME=castorini/duot5-base-msmarco-10k --build-arg TOKENIZER_NAME=t5-base -t webis/tira-ir-starter-duot5:0.0.1-duot5-base-msmarco-10k -f pyterrier-duot5/Dockerfile . 75 | 76 | docker tag webis/tira-ir-starter-duot5:0.0.1-duot5-base-msmarco-10k registry.webis.de/code-research/tira/tira-user-tira-ir-starter/duot5:0.0.1-duot5-base-msmarco-10k 77 | ``` 78 | 79 | ``` 80 | docker build --build-arg MODEL_NAME=castorini/duot5-3b-msmarco --build-arg TOKENIZER_NAME=t5-3b -t webis/tira-ir-starter-duot5:0.0.1-duot5-3b-msmarco -f pyterrier-duot5/Dockerfile . 81 | 82 | docker tag webis/tira-ir-starter-duot5:0.0.1-duot5-3b-msmarco registry.webis.de/code-research/tira/tira-user-tira-ir-starter/duot5:0.0.1-3b-msmarco 83 | ``` 84 | 85 | Run everything locally via: 86 | 87 | ``` 88 | docker run --rm -ti \ 89 | -v ${PWD}/tmp-out:/output \ 90 | -v ${PWD}/clueweb-rerank:/input:ro \ 91 | --entrypoint /reranking.py \ 92 | webis/tira-ir-starter-duot5:0.0.1-duot5-base-msmarco \ 93 | --input $inputDataset --output $outputDir --top_k 3 94 | ``` 95 | -------------------------------------------------------------------------------- /tira-ir-starters/pyterrier-duot5/__pycache__/bla.cpython-310.pyc: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/tira-io/ir-experiment-platform/9fded863a5fb06332ee73b2a6913d763eed3510d/tira-ir-starters/pyterrier-duot5/__pycache__/bla.cpython-310.pyc -------------------------------------------------------------------------------- /tira-ir-starters/pyterrier-duot5/__pycache__/reranking.cpython-310.pyc: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/tira-io/ir-experiment-platform/9fded863a5fb06332ee73b2a6913d763eed3510d/tira-ir-starters/pyterrier-duot5/__pycache__/reranking.cpython-310.pyc -------------------------------------------------------------------------------- /tira-ir-starters/pyterrier-duot5/reranking.py: -------------------------------------------------------------------------------- 1 | #!/usr/bin/env python3 2 | import argparse 3 | import os 4 | 5 | from tira.third_party_integrations import ensure_pyterrier_is_loaded, load_rerank_data 6 | ensure_pyterrier_is_loaded() 7 | 8 | import math 9 | import pandas as pd 10 | import pyterrier as pt 11 | import torch 12 | from torch.nn import functional as F 13 | from transformers import T5Config, T5Tokenizer, T5ForConditionalGeneration 14 | from pyterrier.transformer import TransformerBase 15 | import itertools 16 | import json 17 | 18 | class DuoT5PairwisePreferences(TransformerBase): 19 | def __init__(self, tok_model: str = 't5-base', model: str = 'castorini/duot5-base-msmarco', 20 | batch_size: int = 4, text_field: str = 'text', verbose=True): 21 | self.verbose = verbose 22 | self.batch_size = batch_size 23 | self.device = torch.device('cuda' if torch.cuda.is_available() else 'cpu') 24 | self.tokenizer = T5Tokenizer.from_pretrained(tok_model, model_max_length=512) 25 | self.model_name = model 26 | self.model = T5ForConditionalGeneration.from_pretrained(model) 27 | self.model.to(self.device) 28 | self.model.eval() 29 | self.text_field = text_field 30 | self.REL = self.tokenizer.encode('true')[0] 31 | self.NREL = self.tokenizer.encode('false')[0] 32 | 33 | def transform(self, run): 34 | queries, texts = run['query'], run[self.text_field] 35 | pairwise_scores = [] 36 | all_queries = set(queries.unique()) 37 | prompts = self.tokenizer.batch_encode_plus([f'Relevant:' for _ in range(self.batch_size)], return_tensors='pt', 38 | padding='longest') 39 | max_vlen = self.model.config.n_positions - prompts['input_ids'].shape[1] 40 | batches = 0 41 | for batch in self._iter_duo_batches(run): 42 | batches += 1 43 | enc_query = self.tokenizer.batch_encode_plus([f'Query: {q}' for q in batch['query']], return_tensors='pt', 44 | padding='longest') 45 | enc_text0 = self.tokenizer.batch_encode_plus([f'Query: {q}' for q in batch['text0']], return_tensors='pt', 46 | padding='longest') 47 | enc_text1 = self.tokenizer.batch_encode_plus([f'Query: {q}' for q in batch['text1']], return_tensors='pt', 48 | padding='longest') 49 | enc = {} 50 | for key in enc_query: 51 | query = enc_query[key][:, :-1] # chop off end of sequence token-- this will be added with the prompt 52 | text0 = enc_text0[key][:, :-1] # chop off end of sequence token-- this will be added with the prompt 53 | text1 = enc_text1[key][:, :-1] # chop off end of sequence token-- this will be added with the prompt 54 | # Do we need to truncate? If so, how many tokens per document? 55 | if query.shape[1] + text0.shape[1] + text1.shape[1] > max_vlen: 56 | tokens_to_truncate = query.shape[1] + text0.shape[1] + text1.shape[1] - max_vlen 57 | tokens_to_truncate_per_doc = math.ceil(tokens_to_truncate / 2) 58 | text0 = text0[:, :-tokens_to_truncate_per_doc] 59 | text1 = text1[:, :-tokens_to_truncate_per_doc] 60 | # Combine the components: 61 | enc[key] = torch.cat([query, text0, text1, prompts[key][:query.shape[0]]], dim=1) 62 | enc['decoder_input_ids'] = torch.full( 63 | (len(batch['ids']), 1), 64 | self.model.config.decoder_start_token_id, 65 | dtype=torch.long 66 | ) 67 | enc = {k: v.to(self.device) for k, v in enc.items()} 68 | with torch.no_grad(): 69 | result = self.model(**enc).logits 70 | result = result[:, 0, (self.REL, self.NREL)] 71 | result = F.log_softmax(result, dim=1)[:, 0].cpu().detach().tolist() 72 | 73 | for (qid, did1, did2), score in zip(batch['ids'], result): 74 | yield {'qid': qid, 'docno1': did1, 'docno2': did2, 'score': score} 75 | 76 | def _iter_duo_pairs(self, run): 77 | warned = False 78 | groups = run.groupby('qid') 79 | if self.verbose: 80 | groups = pt.tqdm(groups, desc='duoT5', unit='queries') 81 | for qid, group in groups: 82 | if not warned and len(group) > 50: 83 | warnings.warn(f'A large number of results per query was detected ({len(group)}). Since DuoT5 ' 84 | 'is an O(n^2) operation, this will take a considerable amount of time to process. ' 85 | 'Consider first reducing the size of the results using the % operator.') 86 | warned = True 87 | for row1, row2 in itertools.permutations(group.itertuples(index=False), 2): 88 | yield row1.qid, row1.query, getattr(row1, self.text_field), getattr(row2, self.text_field), row1.docno, row2.docno 89 | 90 | def _iter_duo_batches(self, run): 91 | batch = {'ids': [], 'query': [], 'text0': [], 'text1': []} 92 | print('We shorten queries to the first 1000 characters and both documents each to the first 4000 characters....') 93 | for qid, query, text0, text1, did0, did1 in self._iter_duo_pairs(run): 94 | batch['ids'].append((qid, did0, did1)) 95 | batch['query'].append(query[:1000]) 96 | batch['text0'].append(text0[:4000]) 97 | batch['text1'].append(text1[:4000]) 98 | if len(batch['ids']) == self.batch_size: 99 | yield batch 100 | for v in batch.values(): 101 | v.clear() 102 | if len(batch['ids']) > 0: 103 | yield batch 104 | 105 | 106 | def parse_args(): 107 | parser = argparse.ArgumentParser(description='') 108 | 109 | parser.add_argument('--input', type=str, help='The directory with the input data (i.e., a queries.jsonl and a documents.jsonl file).', required=True) 110 | parser.add_argument('--model', default=os.environ['MODEL_NAME']) 111 | parser.add_argument('--tokenizer', default=os.environ['TOKENIZER_NAME']) 112 | parser.add_argument('--top_k', type=int, help="how many documents to rerank", required=True) 113 | parser.add_argument('--output', type=str, help='The output will be stored in this directory.', required=True) 114 | 115 | return parser.parse_args() 116 | 117 | 118 | def rerank(model, tok_model, top_k, input_directory, output_directory): 119 | 120 | df = load_rerank_data(input_directory) 121 | df = df[df['rank'] <= top_k] 122 | duot5 = DuoT5PairwisePreferences(model=model, tok_model=tok_model) 123 | 124 | with open(output_directory +'/pairwise-preferences.jsonl', 'w') as out_file: 125 | for pref in duot5(df): 126 | out_file.write(json.dumps(pref) + '\n') 127 | pairwise_aggregation(output_directory) 128 | 129 | 130 | def pairwise_aggregation(input_directory): 131 | import os 132 | import pandas as pd 133 | from tira.third_party_integrations import persist_and_normalize_run 134 | import json 135 | run_output = input_directory + '/run.txt' 136 | 137 | if os.path.isfile(run_output): 138 | return 139 | 140 | scores = {} 141 | 142 | with open(input_directory +'/pairwise-preferences.jsonl', 'r') as preferences: 143 | for l in preferences: 144 | l = json.loads(l) 145 | qid, id_a, id_b, score = l['qid'], l['docno1'], l['docno2'], l['score'] 146 | if qid not in scores: 147 | scores[qid] = {} 148 | 149 | for doc_id in [id_a, id_b]: 150 | if doc_id not in scores[qid]: 151 | scores[qid][doc_id] = 0 152 | 153 | scores[qid][id_a] += score 154 | scores[qid][id_b] += (1 - score) 155 | 156 | ret = [] 157 | 158 | for qid in scores: 159 | for doc_id in scores[qid].keys(): 160 | ret += [{'qid': qid, 'Q0': 0, 'docno': doc_id, 'score': scores[qid][doc_id], 'rank': 0}] 161 | 162 | persist_and_normalize_run(pd.DataFrame(ret), 'duoT5-additive', input_directory) 163 | 164 | 165 | if __name__ == '__main__': 166 | args = parse_args() 167 | 168 | rerank(args.model, args.tokenizer, args.top_k, args.input, args.output) 169 | 170 | -------------------------------------------------------------------------------- /tira-ir-starters/pyterrier-duot5/sample-input/rerank.jsonl.gz: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/tira-io/ir-experiment-platform/9fded863a5fb06332ee73b2a6913d763eed3510d/tira-ir-starters/pyterrier-duot5/sample-input/rerank.jsonl.gz -------------------------------------------------------------------------------- /tira-ir-starters/pyterrier-t5/Dockerfile: -------------------------------------------------------------------------------- 1 | FROM webis/tira-ir-starter-pyterrier:0.0.1-base 2 | 3 | RUN pip3 install --upgrade git+https://github.com/terrierteam/pyterrier_t5.git \ 4 | && pip install tira==0.0.29 5 | 6 | ARG MODEL_NAME=local 7 | ENV MODEL_NAME ${MODEL_NAME} 8 | 9 | ARG TOKENIZER_NAME=local 10 | ENV TOKENIZER_NAME ${TOKENIZER_NAME} 11 | 12 | RUN python3 -c "from tira.third_party_integrations import ensure_pyterrier_is_loaded; ensure_pyterrier_is_loaded(); from pyterrier_t5 import MonoT5ReRanker; mono_t5 = MonoT5ReRanker(model='${MODEL_NAME}', tok_model='${TOKENIZER_NAME}');" 13 | 14 | COPY pyterrier-t5/bm25-monot5.ipynb /workspace 15 | 16 | RUN jupyter trust /workspace/*.ipynb 17 | 18 | -------------------------------------------------------------------------------- /tira-ir-starters/pyterrier-t5/README.md: -------------------------------------------------------------------------------- 1 | # TIRA IR-Starter for MonoT5 in PyTerrier with Jupyter Notebooks 2 | 3 | This directory contains a retrieval system that uses a Jupyter notebook with PyTerrier to rerank the top-1000 results of BM25 with MonoT5. 4 | m. 5 | 6 | ## Local Development 7 | 8 | Please use the `tira-run` command (can be installed via `pip3 install tira`) to test that your retrieval approach is correctly installed inside the Docker image. 9 | For example, you can run the following command inside this directory to re-rank with an PyTerrier re-ranker from our tira-ir-starter with BM25 on a small example (2 queries from the passage retrieval task of TREC DL 2019): 10 | 11 | ``` 12 | tira-run \ 13 | --input-directory ${PWD}/sample-input \ 14 | --image webis/tira-ir-starter-pyterrier:0.0.1-base \ 15 | --command '/workspace/pyterrier_cli.py --input $inputDataset --output $outputDir --params wmodel=BM25 --rerank True --retrieval_pipeline default_pipelines.wmodel_text_scorer' 16 | ``` 17 | 18 | In this example above, the command `/workspace/pyterrier_cli.py --input $inputDataset --output $outputDir --params wmodel=BM25 --rerank True --retrieval_pipeline default_pipelines.wmodel_text_scorer` is the command that you would enter in TIRA, and the `--input-directory` flag points to the inputs. 19 | 20 | This creates a run file `tira-output/run.txt`, with content like (`cat sample-output/run.txt |head -3`): 21 | 22 | ``` 23 | 19335 Q0 8412684 1 2.0044117909904275 pyterrier.default_pipelines.wmodel_text_scorer 24 | 19335 Q0 8412687 2 1.6165480088144524 pyterrier.default_pipelines.wmodel_text_scorer 25 | 19335 Q0 527689 3 0.7777388572417481 pyterrier.default_pipelines.wmodel_text_scorer 26 | ``` 27 | 28 | Testing full-rank retrievers works analougously. 29 | 30 | ## Developing Retrieval Approaches in Declarative PyTerrier-Pipelines 31 | 32 | The notebook [full-rank-pipeline.ipynb](full-rank-pipeline.ipynb) exemplifies how to directly run Jupyter Notebooks in TIRA. 33 | 34 | You can run it locally via: 35 | 36 | ``` 37 | tira-run \ 38 | --input-directory ${PWD}/sample-input-full-rank \ 39 | --image webis/tira-ir-starter-pyterrier:0.0.1-base \ 40 | --command '/workspace/run-pyterrier-notebook.py --input $inputDataset --output $outputDir --notebook /workspace/full-rank-pipeline.ipynb' 41 | ``` 42 | 43 | This creates a run file `tira-output/run.txt`, with content like (`cat sample-output/run.txt |head -3`): 44 | 45 | ``` 46 | 1 0 pangram-03 1 -0.4919184192126373 BM25 47 | 1 0 pangram-01 2 -0.5271673505256447 BM25 48 | 1 0 pangram-04 3 -0.9838368384252746 BM25 49 | ``` 50 | 51 | ## Submit the Image to TIRA 52 | 53 | You need a team for your submission, in the following, we use `tira-ir-starter` as team name, to resubmit the image, please just replace `tira-ir-starter` with your team name. 54 | 55 | First, you have to upload the image: 56 | 57 | ``` 58 | docker pull webis/tira-ir-starter-pyterrier-monot5:0.0.1-monot5-base-msmarco-10k 59 | 60 | docker tag webis/tira-ir-starter-pyterrier-monot5:0.0.1-monot5-base-msmarco-10k registry.webis.de/code-research/tira/tira-user-tira-ir-starter/pyterrier-monot5:0.0.1 61 | docker push registry.webis.de/code-research/tira/tira-user-tira-ir-starter/pyterrier-monot5:0.0.1 62 | ``` 63 | 64 | # Build the image 65 | 66 | ``` 67 | docker build --build-arg MODEL_NAME=castorini/monot5-base-msmarco-10k --build-arg TOKENIZER_NAME=t5-base -t webis/tira-ir-starter-pyterrier-monot5:0.0.1-monot5-base-msmarco-10k -f pyterrier-t5/Dockerfile . 68 | ``` 69 | -------------------------------------------------------------------------------- /tira-ir-starters/pyterrier-t5/bm25-monot5.ipynb: -------------------------------------------------------------------------------- 1 | { 2 | "cells": [ 3 | { 4 | "cell_type": "markdown", 5 | "id": "8c3da078-f7fc-4d37-904c-532bb26d4321", 6 | "metadata": {}, 7 | "source": [ 8 | "# BM25 >> MonoT5 Pipeline" 9 | ] 10 | }, 11 | { 12 | "cell_type": "markdown", 13 | "id": "66fd2911-c97a-4f91-af28-8c7e381573b6", 14 | "metadata": {}, 15 | "source": [ 16 | "### Step 1: Import everything and load variables" 17 | ] 18 | }, 19 | { 20 | "cell_type": "code", 21 | "execution_count": 2, 22 | "id": "7ae3c54f-aba1-45bf-b074-e78a99f6405f", 23 | "metadata": {}, 24 | "outputs": [ 25 | { 26 | "name": "stdout", 27 | "output_type": "stream", 28 | "text": [ 29 | "I will use a small hardcoded example located in ./sample-input-full-rank.\n", 30 | "The output directory is /tmp/\n" 31 | ] 32 | } 33 | ], 34 | "source": [ 35 | "import pyterrier as pt\n", 36 | "import pandas as pd\n", 37 | "from tira.third_party_integrations import ensure_pyterrier_is_loaded, get_input_directory_and_output_directory, persist_and_normalize_run\n", 38 | "import json\n", 39 | "from tqdm import tqdm\n", 40 | "\n", 41 | "ensure_pyterrier_is_loaded()\n", 42 | "input_directory, output_directory = get_input_directory_and_output_directory('./sample-input-full-rank')\n" 43 | ] 44 | }, 45 | { 46 | "cell_type": "markdown", 47 | "id": "8c563b0e-97ac-44a2-ba2f-18858f1506bb", 48 | "metadata": {}, 49 | "source": [ 50 | "### Step 2: Load the Data" 51 | ] 52 | }, 53 | { 54 | "cell_type": "code", 55 | "execution_count": 3, 56 | "id": "e35230af-66ec-4607-a97b-127bd890fa59", 57 | "metadata": {}, 58 | "outputs": [ 59 | { 60 | "name": "stdout", 61 | "output_type": "stream", 62 | "text": [ 63 | "Step 2: Load the data.\n" 64 | ] 65 | } 66 | ], 67 | "source": [ 68 | "print('Step 2: Load the data.')\n", 69 | "\n", 70 | "queries = pt.io.read_topics(input_directory + '/queries.xml', format='trecxml')\n", 71 | "\n", 72 | "documents = (json.loads(i) for i in open(input_directory + '/documents.jsonl', 'r'))\n" 73 | ] 74 | }, 75 | { 76 | "cell_type": "markdown", 77 | "id": "72655916-07fe-4c58-82c1-2f9f93381e7f", 78 | "metadata": {}, 79 | "source": [ 80 | "### Step 3: Create the Index" 81 | ] 82 | }, 83 | { 84 | "cell_type": "code", 85 | "execution_count": 4, 86 | "id": "05ce062d-25e4-4c61-b6ce-9431b9f2bbd4", 87 | "metadata": {}, 88 | "outputs": [ 89 | { 90 | "name": "stdout", 91 | "output_type": "stream", 92 | "text": [ 93 | "Step 3: Create the Index.\n" 94 | ] 95 | }, 96 | { 97 | "name": "stderr", 98 | "output_type": "stream", 99 | "text": [ 100 | "5it [00:00, 48.24it/s]\n" 101 | ] 102 | } 103 | ], 104 | "source": [ 105 | "print('Step 3: Create the Index.')\n", 106 | "\n", 107 | "!rm -Rf ./index\n", 108 | "iter_indexer = pt.IterDictIndexer(\"./index\", meta={'docno' : 100, 'text': 10240})\n", 109 | "index_ref = iter_indexer.index(tqdm(documents))" 110 | ] 111 | }, 112 | { 113 | "cell_type": "markdown", 114 | "id": "806c4638-ccee-4470-a74c-2a85d9ee2cfc", 115 | "metadata": {}, 116 | "source": [ 117 | "### Step 4: Create Run" 118 | ] 119 | }, 120 | { 121 | "cell_type": "code", 122 | "execution_count": 7, 123 | "id": "a191f396-e896-4792-afaf-574e452640f5", 124 | "metadata": {}, 125 | "outputs": [], 126 | "source": [ 127 | "from pyterrier_t5 import MonoT5ReRanker\n", 128 | "import os\n", 129 | "\n", 130 | "bm25 = pt.BatchRetrieve(index_ref, wmodel=\"BM25\", metadata=['docno', 'text'])\n", 131 | "\n", 132 | "mono_t5 = MonoT5ReRanker(model=os.environ['MODEL_NAME'], tok_model=os.environ['TOKENIZER_NAME'])\n", 133 | "\n", 134 | "pipeline = bm25 % 1000 >> mono_t5" 135 | ] 136 | }, 137 | { 138 | "cell_type": "code", 139 | "execution_count": 8, 140 | "id": "c0e07fca-de98-4de2-b6a7-abfd516c652c", 141 | "metadata": {}, 142 | "outputs": [ 143 | { 144 | "name": "stderr", 145 | "output_type": "stream", 146 | "text": [ 147 | "monoT5: 100%|██████████████████████████████████████████████████████████████████████████████████████████████████| 2/2 [00:00<00:00, 4.26batches/s]\n" 148 | ] 149 | } 150 | ], 151 | "source": [ 152 | "run = pipeline(queries)" 153 | ] 154 | }, 155 | { 156 | "cell_type": "markdown", 157 | "id": "28c40a2e-0f96-4ae8-aa5e-55a5e7ef9dee", 158 | "metadata": {}, 159 | "source": [ 160 | "### Step 5: Persist Run" 161 | ] 162 | }, 163 | { 164 | "cell_type": "code", 165 | "execution_count": 9, 166 | "id": "12e5bb42-ed1f-41ba-b7a5-cb43ebca96f6", 167 | "metadata": {}, 168 | "outputs": [ 169 | { 170 | "name": "stdout", 171 | "output_type": "stream", 172 | "text": [ 173 | "Step 5: Persist Run.\n" 174 | ] 175 | } 176 | ], 177 | "source": [ 178 | "print('Step 5: Persist Run.')\n", 179 | "\n", 180 | "persist_and_normalize_run(run, output_file=output_directory, system_name='MonoT5', depth=1000)" 181 | ] 182 | } 183 | ], 184 | "metadata": { 185 | "kernelspec": { 186 | "display_name": "Python 3 (ipykernel)", 187 | "language": "python", 188 | "name": "python3" 189 | }, 190 | "language_info": { 191 | "codemirror_mode": { 192 | "name": "ipython", 193 | "version": 3 194 | }, 195 | "file_extension": ".py", 196 | "mimetype": "text/x-python", 197 | "name": "python", 198 | "nbconvert_exporter": "python", 199 | "pygments_lexer": "ipython3", 200 | "version": "3.7.13" 201 | } 202 | }, 203 | "nbformat": 4, 204 | "nbformat_minor": 5 205 | } 206 | -------------------------------------------------------------------------------- /tira-ir-starters/pyterrier-t5/sample-input-full-rank/documents.jsonl: -------------------------------------------------------------------------------- 1 | {"docno": "pangram-01", "text": "How quickly daft jumping zebras vex.", "original_document": {"doc_id": "pangram-01", "text": "How quickly daft jumping zebras vex.", "letters": 30}} 2 | {"docno": "pangram-02", "text": "Quick fox jumps nightly above wizard.", "original_document": {"doc_id": "pangram-02", "text": "Quick fox jumps nightly above wizard.", "letters": 31}} 3 | {"docno": "pangram-03", "text": "The jay, pig, fox, zebra and my wolves quack!", "original_document": {"doc_id": "pangram-03", "text": "The jay, pig, fox, zebra and my wolves quack!", "letters": 33}} 4 | {"docno": "pangram-04", "text": "The quick brown fox jumps over the lazy dog.", "original_document": {"doc_id": "pangram-04", "text": "The quick brown fox jumps over the lazy dog.", "letters": 35}} 5 | {"docno": "pangram-05", "text": "As quirky joke, chefs won\u2019t pay devil magic zebra tax.", "original_document": {"doc_id": "pangram-05", "text": "As quirky joke, chefs won\u2019t pay devil magic zebra tax.", "letters": 42}} 6 | -------------------------------------------------------------------------------- /tira-ir-starters/pyterrier-t5/sample-input-full-rank/metadata.json: -------------------------------------------------------------------------------- 1 | {"ir_datasets_id": "pangrams"} 2 | -------------------------------------------------------------------------------- /tira-ir-starters/pyterrier-t5/sample-input-full-rank/queries.jsonl: -------------------------------------------------------------------------------- 1 | {"qid": "1", "query": "fox jumps above animal", "original_query": {"query_id": "1", "title": "fox jumps above animal", "description": "What pangrams have a fox jumping above some animal?", "narrative": "Relevant pangrams have a fox jumping over an animal (e.g., an dog). Pangrams containing a fox that is not jumping or jumps over something that is not an animal are not relevant."}} 2 | {"qid": "2", "query": "multiple animals including a zebra", "original_query": {"query_id": "2", "title": "multiple animals including a zebra", "description": "Which pangrams have multiple animals where one of the animals is a zebra?", "narrative": "Relevant pangrams have at least two animals, one of the animals must be a Zebra. Pangrams containing only a Zebra are not relevant."}} 3 | -------------------------------------------------------------------------------- /tira-ir-starters/pyterrier-t5/sample-input-full-rank/queries.xml: -------------------------------------------------------------------------------- 1 | 2 | 3 | 4 | fox jumps above animal 5 | 6 | 7 | 8 | 1 9 | 10 | 11 | fox jumps above animal 12 | 13 | 14 | What pangrams have a fox jumping above some animal? 15 | 16 | 17 | Relevant pangrams have a fox jumping over an animal (e.g., an dog). Pangrams containing a fox that is not jumping or jumps over something that is not an animal are not relevant. 18 | 19 | 20 | 21 | 22 | 23 | multiple animals including a zebra 24 | 25 | 26 | 27 | 2 28 | 29 | 30 | multiple animals including a zebra 31 | 32 | 33 | Which pangrams have multiple animals where one of the animals is a zebra? 34 | 35 | 36 | Relevant pangrams have at least two animals, one of the animals must be a Zebra. Pangrams containing only a Zebra are not relevant. 37 | 38 | 39 | 40 | -------------------------------------------------------------------------------- /tira-ir-starters/pyterrier/Dockerfile.base: -------------------------------------------------------------------------------- 1 | FROM pytorch/pytorch:1.12.0-cuda11.3-cudnn8-runtime 2 | 3 | ENV PYTERRIER_VERSION='5.7' 4 | ENV PYTERRIER_HELPER_VERSION='0.0.7' 5 | 6 | RUN apt-get update \ 7 | && apt-get install -y git openjdk-11-jdk \ 8 | && pip3 install python-terrier pandas jupyterlab runnb \ 9 | && python3 -c "import pyterrier as pt; pt.init(version='${PYTERRIER_VERSION}', helper_version='${PYTERRIER_HELPER_VERSION}');" \ 10 | && python3 -c "import pyterrier as pt; pt.init(version='${PYTERRIER_VERSION}', helper_version='${PYTERRIER_HELPER_VERSION}', boot_packages=['com.github.terrierteam:terrier-prf:-SNAPSHOT']);" 11 | 12 | RUN pip3 install tira==0.0.97 13 | 14 | COPY pyterrier/full-rank-pipeline.ipynb pyterrier/retrieval-pipeline.ipynb pyterrier/run-pyterrier-notebook.py pyterrier/default_pipelines.py pyterrier/pyterrier_cli.py /workspace/ 15 | 16 | ENV PYTHONPATH=/workspace 17 | 18 | RUN jupyter trust /workspace/full-rank-pipeline.ipynb \ 19 | && jupyter trust /workspace/retrieval-pipeline.ipynb \ 20 | && ln -s /workspace/run-pyterrier-notebook.py /workspace/run-notebook.py 21 | 22 | -------------------------------------------------------------------------------- /tira-ir-starters/pyterrier/__pycache__/tira_utils.cpython-310.pyc: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/tira-io/ir-experiment-platform/9fded863a5fb06332ee73b2a6913d763eed3510d/tira-ir-starters/pyterrier/__pycache__/tira_utils.cpython-310.pyc -------------------------------------------------------------------------------- /tira-ir-starters/pyterrier/default_pipelines.py: -------------------------------------------------------------------------------- 1 | import pyterrier as pt 2 | from copy import deepcopy 3 | 4 | 5 | ALLOWED_PARAMETERS = { 6 | 'RM3': ('fb_terms', 'fb_docs', 'fb_lamdba'), 7 | } 8 | 9 | 10 | def add_params(controls, params, params_type): 11 | allowed_parameters = ALLOWED_PARAMETERS[params_type] 12 | 13 | for key, value in params.items(): 14 | if key in allowed_parameters: 15 | controls[key] = value 16 | 17 | return controls 18 | 19 | 20 | def wmodel_batch_retrieve(index_ref, params): 21 | return pt.BatchRetrieve(index_ref, params) 22 | 23 | 24 | def wmodel_text_scorer(index_ref, params): 25 | params = deepcopy(params) 26 | default_params = {'verbose': True, 'body_attr': 'text'} 27 | 28 | for k,v in default_params.items(): 29 | if params and k not in params: 30 | params[k] = v 31 | 32 | return pt.text.scorer(**params) 33 | 34 | 35 | def wmodel_batch_retrieve_rm3(index_ref, params): 36 | wmodel_retrieve = wmodel_batch_retrieve(index_ref, params) 37 | rm3 = pt.rewrite.RM3(index_ref, **add_params({}, params, 'RM3')) 38 | 39 | return wmodel_retrieve >> rm3 >> wmodel_retrieve 40 | 41 | -------------------------------------------------------------------------------- /tira-ir-starters/pyterrier/pyterrier_cli.py: -------------------------------------------------------------------------------- 1 | #!/usr/bin/env python3 2 | 3 | import json 4 | import argparse 5 | from pathlib import Path 6 | import os 7 | 8 | from tira.third_party_integrations import ensure_pyterrier_is_loaded, load_rerank_data, normalize_run 9 | import pyterrier as pt 10 | from pyterrier import IndexRef 11 | import gzip 12 | import importlib 13 | from tqdm import tqdm 14 | 15 | 16 | def process_params_input(params: list): 17 | return { param.partition('=')[0]: param.partition('=')[2] for param in params } if params else {} 18 | 19 | 20 | def load_retrieval_pipeline(pipeline: str, indexref: IndexRef, controls: dict): 21 | pipeline = pipeline.split('.') 22 | 23 | module_name = '.'.join(pipeline[:-1]) 24 | module = importlib.import_module(module_name) 25 | 26 | return getattr(module, pipeline[-1])(indexref, controls) 27 | 28 | 29 | def parse_args(): 30 | parser = argparse.ArgumentParser(description='') 31 | 32 | parser.add_argument('--input', type=str, help='The directory with the input data (i.e., a queries.jsonl and a documents.jsonl file).', required=True) 33 | parser.add_argument('--output', type=str, help='The output will be stored in this directory.', required=True) 34 | parser.add_argument('--index_directory', type=str, help='The index is stored/expected in this directory.', required=False) 35 | parser.add_argument('--retrieval_pipeline', type=str, default=None, help='TBD.') 36 | parser.add_argument('--params', type=str, nargs='+', help='The controls of the retrieval methods as dictionaries.', required=False) 37 | parser.add_argument('--rerank', type=bool, default=False, help='Run a re-ranker. This assumes that the input directory contains a valid re-ranking input.', required=False) 38 | parser.add_argument('--blocks', type=bool, default=False, help='For indexing: should the pyterrier index add blocks?', required=False) 39 | 40 | return parser.parse_args() 41 | 42 | 43 | def index(documents, index_directory, blocks): 44 | if os.path.exists(f'{index_directory}/data.properties'): 45 | return pt.IndexRef.of(f'{index_directory}/data.properties') 46 | 47 | if Path(documents).exists(): 48 | documents = tqdm((json.loads(line) for line in Path(documents).open('rt')), 'Load Documents') 49 | elif Path(documents +'.gz').exists(): 50 | documents = tqdm((json.loads(line) for line in gzip.open(documents + '.gz', 'rt')), 'Load Documents') 51 | 52 | print(f'create new index at:\t{index_directory}') 53 | return pt.IterDictIndexer(index_directory, meta={'docno' : 100}, blocks=blocks).index(documents) 54 | 55 | 56 | def retrieve(queries, index_ref, args, retrieval_pipeline, output_directory): 57 | print(f'loading topics from:\t{queries}') 58 | queries = pt.io.read_topics(queries, 'trecxml') 59 | 60 | controls = process_params_input(args.params) 61 | controls['raw_passed_arguments'] = vars(args) 62 | pipeline = load_retrieval_pipeline(retrieval_pipeline, index_ref, controls) 63 | 64 | result = pipeline(queries) 65 | 66 | print(f'writing run file to:\t{output_directory}/run.txt') 67 | Path(output_directory).mkdir(parents=True, exist_ok=True) 68 | pt.io.write_results(normalize_run(result, 1000), f'{output_directory}/run.txt', run_name=f'pyterrier.{retrieval_pipeline}') 69 | 70 | 71 | def rerank(rerank_data, retrieval_pipeline, output_directory): 72 | pipeline = load_retrieval_pipeline(retrieval_pipeline, None, process_params_input(args.params)) 73 | 74 | rerank_data['query'] = rerank_data['query'].apply(lambda i: "".join([x if x.isalnum() else " " for x in i])) 75 | 76 | result = pipeline(rerank_data) 77 | 78 | print(f'writing run file to:\t{output_directory}/run.txt') 79 | Path(output_directory).mkdir(parents=True, exist_ok=True) 80 | pt.io.write_results(normalize_run(result, 1000), f'{output_directory}/run.txt', run_name=f'pyterrier.{retrieval_pipeline}') 81 | 82 | if __name__ == '__main__': 83 | args = parse_args() 84 | ensure_pyterrier_is_loaded() 85 | 86 | index_ref = index(args.input + '/documents.jsonl', os.path.abspath(Path(args.index_directory) / 'index'), args.blocks) if args.index_directory else None 87 | 88 | if args.retrieval_pipeline: 89 | if args.rerank: 90 | rerank(load_rerank_data(args.input), args.retrieval_pipeline, args.output) 91 | else: 92 | retrieve(args.input + '/queries.xml', index_ref, args, args.retrieval_pipeline, args.output) 93 | 94 | -------------------------------------------------------------------------------- /tira-ir-starters/pyterrier/retrieval-pipeline.ipynb: -------------------------------------------------------------------------------- 1 | { 2 | "cells": [ 3 | { 4 | "cell_type": "markdown", 5 | "id": "1996c3be-ed69-4674-8fe1-dc8df3e9543b", 6 | "metadata": {}, 7 | "source": [ 8 | "# This is my cool Pipeline" 9 | ] 10 | }, 11 | { 12 | "cell_type": "markdown", 13 | "id": "b3fc6cff-ffc7-4f0c-a93f-e35bdc388987", 14 | "metadata": {}, 15 | "source": [ 16 | "### Step 1: Import everything and load variables" 17 | ] 18 | }, 19 | { 20 | "cell_type": "code", 21 | "execution_count": 50, 22 | "id": "ba20977a-e280-49ed-9857-392a0067ccdc", 23 | "metadata": {}, 24 | "outputs": [ 25 | { 26 | "name": "stdout", 27 | "output_type": "stream", 28 | "text": [ 29 | "I will use a small hardcoded example.\n", 30 | "I will write the run file to /tmp/\n" 31 | ] 32 | } 33 | ], 34 | "source": [ 35 | "import pyterrier as pt\n", 36 | "import pandas as pd\n", 37 | "import os\n", 38 | "\n", 39 | "if not pt.started():\n", 40 | " pt.init(version='5.7', helper_version='0.0.7', no_download=True)\n", 41 | "\n", 42 | "input_data = os.environ.get('TIRA_INPUT_DIRECTORY', None)\n", 43 | "\n", 44 | "if input_data:\n", 45 | " input_data = input_data + '/rerank.jsonl'\n", 46 | " print(f'I will read the input data from {input_data}.')\n", 47 | "else:\n", 48 | " print('I will use a small hardcoded example.')\n", 49 | "\n", 50 | "output_file = os.environ.get('TIRA_OUTPUT_DIRECTORY', '/tmp/')\n", 51 | "\n", 52 | "print(f'I will write the run file to {output_file}')\n" 53 | ] 54 | }, 55 | { 56 | "cell_type": "markdown", 57 | "id": "7d17fffc-a409-4b40-a1f0-0463d365c853", 58 | "metadata": {}, 59 | "source": [ 60 | "### Step 2: Load the data" 61 | ] 62 | }, 63 | { 64 | "cell_type": "code", 65 | "execution_count": 51, 66 | "id": "2558e75c-84dd-46c4-81bb-3b4ca27a8701", 67 | "metadata": {}, 68 | "outputs": [ 69 | { 70 | "data": { 71 | "text/html": [ 72 | "
\n", 73 | "\n", 86 | "\n", 87 | " \n", 88 | " \n", 89 | " \n", 90 | " \n", 91 | " \n", 92 | " \n", 93 | " \n", 94 | " \n", 95 | " \n", 96 | " \n", 97 | " \n", 98 | " \n", 99 | " \n", 100 | " \n", 101 | " \n", 102 | " \n", 103 | " \n", 104 | " \n", 105 | " \n", 106 | " \n", 107 | " \n", 108 | " \n", 109 | " \n", 110 | " \n", 111 | " \n", 112 | "
qidquerydocnotext
0q1chemical reactionsd1professor protor poured the chemicals reaction
1q1chemical reactionsd2chemical brothers turned up the beats
\n", 113 | "
" 114 | ], 115 | "text/plain": [ 116 | " qid query docno \\\n", 117 | "0 q1 chemical reactions d1 \n", 118 | "1 q1 chemical reactions d2 \n", 119 | "\n", 120 | " text \n", 121 | "0 professor protor poured the chemicals reaction \n", 122 | "1 chemical brothers turned up the beats " 123 | ] 124 | }, 125 | "execution_count": 51, 126 | "metadata": {}, 127 | "output_type": "execute_result" 128 | } 129 | ], 130 | "source": [ 131 | "df = pd.DataFrame([\n", 132 | " {'qid': 'q1', 'query': 'chemical reactions', 'docno': 'd1', 'text': 'professor protor poured the chemicals reaction'},\n", 133 | " {'qid': 'q1', 'query': 'chemical reactions', 'docno': 'd2', 'text': 'chemical brothers turned up the beats'}\n", 134 | "])\n", 135 | "\n", 136 | "if input_data:\n", 137 | " print(f'Read input data from {input_data}.')\n", 138 | " df = pd.read_json(input_data, lines=True)\n", 139 | " print(f'Done...')\n", 140 | "\n", 141 | "df" 142 | ] 143 | }, 144 | { 145 | "cell_type": "markdown", 146 | "id": "169b2218-38ec-4c3d-80a7-b144ecd4cc02", 147 | "metadata": {}, 148 | "source": [ 149 | "### Step 3: Define the actual retrieval appraoch" 150 | ] 151 | }, 152 | { 153 | "cell_type": "code", 154 | "execution_count": 44, 155 | "id": "eb38488f-a6bb-4960-a1c2-680fead6465d", 156 | "metadata": {}, 157 | "outputs": [], 158 | "source": [ 159 | "bm25_scorer = pt.text.scorer(body_attr=\"text\", wmodel='BM25')" 160 | ] 161 | }, 162 | { 163 | "cell_type": "markdown", 164 | "id": "a7315abd-5fb1-478c-b8f6-5970dbfb7e5c", 165 | "metadata": {}, 166 | "source": [ 167 | "### Step 4: Run the pipeline" 168 | ] 169 | }, 170 | { 171 | "cell_type": "code", 172 | "execution_count": 46, 173 | "id": "1a7b340c-0001-4480-9466-7e34c5f83dcc", 174 | "metadata": {}, 175 | "outputs": [ 176 | { 177 | "name": "stdout", 178 | "output_type": "stream", 179 | "text": [ 180 | "18:43:29.314 [main] WARN org.terrier.querying.ApplyTermPipeline - The index has no termpipelines configuration, and no control configuration is found. Defaulting to global termpipelines configuration of 'Stopwords,PorterStemmer'. Set a termpipelines control to remove this warning.\n" 181 | ] 182 | }, 183 | { 184 | "data": { 185 | "text/html": [ 186 | "
\n", 187 | "\n", 200 | "\n", 201 | " \n", 202 | " \n", 203 | " \n", 204 | " \n", 205 | " \n", 206 | " \n", 207 | " \n", 208 | " \n", 209 | " \n", 210 | " \n", 211 | " \n", 212 | " \n", 213 | " \n", 214 | " \n", 215 | " \n", 216 | " \n", 217 | " \n", 218 | " \n", 219 | " \n", 220 | " \n", 221 | " \n", 222 | " \n", 223 | " \n", 224 | " \n", 225 | " \n", 226 | " \n", 227 | " \n", 228 | " \n", 229 | " \n", 230 | " \n", 231 | " \n", 232 | "
qiddocnotextrankscorequery
0q1d1professor protor poured the chemicals reaction0-2.220975chemical reactions
1q1d2chemical brothers turned up the beats1-2.432496chemical reactions
\n", 233 | "
" 234 | ], 235 | "text/plain": [ 236 | " qid docno text rank score \\\n", 237 | "0 q1 d1 professor protor poured the chemicals reaction 0 -2.220975 \n", 238 | "1 q1 d2 chemical brothers turned up the beats 1 -2.432496 \n", 239 | "\n", 240 | " query \n", 241 | "0 chemical reactions \n", 242 | "1 chemical reactions " 243 | ] 244 | }, 245 | "execution_count": 46, 246 | "metadata": {}, 247 | "output_type": "execute_result" 248 | } 249 | ], 250 | "source": [ 251 | "results = bm25_scorer(df)\n", 252 | "results" 253 | ] 254 | }, 255 | { 256 | "cell_type": "markdown", 257 | "id": "0fbff470-0907-4864-8795-6ea4ec8f9586", 258 | "metadata": {}, 259 | "source": [ 260 | "### Step 5: Persist results" 261 | ] 262 | } 263 | ], 264 | "metadata": { 265 | "kernelspec": { 266 | "display_name": "Python 3 (ipykernel)", 267 | "language": "python", 268 | "name": "python3" 269 | }, 270 | "language_info": { 271 | "codemirror_mode": { 272 | "name": "ipython", 273 | "version": 3 274 | }, 275 | "file_extension": ".py", 276 | "mimetype": "text/x-python", 277 | "name": "python", 278 | "nbconvert_exporter": "python", 279 | "pygments_lexer": "ipython3", 280 | "version": "3.7.13" 281 | } 282 | }, 283 | "nbformat": 4, 284 | "nbformat_minor": 5 285 | } 286 | -------------------------------------------------------------------------------- /tira-ir-starters/pyterrier/run-pyterrier-notebook.py: -------------------------------------------------------------------------------- 1 | #!/usr/bin/env python3 2 | import argparse 3 | import os 4 | import sys 5 | import subprocess 6 | 7 | 8 | def parse_args(): 9 | parser = argparse.ArgumentParser(description='') 10 | 11 | parser.add_argument('--input', type=str, help='The directory that contains the input data (this directory is expected to contain a queries.jsonl and a documents.jsonl file).', required=True) 12 | parser.add_argument('--notebook', type=str, help='The notebook to execute.', required=True) 13 | parser.add_argument('--output', type=str, help='The resulting run.txt will be stored in this directory.', required=True) 14 | parser.add_argument('--chdir', type=str, help='Change the directory before executing the notebook to allow for relative imports.', required=False, default=None) 15 | 16 | return parser.parse_args() 17 | 18 | 19 | def main(args): 20 | os.environ['TIRA_INPUT_DIRECTORY'] = os.path.abspath(args.input) 21 | os.environ['TIRA_INPUT_DATASET'] = os.path.abspath(args.input) 22 | os.environ['TIRA_OUTPUT_DIRECTORY'] = os.path.abspath(args.output) 23 | os.environ['TIRA_OUTPUT_DIR'] = os.path.abspath(args.output) 24 | 25 | if args.chdir: 26 | print(f'Change directory to allow relative imports to "{args.chdir}".', flush=True) 27 | os.chdir(args.chdir) 28 | 29 | command = f'runnb --allow-not-trusted {args.notebook}' 30 | subprocess.check_call(command, shell=True, stdout=sys.stdout, stderr=subprocess.STDOUT) 31 | 32 | 33 | if __name__ == '__main__': 34 | main(parse_args()) 35 | 36 | -------------------------------------------------------------------------------- /tira-ir-starters/pyterrier/sample-input-full-rank-gz/documents.jsonl.gz: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/tira-io/ir-experiment-platform/9fded863a5fb06332ee73b2a6913d763eed3510d/tira-ir-starters/pyterrier/sample-input-full-rank-gz/documents.jsonl.gz -------------------------------------------------------------------------------- /tira-ir-starters/pyterrier/sample-input-full-rank-gz/metadata.json: -------------------------------------------------------------------------------- 1 | {"ir_datasets_id": "pangrams"} 2 | -------------------------------------------------------------------------------- /tira-ir-starters/pyterrier/sample-input-full-rank-gz/queries.jsonl: -------------------------------------------------------------------------------- 1 | {"qid": "1", "query": "fox jumps above animal", "original_query": {"query_id": "1", "title": "fox jumps above animal", "description": "What pangrams have a fox jumping above some animal?", "narrative": "Relevant pangrams have a fox jumping over an animal (e.g., an dog). Pangrams containing a fox that is not jumping or jumps over something that is not an animal are not relevant."}} 2 | {"qid": "2", "query": "multiple animals including a zebra", "original_query": {"query_id": "2", "title": "multiple animals including a zebra", "description": "Which pangrams have multiple animals where one of the animals is a zebra?", "narrative": "Relevant pangrams have at least two animals, one of the animals must be a Zebra. Pangrams containing only a Zebra are not relevant."}} 3 | -------------------------------------------------------------------------------- /tira-ir-starters/pyterrier/sample-input-full-rank-gz/queries.xml: -------------------------------------------------------------------------------- 1 | 2 | 3 | 4 | fox jumps above animal 5 | 6 | 7 | 8 | 1 9 | 10 | 11 | fox jumps above animal 12 | 13 | 14 | What pangrams have a fox jumping above some animal? 15 | 16 | 17 | Relevant pangrams have a fox jumping over an animal (e.g., an dog). Pangrams containing a fox that is not jumping or jumps over something that is not an animal are not relevant. 18 | 19 | 20 | 21 | 22 | 23 | multiple animals including a zebra 24 | 25 | 26 | 27 | 2 28 | 29 | 30 | multiple animals including a zebra 31 | 32 | 33 | Which pangrams have multiple animals where one of the animals is a zebra? 34 | 35 | 36 | Relevant pangrams have at least two animals, one of the animals must be a Zebra. Pangrams containing only a Zebra are not relevant. 37 | 38 | 39 | 40 | -------------------------------------------------------------------------------- /tira-ir-starters/pyterrier/sample-input-full-rank/documents.jsonl: -------------------------------------------------------------------------------- 1 | {"docno": "pangram-01", "text": "How quickly daft jumping zebras vex.", "original_document": {"doc_id": "pangram-01", "text": "How quickly daft jumping zebras vex.", "letters": 30}} 2 | {"docno": "pangram-02", "text": "Quick fox jumps nightly above wizard.", "original_document": {"doc_id": "pangram-02", "text": "Quick fox jumps nightly above wizard.", "letters": 31}} 3 | {"docno": "pangram-03", "text": "The jay, pig, fox, zebra and my wolves quack!", "original_document": {"doc_id": "pangram-03", "text": "The jay, pig, fox, zebra and my wolves quack!", "letters": 33}} 4 | {"docno": "pangram-04", "text": "The quick brown fox jumps over the lazy dog.", "original_document": {"doc_id": "pangram-04", "text": "The quick brown fox jumps over the lazy dog.", "letters": 35}} 5 | {"docno": "pangram-05", "text": "As quirky joke, chefs won\u2019t pay devil magic zebra tax.", "original_document": {"doc_id": "pangram-05", "text": "As quirky joke, chefs won\u2019t pay devil magic zebra tax.", "letters": 42}} 6 | -------------------------------------------------------------------------------- /tira-ir-starters/pyterrier/sample-input-full-rank/metadata.json: -------------------------------------------------------------------------------- 1 | {"ir_datasets_id": "pangrams"} 2 | -------------------------------------------------------------------------------- /tira-ir-starters/pyterrier/sample-input-full-rank/queries.jsonl: -------------------------------------------------------------------------------- 1 | {"qid": "1", "query": "fox jumps above animal", "original_query": {"query_id": "1", "title": "fox jumps above animal", "description": "What pangrams have a fox jumping above some animal?", "narrative": "Relevant pangrams have a fox jumping over an animal (e.g., an dog). Pangrams containing a fox that is not jumping or jumps over something that is not an animal are not relevant."}} 2 | {"qid": "2", "query": "multiple animals including a zebra", "original_query": {"query_id": "2", "title": "multiple animals including a zebra", "description": "Which pangrams have multiple animals where one of the animals is a zebra?", "narrative": "Relevant pangrams have at least two animals, one of the animals must be a Zebra. Pangrams containing only a Zebra are not relevant."}} 3 | -------------------------------------------------------------------------------- /tira-ir-starters/pyterrier/sample-input-full-rank/queries.xml: -------------------------------------------------------------------------------- 1 | 2 | 3 | 4 | fox jumps above animal 5 | 6 | 7 | 8 | 1 9 | 10 | 11 | fox jumps above animal 12 | 13 | 14 | What pangrams have a fox jumping above some animal? 15 | 16 | 17 | Relevant pangrams have a fox jumping over an animal (e.g., an dog). Pangrams containing a fox that is not jumping or jumps over something that is not an animal are not relevant. 18 | 19 | 20 | 21 | 22 | 23 | multiple animals including a zebra 24 | 25 | 26 | 27 | 2 28 | 29 | 30 | multiple animals including a zebra 31 | 32 | 33 | Which pangrams have multiple animals where one of the animals is a zebra? 34 | 35 | 36 | Relevant pangrams have at least two animals, one of the animals must be a Zebra. Pangrams containing only a Zebra are not relevant. 37 | 38 | 39 | 40 | -------------------------------------------------------------------------------- /tira-ir-starters/pyterrier/sample-input/rerank.jsonl.gz: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/tira-io/ir-experiment-platform/9fded863a5fb06332ee73b2a6913d763eed3510d/tira-ir-starters/pyterrier/sample-input/rerank.jsonl.gz --------------------------------------------------------------------------------