├── .gitignore ├── CHANGELOG.md ├── CODE_OF_CONDUCT.md ├── CONTRIBUTING.md ├── Dockerfile ├── LICENSE ├── README.md ├── data ├── README.md ├── build_es_index_passages.py ├── convert_dataset_to_dpr_retriever_input_file.py ├── extract_paragraphs_from_page_htmls.py ├── get_all_page_ids_from_cirrussearch.py ├── get_page_htmls.py ├── make_dpr_qas_dataset.py ├── make_dpr_retriever_dataset.py ├── make_dpr_wikipedia_split_dataset.py ├── make_passages_from_paragraphs.py └── requirements.txt ├── dense_retriever.py ├── do_example_run.sh ├── dpr ├── __init__.py ├── data │ ├── __init__.py │ ├── qa_validation.py │ └── reader_data.py ├── indexer │ └── faiss_indexers.py ├── models │ ├── __init__.py │ ├── biencoder.py │ ├── fairseq_models.py │ ├── hf_models.py │ ├── pytext_models.py │ └── reader.py ├── options.py └── utils │ ├── __init__.py │ ├── data_utils.py │ ├── dist_utils.py │ ├── model_utils.py │ └── tokenizers.py ├── generate_dense_embeddings.py ├── imgs ├── aio.png ├── open-qa.png └── retriever.png ├── instruction_of_dirtree.md ├── model └── README.md ├── preprocess_reader_data.py ├── requirements.txt ├── scripts ├── configs │ ├── config.pth │ ├── reader_base.json │ └── retriever_base.json ├── download_data.sh ├── download_model.sh ├── reader │ ├── eval_reader.sh │ └── train_reader.sh └── retriever │ ├── encode_ctxs.sh │ ├── retrieve_passage.sh │ └── train_retriever.sh ├── setup.py ├── submission.sh ├── train_dense_encoder.py └── train_reader.py /.gitignore: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/cl-tohoku/AIO2_DPR_baseline/HEAD/.gitignore -------------------------------------------------------------------------------- /CHANGELOG.md: -------------------------------------------------------------------------------- 1 | -------------------------------------------------------------------------------- /CODE_OF_CONDUCT.md: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/cl-tohoku/AIO2_DPR_baseline/HEAD/CODE_OF_CONDUCT.md -------------------------------------------------------------------------------- /CONTRIBUTING.md: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/cl-tohoku/AIO2_DPR_baseline/HEAD/CONTRIBUTING.md -------------------------------------------------------------------------------- /Dockerfile: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/cl-tohoku/AIO2_DPR_baseline/HEAD/Dockerfile -------------------------------------------------------------------------------- /LICENSE: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/cl-tohoku/AIO2_DPR_baseline/HEAD/LICENSE -------------------------------------------------------------------------------- /README.md: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/cl-tohoku/AIO2_DPR_baseline/HEAD/README.md -------------------------------------------------------------------------------- /data/README.md: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/cl-tohoku/AIO2_DPR_baseline/HEAD/data/README.md -------------------------------------------------------------------------------- /data/build_es_index_passages.py: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/cl-tohoku/AIO2_DPR_baseline/HEAD/data/build_es_index_passages.py -------------------------------------------------------------------------------- /data/convert_dataset_to_dpr_retriever_input_file.py: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/cl-tohoku/AIO2_DPR_baseline/HEAD/data/convert_dataset_to_dpr_retriever_input_file.py -------------------------------------------------------------------------------- /data/extract_paragraphs_from_page_htmls.py: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/cl-tohoku/AIO2_DPR_baseline/HEAD/data/extract_paragraphs_from_page_htmls.py -------------------------------------------------------------------------------- /data/get_all_page_ids_from_cirrussearch.py: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/cl-tohoku/AIO2_DPR_baseline/HEAD/data/get_all_page_ids_from_cirrussearch.py -------------------------------------------------------------------------------- /data/get_page_htmls.py: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/cl-tohoku/AIO2_DPR_baseline/HEAD/data/get_page_htmls.py -------------------------------------------------------------------------------- /data/make_dpr_qas_dataset.py: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/cl-tohoku/AIO2_DPR_baseline/HEAD/data/make_dpr_qas_dataset.py -------------------------------------------------------------------------------- /data/make_dpr_retriever_dataset.py: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/cl-tohoku/AIO2_DPR_baseline/HEAD/data/make_dpr_retriever_dataset.py -------------------------------------------------------------------------------- /data/make_dpr_wikipedia_split_dataset.py: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/cl-tohoku/AIO2_DPR_baseline/HEAD/data/make_dpr_wikipedia_split_dataset.py -------------------------------------------------------------------------------- /data/make_passages_from_paragraphs.py: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/cl-tohoku/AIO2_DPR_baseline/HEAD/data/make_passages_from_paragraphs.py -------------------------------------------------------------------------------- /data/requirements.txt: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/cl-tohoku/AIO2_DPR_baseline/HEAD/data/requirements.txt -------------------------------------------------------------------------------- /dense_retriever.py: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/cl-tohoku/AIO2_DPR_baseline/HEAD/dense_retriever.py -------------------------------------------------------------------------------- /do_example_run.sh: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/cl-tohoku/AIO2_DPR_baseline/HEAD/do_example_run.sh -------------------------------------------------------------------------------- /dpr/__init__.py: -------------------------------------------------------------------------------- 1 | -------------------------------------------------------------------------------- /dpr/data/__init__.py: -------------------------------------------------------------------------------- 1 | -------------------------------------------------------------------------------- /dpr/data/qa_validation.py: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/cl-tohoku/AIO2_DPR_baseline/HEAD/dpr/data/qa_validation.py -------------------------------------------------------------------------------- /dpr/data/reader_data.py: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/cl-tohoku/AIO2_DPR_baseline/HEAD/dpr/data/reader_data.py -------------------------------------------------------------------------------- /dpr/indexer/faiss_indexers.py: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/cl-tohoku/AIO2_DPR_baseline/HEAD/dpr/indexer/faiss_indexers.py -------------------------------------------------------------------------------- /dpr/models/__init__.py: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/cl-tohoku/AIO2_DPR_baseline/HEAD/dpr/models/__init__.py -------------------------------------------------------------------------------- /dpr/models/biencoder.py: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/cl-tohoku/AIO2_DPR_baseline/HEAD/dpr/models/biencoder.py -------------------------------------------------------------------------------- /dpr/models/fairseq_models.py: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/cl-tohoku/AIO2_DPR_baseline/HEAD/dpr/models/fairseq_models.py -------------------------------------------------------------------------------- /dpr/models/hf_models.py: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/cl-tohoku/AIO2_DPR_baseline/HEAD/dpr/models/hf_models.py -------------------------------------------------------------------------------- /dpr/models/pytext_models.py: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/cl-tohoku/AIO2_DPR_baseline/HEAD/dpr/models/pytext_models.py -------------------------------------------------------------------------------- /dpr/models/reader.py: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/cl-tohoku/AIO2_DPR_baseline/HEAD/dpr/models/reader.py -------------------------------------------------------------------------------- /dpr/options.py: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/cl-tohoku/AIO2_DPR_baseline/HEAD/dpr/options.py -------------------------------------------------------------------------------- /dpr/utils/__init__.py: -------------------------------------------------------------------------------- 1 | -------------------------------------------------------------------------------- /dpr/utils/data_utils.py: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/cl-tohoku/AIO2_DPR_baseline/HEAD/dpr/utils/data_utils.py -------------------------------------------------------------------------------- /dpr/utils/dist_utils.py: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/cl-tohoku/AIO2_DPR_baseline/HEAD/dpr/utils/dist_utils.py -------------------------------------------------------------------------------- /dpr/utils/model_utils.py: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/cl-tohoku/AIO2_DPR_baseline/HEAD/dpr/utils/model_utils.py -------------------------------------------------------------------------------- /dpr/utils/tokenizers.py: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/cl-tohoku/AIO2_DPR_baseline/HEAD/dpr/utils/tokenizers.py -------------------------------------------------------------------------------- /generate_dense_embeddings.py: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/cl-tohoku/AIO2_DPR_baseline/HEAD/generate_dense_embeddings.py -------------------------------------------------------------------------------- /imgs/aio.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/cl-tohoku/AIO2_DPR_baseline/HEAD/imgs/aio.png -------------------------------------------------------------------------------- /imgs/open-qa.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/cl-tohoku/AIO2_DPR_baseline/HEAD/imgs/open-qa.png -------------------------------------------------------------------------------- /imgs/retriever.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/cl-tohoku/AIO2_DPR_baseline/HEAD/imgs/retriever.png -------------------------------------------------------------------------------- /instruction_of_dirtree.md: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/cl-tohoku/AIO2_DPR_baseline/HEAD/instruction_of_dirtree.md -------------------------------------------------------------------------------- /model/README.md: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/cl-tohoku/AIO2_DPR_baseline/HEAD/model/README.md -------------------------------------------------------------------------------- /preprocess_reader_data.py: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/cl-tohoku/AIO2_DPR_baseline/HEAD/preprocess_reader_data.py -------------------------------------------------------------------------------- /requirements.txt: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/cl-tohoku/AIO2_DPR_baseline/HEAD/requirements.txt -------------------------------------------------------------------------------- /scripts/configs/config.pth: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/cl-tohoku/AIO2_DPR_baseline/HEAD/scripts/configs/config.pth -------------------------------------------------------------------------------- /scripts/configs/reader_base.json: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/cl-tohoku/AIO2_DPR_baseline/HEAD/scripts/configs/reader_base.json -------------------------------------------------------------------------------- /scripts/configs/retriever_base.json: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/cl-tohoku/AIO2_DPR_baseline/HEAD/scripts/configs/retriever_base.json -------------------------------------------------------------------------------- /scripts/download_data.sh: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/cl-tohoku/AIO2_DPR_baseline/HEAD/scripts/download_data.sh -------------------------------------------------------------------------------- /scripts/download_model.sh: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/cl-tohoku/AIO2_DPR_baseline/HEAD/scripts/download_model.sh -------------------------------------------------------------------------------- /scripts/reader/eval_reader.sh: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/cl-tohoku/AIO2_DPR_baseline/HEAD/scripts/reader/eval_reader.sh -------------------------------------------------------------------------------- /scripts/reader/train_reader.sh: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/cl-tohoku/AIO2_DPR_baseline/HEAD/scripts/reader/train_reader.sh -------------------------------------------------------------------------------- /scripts/retriever/encode_ctxs.sh: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/cl-tohoku/AIO2_DPR_baseline/HEAD/scripts/retriever/encode_ctxs.sh -------------------------------------------------------------------------------- /scripts/retriever/retrieve_passage.sh: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/cl-tohoku/AIO2_DPR_baseline/HEAD/scripts/retriever/retrieve_passage.sh -------------------------------------------------------------------------------- /scripts/retriever/train_retriever.sh: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/cl-tohoku/AIO2_DPR_baseline/HEAD/scripts/retriever/train_retriever.sh -------------------------------------------------------------------------------- /setup.py: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/cl-tohoku/AIO2_DPR_baseline/HEAD/setup.py -------------------------------------------------------------------------------- /submission.sh: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/cl-tohoku/AIO2_DPR_baseline/HEAD/submission.sh -------------------------------------------------------------------------------- /train_dense_encoder.py: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/cl-tohoku/AIO2_DPR_baseline/HEAD/train_dense_encoder.py -------------------------------------------------------------------------------- /train_reader.py: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/cl-tohoku/AIO2_DPR_baseline/HEAD/train_reader.py --------------------------------------------------------------------------------