├── .gitignore ├── LICENSE ├── README.md ├── bprop ├── README.md ├── extract_bert_attention.py ├── multiprocessing_generate_word_sets.py └── transform_vanilla_attention_to_term_distribution.py ├── data ├── inquery ├── msmarco_info │ └── msmarco_toy.data ├── stopwords.txt └── wiki_info │ └── wiki_toy.data ├── prop ├── multiprocessing_generate_pairwise_instances.py ├── multiprocessing_generate_word_sets.py └── preprocessing_data.py ├── pytorch_pretrain_bert ├── __init__.py ├── convert_tf_checkpoint_to_pytorch.py ├── file_utils.py ├── modeling.py ├── optimization.py └── tokenization.py ├── requirements.txt ├── run_pretraining.py └── scripts ├── extract_bert_attention.sh ├── generate_pair_instances.sh ├── generate_word_sets.sh ├── preprocess.sh ├── run_pretrain.sh └── transform2term_distribution.sh /.gitignore: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/Albert-Ma/PROP/HEAD/.gitignore -------------------------------------------------------------------------------- /LICENSE: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/Albert-Ma/PROP/HEAD/LICENSE -------------------------------------------------------------------------------- /README.md: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/Albert-Ma/PROP/HEAD/README.md -------------------------------------------------------------------------------- /bprop/README.md: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/Albert-Ma/PROP/HEAD/bprop/README.md -------------------------------------------------------------------------------- /bprop/extract_bert_attention.py: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/Albert-Ma/PROP/HEAD/bprop/extract_bert_attention.py -------------------------------------------------------------------------------- /bprop/multiprocessing_generate_word_sets.py: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/Albert-Ma/PROP/HEAD/bprop/multiprocessing_generate_word_sets.py -------------------------------------------------------------------------------- /bprop/transform_vanilla_attention_to_term_distribution.py: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/Albert-Ma/PROP/HEAD/bprop/transform_vanilla_attention_to_term_distribution.py -------------------------------------------------------------------------------- /data/inquery: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/Albert-Ma/PROP/HEAD/data/inquery -------------------------------------------------------------------------------- /data/msmarco_info/msmarco_toy.data: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/Albert-Ma/PROP/HEAD/data/msmarco_info/msmarco_toy.data -------------------------------------------------------------------------------- /data/stopwords.txt: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/Albert-Ma/PROP/HEAD/data/stopwords.txt -------------------------------------------------------------------------------- /data/wiki_info/wiki_toy.data: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/Albert-Ma/PROP/HEAD/data/wiki_info/wiki_toy.data -------------------------------------------------------------------------------- /prop/multiprocessing_generate_pairwise_instances.py: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/Albert-Ma/PROP/HEAD/prop/multiprocessing_generate_pairwise_instances.py -------------------------------------------------------------------------------- /prop/multiprocessing_generate_word_sets.py: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/Albert-Ma/PROP/HEAD/prop/multiprocessing_generate_word_sets.py -------------------------------------------------------------------------------- /prop/preprocessing_data.py: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/Albert-Ma/PROP/HEAD/prop/preprocessing_data.py -------------------------------------------------------------------------------- /pytorch_pretrain_bert/__init__.py: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/Albert-Ma/PROP/HEAD/pytorch_pretrain_bert/__init__.py -------------------------------------------------------------------------------- /pytorch_pretrain_bert/convert_tf_checkpoint_to_pytorch.py: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/Albert-Ma/PROP/HEAD/pytorch_pretrain_bert/convert_tf_checkpoint_to_pytorch.py -------------------------------------------------------------------------------- /pytorch_pretrain_bert/file_utils.py: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/Albert-Ma/PROP/HEAD/pytorch_pretrain_bert/file_utils.py -------------------------------------------------------------------------------- /pytorch_pretrain_bert/modeling.py: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/Albert-Ma/PROP/HEAD/pytorch_pretrain_bert/modeling.py -------------------------------------------------------------------------------- /pytorch_pretrain_bert/optimization.py: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/Albert-Ma/PROP/HEAD/pytorch_pretrain_bert/optimization.py -------------------------------------------------------------------------------- /pytorch_pretrain_bert/tokenization.py: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/Albert-Ma/PROP/HEAD/pytorch_pretrain_bert/tokenization.py -------------------------------------------------------------------------------- /requirements.txt: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/Albert-Ma/PROP/HEAD/requirements.txt -------------------------------------------------------------------------------- /run_pretraining.py: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/Albert-Ma/PROP/HEAD/run_pretraining.py -------------------------------------------------------------------------------- /scripts/extract_bert_attention.sh: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/Albert-Ma/PROP/HEAD/scripts/extract_bert_attention.sh -------------------------------------------------------------------------------- /scripts/generate_pair_instances.sh: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/Albert-Ma/PROP/HEAD/scripts/generate_pair_instances.sh -------------------------------------------------------------------------------- /scripts/generate_word_sets.sh: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/Albert-Ma/PROP/HEAD/scripts/generate_word_sets.sh -------------------------------------------------------------------------------- /scripts/preprocess.sh: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/Albert-Ma/PROP/HEAD/scripts/preprocess.sh -------------------------------------------------------------------------------- /scripts/run_pretrain.sh: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/Albert-Ma/PROP/HEAD/scripts/run_pretrain.sh -------------------------------------------------------------------------------- /scripts/transform2term_distribution.sh: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/Albert-Ma/PROP/HEAD/scripts/transform2term_distribution.sh --------------------------------------------------------------------------------