├── .gitignore ├── LICENSE ├── README.md ├── dataset_zoo ├── __init__.py ├── aro_datasets.py ├── constants.py ├── perturbations.py ├── retrieval.py └── utils.py ├── main_aro.py ├── main_retrieval.py ├── misc └── __init__.py ├── model_zoo ├── __init__.py ├── blip_models.py ├── blip_utils │ ├── README.md │ ├── blip.py │ ├── blip_itm.py │ ├── blip_pretrain.py │ ├── blip_retrieval.py │ ├── med.py │ ├── utils.py │ └── vit.py ├── clip_models.py ├── constants.py ├── flava.py ├── xvlm_models.py └── xvlm_utils │ ├── README.md │ ├── box_ops.py │ ├── clip_vit.py │ ├── swin_transformer.py │ ├── tokenization_bert.py │ ├── tokenization_roberta.py │ ├── vit.py │ ├── xbert.py │ ├── xroberta.py │ └── xvlm.py ├── notebooks └── Replicate ARO! VG-Relation, VG-Attribution.ipynb ├── scripts ├── create_environment.sh ├── reproduce_aro.sh └── reproduce_retrieval.sh └── temp_data ├── train_neg_clip.tsv └── valid_neg_clip.tsv /.gitignore: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/mertyg/vision-language-models-are-bows/HEAD/.gitignore -------------------------------------------------------------------------------- /LICENSE: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/mertyg/vision-language-models-are-bows/HEAD/LICENSE -------------------------------------------------------------------------------- /README.md: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/mertyg/vision-language-models-are-bows/HEAD/README.md -------------------------------------------------------------------------------- /dataset_zoo/__init__.py: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/mertyg/vision-language-models-are-bows/HEAD/dataset_zoo/__init__.py -------------------------------------------------------------------------------- /dataset_zoo/aro_datasets.py: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/mertyg/vision-language-models-are-bows/HEAD/dataset_zoo/aro_datasets.py -------------------------------------------------------------------------------- /dataset_zoo/constants.py: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/mertyg/vision-language-models-are-bows/HEAD/dataset_zoo/constants.py -------------------------------------------------------------------------------- /dataset_zoo/perturbations.py: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/mertyg/vision-language-models-are-bows/HEAD/dataset_zoo/perturbations.py -------------------------------------------------------------------------------- /dataset_zoo/retrieval.py: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/mertyg/vision-language-models-are-bows/HEAD/dataset_zoo/retrieval.py -------------------------------------------------------------------------------- /dataset_zoo/utils.py: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/mertyg/vision-language-models-are-bows/HEAD/dataset_zoo/utils.py -------------------------------------------------------------------------------- /main_aro.py: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/mertyg/vision-language-models-are-bows/HEAD/main_aro.py -------------------------------------------------------------------------------- /main_retrieval.py: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/mertyg/vision-language-models-are-bows/HEAD/main_retrieval.py -------------------------------------------------------------------------------- /misc/__init__.py: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/mertyg/vision-language-models-are-bows/HEAD/misc/__init__.py -------------------------------------------------------------------------------- /model_zoo/__init__.py: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/mertyg/vision-language-models-are-bows/HEAD/model_zoo/__init__.py -------------------------------------------------------------------------------- /model_zoo/blip_models.py: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/mertyg/vision-language-models-are-bows/HEAD/model_zoo/blip_models.py -------------------------------------------------------------------------------- /model_zoo/blip_utils/README.md: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/mertyg/vision-language-models-are-bows/HEAD/model_zoo/blip_utils/README.md -------------------------------------------------------------------------------- /model_zoo/blip_utils/blip.py: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/mertyg/vision-language-models-are-bows/HEAD/model_zoo/blip_utils/blip.py -------------------------------------------------------------------------------- /model_zoo/blip_utils/blip_itm.py: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/mertyg/vision-language-models-are-bows/HEAD/model_zoo/blip_utils/blip_itm.py -------------------------------------------------------------------------------- /model_zoo/blip_utils/blip_pretrain.py: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/mertyg/vision-language-models-are-bows/HEAD/model_zoo/blip_utils/blip_pretrain.py -------------------------------------------------------------------------------- /model_zoo/blip_utils/blip_retrieval.py: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/mertyg/vision-language-models-are-bows/HEAD/model_zoo/blip_utils/blip_retrieval.py -------------------------------------------------------------------------------- /model_zoo/blip_utils/med.py: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/mertyg/vision-language-models-are-bows/HEAD/model_zoo/blip_utils/med.py -------------------------------------------------------------------------------- /model_zoo/blip_utils/utils.py: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/mertyg/vision-language-models-are-bows/HEAD/model_zoo/blip_utils/utils.py -------------------------------------------------------------------------------- /model_zoo/blip_utils/vit.py: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/mertyg/vision-language-models-are-bows/HEAD/model_zoo/blip_utils/vit.py -------------------------------------------------------------------------------- /model_zoo/clip_models.py: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/mertyg/vision-language-models-are-bows/HEAD/model_zoo/clip_models.py -------------------------------------------------------------------------------- /model_zoo/constants.py: -------------------------------------------------------------------------------- 1 | CACHE_DIR="~/.cache" 2 | -------------------------------------------------------------------------------- /model_zoo/flava.py: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/mertyg/vision-language-models-are-bows/HEAD/model_zoo/flava.py -------------------------------------------------------------------------------- /model_zoo/xvlm_models.py: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/mertyg/vision-language-models-are-bows/HEAD/model_zoo/xvlm_models.py -------------------------------------------------------------------------------- /model_zoo/xvlm_utils/README.md: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/mertyg/vision-language-models-are-bows/HEAD/model_zoo/xvlm_utils/README.md -------------------------------------------------------------------------------- /model_zoo/xvlm_utils/box_ops.py: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/mertyg/vision-language-models-are-bows/HEAD/model_zoo/xvlm_utils/box_ops.py -------------------------------------------------------------------------------- /model_zoo/xvlm_utils/clip_vit.py: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/mertyg/vision-language-models-are-bows/HEAD/model_zoo/xvlm_utils/clip_vit.py -------------------------------------------------------------------------------- /model_zoo/xvlm_utils/swin_transformer.py: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/mertyg/vision-language-models-are-bows/HEAD/model_zoo/xvlm_utils/swin_transformer.py -------------------------------------------------------------------------------- /model_zoo/xvlm_utils/tokenization_bert.py: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/mertyg/vision-language-models-are-bows/HEAD/model_zoo/xvlm_utils/tokenization_bert.py -------------------------------------------------------------------------------- /model_zoo/xvlm_utils/tokenization_roberta.py: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/mertyg/vision-language-models-are-bows/HEAD/model_zoo/xvlm_utils/tokenization_roberta.py -------------------------------------------------------------------------------- /model_zoo/xvlm_utils/vit.py: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/mertyg/vision-language-models-are-bows/HEAD/model_zoo/xvlm_utils/vit.py -------------------------------------------------------------------------------- /model_zoo/xvlm_utils/xbert.py: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/mertyg/vision-language-models-are-bows/HEAD/model_zoo/xvlm_utils/xbert.py -------------------------------------------------------------------------------- /model_zoo/xvlm_utils/xroberta.py: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/mertyg/vision-language-models-are-bows/HEAD/model_zoo/xvlm_utils/xroberta.py -------------------------------------------------------------------------------- /model_zoo/xvlm_utils/xvlm.py: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/mertyg/vision-language-models-are-bows/HEAD/model_zoo/xvlm_utils/xvlm.py -------------------------------------------------------------------------------- /notebooks/Replicate ARO! VG-Relation, VG-Attribution.ipynb: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/mertyg/vision-language-models-are-bows/HEAD/notebooks/Replicate ARO! VG-Relation, VG-Attribution.ipynb -------------------------------------------------------------------------------- /scripts/create_environment.sh: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/mertyg/vision-language-models-are-bows/HEAD/scripts/create_environment.sh -------------------------------------------------------------------------------- /scripts/reproduce_aro.sh: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/mertyg/vision-language-models-are-bows/HEAD/scripts/reproduce_aro.sh -------------------------------------------------------------------------------- /scripts/reproduce_retrieval.sh: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/mertyg/vision-language-models-are-bows/HEAD/scripts/reproduce_retrieval.sh -------------------------------------------------------------------------------- /temp_data/train_neg_clip.tsv: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/mertyg/vision-language-models-are-bows/HEAD/temp_data/train_neg_clip.tsv -------------------------------------------------------------------------------- /temp_data/valid_neg_clip.tsv: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/mertyg/vision-language-models-are-bows/HEAD/temp_data/valid_neg_clip.tsv --------------------------------------------------------------------------------