├── images
    ├── vq2a_examples.png
    └── gif_vq2a_approach.gif
├── LICENSE
└── README.md


/images/vq2a_examples.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/google-research-datasets/maverics/HEAD/images/vq2a_examples.png


--------------------------------------------------------------------------------
/images/gif_vq2a_approach.gif:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/google-research-datasets/maverics/HEAD/images/gif_vq2a_approach.gif


--------------------------------------------------------------------------------
/LICENSE:
--------------------------------------------------------------------------------
1 | The dataset may be freely used for any purpose, although acknowledgement of
2 | Google LLC ("Google") as the data source would be appreciated. The dataset is
3 | provided "AS IS" without any warranty, express or implied. Google disclaims all
4 | liability for any damages, direct or indirect, resulting from the use of the
5 | dataset.
6 | 


--------------------------------------------------------------------------------
/README.md:
--------------------------------------------------------------------------------
 1 | # MAVERICS 
 2 | 
 3 | <p align="center">
 4 |   <img width="680.25" height="504" src="/images/vq2a_examples.png">
 5 | </p>
 6 | 
 7 | We introduce Manually vAlidated Vq2a Examples fRom Image/Caption datasetS (MAVERICS), a suite of test-only visual question answering datasets.
 8 | The datasets are created from image captions by Visual Question Generation with Question Answering validation, or VQ^2A (see the figure below), followed by manual verification.
 9 | Check our [paper](https://arxiv.org/abs/2205.01883) for further details.
10 | 
11 | <p align="center">
12 |   <img width="576" height="324" src="/images/gif_vq2a_approach.gif">
13 | </p>
14 | 
15 | ## Download
16 | 
17 | [COCO minival2014](https://storage.googleapis.com/maverics/maverics_coco.json) (193KB), generated from [COCO Captions](https://cocodataset.org/#captions-2015).
18 | 
19 | [CC3M dev](https://storage.googleapis.com/maverics/maverics_cc3m.json) (177KB), generated from [Conceptual Captions](https://github.com/google-research-datasets/conceptual-captions).
20 | 
21 | **Format (.json)**
22 | <div class="highlight highlight-source-shell"><pre>
23 | dataset               str: dataset name
24 | split                 str: dataset split
25 | annotations           List of image-question-answers triplets, each of which is
26 | -- image_id           str: image ID
27 | -- caption            str: image caption
28 | -- qa_pairs           List of question-answer pairs, each of which is
29 | ---- question_id      str: question ID
30 | ---- raw_question     str: raw question
31 | ---- question         str: processed question
32 | ---- answers          List of str: 10 ground-truth answers
33 | </pre></div>
34 | 
35 | ## Cite
36 | 
37 | If you use this dataset in your research, please cite the original image caption datasets and our paper:
38 | 
39 | Soravit Changpinyo*, Doron Kukliansky*, Idan Szpektor, Xi Chen, Nan Ding, Radu Soricut.
40 | [All You May Need for VQA are Image Captions](https://arxiv.org/abs/2205.01883).
41 | NAACL 2022.
42 | 
43 | <div class="highlight highlight-source-shell"><pre>
44 | @inproceedings{changpinyo2022vq2a,
45 |   title = {All You May Need for VQA are Image Captions},
46 |   author = {Changpinyo, Soravit and Kukliansky, Doron and Szpektor, Idan and Chen, Xi and Ding, Nan and Soricut, Radu},
47 |   booktitle = {NAACL},
48 |   year = {2022},
49 | }
50 | </pre></div>
51 | 
52 | ## Related Datasets
53 | A multilingual extension of this approach and its accompanied dataset MaXM can be found on [this page](https://github.com/google-research-datasets/maxm/).
54 | 
55 | ## Contact Us
56 | 
57 | Please create an issue in this repository. If you would like to share feedback or report concerns, please email schangpi@google.com.
58 | 


--------------------------------------------------------------------------------