├── .gitignore
├── LICENSE
├── README.md
├── base_main.py
├── datasets.py
├── descriptors
├── descriptors_cars.json
├── descriptors_cub.json
├── descriptors_dtd.json
├── descriptors_eurosat.json
├── descriptors_fgvcaircraft.json
├── descriptors_flowers102.json
├── descriptors_food101.json
├── descriptors_imagenet.json
├── descriptors_pets.json
└── descriptors_places365.json
├── environment.yaml
├── evaluate_results.py
├── generate_concepts.py
├── generate_descriptors.py
├── images
├── main.png
└── teaser.png
├── replicate_key_runs.sh
├── results
├── baselines.csv
├── baselines_concept.csv
├── baselines_gpt.csv
├── randomized_descriptions.csv
├── randomized_descriptions_5xbudget.csv
├── scrambled_descriptions.csv
├── shared_randomized_descriptions.csv
├── shared_randomized_descriptions_2xbudget.csv
├── swapped_descriptions.csv
├── waffleclip.csv
├── waffleclip_concepts.csv
├── waffleclip_gpt.csv
└── waffleclip_gpt_concepts.csv
├── setup.py
├── waffle_tools.py
└── word_list.pkl
/.gitignore:
--------------------------------------------------------------------------------
1 | *.pyc
2 | __pycache__
3 | precomputed_encs
4 | waffle.egg-info
5 |
--------------------------------------------------------------------------------
/LICENSE:
--------------------------------------------------------------------------------
1 | MIT License
2 |
3 | Copyright (c) 2023 Karsten Roth
4 |
5 | Permission is hereby granted, free of charge, to any person obtaining a copy
6 | of this software and associated documentation files (the "Software"), to deal
7 | in the Software without restriction, including without limitation the rights
8 | to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
9 | copies of the Software, and to permit persons to whom the Software is
10 | furnished to do so, subject to the following conditions:
11 |
12 | The above copyright notice and this permission notice shall be included in all
13 | copies or substantial portions of the Software.
14 |
15 | THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
16 | IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
17 | FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
18 | AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
19 | LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
20 | OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
21 | SOFTWARE.
22 |
--------------------------------------------------------------------------------
/README.md:
--------------------------------------------------------------------------------
1 | # WaffleCLIP [](https://www.python.org/downloads/release/python-390/)
2 |
3 | __Authors:__ [Karsten Roth](https://karroth.com/), [Jae Myung Kim](https://jaemyung-kim.github.io/), [Almut Sophia Koepke](https://www.eml-unitue.de/people/almut-sophia-koepke), [Oriol Vinyals](https://scholar.google.com/citations?user=NkzyCvUAAAAJ&hl=en), [Cordelia Schmid](https://scholar.google.com/citations?user=IvqCXP4AAAAJ&hl=en), [Zeynep Akata](https://www.eml-unitue.de/people/zeynep-akata)
4 |
5 | ---
6 |
7 | **Table of Contents**
8 |
9 | - [Waffling for Performance ](#waffling-for-performance-)
10 | - [Overview](#overview)
11 | - [Setting Up](#setting-up)
12 | - [Replicating](#replicating)
13 | - [Default Zero-Shot Visual Classification Performance of CLIP](#default-zero-shot-visual-classification-performance-of-clip)
14 | - [Utilizing GPT Descriptors](#utilizing-gpt-descriptors)
15 | - [Generating your own GPT Descriptors](#generating-your-own-gpt-descriptors)
16 | - [`Waffle`CLIP](#waffleclip)
17 | - [Utilizing High-Level Concept Guidance](#utilizing-high-level-concept-guidance)
18 | - [Extract your own High-Level Concepts](#extract-your-own-high-level-concepts)
19 | - [Putting Everything Together](#putting-everything-together)
20 | - [Repository Details](#repository-details)
21 | - [Citation](#citation)
22 |
23 | ---
24 |
25 | ## Overview
26 |
27 |
28 |
29 | This repository contains code to replicate key experiments from our paper [Waffling around for Performance: Visual Classification with Random Words and Broad Concepts](https://arxiv.org/abs/2306.07282).
30 | It should also provide a good starting point for any subsequent research looking to study improved (zero-shot) transfer performance of pretrained Vision Language Models (VLM), and extends the [great repository](https://github.com/sachit-menon/classify_by_description_release) associated with the original [Visual Classification via Description from Large Language Models](https://arxiv.org/abs/2210.07183) paper.
31 |
32 | If you find this repository useful or use it as part of your research, please consider [citing it](#citation).
33 |
34 |
35 |
36 | ---
37 |
38 | ## Setting Up
39 |
40 | **Set up environment:** To get started, simply set up the correct environment using `environment.yaml` by running
41 |
42 | ```bash
43 | conda env create -f environment.yaml
44 | ```
45 |
46 | and activate the environment via `conda activate waffle`.
47 |
48 | **Ensure `clip` is up-to-date:** The above command should install all relevant libraries. If you are not able to utilize the `ViT-L/14` backbone, it is like because your version of `clip` is not up-to-date. In this case, consider re-installing it:
49 |
50 | ```bash
51 | pip install git+https://github.com/openai/CLIP.git
52 | ```
53 |
54 | **Downloading datasets:** The associated datasets should download automatically with the exception of `ImageNet`, which should follow the default `ImageNet2012`-structure. We also note that auto-downloading `Places365` sometimes causes some issues, and may need to be downloaded by hand.
55 |
56 | ---
57 |
58 | ## Replicating
59 |
60 | In this section, we will detail how to run both baseline approaches (CLIP, CLIP + GPT Descriptors), as well as `Waffle`CLIP and its variants. We also showcase how to generate your own descriptions and extract your own high-level concepts to extend default prompts with.
61 |
62 | A large collection of sample runs to replicate baseline experiments are provided in `replicate_key_results.sh`, which will create a selection of result csv-files in a `results`-folder. You should be able to extract the information in a more readable fashion by simply using `evaluate_results.py`.
63 |
64 | In the following part, we will provide a few more details on specific settings and how to run them.
65 |
66 | ### Default Zero-Shot Visual Classification Performance of CLIP
67 |
68 | To replicate the zero-shot classification performance of vanilla CLIP on e.g. the `ImageNet1K` test data, simply run
69 |
70 | ```bash
71 | python base_main.py --savename='baselines' --dataset=imagenet --mode=clip --model_size=ViT-B/32
72 | ```
73 |
74 | which will utilize the `ViT-B/32` backbone for `mode=clip` on `dataset=imagenet`. Generated results are then *appended* to a csv-file named `results/baselines.csv`. To get results for multiple datasets, simply run with respective changes in `--dataset`. A list of available datasets is provided in `waffle_tools.DATASETS`.
75 |
76 | ### Utilizing GPT Descriptors
77 |
78 | To extend the zero-shot classification of vanilla CLIP with GPT-3 generated descriptions following [Menon et al. 2023](https://arxiv.org/abs/2210.07183) on e.g. the `ImageNet1K` test data, simply run
79 |
80 | ```bash
81 | python base_main.py --savename='baselines_gpt' --dataset=imagenet --mode=gpt_descriptions --model_size=ViT-B/32
82 | ```
83 |
84 | Generated results are then *appended* to a csv-file named `results/baselines_gpt.csv`.
85 |
86 | ### Generating your own GPT Descriptors
87 |
88 | If you want to produce new GPT Descriptors for other datasets, simply utilize `generate_descriptors.py`, which is adapted from [Menon et al. 2023](https://arxiv.org/abs/2210.07183). Ensure that you have a valid OpenAI account.
89 |
90 | ### `Waffle`CLIP
91 |
92 | To perform zero-shot classification using default `Waffle`CLIP, simple run
93 |
94 | ```bash
95 | python base_main.py --savename='waffleclip' --dataset=imagenet --mode=waffle --waffle_count=15 --reps=7 --model_size=ViT-B/32
96 | ```
97 |
98 | which utilizes 15 pairs comprising a random word and a random character sequence descriptor (i.e. 30 descriptors in total) for `Waffle`CLIP. The results are computed over 7 different random initializations, and then averaged. Mean and standard deviations are then stored in `results/waffleclip.csv`.
99 |
100 | ### Utilizing High-Level Concept Guidance
101 |
102 | Using high-level concept-guidance is as easy as using zero-shot vanilla CLIP. Given some high-level concept, e.g. `food` for the Food101 dataset, simply run
103 |
104 | ```bash
105 | python base_main.py --savename='baselines_concept' --dataset=food101 --mode=clip --model_size=ViT-B/32 --label_before_text='A photo of a food: a '
106 | ```
107 |
108 | which replaces the default prompt primer (`"A photo of a "`) with `"A photo of a food: a "`. This can similarly be applied to e.g. `Waffle`CLIP as shown above by also simply appending and changing the `--label_before_text` parameter.
109 |
110 | ### Extract your own High-Level Concepts
111 |
112 | Given a dataset with classnames, extracting of shared concepts can be simply done using `generate_concepts.py`, which selects a random subset and queries GPT-3 about commonalities.
113 |
114 | ### Putting Everything Together
115 |
116 | To run `Waffle`CLIP on top of GPT-Descriptors and with high-level concept guidance, one can simply combine the commands above and run
117 |
118 | ```bash
119 | python base_main.py --savename='waffleclip_gpt_concepts' --dataset=food101 --mode=waffle_and_gpt --waffle_count=15 --reps=7 --model_size=ViT-B/32 --label_before_text='A photo of a food: a '
120 | ```
121 |
122 | ---
123 |
124 | ## Repository Details
125 |
126 | In this section, we quickly details the implemented CLIP and `Waffle`CLIP variants. Note that all of these methods, except for the notion of high-level concept guidance, are implemented in `waffle_tools.py > load_gpt_descriptions()`.
127 |
128 | As baseline methods (executable via `--mode=`), we have
129 |
130 | - `clip`: Default vanilla CLIP.
131 | - `gpt_descriptions`: Extends CLIP with GPT-descriptions per class available in the folder `descriptors`.
132 |
133 | For randomization studies, we have
134 |
135 | - `(shared_)random_descriptions`: Randomly shuffle and redistribute available descriptions to each class, either uniquely or shared between classes (`shared`). By default, a `--randomization_budget=1` means that we use the same number of descriptors per class as is the average for GPT-3 provided descriptors.
136 | - `swapped_descriptions`: Randomly interchange lists of descriptions between classes.
137 | - `scrambled_descriptions`: For lists of descriptions **per** class, we randomly shuffle words and word orders.
138 |
139 | For `Waffle`CLIP variants, we have
140 |
141 | - `waffle`: Denotes the standard `Waffle`CLIP setup using pairs of random word and random character sequence descriptors. Use `--waffle_count` for the number of pairs.
142 | - `waffle_and_gpt`: Extends `Waffle`CLIP with associated GPT-3 descriptors. Note that the same set of descriptors are used for each class, while the descriptors **per** class may differ. Use the same `--waffle_count` number as in `--mode=waffle`, it will be internally adapted to ensure the same number of descriptors and an equal balanced between randomized ones and GPT-3 descriptions. Note that through the subsampling, resulting random descriptor lists may slightly vary between classes.
143 |
144 | For additional CLIP uses, we have also included
145 |
146 | - `prompt_ensemble`: Takes `--waffle_count * 2` of prompts randomly sampled from `waffle_tools.prompt_ensemble` and provides a list of multiple prompts to average retrieval over instead.
147 | - `context_prompt_ensemble`: Extends prompt ensembling with high-level concept guidance. Note that the `--label_before_text` parameter still has to include the extracted high-level concepts.
148 |
149 | ---
150 |
151 | ## Citation
152 |
153 | ```bibtex
154 | @misc{roth2023waffling,
155 | title={Waffling around for Performance: Visual Classification with Random Words and Broad Concepts},
156 | author={Karsten Roth and Jae Myung Kim and A. Sophia Koepke and Oriol Vinyals and Cordelia Schmid and Zeynep Akata},
157 | year={2023},
158 | eprint={2306.07282},
159 | archivePrefix={arXiv},
160 | primaryClass={cs.CV}
161 | }
162 | ```
163 |
--------------------------------------------------------------------------------
/base_main.py:
--------------------------------------------------------------------------------
1 | #%%
2 | import os
3 | import warnings
4 | warnings.filterwarnings("ignore")
5 |
6 | import argparse
7 | import clip
8 | import numpy as np
9 | import pickle
10 | from termcolor import colored
11 | import torch
12 | import torch.nn.functional as F
13 | from torch.utils.data import DataLoader
14 | import torchmetrics
15 | import tqdm
16 |
17 | import waffle_tools
18 |
19 | #%%
20 | parser = argparse.ArgumentParser()
21 | ### Base arguments.
22 | parser.add_argument('--mode', type=str, default='clip', choices=waffle_tools.METHODS,
23 | help='VLM extension to use.')
24 | parser.add_argument('--seed', type=int, default=1,
25 | help='Replication seed.')
26 | parser.add_argument('--batch_size', type=int, default=640,
27 | help='Batchsize, mainly used to compute image embeddings.')
28 | parser.add_argument('--dataset', type=str, default='imagenetv2', choices=waffle_tools.DATASETS,
29 | help='Dataset to evaluate on.')
30 | parser.add_argument('--model_size', type=str, default='ViT-B/32', choices=waffle_tools.BACKBONES,
31 | help='Pretrained CLIP model to use.')
32 | parser.add_argument('--aggregate', type=str, default='mean', choices=('mean', 'max'),
33 | help='How to aggregate similarites of multiple language embeddings.')
34 | ### Text going before and after class names & descriptors.
35 | ### In the default case, this would be "A photo of a " ... "."
36 | parser.add_argument('--label_before_text', type=str, default='A photo of a ',
37 | help='Prompt-part going at the very beginning.')
38 | parser.add_argument('--label_after_text', type=str, default='.',
39 | help='Prompt-part going at the very end.')
40 | ###
41 | parser.add_argument('--pre_descriptor_text', type=str, default='',
42 | help='Text that goes right before the descriptor.')
43 | parser.add_argument('--descriptor_separator', type=str, default=', ',
44 | help='Text separating descriptor part and classname.')
45 | ###
46 | parser.add_argument('--dont_apply_descriptor_modification', action='store_true',
47 | help='Flag. If set, will not use "which (is/has/etc)" before descriptors.')
48 | parser.add_argument('--merge_predictions', action='store_true',
49 | help='Optional flag to merge generated embeddings before computing retrieval scores.')
50 | parser.add_argument('--save_model', type=str, default='',
51 | help='Set to a non-empty filename to store generated language embeddings & scores in a pickle file for all seed-repetitions.')
52 | parser.add_argument('--randomization_budget', type=int, default=15,
53 | help='Budget w.r.t. to DCLIP for randomization ablations')
54 | parser.add_argument('--waffle_count', type=int, default=15,
55 | help='For WaffleCLIP: Number of randomized descriptor pairs to use')
56 | parser.add_argument('--reps', type=int, default=1,
57 | help='Number of repetitions to run a method for with changing randomization. Default value should be >7 for WaffleCLIP variants.')
58 | parser.add_argument('--savename', type=str, default='results',
59 | help='Name of csv-file in which results are stored.')
60 | ###
61 | parser.add_argument('--vmf_scale', type=float, default=1)
62 | opt = parser.parse_args()
63 | opt.apply_descriptor_modification = not opt.dont_apply_descriptor_modification
64 |
65 |
66 | #%% Get dataloader and load model.
67 | waffle_tools.seed_everything(opt.seed)
68 | opt, dataset = waffle_tools.setup(opt)
69 |
70 | print(colored(f"\nLoading model [{opt.model_size}] for dataset [{opt.dataset}] ...\n", "yellow", attrs=["bold"]))
71 |
72 | opt.device = device = torch.device('cuda')
73 | model, preprocess = clip.load(opt.model_size, device=device, jit=False)
74 | model.eval()
75 | model.requires_grad_(False)
76 |
77 |
78 | #%% Compute image embeddings if not already precomputed.
79 | precomputed_encs_folder = 'precomputed_encs'
80 | os.makedirs(precomputed_encs_folder, exist_ok=True)
81 | precomputed_encs_file = os.path.join(
82 | precomputed_encs_folder,
83 | f'{opt.dataset}_{opt.model_size.lower().replace("/", "")}.pkl'
84 | )
85 |
86 | if os.path.exists(precomputed_encs_file):
87 | load_res = pickle.load(open(precomputed_encs_file, 'rb'))
88 | else:
89 | dataloader = DataLoader(dataset, opt.batch_size, shuffle=False, num_workers=8, pin_memory=True)
90 |
91 | enc_coll = []
92 | label_coll = []
93 | with torch.no_grad():
94 | for batch_number, batch in enumerate(tqdm.tqdm(dataloader, desc='Precomputing image embeddings...')):
95 | images, labels = batch
96 | images = images.to(device)
97 | labels = labels.to(device)
98 | label_coll.append(labels)
99 |
100 | image_encodings = F.normalize(model.encode_image(images))
101 | enc_coll.append(image_encodings.cpu())
102 | load_res = {'enc': enc_coll, 'labels': label_coll}
103 | pickle.dump(load_res, open(precomputed_encs_file, 'wb'))
104 |
105 | encoding_coll = load_res['enc']
106 | label_coll = load_res['labels']
107 |
108 | #%% Generate Image Embeddings and compute scores.
109 | accs1 = []
110 | accs5 = []
111 | scores_1 = []
112 | scores_5 = []
113 | encodings = []
114 |
115 | for rep in range(opt.reps):
116 |
117 | print(colored(f'----- Repetition {rep+1}/{opt.reps}', "green", attrs=["bold"]))
118 | waffle_tools.seed_everything(rep)
119 |
120 | accuracy_metric = torchmetrics.Accuracy().to(device)
121 | accuracy_metric_top5 = torchmetrics.Accuracy(top_k=5).to(device)
122 |
123 | description_encodings = waffle_tools.compute_description_encodings(opt, model, mode=opt.mode)
124 |
125 | description_nums = [len(x) for x in description_encodings.values()]
126 | print(f'Minimum and Maximum number of descriptions/class: {np.min(description_nums)} | {np.max(description_nums)}')
127 |
128 | descr_means = torch.cat([x.mean(dim=0).reshape(1, -1) for x in description_encodings.values()])
129 | descr_means /= descr_means.norm(dim=-1, keepdim=True)
130 |
131 | for batch_number, (image_encodings, labels) in tqdm.tqdm(enumerate(zip(encoding_coll, label_coll)), total=len(encoding_coll), desc='Classifying image embeddings...'):
132 | image_encodings = image_encodings.to(device)
133 | labels = labels.to(device)
134 |
135 | if opt.merge_predictions:
136 | image_description_similarity = image_encodings @ descr_means.T
137 | else:
138 | image_description_similarity_t = [None] * opt.n_classes
139 | image_description_similarity_cumulative = [None] * opt.n_classes
140 |
141 | for i, (k, v) in enumerate(description_encodings.items()): # You can also vectorize this; it wasn't much faster for me
142 | image_description_similarity_t[i] = image_encodings @ v.T
143 | image_description_similarity_cumulative[i] = waffle_tools.aggregate_similarity(image_description_similarity_t[i], aggregation_method=opt.aggregate)
144 |
145 | image_description_similarity = torch.stack(image_description_similarity_cumulative, dim=1)
146 |
147 | acc = accuracy_metric(image_description_similarity.softmax(dim=-1), labels)
148 | acc_top5 = accuracy_metric_top5(image_description_similarity.softmax(dim=-1), labels)
149 |
150 | accuracy_logs = {}
151 | accuracy_logs[f"[Mode = {opt.mode}] Top-1 Accuracy: "] = 100 * accuracy_metric.compute().item()
152 | accuracy_logs[f"[Mode = {opt.mode}] Top-5 Accuracy: "] = 100 * accuracy_metric_top5.compute().item()
153 | accs1.append(accuracy_logs[f"[Mode = {opt.mode}] Top-1 Accuracy: "])
154 | accs5.append(accuracy_logs[f"[Mode = {opt.mode}] Top-5 Accuracy: "])
155 |
156 | print("\n")
157 | for key, value in accuracy_logs.items():
158 | print(key, '{0:3.2f}'.format(value))
159 | print("\n")
160 |
161 | scores_1.append(accs1[-1])
162 | scores_5.append(accs5[-1])
163 | encodings.append(description_encodings)
164 |
165 | ### Print final results.
166 | print(colored("\nFinal results", "red", attrs=["bold"]))
167 | print(f'After {opt.reps} reps using mode = {opt.mode} with merge = {opt.merge_predictions}:')
168 | print(colored("Top-1 Accuracy", "white", attrs=["bold"]))
169 | print('Mean Top-1 Acc: {0:3.2f}% +- {1:3.2f}%'.format(np.mean(accs1), np.std(accs1)))
170 | print('Min and Max Top-1 Acc: {0:3.2f}% | {1:3.2f}%'.format(np.min(accs1), np.max(accs1)))
171 | print('All Top-1 Accs: {0}'.format(' | '.join('{0:3.2f}%'.format(x) for x in accs1)))
172 | print(colored("Top-5 Accuracy", "white", attrs=["bold"]))
173 | print('Mean Top-5 Acc: {0:3.2f}% +- {1:3.2f}%'.format(np.mean(accs5), np.std(accs5)))
174 | print('Min and Max Top-5 Acc: {0:3.2f}% | {1:3.2f}%'.format(np.min(accs5), np.max(accs5)))
175 | print('All Top-5 Accs: {0}'.format(' | '.join('{0:3.2f}%'.format(x) for x in accs5)))
176 |
177 | ### Save results as csv.
178 | import sys
179 | import csv
180 | os.makedirs('results', exist_ok=True)
181 | savename = '; '.join(x.replace('--','') for x in sys.argv[1:])
182 | with open(f'results/{opt.savename}.csv', 'a') as csv_file:
183 | writer = csv.writer(csv_file)
184 | writer.writerow([savename, np.mean(accs1), np.std(accs1), np.max(accs1), np.min(accs1), np.mean(accs5), np.std(accs5), np.max(accs5), np.min(accs5)])
185 | csv_file.close()
186 |
187 | ### Save model information as pkl.
188 | if opt.save_model != '':
189 | os.makedirs('stored_models', exist_ok=True)
190 | pickle.dump({'scores_1': scores_1, 'scores_5': scores_5, 'encodings': encodings}, open(f'stored_models/{opt.save_model}_{opt.dataset}.pkl', 'wb'))
--------------------------------------------------------------------------------
/datasets.py:
--------------------------------------------------------------------------------
1 | import os
2 |
3 | from PIL import Image
4 | import torch
5 | from torchvision import datasets
6 | import torchvision.transforms as transforms
7 |
8 | class CUBDataset(datasets.ImageFolder):
9 | """
10 | Wrapper for the CUB-200-2011 dataset.
11 | Method DatasetBirds.__getitem__() returns tuple of image and its corresponding label.
12 | Dataset per https://github.com/slipnitskaya/caltech-birds-advanced-classification
13 | """
14 | def __init__(self,
15 | root,
16 | transform=None,
17 | target_transform=None,
18 | loader=datasets.folder.default_loader,
19 | is_valid_file=None,
20 | train=True,
21 | bboxes=False):
22 |
23 | img_root = os.path.join(root, 'images')
24 |
25 | super(CUBDataset, self).__init__(
26 | root=img_root,
27 | transform=None,
28 | target_transform=None,
29 | loader=loader,
30 | is_valid_file=is_valid_file,
31 | )
32 |
33 | self.redefine_class_to_idx()
34 |
35 | self.transform_ = transform
36 | self.target_transform_ = target_transform
37 | self.train = train
38 |
39 | # obtain sample ids filtered by split
40 | path_to_splits = os.path.join(root, 'train_test_split.txt')
41 | indices_to_use = list()
42 | with open(path_to_splits, 'r') as in_file:
43 | for line in in_file:
44 | idx, use_train = line.strip('\n').split(' ', 2)
45 | if bool(int(use_train)) == self.train:
46 | indices_to_use.append(int(idx))
47 |
48 | # obtain filenames of images
49 | path_to_index = os.path.join(root, 'images.txt')
50 | filenames_to_use = set()
51 | with open(path_to_index, 'r') as in_file:
52 | for line in in_file:
53 | idx, fn = line.strip('\n').split(' ', 2)
54 | if int(idx) in indices_to_use:
55 | filenames_to_use.add(fn)
56 |
57 | img_paths_cut = {'/'.join(img_path.rsplit('/', 2)[-2:]): idx for idx, (img_path, lb) in enumerate(self.imgs)}
58 | imgs_to_use = [self.imgs[img_paths_cut[fn]] for fn in filenames_to_use]
59 |
60 | _, targets_to_use = list(zip(*imgs_to_use))
61 |
62 | self.imgs = self.samples = imgs_to_use
63 | self.targets = targets_to_use
64 |
65 | if bboxes:
66 | # get coordinates of a bounding box
67 | path_to_bboxes = os.path.join(root, 'bounding_boxes.txt')
68 | bounding_boxes = list()
69 | with open(path_to_bboxes, 'r') as in_file:
70 | for line in in_file:
71 | idx, x, y, w, h = map(lambda x: float(x), line.strip('\n').split(' '))
72 | if int(idx) in indices_to_use:
73 | bounding_boxes.append((x, y, w, h))
74 |
75 | self.bboxes = bounding_boxes
76 | else:
77 | self.bboxes = None
78 |
79 | def __getitem__(self, index):
80 | # generate one sample
81 | sample, target = super(CUBDataset, self).__getitem__(index)
82 |
83 | if self.bboxes is not None:
84 | # squeeze coordinates of the bounding box to range [0, 1]
85 | width, height = sample.width, sample.height
86 | x, y, w, h = self.bboxes[index]
87 |
88 | scale_resize = 500 / width
89 | scale_resize_crop = scale_resize * (375 / 500)
90 |
91 | x_rel = scale_resize_crop * x / 375
92 | y_rel = scale_resize_crop * y / 375
93 | w_rel = scale_resize_crop * w / 375
94 | h_rel = scale_resize_crop * h / 375
95 |
96 | target = torch.tensor([target, x_rel, y_rel, w_rel, h_rel])
97 |
98 | if self.transform_ is not None:
99 | sample = self.transform_(sample)
100 | if self.target_transform_ is not None:
101 | target = self.target_transform_(target)
102 |
103 | return sample, target
104 |
105 | def redefine_class_to_idx(self):
106 | adjusted_dict = {}
107 | for k, v in self.class_to_idx.items():
108 | k = k.split('.')[-1].replace('_', ' ')
109 | split_key = k.split(' ')
110 | if len(split_key) > 2:
111 | k = '-'.join(split_key[:-1]) + " " + split_key[-1]
112 | adjusted_dict[k] = v
113 | self.class_to_idx = adjusted_dict
114 |
115 |
116 | def _transform(n_px):
117 | return transforms.Compose([
118 | transforms.Resize(n_px, interpolation=Image.BICUBIC),
119 | transforms.CenterCrop(n_px),
120 | lambda image: image.convert("RGB"),
121 | transforms.ToTensor(),
122 | transforms.Normalize((0.48145466, 0.4578275, 0.40821073), (0.26862954, 0.26130258, 0.27577711)),
123 | ])
124 |
--------------------------------------------------------------------------------
/descriptors/descriptors_dtd.json:
--------------------------------------------------------------------------------
1 | {
2 | "banded": [
3 | "a repeating pattern of light and dark bands",
4 | "the bands are of different widths",
5 | "the bands may be of different colors",
6 | "the bands may be curved or straight",
7 | "the bands may be parallel or intersecting"
8 | ],
9 | "blotchy": [
10 | "an uneven or mottled surface",
11 | "a variety of colors or shades",
12 | "a raised or bumpy texture",
13 | "a matte finish"
14 | ],
15 | "braided": [
16 | "three or more strands of material woven together",
17 | "a tight, interlocking pattern"
18 | ],
19 | "bubbly": [
20 | "small, round, and raised bumps",
21 | "a smooth or glossy surface",
22 | "a three-dimensional appearance",
23 | "a light-reflecting quality"
24 | ],
25 | "bumpy": [
26 | "an uneven surface",
27 | "raised or indented areas",
28 | "a rough or bumpy feel"
29 | ],
30 | "chequered": [
31 | "a repeating pattern of squares or rectangles",
32 | "alternating light and dark colors",
33 | "sharp, defined lines between the squares or rectangles"
34 | ],
35 | "cobwebbed": [
36 | "a web-like pattern",
37 | "made of thin, silky strands",
38 | "often found in dark, damp places",
39 | "can be sticky to the touch",
40 | "can be difficult to remove once entangled"
41 | ],
42 | "cracked": [
43 | "a surface with cracks",
44 | "the cracks may be straight or curved",
45 | "the cracks may be of different sizes",
46 | "the cracks may be close together or far apart",
47 | "the cracks may be deep or shallow",
48 | "the cracks may be filled with dirt or debris"
49 | ],
50 | "crosshatched": [
51 | "a series of parallel lines that intersect to form a grid",
52 | "the lines may be of different thicknesses",
53 | "the lines may be of different colors",
54 | "the texture may be regular or irregular",
55 | "the texture may be applied to a surface or object"
56 | ],
57 | "crystalline": [
58 | "a repeating pattern of shapes",
59 | "sharp edges",
60 | "a glossy or shiny surface",
61 | "a transparent or translucent appearance",
62 | "a three-dimensional structure"
63 | ],
64 | "dotted": [
65 | "a series of small, round dots",
66 | "evenly spaced",
67 | "can be of any color",
68 | "may be on a background of any color",
69 | "may be in a regular or irregular pattern"
70 | ],
71 | "fibrous": [
72 | "closeup view",
73 | "many small, thin strands",
74 | "light reflecting off of strands",
75 | "strands of different colors",
76 | "a textured surface"
77 | ],
78 | "flecked": [
79 | "small, irregularly shaped pieces",
80 | "a variety of colors",
81 | "a rough or bumpy surface"
82 | ],
83 | "freckled": [
84 | "small, round spots",
85 | "evenly distributed",
86 | "same color as surrounding skin",
87 | "darker than surrounding skin"
88 | ],
89 | "frilly": [
90 | "intricate, lacy patterns",
91 | "delicate and often transparent",
92 | "light and airy",
93 | "can be made of various materials, including paper, fabric, or metal"
94 | ],
95 | "gauzy": [
96 | "a thin, translucent fabric",
97 | "a light, airy feel",
98 | "a delicate or lacy appearance"
99 | ],
100 | "grid": [
101 | "a repeating pattern of squares or rectangles",
102 | "straight lines forming a consistent grid",
103 | "evenly spaced lines and intersections",
104 | "a consistent color or tone throughout"
105 | ],
106 | "grooved": [
107 | "a series of parallel lines",
108 | "a repeating pattern",
109 | "a raised or indented surface"
110 | ],
111 | "honeycombed": [
112 | "a repeating pattern of hexagonal shapes",
113 | "a honey-like color",
114 | "a smooth, glossy surface"
115 | ],
116 | "interlaced": [
117 | "a repeating pattern of light and dark lines",
118 | "the lines are usually perpendicular to each other",
119 | "the lines may be of different thicknesses",
120 | "the lines may be of different colors"
121 | ],
122 | "knitted": [
123 | "small, interconnected loops",
124 | "a variety of colors",
125 | "a three-dimensional appearance"
126 | ],
127 | "lacelike": [
128 | "intricate, detailed pattern",
129 | "often made of lace or other delicate material",
130 | "can be white, cream, or other light colors",
131 | "often used for decoration",
132 | "green leaves",
133 | "stem",
134 | "roots",
135 | "flowers (optional)",
136 | "fruit (optional)"
137 | ],
138 | "lined": [
139 | "a series of parallel lines",
140 | "can be straight or curved",
141 | "may be of different colors",
142 | "may be of different widths",
143 | "may be of different thicknesses"
144 | ],
145 | "marbled": [
146 | "a swirl or whorl pattern",
147 | "two or more colors",
148 | "a glossy or shiny surface",
149 | "a smooth or textured surface"
150 | ],
151 | "matted": [
152 | "a surface with a raised or relief pattern",
153 | "a surface that is not smooth",
154 | "a surface with a lot of texture",
155 | "a surface that is not glossy or shiny"
156 | ],
157 | "meshed": [
158 | "a repeating pattern of interlocking shapes",
159 | "a regular, geometric shape",
160 | "a consistent size and spacing between shapes",
161 | "a symmetrical or nearly symmetrical design",
162 | "straight or curved lines",
163 | "a solid color or a limited color palette"
164 | ],
165 | "paisley": [
166 | "a repeating pattern of curved shapes",
167 | "usually brightly colored",
168 | "often with a floral or geometric design"
169 | ],
170 | "perforated": [
171 | "a series of small, evenly spaced holes",
172 | "a surface with a regular pattern of indentations",
173 | "a material with a porous or spongy appearance"
174 | ],
175 | "pitted": [
176 | "a surface with small, shallow depressions",
177 | "a rough or bumpy surface",
178 | "a matte or dull finish"
179 | ],
180 | "pleated": [
181 | "a series of parallel, evenly spaced folds or creases",
182 | "fabric that is gathered or puffed up in between the folds",
183 | "a smooth, flat surface on either side of the folds"
184 | ],
185 | "polka-dotted": [
186 | "a series of small, round dots",
187 | "evenly spaced",
188 | "usually on a light background",
189 | "can be any color",
190 | "may be on a variety of surfaces"
191 | ],
192 | "porous": [
193 | "small, irregular holes or pores",
194 | "a rough or spongy surface",
195 | "light and dark areas where the pores are more or less dense"
196 | ],
197 | "potholed": [
198 | "a road or street with many small holes",
199 | "the surface is bumpy and uneven",
200 | "the holes are usually circular or oval-shaped",
201 | "the holes may be filled with water or debris",
202 | "the surrounding area may be cracked or damaged"
203 | ],
204 | "scaly": [
205 | "a series of overlapping plates or scales",
206 | "a dry, rough, or bumpy surface",
207 | "a dull or matte finish",
208 | "a lack of luster or shine"
209 | ],
210 | "smeared": [
211 | "blurred or out of focus",
212 | "colors running together",
213 | "a smeared or streaked appearance"
214 | ],
215 | "spiralled": [
216 | "a repeating pattern of interlocking curves",
217 | "a three-dimensional appearance",
218 | "a raised or embossed surface",
219 | "a glossy or matte finish"
220 | ],
221 | "sprinkled": [
222 | "small, uniform dots",
223 | "evenly spaced",
224 | "same size and shape",
225 | "can be a variety of colors",
226 | "may be shimmery or iridescent"
227 | ],
228 | "stained": [
229 | "a discolored area",
230 | "a raised or bumpy surface",
231 | "a change in color or opacity",
232 | "a change in texture"
233 | ],
234 | "stratified": [
235 | "a series of layers",
236 | "each layer is of a different material",
237 | "the layers are parallel to each other",
238 | "the layers may be of different thicknesses",
239 | "the layers may be of different colors",
240 | "the layers may have different textures"
241 | ],
242 | "striped": [
243 | "alternating light and dark bands",
244 | "regular, repeating pattern",
245 | "can be horizontal, vertical, or diagonal",
246 | "may be of different widths",
247 | "may be of different colors"
248 | ],
249 | "studded": [
250 | "a series of raised bumps or ridges",
251 | "evenly spaced",
252 | "may be arranged in a pattern",
253 | "may be of different colors",
254 | "may be of different sizes"
255 | ],
256 | "swirly": [
257 | "a spiral or whorl pattern",
258 | "a smooth or glossy surface",
259 | "a light or dark color"
260 | ],
261 | "veined": [
262 | "a repeating pattern of lines or shapes",
263 | "a contrast between light and dark areas",
264 | "a three-dimensional appearance"
265 | ],
266 | "waffled": [
267 | "a series of raised, parallel lines",
268 | "a honeycomb or grid-like appearance",
269 | "a three-dimensional texture",
270 | "a rough or bumpy surface"
271 | ],
272 | "woven": [
273 | "a series of parallel, criss-crossing lines",
274 | "a tight, interlocking pattern",
275 | "a fabric or material with a textile weave"
276 | ],
277 | "wrinkled": [
278 | "a surface with raised bumps or ridges",
279 | "a surface with indentations or valleys",
280 | "a surface with a complex, irregular pattern",
281 | "a surface that appears dry or dehydrated"
282 | ],
283 | "zigzagged": [
284 | "a repeating pattern of sharp angles",
285 | "usually black and white",
286 | "can be created with lines, shapes, or colors",
287 | "often used to create a sense of movement or energy"
288 | ]
289 | }
--------------------------------------------------------------------------------
/descriptors/descriptors_eurosat.json:
--------------------------------------------------------------------------------
1 | {
2 | "annual crop land": [
3 | "large, open fields",
4 | "straight rows",
5 | "different colors for different crops"
6 | ],
7 | "forest": [
8 | "a large area of trees",
9 | "green leaves"
10 | ],
11 | "brushland or shrubland": [
12 | "an area of land with low-growing plants",
13 | "little or no grass",
14 | "dry conditions"
15 | ],
16 | "highway or road": [
17 | "a long, straight, or gently curved line",
18 | "typically has a smooth, paved surface"
19 | ],
20 | "industrial buildings or commercial buildings": [
21 | "evidence of human activity"
22 | ],
23 | "pasture land": [
24 | "green or brown vegetation",
25 | "uninterrupted spaces"
26 | ],
27 | "permanent crop land": [
28 | "large, rectangular fields",
29 | "straight rows of crops",
30 | "irrigation systems",
31 | "green vegetation"
32 | ],
33 | "residential buildings or homes or apartments": [
34 | "man-made structures",
35 | "typically have a rectangular or square shape",
36 | "may be clustered together in groups"
37 | ],
38 | "river": [
39 | "a long, thin line of water",
40 | "may be winding or straight",
41 | "may be bordered by vegetation"
42 | ],
43 | "lake or sea": [
44 | "large body of water",
45 | "typically blue or green in color"
46 | ]
47 | }
--------------------------------------------------------------------------------
/descriptors/descriptors_fgvcaircraft.json:
--------------------------------------------------------------------------------
1 | {
2 | "A300": [
3 | "black or silver color",
4 | "a rectangular body with rounded edges",
5 | "two lens ports",
6 | "a mode dial",
7 | "a top-mounted LCD screen",
8 | "a hand grip",
9 | "several buttons for adjusting settings",
10 | "a shutter release button"
11 | ],
12 | "A310": [
13 | "classic Airbus \u2018A\u2019 shape design",
14 | "two high-mounted, swept-back wings with four engines",
15 | "two long, slender fuselages connected by a single tail fin",
16 | "two under-wing mounted podded engines",
17 | "a swept-back tail and vertical stabilizer",
18 | "a round nose and cockpit window ",
19 | "narrow, rectangular windows"
20 | ],
21 | "A320": [
22 | "single-aisle, twin-engine commercial jet airliner",
23 | "curved tail and wings",
24 | "two jet engines mounted beneath the wings",
25 | "single nose wheel for landing and takeoff",
26 | "two main landing gear",
27 | "two sets of wing flaps",
28 | "two sets of ailerons"
29 | ],
30 | "A330": [
31 | "Two engines mounted on the wings",
32 | "Two sets of landing gear",
33 | "Sharklet wingtips",
34 | "Tailfin with a curved top",
35 | "Cockpit window with two windows",
36 | "Large curved windows on the passenger cabin",
37 | "Large cargo doors at the rear of the fuselage"
38 | ],
39 | "A340": [
40 | "four engines",
41 | "twin aisles",
42 | "a swept wing",
43 | "a T-tail",
44 | "two vertical stabilizers",
45 | "a pointed nose",
46 | "underwing pylons for mounting engines and landing gear"
47 | ],
48 | "A380": [
49 | "four high-bypass turbofan engines",
50 | "two large wings with winglets",
51 | "a large tail fin",
52 | "a rectangular fuselage",
53 | "large windows",
54 | "two vertical stabilizers",
55 | "a large nose cone"
56 | ],
57 | "ATR-42": [
58 | "Short-haul regional airliner",
59 | "Seating capacity of 40-50 passengers",
60 | "High-wing monoplane with two Pratt & Whitney Canada PW100 turboprop engines",
61 | "T-tail with a conventional empennage",
62 | "Two-bladed propellers",
63 | "Four- or six-bladed propellers",
64 | "Distinctive noise of the engines"
65 | ],
66 | "ATR-72": [
67 | "rectangular cockpit windows",
68 | "two large engines with propellers",
69 | "swept-back wings",
70 | "a high tailfin",
71 | "a distinctive nose shape",
72 | "a tricycle landing gear arrangement",
73 | "a large cargo door"
74 | ],
75 | "An-12": [
76 | "four-engine turboprop transport aircraft",
77 | "large cargo door on the rear side",
78 | "two tail fins",
79 | "swept wings",
80 | "large fuselage",
81 | "two main landing gears",
82 | "two turboprop engines mounted on the wings"
83 | ],
84 | "BAE 146": [
85 | "large, swept-back wings",
86 | "distinctively curved fuselage",
87 | "four engines mounted on the wings",
88 | "high-mounted tailplane",
89 | "large, curved tailfin",
90 | "two wheel main undercarriage with two nose wheels"
91 | ],
92 | "BAE-125": [
93 | "low-wing monoplane",
94 | "fixed tricycle undercarriage",
95 | "bubble canopy",
96 | "sweptback vertical stabilizer",
97 | "retractable landing gear",
98 | "sweptback wings with ailerons and flaps"
99 | ],
100 | "Beechcraft 1900": [
101 | "Twin-engine turboprop aircraft",
102 | "Low-wing monoplane configuration",
103 | "High-mounted tailplane",
104 | "Short, stubby fuselage",
105 | "Large oval windows",
106 | "Fixed tricycle landing gear",
107 | "Single vertical stabilizer with two horizontal stabilizers"
108 | ],
109 | "Boeing 707": [
110 | "swept-wing design",
111 | "two large, vertical stabilizers",
112 | "two large, horizontal stabilizers",
113 | "distinctive \u201cT-tail\u201d design",
114 | "two podded engines on each wing",
115 | "two large, round windows on each side of the plane",
116 | "a \u201cbulge\u201d on the fuselage below the cockpit"
117 | ],
118 | "Boeing 717": [
119 | "white, blue, or silver livery ",
120 | "swept-wing design",
121 | "two turbofan engines",
122 | "two vertical stabilizers",
123 | "two horizontal stabilizers",
124 | "two sets of landing gear (each with two main wheels and two nose wheels)",
125 | "two tail-mounted engines",
126 | "cockpit windows"
127 | ],
128 | "Boeing 727": [
129 | "mid-size, narrow-body, three-engined jet airliner",
130 | "distinctive swept-wing design",
131 | "two turbofan engines mounted on the rear of the fuselage",
132 | "two large, round engine intakes on the wing leading edge",
133 | "high-mounted tailplane with a single vertical stabilizer",
134 | "retractable tricycle landing gear with two wheels on the main gear and one wheel on the nose gear"
135 | ],
136 | "Boeing 737": [
137 | "a narrow-body, twin-engine jet airliner ",
138 | "two pairs of wings",
139 | "two turbofan engines",
140 | "a pointed nose",
141 | "two tail fins",
142 | "a two-wheel, retractable landing gear",
143 | "two large, rear mounted engines ",
144 | "a distinctive, curved \u201cwinglet\u201d at the tip of each wing"
145 | ],
146 | "Boeing 747": [
147 | "double decker",
148 | "four jet engines mounted on the wings",
149 | "two vertical stabilizers",
150 | "two sets of wing flaps",
151 | "recognizable body shape",
152 | "distinctive tail logo ",
153 | "distinctive nose design"
154 | ],
155 | "Boeing 757": [
156 | "pointed nose",
157 | "two engines mounted on the wings",
158 | "two vertical stabilizers on the tail",
159 | "slats along the leading edge of the wings",
160 | "ailerons at the trailing edge of the wings",
161 | "two rows of windows along the fuselage",
162 | "two overwing exits"
163 | ],
164 | "Boeing 767": [
165 | "two engines located near the tail section",
166 | "a distinctive T-shaped tail",
167 | "swept-wing design",
168 | "a long, slender fuselage",
169 | "a series of windows along the sides of the fuselage",
170 | "a pointed nose",
171 | "a large cockpit window",
172 | "two or more sets of landing gear"
173 | ],
174 | "Boeing 777": [
175 | "large, two-engine jet airliner",
176 | "distinctive fuselage shape",
177 | "swept-back wings with winglets",
178 | "two engine pods mounted below the wings",
179 | "six-wheeled landing gear",
180 | "an S-duct inlet on each engine",
181 | "two large vertical stabilizers at the rear of the aircraft"
182 | ],
183 | "C-130": [
184 | "wingspan of up to 132 feet",
185 | "characteristic double-humped fuselage",
186 | "mid-set, high-mounted wings",
187 | "large tail-mounted propellers and engine nacelles",
188 | "distinctive tail fin and rudder",
189 | "tall and broad cargo door on the left side of the fuselage"
190 | ],
191 | "C-47": [
192 | "twin-engine propeller-driven",
193 | "a large, round nose",
194 | "two wings with a total of four engines",
195 | "squared-off tailfin",
196 | "metal construction",
197 | "a line of windows along the side of the fuselage"
198 | ],
199 | "CRJ-200": [
200 | "commercial jet aircraft",
201 | "short, stubby fuselage",
202 | "two turbofan engines",
203 | "wings with distinctive swept-back design",
204 | "two sets of low-set horizontal stabilizers",
205 | "sloped nose cone",
206 | "distinctive two-pilot cockpit with large windows"
207 | ],
208 | "CRJ-700": [
209 | "commercial jet",
210 | "twin-engine turbofan",
211 | "narrow body",
212 | "swept-wing design",
213 | "long, sloping nose",
214 | "two rear-mounted engines",
215 | "T-tail configuration",
216 | "distinctive landing gear configuration"
217 | ],
218 | "Cessna 172": [
219 | "fixed-wing design",
220 | "high-wing design",
221 | "single tail with a vertical stabilizer",
222 | "four-seats in a straight-line configuration",
223 | "large windows in the cabin",
224 | "tail number with \u201cN\u201d prefix",
225 | "distinctive blue and white paint scheme"
226 | ],
227 | "Cessna 208": [
228 | "high wings",
229 | "single-engine turboprop",
230 | "sweptback vertical stabilizers",
231 | "rounded nose",
232 | "single tail fin",
233 | "bubble canopy",
234 | "fixed landing gear"
235 | ],
236 | "Cessna Citation": [
237 | "a small, single-engine aircraft ",
238 | "a low-winged design",
239 | "a rounded nose and tail",
240 | "two pilots seated side-by-side",
241 | "two jet engines mounted in the wings",
242 | "distinctive winglets on the wingtips",
243 | "two doors on the side of the fuselage",
244 | "a retractable landing gear"
245 | ],
246 | "Challenger 600": [
247 | "large, twin-engine jet aircraft",
248 | "swept wings",
249 | "a large, oval-shaped fuselage",
250 | "a tall, vertical tail fin",
251 | "a tall, curved cockpit canopy",
252 | "retractable landing gear",
253 | "a large engine intake at the front of the fuselage",
254 | "an exhaust port at the back of the fuselage"
255 | ],
256 | "DC-10": [
257 | "wide-body, three-engine commercial aircraft",
258 | "distinctive swept-wing design",
259 | "large tail fin",
260 | "two engine nacelles on each side of the fuselage",
261 | "long, slender fuselage with a rounded nose",
262 | "two turbofan engines mounted at the rear of the fuselage",
263 | "a T-shaped tail configuration"
264 | ],
265 | "DC-3": [
266 | "Propeller-driven aircraft",
267 | "Distinctive twin-engined, low-winged design",
268 | "Metal fuselage with fabric-covered wings",
269 | "Twin radial engines",
270 | "Three-blade propellers",
271 | "Fixed landing gear with two main wheels and a tail wheel",
272 | "Round windows along the fuselage"
273 | ],
274 | "DC-6": [
275 | "high-wing monoplane design",
276 | "rounded nose",
277 | "two sets of double-slotted flaps on its wings",
278 | "two vertical stabilizers on the tail",
279 | "four-bladed propellers ",
280 | "two sets of landing gear with six wheels each ",
281 | "two large engines on the wings and two smaller engines on the tail"
282 | ],
283 | "DC-8": [
284 | "swept wings",
285 | "two rows of windows along the fuselage",
286 | "pointed nose",
287 | "two sets of landing gear",
288 | "large tail fin",
289 | "distinct engine nacelles near the wings"
290 | ],
291 | "DC-9": [
292 | "twin-engine, narrow-body airliner",
293 | "high-winged aircraft with a T-tail and two turbofan engines",
294 | "swept back wings",
295 | "two large clamshell-type engine access doors",
296 | "two main landing gear with four wheels each",
297 | "large, rounded, single-piece windshield",
298 | "cabin windows arranged in three rows"
299 | ],
300 | "DH-82": [
301 | "low-wing monoplane",
302 | "fabric-covered fuselage",
303 | "fixed tailwheel undercarriage",
304 | "two-bladed fixed-pitch propeller",
305 | "straight wings with ailerons",
306 | "two open cockpits with dual control sticks"
307 | ],
308 | "DHC-1": [
309 | "High-wing monoplane",
310 | "Fixed tricycle landing gear",
311 | "Spacious cabin with side-by-side seating",
312 | "Large, single-piece windshield",
313 | "Slender fuselage",
314 | "Large, swept-back wings",
315 | "Vertical stabilizer and rudder"
316 | ],
317 | "DHC-6": [
318 | "High-wing monoplane with an extended nose",
319 | "Fixed tricycle landing gear",
320 | "A dorsal air scoop and a single exhaust outlet",
321 | "Large, double-slotted flaps",
322 | "A large, oval-shaped cabin window",
323 | "A high-mounted tailplane with a single fin and rudder"
324 | ],
325 | "DR-400": [
326 | "twin-engine propeller aircraft",
327 | "low wing monoplane",
328 | "four-seat cabin",
329 | "retractable landing gear",
330 | "high T-tail",
331 | "sweptback wings",
332 | "two Lycoming engines",
333 | "two-bladed propellers",
334 | "two fuel tanks"
335 | ],
336 | "Dash 8": [
337 | "twin-engine",
338 | "round nose",
339 | "high-winged design",
340 | "wide fuselage",
341 | "two fan-shaped engines mounted on the wings",
342 | "distinctive high-mounted horizontal stabilizers",
343 | "two-wheeled landing gear with a single wheel on the nose"
344 | ],
345 | "Dornier 328": [
346 | "Low-wing monoplane",
347 | "T-tail configuration",
348 | "Propeller engines mounted at the rear of the aircraft",
349 | "Short, slender fuselage",
350 | "Large, swept-back wings",
351 | "Slanted nose with two windows on each side",
352 | "Two pairs of large doors on the side of the fuselage",
353 | "Retractable landing gear"
354 | ],
355 | "EMB-120": [
356 | "a cabin with 30-33 seats",
357 | "a distinctive high-wing design",
358 | "two Pratt & Whitney PW118 turboprop engines",
359 | "a T-tail configuration",
360 | "two wings mounted on top of the fuselage",
361 | "a retractable landing gear",
362 | "a nose-mounted radar dome"
363 | ],
364 | "Embraer E-Jet": [
365 | "sleek and modern design",
366 | "mid-size business jet",
367 | "large, swept back wings",
368 | "two or three turbofan engines",
369 | "pointed nose",
370 | "small, oval-shaped windows ",
371 | "twin-tail arrangement",
372 | "retractable landing gear"
373 | ],
374 | "Embraer ERJ 145": [
375 | "Dual engines and a swept-back wing design",
376 | "White and light grey color wings and tail",
377 | "Narrow and pointed nose",
378 | "Slightly curved wing tips",
379 | "T-tail design with two vertical stabilizers",
380 | "Two rows of windows"
381 | ],
382 | "Embraer Legacy 600": [
383 | "a swept wing design with two turbofan engines",
384 | "twin vertical stabilizers",
385 | "a long, slender fuselage",
386 | "a T-tail configuration",
387 | "a nose-mounted radar antenna",
388 | "two sets of landing gear",
389 | "a rounded cockpit window"
390 | ],
391 | "Eurofighter Typhoon": [
392 | "Twin-engine, canard-delta wing, multi-role fighter aircraft",
393 | "Low-set, swept-back wings",
394 | "Tailplane with two vertical stabilizers",
395 | "Large air intake in the nose",
396 | "Two exhaust nozzles",
397 | "Canards near the nose",
398 | "Missiles and bombs stored on external hardpoints"
399 | ],
400 | "F-16": [
401 | "a delta-wing shape",
402 | "single engine",
403 | "two vertical stabilizers",
404 | "a single tail fin",
405 | "two sets of wings ",
406 | "two air intakes on the side ",
407 | "a single pilot canopy ",
408 | "a retractable landing gear ",
409 | "a nose-mounted gun"
410 | ],
411 | "F/A-18": [
412 | "military aircraft",
413 | "twin-tailed, delta-winged design",
414 | "body length of nearly 60 feet",
415 | "blue and grey camouflage paint",
416 | "two engines mounted on the wings",
417 | "two vertical stabilizers",
418 | "a nose-mounted gun",
419 | "two large air intakes on the sides of the fuselage"
420 | ],
421 | "Falcon 2000": [
422 | "Twin-engine private jet",
423 | "Long, sleek body",
424 | "Sleek and pointed nose",
425 | "Distinctive wing design",
426 | "Two engines mounted at the rear of the fuselage",
427 | "High-mounted tail fin",
428 | "Retractable landing gear",
429 | "Cockpit with tinted windows"
430 | ],
431 | "Falcon 900": [
432 | "Long, slim fuselage",
433 | "Three engines",
434 | "Slender wing with two distinct sections",
435 | "Horizontal stabilizers at the tail",
436 | "Stubby vertical stabilizers at the tail",
437 | "Highly swept leading edge of the wing",
438 | "Two sets of landing gear, one under the wings and one at the tail"
439 | ],
440 | "Fokker 100": [
441 | "medium-sized jet airliner ",
442 | "swept-back wings ",
443 | "two turbofan engines ",
444 | "two vertical stabilizers ",
445 | "a retractable nose gear ",
446 | "a T-tail ",
447 | "a forward-sloping cockpit ",
448 | "a two-section passenger cabin",
449 | "a distinctive black and white livery"
450 | ],
451 | "Fokker 50": [
452 | "Twin-turboprop regional airliner",
453 | "Short body and high-mounted wings",
454 | "Propeller engines mounted on the wings",
455 | "Fixed tricycle landing gear",
456 | "Distinctive fuselage shape with four circular windows per row ",
457 | "T-tail and sweptback fin",
458 | "Sliding passenger doors"
459 | ],
460 | "Fokker 70": [
461 | "rectangular fuselage with a distinct nose section",
462 | "two turbofan engines mounted on the wings",
463 | "two- or three-bogie main landing gear",
464 | "two- or three-segment tail section",
465 | "horizontal stabilizers on the tail",
466 | "large, swept-back wings",
467 | "a T-tail configuration"
468 | ],
469 | "Global Express": [
470 | "large business jet",
471 | "long, slender fuselage ",
472 | "two turbofan engines ",
473 | "two large, swept-back wings ",
474 | "a T-tail with an engine mounted on the tail",
475 | "a large cockpit window",
476 | "a pointed nose",
477 | "a pressurized cabin"
478 | ],
479 | "Gulfstream": [
480 | "long, slim fuselage",
481 | "swept wings",
482 | "tail with a horizontal stabilizer",
483 | "engines mounted on the rear of the aircraft",
484 | "large windows",
485 | "distinctive Gulfstream logo"
486 | ],
487 | "Hawk T1": [
488 | "characteristic forward swept wings",
489 | "four-bladed propeller",
490 | "cockpit with two-person capacity",
491 | "two-wheeled landing gear",
492 | "tricycle-style nose gear",
493 | "tail-mounted engine exhaust"
494 | ],
495 | "Il-76": [
496 | "four high-bypass turbofan engines",
497 | "swept wings",
498 | "T-tail",
499 | "two large cargo doors on either side of the fuselage",
500 | "large dorsal fin",
501 | "two vertical stabilizers",
502 | "swept-back windshield",
503 | "two sets of landing gear"
504 | ],
505 | "King Air": [
506 | "Twin-engine aircraft",
507 | "Propellers",
508 | "High wing design",
509 | "Tapered wingtips",
510 | "White, blue, or grey body",
511 | "T-shaped tail",
512 | "Windows on the fuselage",
513 | "Round engine nacelles"
514 | ],
515 | "L-1011": [
516 | "large, twin-engine jet airliner",
517 | "three vertical stabilizers",
518 | "one engine on each wing",
519 | "wing-mounted engines ",
520 | "a single, tall fin at the tail",
521 | "low-mounted tailplane",
522 | "curved cockpit windows",
523 | "a pointed nose"
524 | ],
525 | "MD-11": [
526 | "three-engine jet aircraft",
527 | "long, slender fuselage",
528 | "distinctive \"hump\" shape on the top of the fuselage",
529 | "swept wings with a high aspect ratio",
530 | "two engines mounted on the wings and one on the tail",
531 | "T-tail configuration",
532 | "large tailplane with two vertical fins",
533 | "large, square-shaped windows near the rear of the fuselage"
534 | ],
535 | "MD-80": [
536 | "bulbous nose and rounded fuselage",
537 | "two overwing emergency exits",
538 | "two high-bypass turbofan engines",
539 | "swept-back wings with four-section flaps",
540 | "tail fin with two small vertical stabilizers",
541 | "T-tail configuration",
542 | "tricycle landing gear with two main wheels and one nose wheel"
543 | ],
544 | "MD-90": [
545 | "a twin-engine, narrow-body jet airliner",
546 | "a single-aisle cabin",
547 | "distinctive tail fin with two vertical stabilizers",
548 | "two engines mounted on the rear of the fuselage",
549 | "two large, swept-back wings",
550 | "a two-piece main landing gear",
551 | "a T-shaped tailcone"
552 | ],
553 | "Metroliner": [
554 | "Passenger train",
555 | "Constructed of stainless steel",
556 | "Streamlined and aerodynamic design",
557 | "Long, thin shape",
558 | "Smooth, curved roof",
559 | "Multiple windows",
560 | "Five or six cars, each with their own motor",
561 | "Blue and silver livery"
562 | ],
563 | "PA-28": [
564 | "single-engine aircraft",
565 | "four-seat cabin",
566 | "low-wing monoplane",
567 | "tricycle landing gear",
568 | "retractable landing gear",
569 | "fixed-pitch propeller",
570 | "laminar flow wings",
571 | "swept-back tail",
572 | "distinctive \u2018chevron\u2019 shape"
573 | ],
574 | "SR-20": [
575 | "retractable landing gear",
576 | "six-cylinder piston engine",
577 | "composite airframe construction",
578 | "side-by-side seating for two",
579 | "swept-back wings",
580 | "T-shaped tail",
581 | "variable-pitch propeller"
582 | ],
583 | "Saab 2000": [
584 | "four-door sedan",
585 | "curved, aerodynamic look",
586 | "two-tone styling with the hood and front quarter panels in one colour and the roof and rear quarter panels in a second colour",
587 | "black grille with a chrome Saab logo",
588 | "wheel arches",
589 | "black alloy wheels",
590 | "rear spoiler",
591 | "sporty interior with leather seats and chrome accents"
592 | ],
593 | "Saab 340": [],
594 | "Spitfire": [
595 | "small, sleek, and aerodynamic design",
596 | "elliptical wings",
597 | "retractable main landing gear",
598 | "open-cockpit design",
599 | "distinctive round air intake at the front",
600 | "two tailfins",
601 | "distinctive propeller and spinner design"
602 | ],
603 | "Tornado": [
604 | "dark, rotating funnel-shaped cloud",
605 | "strong winds",
606 | "dark clouds",
607 | "heavy precipitation",
608 | "lightning",
609 | "strong gusts of wind",
610 | "debris being picked up by the wind",
611 | "a loud roar"
612 | ],
613 | "Tu-134": [
614 | "twin-engine turboprop aircraft",
615 | "swept wings",
616 | "two or four engines mounted on the fuselage",
617 | "T-tail configuration",
618 | "rectangular windows on each side of the fuselage",
619 | "two or three wheels on the main landing gear",
620 | "nose-mounted radar dome"
621 | ],
622 | "Tu-154": [
623 | "swept wings and a T-tail",
624 | "distinctive nose with a drooping profile ",
625 | "two pairs of engines mounted on the wings",
626 | "a distinctively shaped tail fin",
627 | "four overwing emergency exits ",
628 | "two under-fuselage cargo doors ",
629 | "two side-mounted auxiliary power units"
630 | ],
631 | "Yak-42": [
632 | "a distinct T-tail",
633 | "a low-mounted wing with four engines",
634 | "a large fuselage",
635 | "a single-aisle cabin",
636 | "two or three turbofan engines",
637 | "a pointed nose cone",
638 | "a long, slender fuselage"
639 | ]
640 | }
--------------------------------------------------------------------------------
/descriptors/descriptors_flowers102.json:
--------------------------------------------------------------------------------
1 | {"pink primrose": ["delicate flower", "five petals in a star shape", "pink in color", "often has yellow center", "green stem and leaves", "can be found in clusters"], "hard-leaved pocket orchid": ["small, conical yellow or green flower", "five petals arranged in a star shape", "a pocket-like pouch at the center of the flower", "thick, leathery leaves", "hairy stems", "roots that resemble thin strings"], "canterbury bells": ["five-petalled, bell-shaped flower", "petals are usually blue, pink, or white", "long, pointed leaves", "tall, slender stem", "multiple flowers growing in clusters", "dark, elliptical seed pods"], "sweet pea": ["Flower with five petals", "Clustered in a group of two to four", "Color varies from white to pink, purple and red", "Narrow, pointed leaves", "Long, thin, curved stems", "Papilionaceous (butterfly shaped) corolla", "Fragrant scent"], "english marigold": ["bright yellow, orange, or red flowers", "five petals on each flower", "a single stem", "a rounded, compact shape", "dark green, serrated leaves", "a strong, distinctive scent"], "tiger lily": ["an orange-red, trumpet-shaped flower", "a yellow center with black spots", "six petals", "long, lance-shaped leaves", "a tall, leafy stem", "a single, upright flower stalk"], "moon orchid": ["White, pink, or yellow flowers with a distinct crescent moon shape", "Long slender leaves", "Numerous thin stems", "Fragrant scent", "Small size, typically growing up to 12 inches in height"], "bird of paradise": ["brightly colored feathers, usually in shades of blue, green, yellow, and red", "long tail feathers with intricate patterns", "long, curved beak", "curved or pointed crest of feathers on the head", "long legs", "webbed feet"], "monkshood": ["distinctive purple, blue, white, or yellow flowers", "tall, spindly stems", "leaves with toothed edges ", "dark, glossy green foliage ", "pointed sepals with hood-like protrusions", "an upright flowering habit ", "a deep taproot"], "globe thistle": ["spiky, grey-green foliage", "deep purple, thistle-like flowers", "pointed, spiny seed heads", "branching stems", "a sturdy, upright habit", "a long, tapering root system"], "snapdragon": ["flowering plant with five-petaled blooms", "colorful blooms in shades of pink, yellow, purple, and white", "tall, green stems", "long leaves with jagged edges", "a single seed at the center of the bloom", "blooms lasting for up to two weeks"], "colt's foot": ["small white or yellow flowers", "fuzzy, woolly stems", "creeping or upright stems", "purple-tinted leaves", "shallowly lobed leaves", "short, dense hairs on the underside of the leaves"], "king protea": ["large, pink or orange flower with a cone-shaped center", "broad, wavy, green leaves", "long stem", "prickly seed capsule", "woody stem", "large, woody, brown seed heads at the base of the stem"], "spear thistle": ["biennial or annual plant ", "spiny, deeply-lobed leaves", "purplish, pinkish or whitish flowers ", "erect stems ", "spiny fruit ", "bluish-green foliage ", "sharp, spear-like prickles on leaves, stems, and bracts"], "yellow iris": ["yellow petals radiating outward from a central point", "a dark center with yellow-orange shades", "long, thin stamen in the center", "foliage with a deep green color", "thick, green stem", "broad, flat leaves"], "globe-flower": ["a globe-shaped flower head", "usually white, pink, yellow, or purple", "five petals radiating from the center", "several narrow, pointed petals at the center", "a stem that holds the flower upright", "several green leaves around the stem", "a single, round seed pod at the base of the stem"], "purple coneflower": ["tall, green stem", "large, purple petals", "a brown center or cone", "long, thin leaves", "a rough texture on the petals and leaves", "a sweet scent"], "peruvian lily": ["tall stem with multiple leaves", "bright, trumpet-shaped flowers", "petals that are yellow or pink with purple or brown spots", "long, narrow seed pods", "glossy green foliage"], "balloon flower": ["a delicate flower with five petals", "a unique balloon-like shape", "a star-shaped center in the middle of the flower", "vibrant colors such as pink, purple, blue, white, and yellow", "a long, thin stem", "leaves that are long, thin, and oval-shaped"], "giant white arum lily": ["white, funnel-shaped flower with a greenish throat", "long, slender leaves with a glossy surface", "tall flower stalk, up to 6 feet tall", "large, white bracts at the base of the flower", "cluster of bright-orange berries at the top of the flower stalk"], "fire lily": ["bright red or orange colors", "trumpet-shaped blooms", "heart-shaped leaves", "long stamens", "a thick stem with a swollen base", "a sweet, spicy fragrance"], "pincushion flower": ["small, rounded flower with a dense, center cluster of tiny flowers", "petals with a soft, velvety texture", "colors range from white to pink to purple", "pointed sepals that curl back", "long, thin, yellow stamens", "distinctive, spiky center"], "fritillary": ["orange, yellow, or black wings with silver spots ", "two antennae", "long proboscis (feeding tube)", "four legs", "slender body"], "red ginger": ["bright red flower", "long, thin stem", "large, oblong leaves with pointed tips", "oval shaped rhizomes", "yellow or white-colored interior when cut open", "distinct aroma when broken"], "grape hyacinth": ["blue, purple, or white flowers", "cluster of 6-15 bell-shaped flowers", "small, grass-like foliage", "sword-shaped leaves", "pointed flower buds", "six petals surrounding a star-like center"], "corn poppy": ["red, pink, or white petals with black or dark centers", "four petals arranged in a circular pattern", "deeply lobed, green leaves", "a long, thin stem", "a slender, pointed seed pod", "a yellow or green center"], "prince of wales feathers": ["three black, grey, or brown feathers arranged in a fan-like pattern", "long, curved quill", "barbs with soft, fluffy tips", "white eye markings on the tips of the feathers", "a distinct pattern of white, black and grey stripes on the feather\u2019s surface"], "stemless gentian": ["small, 5-petaled flowers", "delicate, star-shaped blooms", "deep blue to purple in color", "white or yellow center", "no stem", "leaves in opposite pairs"], "artichoke": ["dark green, conical shape", "thorny exterior", "rough, textured leaves", "fuzzy stem", "edible, heart-shaped base", "spiny, inedible choke in the center"], "sweet william": ["small, dense shrub", "hairy stems and leaves", "white, pink, or red flowers with four petals", "long, slender seed pods", "deep green leaves", "rounded shape", "fragrant scent"], "carnation": ["flowering plant", "bushy habit", "long, thin stems", "large, fragrant flowers", "five petals on each flower", "various colors, including pink, white, red, yellow, and purple", "dark green leaves"], "garden phlox": ["perennial flower", "clusters of small, five-petaled flowers in a variety of colors", "oval-shaped leaves", "thick, woody stems", "long, slender seedpods", "an upright or spreading form ", "fragrant blooms"], "love in the mist": ["a white, fluffy flower with a purple or blue center", "a ruffled appearance with many petals", "a tall stem with leaves and buds", "a misty or foggy background", "the flower appears to be floating in the air"], "mexican aster": ["perennial flower", "green stem", "daisy-like flowers in shades of pink, purple, yellow, or white", "yellow center disc", "pointed petals", "thick, hairy leaves", "grows in clumps", "grows up to 3 feet in height"], "alpine sea holly": ["silvery white leaves ", "distinctive star-shaped flowers", "purple-blue petals", "a yellow center", "a long, slender stem", "grows in rocky, alpine habitats"], "ruby-lipped cattleya": ["bright pink and yellow colors ", "large, showy flowers", "glossy leaves ", "tall stems ", "long, pointed petals ", "protruding, pointed lips on the lower petals ", "a yellow throat on the lower petals"], "cape flower": ["bright and colorful petals", "long and thin stems", "pointed and hairy leaves", "yellow stamens in the center", "a large, round center", "small, cup-shaped petals", "a bulbous stem base"], "great masterwort": ["perennial plant", "lacy leaves with toothed edges", "green and oval-shaped with jagged edges", "stems of pink and white flowers", "flowers can be either single or double", "blooms in the summer", "prefers moist, fertile soil in full sun or partial shade"], "siam tulip": ["bulbous, bell-like shape", "bright red, orange, yellow, or pink petals", "yellow stamens", "dark green leaves", "long, thin stem", "pointed flower bud"], "lenten rose": ["perennial plant", "leathery, evergreen, lobed leaves", "a single flower head composed of many small, tightly-packed cup-shaped flowers", "white, pink, or purple flowers", "a long, thin, central stem", "long, thin sepals surrounding the flower head", "a central, cup-shaped receptacle"], "barbeton daisy": ["daisy-like flower", "yellow or white petals", "yellow or orange center", "deep green foliage", "slender stems", "downy leaves", "grows in clusters"], "daffodil": ["yellow or white flower", "six-petaled flower", "long, slender stem", "green leaves", "a trumpet-shaped center", "a green or yellow cup-shaped structure at the base of the petals"], "sword lily": ["long, sword-like leaves", "large, bell-shaped flowers", "flower colors range from white to pink, purple, and yellow", "long, finger-like stamens protruding from the center of the flower", "a short stem with a tuft of leaves at the top", "a bulb-like root system"], "poinsettia": ["bright red or pink leaves", "yellow or white flowers in the center of the plant", "dark green leaves", "a red or white stem", "pointed, oval-shaped leaves", "a milky white sap or latex that can be seen when the leaves are broken"], "bolero deep blue": ["deep blue color ", "long-sleeved, hip-length garment", "open front design", "decorated with lace, embroidery, or buttons", "high-necked collar", "flared sleeves", "fitted waistline with gathered fabric"], "wallflower": ["small, yellow, or white-petaled flower", "long stem", "long, thin leaves", "thick, woody stem", "clustering of flowers", "fragrant smell", "found in walls, fences, and other areas around buildings"], "marigold": ["bright yellow, orange, or white petals ", "long, narrow stem", "dark green, spade-shaped leaves", "long, yellow-tipped stamens", "seed pod at the center", "shallow, cup-like calyx"], "buttercup": ["yellow, white, or pink-hued petals", "three round and overlapping petals", "a green center", "a stem with three leaves", "a glossy surface", "a star-shaped flower head with five petals"], "oxeye daisy": ["daisy-like flower", "single long stem", "white petals", "yellow center", "lobed, serrated leaves", "upper leaves with long stems", "lower leaves with short stems", "grows in clusters"], "common dandelion": ["yellow, daisy-like flowers", "a rosette of leaves", "a hollow, white, milky sap", "a thick, white taproot", "a downy, white, parachute-like seed head", "a rough, deeply-lobed leaf structure."], "petunia": ["small, trumpet-shaped flower", "white, pink, purple, or red", "five petals", "yellow stamen", "dark green stems", "heart-shaped leaves", "a single stem or trailing vine"], "wild pansy": ["small, herbaceous plant", "heart-shaped leaves", "five-petaled flowers", "petals are purple, blue, yellow, or white", "five yellow centers", "a long, thin stem with a taproot", "often found in damp, grassy meadows"], "primula": ["herbaceous perennial", "basal rosette of leaves", "scalloped leaves", "long flowering stem", "bell-shaped flowers", "flower colors include white, yellow, pink, purple, blue, or red", "long flowering period from spring to summer"], "sunflower": ["large, bright yellow petals", "a dark center surrounded by disk florets", "long stem", "a single, long, narrow leaves tapered to a point", "a large, rounded bud at the top of the stem."], "pelargonium": ["distinct veins on the leaves", "five-petaled flowers in a variety of colors including white, pink, red, and purple", "upright stems", "a mounding, shrub-like shape", "a strong, sweet scent"], "bishop of llandaff": ["wearing a traditional bishop's cassock", "purple stole draped over the shoulders ", "holding a crozier (bishop's staff)", "wearing a mitre (a pointed hat) ", "carrying a ceremonial Bible", "wearing a gold ring on the right hand"], "gaura": ["perennial flowering plant", "long, thin stems", "bright pink or white flowers", "lance-shaped leaves", "multiple flower heads per stem", "long seed pods"], "geranium": ["stems and leaves that are green or gray-green in color", "five petals on each flower", "petals that are pale pink, lilac, or white in color", "distinctive veining patterns on the petals", "five stamens in each flower", "a bright yellow center with a five-pointed star pattern"], "orange dahlia": ["bright orange petals", "a dark center with many small petals", "a long, thin stem", "a single flower with many petals", "a star-shaped appearance", "pointed, oval petals", "green sepals at the base of the flower"], "pink-yellow dahlia": ["flower with multiple petals", "colors of pink and yellow", "yellow center with several yellow stamens", "large, full bloom", "green foliage at the base of the stem", "long, thin stem", "papery texture"], "cautleya spicata": ["large, showy, bell-shaped flower", "color ranges from yellow to orange-red", "long floral tube with five to seven lobes", "leaves are long, thin, and sharply pointed", "dark green foliage", "can have a single flower or a cluster of flowers"], "japanese anemone": ["perennial flowering plant", "green, glossy leaves ", "large, white, pink, or purple flower heads", "yellow stamens in the centre of the flower", "a thick, woody stem", "a spreading root system"], "black-eyed susan": ["bright yellow petals with dark brown or black centers", "3-4 inch wide flower heads", "thin, pointed, green leaves", "thin, wiry stems", "grows in clusters", "grows in sunny, open areas"], "silverbush": ["silver-green foliage", "small, white flowers", "berries in shades of red, yellow, or black", "a woody stem with branches", "a rounded crown"], "californian poppy": ["four or five petals", "cup-shaped center", "long, thin leaves", "upright stems", "thick, leathery seed pods"], "osteospermum": ["perennial daisy-like flower", "blue, pink, purple, or white petals", "dark center with yellow or white stamens", "narrow, lanceolate leaves", "branching stems", "grows in a clump or mound"], "spring crocus": ["bulb-like flowers", "one or two large petals that are white, purple, yellow, or pink", "a long, thin stem", "narrow, dark green leaves", "a pointed bud", "a light green or yellow center"], "bearded iris": ["perennial flower", "pointed, sword-shaped leaves", "tall stems with three sepals and three petals", "large, showy petals with a tuft of hairs at the base", "a wide range of colors including purple, blue, white, yellow, pink, and red", "a sweet scent"], "windflower": ["thin, narrow petals", "star-shaped center", "white, pink, purple, or yellow petals", "long, pointed leaves", "hairy stem", "pointed seed heads"], "tree poppy": ["upright, branching stems", "bright green leaves", "white, pink, or yellow flowers", "large, round fruit", "a central stem", "a woody stem", "a long taproot"], "gazania": ["bright, daisy-like flower", "round petals in yellow, orange, cream, and red", "white and yellow center", "oval-shaped leaves", "thick, hairy stems", "low-growing, spreading habit"], "azalea": ["evergreen shrub", "many funnel-shaped flowers", "usually pink, white, or red", "fragrant", "thick dark green leaves", "woody stems", "delicate petals"], "water lily": ["Large, round, floating leaves", "Long stems for attaching to the bottom of the pond", "Showy flowers in shades of white, pink, yellow, or purple", "Pads of overlapping stems and root-like structures that anchor the lily to the pond\u2019s bottom", "Thick, waxy coating on the leaves", "Umbrella-like structure at the top of the flower stalk"], "rose": ["bush with thorny stems", "various shades of red, white, pink, and yellow", "five-petaled flowers", "fragrant aroma", "large, serrated leaves"], "thorn apple": ["a shrub or small tree", "purple, bell-shaped flowers", "white, yellow, or pink-tipped spines", "large, bright green leaves", "small, round, yellow-orange fruit", "bark with a gray or gray-brown color"], "morning glory": ["trumpet-shaped flowers", "a variety of colors, including white, blue, pink, purple, and yellow", "heart-shaped leaves", "tendrils for climbing ", "a woody stem"], "passion flower": ["a long, thin stem", "five heart-shaped petals", "a crown of five stamens", "a ring of five pistils", "an inner halo of colorful thread-like filaments", "a large, round seed pod", "a sweet fragrance"], "lotus lotus": ["aquatic plant", "large, rounded leaves", "large, showy flowers with petals in shades of pink, white, or yellow", "long, slender stems", "floating roots", "a thick, spongy stalk", "a hard, woody rhizome"], "toad lily": ["perennial flowering plant", "large leaves with mottled patterns", "funnel-shaped flowers in shades of pink, purple, or white", "long stems", "tall, upright habit", "seed capsules"], "anthurium": ["waxy and glossy leaves", "glossy and waxy flowers", "heart-shaped leaves", "leathery leaves", "dense clusters of flowers", "long, thin spadix with a fleshy white spike", "multiple colors such as red, pink, white, yellow, and purple"], "frangipani": ["a tropical flowering plant", "long leaves", "distinctive trumpet-shaped flowers", "five overlapping petals", "a range of pastel colours such as pink, white, yellow and orange", "a sweet scent"], "clematis": ["climbing vine", "large, showy flowers in colors such as purple, blue, pink, and white", "opposite leaves with 5-7 leaflets", "long, slender stems", "feather-like seed heads", "woody stems that can reach up to 20 feet in length"], "hibiscus": ["an oval shaped flower", "colorful petals in shades of pink, red, yellow, orange, or white", "a large, showy center with numerous stamens", "dark green leaves with serrated edges", "a woody stem with thorns"], "columbine": ["perennial flower", "bell-shaped petals in white, blue, purple, red, yellow, orange, or pink", "two long spurs on the back of the flower", "delicate, fern-like leaves", "long stem"], "desert-rose": ["a small, slow-growing succulent plant", "a cluster of grey-green leaves radiating from the center", "a large, cup-shaped pink or red flower", "a thick, woody stem", "long, spiky thorns growing along the stem", "a crown shaped like a five-pointed star"], "tree mallow": ["perennial shrub", "big and round leaves with a velvety texture", "white, pink, or purple flowers", "fruits that are usually small and black", "a main stem with multiple branches", "can reach heights of up to 6 feet"], "magnolia": ["white, pink, purple, or yellow", "smooth, leathery leaves", "cone-shaped seed pods", "pyramidal or rounded shape", "large, fleshy root system", "bark that is gray and furrowed"], "cyclamen": ["small, upright perennial plant", "dark green leaves with silvery-green markings", "pink, purple, white, or red flowers", "heart-shaped leaves", "dark, pointed central cone", "fleshy stems and roots", "long and narrow seed-like capsules"], "watercress": ["white, hollow stems", "small flowers", "pungent, peppery flavor", "long, thin stems", "thin, wispy leaves", "small, round seeds", "dark green leaves with a glossy sheen"], "canna lily": ["colorful flowers ranging from yellow, orange, red, and pink", "large, dark green buds", "a thick, sturdy stem", "a cluster of seedpods at the end of the stem", "the leaves are arranged in a fan-like shape"], "hippeastrum": ["bulbous root system", "sword-shaped leaves", "trumpet-shaped flowers in shades of red, orange, pink, yellow, or white", "six anthers at the top of the flower", "star-shaped stigma at the base of the flower", "three outer petals and three inner petals"], "bee balm": ["Leaves are usually opposite and oval-shaped", "Five-petaled flowers in shades of red, pink, or purple", "Stems are square or four-sided", "Pollinators are usually bumblebees, honeybees, and other bees"], "ball moss": ["a small, round, greenish-brown plant", "often grows in clusters", "velvety surface", "small, round, yellow flowers", "small, round, whitish fruit", "short, thick, fibrous roots"], "foxglove": ["pink, purple, white, or yellow-green flowers", "spotted inside each flower ", "long, thin leaves", "hairy, sticky stems", "small, bell-shaped seed pods"], "bougainvillea": ["dark green, waxy leaves", "a variety of brightly colored bracts (petals)", "thorny stems", "a woody vine-like structure", "a large, showy flower cluster", "tendrils for climbing support"], "camellia": ["evergreen shrub", "glossy, dark green leaves", "white, pink, or red single or double flowers", "fragrant, cup-shaped flowers", "thick, leathery leaves", "small, black seed pods", "waxy coating on petals and leaves"], "mallow": ["flowering plant", "round, fleshy leaves", "light green or grey-green in color", "small, pink, white, or yellow flowers", "small, hard fruits", "long, thick stems", "grows in moist, sunny areas"], "mexican petunia": ["perenial shrub", "bright purple, pink, or white flowers", "long, narrow leaves", "stem with white hairs", "thick, woody center", "a cluster of flower buds at the end of each stem"], "bromelia": ["bright, colorful foliage", "spiky or silver-gray leaves", "thick, fleshy stems", "a single, central flower spike", "white, yellow, pink, or orange flowers", "red, purple, or green fruit", "a rosette-like growth pattern"], "blanket flower": ["bell-shaped flower with five petals", "center of the flower is filled with yellow stamens", "stems, leaves, and bracts are hairy and gray-green", "blooms in summer and early fall", "grows in clusters of several flowers"], "trumpet creeper": ["a large, trumpet-shaped flower", "five bright yellow petals", "red-orange central pistil", "a long, curved stem", "long vines that twine around objects", "leaves that are deep green and heart-shaped"], "blackberry lily": ["a flowering plant", "a single stem with a basal rosette of leaves", "star-shaped, five-petaled yellow or orange flowers", "blackberry-like seed pods", "long, thin, bright green, glossy leaves", "a tall, upright growth habit"]}
--------------------------------------------------------------------------------
/descriptors/descriptors_food101.json:
--------------------------------------------------------------------------------
1 | {
2 | "apple pie": [
3 | "a pie dish",
4 | "a crust",
5 | "filling made of apples",
6 | "sugar",
7 | "cinnamon",
8 | "nutmeg",
9 | "butter",
10 | "eggs"
11 | ],
12 | "baby back ribs": [
13 | "a rack of ribs",
14 | "usually pork or beef",
15 | "each rib has a bone with meat attached",
16 | "the meat is usually covered in barbecue sauce"
17 | ],
18 | "baklava": [
19 | "a sweet pastry",
20 | "made of layers of filo dough",
21 | "filled with chopped nuts and sweetened with syrup or honey",
22 | "usually cut into diamond or square shapes",
23 | "served as a dessert or snack"
24 | ],
25 | "beef carpaccio": [
26 | "red meat",
27 | "thinly sliced",
28 | "served raw or rare",
29 | "garnished with herbs or spices",
30 | "olive oil or lemon juice"
31 | ],
32 | "beef tartare": [
33 | "red meat",
34 | "chopped or ground",
35 | "served raw",
36 | "often with egg, capers, and onions",
37 | "may be garnished with herbs or spices"
38 | ],
39 | "beet salad": [
40 | "a bowl or plate of food",
41 | "red and green vegetables",
42 | "lettuce",
43 | "other salad ingredients like croutons, cheese, or dressing",
44 | "a fork or spoon"
45 | ],
46 | "beignets": [
47 | "fried doughnut-shaped pastry",
48 | "coated with powdered sugar",
49 | "served hot"
50 | ],
51 | "bibimbap": [
52 | "a bowl of rice",
53 | "topped with vegetables, meat, and/or an egg",
54 | "often served with gochujang (red chili pepper paste)",
55 | "can be served with kimchi on the side"
56 | ],
57 | "bread pudding": [
58 | "a dessert or sweet dish",
59 | "made with bread or a bread-like product",
60 | "typically contains eggs, milk, sugar, and spices",
61 | "may be baked or steamed",
62 | "often served with a sauce or topping"
63 | ],
64 | "breakfast burrito": [
65 | "a flour tortilla",
66 | "eggs",
67 | "cheese",
68 | "meat",
69 | "vegetables",
70 | "salsa",
71 | "hot sauce"
72 | ],
73 | "bruschetta": [
74 | "a piece of bread",
75 | "topped with olive oil, garlic, and tomatoes",
76 | "often served as an appetizer"
77 | ],
78 | "caesar salad": [
79 | "a salad made with romaine lettuce, croutons, Parmesan cheese, and Caesar dressing"
80 | ],
81 | "cannoli": [
82 | "an Italian pastry",
83 | "a tube-shaped shell",
84 | "made of fried dough",
85 | "filled with sweetened ricotta",
86 | "chocolate chips",
87 | "candied fruit",
88 | "nuts"
89 | ],
90 | "caprese salad": [
91 | "a salad made of tomatoes, mozzarella, and basil",
92 | "red, white, and green colors",
93 | "sliced or diced tomatoes",
94 | "sliced or diced mozzarella",
95 | "fresh basil leaves",
96 | "olive oil",
97 | "balsamic vinegar"
98 | ],
99 | "carrot cake": [
100 | "cake with carrots",
101 | "cream cheese frosting",
102 | "often has nuts or raisins",
103 | "may have a green garnish"
104 | ],
105 | "ceviche": [
106 | "a dish of seafood",
107 | "typically includes fish, shrimp, and/or squid",
108 | "marinated in citrus juice",
109 | "served with onions, peppers, and cilantro",
110 | "may be garnished with avocado, lime, and/or chili peppers"
111 | ],
112 | "cheese plate": [
113 | "a plate with several different kinds of cheese",
114 | "a variety of crackers or bread",
115 | "a knife for cutting the cheese",
116 | "a spreader for spreading the cheese",
117 | "a napkin or paper towel"
118 | ],
119 | "cheesecake": [
120 | "a cake made with cheese",
121 | "usually has a graham cracker or cookie crust",
122 | "can be topped with fruit, whipped cream, or chocolate",
123 | "may have a design on top made with icing or chocolate"
124 | ],
125 | "chicken curry": [
126 | "a bowl or plate of food",
127 | "curry is usually a yellow, orange, or red color",
128 | "chunks of chicken",
129 | "rice or noodles",
130 | "vegetables",
131 | "spices"
132 | ],
133 | "chicken quesadilla": [
134 | "a flat, round tortilla",
135 | "chicken",
136 | "cheese",
137 | "salsa",
138 | "sour cream"
139 | ],
140 | "chicken wings": [
141 | "chicken wings",
142 | "fried",
143 | "sauced",
144 | "served with celery and blue cheese dressing"
145 | ],
146 | "chocolate cake": [
147 | "a cake",
148 | "chocolate flavor",
149 | "frosting",
150 | "decorations",
151 | "candles (optional)"
152 | ],
153 | "chocolate mousse": [
154 | "a chocolate mousse is a dessert made from chocolate, eggs, and cream,",
155 | "it is usually served in a glass or cup,",
156 | "it is smooth and creamy,",
157 | "it may have a whipped cream topping,",
158 | "it may be garnished with chocolate shavings or a cherry."
159 | ],
160 | "churros": [
161 | "fried-dough pastry",
162 | "coated in sugar",
163 | "often served with chocolate sauce",
164 | "long and thin",
165 | "twisted or spiraled shape"
166 | ],
167 | "clam chowder": [
168 | "a thick, creamy soup",
169 | "usually contains potatoes, onions, and celery",
170 | "may be white or red",
171 | "typically contains clams, bacon, and/or seafood",
172 | "may be garnished with parsley, thyme, or other herbs"
173 | ],
174 | "club sandwich": [
175 | "a sandwich with multiple layers of meat, cheese, and vegetables",
176 | "bread that is toasted or grilled",
177 | "mayonnaise or another sauce",
178 | "lettuce, tomato, and onion"
179 | ],
180 | "crab cakes": [
181 | "a cake or patty made of crab meat",
182 | "usually fried or baked",
183 | "may be served with a sauce or dressing",
184 | "may be garnished with vegetables or herbs"
185 | ],
186 | "creme brulee": [
187 | "a dessert",
188 | "a round, shallow dish",
189 | "a smooth, custard-like filling",
190 | "a hard, caramelized topping",
191 | "a spoon"
192 | ],
193 | "croque madame": [
194 | "a sandwich made of bread, ham, and cheese",
195 | "the bread is usually toasted or grilled",
196 | "the cheese is melted",
197 | "a fried or poached egg is placed on top",
198 | "served hot",
199 | "garnished with parsley or other herbs"
200 | ],
201 | "cup cakes": [
202 | "small, round cake",
203 | "frosting on top",
204 | "sprinkles or other decorations on top",
205 | "a paper or plastic cupcake liner"
206 | ],
207 | "deviled eggs": [
208 | "a hard-boiled egg",
209 | "mayonnaise",
210 | "mustard",
211 | "vinegar",
212 | "paprika",
213 | "salt",
214 | "pepper"
215 | ],
216 | "donuts": [],
217 | "dumplings": [
218 | "small, round, and filled with meat or vegetables",
219 | "wrapped in a thin dough",
220 | "boiled, steamed, or fried",
221 | "served with a dipping sauce",
222 | "can be made with different fillings"
223 | ],
224 | "edamame": [
225 | "a pod with small, green beans inside",
226 | "a stem attached to the pod",
227 | "leaves attached to the stem",
228 | "a plant with a woody stem",
229 | "a plant that is a member of the pea family"
230 | ],
231 | "eggs benedict": [
232 | "a dish composed of an English muffin, ham, a poached egg, and hollandaise sauce",
233 | "the egg is usually cooked so that the yolk is runny",
234 | "may be garnished with parsley, paprika, or other herbs"
235 | ],
236 | "escargots": [
237 | "small, snail-like creature",
238 | "brown, grey, or white",
239 | "slimy body",
240 | "long, spiral shell",
241 | "two feelers on head",
242 | "two pairs of tentacles"
243 | ],
244 | "falafel": [
245 | "a deep-fried ball or patty",
246 | "made from ground chickpeas, fava beans, or both",
247 | "usually served in a pita or wrap",
248 | "topped with salads, pickles, and sauces"
249 | ],
250 | "filet mignon": [
251 | "a cut of beef",
252 | "usually served as a steak",
253 | "tender and lean",
254 | "can be grilled, pan-fried, or roasted",
255 | "typically served with a sauce or vegetables",
256 | "four-limbed animal",
257 | "fur",
258 | "whiskers",
259 | "a tail",
260 | "pointy ears"
261 | ],
262 | "fish and chips": [
263 | "a plate of food",
264 | "fried fish",
265 | "french fries",
266 | "ketchup or vinegar"
267 | ],
268 | "foie gras": [
269 | "a type of French cuisine",
270 | "made from the liver of a duck or goose",
271 | "can be served as a pate or mousse",
272 | "often served with a sweet or savory sauce",
273 | "may be garnished with fruit, nuts, or herbs"
274 | ],
275 | "french fries": [],
276 | "french onion soup": [
277 | "a bowl of soup",
278 | "onions in the soup",
279 | "a bread crumb topping",
280 | "melted cheese on top",
281 | "a spoon for eating the soup"
282 | ],
283 | "french toast": [
284 | "bread that has been soaked in egg and milk",
285 | "then fried in a pan",
286 | "often served with butter, syrup, or fruit",
287 | "four-limbed animal",
288 | "fur",
289 | "whiskers",
290 | "a tail",
291 | "pointy ears",
292 | "big, round eyes"
293 | ],
294 | "fried calamari": [],
295 | "fried rice": [
296 | "a dish made of rice that has been stir-fried in a wok or a frying pan",
297 | "can be made with various ingredients like meats, vegetables, and eggs",
298 | "usually has a brown or yellowish color",
299 | "can be served with soy sauce, chili sauce, or other condiments"
300 | ],
301 | "frozen yogurt": [
302 | "a cup or cone of frozen yogurt",
303 | "toppings such as fruits, nuts, or candy",
304 | "a spoon",
305 | "a napkin"
306 | ],
307 | "garlic bread": [
308 | "a loaf of bread",
309 | "with garlic cloves",
310 | "and butter or olive oil",
311 | "often with parsley or other herbs",
312 | "served hot or cold"
313 | ],
314 | "gnocchi": [
315 | "small, round, and thick",
316 | "made from potato, flour, and egg",
317 | "can be boiled, baked, or fried",
318 | "served with sauce or vegetables"
319 | ],
320 | "greek salad": [
321 | "a mix of greens, such as lettuce, spinach, and arugula",
322 | "cherry tomatoes",
323 | "Kalamata olives",
324 | "feta cheese",
325 | "red onion",
326 | "cucumber",
327 | "green bell pepper",
328 | "olive oil",
329 | "red wine vinegar",
330 | "oregano",
331 | "salt",
332 | "pepper"
333 | ],
334 | "grilled cheese sandwich": [
335 | "two slices of bread",
336 | "cheese melted in between the slices of bread",
337 | "butter on the bread",
338 | "grill marks on the bread"
339 | ],
340 | "grilled salmon": [
341 | "a fish",
342 | "pink or red-orange",
343 | "fleshy with scales",
344 | "a grilled or cooked appearance",
345 | "may be served with lemon, butter, or other sauces"
346 | ],
347 | "guacamole": [
348 | "a green, brown, or black color",
349 | "a smooth or textured surface",
350 | "a dip or spread",
351 | "made from avocados, onions, tomatoes, and other ingredients",
352 | "often served with chips, tacos, or burritos"
353 | ],
354 | "gyoza": [
355 | "a small, round, flat dumpling",
356 | "made of wheat flour dough and filled with meat and vegetables",
357 | "usually served steamed, boiled, or fried",
358 | "can be served with dipping sauce"
359 | ],
360 | "hamburger": [
361 | "a round, flat bun",
362 | "a cooked beef patty",
363 | "lettuce, tomato, onion, pickles, cheese, and/or other condiments",
364 | "ketchup, mustard, and/or other sauces"
365 | ],
366 | "hot and sour soup": [
367 | "a bowl of soup",
368 | "steam coming off the soup",
369 | "chunks of vegetables or meat in the soup",
370 | "a sour and spicy smell",
371 | "a spoon for eating the soup"
372 | ],
373 | "hot dog": [
374 | "a long, thin, cylindrical shape",
375 | "a smooth surface",
376 | "a bun around it",
377 | "mustard or ketchup on it"
378 | ],
379 | "huevos rancheros": [
380 | "a dish typically consisting of a fried egg, a tortilla, and beans",
381 | "can also include cheese, avocado, salsa, and sour cream",
382 | "the egg is usually sunny side up or over easy",
383 | "served with rice and beans"
384 | ],
385 | "hummus": [
386 | "a smooth, thick paste",
387 | "made from chickpeas, tahini, olive oil, and lemon juice",
388 | "can be flavored with garlic, cumin, paprika, or other spices",
389 | "typically served with pita bread, vegetables, or crackers",
390 | "can be used as a dip or spread"
391 | ],
392 | "ice cream": [
393 | "a sweet, creamy food",
394 | "made from milk, cream, and sugar",
395 | "can be flavored with fruit, nuts, chocolate, or other ingredients",
396 | "served cold",
397 | "can be soft or hard"
398 | ],
399 | "lasagna": [
400 | "a large, rectangular dish",
401 | "multiple layers of pasta, sauce, and cheese",
402 | "typically red or brown in color",
403 | "may have meat or vegetables",
404 | "served hot"
405 | ],
406 | "lobster bisque": [
407 | "a creamy, orange soup",
408 | "chunks of lobster meat",
409 | "a garnish of green onions or parsley",
410 | "a dollop of sour cream or cr\u00e8me fraiche",
411 | "a drizzle of olive oil",
412 | "a sprinkle of paprika"
413 | ],
414 | "lobster roll sandwich": [
415 | "a sandwich with a lobster tail and claw meat filling",
416 | "served on a toasted bun",
417 | "mayonnaise, celery, and lemon juice often added to the filling",
418 | "can be served cold or hot",
419 | "found primarily in New England"
420 | ],
421 | "macaroni and cheese": [
422 | "a bowl or dish of food",
423 | "yellow, orange, or white",
424 | "noodles and cheese",
425 | "a fork or spoon",
426 | "a napkin"
427 | ],
428 | "macarons": [
429 | "a small, round, flat cookie",
430 | "made of two discs of almond meringue with a filling in between",
431 | "the filling is usually flavored with chocolate, coffee, lemon, or raspberry",
432 | "the cookies are often brightly colored",
433 | "they are usually served in pairs"
434 | ],
435 | "miso soup": [
436 | "a bowl of soup",
437 | "a light brown or tan color",
438 | "chunks of tofu",
439 | "green onions",
440 | "seaweed",
441 | "a spoon"
442 | ],
443 | "mussels": [],
444 | "nachos": [
445 | "a dish of tortilla chips",
446 | "covered in cheese",
447 | "with various toppings such as meat, vegetables, and sauce"
448 | ],
449 | "omelette": [
450 | "a cooked egg dish",
451 | "usually made with eggs, milk, and butter",
452 | "can include other ingredients like cheese, ham, or vegetables",
453 | "typically served for breakfast or brunch",
454 | "can be made in a variety of shapes and sizes",
455 | "can be served with a variety of toppings"
456 | ],
457 | "onion rings": [
458 | "a deep-fried food",
459 | "a ring-shaped",
460 | "battered or breaded",
461 | "onion as the main ingredient",
462 | "served with a dipping sauce"
463 | ],
464 | "oysters": [],
465 | "pad thai": [
466 | "a type of Asian noodle dish",
467 | "noodles are usually made from rice, wheat, or egg",
468 | "can be stir-fried, boiled, or served in a soup",
469 | "often contains vegetables, protein, and a sauce",
470 | "may be garnished with peanuts, cilantro, lime, and bean sprouts"
471 | ],
472 | "paella": [
473 | "a large, flat, round dish",
474 | "typically made of metal or ceramic",
475 | "food is cooked and served in the same dish",
476 | "typically has rice, seafood, and vegetables",
477 | "can be made in different colors, depending on the ingredients used"
478 | ],
479 | "pancakes": [
480 | "a stack of pancakes",
481 | "each pancake is round and flat",
482 | "pancakes are usually served with syrup, butter, and/or fruit"
483 | ],
484 | "panna cotta": [
485 | "a dessert made with cream and gelatin",
486 | "typically served in a small, round dish",
487 | "can be decorated with fruit, sauce, or other toppings",
488 | "has a smooth, creamy texture",
489 | "can be made in a variety of flavors"
490 | ],
491 | "peking duck": [
492 | "a waterfowl with a long neck and small head",
493 | "a reddish-brown plumage with a white chest and belly",
494 | "webbed feet",
495 | "a flat bill",
496 | "a long, broad tail",
497 | "a black cap on the head"
498 | ],
499 | "pho": [
500 | "a type of Vietnamese noodle soup",
501 | "typically made with beef or chicken broth",
502 | "rice noodles",
503 | "herbs and spices",
504 | "bean sprouts",
505 | "lime wedges",
506 | "chili peppers",
507 | "protein (beef, chicken, shrimp, etc.)",
508 | "served with a side of hoisin and Sriracha sauce"
509 | ],
510 | "pizza": [
511 | "a round, flatbread crust",
512 | "a sauce, typically red",
513 | "toppings, such as cheese, vegetables, and meat",
514 | "a circumference of crust that is thicker than the rest of the pizza",
515 | "a diameter of about 10-12 inches"
516 | ],
517 | "pork chop": [
518 | "a piece of meat",
519 | "usually pink or white",
520 | "a bone in the center",
521 | "fat around the edges",
522 | "may be grilled, baked, or fried",
523 | "served with a side of vegetables or rice"
524 | ],
525 | "poutine": [
526 | "a dish of french fries",
527 | "covered in cheese curds",
528 | "topped with gravy"
529 | ],
530 | "prime rib": [
531 | "a large, roasted piece of meat",
532 | "typically served on a platter with vegetables",
533 | "the meat is usually pink or red in the center, with a layer of fat on top",
534 | "the meat is surrounded by a bone",
535 | "four-limbed animal",
536 | "typically covered in fur"
537 | ],
538 | "pulled pork sandwich": [
539 | "a sandwich with a filling of pulled pork",
540 | "the pork is usually cooked with barbecue sauce",
541 | "the sandwich may also include coleslaw or other toppings",
542 | "the sandwich is usually served on a bun or roll"
543 | ],
544 | "ramen": [
545 | "a bowl of noodles",
546 | "a broth",
547 | "vegetables",
548 | "meat",
549 | "spices"
550 | ],
551 | "ravioli": [
552 | "a small, round, pasta noodle",
553 | "filled with meat, cheese, or vegetables",
554 | "covered in sauce",
555 | "served with a fork and knife"
556 | ],
557 | "red velvet cake": [
558 | "cake",
559 | "red",
560 | "velvet",
561 | "frosting",
562 | "decorations"
563 | ],
564 | "risotto": [
565 | "a creamy, starchy dish",
566 | "made with rice",
567 | "can be made with various ingredients like vegetables, meats, or seafood",
568 | "often served as a main course",
569 | "can be garnished with cheese, herbs, or lemon"
570 | ],
571 | "samosa": [
572 | "a triangular or cone-shaped pastry",
573 | "filled with savory ingredients",
574 | "fried or baked",
575 | "served hot or cold",
576 | "often eaten as a snack",
577 | "four-limbed animal",
578 | "fur-covered body",
579 | "pointed ears",
580 | "whiskers",
581 | "a long tail",
582 | "sharp"
583 | ],
584 | "sashimi": [
585 | "raw, sliced fish",
586 | "often served with soy sauce and wasabi",
587 | "can be made with salmon, tuna, or other fish",
588 | "may be garnished with ginger, daikon, or other vegetables",
589 | "can be served with rice"
590 | ],
591 | "scallops": [
592 | "a marine invertebrate",
593 | "bivalve mollusc",
594 | "two hinged, convex shells",
595 | "a soft body with frills around the edge",
596 | "a muscular foot used for locomotion and feeding",
597 | "siphons for drawing in water"
598 | ],
599 | "seaweed salad": [
600 | "a mix of green, brown, and red seaweed",
601 | "shredded carrots",
602 | "sesame seeds",
603 | "served in a bowl or on a plate",
604 | "often with soy sauce or vinegar"
605 | ],
606 | "shrimp and grits": [
607 | "a bowl or plate of food",
608 | "shrimp",
609 | "grits",
610 | "a sauce or gravy",
611 | "vegetables",
612 | "spices"
613 | ],
614 | "spaghetti bolognese": [
615 | "a bowl or plate of spaghetti noodles",
616 | "a red or brown meat sauce",
617 | "shredded or grated cheese on top",
618 | "green herbs"
619 | ],
620 | "spaghetti carbonara": [
621 | "a type of pasta dish",
622 | "usually made with spaghetti",
623 | "can be made with other types of pasta",
624 | "a sauce made with eggs, cheese, and bacon",
625 | "can also include onion, garlic, and black pepper",
626 | "served with a sprinkle of parsley"
627 | ],
628 | "spring rolls": [
629 | "cylindrical shape",
630 | "light brown",
631 | "translucent",
632 | "filled with vegetables",
633 | "served with dipping sauce"
634 | ],
635 | "steak": [
636 | "red or brown",
637 | "meat",
638 | "a grilled or cooked surface",
639 | "a juicy or fatty appearance",
640 | "a bone or gristle",
641 | "a knife and fork"
642 | ],
643 | "strawberry shortcake": [
644 | "a cake or biscuit with strawberries and cream",
645 | "red, white, and green",
646 | "a shortcake or biscuit base",
647 | "whipped cream or icing",
648 | "fresh or frozen strawberries"
649 | ],
650 | "sushi": [
651 | "Japanese dish",
652 | "sushi is typically made with raw fish, seafood, or vegetables",
653 | "sushi is often served with soy sauce, wasabi, and pickled ginger",
654 | "sushi is typically rolled in nori (seaweed)",
655 | "sushi can be served in a variety of ways, including nigiri (sliced fish on top of rice), maki (rolled sushi), and temaki (hand-rolled sushi)"
656 | ],
657 | "tacos": [
658 | "a soft or hard taco shell",
659 | "ground beef, chicken, or pork",
660 | "shredded lettuce",
661 | "diced tomatoes",
662 | "grated cheese",
663 | "sour cream",
664 | "salsa"
665 | ],
666 | "takoyaki": [
667 | "a small, round ball of batter",
668 | "filled with octopus, vegetables, and/or other ingredients",
669 | "grilled or fried",
670 | "served with a sauce and/or other toppings",
671 | "on a skewer or in a small, round container"
672 | ],
673 | "tiramisu": [
674 | "a dessert",
675 | "coffee-flavored",
676 | "layered",
677 | "made with ladyfingers",
678 | "mascarpone cheese",
679 | "cocoa powder"
680 | ],
681 | "tuna tartare": [
682 | "a small, round dish",
683 | "pink or red tuna",
684 | "avocado",
685 | "cucumber",
686 | "sesame seeds",
687 | "scallions",
688 | "soy sauce",
689 | "wasabi"
690 | ],
691 | "waffles": [
692 | "a breakfast food",
693 | "a round or square shape",
694 | "a grid pattern",
695 | "syrup or fruit toppings",
696 | "whipped cream"
697 | ]
698 | }
--------------------------------------------------------------------------------
/descriptors/descriptors_pets.json:
--------------------------------------------------------------------------------
1 | {
2 | "Abyssinian": [
3 | "black, grey, or brown fur",
4 | "long, slender legs",
5 | "pointed ears",
6 | "a long, bushy tail",
7 | "a ruff of fur around the neck"
8 | ],
9 | "Bengal": [
10 | "a large, muscular body",
11 | "short fur that is orange with black spots",
12 | "long, black stripes on the face, back, and tail",
13 | "black spots on the legs, belly, and chest",
14 | "green or blue eyes"
15 | ],
16 | "Birman": [
17 | "four-limbed animal",
18 | "long, fluffy fur",
19 | "pointed ears",
20 | "blue eyes",
21 | "white paws"
22 | ],
23 | "Bombay": [
24 | "black or brown fur",
25 | "short ears",
26 | "long tail",
27 | "muscular body",
28 | "short legs",
29 | "long claws"
30 | ],
31 | "British Shorthair": [
32 | "short, dense fur",
33 | "large, round eyes",
34 | "a short, broad muzzle",
35 | "small, rounded ears",
36 | "a thick, muscular body",
37 | "short legs",
38 | "a broad, round head"
39 | ],
40 | "Egyptian Mau": [
41 | "a medium-sized, short-haired cat",
42 | "a sleek, muscular body",
43 | "large ears",
44 | "long legs",
45 | "a long, slender tail",
46 | "a spotted or striped coat in shades of brown, black, silver, or blue"
47 | ],
48 | "Maine Coon": [
49 | "large, muscular body",
50 | "long, thick fur",
51 | "large, bushy tail",
52 | "large, tufted ears",
53 | "large, expressive eyes"
54 | ],
55 | "Persian": [
56 | "long, thick fur",
57 | "round face",
58 | "short nose",
59 | "large, round eyes",
60 | "small ears",
61 | "short legs",
62 | "long body"
63 | ],
64 | "Ragdoll": [
65 | "a large, muscular body",
66 | "a long, thick coat that is usually white with darker markings",
67 | "blue eyes",
68 | "a fluffy tail",
69 | "large, pointy ears"
70 | ],
71 | "Russian Blue": [
72 | "blue-grey fur",
73 | "green or yellow eyes",
74 | "long, fluffy tail",
75 | "triangular head shape",
76 | "pointed ears"
77 | ],
78 | "Siamese": [
79 | "blue eyes",
80 | "point coloration",
81 | "long, slender body",
82 | "triangular head",
83 | "large ears",
84 | "short tail"
85 | ],
86 | "Sphynx": [
87 | "a breed of cat",
88 | "hairless",
89 | "large ears",
90 | "long legs",
91 | "a long, slender body",
92 | "a short tail",
93 | "almond-shaped eyes"
94 | ],
95 | "american bulldog": [
96 | "four-legged mammal",
97 | "short, stocky body",
98 | "large head",
99 | "square jaw",
100 | "short snout",
101 | "black, brown, or white fur",
102 | "short tail",
103 | "muscular build"
104 | ],
105 | "american pit bull terrier": [
106 | "muscular body",
107 | "short, stiff hair",
108 | "short snout",
109 | "wide head",
110 | "large, powerful jaws",
111 | "strong, powerful legs",
112 | "thick tail",
113 | "black, brown, or white coat"
114 | ],
115 | "basset hound": [
116 | "short-legged dog breed",
117 | "long, droopy ears",
118 | "long, low-slung body",
119 | "short, smooth coat",
120 | "typically black, tan, or white in color",
121 | "often has a \"sad\" expression"
122 | ],
123 | "beagle": [
124 | "short-haired dog",
125 | "brown, black, and white",
126 | "floppy ears",
127 | "short snout",
128 | "long, droopy tail",
129 | "small to medium size"
130 | ],
131 | "boxer": [
132 | "short-haired dog",
133 | "brindle, fawn, or black coat",
134 | "square-shaped head",
135 | "floppy ears",
136 | "muscular body",
137 | "stubby tail"
138 | ],
139 | "chihuahua": [
140 | "small size",
141 | "short legs",
142 | "long body",
143 | "large ears",
144 | "round head",
145 | "short snout",
146 | "dark eyes"
147 | ],
148 | "english cocker spaniel": [
149 | "a medium-sized dog",
150 | "a long, silky coat",
151 | "a feathered tail",
152 | "long, floppy ears",
153 | "a round head",
154 | "dark, expressive eyes",
155 | "a muscular body",
156 | "short legs"
157 | ],
158 | "english setter": [
159 | "four-limbed animal",
160 | "black, white, or liver-colored",
161 | "long, silky fur",
162 | "long ears",
163 | "long tail",
164 | "webbed feet",
165 | "large, dark eyes"
166 | ],
167 | "german shorthaired": [
168 | "short, smooth coat",
169 | "liver and white or black and white",
170 | "large, round eyes",
171 | "long ears",
172 | "muscular body",
173 | "long, docked tail",
174 | "webbed feet"
175 | ],
176 | "great pyrenees": [
177 | "a large, white, fluffy dog",
178 | "often has a black or brown face",
179 | "large, triangular ears",
180 | "a long tail",
181 | "four black paws"
182 | ],
183 | "havanese": [
184 | "small to medium size",
185 | "long, silky coat",
186 | "may be any color or combination of colors",
187 | "plumed tail that curls over the back",
188 | "long, floppy ears",
189 | "dark, expressive eyes"
190 | ],
191 | "japanese chin": [
192 | "small, compact body",
193 | "short legs",
194 | "large, round head",
195 | "black, white, or black and white coat",
196 | "long, silky fur",
197 | "small, black eyes",
198 | "black nose",
199 | "small, black ears"
200 | ],
201 | "keeshond": [
202 | "a thick, double coat of fur that is black and silver or black and cream in color",
203 | "a bushy tail that is carried over the back",
204 | "a wedge-shaped head",
205 | "small, triangular ears",
206 | "dark brown eyes",
207 | "a black nose"
208 | ],
209 | "leonberger": [
210 | "four-limbed animal",
211 | "black, brown, or yellow",
212 | "wet nose",
213 | "long tail",
214 | "large eyes",
215 | "furry body",
216 | "clawed feet"
217 | ],
218 | "miniature pinscher": [
219 | "small size",
220 | "short coat",
221 | "black, brown, or red coloration",
222 | "pointed ears",
223 | "long, slender muzzle",
224 | "docked tail",
225 | "muscular body",
226 | "high energy level"
227 | ],
228 | "newfoundland": [
229 | "large, stocky body",
230 | "thick, waterproof coat",
231 | "webbed feet",
232 | "large head",
233 | "droopy ears",
234 | "soulful eyes"
235 | ],
236 | "pomeranian": [
237 | "small size",
238 | "round head",
239 | "large, pointy ears",
240 | "thick, fluffy coat",
241 | "black, brown, white, or cream-colored fur"
242 | ],
243 | "pug": [
244 | "short-snouted dog breed",
245 | "wrinkled face",
246 | "black, fawn, or silver coat",
247 | "small, round eyes",
248 | "curled tail",
249 | "compact body",
250 | "short legs"
251 | ],
252 | "saint bernard": [
253 | "four-legged animal",
254 | "large size",
255 | "long, thick fur",
256 | "dark brown or black fur",
257 | "white markings on the chest and face",
258 | "a short tail",
259 | "a muscular build",
260 | "a large head",
261 | "droopy ears",
262 | "a long snout"
263 | ],
264 | "samoyed": [
265 | "a white, fluffy coat",
266 | "a thick, double coat",
267 | "a long, bushy tail",
268 | "small, black eyes",
269 | "small, black ears",
270 | "a black nose"
271 | ],
272 | "scottish terrier": [
273 | "small to medium-sized dog",
274 | "short, thick coat",
275 | "black, grey, or brindle",
276 | "bushy eyebrows and beard",
277 | "pointed ears",
278 | "long, curved tail",
279 | "strong, muscular build"
280 | ],
281 | "shiba inu": [
282 | "a small to medium sized dog",
283 | "a short, double-coated fur",
284 | "a triangular head",
285 | "small, pointed ears",
286 | "dark, almond shaped eyes",
287 | "a black nose",
288 | "a compact body",
289 | "short legs",
290 | "a long, curled tail"
291 | ],
292 | "staffordshire bull terrier": [
293 | "short, stocky body",
294 | "short, thick legs",
295 | "short, wide head",
296 | "large, square jaw",
297 | "small, dark eyes",
298 | "short, stiff tail",
299 | "short, stiff coat",
300 | "brindle, black, or blue coat",
301 | "white markings on the chest, feet, and face"
302 | ],
303 | "wheaten terrier": [
304 | "a medium-sized, solidly-built dog",
305 | "a square head with a short muzzle",
306 | "dark, almond-shaped eyes",
307 | "V-shaped ears that fold forward",
308 | "a long neck that is slightly arched",
309 | "a deep chest",
310 | "a long, straight back",
311 | "a tail that is carried low",
312 | "a dense, wavy coat that is wheaten in color"
313 | ],
314 | "yorkshire terrier": [
315 | "small size",
316 | "long, silky hair",
317 | "triangular ears",
318 | "black, brown, or blue and tan coat",
319 | "dark eyes",
320 | "short muzzle"
321 | ]
322 | }
--------------------------------------------------------------------------------
/environment.yaml:
--------------------------------------------------------------------------------
1 | name: waffle
2 | channels:
3 | - pytorch
4 | - defaults
5 | dependencies:
6 | - python=3.8.5
7 | - pip=20.3
8 | - cudatoolkit=11.3
9 | - pytorch=1.12.1
10 | - torchvision=0.13.1
11 | - numpy=1.23.1
12 | - scipy=1.10.0
13 | - tqdm
14 | - pip:
15 | - webdataset==0.2.5
16 | - torchmetrics==0.6.0
17 | - matplotlib
18 | - termcolor
19 | - git+https://github.com/openai/CLIP.git
20 | - git+https://github.com/modestyachts/ImageNetV2_pytorch
21 | - -e .
--------------------------------------------------------------------------------
/evaluate_results.py:
--------------------------------------------------------------------------------
1 | #%% Select relevant files.
2 | import numpy as np
3 | import pandas as pd
4 |
5 | files = [
6 | 'results/baselines.csv',
7 | 'results/baselines_concept.csv',
8 | 'results/baselines_gpt.csv',
9 | 'results/shared_randomized_descriptions.csv',
10 | 'results/shared_randomized_descriptions_2xbudget.csv',
11 | 'results/swapped_descriptions.csv',
12 | 'results/scrambled_descriptions.csv',
13 | 'results/randomized_descriptions.csv',
14 | 'results/randomized_descriptions_5xbudget.csv',
15 | 'results/waffleclip.csv',
16 | 'results/waffleclip_concepts.csv',
17 | 'results/waffleclip_gpt.csv',
18 | 'results/waffleclip_gpt_concepts.csv',
19 | ]
20 |
21 | #%% Extract classification accuracies from files.
22 | table_data = []
23 | for file in files:
24 | out = pd.read_csv(file, header=None)
25 | out = np.array(out)
26 |
27 | # Per Dataset Context
28 | # datasets = np.array(['flowers102', 'fgvcaircraft', 'cars'])
29 | datasets = np.array(['imagenetv2', 'imagenet', 'cub', 'eurosat', 'places365', 'food101', 'pets', 'dtd'])
30 | fixed_cols = []
31 | resp_dataset = []
32 | resp_dataset_row_idcs = []
33 | resp_dataset_col_idcs = []
34 | for row in out:
35 | coln = row[0]
36 | for dataset in datasets:
37 | coln = coln.replace(f'dataset={dataset}; ', '')
38 | if 'label_before_text' in coln:
39 | coln = coln.split('label_before_text')[0] + ''.join(''.join(coln.split('label_before_text')[1:]).split(';')[1:])
40 | fixed_cols.append(coln)
41 | unique_cols, idcs = np.unique(fixed_cols, return_index=True)
42 | unique_cols = np.array(unique_cols)[np.argsort(idcs)]
43 |
44 | for i, row in enumerate(out):
45 | dataset = row[0].split('dataset=')[-1].split(';')[0]
46 | resp_dataset_col_idcs.append(np.where(datasets == dataset)[0][0])
47 | resp_dataset_row_idcs.append(np.where(unique_cols == fixed_cols[i])[0][0])
48 | resp_dataset.append(dataset)
49 |
50 |
51 | # Per Dataset Context
52 | mean_results_top1 = np.ones((len(unique_cols), len(datasets))) * np.nan
53 | std_results_top1 = np.ones((len(unique_cols), len(datasets))) * np.nan
54 | for k in range(len(out)):
55 | i = resp_dataset_row_idcs[k]
56 | j = resp_dataset_col_idcs[k]
57 | mean_results_top1[i, j] = np.round(out[k][1], 2)
58 | std_results_top1[i, j] = np.round(out[k][2], 2)
59 |
60 | mean_avgs_top1 = np.nanmean(mean_results_top1, axis=-1)
61 | norm = np.sum(~np.isnan(mean_results_top1), axis=-1).reshape(-1, 1)
62 | std_avgs_top1 = np.sqrt(np.sum(std_results_top1 ** 2 / norm, axis=-1))
63 |
64 | result_str = np.ones((len(unique_cols)+1, len(datasets) + 1)).astype(str)
65 | result_str[0, :] = [str(x) for x in datasets] + ['Avg']
66 | for i in range(mean_results_top1.shape[0]):
67 | for j in range(mean_results_top1.shape[1]):
68 | result_str[i+1, j] = '{0:2.2f} ({1:2.2f})'.format(mean_results_top1[i, j], std_results_top1[i, j])
69 | result_str[i+1, -1] = '{0:2.2f} ({1:2.2f})'.format(mean_avgs_top1[i], std_avgs_top1[i])
70 |
71 |
72 | # Per Dataset Context
73 | table_data.append(file)
74 | for i in range(len(unique_cols)):
75 | subcoll = []
76 | for sub in result_str[i+1]:
77 | res = sub.split(' ')
78 | subcoll.append('{0} \\small$\\pm{1}$'.format(res[0], res[1].replace('(', '').replace(')', '')))
79 | table_data.append(' & '.join(subcoll) + '\\\\')
80 | table_data.append('--------')
81 |
82 |
83 |
84 | #%% Print final table.
85 | print('\n'.join(table_data))
86 |
87 |
--------------------------------------------------------------------------------
/generate_concepts.py:
--------------------------------------------------------------------------------
1 | #%%
2 | import numpy as np
3 | import openai
4 | import json
5 |
6 | openai.api_key = 'your_key'
7 |
8 | #%% Create GPT-3 prompts with classname lists.
9 | np.random.seed(0)
10 | import numpy as np
11 | datasets = ['imagenet', 'cub', 'eurosat', 'places365', 'food101', 'pets', 'dtd', 'fgvcaircraft', 'cars', 'flowers102', ]
12 | single_prompts = []
13 | long_prompts = []
14 | for dataset in datasets:
15 | out = json.load(open(f'descriptors/descriptors_{dataset}.json'))
16 | lab_list = np.random.choice(list(out.keys()), np.min([25, len(out)]), replace=False)
17 | lab_list = ', '.join([x.replace('_', ' ') for x in lab_list])
18 | if dataset == 'places365':
19 | lab_list = ', '.join([x.replace('-', ' ') for x in lab_list])
20 | long_prompt = "Q: Tell me in five words or less what " + lab_list + " have in common. It may be nothing. A: They are all "
21 | long_prompts.append(long_prompt)
22 |
23 |
24 | #%% Query GPT-3
25 | out = openai.Completion.create(
26 | model="text-davinci-003", prompt=long_prompts,
27 | temperature=0., max_tokens=300
28 | )
29 |
30 | #%% Generated Concepts:
31 | classes = []
32 | for elem in out['choices']:
33 | concept = elem['text'].replace('\n', '').replace('.', '').replace('?', '').lower()
34 | if concept[-1] == 's':
35 | concept = concept[:-1]
36 | classes.append(concept)
37 |
38 | print('Predicted Classes for each dataset: ')
39 | for dataset, classname in zip(datasets, classes):
40 | print(f'Dataset: {dataset} - {classname}')
--------------------------------------------------------------------------------
/generate_descriptors.py:
--------------------------------------------------------------------------------
1 | #%%
2 | import openai
3 | import json
4 | import pathlib
5 | import numpy as np
6 | import time
7 | openai.api_key = 'your_key'
8 |
9 | #%% Example Description Generation for FGVCAircraft
10 | from torchvision.datasets import FGVCAircraft
11 |
12 | AIRCRAFT_DIR = 'your_path/fgvcaircraft'
13 | data_dir = pathlib.Path(AIRCRAFT_DIR)
14 | dataset = FGVCAircraft(data_dir, split='test', annotation_level='family', download=True)
15 | classnames = dataset.classes
16 |
17 | #%% Generate Prompts.
18 | def generate_prompt(category_name: str):
19 | # you can replace the examples with whatever you want; these were random and worked, could be improved
20 | return f"""Q: What are useful visual features for distinguishing a lemur in a photo?
21 | A: There are several useful visual features to tell there is a lemur in a photo:
22 | - four-limbed primate
23 | - black, grey, white, brown, or red-brown
24 | - wet and hairless nose with curved nostrils
25 | - long tail
26 | - large eyes
27 | - furry bodies
28 | - clawed hands and feet
29 | Q: What are useful visual features for distinguishing a television in a photo?
30 | A: There are several useful visual features to tell there is a television in a photo:
31 | - electronic device
32 | - black or grey
33 | - a large, rectangular screen
34 | - a stand or mount to support the screen
35 | - one or more speakers
36 | - a power cord
37 | - input ports for connecting to other devices
38 | - a remote control
39 | Q: What are useful features for distinguishing a {category_name} in a photo?
40 | A: There are several useful visual features to tell there is a {category_name} in a photo:
41 | -
42 | """
43 |
44 | prompts = [generate_prompt(_c) for _c in classnames]
45 |
46 | #%% Query GPT-3.
47 | def stringtolist(description):
48 | return [descriptor[2:] for descriptor in description.split('\n') if (descriptor != '') and (descriptor.startswith('- '))]
49 |
50 | descriptions = []
51 | for i in np.arange(0, len(prompts), 20):
52 | st = time.time()
53 | response = openai.Completion.create(
54 | model="text-davinci-003",
55 | prompt=prompts[i:i + 20],
56 | temperature=0.7,
57 | max_tokens=100,
58 | )
59 | print(i, ":", time.time() - st)
60 | descriptions += [stringtolist(_r["text"]) for _r in response["choices"]]
61 |
62 | #%% Write generated descriptions to JSON.
63 | descriptions_dict = {_c: _d for _c, _d in zip(dataset.classes, descriptions)}
64 | with open('descriptors/placeholder_name.json', 'w') as outfile:
65 | outfile.write(json.dumps(descriptions_dict, indent=4))
66 |
--------------------------------------------------------------------------------
/images/main.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/ExplainableML/WaffleCLIP/7a1b8ee48e31285f62ecd839fecb6b89cbef81f1/images/main.png
--------------------------------------------------------------------------------
/images/teaser.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/ExplainableML/WaffleCLIP/7a1b8ee48e31285f62ecd839fecb6b89cbef81f1/images/teaser.png
--------------------------------------------------------------------------------
/results/baselines.csv:
--------------------------------------------------------------------------------
1 | savename=baselines; dataset=imagenetv2; mode=clip; model_size=ViT-L/14,67.90000200271606,0.0,67.90000200271606,67.90000200271606,89.52999711036682,0.0,89.52999711036682,89.52999711036682
2 | savename=baselines; dataset=imagenet; mode=clip; model_size=ViT-L/14,73.3739972114563,0.0,73.3739972114563,73.3739972114563,93.35799813270569,0.0,93.35799813270569,93.35799813270569
3 | savename=baselines; dataset=cub; mode=clip; model_size=ViT-L/14,62.27131485939026,0.0,62.27131485939026,62.27131485939026,92.07801222801208,0.0,92.07801222801208,92.07801222801208
4 | savename=baselines; dataset=eurosat; mode=clip; model_size=ViT-L/14,56.029629707336426,0.0,56.029629707336426,56.029629707336426,92.51852035522461,0.0,92.51852035522461,92.51852035522461
5 | savename=baselines; dataset=places365; mode=clip; model_size=ViT-L/14,40.46301245689392,0.0,40.46301245689392,40.46301245689392,68.18355917930603,0.0,68.18355917930603,68.18355917930603
6 | savename=baselines; dataset=food101; mode=clip; model_size=ViT-L/14,92.55049228668213,0.0,92.55049228668213,92.55049228668213,99.24356341362,0.0,99.24356341362,99.24356341362
7 | savename=baselines; dataset=pets; mode=clip; model_size=ViT-L/14,93.29517483711243,0.0,93.29517483711243,93.29517483711243,99.80921149253845,0.0,99.80921149253845,99.80921149253845
8 | savename=baselines; dataset=dtd; mode=clip; model_size=ViT-L/14,52.8723418712616,0.0,52.8723418712616,52.8723418712616,79.2553186416626,0.0,79.2553186416626,79.2553186416626
9 | savename=baselines; dataset=imagenetv2; mode=clip; model_size=ViT-B/32,54.739999771118164,0.0,54.739999771118164,54.739999771118164,81.99999928474426,0.0,81.99999928474426,81.99999928474426
10 | savename=baselines; dataset=imagenet; mode=clip; model_size=ViT-B/32,61.99600100517273,0.0,61.99600100517273,61.99600100517273,87.6579999923706,0.0,87.6579999923706,87.6579999923706
11 | savename=baselines; dataset=cub; mode=clip; model_size=ViT-B/32,51.27718448638916,0.0,51.27718448638916,51.27718448638916,83.44839215278625,0.0,83.44839215278625,83.44839215278625
12 | savename=baselines; dataset=eurosat; mode=clip; model_size=ViT-B/32,40.76296389102936,0.0,40.76296389102936,40.76296389102936,90.7444417476654,0.0,90.7444417476654,90.7444417476654
13 | savename=baselines; dataset=places365; mode=clip; model_size=ViT-B/32,39.12328779697418,0.0,39.12328779697418,39.12328779697418,69.72876787185669,0.0,69.72876787185669,69.72876787185669
14 | savename=baselines; dataset=food101; mode=clip; model_size=ViT-B/32,82.59009718894958,0.0,82.59009718894958,82.59009718894958,96.87920808792114,0.0,96.87920808792114,96.87920808792114
15 | savename=baselines; dataset=pets; mode=clip; model_size=ViT-B/32,85.06404757499695,0.0,85.06404757499695,85.06404757499695,97.35623002052307,0.0,97.35623002052307,97.35623002052307
16 | savename=baselines; dataset=dtd; mode=clip; model_size=ViT-B/32,43.13829839229584,0.0,43.13829839229584,43.13829839229584,73.56383204460144,0.0,73.56383204460144,73.56383204460144
17 | savename=baselines; dataset=imagenetv2; mode=clip; model_size=RN50,51.3700008392334,0.0,51.3700008392334,51.3700008392334,79.8200011253357,0.0,79.8200011253357,79.8200011253357
18 | savename=baselines; dataset=imagenet; mode=clip; model_size=RN50,58.14800262451172,0.0,58.14800262451172,58.14800262451172,85.20200252532959,0.0,85.20200252532959,85.20200252532959
19 | savename=baselines; dataset=cub; mode=clip; model_size=RN50,45.184674859046936,0.0,45.184674859046936,45.184674859046936,80.47980666160583,0.0,80.47980666160583,80.47980666160583
20 | savename=baselines; dataset=eurosat; mode=clip; model_size=RN50,28.081482648849487,0.0,28.081482648849487,28.081482648849487,75.28148293495178,0.0,75.28148293495178,75.28148293495178
21 | savename=baselines; dataset=places365; mode=clip; model_size=RN50,36.6602748632431,0.0,36.6602748632431,36.6602748632431,67.51232743263245,0.0,67.51232743263245,67.51232743263245
22 | savename=baselines; dataset=food101; mode=clip; model_size=RN50,78.37227582931519,0.0,78.37227582931519,78.37227582931519,95.73861360549927,0.0,95.73861360549927,95.73861360549927
23 | savename=baselines; dataset=pets; mode=clip; model_size=RN50,83.75579118728638,0.0,83.75579118728638,83.75579118728638,97.43799567222595,0.0,97.43799567222595,97.43799567222595
24 | savename=baselines; dataset=dtd; mode=clip; model_size=RN50,38.51063847541809,0.0,38.51063847541809,38.51063847541809,70.37234306335449,0.0,70.37234306335449,70.37234306335449
25 |
--------------------------------------------------------------------------------
/results/baselines_concept.csv:
--------------------------------------------------------------------------------
1 | savename=baselines_concept; dataset=cub; mode=clip; model_size=ViT-L/14; label_before_text=A photo of a bird: a ,63.0134642124176,0.0,63.0134642124176,63.0134642124176,92.50949025154114,0.0,92.50949025154114,92.50949025154114
2 | savename=baselines_concept; dataset=eurosat; mode=clip; model_size=ViT-L/14; label_before_text=A photo of a land use: a ,61.22962832450867,0.0,61.22962832450867,61.22962832450867,95.39999961853027,0.0,95.39999961853027,95.39999961853027
3 | savename=baselines_concept; dataset=places365; mode=clip; model_size=ViT-L/14; label_before_text=A photo of a place: a ,41.0712331533432,0.0,41.0712331533432,41.0712331533432,68.53424906730652,0.0,68.53424906730652,68.53424906730652
4 | savename=baselines_concept; dataset=food101; mode=clip; model_size=ViT-L/14; label_before_text=A photo of a food: a ,93.52079033851624,0.0,93.52079033851624,93.52079033851624,99.33465123176575,0.0,99.33465123176575,99.33465123176575
5 | savename=baselines_concept; dataset=pets; mode=clip; model_size=ViT-L/14; label_before_text=A photo of a breed: a ,93.64949464797974,0.0,93.64949464797974,93.64949464797974,99.89097714424133,0.0,99.89097714424133,99.89097714424133
6 | savename=baselines_concept; dataset=cub; mode=clip; model_size=ViT-B/32; label_before_text=A photo of a bird: a ,52.2091805934906,0.0,52.2091805934906,52.2091805934906,84.93269085884094,0.0,84.93269085884094,84.93269085884094
7 | savename=baselines_concept; dataset=eurosat; mode=clip; model_size=ViT-B/32; label_before_text=A photo of a land use: a ,48.859259486198425,0.0,48.859259486198425,48.859259486198425,89.32222127914429,0.0,89.32222127914429,89.32222127914429
8 | savename=baselines_concept; dataset=places365; mode=clip; model_size=ViT-B/32; label_before_text=A photo of a place: a ,39.306849241256714,0.0,39.306849241256714,39.306849241256714,69.03561353683472,0.0,69.03561353683472,69.03561353683472
9 | savename=baselines_concept; dataset=food101; mode=clip; model_size=ViT-B/32; label_before_text=A photo of a food: a ,84.6613883972168,0.0,84.6613883972168,84.6613883972168,97.5960373878479,0.0,97.5960373878479,97.5960373878479
10 | savename=baselines_concept; dataset=pets; mode=clip; model_size=ViT-B/32; label_before_text=A photo of a breed: a ,86.7266297340393,0.0,86.7266297340393,86.7266297340393,98.06486964225769,0.0,98.06486964225769,98.06486964225769
11 | savename=baselines_concept; dataset=cub; mode=clip; model_size=RN50; label_before_text=A photo of a bird: a ,46.617189049720764,0.0,46.617189049720764,46.617189049720764,81.22195601463318,0.0,81.22195601463318,81.22195601463318
12 | savename=baselines_concept; dataset=eurosat; mode=clip; model_size=RN50; label_before_text=A photo of a land use: a ,34.04814898967743,0.0,34.04814898967743,34.04814898967743,81.55555725097656,0.0,81.55555725097656,81.55555725097656
13 | savename=baselines_concept; dataset=places365; mode=clip; model_size=RN50; label_before_text=A photo of a place: a ,37.44109570980072,0.0,37.44109570980072,37.44109570980072,67.66027212142944,0.0,67.66027212142944,67.66027212142944
14 | savename=baselines_concept; dataset=food101; mode=clip; model_size=RN50; label_before_text=A photo of a food: a ,80.87524771690369,0.0,80.87524771690369,80.87524771690369,96.70494794845581,0.0,96.70494794845581,96.70494794845581
15 | savename=baselines_concept; dataset=pets; mode=clip; model_size=RN50; label_before_text=A photo of a breed: a ,83.34696292877197,0.0,83.34696292877197,83.34696292877197,97.98310399055481,0.0,97.98310399055481,97.98310399055481
16 |
--------------------------------------------------------------------------------
/results/baselines_gpt.csv:
--------------------------------------------------------------------------------
1 | savename=baselines_gpt; dataset=imagenetv2; mode=gpt_descriptions; model_size=ViT-L/14,69.72000002861023,0.0,69.72000002861023,69.72000002861023,90.82000255584717,0.0,90.82000255584717,90.82000255584717
2 | savename=baselines_gpt; dataset=imagenet; mode=gpt_descriptions; model_size=ViT-L/14,75.2560019493103,0.0,75.2560019493103,75.2560019493103,94.60999965667725,0.0,94.60999965667725,94.60999965667725
3 | savename=baselines_gpt; dataset=cub; mode=gpt_descriptions; model_size=ViT-L/14,63.531237840652466,0.0,63.531237840652466,63.531237840652466,92.49223470687866,0.0,92.49223470687866,92.49223470687866
4 | savename=baselines_gpt; dataset=eurosat; mode=gpt_descriptions; model_size=ViT-L/14,58.718520402908325,0.0,58.718520402908325,58.718520402908325,95.34074068069458,0.0,95.34074068069458,95.34074068069458
5 | savename=baselines_gpt; dataset=places365; mode=gpt_descriptions; model_size=ViT-L/14,42.59726107120514,0.0,42.59726107120514,42.59726107120514,70.99999785423279,0.0,70.99999785423279,70.99999785423279
6 | savename=baselines_gpt; dataset=food101; mode=gpt_descriptions; model_size=ViT-L/14,92.80791878700256,0.0,92.80791878700256,92.80791878700256,99.25148487091064,0.0,99.25148487091064,99.25148487091064
7 | savename=baselines_gpt; dataset=pets; mode=gpt_descriptions; model_size=ViT-L/14,93.89479160308838,0.0,93.89479160308838,93.89479160308838,99.86372590065002,0.0,99.86372590065002,99.86372590065002
8 | savename=baselines_gpt; dataset=dtd; mode=gpt_descriptions; model_size=ViT-L/14,56.59574270248413,0.0,56.59574270248413,56.59574270248413,84.8936140537262,0.0,84.8936140537262,84.8936140537262
9 | savename=baselines_gpt; dataset=imagenetv2; mode=gpt_descriptions; model_size=ViT-B/32,55.7699978351593,0.0,55.7699978351593,55.7699978351593,82.78999924659729,0.0,82.78999924659729,82.78999924659729
10 | savename=baselines_gpt; dataset=imagenet; mode=gpt_descriptions; model_size=ViT-B/32,63.11600208282471,0.0,63.11600208282471,63.11600208282471,88.4660005569458,0.0,88.4660005569458,88.4660005569458
11 | savename=baselines_gpt; dataset=cub; mode=gpt_descriptions; model_size=ViT-B/32,52.46807336807251,0.0,52.46807336807251,52.46807336807251,84.6392810344696,0.0,84.6392810344696,84.6392810344696
12 | savename=baselines_gpt; dataset=eurosat; mode=gpt_descriptions; model_size=ViT-B/32,43.296295404434204,0.0,43.296295404434204,43.296295404434204,85.58889031410217,0.0,85.58889031410217,85.58889031410217
13 | savename=baselines_gpt; dataset=places365; mode=gpt_descriptions; model_size=ViT-B/32,40.45479595661163,0.0,40.45479595661163,40.45479595661163,71.68493270874023,0.0,71.68493270874023,71.68493270874023
14 | savename=baselines_gpt; dataset=food101; mode=gpt_descriptions; model_size=ViT-B/32,82.79207944869995,0.0,82.79207944869995,82.79207944869995,96.81980013847351,0.0,96.81980013847351,96.81980013847351
15 | savename=baselines_gpt; dataset=pets; mode=gpt_descriptions; model_size=ViT-B/32,86.53584122657776,0.0,86.53584122657776,86.53584122657776,99.42763447761536,0.0,99.42763447761536,99.42763447761536
16 | savename=baselines_gpt; dataset=dtd; mode=gpt_descriptions; model_size=ViT-B/32,43.989360332489014,0.0,43.989360332489014,43.989360332489014,76.80851221084595,0.0,76.80851221084595,76.80851221084595
17 | savename=baselines_gpt; dataset=imagenetv2; mode=gpt_descriptions; model_size=RN50,52.64999866485596,0.0,52.64999866485596,52.64999866485596,81.04000091552734,0.0,81.04000091552734,81.04000091552734
18 | savename=baselines_gpt; dataset=imagenet; mode=gpt_descriptions; model_size=RN50,59.67000126838684,0.0,59.67000126838684,59.67000126838684,86.2119972705841,0.0,86.2119972705841,86.2119972705841
19 | savename=baselines_gpt; dataset=cub; mode=gpt_descriptions; model_size=RN50,47.7908194065094,0.0,47.7908194065094,47.7908194065094,81.91232085227966,0.0,81.91232085227966,81.91232085227966
20 | savename=baselines_gpt; dataset=eurosat; mode=gpt_descriptions; model_size=RN50,34.340742230415344,0.0,34.340742230415344,34.340742230415344,78.9222240447998,0.0,78.9222240447998,78.9222240447998
21 | savename=baselines_gpt; dataset=places365; mode=gpt_descriptions; model_size=RN50,38.394519686698914,0.0,38.394519686698914,38.394519686698914,69.87397074699402,0.0,69.87397074699402,69.87397074699402
22 | savename=baselines_gpt; dataset=food101; mode=gpt_descriptions; model_size=RN50,78.57029438018799,0.0,78.57029438018799,78.57029438018799,95.51287293434143,0.0,95.51287293434143,95.51287293434143
23 | savename=baselines_gpt; dataset=pets; mode=gpt_descriptions; model_size=RN50,85.71817874908447,0.0,85.71817874908447,85.71817874908447,99.45489168167114,0.0,99.45489168167114,99.45489168167114
24 | savename=baselines_gpt; dataset=dtd; mode=gpt_descriptions; model_size=RN50,41.0638302564621,0.0,41.0638302564621,41.0638302564621,73.1382966041565,0.0,73.1382966041565,73.1382966041565
25 |
--------------------------------------------------------------------------------
/results/randomized_descriptions.csv:
--------------------------------------------------------------------------------
1 | savename=randomized_descriptions; dataset=imagenetv2; mode=random_descriptions; randomization_budget=1; reps=7; model_size=ViT-L/14,68.01285658563886,0.2193963268489544,68.3899998664856,67.71000027656555,90.32857077462333,0.09493327231840444,90.47999978065491,90.17999768257141
2 | savename=randomized_descriptions; dataset=imagenet; mode=random_descriptions; randomization_budget=1; reps=7; model_size=ViT-L/14,73.88914227485657,0.07857203555022833,74.02600049972534,73.76599907875061,93.95914162908282,0.05285970342949365,94.05199885368347,93.90599727630615
3 | savename=randomized_descriptions; dataset=cub; mode=random_descriptions; randomization_budget=1; reps=7; model_size=ViT-L/14,63.80492108208792,0.2185929834409417,64.10079598426819,63.444942235946655,92.48237098966327,0.2328134540709102,92.88919568061829,92.1643078327179
4 | savename=randomized_descriptions; dataset=eurosat; mode=random_descriptions; randomization_budget=1; reps=7; model_size=ViT-L/14,55.72486775262015,2.011813421650593,59.84814763069153,53.36666703224182,92.08783081599644,1.52697489626368,94.9222207069397,89.99629616737366
5 | savename=randomized_descriptions; dataset=places365; mode=random_descriptions; randomization_budget=1; reps=7; model_size=ViT-L/14,40.317025780677795,0.29408955281163046,40.68219065666199,39.89863097667694,68.69275995663234,0.3009473253902193,68.98903846740723,68.04657578468323
6 | savename=randomized_descriptions; dataset=food101; mode=random_descriptions; randomization_budget=1; reps=7; model_size=ViT-L/14,92.37397398267474,0.3065294934006108,92.68910884857178,91.79009795188904,99.22885383878436,0.024662279235043542,99.26732778549194,99.19207692146301
7 | savename=randomized_descriptions; dataset=pets; mode=random_descriptions; randomization_budget=1; reps=7; model_size=ViT-L/14,93.59887838363647,0.18794648230810387,93.78577470779419,93.21340918540955,99.83646869659424,0.025235267378448712,99.86372590065002,99.80921149253845
8 | savename=randomized_descriptions; dataset=dtd; mode=random_descriptions; randomization_budget=1; reps=7; model_size=ViT-L/14,52.83434731619699,0.46171695188412815,53.67021560668945,52.1276593208313,80.55471096720014,0.6050470722772221,81.17021322250366,79.41489219665527
9 | savename=randomized_descriptions; dataset=imagenetv2; mode=random_descriptions; randomization_budget=1; reps=7; model_size=ViT-B/32,54.11285758018494,0.25443075157742867,54.57000136375427,53.860002756118774,81.80571453911918,0.10795106592896152,81.95000290870667,81.62999749183655
10 | savename=randomized_descriptions; dataset=imagenet; mode=random_descriptions; randomization_budget=1; reps=7; model_size=ViT-B/32,61.37514284678868,0.1815740671374845,61.580002307891846,61.02399826049805,87.59257112230573,0.08156938606804462,87.68200278282166,87.44199872016907
11 | savename=randomized_descriptions; dataset=cub; mode=random_descriptions; randomization_budget=1; reps=7; model_size=ViT-B/32,52.418759039470125,0.18735341298914965,52.67518162727356,52.105629444122314,84.31875279971531,0.3232676351339853,84.88091230392456,83.93165469169617
12 | savename=randomized_descriptions; dataset=eurosat; mode=random_descriptions; randomization_budget=1; reps=7; model_size=ViT-B/32,36.825925537518096,4.263402585377881,43.655556440353394,31.437036395072937,85.45026438576835,2.7148114617949304,87.36666440963745,80.67407608032227
13 | savename=randomized_descriptions; dataset=places365; mode=random_descriptions; randomization_budget=1; reps=7; model_size=ViT-B/32,38.80117450441633,0.2550864336073194,39.260274171829224,38.37534189224243,69.27906019347054,0.2853654529515133,69.75616216659546,68.88766884803772
14 | savename=randomized_descriptions; dataset=food101; mode=random_descriptions; randomization_budget=1; reps=7; model_size=ViT-B/32,82.86393114498684,0.2260587673820239,83.18019509315491,82.47524499893188,96.97595408984593,0.04509906444966285,97.04158306121826,96.89108729362488
15 | savename=randomized_descriptions; dataset=pets; mode=random_descriptions; randomization_budget=1; reps=7; model_size=ViT-B/32,85.947904416493,0.6132421395387265,86.7811381816864,84.87326502799988,98.36078371320453,0.3808603899183583,98.88253211975098,97.76505827903748
16 | savename=randomized_descriptions; dataset=dtd; mode=random_descriptions; randomization_budget=1; reps=7; model_size=ViT-B/32,42.19604858330318,0.8528255901137699,43.191489577293396,40.95744788646698,72.5379935332707,1.4938916491554117,74.84042644500732,70.79787254333496
17 | savename=randomized_descriptions; dataset=imagenetv2; mode=random_descriptions; randomization_budget=1; reps=7; model_size=RN50,51.60714217594692,0.28277859802448757,52.00999975204468,51.179999113082886,79.83714171818325,0.1271337780463577,79.97999787330627,79.60000038146973
18 | savename=randomized_descriptions; dataset=imagenet; mode=random_descriptions; randomization_budget=1; reps=7; model_size=RN50,58.29628535679409,0.1581974838562844,58.50200057029724,58.01200270652771,85.2817143712725,0.12616139627969633,85.46599745750427,85.11800169944763
19 | savename=randomized_descriptions; dataset=cub; mode=random_descriptions; randomization_budget=1; reps=7; model_size=RN50,47.39632138184139,0.24796192958302954,47.7217823266983,47.048670053482056,81.47098081452506,0.35763986009488546,82.08491802215576,80.99758625030518
20 | savename=randomized_descriptions; dataset=eurosat; mode=random_descriptions; randomization_budget=1; reps=7; model_size=RN50,30.18095280442919,4.158255407977126,34.559258818626404,24.040740728378296,76.57248633248466,4.768356818304938,80.97407221794128,68.59999895095825
21 | savename=randomized_descriptions; dataset=places365; mode=random_descriptions; randomization_budget=1; reps=7; model_size=RN50,36.815655657223296,0.27056919892080233,37.34520673751831,36.45205497741699,67.53072312899998,0.32043130188758867,68.0054783821106,66.92602634429932
22 | savename=randomized_descriptions; dataset=food101; mode=random_descriptions; randomization_budget=1; reps=7; model_size=RN50,78.86449779782977,0.24264050377480897,79.33465242385864,78.5663366317749,96.06393149920872,0.10084628190565298,96.27722501754761,95.96039652824402
23 | savename=randomized_descriptions; dataset=pets; mode=random_descriptions; randomization_budget=1; reps=7; model_size=RN50,84.53841039112636,0.16846447589818303,84.76424217224121,84.21913385391235,98.76961367470878,0.4082928501497086,99.2641031742096,97.84682393074036
24 | savename=randomized_descriptions; dataset=dtd; mode=random_descriptions; randomization_budget=1; reps=7; model_size=RN50,38.90577554702759,0.860442425432337,40.21276533603668,37.76595890522003,69.54407266208104,1.8098186248092942,72.34042286872864,67.34042763710022
25 |
--------------------------------------------------------------------------------
/results/randomized_descriptions_5xbudget.csv:
--------------------------------------------------------------------------------
1 | savename=randomized_descriptions_5xbudget; dataset=imagenetv2; mode=random_descriptions; randomization_budget=5; reps=7; model_size=ViT-L/14,69.27428671291896,0.1724489880875345,69.55000162124634,69.0500020980835,91.01857117244175,0.06854123210689575,91.10999703407288,90.88000059127808
2 | savename=randomized_descriptions_5xbudget; dataset=imagenet; mode=random_descriptions; randomization_budget=5; reps=7; model_size=ViT-L/14,75.10514259338379,0.0828608422934236,75.19599795341492,74.96600151062012,94.58400181361607,0.06628082057783422,94.70800161361694,94.49800252914429
3 | savename=randomized_descriptions_5xbudget; dataset=cub; mode=random_descriptions; randomization_budget=5; reps=7; model_size=ViT-L/14,64.25366231373378,0.16280405518555277,64.53227400779724,64.03175592422485,92.59578841073173,0.07663359405860674,92.68208742141724,92.47497320175171
4 | savename=randomized_descriptions_5xbudget; dataset=eurosat; mode=random_descriptions; randomization_budget=5; reps=7; model_size=ViT-L/14,58.335449014391216,1.5457922302046065,60.66666841506958,56.44444227218628,95.00052843775067,0.45911249248036506,95.68148255348206,94.47407126426697
5 | savename=randomized_descriptions_5xbudget; dataset=places365; mode=random_descriptions; randomization_budget=5; reps=7; model_size=ViT-L/14,42.106849380901885,0.14323106982643002,42.30958819389343,41.83835685253143,70.65479414803642,0.1289967586589501,70.8164393901825,70.46301364898682
6 | savename=randomized_descriptions_5xbudget; dataset=food101; mode=random_descriptions; randomization_budget=5; reps=7; model_size=ViT-L/14,93.22093384606498,0.11549666556572105,93.37029457092285,93.04158687591553,99.34653384344918,0.010370646804275073,99.36237335205078,99.33069348335266
7 | savename=randomized_descriptions_5xbudget; dataset=pets; mode=random_descriptions; randomization_budget=5; reps=7; model_size=ViT-L/14,93.87921946389335,0.08844692056946543,94.05832886695862,93.78577470779419,99.83257481030056,0.009538034535942309,99.83646869659424,99.80921149253845
8 | savename=randomized_descriptions_5xbudget; dataset=dtd; mode=random_descriptions; randomization_budget=5; reps=7; model_size=ViT-L/14,55.281154598508564,0.23048121746170872,55.58510422706604,54.89361882209778,81.85410244124276,0.4362522186373555,82.4999988079071,81.22340440750122
9 | savename=randomized_descriptions_5xbudget; dataset=imagenetv2; mode=random_descriptions; randomization_budget=5; reps=7; model_size=ViT-B/32,55.38857153483799,0.11993254343291275,55.61000108718872,55.16999959945679,82.6914301940373,0.11897548172857365,82.84000158309937,82.4400007724762
10 | savename=randomized_descriptions_5xbudget; dataset=imagenet; mode=random_descriptions; randomization_budget=5; reps=7; model_size=ViT-B/32,62.81428507396153,0.05037540513383202,62.9040002822876,62.74399757385254,88.37828465870449,0.03661969836303258,88.42399716377258,88.3080005645752
11 | savename=randomized_descriptions_5xbudget; dataset=cub; mode=random_descriptions; randomization_budget=5; reps=7; model_size=ViT-B/32,52.665319613048005,0.16831250765303213,52.88228988647461,52.34725475311279,84.5825731754303,0.14193826455483155,84.82913374900818,84.4149112701416
12 | savename=randomized_descriptions_5xbudget; dataset=eurosat; mode=random_descriptions; randomization_budget=5; reps=7; model_size=ViT-B/32,38.56984121458871,1.5176437145538342,40.33703804016113,36.040741205215454,88.68042315755572,0.8014705232328314,90.21852016448975,87.84444332122803
13 | savename=randomized_descriptions_5xbudget; dataset=places365; mode=random_descriptions; randomization_budget=5; reps=7; model_size=ViT-B/32,40.539335353033884,0.053656374275673616,40.63013792037964,40.43287634849548,71.1698625768934,0.07455830412870065,71.28493189811707,71.09314799308777
14 | savename=randomized_descriptions_5xbudget; dataset=food101; mode=random_descriptions; randomization_budget=5; reps=7; model_size=ViT-B/32,84.02828914778573,0.11371385176099083,84.16237831115723,83.83762240409851,97.28090592793056,0.03610217966879127,97.31088876724243,97.19604253768921
15 | savename=randomized_descriptions_5xbudget; dataset=pets; mode=random_descriptions; randomization_budget=5; reps=7; model_size=ViT-B/32,86.7733529635838,0.22836716668256388,87.1082067489624,86.45407557487488,98.34520816802979,0.20013348777553192,98.63722920417786,98.0376124382019
16 | savename=randomized_descriptions_5xbudget; dataset=dtd; mode=random_descriptions; randomization_budget=5; reps=7; model_size=ViT-B/32,43.41185390949249,0.7367293254849496,44.946807622909546,42.65957474708557,74.55167259488788,0.9251809378000958,76.17021203041077,73.51064085960388
17 | savename=randomized_descriptions_5xbudget; dataset=imagenetv2; mode=random_descriptions; randomization_budget=5; reps=7; model_size=RN50,52.810001373291016,0.07837591543386833,52.890002727508545,52.640002965927124,80.66714235714504,0.08101421242399832,80.77999949455261,80.51999807357788
18 | savename=randomized_descriptions_5xbudget; dataset=imagenet; mode=random_descriptions; randomization_budget=5; reps=7; model_size=RN50,59.73114201000759,0.05407840142882488,59.7819983959198,59.62799787521362,86.09971404075623,0.041684907206984714,86.1739993095398,86.03799939155579
19 | savename=randomized_descriptions_5xbudget; dataset=cub; mode=random_descriptions; randomization_budget=5; reps=7; model_size=RN50,47.71438453878675,0.11736911271197402,47.859856486320496,47.53192961215973,81.92465049879891,0.09395232565838038,82.08491802215576,81.756991147995
20 | savename=randomized_descriptions_5xbudget; dataset=eurosat; mode=random_descriptions; randomization_budget=5; reps=7; model_size=RN50,34.54232769353049,0.7371544130267534,35.592591762542725,33.39629769325256,77.2391540663583,1.0977556637645023,79.5037031173706,76.17407441139221
21 | savename=randomized_descriptions_5xbudget; dataset=places365; mode=random_descriptions; randomization_budget=5; reps=7; model_size=RN50,38.62309200423105,0.14336536583569118,38.87397348880768,38.44383656978607,69.57769223621914,0.07334659000091259,69.67671513557434,69.46301460266113
22 | savename=randomized_descriptions_5xbudget; dataset=food101; mode=random_descriptions; randomization_budget=5; reps=7; model_size=RN50,80.18953204154968,0.1262617790996535,80.3801953792572,79.94059324264526,96.48147055080959,0.024215995313715127,96.5227723121643,96.44356369972229
23 | savename=randomized_descriptions_5xbudget; dataset=pets; mode=random_descriptions; randomization_budget=5; reps=7; model_size=RN50,85.3132426738739,0.15719902852264572,85.50013899803162,85.00953912734985,99.13561514445713,0.18640704093629037,99.40038323402405,98.90978336334229
24 | savename=randomized_descriptions_5xbudget; dataset=dtd; mode=random_descriptions; randomization_budget=5; reps=7; model_size=RN50,40.33434646470206,0.45996186754810503,41.01063907146454,39.52127695083618,71.07142806053162,0.33931707595236893,71.54255509376526,70.53191661834717
25 |
--------------------------------------------------------------------------------
/results/scrambled_descriptions.csv:
--------------------------------------------------------------------------------
1 | savename=scrambled_descriptions; dataset=imagenetv2; mode=scrambled_descriptions; reps=7; model_size=ViT-L/14,68.68000030517578,0.2139419453681329,69.01999711990356,68.29000115394592,90.47857267516,0.17771833362417314,90.89000225067139,90.31000137329102
2 | savename=scrambled_descriptions; dataset=imagenet; mode=scrambled_descriptions; reps=7; model_size=ViT-L/14,74.4740000792912,0.10755962861809022,74.63200092315674,74.34599995613098,94.14742844445365,0.06248786771703957,94.23199892044067,94.01400089263916
3 | savename=scrambled_descriptions; dataset=cub; mode=scrambled_descriptions; reps=7; model_size=ViT-L/14,63.78026604652405,0.13204101566783522,63.94546031951904,63.6002779006958,92.42319805281502,0.08505422216449562,92.57853031158447,92.31964349746704
4 | savename=scrambled_descriptions; dataset=eurosat; mode=scrambled_descriptions; reps=7; model_size=ViT-L/14,55.97566195896694,2.011426742360334,58.46666693687439,52.57037281990051,93.62116370882306,1.3564206629567974,95.06666660308838,91.20000004768372
5 | savename=scrambled_descriptions; dataset=places365; mode=scrambled_descriptions; reps=7; model_size=ViT-L/14,41.28649703093937,0.22633565416982424,41.56986176967621,40.8109575510025,70.03718188830784,0.14577471371356018,70.3068494796753,69.87671256065369
6 | savename=scrambled_descriptions; dataset=food101; mode=scrambled_descriptions; reps=7; model_size=ViT-L/14,92.29419997760228,0.20453941413805343,92.59405732154846,91.88514947891235,99.1949064391,0.03785143688434753,99.25544261932373,99.13663268089294
7 | savename=scrambled_descriptions; dataset=pets; mode=scrambled_descriptions; reps=7; model_size=ViT-L/14,93.52489965302604,0.18477316918208822,93.86754035949707,93.21340918540955,99.86372334616524,0.03568545604383744,99.91823434829712,99.80921149253845
8 | savename=scrambled_descriptions; dataset=dtd; mode=scrambled_descriptions; reps=7; model_size=ViT-L/14,53.28267386981419,1.117822875748091,55.15957474708557,51.48935914039612,82.735561473029,0.6951963678004311,83.61701965332031,81.7553162574768
9 | savename=scrambled_descriptions; dataset=imagenetv2; mode=scrambled_descriptions; reps=7; model_size=ViT-B/32,55.12999892234802,0.12011956877485727,55.29000163078308,54.909998178482056,82.49999965940204,0.060946390830177534,82.59000182151794,82.41999745368958
10 | savename=scrambled_descriptions; dataset=imagenet; mode=scrambled_descriptions; reps=7; model_size=ViT-B/32,62.57371391568865,0.11963942982846638,62.74399757385254,62.414002418518066,88.11542817524501,0.06371143859368061,88.17999958992004,87.98199892044067
11 | savename=scrambled_descriptions; dataset=cub; mode=scrambled_descriptions; reps=7; model_size=ViT-B/32,52.18205877712795,0.2794951558003448,52.657920122146606,51.846736669540405,84.34587376458305,0.16708068065852358,84.65654253959656,84.17328000068665
12 | savename=scrambled_descriptions; dataset=eurosat; mode=scrambled_descriptions; reps=7; model_size=ViT-B/32,40.483068568365915,2.518662615491737,44.7518527507782,37.35185265541077,84.51164194515773,2.122127836272097,86.65555715560913,79.88518476486206
13 | savename=scrambled_descriptions; dataset=places365; mode=scrambled_descriptions; reps=7; model_size=ViT-B/32,39.90724044186728,0.08226208093002332,40.005478262901306,39.76438343524933,70.93268207141331,0.09968743292911104,71.12602591514587,70.80274224281311
14 | savename=scrambled_descriptions; dataset=food101; mode=scrambled_descriptions; reps=7; model_size=ViT-B/32,82.45657852717808,0.12516208035253049,82.69307017326355,82.26534724235535,96.72362123216901,0.07387221852391418,96.78019881248474,96.55049443244934
15 | savename=scrambled_descriptions; dataset=pets; mode=scrambled_descriptions; reps=7; model_size=ViT-B/32,86.11143571989876,0.3730975169674172,86.56309843063354,85.30935049057007,98.55157051767621,0.7381880446835757,99.50940012931824,97.38348126411438
16 | savename=scrambled_descriptions; dataset=dtd; mode=scrambled_descriptions; reps=7; model_size=ViT-B/32,41.58054717949459,0.3136742666356182,42.180851101875305,41.170212626457214,74.3844985961914,0.788372271865912,75.63830018043518,72.97872304916382
17 | savename=scrambled_descriptions; dataset=imagenetv2; mode=scrambled_descriptions; reps=7; model_size=RN50,52.187143904822214,0.2078265522068517,52.45000123977661,51.840001344680786,80.58285628046308,0.0718564294137104,80.65999746322632,80.44999837875366
18 | savename=scrambled_descriptions; dataset=imagenet; mode=scrambled_descriptions; reps=7; model_size=RN50,59.207143102373394,0.05382343956447141,59.28400158882141,59.126001596450806,85.8137139252254,0.0711169703003708,85.93000173568726,85.70600152015686
19 | savename=scrambled_descriptions; dataset=cub; mode=scrambled_descriptions; reps=7; model_size=RN50,47.62315664972578,0.37425174896040936,48.29133450984955,47.221264243125916,81.84821775981358,0.1631392195957239,82.10217356681824,81.6189169883728
20 | savename=scrambled_descriptions; dataset=eurosat; mode=scrambled_descriptions; reps=7; model_size=RN50,34.979365127427236,2.002439745245657,37.929630279541016,32.53703713417053,79.76296373776027,3.1365458739318086,86.15925908088684,75.01111030578613
21 | savename=scrambled_descriptions; dataset=places365; mode=scrambled_descriptions; reps=7; model_size=RN50,37.90528391088758,0.17339811587539491,38.20821940898895,37.69862949848175,69.16399257523673,0.11308090781237148,69.35616731643677,69.04657483100891
22 | savename=scrambled_descriptions; dataset=food101; mode=scrambled_descriptions; reps=7; model_size=RN50,78.34851486342293,0.13582952854568578,78.5504937171936,78.14653515815735,95.42857153075082,0.07177941125856926,95.5524742603302,95.36237716674805
23 | savename=scrambled_descriptions; dataset=pets; mode=scrambled_descriptions; reps=7; model_size=RN50,85.08351785796029,0.3440533379493897,85.55464744567871,84.62796211242676,99.37701906476703,0.08423189355321621,99.45489168167114,99.18233752250671
24 | savename=scrambled_descriptions; dataset=dtd; mode=scrambled_descriptions; reps=7; model_size=RN50,39.24012226717813,0.9786215348243714,40.3723418712616,37.87234127521515,70.98024317196437,1.201184520745825,72.60638475418091,69.57446932792664
25 |
--------------------------------------------------------------------------------
/results/shared_randomized_descriptions.csv:
--------------------------------------------------------------------------------
1 | savename=shared_randomized_descriptions; dataset=imagenetv2; mode=shared_random_descriptions; randomization_budget=1; reps=7; model_size=ViT-L/14,69.26714267049518,0.22764070216131602,69.6399986743927,68.98999810218811,90.95714347703117,0.14800680326984098,91.15999937057495,90.72999954223633
2 | savename=shared_randomized_descriptions; dataset=imagenet; mode=shared_random_descriptions; randomization_budget=1; reps=7; model_size=ViT-L/14,75.04571420805794,0.15066491398832618,75.2340018749237,74.83400106430054,94.53657184328351,0.11099420072289347,94.67800259590149,94.38400268554688
3 | savename=shared_randomized_descriptions; dataset=cub; mode=shared_random_descriptions; randomization_budget=1; reps=7; model_size=ViT-L/14,64.21174577304295,0.3613089768757362,64.5495355129242,63.3759081363678,92.61797836848667,0.09212312223238966,92.7511215209961,92.47497320175171
4 | savename=shared_randomized_descriptions; dataset=eurosat; mode=shared_random_descriptions; randomization_budget=1; reps=7; model_size=ViT-L/14,57.5941801071167,1.7229024822358987,59.32962894439697,54.64444160461426,93.23227490697589,0.7761561498162296,94.41481232643127,91.98889136314392
5 | savename=shared_randomized_descriptions; dataset=places365; mode=shared_random_descriptions; randomization_budget=1; reps=7; model_size=ViT-L/14,42.01330712863377,0.22863905806047535,42.33972728252411,41.67945086956024,70.59726119041443,0.32543360550816686,71.0301399230957,70.06301283836365
6 | savename=shared_randomized_descriptions; dataset=food101; mode=shared_random_descriptions; randomization_budget=1; reps=7; model_size=ViT-L/14,93.15247535705566,0.12641469294304794,93.29109191894531,92.9663360118866,99.32164038930621,0.013000219624615129,99.34653639793396,99.30297136306763
7 | savename=shared_randomized_descriptions; dataset=pets; mode=shared_random_descriptions; randomization_budget=1; reps=7; model_size=ViT-L/14,93.9687728881836,0.2188756169804374,94.38539147377014,93.7585175037384,99.8442564691816,0.012313549637751919,99.86372590065002,99.83646869659424
8 | savename=shared_randomized_descriptions; dataset=dtd; mode=shared_random_descriptions; randomization_budget=1; reps=7; model_size=ViT-L/14,55.15957474708557,0.4740583550130511,55.79787492752075,54.57446575164795,81.75531881196159,0.6344874716625984,82.4999988079071,80.3723394870758
9 | savename=shared_randomized_descriptions; dataset=imagenetv2; mode=shared_random_descriptions; randomization_budget=1; reps=7; model_size=ViT-B/32,55.450000933238435,0.2441897628052917,55.809998512268066,55.16999959945679,82.7171436377934,0.145377320130409,82.9800009727478,82.51000046730042
10 | savename=shared_randomized_descriptions; dataset=imagenet; mode=shared_random_descriptions; randomization_budget=1; reps=7; model_size=ViT-B/32,62.89257236889431,0.18634347568198825,63.13599944114685,62.50200271606445,88.41971414429801,0.1557928088649277,88.63199949264526,88.13999891281128
11 | savename=shared_randomized_descriptions; dataset=cub; mode=shared_random_descriptions; randomization_budget=1; reps=7; model_size=ViT-B/32,52.64066202299936,0.2719545739772088,52.96858549118042,52.226442098617554,84.3951872416905,0.16820403857250946,84.60476398468018,84.01795029640198
12 | savename=shared_randomized_descriptions; dataset=eurosat; mode=shared_random_descriptions; randomization_budget=1; reps=7; model_size=ViT-B/32,39.74497360842569,2.694690191067821,44.26666796207428,36.06666624546051,88.38095324380058,1.8215421785223604,90.10000228881836,84.68518257141113
13 | savename=shared_randomized_descriptions; dataset=places365; mode=shared_random_descriptions; randomization_budget=1; reps=7; model_size=ViT-B/32,40.293934089796885,0.4749337134595038,40.95890522003174,39.5068496465683,71.01839525359017,0.5656094831308465,71.8246579170227,70.07123231887817
14 | savename=shared_randomized_descriptions; dataset=food101; mode=shared_random_descriptions; randomization_budget=1; reps=7; model_size=ViT-B/32,83.82008501461574,0.4843360015559043,84.22574400901794,82.72475004196167,97.20339519636971,0.18738555866369913,97.40198254585266,96.79999947547913
15 | savename=shared_randomized_descriptions; dataset=pets; mode=shared_random_descriptions; randomization_budget=1; reps=7; model_size=ViT-B/32,87.03811849866595,0.296734215420932,87.43526935577393,86.53584122657776,98.80076135907855,0.4772195703271103,99.34586882591248,98.092120885849
16 | savename=shared_randomized_descriptions; dataset=dtd; mode=shared_random_descriptions; randomization_budget=1; reps=7; model_size=ViT-B/32,43.35106355803354,0.4080766395829798,43.936169147491455,42.81914830207825,74.96960418564933,0.5720831503469302,76.11702084541321,74.36169981956482
17 | savename=shared_randomized_descriptions; dataset=imagenetv2; mode=shared_random_descriptions; randomization_budget=1; reps=7; model_size=RN50,52.637141942977905,0.297547549778035,52.92999744415283,52.0799994468689,80.62714253153119,0.3352483299392842,81.11000061035156,80.19000291824341
18 | savename=shared_randomized_descriptions; dataset=imagenet; mode=shared_random_descriptions; randomization_budget=1; reps=7; model_size=RN50,59.688572372709004,0.3033937974284858,60.19200086593628,59.14199948310852,86.06542859758649,0.2725200692091533,86.44000291824341,85.61000227928162
19 | savename=shared_randomized_descriptions; dataset=cub; mode=shared_random_descriptions; randomization_budget=1; reps=7; model_size=RN50,47.766161816460745,0.3616633212878823,48.42940866947174,47.48015105724335,82.0158771106175,0.2539934733396611,82.41283893585205,81.68795108795166
20 | savename=shared_randomized_descriptions; dataset=eurosat; mode=shared_random_descriptions; randomization_budget=1; reps=7; model_size=RN50,32.73862430027553,1.4975630283538255,34.54814851284027,30.69629669189453,78.10264485222953,3.5007160997438995,82.84444212913513,72.4740743637085
21 | savename=shared_randomized_descriptions; dataset=places365; mode=shared_random_descriptions; randomization_budget=1; reps=7; model_size=RN50,38.62857094832829,0.21875145526497958,38.91232907772064,38.19451928138733,69.43874784878322,0.3829387927606918,69.85205411911011,68.68219375610352
22 | savename=shared_randomized_descriptions; dataset=food101; mode=shared_random_descriptions; randomization_budget=1; reps=7; model_size=RN50,80.0718537398747,0.5837078454731763,80.71287274360657,78.92277240753174,96.36605381965637,0.21560344711455676,96.51881456375122,95.85742354393005
23 | savename=shared_randomized_descriptions; dataset=pets; mode=shared_random_descriptions; randomization_budget=1; reps=7; model_size=RN50,85.32881651605878,0.5292467495890827,85.96347570419312,84.30089950561523,99.33029583522251,0.13018970530867255,99.45489168167114,99.07331466674805
24 | savename=shared_randomized_descriptions; dataset=dtd; mode=shared_random_descriptions; randomization_budget=1; reps=7; model_size=RN50,40.767477239881245,0.6401045361726367,41.436171531677246,39.46808576583862,71.24619994844709,0.6606150391562774,72.50000238418579,70.2659547328949
25 |
--------------------------------------------------------------------------------
/results/shared_randomized_descriptions_2xbudget.csv:
--------------------------------------------------------------------------------
1 | savename=shared_randomized_descriptions_2xbudget; dataset=imagenetv2; mode=shared_random_descriptions; randomization_budget=2; reps=7; model_size=ViT-L/14,69.58142944744655,0.21155309864778946,69.9999988079071,69.34000253677368,91.15571464811053,0.1283813450882656,91.36000275611877,90.97999930381775
2 | savename=shared_randomized_descriptions_2xbudget; dataset=imagenet; mode=shared_random_descriptions; randomization_budget=2; reps=7; model_size=ViT-L/14,75.30228580747333,0.15774630110498036,75.51000118255615,75.05999803543091,94.68285696847099,0.09192558837648286,94.82200145721436,94.57399845123291
3 | savename=shared_randomized_descriptions_2xbudget; dataset=cub; mode=shared_random_descriptions; randomization_budget=2; reps=7; model_size=ViT-L/14,64.30297323635646,0.26235171061034906,64.79116082191467,63.89368176460266,92.65003204345703,0.05088948280675932,92.7511215209961,92.57853031158447
4 | savename=shared_randomized_descriptions_2xbudget; dataset=eurosat; mode=shared_random_descriptions; randomization_budget=2; reps=7; model_size=ViT-L/14,59.31957619530814,1.6265101425204909,61.72592639923096,57.08518624305725,94.07090033803668,1.2136918404015073,95.70000171661377,92.55926012992859
5 | savename=shared_randomized_descriptions_2xbudget; dataset=places365; mode=shared_random_descriptions; randomization_budget=2; reps=7; model_size=ViT-L/14,42.28375724383763,0.17006440559033592,42.45205521583557,41.98904037475586,70.84853393690926,0.22143526805386562,71.0986316204071,70.35616636276245
6 | savename=shared_randomized_descriptions_2xbudget; dataset=food101; mode=shared_random_descriptions; randomization_budget=2; reps=7; model_size=ViT-L/14,93.30862930842808,0.05458419344984198,93.37425827980042,93.21980476379395,99.3431397846767,0.016860465057457634,99.36633706092834,99.31485056877136
7 | savename=shared_randomized_descriptions_2xbudget; dataset=pets; mode=shared_random_descriptions; randomization_budget=2; reps=7; model_size=ViT-L/14,94.03885858399528,0.10776203548975999,94.24911141395569,93.92204880714417,99.84036258288792,0.00953803453594231,99.86372590065002,99.83646869659424
8 | savename=shared_randomized_descriptions_2xbudget; dataset=dtd; mode=shared_random_descriptions; randomization_budget=2; reps=7; model_size=ViT-L/14,55.31154956136431,0.5017515223163613,56.010639667510986,54.680848121643066,82.21124580928257,0.5210576842658505,82.81915187835693,81.22340440750122
9 | savename=shared_randomized_descriptions_2xbudget; dataset=imagenetv2; mode=shared_random_descriptions; randomization_budget=2; reps=7; model_size=ViT-B/32,55.73142766952515,0.2357014562953907,56.16999864578247,55.44999837875366,82.78999924659729,0.2048696262962945,83.10999870300293,82.4899971485138
10 | savename=shared_randomized_descriptions_2xbudget; dataset=imagenet; mode=shared_random_descriptions; randomization_budget=2; reps=7; model_size=ViT-B/32,63.11114345278059,0.1940561014077392,63.47000002861023,62.84000277519226,88.53514279638019,0.10982496558589953,88.75399827957153,88.39799761772156
11 | savename=shared_randomized_descriptions_2xbudget; dataset=cub; mode=shared_random_descriptions; randomization_budget=2; reps=7; model_size=ViT-B/32,52.731889486312866,0.23333504732558966,53.158438205718994,52.485328912734985,84.41491212163653,0.13014025107657543,84.5529854297638,84.20780301094055
12 | savename=shared_randomized_descriptions_2xbudget; dataset=eurosat; mode=shared_random_descriptions; randomization_budget=2; reps=7; model_size=ViT-B/32,39.730688078062876,1.660775840777876,42.540740966796875,37.81111240386963,88.94708922931126,0.9380394543095514,89.89259004592896,87.54814863204956
13 | savename=shared_randomized_descriptions_2xbudget; dataset=places365; mode=shared_random_descriptions; randomization_budget=2; reps=7; model_size=ViT-B/32,40.60978463717869,0.2201583657989829,40.81917703151703,40.12876749038696,71.34168233190265,0.29961615784697093,71.60547971725464,70.65205574035645
14 | savename=shared_randomized_descriptions_2xbudget; dataset=food101; mode=shared_random_descriptions; randomization_budget=2; reps=7; model_size=ViT-B/32,84.01470950671604,0.23173875765215182,84.25742387771606,83.48515033721924,97.24130204745701,0.08745774963239765,97.35049605369568,97.05346822738647
15 | savename=shared_randomized_descriptions_2xbudget; dataset=pets; mode=shared_random_descriptions; randomization_budget=2; reps=7; model_size=ViT-B/32,87.10820419447762,0.12616805941982762,87.32624650001526,86.97192668914795,98.78908225468227,0.33280877918803026,99.31861758232117,98.39193224906921
16 | savename=shared_randomized_descriptions_2xbudget; dataset=dtd; mode=shared_random_descriptions; randomization_budget=2; reps=7; model_size=ViT-B/32,43.2902740580695,0.21733026574455688,43.563830852508545,43.03191602230072,75.02279707363674,0.4815483325991365,75.90425610542297,74.57447052001953
17 | savename=shared_randomized_descriptions_2xbudget; dataset=imagenetv2; mode=shared_random_descriptions; randomization_budget=2; reps=7; model_size=RN50,52.918572085244314,0.2222889128020536,53.380000591278076,52.71000266075134,80.74142762592861,0.13881523375871038,80.97000122070312,80.54999709129333
18 | savename=shared_randomized_descriptions_2xbudget; dataset=imagenet; mode=shared_random_descriptions; randomization_budget=2; reps=7; model_size=RN50,59.891713517052786,0.2561480404202325,60.41399836540222,59.564000368118286,86.22114317757743,0.14651078402877882,86.48800253868103,86.04999780654907
19 | savename=shared_randomized_descriptions_2xbudget; dataset=cub; mode=shared_random_descriptions; randomization_budget=2; reps=7; model_size=RN50,47.66753784247807,0.3310166426349245,48.239558935165405,47.20400273799896,81.99861815997532,0.18588735429942213,82.36106038093567,81.68795108795166
20 | savename=shared_randomized_descriptions_2xbudget; dataset=eurosat; mode=shared_random_descriptions; randomization_budget=2; reps=7; model_size=RN50,34.3624336378915,1.2872968884160392,36.14444434642792,32.19999969005585,77.68359695162091,1.882666713199164,80.3074061870575,74.85555410385132
21 | savename=shared_randomized_descriptions_2xbudget; dataset=places365; mode=shared_random_descriptions; randomization_budget=2; reps=7; model_size=RN50,38.93307277134487,0.2103442066201785,39.24109637737274,38.671234250068665,69.76203492709568,0.23599540288367132,70.0465738773346,69.28766965866089
22 | savename=shared_randomized_descriptions_2xbudget; dataset=food101; mode=shared_random_descriptions; randomization_budget=2; reps=7; model_size=RN50,80.1120230129787,0.30431999117190267,80.47524690628052,79.44950461387634,96.44978727613177,0.12852047017460874,96.53465151786804,96.13861441612244
23 | savename=shared_randomized_descriptions_2xbudget; dataset=pets; mode=shared_random_descriptions; randomization_budget=2; reps=7; model_size=RN50,85.34049817493984,0.3265541461896925,85.71817874908447,84.73698496818542,99.29525426455906,0.14380205956458872,99.40038323402405,98.99154901504517
24 | savename=shared_randomized_descriptions_2xbudget; dataset=dtd; mode=shared_random_descriptions; randomization_budget=2; reps=7; model_size=RN50,40.8814583505903,0.779607325859842,42.07446873188019,39.68085050582886,71.32218905857631,0.44269050959064726,71.91489338874817,70.69149017333984
25 |
--------------------------------------------------------------------------------
/results/swapped_descriptions.csv:
--------------------------------------------------------------------------------
1 | savename=swapped_descriptions; dataset=imagenetv2; mode=swapped_descriptions; reps=7; model_size=ViT-L/14,66.43857104437691,0.11800976178912019,66.68999791145325,66.3100004196167,88.79285710198539,0.17506895144015433,89.06000256538391,88.52999806404114
2 | savename=swapped_descriptions; dataset=imagenet; mode=swapped_descriptions; reps=7; model_size=ViT-L/14,72.06685628209796,0.15301498846643496,72.29200005531311,71.89599871635437,92.72200039454869,0.11203157418771292,92.94400215148926,92.56399869918823
3 | savename=swapped_descriptions; dataset=cub; mode=swapped_descriptions; reps=7; model_size=ViT-L/14,63.615070070539204,0.4386148057893564,64.30790424346924,63.03071975708008,92.47004304613385,0.1739945096061416,92.78563857078552,92.18156933784485
4 | savename=swapped_descriptions; dataset=eurosat; mode=swapped_descriptions; reps=7; model_size=ViT-L/14,51.49153470993042,4.887823443308111,57.770371437072754,42.28518605232239,94.22539728028434,1.8522766637554173,97.83703684806824,92.01111197471619
5 | savename=swapped_descriptions; dataset=places365; mode=swapped_descriptions; reps=7; model_size=ViT-L/14,37.05675091062273,0.40801508396155134,37.60274052619934,36.18082106113434,64.35655610901969,0.5223380152817926,65.44931530952454,63.695889711380005
6 | savename=swapped_descriptions; dataset=food101; mode=swapped_descriptions; reps=7; model_size=ViT-L/14,91.30410041127887,0.2983485370803681,91.56039357185364,90.70494771003723,99.09080522400993,0.02617117108934712,99.12078976631165,99.04554486274719
7 | savename=swapped_descriptions; dataset=pets; mode=swapped_descriptions; reps=7; model_size=ViT-L/14,93.73904892376491,0.2831394697107482,94.03107166290283,93.13164353370667,99.85204339027405,0.03210805358400602,99.91823434829712,99.80921149253845
8 | savename=swapped_descriptions; dataset=dtd; mode=swapped_descriptions; reps=7; model_size=ViT-L/14,49.840426445007324,0.7750017855483331,51.01063847541809,48.457446694374084,76.93009035927909,1.0075207474266752,79.04255390167236,75.47872066497803
9 | savename=swapped_descriptions; dataset=imagenetv2; mode=swapped_descriptions; reps=7; model_size=ViT-B/32,52.47571383203779,0.387550610760457,52.920001745224,51.920002698898315,80.13857177325657,0.3092581597707888,80.61000108718872,79.6500027179718
10 | savename=swapped_descriptions; dataset=imagenet; mode=swapped_descriptions; reps=7; model_size=ViT-B/32,59.62657247270857,0.11859909257106276,59.79200005531311,59.43800210952759,86.09428576060704,0.12877696537510436,86.31200194358826,85.88399887084961
11 | savename=swapped_descriptions; dataset=cub; mode=swapped_descriptions; reps=7; model_size=ViT-B/32,52.51984851700919,0.4059193179871099,53.037625551223755,51.91577672958374,84.24232006072998,0.1449904443599959,84.48395133018494,84.01795029640198
12 | savename=swapped_descriptions; dataset=eurosat; mode=swapped_descriptions; reps=7; model_size=ViT-B/32,33.634391852787566,4.157798575882041,40.69259166717529,25.966668128967285,85.87407299450466,3.973810804452285,92.21110939979553,80.45555353164673
13 | savename=swapped_descriptions; dataset=places365; mode=swapped_descriptions; reps=7; model_size=ViT-B/32,35.518981729234966,0.31883790530983347,35.8958899974823,34.920549392700195,64.88140906606402,0.4260319104739242,65.65479636192322,64.4657552242279
14 | savename=swapped_descriptions; dataset=food101; mode=swapped_descriptions; reps=7; model_size=ViT-B/32,81.70919418334961,0.3509998320954821,82.22178220748901,81.17623925209045,96.73550299235752,0.06760060868579906,96.83960676193237,96.64554595947266
15 | savename=swapped_descriptions; dataset=pets; mode=swapped_descriptions; reps=7; model_size=ViT-B/32,86.24381933893476,0.5300478120942672,86.80839538574219,85.36385893821716,98.38025144168309,0.4729440235726055,99.31861758232117,97.92858958244324
16 | savename=swapped_descriptions; dataset=dtd; mode=swapped_descriptions; reps=7; model_size=ViT-B/32,38.419452735355925,1.1435599439436421,40.797871351242065,37.12765872478485,68.2446803365435,1.772162674788676,70.69149017333984,65.69148898124695
17 | savename=swapped_descriptions; dataset=imagenetv2; mode=swapped_descriptions; reps=7; model_size=RN50,49.781428064618794,0.20024434122245688,50.09999871253967,49.43999946117401,78.09428572654724,0.24755064280236058,78.40999960899353,77.5600016117096
18 | savename=swapped_descriptions; dataset=imagenet; mode=swapped_descriptions; reps=7; model_size=RN50,56.35028651782444,0.05767240123334678,56.42399787902832,56.238001585006714,83.70028478758675,0.09651946952088938,83.79999995231628,83.54799747467041
19 | savename=swapped_descriptions; dataset=cub; mode=swapped_descriptions; reps=7; model_size=RN50,47.67000377178192,0.34146671924963806,48.153263330459595,47.03141152858734,81.67315806661334,0.17343469006595674,81.99861645698547,81.46358132362366
20 | savename=swapped_descriptions; dataset=eurosat; mode=swapped_descriptions; reps=7; model_size=RN50,28.16560800586428,4.443546063115817,35.92592477798462,21.788889169692993,77.41957817758832,4.674054365586377,82.89629817008972,70.12222409248352
21 | savename=swapped_descriptions; dataset=places365; mode=swapped_descriptions; reps=7; model_size=RN50,33.769080468586516,0.3425036501325262,34.315067529678345,33.24109613895416,63.270058802195955,0.40941375350193754,63.98082375526428,62.62739896774292
22 | savename=swapped_descriptions; dataset=food101; mode=swapped_descriptions; reps=7; model_size=RN50,77.58076531546456,0.2949095429181133,78.26138734817505,77.30296850204468,95.55247596332005,0.12731549404159742,95.6990122795105,95.33465504646301
23 | savename=swapped_descriptions; dataset=pets; mode=swapped_descriptions; reps=7; model_size=RN50,84.6007091658456,0.640025313190826,85.55464744567871,83.53775143623352,99.14729595184326,0.15873417240414361,99.40038323402405,98.90978336334229
24 | savename=swapped_descriptions; dataset=dtd; mode=swapped_descriptions; reps=7; model_size=RN50,35.828267676489695,1.1128004046387219,36.968085169792175,33.29787254333496,65.5775078705379,1.7955983875298727,67.76595711708069,61.702126264572144
25 |
--------------------------------------------------------------------------------
/results/waffleclip.csv:
--------------------------------------------------------------------------------
1 | savename=waffleclip; dataset=imagenetv2; mode=waffle; waffle_count=15; reps=7; model_size=ViT-L/14,69.48428494589669,0.08033205808711852,69.62000131607056,69.37999725341797,90.78571455819267,0.046246417658973756,90.85999727249146,90.72999954223633
2 | savename=waffleclip; dataset=imagenet; mode=waffle; waffle_count=15; reps=7; model_size=ViT-L/14,75.30342766216823,0.04338108945792614,75.37599802017212,75.24799704551697,94.48514240128654,0.008870004777099484,94.49399709701538,94.46600079536438
3 | savename=waffleclip; dataset=cub; mode=waffle; waffle_count=15; reps=7; model_size=ViT-L/14,64.1821597303663,0.12873121103498034,64.34242129325867,63.962721824645996,92.03116468020848,0.19057064013305197,92.23334193229675,91.69830679893494
4 | savename=waffleclip; dataset=eurosat; mode=waffle; waffle_count=15; reps=7; model_size=ViT-L/14,61.16772379193987,0.3532880403272239,61.6518497467041,60.56666374206543,96.1206351007734,0.355487621007228,96.58148288726807,95.47407627105713
5 | savename=waffleclip; dataset=places365; mode=waffle; waffle_count=15; reps=7; model_size=ViT-L/14,42.262230600629536,0.09535428249308145,42.40821897983551,42.09589064121246,70.55616463933673,0.06672406784044592,70.63561677932739,70.42739987373352
6 | savename=waffleclip; dataset=food101; mode=waffle; waffle_count=15; reps=7; model_size=ViT-L/14,93.31315415246146,0.09476387930831631,93.46930980682373,93.22376251220703,99.37934875488281,0.017160492952429628,99.41385984420776,99.3584156036377
7 | savename=waffleclip; dataset=pets; mode=waffle; waffle_count=15; reps=7; model_size=ViT-L/14,91.98302371161324,0.11230897164209556,92.20495820045471,91.8506383895874,99.2212746824537,0.21942981785300147,99.5911717414856,98.93704056739807
8 | savename=waffleclip; dataset=dtd; mode=waffle; waffle_count=15; reps=7; model_size=ViT-L/14,53.94376942089626,0.2905475444348964,54.41489219665527,53.5106360912323,80.26595711708069,0.3278924856334635,80.6382954120636,79.84042763710022
9 | savename=waffleclip; dataset=imagenetv2; mode=waffle; waffle_count=15; reps=7; model_size=ViT-B/32,55.90714301381792,0.11041701595629497,56.11000061035156,55.73999881744385,83.04714390209743,0.11335166342426342,83.27999711036682,82.9200029373169
10 | savename=waffleclip; dataset=imagenet; mode=waffle; waffle_count=15; reps=7; model_size=ViT-B/32,63.3102868284498,0.09105948009955912,63.42200040817261,63.134002685546875,88.5537156036922,0.04681181061919899,88.60200047492981,88.47200274467468
11 | savename=waffleclip; dataset=cub; mode=waffle; waffle_count=15; reps=7; model_size=ViT-B/32,52.386705364499775,0.11400587507754532,52.606141567230225,52.24369764328003,83.75412992068699,0.14665616286755107,83.94891023635864,83.51743221282959
12 | savename=waffleclip; dataset=eurosat; mode=waffle; waffle_count=15; reps=7; model_size=ViT-B/32,44.31164009230478,1.0674852848360612,45.70740759372711,42.15185046195984,91.93809458187648,0.689425976879227,92.8074061870575,90.7444417476654
13 | savename=waffleclip; dataset=places365; mode=waffle; waffle_count=15; reps=7; model_size=ViT-B/32,40.562425766672405,0.07365580008672698,40.619176626205444,40.38630127906799,71.21761185782296,0.06823265988970652,71.29862904548645,71.11232876777649
14 | savename=waffleclip; dataset=food101; mode=waffle; waffle_count=15; reps=7; model_size=ViT-B/32,83.25035401753017,0.20793027224502197,83.72673392295837,83.06534886360168,97.04215100833348,0.09140706989108636,97.2158432006836,96.90693020820618
15 | savename=waffleclip; dataset=pets; mode=waffle; waffle_count=15; reps=7; model_size=ViT-B/32,85.71039182799203,0.2479732014968492,85.9907329082489,85.36385893821716,97.37958822931562,0.009535948807856605,97.38348126411438,97.35623002052307
16 | savename=waffleclip; dataset=dtd; mode=waffle; waffle_count=15; reps=7; model_size=ViT-B/32,43.16109376294272,0.24599473916578185,43.510639667510986,42.872339487075806,73.70060852595738,0.25087473596993276,74.04255270957947,73.35106134414673
17 | savename=waffleclip; dataset=imagenetv2; mode=waffle; waffle_count=15; reps=7; model_size=RN50,52.88857136453901,0.17066577371348435,53.25000286102295,52.730000019073486,81.02999925613403,0.10515298883128955,81.20999932289124,80.9000015258789
18 | savename=waffleclip; dataset=imagenet; mode=waffle; waffle_count=15; reps=7; model_size=RN50,60.119142702647615,0.12391616061156967,60.38399934768677,59.97999906539917,86.18114250046867,0.08403300386800847,86.3319993019104,86.04999780654907
19 | savename=waffleclip; dataset=cub; mode=waffle; waffle_count=15; reps=7; model_size=RN50,47.7119186094829,0.18493689952251127,47.99792766571045,47.41111397743225,81.95917095456805,0.10547428419522877,82.17121362686157,81.86054825782776
20 | savename=waffleclip; dataset=eurosat; mode=waffle; waffle_count=15; reps=7; model_size=RN50,31.31851851940155,0.4654133652214447,31.999999284744263,30.662962794303894,80.53809489522662,0.7584383805443282,81.99999928474426,79.3666660785675
21 | savename=waffleclip; dataset=places365; mode=waffle; waffle_count=15; reps=7; model_size=RN50,38.30528387001583,0.09927866459137337,38.484930992126465,38.14520537853241,69.36360086713519,0.09633914892145473,69.5342481136322,69.24657821655273
22 | savename=waffleclip; dataset=food101; mode=waffle; waffle_count=15; reps=7; model_size=RN50,79.67751026153564,0.15451935002418155,80.03564476966858,79.51287031173706,96.30495139530727,0.060212162356299025,96.44752740859985,96.25346660614014
23 | savename=waffleclip; dataset=pets; mode=waffle; waffle_count=15; reps=7; model_size=RN50,84.33204804147992,0.15170184832827444,84.57345366477966,84.0283453464508,97.32897196497235,0.02913758385800476,97.38348126411438,97.3017156124115
24 | savename=waffleclip; dataset=dtd; mode=waffle; waffle_count=15; reps=7; model_size=RN50,39.285714285714285,0.30070480206219935,39.89361822605133,38.88297975063324,70.83586709839958,0.34788811066128894,71.59574627876282,70.42553424835205
25 |
--------------------------------------------------------------------------------
/results/waffleclip_concepts.csv:
--------------------------------------------------------------------------------
1 | savename=waffleclip_concepts; dataset=cub; mode=waffle; waffle_count=15; reps=7; model_size=ViT-L/14; label_before_text=A photo of a bird: a ,63.403027398245676,0.17230988707535372,63.58301639556885,63.09975981712341,93.0050790309906,0.10943469690713029,93.1998610496521,92.87193417549133
2 | savename=waffleclip_concepts; dataset=eurosat; mode=waffle; waffle_count=15; reps=7; model_size=ViT-L/14; label_before_text=A photo of a land use: a ,60.19576702799116,0.8708226889187901,61.96296215057373,59.155553579330444,95.43280346052987,0.46737672106425915,96.37036919593811,94.98518705368042
3 | savename=waffleclip_concepts; dataset=places365; mode=waffle; waffle_count=15; reps=7; model_size=ViT-L/14; label_before_text=A photo of a place: a ,42.57064546857561,0.088325732101645,42.73972511291504,42.43561625480652,70.26614461626325,0.11034316738342016,70.39178013801575,70.06575465202332
4 | savename=waffleclip_concepts; dataset=food101; mode=waffle; waffle_count=15; reps=7; model_size=ViT-L/14; label_before_text=A photo of a food: a ,93.65487950188773,0.054976672665656263,93.74257326126099,93.56435537338257,99.37255893434796,0.012318307896407187,99.39405918121338,99.3584156036377
5 | savename=waffleclip_concepts; dataset=pets; mode=waffle; waffle_count=15; reps=7; model_size=ViT-L/14; label_before_text=A photo of a breed: a ,94.38149758747646,0.07903109286790944,94.52166557312012,94.27636861801147,99.89487103053501,0.009538034535942309,99.91823434829712,99.89097714424133
6 | savename=waffleclip_concepts; dataset=cub; mode=waffle; waffle_count=15; reps=7; model_size=ViT-B/32; label_before_text=A photo of a bird: a ,52.83297981534685,0.18506865817411075,53.17569971084595,52.606141567230225,85.11021137237549,0.10587731727059595,85.26061177253723,84.94994640350342
7 | savename=waffleclip_concepts; dataset=eurosat; mode=waffle; waffle_count=15; reps=7; model_size=ViT-B/32; label_before_text=A photo of a land use: a ,48.50423336029053,0.7035456689791778,49.58148002624512,47.58518636226654,92.04761811665126,0.2597830102262772,92.34444499015808,91.67037010192871
8 | savename=waffleclip_concepts; dataset=places365; mode=waffle; waffle_count=15; reps=7; model_size=ViT-B/32; label_before_text=A photo of a place: a ,40.96908015864236,0.08089938782698373,41.063013672828674,40.84109663963318,71.08375770705086,0.05733456561464694,71.18903994560242,70.9972620010376
9 | savename=waffleclip_concepts; dataset=food101; mode=waffle; waffle_count=15; reps=7; model_size=ViT-B/32; label_before_text=A photo of a food: a ,85.2050917489188,0.05663381492329447,85.29108762741089,85.11286973953247,97.70693097795758,0.021897589369750864,97.7425754070282,97.67920970916748
10 | savename=waffleclip_concepts; dataset=pets; mode=waffle; waffle_count=15; reps=7; model_size=ViT-B/32; label_before_text=A photo of a breed: a ,87.51703415598188,0.06006847473709426,87.5715434551239,87.38075494766235,99.12004045077732,0.06468640733408089,99.2095947265625,99.01880621910095
11 | savename=waffleclip_concepts; dataset=cub; mode=waffle; waffle_count=15; reps=7; model_size=RN50; label_before_text=A photo of a bird: a ,48.37516673973629,0.12253873411506924,48.532965779304504,48.18778038024902,81.87533957617623,0.0785141854265144,81.981360912323,81.77424669265747
12 | savename=waffleclip_concepts; dataset=eurosat; mode=waffle; waffle_count=15; reps=7; model_size=RN50; label_before_text=A photo of a land use: a ,35.07301551955087,0.4221143021596114,35.77037155628204,34.54814851284027,80.34126843724933,0.5399554261090732,81.22962713241577,79.49629426002502
13 | savename=waffleclip_concepts; dataset=places365; mode=waffle; waffle_count=15; reps=7; model_size=RN50; label_before_text=A photo of a place: a ,39.02387406144823,0.07609063154839865,39.12602663040161,38.87671232223511,70.0363985129765,0.05368188755422505,70.13424634933472,69.967120885849
14 | savename=waffleclip_concepts; dataset=food101; mode=waffle; waffle_count=15; reps=7; model_size=RN50; label_before_text=A photo of a food: a ,81.38896737779889,0.08263262231410383,81.51286840438843,81.2831699848175,96.81697402681623,0.03325092847020097,96.88712954521179,96.77227735519409
15 | savename=waffleclip_concepts; dataset=pets; mode=waffle; waffle_count=15; reps=7; model_size=RN50; label_before_text=A photo of a breed: a ,85.84666933332171,0.1512003597480288,86.09975576400757,85.66367030143738,98.38414447648185,0.05395096961839577,98.4736979007721,98.31016659736633
16 |
--------------------------------------------------------------------------------
/results/waffleclip_gpt.csv:
--------------------------------------------------------------------------------
1 | savename=waffleclip_gpt; dataset=imagenetv2; mode=waffle_and_gpt; waffle_count=15; reps=7; model_size=ViT-L/14,69.79571495737348,0.12926740447306884,69.9999988079071,69.63000297546387,90.97000019890922,0.06141174959094123,91.06000065803528,90.85000157356262
2 | savename=waffleclip_gpt; dataset=imagenet; mode=waffle_and_gpt; waffle_count=15; reps=7; model_size=ViT-L/14,75.56685635021755,0.059231058197767326,75.66999793052673,75.5079984664917,94.72314289637974,0.02481536836391397,94.75799798965454,94.67599987983704
3 | savename=waffleclip_gpt; dataset=cub; mode=waffle_and_gpt; waffle_count=15; reps=7; model_size=ViT-L/14,64.31283695357186,0.2105176288826397,64.60131406784058,63.85916471481323,92.41826363972255,0.1287319286126968,92.69934296607971,92.2506034374237
4 | savename=waffleclip_gpt; dataset=eurosat; mode=waffle_and_gpt; waffle_count=15; reps=7; model_size=ViT-L/14,60.626983642578125,1.2255827505053347,62.12962865829468,58.585184812545776,96.19894112859454,0.2898816218897112,96.83333039283752,95.87777853012085
5 | savename=waffleclip_gpt; dataset=places365; mode=waffle_and_gpt; waffle_count=15; reps=7; model_size=ViT-L/14,42.96007880142757,0.12013691975412566,43.12602877616882,42.73424744606018,71.47945250783648,0.07159445916960167,71.56986594200134,71.3835597038269
6 | savename=waffleclip_gpt; dataset=food101; mode=waffle_and_gpt; waffle_count=15; reps=7; model_size=ViT-L/14,93.2758126940046,0.07607426457773844,93.3782160282135,93.12871098518372,99.34031111853463,0.01903462198551041,99.37029480934143,99.31089282035828
7 | savename=waffleclip_gpt; dataset=pets; mode=waffle_and_gpt; waffle_count=15; reps=7; model_size=ViT-L/14,93.35357972553798,0.21727695033205005,93.59498620033264,92.94085502624512,99.87151026725769,0.024000764602674817,99.89097714424133,99.83646869659424
8 | savename=waffleclip_gpt; dataset=dtd; mode=waffle_and_gpt; waffle_count=15; reps=7; model_size=ViT-L/14,56.32978677749634,0.4217156800016994,56.8617045879364,55.53191304206848,83.22948302541461,0.3287735428149408,83.56382846832275,82.55318999290466
9 | savename=waffleclip_gpt; dataset=imagenetv2; mode=waffle_and_gpt; waffle_count=15; reps=7; model_size=ViT-B/32,56.071427890232634,0.11703799629275702,56.269997358322144,55.889999866485596,83.18142890930176,0.10762743247941656,83.35999846458435,83.03999900817871
10 | savename=waffleclip_gpt; dataset=imagenet; mode=waffle_and_gpt; waffle_count=15; reps=7; model_size=ViT-B/32,63.443429129464285,0.06454929764837977,63.50200176239014,63.32600116729736,88.7311441557748,0.06024605179179673,88.84000182151794,88.63800168037415
11 | savename=waffleclip_gpt; dataset=cub; mode=waffle_and_gpt; waffle_count=15; reps=7; model_size=ViT-B/32,52.71709731646946,0.16291662217422562,53.003108501434326,52.46807336807251,84.42477413586208,0.1626556818396583,84.70832109451294,84.17328000068665
12 | savename=waffleclip_gpt; dataset=eurosat; mode=waffle_and_gpt; waffle_count=15; reps=7; model_size=ViT-B/32,44.96772459575108,0.8259557195958475,46.45926058292389,43.87407302856445,89.25132240567889,0.47184646438219974,89.65555429458618,88.34444284439087
13 | savename=waffleclip_gpt; dataset=places365; mode=waffle_and_gpt; waffle_count=15; reps=7; model_size=ViT-B/32,41.09863042831421,0.06586725423827011,41.178083419799805,40.989041328430176,72.12093983377729,0.10183334683455673,72.28767275810242,71.93424701690674
14 | savename=waffleclip_gpt; dataset=food101; mode=waffle_and_gpt; waffle_count=15; reps=7; model_size=ViT-B/32,83.30127426556179,0.0947387274515462,83.41386318206787,83.13663601875305,96.9753886972155,0.05963972599285075,97.04950451850891,96.87920808792114
15 | savename=waffleclip_gpt; dataset=pets; mode=waffle_and_gpt; waffle_count=15; reps=7; model_size=ViT-B/32,86.14647814205715,0.156036447433207,86.34505271911621,85.90896725654602,97.51197269984654,0.14103296905467513,97.73780107498169,97.35623002052307
16 | savename=waffleclip_gpt; dataset=dtd; mode=waffle_and_gpt; waffle_count=15; reps=7; model_size=ViT-B/32,43.662614481789724,0.35786976455770425,44.20212805271149,43.13829839229584,76.20060869625637,0.45920728487455403,76.75532102584839,75.53191781044006
17 | savename=waffleclip_gpt; dataset=imagenetv2; mode=waffle_and_gpt; waffle_count=15; reps=7; model_size=RN50,52.89285693849836,0.14389637874904107,53.109997510910034,52.64999866485596,81.1614283493587,0.08724787922898508,81.33999705314636,81.0699999332428
18 | savename=waffleclip_gpt; dataset=imagenet; mode=waffle_and_gpt; waffle_count=15; reps=7; model_size=RN50,60.09857143674578,0.03927583489960801,60.15999913215637,60.04199981689453,86.38257043702262,0.042838151709160326,86.42799854278564,86.30200028419495
19 | savename=waffleclip_gpt; dataset=cub; mode=waffle_and_gpt; waffle_count=15; reps=7; model_size=RN50,47.75876488004412,0.1855273338635194,47.99792766571045,47.428375482559204,82.0060167993818,0.11141682109646271,82.24024772644043,81.87780380249023
20 | savename=waffleclip_gpt; dataset=eurosat; mode=waffle_and_gpt; waffle_count=15; reps=7; model_size=RN50,35.922751256397795,0.41775851901937383,36.629629135131836,35.08518636226654,80.1037039075579,0.408564160921435,80.53333163261414,79.4703722000122
21 | savename=waffleclip_gpt; dataset=places365; mode=waffle_and_gpt; waffle_count=15; reps=7; model_size=RN50,38.893150857516694,0.10206044015742341,39.12328779697418,38.82191777229309,70.14872772353036,0.060232898021218804,70.2602744102478,70.08219361305237
22 | savename=waffleclip_gpt; dataset=food101; mode=waffle_and_gpt; waffle_count=15; reps=7; model_size=RN50,79.34936370168414,0.06965676327049432,79.44950461387634,79.25544381141663,96.0288541657584,0.061925622142267066,96.11881375312805,95.93663215637207
23 | savename=waffleclip_gpt; dataset=pets; mode=waffle_and_gpt; waffle_count=15; reps=7; model_size=RN50,85.10688117572239,0.17105368871135357,85.36385893821716,84.76424217224121,98.21282540048871,0.19264427707928136,98.52820634841919,97.90133833885193
24 | savename=waffleclip_gpt; dataset=dtd; mode=waffle_and_gpt; waffle_count=15; reps=7; model_size=RN50,40.85106338773455,0.28573880323077533,41.32978618144989,40.3191477060318,72.34802416392735,0.342873658782746,72.81914949417114,71.96808457374573
25 |
--------------------------------------------------------------------------------
/results/waffleclip_gpt_concepts.csv:
--------------------------------------------------------------------------------
1 | savename=waffleclip_gpt_concepts; dataset=cub; mode=waffle_and_gpt; waffle_count=15; reps=7; model_size=ViT-L/14; label_before_text=A photo of a bird: a ,63.136742796216694,0.16017043917192275,63.42768669128418,62.96168565750122,92.87440180778503,0.07631413457942725,93.01000833511353,92.78563857078552
2 | savename=waffleclip_gpt_concepts; dataset=eurosat; mode=waffle_and_gpt; waffle_count=15; reps=7; model_size=ViT-L/14; label_before_text=A photo of a land use: a ,61.82486755507333,1.0679728773544854,63.14444541931152,59.740740060806274,96.52433906282697,0.2163323865715947,96.8925952911377,96.20370268821716
3 | savename=waffleclip_gpt_concepts; dataset=places365; mode=waffle_and_gpt; waffle_count=15; reps=7; model_size=ViT-L/14; label_before_text=A photo of a place: a ,42.94833626065935,0.08600948191591506,43.05753409862518,42.80821979045868,71.18316973958697,0.101737746567651,71.32054567337036,71.01370096206665
4 | savename=waffleclip_gpt_concepts; dataset=food101; mode=waffle_and_gpt; waffle_count=15; reps=7; model_size=ViT-L/14; label_before_text=A photo of a food: a ,93.4947669506073,0.03838926388749175,93.53267550468445,93.4099018573761,99.34936250959124,0.012291350040940131,99.36237335205078,99.3267297744751
5 | savename=waffleclip_gpt_concepts; dataset=pets; mode=waffle_and_gpt; waffle_count=15; reps=7; model_size=ViT-L/14; label_before_text=A photo of a breed: a ,94.1167312008994,0.08671480981779288,94.27636861801147,94.00381445884705,99.89876576832363,0.019074504825709958,99.91823434829712,99.86372590065002
6 | savename=waffleclip_gpt_concepts; dataset=cub; mode=waffle_and_gpt; waffle_count=15; reps=7; model_size=ViT-B/32; label_before_text=A photo of a bird: a ,52.76394316128322,0.2525525911144023,53.17569971084595,52.46807336807251,85.47265614782062,0.08689394138822988,85.60580015182495,85.31239032745361
7 | savename=waffleclip_gpt_concepts; dataset=eurosat; mode=waffle_and_gpt; waffle_count=15; reps=7; model_size=ViT-B/32; label_before_text=A photo of a land use: a ,51.63809486797878,0.2446667157394112,51.97407603263855,51.366668939590454,90.85185272353036,0.40125175691294585,91.3777768611908,90.18148183822632
8 | savename=waffleclip_gpt_concepts; dataset=places365; mode=waffle_and_gpt; waffle_count=15; reps=7; model_size=ViT-B/32; label_before_text=A photo of a place: a ,41.34794431073325,0.09110232050718331,41.50137007236481,41.23835563659668,71.78121379443577,0.0900661249665932,71.92876935005188,71.6794490814209
9 | savename=waffleclip_gpt_concepts; dataset=food101; mode=waffle_and_gpt; waffle_count=15; reps=7; model_size=ViT-B/32; label_before_text=A photo of a food: a ,84.87298488616943,0.05057801373729727,84.95049476623535,84.80396270751953,97.61131490979876,0.030341635956194157,97.65148758888245,97.56831526756287
10 | savename=waffleclip_gpt_concepts; dataset=pets; mode=waffle_and_gpt; waffle_count=15; reps=7; model_size=ViT-B/32; label_before_text=A photo of a breed: a ,87.68835323197501,0.1483679680414914,87.87135481834412,87.38075494766235,99.36533910887582,0.049862484934450044,99.45489168167114,99.29136037826538
11 | savename=waffleclip_gpt_concepts; dataset=cub; mode=waffle_and_gpt; waffle_count=15; reps=7; model_size=RN50; label_before_text=A photo of a bird: a ,48.431875876017976,0.23200237659448072,48.912668228149414,48.153263330459595,81.8038352898189,0.1419396401403567,82.08491802215576,81.63617253303528
12 | savename=waffleclip_gpt_concepts; dataset=eurosat; mode=waffle_and_gpt; waffle_count=15; reps=7; model_size=RN50; label_before_text=A photo of a land use: a ,37.359258958271575,0.6005609636243774,38.059258460998535,36.407408118247986,83.68201170648847,0.9698436837832175,85.39259433746338,82.31852054595947
13 | savename=waffleclip_gpt_concepts; dataset=places365; mode=waffle_and_gpt; waffle_count=15; reps=7; model_size=RN50; label_before_text=A photo of a place: a ,39.43444234984262,0.06183399189869141,39.52876627445221,39.33972716331482,70.72054828916278,0.08135355634847152,70.8493173122406,70.62191963195801
14 | savename=waffleclip_gpt_concepts; dataset=food101; mode=waffle_and_gpt; waffle_count=15; reps=7; model_size=RN50; label_before_text=A photo of a food: a ,81.17058021681649,0.08239210718985283,81.29900693893433,81.02574348449707,96.73946244376046,0.02700371384345886,96.78019881248474,96.70494794845581
15 | savename=waffleclip_gpt_concepts; dataset=pets; mode=waffle_and_gpt; waffle_count=15; reps=7; model_size=RN50; label_before_text=A photo of a breed: a ,85.80383913857597,0.18337305438112456,86.01799011230469,85.47288179397583,98.9487213747842,0.11458018794127695,99.15508031845093,98.77350926399231
16 |
--------------------------------------------------------------------------------
/setup.py:
--------------------------------------------------------------------------------
1 | from setuptools import setup, find_packages
2 |
3 | setup(
4 | name='waffle',
5 | version='0.0.1',
6 | description='',
7 | packages=find_packages(),
8 | install_requires=[
9 | 'torch',
10 | 'numpy',
11 | 'tqdm',
12 | ],
13 | )
14 |
--------------------------------------------------------------------------------
/word_list.pkl:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/ExplainableML/WaffleCLIP/7a1b8ee48e31285f62ecd839fecb6b89cbef81f1/word_list.pkl
--------------------------------------------------------------------------------