├── GenerateDescriptions.ipynb ├── README.md ├── dcc.py ├── dcc_transfer ├── __init__.py ├── transfer_weights.py └── w2vDist.py ├── extract_features.sh ├── generate_coco.sh ├── generate_imagenet.sh ├── images └── ipython_images │ ├── alpaca │ └── n02698473_518.JPEG │ ├── baobab │ └── n12189987_4309.JPEG │ ├── candelabra │ └── n02947818_15145.JPEG │ ├── coco │ ├── COCO_val2014_000000279846.jpg │ ├── COCO_val2014_000000356368.jpg │ ├── COCO_val2014_000000380868.jpg │ └── COCO_val2014_000000531563.jpg │ └── otter │ ├── n02444819_10502.JPEG │ └── n02444819_13167.JPEG ├── prototxts ├── dcc_coco_baseline_vgg.solver.prototxt ├── dcc_coco_baseline_vgg.train.prototxt ├── dcc_coco_rm1_vgg.solver.deltaLM.prototxt ├── dcc_coco_rm1_vgg.solver.freezeLM.prototxt ├── dcc_coco_rm1_vgg.solver.prototxt ├── dcc_coco_rm1_vgg.train.deltaLM.prototxt ├── dcc_coco_rm1_vgg.train.freezeLM.prototxt ├── dcc_coco_rm1_vgg.train.prototxt ├── dcc_imagenet_rm1_vgg.solver.prototxt ├── dcc_imagenet_rm1_vgg.train.prototxt ├── dcc_oodLM_rm1_vgg.im2txt.solver.prototxt ├── dcc_oodLM_rm1_vgg.surf.solver.prototxt ├── dcc_oodLM_rm1_vgg.train.prototxt ├── dcc_vgg.80k.deploy.prototxt ├── dcc_vgg.80k.wtd.imagenet.prototxt ├── dcc_vgg.80k.wtd.prototxt ├── dcc_vgg.delta.wtd.prototxt ├── dcc_vgg.deploy.prototxt ├── dcc_vgg.wtd.prototxt ├── train_classifiers_deploy.imagenet.prototxt └── train_classifiers_deploy.prototxt ├── run_dcc_coco_baseline_vgg.sh ├── run_dcc_coco_rm1_vgg.delta.sh ├── run_dcc_coco_rm1_vgg.sh ├── run_dcc_imagenet_rm1_vgg.im2txt.sh ├── run_dcc_imagenet_rm1_vgg.surf.sh ├── setup.sh ├── transfer.sh ├── transfer_delta.sh └── utils ├── __init__.py ├── config.example.py ├── download_tools.sh ├── extract_classifiers.py ├── image_list ├── coco2014_cocoid.train.txt ├── coco2014_cocoid.val_test.txt ├── coco2014_cocoid.val_val.txt ├── test_imagenet_images.txt └── train_imagenet_images.txt ├── lexicalList ├── lexicalList_471_rebuttalScale.txt ├── lexicalList_471_rebuttalScale_justImageNet.txt ├── lexicalList_JJ100_NN300_VB100_rmEightCoco1.txt └── lexicalList_parseCoco_JJ100_NN300_VB100.txt ├── python_data_layers.py ├── transfer_experiments ├── transfer_classifiers_coco1.txt ├── transfer_classifiers_imagenet.txt ├── transfer_words_coco1.txt └── transfer_words_imagenet.txt └── vocabulary ├── vocabulary.txt └── yt_coco_surface_80k_vocab.txt /README.md: -------------------------------------------------------------------------------- 1 | # Deep Compositional Captioning 2 | 3 | Hendricks, Lisa Anne, et al. "Deep Compositional Captioning: Describing Novel Object Categories without Paired Training Data." CVPR (2016). 4 | 5 | [Find the paper here.](https://arxiv.org/abs/1511.05284) 6 | 7 | ``` 8 | @inproceedings{hendricks16cvpr, 9 | title = {Deep Compositional Captioning: Describing Novel Object Categories without Paired Training Data}, 10 | author = {Hendricks, Lisa Anne and Venugopalan, Subhashini and Rohrbach, Marcus and Mooney, Raymond, and Saenko Kate, and Darrell, Trevor}, 11 | booktitle = {Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR)}, 12 | year = {2016} 13 | } 14 | ``` 15 | 16 | License: BSD 2-Clause license 17 | 18 | You should be able to replicate my results using this code. Please let me know if you have any questions. 19 | ## Setting Up 20 | 21 | To use my code, please make sure you have the following: 22 | 23 | 1. Lisa Anne Hendricks' branch of Caffe installed: "https://github.com/LisaAnne/lisa-caffe-public/tree/master". My code will probably work well with other Caffe versions, but I have tested on this version. 24 | 2. ~All data/models can be downloaded with setup.sh.~ 25 | 2. After I graduated my website was deleted, so please download data from a drive folder [here](https://drive.google.com/drive/u/1/folders/1ct0KhDW8ZHW4D9pxu0IX1ntTaH-XOAVV). 26 | 3. Optional -- ImageNet dataset (http://image-net.org/download). For the ImageNet experiments, some classes are outside the 1,000 classes chosen for the ILSVRC challenge. To see which images I used, look at "utils/all_imagenet_images.txt" which includes path to imagenet image and label I used when training. 27 | 28 | To begin, please run: ./setup.sh 29 | 30 | My script assumes that you have already downloaded MSCOCO description annotations, images, and evaluation tools. If not, no worries! You can download those by using the following flags: 31 | 32 | --download_mscoco_annotations: downloads mscoco annotations to annotations. 33 | --download_mscoco_images: downloads mscoco images to images/coco_images. 34 | --download_mscoco_tools: downloads mscoco eval tools to utils/coco_tools. 35 | 36 | The script will also download my annotations used for my zero-shot splits and my models (before transfer). **Note -- to replicate my results you will need to run transfer.sh and transfer_delta.sh as described in the next few steps*. 37 | 38 | Next, copy "utils/config.example.py" to "utils/config.py" and make sure all paths match the paths on your machine. In particular, you will need to indicate the path to your caffe directory, the MSCOCO dataset and evaluation toolbox (if you did not download these using setup.sh), and imagenet images. 39 | 40 | Once you have setup your paths, run "transfer.sh" and "transfer_delta.sh" to run the transfer code. **You will not get the same results as me if you do not run the transfer code.** 41 | 42 | Now that everything is setup, we can evaluate the DCC model. 43 | 44 | Please look at "GenerateDescriptions.ipynb" for an example of how to caption an image. You do not need to retrain models, and can go directly to steps 5 and 6 if you would like to evaluate models. Some details follow: 45 | 46 | 1. The first step in DCC is to train lexical models which map images to a set of visual concepts (e.g., "sheep", "grass", "stand"). 47 | - "attributes_JJ100_NN300_VB100_allObjects_coco_vgg_0111_iter_80000.caffemodel": image model trained with MSCOCO images 48 | - "attributes_JJ100_NN300_VB100_coco_471_eightCluster_0223_iter_80000.caffemodel": image model trained with MSCOCO images (Do not use multiple labels for held out classes. We mine MSCOCO labels from descriptions, and therefore images can have multiple labels. However, for the eight held out concepts, we just train with a single label corresponding to the held out class -- e.g., "bus" instead of "bus", "street", "building". We do this to ensure that the visual model does not exploit co-occurrences) 49 | - "attributes_JJ100_NN300_VB100_clusterEight_imagenet_vgg_0112_iter_80000.caffemode": image model trained with MSCOCO images EXCEPT for objects which are held outduring paired training. These categories are trained with ImageNet data. 50 | - "vgg_multilabel_FT_iter_100000.caffemodel": image model trained on all MSCOCO images and over 600 ImageNet objects not in MSCOCO 51 | 52 | The code to train these models will be coming soon, but you can use all my pretrained models. Use "./extract_features.sh" to extract image features for MSCOCO. 53 | 54 | 2. The next step in DCC is to train language models. 55 | - "mrnn.direct_iter_110000.caffemodel": language model trained on MSCOCO text 56 | - "mrnn.lm.direct_surf_lr0.01_iter_120000.caffemodel": language model trained on WebCorbus text 57 | - "mrnn.lm.direct_imtextyt_lr0.01_iter_120000.caffemodel": langauge model trained on Caption text 58 | 59 | The code to train these models will be coming soon, but you can use all my pretrained models. 60 | 61 | 3. The final training step is to train the caption model. You can find the prototxts to train the caption models in "prototxts". To speed up training, I pre-extract image features. Please look at "extract_features.sh" to see how to extract features. Train the caption models using one of the following bash scripts: 62 | - "run_dcc_coco_baseline_vgg.sh": model with pair supervision 63 | - "run_dcc_coco_rm1_vgg.sh": direct transfer model with in domain text pre-training and in domain image pre-training 64 | - "run_dcc_coco_rm1_vgg.delta.sh": delta transfer model with in domain text pre-training and in domain image pre-training 65 | - "run_dcc_imagenet_rm1_vgg.sh": direct transfer model with in domain text pre-training and out of domain image pre-training 66 | - "run_dcc_imagenet_rm1_vgg.im2txt.sh": direct transfer model with out of domain text pre-training with Caption txt and out of domain image pre-training 67 | - "run_dcc_imagenet_rm1_vgg.sh": direct transfer model with out of domain text pre-training with WebCorpus and out of domain image pre-training 68 | - "run_dcc_imagenet_sentences_vgg.sh": direct transfer model for describing Imagnet objects 69 | 70 | Note that I include all my caption models in "snapshots", so you do not have to retrain these models yourself! 71 | 72 | 4. Novel word transfer. Please look at transfer.sh to see how to transfer weigths for the direct transfer model and transfer_delta.sh to see how to transfer weights for the delta_transfer model. The setup script will automatically do both direct transfer and delta transfer, so these models should be in snapshots as well. 73 | 74 | 5. Evaluation on MSCOCO. Look at generate_coco.sh. 75 | 76 | 6. Generating descriptions for ImageNet images. Look at generate_imagenet.sh. 77 | 78 | If you just want to compare to my descritions, look in the "results/generated_sentences" folder. You will find: 79 | 80 | 1. dcc_coco_rm1_vgg.471.solver.prototxt_iter_110000.caffemodel_coco2014_cocoid.val_test.txt.json: DCC with in domain text and in domain images. 81 | 2. dcc_oodLM_rm1_vgg.surf.471.solver_0409_iter_110000.transfer_words_coco1.txt_closeness_embedding.caffemodel_coco2014_cocoid.val_test.txt.json: DCC with out of domain text and out of domain images 82 | 3. vgg_feats.vgg_multilabel_FT_iter_100000_imagenetSentences_iter_110000.transfer_words_imagenet.txt_closeness_embedding.caffemodel_test_imagenet_images.txt.json: DCC images for ImageNet images. 83 | 84 | Finally, if you are working on integrating novel words into captions, I suggest you also check out the following papers: 85 | 86 | [Captioning Images with Diverse Objects](https://arxiv.org/abs/1511.05284) **Oral CVPR 2017** 87 | 88 | [Incorporating Copying Mechanism in Image Captioning 89 | for Learning Novel Objects](http://openaccess.thecvf.com/content_cvpr_2017/papers/Yao_Incorporating_Copying_Mechanism_CVPR_2017_paper.pdf) **CVPR 2017** 90 | 91 | [Guided Open Vocabulary Image Captioning with Constrained Beam Search](https://arxiv.org/abs/1612.00576) **EMNLP 2017** 92 | 93 | [Neural Baby Talk](https://arxiv.org/pdf/1803.09845.pdf) **Spotlight CVPR 2018** 94 | 95 | [Decoupled Novel Captioner](https://arxiv.org/pdf/1804.03803.pdf) **ACM MM 2018** 96 | 97 | [Partially Supervised Image Captioning](https://arxiv.org/pdf/1806.06004.pdf) **NIPS 2018** 98 | 99 | [Image Captioning with Unseen Objects](https://arxiv.org/pdf/1908.00047.pdf) **Spotlight BMVC 2019** 100 | 101 | If you have a paper in which you compare to DCC, let me know and I will add it to this list. 102 | 103 | Please contact lisa_anne@berkeley.edu if you have any issues. Happy captioning! 104 | 105 | Updated 8/06/2019 106 | 107 | -------------------------------------------------------------------------------- /dcc.py: -------------------------------------------------------------------------------- 1 | #main funciton for DCC; use to extract features and evaluate results 2 | import sys 3 | from utils import extract_classifiers 4 | from eval.captioner import * 5 | from eval.coco_eval import * 6 | #from eval import eval_sentences 7 | from dcc_transfer import transfer_weights 8 | import argparse 9 | import pdb 10 | from utils.config import * 11 | import h5py 12 | 13 | def extract_features(args): 14 | 15 | 16 | extract_classifiers.extract_features(args.image_model, args.model_weights, args.imagenet_images, args.device, args.image_dim, args.lexical_feature, args.batch_size) 17 | 18 | def transfer(args): 19 | 20 | transfer_net = transfer_weights.transfer_net(args.language_model, args.model_weights, args.orig_attributes, args.all_attributes, args.vocab) 21 | eval('transfer_net.' + args.transfer_type)(args.words, args.classifiers, args.closeness_metric, args.log, num_transfer=args.num_transfer, orig_net_weights=args.orig_model) 22 | 23 | def generate_coco(args): 24 | #args.model_weights, args.image_model, args.language_model, args.vocab, args.image_list, args.precomputed_features 25 | 26 | language_model = models_folder + args.language_model 27 | model_weights = weights_folder + args.model_weights 28 | vocab = vocab_root + args.vocab 29 | 30 | image_list = open_txt(image_list_root + args.image_list) 31 | 32 | captioner = Captioner(language_model, model_weights, 33 | sentence_generation_cont_in='cont_sentence', 34 | sentence_generation_sent_in='input_sentence', 35 | sentence_generation_feature_in=['image_features'], 36 | sentence_generation_out='predict', 37 | vocab_file=vocab, 38 | prev_word_restriction=True) 39 | 40 | if args.precomputed_features: 41 | precomputed_feats = lexical_features_root + args.precomputed_features 42 | features = h5py.File(precomputed_feats) 43 | descriptor_dict = {} 44 | for feature, im in zip(features['features'], features['ims']): 45 | descriptor_dict[im] = np.array(feature) 46 | features.close() 47 | else: 48 | #TODO add in code to compute features if not precomputes 49 | raise Exception("You must precompute features!") 50 | 51 | assert len(image_list) == len(descriptor_dict.keys()) 52 | 53 | final_captions = captioner.caption_images(descriptor_dict, descriptor_dict.keys(), batch_size=1000) 54 | save_caps = 'results/generated_sentences/%s_%s.json' %(args.model_weights, args.image_list) 55 | save_json_coco_format(final_captions, save_caps) 56 | 57 | gt_json = coco_annotations + 'captions_%s2014.json' %args.split 58 | gt_template_novel = coco_annotations + 'captions_split_set_%s_%s_novel2014.json' 59 | gt_template_train = coco_annotations + 'captions_split_set_%s_%s_train2014.json' 60 | 61 | print "Scores over entire dataset..." 62 | score_generation(gt_json, save_caps) 63 | 64 | print "Scores over word splits..." 65 | new_words = ['bus', 'bottle', 'couch', 'microwave', 'pizza', 'racket', 'suitcase', 'zebra'] 66 | score_dcc(gt_template_novel, gt_template_train, save_caps, new_words, args.split) 67 | 68 | def generate_imagenet(args): 69 | #args.model_weights, args.image_model, args.language_model, args.vocab, args.image_list, args.precomputed_features 70 | 71 | model_weights = weights_folder + args.model_weights 72 | language_model = models_folder + args.language_model 73 | vocab = vocab_root + args.vocab 74 | precomputed_feats = lexical_features_root + args.precomputed_features 75 | 76 | image_list = open_txt(image_list_root + args.image_list) 77 | 78 | if args.precomputed_features: 79 | precomputed_feats = lexical_features_root + args.precomputed_features 80 | features = h5py.File(precomputed_feats) 81 | descriptor_dict = {} 82 | for feature, im in zip(features['features'], features['ims']): 83 | descriptor_dict[im] = np.array(feature) 84 | features.close() 85 | else: 86 | #TODO add in code to compute features if not precomputes 87 | raise Exception("You must precompute features!") 88 | 89 | captioner = Captioner(language_model, model_weights, 90 | sentence_generation_cont_in='cont_sentence', 91 | sentence_generation_sent_in='input_sentence', 92 | sentence_generation_feature_in=['image_features'], 93 | sentence_generation_out='predict', 94 | vocab_file=vocab, 95 | prev_word_restriction=True) 96 | 97 | final_captions = captioner.caption_images(descriptor_dict, descriptor_dict.keys(), batch_size=1000) 98 | save_caps = 'results/generated_sentences/%s_%s.json' %(args.model_weights, args.image_list) 99 | save_json_other_format(final_captions, save_caps) 100 | 101 | def eval_imagenet(args): 102 | result = eval_sentences.make_imagenet_result_dict(generated_sentences + args.caps) 103 | eval_sentences.find_successful_classes(result) 104 | 105 | if __name__ == "__main__": 106 | parser = argparse.ArgumentParser() 107 | parser.add_argument("--image_model",type=str) 108 | parser.add_argument("--language_model",type=str) 109 | parser.add_argument("--model_weights",type=str) 110 | parser.add_argument("--image_list", type=str) 111 | parser.add_argument("--imagenet_images",type=str, default=None) #extract_features 112 | parser.add_argument("--lexical_feature",type=str, default='probs') #name of layer to extract 113 | parser.add_argument("--orig_attributes",type=str, default='') 114 | parser.add_argument("--all_attributes",type=str, default='') 115 | parser.add_argument("--vocab", type=str, default='') 116 | parser.add_argument("--words", type=str, default='') 117 | parser.add_argument("--precomputed_features", type=str, default=None) #list of classifiers 118 | parser.add_argument("--classifiers", type=str, default='') #list of classifiers 119 | parser.add_argument("--closeness_metric", type=str, default='closeness_embedding') 120 | parser.add_argument("--transfer_type", type=str, default='direct_transfer') 121 | parser.add_argument("--split", type=str, default='val_val') 122 | parser.add_argument("--caps", type=str, default='') 123 | 124 | parser.add_argument("--orig_model", type=str, default='') 125 | parser.add_argument("--new_model", type=str, default='') 126 | parser.add_argument("--language_feature", type=str, default='predict') 127 | parser.add_argument("--image_feature", type=str, default='data') 128 | 129 | parser.add_argument("--device",type=int, default=0) 130 | parser.add_argument("--image_dim",type=int, default=227) 131 | parser.add_argument("--batch_size",type=int, default=10) 132 | parser.add_argument("--num_transfer",type=int, default=1) 133 | 134 | parser.add_argument('--extract_features', dest='extract_features', action='store_true') 135 | parser.set_defaults(extract_features=False) 136 | parser.add_argument('--generate_coco', dest='generate_coco', action='store_true') 137 | parser.set_defaults(generate_coco=False) 138 | parser.add_argument('--generate_imagenet', dest='generate_imagenet', action='store_true') 139 | parser.set_defaults(generate_imagenet=False) 140 | parser.add_argument('--eval_imagenet', dest='eval_imagenet', action='store_true') 141 | parser.set_defaults(eval_imagenet=False) 142 | parser.add_argument('--transfer', dest='transfer', action='store_true') 143 | parser.set_defaults(transfer=False) 144 | parser.add_argument('--log', dest='log', action='store_true') 145 | parser.set_defaults(log=False) 146 | 147 | args = parser.parse_args() 148 | 149 | if args.extract_features: 150 | extract_features(args) 151 | 152 | if args.transfer: 153 | transfer(args) 154 | 155 | if args.generate_coco: 156 | generate_coco(args) 157 | 158 | if args.generate_imagenet: 159 | generate_imagenet(args) 160 | 161 | if args.eval_imagenet: 162 | eval_imagenet(args) 163 | -------------------------------------------------------------------------------- /dcc_transfer/__init__.py: -------------------------------------------------------------------------------- 1 | #Train captions module 2 | -------------------------------------------------------------------------------- /dcc_transfer/transfer_weights.py: -------------------------------------------------------------------------------- 1 | import sys 2 | sys.path.append('utils/') 3 | sys.path.append('utils/tools/') 4 | from python_utils import * 5 | from w2vDist import * 6 | import caffe 7 | import numpy as np 8 | import copy 9 | import pickle as pkl 10 | import hickle as hkl 11 | from config import * 12 | import pdb 13 | 14 | class closeness_embedding(object): 15 | 16 | def __init__(self, attributes): 17 | self.W2V = w2v() 18 | self.W2V.readVectors() 19 | self.W2V.reduce_vectors(attributes, '-n') 20 | 21 | def __call__(self, word): 22 | return self.W2V.findClosestWords(word) 23 | 24 | class transfer_net(object): 25 | 26 | def __init__(self, model, model_weights, orig_attributes, all_attributes, vocab): 27 | self.model = model 28 | self.model_weights = model_weights 29 | self.net = caffe.Net(self.model, weights_folder + self.model_weights + '.caffemodel', caffe.TRAIN) 30 | self.orig_attributes = open_txt(orig_attributes) 31 | self.all_attributes = open_txt(all_attributes) 32 | self.new_attributes = list(set(self.all_attributes) - set(self.orig_attributes)) 33 | self.vocab = open_txt(vocab) 34 | self.vocab = [''] + self.vocab 35 | 36 | def direct_transfer(self, words, classifiers, closeness_metric, log=False, predict_lm = 'predict-lm', predict_im = 'predict-im', num_transfer=1, orig_net_weights=''): 37 | metric = eval(closeness_metric)(self.all_attributes) 38 | save_tag = '%s_%s' %(words.split('/')[-1], closeness_metric) 39 | if log: log_file = 'outfiles/transfer/%s_%s_direct.out' %(self.model_weights, words.split('/')[-1]) 40 | if log: log_write = open(log_file, 'w') 41 | 42 | words = open_txt(words) 43 | classifiers = open_txt(classifiers) 44 | 45 | illegal_words = classifiers + self.new_attributes 46 | illegal_idx = [self.all_attributes.index(illegal_word) for illegal_word in illegal_words] 47 | 48 | if len(self.net.params[predict_im]) > 1: 49 | im_bias = True 50 | else: 51 | im_bias = False 52 | 53 | close_words = {} 54 | for word, classifier in zip(words, classifiers): 55 | word_sims = metric(classifier) 56 | for illegal_id in illegal_idx: 57 | word_sims[illegal_id] = -100000 58 | 59 | close_words[word] = self.all_attributes[np.argsort(word_sims)[-1]] 60 | 61 | t_word_string = "Transfer word for %s is %s." %(word, close_words[word]) 62 | print t_word_string 63 | if log: log_write.writelines('%s\n' %t_word_string) 64 | 65 | predict_weights_lm = copy.deepcopy(self.net.params[predict_lm][0].data) 66 | predict_bias_lm = copy.deepcopy(self.net.params[predict_lm][1].data) 67 | predict_weights_im = copy.deepcopy(self.net.params[predict_im][0].data) 68 | if im_bias: 69 | predict_bias_im = copy.deepcopy(self.net.params[predict_im][1].data) 70 | 71 | for word, classifier in zip(words, classifiers): 72 | word_idx = self.vocab.index(word) 73 | attribute_idx = self.all_attributes.index(classifier) 74 | transfer_word_idx = self.vocab.index(close_words[word]) 75 | transfer_attribute_idx = self.all_attributes.index(close_words[word]) 76 | 77 | transfer_weights_lm = np.ones((predict_weights_lm.shape[1],))*0 78 | transfer_bias_lm = 0 79 | transfer_weights_im = np.ones((predict_weights_im.shape[1],))*0 80 | if im_bias: 81 | transfer_bias_im = 0 82 | 83 | transfer_weights_lm += predict_weights_lm[transfer_word_idx,:] 84 | transfer_bias_lm += predict_bias_lm[transfer_word_idx] 85 | transfer_weights_im += predict_weights_im[transfer_word_idx,:] 86 | if im_bias: 87 | transfer_bias_im += predict_bias_im[transfer_word_idx] 88 | 89 | #Take care of classifier cross terms 90 | transfer_weights_im[attribute_idx] = predict_weights_im[transfer_word_idx, transfer_attribute_idx] 91 | transfer_weights_im[transfer_attribute_idx] = 0 92 | 93 | predict_weights_lm[word_idx,:] = transfer_weights_lm 94 | predict_bias_lm[word_idx] = transfer_bias_lm 95 | predict_weights_im[word_idx,:] = transfer_weights_im 96 | if im_bias: 97 | predict_bias_im[word_idx] = transfer_bias_im 98 | 99 | predict_weights_im[transfer_word_idx,attribute_idx] = 0 100 | 101 | self.net.params[predict_lm][0].data[...] = predict_weights_lm 102 | self.net.params[predict_lm][1].data[...] = predict_bias_lm 103 | self.net.params[predict_im][0].data[...] = predict_weights_im 104 | if im_bias: 105 | self.net.params[predict-im][1].data[...] = predict_bias_im 106 | self.net.save('%s%s.%s.caffemodel' %(weights_folder, self.model_weights, save_tag)) 107 | 108 | save_string = 'Saved to: %s%s.%s.caffemodel' %(weights_folder, self.model_weights, save_tag) 109 | print save_string 110 | if log: log_write.writelines('%s\n' %(save_string)) 111 | if log: log_write.close() 112 | if log: print 'Log file saved to %s.' %log_file 113 | 114 | def delta_transfer(self, words, classifiers, closeness_metric, log=False, predict_lm = 'predict', predict_im = 'predict-im', num_transfer=3, orig_net_weights=''): 115 | 116 | metric = eval(closeness_metric)(self.all_attributes) 117 | save_tag = '%s_%s' %(words.split('/')[-1], closeness_metric) 118 | if log: log_file = 'outfiles/transfer/%s_%s_%d_delta.out' %(self.model_weights, words.split('/')[-1], num_transfer) 119 | if log: log_write = open(log_file, 'w') 120 | 121 | orig_net = caffe.Net(self.model, weights_folder + orig_net_weights + '.caffemodel', caffe.TRAIN) 122 | 123 | words = open_txt(words) 124 | classifiers = open_txt(classifiers) 125 | 126 | illegal_words = classifiers + self.new_attributes 127 | illegal_idx = [self.all_attributes.index(illegal_word) for illegal_word in illegal_words] 128 | 129 | if len(self.net.params[predict_im]) > 1: 130 | im_bias = True 131 | else: 132 | im_bias = False 133 | 134 | close_words = {} 135 | for word, classifier in zip(words, classifiers): 136 | word_sims = metric(classifier) 137 | for illegal_id in illegal_idx: 138 | word_sims[illegal_id] = -100000 139 | 140 | close_words[word] = [self.all_attributes[np.argsort(word_sims)[-1*i]] for i in range(1,num_transfer+1)] 141 | 142 | t_word_string = "Transfer words for %s are %s." %(word, close_words[word]) 143 | print t_word_string 144 | if log: log_write.writelines('%s\n' %t_word_string) 145 | 146 | ft_weights_lm = copy.deepcopy(self.net.params[predict_lm][0].data) 147 | orig_weights_lm = copy.deepcopy(orig_net.params[predict_lm][0].data) 148 | ft_bias_lm = copy.deepcopy(self.net.params[predict_lm][1].data) 149 | orig_bias_lm = copy.deepcopy(orig_net.params[predict_lm][1].data) 150 | ft_weights_im = copy.deepcopy(self.net.params[predict_im][0].data) 151 | if im_bias: 152 | ft_bias_im = copy.deepcopy(self.net.params[predict_im][1].data) 153 | 154 | for word, classifier in zip(words, classifiers): 155 | word_idx = self.vocab.index(word) 156 | attribute_idx = self.all_attributes.index(classifier) 157 | transfer_weights_lm = np.ones((ft_weights_lm.shape[1],))*0 158 | transfer_bias_lm = 0 159 | transfer_weights_im = np.ones((ft_weights_im.shape[1],))*0 160 | if im_bias: 161 | transfer_bias_im = 0 162 | 163 | for close_word in close_words[word]: 164 | transfer_word_idx = self.vocab.index(close_word) 165 | transfer_attribute_idx = self.all_attributes.index(close_word) 166 | 167 | transfer_weights_lm += ft_weights_lm[transfer_word_idx,:] - orig_weights_lm[transfer_word_idx,:] 168 | transfer_bias_lm += ft_bias_lm[transfer_word_idx] - orig_bias_lm[transfer_word_idx] 169 | 170 | #Take care of classifier cross terms 171 | transfer_word_idx = self.vocab.index(close_words[word][0]) 172 | transfer_attribute_idx = self.all_attributes.index(close_words[word][0]) 173 | transfer_weights_im += ft_weights_im[transfer_word_idx,:] 174 | if im_bias: 175 | transfer_bias_im += ft_bias_im[transfer_word_idx] 176 | transfer_weights_im[attribute_idx] = ft_weights_im[transfer_word_idx, transfer_attribute_idx] 177 | transfer_weights_im[transfer_attribute_idx] = 0 178 | 179 | ft_weights_lm[word_idx,:] += transfer_weights_lm/num_transfer 180 | ft_bias_lm[word_idx] += transfer_bias_lm/num_transfer 181 | ft_weights_im[word_idx,:] = transfer_weights_im 182 | if im_bias: 183 | ft_bias_im[word_idx] = transfer_bias_im 184 | 185 | ft_weights_im[transfer_word_idx,attribute_idx] = 0 186 | 187 | self.net.params[predict_lm][0].data[...] = ft_weights_lm 188 | self.net.params[predict_lm][1].data[...] = ft_bias_lm 189 | self.net.params[predict_im][0].data[...] = ft_weights_im 190 | if im_bias: 191 | self.net.params[predict-im][1].data[...] = ft_bias_im 192 | self.net.save('%s%s.%s_delta_%d.caffemodel' %(weights_folder, self.model_weights, save_tag, num_transfer)) 193 | 194 | save_string = 'Saved to: %s%s.%s_delta_%d.caffemodel' %(weights_folder, self.model_weights, save_tag, num_transfer) 195 | print save_string 196 | if log: log_write.writelines('%s\n' %(save_string)) 197 | if log: log_write.close() 198 | if log: print 'Log file saved to %s.' %log_file 199 | 200 | 201 | 202 | 203 | 204 | 205 | 206 | 207 | 208 | 209 | 210 | 211 | 212 | 213 | 214 | 215 | 216 | 217 | -------------------------------------------------------------------------------- /dcc_transfer/w2vDist.py: -------------------------------------------------------------------------------- 1 | import argparse 2 | import struct 3 | import numpy as np 4 | from sklearn.metrics import pairwise as skmetrics 5 | import sys 6 | import pdb 7 | 8 | FLOAT_SIZE = 4 9 | 10 | class w2v: 11 | def __init__(self): 12 | self.binFile="dcc_transfer/vectors-cbow-bnc+ukwac+wikipedia.bin" 13 | self.vocab=None 14 | self.matrix=None 15 | 16 | def readVectors(self): 17 | # vocab is a list containing the row info 18 | # matrix will have data 19 | infd=open(self.binFile,"r") 20 | header = infd.readline().rstrip() 21 | vocab_s, dims = map(int, header.split(" ")) 22 | self.vocab = [] 23 | # init matrix 24 | self.matrix = np.zeros((vocab_s, dims), dtype=np.float) 25 | i = 0 26 | while True: 27 | #while i<7: 28 | line = infd.readline() 29 | if not line: 30 | break 31 | sep = line.find(" ") 32 | word = line[:sep] 33 | data = line[sep+1:] 34 | if len(data) < FLOAT_SIZE * dims + 1: 35 | data += infd.read(FLOAT_SIZE * dims + 1 - len(data)) 36 | data = data[:-1] 37 | self.vocab.append(word) 38 | vector = (struct.unpack("%df" % dims, data)) 39 | self.matrix[i] = vector 40 | i += 1 41 | infd.close() 42 | 43 | def reduce_vectors(self, word_list, pos): 44 | #reduce the vectors so that they only include the vectors in the lexical list 45 | #kind of hacky, but assume a noun unless noun does not exist, then assume adjective, then assume verb, then skip 46 | vocab = self.vocab 47 | new_vocab = [] 48 | new_matrix = np.zeros((len(word_list), self.matrix.shape[1]), dtype=np.float) 49 | for new_idx, word in enumerate(word_list): 50 | if word + pos in self.vocab: 51 | old_idx = vocab.index(word+'-n') 52 | new_vocab.append(word+'-n') 53 | else: 54 | old_idx = None 55 | new_vocab.append(None) 56 | if old_idx: 57 | new_matrix[new_idx, :] = self.matrix[old_idx,:] 58 | else: 59 | print "Word %s%s not in word2vec.\n" %(word, pos) 60 | new_matrix[new_idx,:] = -1000000 #this should make this vector far from everything 61 | self.matrix = new_matrix 62 | self.vocab = new_vocab 63 | 64 | def printSample(self): 65 | for i in range(0,7): 66 | print self.vocab[i]+"\t" 67 | #print str(self.matrix[i])+"\n" 68 | 69 | def getCosDistanceN(self,noun1,noun2): 70 | if(not self.vocab.__contains__(noun1+'-n')): 71 | print "Vocab does not contain "+noun1+'-n' 72 | return 0 73 | if(not self.vocab.__contains__(noun2+'-n')): 74 | print "Vocab does not contain "+noun2+'-n' 75 | return 0 76 | ind1=self.vocab.index(noun1+'-n') 77 | ind2=self.vocab.index(noun2+'-n') 78 | dist=skmetrics.cosine_similarity(self.matrix[ind1],self.matrix[ind2]) 79 | return dist[0][0] 80 | 81 | def getCosDistanceV(self,verb1,verb2): 82 | if(not self.vocab.__contains__(verb1+'-v')): 83 | print "Vocab does not contain "+verb1+'-v' 84 | return 0 85 | if(not self.vocab.__contains__(verb2+'-v')): 86 | print "Vocab does not contain "+verb2+'-v' 87 | return 0 88 | ind1=self.vocab.index(verb1+'-v') 89 | ind2=self.vocab.index(verb2+'-v') 90 | dist=skmetrics.cosine_similarity(self.matrix[ind1],self.matrix[ind2]) 91 | return dist[0][0] 92 | 93 | def findClosestWords(self, word, pos='-n'): 94 | if(not self.vocab.__contains__(word + pos)): 95 | print "Vocab does not contain "+ word + pos 96 | return np.random.rand(self.matrix.shape[0]) 97 | ind1=self.vocab.index(word + pos) 98 | numerator = np.dot(self.matrix, self.matrix[ind1]) 99 | denominator = np.linalg.norm(self.matrix[ind1])*np.linalg.norm(self.matrix, axis=1) 100 | dist = numerator/denominator 101 | return dist 102 | # vocab_idx = np.argsort(dist)[-10:] 103 | # return [self.vocab[idx] for idx in vocab_idx], dist[vocab_idx] 104 | 105 | def demo1(): 106 | parser = argparse.ArgumentParser( 107 | description="Converts a Mikolov binary vector file into one compatible with Trento's COMPOSES.") 108 | parser.add_argument('--input', '-i', help='Input file') 109 | parser.add_argument('--output', '-o', type=argparse.FileType('w'), help='Output file') 110 | args = parser.parse_args() 111 | W2V=w2v() 112 | W2V.readVectors() 113 | W2V.printSample() 114 | w1='time' 115 | w2='year' 116 | dist=W2V.getCosDistanceN(w1,w2) 117 | print "cos sim b/w "+w1+" and "+w2+" = "+str(dist) 118 | 119 | def main(): 120 | if len(sys.argv) > 1: 121 | word = sys.argv[1] 122 | else: 123 | word = 'zebra' 124 | W2V=w2v() 125 | W2V.readVectors() 126 | similar_words, similarity=W2V.findClosestWords(word) 127 | similar_words.reverse() 128 | for iw, w in enumerate(similar_words): 129 | print "Closest words are: %s:%f.\n" %(w, similarity[-(iw+1)]) 130 | 131 | if __name__=="__main__": 132 | main() 133 | -------------------------------------------------------------------------------- /extract_features.sh: -------------------------------------------------------------------------------- 1 | #!/bin/bash 2 | 3 | #coco (in domain) 4 | model='prototxts/train_classifiers_deploy.prototxt' 5 | model_weights='snapshots/attributes_JJ100_NN300_VB100_coco_471_eightCluster_0223_iter_80000.caffemodel' 6 | image_dim=224 7 | 8 | python dcc.py --image_model $model \ 9 | --model_weights $model_weights \ 10 | --batch_size 16 \ 11 | --image_dim $image_dim \ 12 | --extract_features 13 | 14 | #coco (out of domain) 15 | model='prototxts/train_classifiers_deploy.prototxt' 16 | model_weights='snapshots/attributes_JJ100_NN300_VB100_clusterEight_imagenet_vgg_0112_iter_80000.caffemodel' 17 | image_dim=224 18 | 19 | python dcc.py --image_model $model \ 20 | --model_weights $model_weights \ 21 | --batch_size 16 \ 22 | --image_dim $image_dim \ 23 | --extract_features 24 | 25 | -------------------------------------------------------------------------------- /generate_coco.sh: -------------------------------------------------------------------------------- 1 | #!/usr/bin/env bash 2 | 3 | #This will generate results when using in domain data for transfer using the direct transfer method. 4 | 5 | #IN DOMAIN DIRECT TRANSFER 6 | deploy_words=dcc_vgg.wtd.prototxt 7 | model_name=dcc_coco_rm1_vgg.471.solver.prototxt_iter_110000.transfer_words_coco1.txt_closeness_embedding.caffemodel 8 | vocab=vocabulary.txt 9 | precomputed_feats=vgg_feats.attributes_JJ100_NN300_VB100_coco_471_eightCluster_0223_iter_80000.caffemodel.val_test.h5 10 | 11 | #IN DOMAIN DELTA TRANSFER 12 | #deploy_words=dcc_vgg.delta.wtd.prototxt 13 | #model_name=dcc_coco_rm1_vgg.delta_freezeLM_iter_50000.transfer_words_coco1.txt_closeness_embedding_delta_1.caffemodel 14 | #vocab=vocabulary.txt 15 | #precomputed_feats=vgg_feats.attributes_JJ100_NN300_VB100_coco_471_eightCluster_0223_iter_80000.caffemodel.val_test.h5 16 | 17 | 18 | #To generate result using out of domain for transfer: 19 | 20 | #OUT OF DOMAIN IMAGE, OUT OF DOMAIN LANGUAGE (IM2TXT) 21 | #deploy_words=dcc_vgg.80k.wtd.prototxt 22 | #vocab=yt_coco_surface_80k_vocab.txt 23 | #model_name=dcc_oodLM_rm1_vgg.im2txt.471.solver_0409_iter_110000.transfer_words_coco1.txt_closeness_embedding.caffemodel 24 | #precomputed_feats=vgg_feats.attributes_JJ100_NN300_VB100_clusterEight_imagenet_vgg_0112_iter_80000.val_test.h5 25 | 26 | #OUT OF DOMAIN IMAGE, OUT OF DOMAIN LANGUAGE (IM2TXT) 27 | #deploy_words=dcc_vgg.80k.wtd.prototxt 28 | #vocab=yt_coco_surface_80k_vocab.txt 29 | #model_name=dcc_oodLM_rm1_vgg.surf.471.solver_0409_iter_110000.transfer_words_coco1.txt_closeness_embedding.caffemodel 30 | #precomputed_feats=vgg_feats.attributes_JJ100_NN300_VB100_clusterEight_imagenet_vgg_0112_iter_80000.val_test.h5 31 | 32 | #change to "val_val" to eval on validation set 33 | image_list=coco2014_cocoid.val_test.txt 34 | split=val_test 35 | 36 | echo $deploy_words 37 | echo $model_name 38 | echo $vocab 39 | echo $precomputed_feats 40 | echo $image_list 41 | 42 | python dcc.py --language_model $deploy_words \ 43 | --model_weights $model_name \ 44 | --vocab $vocab \ 45 | --precomputed_features $precomputed_feats \ 46 | --image_list $image_list \ 47 | --split $split \ 48 | --generate_coco 49 | -------------------------------------------------------------------------------- /generate_imagenet.sh: -------------------------------------------------------------------------------- 1 | #!/usr/bin/env bash 2 | 3 | deploy_words=dcc_vgg.80k.wtd.imagenet.prototxt 4 | model_name=vgg_feats.vgg_multilabel_FT_iter_100000_imagenetSentences_iter_110000.transfer_words_imagenet.txt_closeness_embedding.caffemodel 5 | vocab=yt_coco_surface_80k_vocab.txt 6 | precomputed_feats=vgg_feats.vgg_multilabel_FT_iter_100000.caffemodel.imagenet_ims_test.h5 7 | #image_list=test_imagenet_images.txt 8 | image_list=gecko_test_list.txt 9 | language_feature='predict' 10 | 11 | python dcc.py --language_model $deploy_words \ 12 | --model_weights $model_name \ 13 | --vocab $vocab \ 14 | --precomputed_features $precomputed_feats \ 15 | --image_list $image_list \ 16 | --language_feature $language_feature \ 17 | --generate_imagenet 18 | -------------------------------------------------------------------------------- /images/ipython_images/alpaca/n02698473_518.JPEG: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/LisaAnne/DCC/a885b5ddc029f98c2cedbf1b055e5ad3dde5e352/images/ipython_images/alpaca/n02698473_518.JPEG -------------------------------------------------------------------------------- /images/ipython_images/baobab/n12189987_4309.JPEG: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/LisaAnne/DCC/a885b5ddc029f98c2cedbf1b055e5ad3dde5e352/images/ipython_images/baobab/n12189987_4309.JPEG -------------------------------------------------------------------------------- /images/ipython_images/candelabra/n02947818_15145.JPEG: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/LisaAnne/DCC/a885b5ddc029f98c2cedbf1b055e5ad3dde5e352/images/ipython_images/candelabra/n02947818_15145.JPEG -------------------------------------------------------------------------------- /images/ipython_images/coco/COCO_val2014_000000279846.jpg: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/LisaAnne/DCC/a885b5ddc029f98c2cedbf1b055e5ad3dde5e352/images/ipython_images/coco/COCO_val2014_000000279846.jpg -------------------------------------------------------------------------------- /images/ipython_images/coco/COCO_val2014_000000356368.jpg: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/LisaAnne/DCC/a885b5ddc029f98c2cedbf1b055e5ad3dde5e352/images/ipython_images/coco/COCO_val2014_000000356368.jpg -------------------------------------------------------------------------------- /images/ipython_images/coco/COCO_val2014_000000380868.jpg: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/LisaAnne/DCC/a885b5ddc029f98c2cedbf1b055e5ad3dde5e352/images/ipython_images/coco/COCO_val2014_000000380868.jpg -------------------------------------------------------------------------------- /images/ipython_images/coco/COCO_val2014_000000531563.jpg: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/LisaAnne/DCC/a885b5ddc029f98c2cedbf1b055e5ad3dde5e352/images/ipython_images/coco/COCO_val2014_000000531563.jpg -------------------------------------------------------------------------------- /images/ipython_images/otter/n02444819_10502.JPEG: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/LisaAnne/DCC/a885b5ddc029f98c2cedbf1b055e5ad3dde5e352/images/ipython_images/otter/n02444819_10502.JPEG -------------------------------------------------------------------------------- /images/ipython_images/otter/n02444819_13167.JPEG: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/LisaAnne/DCC/a885b5ddc029f98c2cedbf1b055e5ad3dde5e352/images/ipython_images/otter/n02444819_13167.JPEG -------------------------------------------------------------------------------- /prototxts/dcc_coco_baseline_vgg.solver.prototxt: -------------------------------------------------------------------------------- 1 | train_net: "prototxts/dcc_coco_baseline_vgg.train.prototxt" 2 | random_seed: 1701 3 | average_loss: 100 4 | snapshot_prefix: "snapshots/dcc_coco_baseline_vgg.solver.prototxt" 5 | clip_gradients: 10 6 | max_iter: 110000 7 | stepsize: 20000 8 | base_lr: 0.01 9 | snapshot: 5000 10 | momentum: 0.9 11 | solver_mode: GPU 12 | lr_policy: "step" 13 | weight_decay: 0.0 14 | display: 10 15 | gamma: 0.5 16 | -------------------------------------------------------------------------------- /prototxts/dcc_coco_baseline_vgg.train.prototxt: -------------------------------------------------------------------------------- 1 | layer { 2 | name: "data" 3 | type: "Python" 4 | top: "input_sentence" 5 | top: "target_sentence" 6 | top: "cont_sentence" 7 | top: "data" 8 | python_param { 9 | module: "python_data_layers" 10 | layer: "pairedCaptionData" 11 | param_str: "{'caption_json': 'annotations/captions_train2014.json', 'feature_file': 'lexical_features/vgg_feats.attributes_JJ100_NN300_VB100_allObjects_coco_vgg_0111_iter_80000.train.h5', 'vocabulary': 'utils/vocabulary/vocabulary.txt', 'batch_size': 100, 'stream_size': 20, 'top_names': ['input_sentence', 'target_sentence', 'cont_sentence', 'data']}" 12 | } 13 | } 14 | layer { 15 | name: "reshape-data" 16 | type: "Reshape" 17 | bottom: "data" 18 | top: "reshape-data" 19 | reshape_param { 20 | shape { 21 | dim: 1 22 | dim: -1 23 | dim: 471 24 | } 25 | } 26 | } 27 | layer { 28 | name: "embedding" 29 | type: "Embed" 30 | bottom: "input_sentence" 31 | top: "embedding" 32 | param { 33 | lr_mult: 0 34 | decay_mult: 0 35 | } 36 | embed_param { 37 | num_output: 512 38 | input_dim: 8801 39 | bias_term: false 40 | } 41 | } 42 | layer { 43 | name: "embedding2" 44 | type: "InnerProduct" 45 | bottom: "embedding" 46 | top: "embedding2" 47 | param { 48 | lr_mult: 0 49 | decay_mult: 0 50 | } 51 | inner_product_param { 52 | num_output: 512 53 | axis: 2 54 | } 55 | } 56 | layer { 57 | name: "lstm1" 58 | type: "LSTM" 59 | bottom: "embedding2" 60 | bottom: "cont_sentence" 61 | top: "lstm1" 62 | param { 63 | lr_mult: 0 64 | decay_mult: 0 65 | } 66 | recurrent_param { 67 | num_output: 512 68 | } 69 | } 70 | layer { 71 | name: "concat-lm" 72 | type: "Concat" 73 | bottom: "embedding2" 74 | bottom: "lstm1" 75 | top: "concat-lm" 76 | concat_param { 77 | axis: 2 78 | } 79 | } 80 | layer { 81 | name: "predict-lm" 82 | type: "InnerProduct" 83 | bottom: "concat-lm" 84 | top: "predict-lm" 85 | param { 86 | lr_mult: 1 87 | decay_mult: 1 88 | } 89 | param { 90 | lr_mult: 1 91 | decay_mult: 1 92 | } 93 | inner_product_param { 94 | num_output: 8801 95 | weight_filler { 96 | type: "uniform" 97 | min: -0.08 98 | max: 0.08 99 | } 100 | bias_filler { 101 | type: "constant" 102 | value: 0 103 | } 104 | axis: 2 105 | } 106 | } 107 | layer { 108 | name: "tile-data" 109 | type: "Tile" 110 | bottom: "reshape-data" 111 | top: "tile-data" 112 | tile_param { 113 | axis: 0 114 | tiles: 20 115 | } 116 | } 117 | layer { 118 | name: "predict-im" 119 | type: "InnerProduct" 120 | bottom: "tile-data" 121 | top: "predict-im" 122 | param { 123 | lr_mult: 1 124 | decay_mult: 1 125 | } 126 | inner_product_param { 127 | num_output: 8801 128 | bias_term: false 129 | weight_filler { 130 | type: "uniform" 131 | min: -0.08 132 | max: 0.08 133 | } 134 | axis: 2 135 | } 136 | } 137 | layer { 138 | name: "predict-multimodal" 139 | type: "Eltwise" 140 | bottom: "predict-lm" 141 | bottom: "predict-im" 142 | top: "predict-multimodal" 143 | eltwise_param { 144 | operation: SUM 145 | } 146 | } 147 | layer { 148 | name: "cross-entropy-loss" 149 | type: "SoftmaxWithLoss" 150 | bottom: "predict-multimodal" 151 | bottom: "target_sentence" 152 | top: "cross-entropy-loss" 153 | loss_weight: 20 154 | loss_param { 155 | ignore_label: -1 156 | } 157 | softmax_param { 158 | axis: 2 159 | } 160 | } 161 | 162 | -------------------------------------------------------------------------------- /prototxts/dcc_coco_rm1_vgg.solver.deltaLM.prototxt: -------------------------------------------------------------------------------- 1 | train_net: "prototxts/dcc_coco_rm1_vgg.train.deltaLM.prototxt" 2 | random_seed: 1701 3 | average_loss: 100 4 | snapshot_prefix: "snapshots/dcc_coco_rm1_vgg.deltaLM.prototxt" 5 | clip_gradients: 10 6 | max_iter: 5000 7 | stepsize: 5000 8 | base_lr: 0.001 9 | snapshot: 5000 10 | momentum: 0.9 11 | solver_mode: GPU 12 | lr_policy: "step" 13 | weight_decay: 0.0 14 | display: 10 15 | gamma: 0.5 16 | -------------------------------------------------------------------------------- /prototxts/dcc_coco_rm1_vgg.solver.freezeLM.prototxt: -------------------------------------------------------------------------------- 1 | train_net: "prototxts/dcc_coco_rm1_vgg.train.freezeLM.prototxt" 2 | random_seed: 1701 3 | average_loss: 100 4 | snapshot_prefix: "snapshots/dcc_coco_rm1_vgg.delta_freezeLM" 5 | clip_gradients: 10 6 | max_iter: 110000 7 | stepsize: 20000 8 | base_lr: 0.01 9 | snapshot: 5000 10 | momentum: 0.9 11 | solver_mode: GPU 12 | lr_policy: "step" 13 | weight_decay: 0.0 14 | display: 10 15 | gamma: 0.5 16 | -------------------------------------------------------------------------------- /prototxts/dcc_coco_rm1_vgg.solver.prototxt: -------------------------------------------------------------------------------- 1 | train_net: "prototxts/dcc_coco_rm1_vgg.train.prototxt" 2 | random_seed: 1701 3 | average_loss: 100 4 | snapshot_prefix: "snapshots/dcc_coco_rm1_vgg.solver.prototxt" 5 | clip_gradients: 10 6 | max_iter: 110000 7 | stepsize: 20000 8 | base_lr: 0.01 9 | snapshot: 5000 10 | momentum: 0.9 11 | solver_mode: GPU 12 | lr_policy: "step" 13 | weight_decay: 0.0 14 | display: 10 15 | gamma: 0.5 16 | -------------------------------------------------------------------------------- /prototxts/dcc_coco_rm1_vgg.train.deltaLM.prototxt: -------------------------------------------------------------------------------- 1 | layer { 2 | name: "data" 3 | type: "Python" 4 | top: "input_sentence" 5 | top: "target_sentence" 6 | top: "cont_sentence" 7 | top: "data" 8 | python_param { 9 | module: "python_data_layers" 10 | layer: "pairedCaptionData" 11 | param_str: "{'caption_json': 'annotations/captions_no_caption_rm_eightCluster_train2014.json', 'feature_file': 'lexical_features/vgg_feats.attributes_JJ100_NN300_VB100_coco_471_eightCluster_0223_iter_80000.caffemodel.train.h5', 'vocabulary': 'utils/vocabulary/vocabulary.txt', 'batch_size': 100, 'stream_size': 20, 'top_names': ['input_sentence', 'target_sentence', 'cont_sentence', 'data']}" 12 | } 13 | } 14 | layer { 15 | name: "embedding" 16 | type: "Embed" 17 | bottom: "input_sentence" 18 | top: "embedded_input_sentence_1" 19 | param { 20 | lr_mult: 1 21 | } 22 | embed_param { 23 | bias_term: false 24 | input_dim: 8801 25 | num_output: 512 26 | weight_filler { 27 | type: "uniform" 28 | min: -0.08 29 | max: 0.08 30 | } 31 | } 32 | } 33 | layer { 34 | name: "embedding2" 35 | type: "InnerProduct" 36 | bottom: "embedded_input_sentence_1" 37 | top: "embedded_input_sentence" 38 | param { 39 | lr_mult: 1 40 | decay_mult: 1 41 | } 42 | param { 43 | lr_mult: 1 44 | decay_mult: 2 45 | } 46 | inner_product_param { 47 | num_output: 512 48 | weight_filler { 49 | type: "uniform" 50 | min: -0.08 51 | max: 0.08 52 | } 53 | bias_filler { 54 | type: "constant" 55 | value: 0 56 | } 57 | axis: 2 58 | } 59 | } 60 | layer { 61 | name: "lstm1" 62 | type: "LSTM" 63 | bottom: "embedded_input_sentence" 64 | bottom: "cont_sentence" 65 | top: "lstm1" 66 | param { 67 | lr_mult: 1 68 | decay_mult: 1 69 | } 70 | param { 71 | lr_mult: 1 72 | decay_mult: 1 73 | } 74 | param { 75 | lr_mult: 1 76 | decay_mult: 1 77 | } 78 | recurrent_param { 79 | num_output: 512 80 | weight_filler { 81 | type: "uniform" 82 | min: -0.08 83 | max: 0.08 84 | } 85 | bias_filler { 86 | type: "constant" 87 | value: 0 88 | } 89 | } 90 | } 91 | layer { 92 | name: "reshape-fc8" 93 | type: "Reshape" 94 | bottom: "data" 95 | top: "reshape-fc8" 96 | reshape_param{ 97 | shape{ 98 | dim: 1 99 | dim: 100 100 | dim: -1 101 | } 102 | } 103 | } 104 | layer { 105 | name: "tile-fc8" 106 | type: "Tile" 107 | tile_param{ 108 | axis: 0 109 | tiles: 20 110 | } 111 | bottom: "reshape-fc8" 112 | top: "tile-fc8" 113 | } 114 | layer { 115 | name: "concat-lm" 116 | type: "Concat" 117 | concat_param{ 118 | axis: 2 119 | } 120 | bottom: "embedded_input_sentence" 121 | bottom: "lstm1" 122 | top: "concat_lm" 123 | } 124 | layer { 125 | name: "predict" 126 | type: "InnerProduct" 127 | bottom: "concat_lm" 128 | top: "predict-lm" 129 | param { 130 | lr_mult: 1 131 | decay_mult: 1 132 | } 133 | param { 134 | lr_mult: 1 135 | decay_mult: 2 136 | } 137 | inner_product_param { 138 | num_output: 8801 139 | weight_filler { 140 | type: "uniform" 141 | min: -0.08 142 | max: 0.08 143 | } 144 | bias_filler { 145 | type: "constant" 146 | value: 0 147 | } 148 | axis: 2 149 | } 150 | } 151 | layer { 152 | name: "predict-im" 153 | type: "InnerProduct" 154 | bottom: "tile-fc8" 155 | top: "predict-im" 156 | param { 157 | lr_mult: 1 158 | decay_mult: 1 159 | } 160 | inner_product_param { 161 | bias_term: false 162 | num_output: 8801 163 | weight_filler { 164 | type: "gaussian" 165 | std: 0.01 166 | } 167 | bias_filler { 168 | type: "constant" 169 | value: 0 170 | } 171 | axis: 2 172 | } 173 | } 174 | layer { 175 | name: "add-predictions" 176 | type: "Eltwise" 177 | bottom: "predict-lm" 178 | bottom: "predict-im" 179 | top: "predict-multimodal" 180 | eltwise_param { 181 | operation: SUM 182 | } 183 | } 184 | layer { 185 | name: "cross_entropy_loss" 186 | type: "SoftmaxWithLoss" 187 | bottom: "predict-multimodal" 188 | bottom: "target_sentence" 189 | top: "cross_entropy_loss" 190 | loss_weight: 20 191 | loss_param { 192 | ignore_label: -1 193 | } 194 | softmax_param { 195 | axis: 2 196 | } 197 | } 198 | layer { 199 | name: "accuracy" 200 | type: "Accuracy" 201 | bottom: "predict-multimodal" 202 | bottom: "target_sentence" 203 | top: "accuracy" 204 | include { phase: TEST } 205 | accuracy_param { 206 | axis: 2 207 | ignore_label: -1 208 | } 209 | } 210 | -------------------------------------------------------------------------------- /prototxts/dcc_coco_rm1_vgg.train.freezeLM.prototxt: -------------------------------------------------------------------------------- 1 | layer { 2 | name: "data" 3 | type: "Python" 4 | top: "input_sentence" 5 | top: "target_sentence" 6 | top: "cont_sentence" 7 | top: "data" 8 | python_param { 9 | module: "python_data_layers" 10 | layer: "pairedCaptionData" 11 | param_str: "{'caption_json': 'annotations/captions_no_caption_rm_eightCluster_train2014.json', 'feature_file': 'lexical_features/vgg_feats.attributes_JJ100_NN300_VB100_coco_471_eightCluster_0223_iter_80000.caffemodel.train.h5', 'vocabulary': 'utils/vocabulary/vocabulary.txt', 'batch_size': 100, 'stream_size': 20, 'top_names': ['input_sentence', 'target_sentence', 'cont_sentence', 'data']}" 12 | } 13 | } 14 | layer { 15 | name: "reshape-data" 16 | type: "Reshape" 17 | bottom: "data" 18 | top: "reshape-data" 19 | reshape_param { 20 | shape { 21 | dim: 1 22 | dim: -1 23 | dim: 471 24 | } 25 | } 26 | } 27 | layer { 28 | name: "embedding" 29 | type: "Embed" 30 | bottom: "input_sentence" 31 | top: "embedding" 32 | param { 33 | lr_mult: 0 34 | decay_mult: 0 35 | } 36 | embed_param { 37 | num_output: 512 38 | input_dim: 8801 39 | bias_term: false 40 | } 41 | } 42 | layer { 43 | name: "embedding2" 44 | type: "InnerProduct" 45 | bottom: "embedding" 46 | top: "embedding2" 47 | param { 48 | lr_mult: 0 49 | decay_mult: 0 50 | } 51 | inner_product_param { 52 | num_output: 512 53 | axis: 2 54 | } 55 | } 56 | layer { 57 | name: "lstm1" 58 | type: "LSTM" 59 | bottom: "embedding2" 60 | bottom: "cont_sentence" 61 | top: "lstm1" 62 | param { 63 | lr_mult: 0 64 | decay_mult: 0 65 | } 66 | recurrent_param { 67 | num_output: 512 68 | } 69 | } 70 | layer { 71 | name: "concat-lm" 72 | type: "Concat" 73 | bottom: "embedding2" 74 | bottom: "lstm1" 75 | top: "concat-lm" 76 | concat_param { 77 | axis: 2 78 | } 79 | } 80 | layer { 81 | name: "predict" 82 | type: "InnerProduct" 83 | bottom: "concat-lm" 84 | top: "predict-lm" 85 | param { 86 | lr_mult: 0 87 | decay_mult: 0 88 | } 89 | param { 90 | lr_mult: 0 91 | decay_mult: 0 92 | } 93 | inner_product_param { 94 | num_output: 8801 95 | weight_filler { 96 | type: "uniform" 97 | min: -0.08 98 | max: 0.08 99 | } 100 | bias_filler { 101 | type: "constant" 102 | value: 0 103 | } 104 | axis: 2 105 | } 106 | } 107 | layer { 108 | name: "tile-data" 109 | type: "Tile" 110 | bottom: "reshape-data" 111 | top: "tile-data" 112 | tile_param { 113 | axis: 0 114 | tiles: 20 115 | } 116 | } 117 | layer { 118 | name: "predict-im" 119 | type: "InnerProduct" 120 | bottom: "tile-data" 121 | top: "predict-im" 122 | param { 123 | lr_mult: 1 124 | decay_mult: 1 125 | } 126 | inner_product_param { 127 | num_output: 8801 128 | bias_term: false 129 | weight_filler { 130 | type: "uniform" 131 | min: -0.08 132 | max: 0.08 133 | } 134 | axis: 2 135 | } 136 | } 137 | layer { 138 | name: "predict-multimodal" 139 | type: "Eltwise" 140 | bottom: "predict-lm" 141 | bottom: "predict-im" 142 | top: "predict-multimodal" 143 | eltwise_param { 144 | operation: SUM 145 | } 146 | } 147 | layer { 148 | name: "cross-entropy-loss" 149 | type: "SoftmaxWithLoss" 150 | bottom: "predict-multimodal" 151 | bottom: "target_sentence" 152 | top: "cross-entropy-loss" 153 | loss_weight: 20 154 | loss_param { 155 | ignore_label: -1 156 | } 157 | softmax_param { 158 | axis: 2 159 | } 160 | } 161 | 162 | -------------------------------------------------------------------------------- /prototxts/dcc_coco_rm1_vgg.train.prototxt: -------------------------------------------------------------------------------- 1 | layer { 2 | name: "data" 3 | type: "Python" 4 | top: "input_sentence" 5 | top: "target_sentence" 6 | top: "cont_sentence" 7 | top: "data" 8 | python_param { 9 | module: "python_data_layers" 10 | layer: "pairedCaptionData" 11 | param_str: "{'caption_json': 'annotations/captions_no_caption_rm_eightCluster_train2014.json', 'feature_file': 'lexical_features/vgg_feats.attributes_JJ100_NN300_VB100_coco_471_eightCluster_0223_iter_80000.caffemodel.train.h5', 'vocabulary': 'utils/vocabulary/vocabulary.txt', 'batch_size': 100, 'stream_size': 20, 'top_names': ['input_sentence', 'target_sentence', 'cont_sentence', 'data']}" 12 | } 13 | } 14 | layer { 15 | name: "reshape-data" 16 | type: "Reshape" 17 | bottom: "data" 18 | top: "reshape-data" 19 | reshape_param { 20 | shape { 21 | dim: 1 22 | dim: -1 23 | dim: 471 24 | } 25 | } 26 | } 27 | layer { 28 | name: "embedding" 29 | type: "Embed" 30 | bottom: "input_sentence" 31 | top: "embedding" 32 | param { 33 | lr_mult: 0 34 | decay_mult: 0 35 | } 36 | embed_param { 37 | num_output: 512 38 | input_dim: 8801 39 | bias_term: false 40 | } 41 | } 42 | layer { 43 | name: "embedding2" 44 | type: "InnerProduct" 45 | bottom: "embedding" 46 | top: "embedding2" 47 | param { 48 | lr_mult: 0 49 | decay_mult: 0 50 | } 51 | inner_product_param { 52 | num_output: 512 53 | axis: 2 54 | } 55 | } 56 | layer { 57 | name: "lstm1" 58 | type: "LSTM" 59 | bottom: "embedding2" 60 | bottom: "cont_sentence" 61 | top: "lstm1" 62 | param { 63 | lr_mult: 0 64 | decay_mult: 0 65 | } 66 | recurrent_param { 67 | num_output: 512 68 | } 69 | } 70 | layer { 71 | name: "concat-lm" 72 | type: "Concat" 73 | bottom: "embedding2" 74 | bottom: "lstm1" 75 | top: "concat-lm" 76 | concat_param { 77 | axis: 2 78 | } 79 | } 80 | layer { 81 | name: "predict-lm" 82 | type: "InnerProduct" 83 | bottom: "concat-lm" 84 | top: "predict-lm" 85 | param { 86 | lr_mult: 1 87 | decay_mult: 1 88 | } 89 | param { 90 | lr_mult: 1 91 | decay_mult: 1 92 | } 93 | inner_product_param { 94 | num_output: 8801 95 | weight_filler { 96 | type: "uniform" 97 | min: -0.08 98 | max: 0.08 99 | } 100 | bias_filler { 101 | type: "constant" 102 | value: 0 103 | } 104 | axis: 2 105 | } 106 | } 107 | layer { 108 | name: "tile-data" 109 | type: "Tile" 110 | bottom: "reshape-data" 111 | top: "tile-data" 112 | tile_param { 113 | axis: 0 114 | tiles: 20 115 | } 116 | } 117 | layer { 118 | name: "predict-im" 119 | type: "InnerProduct" 120 | bottom: "tile-data" 121 | top: "predict-im" 122 | param { 123 | lr_mult: 1 124 | decay_mult: 1 125 | } 126 | inner_product_param { 127 | num_output: 8801 128 | bias_term: false 129 | weight_filler { 130 | type: "uniform" 131 | min: -0.08 132 | max: 0.08 133 | } 134 | axis: 2 135 | } 136 | } 137 | layer { 138 | name: "predict-multimodal" 139 | type: "Eltwise" 140 | bottom: "predict-lm" 141 | bottom: "predict-im" 142 | top: "predict-multimodal" 143 | eltwise_param { 144 | operation: SUM 145 | } 146 | } 147 | layer { 148 | name: "cross-entropy-loss" 149 | type: "SoftmaxWithLoss" 150 | bottom: "predict-multimodal" 151 | bottom: "target_sentence" 152 | top: "cross-entropy-loss" 153 | loss_weight: 20 154 | loss_param { 155 | ignore_label: -1 156 | } 157 | softmax_param { 158 | axis: 2 159 | } 160 | } 161 | 162 | -------------------------------------------------------------------------------- /prototxts/dcc_imagenet_rm1_vgg.solver.prototxt: -------------------------------------------------------------------------------- 1 | train_net: "prototxts/dcc_imagenet_rm1_vgg.train.prototxt" 2 | random_seed: 1701 3 | average_loss: 100 4 | snapshot_prefix: "snapshots/dcc_imagenet_rm1_vgg.solver.prototxt" 5 | clip_gradients: 10 6 | max_iter: 110000 7 | stepsize: 20000 8 | base_lr: 0.01 9 | snapshot: 5000 10 | momentum: 0.9 11 | solver_mode: GPU 12 | lr_policy: "step" 13 | weight_decay: 0.0 14 | display: 10 15 | gamma: 0.5 16 | -------------------------------------------------------------------------------- /prototxts/dcc_imagenet_rm1_vgg.train.prototxt: -------------------------------------------------------------------------------- 1 | layer { 2 | name: "data" 3 | type: "Python" 4 | top: "input_sentence" 5 | top: "target_sentence" 6 | top: "cont_sentence" 7 | top: "data" 8 | python_param { 9 | module: "python_data_layers" 10 | layer: "pairedCaptionData" 11 | param_str: "{'caption_json': 'annotations/captions_no_caption_rm_eightCluster_train2014.json', 'feature_file': 'lexical_features/vgg_feats.attributes_JJ100_NN300_VB100_clusterEight_imagenet_vgg_0112_iter_80000.caffemodel.train.h5', 'vocabulary': 'utils/vocabulary/vocabulary.txt', 'batch_size': 100, 'stream_size': 20, 'top_names': ['input_sentence', 'target_sentence', 'cont_sentence', 'data']}" 12 | } 13 | } 14 | layer { 15 | name: "reshape-data" 16 | type: "Reshape" 17 | bottom: "data" 18 | top: "reshape-data" 19 | reshape_param { 20 | shape { 21 | dim: 1 22 | dim: -1 23 | dim: 471 24 | } 25 | } 26 | } 27 | layer { 28 | name: "embedding" 29 | type: "Embed" 30 | bottom: "input_sentence" 31 | top: "embedding" 32 | param { 33 | lr_mult: 0 34 | decay_mult: 0 35 | } 36 | embed_param { 37 | num_output: 512 38 | input_dim: 8801 39 | bias_term: false 40 | } 41 | } 42 | layer { 43 | name: "embedding2" 44 | type: "InnerProduct" 45 | bottom: "embedding" 46 | top: "embedding2" 47 | param { 48 | lr_mult: 0 49 | decay_mult: 0 50 | } 51 | inner_product_param { 52 | num_output: 512 53 | axis: 2 54 | } 55 | } 56 | layer { 57 | name: "lstm1" 58 | type: "LSTM" 59 | bottom: "embedding2" 60 | bottom: "cont_sentence" 61 | top: "lstm1" 62 | param { 63 | lr_mult: 0 64 | decay_mult: 0 65 | } 66 | recurrent_param { 67 | num_output: 512 68 | } 69 | } 70 | layer { 71 | name: "concat-lm" 72 | type: "Concat" 73 | bottom: "embedding2" 74 | bottom: "lstm1" 75 | top: "concat-lm" 76 | concat_param { 77 | axis: 2 78 | } 79 | } 80 | layer { 81 | name: "predict-lm" 82 | type: "InnerProduct" 83 | bottom: "concat-lm" 84 | top: "predict-lm" 85 | param { 86 | lr_mult: 1 87 | decay_mult: 1 88 | } 89 | param { 90 | lr_mult: 1 91 | decay_mult: 1 92 | } 93 | inner_product_param { 94 | num_output: 8801 95 | weight_filler { 96 | type: "uniform" 97 | min: -0.08 98 | max: 0.08 99 | } 100 | bias_filler { 101 | type: "constant" 102 | value: 0 103 | } 104 | axis: 2 105 | } 106 | } 107 | layer { 108 | name: "tile-data" 109 | type: "Tile" 110 | bottom: "reshape-data" 111 | top: "tile-data" 112 | tile_param { 113 | axis: 0 114 | tiles: 20 115 | } 116 | } 117 | layer { 118 | name: "predict-im" 119 | type: "InnerProduct" 120 | bottom: "tile-data" 121 | top: "predict-im" 122 | param { 123 | lr_mult: 1 124 | decay_mult: 1 125 | } 126 | inner_product_param { 127 | num_output: 8801 128 | bias_term: false 129 | weight_filler { 130 | type: "uniform" 131 | min: -0.08 132 | max: 0.08 133 | } 134 | axis: 2 135 | } 136 | } 137 | layer { 138 | name: "predict-multimodal" 139 | type: "Eltwise" 140 | bottom: "predict-lm" 141 | bottom: "predict-im" 142 | top: "predict-multimodal" 143 | eltwise_param { 144 | operation: SUM 145 | } 146 | } 147 | layer { 148 | name: "cross-entropy-loss" 149 | type: "SoftmaxWithLoss" 150 | bottom: "predict-multimodal" 151 | bottom: "target_sentence" 152 | top: "cross-entropy-loss" 153 | loss_weight: 20 154 | loss_param { 155 | ignore_label: -1 156 | } 157 | softmax_param { 158 | axis: 2 159 | } 160 | } 161 | 162 | -------------------------------------------------------------------------------- /prototxts/dcc_oodLM_rm1_vgg.im2txt.solver.prototxt: -------------------------------------------------------------------------------- 1 | train_net: "prototxts/dcc_oodLM_rm1_vgg.train.prototxt" 2 | random_seed: 1701 3 | average_loss: 100 4 | snapshot_prefix: "snapshots/dcc_oodLM_rm1_vgg.im2txt.solver_0409" 5 | clip_gradients: 10 6 | max_iter: 110000 7 | stepsize: 20000 8 | base_lr: 0.01 9 | snapshot: 5000 10 | momentum: 0.9 11 | solver_mode: GPU 12 | lr_policy: "step" 13 | weight_decay: 0.0 14 | display: 10 15 | gamma: 0.5 16 | -------------------------------------------------------------------------------- /prototxts/dcc_oodLM_rm1_vgg.surf.solver.prototxt: -------------------------------------------------------------------------------- 1 | train_net: "prototxts/dcc_oodLM_rm1_vgg.train.prototxt" 2 | random_seed: 1701 3 | average_loss: 100 4 | snapshot_prefix: "snapshots/dcc_oodLM_rm1_vgg.surf.solver" 5 | clip_gradients: 10 6 | max_iter: 110000 7 | stepsize: 20000 8 | base_lr: 0.01 9 | snapshot: 5000 10 | momentum: 0.9 11 | solver_mode: GPU 12 | lr_policy: "step" 13 | weight_decay: 0.0 14 | display: 10 15 | gamma: 0.5 16 | -------------------------------------------------------------------------------- /prototxts/dcc_oodLM_rm1_vgg.train.prototxt: -------------------------------------------------------------------------------- 1 | layer { 2 | name: "data" 3 | type: "Python" 4 | top: "input_sentence" 5 | top: "target_sentence" 6 | top: "cont_sentence" 7 | top: "data" 8 | python_param { 9 | module: "python_data_layers" 10 | layer: "pairedCaptionData" 11 | param_str: "{'caption_json': 'annotations/captions_no_caption_rm_eightCluster_train2014.json', 'feature_file': 'lexical_features/vgg_feats.attributes_JJ100_NN300_VB100_clusterEight_imagenet_vgg_0112_iter_80000.caffemodel.train.h5', 'vocabulary': 'utils/vocabulary/yt_coco_surface_80k_vocab.txt', 'batch_size': 100, 'stream_size': 20, 'top_names': ['input_sentence', 'target_sentence', 'cont_sentence', 'data']}" 12 | } 13 | } 14 | layer { 15 | name: "reshape-data" 16 | type: "Reshape" 17 | bottom: "data" 18 | top: "reshape-data" 19 | reshape_param { 20 | shape { 21 | dim: 1 22 | dim: -1 23 | dim: 471 24 | } 25 | } 26 | } 27 | layer { 28 | name: "embedding" 29 | type: "Embed" 30 | bottom: "input_sentence" 31 | top: "embedding" 32 | param { 33 | lr_mult: 0 34 | decay_mult: 0 35 | } 36 | embed_param { 37 | num_output: 512 38 | input_dim: 80002 39 | bias_term: false 40 | } 41 | } 42 | layer { 43 | name: "embedding2" 44 | type: "InnerProduct" 45 | bottom: "embedding" 46 | top: "embedding2" 47 | param { 48 | lr_mult: 0 49 | decay_mult: 0 50 | } 51 | inner_product_param { 52 | num_output: 512 53 | axis: 2 54 | } 55 | } 56 | layer { 57 | name: "lstm1" 58 | type: "LSTM" 59 | bottom: "embedding2" 60 | bottom: "cont_sentence" 61 | top: "lstm1" 62 | param { 63 | lr_mult: 0 64 | decay_mult: 0 65 | } 66 | recurrent_param { 67 | num_output: 512 68 | } 69 | } 70 | layer { 71 | name: "concat-lm" 72 | type: "Concat" 73 | bottom: "embedding2" 74 | bottom: "lstm1" 75 | top: "concat-lm" 76 | concat_param { 77 | axis: 2 78 | } 79 | } 80 | layer { 81 | name: "predict-lm" 82 | type: "InnerProduct" 83 | bottom: "concat-lm" 84 | top: "predict-lm" 85 | param { 86 | lr_mult: 1 87 | decay_mult: 1 88 | } 89 | param { 90 | lr_mult: 1 91 | decay_mult: 1 92 | } 93 | inner_product_param { 94 | num_output: 80002 95 | weight_filler { 96 | type: "uniform" 97 | min: -0.08 98 | max: 0.08 99 | } 100 | bias_filler { 101 | type: "constant" 102 | value: 0 103 | } 104 | axis: 2 105 | } 106 | } 107 | layer { 108 | name: "tile-data" 109 | type: "Tile" 110 | bottom: "reshape-data" 111 | top: "tile-data" 112 | tile_param { 113 | axis: 0 114 | tiles: 20 115 | } 116 | } 117 | layer { 118 | name: "predict-im" 119 | type: "InnerProduct" 120 | bottom: "tile-data" 121 | top: "predict-im" 122 | param { 123 | lr_mult: 1 124 | decay_mult: 1 125 | } 126 | inner_product_param { 127 | num_output: 80002 128 | bias_term: false 129 | weight_filler { 130 | type: "uniform" 131 | min: -0.08 132 | max: 0.08 133 | } 134 | axis: 2 135 | } 136 | } 137 | layer { 138 | name: "predict-multimodal" 139 | type: "Eltwise" 140 | bottom: "predict-lm" 141 | bottom: "predict-im" 142 | top: "predict-multimodal" 143 | eltwise_param { 144 | operation: SUM 145 | } 146 | } 147 | layer { 148 | name: "cross-entropy-loss" 149 | type: "SoftmaxWithLoss" 150 | bottom: "predict-multimodal" 151 | bottom: "target_sentence" 152 | top: "cross-entropy-loss" 153 | loss_weight: 20 154 | loss_param { 155 | ignore_label: -1 156 | } 157 | softmax_param { 158 | axis: 2 159 | } 160 | } 161 | 162 | -------------------------------------------------------------------------------- /prototxts/dcc_vgg.80k.deploy.prototxt: -------------------------------------------------------------------------------- 1 | layer { 2 | name: "data" 3 | type: "DummyData" 4 | top: "data" 5 | dummy_data_param { 6 | data_filler { 7 | type: "constant" 8 | value: 1 9 | } 10 | shape { 11 | dim: 10 12 | dim: 471 13 | } 14 | } 15 | } 16 | layer { 17 | name: "reshape-data" 18 | type: "Reshape" 19 | bottom: "data" 20 | top: "reshape-data" 21 | reshape_param { 22 | shape { 23 | dim: 1 24 | dim: -1 25 | dim: 471 26 | } 27 | } 28 | } 29 | 30 | -------------------------------------------------------------------------------- /prototxts/dcc_vgg.80k.wtd.imagenet.prototxt: -------------------------------------------------------------------------------- 1 | layer { 2 | name: "cont_sentence" 3 | type: "DummyData" 4 | top: "cont_sentence" 5 | dummy_data_param { 6 | data_filler { 7 | type: "constant" 8 | value: 1 9 | } 10 | shape { 11 | dim: 1 12 | dim: 50 13 | } 14 | } 15 | } 16 | layer { 17 | name: "input_sentence" 18 | type: "DummyData" 19 | top: "input_sentence" 20 | dummy_data_param { 21 | data_filler { 22 | type: "constant" 23 | value: 1 24 | } 25 | shape { 26 | dim: 1 27 | dim: 50 28 | dim: 1 29 | } 30 | } 31 | } 32 | layer { 33 | name: "image_features" 34 | type: "DummyData" 35 | top: "image_features" 36 | dummy_data_param { 37 | data_filler { 38 | type: "constant" 39 | value: 1 40 | } 41 | shape { 42 | dim: 50 43 | dim: 1117 44 | } 45 | } 46 | } 47 | layer { 48 | name: "embedding" 49 | type: "Embed" 50 | bottom: "input_sentence" 51 | top: "embedding" 52 | param { 53 | lr_mult: 0 54 | decay_mult: 0 55 | } 56 | embed_param { 57 | num_output: 512 58 | input_dim: 80002 59 | bias_term: false 60 | } 61 | } 62 | layer { 63 | name: "embedding2" 64 | type: "InnerProduct" 65 | bottom: "embedding" 66 | top: "embedding2" 67 | param { 68 | lr_mult: 0 69 | decay_mult: 0 70 | } 71 | inner_product_param { 72 | num_output: 512 73 | axis: 2 74 | } 75 | } 76 | layer { 77 | name: "lstm1" 78 | type: "LSTM" 79 | bottom: "embedding2" 80 | bottom: "cont_sentence" 81 | top: "lstm1" 82 | param { 83 | lr_mult: 0 84 | decay_mult: 0 85 | } 86 | recurrent_param { 87 | num_output: 512 88 | } 89 | } 90 | layer { 91 | name: "concat-lm" 92 | type: "Concat" 93 | bottom: "embedding2" 94 | bottom: "lstm1" 95 | top: "concat-lm" 96 | concat_param { 97 | axis: 2 98 | } 99 | } 100 | layer { 101 | name: "predict-lm" 102 | type: "InnerProduct" 103 | bottom: "concat-lm" 104 | top: "predict-lm" 105 | param { 106 | lr_mult: 1 107 | decay_mult: 1 108 | } 109 | param { 110 | lr_mult: 1 111 | decay_mult: 1 112 | } 113 | inner_product_param { 114 | num_output: 80002 115 | weight_filler { 116 | type: "uniform" 117 | min: -0.08 118 | max: 0.08 119 | } 120 | bias_filler { 121 | type: "constant" 122 | value: 0 123 | } 124 | axis: 2 125 | } 126 | } 127 | layer { 128 | name: "reshape-data" 129 | type: "Reshape" 130 | bottom: "image_features" 131 | top: "reshape-data" 132 | reshape_param { 133 | axis: 0 134 | num_axes: 0 135 | shape { 136 | dim: 1 137 | } 138 | } 139 | } 140 | layer { 141 | name: "predict-im" 142 | type: "InnerProduct" 143 | bottom: "reshape-data" 144 | top: "predict-im" 145 | param { 146 | lr_mult: 1 147 | decay_mult: 1 148 | } 149 | inner_product_param { 150 | num_output: 80002 151 | bias_term: false 152 | weight_filler { 153 | type: "uniform" 154 | min: -0.08 155 | max: 0.08 156 | } 157 | axis: 2 158 | } 159 | } 160 | layer { 161 | name: "predict-multimodal" 162 | type: "Eltwise" 163 | bottom: "predict-lm" 164 | bottom: "predict-im" 165 | top: "predict-multimodal" 166 | eltwise_param { 167 | operation: SUM 168 | } 169 | } 170 | layer { 171 | name: "predict" 172 | type: "Softmax" 173 | bottom: "predict-multimodal" 174 | top: "predict" 175 | softmax_param { 176 | axis: 2 177 | } 178 | } 179 | 180 | -------------------------------------------------------------------------------- /prototxts/dcc_vgg.80k.wtd.prototxt: -------------------------------------------------------------------------------- 1 | layer { 2 | name: "cont_sentence" 3 | type: "DummyData" 4 | top: "cont_sentence" 5 | dummy_data_param { 6 | data_filler { 7 | type: "constant" 8 | value: 1 9 | } 10 | shape { 11 | dim: 1 12 | dim: 50 13 | } 14 | } 15 | } 16 | layer { 17 | name: "input_sentence" 18 | type: "DummyData" 19 | top: "input_sentence" 20 | dummy_data_param { 21 | data_filler { 22 | type: "constant" 23 | value: 1 24 | } 25 | shape { 26 | dim: 1 27 | dim: 50 28 | dim: 1 29 | } 30 | } 31 | } 32 | layer { 33 | name: "image_features" 34 | type: "DummyData" 35 | top: "image_features" 36 | dummy_data_param { 37 | data_filler { 38 | type: "constant" 39 | value: 1 40 | } 41 | shape { 42 | dim: 50 43 | dim: 471 44 | } 45 | } 46 | } 47 | layer { 48 | name: "embedding" 49 | type: "Embed" 50 | bottom: "input_sentence" 51 | top: "embedding" 52 | param { 53 | lr_mult: 0 54 | decay_mult: 0 55 | } 56 | embed_param { 57 | num_output: 512 58 | input_dim: 80002 59 | bias_term: false 60 | } 61 | } 62 | layer { 63 | name: "embedding2" 64 | type: "InnerProduct" 65 | bottom: "embedding" 66 | top: "embedding2" 67 | param { 68 | lr_mult: 0 69 | decay_mult: 0 70 | } 71 | inner_product_param { 72 | num_output: 512 73 | axis: 2 74 | } 75 | } 76 | layer { 77 | name: "lstm1" 78 | type: "LSTM" 79 | bottom: "embedding2" 80 | bottom: "cont_sentence" 81 | top: "lstm1" 82 | param { 83 | lr_mult: 0 84 | decay_mult: 0 85 | } 86 | recurrent_param { 87 | num_output: 512 88 | } 89 | } 90 | layer { 91 | name: "concat-lm" 92 | type: "Concat" 93 | bottom: "embedding2" 94 | bottom: "lstm1" 95 | top: "concat-lm" 96 | concat_param { 97 | axis: 2 98 | } 99 | } 100 | layer { 101 | name: "predict-lm" 102 | type: "InnerProduct" 103 | bottom: "concat-lm" 104 | top: "predict-lm" 105 | param { 106 | lr_mult: 1 107 | decay_mult: 1 108 | } 109 | param { 110 | lr_mult: 1 111 | decay_mult: 1 112 | } 113 | inner_product_param { 114 | num_output: 80002 115 | weight_filler { 116 | type: "uniform" 117 | min: -0.08 118 | max: 0.08 119 | } 120 | bias_filler { 121 | type: "constant" 122 | value: 0 123 | } 124 | axis: 2 125 | } 126 | } 127 | layer { 128 | name: "reshape-data" 129 | type: "Reshape" 130 | bottom: "image_features" 131 | top: "reshape-data" 132 | reshape_param { 133 | axis: 0 134 | num_axes: 0 135 | shape { 136 | dim: 1 137 | } 138 | } 139 | } 140 | layer { 141 | name: "predict-im" 142 | type: "InnerProduct" 143 | bottom: "reshape-data" 144 | top: "predict-im" 145 | param { 146 | lr_mult: 1 147 | decay_mult: 1 148 | } 149 | inner_product_param { 150 | num_output: 80002 151 | bias_term: false 152 | weight_filler { 153 | type: "uniform" 154 | min: -0.08 155 | max: 0.08 156 | } 157 | axis: 2 158 | } 159 | } 160 | layer { 161 | name: "predict-multimodal" 162 | type: "Eltwise" 163 | bottom: "predict-lm" 164 | bottom: "predict-im" 165 | top: "predict-multimodal" 166 | eltwise_param { 167 | operation: SUM 168 | } 169 | } 170 | layer { 171 | name: "predict" 172 | type: "Softmax" 173 | bottom: "predict-multimodal" 174 | top: "predict" 175 | softmax_param { 176 | axis: 2 177 | } 178 | } 179 | 180 | -------------------------------------------------------------------------------- /prototxts/dcc_vgg.delta.wtd.prototxt: -------------------------------------------------------------------------------- 1 | layer { 2 | name: "cont_sentence" 3 | type: "DummyData" 4 | top: "cont_sentence" 5 | dummy_data_param { 6 | data_filler { 7 | type: "constant" 8 | value: 1 9 | } 10 | shape { 11 | dim: 1 12 | dim: 100 13 | } 14 | } 15 | } 16 | layer { 17 | name: "input_sentence" 18 | type: "DummyData" 19 | top: "input_sentence" 20 | dummy_data_param { 21 | data_filler { 22 | type: "constant" 23 | value: 1 24 | } 25 | shape { 26 | dim: 1 27 | dim: 100 28 | dim: 1 29 | } 30 | } 31 | } 32 | layer { 33 | name: "image_features" 34 | type: "DummyData" 35 | top: "image_features" 36 | dummy_data_param { 37 | data_filler { 38 | type: "constant" 39 | value: 1 40 | } 41 | shape { 42 | dim: 100 43 | dim: 471 44 | } 45 | } 46 | } 47 | layer { 48 | name: "embedding" 49 | type: "Embed" 50 | bottom: "input_sentence" 51 | top: "embedding" 52 | param { 53 | lr_mult: 0 54 | decay_mult: 0 55 | } 56 | embed_param { 57 | num_output: 512 58 | input_dim: 8801 59 | bias_term: false 60 | } 61 | } 62 | layer { 63 | name: "embedding2" 64 | type: "InnerProduct" 65 | bottom: "embedding" 66 | top: "embedding2" 67 | param { 68 | lr_mult: 0 69 | decay_mult: 0 70 | } 71 | inner_product_param { 72 | num_output: 512 73 | axis: 2 74 | } 75 | } 76 | layer { 77 | name: "lstm1" 78 | type: "LSTM" 79 | bottom: "embedding2" 80 | bottom: "cont_sentence" 81 | top: "lstm1" 82 | param { 83 | lr_mult: 0 84 | decay_mult: 0 85 | } 86 | recurrent_param { 87 | num_output: 512 88 | } 89 | } 90 | layer { 91 | name: "concat-lm" 92 | type: "Concat" 93 | bottom: "embedding2" 94 | bottom: "lstm1" 95 | top: "concat-lm" 96 | concat_param { 97 | axis: 2 98 | } 99 | } 100 | layer { 101 | name: "predict" 102 | type: "InnerProduct" 103 | bottom: "concat-lm" 104 | top: "predict-lm" 105 | param { 106 | lr_mult: 1 107 | decay_mult: 1 108 | } 109 | param { 110 | lr_mult: 1 111 | decay_mult: 1 112 | } 113 | inner_product_param { 114 | num_output: 8801 115 | weight_filler { 116 | type: "uniform" 117 | min: -0.08 118 | max: 0.08 119 | } 120 | bias_filler { 121 | type: "constant" 122 | value: 0 123 | } 124 | axis: 2 125 | } 126 | } 127 | layer { 128 | name: "reshape-data" 129 | type: "Reshape" 130 | bottom: "image_features" 131 | top: "reshape-data" 132 | reshape_param { 133 | axis: 0 134 | num_axes: 0 135 | shape { 136 | dim: 1 137 | } 138 | } 139 | } 140 | layer { 141 | name: "predict-im" 142 | type: "InnerProduct" 143 | bottom: "reshape-data" 144 | top: "predict-im" 145 | param { 146 | lr_mult: 1 147 | decay_mult: 1 148 | } 149 | inner_product_param { 150 | num_output: 8801 151 | bias_term: false 152 | weight_filler { 153 | type: "uniform" 154 | min: -0.08 155 | max: 0.08 156 | } 157 | axis: 2 158 | } 159 | } 160 | layer { 161 | name: "predict-multimodal" 162 | type: "Eltwise" 163 | bottom: "predict-lm" 164 | bottom: "predict-im" 165 | top: "predict-multimodal" 166 | eltwise_param { 167 | operation: SUM 168 | } 169 | } 170 | layer { 171 | name: "probs" 172 | type: "Softmax" 173 | bottom: "predict-multimodal" 174 | top: "predict" 175 | softmax_param { 176 | axis: 2 177 | } 178 | } 179 | 180 | -------------------------------------------------------------------------------- /prototxts/dcc_vgg.deploy.prototxt: -------------------------------------------------------------------------------- 1 | layer { 2 | name: "data" 3 | type: "DummyData" 4 | top: "data" 5 | dummy_data_param { 6 | data_filler { 7 | type: "constant" 8 | value: 1 9 | } 10 | shape { 11 | dim: 10 12 | dim: 471 13 | } 14 | } 15 | } 16 | layer { 17 | name: "reshape-data" 18 | type: "Reshape" 19 | bottom: "data" 20 | top: "reshape-data" 21 | reshape_param { 22 | shape { 23 | dim: 1 24 | dim: -1 25 | dim: 471 26 | } 27 | } 28 | } 29 | 30 | -------------------------------------------------------------------------------- /prototxts/dcc_vgg.wtd.prototxt: -------------------------------------------------------------------------------- 1 | layer { 2 | name: "cont_sentence" 3 | type: "DummyData" 4 | top: "cont_sentence" 5 | dummy_data_param { 6 | data_filler { 7 | type: "constant" 8 | value: 1 9 | } 10 | shape { 11 | dim: 1 12 | dim: 100 13 | } 14 | } 15 | } 16 | layer { 17 | name: "input_sentence" 18 | type: "DummyData" 19 | top: "input_sentence" 20 | dummy_data_param { 21 | data_filler { 22 | type: "constant" 23 | value: 1 24 | } 25 | shape { 26 | dim: 1 27 | dim: 100 28 | dim: 1 29 | } 30 | } 31 | } 32 | layer { 33 | name: "image_features" 34 | type: "DummyData" 35 | top: "image_features" 36 | dummy_data_param { 37 | data_filler { 38 | type: "constant" 39 | value: 1 40 | } 41 | shape { 42 | dim: 100 43 | dim: 471 44 | } 45 | } 46 | } 47 | layer { 48 | name: "embedding" 49 | type: "Embed" 50 | bottom: "input_sentence" 51 | top: "embedding" 52 | param { 53 | lr_mult: 0 54 | decay_mult: 0 55 | } 56 | embed_param { 57 | num_output: 512 58 | input_dim: 8801 59 | bias_term: false 60 | } 61 | } 62 | layer { 63 | name: "embedding2" 64 | type: "InnerProduct" 65 | bottom: "embedding" 66 | top: "embedding2" 67 | param { 68 | lr_mult: 0 69 | decay_mult: 0 70 | } 71 | inner_product_param { 72 | num_output: 512 73 | axis: 2 74 | } 75 | } 76 | layer { 77 | name: "lstm1" 78 | type: "LSTM" 79 | bottom: "embedding2" 80 | bottom: "cont_sentence" 81 | top: "lstm1" 82 | param { 83 | lr_mult: 0 84 | decay_mult: 0 85 | } 86 | recurrent_param { 87 | num_output: 512 88 | } 89 | } 90 | layer { 91 | name: "concat-lm" 92 | type: "Concat" 93 | bottom: "embedding2" 94 | bottom: "lstm1" 95 | top: "concat-lm" 96 | concat_param { 97 | axis: 2 98 | } 99 | } 100 | layer { 101 | name: "predict-lm" 102 | type: "InnerProduct" 103 | bottom: "concat-lm" 104 | top: "predict-lm" 105 | param { 106 | lr_mult: 1 107 | decay_mult: 1 108 | } 109 | param { 110 | lr_mult: 1 111 | decay_mult: 1 112 | } 113 | inner_product_param { 114 | num_output: 8801 115 | weight_filler { 116 | type: "uniform" 117 | min: -0.08 118 | max: 0.08 119 | } 120 | bias_filler { 121 | type: "constant" 122 | value: 0 123 | } 124 | axis: 2 125 | } 126 | } 127 | layer { 128 | name: "reshape-data" 129 | type: "Reshape" 130 | bottom: "image_features" 131 | top: "reshape-data" 132 | reshape_param { 133 | axis: 0 134 | num_axes: 0 135 | shape { 136 | dim: 1 137 | } 138 | } 139 | } 140 | layer { 141 | name: "predict-im" 142 | type: "InnerProduct" 143 | bottom: "reshape-data" 144 | top: "predict-im" 145 | param { 146 | lr_mult: 1 147 | decay_mult: 1 148 | } 149 | inner_product_param { 150 | num_output: 8801 151 | bias_term: false 152 | weight_filler { 153 | type: "uniform" 154 | min: -0.08 155 | max: 0.08 156 | } 157 | axis: 2 158 | } 159 | } 160 | layer { 161 | name: "predict-multimodal" 162 | type: "Eltwise" 163 | bottom: "predict-lm" 164 | bottom: "predict-im" 165 | top: "predict-multimodal" 166 | eltwise_param { 167 | operation: SUM 168 | } 169 | } 170 | layer { 171 | name: "predict" 172 | type: "Softmax" 173 | bottom: "predict-multimodal" 174 | top: "predict" 175 | softmax_param { 176 | axis: 2 177 | } 178 | } 179 | 180 | -------------------------------------------------------------------------------- /prototxts/train_classifiers_deploy.imagenet.prototxt: -------------------------------------------------------------------------------- 1 | name: "FinetuneVGG" 2 | input: "data" 3 | input_shape { 4 | dim: 1 5 | dim: 3 6 | dim: 224 7 | dim: 224 8 | } 9 | layer { 10 | name: "conv1_1" 11 | type: "Convolution" 12 | bottom: "data" 13 | top: "conv1_1" 14 | convolution_param { 15 | num_output: 64 16 | pad: 1 17 | kernel_size: 3 18 | } 19 | } 20 | layer { 21 | name: "relu1_1" 22 | type: "ReLU" 23 | bottom: "conv1_1" 24 | top: "conv1_1" 25 | } 26 | layer { 27 | name: "conv1_2" 28 | type: "Convolution" 29 | bottom: "conv1_1" 30 | top: "conv1_2" 31 | convolution_param { 32 | num_output: 64 33 | pad: 1 34 | kernel_size: 3 35 | } 36 | } 37 | layer { 38 | name: "relu1_2" 39 | type: "ReLU" 40 | bottom: "conv1_2" 41 | top: "conv1_2" 42 | } 43 | layer { 44 | name: "pool1" 45 | type: "Pooling" 46 | bottom: "conv1_2" 47 | top: "pool1" 48 | pooling_param { 49 | pool: MAX 50 | kernel_size: 2 51 | stride: 2 52 | } 53 | } 54 | layer { 55 | name: "conv2_1" 56 | type: "Convolution" 57 | bottom: "pool1" 58 | top: "conv2_1" 59 | convolution_param { 60 | num_output: 128 61 | pad: 1 62 | kernel_size: 3 63 | } 64 | } 65 | layer { 66 | name: "relu2_1" 67 | type: "ReLU" 68 | bottom: "conv2_1" 69 | top: "conv2_1" 70 | } 71 | layer { 72 | name: "conv2_2" 73 | type: "Convolution" 74 | bottom: "conv2_1" 75 | top: "conv2_2" 76 | convolution_param { 77 | num_output: 128 78 | pad: 1 79 | kernel_size: 3 80 | } 81 | } 82 | layer { 83 | name: "relu2_2" 84 | type: "ReLU" 85 | bottom: "conv2_2" 86 | top: "conv2_2" 87 | } 88 | layer { 89 | name: "pool2" 90 | type: "Pooling" 91 | bottom: "conv2_2" 92 | top: "pool2" 93 | pooling_param { 94 | pool: MAX 95 | kernel_size: 2 96 | stride: 2 97 | } 98 | } 99 | layer { 100 | name: "conv3_1" 101 | type: "Convolution" 102 | bottom: "pool2" 103 | top: "conv3_1" 104 | convolution_param { 105 | num_output: 256 106 | pad: 1 107 | kernel_size: 3 108 | } 109 | } 110 | layer { 111 | name: "relu3_1" 112 | type: "ReLU" 113 | bottom: "conv3_1" 114 | top: "conv3_1" 115 | } 116 | layer { 117 | name: "conv3_2" 118 | type: "Convolution" 119 | bottom: "conv3_1" 120 | top: "conv3_2" 121 | convolution_param { 122 | num_output: 256 123 | pad: 1 124 | kernel_size: 3 125 | } 126 | } 127 | layer { 128 | name: "relu3_2" 129 | type: "ReLU" 130 | bottom: "conv3_2" 131 | top: "conv3_2" 132 | } 133 | layer { 134 | name: "conv3_3" 135 | type: "Convolution" 136 | bottom: "conv3_2" 137 | top: "conv3_3" 138 | convolution_param { 139 | num_output: 256 140 | pad: 1 141 | kernel_size: 3 142 | } 143 | } 144 | layer { 145 | name: "relu3_3" 146 | type: "ReLU" 147 | bottom: "conv3_3" 148 | top: "conv3_3" 149 | } 150 | layer { 151 | name: "pool3" 152 | type: "Pooling" 153 | bottom: "conv3_3" 154 | top: "pool3" 155 | pooling_param { 156 | pool: MAX 157 | kernel_size: 2 158 | stride: 2 159 | } 160 | } 161 | layer { 162 | name: "conv4_1" 163 | type: "Convolution" 164 | bottom: "pool3" 165 | top: "conv4_1" 166 | convolution_param { 167 | num_output: 512 168 | pad: 1 169 | kernel_size: 3 170 | } 171 | } 172 | layer { 173 | name: "relu4_1" 174 | type: "ReLU" 175 | bottom: "conv4_1" 176 | top: "conv4_1" 177 | } 178 | layer { 179 | name: "conv4_2" 180 | type: "Convolution" 181 | bottom: "conv4_1" 182 | top: "conv4_2" 183 | convolution_param { 184 | num_output: 512 185 | pad: 1 186 | kernel_size: 3 187 | } 188 | } 189 | layer { 190 | name: "relu4_2" 191 | type: "ReLU" 192 | bottom: "conv4_2" 193 | top: "conv4_2" 194 | } 195 | layer { 196 | name: "conv4_3" 197 | type: "Convolution" 198 | bottom: "conv4_2" 199 | top: "conv4_3" 200 | convolution_param { 201 | num_output: 512 202 | pad: 1 203 | kernel_size: 3 204 | } 205 | } 206 | layer { 207 | name: "relu4_3" 208 | type: "ReLU" 209 | bottom: "conv4_3" 210 | top: "conv4_3" 211 | } 212 | layer { 213 | name: "pool4" 214 | type: "Pooling" 215 | bottom: "conv4_3" 216 | top: "pool4" 217 | pooling_param { 218 | pool: MAX 219 | kernel_size: 2 220 | stride: 2 221 | } 222 | } 223 | layer { 224 | name: "conv5_1" 225 | type: "Convolution" 226 | bottom: "pool4" 227 | top: "conv5_1" 228 | convolution_param { 229 | num_output: 512 230 | pad: 1 231 | kernel_size: 3 232 | } 233 | } 234 | layer { 235 | name: "relu5_1" 236 | type: "ReLU" 237 | bottom: "conv5_1" 238 | top: "conv5_1" 239 | } 240 | layer { 241 | name: "conv5_2" 242 | type: "Convolution" 243 | bottom: "conv5_1" 244 | top: "conv5_2" 245 | convolution_param { 246 | num_output: 512 247 | pad: 1 248 | kernel_size: 3 249 | } 250 | } 251 | layer { 252 | name: "relu5_2" 253 | type: "ReLU" 254 | bottom: "conv5_2" 255 | top: "conv5_2" 256 | } 257 | layer { 258 | name: "conv5_3" 259 | type: "Convolution" 260 | bottom: "conv5_2" 261 | top: "conv5_3" 262 | convolution_param { 263 | num_output: 512 264 | pad: 1 265 | kernel_size: 3 266 | } 267 | } 268 | layer { 269 | name: "relu5_3" 270 | type: "ReLU" 271 | bottom: "conv5_3" 272 | top: "conv5_3" 273 | } 274 | layer { 275 | name: "pool5" 276 | type: "Pooling" 277 | bottom: "conv5_3" 278 | top: "pool5" 279 | pooling_param { 280 | pool: MAX 281 | kernel_size: 2 282 | stride: 2 283 | } 284 | } 285 | layer { 286 | name: "fc6" 287 | type: "InnerProduct" 288 | bottom: "pool5" 289 | top: "fc6" 290 | inner_product_param { 291 | num_output: 4096 292 | } 293 | } 294 | layer { 295 | name: "relu6" 296 | type: "ReLU" 297 | bottom: "fc6" 298 | top: "fc6" 299 | } 300 | layer { 301 | name: "drop6" 302 | type: "Dropout" 303 | bottom: "fc6" 304 | top: "fc6" 305 | dropout_param { 306 | dropout_ratio: 0.5 307 | } 308 | } 309 | layer { 310 | name: "fc7" 311 | type: "InnerProduct" 312 | bottom: "fc6" 313 | top: "fc7" 314 | inner_product_param { 315 | num_output: 4096 316 | } 317 | } 318 | layer { 319 | name: "relu7" 320 | type: "ReLU" 321 | bottom: "fc7" 322 | top: "fc7" 323 | } 324 | layer { 325 | name: "drop7" 326 | type: "Dropout" 327 | bottom: "fc7" 328 | top: "fc7" 329 | dropout_param { 330 | dropout_ratio: 0.5 331 | } 332 | } 333 | layer { 334 | name: "fc8-attributes" 335 | type: "InnerProduct" 336 | bottom: "fc7" 337 | top: "fc8-attributes" 338 | inner_product_param { 339 | num_output: 1117 340 | } 341 | } 342 | layer { 343 | name: "probs" 344 | type: "Sigmoid" 345 | bottom: "fc8-attributes" 346 | top: "probs" 347 | } 348 | -------------------------------------------------------------------------------- /prototxts/train_classifiers_deploy.prototxt: -------------------------------------------------------------------------------- 1 | name: "FinetuneVGG" 2 | input: "data" 3 | input_shape { 4 | dim: 1 5 | dim: 3 6 | dim: 224 7 | dim: 224 8 | } 9 | layer { 10 | name: "conv1_1" 11 | type: "Convolution" 12 | bottom: "data" 13 | top: "conv1_1" 14 | convolution_param { 15 | num_output: 64 16 | pad: 1 17 | kernel_size: 3 18 | } 19 | } 20 | layer { 21 | name: "relu1_1" 22 | type: "ReLU" 23 | bottom: "conv1_1" 24 | top: "conv1_1" 25 | } 26 | layer { 27 | name: "conv1_2" 28 | type: "Convolution" 29 | bottom: "conv1_1" 30 | top: "conv1_2" 31 | convolution_param { 32 | num_output: 64 33 | pad: 1 34 | kernel_size: 3 35 | } 36 | } 37 | layer { 38 | name: "relu1_2" 39 | type: "ReLU" 40 | bottom: "conv1_2" 41 | top: "conv1_2" 42 | } 43 | layer { 44 | name: "pool1" 45 | type: "Pooling" 46 | bottom: "conv1_2" 47 | top: "pool1" 48 | pooling_param { 49 | pool: MAX 50 | kernel_size: 2 51 | stride: 2 52 | } 53 | } 54 | layer { 55 | name: "conv2_1" 56 | type: "Convolution" 57 | bottom: "pool1" 58 | top: "conv2_1" 59 | convolution_param { 60 | num_output: 128 61 | pad: 1 62 | kernel_size: 3 63 | } 64 | } 65 | layer { 66 | name: "relu2_1" 67 | type: "ReLU" 68 | bottom: "conv2_1" 69 | top: "conv2_1" 70 | } 71 | layer { 72 | name: "conv2_2" 73 | type: "Convolution" 74 | bottom: "conv2_1" 75 | top: "conv2_2" 76 | convolution_param { 77 | num_output: 128 78 | pad: 1 79 | kernel_size: 3 80 | } 81 | } 82 | layer { 83 | name: "relu2_2" 84 | type: "ReLU" 85 | bottom: "conv2_2" 86 | top: "conv2_2" 87 | } 88 | layer { 89 | name: "pool2" 90 | type: "Pooling" 91 | bottom: "conv2_2" 92 | top: "pool2" 93 | pooling_param { 94 | pool: MAX 95 | kernel_size: 2 96 | stride: 2 97 | } 98 | } 99 | layer { 100 | name: "conv3_1" 101 | type: "Convolution" 102 | bottom: "pool2" 103 | top: "conv3_1" 104 | convolution_param { 105 | num_output: 256 106 | pad: 1 107 | kernel_size: 3 108 | } 109 | } 110 | layer { 111 | name: "relu3_1" 112 | type: "ReLU" 113 | bottom: "conv3_1" 114 | top: "conv3_1" 115 | } 116 | layer { 117 | name: "conv3_2" 118 | type: "Convolution" 119 | bottom: "conv3_1" 120 | top: "conv3_2" 121 | convolution_param { 122 | num_output: 256 123 | pad: 1 124 | kernel_size: 3 125 | } 126 | } 127 | layer { 128 | name: "relu3_2" 129 | type: "ReLU" 130 | bottom: "conv3_2" 131 | top: "conv3_2" 132 | } 133 | layer { 134 | name: "conv3_3" 135 | type: "Convolution" 136 | bottom: "conv3_2" 137 | top: "conv3_3" 138 | convolution_param { 139 | num_output: 256 140 | pad: 1 141 | kernel_size: 3 142 | } 143 | } 144 | layer { 145 | name: "relu3_3" 146 | type: "ReLU" 147 | bottom: "conv3_3" 148 | top: "conv3_3" 149 | } 150 | layer { 151 | name: "pool3" 152 | type: "Pooling" 153 | bottom: "conv3_3" 154 | top: "pool3" 155 | pooling_param { 156 | pool: MAX 157 | kernel_size: 2 158 | stride: 2 159 | } 160 | } 161 | layer { 162 | name: "conv4_1" 163 | type: "Convolution" 164 | bottom: "pool3" 165 | top: "conv4_1" 166 | convolution_param { 167 | num_output: 512 168 | pad: 1 169 | kernel_size: 3 170 | } 171 | } 172 | layer { 173 | name: "relu4_1" 174 | type: "ReLU" 175 | bottom: "conv4_1" 176 | top: "conv4_1" 177 | } 178 | layer { 179 | name: "conv4_2" 180 | type: "Convolution" 181 | bottom: "conv4_1" 182 | top: "conv4_2" 183 | convolution_param { 184 | num_output: 512 185 | pad: 1 186 | kernel_size: 3 187 | } 188 | } 189 | layer { 190 | name: "relu4_2" 191 | type: "ReLU" 192 | bottom: "conv4_2" 193 | top: "conv4_2" 194 | } 195 | layer { 196 | name: "conv4_3" 197 | type: "Convolution" 198 | bottom: "conv4_2" 199 | top: "conv4_3" 200 | convolution_param { 201 | num_output: 512 202 | pad: 1 203 | kernel_size: 3 204 | } 205 | } 206 | layer { 207 | name: "relu4_3" 208 | type: "ReLU" 209 | bottom: "conv4_3" 210 | top: "conv4_3" 211 | } 212 | layer { 213 | name: "pool4" 214 | type: "Pooling" 215 | bottom: "conv4_3" 216 | top: "pool4" 217 | pooling_param { 218 | pool: MAX 219 | kernel_size: 2 220 | stride: 2 221 | } 222 | } 223 | layer { 224 | name: "conv5_1" 225 | type: "Convolution" 226 | bottom: "pool4" 227 | top: "conv5_1" 228 | convolution_param { 229 | num_output: 512 230 | pad: 1 231 | kernel_size: 3 232 | } 233 | } 234 | layer { 235 | name: "relu5_1" 236 | type: "ReLU" 237 | bottom: "conv5_1" 238 | top: "conv5_1" 239 | } 240 | layer { 241 | name: "conv5_2" 242 | type: "Convolution" 243 | bottom: "conv5_1" 244 | top: "conv5_2" 245 | convolution_param { 246 | num_output: 512 247 | pad: 1 248 | kernel_size: 3 249 | } 250 | } 251 | layer { 252 | name: "relu5_2" 253 | type: "ReLU" 254 | bottom: "conv5_2" 255 | top: "conv5_2" 256 | } 257 | layer { 258 | name: "conv5_3" 259 | type: "Convolution" 260 | bottom: "conv5_2" 261 | top: "conv5_3" 262 | convolution_param { 263 | num_output: 512 264 | pad: 1 265 | kernel_size: 3 266 | } 267 | } 268 | layer { 269 | name: "relu5_3" 270 | type: "ReLU" 271 | bottom: "conv5_3" 272 | top: "conv5_3" 273 | } 274 | layer { 275 | name: "pool5" 276 | type: "Pooling" 277 | bottom: "conv5_3" 278 | top: "pool5" 279 | pooling_param { 280 | pool: MAX 281 | kernel_size: 2 282 | stride: 2 283 | } 284 | } 285 | layer { 286 | name: "fc6" 287 | type: "InnerProduct" 288 | bottom: "pool5" 289 | top: "fc6" 290 | inner_product_param { 291 | num_output: 4096 292 | } 293 | } 294 | layer { 295 | name: "relu6" 296 | type: "ReLU" 297 | bottom: "fc6" 298 | top: "fc6" 299 | } 300 | layer { 301 | name: "drop6" 302 | type: "Dropout" 303 | bottom: "fc6" 304 | top: "fc6" 305 | dropout_param { 306 | dropout_ratio: 0.5 307 | } 308 | } 309 | layer { 310 | name: "fc7" 311 | type: "InnerProduct" 312 | bottom: "fc6" 313 | top: "fc7" 314 | inner_product_param { 315 | num_output: 4096 316 | } 317 | } 318 | layer { 319 | name: "relu7" 320 | type: "ReLU" 321 | bottom: "fc7" 322 | top: "fc7" 323 | } 324 | layer { 325 | name: "drop7" 326 | type: "Dropout" 327 | bottom: "fc7" 328 | top: "fc7" 329 | dropout_param { 330 | dropout_ratio: 0.5 331 | } 332 | } 333 | layer { 334 | name: "fc8-attributes" 335 | type: "InnerProduct" 336 | bottom: "fc7" 337 | top: "fc8-attributes" 338 | inner_product_param { 339 | num_output: 471 340 | } 341 | } 342 | layer { 343 | name: "probs" 344 | type: "Sigmoid" 345 | bottom: "fc8-attributes" 346 | top: "probs" 347 | } 348 | -------------------------------------------------------------------------------- /run_dcc_coco_baseline_vgg.sh: -------------------------------------------------------------------------------- 1 | #!/usr/bin/env bash 2 | 3 | export PYTHONPATH='utils/:$PYTHONPATH' 4 | 5 | caffe/python/train.py --solver prototxts/dcc_coco_baseline_vgg.solver.prototxt --weights snapshots/mrnn.direct_iter_110000.caffemodel --gpu 0 6 | -------------------------------------------------------------------------------- /run_dcc_coco_rm1_vgg.delta.sh: -------------------------------------------------------------------------------- 1 | #!/usr/bin/env bash 2 | 3 | export PYTHONPATH='utils/:$PYTHONPATH' 4 | 5 | caffe/python/train.py --solver prototxts/dcc_coco_rm1_vgg.solver.freezeLM.prototxt --weights snapshots/mrnn.direct_iter_110000.caffemodel --gpu 0 6 | 7 | caffe/python/train.py --solver prototxts/dcc_coco_rm1_vgg.solver.deltaLM.prototxt --weights snapshots/dcc_coco_rm1_vgg.delta_freezeLM_iter_50000.caffemodel --gpu 0 8 | 9 | -------------------------------------------------------------------------------- /run_dcc_coco_rm1_vgg.sh: -------------------------------------------------------------------------------- 1 | #!/usr/bin/env bash 2 | 3 | export PYTHONPATH='utils/:$PYTHONPATH' 4 | 5 | caffe/python/train.py --solver prototxts/dcc_coco_rm1_vgg.solver.prototxt --weights snapshots/mrnn.direct_iter_110000.caffemodel --gpu 2 6 | -------------------------------------------------------------------------------- /run_dcc_imagenet_rm1_vgg.im2txt.sh: -------------------------------------------------------------------------------- 1 | #!/usr/bin/env bash 2 | 3 | export PYTHONPATH='utils/:$PYTHONPATH' 4 | 5 | caffe/python/train.py --solver prototxts/dcc_oodLM_rm1_vgg.im2txt.solver.prototxt --weights snapshots/mrnn.lm.direct_imtextyt_lr0.01_iter_120000.caffemodel --gpu 0 6 | -------------------------------------------------------------------------------- /run_dcc_imagenet_rm1_vgg.surf.sh: -------------------------------------------------------------------------------- 1 | #!/usr/bin/env bash 2 | 3 | export PYTHONPATH='utils/:$PYTHONPATH' 4 | 5 | caffe/python/train.py --solver prototxts/dcc_oodLM_rm1_vgg.surf.solver.prototxt --weights snapshots/mrnn.lm.direct_surf_lr0.01_iter_120000.caffemodel --gpu 0 6 | -------------------------------------------------------------------------------- /setup.sh: -------------------------------------------------------------------------------- 1 | #!/bin/bash 2 | # POSIX 3 | 4 | #This was tested on a Linux system. You may run into issues if you try to do this on another system (e.g, MAC OS does not have "wget") 5 | 6 | #TODO: Download imagenet images 7 | 8 | home_dir=$(pwd) 9 | 10 | # Initialize variables: 11 | download_mscoco_annotations=0 12 | download_mscoco_images=0 13 | download_mscoco_tools=0 14 | 15 | annotation_folder="annotations" 16 | image_folder="images/coco_images" 17 | models_folder="snapshots" 18 | gen_setneces_folder="results/generated_sentences" 19 | 20 | dcc_data=( "captions_no_caption_rm_eightCluster_train2014.json" "captions_split_set_bottle_val_test_novel2014.json" "captions_split_set_bottle_val_test_train2014.json" "captions_split_set_bottle_val_val_novel2014.json" "captions_split_set_bottle_val_val_train2014.json" "captions_split_set_bus_val_test_novel2014.json" "captions_split_set_bus_val_test_train2014.json" "captions_split_set_bus_val_val_novel2014.json" "captions_split_set_bus_val_val_train2014.json" "captions_split_set_couch_val_test_novel2014.json" "captions_split_set_couch_val_test_train2014.json" "captions_split_set_couch_val_val_novel2014.json" "captions_split_set_couch_val_val_train2014.json" "captions_split_set_microwave_val_test_novel2014.json" "captions_split_set_microwave_val_test_train2014.json" "captions_split_set_microwave_val_val_novel2014.json" "captions_split_set_microwave_val_val_train2014.json" "captions_split_set_pizza_val_test_novel2014.json" "captions_split_set_pizza_val_test_train2014.json" "captions_split_set_pizza_val_val_novel2014.json" "captions_split_set_pizza_val_val_train2014.json" "captions_split_set_racket_val_test_novel2014.json" "captions_split_set_racket_val_test_train2014.json" "captions_split_set_racket_val_val_novel2014.json" "captions_split_set_racket_val_val_train2014.json" "captions_split_set_suitcase_val_test_novel2014.json" "captions_split_set_suitcase_val_test_train2014.json" "captions_split_set_suitcase_val_val_novel2014.json" "captions_split_set_suitcase_val_val_train2014.json" "captions_split_set_zebra_val_test_novel2014.json" "captions_split_set_zebra_val_test_train2014.json" "captions_split_set_zebra_val_val_novel2014.json" "captions_split_set_zebra_val_val_train2014.json" "captions_val_test2014.json" "captions_val_val2014.json" ) 21 | dcc_models=( "caption_models/attributes_JJ100_NN300_VB100_eightClusters_captions_cocoImages_1026_ftLM_1110_vgg_iter_5000.caffemodel" "caption_models/attributes_JJ100_NN300_VB100_eightClusters_imagenetImages_captions_freezeLMPretrain_vgg_iter_50000.caffemodel" "caption_models/dcc_coco_rm1_vgg.471.solver.prototxt_iter_110000.caffemodel" "caption_models/dcc_oodLM_rm1_vgg.im2txt.471.solver_0409_iter_110000.caffemodel" "caption_models/dcc_oodLM_rm1_vgg.surf.471.solver_0409_iter_110000.caffemodel" "caption_models/vgg_feats.vgg_multilabel_FT_iter_100000_imagenetSentences_iter_110000.caffemodel" "classifiers/attributes_JJ100_NN300_VB100_allObjects_coco_vgg_0111_iter_80000.caffemodel" "classifiers/attributes_JJ100_NN300_VB100_clusterEight_imagenet_vgg_0112_iter_80000.caffemodel" "classifiers/attributes_JJ100_NN300_VB100_coco_471_eightCluster_0223_iter_80000.caffemodel" "classifiers/vgg_multilabel_FT_iter_100000.caffemodel" "language_models/mrnn.direct_iter_110000.caffemodel" "language_models/mrnn.lm.direct_surf_lr0.01_iter_120000.caffemodel" "language_models/mrnn.lm.direct_imtextyt_lr0.01_iter_120000.caffemodel" "caption_models/dcc_coco_rm1_vgg.delta_freezeLM_iter_50000.caffemodel" "caption_models/dcc_coco_rm1_vgg.delta_iter_5000.caffemodel" ) 22 | dcc_sentences=( "dcc_coco_rm1_vgg.471.solver.prototxt_iter_110000.caffemodel_coco2014_cocoid.val_test.txt.json" "dcc_oodLM_rm1_vgg.surf.471.solver_0409_iter_110000.transfer_words_coco1.txt_closeness_embedding.caffemodel_coco2014_cocoid.val_test.txt.json" "vgg_feats.vgg_multilabel_FT_iter_100000_imagenetSentences_iter_110000.transfer_words_imagenet.txt_closeness_embedding.caffemodel_test_imagenet_images.txt.json" ) 23 | 24 | show_help () { 25 | echo "--download_mscoco_annotations: downloads mscoco annotations to $annotation_folder." 26 | echo "--download_mscoco_images: downloads mscoco images to $image_folder." 27 | echo "--download_mscoco_tools: downloads mscoco eval tools to $tools_folder." 28 | } 29 | 30 | while :; do 31 | case $1 in 32 | -h|-\?|--help) 33 | show_help 34 | exit 35 | ;; 36 | --download_mscoco_annotations) 37 | download_mscoco_annotations=$((download_mscoco_annotations + 1)) 38 | ;; 39 | --download_mscoco_images) 40 | download_mscoco_images=$((download_mscoco_images + 1)) 41 | ;; 42 | --download_mscoco_tools) 43 | download_mscoco_tools=$((download_mscoco_tools + 1)) 44 | ;; 45 | --) 46 | shift 47 | break 48 | ;; 49 | *) 50 | break 51 | esac 52 | shift 53 | done 54 | 55 | mkdir -p $annotation_folder 56 | mkdir -p $image_folder 57 | mkdir -p $tools_folder 58 | 59 | if [ $download_mscoco_annotations -eq 1 ] 60 | then 61 | echo "Downloading MSCOCO annotations to $annotation_folder" 62 | mscoco_annotation_file="annotations-1-0-3/captions_train-val2014.zip" 63 | wget http://msvocds.blob.core.windows.net/$mscoco_annotation_file 64 | unzip captions_train-val2014.zip 65 | mv annotations/* $annotation_folder 66 | rm captions_train-val2014.zip 67 | else 68 | echo "Not downloading MSCOCO annotations." 69 | fi 70 | 71 | if [ $download_mscoco_images -eq 1 ] 72 | then 73 | echo "Downloading MSCOCO images to $image_folder" 74 | mscoco_train_image_file="coco2014/train2014.zip" 75 | wget http://msvocds.blob.core.windows.net/$mscoco_train_image_file 76 | unzip train2014.zip 77 | mscoco_val_image_file="coco2014/val2014.zip" 78 | wget http://msvocds.blob.core.windows.net/$mscoco_val_image_file 79 | unzip val2014.zip 80 | mv train2014 $image_folder 81 | mv val2014 $image_folder 82 | rm train2014.zip 83 | rm val2014.zip 84 | else 85 | echo "Not downloading MSCOCO images." 86 | fi 87 | 88 | if [ $download_mscoco_tools -eq 1 ] 89 | then 90 | echo "Downloading MSCOCO eval tools to $tools_folder" 91 | ./utils/download_tools.sh 92 | else 93 | echo "Not downloading MSCOCO eval_tools." 94 | fi 95 | 96 | mkdir -p $models_folder 97 | mkdir -p results 98 | mkdir -p results/generated_sentences 99 | 100 | #get data for DCC 101 | echo "Downloading dcc data..." 102 | cd $annotation_folder 103 | for i in "${dcc_data[@]}" 104 | do 105 | echo "Downloading: " $i 106 | wget https://people.eecs.berkeley.edu/~lisa_anne/release_DCC/annotations_DCC/$i 107 | done 108 | cd $home_dir 109 | 110 | #get pretrained models for DCC 111 | echo "Downloading dcc models..." 112 | cd $models_folder 113 | for i in "${dcc_models[@]}" 114 | do 115 | echo "Downloading: " $i 116 | wget https://people.eecs.berkeley.edu/~lisa_anne/release_DCC/trained_models/$i 117 | done 118 | cd $home_dir 119 | 120 | #get word2vec 121 | echo "Downloading dcc word2vec..." 122 | cd dcc_transfer 123 | echo "Downloading: " $i 124 | wget https://people.eecs.berkeley.edu/~lisa_anne/release_DCC/utils/vectors-cbow-bnc+ukwac+wikipedia.bin 125 | cd $home_dir 126 | 127 | mkdir -p results/generated_sentences 128 | cd results/generated_sentences 129 | 130 | #get generated sentences 131 | echo "Downloading generated sentences..." 132 | cd $models_folder 133 | for i in "${dcc_sentences[@]}" 134 | do 135 | echo "Downloading: " $i 136 | wget https://people.eecs.berkeley.edu/~lisa_anne/release_DCC/generated_sentences/$i 137 | done 138 | cd $home_dir 139 | 140 | mkdir -p outfiles 141 | mkdir -p outfiles/transfer 142 | 143 | #clone utilities from other folders 144 | git clone git@github.com:LisaAnne/sentence_gen_tools.git eval 145 | git clone https://github.com/LisaAnne/python_tools utils/tools 146 | 147 | cd eval 148 | ln -s ../utils/tools . 149 | cd $home_dir 150 | 151 | -------------------------------------------------------------------------------- /transfer.sh: -------------------------------------------------------------------------------- 1 | #!/bin/bash 2 | 3 | #coco 4 | 5 | #IN DOMAIN language 6 | 7 | model=prototxts/dcc_vgg.wtd.prototxt 8 | model_weights=dcc_coco_rm1_vgg.471.solver.prototxt_iter_110000 9 | orig_attributes='utils/lexicalList/lexicalList_JJ100_NN300_VB100_rmEightCoco1.txt' 10 | all_attributes='utils/lexicalList/lexicalList_parseCoco_JJ100_NN300_VB100.txt' 11 | vocab='utils/vocabulary/vocabulary.txt' 12 | words='utils/transfer_experiments/transfer_words_coco1.txt' 13 | classifiers='utils/transfer_experiments/transfer_classifiers_coco1.txt' 14 | closeness_metric='closeness_embedding' 15 | transfer_type='direct_transfer' 16 | 17 | python dcc.py --language_model $model \ 18 | --model_weights $model_weights \ 19 | --orig_attributes $orig_attributes \ 20 | --all_attributes $all_attributes \ 21 | --vocab $vocab \ 22 | --words $words \ 23 | --classifiers $classifiers \ 24 | --transfer_type $transfer_type \ 25 | --transfer \ 26 | --log 27 | 28 | #OUT OF DOMAIN language (im2txt) 29 | model=prototxts/dcc_vgg.80k.wtd.prototxt #prototxt used when using OUT OF DOMAIN language features 30 | model_weights=dcc_oodLM_rm1_vgg.im2txt.471.solver_0409_iter_110000 #language learned from im2txt LM 31 | orig_attributes='utils/lexicalList/lexicalList_JJ100_NN300_VB100_rmEightCoco1.txt' 32 | all_attributes='utils/lexicalList/lexicalList_parseCoco_JJ100_NN300_VB100.txt' 33 | vocab='utils/vocabulary/yt_coco_surface_80k_vocab.txt' #vocab used when training with OUT OF DOMAIN language features 34 | words='utils/transfer_experiments/transfer_words_coco1.txt' 35 | classifiers='utils/transfer_experiments/transfer_classifiers_coco1.txt' 36 | closeness_metric='closeness_embedding' 37 | transfer_type='direct_transfer' 38 | 39 | python dcc.py --language_model $model \ 40 | --model_weights $model_weights \ 41 | --orig_attributes $orig_attributes \ 42 | --all_attributes $all_attributes \ 43 | --vocab $vocab \ 44 | --words $words \ 45 | --classifiers $classifiers \ 46 | --transfer_type $transfer_type \ 47 | --transfer \ 48 | --log 49 | 50 | #OUT OF DOMAIN language (surf) 51 | model=prototxts/dcc_vgg.80k.wtd.prototxt #prototxt used when using OUT OF DOMAIN language features 52 | model_weights=dcc_oodLM_rm1_vgg.surf.471.solver_0409_iter_110000 #language learned from surf LM 53 | orig_attributes='utils/lexicalList/lexicalList_JJ100_NN300_VB100_rmEightCoco1.txt' 54 | all_attributes='utils/lexicalList/lexicalList_parseCoco_JJ100_NN300_VB100.txt' 55 | vocab='utils/vocabulary/yt_coco_surface_80k_vocab.txt' #vocab used when training with OUT OF DOMAIN language features 56 | words='utils/transfer_experiments/transfer_words_coco1.txt' 57 | classifiers='utils/transfer_experiments/transfer_classifiers_coco1.txt' 58 | closeness_metric='closeness_embedding' 59 | transfer_type='direct_transfer' 60 | 61 | python dcc.py --language_model $model \ 62 | --model_weights $model_weights \ 63 | --orig_attributes $orig_attributes \ 64 | --all_attributes $all_attributes \ 65 | --vocab $vocab \ 66 | --words $words \ 67 | --classifiers $classifiers \ 68 | --transfer_type $transfer_type \ 69 | --transfer \ 70 | --log 71 | 72 | #imagenet 73 | model='prototxts/dcc_vgg.80k.wtd.imagenet.prototxt' 74 | model_weights='vgg_feats.vgg_multilabel_FT_iter_100000_imagenetSentences_iter_110000' 75 | orig_attributes='utils/lexicalList/lexicalList_JJ100_NN300_VB100_rmEightCoco1.txt' 76 | all_attributes='utils/lexicalList/lexicalList_471_rebuttalScale.txt' 77 | vocab='utils/vocabulary/yt_coco_surface_80k_vocab.txt' 78 | words='utils/transfer_experiments/transfer_words_imagenet.txt' 79 | classifiers='utils/transfer_experiments/transfer_classifiers_imagenet.txt' 80 | closeness_metric='closeness_embedding' 81 | transfer_type='direct_transfer' 82 | 83 | python dcc.py --language_model $model \ 84 | --model_weights $model_weights \ 85 | --orig_attributes $orig_attributes \ 86 | --all_attributes $all_attributes \ 87 | --vocab $vocab \ 88 | --words $words \ 89 | --classifiers $classifiers \ 90 | --transfer_type $transfer_type \ 91 | --transfer \ 92 | --log 93 | 94 | 95 | 96 | -------------------------------------------------------------------------------- /transfer_delta.sh: -------------------------------------------------------------------------------- 1 | #!/bin/bash 2 | 3 | #coco 4 | model=prototxts/dcc_vgg.delta.wtd.prototxt 5 | model_weights=dcc_coco_rm1_vgg.delta_iter_5000 6 | orig_weights=dcc_coco_rm1_vgg.delta_freezeLM_iter_50000 7 | orig_attributes='utils/lexicalList/lexicalList_JJ100_NN300_VB100_rmEightCoco1.txt' 8 | all_attributes='utils/lexicalList/lexicalList_parseCoco_JJ100_NN300_VB100.txt' 9 | vocab='utils/vocabulary/vocabulary.txt' 10 | words='utils/transfer_experiments/transfer_words_coco1.txt' 11 | classifiers='utils/transfer_experiments/transfer_classifiers_coco1.txt' 12 | closeness_metric='closeness_embedding' 13 | transfer_type='delta_transfer' 14 | num_transfer=1 15 | 16 | python dcc.py --language_model $model \ 17 | --model_weights $model_weights \ 18 | --orig_attributes $orig_attributes \ 19 | --all_attributes $all_attributes \ 20 | --vocab $vocab \ 21 | --words $words \ 22 | --classifiers $classifiers \ 23 | --transfer_type $transfer_type \ 24 | --orig_model $orig_weights \ 25 | --num_transfer $num_transfer \ 26 | --transfer \ 27 | --log 28 | 29 | 30 | 31 | -------------------------------------------------------------------------------- /utils/__init__.py: -------------------------------------------------------------------------------- 1 | #init 2 | -------------------------------------------------------------------------------- /utils/config.example.py: -------------------------------------------------------------------------------- 1 | import os 2 | 3 | caffe_dir = 'caffe/' #path to your caffe directory 4 | pycaffe_dir = caffe_dir + 'python/' #path to your pycaffe directory 5 | lexical_features_root = 'lexical_features/' #path to store extracted features. You will need to extract these with "extract-features.sh" 6 | 7 | #All data in "setup.sh" will be downloaded to these folders by default. If you would like them to be downloaded somewhere else, you will need to upate the paths here and in setup.sh! 8 | coco_annotations = 'annotations/' #path to MSCOCO annotations (will be downloaded by "setup.sh" if not already installed) 9 | coco_images_root = 'images/coco_images/' #path to MSCOCO images (will be downloaded by "setup.sh" if not already installed) 10 | imagenet_images_root = 'images/imagenet_images/' #subset of imagenet dataset collected for DCC. "setup.sh" does not download these images as of now. 11 | tools_folder = 'utils/coco_tools/' #path to MSCOCO eval tools (will be downloaded by "setup.sh" if not already installed) 12 | models_folder = 'prototxts/' 13 | weights_folder = 'snapshots/' 14 | vocab_root = 'utils/vocabulary/' 15 | image_list_root = 'utils/image_list/' 16 | os.environ["COCO_EVAL_PATH"] = tools_folder 17 | -------------------------------------------------------------------------------- /utils/download_tools.sh: -------------------------------------------------------------------------------- 1 | #!/usr/bin/env bash 2 | 3 | # change to directory $DIR where this script is stored 4 | pushd . 5 | DIR=$( cd "$( dirname "${BASH_SOURCE[0]}" )" && pwd ) 6 | cd $DIR 7 | 8 | cd utils 9 | git clone https://github.com/tylin/coco-caption.git coco_tools 10 | cd .. 11 | echo "Finished downloading caption eval tools" 12 | 13 | -------------------------------------------------------------------------------- /utils/extract_classifiers.py: -------------------------------------------------------------------------------- 1 | import numpy as np 2 | import sys 3 | import os 4 | import h5py 5 | sys.path.append('utils/') 6 | sys.path.append('utils/tools/') 7 | import caffe_utils 8 | from config import * 9 | sys.path.insert(0, pycaffe_dir) 10 | import caffe 11 | import pdb 12 | 13 | coco_template = '%s/%%s2014/COCO_%%s2014_%%012d.jpg' %coco_images_root 14 | 15 | class VisualFeatureExtractor(object): 16 | 17 | def __init__(self, model, model_weights, device, feature_extract='probs'): 18 | caffe.set_mode_gpu() 19 | caffe.set_device(device) 20 | self.net = caffe.Net(model, model_weights, caffe.TEST) 21 | self.feature_size = self.net.blobs[feature_extract].data.shape[1] 22 | self.feature_extract = feature_extract 23 | 24 | def build_image_processor(self, image_dim=224): 25 | self.image_dim = image_dim 26 | self.transformer = caffe_utils.build_transformer(image_dim) 27 | self.image_processor = lambda x: caffe_utils.image_processor(self.transformer, x, True) 28 | 29 | def extract_batch_features(self, images): 30 | net = self.net 31 | data = [] 32 | for im in images: 33 | data.extend(self.image_processor(im)) 34 | net.blobs['data'].reshape(len(data),3,self.image_dim,self.image_dim) 35 | net.blobs['data'].data[...] = data 36 | out = net.forward() 37 | features_tmp = net.blobs[self.feature_extract].data 38 | features_av = np.array([np.mean(features_tmp[i:i+10], axis=0) for i in range(0, len(data), 10)]) 39 | return features_av 40 | 41 | def extract_features(model, model_weights, imagenet_ims=None, device=0, image_dim=224, feature_extract='probs', batch_size=10): 42 | 43 | net_type = 'vgg' 44 | 45 | im_ids_train = open('utils/image_list/coco2014_cocoid.train.txt').readlines() 46 | im_ids_train = [int(im_id.strip()) for im_id in im_ids_train] 47 | im_ids_val = open('utils/image_list/coco2014_cocoid.val_val.txt').readlines() 48 | im_ids_val = [int(im_id.strip()) for im_id in im_ids_val] 49 | im_ids_test = open('utils/image_list/coco2014_cocoid.val_test.txt').readlines() 50 | im_ids_test = [int(im_id.strip()) for im_id in im_ids_test] 51 | 52 | train_ims = [coco_template %('train', 'train', im_id) for im_id in im_ids_train] 53 | val_ims = [coco_template %('val', 'val', im_id) for im_id in im_ids_val] 54 | test_ims = [coco_template %('val', 'val', im_id) for im_id in im_ids_test] 55 | 56 | sets = [train_ims, val_ims, test_ims] 57 | set_names = ['train', 'val_val', 'val_test'] 58 | if imagenet_ims: 59 | sets = [] 60 | set_names = [] 61 | imagenet_ims = open(imagenet_ims).readlines() 62 | test_imagenet = [imagenet_images_root + i.strip() for i in imagenet_ims] 63 | sets.append(test_imagenet) 64 | set_names.append('imagenet_test_ims') 65 | 66 | save_h5 = model_weights.split('/')[-1] 67 | 68 | #set up feature extractor 69 | extractor = VisualFeatureExtractor(model, model_weights, device, feature_extract) 70 | extractor.build_image_processor() 71 | feature_size = extractor.net.blobs['probs'].data.shape[1] 72 | 73 | for s, set_name in zip(sets, set_names): 74 | all_ims = s 75 | features = np.zeros((len(all_ims), feature_size)) 76 | for ix in range(0, len(all_ims), batch_size): 77 | sys.stdout.write('\rOn set %s. On image %d/%d.' %(set_name, ix, len(all_ims))) 78 | sys.stdout.flush() 79 | 80 | batch_end = min(ix+batch_size, len(all_ims)) 81 | batch_frames = all_ims[ix:batch_end] 82 | features_av = extractor.extract_batch_features(batch_frames) 83 | features[ix:ix+features_av.shape[0],:] = features_av 84 | 85 | h5_file = '%s/%s_feats.%s.%s.h5' %(lexical_features_root, net_type, save_h5, set_name) 86 | f = h5py.File(h5_file, "w") 87 | print "Printing to %s\n" %h5_file 88 | all_ims_short = [i.split('/')[-2] + '/' + i.split('/')[-1] for i in all_ims] 89 | assert len(all_ims_short) == len(features) 90 | dset = f.create_dataset("ims", data=all_ims_short) 91 | dset = f.create_dataset("features", data=features) 92 | f.close() 93 | 94 | 95 | -------------------------------------------------------------------------------- /utils/lexicalList/lexicalList_471_rebuttalScale.txt: -------------------------------------------------------------------------------- 1 | yellow 2 | skate 3 | asian 4 | hanging 5 | dish 6 | chair 7 | row 8 | tv 9 | young 10 | bike 11 | cluttered 12 | brown 13 | woman 14 | blanket 15 | vase 16 | fall 17 | cook 18 | drinking 19 | school 20 | wooden 21 | cloudy 22 | large 23 | sand 24 | small 25 | guy 26 | enjoy 27 | bicycle 28 | fence 29 | skiing 30 | sign 31 | jump 32 | go 33 | street 34 | video 35 | pass 36 | run 37 | blue 38 | clock 39 | sun 40 | uniform 41 | cell 42 | public 43 | body 44 | full 45 | commercial 46 | french 47 | surfer 48 | water 49 | baseball 50 | sink 51 | box 52 | boy 53 | colored 54 | luggage 55 | receive 56 | airport 57 | pick 58 | military 59 | climb 60 | apple 61 | family 62 | wii 63 | snowboard 64 | motor 65 | market 66 | standing 67 | plow 68 | next 69 | few 70 | camera 71 | vehicle 72 | strike 73 | tell 74 | flat 75 | door 76 | chocolate 77 | phone 78 | train 79 | adult 80 | baby 81 | hold 82 | fly 83 | room 84 | salad 85 | player 86 | car 87 | ride 88 | work 89 | cat 90 | decker 91 | donut 92 | male 93 | beautiful 94 | grassy 95 | give 96 | high 97 | something 98 | hit 99 | airplane 100 | dress 101 | pink 102 | huge 103 | end 104 | sit 105 | provide 106 | pine 107 | lamp 108 | animal 109 | elephant 110 | tile 111 | beach 112 | pizza 113 | plant 114 | sandwich 115 | flip 116 | stop 117 | plane 118 | court 119 | wave 120 | man 121 | surfing 122 | crowded 123 | light 124 | counter 125 | meat 126 | green 127 | block 128 | enter 129 | basket 130 | tall 131 | playing 132 | talk 133 | wine 134 | cute 135 | help 136 | office 137 | move 138 | meter 139 | paper 140 | motorcycle 141 | oven 142 | keyboard 143 | bunch 144 | style 145 | police 146 | monitor 147 | fix 148 | hot 149 | window 150 | orange 151 | covered 152 | soccer 153 | sauce 154 | coffee 155 | someone 156 | return 157 | food 158 | wooded 159 | scene 160 | giraffe 161 | half 162 | front 163 | silver 164 | bread 165 | rocky 166 | stir 167 | rock 168 | tray 169 | meal 170 | house 171 | zebra 172 | girl 173 | enclosure 174 | sandy 175 | living 176 | flower 177 | electronic 178 | hill 179 | red 180 | umbrella 181 | dirt 182 | hang 183 | horse 184 | cart 185 | graze 186 | base 187 | put 188 | sidewalk 189 | skateboard 190 | keep 191 | turn 192 | birthday 193 | swing 194 | bite 195 | feed 196 | cheese 197 | number 198 | carry 199 | open 200 | sheep 201 | city 202 | little 203 | toy 204 | hydrant 205 | jet 206 | top 207 | plastic 208 | station 209 | white 210 | banana 211 | store 212 | way 213 | shelf 214 | hotel 215 | park 216 | steel 217 | television 218 | double 219 | tree 220 | grey 221 | bed 222 | shower 223 | stall 224 | steer 225 | bridge 226 | modern 227 | have 228 | gear 229 | mountain 230 | person 231 | mid 232 | zoo 233 | dining 234 | take 235 | truck 236 | surf 237 | play 238 | multiple 239 | track 240 | serve 241 | reach 242 | leave 243 | pair 244 | refrigerator 245 | clear 246 | metal 247 | dog 248 | face 249 | clean 250 | professional 251 | slope 252 | walking 253 | shot 254 | kite 255 | show 256 | stove 257 | watch 258 | bright 259 | bedroom 260 | corner 261 | chicken 262 | ground 263 | giant 264 | busy 265 | wood 266 | black 267 | pretty 268 | rice 269 | plate 270 | handle 271 | colorful 272 | get 273 | sunny 274 | photograph 275 | bear 276 | batter 277 | tiled 278 | striped 279 | gray 280 | bat 281 | catcher 282 | bird 283 | bag 284 | grab 285 | river 286 | view 287 | set 288 | pitch 289 | seat 290 | see 291 | computer 292 | parking 293 | racket 294 | close 295 | stack 296 | outside 297 | various 298 | do 299 | screen 300 | deliver 301 | group 302 | tub 303 | herd 304 | come 305 | kitchen 306 | grass 307 | cow 308 | restaurant 309 | many 310 | let 311 | load 312 | toilet 313 | wall 314 | walk 315 | pole 316 | table 317 | grazing 318 | boat 319 | bathroom 320 | teddy 321 | brick 322 | empty 323 | engine 324 | dry 325 | assorted 326 | fire 327 | child 328 | catch 329 | look 330 | hat 331 | runway 332 | air 333 | trick 334 | country 335 | lush 336 | middle 337 | ready 338 | mouse 339 | helmet 340 | different 341 | shirt 342 | perform 343 | vintage 344 | make 345 | bowl 346 | cross 347 | same 348 | several 349 | pan 350 | ball 351 | drink 352 | rail 353 | hand 354 | fruit 355 | statue 356 | broccoli 357 | kid 358 | surfboard 359 | try 360 | floor 361 | ocean 362 | concrete 363 | edge 364 | bottle 365 | outdoor 366 | photo 367 | laptop 368 | snow 369 | touch 370 | couch 371 | railroad 372 | blow 373 | cut 374 | snowboarder 375 | cup 376 | sky 377 | bench 378 | other 379 | pile 380 | board 381 | wet 382 | ski 383 | kick 384 | skier 385 | read 386 | big 387 | couple 388 | dark 389 | game 390 | traffic 391 | know 392 | desk 393 | doughnut 394 | intersection 395 | lady 396 | furniture 397 | night 398 | tower 399 | old 400 | crowd 401 | picture 402 | back 403 | hair 404 | mirror 405 | home 406 | slice 407 | purple 408 | microwave 409 | prop 410 | skateboarder 411 | knife 412 | be 413 | eating 414 | pose 415 | use 416 | throw 417 | stone 418 | side 419 | dinner 420 | stand 421 | road 422 | image 423 | bath 424 | brush 425 | female 426 | racquet 427 | prepare 428 | area 429 | stainless 430 | tennis 431 | lone 432 | long 433 | start 434 | low 435 | lot 436 | suit 437 | fork 438 | head 439 | snowy 440 | bus 441 | pitcher 442 | line 443 | eat 444 | pull 445 | inside 446 | up 447 | carriage 448 | dirty 449 | suitcase 450 | cake 451 | piece 452 | display 453 | passenger 454 | single 455 | check 456 | variety 457 | field 458 | book 459 | holding 460 | tie 461 | nice 462 | brushing 463 | polar 464 | frisbee 465 | building 466 | land 467 | remote 468 | glass 469 | jacket 470 | time 471 | fresh 472 | aardvark 473 | abacus 474 | acorn 475 | acrobatics 476 | acrylic 477 | admiral 478 | albatross 479 | alp 480 | alpaca 481 | ambrosia 482 | amphitheater 483 | anaconda 484 | android 485 | anteater 486 | ape 487 | applesauce 488 | apricot 489 | aquatics 490 | ark 491 | armadillo 492 | armory 493 | armour 494 | arroyo 495 | artillery 496 | astronaut 497 | audiovisual 498 | aviary 499 | axe 500 | azalea 501 | backpacker 502 | badger 503 | bagpipe 504 | balaclava 505 | ballplayer 506 | banjo 507 | banquette 508 | bantam 509 | baobab 510 | baritone 511 | barnacle 512 | barometer 513 | barracuda 514 | barrette 515 | basilica 516 | bassoon 517 | bayonet 518 | beaker 519 | bearskin 520 | beaver 521 | bedchamber 522 | belltower 523 | birdbath 524 | birdie 525 | bishop 526 | bistro 527 | blackbird 528 | blackcurrant 529 | blackjack 530 | blimp 531 | bloodhound 532 | boa 533 | boathouse 534 | boatman 535 | boatswain 536 | bobcat 537 | bobsled 538 | bodyguard 539 | bolero 540 | bongo 541 | bookend 542 | botanist 543 | boudoir 544 | boutique 545 | bramble 546 | brasserie 547 | breadbasket 548 | brontosaurus 549 | brownstone 550 | bubblegum 551 | buckskin 552 | bugle 553 | bumblebee 554 | bungalow 555 | burka 556 | burro 557 | bushbaby 558 | butte 559 | cabaret 560 | cablecar 561 | cabriolet 562 | cadaver 563 | caddie 564 | cadet 565 | caldera 566 | camcorder 567 | canary 568 | candelabra 569 | candida 570 | candlestick 571 | cannonball 572 | canon 573 | caribou 574 | carp 575 | cashew 576 | cashmere 577 | cassava 578 | catnip 579 | cauldron 580 | caveman 581 | cedar 582 | cellist 583 | cello 584 | centrifuge 585 | chainsaw 586 | chameleon 587 | chamomile 588 | chapel 589 | chateau 590 | chemist 591 | chickadee 592 | chicory 593 | chiffon 594 | chile 595 | chime 596 | chimpanzee 597 | chinchilla 598 | chowder 599 | chrysalis 600 | chrysanthemum 601 | cinema 602 | circuitry 603 | citadel 604 | clam 605 | clamshell 606 | clarinet 607 | clavichord 608 | clergyman 609 | clinician 610 | clipper 611 | cloak 612 | clothespin 613 | clove 614 | cobweb 615 | cockatoo 616 | cockerel 617 | cockfighting 618 | cockroach 619 | coffin 620 | colander 621 | conch 622 | condominium 623 | conformation 624 | conifer 625 | copier 626 | coriander 627 | corkscrew 628 | cornet 629 | coronet 630 | corsage 631 | corset 632 | cosmopolitan 633 | cougar 634 | courgette 635 | couscous 636 | cowbell 637 | coyote 638 | crayfish 639 | crayon 640 | crocodile 641 | crossbow 642 | crucifix 643 | crutch 644 | cuttlefish 645 | daffodil 646 | dagger 647 | dahl 648 | dais 649 | damselfly 650 | delicatessen 651 | dinghy 652 | dingo 653 | discotheque 654 | discus 655 | divan 656 | dogfood 657 | doghouse 658 | dogwood 659 | doorbell 660 | doorknob 661 | dormitory 662 | doula 663 | dragonfly 664 | drumstick 665 | dumpling 666 | dungeon 667 | earthenware 668 | edifice 669 | eel 670 | emerald 671 | emu 672 | enchilada 673 | envoy 674 | equator 675 | eraser 676 | eucalyptus 677 | euphonium 678 | eyeliner 679 | eyeshade 680 | eyewitness 681 | falafel 682 | farmyard 683 | fauna 684 | fawn 685 | ferryboat 686 | fiddle 687 | fiddler 688 | fife 689 | fig 690 | finch 691 | firefly 692 | fishbowl 693 | flagpole 694 | floorboard 695 | flotsam 696 | flounder 697 | flowerbed 698 | flute 699 | fondue 700 | footbridge 701 | foothill 702 | forceps 703 | fortress 704 | fossil 705 | foxhunting 706 | freighter 707 | friar 708 | frittata 709 | fuji 710 | furnace 711 | gallows 712 | gargoyle 713 | gator 714 | gecko 715 | geranium 716 | gibbon 717 | glacier 718 | gladiator 719 | gong 720 | gooseberry 721 | goulash 722 | gramophone 723 | grandpa 724 | grandparent 725 | grasshopper 726 | grenade 727 | groundhog 728 | grouse 729 | guillotine 730 | guitarist 731 | gulag 732 | gunboat 733 | gymnast 734 | gymnastics 735 | gyroscope 736 | hacksaw 737 | haggis 738 | hairpiece 739 | halibut 740 | halo 741 | handcuff 742 | handgun 743 | hare 744 | harmonica 745 | harp 746 | harpoon 747 | harpsichord 748 | hazelnut 749 | headland 750 | headscarf 751 | headteacher 752 | hearse 753 | hedgehog 754 | highland 755 | hitchhiker 756 | hollandaise 757 | honeycomb 758 | honeysuckle 759 | hornet 760 | horseman 761 | horseracing 762 | horseshoe 763 | hostel 764 | hourglass 765 | huckleberry 766 | humpback 767 | humus 768 | huntress 769 | hyacinth 770 | hydra 771 | hydroplane 772 | hyena 773 | iceberg 774 | icemaker 775 | igloo 776 | iguana 777 | impala 778 | jackal 779 | jade 780 | jaguar 781 | jak 782 | jalapeno 783 | jasmine 784 | javelin 785 | jellyfish 786 | jester 787 | jigsaw 788 | jock 789 | juggler 790 | jukebox 791 | junkyard 792 | jurist 793 | juror 794 | keypad 795 | kiwifruit 796 | knitwear 797 | kola 798 | ladybird 799 | lama 800 | lapdog 801 | laurel 802 | lawnmower 803 | lectern 804 | leech 805 | lemur 806 | lentil 807 | leotard 808 | levee 809 | lichen 810 | lifeboat 811 | lifejacket 812 | lightbulb 813 | locust 814 | longbow 815 | lorikeet 816 | lory 817 | lychee 818 | lynx 819 | macaque 820 | macaw 821 | machete 822 | mackerel 823 | madras 824 | magician 825 | magnolia 826 | mammoth 827 | manatee 828 | mandolin 829 | mantis 830 | mantlepiece 831 | marigold 832 | marimba 833 | marshmallow 834 | marzipan 835 | mascara 836 | masquerade 837 | mastodon 838 | matador 839 | mayflower 840 | medusa 841 | megaphone 842 | menagerie 843 | mesa 844 | metronome 845 | microprocessor 846 | microscope 847 | mimosa 848 | minestrone 849 | minibar 850 | minicomputer 851 | miniskirt 852 | mink 853 | missile 854 | mockingbird 855 | mongoose 856 | morgue 857 | mosque 858 | moth 859 | mousetrap 860 | moussaka 861 | mousse 862 | mullet 863 | mussel 864 | mustang 865 | naan 866 | nanny 867 | nebula 868 | nectarine 869 | nematode 870 | neurologist 871 | newsstand 872 | nightingale 873 | nightshade 874 | nipple 875 | nutcracker 876 | nutmeg 877 | oboe 878 | okapi 879 | opera 880 | orangutan 881 | orca 882 | oregano 883 | organist 884 | osprey 885 | otter 886 | oyster 887 | pajama 888 | palmetto 889 | papaya 890 | parlour 891 | parsnip 892 | peafowl 893 | pecan 894 | penthouse 895 | peppercorn 896 | percussionist 897 | persimmon 898 | petticoat 899 | pharmacy 900 | pheasant 901 | phonograph 902 | photocopy 903 | piccolo 904 | piglet 905 | pilaf 906 | pinion 907 | piranha 908 | pistachio 909 | pitchfork 910 | pixie 911 | plateau 912 | platypus 913 | plum 914 | plumber 915 | pocketbook 916 | pocketknife 917 | poinsettia 918 | polarbear 919 | poppet 920 | porpoise 921 | possum 922 | potluck 923 | poultry 924 | powerboat 925 | precipice 926 | primrose 927 | prong 928 | prune 929 | pueblo 930 | puffin 931 | pumpernickel 932 | pupa 933 | puppeteer 934 | pushchair 935 | pussycat 936 | python 937 | quail 938 | quarterback 939 | quill 940 | quilting 941 | quiver 942 | rabbi 943 | raccoon 944 | radicchio 945 | raisin 946 | rake 947 | raptor 948 | rattan 949 | redhead 950 | regalia 951 | revolver 952 | rhubarb 953 | ricotta 954 | riverbed 955 | roach 956 | roadblock 957 | roadhouse 958 | roadrunner 959 | roaster 960 | rocker 961 | rollercoaster 962 | romaine 963 | rosemary 964 | roulette 965 | sake 966 | salamander 967 | sallow 968 | samba 969 | sander 970 | sari 971 | sashimi 972 | saucepan 973 | sauna 974 | saxophone 975 | schnauzer 976 | schnitzel 977 | schoolhouse 978 | schoolmaster 979 | schoolroom 980 | sconce 981 | scone 982 | scythe 983 | seabird 984 | seaboard 985 | seafront 986 | semiautomatic 987 | serpent 988 | settler 989 | shackle 990 | shallot 991 | shipwreck 992 | shoebox 993 | shorebird 994 | showboat 995 | shrew 996 | sitar 997 | skunk 998 | sloth 999 | snapper 1000 | snorkel 1001 | snowbird 1002 | snowcap 1003 | snowdrift 1004 | snowflake 1005 | snowplow 1006 | snowshoe 1007 | sofabed 1008 | songbird 1009 | soybean 1010 | spaceship 1011 | spectacles 1012 | spectrometer 1013 | sphinx 1014 | sprinkler 1015 | spruce 1016 | stag 1017 | stallion 1018 | starfish 1019 | steamship 1020 | steed 1021 | stingray 1022 | stonework 1023 | stopwatch 1024 | streamer 1025 | streamliner 1026 | stretcher 1027 | subcontractor 1028 | summerhouse 1029 | sunfish 1030 | sunglass 1031 | supercomputer 1032 | superhighway 1033 | supernova 1034 | swallow 1035 | sweetcorn 1036 | swine 1037 | swordfish 1038 | sycamore 1039 | syringe 1040 | tabasco 1041 | tabernacle 1042 | tamale 1043 | tambourine 1044 | tammy 1045 | tankard 1046 | taper 1047 | tarantula 1048 | tearoom 1049 | teepee 1050 | telegraph 1051 | telegraphy 1052 | tempura 1053 | teriyaki 1054 | theremin 1055 | throne 1056 | timber 1057 | timpani 1058 | toad 1059 | toboggan 1060 | tombstone 1061 | topaz 1062 | topcoat 1063 | torch 1064 | torpedo 1065 | toucan 1066 | towboat 1067 | trackball 1068 | trampoline 1069 | tramway 1070 | trapeze 1071 | treadmill 1072 | trifle 1073 | truffle 1074 | trumpeter 1075 | tuba 1076 | tulip 1077 | turbojet 1078 | turnip 1079 | turnpike 1080 | tutu 1081 | tweed 1082 | tyrannosaurus 1083 | ukulele 1084 | unicycle 1085 | urchin 1086 | vat 1087 | verandah 1088 | vestibule 1089 | vial 1090 | viola 1091 | volcano 1092 | vole 1093 | voyager 1094 | vulture 1095 | waggon 1096 | wallaby 1097 | walrus 1098 | warbler 1099 | warship 1100 | warthog 1101 | wasp 1102 | watchtower 1103 | weevil 1104 | weld 1105 | wheelhouse 1106 | whippet 1107 | whisky 1108 | witch 1109 | wolverine 1110 | wombat 1111 | woodpecker 1112 | woollen 1113 | worm 1114 | wrangler 1115 | wren 1116 | yam 1117 | zipper 1118 | -------------------------------------------------------------------------------- /utils/lexicalList/lexicalList_471_rebuttalScale_justImageNet.txt: -------------------------------------------------------------------------------- 1 | yellow 2 | skate 3 | asian 4 | hanging 5 | dish 6 | chair 7 | row 8 | tv 9 | young 10 | bike 11 | cluttered 12 | brown 13 | woman 14 | blanket 15 | vase 16 | fall 17 | cook 18 | drinking 19 | school 20 | wooden 21 | cloudy 22 | large 23 | sand 24 | small 25 | guy 26 | enjoy 27 | bicycle 28 | fence 29 | skiing 30 | sign 31 | jump 32 | go 33 | street 34 | video 35 | pass 36 | run 37 | blue 38 | clock 39 | sun 40 | uniform 41 | cell 42 | public 43 | body 44 | full 45 | commercial 46 | french 47 | surfer 48 | water 49 | baseball 50 | sink 51 | box 52 | boy 53 | colored 54 | luggage 55 | receive 56 | airport 57 | pick 58 | military 59 | climb 60 | apple 61 | family 62 | wii 63 | snowboard 64 | motor 65 | market 66 | standing 67 | plow 68 | next 69 | few 70 | camera 71 | vehicle 72 | strike 73 | tell 74 | flat 75 | door 76 | chocolate 77 | phone 78 | train 79 | adult 80 | baby 81 | hold 82 | fly 83 | room 84 | salad 85 | player 86 | car 87 | ride 88 | work 89 | cat 90 | decker 91 | donut 92 | male 93 | beautiful 94 | grassy 95 | give 96 | high 97 | something 98 | hit 99 | airplane 100 | dress 101 | pink 102 | huge 103 | end 104 | sit 105 | provide 106 | pine 107 | lamp 108 | animal 109 | elephant 110 | tile 111 | beach 112 | pizza 113 | plant 114 | sandwich 115 | flip 116 | stop 117 | plane 118 | court 119 | wave 120 | man 121 | surfing 122 | crowded 123 | light 124 | counter 125 | meat 126 | green 127 | block 128 | enter 129 | basket 130 | tall 131 | playing 132 | talk 133 | wine 134 | cute 135 | help 136 | office 137 | move 138 | meter 139 | paper 140 | motorcycle 141 | oven 142 | keyboard 143 | bunch 144 | style 145 | police 146 | monitor 147 | fix 148 | hot 149 | window 150 | orange 151 | covered 152 | soccer 153 | sauce 154 | coffee 155 | someone 156 | return 157 | food 158 | wooded 159 | scene 160 | giraffe 161 | half 162 | front 163 | silver 164 | bread 165 | rocky 166 | stir 167 | rock 168 | tray 169 | meal 170 | house 171 | zebra 172 | girl 173 | enclosure 174 | sandy 175 | living 176 | flower 177 | electronic 178 | hill 179 | red 180 | umbrella 181 | dirt 182 | hang 183 | horse 184 | cart 185 | graze 186 | base 187 | put 188 | sidewalk 189 | skateboard 190 | keep 191 | turn 192 | birthday 193 | swing 194 | bite 195 | feed 196 | cheese 197 | number 198 | carry 199 | open 200 | sheep 201 | city 202 | little 203 | toy 204 | hydrant 205 | jet 206 | top 207 | plastic 208 | station 209 | white 210 | banana 211 | store 212 | way 213 | shelf 214 | hotel 215 | park 216 | steel 217 | television 218 | double 219 | tree 220 | grey 221 | bed 222 | shower 223 | stall 224 | steer 225 | bridge 226 | modern 227 | have 228 | gear 229 | mountain 230 | person 231 | mid 232 | zoo 233 | dining 234 | take 235 | truck 236 | surf 237 | play 238 | multiple 239 | track 240 | serve 241 | reach 242 | leave 243 | pair 244 | refrigerator 245 | clear 246 | metal 247 | dog 248 | face 249 | clean 250 | professional 251 | slope 252 | walking 253 | shot 254 | kite 255 | show 256 | stove 257 | watch 258 | bright 259 | bedroom 260 | corner 261 | chicken 262 | ground 263 | giant 264 | busy 265 | wood 266 | black 267 | pretty 268 | rice 269 | plate 270 | handle 271 | colorful 272 | get 273 | sunny 274 | photograph 275 | bear 276 | batter 277 | tiled 278 | striped 279 | gray 280 | bat 281 | catcher 282 | bird 283 | bag 284 | grab 285 | river 286 | view 287 | set 288 | pitch 289 | seat 290 | see 291 | computer 292 | parking 293 | racket 294 | close 295 | stack 296 | outside 297 | various 298 | do 299 | screen 300 | deliver 301 | group 302 | tub 303 | herd 304 | come 305 | kitchen 306 | grass 307 | cow 308 | restaurant 309 | many 310 | let 311 | load 312 | toilet 313 | wall 314 | walk 315 | pole 316 | table 317 | grazing 318 | boat 319 | bathroom 320 | teddy 321 | brick 322 | empty 323 | engine 324 | dry 325 | assorted 326 | fire 327 | child 328 | catch 329 | look 330 | hat 331 | runway 332 | air 333 | trick 334 | country 335 | lush 336 | middle 337 | ready 338 | mouse 339 | helmet 340 | different 341 | shirt 342 | perform 343 | vintage 344 | make 345 | bowl 346 | cross 347 | same 348 | several 349 | pan 350 | ball 351 | drink 352 | rail 353 | hand 354 | fruit 355 | statue 356 | broccoli 357 | kid 358 | surfboard 359 | try 360 | floor 361 | ocean 362 | concrete 363 | edge 364 | bottle 365 | outdoor 366 | photo 367 | laptop 368 | snow 369 | touch 370 | couch 371 | railroad 372 | blow 373 | cut 374 | snowboarder 375 | cup 376 | sky 377 | bench 378 | other 379 | pile 380 | board 381 | wet 382 | ski 383 | kick 384 | skier 385 | read 386 | big 387 | couple 388 | dark 389 | game 390 | traffic 391 | know 392 | desk 393 | doughnut 394 | intersection 395 | lady 396 | furniture 397 | night 398 | tower 399 | old 400 | crowd 401 | picture 402 | back 403 | hair 404 | mirror 405 | home 406 | slice 407 | purple 408 | microwave 409 | prop 410 | skateboarder 411 | knife 412 | be 413 | eating 414 | pose 415 | use 416 | throw 417 | stone 418 | side 419 | dinner 420 | stand 421 | road 422 | image 423 | bath 424 | brush 425 | female 426 | racquet 427 | prepare 428 | area 429 | stainless 430 | tennis 431 | lone 432 | long 433 | start 434 | low 435 | lot 436 | suit 437 | fork 438 | head 439 | snowy 440 | bus 441 | pitcher 442 | line 443 | eat 444 | pull 445 | inside 446 | up 447 | carriage 448 | dirty 449 | suitcase 450 | cake 451 | piece 452 | display 453 | passenger 454 | single 455 | check 456 | variety 457 | field 458 | book 459 | holding 460 | tie 461 | nice 462 | brushing 463 | polar 464 | frisbee 465 | building 466 | land 467 | remote 468 | glass 469 | jacket 470 | time 471 | fresh 472 | aardvark 473 | abacus 474 | acorn 475 | acrobatics 476 | acrylic 477 | admiral 478 | albatross 479 | alp 480 | alpaca 481 | ambrosia 482 | amphitheater 483 | anaconda 484 | android 485 | anteater 486 | ape 487 | applesauce 488 | apricot 489 | aquatics 490 | ark 491 | armadillo 492 | armory 493 | armour 494 | arroyo 495 | artillery 496 | astronaut 497 | audiovisual 498 | aviary 499 | axe 500 | azalea 501 | backpacker 502 | badger 503 | bagpipe 504 | balaclava 505 | ballplayer 506 | banjo 507 | banquette 508 | bantam 509 | baobab 510 | baritone 511 | barnacle 512 | barometer 513 | barracuda 514 | barrette 515 | basilica 516 | bassoon 517 | bayonet 518 | beaker 519 | bearskin 520 | beaver 521 | bedchamber 522 | belltower 523 | birdbath 524 | birdie 525 | bishop 526 | bistro 527 | blackbird 528 | blackcurrant 529 | blackjack 530 | blimp 531 | bloodhound 532 | boa 533 | boathouse 534 | boatman 535 | boatswain 536 | bobcat 537 | bobsled 538 | bodyguard 539 | bolero 540 | bongo 541 | bookend 542 | botanist 543 | boudoir 544 | boutique 545 | bramble 546 | brasserie 547 | breadbasket 548 | brontosaurus 549 | brownstone 550 | bubblegum 551 | buckskin 552 | bugle 553 | bumblebee 554 | bungalow 555 | burka 556 | burro 557 | bushbaby 558 | butte 559 | cabaret 560 | cablecar 561 | cabriolet 562 | cadaver 563 | caddie 564 | cadet 565 | caldera 566 | camcorder 567 | canary 568 | candelabra 569 | candida 570 | candlestick 571 | cannonball 572 | canon 573 | caribou 574 | carp 575 | cashew 576 | cashmere 577 | cassava 578 | catnip 579 | cauldron 580 | caveman 581 | cedar 582 | cellist 583 | cello 584 | centrifuge 585 | chainsaw 586 | chameleon 587 | chamomile 588 | chapel 589 | chateau 590 | chemist 591 | chickadee 592 | chicory 593 | chiffon 594 | chile 595 | chime 596 | chimpanzee 597 | chinchilla 598 | chowder 599 | chrysalis 600 | chrysanthemum 601 | cinema 602 | circuitry 603 | citadel 604 | clam 605 | clamshell 606 | clarinet 607 | clavichord 608 | clergyman 609 | clinician 610 | clipper 611 | cloak 612 | clothespin 613 | clove 614 | cobweb 615 | cockatoo 616 | cockerel 617 | cockfighting 618 | cockroach 619 | coffin 620 | colander 621 | conch 622 | condominium 623 | conformation 624 | conifer 625 | copier 626 | coriander 627 | corkscrew 628 | cornet 629 | coronet 630 | corsage 631 | corset 632 | cosmopolitan 633 | cougar 634 | courgette 635 | couscous 636 | cowbell 637 | coyote 638 | crayfish 639 | crayon 640 | crocodile 641 | crossbow 642 | crucifix 643 | crutch 644 | cuttlefish 645 | daffodil 646 | dagger 647 | dahl 648 | dais 649 | damselfly 650 | delicatessen 651 | dinghy 652 | dingo 653 | discotheque 654 | discus 655 | divan 656 | dogfood 657 | doghouse 658 | dogwood 659 | doorbell 660 | doorknob 661 | dormitory 662 | doula 663 | dragonfly 664 | drumstick 665 | dumpling 666 | dungeon 667 | earthenware 668 | edifice 669 | eel 670 | emerald 671 | emu 672 | enchilada 673 | envoy 674 | equator 675 | eraser 676 | eucalyptus 677 | euphonium 678 | eyeliner 679 | eyeshade 680 | eyewitness 681 | falafel 682 | farmyard 683 | fauna 684 | fawn 685 | ferryboat 686 | fiddle 687 | fiddler 688 | fife 689 | fig 690 | finch 691 | firefly 692 | fishbowl 693 | flagpole 694 | floorboard 695 | flotsam 696 | flounder 697 | flowerbed 698 | flute 699 | fondue 700 | footbridge 701 | foothill 702 | forceps 703 | fortress 704 | fossil 705 | foxhunting 706 | freighter 707 | friar 708 | frittata 709 | fuji 710 | furnace 711 | gallows 712 | gargoyle 713 | gator 714 | gecko 715 | geranium 716 | gibbon 717 | glacier 718 | gladiator 719 | gong 720 | gooseberry 721 | goulash 722 | gramophone 723 | grandpa 724 | grandparent 725 | grasshopper 726 | grenade 727 | groundhog 728 | grouse 729 | guillotine 730 | guitarist 731 | gulag 732 | gunboat 733 | gymnast 734 | gymnastics 735 | gyroscope 736 | hacksaw 737 | haggis 738 | hairpiece 739 | halibut 740 | halo 741 | handcuff 742 | handgun 743 | hare 744 | harmonica 745 | harp 746 | harpoon 747 | harpsichord 748 | hazelnut 749 | headland 750 | headscarf 751 | headteacher 752 | hearse 753 | hedgehog 754 | highland 755 | hitchhiker 756 | hollandaise 757 | honeycomb 758 | honeysuckle 759 | hornet 760 | horseman 761 | horseracing 762 | horseshoe 763 | hostel 764 | hourglass 765 | huckleberry 766 | humpback 767 | humus 768 | huntress 769 | hyacinth 770 | hydra 771 | hydroplane 772 | hyena 773 | iceberg 774 | icemaker 775 | igloo 776 | iguana 777 | impala 778 | jackal 779 | jade 780 | jaguar 781 | jak 782 | jalapeno 783 | jasmine 784 | javelin 785 | jellyfish 786 | jester 787 | jigsaw 788 | jock 789 | juggler 790 | jukebox 791 | junkyard 792 | jurist 793 | juror 794 | keypad 795 | kiwifruit 796 | knitwear 797 | kola 798 | ladybird 799 | lama 800 | lapdog 801 | laurel 802 | lawnmower 803 | lectern 804 | leech 805 | lemur 806 | lentil 807 | leotard 808 | levee 809 | lichen 810 | lifeboat 811 | lifejacket 812 | lightbulb 813 | locust 814 | longbow 815 | lorikeet 816 | lory 817 | lychee 818 | lynx 819 | macaque 820 | macaw 821 | machete 822 | mackerel 823 | madras 824 | magician 825 | magnolia 826 | mammoth 827 | manatee 828 | mandolin 829 | mantis 830 | mantlepiece 831 | marigold 832 | marimba 833 | marshmallow 834 | marzipan 835 | mascara 836 | masquerade 837 | mastodon 838 | matador 839 | mayflower 840 | medusa 841 | megaphone 842 | menagerie 843 | mesa 844 | metronome 845 | microprocessor 846 | microscope 847 | mimosa 848 | minestrone 849 | minibar 850 | minicomputer 851 | miniskirt 852 | mink 853 | missile 854 | mockingbird 855 | mongoose 856 | morgue 857 | mosque 858 | moth 859 | mousetrap 860 | moussaka 861 | mousse 862 | mullet 863 | mussel 864 | mustang 865 | naan 866 | nanny 867 | nebula 868 | nectarine 869 | nematode 870 | neurologist 871 | newsstand 872 | nightingale 873 | nightshade 874 | nipple 875 | nutcracker 876 | nutmeg 877 | oboe 878 | okapi 879 | opera 880 | orangutan 881 | orca 882 | oregano 883 | organist 884 | osprey 885 | otter 886 | oyster 887 | pajama 888 | palmetto 889 | papaya 890 | parlour 891 | parsnip 892 | peafowl 893 | pecan 894 | penthouse 895 | peppercorn 896 | percussionist 897 | persimmon 898 | petticoat 899 | pharmacy 900 | pheasant 901 | phonograph 902 | photocopy 903 | piccolo 904 | piglet 905 | pilaf 906 | pinion 907 | piranha 908 | pistachio 909 | pitchfork 910 | pixie 911 | plateau 912 | platypus 913 | plum 914 | plumber 915 | pocketbook 916 | pocketknife 917 | poinsettia 918 | polarbear 919 | poppet 920 | porpoise 921 | possum 922 | potluck 923 | poultry 924 | powerboat 925 | precipice 926 | primrose 927 | prong 928 | prune 929 | pueblo 930 | puffin 931 | pumpernickel 932 | pupa 933 | puppeteer 934 | pushchair 935 | pussycat 936 | python 937 | quail 938 | quarterback 939 | quill 940 | quilting 941 | quiver 942 | rabbi 943 | raccoon 944 | radicchio 945 | raisin 946 | rake 947 | raptor 948 | rattan 949 | redhead 950 | regalia 951 | revolver 952 | rhubarb 953 | ricotta 954 | riverbed 955 | roach 956 | roadblock 957 | roadhouse 958 | roadrunner 959 | roaster 960 | rocker 961 | rollercoaster 962 | romaine 963 | rosemary 964 | roulette 965 | sake 966 | salamander 967 | sallow 968 | samba 969 | sander 970 | sari 971 | sashimi 972 | saucepan 973 | sauna 974 | saxophone 975 | schnauzer 976 | schnitzel 977 | schoolhouse 978 | schoolmaster 979 | schoolroom 980 | sconce 981 | scone 982 | scythe 983 | seabird 984 | seaboard 985 | seafront 986 | semiautomatic 987 | serpent 988 | settler 989 | shackle 990 | shallot 991 | shipwreck 992 | shoebox 993 | shorebird 994 | showboat 995 | shrew 996 | sitar 997 | skunk 998 | sloth 999 | snapper 1000 | snorkel 1001 | snowbird 1002 | snowcap 1003 | snowdrift 1004 | snowflake 1005 | snowplow 1006 | snowshoe 1007 | sofabed 1008 | songbird 1009 | soybean 1010 | spaceship 1011 | spectacles 1012 | spectrometer 1013 | sphinx 1014 | sprinkler 1015 | spruce 1016 | stag 1017 | stallion 1018 | starfish 1019 | steamship 1020 | steed 1021 | stingray 1022 | stonework 1023 | stopwatch 1024 | streamer 1025 | streamliner 1026 | stretcher 1027 | subcontractor 1028 | summerhouse 1029 | sunfish 1030 | sunglass 1031 | supercomputer 1032 | superhighway 1033 | supernova 1034 | swallow 1035 | sweetcorn 1036 | swine 1037 | swordfish 1038 | sycamore 1039 | syringe 1040 | tabasco 1041 | tabernacle 1042 | tamale 1043 | tambourine 1044 | tammy 1045 | tankard 1046 | taper 1047 | tarantula 1048 | tearoom 1049 | teepee 1050 | telegraph 1051 | telegraphy 1052 | tempura 1053 | teriyaki 1054 | theremin 1055 | throne 1056 | timber 1057 | timpani 1058 | toad 1059 | toboggan 1060 | tombstone 1061 | topaz 1062 | topcoat 1063 | torch 1064 | torpedo 1065 | toucan 1066 | towboat 1067 | trackball 1068 | trampoline 1069 | tramway 1070 | trapeze 1071 | treadmill 1072 | trifle 1073 | truffle 1074 | trumpeter 1075 | tuba 1076 | tulip 1077 | turbojet 1078 | turnip 1079 | turnpike 1080 | tutu 1081 | tweed 1082 | tyrannosaurus 1083 | ukulele 1084 | unicycle 1085 | urchin 1086 | vat 1087 | verandah 1088 | vestibule 1089 | vial 1090 | viola 1091 | volcano 1092 | vole 1093 | voyager 1094 | vulture 1095 | waggon 1096 | wallaby 1097 | walrus 1098 | warbler 1099 | warship 1100 | warthog 1101 | wasp 1102 | watchtower 1103 | weevil 1104 | weld 1105 | wheelhouse 1106 | whippet 1107 | whisky 1108 | witch 1109 | wolverine 1110 | wombat 1111 | woodpecker 1112 | woollen 1113 | worm 1114 | wrangler 1115 | wren 1116 | yam 1117 | zipper 1118 | -------------------------------------------------------------------------------- /utils/lexicalList/lexicalList_JJ100_NN300_VB100_rmEightCoco1.txt: -------------------------------------------------------------------------------- 1 | yellow 2 | skate 3 | asian 4 | hanging 5 | dish 6 | chair 7 | row 8 | tv 9 | young 10 | bike 11 | cluttered 12 | brown 13 | woman 14 | blanket 15 | vase 16 | fall 17 | cook 18 | drinking 19 | school 20 | wooden 21 | cloudy 22 | large 23 | sand 24 | small 25 | guy 26 | enjoy 27 | bicycle 28 | fence 29 | skiing 30 | sign 31 | jump 32 | go 33 | street 34 | video 35 | pass 36 | run 37 | blue 38 | clock 39 | sun 40 | uniform 41 | cell 42 | public 43 | body 44 | full 45 | commercial 46 | french 47 | surfer 48 | water 49 | baseball 50 | sink 51 | box 52 | boy 53 | colored 54 | receive 55 | airport 56 | pick 57 | military 58 | climb 59 | apple 60 | family 61 | wii 62 | snowboard 63 | motor 64 | market 65 | standing 66 | plow 67 | next 68 | few 69 | camera 70 | vehicle 71 | strike 72 | tell 73 | flat 74 | door 75 | chocolate 76 | phone 77 | train 78 | adult 79 | baby 80 | hold 81 | fly 82 | room 83 | salad 84 | player 85 | car 86 | ride 87 | work 88 | cat 89 | decker 90 | donut 91 | male 92 | beautiful 93 | grassy 94 | give 95 | high 96 | something 97 | hit 98 | airplane 99 | dress 100 | pink 101 | huge 102 | end 103 | sit 104 | provide 105 | pine 106 | lamp 107 | animal 108 | elephant 109 | tile 110 | beach 111 | plant 112 | sandwich 113 | flip 114 | stop 115 | plane 116 | court 117 | wave 118 | man 119 | surfing 120 | crowded 121 | light 122 | counter 123 | meat 124 | green 125 | block 126 | enter 127 | basket 128 | tall 129 | playing 130 | talk 131 | wine 132 | cute 133 | help 134 | office 135 | move 136 | meter 137 | paper 138 | motorcycle 139 | oven 140 | keyboard 141 | bunch 142 | style 143 | police 144 | monitor 145 | fix 146 | hot 147 | window 148 | orange 149 | covered 150 | soccer 151 | sauce 152 | coffee 153 | someone 154 | return 155 | food 156 | wooded 157 | scene 158 | giraffe 159 | half 160 | front 161 | silver 162 | bread 163 | rocky 164 | stir 165 | rock 166 | tray 167 | meal 168 | house 169 | girl 170 | enclosure 171 | sandy 172 | living 173 | flower 174 | electronic 175 | hill 176 | red 177 | umbrella 178 | dirt 179 | hang 180 | horse 181 | cart 182 | graze 183 | base 184 | put 185 | sidewalk 186 | skateboard 187 | keep 188 | turn 189 | birthday 190 | swing 191 | bite 192 | feed 193 | cheese 194 | number 195 | carry 196 | open 197 | sheep 198 | city 199 | little 200 | toy 201 | hydrant 202 | jet 203 | top 204 | plastic 205 | station 206 | white 207 | banana 208 | store 209 | way 210 | shelf 211 | hotel 212 | park 213 | steel 214 | television 215 | double 216 | tree 217 | grey 218 | bed 219 | shower 220 | stall 221 | steer 222 | bridge 223 | modern 224 | have 225 | gear 226 | mountain 227 | person 228 | mid 229 | zoo 230 | dining 231 | take 232 | truck 233 | surf 234 | play 235 | multiple 236 | track 237 | serve 238 | reach 239 | leave 240 | pair 241 | refrigerator 242 | clear 243 | metal 244 | dog 245 | face 246 | clean 247 | professional 248 | slope 249 | walking 250 | shot 251 | kite 252 | show 253 | stove 254 | watch 255 | bright 256 | bedroom 257 | corner 258 | chicken 259 | ground 260 | giant 261 | busy 262 | wood 263 | black 264 | pretty 265 | rice 266 | plate 267 | handle 268 | colorful 269 | get 270 | sunny 271 | photograph 272 | bear 273 | batter 274 | tiled 275 | striped 276 | gray 277 | bat 278 | catcher 279 | bird 280 | bag 281 | grab 282 | river 283 | view 284 | set 285 | pitch 286 | seat 287 | see 288 | computer 289 | parking 290 | close 291 | stack 292 | outside 293 | various 294 | do 295 | screen 296 | deliver 297 | group 298 | tub 299 | herd 300 | come 301 | kitchen 302 | grass 303 | cow 304 | restaurant 305 | many 306 | let 307 | load 308 | toilet 309 | wall 310 | walk 311 | pole 312 | table 313 | grazing 314 | boat 315 | bathroom 316 | teddy 317 | brick 318 | empty 319 | engine 320 | dry 321 | assorted 322 | fire 323 | child 324 | catch 325 | look 326 | hat 327 | runway 328 | air 329 | trick 330 | country 331 | lush 332 | middle 333 | ready 334 | mouse 335 | helmet 336 | different 337 | shirt 338 | perform 339 | vintage 340 | make 341 | bowl 342 | cross 343 | same 344 | several 345 | pan 346 | ball 347 | drink 348 | rail 349 | hand 350 | fruit 351 | statue 352 | broccoli 353 | kid 354 | surfboard 355 | try 356 | floor 357 | ocean 358 | concrete 359 | edge 360 | outdoor 361 | photo 362 | laptop 363 | snow 364 | touch 365 | railroad 366 | blow 367 | cut 368 | snowboarder 369 | cup 370 | sky 371 | bench 372 | other 373 | pile 374 | board 375 | wet 376 | ski 377 | kick 378 | skier 379 | read 380 | big 381 | couple 382 | dark 383 | game 384 | traffic 385 | know 386 | desk 387 | doughnut 388 | intersection 389 | lady 390 | furniture 391 | night 392 | tower 393 | old 394 | crowd 395 | picture 396 | back 397 | hair 398 | mirror 399 | home 400 | slice 401 | purple 402 | prop 403 | skateboarder 404 | knife 405 | be 406 | eating 407 | pose 408 | use 409 | throw 410 | stone 411 | side 412 | dinner 413 | stand 414 | road 415 | image 416 | bath 417 | brush 418 | female 419 | prepare 420 | area 421 | stainless 422 | tennis 423 | lone 424 | long 425 | start 426 | low 427 | lot 428 | suit 429 | fork 430 | head 431 | snowy 432 | pitcher 433 | line 434 | eat 435 | pull 436 | inside 437 | up 438 | carriage 439 | dirty 440 | cake 441 | piece 442 | display 443 | passenger 444 | single 445 | check 446 | variety 447 | field 448 | book 449 | holding 450 | tie 451 | nice 452 | brushing 453 | polar 454 | frisbee 455 | building 456 | land 457 | remote 458 | glass 459 | jacket 460 | time 461 | fresh 462 | -------------------------------------------------------------------------------- /utils/lexicalList/lexicalList_parseCoco_JJ100_NN300_VB100.txt: -------------------------------------------------------------------------------- 1 | yellow 2 | skate 3 | asian 4 | hanging 5 | dish 6 | chair 7 | row 8 | tv 9 | young 10 | bike 11 | cluttered 12 | brown 13 | woman 14 | blanket 15 | vase 16 | fall 17 | cook 18 | drinking 19 | school 20 | wooden 21 | cloudy 22 | large 23 | sand 24 | small 25 | guy 26 | enjoy 27 | bicycle 28 | fence 29 | skiing 30 | sign 31 | jump 32 | go 33 | street 34 | video 35 | pass 36 | run 37 | blue 38 | clock 39 | sun 40 | uniform 41 | cell 42 | public 43 | body 44 | full 45 | commercial 46 | french 47 | surfer 48 | water 49 | baseball 50 | sink 51 | box 52 | boy 53 | colored 54 | luggage 55 | receive 56 | airport 57 | pick 58 | military 59 | climb 60 | apple 61 | family 62 | wii 63 | snowboard 64 | motor 65 | market 66 | standing 67 | plow 68 | next 69 | few 70 | camera 71 | vehicle 72 | strike 73 | tell 74 | flat 75 | door 76 | chocolate 77 | phone 78 | train 79 | adult 80 | baby 81 | hold 82 | fly 83 | room 84 | salad 85 | player 86 | car 87 | ride 88 | work 89 | cat 90 | decker 91 | donut 92 | male 93 | beautiful 94 | grassy 95 | give 96 | high 97 | something 98 | hit 99 | airplane 100 | dress 101 | pink 102 | huge 103 | end 104 | sit 105 | provide 106 | pine 107 | lamp 108 | animal 109 | elephant 110 | tile 111 | beach 112 | pizza 113 | plant 114 | sandwich 115 | flip 116 | stop 117 | plane 118 | court 119 | wave 120 | man 121 | surfing 122 | crowded 123 | light 124 | counter 125 | meat 126 | green 127 | block 128 | enter 129 | basket 130 | tall 131 | playing 132 | talk 133 | wine 134 | cute 135 | help 136 | office 137 | move 138 | meter 139 | paper 140 | motorcycle 141 | oven 142 | keyboard 143 | bunch 144 | style 145 | police 146 | monitor 147 | fix 148 | hot 149 | window 150 | orange 151 | covered 152 | soccer 153 | sauce 154 | coffee 155 | someone 156 | return 157 | food 158 | wooded 159 | scene 160 | giraffe 161 | half 162 | front 163 | silver 164 | bread 165 | rocky 166 | stir 167 | rock 168 | tray 169 | meal 170 | house 171 | zebra 172 | girl 173 | enclosure 174 | sandy 175 | living 176 | flower 177 | electronic 178 | hill 179 | red 180 | umbrella 181 | dirt 182 | hang 183 | horse 184 | cart 185 | graze 186 | base 187 | put 188 | sidewalk 189 | skateboard 190 | keep 191 | turn 192 | birthday 193 | swing 194 | bite 195 | feed 196 | cheese 197 | number 198 | carry 199 | open 200 | sheep 201 | city 202 | little 203 | toy 204 | hydrant 205 | jet 206 | top 207 | plastic 208 | station 209 | white 210 | banana 211 | store 212 | way 213 | shelf 214 | hotel 215 | park 216 | steel 217 | television 218 | double 219 | tree 220 | grey 221 | bed 222 | shower 223 | stall 224 | steer 225 | bridge 226 | modern 227 | have 228 | gear 229 | mountain 230 | person 231 | mid 232 | zoo 233 | dining 234 | take 235 | truck 236 | surf 237 | play 238 | multiple 239 | track 240 | serve 241 | reach 242 | leave 243 | pair 244 | refrigerator 245 | clear 246 | metal 247 | dog 248 | face 249 | clean 250 | professional 251 | slope 252 | walking 253 | shot 254 | kite 255 | show 256 | stove 257 | watch 258 | bright 259 | bedroom 260 | corner 261 | chicken 262 | ground 263 | giant 264 | busy 265 | wood 266 | black 267 | pretty 268 | rice 269 | plate 270 | handle 271 | colorful 272 | get 273 | sunny 274 | photograph 275 | bear 276 | batter 277 | tiled 278 | striped 279 | gray 280 | bat 281 | catcher 282 | bird 283 | bag 284 | grab 285 | river 286 | view 287 | set 288 | pitch 289 | seat 290 | see 291 | computer 292 | parking 293 | racket 294 | close 295 | stack 296 | outside 297 | various 298 | do 299 | screen 300 | deliver 301 | group 302 | tub 303 | herd 304 | come 305 | kitchen 306 | grass 307 | cow 308 | restaurant 309 | many 310 | let 311 | load 312 | toilet 313 | wall 314 | walk 315 | pole 316 | table 317 | grazing 318 | boat 319 | bathroom 320 | teddy 321 | brick 322 | empty 323 | engine 324 | dry 325 | assorted 326 | fire 327 | child 328 | catch 329 | look 330 | hat 331 | runway 332 | air 333 | trick 334 | country 335 | lush 336 | middle 337 | ready 338 | mouse 339 | helmet 340 | different 341 | shirt 342 | perform 343 | vintage 344 | make 345 | bowl 346 | cross 347 | same 348 | several 349 | pan 350 | ball 351 | drink 352 | rail 353 | hand 354 | fruit 355 | statue 356 | broccoli 357 | kid 358 | surfboard 359 | try 360 | floor 361 | ocean 362 | concrete 363 | edge 364 | bottle 365 | outdoor 366 | photo 367 | laptop 368 | snow 369 | touch 370 | couch 371 | railroad 372 | blow 373 | cut 374 | snowboarder 375 | cup 376 | sky 377 | bench 378 | other 379 | pile 380 | board 381 | wet 382 | ski 383 | kick 384 | skier 385 | read 386 | big 387 | couple 388 | dark 389 | game 390 | traffic 391 | know 392 | desk 393 | doughnut 394 | intersection 395 | lady 396 | furniture 397 | night 398 | tower 399 | old 400 | crowd 401 | picture 402 | back 403 | hair 404 | mirror 405 | home 406 | slice 407 | purple 408 | microwave 409 | prop 410 | skateboarder 411 | knife 412 | be 413 | eating 414 | pose 415 | use 416 | throw 417 | stone 418 | side 419 | dinner 420 | stand 421 | road 422 | image 423 | bath 424 | brush 425 | female 426 | racquet 427 | prepare 428 | area 429 | stainless 430 | tennis 431 | lone 432 | long 433 | start 434 | low 435 | lot 436 | suit 437 | fork 438 | head 439 | snowy 440 | bus 441 | pitcher 442 | line 443 | eat 444 | pull 445 | inside 446 | up 447 | carriage 448 | dirty 449 | suitcase 450 | cake 451 | piece 452 | display 453 | passenger 454 | single 455 | check 456 | variety 457 | field 458 | book 459 | holding 460 | tie 461 | nice 462 | brushing 463 | polar 464 | frisbee 465 | building 466 | land 467 | remote 468 | glass 469 | jacket 470 | time 471 | fresh 472 | -------------------------------------------------------------------------------- /utils/python_data_layers.py: -------------------------------------------------------------------------------- 1 | #!/usr/bin/env python 2 | 3 | import pdb 4 | import sys 5 | try: 6 | from config import * 7 | except: 8 | raise Exception("Please setup config file (see instructions)") 9 | sys.path.append(pycaffe_dir) 10 | import caffe 11 | import io 12 | from PIL import Image 13 | import matplotlib.pyplot as plt 14 | import numpy as np 15 | import scipy.misc 16 | import time 17 | import glob 18 | import pickle as pkl 19 | import random 20 | import h5py 21 | from multiprocessing import Pool 22 | from threading import Thread 23 | import skimage.io 24 | import copy 25 | import json 26 | import time 27 | import re 28 | import math 29 | 30 | UNK_IDENTIFIER = '' 31 | SENTENCE_SPLIT_REGEX = re.compile(r'(\W+)') 32 | 33 | def read_json(t_file): 34 | j_file = open(t_file).read() 35 | return json.loads(j_file) 36 | 37 | def split_sentence(sentence): 38 | # break sentence into a list of words and punctuation 39 | sentence = [s.lower() for s in SENTENCE_SPLIT_REGEX.split(sentence.strip()) if len(s.strip()) > 0] 40 | if sentence[-1] != '.': 41 | return sentence 42 | return sentence[:-1] 43 | 44 | def tokenize_text(sentence, vocabulary, leave_out_unks=False): 45 | sentence = split_sentence(sentence) 46 | token_sent = [] 47 | for w in sentence: 48 | try: 49 | token_sent.append(vocabulary[w]) 50 | except: 51 | if not leave_out_unks: 52 | try: 53 | token_sent.append(vocabulary['']) 54 | except: 55 | pass 56 | else: 57 | pass 58 | if not leave_out_unks: 59 | token_sent.append(vocabulary['EOS']) 60 | return token_sent 61 | 62 | def open_vocab(vocab_txt): 63 | vocab_list = open(vocab_txt).readlines() 64 | vocab_list = ['EOS'] + [v.strip() for v in vocab_list] 65 | vocab = {} 66 | for iv, v in enumerate(vocab_list): vocab[v] = iv 67 | return vocab 68 | 69 | def textPreprocessor(params): 70 | #input: 71 | # params['caption_json']: text json which contains text and a path to an image if the text is grounded in an image 72 | # params['vocabulary']: vocabulary txt to use 73 | #output: 74 | # processed_text: tokenized text with corresponding image path (if they exist) 75 | 76 | #make vocabulary dict 77 | vocab = open_vocab(params['vocabulary']) 78 | json_text = read_json(params['caption_json']) 79 | processed_text = {} 80 | 81 | t = time.time() 82 | for annotation in json_text['annotations']: 83 | processed_text[annotation['id']] = {} 84 | processed_text[annotation['id']]['text'] = tokenize_text(annotation['caption'], vocab) 85 | processed_text[annotation['id']]['image'] = annotation['image_id'] 86 | print "Setting up text dict: ", time.time()-t 87 | return processed_text 88 | 89 | class extractData(object): 90 | 91 | def increment(self): 92 | #uses iteration, batch_size, data_list, and num_data to extract next batch identifiers 93 | next_batch = [None]*self.batch_size 94 | if self.iteration + self.batch_size >= self.num_data: 95 | next_batch[:self.num_data-self.iteration] = self.data_list[self.iteration:] 96 | next_batch[self.num_data-self.iteration:] = self.data_list[:self.batch_size -(self.num_data-self.iteration)] 97 | random.shuffle(self.data_list) 98 | self.iteration = self.num_data - self.iteration 99 | else: 100 | next_batch = self.data_list[self.iteration:self.iteration+self.batch_size] 101 | self.iteration += self.batch_size 102 | assert self.iteration > -1 103 | assert len(next_batch) == self.batch_size 104 | return next_batch 105 | 106 | def advanceBatch(self): 107 | next_batch = self.increment() 108 | self.get_data(next_batch) 109 | 110 | class extractFeatureText(extractData): 111 | 112 | def __init__(self, dataset, params, result): 113 | self.extractType = 'text' 114 | self.data_list = dataset.keys() 115 | self.num_data = len(self.data_list) 116 | print 'For extractor extractText, length of data is: ', self.num_data 117 | self.dataset = dataset 118 | self.iteration = 0 119 | self.batch_size = params['batch_size'] 120 | self.stream_size = params['stream_size'] 121 | 122 | #create h5 dataset 123 | extracted_features = h5py.File(params['feature_file'],'r') 124 | 125 | self.features = {} 126 | t = time.time() 127 | for ix, im in enumerate(extracted_features['ims']): 128 | im_key = int(im.split('_')[-1].split('.jpg')[0]) 129 | self.features[im_key] = extracted_features['features'][ix] 130 | print "Setting up image dict: ", time.time()-t 131 | 132 | #prep to process image 133 | self.feature_dim = self.features.values()[0].shape[0] 134 | feature_data_shape = (self.batch_size, self.feature_dim) 135 | 136 | #preperation to output top 137 | self.text_data_key = params['text_data_key'] 138 | self.text_label_key = params['text_label_key'] 139 | self.marker_key = params['text_marker_key'] 140 | self.feature_data_key = params['feature_data_key'] 141 | self.top_keys = [self.text_data_key, self.text_label_key, self.marker_key, self.feature_data_key] 142 | self.batch_size = params['batch_size'] 143 | self.stream_size = params['stream_size'] 144 | self.top_shapes = [(self.stream_size, self.batch_size), (self.stream_size, self.batch_size), (self.stream_size, self.batch_size), feature_data_shape] 145 | self.result = result 146 | 147 | def get_data(self, next_batch): 148 | batch_images = [self.dataset[nb]['image'] for nb in next_batch] 149 | next_batch_input_sentences = np.zeros((self.stream_size, self.batch_size)) 150 | next_batch_target_sentences = np.ones((self.stream_size, self.batch_size))*-1 151 | next_batch_feature_data = np.ones((self.batch_size, self.feature_dim)) 152 | next_batch_markers = np.ones((self.stream_size, self.batch_size)) 153 | next_batch_markers[0,:] = 0 154 | for ni, nb in enumerate(next_batch): 155 | ns = self.dataset[nb]['text'] 156 | nf = self.dataset[nb]['image'] 157 | num_words = len(ns) 158 | ns_input = ns[:min(num_words, self.stream_size-1)] 159 | ns_target = ns[:min(num_words, self.stream_size)] 160 | next_batch_input_sentences[1:min(num_words+1, self.stream_size), ni] = ns_input 161 | next_batch_target_sentences[:min(num_words, self.stream_size), ni] = ns_target 162 | next_batch_feature_data[ni,...] = self.features[nf] 163 | 164 | self.result[self.text_data_key] = next_batch_input_sentences 165 | self.result[self.text_label_key] = next_batch_target_sentences 166 | self.result[self.marker_key] = next_batch_markers 167 | self.result[self.feature_data_key] = next_batch_feature_data 168 | 169 | class extractMulti(extractData): 170 | 171 | def __init__(self, dataset, params, result): 172 | #just need to set up parameters for "increment" 173 | self.extractors = params['extractors'] 174 | self.batch_size = params['batch_size'] 175 | self.data_list = dataset.keys() 176 | self.num_data = len(self.data_list) 177 | self.dataset = dataset 178 | self.iteration = 0 179 | self.batch_size = params['batch_size'] 180 | 181 | self.top_keys = [] 182 | self.top_shapes = [] 183 | for e in self.extractors: 184 | self.top_keys.extend(e.top_keys) 185 | self.top_shapes.extend(e.top_shapes) 186 | 187 | def get_data(self, next_batch): 188 | t = time.time() 189 | for e in self.extractors: 190 | e.get_data(next_batch) 191 | 192 | class batchAdvancer(object): 193 | 194 | def __init__(self, extractors): 195 | self.extractors = extractors 196 | 197 | def __call__(self): 198 | #The batch advancer just calls each extractor 199 | for e in self.extractors: 200 | e.advanceBatch() 201 | 202 | class python_data_layer(caffe.Layer): 203 | 204 | def setup(self, bottom, top): 205 | random.seed(10) 206 | 207 | self.params = eval(self.param_str) 208 | params = self.params 209 | 210 | #set up prefetching 211 | self.thread_result = {} 212 | self.thread = None 213 | 214 | self.setup_extractors() 215 | 216 | self.batch_advancer = batchAdvancer(self.data_extractors) 217 | self.top_names = [] 218 | self.top_shapes = [] 219 | for de in self.data_extractors: 220 | self.top_names.extend(de.top_keys) 221 | self.top_shapes.extend(de.top_shapes) 222 | 223 | self.dispatch_worker() 224 | 225 | if 'top_names' in params.keys(): 226 | #check top names equal to each other... 227 | if not (set(params['top_names']) == set(self.top_names)): 228 | raise Exception("Input 'top names' not the same as determined top names.") 229 | else: 230 | self.top_names == params['top_names'] 231 | 232 | print self.top_names 233 | print 'Outputs:', self.top_names 234 | if len(top) != len(self.top_names): 235 | raise Exception('Incorrect number of outputs (expected %d, got %d)' % 236 | (len(self.top_names), len(top))) 237 | self.join_worker() 238 | #for top_index, name in enumerate(self.top_names.keys()): 239 | 240 | for top_index, name in enumerate(self.top_names): 241 | shape = self.top_shapes[top_index] 242 | print 'Top name %s has shape %s.' %(name, shape) 243 | top[top_index].reshape(*shape) 244 | 245 | def reshape(self, bottom, top): 246 | pass 247 | 248 | def forward(self, bottom, top): 249 | 250 | if self.thread is not None: 251 | self.join_worker() 252 | 253 | for top_index, name in zip(range(len(top)), self.top_names): 254 | top[top_index].data[...] = self.thread_result[name] 255 | 256 | self.dispatch_worker() 257 | 258 | def dispatch_worker(self): 259 | assert self.thread is None 260 | self.thread = Thread(target=self.batch_advancer) 261 | self.thread.start() 262 | 263 | def join_worker(self): 264 | assert self.thread is not None 265 | self.thread.join() 266 | self.thread = None 267 | 268 | def backward(self, top, propagate_down, bottom): 269 | pass 270 | 271 | class pairedCaptionData(python_data_layer): 272 | 273 | def setup_extractors(self): 274 | 275 | params = self.params 276 | 277 | #check that all parameters are included and set default params 278 | assert 'caption_json' in self.params.keys() 279 | assert 'vocabulary' in self.params.keys() 280 | assert 'feature_file' in self.params.keys() 281 | if 'batch_size' not in params.keys(): params['batch_size'] = 100 282 | if 'stream_size' not in params.keys(): params['stream_size'] = 20 283 | 284 | params['text_data_key'] = 'input_sentence' 285 | params['text_label_key'] = 'target_sentence' 286 | params['text_marker_key'] = 'cont_sentence' 287 | params['feature_data_key'] = 'data' 288 | 289 | data = textPreprocessor(params) 290 | data_extractor = extractFeatureText(data, params, self.thread_result) 291 | 292 | self.data_extractors = [data_extractor] 293 | -------------------------------------------------------------------------------- /utils/transfer_experiments/transfer_classifiers_coco1.txt: -------------------------------------------------------------------------------- 1 | zebra 2 | zebra 3 | pizza 4 | pizza 5 | suitcase 6 | suitcase 7 | luggage 8 | luggage 9 | bottle 10 | bottle 11 | bus 12 | bus 13 | couch 14 | couch 15 | microwave 16 | microwave 17 | racket 18 | racket 19 | racquet 20 | racquet 21 | -------------------------------------------------------------------------------- /utils/transfer_experiments/transfer_classifiers_imagenet.txt: -------------------------------------------------------------------------------- 1 | aardvark 2 | abacus 3 | acorn 4 | acrobatics 5 | acrylic 6 | admiral 7 | albatross 8 | alp 9 | alpaca 10 | ambrosia 11 | amphitheater 12 | anaconda 13 | android 14 | anteater 15 | ape 16 | applesauce 17 | apricot 18 | aquatics 19 | ark 20 | armadillo 21 | armory 22 | armour 23 | arroyo 24 | artillery 25 | astronaut 26 | audiovisual 27 | aviary 28 | axe 29 | azalea 30 | backpacker 31 | badger 32 | bagpipe 33 | balaclava 34 | ballplayer 35 | banjo 36 | banquette 37 | bantam 38 | baobab 39 | baritone 40 | barnacle 41 | barometer 42 | barracuda 43 | barrette 44 | basilica 45 | bassoon 46 | bayonet 47 | beaker 48 | bearskin 49 | beaver 50 | bedchamber 51 | belltower 52 | birdbath 53 | birdie 54 | bishop 55 | bistro 56 | blackbird 57 | blackcurrant 58 | blackjack 59 | blimp 60 | bloodhound 61 | boa 62 | boathouse 63 | boatman 64 | boatswain 65 | bobcat 66 | bobsled 67 | bodyguard 68 | bolero 69 | bongo 70 | bookend 71 | botanist 72 | boudoir 73 | boutique 74 | bramble 75 | brasserie 76 | breadbasket 77 | brontosaurus 78 | brownstone 79 | bubblegum 80 | buckskin 81 | bugle 82 | bumblebee 83 | bungalow 84 | burka 85 | burro 86 | bushbaby 87 | butte 88 | cabaret 89 | cablecar 90 | cabriolet 91 | cadaver 92 | caddie 93 | cadet 94 | caldera 95 | camcorder 96 | canary 97 | candelabra 98 | candida 99 | candlestick 100 | cannonball 101 | canon 102 | caribou 103 | carp 104 | cashew 105 | cashmere 106 | cassava 107 | catnip 108 | cauldron 109 | caveman 110 | cedar 111 | cellist 112 | cello 113 | centrifuge 114 | chainsaw 115 | chameleon 116 | chamomile 117 | chapel 118 | chateau 119 | chemist 120 | chickadee 121 | chicory 122 | chiffon 123 | chile 124 | chime 125 | chimpanzee 126 | chinchilla 127 | chowder 128 | chrysalis 129 | chrysanthemum 130 | cinema 131 | circuitry 132 | citadel 133 | clam 134 | clamshell 135 | clarinet 136 | clavichord 137 | clergyman 138 | clinician 139 | clipper 140 | cloak 141 | clothespin 142 | clove 143 | cobweb 144 | cockatoo 145 | cockerel 146 | cockfighting 147 | cockroach 148 | coffin 149 | colander 150 | conch 151 | condominium 152 | conformation 153 | conifer 154 | copier 155 | coriander 156 | corkscrew 157 | cornet 158 | coronet 159 | corsage 160 | corset 161 | cosmopolitan 162 | cougar 163 | courgette 164 | couscous 165 | cowbell 166 | coyote 167 | crayfish 168 | crayon 169 | crocodile 170 | crossbow 171 | crucifix 172 | crutch 173 | cuttlefish 174 | daffodil 175 | dagger 176 | dahl 177 | dais 178 | damselfly 179 | delicatessen 180 | dinghy 181 | dingo 182 | discotheque 183 | discus 184 | divan 185 | dogfood 186 | doghouse 187 | dogwood 188 | doorbell 189 | doorknob 190 | dormitory 191 | doula 192 | dragonfly 193 | drumstick 194 | dumpling 195 | dungeon 196 | earthenware 197 | edifice 198 | eel 199 | emerald 200 | emu 201 | enchilada 202 | envoy 203 | equator 204 | eraser 205 | eucalyptus 206 | euphonium 207 | eyeliner 208 | eyeshade 209 | eyewitness 210 | falafel 211 | farmyard 212 | fauna 213 | fawn 214 | ferryboat 215 | fiddle 216 | fiddler 217 | fife 218 | fig 219 | finch 220 | firefly 221 | fishbowl 222 | flagpole 223 | floorboard 224 | flotsam 225 | flounder 226 | flowerbed 227 | flute 228 | fondue 229 | footbridge 230 | foothill 231 | forceps 232 | fortress 233 | fossil 234 | foxhunting 235 | freighter 236 | friar 237 | frittata 238 | fuji 239 | furnace 240 | gallows 241 | gargoyle 242 | gator 243 | gecko 244 | geranium 245 | gibbon 246 | glacier 247 | gladiator 248 | gong 249 | gooseberry 250 | goulash 251 | gramophone 252 | grandpa 253 | grandparent 254 | grasshopper 255 | grenade 256 | groundhog 257 | grouse 258 | guillotine 259 | guitarist 260 | gulag 261 | gunboat 262 | gymnast 263 | gymnastics 264 | gyroscope 265 | hacksaw 266 | haggis 267 | hairpiece 268 | halibut 269 | halo 270 | handcuff 271 | handgun 272 | hare 273 | harmonica 274 | harp 275 | harpoon 276 | harpsichord 277 | hazelnut 278 | headland 279 | headscarf 280 | headteacher 281 | hearse 282 | hedgehog 283 | highland 284 | hitchhiker 285 | hollandaise 286 | honeycomb 287 | honeysuckle 288 | hornet 289 | horseman 290 | horseracing 291 | horseshoe 292 | hostel 293 | hourglass 294 | huckleberry 295 | humpback 296 | humus 297 | huntress 298 | hyacinth 299 | hydra 300 | hydroplane 301 | hyena 302 | iceberg 303 | icemaker 304 | igloo 305 | iguana 306 | impala 307 | jackal 308 | jade 309 | jaguar 310 | jak 311 | jalapeno 312 | jasmine 313 | javelin 314 | jellyfish 315 | jester 316 | jigsaw 317 | jock 318 | juggler 319 | jukebox 320 | junkyard 321 | jurist 322 | juror 323 | keypad 324 | kiwifruit 325 | knitwear 326 | kola 327 | ladybird 328 | lama 329 | lapdog 330 | laurel 331 | lawnmower 332 | lectern 333 | leech 334 | lemur 335 | lentil 336 | leotard 337 | levee 338 | lichen 339 | lifeboat 340 | lifejacket 341 | lightbulb 342 | locust 343 | longbow 344 | lorikeet 345 | lory 346 | lychee 347 | lynx 348 | macaque 349 | macaw 350 | machete 351 | mackerel 352 | madras 353 | magician 354 | magnolia 355 | mammoth 356 | manatee 357 | mandolin 358 | mantis 359 | mantlepiece 360 | marigold 361 | marimba 362 | marshmallow 363 | marzipan 364 | mascara 365 | masquerade 366 | mastodon 367 | matador 368 | mayflower 369 | medusa 370 | megaphone 371 | menagerie 372 | mesa 373 | metronome 374 | microprocessor 375 | microscope 376 | mimosa 377 | minestrone 378 | minibar 379 | minicomputer 380 | miniskirt 381 | mink 382 | missile 383 | mockingbird 384 | mongoose 385 | morgue 386 | mosque 387 | moth 388 | mousetrap 389 | moussaka 390 | mousse 391 | mullet 392 | mussel 393 | mustang 394 | naan 395 | nanny 396 | nebula 397 | nectarine 398 | nematode 399 | neurologist 400 | newsstand 401 | nightingale 402 | nightshade 403 | nipple 404 | nutcracker 405 | nutmeg 406 | oboe 407 | okapi 408 | opera 409 | orangutan 410 | orca 411 | oregano 412 | organist 413 | osprey 414 | otter 415 | oyster 416 | pajama 417 | palmetto 418 | papaya 419 | parlour 420 | parsnip 421 | peafowl 422 | pecan 423 | penthouse 424 | peppercorn 425 | percussionist 426 | persimmon 427 | petticoat 428 | pharmacy 429 | pheasant 430 | phonograph 431 | photocopy 432 | piccolo 433 | piglet 434 | pilaf 435 | pinion 436 | piranha 437 | pistachio 438 | pitchfork 439 | pixie 440 | plateau 441 | platypus 442 | plum 443 | plumber 444 | pocketbook 445 | pocketknife 446 | poinsettia 447 | polarbear 448 | poppet 449 | porpoise 450 | possum 451 | potluck 452 | poultry 453 | powerboat 454 | precipice 455 | primrose 456 | prong 457 | prune 458 | pueblo 459 | puffin 460 | pumpernickel 461 | pupa 462 | puppeteer 463 | pushchair 464 | pussycat 465 | python 466 | quail 467 | quarterback 468 | quill 469 | quilting 470 | quiver 471 | rabbi 472 | raccoon 473 | radicchio 474 | raisin 475 | rake 476 | raptor 477 | rattan 478 | redhead 479 | regalia 480 | revolver 481 | rhubarb 482 | ricotta 483 | riverbed 484 | roach 485 | roadblock 486 | roadhouse 487 | roadrunner 488 | roaster 489 | rocker 490 | rollercoaster 491 | romaine 492 | rosemary 493 | roulette 494 | sake 495 | salamander 496 | sallow 497 | samba 498 | sander 499 | sari 500 | sashimi 501 | saucepan 502 | sauna 503 | saxophone 504 | schnauzer 505 | schnitzel 506 | schoolhouse 507 | schoolmaster 508 | schoolroom 509 | sconce 510 | scone 511 | scythe 512 | seabird 513 | seaboard 514 | seafront 515 | semiautomatic 516 | serpent 517 | settler 518 | shackle 519 | shallot 520 | shipwreck 521 | shoebox 522 | shorebird 523 | showboat 524 | shrew 525 | sitar 526 | skunk 527 | sloth 528 | snapper 529 | snorkel 530 | snowbird 531 | snowcap 532 | snowdrift 533 | snowflake 534 | snowplow 535 | snowshoe 536 | sofabed 537 | songbird 538 | soybean 539 | spaceship 540 | spectacles 541 | spectrometer 542 | sphinx 543 | sprinkler 544 | spruce 545 | stag 546 | stallion 547 | starfish 548 | steamship 549 | steed 550 | stingray 551 | stonework 552 | stopwatch 553 | streamer 554 | streamliner 555 | stretcher 556 | subcontractor 557 | summerhouse 558 | sunfish 559 | sunglass 560 | supercomputer 561 | superhighway 562 | supernova 563 | swallow 564 | sweetcorn 565 | swine 566 | swordfish 567 | sycamore 568 | syringe 569 | tabasco 570 | tabernacle 571 | tamale 572 | tambourine 573 | tammy 574 | tankard 575 | taper 576 | tarantula 577 | tearoom 578 | teepee 579 | telegraph 580 | telegraphy 581 | tempura 582 | teriyaki 583 | theremin 584 | throne 585 | timber 586 | timpani 587 | toad 588 | toboggan 589 | tombstone 590 | topaz 591 | topcoat 592 | torch 593 | torpedo 594 | toucan 595 | towboat 596 | trackball 597 | trampoline 598 | tramway 599 | trapeze 600 | treadmill 601 | trifle 602 | truffle 603 | trumpeter 604 | tuba 605 | tulip 606 | turbojet 607 | turnip 608 | turnpike 609 | tutu 610 | tweed 611 | tyrannosaurus 612 | ukulele 613 | unicycle 614 | urchin 615 | vat 616 | verandah 617 | vestibule 618 | vial 619 | viola 620 | volcano 621 | vole 622 | voyager 623 | vulture 624 | waggon 625 | wallaby 626 | walrus 627 | warbler 628 | warship 629 | warthog 630 | wasp 631 | watchtower 632 | weevil 633 | weld 634 | wheelhouse 635 | whippet 636 | whisky 637 | witch 638 | wolverine 639 | wombat 640 | woodpecker 641 | woollen 642 | worm 643 | wrangler 644 | wren 645 | yam 646 | zipper 647 | -------------------------------------------------------------------------------- /utils/transfer_experiments/transfer_words_coco1.txt: -------------------------------------------------------------------------------- 1 | zebra 2 | zebras 3 | pizza 4 | pizzas 5 | suitcase 6 | suitcases 7 | luggage 8 | luggages 9 | bottle 10 | bottles 11 | bus 12 | busses 13 | couch 14 | couches 15 | microwave 16 | microwaves 17 | racket 18 | rackets 19 | racquet 20 | racquets 21 | -------------------------------------------------------------------------------- /utils/transfer_experiments/transfer_words_imagenet.txt: -------------------------------------------------------------------------------- 1 | aardvark 2 | abacus 3 | acorn 4 | acrobatics 5 | acrylic 6 | admiral 7 | albatross 8 | alp 9 | alpaca 10 | ambrosia 11 | amphitheater 12 | anaconda 13 | android 14 | anteater 15 | ape 16 | applesauce 17 | apricot 18 | aquatics 19 | ark 20 | armadillo 21 | armory 22 | armour 23 | arroyo 24 | artillery 25 | astronaut 26 | audiovisual 27 | aviary 28 | axe 29 | azalea 30 | backpacker 31 | badger 32 | bagpipe 33 | balaclava 34 | ballplayer 35 | banjo 36 | banquette 37 | bantam 38 | baobab 39 | baritone 40 | barnacle 41 | barometer 42 | barracuda 43 | barrette 44 | basilica 45 | bassoon 46 | bayonet 47 | beaker 48 | bearskin 49 | beaver 50 | bedchamber 51 | belltower 52 | birdbath 53 | birdie 54 | bishop 55 | bistro 56 | blackbird 57 | blackcurrant 58 | blackjack 59 | blimp 60 | bloodhound 61 | boa 62 | boathouse 63 | boatman 64 | boatswain 65 | bobcat 66 | bobsled 67 | bodyguard 68 | bolero 69 | bongo 70 | bookend 71 | botanist 72 | boudoir 73 | boutique 74 | bramble 75 | brasserie 76 | breadbasket 77 | brontosaurus 78 | brownstone 79 | bubblegum 80 | buckskin 81 | bugle 82 | bumblebee 83 | bungalow 84 | burka 85 | burro 86 | bushbaby 87 | butte 88 | cabaret 89 | cablecar 90 | cabriolet 91 | cadaver 92 | caddie 93 | cadet 94 | caldera 95 | camcorder 96 | canary 97 | candelabra 98 | candida 99 | candlestick 100 | cannonball 101 | canon 102 | caribou 103 | carp 104 | cashew 105 | cashmere 106 | cassava 107 | catnip 108 | cauldron 109 | caveman 110 | cedar 111 | cellist 112 | cello 113 | centrifuge 114 | chainsaw 115 | chameleon 116 | chamomile 117 | chapel 118 | chateau 119 | chemist 120 | chickadee 121 | chicory 122 | chiffon 123 | chile 124 | chime 125 | chimpanzee 126 | chinchilla 127 | chowder 128 | chrysalis 129 | chrysanthemum 130 | cinema 131 | circuitry 132 | citadel 133 | clam 134 | clamshell 135 | clarinet 136 | clavichord 137 | clergyman 138 | clinician 139 | clipper 140 | cloak 141 | clothespin 142 | clove 143 | cobweb 144 | cockatoo 145 | cockerel 146 | cockfighting 147 | cockroach 148 | coffin 149 | colander 150 | conch 151 | condominium 152 | conformation 153 | conifer 154 | copier 155 | coriander 156 | corkscrew 157 | cornet 158 | coronet 159 | corsage 160 | corset 161 | cosmopolitan 162 | cougar 163 | courgette 164 | couscous 165 | cowbell 166 | coyote 167 | crayfish 168 | crayon 169 | crocodile 170 | crossbow 171 | crucifix 172 | crutch 173 | cuttlefish 174 | daffodil 175 | dagger 176 | dahl 177 | dais 178 | damselfly 179 | delicatessen 180 | dinghy 181 | dingo 182 | discotheque 183 | discus 184 | divan 185 | dogfood 186 | doghouse 187 | dogwood 188 | doorbell 189 | doorknob 190 | dormitory 191 | doula 192 | dragonfly 193 | drumstick 194 | dumpling 195 | dungeon 196 | earthenware 197 | edifice 198 | eel 199 | emerald 200 | emu 201 | enchilada 202 | envoy 203 | equator 204 | eraser 205 | eucalyptus 206 | euphonium 207 | eyeliner 208 | eyeshade 209 | eyewitness 210 | falafel 211 | farmyard 212 | fauna 213 | fawn 214 | ferryboat 215 | fiddle 216 | fiddler 217 | fife 218 | fig 219 | finch 220 | firefly 221 | fishbowl 222 | flagpole 223 | floorboard 224 | flotsam 225 | flounder 226 | flowerbed 227 | flute 228 | fondue 229 | footbridge 230 | foothill 231 | forceps 232 | fortress 233 | fossil 234 | foxhunting 235 | freighter 236 | friar 237 | frittata 238 | fuji 239 | furnace 240 | gallows 241 | gargoyle 242 | gator 243 | gecko 244 | geranium 245 | gibbon 246 | glacier 247 | gladiator 248 | gong 249 | gooseberry 250 | goulash 251 | gramophone 252 | grandpa 253 | grandparent 254 | grasshopper 255 | grenade 256 | groundhog 257 | grouse 258 | guillotine 259 | guitarist 260 | gulag 261 | gunboat 262 | gymnast 263 | gymnastics 264 | gyroscope 265 | hacksaw 266 | haggis 267 | hairpiece 268 | halibut 269 | halo 270 | handcuff 271 | handgun 272 | hare 273 | harmonica 274 | harp 275 | harpoon 276 | harpsichord 277 | hazelnut 278 | headland 279 | headscarf 280 | headteacher 281 | hearse 282 | hedgehog 283 | highland 284 | hitchhiker 285 | hollandaise 286 | honeycomb 287 | honeysuckle 288 | hornet 289 | horseman 290 | horseracing 291 | horseshoe 292 | hostel 293 | hourglass 294 | huckleberry 295 | humpback 296 | humus 297 | huntress 298 | hyacinth 299 | hydra 300 | hydroplane 301 | hyena 302 | iceberg 303 | icemaker 304 | igloo 305 | iguana 306 | impala 307 | jackal 308 | jade 309 | jaguar 310 | jak 311 | jalapeno 312 | jasmine 313 | javelin 314 | jellyfish 315 | jester 316 | jigsaw 317 | jock 318 | juggler 319 | jukebox 320 | junkyard 321 | jurist 322 | juror 323 | keypad 324 | kiwifruit 325 | knitwear 326 | kola 327 | ladybird 328 | lama 329 | lapdog 330 | laurel 331 | lawnmower 332 | lectern 333 | leech 334 | lemur 335 | lentil 336 | leotard 337 | levee 338 | lichen 339 | lifeboat 340 | lifejacket 341 | lightbulb 342 | locust 343 | longbow 344 | lorikeet 345 | lory 346 | lychee 347 | lynx 348 | macaque 349 | macaw 350 | machete 351 | mackerel 352 | madras 353 | magician 354 | magnolia 355 | mammoth 356 | manatee 357 | mandolin 358 | mantis 359 | mantlepiece 360 | marigold 361 | marimba 362 | marshmallow 363 | marzipan 364 | mascara 365 | masquerade 366 | mastodon 367 | matador 368 | mayflower 369 | medusa 370 | megaphone 371 | menagerie 372 | mesa 373 | metronome 374 | microprocessor 375 | microscope 376 | mimosa 377 | minestrone 378 | minibar 379 | minicomputer 380 | miniskirt 381 | mink 382 | missile 383 | mockingbird 384 | mongoose 385 | morgue 386 | mosque 387 | moth 388 | mousetrap 389 | moussaka 390 | mousse 391 | mullet 392 | mussel 393 | mustang 394 | naan 395 | nanny 396 | nebula 397 | nectarine 398 | nematode 399 | neurologist 400 | newsstand 401 | nightingale 402 | nightshade 403 | nipple 404 | nutcracker 405 | nutmeg 406 | oboe 407 | okapi 408 | opera 409 | orangutan 410 | orca 411 | oregano 412 | organist 413 | osprey 414 | otter 415 | oyster 416 | pajama 417 | palmetto 418 | papaya 419 | parlour 420 | parsnip 421 | peafowl 422 | pecan 423 | penthouse 424 | peppercorn 425 | percussionist 426 | persimmon 427 | petticoat 428 | pharmacy 429 | pheasant 430 | phonograph 431 | photocopy 432 | piccolo 433 | piglet 434 | pilaf 435 | pinion 436 | piranha 437 | pistachio 438 | pitchfork 439 | pixie 440 | plateau 441 | platypus 442 | plum 443 | plumber 444 | pocketbook 445 | pocketknife 446 | poinsettia 447 | polarbear 448 | poppet 449 | porpoise 450 | possum 451 | potluck 452 | poultry 453 | powerboat 454 | precipice 455 | primrose 456 | prong 457 | prune 458 | pueblo 459 | puffin 460 | pumpernickel 461 | pupa 462 | puppeteer 463 | pushchair 464 | pussycat 465 | python 466 | quail 467 | quarterback 468 | quill 469 | quilting 470 | quiver 471 | rabbi 472 | raccoon 473 | radicchio 474 | raisin 475 | rake 476 | raptor 477 | rattan 478 | redhead 479 | regalia 480 | revolver 481 | rhubarb 482 | ricotta 483 | riverbed 484 | roach 485 | roadblock 486 | roadhouse 487 | roadrunner 488 | roaster 489 | rocker 490 | rollercoaster 491 | romaine 492 | rosemary 493 | roulette 494 | sake 495 | salamander 496 | sallow 497 | samba 498 | sander 499 | sari 500 | sashimi 501 | saucepan 502 | sauna 503 | saxophone 504 | schnauzer 505 | schnitzel 506 | schoolhouse 507 | schoolmaster 508 | schoolroom 509 | sconce 510 | scone 511 | scythe 512 | seabird 513 | seaboard 514 | seafront 515 | semiautomatic 516 | serpent 517 | settler 518 | shackle 519 | shallot 520 | shipwreck 521 | shoebox 522 | shorebird 523 | showboat 524 | shrew 525 | sitar 526 | skunk 527 | sloth 528 | snapper 529 | snorkel 530 | snowbird 531 | snowcap 532 | snowdrift 533 | snowflake 534 | snowplow 535 | snowshoe 536 | sofabed 537 | songbird 538 | soybean 539 | spaceship 540 | spectacles 541 | spectrometer 542 | sphinx 543 | sprinkler 544 | spruce 545 | stag 546 | stallion 547 | starfish 548 | steamship 549 | steed 550 | stingray 551 | stonework 552 | stopwatch 553 | streamer 554 | streamliner 555 | stretcher 556 | subcontractor 557 | summerhouse 558 | sunfish 559 | sunglass 560 | supercomputer 561 | superhighway 562 | supernova 563 | swallow 564 | sweetcorn 565 | swine 566 | swordfish 567 | sycamore 568 | syringe 569 | tabasco 570 | tabernacle 571 | tamale 572 | tambourine 573 | tammy 574 | tankard 575 | taper 576 | tarantula 577 | tearoom 578 | teepee 579 | telegraph 580 | telegraphy 581 | tempura 582 | teriyaki 583 | theremin 584 | throne 585 | timber 586 | timpani 587 | toad 588 | toboggan 589 | tombstone 590 | topaz 591 | topcoat 592 | torch 593 | torpedo 594 | toucan 595 | towboat 596 | trackball 597 | trampoline 598 | tramway 599 | trapeze 600 | treadmill 601 | trifle 602 | truffle 603 | trumpeter 604 | tuba 605 | tulip 606 | turbojet 607 | turnip 608 | turnpike 609 | tutu 610 | tweed 611 | tyrannosaurus 612 | ukulele 613 | unicycle 614 | urchin 615 | vat 616 | verandah 617 | vestibule 618 | vial 619 | viola 620 | volcano 621 | vole 622 | voyager 623 | vulture 624 | waggon 625 | wallaby 626 | walrus 627 | warbler 628 | warship 629 | warthog 630 | wasp 631 | watchtower 632 | weevil 633 | weld 634 | wheelhouse 635 | whippet 636 | whisky 637 | witch 638 | wolverine 639 | wombat 640 | woodpecker 641 | woollen 642 | worm 643 | wrangler 644 | wren 645 | yam 646 | zipper 647 | --------------------------------------------------------------------------------