├── README.md ├── build_vocab.py ├── coco_caption ├── Bleu_1.pkl ├── Bleu_2.pkl ├── Bleu_3.pkl ├── Bleu_4.pkl ├── CIDEr.pkl ├── METEOR.pkl ├── captions_test2014_hitachi_results.json ├── captions_val2014_hitachi_results.json ├── draw.py ├── eval_captions_results.py ├── eval_image_caption.py ├── eval_model.py ├── gen_test_json.py ├── gen_val_json.py ├── model_evalution.png ├── read_test_info.py └── read_validation_info.py ├── create_train_val_all_reference.py ├── create_train_val_each_reference.py ├── data ├── bias_init_vector.npy ├── idx_to_word.pkl ├── test.txt ├── test2014_images_ids_to_names.pkl ├── train_val_imageNames_to_imageIDs.pkl ├── val2014_images_ids_to_names.pkl ├── val_images_captions.pkl └── word_to_idx.pkl ├── image ├── 1.png ├── 2.png └── 3.png ├── image_caption.py ├── inception ├── COCO_val2014_000000320612.jpg ├── README.md ├── check_NOT_JPEG_IMG.sh ├── copy_train_val_feats.sh ├── extract_inception_bottleneck_feature.py ├── imageInfo.txt ├── test_feats │ └── README.md ├── train_feats │ └── README.md ├── train_val_feats │ └── README.md └── val_feats │ └── README.md ├── pre_train_json.py ├── pre_val_json.py └── split_train_val_data.py /README.md: -------------------------------------------------------------------------------- 1 | # Optimization of image description metrics using policy gradient methods 2 | This is Tensorflow implement of paper: [Optimization of image description metrics using policy gradient methods](https://arxiv.org/abs/1612.00370). 3 | 4 | ## Note 5 | This repository is not being actively maintained due to lack of time and interest. My sincerest apologies to the open source community for allowing this project to stagnate. I hope it was useful for some of you as a jumping-off point. 6 | 7 | ## Prerequisites 8 | - TensorFlow 0.10 9 | 10 | ## Introduction 11 | This codes are a little coarse. When I worked on this paper, I also have some questions. But the authors didn't reply me after I sent e-mails, and I don't know why. 12 | 13 | So, Please contact me anytime if you have any doubt. 14 | 15 | My e-mail: jschenxinpeng@gmail.com 16 | 17 | I will appreciate that if you have any advice. 18 | 19 | ## How to run the code 20 | ### Step 1 21 | Go into the `./inception` directory, the python script which used to extract features is: `extract_inception_bottleneck_feature.py`. 22 | 23 | In this python script, there are few parameters you should modified: 24 | - `image_path`: the MSCOCO image path, e.g. `/path/to/msococo/train2014`, `/path/to/msococo/val2014`, `/path/to/msococo/test2014` 25 | - `feats_save_path`: the feature directory which you want to saved. 26 | - `model_path`: the pre-trained **inception-V3** tensorflow model. And I uploaded this model on the Google Drive: [tensorflow_inception_graph.pb](https://drive.google.com/open?id=0B65vBUruA6N4Y2dtVHBJMVhodjA) 27 | 28 | 29 | After you modified the parameters, we can extract image features, in the terminal: 30 | ```bash 31 | $ CUDA_VISIBLE_DEVICES=3 python extract_inception_bottleneck_feature.py 32 | ``` 33 | Also, you can run the code without GPU: 34 | ```bash 35 | $ CUDA_VISIBLE_DEVICES="" python extract_inception_bottleneck_feature.py 36 | ``` 37 | 38 | In my experiment, I save the `train2014` image feature in the folder: `./inception/train_feats`, `val2014` image feature are saved in the folder: `./inception/val_feats`, and the `test2014` image features are saved in the folder: `test_feats` 39 | And at the same time, I saved the `train2014`+`val2014` image features in the folder: `./inception/train_val_feats` 40 | 41 | ### Step 2 42 | Run the scripts: 43 | ```bash 44 | $ python pre_train_json.py 45 | $ python pre_val_json.py 46 | $ python split_train_val_data.py 47 | ``` 48 | 49 | The python script `pre_train_json.py`, it is used to process the `./data/captions_train2014.json`, it generated a file: `./data/train_images_captions.pkl`, it is a dict which save the captions of each image, like this: 50 |
![train_image_captions](https://github.com/chenxinpeng/Optimization-of-image-description-metrics-using-policy-gradient-methods/blob/master/image/1.png)
51 | 52 | The script `pre_val_json.py`, it is used to process the `./data/captions_val2014.json`. it generated a file: `./data/val_images_captions.pkl`. 53 | 54 | The script `split_train_val_data.py`, because according to the paper, it only use 1665 validation images, the other validation images are used to training. So, I split the validation images into two parts, the 0~1665 images are used to validation, the left are used to training. 55 | 56 | ### Step 3 57 | Run the scripts: 58 | ```bash 59 | $ python create_train_val_all_reference.py 60 | ``` 61 | and 62 | ```bash 63 | $ create_train_val_each_reference.py 64 | ``` 65 | 66 | Let me explain the two scripts, the first script `create_train_val_all_reference.py`, it will generate a JSON file named `train_val_all_reference.json`(about 70M), it saves the ground-truth captions of training and validation images. 67 | 68 | The second script `create_train_val_each_reference.py`, it will generate JSON files of every training and validation images. And it saves every JSON file in the folder: `./train_val_reference_json/` 69 | 70 | ### Step 4 71 | Run the script: 72 | ```bash 73 | $ python build_vocab.py 74 | ``` 75 | 76 | This script will build the vocabulary dict. In the data folder, it will generate three files: 77 | - word_to_idx.pkl 78 | - idx_to_word.pkl 79 | - bias_init_vector.npy 80 | 81 | By the way, I filter the words more than 5 times, you can change this parameter in the script. 82 | 83 | ### Step 5 84 | In this step, we follow the algorithm in the paper: 85 |
![algorithm](https://github.com/chenxinpeng/Optimization-of-image-description-metrics-using-policy-gradient-methods/blob/master/image/2.png)
86 | 87 | First, we train the the basic model with MLE(Maximum Likehood Estimation): 88 | ```bash 89 | $ CUDA_VISIBLE_DEVICES=0 ipython 90 | >>> import image_caption 91 | >>> image_caption.Train_with_MLE() 92 | ``` 93 | 94 | After training the basic model, you can test and validate the model on test data and validation data: 95 | ```bash 96 | >>> image_caption.Test_with_MLE() 97 | >>> image_caption.Val_with_MLE() 98 | ``` 99 | 100 | Second, we train B_phi using MC estimates of Q_theta on a small dataset D(1665 images): 101 | ```bash 102 | >>> image_caption.Sample_Q_with_MC() 103 | >>> image_caption.Train_Bphi_Model() 104 | ``` 105 | 106 | After we get the B_phi model, we use RG to optimize the generation: 107 | ```bash 108 | >>> image_caption.Train_SGD_update() 109 | ``` 110 | I have runned several epochs, here I compared the RL results with the no RL results: 111 | ![results compared](https://github.com/chenxinpeng/Optimization-of-image-description-metrics-using-policy-gradient-methods/blob/master/image/3.png) 112 | 113 | This shows that the policy gradient method is beneficial for image caption. 114 | 115 | ### COCO evalution 116 | In the `./coco_caption/` folder, we can evaluate the generation results and our each trained model. Please see the python scripts. 117 | -------------------------------------------------------------------------------- /build_vocab.py: -------------------------------------------------------------------------------- 1 | # encoding: UTF-8 2 | 3 | #----------------------------------------------------------------------- 4 | # We preprocess the text data by lower casing, and replacing words which 5 | # occur less than 5 times in the 82K training set with ; 6 | # This results in a vocabulary size of 10,622 (from 32,807 words). 7 | #----------------------------------------------------------------------- 8 | 9 | import os 10 | import numpy as np 11 | import cPickle as pickle 12 | import time 13 | 14 | 15 | train_images_captions_path = './data/train_images_captions.pkl' 16 | with open(train_images_captions_path, 'r') as train_fr: 17 | train_images_captions = pickle.load(train_fr) 18 | 19 | val_images_captions_path = './data/val_images_captions.pkl' 20 | with open(val_images_captions_path, 'r') as val_fr: 21 | val_images_captions = pickle.load(val_fr) 22 | 23 | 24 | #------------------------------------------------------------------------ 25 | # Borrowed this function from NeuralTalk: 26 | # https://github.com/karpathy/neuraltalk/blob/master/driver.py#L16 27 | #----------------------------------------------------------------------- 28 | def preProBuildWordVocab(sentence_iterator, word_count_threshold=5): 29 | print 'Preprocessing word counts and creating vocab based on word count threshold %d' % (word_count_threshold, ) 30 | 31 | t0 = time.time() 32 | word_counts = {} 33 | nsents = 0 34 | 35 | for sent in sentence_iterator: 36 | nsents += 1 37 | tmp_sent = sent.split(' ') 38 | # remove the empty string '' in the sentence 39 | tmp_sent = filter(None, tmp_sent) 40 | for w in tmp_sent: 41 | word_counts[w] = word_counts.get(w, 0) + 1 42 | vocab = [w for w in word_counts if word_counts[w] >= word_count_threshold] 43 | print 'Filter words from %d to %d in %0.2fs' % (len(word_counts), len(vocab), time.time()-t0) 44 | 45 | ixtoword = {} 46 | ixtoword[0] = '' 47 | ixtoword[1] = '' 48 | ixtoword[2] = '' 49 | ixtoword[3] = '' 50 | 51 | wordtoix = {} 52 | wordtoix[''] = 0 53 | wordtoix[''] = 1 54 | wordtoix[''] = 2 55 | wordtoix[''] = 3 56 | 57 | for idx, w in enumerate(vocab): 58 | wordtoix[w] = idx + 4 59 | ixtoword[idx+4] = w 60 | 61 | word_counts[''] = nsents 62 | word_counts[''] = nsents 63 | word_counts[''] = nsents 64 | word_counts[''] = nsents 65 | 66 | bias_init_vector = np.array([1.0 * word_counts[ ixtoword[i] ] for i in ixtoword]) 67 | bias_init_vector /= np.sum(bias_init_vector) # normalize to frequencies 68 | bias_init_vector = np.log(bias_init_vector) 69 | bias_init_vector -= np.max(bias_init_vector) # shift to nice numeric range 70 | 71 | return wordtoix, ixtoword, bias_init_vector 72 | 73 | 74 | # extract all sentences in captions 75 | all_sents = [] 76 | for image, sents in train_images_captions.iteritems(): 77 | for each_sent in sents: 78 | all_sents.append(each_sent) 79 | #for image, sents in val_images_captions.iteritems(): 80 | # for each_sent in sents: 81 | # all_sents.append(each_sent) 82 | 83 | word_to_idx, idx_to_word, bias_init_vector = preProBuildWordVocab(all_sents, word_count_threshold=5) 84 | 85 | with open('./data/idx_to_word.pkl', 'w') as fw_1: 86 | pickle.dump(idx_to_word, fw_1) 87 | 88 | with open('./data/word_to_idx.pkl', 'w') as fw_2: 89 | pickle.dump(word_to_idx, fw_2) 90 | 91 | np.save('./data/bias_init_vector.npy', bias_init_vector) 92 | 93 | -------------------------------------------------------------------------------- /coco_caption/Bleu_1.pkl: -------------------------------------------------------------------------------- 1 | (lp1 2 | F0.63260942443110402 3 | aF0.66267368806380189 4 | aF0.67396468036764079 5 | aF0.67794791365717377 6 | aF0.67931985672525264 7 | aF0.68720906282181904 8 | aF0.68912609777666678 9 | aF0.69167747380553968 10 | aF0.69363946633569773 11 | aF0.70126672092487952 12 | aF0.69540300212104966 13 | aF0.69710219655562145 14 | aF0.70975461281721475 15 | aF0.70438543342455129 16 | aF0.71065515953693403 17 | aF0.70796259212616475 18 | aF0.7137309178045258 19 | aF0.70972130606858697 20 | aF0.71264320363297118 21 | aF0.71230150588813113 22 | aF0.71266183163274432 23 | aF0.71383000617409531 24 | aF0.71222052067379871 25 | aF0.71817035091708437 26 | aF0.71678112561610441 27 | aF0.71900126082551585 28 | aF0.72450404567296334 29 | aF0.72652732454870661 30 | aF0.71936356792608769 31 | aF0.72033242351406046 32 | aF0.72968607636874172 33 | aF0.72561088200897139 34 | aF0.7244390812590612 35 | aF0.72698497523275596 36 | aF0.73069298752786138 37 | aF0.73729693796998763 38 | aF0.73198069213775818 39 | aF0.73262131127298169 40 | aF0.73429693443261856 41 | aF0.73372853890175116 42 | aF0.73490602935423266 43 | aF0.73360538079808202 44 | aF0.73405409832707313 45 | aF0.73505874686919437 46 | aF0.73659015588558707 47 | aF0.73524962178515896 48 | aF0.74024590163932913 49 | aF0.74045426642110213 50 | aF0.74236570630058896 51 | aF0.74099218845853465 52 | aF0.74233917889878653 53 | aF0.74182765268035067 54 | aF0.7426081290217329 55 | aF0.74422245108134411 56 | aF0.74005520376683032 57 | aF0.74418414823278223 58 | aF0.73858778239285305 59 | aF0.74325937512862261 60 | aF0.74602720699373459 61 | aF0.74856333634347039 62 | aF0.74826556870816785 63 | aF0.75033971587398085 64 | aF0.74709100559341968 65 | aF0.74876745673204315 66 | aF0.75033740951288808 67 | aF0.75072200676621936 68 | aF0.75079507461467931 69 | aF0.75131406044676519 70 | aF0.75091575091573559 71 | aF0.7533601049008205 72 | aF0.75039924655008505 73 | aF0.75081011677908271 74 | aF0.75387249307026027 75 | aF0.75280507066520175 76 | aF0.75895135402089153 77 | aF0.7509718073570778 78 | aF0.75198312065081352 79 | aF0.75178513469651187 80 | aF0.75653279730892331 81 | aF0.75611004216643973 82 | aF0.75846715177999302 83 | aF0.75828324067043973 84 | aF0.75750477716820241 85 | aF0.75789989385154011 86 | aF0.75972493489581794 87 | aF0.75437812360325152 88 | aF0.75501730103804698 89 | aF0.76171867007671079 90 | aF0.75758749312079343 91 | aF0.75749841762458392 92 | aF0.75914832925833831 93 | aF0.75881129150687854 94 | aF0.75751186083287869 95 | aF0.75484145132240421 96 | aF0.75908663477654936 97 | aF0.75733703416464437 98 | aF0.76081097635671613 99 | aF0.75996548039778178 100 | aF0.75746828259898757 101 | aF0.76331747554973228 102 | aF0.76167146385774265 103 | aF0.75959328927298919 104 | aF0.76060692348369285 105 | aF0.7615271719189376 106 | aF0.76225570032571743 107 | aF0.75984967874892395 108 | aF0.75717563705163327 109 | aF0.76227174752936766 110 | aF0.75881628615836505 111 | aF0.76317765735044829 112 | aF0.75621208762417613 113 | aF0.75900304982729583 114 | aF0.76074306177258977 115 | aF0.76198528962326029 116 | aF0.76501348149357051 117 | aF0.76363267788362532 118 | aF0.76185546082012268 119 | aF0.76123584183157811 120 | aF0.76008633906235867 121 | aF0.76592733375117239 122 | aF0.76374915821478762 123 | aF0.76026841827837732 124 | aF0.76175517026116601 125 | aF0.76038009926157535 126 | aF0.76368359579522416 127 | aF0.75833484600954482 128 | aF0.76197571632803573 129 | aF0.7629299028616281 130 | aF0.76091773426797282 131 | aF0.76531982421873446 132 | aF0.76405016256384894 133 | aF0.7659885788607157 134 | aF0.76358106052923536 135 | aF0.76399991904635078 136 | aF0.76128875773131832 137 | aF0.76268313355126205 138 | aF0.76484217800449661 139 | aF0.76796994008325858 140 | aF0.76355805394910903 141 | aF0.76499747091551318 142 | aF0.76452938096579082 143 | aF0.76352628826580393 144 | aF0.76775447404646602 145 | aF0.76476284624849677 146 | aF0.76222562943833194 147 | aF0.76338147833473402 148 | aF0.76421753429051176 149 | aF0.76712827251590487 150 | aF0.76658684321733916 151 | aF0.76485198497635876 152 | aF0.7618884776311402 153 | aF0.76374823481943188 154 | aF0.76541313880411499 155 | aF0.76382010709826542 156 | aF0.76418767057311043 157 | aF0.76132075471696592 158 | aF0.76545898102782794 159 | aF0.76787125803115663 160 | aF0.76265898728100234 161 | aF0.76364074356605049 162 | aF0.76735406919538229 163 | aF0.76499618825982496 164 | aF0.76497955731000866 165 | aF0.76429861529197751 166 | aF0.7638679302227166 167 | aF0.76541870139264157 168 | aF0.76458651450613269 169 | aF0.7663183003496985 170 | aF0.76314153346301195 171 | aF0.76658610271901784 172 | aF0.76437242097670843 173 | aF0.76178136675905428 174 | aF0.76439947911447936 175 | aF0.75999279077217319 176 | aF0.76444729922317267 177 | aF0.76261997510138624 178 | aF0.75889880417638389 179 | aF0.76379881537996663 180 | aF0.76253897998187792 181 | aF0.76538973882854799 182 | aF0.76289754065527593 183 | aF0.76168346510550311 184 | aF0.76136567119633991 185 | aF0.75887144844027898 186 | aF0.76305356893836762 187 | aF0.76344193483908129 188 | aF0.76334680581527092 189 | aF0.76585101253614662 190 | aF0.76123191302260851 191 | aF0.7634585650890392 192 | aF0.76134217400038662 193 | aF0.7604760476047453 194 | aF0.76254760812776401 195 | aF0.76048083248608023 196 | aF0.76025974025972509 197 | aF0.76538307260327387 198 | aF0.76010312987167206 199 | aF0.76104285286183682 200 | aF0.76136592032799855 201 | aF0.76413942221063957 202 | aF0.76069486646408202 203 | aF0.76404426801393877 204 | aF0.76077328225998009 205 | aF0.76148381689745859 206 | aF0.76053659024925901 207 | aF0.76173206065797661 208 | aF0.75772879409025495 209 | aF0.76179131966686497 210 | aF0.7582728650027758 211 | aF0.75840286054825679 212 | aF0.76425427578027694 213 | aF0.7586440542856423 214 | aF0.76066111781102075 215 | aF0.76409670893173964 216 | aF0.76330972465381264 217 | aF0.7588560738735719 218 | aF0.75990113810765669 219 | aF0.7633820800829807 220 | aF0.7603335533025225 221 | aF0.76201052484141185 222 | aF0.76061405612855282 223 | aF0.75630568950803212 224 | aF0.757408546872814 225 | aF0.76159282197870493 226 | aF0.76209790209788697 227 | aF0.762037850355331 228 | aF0.75760592545395122 229 | aF0.76000319144690709 230 | aF0.76037472593181665 231 | aF0.76114156339369321 232 | aF0.76310264152451035 233 | aF0.75998571343530852 234 | aF0.75947565098758807 235 | aF0.76130983169536548 236 | aF0.76105455702241209 237 | aF0.76010427653179524 238 | aF0.75949947197479906 239 | aF0.76106000439006327 240 | aF0.76160169187181759 241 | aF0.76186211288908612 242 | aF0.75888582043588193 243 | aF0.76169278371093407 244 | aF0.76218829923272136 245 | aF0.76508881611589241 246 | aF0.76059954616026204 247 | aF0.75496076619007746 248 | aF0.75848861899940945 249 | aF0.75676319393444702 250 | aF0.7617587319287753 251 | aF0.76011012908244202 252 | a. -------------------------------------------------------------------------------- /coco_caption/Bleu_2.pkl: -------------------------------------------------------------------------------- 1 | (lp1 2 | F0.44701445730041445 3 | aF0.4830433062444468 4 | aF0.4939662856657035 5 | aF0.50050569478837414 6 | aF0.50175276072974118 7 | aF0.50897594637847177 8 | aF0.51316415541095439 9 | aF0.51448432825471457 10 | aF0.51853477147182192 11 | aF0.52498707790409072 12 | aF0.51841263036406193 13 | aF0.52040892167020669 14 | aF0.53524628148436371 15 | aF0.52934501579869575 16 | aF0.53453659106341056 17 | aF0.53270843698039461 18 | aF0.54089648298723925 19 | aF0.53803943586804548 20 | aF0.53975898466930994 21 | aF0.54063323725452739 22 | aF0.54217124704850073 23 | aF0.54411558783118907 24 | aF0.53994886567844735 25 | aF0.54709809187597069 26 | aF0.54544906649331237 27 | aF0.54868210119279937 28 | aF0.5530751557038055 29 | aF0.55552914989855995 30 | aF0.54887126568793954 31 | aF0.55155635654771573 32 | aF0.56095074945855317 33 | aF0.55732285864590903 34 | aF0.55810031416061867 35 | aF0.56031775646004767 36 | aF0.56165756096229902 37 | aF0.57194047706892182 38 | aF0.56457488849693882 39 | aF0.56605184239308481 40 | aF0.56889475603038908 41 | aF0.56868565740017174 42 | aF0.56971650083471992 43 | aF0.56792100434663617 44 | aF0.56998952080709786 45 | aF0.57025692681385454 46 | aF0.57170900009487069 47 | aF0.57194722703677359 48 | aF0.57787436758728994 49 | aF0.5753862229678891 50 | aF0.57916014431738894 51 | aF0.57912794779759935 52 | aF0.57929492407106742 53 | aF0.58078922987124637 54 | aF0.5796684664074081 55 | aF0.58005494492785126 56 | aF0.57911581348905161 57 | aF0.58278507841009286 58 | aF0.57745218667697251 59 | aF0.58203875926982884 60 | aF0.58495866143962083 61 | aF0.58706610649185209 62 | aF0.58951244632503708 63 | aF0.59026069155858862 64 | aF0.58883494292550931 65 | aF0.58831417421830079 66 | aF0.59102430979134868 67 | aF0.5911804143343059 68 | aF0.59126462484225051 69 | aF0.59358746429763165 70 | aF0.59187843140612617 71 | aF0.59412353717009891 72 | aF0.59127930606673385 73 | aF0.59285352042928452 74 | aF0.59559156222164167 75 | aF0.59379898303023559 76 | aF0.60248618395778653 77 | aF0.59384720141066505 78 | aF0.59376182375289477 79 | aF0.59483540612433117 80 | aF0.59954501808983662 81 | aF0.59990203267749709 82 | aF0.60332997368419095 83 | aF0.60215894841141249 84 | aF0.60352467419032307 85 | aF0.6013315578396784 86 | aF0.60494560460416125 87 | aF0.59908424648591796 88 | aF0.60016198505966334 89 | aF0.60801929345256411 90 | aF0.60205314238629593 91 | aF0.60094656337983576 92 | aF0.60569322731387798 93 | aF0.60137663725716728 94 | aF0.60447554787780311 95 | aF0.6023968738365173 96 | aF0.60445867473371473 97 | aF0.60402523782931206 98 | aF0.60584654496739965 99 | aF0.60650858956193743 100 | aF0.60272262198958249 101 | aF0.61085598334544078 102 | aF0.60684081983158378 103 | aF0.60587199343224851 104 | aF0.6064372797156411 105 | aF0.61030070725962771 106 | aF0.6093417010233797 107 | aF0.60805388440391828 108 | aF0.6020004209514298 109 | aF0.61043376214203138 110 | aF0.60671576045281872 111 | aF0.61002699552743667 112 | aF0.60464068114185543 113 | aF0.60580018038988437 114 | aF0.60981663264733621 115 | aF0.60877942352291792 116 | aF0.61291674770910931 117 | aF0.61309914625801043 118 | aF0.60996962332835791 119 | aF0.61201405304123013 120 | aF0.60841555061617714 121 | aF0.61573573757223254 122 | aF0.61318783301342683 123 | aF0.60849975642279663 124 | aF0.61252675557034453 125 | aF0.60898045692066782 126 | aF0.61504865645318463 127 | aF0.60710668744610208 128 | aF0.61091470259637271 129 | aF0.61200473084388596 130 | aF0.6099081398355648 131 | aF0.61536424101384635 132 | aF0.61421700746185792 133 | aF0.61643166727267151 134 | aF0.61445942019481259 135 | aF0.61502708829375341 136 | aF0.61197734689484284 137 | aF0.61316103926709153 138 | aF0.61581535445513302 139 | aF0.61867682178305294 140 | aF0.61490103693948694 141 | aF0.6151495679583342 142 | aF0.61595917782039489 143 | aF0.61430860486926553 144 | aF0.61704711742905327 145 | aF0.61709909839235777 146 | aF0.61258774792155035 147 | aF0.61482858860084921 148 | aF0.61504181807003588 149 | aF0.61979197468898661 150 | aF0.61864272422277389 151 | aF0.61513322327342745 152 | aF0.61295431012147183 153 | aF0.61541712918827107 154 | aF0.61726156954047251 155 | aF0.61754589219786915 156 | aF0.61468076187160803 157 | aF0.61265300280943291 158 | aF0.6168941181595029 159 | aF0.61919901824762136 160 | aF0.61406508562514361 161 | aF0.61502761262377836 162 | aF0.61940977597170821 163 | aF0.61748437661042355 164 | aF0.61693903992142263 165 | aF0.61599967048221671 166 | aF0.61789053547752348 167 | aF0.6183240233733942 168 | aF0.61771838981498883 169 | aF0.61922813949850863 170 | aF0.61450439534501533 171 | aF0.61973030145040742 172 | aF0.61818458359418482 173 | aF0.6157284722486378 174 | aF0.61756890018497412 175 | aF0.61230920347519824 176 | aF0.61686833730123991 177 | aF0.61605411838141333 178 | aF0.61315340461893242 179 | aF0.61646795673946864 180 | aF0.61612208193854168 181 | aF0.61976107823057192 182 | aF0.61380989244511042 183 | aF0.61411410577631032 184 | aF0.61552782708664289 185 | aF0.61152383713225156 186 | aF0.61823178449047611 187 | aF0.61753378676799853 188 | aF0.61619993806514584 189 | aF0.62098948977751711 190 | aF0.61458008818578691 191 | aF0.61736647496168917 192 | aF0.61426458254321592 193 | aF0.61346741040548691 194 | aF0.61660466734297126 195 | aF0.61405728049556663 196 | aF0.61488851431290303 197 | aF0.61849654967711598 198 | aF0.61557906777518223 199 | aF0.61408542908871278 200 | aF0.61607072636777938 201 | aF0.61997841128965314 202 | aF0.61518709141594352 203 | aF0.62030651352035393 204 | aF0.61730140591390059 205 | aF0.61763072287861132 206 | aF0.61475577921228552 207 | aF0.61801016609292014 208 | aF0.61210736598709281 209 | aF0.61659054111000178 210 | aF0.61210331912518356 211 | aF0.61296183498816936 212 | aF0.61783967210400192 213 | aF0.61391822633306581 214 | aF0.61556511671279812 215 | aF0.61938574320350381 216 | aF0.62007090827931111 217 | aF0.61384994461502429 218 | aF0.61500100454818185 219 | aF0.61930892097903145 220 | aF0.61500605774310835 221 | aF0.61638415864859331 222 | aF0.6177303796053345 223 | aF0.61212374229375988 224 | aF0.61247389336872149 225 | aF0.61760359835095147 226 | aF0.61731902973593111 227 | aF0.61841611686862852 228 | aF0.61256405644606771 229 | aF0.61802499832904678 230 | aF0.61760266436194977 231 | aF0.61660486229631017 232 | aF0.62044394356209343 233 | aF0.61546395818335875 234 | aF0.6139598334367975 235 | aF0.61769434235330445 236 | aF0.61695063768578651 237 | aF0.61738899642248457 238 | aF0.61748494771049667 239 | aF0.61645394369516349 240 | aF0.61838848996504037 241 | aF0.62086840544332933 242 | aF0.61543881095760411 243 | aF0.61794125716157011 244 | aF0.61819097069341933 245 | aF0.62225844703921607 246 | aF0.61628904558898612 247 | aF0.6114189319235831 248 | aF0.61394621850444564 249 | aF0.61457811381292315 250 | aF0.62035333536351489 251 | aF0.61688896948080474 252 | a. -------------------------------------------------------------------------------- /coco_caption/Bleu_3.pkl: -------------------------------------------------------------------------------- 1 | (lp1 2 | F0.29707668433830187 3 | aF0.33579709089922882 4 | aF0.34468956821642355 5 | aF0.35316911577671461 6 | aF0.35567269818131536 7 | aF0.36026177660692654 8 | aF0.36471407910280479 9 | aF0.36844262557330587 10 | aF0.37220932639239973 11 | aF0.37655673987251204 12 | aF0.37035683680989817 13 | aF0.37383122750940773 14 | aF0.38748195100330884 15 | aF0.38170775143175978 16 | aF0.38555749119890365 17 | aF0.38731723488565623 18 | aF0.3946590294047751 19 | aF0.39315025222532585 20 | aF0.39256515373344114 21 | aF0.39467263589120877 22 | aF0.39696270707495923 23 | aF0.39833479893545831 24 | aF0.39487779046538279 25 | aF0.40155629980022378 26 | aF0.39885564800104872 27 | aF0.402607878910128 28 | aF0.40497159270308258 29 | aF0.40795938013325189 30 | aF0.40344632387585178 31 | aF0.40637259660635661 32 | aF0.41407510930333169 33 | aF0.4102150656455672 34 | aF0.41338795373053411 35 | aF0.41467297534600533 36 | aF0.41516443031850281 37 | aF0.42669972153917346 38 | aF0.41821149081195041 39 | aF0.42038574304194831 40 | aF0.42513526251520656 41 | aF0.42370485724669033 42 | aF0.42488531017056119 43 | aF0.42293690849431748 44 | aF0.42667059727500384 45 | aF0.42586863147294762 46 | aF0.42836885025141436 47 | aF0.4299897907257495 48 | aF0.43411806746351622 49 | aF0.43056178438544851 50 | aF0.43563695720324735 51 | aF0.4345403419284995 52 | aF0.43522214987795199 53 | aF0.43796801751982894 54 | aF0.43571154528320061 55 | aF0.43636970073462372 56 | aF0.43705544038470123 57 | aF0.44065071575633136 58 | aF0.435368739574028 59 | aF0.43961773731432013 60 | aF0.44183301787037665 61 | aF0.44367101901501016 62 | aF0.44872928677342272 63 | aF0.44816791129017247 64 | aF0.44742686100836998 65 | aF0.44632863652345256 66 | aF0.44999945503612349 67 | aF0.44874574716998011 68 | aF0.44921021562245445 69 | aF0.45245486710269617 70 | aF0.45080258927390493 71 | aF0.4517288056115108 72 | aF0.45044330018954776 73 | aF0.45315527170499748 74 | aF0.45515082040339 75 | aF0.45256258077690009 76 | aF0.46140213355011939 77 | aF0.45340587413910699 78 | aF0.45358279690486486 79 | aF0.45457779339247517 80 | aF0.45946683687786549 81 | aF0.46015771281470458 82 | aF0.46459354755904542 83 | aF0.46353604874142995 84 | aF0.46485315789905229 85 | aF0.46229484283542849 86 | aF0.46683625217591457 87 | aF0.4600210089069427 88 | aF0.46163411012099803 89 | aF0.46952465502189561 90 | aF0.46353144069095609 91 | aF0.46138700496076562 92 | aF0.46805364305382735 93 | aF0.46261957473479265 94 | aF0.46822507216697445 95 | aF0.46573614820349651 96 | aF0.46646351011947462 97 | aF0.46808312770932858 98 | aF0.4681009629140841 99 | aF0.46924266963178407 100 | aF0.46433432700446597 101 | aF0.47395576019098851 102 | aF0.46988376324318432 103 | aF0.46803415290875722 104 | aF0.46871674087641396 105 | aF0.47439412173129303 106 | aF0.47246776641391586 107 | aF0.47247615078372701 108 | aF0.46404701465608966 109 | aF0.47357505531118094 110 | aF0.47198794292863699 111 | aF0.47334742622035336 112 | aF0.46985141348151238 113 | aF0.46863596911374922 114 | aF0.47534943990324519 115 | aF0.47217255313131457 116 | aF0.47675171005429551 117 | aF0.4769935641462848 118 | aF0.47488847318383937 119 | aF0.47772451287728834 120 | aF0.47385830175620297 121 | aF0.48120754995513088 122 | aF0.47831364406089844 123 | aF0.47285442227793889 124 | aF0.47875806638914192 125 | aF0.47443960821323983 126 | aF0.48247941147377782 127 | aF0.47228211995405689 128 | aF0.4757772503862297 129 | aF0.47804594214004742 130 | aF0.47603058967586698 131 | aF0.48186828438554485 132 | aF0.48125505187767553 133 | aF0.48268839434006816 134 | aF0.48060548983626122 135 | aF0.48185027200671937 136 | aF0.47887654320083178 137 | aF0.47982017328949261 138 | aF0.48288759752518839 139 | aF0.48531943145948719 140 | aF0.4821206819945793 141 | aF0.48170729227927089 142 | aF0.48325775033737572 143 | aF0.48090517713054992 144 | aF0.48413361675400879 145 | aF0.4851792890835338 146 | aF0.47916535165014268 147 | aF0.48267193300728001 148 | aF0.48194793035054923 149 | aF0.48872452714090736 150 | aF0.48591065259487948 151 | aF0.48277714764935498 152 | aF0.48087170207199137 153 | aF0.4837870721360194 154 | aF0.48524386978756578 155 | aF0.48705775470024276 156 | aF0.48266689387815853 157 | aF0.48001752849599644 158 | aF0.48472744672697665 159 | aF0.48754804891825843 160 | aF0.48193539406653291 161 | aF0.48463268530603482 162 | aF0.48784859837816424 163 | aF0.48608657030900687 164 | aF0.48496775726120694 165 | aF0.48621079480351659 166 | aF0.48701075729393739 167 | aF0.48770161730704686 168 | aF0.48711711959918164 169 | aF0.48906596202391678 170 | aF0.48341538387693245 171 | aF0.48954999335676047 172 | aF0.48902007741894865 173 | aF0.48596501774342127 174 | aF0.48741199016181458 175 | aF0.48195482010735796 176 | aF0.48620208346808602 177 | aF0.48718594829361511 178 | aF0.48442536182635992 179 | aF0.48723060252053435 180 | aF0.4865840457623945 181 | aF0.49040598219548953 182 | aF0.48411732663292323 183 | aF0.48542463554923942 184 | aF0.48691735953048676 185 | aF0.48221630463602011 186 | aF0.48897344960543215 187 | aF0.48925494480211323 188 | aF0.48734175849587807 189 | aF0.49382841715209486 190 | aF0.48525781916637306 191 | aF0.48952731180161335 192 | aF0.48489412473669607 193 | aF0.48512672089773834 194 | aF0.48773814645547636 195 | aF0.48511301628691544 196 | aF0.48515441018364269 197 | aF0.4911020151164302 198 | aF0.48774254427619224 199 | aF0.48642747994994562 200 | aF0.48797363436906499 201 | aF0.49245763106373164 202 | aF0.48815789423151973 203 | aF0.49445600842164888 204 | aF0.49117836208991322 205 | aF0.49105485743189986 206 | aF0.48766770615777566 207 | aF0.49168562113874731 208 | aF0.48461555235548326 209 | aF0.49017626894557875 210 | aF0.4860620150849187 211 | aF0.48701753957782395 212 | aF0.49082798212201251 213 | aF0.4876528482555253 214 | aF0.48980980128093132 215 | aF0.4929782608313194 216 | aF0.49502570552137382 217 | aF0.48648368075893822 218 | aF0.48956480171272349 219 | aF0.49366981705435337 220 | aF0.48835059095870148 221 | aF0.49053050478489696 222 | aF0.49372569309352365 223 | aF0.48715792153404475 224 | aF0.48702718177864163 225 | aF0.49273637239595935 226 | aF0.49203436512730986 227 | aF0.49321264770944984 228 | aF0.48763071331968477 229 | aF0.49344111229212106 230 | aF0.49260384260787449 231 | aF0.4912120228436303 232 | aF0.49561553536202341 233 | aF0.49143099524151251 234 | aF0.48810871842778192 235 | aF0.4935120176770717 236 | aF0.49019144022485189 237 | aF0.49313654782706051 238 | aF0.49396248107002222 239 | aF0.49139490752246073 240 | aF0.49398297645157413 241 | aF0.49767769550512825 242 | aF0.49079368869714263 243 | aF0.49303917546688714 244 | aF0.49310343192174433 245 | aF0.49824166501019007 246 | aF0.49207240322236639 247 | aF0.48590100958044885 248 | aF0.48890793373520286 249 | aF0.49177203053584628 250 | aF0.49592014401867918 251 | aF0.49230301633292395 252 | a. -------------------------------------------------------------------------------- /coco_caption/Bleu_4.pkl: -------------------------------------------------------------------------------- 1 | (lp1 2 | F0.19103731907605284 3 | aF0.22683608421029841 4 | aF0.23542856254239006 5 | aF0.24459713551700563 6 | aF0.24886658872084744 7 | aF0.25220417267130463 8 | aF0.25457798072775245 9 | aF0.26031391053687997 10 | aF0.26420452058950622 11 | aF0.2659221673056883 12 | aF0.2609398704759891 13 | aF0.2653344724430638 14 | aF0.27584938913452517 15 | aF0.27237052723108879 16 | aF0.27479138411264281 17 | aF0.27797801774011011 18 | aF0.28434626362579979 19 | aF0.28405411752975479 20 | aF0.28204020906024119 21 | aF0.28593043526171896 22 | aF0.28787021905181986 23 | aF0.28805930784022998 24 | aF0.28614477285136153 25 | aF0.29181514811757781 26 | aF0.28874460804022045 27 | aF0.29247434454664928 28 | aF0.29326186082116695 29 | aF0.29634960178063929 30 | aF0.29345886529842169 31 | aF0.29650693146541662 32 | aF0.30227060494896868 33 | aF0.29855928246669844 34 | aF0.30284894566488907 35 | aF0.30260955963716535 36 | aF0.30392599304098794 37 | aF0.31370216448042576 38 | aF0.30521265145259086 39 | aF0.30993010433052742 40 | aF0.31441224952349478 41 | aF0.31192129088018899 42 | aF0.31409217087999597 43 | aF0.31076541060548568 44 | aF0.31587582541598569 45 | aF0.31537510823941756 46 | aF0.3182303877321383 47 | aF0.32096154055415882 48 | aF0.32356939088295256 49 | aF0.31907307143028468 50 | aF0.32403800046394565 51 | aF0.32287272311748239 52 | aF0.32315344964331982 53 | aF0.32640381048178768 54 | aF0.3241802927116072 55 | aF0.32610084694110908 56 | aF0.32659814021512662 57 | aF0.32997586663799006 58 | aF0.32608335039896919 59 | aF0.32858256719319212 60 | aF0.33162553092873226 61 | aF0.33210663715736538 62 | aF0.33869578316024507 63 | aF0.33722396656915149 64 | aF0.33708481990948791 65 | aF0.33567921379277449 66 | aF0.33876934289134653 67 | aF0.3371020407956643 68 | aF0.33760526283255243 69 | aF0.34166908601292334 70 | aF0.3394921246698111 71 | aF0.34054519042631148 72 | aF0.34001763209791031 73 | aF0.34277039194025344 74 | aF0.34451105787295644 75 | aF0.34200696303712547 76 | aF0.34973774603210861 77 | aF0.34230101070171876 78 | aF0.34382464607014346 79 | aF0.34457167675816985 80 | aF0.34921177041120893 81 | aF0.34986107855224013 82 | aF0.35394763736969825 83 | aF0.35372788977751429 84 | aF0.3545722007597692 85 | aF0.35297960676403084 86 | aF0.35785848228376343 87 | aF0.35115096945545998 88 | aF0.35240026852564793 89 | aF0.36015821930253972 90 | aF0.35419418837069244 91 | aF0.35222748106862894 92 | aF0.35900355377876059 93 | aF0.35270470018579997 94 | aF0.36017948097716079 95 | aF0.3571520904537403 96 | aF0.35784430443087445 97 | aF0.36033686539157744 98 | aF0.35936983184100374 99 | aF0.36103735701037065 100 | aF0.35468969311289089 101 | aF0.36481868731861383 102 | aF0.3610758660360579 103 | aF0.35833139721326762 104 | aF0.35935112724555651 105 | aF0.36681616517358284 106 | aF0.36423449763547083 107 | aF0.36436438155127382 108 | aF0.35466488745131275 109 | aF0.3638962331113289 110 | aF0.3648536348390829 111 | aF0.36538553017534126 112 | aF0.36289910709830586 113 | aF0.35994929215702953 114 | aF0.36733874497939828 115 | aF0.36397063140582475 116 | aF0.36835167441436145 117 | aF0.36867146530705597 118 | aF0.36726539384222662 119 | aF0.36981564654375382 120 | aF0.36681805279447088 121 | aF0.37403727775498979 122 | aF0.37082434601287384 123 | aF0.36561052287422757 124 | aF0.37239070490596776 125 | aF0.36741687886384988 126 | aF0.37648565964725067 127 | aF0.36583814676680609 128 | aF0.36820048588383369 129 | aF0.37085865562520698 130 | aF0.36971634701005907 131 | aF0.37524245754728314 132 | aF0.37519371420589775 133 | aF0.37569237627230612 134 | aF0.37505862408692603 135 | aF0.37494140621179811 136 | aF0.37271055324782215 137 | aF0.37381762178049976 138 | aF0.37645900557431977 139 | aF0.37860386900412352 140 | aF0.37667114598782409 141 | aF0.37451087901501978 142 | aF0.37831955612685131 143 | aF0.37467437676980575 144 | aF0.37825443356479066 145 | aF0.37993701658177981 146 | aF0.37348682450987941 147 | aF0.37795797618624111 148 | aF0.37645504797177287 149 | aF0.38387768811695189 150 | aF0.3799090659899263 151 | aF0.37685270057549286 152 | aF0.37596702261272047 153 | aF0.37845321062600784 154 | aF0.3803725963060608 155 | aF0.3822563128330731 156 | aF0.3776679861024328 157 | aF0.37493680540871699 158 | aF0.37993441984854709 159 | aF0.38199833532514771 160 | aF0.3776239901111067 161 | aF0.38090435829260305 162 | aF0.38304336514433907 163 | aF0.38140001342240798 164 | aF0.37942568337606175 165 | aF0.3831993228670541 166 | aF0.38252106021586818 167 | aF0.38417860160467121 168 | aF0.38250562004420691 169 | aF0.38423758350699044 170 | aF0.37950946045216399 171 | aF0.38616452829602393 172 | aF0.3853229532735552 173 | aF0.38278297504880476 174 | aF0.38341932995367961 175 | aF0.37845852212585496 176 | aF0.38258576524342491 177 | aF0.38457802752250619 178 | aF0.38202628685584999 179 | aF0.38541258589721178 180 | aF0.38245170915044602 181 | aF0.38745939261986095 182 | aF0.38207188761124977 183 | aF0.38237760969105117 184 | aF0.38384320986974424 185 | aF0.38015615863298075 186 | aF0.38614121306192073 187 | aF0.38705960837141862 188 | aF0.38545673597270552 189 | aF0.39340474858364699 190 | aF0.38254301091323906 191 | aF0.38711020214855368 192 | aF0.382811340276414 193 | aF0.3837974756188956 194 | aF0.38568903486431172 195 | aF0.38317133098419892 196 | aF0.38212551706633319 197 | aF0.39003297824229449 198 | aF0.38564809086809521 199 | aF0.38560682479375591 200 | aF0.38661079526224212 201 | aF0.39081229235860743 202 | aF0.387753271729882 203 | aF0.39427041718921269 204 | aF0.39120592522865488 205 | aF0.39103004797818908 206 | aF0.38733975067270082 207 | aF0.39210506439882942 208 | aF0.38463518152520504 209 | aF0.39002497769459588 210 | aF0.38688767829011556 211 | aF0.38782503967563448 212 | aF0.39050733830750117 213 | aF0.38783629528341018 214 | aF0.39075195624970827 215 | aF0.39231135321734867 216 | aF0.39621080075968301 217 | aF0.38519496993855645 218 | aF0.39095855075647445 219 | aF0.39464032402579052 220 | aF0.38878908563787218 221 | aF0.39139149780827431 222 | aF0.39569559841695606 223 | aF0.38962369059152524 224 | aF0.38811829689914212 225 | aF0.39358296190106823 226 | aF0.39262015735556394 227 | aF0.39411030708781664 228 | aF0.3897808831776835 229 | aF0.39429106708871375 230 | aF0.39371099054405934 231 | aF0.39286280253296424 232 | aF0.39659726531556033 233 | aF0.39438239328877034 234 | aF0.38857711679154944 235 | aF0.39497591048562675 236 | aF0.3903709293200292 237 | aF0.39519066565046018 238 | aF0.39655001015540858 239 | aF0.39271192457138304 240 | aF0.39663870194427303 241 | aF0.39977504950447323 242 | aF0.3923138889326509 243 | aF0.39466442112198152 244 | aF0.39469625701331201 245 | aF0.4003041739901127 246 | aF0.39477252852653533 247 | aF0.38676718932616477 248 | aF0.39068639404569161 249 | aF0.39480118886047777 250 | aF0.39771579914878369 251 | aF0.39478737966100647 252 | a. -------------------------------------------------------------------------------- /coco_caption/CIDEr.pkl: -------------------------------------------------------------------------------- 1 | (lp1 2 | cnumpy.core.multiarray 3 | scalar 4 | p2 5 | (cnumpy 6 | dtype 7 | p3 8 | (S'f8' 9 | I0 10 | I1 11 | tRp4 12 | (I3 13 | S'<' 14 | NNNI-1 15 | I-1 16 | I0 17 | tbS'a\xb6#V6p\xe1?' 18 | tRp5 19 | ag2 20 | (g4 21 | S'sW\x1c\xb5S\xe5\xe5?' 22 | tRp6 23 | ag2 24 | (g4 25 | S'^oE\x18~\xaa\xe7?' 26 | tRp7 27 | ag2 28 | (g4 29 | S'\xc31\x0f\xf8)\xe9\xe8?' 30 | tRp8 31 | ag2 32 | (g4 33 | S'V\x03;\xcd\xda}\xe9?' 34 | tRp9 35 | ag2 36 | (g4 37 | S'\xf0g\xfa\xd1o\t\xea?' 38 | tRp10 39 | ag2 40 | (g4 41 | S'\x9c\xdb\xb9\xea\x94}\xea?' 42 | tRp11 43 | ag2 44 | (g4 45 | S'\x92\x94Y\x8e\x1c\xd9\xea?' 46 | tRp12 47 | ag2 48 | (g4 49 | S'\x05\r\xd8(\xd8\x10\xeb?' 50 | tRp13 51 | ag2 52 | (g4 53 | S'\xee\x9a(D$\xb6\xeb?' 54 | tRp14 55 | ag2 56 | (g4 57 | S'w\x86\x03\r6i\xeb?' 58 | tRp15 59 | ag2 60 | (g4 61 | S'V\xc2\x8b\x8e%\xd2\xeb?' 62 | tRp16 63 | ag2 64 | (g4 65 | S'\xa9Oo\xda\xc6\x9b\xec?' 66 | tRp17 67 | ag2 68 | (g4 69 | S'_\x99\xa0\xfd\xe0\xc4\xec?' 70 | tRp18 71 | ag2 72 | (g4 73 | S'\x15\x95$WEI\xed?' 74 | tRp19 75 | ag2 76 | (g4 77 | S'M\x1c\x13\xbc\x9a;\xed?' 78 | tRp20 79 | ag2 80 | (g4 81 | S'If\xea\x0f\xd3\xc5\xed?' 82 | tRp21 83 | ag2 84 | (g4 85 | S'\xb3:,Bf\x9c\xed?' 86 | tRp22 87 | ag2 88 | (g4 89 | S'b1Kp\xf4\xec\xed?' 90 | tRp23 91 | ag2 92 | (g4 93 | S'\x95qd<\xbaB\xee?' 94 | tRp24 95 | ag2 96 | (g4 97 | S'\x9chw\x1fE5\xee?' 98 | tRp25 99 | ag2 100 | (g4 101 | S'\xc0sN\xe7c\x9a\xee?' 102 | tRp26 103 | ag2 104 | (g4 105 | S'\x1c\x7f\x88\x0fQv\xee?' 106 | tRp27 107 | ag2 108 | (g4 109 | S'\xebK\xd8\xe2\x14\x17\xef?' 110 | tRp28 111 | ag2 112 | (g4 113 | S'\x9c<1\xce.\xd7\xee?' 114 | tRp29 115 | ag2 116 | (g4 117 | S'\xe7\\Y\xd3\xfb;\xef?' 118 | tRp30 119 | ag2 120 | (g4 121 | S'|E\x9c\xd9\xbf\x8d\xef?' 122 | tRp31 123 | ag2 124 | (g4 125 | S'_\xeb_\x06@\xfe\xef?' 126 | tRp32 127 | ag2 128 | (g4 129 | S'\xf7\x8e\xac\xd0\xdf\x83\xef?' 130 | tRp33 131 | ag2 132 | (g4 133 | S'[\xd3\xa5\xe3\xc2\xcc\xef?' 134 | tRp34 135 | ag2 136 | (g4 137 | S'\x18Z\x90L\x10\x1f\xf0?' 138 | tRp35 139 | ag2 140 | (g4 141 | S'\xbd\xadcl\x14\x04\xf0?' 142 | tRp36 143 | ag2 144 | (g4 145 | S'\x13\xb8\xd5wq-\xf0?' 146 | tRp37 147 | ag2 148 | (g4 149 | S'\x8a\x11\xdc\xb7\xa5P\xf0?' 150 | tRp38 151 | ag2 152 | (g4 153 | S'\x83(\x8ce8M\xf0?' 154 | tRp39 155 | ag2 156 | (g4 157 | S'J\xf3\xd5C\x01\x9d\xf0?' 158 | tRp40 159 | ag2 160 | (g4 161 | S'[\xff\n\xbcY\x82\xf0?' 162 | tRp41 163 | ag2 164 | (g4 165 | S'\xce\xc8\x99\xb7\x8dw\xf0?' 166 | tRp42 167 | ag2 168 | (g4 169 | S'f\x85\xd1jj\xb8\xf0?' 170 | tRp43 171 | ag2 172 | (g4 173 | S'M\xbbt#\xad\xa2\xf0?' 174 | tRp44 175 | ag2 176 | (g4 177 | S'B\xb2\x9bJ\xbe\xc3\xf0?' 178 | tRp45 179 | ag2 180 | (g4 181 | S'5\x17\xb7_\x91\xa5\xf0?' 182 | tRp46 183 | ag2 184 | (g4 185 | S'PC\xf3\xd8`\xdd\xf0?' 186 | tRp47 187 | ag2 188 | (g4 189 | S'\xe6\xec;\x99\x19\xfd\xf0?' 190 | tRp48 191 | ag2 192 | (g4 193 | S'\xe3\x85\xcc\xcbY\x10\xf1?' 194 | tRp49 195 | ag2 196 | (g4 197 | S'\x12\x93\xed\xb2\xa5\x16\xf1?' 198 | tRp50 199 | ag2 200 | (g4 201 | S'%A\x7f3\xf0Q\xf1?' 202 | tRp51 203 | ag2 204 | (g4 205 | S'\x17D,M54\xf1?' 206 | tRp52 207 | ag2 208 | (g4 209 | S'\x84}\xab\xb2\xe6C\xf1?' 210 | tRp53 211 | ag2 212 | (g4 213 | S'\x85\xaf\x81|\xde>\xf1?' 214 | tRp54 215 | ag2 216 | (g4 217 | S'6\xe7;J\xd0x\xf1?' 218 | tRp55 219 | ag2 220 | (g4 221 | S'z\xfb\r\xad\x8c|\xf1?' 222 | tRp56 223 | ag2 224 | (g4 225 | S'\x9c@\xb2\x89>\x81\xf1?' 226 | tRp57 227 | ag2 228 | (g4 229 | S'\xdb4\xce\x97Fn\xf1?' 230 | tRp58 231 | ag2 232 | (g4 233 | S'_^\xa1Xc\x97\xf1?' 234 | tRp59 235 | ag2 236 | (g4 237 | S"V'\n\x05\x0c\xac\xf1?" 238 | tRp60 239 | ag2 240 | (g4 241 | S'\x10\x1d\xc8\xb5$t\xf1?' 242 | tRp61 243 | ag2 244 | (g4 245 | S'c\x90\xf5VV\x9e\xf1?' 246 | tRp62 247 | ag2 248 | (g4 249 | S'\xaaV\x86l\xcb\xb2\xf1?' 250 | tRp63 251 | ag2 252 | (g4 253 | S'\x7f\x11\xd4{\xdf\xec\xf1?' 254 | tRp64 255 | ag2 256 | (g4 257 | S'@\x15\x99&,\xf4\xf1?' 258 | tRp65 259 | ag2 260 | (g4 261 | S'\xf9\x93\xa1\xb0\xe8\xef\xf1?' 262 | tRp66 263 | ag2 264 | (g4 265 | S'\xd8| \xa8w\xef\xf1?' 266 | tRp67 267 | ag2 268 | (g4 269 | S's\xe4-N\xcb\xc3\xf1?' 270 | tRp68 271 | ag2 272 | (g4 273 | S's\xa1\x99\xda\x9c\x1e\xf2?' 274 | tRp69 275 | ag2 276 | (g4 277 | S'e\x84\xe9\x9a\x90\x00\xf2?' 278 | tRp70 279 | ag2 280 | (g4 281 | S'\xaeB7Js\x13\xf2?' 282 | tRp71 283 | ag2 284 | (g4 285 | S'\x08pj@5\x1a\xf2?' 286 | tRp72 287 | ag2 288 | (g4 289 | S'\x1f\x881v\xfd;\xf2?' 290 | tRp73 291 | ag2 292 | (g4 293 | S'\xab2D&\xd5+\xf2?' 294 | tRp74 295 | ag2 296 | (g4 297 | S'\xec\xddG\x85\xea\x16\xf2?' 298 | tRp75 299 | ag2 300 | (g4 301 | S"\x02\xba'=\x80M\xf2?" 302 | tRp76 303 | ag2 304 | (g4 305 | S'\xf2B\xb1H&z\xf2?' 306 | tRp77 307 | ag2 308 | (g4 309 | S'\xe7\x8c\xa0%\xcf+\xf2?' 310 | tRp78 311 | ag2 312 | (g4 313 | S'\xa1L\xe9|\x15\xa4\xf2?' 314 | tRp79 315 | ag2 316 | (g4 317 | S'w\xc6\xc6\xde\xf5P\xf2?' 318 | tRp80 319 | ag2 320 | (g4 321 | S'\t\xfa\xac\xc7\xadW\xf2?' 322 | tRp81 323 | ag2 324 | (g4 325 | S'\xce\xcc\x90\x19kM\xf2?' 326 | tRp82 327 | ag2 328 | (g4 329 | S'F\x80[\xf0\xcau\xf2?' 330 | tRp83 331 | ag2 332 | (g4 333 | S'\xcdL\xabx\xbc\x8a\xf2?' 334 | tRp84 335 | ag2 336 | (g4 337 | S'=\xd7?:3\xb9\xf2?' 338 | tRp85 339 | ag2 340 | (g4 341 | S'\x0b\xa3\x8e_\xeb\xbd\xf2?' 342 | tRp86 343 | ag2 344 | (g4 345 | S'\xa1\xe2\xd3:\xff\xd2\xf2?' 346 | tRp87 347 | ag2 348 | (g4 349 | S" '1\x84R\xab\xf2?" 350 | tRp88 351 | ag2 352 | (g4 353 | S'\xa9n\xee\x1c\xce\x03\xf3?' 354 | tRp89 355 | ag2 356 | (g4 357 | S'\xc7\x15\x00o.\xbd\xf2?' 358 | tRp90 359 | ag2 360 | (g4 361 | S'\xde\xb8\x1bO\x9c\xbf\xf2?' 362 | tRp91 363 | ag2 364 | (g4 365 | S'\xec\x91H\x8fB\x07\xf3?' 366 | tRp92 367 | ag2 368 | (g4 369 | S'\xffr\xd9\xd1\xad\xcc\xf2?' 370 | tRp93 371 | ag2 372 | (g4 373 | S'\xb7\x01Y\x04m\xb9\xf2?' 374 | tRp94 375 | ag2 376 | (g4 377 | S'\xf8ZK\x1a\x1f\xe5\xf2?' 378 | tRp95 379 | ag2 380 | (g4 381 | S'\x10\xaaT;(\xbd\xf2?' 382 | tRp96 383 | ag2 384 | (g4 385 | S'a\x95\x98\xac,1\xf3?' 386 | tRp97 387 | ag2 388 | (g4 389 | S'\xebn\xa7\xdeD\xc7\xf2?' 390 | tRp98 391 | ag2 392 | (g4 393 | S'\xa5\xe2\xf0\xd9\xe9\x01\xf3?' 394 | tRp99 395 | ag2 396 | (g4 397 | S'\xd3\xb2jBP\x05\xf3?' 398 | tRp100 399 | ag2 400 | (g4 401 | S'TS\x92.\xf2\x08\xf3?' 402 | tRp101 403 | ag2 404 | (g4 405 | S'5\xab\xfd\xff\xb8\xe9\xf2?' 406 | tRp102 407 | ag2 408 | (g4 409 | S'\xb8\xf2\xa7\x1c\t\xd5\xf2?' 410 | tRp103 411 | ag2 412 | (g4 413 | S'\xf2\xc5w\x8a\xc1E\xf3?' 414 | tRp104 415 | ag2 416 | (g4 417 | S'\xd7\xc8\x16\xa3U\x0e\xf3?' 418 | tRp105 419 | ag2 420 | (g4 421 | S'T\x1c\xa7,o\xef\xf2?' 422 | tRp106 423 | ag2 424 | (g4 425 | S'*\xf1 \x9f\xf4\xfa\xf2?' 426 | tRp107 427 | ag2 428 | (g4 429 | S'.\x97J\xa6\xa6\\\xf3?' 430 | tRp108 431 | ag2 432 | (g4 433 | S'\xe8\xc1P:\x90?\xf3?' 434 | tRp109 435 | ag2 436 | (g4 437 | S'\xf9A0\xdf\xaaI\xf3?' 438 | tRp110 439 | ag2 440 | (g4 441 | S'\xdc\xb8\xbf\x05,\xd2\xf2?' 442 | tRp111 443 | ag2 444 | (g4 445 | S'T\xee\x13\x7f\xc84\xf3?' 446 | tRp112 447 | ag2 448 | (g4 449 | S'\xb2\x9b{\xc65M\xf3?' 450 | tRp113 451 | ag2 452 | (g4 453 | S'\xe8_\x19@?{\xf3?' 454 | tRp114 455 | ag2 456 | (g4 457 | S'\xb9\xb0\xc8L\xbb\x1d\xf3?' 458 | tRp115 459 | ag2 460 | (g4 461 | S'\xea\xf9\x03_\xedX\xf3?' 462 | tRp116 463 | ag2 464 | (g4 465 | S'\xd4&\x08H\xabH\xf3?' 466 | tRp117 467 | ag2 468 | (g4 469 | S'W\xa55\xea\x18>\xf3?' 470 | tRp118 471 | ag2 472 | (g4 473 | S'2i\x02\xee\xf7U\xf3?' 474 | tRp119 475 | ag2 476 | (g4 477 | S'\x13\x9a\x10\xb7\xe9\x81\xf3?' 478 | tRp120 479 | ag2 480 | (g4 481 | S'x\xfe\xfe\x0e\tQ\xf3?' 482 | tRp121 483 | ag2 484 | (g4 485 | S'\x9br\xa0qK~\xf3?' 486 | tRp122 487 | ag2 488 | (g4 489 | S'\xe8\xb1\x8a0\xfdO\xf3?' 490 | tRp123 491 | ag2 492 | (g4 493 | S'\xc0\xfc\x01\xe8\x85\xc1\xf3?' 494 | tRp124 495 | ag2 496 | (g4 497 | S'\xd09\xbde\xa6g\xf3?' 498 | tRp125 499 | ag2 500 | (g4 501 | S'\xa4n=\xd3\x1b9\xf3?' 502 | tRp126 503 | ag2 504 | (g4 505 | S'i\x9f.\xc0\xaf\x98\xf3?' 506 | tRp127 507 | ag2 508 | (g4 509 | S'E\xa0J\x1e\x16f\xf3?' 510 | tRp128 511 | ag2 512 | (g4 513 | S'{\xaf\xec\xddT\xc4\xf3?' 514 | tRp129 515 | ag2 516 | (g4 517 | S'1rIR\xf6\\\xf3?' 518 | tRp130 519 | ag2 520 | (g4 521 | S'\xa1\xd8Y!\x1e\x8c\xf3?' 522 | tRp131 523 | ag2 524 | (g4 525 | S'\x92\xd0\xba\xdd"\x80\xf3?' 526 | tRp132 527 | ag2 528 | (g4 529 | S'\xa3\x94\xc7\x8eV\x8f\xf3?' 530 | tRp133 531 | ag2 532 | (g4 533 | S'\xd11Xq\xb9\xbe\xf3?' 534 | tRp134 535 | ag2 536 | (g4 537 | S'\xe0\x9e\x0fn\xe3\xaf\xf3?' 538 | tRp135 539 | ag2 540 | (g4 541 | S'\x0f\xf7\x1eQ\xcc\xc4\xf3?' 542 | tRp136 543 | ag2 544 | (g4 545 | S'E[\x0f\x80m\xa7\xf3?' 546 | tRp137 547 | ag2 548 | (g4 549 | S'\xefA\x18\xc2\x01\xb7\xf3?' 550 | tRp138 551 | ag2 552 | (g4 553 | S'\x15)\xbd\x97\xcf\x97\xf3?' 554 | tRp139 555 | ag2 556 | (g4 557 | S'\x88|Xd\x03\xaa\xf3?' 558 | tRp140 559 | ag2 560 | (g4 561 | S'\xe4\nV\xd9\x99\xaa\xf3?' 562 | tRp141 563 | ag2 564 | (g4 565 | S'\x9f0\xf4i*\xd0\xf3?' 566 | tRp142 567 | ag2 568 | (g4 569 | S'\x1b\x1c\x86\x16?\xbf\xf3?' 570 | tRp143 571 | ag2 572 | (g4 573 | S'eqgj\xd9\xba\xf3?' 574 | tRp144 575 | ag2 576 | (g4 577 | S'\xbc\x88\x82\xa4f\xea\xf3?' 578 | tRp145 579 | ag2 580 | (g4 581 | S'a\xc4\xe0M\x1f\xbe\xf3?' 582 | tRp146 583 | ag2 584 | (g4 585 | S'\x99\xff\x19\xa3\xa2\xd3\xf3?' 586 | tRp147 587 | ag2 588 | (g4 589 | S'\x124\x96\x8a\xb4\xe4\xf3?' 590 | tRp148 591 | ag2 592 | (g4 593 | S'6#\xdan\x9d\x9b\xf3?' 594 | tRp149 595 | ag2 596 | (g4 597 | S'\xba\xd1\xd6\x8e\xb7\xcd\xf3?' 598 | tRp150 599 | ag2 600 | (g4 601 | S'BEz}H\xe5\xf3?' 602 | tRp151 603 | ag2 604 | (g4 605 | S'!\x82hfM\x02\xf4?' 606 | tRp152 607 | ag2 608 | (g4 609 | S'(\xd6\r\xbc\x8f\xfb\xf3?' 610 | tRp153 611 | ag2 612 | (g4 613 | S'=\x81/8n\xcb\xf3?' 614 | tRp154 615 | ag2 616 | (g4 617 | S'\x9c\xbe\xd8\xb0\xdc\xb6\xf3?' 618 | tRp155 619 | ag2 620 | (g4 621 | S'\xb6\x80\xb4\xae\xfe\xc3\xf3?' 622 | tRp156 623 | ag2 624 | (g4 625 | S'\xaegc\xb0\xdc\xd3\xf3?' 626 | tRp157 627 | ag2 628 | (g4 629 | S'\xa8\xa1\x9b\x94\xb6\x13\xf4?' 630 | tRp158 631 | ag2 632 | (g4 633 | S'\x8a"\x8dB;\xe8\xf3?' 634 | tRp159 635 | ag2 636 | (g4 637 | S'\x10s\xdf\xae]\xb9\xf3?' 638 | tRp160 639 | ag2 640 | (g4 641 | S'\xe0Z\xf2\x1a&\xdf\xf3?' 642 | tRp161 643 | ag2 644 | (g4 645 | S'y\xed0 \xb9\xf5\xf3?' 646 | tRp162 647 | ag2 648 | (g4 649 | S'\x1b\xa3\xbd\x14{\xea\xf3?' 650 | tRp163 651 | ag2 652 | (g4 653 | S'\x19\xfa\x1fUX\xec\xf3?' 654 | tRp164 655 | ag2 656 | (g4 657 | S'\xecU\x1f+C\x13\xf4?' 658 | tRp165 659 | ag2 660 | (g4 661 | S'\x8c\x80\xf7\xe4(\xe9\xf3?' 662 | tRp166 663 | ag2 664 | (g4 665 | S'c1ia\xd7\xb6\xf3?' 666 | tRp167 667 | ag2 668 | (g4 669 | S'\x14\xf6\xe1Og\x07\xf4?' 670 | tRp168 671 | ag2 672 | (g4 673 | S'\xa20\xd9g7\t\xf4?' 674 | tRp169 675 | ag2 676 | (g4 677 | S'\xc4P$T\xcb\xf6\xf3?' 678 | tRp170 679 | ag2 680 | (g4 681 | S'\xc6V\xcd\xec\xa4\x11\xf4?' 682 | tRp171 683 | ag2 684 | (g4 685 | S'\x97\xdc\xe3\x93I\xf5\xf3?' 686 | tRp172 687 | ag2 688 | (g4 689 | S'\xff\xfcL\xa9Q\xc8\xf3?' 690 | tRp173 691 | ag2 692 | (g4 693 | S'\x86\x02\x81\xf5B\xfc\xf3?' 694 | tRp174 695 | ag2 696 | (g4 697 | S'\xf0\xac|\xd9\xc6\n\xf4?' 698 | tRp175 699 | ag2 700 | (g4 701 | S'\xbf-\xca\\\x91\xea\xf3?' 702 | tRp176 703 | ag2 704 | (g4 705 | S'\x83p\xcb\xd6]\xe9\xf3?' 706 | tRp177 707 | ag2 708 | (g4 709 | S"?\xf5'\xb2\x0b\xd3\xf3?" 710 | tRp178 711 | ag2 712 | (g4 713 | S'\xa8\x82\xe3p\xae\xfa\xf3?' 714 | tRp179 715 | ag2 716 | (g4 717 | S'2\x8b\xe7\xf4\xc7\xf7\xf3?' 718 | tRp180 719 | ag2 720 | (g4 721 | S'R\xef[\xecR\xf1\xf3?' 722 | tRp181 723 | ag2 724 | (g4 725 | S'\xb8;\xee\x81?\xe7\xf3?' 726 | tRp182 727 | ag2 728 | (g4 729 | S'\xe9\xa8\xad\xe1\xeb\xf6\xf3?' 730 | tRp183 731 | ag2 732 | (g4 733 | S'M J\x19\xd5\x17\xf4?' 734 | tRp184 735 | ag2 736 | (g4 737 | S'B\x83\r)\xd8\xe4\xf3?' 738 | tRp185 739 | ag2 740 | (g4 741 | S'\xfd\xa0\xc7\x13\x81\xec\xf3?' 742 | tRp186 743 | ag2 744 | (g4 745 | S'\xd5\xf5\x04\x19f\x05\xf4?' 746 | tRp187 747 | ag2 748 | (g4 749 | S'6\x18\xb9\xe2q\xd4\xf3?' 750 | tRp188 751 | ag2 752 | (g4 753 | S'\xb3\xb0\x8b\xb6\x8a:\xf4?' 754 | tRp189 755 | ag2 756 | (g4 757 | S'\x8f\x00u\x07G*\xf4?' 758 | tRp190 759 | ag2 760 | (g4 761 | S'\x04:c\x99\x9f\x06\xf4?' 762 | tRp191 763 | ag2 764 | (g4 765 | S'x\xe7\x7f\xaa\x8eh\xf4?' 766 | tRp192 767 | ag2 768 | (g4 769 | S'\xfb\x1aR\x13m\xdd\xf3?' 770 | tRp193 771 | ag2 772 | (g4 773 | S'y\xd4\xc8\n\xcb\x1e\xf4?' 774 | tRp194 775 | ag2 776 | (g4 777 | S'\x8e\xf9.\xc9\xb5\xfe\xf3?' 778 | tRp195 779 | ag2 780 | (g4 781 | S'j\x86r\n\x94\x03\xf4?' 782 | tRp196 783 | ag2 784 | (g4 785 | S'\x02N\x03\x0e4\x03\xf4?' 786 | tRp197 787 | ag2 788 | (g4 789 | S'\xa5%\x16\x9e\xbf\xe4\xf3?' 790 | tRp198 791 | ag2 792 | (g4 793 | S'Mu\xe9\xdc\xf5\xec\xf3?' 794 | tRp199 795 | ag2 796 | (g4 797 | S'=\xd6\x90Yv(\xf4?' 798 | tRp200 799 | ag2 800 | (g4 801 | S'\xecL$W\xe7\x12\xf4?' 802 | tRp201 803 | ag2 804 | (g4 805 | S'\xfff\x8def\x06\xf4?' 806 | tRp202 807 | ag2 808 | (g4 809 | S'\xbb\xd2E\x11\xca\x15\xf4?' 810 | tRp203 811 | ag2 812 | (g4 813 | S'\xce\xa1\xac\x14\xab<\xf4?' 814 | tRp204 815 | ag2 816 | (g4 817 | S'jl|\x99M\x14\xf4?' 818 | tRp205 819 | ag2 820 | (g4 821 | S'\xf3vr\xb6WS\xf4?' 822 | tRp206 823 | ag2 824 | (g4 825 | S'B\xc4!\x7f\x06\x1d\xf4?' 826 | tRp207 827 | ag2 828 | (g4 829 | S'ZNI\xeb1I\xf4?' 830 | tRp208 831 | ag2 832 | (g4 833 | S'r\t%6\x8f\x11\xf4?' 834 | tRp209 835 | ag2 836 | (g4 837 | S'\xa5\xe10\xbd\xf59\xf4?' 838 | tRp210 839 | ag2 840 | (g4 841 | S'\xbe51\xffS\t\xf4?' 842 | tRp211 843 | ag2 844 | (g4 845 | S'z\xa4 \x11\xf2+\xf4?' 846 | tRp212 847 | ag2 848 | (g4 849 | S'\xd5\xca\xb4\xdb\xba\xfd\xf3?' 850 | tRp213 851 | ag2 852 | (g4 853 | S'l\xf1\x85\xadC\x04\xf4?' 854 | tRp214 855 | ag2 856 | (g4 857 | S'0\xcbo\xab\x90"\xf4?' 858 | tRp215 859 | ag2 860 | (g4 861 | S'\x84R\xebEg\xfc\xf3?' 862 | tRp216 863 | ag2 864 | (g4 865 | S'K\x1d\n\xd0\xf2/\xf4?' 866 | tRp217 867 | ag2 868 | (g4 869 | S'\x1bZ\xd4\xee\xaa?\xf4?' 870 | tRp218 871 | ag2 872 | (g4 873 | S',\x12r\xadw_\xf4?' 874 | tRp219 875 | ag2 876 | (g4 877 | S'L\xef\xb5\x00\xc7\x00\xf4?' 878 | tRp220 879 | ag2 880 | (g4 881 | S'\r\x8f\x92\x0f\x06:\xf4?' 882 | tRp221 883 | ag2 884 | (g4 885 | S'\xca:-\xfbf\\\xf4?' 886 | tRp222 887 | ag2 888 | (g4 889 | S"' \xa8\xa3\x91\xfd\xf3?" 890 | tRp223 891 | ag2 892 | (g4 893 | S'\xed\x02\xd7\xb1\xf6,\xf4?' 894 | tRp224 895 | ag2 896 | (g4 897 | S'\xa3\xa7u\xf5\x15!\xf4?' 898 | tRp225 899 | ag2 900 | (g4 901 | S'L\xe4\xd2\xd3G)\xf4?' 902 | tRp226 903 | ag2 904 | (g4 905 | S'\xf5A\xaa\xd7/\x0b\xf4?' 906 | tRp227 907 | ag2 908 | (g4 909 | S'\xcf\xac@>\xff@\xf4?' 910 | tRp228 911 | ag2 912 | (g4 913 | S'\x81x\x13Rm\x1c\xf4?' 914 | tRp229 915 | ag2 916 | (g4 917 | S'\t-n\x98"I\xf4?' 918 | tRp230 919 | ag2 920 | (g4 921 | S'\x9a\xd6|\xd0t\xfa\xf3?' 922 | tRp231 923 | ag2 924 | (g4 925 | S'\xe77\xaf$T0\xf4?' 926 | tRp232 927 | ag2 928 | (g4 929 | S')\r\x16\xb5<4\xf4?' 930 | tRp233 931 | ag2 932 | (g4 933 | S'\x1a\xce\xf3\xe0\x0b9\xf4?' 934 | tRp234 935 | ag2 936 | (g4 937 | S'i\xdf2AtC\xf4?' 938 | tRp235 939 | ag2 940 | (g4 941 | S'\xd9\xe6h\x0f\x8bC\xf4?' 942 | tRp236 943 | ag2 944 | (g4 945 | S'0 [(\x1a\x01\xf4?' 946 | tRp237 947 | ag2 948 | (g4 949 | S'\xae\x9e\x96\xcf\x80I\xf4?' 950 | tRp238 951 | ag2 952 | (g4 953 | S'\xb3\xdb\xb4g\xb5+\xf4?' 954 | tRp239 955 | ag2 956 | (g4 957 | S'\xdahk\xea\x1bJ\xf4?' 958 | tRp240 959 | ag2 960 | (g4 961 | S'\xf7i\xcdd\x93J\xf4?' 962 | tRp241 963 | ag2 964 | (g4 965 | S'.WXi\x9eE\xf4?' 966 | tRp242 967 | ag2 968 | (g4 969 | S'\x10\xa1=z\x80:\xf4?' 970 | tRp243 971 | ag2 972 | (g4 973 | S'\xd76\x7f1*o\xf4?' 974 | tRp244 975 | ag2 976 | (g4 977 | S'\x94\xdbd\x93\xab\x18\xf4?' 978 | tRp245 979 | ag2 980 | (g4 981 | S'\x11|r\xa2\x06-\xf4?' 982 | tRp246 983 | ag2 984 | (g4 985 | S'hYQ\x94\xc51\xf4?' 986 | tRp247 987 | ag2 988 | (g4 989 | S'\x84\xfd:KV~\xf4?' 990 | tRp248 991 | ag2 992 | (g4 993 | S'\xac\xaf\x01\x937;\xf4?' 994 | tRp249 995 | ag2 996 | (g4 997 | S'\xf7\x86\xf9\x9co\xfc\xf3?' 998 | tRp250 999 | ag2 1000 | (g4 1001 | S'q\xa5\x11H\xe4%\xf4?' 1002 | tRp251 1003 | ag2 1004 | (g4 1005 | S' \xbb\xf8\xc2\n,\xf4?' 1006 | tRp252 1007 | ag2 1008 | (g4 1009 | S'\x894\xae\xa1\xcb7\xf4?' 1010 | tRp253 1011 | ag2 1012 | (g4 1013 | S'\xc9\x8f\x919}=\xf4?' 1014 | tRp254 1015 | a. -------------------------------------------------------------------------------- /coco_caption/METEOR.pkl: -------------------------------------------------------------------------------- 1 | (lp1 2 | F0.18915644594642708 3 | aF0.20911916479687623 4 | aF0.21949224920874325 5 | aF0.22395937995175452 6 | aF0.22636076772912747 7 | aF0.23063332405294543 8 | aF0.23168066989877056 9 | aF0.23275986881940422 10 | aF0.23417117399601728 11 | aF0.2350631549515593 12 | aF0.23678958786720067 13 | aF0.23671099113335686 14 | aF0.23922871999875575 15 | aF0.23980766235480724 16 | aF0.24250715969661582 17 | aF0.24409128448606446 18 | aF0.24417673283127833 19 | aF0.24534410461950146 20 | aF0.24717353763019165 21 | aF0.24719646311209323 22 | aF0.24650820758422559 23 | aF0.250388748859595 24 | aF0.24883898414632688 25 | aF0.25163130016639429 26 | aF0.24963518611164856 27 | aF0.25137443983486046 28 | aF0.25280151086472102 29 | aF0.2539250468977573 30 | aF0.25330708644786121 31 | aF0.25590649687564843 32 | aF0.25485096265105195 33 | aF0.25472098886275546 34 | aF0.25725636982963301 35 | aF0.2577444676483735 36 | aF0.25687426372906574 37 | aF0.26009681088862618 38 | aF0.25996750081263781 39 | aF0.25851554345039679 40 | aF0.26152881766684505 41 | aF0.25893832149455559 42 | aF0.26152794170654592 43 | aF0.26078061911783385 44 | aF0.26246229451701153 45 | aF0.26393208384860578 46 | aF0.26333146434380406 47 | aF0.26329484085148308 48 | aF0.26402779454741354 49 | aF0.26393363716423013 50 | aF0.26562805312849591 51 | aF0.264572119645897 52 | aF0.26601627774989034 53 | aF0.26632920720759801 54 | aF0.26748625980022117 55 | aF0.26663498870215679 56 | aF0.26810417109650442 57 | aF0.26916412073025509 58 | aF0.2674172239826727 59 | aF0.26680873225560681 60 | aF0.26968550960583088 61 | aF0.27003636233769723 62 | aF0.27116095980912269 63 | aF0.27091837362020121 64 | aF0.27190187421724671 65 | aF0.26940392346200703 66 | aF0.27176744667814412 67 | aF0.27089726905540917 68 | aF0.27288518217035101 69 | aF0.27274628383276189 70 | aF0.2728966926367839 71 | aF0.27219854188079728 72 | aF0.27262760529799257 73 | aF0.27505080069905191 74 | aF0.27477331309378716 75 | aF0.27243363100838913 76 | aF0.27510896016998154 77 | aF0.27502131399790919 78 | aF0.27404601246014254 79 | aF0.27481159756735757 80 | aF0.27560350140208262 81 | aF0.27514962922677899 82 | aF0.27718844396314396 83 | aF0.27608789049323135 84 | aF0.27804083149494685 85 | aF0.27725020699438457 86 | aF0.28040352064129626 87 | aF0.2772757474518821 88 | aF0.27813172148719623 89 | aF0.27954703349661797 90 | aF0.27729982167090511 91 | aF0.27683894481501659 92 | aF0.28107214089699611 93 | aF0.27790217290818026 94 | aF0.28104024908926628 95 | aF0.28048263204639945 96 | aF0.28061577170556196 97 | aF0.28023214986579476 98 | aF0.27940242965930034 99 | aF0.27926507426135566 100 | aF0.28019376437748233 101 | aF0.28248412298998271 102 | aF0.28046655881029953 103 | aF0.28002215564340555 104 | aF0.28046864475212863 105 | aF0.28390852552172363 106 | aF0.28296736924131477 107 | aF0.28268677077151555 108 | aF0.2798149277913497 109 | aF0.28374749773137276 110 | aF0.28284055862777119 111 | aF0.28334547401709625 112 | aF0.28240138854903724 113 | aF0.28365043720208905 114 | aF0.28296323222134556 115 | aF0.28340865376711727 116 | aF0.28375944351922833 117 | aF0.28482920079101431 118 | aF0.28285094729853871 119 | aF0.28537332875124127 120 | aF0.28369409144691493 121 | aF0.28788448500736619 122 | aF0.28417719629010885 123 | aF0.28247767281542563 124 | aF0.28610696382647222 125 | aF0.28507581771364848 126 | aF0.28774582385662278 127 | aF0.28284401083150934 128 | aF0.28526007497751038 129 | aF0.28584808770710396 130 | aF0.28676610806088709 131 | aF0.28640958412108453 132 | aF0.28593547723436086 133 | aF0.28608271026206111 134 | aF0.28661153862054706 135 | aF0.28883979698928797 136 | aF0.28681758599812696 137 | aF0.28730389532696099 138 | aF0.28662283193827115 139 | aF0.28892897219405839 140 | aF0.28759559638491083 141 | aF0.28666049624096002 142 | aF0.29011089745228869 143 | aF0.28900238335817979 144 | aF0.28860771778784133 145 | aF0.289670141090758 146 | aF0.28753610880743102 147 | aF0.28804254503597859 148 | aF0.29014480002922199 149 | aF0.29098093608245995 150 | aF0.29012750509699342 151 | aF0.28834429827938918 152 | aF0.2882065437807817 153 | aF0.28929992571507329 154 | aF0.28940006979015337 155 | aF0.29085633837764174 156 | aF0.29004617784801029 157 | aF0.28975265226997715 158 | aF0.28978342364090581 159 | aF0.28957382463597903 160 | aF0.28905715704535884 161 | aF0.29011513806956507 162 | aF0.29295667055631425 163 | aF0.29113327118771654 164 | aF0.28852864597199662 165 | aF0.29186213532220379 166 | aF0.2918781623550053 167 | aF0.29128146661360593 168 | aF0.29225980870223844 169 | aF0.29066672991543518 170 | aF0.29029890273104586 171 | aF0.29195084253980425 172 | aF0.29123553546819075 173 | aF0.29062106085118061 174 | aF0.29179804871193821 175 | aF0.28998709951209195 176 | aF0.29149185522962423 177 | aF0.29151798337369877 178 | aF0.29155177404501126 179 | aF0.29103521281804867 180 | aF0.29096331070421333 181 | aF0.29301306248005865 182 | aF0.29096500538456355 183 | aF0.28989199327667775 184 | aF0.29314262652344514 185 | aF0.29144324364037155 186 | aF0.29428756238948467 187 | aF0.2924259491287029 188 | aF0.29210513942607341 189 | aF0.29480797781844198 190 | aF0.29101766553274977 191 | aF0.29336269482609206 192 | aF0.29201278358565069 193 | aF0.29219909529639992 194 | aF0.2938912158574768 195 | aF0.2921506098701227 196 | aF0.29239299757508447 197 | aF0.29354609641852591 198 | aF0.29262711883114623 199 | aF0.29171791917485274 200 | aF0.29223746418306829 201 | aF0.29381567866483349 202 | aF0.29290370883214206 203 | aF0.29505681546520424 204 | aF0.29429962868558845 205 | aF0.29460775300456488 206 | aF0.29238086027195909 207 | aF0.29528543129374024 208 | aF0.29341402386920695 209 | aF0.29474568981359983 210 | aF0.29216358900268563 211 | aF0.2935913352931202 212 | aF0.29334145784065863 213 | aF0.29448787757565903 214 | aF0.29498898620849007 215 | aF0.29431393431643338 216 | aF0.29704984530391071 217 | aF0.2931446168207652 218 | aF0.29480617685982402 219 | aF0.29576975129065086 220 | aF0.29359336011535353 221 | aF0.29450203452857454 222 | aF0.29521354194283322 223 | aF0.29281448737344823 224 | aF0.29482487461515422 225 | aF0.29571226574885395 226 | aF0.29504088341902712 227 | aF0.29540440838582394 228 | aF0.29286989337211972 229 | aF0.29518663766845959 230 | aF0.2966833412001017 231 | aF0.295448470122119 232 | aF0.29711422648721347 233 | aF0.29711734749392577 234 | aF0.29627884382012593 235 | aF0.29639846161174005 236 | aF0.29497700766905133 237 | aF0.29605339757395865 238 | aF0.29650222850027647 239 | aF0.29612326748964929 240 | aF0.29621738509928525 241 | aF0.29881187108696727 242 | aF0.29557793809696425 243 | aF0.29560094518659574 244 | aF0.29561788163129465 245 | aF0.29949039672884398 246 | aF0.29586661778143331 247 | aF0.29518660763816906 248 | aF0.29576518080675901 249 | aF0.29505938092454825 250 | aF0.29744406928735179 251 | aF0.29636329553803575 252 | a. -------------------------------------------------------------------------------- /coco_caption/draw.py: -------------------------------------------------------------------------------- 1 | #! encoding: UTF-8 2 | 3 | import cPickle as pickle 4 | import matplotlib.pyplot as plt 5 | 6 | with open("Bleu_1.pkl", "r") as f: 7 | Bleu_1 = pickle.load(f) 8 | 9 | with open("Bleu_2.pkl", "r") as f: 10 | Bleu_2 = pickle.load(f) 11 | 12 | with open("Bleu_3.pkl", "r") as f: 13 | Bleu_3 = pickle.load(f) 14 | 15 | with open("Bleu_4.pkl", "r") as f: 16 | Bleu_4 = pickle.load(f) 17 | 18 | with open("METEOR.pkl", "r") as f: 19 | METEOR = pickle.load(f) 20 | 21 | with open("CIDEr.pkl", "r") as f: 22 | CIDEr = pickle.load(f) 23 | 24 | print len(Bleu_1) 25 | 26 | plt.plot(range(0, 2*len(Bleu_1), 2), Bleu_1, label="Bleu-1", color="g") 27 | plt.plot(range(0, 2*len(Bleu_2), 2), Bleu_2, label="Bleu-2", color="r") 28 | plt.plot(range(0, 2*len(Bleu_3), 2), Bleu_3, label="Bleu-3", color="b") 29 | plt.plot(range(0, 2*len(Bleu_4), 2), Bleu_4, label="Bleu-4", color="m") 30 | plt.plot(range(0, 2*len(METEOR), 2), METEOR, label="METEOR", color="k") 31 | plt.plot(range(0, 2*len(CIDEr), 2), CIDEr, label="CIDEr", color="y") 32 | 33 | plt.grid(True) 34 | #plt.legend(handles=[line_1, line_2]) 35 | #plt.legend(bbox_to_anchor=(1.05, 1), loc=2, borderaxespad=0.) 36 | #plt.legend(handles=[line1], loc=1) 37 | plt.legend(loc=2) 38 | plt.show() 39 | #plt.savefig("tmp.png") 40 | -------------------------------------------------------------------------------- /coco_caption/eval_captions_results.py: -------------------------------------------------------------------------------- 1 | #! encoding: UTF-8 2 | 3 | import os 4 | from pycocotools.coco import COCO 5 | from pycocoevalcap.eval import COCOEvalCap 6 | 7 | annFile = "../data/train_val_all_reference.json" 8 | resFile = "captions_val2014_results.json" 9 | 10 | # create coco object and cocoRes object 11 | coco = COCO(annFile) 12 | 13 | # after generating the captions_val2014_results.json file 14 | # we call the coco caption evaluation tools 15 | cocoRes = coco.loadRes(resFile) 16 | 17 | # create cocoEval object by taking coco and cocoRes 18 | cocoEval = COCOEvalCap(coco, cocoRes) 19 | 20 | # evaluate on a subset of images by setting 21 | # cocoEval.params['image_id'] = cocoRes.getImgIds() 22 | # please remove this line when evaluating the full validation set 23 | cocoEval.params['image_id'] = cocoRes.getImgIds() 24 | 25 | # evaluate results 26 | cocoEval.evaluate() 27 | 28 | # print output evaluation scores 29 | for metric, score in cocoEval.eval.items(): 30 | print '%s: %.3f'%(metric, score) -------------------------------------------------------------------------------- /coco_caption/eval_image_caption.py: -------------------------------------------------------------------------------- 1 | # encoding: UTF-8 2 | 3 | import os 4 | import sys 5 | import glob 6 | import random 7 | import time 8 | import json 9 | from json import encoder 10 | import numpy as np 11 | import cPickle as pickle 12 | import matplotlib.pyplot as plt 13 | 14 | import tensorflow as tf 15 | 16 | sys.path.append('./coco_caption/') 17 | from pycocotools.coco import COCO 18 | from pycocoevalcap.eval import COCOEvalCap 19 | 20 | import ipdb 21 | 22 | 23 | ############################################################################################################# 24 | # 25 | # Step 1: Input: D = {(x^n, y^n): n = 1:N} 26 | # Step 2:Train \Pi(g_{1:T} | x) using MLE on D, MLE: Maximum likehood eatimation 27 | # 28 | ############################################################################################################ 29 | class CNN_LSTM(): 30 | def __init__(self, 31 | n_words, 32 | batch_size, 33 | feats_dim, 34 | project_dim, 35 | lstm_size, 36 | word_embed_dim, 37 | lstm_step, 38 | bias_init_vector=None): 39 | 40 | self.n_words = n_words 41 | self.batch_size = batch_size 42 | self.feats_dim = feats_dim 43 | self.project_dim = project_dim 44 | self.lstm_size = lstm_size 45 | self.word_embed_dim = word_embed_dim 46 | self.lstm_step = lstm_step 47 | 48 | # project the image feature vector of dimension 2048 to 512 dimension, with a linear layer 49 | # self.encode_img_W: 2048 x 512 50 | # self.encode_img_b: 512 51 | self.encode_img_W = tf.Variable(tf.random_uniform([feats_dim, project_dim], -0.1, 0.1), name="encode_img_W") 52 | self.encode_img_b = tf.zeros([project_dim], name="encode_img_b") 53 | 54 | with tf.device("/cpu:0"): 55 | self.Wemb = tf.Variable(tf.random_uniform([n_words, word_embed_dim], -0.1, 0.1), name="Wemb") 56 | 57 | self.lstm = tf.nn.rnn_cell.BasicLSTMCell(lstm_size, state_is_tuple=True) 58 | 59 | self.embed_word_W = tf.Variable(tf.random_uniform([lstm_size, n_words], -0.1, 0.1), name="embed_word_W") 60 | 61 | if bias_init_vector is not None: 62 | self.embed_word_b = tf.Variable(bias_init_vector.astype(np.float32), name="embed_word_b") 63 | else: 64 | self.embed_word_b = tf.Variable(tf.zeros([n_words]), name="embed_word_b") 65 | 66 | self.baseline_MLP_W = tf.Variable(tf.random_uniform([lstm_size, 1], -0.1, 0.1), name="baseline_MLP_W") 67 | self.baseline_MLP_b = tf.Variable(tf.zeros([1]), name="baseline_MLP_b") 68 | 69 | ############################################################################################################ 70 | # 71 | # Class function for step 2 72 | # 73 | ############################################################################################################ 74 | def build_model(self): 75 | images = tf.placeholder(tf.float32, [self.batch_size, self.feats_dim]) 76 | sentences = tf.placeholder(tf.int32, [self.batch_size, self.lstm_step]) 77 | masks = tf.placeholder(tf.float32, [self.batch_size, self.lstm_step]) 78 | 79 | images_embed = tf.matmul(images, self.encode_img_W) + self.encode_img_b 80 | 81 | state = self.lstm.zero_state(batch_size=self.batch_size, dtype=tf.float32) 82 | 83 | loss = 0.0 84 | with tf.variable_scope("LSTM"): 85 | for i in range(0, self.lstm_step): 86 | if i == 0: 87 | current_emb = images_embed 88 | else: 89 | with tf.device("/cpu:0"): 90 | current_emb = tf.nn.embedding_lookup(self.Wemb, sentences[:, i-1]) 91 | 92 | if i > 0: 93 | tf.get_variable_scope().reuse_variables() 94 | 95 | output, state = self.lstm(current_emb, state) 96 | 97 | if i > 0: 98 | labels = tf.expand_dims(sentences[:, i], 1) 99 | indices = tf.expand_dims(tf.range(0, self.batch_size, 1), 1) 100 | concated = tf.concat(1, [indices, labels]) 101 | onehot_labels = tf.sparse_to_dense( concated, tf.pack([self.batch_size, self.n_words]), 1.0, 0.0) 102 | 103 | logit_words = tf.matmul(output, self.embed_word_W) + self.embed_word_b 104 | cross_entropy = tf.nn.softmax_cross_entropy_with_logits(logit_words, onehot_labels) 105 | cross_entropy = cross_entropy * masks[:, i] 106 | current_loss = tf.reduce_sum(cross_entropy)/self.batch_size 107 | 108 | loss = loss + current_loss 109 | return loss, images, sentences, masks 110 | 111 | def generate_model(self): 112 | images = tf.placeholder(tf.float32, [1, self.feats_dim]) 113 | images_embed = tf.matmul(images, self.encode_img_W) + self.encode_img_b 114 | 115 | state = self.lstm.zero_state(batch_size=1, dtype=tf.float32) 116 | sentences = [] 117 | 118 | with tf.variable_scope("LSTM"): 119 | output, state = self.lstm(images_embed, state) 120 | 121 | with tf.device("/cpu:0"): 122 | current_emb = tf.nn.embedding_lookup(self.Wemb, tf.ones([1], dtype=tf.int64)) 123 | 124 | for i in range(0, self.lstm_step): 125 | tf.get_variable_scope().reuse_variables() 126 | 127 | output, state = self.lstm(current_emb, state) 128 | 129 | logit_words = tf.matmul(output, self.embed_word_W) + self.embed_word_b 130 | max_prob_word = tf.argmax(logit_words, 1)[0] 131 | 132 | with tf.device("/cpu:0"): 133 | current_emb = tf.nn.embedding_lookup(self.Wemb, max_prob_word) 134 | current_emb = tf.expand_dims(current_emb, 0) 135 | sentences.append(max_prob_word) 136 | 137 | return images, sentences 138 | 139 | 140 | ############################################################################## 141 | # 142 | # set parameters and path 143 | # 144 | ############################################################################## 145 | batch_size = 100 146 | feats_dim = 2048 147 | project_dim = 512 148 | lstm_size = 512 149 | word_embed_dim = 512 150 | lstm_step = 20 151 | 152 | idx_to_word_path = '../data/idx_to_word.pkl' 153 | word_to_idx_path = '../data/word_to_idx.pkl' 154 | bias_init_vector_path = '../data/bias_init_vector.npy' 155 | 156 | with open(idx_to_word_path, 'r') as fr_3: 157 | idx_to_word = pickle.load(fr_3) 158 | 159 | with open(word_to_idx_path, 'r') as fr_4: 160 | word_to_idx = pickle.load(fr_4) 161 | 162 | bias_init_vector = np.load(bias_init_vector_path) 163 | 164 | 165 | ########################################################################################## 166 | # 167 | # I move the generation model part out of the Val_with_MLE function 168 | # 169 | ########################################################################################## 170 | n_words = len(idx_to_word) 171 | 172 | val_feats_path = '../inception/val_feats' 173 | val_feats_names = glob.glob(val_feats_path + '/*.npy') 174 | val_images_names = map(lambda x: os.path.basename(x)[0:-4], val_feats_names) 175 | 176 | model = CNN_LSTM(n_words = n_words, 177 | batch_size = batch_size, 178 | feats_dim = feats_dim, 179 | project_dim = project_dim, 180 | lstm_size = lstm_size, 181 | word_embed_dim = word_embed_dim, 182 | lstm_step = lstm_step, 183 | bias_init_vector = None) 184 | tf_images, tf_sentences = model.generate_model() 185 | 186 | def Val_with_MLE(model_path): 187 | ''' 188 | n_words = len(idx_to_word) 189 | 190 | # version 1: test all validation images 191 | val_feats_path = '../inception/val_feats' 192 | val_feats_names = glob.glob(val_feats_path + '/*.npy') 193 | val_images_names = map(lambda x: os.path.basename(x)[0:-4], val_feats_names) 194 | 195 | model = CNN_LSTM(n_words = n_words, 196 | batch_size = batch_size, 197 | feats_dim = feats_dim, 198 | project_dim = project_dim, 199 | lstm_size = lstm_size, 200 | word_embed_dim = word_embed_dim, 201 | lstm_step = lstm_step, 202 | bias_init_vector = None) 203 | tf_images, tf_sentences = model.generate_model() 204 | ''' 205 | sess = tf.InteractiveSession() 206 | saver = tf.train.Saver() 207 | saver.restore(sess, model_path) 208 | 209 | fw_1 = open("val2014_results.txt", 'w') 210 | for idx, img_name in enumerate(val_images_names[0:5000]): 211 | print "{}, {}".format(idx, img_name) 212 | start_time = time.time() 213 | 214 | current_feats = np.load( os.path.join(val_feats_path, img_name+'.npy') ) 215 | current_feats = np.reshape(current_feats, [1, feats_dim]) 216 | 217 | sentences_index = sess.run(tf_sentences, feed_dict={tf_images: current_feats}) 218 | sentences = [] 219 | for idx_word in sentences_index: 220 | word = idx_to_word[idx_word] 221 | word = word.replace('\n', '') 222 | word = word.replace('\\', '') 223 | word = word.replace('"', '') 224 | sentences.append(word) 225 | 226 | punctuation = np.argmax(np.array(sentences) == '') + 1 227 | sentences = sentences[:punctuation] 228 | generated_sentence = ' '.join(sentences) 229 | generated_sentence = generated_sentence.replace(' ', '') 230 | generated_sentence = generated_sentence.replace(' ', '') 231 | 232 | print generated_sentence,'\n' 233 | fw_1.write(img_name + '\n') 234 | fw_1.write(generated_sentence + '\n') 235 | fw_1.close() 236 | 237 | -------------------------------------------------------------------------------- /coco_caption/eval_model.py: -------------------------------------------------------------------------------- 1 | #! encoding: UTF-8 2 | 3 | import os 4 | import ipdb 5 | import glob 6 | import time 7 | import subprocess 8 | import cPickle as pickle 9 | import matplotlib.pyplot as plt 10 | 11 | from pycocotools.coco import COCO 12 | from pycocoevalcap.eval import COCOEvalCap 13 | 14 | import eval_image_caption 15 | 16 | 17 | model_path = "../models" 18 | 19 | annFile = "../data/train_val_all_reference.json" 20 | resFile = "captions_val2014_results.json" 21 | 22 | # create coco object and cocoRes object 23 | coco = COCO(annFile) 24 | 25 | n_epochs = 500 26 | n_epochs += 2 27 | 28 | with open("Bleu_1.pkl", "r") as f: 29 | Bleu_1 = pickle.load(f) 30 | 31 | with open("Bleu_2.pkl", "r") as f: 32 | Bleu_2 = pickle.load(f) 33 | 34 | with open("Bleu_3.pkl", "r") as f: 35 | Bleu_3 = pickle.load(f) 36 | 37 | with open("Bleu_4.pkl", "r") as f: 38 | Bleu_4 = pickle.load(f) 39 | 40 | with open("METEOR.pkl", "r") as f: 41 | METEOR = pickle.load(f) 42 | 43 | with open("CIDEr.pkl", "r") as f: 44 | CIDEr = pickle.load(f) 45 | 46 | for idx_model in range(202, n_epochs, 2): 47 | model_name = os.path.join(model_path, "model_MLP-" + str(idx_model)) 48 | 49 | start_time = time.time() 50 | 51 | # generate the val2014_results.txt 52 | eval_image_caption.Val_with_MLE(model_name) 53 | 54 | # call the gen_val_json.py with subprocess 55 | # we will generate the captions_val2014_results.json file 56 | subprocess.call(["python", "gen_val_json.py"]) 57 | 58 | # after generating the captions_val2014_results.json file 59 | # we call the coco caption evaluation tools 60 | cocoRes = coco.loadRes(resFile) 61 | 62 | # create cocoEval object by taking coco and cocoRes 63 | cocoEval = COCOEvalCap(coco, cocoRes) 64 | 65 | # evaluate on a subset of images by setting 66 | # cocoEval.params['image_id'] = cocoRes.getImgIds() 67 | # please remove this line when evaluating the full validation set 68 | cocoEval.params['image_id'] = cocoRes.getImgIds() 69 | 70 | # evaluate results 71 | cocoEval.evaluate() 72 | 73 | # print output evaluation scores 74 | for metric, score in cocoEval.eval.items(): 75 | print '%s: %.3f'%(metric, score) 76 | if metric == "Bleu_1": 77 | Bleu_1.append(score) 78 | if metric == "Bleu_2": 79 | Bleu_2.append(score) 80 | if metric == "Bleu_3": 81 | Bleu_3.append(score) 82 | if metric == "Bleu_4": 83 | Bleu_4.append(score) 84 | if metric == "METEOR": 85 | METEOR.append(score) 86 | if metric == "CIDEr": 87 | CIDEr.append(score) 88 | # save the scores immediately 89 | with open("Bleu_1.pkl", "w") as fw1: 90 | pickle.dump(Bleu_1, fw1) 91 | with open("Bleu_2.pkl", "w") as fw2: 92 | pickle.dump(Bleu_2, fw2) 93 | with open("Bleu_3.pkl", "w") as fw3: 94 | pickle.dump(Bleu_3, fw3) 95 | with open("Bleu_4.pkl", "w") as fw4: 96 | pickle.dump(Bleu_4, fw4) 97 | with open("METEOR.pkl", "w") as fw5: 98 | pickle.dump(METEOR, fw5) 99 | with open("CIDEr.pkl", "w") as fw6: 100 | pickle.dump(CIDEr, fw6) 101 | 102 | print "Mdoel {} evaluation time cost: {}".format(model_name, time.time()-start_time) 103 | 104 | # draw the pictures 105 | plt.plot(range(len(Bleu_1)), Bleu_1, label="Bleu-1", color="g") 106 | plt.plot(range(len(Bleu_2)), Bleu_2, label="Bleu-2", color="r") 107 | plt.plot(range(len(Bleu_3)), Bleu_3, label="Bleu-3", color="b") 108 | plt.plot(range(len(Bleu_4)), Bleu_4, label="Bleu-4", color="m") 109 | plt.plot(range(len(METEOR)), METEOR, label="METEOR", color="k") 110 | plt.plot(range(len(CIDEr)), CIDEr, label="CIDEr", color="y") 111 | plt.grid(True) 112 | plt.legend(loc=2) 113 | plt.show() 114 | plt.savefig("evalution.png") -------------------------------------------------------------------------------- /coco_caption/gen_test_json.py: -------------------------------------------------------------------------------- 1 | #!/usr/bin/env python 2 | # coding=utf-8 3 | import os 4 | import json 5 | import cPickle as pickle 6 | 7 | test_results_save_path = '../test2014_results_model-486.txt' 8 | test_results = open(test_results_save_path).read().splitlines() 9 | 10 | images_captions = {} 11 | captions = [] 12 | names = [] 13 | for idx, item in enumerate(test_results): 14 | if idx % 2 == 0: 15 | names.append(item) 16 | if idx % 2 == 1: 17 | captions.append(item) 18 | 19 | for idx, name in enumerate(names): 20 | print idx, ' ', name 21 | images_captions[name] = captions[idx] 22 | 23 | with open('../data/test2014_images_ids_to_names.pkl', 'r') as fr_1: 24 | test2014_images_ids_to_names = pickle.load(fr_1) 25 | 26 | names_to_ids = {} 27 | for key, item in test2014_images_ids_to_names.iteritems(): 28 | names_to_ids[item] = key 29 | 30 | fw_1 = open('captions_test2014_results.json', 'w') 31 | fw_1.write('[') 32 | 33 | for idx, name in enumerate(names): 34 | print idx, ' ', name 35 | tmp_idx = names.index(name) 36 | caption = captions[tmp_idx] 37 | caption = caption.replace(' ,', ',') 38 | caption = caption.replace('"', '') 39 | caption = caption.replace('\n', '') 40 | if idx != len(names)-1: 41 | fw_1.write('{"image_id": ' + str(names_to_ids[name]) + ', "caption": "' + str(caption) + '"}, ') 42 | else: 43 | fw_1.write('{"image_id": ' + str(names_to_ids[name]) + ', "caption": "' + str(caption) + '"}]') 44 | 45 | fw_1.close() 46 | -------------------------------------------------------------------------------- /coco_caption/gen_val_json.py: -------------------------------------------------------------------------------- 1 | #!/usr/bin/env python 2 | # coding=utf-8 3 | import os 4 | import json 5 | import cPickle as pickle 6 | 7 | test_results_save_path = '../val2014_results_model_MLP-486.txt' 8 | test_results = open(test_results_save_path).read().splitlines() 9 | 10 | images_captions = {} 11 | captions = [] 12 | names = [] 13 | for idx, item in enumerate(test_results): 14 | if idx % 2 == 0: 15 | names.append(item) 16 | if idx % 2 == 1: 17 | captions.append(item) 18 | 19 | for idx, name in enumerate(names): 20 | print idx, ' ', name 21 | images_captions[name] = captions[idx] 22 | 23 | with open('../data/val2014_images_ids_to_names.pkl', 'r') as fr_1: 24 | test2014_images_ids_to_names = pickle.load(fr_1) 25 | 26 | names_to_ids = {} 27 | for key, item in test2014_images_ids_to_names.iteritems(): 28 | names_to_ids[item] = key 29 | 30 | fw_1 = open('captions_val2014_results.json', 'w') 31 | fw_1.write('[') 32 | 33 | for idx, name in enumerate(names): 34 | print idx, ' ', name 35 | tmp_idx = names.index(name) 36 | caption = captions[tmp_idx] 37 | caption = caption.replace(' ,', ',') 38 | caption = caption.replace('"', '') 39 | caption = caption.replace('\n', '') 40 | if idx != len(names)-1: 41 | fw_1.write('{"image_id": ' + str(names_to_ids[name]) + ', "caption": "' + str(caption) + '"}, ') 42 | else: 43 | fw_1.write('{"image_id": ' + str(names_to_ids[name]) + ', "caption": "' + str(caption) + '"}]') 44 | 45 | fw_1.close() 46 | -------------------------------------------------------------------------------- /coco_caption/model_evalution.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/chenxinpeng/Optimization_of_image_description_metrics_using_policy_gradient_methods/66089304b3dc78a1e27f90e262d0cb17c5bb4cf2/coco_caption/model_evalution.png -------------------------------------------------------------------------------- /coco_caption/read_test_info.py: -------------------------------------------------------------------------------- 1 | #!/usr/bin/env python 2 | # coding=utf-8 3 | 4 | import os 5 | import json 6 | import cPickle as pickle 7 | 8 | image_info_test2014_path = '../data/image_info_test2014.json' 9 | 10 | image_info_json = json.load(open(image_info_test2014_path, 'r')) 11 | 12 | images_info = image_info_json["images"] 13 | 14 | imageIds_to_imageNames = {} 15 | for image in images_info: 16 | id = int(image["id"]) 17 | name = image["file_name"] 18 | imageIds_to_imageNames[id] = name 19 | 20 | with open("./data/test2014_images_ids_to_names.pkl", 'w') as fw_1: 21 | pickle.dump(imageIds_to_imageNames, fw_1) 22 | -------------------------------------------------------------------------------- /coco_caption/read_validation_info.py: -------------------------------------------------------------------------------- 1 | #!/usr/bin/env python 2 | # coding=utf-8 3 | 4 | import os 5 | import json 6 | import cPickle as pickle 7 | 8 | image_info_test2014_path = '../data/captions_val2014.json' 9 | 10 | image_info_json = json.load(open(image_info_test2014_path, 'r')) 11 | 12 | images_info = image_info_json["images"] 13 | 14 | imageIds_to_imageNames = {} 15 | for image in images_info: 16 | id = int(image["id"]) 17 | name = image["file_name"] 18 | imageIds_to_imageNames[id] = name 19 | 20 | with open("val2014_images_ids_to_names.pkl", 'w') as fw_1: 21 | pickle.dump(imageIds_to_imageNames, fw_1) 22 | -------------------------------------------------------------------------------- /create_train_val_all_reference.py: -------------------------------------------------------------------------------- 1 | # encoding: UTF-8 2 | 3 | import os 4 | import glob 5 | import json 6 | import cPickle as pickle 7 | 8 | 9 | train_val_imageNames_to_imageIDs = {} 10 | train_val_Names_Captions = [] 11 | #train_imageNames_to_imageIDs = {} 12 | #val_imageNames_to_imageIDs = {} 13 | 14 | ################################################################ 15 | with open('./data/captions_train2014.json') as fr_1: 16 | train_captions = json.load(fr_1) 17 | 18 | for image in train_captions['images']: 19 | image_name = image['file_name'] 20 | image_id = image['id'] 21 | train_val_imageNames_to_imageIDs[image_name] = image_id 22 | 23 | for image in train_captions['annotations']: 24 | image_id = image['image_id'] 25 | image_caption = image['caption'] 26 | train_val_Names_Captions.append([image_id, image_caption]) 27 | 28 | ################################################################# 29 | with open('./data/captions_val2014.json') as fr_2: 30 | val_captions = json.load(fr_2) 31 | 32 | for image in val_captions['images']: 33 | image_name = image['file_name'] 34 | image_id = image['id'] 35 | train_val_imageNames_to_imageIDs[image_name] = image_id 36 | 37 | for image in val_captions['annotations']: 38 | image_id = image['image_id'] 39 | image_caption = image['caption'] 40 | train_val_Names_Captions.append([image_id, image_caption]) 41 | 42 | ################################################################# 43 | 44 | json_fw = open('./data/train_val_all_reference.json', 'w') 45 | json_fw.write('{"info": {"description": "Test", "url": "https://github.com/chenxinpeng", "version": "1.0", "year": 2017, "contributor": "Chen Xinpeng", "date_created": "2017"}, "images": [') 46 | 47 | count = 0 48 | for imageName, imageID in train_val_imageNames_to_imageIDs.iteritems(): 49 | if count != len(train_val_imageNames_to_imageIDs)-1: 50 | json_fw.write('{"license": 1, "file_name": "' + str(imageName) + '", "id": ' + str(imageID) + '}, ') 51 | else: 52 | json_fw.write('{"license": 1, "file_name": "' + str(imageName) + '", "id": ' + str(imageID) + '}]') 53 | count += 1 54 | 55 | json_fw.write(', "licenses": [{"url": "http://creativecommons.org/licenses/by-nc-sa/2.0/", "id": 1, "name": "Test"}], ') 56 | 57 | json_fw.write('"type": "captions", "annotations": [') 58 | 59 | flag_count = 0 60 | id_count = 0 61 | for imageName, imageID in train_val_imageNames_to_imageIDs.iteritems(): 62 | print "{}, {}, {}".format(flag_count, imageName, imageID) 63 | 64 | captions = [] 65 | for item in train_val_Names_Captions: 66 | if item[0] == imageID: 67 | captions.append(item[1]) 68 | 69 | count_captions = 0 70 | if flag_count != len(train_val_imageNames_to_imageIDs)-1: 71 | for idx, each_sent in enumerate(captions): 72 | if '\n' in each_sent: 73 | each_sent = each_sent.replace('\n', '') 74 | if '\\' in each_sent: 75 | each_sent = each_sent.replace('\\', '') 76 | if '"' in each_sent: 77 | each_sent = each_sent.replace('"', '') 78 | json_fw.write('{"image_id": ' + str(imageID) + ', "id": ' + str(id_count) + ', "caption": "' + each_sent + '"}, ') 79 | id_count += 1 80 | 81 | if flag_count == len(train_val_imageNames_to_imageIDs)-1: 82 | for idx, each_sent in enumerate(captions): 83 | if '\n' in each_sent: 84 | each_sent = each_sent.replace('\n', '') 85 | if '\\' in each_sent: 86 | each_sent = each_sent.replace('\\', '') 87 | if '"' in each_sent: 88 | each_sent = each_sent.replace('"', '') 89 | if idx != len(captions)-1: 90 | json_fw.write('{"image_id": ' + str(imageID) + ', "id": ' + str(id_count) + ', "caption": "' + each_sent + '"}, ') 91 | else: 92 | json_fw.write('{"image_id": ' + str(imageID) + ', "id": ' + str(id_count) + ', "caption": "' + each_sent + '"}]}') 93 | id_count += 1 94 | 95 | flag_count += 1 96 | 97 | json_fw.close() 98 | -------------------------------------------------------------------------------- /create_train_val_each_reference.py: -------------------------------------------------------------------------------- 1 | # encoding: UTF-8 2 | 3 | ############################################################### 4 | # 5 | # generate images captions into every json file one by one, 6 | # and get the dict that map the image IDs to image Names 7 | # 8 | ############################################################### 9 | 10 | import os 11 | import sys 12 | import json 13 | import cPickle as pickle 14 | 15 | train_val_imageNames_to_imageIDs = {} 16 | train_imageNames_to_imageIDs = {} 17 | val_imageNames_to_imageIDs = {} 18 | 19 | with open('./data/captions_train2014.json') as fr_1: 20 | train_captions = json.load(fr_1) 21 | 22 | for image in train_captions['images']: 23 | image_name = image['file_name'] 24 | image_id = image['id'] 25 | train_imageNames_to_imageIDs[image_name] = image_id 26 | 27 | train_Names_Captions = [] 28 | for image in train_captions['annotations']: 29 | image_id = image['image_id'] 30 | image_caption = image['caption'] 31 | train_Names_Captions.append([image_id, image_caption]) 32 | 33 | train_count = 0 34 | for imageName, imageID in train_imageNames_to_imageIDs.iteritems(): 35 | print "{}, {}, {}".format(train_count, imageName, imageID) 36 | train_count += 1 37 | 38 | captions = [] 39 | for item in train_Names_Captions: 40 | if item[0] == imageID: 41 | captions.append(item[1]) 42 | 43 | json_fw = open('./train_val_reference_json/'+imageName+'.json', 'w') 44 | json_fw.write('{"info": {"description": "CaptionEval", "url": "https://github.com/chenxinpeng/", "version": "1.0", "year": 2017, "contributor": "Xinpeng Chen", "date_created": "2017.01.26"}, "images": [{"license": 1, "file_name": "' + imageName + '", "id": ' + str(imageID) + '}]') 45 | 46 | json_fw.write(' ,"licenses": [{"url": "test", "id": 1, "name": "test"}], ') 47 | json_fw.write('"type": "captions", "annotations": [') 48 | 49 | id_count = 0 50 | for idx, each_sent in enumerate(captions): 51 | if idx != len(captions)-1: 52 | if '\n' in each_sent: 53 | each_sent = each_sent.replace('\n', '') 54 | if '\\' in each_sent: 55 | each_sent = each_sent.replace('\\', '') 56 | if '"' in each_sent: 57 | each_sent = each_sent.replace('"', '') 58 | json_fw.write('{"image_id": ' + str(imageID) + ', "id": ' + str(id_count) + ', "caption": "' + each_sent + '"}, ') 59 | else: 60 | if '\n' in each_sent: 61 | each_sent = each_sent.replace('\n', '') 62 | if '\\' in each_sent: 63 | each_sent = each_sent.replace('\\', '') 64 | if '"' in each_sent: 65 | each_sent = each_sent.replace('"', '') 66 | json_fw.write('{"image_id": ' + str(imageID) + ', "id": ' + str(id_count) + ', "caption": "' + each_sent + '"}]}') 67 | id_count += 1 68 | json_fw.close() 69 | 70 | # Validation json file 71 | with open('./data/captions_val2014.json') as fr_2: 72 | val_captions = json.load(fr_2) 73 | 74 | for image in val_captions['images']: 75 | image_name = image['file_name'] 76 | image_id = image['id'] 77 | val_imageNames_to_imageIDs[image_name] = image_id 78 | 79 | val_Names_Captions = [] 80 | for image in val_captions['annotations']: 81 | image_id = image['image_id'] 82 | image_caption = image['caption'] 83 | val_Names_Captions.append([image_id, image_caption]) 84 | 85 | val_count = 0 86 | for imageName, imageID in val_imageNames_to_imageIDs.iteritems(): 87 | print "{}, {}, {}".format(val_count, imageName, imageID) 88 | 89 | captions = [] 90 | for item in val_Names_Captions: 91 | if item[0] == imageID: 92 | captions.append(item[1]) 93 | 94 | json_fw = open('./train_val_reference_json/'+imageName+'.json', 'w') 95 | json_fw.write('{"info": {"description": "CaptionEval", "url": "https://github.com/chenxinpeng/", "version": "1.0", "year": 2017, "contributor": "Xinpeng Chen", "date_created": "2017.01.26"}, "images": [{"license": 1, "file_name": "' + imageName + '", "id": ' + str(imageID) + '}]') 96 | 97 | json_fw.write(' ,"licenses": [{"url": "test", "id": 1, "name": "test"}], ') 98 | json_fw.write('"type": "captions", "annotations": [') 99 | 100 | id_count = 0 101 | for idx, each_sent in enumerate(captions): 102 | if idx != len(captions)-1: 103 | if '\n' in each_sent: 104 | each_sent = each_sent.replace('\n', '') 105 | if '\\' in each_sent: 106 | each_sent = each_sent.replace('\\', '') 107 | if '"' in each_sent: 108 | each_sent = each_sent.replace('"', '') 109 | json_fw.write('{"image_id": ' + str(imageID) + ', "id": ' + str(id_count) + ', "caption": "' + each_sent + '"}, ') 110 | else: 111 | if '\n' in each_sent: 112 | each_sent = each_sent.replace('\n', '') 113 | if '\\' in each_sent: 114 | each_sent = each_sent.replace('\\', '') 115 | if '"' in each_sent: 116 | each_sent = each_sent.replace('"', '') 117 | json_fw.write('{"image_id": ' + str(imageID) + ', "id": ' + str(id_count) + ', "caption": "' + each_sent + '"}]}') 118 | id_count += 1 119 | val_count += 1 120 | json_fw.close() 121 | 122 | for k, item in train_imageNames_to_imageIDs.iteritems(): 123 | train_val_imageNames_to_imageIDs[k] = item 124 | for k, item in val_imageNames_to_imageIDs.iteritems(): 125 | train_val_imageNames_to_imageIDs[k] = item 126 | 127 | with open('./data/train_val_imageNames_to_imageIDs.pkl', 'w') as fw_2: 128 | pickle.dump(train_val_imageNames_to_imageIDs, fw_2) 129 | 130 | 131 | -------------------------------------------------------------------------------- /data/bias_init_vector.npy: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/chenxinpeng/Optimization_of_image_description_metrics_using_policy_gradient_methods/66089304b3dc78a1e27f90e262d0cb17c5bb4cf2/data/bias_init_vector.npy -------------------------------------------------------------------------------- /data/test.txt: -------------------------------------------------------------------------------- 1 | test 2 | -------------------------------------------------------------------------------- /image/1.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/chenxinpeng/Optimization_of_image_description_metrics_using_policy_gradient_methods/66089304b3dc78a1e27f90e262d0cb17c5bb4cf2/image/1.png -------------------------------------------------------------------------------- /image/2.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/chenxinpeng/Optimization_of_image_description_metrics_using_policy_gradient_methods/66089304b3dc78a1e27f90e262d0cb17c5bb4cf2/image/2.png -------------------------------------------------------------------------------- /image/3.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/chenxinpeng/Optimization_of_image_description_metrics_using_policy_gradient_methods/66089304b3dc78a1e27f90e262d0cb17c5bb4cf2/image/3.png -------------------------------------------------------------------------------- /image_caption.py: -------------------------------------------------------------------------------- 1 | # encoding: UTF-8 2 | 3 | import os 4 | import sys 5 | import glob 6 | import random 7 | import time 8 | import json 9 | from json import encoder 10 | import numpy as np 11 | import cPickle as pickle 12 | import matplotlib.pyplot as plt 13 | 14 | import tensorflow as tf 15 | 16 | sys.path.append('../') 17 | from pycocotools.coco import COCO 18 | from pycocoevalcap.eval import COCOEvalCap 19 | 20 | import ipdb 21 | 22 | 23 | ############################################################################################################# 24 | # 25 | # Step 1: Input: D = {(x^n, y^n): n = 1:N} 26 | # Step 2:Train \Pi(g_{1:T} | x) using MLE on D, MLE: Maximum likehood eatimation 27 | # 28 | ############################################################################################################ 29 | class CNN_LSTM(): 30 | def __init__(self, 31 | n_words, 32 | batch_size, 33 | feats_dim, 34 | project_dim, 35 | lstm_size, 36 | word_embed_dim, 37 | lstm_step, 38 | bias_init_vector=None): 39 | 40 | self.n_words = n_words 41 | self.batch_size = batch_size 42 | self.feats_dim = feats_dim 43 | self.project_dim = project_dim 44 | self.lstm_size = lstm_size 45 | self.word_embed_dim = word_embed_dim 46 | self.lstm_step = lstm_step 47 | 48 | # project the image feature vector of dimension 2048 to 512 dimension, with a linear layer 49 | # self.encode_img_W: 2048 x 512 50 | # self.encode_img_b: 512 51 | self.encode_img_W = tf.Variable(tf.random_uniform([feats_dim, project_dim], -0.1, 0.1), name="encode_img_W") 52 | self.encode_img_b = tf.zeros([project_dim], name="encode_img_b") 53 | 54 | with tf.device("/cpu:0"): 55 | self.Wemb = tf.Variable(tf.random_uniform([n_words, word_embed_dim], -0.1, 0.1), name="Wemb") 56 | 57 | self.lstm = tf.nn.rnn_cell.BasicLSTMCell(lstm_size, state_is_tuple=True) 58 | 59 | self.embed_word_W = tf.Variable(tf.random_uniform([lstm_size, n_words], -0.1, 0.1), name="embed_word_W") 60 | 61 | if bias_init_vector is not None: 62 | self.embed_word_b = tf.Variable(bias_init_vector.astype(np.float32), name="embed_word_b") 63 | else: 64 | self.embed_word_b = tf.Variable(tf.zeros([n_words]), name="embed_word_b") 65 | 66 | self.baseline_MLP_W = tf.Variable(tf.random_uniform([lstm_size, 1], -0.1, 0.1), name="baseline_MLP_W") 67 | self.baseline_MLP_b = tf.Variable(tf.zeros([1]), name="baseline_MLP_b") 68 | 69 | # At the beginning, I used two layers of MLP, but I think it's wrong 70 | #self.baseline_MLP2_W = tf.Variable(tf.random_uniform([lstm_size, 1], -0.1, 0.1), name="baseline_MLP2_W") 71 | #self.baseline_MLP2_b = tf.Variable(tf.zeros([1]), name="baseline_MLP2_b") 72 | 73 | ############################################################################################################ 74 | # 75 | # Class function for step 2 76 | # 77 | ############################################################################################################ 78 | def build_model(self): 79 | images = tf.placeholder(tf.float32, [self.batch_size, self.feats_dim]) 80 | sentences = tf.placeholder(tf.int32, [self.batch_size, self.lstm_step]) 81 | masks = tf.placeholder(tf.float32, [self.batch_size, self.lstm_step]) 82 | 83 | images_embed = tf.matmul(images, self.encode_img_W) + self.encode_img_b 84 | 85 | state = self.lstm.zero_state(batch_size=self.batch_size, dtype=tf.float32) 86 | 87 | loss = 0.0 88 | with tf.variable_scope("LSTM"): 89 | for i in range(0, self.lstm_step): 90 | if i == 0: 91 | current_emb = images_embed 92 | else: 93 | with tf.device("/cpu:0"): 94 | current_emb = tf.nn.embedding_lookup(self.Wemb, sentences[:, i-1]) 95 | 96 | if i > 0: 97 | tf.get_variable_scope().reuse_variables() 98 | 99 | output, state = self.lstm(current_emb, state) 100 | 101 | if i > 0: 102 | labels = tf.expand_dims(sentences[:, i], 1) 103 | indices = tf.expand_dims(tf.range(0, self.batch_size, 1), 1) 104 | concated = tf.concat(1, [indices, labels]) 105 | onehot_labels = tf.sparse_to_dense( concated, tf.pack([self.batch_size, self.n_words]), 1.0, 0.0) 106 | 107 | logit_words = tf.matmul(output, self.embed_word_W) + self.embed_word_b 108 | cross_entropy = tf.nn.softmax_cross_entropy_with_logits(logit_words, onehot_labels) 109 | cross_entropy = cross_entropy * masks[:, i] 110 | current_loss = tf.reduce_sum(cross_entropy)/self.batch_size 111 | 112 | loss = loss + current_loss 113 | return loss, images, sentences, masks 114 | 115 | def generate_model(self): 116 | images = tf.placeholder(tf.float32, [1, self.feats_dim]) 117 | images_embed = tf.matmul(images, self.encode_img_W) + self.encode_img_b 118 | 119 | state = self.lstm.zero_state(batch_size=1, dtype=tf.float32) 120 | sentences = [] 121 | 122 | with tf.variable_scope("LSTM"): 123 | output, state = self.lstm(images_embed, state) 124 | 125 | with tf.device("/cpu:0"): 126 | current_emb = tf.nn.embedding_lookup(self.Wemb, tf.ones([1], dtype=tf.int64)) 127 | 128 | for i in range(0, self.lstm_step): 129 | tf.get_variable_scope().reuse_variables() 130 | 131 | output, state = self.lstm(current_emb, state) 132 | 133 | logit_words = tf.matmul(output, self.embed_word_W) + self.embed_word_b 134 | max_prob_word = tf.argmax(logit_words, 1)[0] 135 | 136 | with tf.device("/cpu:0"): 137 | current_emb = tf.nn.embedding_lookup(self.Wemb, max_prob_word) 138 | current_emb = tf.expand_dims(current_emb, 0) 139 | sentences.append(max_prob_word) 140 | 141 | return images, sentences 142 | 143 | #################################################################################### 144 | # 145 | # Class function for step 3 146 | # 147 | #################################################################################### 148 | def train_Bphi_model(self): 149 | encode_img_W = tf.stop_gradient(self.encode_img_W) 150 | encode_img_b = tf.stop_gradient(self.encode_img_b) 151 | Wemb = tf.stop_gradient(self.Wemb) 152 | 153 | images = tf.placeholder(tf.float32, [1, self.feats_dim]) 154 | images_embed = tf.matmul(images, encode_img_W) + encode_img_b 155 | 156 | Q_Bleu_1 = tf.placeholder(tf.float32, [1, self.lstm_step]) 157 | Q_Bleu_2 = tf.placeholder(tf.float32, [1, self.lstm_step]) 158 | Q_Bleu_3 = tf.placeholder(tf.float32, [1, self.lstm_step]) 159 | Q_Bleu_4 = tf.placeholder(tf.float32, [1, self.lstm_step]) 160 | 161 | weight_Bleu_1 = 0.5 162 | weight_Bleu_2 = 0.5 163 | weight_Bleu_3 = 1.0 164 | weight_Bleu_4 = 1.0 165 | 166 | state = self.lstm.zero_state(batch_size=1, dtype=tf.float32) 167 | 168 | # To avoid creating a feedback loop, we do not back-propagate 169 | # gradients through the hidden state from this loss 170 | c, h = state[0], state[1] 171 | c, h = tf.stop_gradient(c), tf.stop_gradient(h) 172 | state = tf.nn.rnn_cell.LSTMStateTuple(c, h) 173 | 174 | loss = 0.0 175 | 176 | with tf.variable_scope("LSTM"): 177 | with tf.device("/cpu:0"): 178 | current_embed = tf.nn.embedding_lookup(Wemb, tf.ones([1], dtype=tf.int64)) 179 | 180 | output, state = self.lstm(images_embed, state) 181 | c, h = state[0], state[1] 182 | c, h = tf.stop_gradient(c), tf.stop_gradient(h) 183 | state = tf.nn.rnn_cell.LSTMStateTuple(c, h) 184 | 185 | for i in range(0, self.lstm_step): 186 | tf.get_variable_scope().reuse_variables() 187 | 188 | output, state = self.lstm(current_embed, state) 189 | c, h = state[0], state[1] 190 | c, h = tf.stop_gradient(c), tf.stop_gradient(h) 191 | state = tf.nn.rnn_cell.LSTMStateTuple(c, h) 192 | 193 | # In our experiments, the baseline estimator is an MLP which takes as input the hidden state of the RNN at step t 194 | # To avoid creating a feedback loop, we do not back-propagate gradients through the hidden state from this loss 195 | #if i >= 1: 196 | baseline_estimator = tf.nn.relu(tf.matmul(state[1], self.baseline_MLP_W) + self.baseline_MLP_b) 197 | Q_current = weight_Bleu_1 * Q_Bleu_1[:, i] + weight_Bleu_2 * Q_Bleu_2[:, i] + \ 198 | weight_Bleu_3 * Q_Bleu_3[:, i] + weight_Bleu_4 * Q_Bleu_4[:, i] 199 | 200 | # Equation (8) in the paper 201 | loss = loss + tf.square(Q_current - baseline_estimator) 202 | 203 | return images, Q_Bleu_1, Q_Bleu_2, Q_Bleu_3, Q_Bleu_4, loss 204 | 205 | def Monte_Carlo_Rollout(self): 206 | images = tf.placeholder(tf.float32, [1, self.feats_dim]) 207 | images_embed = tf.matmul(images, self.encode_img_W) + self.encode_img_b 208 | 209 | state = self.lstm.zero_state(batch_size=1, dtype=tf.float32) 210 | 211 | gen_sentences = [] 212 | all_sample_sentences = [] 213 | 214 | with tf.variable_scope("LSTM"): 215 | output, state = self.lstm(images_embed, state) 216 | with tf.device("/cpu:0"): 217 | current_emb = tf.nn.embedding_lookup(self.Wemb, tf.ones([1], dtype=tf.int64)) 218 | 219 | for i in range(0, self.lstm_step): 220 | tf.get_variable_scope().reuse_variables() 221 | 222 | output, state = self.lstm(current_emb, state) 223 | logit_words = tf.matmul(output, self.embed_word_W) + self.embed_word_b 224 | max_prob_word = tf.argmax(logit_words, 1)[0] 225 | 226 | with tf.device("/cpu:0"): 227 | current_emb = tf.nn.embedding_lookup(self.Wemb, max_prob_word) 228 | current_emb = tf.expand_dims(current_emb, 0) 229 | gen_sentences.append(max_prob_word) 230 | 231 | if i < self.lstm_step-1: 232 | num_sample = self.lstm_step - 1 - i 233 | sample_sentences = [] 234 | for idx_sample in range(num_sample): 235 | sample = tf.multinomial(logit_words, 3) 236 | sample_sentences.append(sample[0]) 237 | all_sample_sentences.append(sample_sentences) 238 | 239 | return images, gen_sentences, all_sample_sentences 240 | 241 | ######################################################################## 242 | # 243 | # Class function for step 4 244 | # 245 | ######################################################################## 246 | def Monte_Carlo_and_Baseline(self): 247 | images = tf.placeholder(tf.float32, [self.batch_size, self.feats_dim]) 248 | images_embed = tf.matmul(images, self.encode_img_W) + self.encode_img_b 249 | 250 | state = self.lstm.zero_state(batch_size=self.batch_size, dtype=tf.float32) 251 | 252 | gen_sentences = [] 253 | all_sample_sentences = [] 254 | all_baselines = [] 255 | 256 | with tf.variable_scope("LSTM"): 257 | output, state = self.lstm(images_embed, state) 258 | with tf.device("/cpu:0"): 259 | current_emb = tf.nn.embedding_lookup(self.Wemb, tf.ones([self.batch_size], dtype=tf.int64)) 260 | 261 | for i in range(0, self.lstm_step): 262 | tf.get_variable_scope().reuse_variables() 263 | 264 | output, state = self.lstm(current_emb, state) 265 | logit_words = tf.matmul(output, self.embed_word_W) + self.embed_word_b 266 | max_prob_word = tf.argmax(logit_words, 1) 267 | with tf.device("/cpu:0"): 268 | current_emb = tf.nn.embedding_lookup(self.Wemb, max_prob_word) 269 | #current_emb = tf.expand_dims(current_emb, 0) 270 | gen_sentences.append(max_prob_word) 271 | 272 | # compute Q for gt with K Monte Carlo rollouts 273 | if i < self.lstm_step-1: 274 | num_sample = self.lstm_step - 1 - i 275 | sample_sentences = [] 276 | for idx_sample in range(num_sample): 277 | sample = tf.multinomial(logit_words, 3) 278 | sample_sentences.append(sample) 279 | all_sample_sentences.append(sample_sentences) 280 | # compute eatimated baseline 281 | baseline = tf.nn.relu(tf.matmul(state[1], self.baseline_MLP_W) + self.baseline_MLP_b) 282 | all_baselines.append(baseline) 283 | 284 | return images, gen_sentences, all_sample_sentences, all_baselines 285 | 286 | def SGD_update(self, batch_num_images=1000): 287 | images = tf.placeholder(tf.float32, [batch_num_images, self.feats_dim]) 288 | images_embed = tf.matmul(images, self.encode_img_W) + self.encode_img_b 289 | 290 | Q_rewards = tf.placeholder(tf.float32, [batch_num_images, self.lstm_step]) 291 | Baselines = tf.placeholder(tf.float32, [batch_num_images, self.lstm_step]) 292 | 293 | state = self.lstm.zero_state(batch_size=batch_num_images, dtype=tf.float32) 294 | 295 | loss = 0.0 296 | 297 | with tf.variable_scope("LSTM"): 298 | tf.get_variable_scope().reuse_variables() 299 | output, state = self.lstm(images_embed, state) 300 | 301 | with tf.device("/cpu:0"): 302 | current_emb = tf.nn.embedding_lookup(self.Wemb, tf.ones([batch_num_images], dtype=tf.int64)) 303 | 304 | for i in range(0, self.lstm_step): 305 | output, state = self.lstm(current_emb, state) 306 | 307 | logit_words = tf.matmul(output, self.embed_word_W) + self.embed_word_b 308 | logit_words_softmax = tf.nn.softmax(logit_words) 309 | max_prob_word = tf.argmax(logit_words_softmax, 1) 310 | max_prob = tf.reduce_max(logit_words_softmax, 1) 311 | 312 | current_rewards = Q_rewards[:, i] - Baselines[:, i] 313 | 314 | loss = loss + tf.reduce_sum(-tf.log(max_prob) * current_rewards) 315 | 316 | with tf.device("/cpu:0"): 317 | current_emb = tf.nn.embedding_lookup(self.Wemb, max_prob_word) 318 | #current_emb = tf.expand_dims(current_emb, 0) 319 | 320 | return images, Q_rewards, Baselines, loss, max_prob, current_rewards, logit_words 321 | 322 | 323 | ############################################################################## 324 | # 325 | # Step 1: set parameters and path 326 | # 327 | ############################################################################## 328 | batch_size = 100 329 | feats_dim = 2048 330 | project_dim = 512 331 | lstm_size = 512 332 | word_embed_dim = 512 333 | lstm_step = 30 334 | 335 | n_epochs = 500 336 | learning_rate = 0.0001 337 | 338 | # Features directory of training and validation images, and the other path 339 | train_val_feats_path = './inception/train_val_feats' 340 | val_feats_path = './inception/val_feats' 341 | 342 | loss_images_save_path = './loss_imgs' 343 | loss_file_save_path = 'loss.txt' 344 | model_path = './models' 345 | 346 | train_images_captions_path = './data/train_images_captions.pkl' 347 | val_images_captions_path = './data/val_images_captions.pkl' 348 | 349 | idx_to_word_path = './data/idx_to_word.pkl' 350 | word_to_idx_path = './data/word_to_idx.pkl' 351 | bias_init_vector_path = './data/bias_init_vector.npy' 352 | 353 | # Load pre-processed data 354 | with open(train_images_captions_path, 'r') as fr_1: 355 | train_images_captions = pickle.load(fr_1) 356 | 357 | with open(val_images_captions_path, 'r') as fr_2: 358 | val_images_captions = pickle.load(fr_2) 359 | 360 | with open(idx_to_word_path, 'r') as fr_3: 361 | idx_to_word = pickle.load(fr_3) 362 | 363 | with open(word_to_idx_path, 'r') as fr_4: 364 | word_to_idx = pickle.load(fr_4) 365 | 366 | bias_init_vector = np.load(bias_init_vector_path) 367 | 368 | 369 | ########################################################################## 370 | # 371 | # Step 2: Train, validation and test stage using MLE on Dataset 372 | # 373 | ########################################################################## 374 | def Train_with_MLE(): 375 | n_words = len(idx_to_word) 376 | train_images_names = train_images_captions.keys() 377 | 378 | # change the word of each image captions to index by word_to_idx 379 | train_images_captions_index = {} 380 | for each_img, sents in train_images_captions.iteritems(): 381 | sents_index = np.zeros([len(sents), lstm_step], dtype=np.int32) 382 | 383 | for idy, sent in enumerate(sents): 384 | sent = ' ' + sent + ' ' 385 | tmp_sent = sent.split(' ') 386 | tmp_sent = filter(None, tmp_sent) 387 | 388 | for idx, word in enumerate(tmp_sent): 389 | if idx == lstm_step-1: 390 | sents_index[idy, idx] = word_to_idx[''] 391 | break 392 | elif word in word_to_idx: 393 | sents_index[idy, idx] = word_to_idx[word] 394 | train_images_captions_index[each_img] = sents_index 395 | with open('./data/train_images_captions_index.pkl', 'w') as fw_1: 396 | pickle.dump(train_images_captions_index, fw_1) 397 | 398 | model = CNN_LSTM(n_words = n_words, 399 | batch_size = batch_size, 400 | feats_dim = feats_dim, 401 | project_dim = project_dim, 402 | lstm_size = lstm_size, 403 | word_embed_dim = word_embed_dim, 404 | lstm_step = lstm_step, 405 | bias_init_vector = bias_init_vector) 406 | 407 | tf_loss, tf_images, tf_sentences, tf_masks = model.build_model() 408 | 409 | sess = tf.InteractiveSession() 410 | saver = tf.train.Saver(max_to_keep=500, write_version=1) 411 | train_op = tf.train.AdamOptimizer(learning_rate).minimize(tf_loss) 412 | tf.initialize_all_variables().run() 413 | 414 | # when you want to train the model from the front model 415 | #new_saver = tf.train.Saver(max_to_keep=500) 416 | #new_saver = tf.train.import_meta_graph('./models/model-78.meta') 417 | #new_saver.restore(sess, tf.train.latest_checkpoint('./models/')) 418 | 419 | loss_fw = open(loss_file_save_path, 'w') 420 | loss_to_draw = [] 421 | for epoch in range(0, n_epochs): 422 | loss_to_draw_epoch = [] 423 | # disorder the training images 424 | random.shuffle(train_images_names) 425 | 426 | for start, end in zip(range(0, len(train_images_names), batch_size), 427 | range(batch_size, len(train_images_names), batch_size)): 428 | start_time = time.time() 429 | 430 | # current_feats: get the [start:end] features 431 | # current_captions: convert the word to the idx by the word_to_idx 432 | # current_masks: set the to zero, the other place to non-zero 433 | current_feats = [] 434 | current_captions = [] 435 | 436 | img_names = train_images_names[start:end] 437 | for each_img_name in img_names: 438 | # load this image's feats from the train_val_feats directory 439 | #each_img_name = each_img_name + '.npy' 440 | img_feat = np.load( os.path.join(train_val_feats_path, each_img_name+'.npy') ) 441 | current_feats.append(img_feat) 442 | 443 | img_caption_length = len(train_images_captions[each_img_name]) 444 | random_choice_index = random.randint(0, img_caption_length-1) 445 | img_caption = train_images_captions_index[each_img_name][random_choice_index] 446 | current_captions.append(img_caption) 447 | 448 | current_feats = np.asarray(current_feats) 449 | current_captions = np.asarray(current_captions) 450 | 451 | current_masks = np.zeros( (current_captions.shape[0], current_captions.shape[1]), dtype=np.int32 ) 452 | nonzeros = np.array( map(lambda x: (x != 0).sum(), current_captions) ) 453 | 454 | for ind, row in enumerate(current_masks): 455 | row[:nonzeros[ind]] = 1 456 | 457 | _, loss_val = sess.run( 458 | [train_op, tf_loss], 459 | feed_dict = { 460 | tf_images: current_feats, 461 | tf_sentences: current_captions, 462 | tf_masks: current_masks 463 | }) 464 | loss_to_draw_epoch.append(loss_val) 465 | 466 | print "idx: {} epoch: {} loss: {} Time cost: {}".format(start, epoch, loss_val, time.time()-start_time) 467 | loss_fw.write('epoch ' + str(epoch) + ' loss ' + str(loss_val) + '\n') 468 | 469 | # draw loss curve every epoch 470 | loss_to_draw.append(np.mean(loss_to_draw_epoch)) 471 | plt_save_img_name = str(epoch) + '.png' 472 | plt.plot(range(len(loss_to_draw)), loss_to_draw, color='g') 473 | plt.grid(True) 474 | plt.savefig(os.path.join(loss_images_save_path, plt_save_img_name)) 475 | 476 | if np.mod(epoch, 2) == 0: 477 | print "Epoch ", epoch, " is done. Saving the model ..." 478 | saver.save(sess, os.path.join(model_path, 'model_MLP'), global_step=epoch) 479 | loss_fw.close() 480 | 481 | 482 | def Test_with_MLE(): 483 | model_path = os.path.join('./models', 'model_MLP-486') 484 | n_words = len(idx_to_word) 485 | 486 | test_feats_path = './inception/test_feats' 487 | test_feats_names = glob.glob(test_feats_path + '/*.npy') 488 | test_images_names = map(lambda x: os.path.basename(x)[0:-4], test_feats_names) 489 | 490 | model = CNN_LSTM(n_words = n_words, 491 | batch_size = batch_size, 492 | feats_dim = feats_dim, 493 | project_dim = project_dim, 494 | lstm_size = lstm_size, 495 | word_embed_dim = word_embed_dim, 496 | lstm_step = lstm_step, 497 | bias_init_vector = None) 498 | 499 | tf_images, tf_sentences = model.generate_model() 500 | sess = tf.InteractiveSession() 501 | saver = tf.train.Saver() 502 | saver.restore(sess, model_path) 503 | 504 | fw_1 = open("test2014_results_model-486.txt", 'w') 505 | for idx, img_name in enumerate(test_images_names): 506 | t0 = time.time() 507 | 508 | current_feats = np.load( os.path.join(test_feats_path, img_name+'.npy') ) 509 | current_feats = np.reshape(current_feats, [1, feats_dim]) 510 | 511 | sentences_index = sess.run(tf_sentences, feed_dict={tf_images: current_feats}) 512 | 513 | #sentences = map(lambda x: idx_to_word[x], sentences_index) 514 | sentences = [] 515 | for idx_word in sentences_index: 516 | word = idx_to_word[idx_word] 517 | word = word.replace('\n', '') 518 | word = word.replace('\\', '') 519 | word = word.replace('"', '') 520 | sentences.append(word) 521 | 522 | punctuation = np.argmax(np.array(sentences) == '') + 1 523 | sentences = sentences[:punctuation] 524 | generated_sentence = ' '.join(sentences) 525 | generated_sentence = generated_sentence.replace(' ', '') 526 | generated_sentence = generated_sentence.replace(' ', '') 527 | 528 | print generated_sentence,'\n' 529 | fw_1.write(img_name + '\n') 530 | fw_1.write(generated_sentence + '\n') 531 | 532 | print "{}, {}, Time cost: {}".format(idx, img_name, time.time()-t0) 533 | 534 | fw_1.close() 535 | 536 | 537 | def Val_with_MLE(): 538 | model_path = os.path.join('./models', 'model_MLP-486') 539 | n_words = len(idx_to_word) 540 | 541 | # version 1: test all validation images 542 | val_feats_path = './inception/val_feats' 543 | val_feats_names = glob.glob(val_feats_path + '/*.npy') 544 | val_images_names = map(lambda x: os.path.basename(x)[0:-4], val_feats_names) 545 | 546 | # version 2: test only in the 1665 validation images 547 | #val_feats_path = './inception/val_feats_v2' 548 | #with open('./data/val_images_captions.pkl', 'r') as fr_1: 549 | # val_images_names = pickle.load(fr_1).keys() 550 | 551 | model = CNN_LSTM(n_words = n_words, 552 | batch_size = batch_size, 553 | feats_dim = feats_dim, 554 | project_dim = project_dim, 555 | lstm_size = lstm_size, 556 | word_embed_dim = word_embed_dim, 557 | lstm_step = lstm_step, 558 | bias_init_vector = None) 559 | tf_images, tf_sentences = model.generate_model() 560 | sess = tf.InteractiveSession() 561 | saver = tf.train.Saver() 562 | saver.restore(sess, model_path) 563 | 564 | fw_1 = open("val2014_results_model_MLP-486.txt", 'w') 565 | for idx, img_name in enumerate(val_images_names): 566 | print "{}, {}".format(idx, img_name) 567 | start_time = time.time() 568 | 569 | current_feats = np.load( os.path.join(val_feats_path, img_name+'.npy') ) 570 | current_feats = np.reshape(current_feats, [1, feats_dim]) 571 | 572 | sentences_index = sess.run(tf_sentences, feed_dict={tf_images: current_feats}) 573 | #sentences = map(lambda x: idx_to_word[x], sentences_index) 574 | sentences = [] 575 | for idx_word in sentences_index: 576 | word = idx_to_word[idx_word] 577 | word = word.replace('\n', '') 578 | word = word.replace('\\', '') 579 | word = word.replace('"', '') 580 | sentences.append(word) 581 | 582 | punctuation = np.argmax(np.array(sentences) == '') + 1 583 | sentences = sentences[:punctuation] 584 | generated_sentence = ' '.join(sentences) 585 | generated_sentence = generated_sentence.replace(' ', '') 586 | generated_sentence = generated_sentence.replace(' ', '') 587 | 588 | print generated_sentence,'\n' 589 | fw_1.write(img_name + '\n') 590 | fw_1.write(generated_sentence + '\n') 591 | fw_1.close() 592 | 593 | 594 | ########################################################################################################## 595 | # 596 | # Step 3: Train B_phi using MC estimates of Q_\theta on a small subset of Dataset D 597 | # 598 | ########################################################################################################## 599 | #import create_json_reference 600 | 601 | #epochs_Bphi_with_MC = 1000 602 | 603 | # I select 1665 images in the val set which saved in ./data: "val_images_captions.pkl", 604 | # to train the B_phi, here is the reference json file path 605 | #refer_1665_save_path = './data/reference_1665.json' 606 | 607 | #eval_ids_to_imgNames_save_path = './data/eval_ids_to_imgNames.pkl' 608 | 609 | def Sample_Q_with_MC(): 610 | model_path = os.path.join('./models', 'model_MLP-200') 611 | 612 | n_words = len(idx_to_word) 613 | 614 | val_images_names = val_images_captions.keys() 615 | 616 | print "Begin compute Q rewards of {} images...".format(len(val_images_names)) 617 | 618 | # create_json_reference.py 619 | # create_refer(train_images_captions_path, train_images_names, refer_1665_save_path) 620 | #create_json_reference.create_refer(val_images_captions_path, val_images_names, refer_1665_save_path) 621 | 622 | #with open(eval_ids_to_imgNames_save_path, 'r') as fr_1: 623 | # eval_ids_to_imgNames = pickle.load(fr_1) 624 | #eval_imgNames_to_ids = {} 625 | #for key, val in eval_ids_to_imgNames.iteritems(): 626 | # eval_imgNames_to_ids[val] = key 627 | 628 | #with open('./data/train_images_captions_index.pkl', 'r') as fr_2: 629 | # train_images_captions_index = pickle.load(fr_2) 630 | 631 | # open the dict that map the image names to image ids 632 | with open('./data/train_val_imageNames_to_imageIDs.pkl', 'r') as fr: 633 | train_val_imageNames_to_imageIDs = pickle.load(fr) 634 | 635 | model = CNN_LSTM(n_words = n_words, 636 | batch_size = 1, 637 | feats_dim = feats_dim, 638 | project_dim = project_dim, 639 | lstm_size = lstm_size, 640 | word_embed_dim = word_embed_dim, 641 | lstm_step = lstm_step, 642 | bias_init_vector = bias_init_vector) 643 | 644 | tf_images, tf_gen_sentences, tf_all_sentences = model.Monte_Carlo_Rollout() 645 | sess = tf.Session() 646 | saver = tf.train.Saver() 647 | saver.restore(sess, model_path) 648 | 649 | all_images_Q_rewards = {} 650 | for idx, img_name in enumerate(val_images_names): 651 | print("current image idx: {}, {}".format(idx, img_name)) 652 | start_time = time.time() 653 | 654 | # Load reference json file 655 | annFile = './train_val_reference_json/' + img_name + '.json' 656 | coco = COCO(annFile) 657 | 658 | all_images_Q_rewards[img_name] = {} 659 | current_image_rewards = all_images_Q_rewards[img_name] 660 | current_image_rewards['Bleu_4'] = [] 661 | current_image_rewards['Bleu_3'] = [] 662 | current_image_rewards['Bleu_2'] = [] 663 | current_image_rewards['Bleu_1'] = [] 664 | 665 | current_feats = np.load(os.path.join(val_feats_path, img_name+'.npy')) 666 | current_feats = np.reshape(current_feats, [1, feats_dim]) 667 | 668 | gen_sents_index, all_sample_sents = sess.run([tf_gen_sentences, tf_all_sentences], feed_dict={tf_images: current_feats}) 669 | gen_sents = [] 670 | for item in gen_sents_index: 671 | tmp_word = idx_to_word[item] 672 | tmp_word = tmp_word.replace('\\', '') 673 | tmp_word = tmp_word.replace('\n', '') 674 | tmp_word = tmp_word.replace('"', '') 675 | gen_sents.append(tmp_word) 676 | gen_sents_list = gen_sents 677 | punctuation = np.argmax(np.array(gen_sents) == '') + 1 678 | gen_sents = gen_sents[:punctuation] 679 | gen_sents = ' '.join(gen_sents) 680 | gen_sents = gen_sents.replace(' ', '') 681 | gen_sents = gen_sents.replace(' ,', ',') 682 | print "\ngenerated sentences: {}".format(gen_sents) 683 | 684 | for i_s, samples in enumerate(all_sample_sents): 685 | print "\n==========================================================================" 686 | print "{} / {}".format(i_s, len(all_sample_sents)) 687 | 688 | samples = np.asarray(samples) 689 | sample_sent_1 = []; sample_sent_2 = []; sample_sent_3 = [] 690 | 691 | for each_gen_sents_word in gen_sents_list[0: (i_s+1)]: 692 | sample_sent_1.append(each_gen_sents_word) 693 | sample_sent_2.append(each_gen_sents_word) 694 | sample_sent_3.append(each_gen_sents_word) 695 | 696 | for j_s in range(samples.shape[0]): 697 | word_1, word_2, word_3 = idx_to_word[samples[j_s, 0]], idx_to_word[samples[j_s, 1]], idx_to_word[samples[j_s, 2]] 698 | word_1, word_2, word_3 = word_1.replace('\n', ''), word_2.replace('\n', ''), word_3.replace('\n', '') 699 | word_1, word_2, word_3 = word_1.replace('"', ''), word_2.replace('"', ''), word_3.replace('"', '') 700 | word_1, word_2, word_3 = word_1.replace('\\', ''), word_2.replace('\\', ''), word_3.replace('\\', '') 701 | sample_sent_1.append(word_1) 702 | sample_sent_2.append(word_2) 703 | sample_sent_3.append(word_3) 704 | 705 | sample_sent_1.append('') 706 | sample_sent_2.append('') 707 | sample_sent_3.append('') 708 | 709 | three_sample_sents = [sample_sent_1, sample_sent_2, sample_sent_3] 710 | 711 | three_sample_rewards = {} 712 | three_sample_rewards['Bleu_1'] = 0.0 713 | three_sample_rewards['Bleu_2'] = 0.0 714 | three_sample_rewards['Bleu_3'] = 0.0 715 | three_sample_rewards['Bleu_4'] = 0.0 716 | 717 | for ii, each_sample_sent in enumerate(three_sample_sents): 718 | if ' ' in each_sample_sent: 719 | each_sample_sent.remove(' ') # remove the space element in a list! 720 | 721 | print "sample sentence {}, {}".format(ii, each_sample_sent) 722 | 723 | punctuation = np.argmax(np.array(each_sample_sent) == '') + 1 724 | each_sample_sent = each_sample_sent[:punctuation] 725 | each_sample_sent = ' '.join(each_sample_sent) 726 | each_sample_sent = each_sample_sent.replace(' ', '') 727 | each_sample_sent = each_sample_sent.replace(' ,', ',') 728 | print each_sample_sent 729 | fw_1 = open("./data/results_MC.json", 'w') 730 | fw_1.write('[{"image_id": ' + str(train_val_imageNames_to_imageIDs[img_name]) + ', "caption": "' + each_sample_sent + '"}]') 731 | fw_1.close() 732 | 733 | #annFile = './data/reference_1665.json' 734 | resFile = './data/results_MC.json' 735 | #coco = COCO(annFile) 736 | cocoRes = coco.loadRes(resFile) 737 | cocoEval = COCOEvalCap(coco, cocoRes) 738 | cocoEval.params['image_id'] = cocoRes.getImgIds() 739 | cocoEval.evaluate() 740 | 741 | for metric, score in cocoEval.eval.items(): 742 | print '%s: %.3f'%(metric, score) 743 | if metric == 'Bleu_1': 744 | three_sample_rewards['Bleu_1'] += score 745 | if metric == 'Bleu_2': 746 | three_sample_rewards['Bleu_2'] += score 747 | if metric == 'Bleu_3': 748 | three_sample_rewards['Bleu_3'] += score 749 | if metric == 'Bleu_4': 750 | three_sample_rewards['Bleu_4'] += score 751 | 752 | current_image_rewards['Bleu_1'].append(three_sample_rewards['Bleu_1']/3.0) 753 | current_image_rewards['Bleu_2'].append(three_sample_rewards['Bleu_2']/3.0) 754 | current_image_rewards['Bleu_3'].append(three_sample_rewards['Bleu_3']/3.0) 755 | current_image_rewards['Bleu_4'].append(three_sample_rewards['Bleu_4']/3.0) 756 | 757 | # If be in a terminal state, we define Q(g_{1:T}, EOS) = R(g_{1:T}) 758 | fw_1 = open("./data/results_MC.json", 'w') 759 | fw_1.write('[{"image_id": ' + str(train_val_imageNames_to_imageIDs[img_name]) + ', "caption": "' + gen_sents + '"}]') 760 | fw_1.close() 761 | #annFile = './data/reference_1665.json' 762 | resFile = './data/results_MC.json' 763 | #coco = COCO(annFile) 764 | cocoRes = coco.loadRes(resFile) 765 | cocoEval = COCOEvalCap(coco, cocoRes) 766 | cocoEval.params['image_id'] = cocoRes.getImgIds() 767 | cocoEval.evaluate() 768 | for metric, score in cocoEval.eval.items(): 769 | print '%s: %.3f'%(metric, score) 770 | if metric == 'Bleu_1': 771 | current_image_rewards['Bleu_1'].append(score) 772 | if metric == 'Bleu_2': 773 | current_image_rewards['Bleu_2'].append(score) 774 | if metric == 'Bleu_3': 775 | current_image_rewards['Bleu_3'].append(score) 776 | if metric == 'Bleu_4': 777 | current_image_rewards['Bleu_4'].append(score) 778 | print "Time cost: {}".format(time.time()-start_time) 779 | 780 | with open('./data/all_images_Q_rewards.pkl', 'w') as fw_1: 781 | pickle.dump(all_images_Q_rewards, fw_1) 782 | 783 | def Train_Bphi_Model(): 784 | n_words = len(idx_to_word) 785 | 786 | with open('./data/all_images_Q_rewards.pkl', 'r') as fr_3: 787 | all_images_Q_rewards = pickle.load(fr_3) 788 | 789 | subset_images_names = all_images_Q_rewards.keys() 790 | 791 | model = CNN_LSTM(n_words = n_words, 792 | batch_size = 1, 793 | feats_dim = feats_dim, 794 | project_dim = project_dim, 795 | lstm_size = lstm_size, 796 | word_embed_dim = word_embed_dim, 797 | lstm_step = lstm_step, 798 | bias_init_vector = bias_init_vector) 799 | 800 | Bphi_tf_images, Bphi_tf_Bleu_1, Bphi_tf_Bleu_2, Bphi_tf_Bleu_3, Bphi_tf_Bleu_4, Bphi_tf_loss = model.train_Bphi_model() 801 | 802 | train_op = tf.train.GradientDescentOptimizer(learning_rate).minimize(Bphi_tf_loss) 803 | sess = tf.InteractiveSession() 804 | #tf.initialize_all_variables().run() 805 | new_saver = tf.train.Saver(max_to_keep=500) 806 | #new_saver = tf.train.import_meta_graph('./models/model-32.meta') 807 | #new_saver.restore(sess, tf.train.latest_checkpoint('./models')) 808 | new_saver.restore(sess, './models/model-50') 809 | 810 | loss_to_draw = [] 811 | for epoch in range(0, epochs_Bphi_with_MC): 812 | loss_to_draw_epoch = [] 813 | random.shuffle(subset_images_names) 814 | 815 | for start, end in zip(range(0, len(subset_images_names), 1), 816 | range(1, len(subset_images_names), 1)): 817 | start_time_batch = time.time() 818 | 819 | current_feats = [] 820 | 821 | # Bleu_1, Bleu_2, Bleu_3, Bleu_4 822 | current_Bleu_1 = [] 823 | current_Bleu_2 = [] 824 | current_Bleu_3 = [] 825 | current_Bleu_4 = [] 826 | 827 | img_names = subset_images_names[start:end] 828 | for each_img_name in img_names: 829 | img_feat = np.load(os.path.join(train_val_feats_path, each_img_name+'.npy')) 830 | current_feats.append(img_feat) 831 | 832 | current_Bleu_1.append(all_images_Q_rewards[each_img_name]['Bleu_1']) 833 | current_Bleu_2.append(all_images_Q_rewards[each_img_name]['Bleu_2']) 834 | current_Bleu_3.append(all_images_Q_rewards[each_img_name]['Bleu_3']) 835 | current_Bleu_4.append(all_images_Q_rewards[each_img_name]['Bleu_4']) 836 | 837 | current_feats = np.asarray(current_feats, dtype=np.float32) 838 | current_Bleu_1 = np.asarray(current_Bleu_1, dtype=np.float32) 839 | current_Bleu_2 = np.asarray(current_Bleu_2, dtype=np.float32) 840 | current_Bleu_3 = np.asarray(current_Bleu_3, dtype=np.float32) 841 | current_Bleu_4 = np.asarray(current_Bleu_4, dtype=np.float32) 842 | 843 | _, loss_val = sess.run([train_op, Bphi_tf_loss], 844 | feed_dict = {Bphi_tf_images: current_feats, 845 | Bphi_tf_Bleu_1: current_Bleu_1, 846 | Bphi_tf_Bleu_2: current_Bleu_2, 847 | Bphi_tf_Bleu_3: current_Bleu_3, 848 | Bphi_tf_Bleu_4: current_Bleu_4 849 | }) 850 | 851 | loss_to_draw_epoch.append(loss_val[0,0]) 852 | print "idx: {} epoch: {} loss: {} Time cost: {}".format(start, epoch, loss_val[0,0], time.time() - start_time_batch) 853 | 854 | loss_to_draw.append(np.mean(loss_to_draw_epoch)) 855 | plt_save_img_name = 'Bphi_train_' + str(epoch) + '.png' 856 | plt.plot(range(len(loss_to_draw)), loss_to_draw, color='g') 857 | plt.grid(True) 858 | plt.savefig(os.path.join('./loss_imgs', plt_save_img_name)) 859 | 860 | if np.mod(epoch, 2) == 0: 861 | print "Epoch ", epoch, " is done. Saving the model ..." 862 | new_saver.save(sess, os.path.join('./models', 'Bphi_train_model'), global_step=epoch) 863 | 864 | 865 | ############################################################################################################## 866 | # 867 | # Step 4: go through all the images in D, SGD update of \theta, \phi 868 | # 869 | ############################################################################################################## 870 | def Train_SGD_update(): 871 | model_path = os.path.join('./models', 'Bphi_train_model-84') 872 | batch_num_images = 100 # 100 873 | epoches = n_epochs # 500 874 | n_words = len(idx_to_word) 875 | train_images_names = train_images_captions.keys() 876 | 877 | # open the dict that map the image names to image ids 878 | with open('./data/train_val_imageNames_to_imageIDs.pkl', 'r') as fr: 879 | train_val_imageNames_to_imageIDs = pickle.load(fr) 880 | 881 | # Load COCO reference json file 882 | annFile = './data/train_val_all_reference.json' 883 | coco = COCO(annFile) 884 | 885 | # model initialization 886 | model = CNN_LSTM(n_words = n_words, 887 | batch_size = batch_num_images, 888 | feats_dim = feats_dim, 889 | project_dim = project_dim, 890 | lstm_size = lstm_size, 891 | word_embed_dim = word_embed_dim, 892 | lstm_step = lstm_step, 893 | bias_init_vector = bias_init_vector) 894 | 895 | # The first model is used to generate sample sentences and Baselines. 896 | # Then we use the sample sentences and coco caption API to compute the Q_rewards. 897 | # And the second model is used to transfer the Q_rewards, Baselines values, 898 | # the loss function is \sum(log(max_probability) * rewards) 899 | tf_images, tf_gen_sents_index, tf_all_sample_sents, tf_all_baselines = model.Monte_Carlo_and_Baseline() 900 | tf_images_2, tf_Q_rewards, tf_Baselines, tf_loss, tf_max_prob, tf_current_rewards, tf_logit_words = model.SGD_update(batch_num_images=1000) 901 | 902 | train_op = tf.train.GradientDescentOptimizer(learning_rate).minimize(tf_loss) 903 | sess = tf.InteractiveSession() 904 | saver = tf.train.Saver() 905 | saver.restore(sess, model_path) 906 | #tf.initialize_all_variables().run() 907 | 908 | # save every epoch loss value in loss_to_draw 909 | loss_to_draw = [] 910 | for epoch in range(0, epoches): 911 | # save every batch loss value in loss_to_draw_epoch 912 | loss_to_draw_epoch = [] 913 | 914 | # shuffle the order of images randomly 915 | random.shuffle(train_images_names) 916 | 917 | # store rewards of all the training images 918 | train_val_images_Q_rewards = {} 919 | 920 | for start, end in zip(range(0, len(train_images_names), batch_num_images), 921 | range(batch_num_images, len(train_images_names), batch_num_images)): 922 | start_time = time.time() 923 | 924 | img_names = train_images_names[start:end] 925 | current_feats = [] 926 | for img_name in img_names: 927 | tmp_feats = np.load(os.path.join(train_val_feats_path, img_name+'.npy')) 928 | current_feats.append(tmp_feats) 929 | current_feats = np.asarray(current_feats) 930 | 931 | # store rewards of all the training images 932 | #train_val_images_Q_rewards = {} 933 | #ONE IMAGE: for idx, img_name in enumerate(train_images_names): 934 | #ONE IMAGE: print "{}, {}".format(idx, img_name) 935 | #ONE IMAGE: start_time = time.time() 936 | current_batch_rewards = {} 937 | current_batch_rewards['Bleu_1'] = [] 938 | current_batch_rewards['Bleu_2'] = [] 939 | current_batch_rewards['Bleu_3'] = [] 940 | current_batch_rewards['Bleu_4'] = [] 941 | 942 | # weighted sum 943 | sum_image_rewards = [] 944 | Bleu_1_weight = 0.5 945 | Bleu_2_weight = 0.5 946 | Bleu_3_weight = 1.0 947 | Bleu_4_weight = 1.0 948 | 949 | #ONE IMAGE: current_feats = np.load(os.path.join(train_val_feats_path, img_name+'.npy')) 950 | #ONE IMAGE: current_feats = np.reshape(current_feats, [1, feats_dim]) 951 | 952 | 953 | ################################################################################################################################### 954 | # 955 | # Below, for the current 100 images, we compute Q(g1:t-1, gt) for gt with K Monte Carlo rollouts, using Equation (6) 956 | # Meanwhile, we compute estimated baseline B_phi(g1:t-1) 957 | # 958 | ################################################################################################################################### 959 | feed_dict = {tf_images: current_feats} 960 | gen_sents_index, all_sample_sents, all_baselines = sess.run([tf_gen_sents_index, tf_all_sample_sents, tf_all_baselines], feed_dict) 961 | 962 | # 100 sentences, every sentence has 30 words, thus its shape is 100 x 30 963 | batch_sentences = [] 964 | for tmp_i in range(0, batch_num_images): 965 | single_sentences = [] 966 | for tmp_j in range(0, len(gen_sents_index)): 967 | word_idx = gen_sents_index[tmp_j][tmp_i] 968 | word = idx_to_word[word_idx] 969 | word = word.replace('\n', '') 970 | word = word.replace('\\', '') 971 | word = word.replace('"', '') 972 | single_sentences.append(word) 973 | batch_sentences.append(single_sentences) 974 | 975 | #ONE IMAGE: tmp_sentences = map(lambda x: idx_to_word[x], gen_sents_index) 976 | #ONE IMAGE: print tmp_sentences 977 | #ONE IMAGE: sentences = [] 978 | #ONE IMAGE: for word in tmp_sentences: 979 | #ONE IMAGE: word = word.replace('\n', '') 980 | #ONE IMAGE: word = word.replace('\\', '') 981 | #ONE IMAGE: word = word.replace('"', '') 982 | #ONE IMAGE: sentences.append(word) 983 | 984 | batch_sentences_processed = [] 985 | #gen_sents_list = batch_sentences 986 | for tmp_i in range(0, batch_num_images): 987 | tmp_sentences = batch_sentences[tmp_i] 988 | punctuation = np.argmax(np.array(tmp_sentences) == '') + 1 989 | tmp_sentences = tmp_sentences[:punctuation] 990 | tmp_sentences = ' '.join(tmp_sentences) 991 | tmp_sentences = tmp_sentences.replace(' ', '') 992 | tmp_sentences = tmp_sentences.replace(' ', '') 993 | batch_sentences_processed.append(tmp_sentences) 994 | #print "Idx: {} Image Name: {} Gen Sentence: {}".format(tmp_i, img_names[tmp_i], generated_sentence) 995 | 996 | #ONE IMAGE: gen_sents_list = sentences 997 | #ONE IMAGE: punctuation = np.argmax(np.array(sentences) == '') + 1 998 | #ONE IMAGE: sentences = sentences[:punctuation] 999 | #ONE IMAGE: generated_sentence = ' '.join(sentences) 1000 | #ONE IMAGE: generated_sentence = generated_sentence.replace(' ', '') 1001 | #ONE IMAGE: generated_sentence = generated_sentence.replace(' ', '') 1002 | #ONE IMAGE: print "Generated sentences: {}".format(generated_sentence) 1003 | 1004 | # 0, 1, 2, ..., 28, the 30th is computed by the whole generated sentences 1005 | for time_step in range(0, lstm_step-1): 1006 | print "\n====================================================================================================" 1007 | print "Time step: {} \n".format(time_step) 1008 | batch_samples = all_sample_sents[time_step] 1009 | batch_samples = np.asarray(batch_samples) 1010 | 1011 | batch_sample_sents_1 = [] 1012 | batch_sample_sents_2 = [] 1013 | batch_sample_sents_3 = [] 1014 | # store the sample sentences, each sample list has 100 images' sentences 1015 | for img_idx in range(0, batch_num_images): 1016 | batch_sample_sents_1.append([]) 1017 | batch_sample_sents_2.append([]) 1018 | batch_sample_sents_3.append([]) 1019 | 1020 | # 0, 1, 2, ..., 99 1021 | for img_idx in range(0, batch_num_images): 1022 | for each_gen_sents_word in batch_sentences[img_idx][0:time_step+1]: 1023 | each_gen_sents_word = each_gen_sents_word.replace('\n', '') 1024 | each_gen_sents_word = each_gen_sents_word.replace('\\', '') 1025 | each_gen_sents_word = each_gen_sents_word.replace('"', '') 1026 | batch_sample_sents_1[img_idx].append(each_gen_sents_word) 1027 | batch_sample_sents_2[img_idx].append(each_gen_sents_word) 1028 | batch_sample_sents_3[img_idx].append(each_gen_sents_word) 1029 | 1030 | # 0, 1, 2, ..., 99 1031 | for img_idx in range(0, batch_num_images): 1032 | for tmp_i in range(0, batch_samples.shape[0]): 1033 | word_1 = idx_to_word[batch_samples[tmp_i, img_idx, 0]] 1034 | word_2 = idx_to_word[batch_samples[tmp_i, img_idx, 1]] 1035 | word_3 = idx_to_word[batch_samples[tmp_i, img_idx, 2]] 1036 | word_1, word_2, word_3 = word_1.replace('\n', ''), word_2.replace('\n', ''), word_3.replace('\n', '') 1037 | word_1, word_2, word_3 = word_1.replace('\\', ''), word_2.replace('\\', ''), word_3.replace('\\', '') 1038 | word_1, word_2, word_3 = word_1.replace('"', ''), word_2.replace('"', ''), word_3.replace('"', '') 1039 | 1040 | batch_sample_sents_1[img_idx].append(word_1) 1041 | batch_sample_sents_2[img_idx].append(word_2) 1042 | batch_sample_sents_3[img_idx].append(word_3) 1043 | batch_sample_sents_1[img_idx].append('') 1044 | batch_sample_sents_2[img_idx].append('') 1045 | batch_sample_sents_3[img_idx].append('') 1046 | 1047 | batch_three_sample_sents = [batch_sample_sents_1, batch_sample_sents_2, batch_sample_sents_3] 1048 | three_sample_rewards = {} 1049 | three_sample_rewards['Bleu_1'] = 0.0 1050 | three_sample_rewards['Bleu_2'] = 0.0 1051 | three_sample_rewards['Bleu_3'] = 0.0 1052 | three_sample_rewards['Bleu_4'] = 0.0 1053 | 1054 | for tmp_i, batch_sample_sents in enumerate(batch_three_sample_sents): 1055 | ###################################################################################### 1056 | # write the sample sentences of current 100 images 1057 | ###################################################################################### 1058 | fw_1 = open("./data/results_batch_sample_sents.json", 'w') 1059 | fw_1.write('[') 1060 | 1061 | for img_idx in range(0, batch_num_images): 1062 | if ' ' in batch_sample_sents[img_idx]: 1063 | batch_sample_sents[img_idx].remove(' ') 1064 | 1065 | punctuation = np.argmax(np.array(batch_sample_sents[img_idx]) == '') + 1 1066 | batch_sample_sents[img_idx] = batch_sample_sents[img_idx][:punctuation] 1067 | batch_sample_sents[img_idx] = ' '.join(batch_sample_sents[img_idx]) 1068 | batch_sample_sents[img_idx] = batch_sample_sents[img_idx].replace(' ', '') 1069 | batch_sample_sents[img_idx] = batch_sample_sents[img_idx].replace(' ,', ',') 1070 | 1071 | if img_idx != batch_num_images-1: 1072 | fw_1.write('{"image_id": ' + str(train_val_imageNames_to_imageIDs[img_names[img_idx]]) + ', "caption": "' + batch_sample_sents[img_idx] + '"}, ') 1073 | else: 1074 | fw_1.write('{"image_id": ' + str(train_val_imageNames_to_imageIDs[img_names[img_idx]]) + ', "caption": "' + batch_sample_sents[img_idx] + '"}]') 1075 | fw_1.close() 1076 | 1077 | ######################################################################################## 1078 | # compute the Bleu1,2,3,4 score using current 100 images 1079 | ######################################################################################## 1080 | #annFile = './data/train_val_all_reference.json' 1081 | resFile = './data/results_batch_sample_sents.json' 1082 | #coco = COCO(annFile) 1083 | cocoRes = coco.loadRes(resFile) 1084 | cocoEval = COCOEvalCap(coco, cocoRes) 1085 | cocoEval.params['image_id'] = cocoRes.getImgIds() 1086 | cocoEval.evaluate() 1087 | for metric, score in cocoEval.eval.items(): 1088 | if metric == 'Bleu_1': 1089 | three_sample_rewards['Bleu_1'] += score 1090 | if metric == 'Bleu_2': 1091 | three_sample_rewards['Bleu_2'] += score 1092 | if metric == 'Bleu_3': 1093 | three_sample_rewards['Bleu_3'] += score 1094 | if metric == 'Bleu_4': 1095 | three_sample_rewards['Bleu_4'] += score 1096 | 1097 | current_batch_rewards['Bleu_1'].append(three_sample_rewards['Bleu_1']/3.0) 1098 | current_batch_rewards['Bleu_2'].append(three_sample_rewards['Bleu_2']/3.0) 1099 | current_batch_rewards['Bleu_3'].append(three_sample_rewards['Bleu_3']/3.0) 1100 | current_batch_rewards['Bleu_4'].append(three_sample_rewards['Bleu_4']/3.0) 1101 | 1102 | ##################################################################################################### 1103 | # compute the 30th rewards of the current 100 images 1104 | ##################################################################################################### 1105 | fw_2 = open("./data/results_batch_sample_sents.json", 'w') 1106 | fw_2.write('[') 1107 | for img_idx in range(0, batch_num_images): 1108 | if img_idx != batch_num_images-1: 1109 | fw_2.write('{"image_id": ' + str(train_val_imageNames_to_imageIDs[img_names[img_idx]]) + ', "caption": "' + batch_sentences_processed[img_idx] + '"}, ') 1110 | else: 1111 | fw_2.write('{"image_id": ' + str(train_val_imageNames_to_imageIDs[img_names[img_idx]]) + ', "caption": "' + batch_sentences_processed[img_idx] + '"}]') 1112 | fw_2.close() 1113 | #annFile = './data/train_val_all_reference.json' 1114 | resFile = './data/results_batch_sample_sents.json' 1115 | #coco = COCO(annFile) 1116 | cocoRes = coco.loadRes(resFile) 1117 | cocoEval = COCOEvalCap(coco, cocoRes) 1118 | cocoEval.params['image_id'] = cocoRes.getImgIds() 1119 | cocoEval.evaluate() 1120 | for metric, score in cocoEval.eval.items(): 1121 | if metric == 'Bleu_1': 1122 | current_batch_rewards['Bleu_1'].append(score) 1123 | if metric == 'Bleu_2': 1124 | current_batch_rewards['Bleu_2'].append(score) 1125 | if metric == 'Bleu_3': 1126 | current_batch_rewards['Bleu_3'].append(score) 1127 | if metric == 'Bleu_4': 1128 | current_batch_rewards['Bleu_4'].append(score) 1129 | 1130 | # compute the weight sum of Bleu value as rewards 1131 | for tmp_idx in range(0, lstm_step): 1132 | tmp_reward = current_batch_rewards['Bleu_1'][tmp_idx] * Bleu_1_weight + \ 1133 | current_batch_rewards['Bleu_2'][tmp_idx] * Bleu_2_weight + \ 1134 | current_batch_rewards['Bleu_3'][tmp_idx] * Bleu_3_weight + \ 1135 | current_batch_rewards['Bleu_4'][tmp_idx] * Bleu_4_weight 1136 | sum_image_rewards.append(tmp_reward) 1137 | sum_image_rewards = np.asarray(sum_image_rewards) 1138 | #sum_image_rewards = np.reshape(sum_image_rewards, [1, lstm_step]) 1139 | sum_image_rewards = np.array([sum_image_rewards, ] * batch_num_images) 1140 | 1141 | all_baselines = np.asarray(all_baselines) 1142 | all_baselines = np.reshape(all_baselines, [batch_num_images, lstm_step]) 1143 | #all_baselines_mean = np.mean(all_baselines, axis=0) 1144 | #all_baselines = np.array([all_baselines_mean,] * batch_num_images) 1145 | feed_dict = {tf_images_2: current_feats, tf_Q_rewards: sum_image_rewards, tf_Baselines: all_baselines} 1146 | _, loss_value, max_prob, current_rewards, logit_words = sess.run([train_op, tf_loss, tf_max_prob, tf_current_rewards, tf_logit_words], feed_dict) 1147 | #ipdb.set_trace() 1148 | loss_to_draw_epoch.append(loss_value) 1149 | print "idx: {} epoch: {} loss: {} Time cost: {}".format(start, epoch, loss_value, time.time()-start_time) 1150 | 1151 | # draw loss curve every epoch 1152 | loss_to_draw.append(np.mean(loss_to_draw_epoch)) 1153 | plt_save_img_name = 'SGD_update_' + str(epoch) + '.png' 1154 | plt.plot(range(len(loss_to_draw)), loss_to_draw, color='g') 1155 | plt.grid(True) 1156 | plt.savefig(os.path.join(loss_images_save_path, plt_save_img_name)) 1157 | 1158 | if np.mod(epoch, 1) == 0: 1159 | print "Epoch ", epoch, " is done. Saving the model ..." 1160 | saver.save(sess, os.path.join('./models', 'SGD_update_model'), global_step=epoch) 1161 | 1162 | #ONE IMAGE: # compute the 29 rewards using all_sample_sents 1163 | #ONE IMAGE: # the 30th reward is computed with gen_sents_list 1164 | #ONE IMAGE: for t in range(0, lstm_step-1): 1165 | #ONE IMAGE: samples = all_sample_sents[t] 1166 | #ONE IMAGE: samples = np.asarray(samples) 1167 | 1168 | #ONE IMAGE: sample_sent_1 = [] 1169 | #ONE IMAGE: sample_sent_2 = [] 1170 | #ONE IMAGE: sample_sent_3 = [] 1171 | #ONE IMAGE: for each_gen_sents_word in gen_sents_list[0:t+1]: 1172 | #ONE IMAGE: sample_sent_1.append(each_gen_sents_word) 1173 | #ONE IMAGE: sample_sent_2.append(each_gen_sents_word) 1174 | #ONE IMAGE: sample_sent_3.append(each_gen_sents_word) 1175 | 1176 | #ONE IMAGE: for i in range(samples.shape[0]): 1177 | #ONE IMAGE: word_1, word_2, word_3 = idx_to_word[samples[i, 0]], idx_to_word[samples[i, 1]], idx_to_word[samples[i, 2]] 1178 | 1179 | #ONE IMAGE: word_1, word_2, word_3 = word_1.replace('\n', ''), word_2.replace('\n', ''), word_3.replace('\n', '') 1180 | #ONE IMAGE: word_1, word_2, word_3 = word_1.replace('\\', ''), word_2.replace('\\', ''), word_3.replace('\\', '') 1181 | #ONE IMAGE: word_1, word_2, word_3 = word_1.replace('"', ''), word_2.replace('"', ''), word_3.replace('"', '') 1182 | 1183 | #ONE IMAGE: sample_sent_1.append(word_1) 1184 | #ONE IMAGE: sample_sent_2.append(word_2) 1185 | #ONE IMAGE: sample_sent_3.append(word_3) 1186 | 1187 | #ONE IMAGE: sample_sent_1.append('') 1188 | #ONE IMAGE: sample_sent_2.append('') 1189 | #ONE IMAGE: sample_sent_3.append('') 1190 | 1191 | #ONE IMAGE: three_sample_sents = [sample_sent_1, sample_sent_2, sample_sent_3] 1192 | #ONE IMAGE: three_sample_rewards = {} 1193 | #ONE IMAGE: three_sample_rewards['Bleu_1'] = 0.0 1194 | #ONE IMAGE: three_sample_rewards['Bleu_2'] = 0.0 1195 | #ONE IMAGE: three_sample_rewards['Bleu_3'] = 0.0 1196 | #ONE IMAGE: three_sample_rewards['Bleu_4'] = 0.0 1197 | 1198 | #ONE IMAGE: for i, each_sample_sent in enumerate(three_sample_sents): 1199 | #ONE IMAGE: # remove the space element in a list 1200 | #ONE IMAGE: if ' ' in each_sample_sent: 1201 | #ONE IMAGE: each_sample_sent.remove(' ') 1202 | 1203 | #ONE IMAGE: punctuation = np.argmax(np.array(each_sample_sent) == '') + 1 1204 | #ONE IMAGE: each_sample_sent = each_sample_sent[:punctuation] 1205 | #ONE IMAGE: each_sample_sent = ' '.join(each_sample_sent) 1206 | #ONE IMAGE: each_sample_sent = each_sample_sent.replace(' ', '') 1207 | #ONE IMAGE: each_sample_sent = each_sample_sent.replace(' ,', ',') 1208 | 1209 | #ONE IMAGE: fw_1 = open("./data/results_each_sample_sent.json", 'w') 1210 | #ONE IMAGE: fw_1.write('[{"image_id": ' + str(train_val_imageNames_to_imageIDs[img_name]) + ', "caption": "' + each_sample_sent + '"}]') 1211 | #ONE IMAGE: fw_1.close() 1212 | 1213 | #ONE IMAGE: annFile = './train_val_reference_json/' + img_name + '.json' 1214 | #ONE IMAGE: resFile = './data/results_each_sample_sent.json' 1215 | #ONE IMAGE: coco = COCO(annFile) 1216 | #ONE IMAGE: cocoRes = coco.loadRes(resFile) 1217 | #ONE IMAGE: cocoEval = COCOEvalCap(coco, cocoRes) 1218 | #ONE IMAGE: cocoEval.params['image_id'] = cocoRes.getImgIds() 1219 | #ONE IMAGE: cocoEval.evaluate() 1220 | #ONE IMAGE: for metric, score in cocoEval.eval.items(): 1221 | #ONE IMAGE: if metric == 'Bleu_1': 1222 | #ONE IMAGE: three_sample_rewards['Bleu_1'] += score 1223 | #ONE IMAGE: if metric == 'Bleu_2': 1224 | #ONE IMAGE: three_sample_rewards['Bleu_2'] += score 1225 | #ONE IMAGE: if metric == 'Bleu_3': 1226 | #ONE IMAGE: three_sample_rewards['Bleu_3'] += score 1227 | #ONE IMAGE: if metric == 'Bleu_4': 1228 | #ONE IMAGE: three_sample_rewards['Bleu_4'] += score 1229 | 1230 | #ONE IMAGE: current_image_rewards['Bleu_1'].append(three_sample_rewards['Bleu_1']/3.0) 1231 | #ONE IMAGE: current_image_rewards['Bleu_2'].append(three_sample_rewards['Bleu_2']/3.0) 1232 | #ONE IMAGE: current_image_rewards['Bleu_3'].append(three_sample_rewards['Bleu_3']/3.0) 1233 | #ONE IMAGE: current_image_rewards['Bleu_4'].append(three_sample_rewards['Bleu_4']/3.0) 1234 | 1235 | #ONE IMAGE: fw_1 = open("./data/results_each_sample_sent.json", 'w') 1236 | #ONE IMAGE: fw_1.write('[{"image_id": ' + str(train_val_imageNames_to_imageIDs[img_name]) + ', "caption": "' + generated_sentence + '"}]') 1237 | #ONE IMAGE: fw_1.close() 1238 | 1239 | #ONE IMAGE: annFile = './train_val_reference_json/' + img_name + '.json' 1240 | #ONE IMAGE: resFile = './data/results_each_sample_sent.json' 1241 | #ONE IMAGE: coco = COCO(annFile) 1242 | #ONE IMAGE: cocoRes = coco.loadRes(resFile) 1243 | #ONE IMAGE: cocoEval = COCOEvalCap(coco, cocoRes) 1244 | #ONE IMAGE: cocoEval.params['image_id'] = cocoRes.getImgIds() 1245 | #ONE IMAGE: cocoEval.evaluate() 1246 | #ONE IMAGE: for metric, score in cocoEval.eval.items(): 1247 | #ONE IMAGE: if metric == 'Bleu_1': 1248 | #ONE IMAGE: current_image_rewards['Bleu_1'].append(score) 1249 | #ONE IMAGE: if metric == 'Bleu_2': 1250 | #ONE IMAGE: current_image_rewards['Bleu_2'].append(score) 1251 | #ONE IMAGE: if metric == 'Bleu_3': 1252 | #ONE IMAGE: current_image_rewards['Bleu_3'].append(score) 1253 | #ONE IMAGE: if metric == 'Bleu_4': 1254 | #ONE IMAGE: current_image_rewards['Bleu_4'].append(score) 1255 | 1256 | #ONE IMAGE: # save the rewards immediately 1257 | #ONE IMAGE: train_val_images_Q_rewards[img_name] = current_image_rewards 1258 | #ONE IMAGE: with open('./data/train_val_images_Q_rewards.pkl', 'w') as fw_2: 1259 | #ONE IMAGE: pickle.dump(train_val_images_Q_rewards, fw_2) 1260 | 1261 | #ONE IMAGE: # compute the weight sum of Bleu value as rewards 1262 | #ONE IMAGE: for tmp_idx in range(0, lstm_step): 1263 | #ONE IMAGE: tmp_reward = current_image_rewards['Bleu_1'][tmp_idx] * Bleu_1_weight + \ 1264 | #ONE IMAGE: current_image_rewards['Bleu_2'][tmp_idx] * Bleu_2_weight + \ 1265 | #ONE IMAGE: current_image_rewards['Bleu_3'][tmp_idx] * Bleu_3_weight + \ 1266 | #ONE IMAGE: current_image_rewards['Bleu_4'][tmp_idx] * Bleu_4_weight 1267 | #ONE IMAGE: sum_image_rewards.append(tmp_reward) 1268 | 1269 | #ONE IMAGE: sum_image_rewards = np.asarray(sum_image_rewards) 1270 | #ONE IMAGE: sum_image_rewards = np.reshape(sum_image_rewards, [1, lstm_step]) 1271 | #ONE IMAGE: all_baselines = np.asarray(all_baselines) 1272 | #ONE IMAGE: all_baselines = np.reshape(all_baselines, [1, lstm_step]) 1273 | #ONE IMAGE: feed_dict = {tf_images_2: current_feats, tf_Q_rewards: sum_image_rewards, tf_Baselines: all_baselines} 1274 | #ONE IMAGE: _, loss_value = sess.run([train_op, tf_loss], feed_dict) 1275 | 1276 | #ONE IMAGE: loss_to_draw_epoch.append(loss_value) 1277 | 1278 | #ONE IMAGE: print "idx: {} epoch: {} loss: {} Time cost: {}".format(idx, epoch, loss_value, time.time()-start_time) 1279 | 1280 | #ONE IMAGE: # draw loss curve every epoch 1281 | #ONE IMAGE: loss_to_draw.append(np.mean(loss_to_draw_epoch)) 1282 | #ONE IMAGE: plt_save_img_name = str(epoch) + '.png' 1283 | #ONE IMAGE: plt.plot(range(len(loss_to_draw)), loss_to_draw, color='g') 1284 | #ONE IMAGE: plt.grid(True) 1285 | #ONE IMAGE: plt.savefig(os.path.join(loss_images_save_path, plt_save_img_name)) 1286 | 1287 | #ONE IMAGE: if np.mod(epoch, 2) == 0: 1288 | #ONE IMAGE: print "Epoch ", epoch, " is done. Saving the model ..." 1289 | #ONE IMAGE: saver.save(sess, os.path.join('./models', 'SGD_update_model'), global_step=epoch) 1290 | 1291 | 1292 | 1293 | -------------------------------------------------------------------------------- /inception/COCO_val2014_000000320612.jpg: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/chenxinpeng/Optimization_of_image_description_metrics_using_policy_gradient_methods/66089304b3dc78a1e27f90e262d0cb17c5bb4cf2/inception/COCO_val2014_000000320612.jpg -------------------------------------------------------------------------------- /inception/README.md: -------------------------------------------------------------------------------- 1 | Please attetion: 2 | 3 | 1. First, the original images in MSCOCO have an image which is PNG format (**COCO_val2014_000000320612.jpg**). 4 | 5 | 2. Second, when I want to put the training features and validation features into one folder: `train_val_feats`. 6 | The number of files is so many that the Linux system can't execute with `cp` command. 7 | So I use the `copy_train_val_feats.sh` to put the `train_feats`, `val_feats` into one folder: `train_val_feats`. 8 | 9 | -------------------------------------------------------------------------------- /inception/check_NOT_JPEG_IMG.sh: -------------------------------------------------------------------------------- 1 | 2 | DIR="/home/chenxp/data/mscoco/val2014/*.jpg" 3 | 4 | for img in $DIR 5 | do 6 | file $img >> imageInfo.txt 7 | done 8 | -------------------------------------------------------------------------------- /inception/copy_train_val_feats.sh: -------------------------------------------------------------------------------- 1 | #!/bin/bash 2 | 3 | DIR="./val_feats/*.npy" 4 | 5 | for feat in $DIR 6 | do 7 | cp $feat ./train_val_feats 8 | done 9 | -------------------------------------------------------------------------------- /inception/extract_inception_bottleneck_feature.py: -------------------------------------------------------------------------------- 1 | import os 2 | import glob 3 | import time 4 | 5 | import tensorflow as tf 6 | import tensorflow.python.platform 7 | from tensorflow.python.platform import gfile 8 | 9 | import numpy as np 10 | 11 | 12 | def create_graph(model_path): 13 | """ 14 | create_graph loads the inception model to memory, should be called before 15 | calling extract_features. 16 | 17 | model_path: path to inception model in protobuf form. 18 | """ 19 | with gfile.FastGFile(model_path, 'rb') as f: 20 | graph_def = tf.GraphDef() 21 | graph_def.ParseFromString(f.read()) 22 | _ = tf.import_graph_def(graph_def, name='') 23 | 24 | 25 | def extract_features(image_paths, feats_save_path, verbose=False): 26 | """ 27 | extract_features computed the inception bottleneck feature for a list of images 28 | 29 | image_paths: array of image path 30 | return: 2-d array in the shape of (len(image_paths), 2048) 31 | """ 32 | #feature_dimension = 2048 33 | #features = np.empty((len(image_paths), feature_dimension)) 34 | 35 | with tf.Session() as sess: 36 | flattened_tensor = sess.graph.get_tensor_by_name('pool_3:0') 37 | 38 | for i, image_path in enumerate(image_paths): 39 | image_basename = os.path.basename(image_path) 40 | start_time = time.time() 41 | 42 | feat_save_path = os.path.join(feats_save_path, image_basename + '.npy') 43 | if os.path.isfile(feat_save_path): 44 | continue 45 | 46 | if not gfile.Exists(image_path): 47 | tf.logging.fatal('File does not exist %s', image) 48 | 49 | image_data = gfile.FastGFile(image_path, 'rb').read() 50 | feature = sess.run([flattened_tensor], {'DecodeJpeg/contents:0': image_data}) 51 | np.save(feat_save_path, np.squeeze(feature)) 52 | 53 | if verbose: 54 | print('idx: {} {} Time cost: {}'.format(i, image_basename, time.time()-start_time)) 55 | 56 | 57 | if __name__ == "__main__": 58 | images_path = '/home/chenxp/data/mscoco/test2014' 59 | feats_save_path = './test_feats' 60 | 61 | model_path = 'tensorflow_inception_graph.pb' 62 | 63 | images_lists = glob.glob(images_path + '/*.jpg') 64 | 65 | create_graph(model_path) 66 | extract_features(images_lists, feats_save_path, verbose=True) 67 | -------------------------------------------------------------------------------- /inception/test_feats/README.md: -------------------------------------------------------------------------------- 1 | 2 | This folder saves the features of test images. 3 | -------------------------------------------------------------------------------- /inception/train_feats/README.md: -------------------------------------------------------------------------------- 1 | 2 | This folder saves the feature of train images. 3 | -------------------------------------------------------------------------------- /inception/train_val_feats/README.md: -------------------------------------------------------------------------------- 1 | 2 | This folder saves the features of training and validation images. 3 | -------------------------------------------------------------------------------- /inception/val_feats/README.md: -------------------------------------------------------------------------------- 1 | 2 | This folder saves the features of validation images. 3 | -------------------------------------------------------------------------------- /pre_train_json.py: -------------------------------------------------------------------------------- 1 | # encoding: UTF-8 2 | 3 | import os 4 | import json 5 | import numpy as np 6 | import cPickle as pickle 7 | 8 | import time 9 | import ipdb 10 | 11 | train_captions_path = './data/captions_train2014.json' 12 | save_images_captions_path = './data/train_images_captions.pkl' 13 | 14 | train_captions_fo = open(train_captions_path) 15 | train_captions = json.load(train_captions_fo) 16 | 17 | image_ids = [] 18 | for annotation in train_captions['annotations']: 19 | image_ids.append(annotation['image_id']) 20 | 21 | # [[filename1, id1], [filename2, id2], ... ] 22 | images_captions = {} 23 | for ii, image in enumerate(train_captions['images']): 24 | start_time = time.time() 25 | 26 | image_file_name = image['file_name'] 27 | image_id = image['id'] 28 | indices = [i for i, x in enumerate(image_ids) if x == image_id] 29 | 30 | caption = [] 31 | for idx in indices: 32 | each_cap = train_captions['annotations'][idx]['caption'] 33 | each_cap = each_cap.lower() 34 | each_cap = each_cap.replace('.', '') 35 | each_cap = each_cap.replace(',', ' ,') 36 | each_cap = each_cap.replace('?', ' ?') 37 | caption.append(each_cap) 38 | images_captions[image_file_name] = caption 39 | print "{} {} Each image cost: {}".format(ii, image_file_name, time.time()-start_time) 40 | 41 | with open(save_images_captions_path, 'w') as fw: 42 | pickle.dump(images_captions, fw) 43 | 44 | 45 | -------------------------------------------------------------------------------- /pre_val_json.py: -------------------------------------------------------------------------------- 1 | # encoding: UTF-8 2 | 3 | import os 4 | import json 5 | import numpy as np 6 | import cPickle as pickle 7 | 8 | import time 9 | import ipdb 10 | 11 | train_captions_path = './data/captions_val2014.json' 12 | save_images_captions_path = './data/val_images_captions.pkl' 13 | 14 | train_captions_fo = open(train_captions_path) 15 | train_captions = json.load(train_captions_fo) 16 | 17 | image_ids = [] 18 | for annotation in train_captions['annotations']: 19 | image_ids.append(annotation['image_id']) 20 | 21 | # [[filename1, id1], [filename2, id2], ... ] 22 | images_captions = {} 23 | for ii, image in enumerate(train_captions['images']): 24 | start_time = time.time() 25 | 26 | image_file_name = image['file_name'] 27 | image_id = image['id'] 28 | indices = [i for i, x in enumerate(image_ids) if x == image_id] 29 | 30 | caption = [] 31 | for idx in indices: 32 | each_cap = train_captions['annotations'][idx]['caption'] 33 | each_cap = each_cap.lower() 34 | each_cap = each_cap.replace('.', '') 35 | each_cap = each_cap.replace(',', ' ,') 36 | each_cap = each_cap.replace('?', ' ?') 37 | caption.append(each_cap) 38 | images_captions[image_file_name] = caption 39 | print "{} {} Each image cost: {}".format(ii, image_file_name, time.time()-start_time) 40 | 41 | with open(save_images_captions_path, 'w') as fw: 42 | pickle.dump(images_captions, fw) 43 | 44 | 45 | -------------------------------------------------------------------------------- /split_train_val_data.py: -------------------------------------------------------------------------------- 1 | # encoding: UTF-8 2 | 3 | # accoding the paper: we hold out a small subset of 1,665 validation images 4 | # for hyper-parameter tuning, and use the remaining combined training and 5 | # validation set for training 6 | 7 | import os 8 | import cPickle as pickle 9 | 10 | train_images_captions_path = './data/train_images_captions.pkl' 11 | val_images_captions_path = './data/val_images_captions.pkl' 12 | 13 | with open(train_images_captions_path, 'r') as fr1: 14 | train_images_captions = pickle.load(fr1) 15 | 16 | with open(val_images_captions_path, 'r') as fr2: 17 | val_images_captions = pickle.load(fr2) 18 | 19 | val_images_names = val_images_captions.keys() 20 | 21 | # val_images_names[0:1665] for validation 22 | # val_images_names[1665:] for training 23 | val_names_part_one = val_images_names[0:1665] 24 | val_names_part_two = val_images_names[1665:] 25 | 26 | # re-save the train_images_captions, val_images_captions 27 | val_images_captions_new = {} 28 | for img in val_names_part_one: 29 | val_images_captions_new[img] = val_images_captions[img] 30 | 31 | for img in val_names_part_two: 32 | train_images_captions[img] = val_images_captions[img] 33 | 34 | with open(train_images_captions_path, 'w') as fw1: 35 | pickle.dump(train_images_captions, fw1) 36 | 37 | with open(val_images_captions_path, 'w') as fw2: 38 | pickle.dump(val_images_captions_new, fw2) 39 | 40 | --------------------------------------------------------------------------------