├── README.md
├── build_vocab.py
├── coco_caption
    ├── Bleu_1.pkl
    ├── Bleu_2.pkl
    ├── Bleu_3.pkl
    ├── Bleu_4.pkl
    ├── CIDEr.pkl
    ├── METEOR.pkl
    ├── captions_test2014_hitachi_results.json
    ├── captions_val2014_hitachi_results.json
    ├── draw.py
    ├── eval_captions_results.py
    ├── eval_image_caption.py
    ├── eval_model.py
    ├── gen_test_json.py
    ├── gen_val_json.py
    ├── model_evalution.png
    ├── read_test_info.py
    └── read_validation_info.py
├── create_train_val_all_reference.py
├── create_train_val_each_reference.py
├── data
    ├── bias_init_vector.npy
    ├── idx_to_word.pkl
    ├── test.txt
    ├── test2014_images_ids_to_names.pkl
    ├── train_val_imageNames_to_imageIDs.pkl
    ├── val2014_images_ids_to_names.pkl
    ├── val_images_captions.pkl
    └── word_to_idx.pkl
├── image
    ├── 1.png
    ├── 2.png
    └── 3.png
├── image_caption.py
├── inception
    ├── COCO_val2014_000000320612.jpg
    ├── README.md
    ├── check_NOT_JPEG_IMG.sh
    ├── copy_train_val_feats.sh
    ├── extract_inception_bottleneck_feature.py
    ├── imageInfo.txt
    ├── test_feats
    │   └── README.md
    ├── train_feats
    │   └── README.md
    ├── train_val_feats
    │   └── README.md
    └── val_feats
    │   └── README.md
├── pre_train_json.py
├── pre_val_json.py
└── split_train_val_data.py


/README.md:
--------------------------------------------------------------------------------
  1 | # Optimization of image description metrics using policy gradient methods
  2 | This is Tensorflow implement of paper: [Optimization of image description metrics using policy gradient methods](https://arxiv.org/abs/1612.00370).
  3 | 
  4 | ## Note
  5 | This repository is not being actively maintained due to lack of time and interest. My sincerest apologies to the open source community for allowing this project to stagnate. I hope it was useful for some of you as a jumping-off point.
  6 | 
  7 | ## Prerequisites
  8 | - TensorFlow 0.10
  9 | 
 10 | ## Introduction
 11 | This codes are a little coarse. When I worked on this paper, I also have some questions. But the authors didn't reply me after I sent e-mails, and I don't know why. 
 12 | 
 13 | So, Please contact me anytime if you have any doubt.
 14 | 
 15 | My e-mail: jschenxinpeng@gmail.com
 16 | 
 17 | I will appreciate that if you have any advice.
 18 | 
 19 | ## How to run the code
 20 | ### Step 1
 21 | Go into the `./inception` directory, the python script which used to extract features is: `extract_inception_bottleneck_feature.py`.
 22 | 
 23 | In this python script, there are few parameters you should modified:
 24 |  - `image_path`: the MSCOCO image path, e.g. `/path/to/msococo/train2014`, `/path/to/msococo/val2014`, `/path/to/msococo/test2014`
 25 |  - `feats_save_path`: the feature directory which you want to saved.
 26 |  - `model_path`: the pre-trained **inception-V3** tensorflow model. And I uploaded this model on the Google Drive: [tensorflow_inception_graph.pb](https://drive.google.com/open?id=0B65vBUruA6N4Y2dtVHBJMVhodjA)
 27 |  
 28 | 
 29 | After you modified the parameters, we can extract image features, in the terminal:
 30 |  ```bash
 31 |  $ CUDA_VISIBLE_DEVICES=3 python extract_inception_bottleneck_feature.py
 32 |  ```
 33 | Also, you can run the code without GPU:
 34 |  ```bash
 35 |  $ CUDA_VISIBLE_DEVICES="" python extract_inception_bottleneck_feature.py
 36 |  ```
 37 | 
 38 | In my experiment, I save the `train2014` image feature in the folder: `./inception/train_feats`, `val2014` image feature are saved in the folder: `./inception/val_feats`, and the `test2014` image features are saved in the folder: `test_feats`
 39 | And at the same time, I saved the `train2014`+`val2014` image features in the folder: `./inception/train_val_feats`
 40 | 
 41 | ### Step 2
 42 | Run the scripts:
 43 | ```bash
 44 | $ python pre_train_json.py
 45 | $ python pre_val_json.py
 46 | $ python split_train_val_data.py
 47 | ```
 48 | 
 49 | The python script `pre_train_json.py`, it is used to process the `./data/captions_train2014.json`, it generated a file: `./data/train_images_captions.pkl`, it is a dict which save the captions of each image, like this:
 50 | <center>![train_image_captions](https://github.com/chenxinpeng/Optimization-of-image-description-metrics-using-policy-gradient-methods/blob/master/image/1.png)</center>
 51 | 
 52 | The script `pre_val_json.py`, it is used to process the `./data/captions_val2014.json`. it generated a file: `./data/val_images_captions.pkl`.
 53 | 
 54 | The script `split_train_val_data.py`, because according to the paper, it only use 1665 validation images, the other validation images are used to training. So, I split the validation images into two parts, the 0~1665 images are used to validation, the left are used to training.
 55 | 
 56 | ### Step 3
 57 | Run the scripts:
 58 | ```bash
 59 | $ python create_train_val_all_reference.py
 60 | ```
 61 | and
 62 | ```bash
 63 | $ create_train_val_each_reference.py
 64 | ```
 65 | 
 66 | Let me explain the two scripts, the first script `create_train_val_all_reference.py`, it will generate a JSON file named `train_val_all_reference.json`(about 70M), it saves the ground-truth captions of training and validation images.
 67 | 
 68 | The second script `create_train_val_each_reference.py`, it will generate JSON files of every training and validation images. And it saves every JSON file in the folder: `./train_val_reference_json/`
 69 | 
 70 | ### Step 4
 71 | Run the script:
 72 | ```bash
 73 | $ python build_vocab.py
 74 | ```
 75 | 
 76 | This script will build the vocabulary dict. In the data folder, it will generate three files:
 77 |  - word_to_idx.pkl
 78 |  - idx_to_word.pkl
 79 |  - bias_init_vector.npy
 80 |   
 81 | By the way, I filter the words more than 5 times, you can change this parameter in the script.
 82 | 
 83 | ### Step 5
 84 | In this step, we follow the algorithm in the paper:
 85 | <center>![algorithm](https://github.com/chenxinpeng/Optimization-of-image-description-metrics-using-policy-gradient-methods/blob/master/image/2.png)</center>
 86 | 
 87 | First, we train the the basic model with MLE(Maximum Likehood Estimation):
 88 | ```bash
 89 | $ CUDA_VISIBLE_DEVICES=0 ipython
 90 | >>> import image_caption
 91 | >>> image_caption.Train_with_MLE()
 92 | ```
 93 | 
 94 | After training the basic model, you can test and validate the model on test data and validation data:
 95 | ```bash
 96 | >>> image_caption.Test_with_MLE()
 97 | >>> image_caption.Val_with_MLE()
 98 | ```
 99 | 
100 | Second, we train B_phi using MC estimates of Q_theta on a small dataset D(1665 images):
101 | ```bash
102 | >>> image_caption.Sample_Q_with_MC()
103 | >>> image_caption.Train_Bphi_Model()
104 | ```
105 | 
106 | After we get the B_phi model, we use RG to optimize the generation:
107 | ```bash
108 | >>> image_caption.Train_SGD_update()
109 | ```
110 | I have runned several epochs, here I compared the RL results with the no RL results:
111 | ![results compared](https://github.com/chenxinpeng/Optimization-of-image-description-metrics-using-policy-gradient-methods/blob/master/image/3.png)
112 | 
113 | This shows that the policy gradient method is beneficial for image caption.
114 | 
115 | ### COCO evalution
116 | In the `./coco_caption/` folder, we can evaluate the generation results and our each trained model. Please see the python scripts.
117 | 


--------------------------------------------------------------------------------
/build_vocab.py:
--------------------------------------------------------------------------------
 1 | # encoding: UTF-8
 2 | 
 3 | #-----------------------------------------------------------------------
 4 | # We preprocess the text data by lower casing, and replacing words which
 5 | # occur less than 5 times in the 82K training set with <unk>;
 6 | # This results in a vocabulary size of 10,622 (from 32,807 words).
 7 | #-----------------------------------------------------------------------
 8 | 
 9 | import os
10 | import numpy as np
11 | import cPickle as pickle
12 | import time
13 | 
14 | 
15 | train_images_captions_path = './data/train_images_captions.pkl'
16 | with open(train_images_captions_path, 'r') as train_fr:
17 |     train_images_captions = pickle.load(train_fr)
18 | 
19 | val_images_captions_path = './data/val_images_captions.pkl'
20 | with open(val_images_captions_path, 'r') as val_fr:
21 |     val_images_captions = pickle.load(val_fr)
22 | 
23 | 
24 | #------------------------------------------------------------------------
25 | # Borrowed this function from NeuralTalk:
26 | # https://github.com/karpathy/neuraltalk/blob/master/driver.py#L16
27 | #-----------------------------------------------------------------------
28 | def preProBuildWordVocab(sentence_iterator, word_count_threshold=5):
29 |     print 'Preprocessing word counts and creating vocab based on word count threshold %d' % (word_count_threshold, )
30 | 
31 |     t0 = time.time()
32 |     word_counts = {}
33 |     nsents = 0
34 | 
35 |     for sent in sentence_iterator:
36 |         nsents += 1
37 |         tmp_sent = sent.split(' ')
38 |         # remove the empty string '' in the sentence
39 |         tmp_sent = filter(None, tmp_sent)
40 |         for w in tmp_sent:
41 |            word_counts[w] = word_counts.get(w, 0) + 1
42 |     vocab = [w for w in word_counts if word_counts[w] >= word_count_threshold]
43 |     print 'Filter words from %d to %d in %0.2fs' % (len(word_counts), len(vocab), time.time()-t0)
44 | 
45 |     ixtoword = {}
46 |     ixtoword[0] = '<pad>'
47 |     ixtoword[1] = '<bos>'
48 |     ixtoword[2] = '<eos>'
49 |     ixtoword[3] = '<unk>'
50 | 
51 |     wordtoix = {}
52 |     wordtoix['<pad>'] = 0
53 |     wordtoix['<bos>'] = 1
54 |     wordtoix['<eos>'] = 2
55 |     wordtoix['<unk>'] = 3
56 | 
57 |     for idx, w in enumerate(vocab):
58 |         wordtoix[w] = idx + 4
59 |         ixtoword[idx+4] = w
60 | 
61 |     word_counts['<eos>'] = nsents
62 |     word_counts['<bos>'] = nsents
63 |     word_counts['<pad>'] = nsents
64 |     word_counts['<unk>'] = nsents
65 | 
66 |     bias_init_vector = np.array([1.0 * word_counts[ ixtoword[i] ] for i in ixtoword])
67 |     bias_init_vector /= np.sum(bias_init_vector) # normalize to frequencies
68 |     bias_init_vector = np.log(bias_init_vector)
69 |     bias_init_vector -= np.max(bias_init_vector) # shift to nice numeric range
70 | 
71 |     return wordtoix, ixtoword, bias_init_vector
72 | 
73 | 
74 | # extract all sentences in captions
75 | all_sents = []
76 | for image, sents in train_images_captions.iteritems():
77 |     for each_sent in sents:
78 |         all_sents.append(each_sent)
79 | #for image, sents in val_images_captions.iteritems():
80 | #    for each_sent in sents:
81 | #        all_sents.append(each_sent)
82 | 
83 | word_to_idx, idx_to_word, bias_init_vector = preProBuildWordVocab(all_sents, word_count_threshold=5)
84 | 
85 | with open('./data/idx_to_word.pkl', 'w') as fw_1:
86 |     pickle.dump(idx_to_word, fw_1)
87 | 
88 | with open('./data/word_to_idx.pkl', 'w') as fw_2:
89 |     pickle.dump(word_to_idx, fw_2)
90 | 
91 | np.save('./data/bias_init_vector.npy', bias_init_vector)
92 | 
93 | 


--------------------------------------------------------------------------------
/coco_caption/Bleu_1.pkl:
--------------------------------------------------------------------------------
  1 | (lp1
  2 | F0.63260942443110402
  3 | aF0.66267368806380189
  4 | aF0.67396468036764079
  5 | aF0.67794791365717377
  6 | aF0.67931985672525264
  7 | aF0.68720906282181904
  8 | aF0.68912609777666678
  9 | aF0.69167747380553968
 10 | aF0.69363946633569773
 11 | aF0.70126672092487952
 12 | aF0.69540300212104966
 13 | aF0.69710219655562145
 14 | aF0.70975461281721475
 15 | aF0.70438543342455129
 16 | aF0.71065515953693403
 17 | aF0.70796259212616475
 18 | aF0.7137309178045258
 19 | aF0.70972130606858697
 20 | aF0.71264320363297118
 21 | aF0.71230150588813113
 22 | aF0.71266183163274432
 23 | aF0.71383000617409531
 24 | aF0.71222052067379871
 25 | aF0.71817035091708437
 26 | aF0.71678112561610441
 27 | aF0.71900126082551585
 28 | aF0.72450404567296334
 29 | aF0.72652732454870661
 30 | aF0.71936356792608769
 31 | aF0.72033242351406046
 32 | aF0.72968607636874172
 33 | aF0.72561088200897139
 34 | aF0.7244390812590612
 35 | aF0.72698497523275596
 36 | aF0.73069298752786138
 37 | aF0.73729693796998763
 38 | aF0.73198069213775818
 39 | aF0.73262131127298169
 40 | aF0.73429693443261856
 41 | aF0.73372853890175116
 42 | aF0.73490602935423266
 43 | aF0.73360538079808202
 44 | aF0.73405409832707313
 45 | aF0.73505874686919437
 46 | aF0.73659015588558707
 47 | aF0.73524962178515896
 48 | aF0.74024590163932913
 49 | aF0.74045426642110213
 50 | aF0.74236570630058896
 51 | aF0.74099218845853465
 52 | aF0.74233917889878653
 53 | aF0.74182765268035067
 54 | aF0.7426081290217329
 55 | aF0.74422245108134411
 56 | aF0.74005520376683032
 57 | aF0.74418414823278223
 58 | aF0.73858778239285305
 59 | aF0.74325937512862261
 60 | aF0.74602720699373459
 61 | aF0.74856333634347039
 62 | aF0.74826556870816785
 63 | aF0.75033971587398085
 64 | aF0.74709100559341968
 65 | aF0.74876745673204315
 66 | aF0.75033740951288808
 67 | aF0.75072200676621936
 68 | aF0.75079507461467931
 69 | aF0.75131406044676519
 70 | aF0.75091575091573559
 71 | aF0.7533601049008205
 72 | aF0.75039924655008505
 73 | aF0.75081011677908271
 74 | aF0.75387249307026027
 75 | aF0.75280507066520175
 76 | aF0.75895135402089153
 77 | aF0.7509718073570778
 78 | aF0.75198312065081352
 79 | aF0.75178513469651187
 80 | aF0.75653279730892331
 81 | aF0.75611004216643973
 82 | aF0.75846715177999302
 83 | aF0.75828324067043973
 84 | aF0.75750477716820241
 85 | aF0.75789989385154011
 86 | aF0.75972493489581794
 87 | aF0.75437812360325152
 88 | aF0.75501730103804698
 89 | aF0.76171867007671079
 90 | aF0.75758749312079343
 91 | aF0.75749841762458392
 92 | aF0.75914832925833831
 93 | aF0.75881129150687854
 94 | aF0.75751186083287869
 95 | aF0.75484145132240421
 96 | aF0.75908663477654936
 97 | aF0.75733703416464437
 98 | aF0.76081097635671613
 99 | aF0.75996548039778178
100 | aF0.75746828259898757
101 | aF0.76331747554973228
102 | aF0.76167146385774265
103 | aF0.75959328927298919
104 | aF0.76060692348369285
105 | aF0.7615271719189376
106 | aF0.76225570032571743
107 | aF0.75984967874892395
108 | aF0.75717563705163327
109 | aF0.76227174752936766
110 | aF0.75881628615836505
111 | aF0.76317765735044829
112 | aF0.75621208762417613
113 | aF0.75900304982729583
114 | aF0.76074306177258977
115 | aF0.76198528962326029
116 | aF0.76501348149357051
117 | aF0.76363267788362532
118 | aF0.76185546082012268
119 | aF0.76123584183157811
120 | aF0.76008633906235867
121 | aF0.76592733375117239
122 | aF0.76374915821478762
123 | aF0.76026841827837732
124 | aF0.76175517026116601
125 | aF0.76038009926157535
126 | aF0.76368359579522416
127 | aF0.75833484600954482
128 | aF0.76197571632803573
129 | aF0.7629299028616281
130 | aF0.76091773426797282
131 | aF0.76531982421873446
132 | aF0.76405016256384894
133 | aF0.7659885788607157
134 | aF0.76358106052923536
135 | aF0.76399991904635078
136 | aF0.76128875773131832
137 | aF0.76268313355126205
138 | aF0.76484217800449661
139 | aF0.76796994008325858
140 | aF0.76355805394910903
141 | aF0.76499747091551318
142 | aF0.76452938096579082
143 | aF0.76352628826580393
144 | aF0.76775447404646602
145 | aF0.76476284624849677
146 | aF0.76222562943833194
147 | aF0.76338147833473402
148 | aF0.76421753429051176
149 | aF0.76712827251590487
150 | aF0.76658684321733916
151 | aF0.76485198497635876
152 | aF0.7618884776311402
153 | aF0.76374823481943188
154 | aF0.76541313880411499
155 | aF0.76382010709826542
156 | aF0.76418767057311043
157 | aF0.76132075471696592
158 | aF0.76545898102782794
159 | aF0.76787125803115663
160 | aF0.76265898728100234
161 | aF0.76364074356605049
162 | aF0.76735406919538229
163 | aF0.76499618825982496
164 | aF0.76497955731000866
165 | aF0.76429861529197751
166 | aF0.7638679302227166
167 | aF0.76541870139264157
168 | aF0.76458651450613269
169 | aF0.7663183003496985
170 | aF0.76314153346301195
171 | aF0.76658610271901784
172 | aF0.76437242097670843
173 | aF0.76178136675905428
174 | aF0.76439947911447936
175 | aF0.75999279077217319
176 | aF0.76444729922317267
177 | aF0.76261997510138624
178 | aF0.75889880417638389
179 | aF0.76379881537996663
180 | aF0.76253897998187792
181 | aF0.76538973882854799
182 | aF0.76289754065527593
183 | aF0.76168346510550311
184 | aF0.76136567119633991
185 | aF0.75887144844027898
186 | aF0.76305356893836762
187 | aF0.76344193483908129
188 | aF0.76334680581527092
189 | aF0.76585101253614662
190 | aF0.76123191302260851
191 | aF0.7634585650890392
192 | aF0.76134217400038662
193 | aF0.7604760476047453
194 | aF0.76254760812776401
195 | aF0.76048083248608023
196 | aF0.76025974025972509
197 | aF0.76538307260327387
198 | aF0.76010312987167206
199 | aF0.76104285286183682
200 | aF0.76136592032799855
201 | aF0.76413942221063957
202 | aF0.76069486646408202
203 | aF0.76404426801393877
204 | aF0.76077328225998009
205 | aF0.76148381689745859
206 | aF0.76053659024925901
207 | aF0.76173206065797661
208 | aF0.75772879409025495
209 | aF0.76179131966686497
210 | aF0.7582728650027758
211 | aF0.75840286054825679
212 | aF0.76425427578027694
213 | aF0.7586440542856423
214 | aF0.76066111781102075
215 | aF0.76409670893173964
216 | aF0.76330972465381264
217 | aF0.7588560738735719
218 | aF0.75990113810765669
219 | aF0.7633820800829807
220 | aF0.7603335533025225
221 | aF0.76201052484141185
222 | aF0.76061405612855282
223 | aF0.75630568950803212
224 | aF0.757408546872814
225 | aF0.76159282197870493
226 | aF0.76209790209788697
227 | aF0.762037850355331
228 | aF0.75760592545395122
229 | aF0.76000319144690709
230 | aF0.76037472593181665
231 | aF0.76114156339369321
232 | aF0.76310264152451035
233 | aF0.75998571343530852
234 | aF0.75947565098758807
235 | aF0.76130983169536548
236 | aF0.76105455702241209
237 | aF0.76010427653179524
238 | aF0.75949947197479906
239 | aF0.76106000439006327
240 | aF0.76160169187181759
241 | aF0.76186211288908612
242 | aF0.75888582043588193
243 | aF0.76169278371093407
244 | aF0.76218829923272136
245 | aF0.76508881611589241
246 | aF0.76059954616026204
247 | aF0.75496076619007746
248 | aF0.75848861899940945
249 | aF0.75676319393444702
250 | aF0.7617587319287753
251 | aF0.76011012908244202
252 | a.


--------------------------------------------------------------------------------
/coco_caption/Bleu_2.pkl:
--------------------------------------------------------------------------------
  1 | (lp1
  2 | F0.44701445730041445
  3 | aF0.4830433062444468
  4 | aF0.4939662856657035
  5 | aF0.50050569478837414
  6 | aF0.50175276072974118
  7 | aF0.50897594637847177
  8 | aF0.51316415541095439
  9 | aF0.51448432825471457
 10 | aF0.51853477147182192
 11 | aF0.52498707790409072
 12 | aF0.51841263036406193
 13 | aF0.52040892167020669
 14 | aF0.53524628148436371
 15 | aF0.52934501579869575
 16 | aF0.53453659106341056
 17 | aF0.53270843698039461
 18 | aF0.54089648298723925
 19 | aF0.53803943586804548
 20 | aF0.53975898466930994
 21 | aF0.54063323725452739
 22 | aF0.54217124704850073
 23 | aF0.54411558783118907
 24 | aF0.53994886567844735
 25 | aF0.54709809187597069
 26 | aF0.54544906649331237
 27 | aF0.54868210119279937
 28 | aF0.5530751557038055
 29 | aF0.55552914989855995
 30 | aF0.54887126568793954
 31 | aF0.55155635654771573
 32 | aF0.56095074945855317
 33 | aF0.55732285864590903
 34 | aF0.55810031416061867
 35 | aF0.56031775646004767
 36 | aF0.56165756096229902
 37 | aF0.57194047706892182
 38 | aF0.56457488849693882
 39 | aF0.56605184239308481
 40 | aF0.56889475603038908
 41 | aF0.56868565740017174
 42 | aF0.56971650083471992
 43 | aF0.56792100434663617
 44 | aF0.56998952080709786
 45 | aF0.57025692681385454
 46 | aF0.57170900009487069
 47 | aF0.57194722703677359
 48 | aF0.57787436758728994
 49 | aF0.5753862229678891
 50 | aF0.57916014431738894
 51 | aF0.57912794779759935
 52 | aF0.57929492407106742
 53 | aF0.58078922987124637
 54 | aF0.5796684664074081
 55 | aF0.58005494492785126
 56 | aF0.57911581348905161
 57 | aF0.58278507841009286
 58 | aF0.57745218667697251
 59 | aF0.58203875926982884
 60 | aF0.58495866143962083
 61 | aF0.58706610649185209
 62 | aF0.58951244632503708
 63 | aF0.59026069155858862
 64 | aF0.58883494292550931
 65 | aF0.58831417421830079
 66 | aF0.59102430979134868
 67 | aF0.5911804143343059
 68 | aF0.59126462484225051
 69 | aF0.59358746429763165
 70 | aF0.59187843140612617
 71 | aF0.59412353717009891
 72 | aF0.59127930606673385
 73 | aF0.59285352042928452
 74 | aF0.59559156222164167
 75 | aF0.59379898303023559
 76 | aF0.60248618395778653
 77 | aF0.59384720141066505
 78 | aF0.59376182375289477
 79 | aF0.59483540612433117
 80 | aF0.59954501808983662
 81 | aF0.59990203267749709
 82 | aF0.60332997368419095
 83 | aF0.60215894841141249
 84 | aF0.60352467419032307
 85 | aF0.6013315578396784
 86 | aF0.60494560460416125
 87 | aF0.59908424648591796
 88 | aF0.60016198505966334
 89 | aF0.60801929345256411
 90 | aF0.60205314238629593
 91 | aF0.60094656337983576
 92 | aF0.60569322731387798
 93 | aF0.60137663725716728
 94 | aF0.60447554787780311
 95 | aF0.6023968738365173
 96 | aF0.60445867473371473
 97 | aF0.60402523782931206
 98 | aF0.60584654496739965
 99 | aF0.60650858956193743
100 | aF0.60272262198958249
101 | aF0.61085598334544078
102 | aF0.60684081983158378
103 | aF0.60587199343224851
104 | aF0.6064372797156411
105 | aF0.61030070725962771
106 | aF0.6093417010233797
107 | aF0.60805388440391828
108 | aF0.6020004209514298
109 | aF0.61043376214203138
110 | aF0.60671576045281872
111 | aF0.61002699552743667
112 | aF0.60464068114185543
113 | aF0.60580018038988437
114 | aF0.60981663264733621
115 | aF0.60877942352291792
116 | aF0.61291674770910931
117 | aF0.61309914625801043
118 | aF0.60996962332835791
119 | aF0.61201405304123013
120 | aF0.60841555061617714
121 | aF0.61573573757223254
122 | aF0.61318783301342683
123 | aF0.60849975642279663
124 | aF0.61252675557034453
125 | aF0.60898045692066782
126 | aF0.61504865645318463
127 | aF0.60710668744610208
128 | aF0.61091470259637271
129 | aF0.61200473084388596
130 | aF0.6099081398355648
131 | aF0.61536424101384635
132 | aF0.61421700746185792
133 | aF0.61643166727267151
134 | aF0.61445942019481259
135 | aF0.61502708829375341
136 | aF0.61197734689484284
137 | aF0.61316103926709153
138 | aF0.61581535445513302
139 | aF0.61867682178305294
140 | aF0.61490103693948694
141 | aF0.6151495679583342
142 | aF0.61595917782039489
143 | aF0.61430860486926553
144 | aF0.61704711742905327
145 | aF0.61709909839235777
146 | aF0.61258774792155035
147 | aF0.61482858860084921
148 | aF0.61504181807003588
149 | aF0.61979197468898661
150 | aF0.61864272422277389
151 | aF0.61513322327342745
152 | aF0.61295431012147183
153 | aF0.61541712918827107
154 | aF0.61726156954047251
155 | aF0.61754589219786915
156 | aF0.61468076187160803
157 | aF0.61265300280943291
158 | aF0.6168941181595029
159 | aF0.61919901824762136
160 | aF0.61406508562514361
161 | aF0.61502761262377836
162 | aF0.61940977597170821
163 | aF0.61748437661042355
164 | aF0.61693903992142263
165 | aF0.61599967048221671
166 | aF0.61789053547752348
167 | aF0.6183240233733942
168 | aF0.61771838981498883
169 | aF0.61922813949850863
170 | aF0.61450439534501533
171 | aF0.61973030145040742
172 | aF0.61818458359418482
173 | aF0.6157284722486378
174 | aF0.61756890018497412
175 | aF0.61230920347519824
176 | aF0.61686833730123991
177 | aF0.61605411838141333
178 | aF0.61315340461893242
179 | aF0.61646795673946864
180 | aF0.61612208193854168
181 | aF0.61976107823057192
182 | aF0.61380989244511042
183 | aF0.61411410577631032
184 | aF0.61552782708664289
185 | aF0.61152383713225156
186 | aF0.61823178449047611
187 | aF0.61753378676799853
188 | aF0.61619993806514584
189 | aF0.62098948977751711
190 | aF0.61458008818578691
191 | aF0.61736647496168917
192 | aF0.61426458254321592
193 | aF0.61346741040548691
194 | aF0.61660466734297126
195 | aF0.61405728049556663
196 | aF0.61488851431290303
197 | aF0.61849654967711598
198 | aF0.61557906777518223
199 | aF0.61408542908871278
200 | aF0.61607072636777938
201 | aF0.61997841128965314
202 | aF0.61518709141594352
203 | aF0.62030651352035393
204 | aF0.61730140591390059
205 | aF0.61763072287861132
206 | aF0.61475577921228552
207 | aF0.61801016609292014
208 | aF0.61210736598709281
209 | aF0.61659054111000178
210 | aF0.61210331912518356
211 | aF0.61296183498816936
212 | aF0.61783967210400192
213 | aF0.61391822633306581
214 | aF0.61556511671279812
215 | aF0.61938574320350381
216 | aF0.62007090827931111
217 | aF0.61384994461502429
218 | aF0.61500100454818185
219 | aF0.61930892097903145
220 | aF0.61500605774310835
221 | aF0.61638415864859331
222 | aF0.6177303796053345
223 | aF0.61212374229375988
224 | aF0.61247389336872149
225 | aF0.61760359835095147
226 | aF0.61731902973593111
227 | aF0.61841611686862852
228 | aF0.61256405644606771
229 | aF0.61802499832904678
230 | aF0.61760266436194977
231 | aF0.61660486229631017
232 | aF0.62044394356209343
233 | aF0.61546395818335875
234 | aF0.6139598334367975
235 | aF0.61769434235330445
236 | aF0.61695063768578651
237 | aF0.61738899642248457
238 | aF0.61748494771049667
239 | aF0.61645394369516349
240 | aF0.61838848996504037
241 | aF0.62086840544332933
242 | aF0.61543881095760411
243 | aF0.61794125716157011
244 | aF0.61819097069341933
245 | aF0.62225844703921607
246 | aF0.61628904558898612
247 | aF0.6114189319235831
248 | aF0.61394621850444564
249 | aF0.61457811381292315
250 | aF0.62035333536351489
251 | aF0.61688896948080474
252 | a.


--------------------------------------------------------------------------------
/coco_caption/Bleu_3.pkl:
--------------------------------------------------------------------------------
  1 | (lp1
  2 | F0.29707668433830187
  3 | aF0.33579709089922882
  4 | aF0.34468956821642355
  5 | aF0.35316911577671461
  6 | aF0.35567269818131536
  7 | aF0.36026177660692654
  8 | aF0.36471407910280479
  9 | aF0.36844262557330587
 10 | aF0.37220932639239973
 11 | aF0.37655673987251204
 12 | aF0.37035683680989817
 13 | aF0.37383122750940773
 14 | aF0.38748195100330884
 15 | aF0.38170775143175978
 16 | aF0.38555749119890365
 17 | aF0.38731723488565623
 18 | aF0.3946590294047751
 19 | aF0.39315025222532585
 20 | aF0.39256515373344114
 21 | aF0.39467263589120877
 22 | aF0.39696270707495923
 23 | aF0.39833479893545831
 24 | aF0.39487779046538279
 25 | aF0.40155629980022378
 26 | aF0.39885564800104872
 27 | aF0.402607878910128
 28 | aF0.40497159270308258
 29 | aF0.40795938013325189
 30 | aF0.40344632387585178
 31 | aF0.40637259660635661
 32 | aF0.41407510930333169
 33 | aF0.4102150656455672
 34 | aF0.41338795373053411
 35 | aF0.41467297534600533
 36 | aF0.41516443031850281
 37 | aF0.42669972153917346
 38 | aF0.41821149081195041
 39 | aF0.42038574304194831
 40 | aF0.42513526251520656
 41 | aF0.42370485724669033
 42 | aF0.42488531017056119
 43 | aF0.42293690849431748
 44 | aF0.42667059727500384
 45 | aF0.42586863147294762
 46 | aF0.42836885025141436
 47 | aF0.4299897907257495
 48 | aF0.43411806746351622
 49 | aF0.43056178438544851
 50 | aF0.43563695720324735
 51 | aF0.4345403419284995
 52 | aF0.43522214987795199
 53 | aF0.43796801751982894
 54 | aF0.43571154528320061
 55 | aF0.43636970073462372
 56 | aF0.43705544038470123
 57 | aF0.44065071575633136
 58 | aF0.435368739574028
 59 | aF0.43961773731432013
 60 | aF0.44183301787037665
 61 | aF0.44367101901501016
 62 | aF0.44872928677342272
 63 | aF0.44816791129017247
 64 | aF0.44742686100836998
 65 | aF0.44632863652345256
 66 | aF0.44999945503612349
 67 | aF0.44874574716998011
 68 | aF0.44921021562245445
 69 | aF0.45245486710269617
 70 | aF0.45080258927390493
 71 | aF0.4517288056115108
 72 | aF0.45044330018954776
 73 | aF0.45315527170499748
 74 | aF0.45515082040339
 75 | aF0.45256258077690009
 76 | aF0.46140213355011939
 77 | aF0.45340587413910699
 78 | aF0.45358279690486486
 79 | aF0.45457779339247517
 80 | aF0.45946683687786549
 81 | aF0.46015771281470458
 82 | aF0.46459354755904542
 83 | aF0.46353604874142995
 84 | aF0.46485315789905229
 85 | aF0.46229484283542849
 86 | aF0.46683625217591457
 87 | aF0.4600210089069427
 88 | aF0.46163411012099803
 89 | aF0.46952465502189561
 90 | aF0.46353144069095609
 91 | aF0.46138700496076562
 92 | aF0.46805364305382735
 93 | aF0.46261957473479265
 94 | aF0.46822507216697445
 95 | aF0.46573614820349651
 96 | aF0.46646351011947462
 97 | aF0.46808312770932858
 98 | aF0.4681009629140841
 99 | aF0.46924266963178407
100 | aF0.46433432700446597
101 | aF0.47395576019098851
102 | aF0.46988376324318432
103 | aF0.46803415290875722
104 | aF0.46871674087641396
105 | aF0.47439412173129303
106 | aF0.47246776641391586
107 | aF0.47247615078372701
108 | aF0.46404701465608966
109 | aF0.47357505531118094
110 | aF0.47198794292863699
111 | aF0.47334742622035336
112 | aF0.46985141348151238
113 | aF0.46863596911374922
114 | aF0.47534943990324519
115 | aF0.47217255313131457
116 | aF0.47675171005429551
117 | aF0.4769935641462848
118 | aF0.47488847318383937
119 | aF0.47772451287728834
120 | aF0.47385830175620297
121 | aF0.48120754995513088
122 | aF0.47831364406089844
123 | aF0.47285442227793889
124 | aF0.47875806638914192
125 | aF0.47443960821323983
126 | aF0.48247941147377782
127 | aF0.47228211995405689
128 | aF0.4757772503862297
129 | aF0.47804594214004742
130 | aF0.47603058967586698
131 | aF0.48186828438554485
132 | aF0.48125505187767553
133 | aF0.48268839434006816
134 | aF0.48060548983626122
135 | aF0.48185027200671937
136 | aF0.47887654320083178
137 | aF0.47982017328949261
138 | aF0.48288759752518839
139 | aF0.48531943145948719
140 | aF0.4821206819945793
141 | aF0.48170729227927089
142 | aF0.48325775033737572
143 | aF0.48090517713054992
144 | aF0.48413361675400879
145 | aF0.4851792890835338
146 | aF0.47916535165014268
147 | aF0.48267193300728001
148 | aF0.48194793035054923
149 | aF0.48872452714090736
150 | aF0.48591065259487948
151 | aF0.48277714764935498
152 | aF0.48087170207199137
153 | aF0.4837870721360194
154 | aF0.48524386978756578
155 | aF0.48705775470024276
156 | aF0.48266689387815853
157 | aF0.48001752849599644
158 | aF0.48472744672697665
159 | aF0.48754804891825843
160 | aF0.48193539406653291
161 | aF0.48463268530603482
162 | aF0.48784859837816424
163 | aF0.48608657030900687
164 | aF0.48496775726120694
165 | aF0.48621079480351659
166 | aF0.48701075729393739
167 | aF0.48770161730704686
168 | aF0.48711711959918164
169 | aF0.48906596202391678
170 | aF0.48341538387693245
171 | aF0.48954999335676047
172 | aF0.48902007741894865
173 | aF0.48596501774342127
174 | aF0.48741199016181458
175 | aF0.48195482010735796
176 | aF0.48620208346808602
177 | aF0.48718594829361511
178 | aF0.48442536182635992
179 | aF0.48723060252053435
180 | aF0.4865840457623945
181 | aF0.49040598219548953
182 | aF0.48411732663292323
183 | aF0.48542463554923942
184 | aF0.48691735953048676
185 | aF0.48221630463602011
186 | aF0.48897344960543215
187 | aF0.48925494480211323
188 | aF0.48734175849587807
189 | aF0.49382841715209486
190 | aF0.48525781916637306
191 | aF0.48952731180161335
192 | aF0.48489412473669607
193 | aF0.48512672089773834
194 | aF0.48773814645547636
195 | aF0.48511301628691544
196 | aF0.48515441018364269
197 | aF0.4911020151164302
198 | aF0.48774254427619224
199 | aF0.48642747994994562
200 | aF0.48797363436906499
201 | aF0.49245763106373164
202 | aF0.48815789423151973
203 | aF0.49445600842164888
204 | aF0.49117836208991322
205 | aF0.49105485743189986
206 | aF0.48766770615777566
207 | aF0.49168562113874731
208 | aF0.48461555235548326
209 | aF0.49017626894557875
210 | aF0.4860620150849187
211 | aF0.48701753957782395
212 | aF0.49082798212201251
213 | aF0.4876528482555253
214 | aF0.48980980128093132
215 | aF0.4929782608313194
216 | aF0.49502570552137382
217 | aF0.48648368075893822
218 | aF0.48956480171272349
219 | aF0.49366981705435337
220 | aF0.48835059095870148
221 | aF0.49053050478489696
222 | aF0.49372569309352365
223 | aF0.48715792153404475
224 | aF0.48702718177864163
225 | aF0.49273637239595935
226 | aF0.49203436512730986
227 | aF0.49321264770944984
228 | aF0.48763071331968477
229 | aF0.49344111229212106
230 | aF0.49260384260787449
231 | aF0.4912120228436303
232 | aF0.49561553536202341
233 | aF0.49143099524151251
234 | aF0.48810871842778192
235 | aF0.4935120176770717
236 | aF0.49019144022485189
237 | aF0.49313654782706051
238 | aF0.49396248107002222
239 | aF0.49139490752246073
240 | aF0.49398297645157413
241 | aF0.49767769550512825
242 | aF0.49079368869714263
243 | aF0.49303917546688714
244 | aF0.49310343192174433
245 | aF0.49824166501019007
246 | aF0.49207240322236639
247 | aF0.48590100958044885
248 | aF0.48890793373520286
249 | aF0.49177203053584628
250 | aF0.49592014401867918
251 | aF0.49230301633292395
252 | a.


--------------------------------------------------------------------------------
/coco_caption/Bleu_4.pkl:
--------------------------------------------------------------------------------
  1 | (lp1
  2 | F0.19103731907605284
  3 | aF0.22683608421029841
  4 | aF0.23542856254239006
  5 | aF0.24459713551700563
  6 | aF0.24886658872084744
  7 | aF0.25220417267130463
  8 | aF0.25457798072775245
  9 | aF0.26031391053687997
 10 | aF0.26420452058950622
 11 | aF0.2659221673056883
 12 | aF0.2609398704759891
 13 | aF0.2653344724430638
 14 | aF0.27584938913452517
 15 | aF0.27237052723108879
 16 | aF0.27479138411264281
 17 | aF0.27797801774011011
 18 | aF0.28434626362579979
 19 | aF0.28405411752975479
 20 | aF0.28204020906024119
 21 | aF0.28593043526171896
 22 | aF0.28787021905181986
 23 | aF0.28805930784022998
 24 | aF0.28614477285136153
 25 | aF0.29181514811757781
 26 | aF0.28874460804022045
 27 | aF0.29247434454664928
 28 | aF0.29326186082116695
 29 | aF0.29634960178063929
 30 | aF0.29345886529842169
 31 | aF0.29650693146541662
 32 | aF0.30227060494896868
 33 | aF0.29855928246669844
 34 | aF0.30284894566488907
 35 | aF0.30260955963716535
 36 | aF0.30392599304098794
 37 | aF0.31370216448042576
 38 | aF0.30521265145259086
 39 | aF0.30993010433052742
 40 | aF0.31441224952349478
 41 | aF0.31192129088018899
 42 | aF0.31409217087999597
 43 | aF0.31076541060548568
 44 | aF0.31587582541598569
 45 | aF0.31537510823941756
 46 | aF0.3182303877321383
 47 | aF0.32096154055415882
 48 | aF0.32356939088295256
 49 | aF0.31907307143028468
 50 | aF0.32403800046394565
 51 | aF0.32287272311748239
 52 | aF0.32315344964331982
 53 | aF0.32640381048178768
 54 | aF0.3241802927116072
 55 | aF0.32610084694110908
 56 | aF0.32659814021512662
 57 | aF0.32997586663799006
 58 | aF0.32608335039896919
 59 | aF0.32858256719319212
 60 | aF0.33162553092873226
 61 | aF0.33210663715736538
 62 | aF0.33869578316024507
 63 | aF0.33722396656915149
 64 | aF0.33708481990948791
 65 | aF0.33567921379277449
 66 | aF0.33876934289134653
 67 | aF0.3371020407956643
 68 | aF0.33760526283255243
 69 | aF0.34166908601292334
 70 | aF0.3394921246698111
 71 | aF0.34054519042631148
 72 | aF0.34001763209791031
 73 | aF0.34277039194025344
 74 | aF0.34451105787295644
 75 | aF0.34200696303712547
 76 | aF0.34973774603210861
 77 | aF0.34230101070171876
 78 | aF0.34382464607014346
 79 | aF0.34457167675816985
 80 | aF0.34921177041120893
 81 | aF0.34986107855224013
 82 | aF0.35394763736969825
 83 | aF0.35372788977751429
 84 | aF0.3545722007597692
 85 | aF0.35297960676403084
 86 | aF0.35785848228376343
 87 | aF0.35115096945545998
 88 | aF0.35240026852564793
 89 | aF0.36015821930253972
 90 | aF0.35419418837069244
 91 | aF0.35222748106862894
 92 | aF0.35900355377876059
 93 | aF0.35270470018579997
 94 | aF0.36017948097716079
 95 | aF0.3571520904537403
 96 | aF0.35784430443087445
 97 | aF0.36033686539157744
 98 | aF0.35936983184100374
 99 | aF0.36103735701037065
100 | aF0.35468969311289089
101 | aF0.36481868731861383
102 | aF0.3610758660360579
103 | aF0.35833139721326762
104 | aF0.35935112724555651
105 | aF0.36681616517358284
106 | aF0.36423449763547083
107 | aF0.36436438155127382
108 | aF0.35466488745131275
109 | aF0.3638962331113289
110 | aF0.3648536348390829
111 | aF0.36538553017534126
112 | aF0.36289910709830586
113 | aF0.35994929215702953
114 | aF0.36733874497939828
115 | aF0.36397063140582475
116 | aF0.36835167441436145
117 | aF0.36867146530705597
118 | aF0.36726539384222662
119 | aF0.36981564654375382
120 | aF0.36681805279447088
121 | aF0.37403727775498979
122 | aF0.37082434601287384
123 | aF0.36561052287422757
124 | aF0.37239070490596776
125 | aF0.36741687886384988
126 | aF0.37648565964725067
127 | aF0.36583814676680609
128 | aF0.36820048588383369
129 | aF0.37085865562520698
130 | aF0.36971634701005907
131 | aF0.37524245754728314
132 | aF0.37519371420589775
133 | aF0.37569237627230612
134 | aF0.37505862408692603
135 | aF0.37494140621179811
136 | aF0.37271055324782215
137 | aF0.37381762178049976
138 | aF0.37645900557431977
139 | aF0.37860386900412352
140 | aF0.37667114598782409
141 | aF0.37451087901501978
142 | aF0.37831955612685131
143 | aF0.37467437676980575
144 | aF0.37825443356479066
145 | aF0.37993701658177981
146 | aF0.37348682450987941
147 | aF0.37795797618624111
148 | aF0.37645504797177287
149 | aF0.38387768811695189
150 | aF0.3799090659899263
151 | aF0.37685270057549286
152 | aF0.37596702261272047
153 | aF0.37845321062600784
154 | aF0.3803725963060608
155 | aF0.3822563128330731
156 | aF0.3776679861024328
157 | aF0.37493680540871699
158 | aF0.37993441984854709
159 | aF0.38199833532514771
160 | aF0.3776239901111067
161 | aF0.38090435829260305
162 | aF0.38304336514433907
163 | aF0.38140001342240798
164 | aF0.37942568337606175
165 | aF0.3831993228670541
166 | aF0.38252106021586818
167 | aF0.38417860160467121
168 | aF0.38250562004420691
169 | aF0.38423758350699044
170 | aF0.37950946045216399
171 | aF0.38616452829602393
172 | aF0.3853229532735552
173 | aF0.38278297504880476
174 | aF0.38341932995367961
175 | aF0.37845852212585496
176 | aF0.38258576524342491
177 | aF0.38457802752250619
178 | aF0.38202628685584999
179 | aF0.38541258589721178
180 | aF0.38245170915044602
181 | aF0.38745939261986095
182 | aF0.38207188761124977
183 | aF0.38237760969105117
184 | aF0.38384320986974424
185 | aF0.38015615863298075
186 | aF0.38614121306192073
187 | aF0.38705960837141862
188 | aF0.38545673597270552
189 | aF0.39340474858364699
190 | aF0.38254301091323906
191 | aF0.38711020214855368
192 | aF0.382811340276414
193 | aF0.3837974756188956
194 | aF0.38568903486431172
195 | aF0.38317133098419892
196 | aF0.38212551706633319
197 | aF0.39003297824229449
198 | aF0.38564809086809521
199 | aF0.38560682479375591
200 | aF0.38661079526224212
201 | aF0.39081229235860743
202 | aF0.387753271729882
203 | aF0.39427041718921269
204 | aF0.39120592522865488
205 | aF0.39103004797818908
206 | aF0.38733975067270082
207 | aF0.39210506439882942
208 | aF0.38463518152520504
209 | aF0.39002497769459588
210 | aF0.38688767829011556
211 | aF0.38782503967563448
212 | aF0.39050733830750117
213 | aF0.38783629528341018
214 | aF0.39075195624970827
215 | aF0.39231135321734867
216 | aF0.39621080075968301
217 | aF0.38519496993855645
218 | aF0.39095855075647445
219 | aF0.39464032402579052
220 | aF0.38878908563787218
221 | aF0.39139149780827431
222 | aF0.39569559841695606
223 | aF0.38962369059152524
224 | aF0.38811829689914212
225 | aF0.39358296190106823
226 | aF0.39262015735556394
227 | aF0.39411030708781664
228 | aF0.3897808831776835
229 | aF0.39429106708871375
230 | aF0.39371099054405934
231 | aF0.39286280253296424
232 | aF0.39659726531556033
233 | aF0.39438239328877034
234 | aF0.38857711679154944
235 | aF0.39497591048562675
236 | aF0.3903709293200292
237 | aF0.39519066565046018
238 | aF0.39655001015540858
239 | aF0.39271192457138304
240 | aF0.39663870194427303
241 | aF0.39977504950447323
242 | aF0.3923138889326509
243 | aF0.39466442112198152
244 | aF0.39469625701331201
245 | aF0.4003041739901127
246 | aF0.39477252852653533
247 | aF0.38676718932616477
248 | aF0.39068639404569161
249 | aF0.39480118886047777
250 | aF0.39771579914878369
251 | aF0.39478737966100647
252 | a.


--------------------------------------------------------------------------------
/coco_caption/CIDEr.pkl:
--------------------------------------------------------------------------------
   1 | (lp1
   2 | cnumpy.core.multiarray
   3 | scalar
   4 | p2
   5 | (cnumpy
   6 | dtype
   7 | p3
   8 | (S'f8'
   9 | I0
  10 | I1
  11 | tRp4
  12 | (I3
  13 | S'<'
  14 | NNNI-1
  15 | I-1
  16 | I0
  17 | tbS'a\xb6#V6p\xe1?'
  18 | tRp5
  19 | ag2
  20 | (g4
  21 | S'sW\x1c\xb5S\xe5\xe5?'
  22 | tRp6
  23 | ag2
  24 | (g4
  25 | S'^oE\x18~\xaa\xe7?'
  26 | tRp7
  27 | ag2
  28 | (g4
  29 | S'\xc31\x0f\xf8)\xe9\xe8?'
  30 | tRp8
  31 | ag2
  32 | (g4
  33 | S'V\x03;\xcd\xda}\xe9?'
  34 | tRp9
  35 | ag2
  36 | (g4
  37 | S'\xf0g\xfa\xd1o\t\xea?'
  38 | tRp10
  39 | ag2
  40 | (g4
  41 | S'\x9c\xdb\xb9\xea\x94}\xea?'
  42 | tRp11
  43 | ag2
  44 | (g4
  45 | S'\x92\x94Y\x8e\x1c\xd9\xea?'
  46 | tRp12
  47 | ag2
  48 | (g4
  49 | S'\x05\r\xd8(\xd8\x10\xeb?'
  50 | tRp13
  51 | ag2
  52 | (g4
  53 | S'\xee\x9a(D$\xb6\xeb?'
  54 | tRp14
  55 | ag2
  56 | (g4
  57 | S'w\x86\x03\r6i\xeb?'
  58 | tRp15
  59 | ag2
  60 | (g4
  61 | S'V\xc2\x8b\x8e%\xd2\xeb?'
  62 | tRp16
  63 | ag2
  64 | (g4
  65 | S'\xa9Oo\xda\xc6\x9b\xec?'
  66 | tRp17
  67 | ag2
  68 | (g4
  69 | S'_\x99\xa0\xfd\xe0\xc4\xec?'
  70 | tRp18
  71 | ag2
  72 | (g4
  73 | S'\x15\x95$WEI\xed?'
  74 | tRp19
  75 | ag2
  76 | (g4
  77 | S'M\x1c\x13\xbc\x9a;\xed?'
  78 | tRp20
  79 | ag2
  80 | (g4
  81 | S'If\xea\x0f\xd3\xc5\xed?'
  82 | tRp21
  83 | ag2
  84 | (g4
  85 | S'\xb3:,Bf\x9c\xed?'
  86 | tRp22
  87 | ag2
  88 | (g4
  89 | S'b1Kp\xf4\xec\xed?'
  90 | tRp23
  91 | ag2
  92 | (g4
  93 | S'\x95qd<\xbaB\xee?'
  94 | tRp24
  95 | ag2
  96 | (g4
  97 | S'\x9chw\x1fE5\xee?'
  98 | tRp25
  99 | ag2
 100 | (g4
 101 | S'\xc0sN\xe7c\x9a\xee?'
 102 | tRp26
 103 | ag2
 104 | (g4
 105 | S'\x1c\x7f\x88\x0fQv\xee?'
 106 | tRp27
 107 | ag2
 108 | (g4
 109 | S'\xebK\xd8\xe2\x14\x17\xef?'
 110 | tRp28
 111 | ag2
 112 | (g4
 113 | S'\x9c<1\xce.\xd7\xee?'
 114 | tRp29
 115 | ag2
 116 | (g4
 117 | S'\xe7\\Y\xd3\xfb;\xef?'
 118 | tRp30
 119 | ag2
 120 | (g4
 121 | S'|E\x9c\xd9\xbf\x8d\xef?'
 122 | tRp31
 123 | ag2
 124 | (g4
 125 | S'_\xeb_\x06@\xfe\xef?'
 126 | tRp32
 127 | ag2
 128 | (g4
 129 | S'\xf7\x8e\xac\xd0\xdf\x83\xef?'
 130 | tRp33
 131 | ag2
 132 | (g4
 133 | S'[\xd3\xa5\xe3\xc2\xcc\xef?'
 134 | tRp34
 135 | ag2
 136 | (g4
 137 | S'\x18Z\x90L\x10\x1f\xf0?'
 138 | tRp35
 139 | ag2
 140 | (g4
 141 | S'\xbd\xadcl\x14\x04\xf0?'
 142 | tRp36
 143 | ag2
 144 | (g4
 145 | S'\x13\xb8\xd5wq-\xf0?'
 146 | tRp37
 147 | ag2
 148 | (g4
 149 | S'\x8a\x11\xdc\xb7\xa5P\xf0?'
 150 | tRp38
 151 | ag2
 152 | (g4
 153 | S'\x83(\x8ce8M\xf0?'
 154 | tRp39
 155 | ag2
 156 | (g4
 157 | S'J\xf3\xd5C\x01\x9d\xf0?'
 158 | tRp40
 159 | ag2
 160 | (g4
 161 | S'[\xff\n\xbcY\x82\xf0?'
 162 | tRp41
 163 | ag2
 164 | (g4
 165 | S'\xce\xc8\x99\xb7\x8dw\xf0?'
 166 | tRp42
 167 | ag2
 168 | (g4
 169 | S'f\x85\xd1jj\xb8\xf0?'
 170 | tRp43
 171 | ag2
 172 | (g4
 173 | S'M\xbbt#\xad\xa2\xf0?'
 174 | tRp44
 175 | ag2
 176 | (g4
 177 | S'B\xb2\x9bJ\xbe\xc3\xf0?'
 178 | tRp45
 179 | ag2
 180 | (g4
 181 | S'5\x17\xb7_\x91\xa5\xf0?'
 182 | tRp46
 183 | ag2
 184 | (g4
 185 | S'PC\xf3\xd8`\xdd\xf0?'
 186 | tRp47
 187 | ag2
 188 | (g4
 189 | S'\xe6\xec;\x99\x19\xfd\xf0?'
 190 | tRp48
 191 | ag2
 192 | (g4
 193 | S'\xe3\x85\xcc\xcbY\x10\xf1?'
 194 | tRp49
 195 | ag2
 196 | (g4
 197 | S'\x12\x93\xed\xb2\xa5\x16\xf1?'
 198 | tRp50
 199 | ag2
 200 | (g4
 201 | S'%A\x7f3\xf0Q\xf1?'
 202 | tRp51
 203 | ag2
 204 | (g4
 205 | S'\x17D,M54\xf1?'
 206 | tRp52
 207 | ag2
 208 | (g4
 209 | S'\x84}\xab\xb2\xe6C\xf1?'
 210 | tRp53
 211 | ag2
 212 | (g4
 213 | S'\x85\xaf\x81|\xde>\xf1?'
 214 | tRp54
 215 | ag2
 216 | (g4
 217 | S'6\xe7;J\xd0x\xf1?'
 218 | tRp55
 219 | ag2
 220 | (g4
 221 | S'z\xfb\r\xad\x8c|\xf1?'
 222 | tRp56
 223 | ag2
 224 | (g4
 225 | S'\x9c@\xb2\x89>\x81\xf1?'
 226 | tRp57
 227 | ag2
 228 | (g4
 229 | S'\xdb4\xce\x97Fn\xf1?'
 230 | tRp58
 231 | ag2
 232 | (g4
 233 | S'_^\xa1Xc\x97\xf1?'
 234 | tRp59
 235 | ag2
 236 | (g4
 237 | S"V'\n\x05\x0c\xac\xf1?"
 238 | tRp60
 239 | ag2
 240 | (g4
 241 | S'\x10\x1d\xc8\xb5$t\xf1?'
 242 | tRp61
 243 | ag2
 244 | (g4
 245 | S'c\x90\xf5VV\x9e\xf1?'
 246 | tRp62
 247 | ag2
 248 | (g4
 249 | S'\xaaV\x86l\xcb\xb2\xf1?'
 250 | tRp63
 251 | ag2
 252 | (g4
 253 | S'\x7f\x11\xd4{\xdf\xec\xf1?'
 254 | tRp64
 255 | ag2
 256 | (g4
 257 | S'@\x15\x99&,\xf4\xf1?'
 258 | tRp65
 259 | ag2
 260 | (g4
 261 | S'\xf9\x93\xa1\xb0\xe8\xef\xf1?'
 262 | tRp66
 263 | ag2
 264 | (g4
 265 | S'\xd8| \xa8w\xef\xf1?'
 266 | tRp67
 267 | ag2
 268 | (g4
 269 | S's\xe4-N\xcb\xc3\xf1?'
 270 | tRp68
 271 | ag2
 272 | (g4
 273 | S's\xa1\x99\xda\x9c\x1e\xf2?'
 274 | tRp69
 275 | ag2
 276 | (g4
 277 | S'e\x84\xe9\x9a\x90\x00\xf2?'
 278 | tRp70
 279 | ag2
 280 | (g4
 281 | S'\xaeB7Js\x13\xf2?'
 282 | tRp71
 283 | ag2
 284 | (g4
 285 | S'\x08pj@5\x1a\xf2?'
 286 | tRp72
 287 | ag2
 288 | (g4
 289 | S'\x1f\x881v\xfd;\xf2?'
 290 | tRp73
 291 | ag2
 292 | (g4
 293 | S'\xab2D&\xd5+\xf2?'
 294 | tRp74
 295 | ag2
 296 | (g4
 297 | S'\xec\xddG\x85\xea\x16\xf2?'
 298 | tRp75
 299 | ag2
 300 | (g4
 301 | S"\x02\xba'=\x80M\xf2?"
 302 | tRp76
 303 | ag2
 304 | (g4
 305 | S'\xf2B\xb1H&z\xf2?'
 306 | tRp77
 307 | ag2
 308 | (g4
 309 | S'\xe7\x8c\xa0%\xcf+\xf2?'
 310 | tRp78
 311 | ag2
 312 | (g4
 313 | S'\xa1L\xe9|\x15\xa4\xf2?'
 314 | tRp79
 315 | ag2
 316 | (g4
 317 | S'w\xc6\xc6\xde\xf5P\xf2?'
 318 | tRp80
 319 | ag2
 320 | (g4
 321 | S'\t\xfa\xac\xc7\xadW\xf2?'
 322 | tRp81
 323 | ag2
 324 | (g4
 325 | S'\xce\xcc\x90\x19kM\xf2?'
 326 | tRp82
 327 | ag2
 328 | (g4
 329 | S'F\x80[\xf0\xcau\xf2?'
 330 | tRp83
 331 | ag2
 332 | (g4
 333 | S'\xcdL\xabx\xbc\x8a\xf2?'
 334 | tRp84
 335 | ag2
 336 | (g4
 337 | S'=\xd7?:3\xb9\xf2?'
 338 | tRp85
 339 | ag2
 340 | (g4
 341 | S'\x0b\xa3\x8e_\xeb\xbd\xf2?'
 342 | tRp86
 343 | ag2
 344 | (g4
 345 | S'\xa1\xe2\xd3:\xff\xd2\xf2?'
 346 | tRp87
 347 | ag2
 348 | (g4
 349 | S" '1\x84R\xab\xf2?"
 350 | tRp88
 351 | ag2
 352 | (g4
 353 | S'\xa9n\xee\x1c\xce\x03\xf3?'
 354 | tRp89
 355 | ag2
 356 | (g4
 357 | S'\xc7\x15\x00o.\xbd\xf2?'
 358 | tRp90
 359 | ag2
 360 | (g4
 361 | S'\xde\xb8\x1bO\x9c\xbf\xf2?'
 362 | tRp91
 363 | ag2
 364 | (g4
 365 | S'\xec\x91H\x8fB\x07\xf3?'
 366 | tRp92
 367 | ag2
 368 | (g4
 369 | S'\xffr\xd9\xd1\xad\xcc\xf2?'
 370 | tRp93
 371 | ag2
 372 | (g4
 373 | S'\xb7\x01Y\x04m\xb9\xf2?'
 374 | tRp94
 375 | ag2
 376 | (g4
 377 | S'\xf8ZK\x1a\x1f\xe5\xf2?'
 378 | tRp95
 379 | ag2
 380 | (g4
 381 | S'\x10\xaaT;(\xbd\xf2?'
 382 | tRp96
 383 | ag2
 384 | (g4
 385 | S'a\x95\x98\xac,1\xf3?'
 386 | tRp97
 387 | ag2
 388 | (g4
 389 | S'\xebn\xa7\xdeD\xc7\xf2?'
 390 | tRp98
 391 | ag2
 392 | (g4
 393 | S'\xa5\xe2\xf0\xd9\xe9\x01\xf3?'
 394 | tRp99
 395 | ag2
 396 | (g4
 397 | S'\xd3\xb2jBP\x05\xf3?'
 398 | tRp100
 399 | ag2
 400 | (g4
 401 | S'TS\x92.\xf2\x08\xf3?'
 402 | tRp101
 403 | ag2
 404 | (g4
 405 | S'5\xab\xfd\xff\xb8\xe9\xf2?'
 406 | tRp102
 407 | ag2
 408 | (g4
 409 | S'\xb8\xf2\xa7\x1c\t\xd5\xf2?'
 410 | tRp103
 411 | ag2
 412 | (g4
 413 | S'\xf2\xc5w\x8a\xc1E\xf3?'
 414 | tRp104
 415 | ag2
 416 | (g4
 417 | S'\xd7\xc8\x16\xa3U\x0e\xf3?'
 418 | tRp105
 419 | ag2
 420 | (g4
 421 | S'T\x1c\xa7,o\xef\xf2?'
 422 | tRp106
 423 | ag2
 424 | (g4
 425 | S'*\xf1 \x9f\xf4\xfa\xf2?'
 426 | tRp107
 427 | ag2
 428 | (g4
 429 | S'.\x97J\xa6\xa6\\\xf3?'
 430 | tRp108
 431 | ag2
 432 | (g4
 433 | S'\xe8\xc1P:\x90?\xf3?'
 434 | tRp109
 435 | ag2
 436 | (g4
 437 | S'\xf9A0\xdf\xaaI\xf3?'
 438 | tRp110
 439 | ag2
 440 | (g4
 441 | S'\xdc\xb8\xbf\x05,\xd2\xf2?'
 442 | tRp111
 443 | ag2
 444 | (g4
 445 | S'T\xee\x13\x7f\xc84\xf3?'
 446 | tRp112
 447 | ag2
 448 | (g4
 449 | S'\xb2\x9b{\xc65M\xf3?'
 450 | tRp113
 451 | ag2
 452 | (g4
 453 | S'\xe8_\x19@?{\xf3?'
 454 | tRp114
 455 | ag2
 456 | (g4
 457 | S'\xb9\xb0\xc8L\xbb\x1d\xf3?'
 458 | tRp115
 459 | ag2
 460 | (g4
 461 | S'\xea\xf9\x03_\xedX\xf3?'
 462 | tRp116
 463 | ag2
 464 | (g4
 465 | S'\xd4&\x08H\xabH\xf3?'
 466 | tRp117
 467 | ag2
 468 | (g4
 469 | S'W\xa55\xea\x18>\xf3?'
 470 | tRp118
 471 | ag2
 472 | (g4
 473 | S'2i\x02\xee\xf7U\xf3?'
 474 | tRp119
 475 | ag2
 476 | (g4
 477 | S'\x13\x9a\x10\xb7\xe9\x81\xf3?'
 478 | tRp120
 479 | ag2
 480 | (g4
 481 | S'x\xfe\xfe\x0e\tQ\xf3?'
 482 | tRp121
 483 | ag2
 484 | (g4
 485 | S'\x9br\xa0qK~\xf3?'
 486 | tRp122
 487 | ag2
 488 | (g4
 489 | S'\xe8\xb1\x8a0\xfdO\xf3?'
 490 | tRp123
 491 | ag2
 492 | (g4
 493 | S'\xc0\xfc\x01\xe8\x85\xc1\xf3?'
 494 | tRp124
 495 | ag2
 496 | (g4
 497 | S'\xd09\xbde\xa6g\xf3?'
 498 | tRp125
 499 | ag2
 500 | (g4
 501 | S'\xa4n=\xd3\x1b9\xf3?'
 502 | tRp126
 503 | ag2
 504 | (g4
 505 | S'i\x9f.\xc0\xaf\x98\xf3?'
 506 | tRp127
 507 | ag2
 508 | (g4
 509 | S'E\xa0J\x1e\x16f\xf3?'
 510 | tRp128
 511 | ag2
 512 | (g4
 513 | S'{\xaf\xec\xddT\xc4\xf3?'
 514 | tRp129
 515 | ag2
 516 | (g4
 517 | S'1rIR\xf6\\\xf3?'
 518 | tRp130
 519 | ag2
 520 | (g4
 521 | S'\xa1\xd8Y!\x1e\x8c\xf3?'
 522 | tRp131
 523 | ag2
 524 | (g4
 525 | S'\x92\xd0\xba\xdd"\x80\xf3?'
 526 | tRp132
 527 | ag2
 528 | (g4
 529 | S'\xa3\x94\xc7\x8eV\x8f\xf3?'
 530 | tRp133
 531 | ag2
 532 | (g4
 533 | S'\xd11Xq\xb9\xbe\xf3?'
 534 | tRp134
 535 | ag2
 536 | (g4
 537 | S'\xe0\x9e\x0fn\xe3\xaf\xf3?'
 538 | tRp135
 539 | ag2
 540 | (g4
 541 | S'\x0f\xf7\x1eQ\xcc\xc4\xf3?'
 542 | tRp136
 543 | ag2
 544 | (g4
 545 | S'E[\x0f\x80m\xa7\xf3?'
 546 | tRp137
 547 | ag2
 548 | (g4
 549 | S'\xefA\x18\xc2\x01\xb7\xf3?'
 550 | tRp138
 551 | ag2
 552 | (g4
 553 | S'\x15)\xbd\x97\xcf\x97\xf3?'
 554 | tRp139
 555 | ag2
 556 | (g4
 557 | S'\x88|Xd\x03\xaa\xf3?'
 558 | tRp140
 559 | ag2
 560 | (g4
 561 | S'\xe4\nV\xd9\x99\xaa\xf3?'
 562 | tRp141
 563 | ag2
 564 | (g4
 565 | S'\x9f0\xf4i*\xd0\xf3?'
 566 | tRp142
 567 | ag2
 568 | (g4
 569 | S'\x1b\x1c\x86\x16?\xbf\xf3?'
 570 | tRp143
 571 | ag2
 572 | (g4
 573 | S'eqgj\xd9\xba\xf3?'
 574 | tRp144
 575 | ag2
 576 | (g4
 577 | S'\xbc\x88\x82\xa4f\xea\xf3?'
 578 | tRp145
 579 | ag2
 580 | (g4
 581 | S'a\xc4\xe0M\x1f\xbe\xf3?'
 582 | tRp146
 583 | ag2
 584 | (g4
 585 | S'\x99\xff\x19\xa3\xa2\xd3\xf3?'
 586 | tRp147
 587 | ag2
 588 | (g4
 589 | S'\x124\x96\x8a\xb4\xe4\xf3?'
 590 | tRp148
 591 | ag2
 592 | (g4
 593 | S'6#\xdan\x9d\x9b\xf3?'
 594 | tRp149
 595 | ag2
 596 | (g4
 597 | S'\xba\xd1\xd6\x8e\xb7\xcd\xf3?'
 598 | tRp150
 599 | ag2
 600 | (g4
 601 | S'BEz}H\xe5\xf3?'
 602 | tRp151
 603 | ag2
 604 | (g4
 605 | S'!\x82hfM\x02\xf4?'
 606 | tRp152
 607 | ag2
 608 | (g4
 609 | S'(\xd6\r\xbc\x8f\xfb\xf3?'
 610 | tRp153
 611 | ag2
 612 | (g4
 613 | S'=\x81/8n\xcb\xf3?'
 614 | tRp154
 615 | ag2
 616 | (g4
 617 | S'\x9c\xbe\xd8\xb0\xdc\xb6\xf3?'
 618 | tRp155
 619 | ag2
 620 | (g4
 621 | S'\xb6\x80\xb4\xae\xfe\xc3\xf3?'
 622 | tRp156
 623 | ag2
 624 | (g4
 625 | S'\xaegc\xb0\xdc\xd3\xf3?'
 626 | tRp157
 627 | ag2
 628 | (g4
 629 | S'\xa8\xa1\x9b\x94\xb6\x13\xf4?'
 630 | tRp158
 631 | ag2
 632 | (g4
 633 | S'\x8a"\x8dB;\xe8\xf3?'
 634 | tRp159
 635 | ag2
 636 | (g4
 637 | S'\x10s\xdf\xae]\xb9\xf3?'
 638 | tRp160
 639 | ag2
 640 | (g4
 641 | S'\xe0Z\xf2\x1a&\xdf\xf3?'
 642 | tRp161
 643 | ag2
 644 | (g4
 645 | S'y\xed0 \xb9\xf5\xf3?'
 646 | tRp162
 647 | ag2
 648 | (g4
 649 | S'\x1b\xa3\xbd\x14{\xea\xf3?'
 650 | tRp163
 651 | ag2
 652 | (g4
 653 | S'\x19\xfa\x1fUX\xec\xf3?'
 654 | tRp164
 655 | ag2
 656 | (g4
 657 | S'\xecU\x1f+C\x13\xf4?'
 658 | tRp165
 659 | ag2
 660 | (g4
 661 | S'\x8c\x80\xf7\xe4(\xe9\xf3?'
 662 | tRp166
 663 | ag2
 664 | (g4
 665 | S'c1ia\xd7\xb6\xf3?'
 666 | tRp167
 667 | ag2
 668 | (g4
 669 | S'\x14\xf6\xe1Og\x07\xf4?'
 670 | tRp168
 671 | ag2
 672 | (g4
 673 | S'\xa20\xd9g7\t\xf4?'
 674 | tRp169
 675 | ag2
 676 | (g4
 677 | S'\xc4P$T\xcb\xf6\xf3?'
 678 | tRp170
 679 | ag2
 680 | (g4
 681 | S'\xc6V\xcd\xec\xa4\x11\xf4?'
 682 | tRp171
 683 | ag2
 684 | (g4
 685 | S'\x97\xdc\xe3\x93I\xf5\xf3?'
 686 | tRp172
 687 | ag2
 688 | (g4
 689 | S'\xff\xfcL\xa9Q\xc8\xf3?'
 690 | tRp173
 691 | ag2
 692 | (g4
 693 | S'\x86\x02\x81\xf5B\xfc\xf3?'
 694 | tRp174
 695 | ag2
 696 | (g4
 697 | S'\xf0\xac|\xd9\xc6\n\xf4?'
 698 | tRp175
 699 | ag2
 700 | (g4
 701 | S'\xbf-\xca\\\x91\xea\xf3?'
 702 | tRp176
 703 | ag2
 704 | (g4
 705 | S'\x83p\xcb\xd6]\xe9\xf3?'
 706 | tRp177
 707 | ag2
 708 | (g4
 709 | S"?\xf5'\xb2\x0b\xd3\xf3?"
 710 | tRp178
 711 | ag2
 712 | (g4
 713 | S'\xa8\x82\xe3p\xae\xfa\xf3?'
 714 | tRp179
 715 | ag2
 716 | (g4
 717 | S'2\x8b\xe7\xf4\xc7\xf7\xf3?'
 718 | tRp180
 719 | ag2
 720 | (g4
 721 | S'R\xef[\xecR\xf1\xf3?'
 722 | tRp181
 723 | ag2
 724 | (g4
 725 | S'\xb8;\xee\x81?\xe7\xf3?'
 726 | tRp182
 727 | ag2
 728 | (g4
 729 | S'\xe9\xa8\xad\xe1\xeb\xf6\xf3?'
 730 | tRp183
 731 | ag2
 732 | (g4
 733 | S'M J\x19\xd5\x17\xf4?'
 734 | tRp184
 735 | ag2
 736 | (g4
 737 | S'B\x83\r)\xd8\xe4\xf3?'
 738 | tRp185
 739 | ag2
 740 | (g4
 741 | S'\xfd\xa0\xc7\x13\x81\xec\xf3?'
 742 | tRp186
 743 | ag2
 744 | (g4
 745 | S'\xd5\xf5\x04\x19f\x05\xf4?'
 746 | tRp187
 747 | ag2
 748 | (g4
 749 | S'6\x18\xb9\xe2q\xd4\xf3?'
 750 | tRp188
 751 | ag2
 752 | (g4
 753 | S'\xb3\xb0\x8b\xb6\x8a:\xf4?'
 754 | tRp189
 755 | ag2
 756 | (g4
 757 | S'\x8f\x00u\x07G*\xf4?'
 758 | tRp190
 759 | ag2
 760 | (g4
 761 | S'\x04:c\x99\x9f\x06\xf4?'
 762 | tRp191
 763 | ag2
 764 | (g4
 765 | S'x\xe7\x7f\xaa\x8eh\xf4?'
 766 | tRp192
 767 | ag2
 768 | (g4
 769 | S'\xfb\x1aR\x13m\xdd\xf3?'
 770 | tRp193
 771 | ag2
 772 | (g4
 773 | S'y\xd4\xc8\n\xcb\x1e\xf4?'
 774 | tRp194
 775 | ag2
 776 | (g4
 777 | S'\x8e\xf9.\xc9\xb5\xfe\xf3?'
 778 | tRp195
 779 | ag2
 780 | (g4
 781 | S'j\x86r\n\x94\x03\xf4?'
 782 | tRp196
 783 | ag2
 784 | (g4
 785 | S'\x02N\x03\x0e4\x03\xf4?'
 786 | tRp197
 787 | ag2
 788 | (g4
 789 | S'\xa5%\x16\x9e\xbf\xe4\xf3?'
 790 | tRp198
 791 | ag2
 792 | (g4
 793 | S'Mu\xe9\xdc\xf5\xec\xf3?'
 794 | tRp199
 795 | ag2
 796 | (g4
 797 | S'=\xd6\x90Yv(\xf4?'
 798 | tRp200
 799 | ag2
 800 | (g4
 801 | S'\xecL$W\xe7\x12\xf4?'
 802 | tRp201
 803 | ag2
 804 | (g4
 805 | S'\xfff\x8def\x06\xf4?'
 806 | tRp202
 807 | ag2
 808 | (g4
 809 | S'\xbb\xd2E\x11\xca\x15\xf4?'
 810 | tRp203
 811 | ag2
 812 | (g4
 813 | S'\xce\xa1\xac\x14\xab<\xf4?'
 814 | tRp204
 815 | ag2
 816 | (g4
 817 | S'jl|\x99M\x14\xf4?'
 818 | tRp205
 819 | ag2
 820 | (g4
 821 | S'\xf3vr\xb6WS\xf4?'
 822 | tRp206
 823 | ag2
 824 | (g4
 825 | S'B\xc4!\x7f\x06\x1d\xf4?'
 826 | tRp207
 827 | ag2
 828 | (g4
 829 | S'ZNI\xeb1I\xf4?'
 830 | tRp208
 831 | ag2
 832 | (g4
 833 | S'r\t%6\x8f\x11\xf4?'
 834 | tRp209
 835 | ag2
 836 | (g4
 837 | S'\xa5\xe10\xbd\xf59\xf4?'
 838 | tRp210
 839 | ag2
 840 | (g4
 841 | S'\xbe51\xffS\t\xf4?'
 842 | tRp211
 843 | ag2
 844 | (g4
 845 | S'z\xa4 \x11\xf2+\xf4?'
 846 | tRp212
 847 | ag2
 848 | (g4
 849 | S'\xd5\xca\xb4\xdb\xba\xfd\xf3?'
 850 | tRp213
 851 | ag2
 852 | (g4
 853 | S'l\xf1\x85\xadC\x04\xf4?'
 854 | tRp214
 855 | ag2
 856 | (g4
 857 | S'0\xcbo\xab\x90"\xf4?'
 858 | tRp215
 859 | ag2
 860 | (g4
 861 | S'\x84R\xebEg\xfc\xf3?'
 862 | tRp216
 863 | ag2
 864 | (g4
 865 | S'K\x1d\n\xd0\xf2/\xf4?'
 866 | tRp217
 867 | ag2
 868 | (g4
 869 | S'\x1bZ\xd4\xee\xaa?\xf4?'
 870 | tRp218
 871 | ag2
 872 | (g4
 873 | S',\x12r\xadw_\xf4?'
 874 | tRp219
 875 | ag2
 876 | (g4
 877 | S'L\xef\xb5\x00\xc7\x00\xf4?'
 878 | tRp220
 879 | ag2
 880 | (g4
 881 | S'\r\x8f\x92\x0f\x06:\xf4?'
 882 | tRp221
 883 | ag2
 884 | (g4
 885 | S'\xca:-\xfbf\\\xf4?'
 886 | tRp222
 887 | ag2
 888 | (g4
 889 | S"' \xa8\xa3\x91\xfd\xf3?"
 890 | tRp223
 891 | ag2
 892 | (g4
 893 | S'\xed\x02\xd7\xb1\xf6,\xf4?'
 894 | tRp224
 895 | ag2
 896 | (g4
 897 | S'\xa3\xa7u\xf5\x15!\xf4?'
 898 | tRp225
 899 | ag2
 900 | (g4
 901 | S'L\xe4\xd2\xd3G)\xf4?'
 902 | tRp226
 903 | ag2
 904 | (g4
 905 | S'\xf5A\xaa\xd7/\x0b\xf4?'
 906 | tRp227
 907 | ag2
 908 | (g4
 909 | S'\xcf\xac@>\xff@\xf4?'
 910 | tRp228
 911 | ag2
 912 | (g4
 913 | S'\x81x\x13Rm\x1c\xf4?'
 914 | tRp229
 915 | ag2
 916 | (g4
 917 | S'\t-n\x98"I\xf4?'
 918 | tRp230
 919 | ag2
 920 | (g4
 921 | S'\x9a\xd6|\xd0t\xfa\xf3?'
 922 | tRp231
 923 | ag2
 924 | (g4
 925 | S'\xe77\xaf$T0\xf4?'
 926 | tRp232
 927 | ag2
 928 | (g4
 929 | S')\r\x16\xb5<4\xf4?'
 930 | tRp233
 931 | ag2
 932 | (g4
 933 | S'\x1a\xce\xf3\xe0\x0b9\xf4?'
 934 | tRp234
 935 | ag2
 936 | (g4
 937 | S'i\xdf2AtC\xf4?'
 938 | tRp235
 939 | ag2
 940 | (g4
 941 | S'\xd9\xe6h\x0f\x8bC\xf4?'
 942 | tRp236
 943 | ag2
 944 | (g4
 945 | S'0 [(\x1a\x01\xf4?'
 946 | tRp237
 947 | ag2
 948 | (g4
 949 | S'\xae\x9e\x96\xcf\x80I\xf4?'
 950 | tRp238
 951 | ag2
 952 | (g4
 953 | S'\xb3\xdb\xb4g\xb5+\xf4?'
 954 | tRp239
 955 | ag2
 956 | (g4
 957 | S'\xdahk\xea\x1bJ\xf4?'
 958 | tRp240
 959 | ag2
 960 | (g4
 961 | S'\xf7i\xcdd\x93J\xf4?'
 962 | tRp241
 963 | ag2
 964 | (g4
 965 | S'.WXi\x9eE\xf4?'
 966 | tRp242
 967 | ag2
 968 | (g4
 969 | S'\x10\xa1=z\x80:\xf4?'
 970 | tRp243
 971 | ag2
 972 | (g4
 973 | S'\xd76\x7f1*o\xf4?'
 974 | tRp244
 975 | ag2
 976 | (g4
 977 | S'\x94\xdbd\x93\xab\x18\xf4?'
 978 | tRp245
 979 | ag2
 980 | (g4
 981 | S'\x11|r\xa2\x06-\xf4?'
 982 | tRp246
 983 | ag2
 984 | (g4
 985 | S'hYQ\x94\xc51\xf4?'
 986 | tRp247
 987 | ag2
 988 | (g4
 989 | S'\x84\xfd:KV~\xf4?'
 990 | tRp248
 991 | ag2
 992 | (g4
 993 | S'\xac\xaf\x01\x937;\xf4?'
 994 | tRp249
 995 | ag2
 996 | (g4
 997 | S'\xf7\x86\xf9\x9co\xfc\xf3?'
 998 | tRp250
 999 | ag2
1000 | (g4
1001 | S'q\xa5\x11H\xe4%\xf4?'
1002 | tRp251
1003 | ag2
1004 | (g4
1005 | S' \xbb\xf8\xc2\n,\xf4?'
1006 | tRp252
1007 | ag2
1008 | (g4
1009 | S'\x894\xae\xa1\xcb7\xf4?'
1010 | tRp253
1011 | ag2
1012 | (g4
1013 | S'\xc9\x8f\x919}=\xf4?'
1014 | tRp254
1015 | a.


--------------------------------------------------------------------------------
/coco_caption/METEOR.pkl:
--------------------------------------------------------------------------------
  1 | (lp1
  2 | F0.18915644594642708
  3 | aF0.20911916479687623
  4 | aF0.21949224920874325
  5 | aF0.22395937995175452
  6 | aF0.22636076772912747
  7 | aF0.23063332405294543
  8 | aF0.23168066989877056
  9 | aF0.23275986881940422
 10 | aF0.23417117399601728
 11 | aF0.2350631549515593
 12 | aF0.23678958786720067
 13 | aF0.23671099113335686
 14 | aF0.23922871999875575
 15 | aF0.23980766235480724
 16 | aF0.24250715969661582
 17 | aF0.24409128448606446
 18 | aF0.24417673283127833
 19 | aF0.24534410461950146
 20 | aF0.24717353763019165
 21 | aF0.24719646311209323
 22 | aF0.24650820758422559
 23 | aF0.250388748859595
 24 | aF0.24883898414632688
 25 | aF0.25163130016639429
 26 | aF0.24963518611164856
 27 | aF0.25137443983486046
 28 | aF0.25280151086472102
 29 | aF0.2539250468977573
 30 | aF0.25330708644786121
 31 | aF0.25590649687564843
 32 | aF0.25485096265105195
 33 | aF0.25472098886275546
 34 | aF0.25725636982963301
 35 | aF0.2577444676483735
 36 | aF0.25687426372906574
 37 | aF0.26009681088862618
 38 | aF0.25996750081263781
 39 | aF0.25851554345039679
 40 | aF0.26152881766684505
 41 | aF0.25893832149455559
 42 | aF0.26152794170654592
 43 | aF0.26078061911783385
 44 | aF0.26246229451701153
 45 | aF0.26393208384860578
 46 | aF0.26333146434380406
 47 | aF0.26329484085148308
 48 | aF0.26402779454741354
 49 | aF0.26393363716423013
 50 | aF0.26562805312849591
 51 | aF0.264572119645897
 52 | aF0.26601627774989034
 53 | aF0.26632920720759801
 54 | aF0.26748625980022117
 55 | aF0.26663498870215679
 56 | aF0.26810417109650442
 57 | aF0.26916412073025509
 58 | aF0.2674172239826727
 59 | aF0.26680873225560681
 60 | aF0.26968550960583088
 61 | aF0.27003636233769723
 62 | aF0.27116095980912269
 63 | aF0.27091837362020121
 64 | aF0.27190187421724671
 65 | aF0.26940392346200703
 66 | aF0.27176744667814412
 67 | aF0.27089726905540917
 68 | aF0.27288518217035101
 69 | aF0.27274628383276189
 70 | aF0.2728966926367839
 71 | aF0.27219854188079728
 72 | aF0.27262760529799257
 73 | aF0.27505080069905191
 74 | aF0.27477331309378716
 75 | aF0.27243363100838913
 76 | aF0.27510896016998154
 77 | aF0.27502131399790919
 78 | aF0.27404601246014254
 79 | aF0.27481159756735757
 80 | aF0.27560350140208262
 81 | aF0.27514962922677899
 82 | aF0.27718844396314396
 83 | aF0.27608789049323135
 84 | aF0.27804083149494685
 85 | aF0.27725020699438457
 86 | aF0.28040352064129626
 87 | aF0.2772757474518821
 88 | aF0.27813172148719623
 89 | aF0.27954703349661797
 90 | aF0.27729982167090511
 91 | aF0.27683894481501659
 92 | aF0.28107214089699611
 93 | aF0.27790217290818026
 94 | aF0.28104024908926628
 95 | aF0.28048263204639945
 96 | aF0.28061577170556196
 97 | aF0.28023214986579476
 98 | aF0.27940242965930034
 99 | aF0.27926507426135566
100 | aF0.28019376437748233
101 | aF0.28248412298998271
102 | aF0.28046655881029953
103 | aF0.28002215564340555
104 | aF0.28046864475212863
105 | aF0.28390852552172363
106 | aF0.28296736924131477
107 | aF0.28268677077151555
108 | aF0.2798149277913497
109 | aF0.28374749773137276
110 | aF0.28284055862777119
111 | aF0.28334547401709625
112 | aF0.28240138854903724
113 | aF0.28365043720208905
114 | aF0.28296323222134556
115 | aF0.28340865376711727
116 | aF0.28375944351922833
117 | aF0.28482920079101431
118 | aF0.28285094729853871
119 | aF0.28537332875124127
120 | aF0.28369409144691493
121 | aF0.28788448500736619
122 | aF0.28417719629010885
123 | aF0.28247767281542563
124 | aF0.28610696382647222
125 | aF0.28507581771364848
126 | aF0.28774582385662278
127 | aF0.28284401083150934
128 | aF0.28526007497751038
129 | aF0.28584808770710396
130 | aF0.28676610806088709
131 | aF0.28640958412108453
132 | aF0.28593547723436086
133 | aF0.28608271026206111
134 | aF0.28661153862054706
135 | aF0.28883979698928797
136 | aF0.28681758599812696
137 | aF0.28730389532696099
138 | aF0.28662283193827115
139 | aF0.28892897219405839
140 | aF0.28759559638491083
141 | aF0.28666049624096002
142 | aF0.29011089745228869
143 | aF0.28900238335817979
144 | aF0.28860771778784133
145 | aF0.289670141090758
146 | aF0.28753610880743102
147 | aF0.28804254503597859
148 | aF0.29014480002922199
149 | aF0.29098093608245995
150 | aF0.29012750509699342
151 | aF0.28834429827938918
152 | aF0.2882065437807817
153 | aF0.28929992571507329
154 | aF0.28940006979015337
155 | aF0.29085633837764174
156 | aF0.29004617784801029
157 | aF0.28975265226997715
158 | aF0.28978342364090581
159 | aF0.28957382463597903
160 | aF0.28905715704535884
161 | aF0.29011513806956507
162 | aF0.29295667055631425
163 | aF0.29113327118771654
164 | aF0.28852864597199662
165 | aF0.29186213532220379
166 | aF0.2918781623550053
167 | aF0.29128146661360593
168 | aF0.29225980870223844
169 | aF0.29066672991543518
170 | aF0.29029890273104586
171 | aF0.29195084253980425
172 | aF0.29123553546819075
173 | aF0.29062106085118061
174 | aF0.29179804871193821
175 | aF0.28998709951209195
176 | aF0.29149185522962423
177 | aF0.29151798337369877
178 | aF0.29155177404501126
179 | aF0.29103521281804867
180 | aF0.29096331070421333
181 | aF0.29301306248005865
182 | aF0.29096500538456355
183 | aF0.28989199327667775
184 | aF0.29314262652344514
185 | aF0.29144324364037155
186 | aF0.29428756238948467
187 | aF0.2924259491287029
188 | aF0.29210513942607341
189 | aF0.29480797781844198
190 | aF0.29101766553274977
191 | aF0.29336269482609206
192 | aF0.29201278358565069
193 | aF0.29219909529639992
194 | aF0.2938912158574768
195 | aF0.2921506098701227
196 | aF0.29239299757508447
197 | aF0.29354609641852591
198 | aF0.29262711883114623
199 | aF0.29171791917485274
200 | aF0.29223746418306829
201 | aF0.29381567866483349
202 | aF0.29290370883214206
203 | aF0.29505681546520424
204 | aF0.29429962868558845
205 | aF0.29460775300456488
206 | aF0.29238086027195909
207 | aF0.29528543129374024
208 | aF0.29341402386920695
209 | aF0.29474568981359983
210 | aF0.29216358900268563
211 | aF0.2935913352931202
212 | aF0.29334145784065863
213 | aF0.29448787757565903
214 | aF0.29498898620849007
215 | aF0.29431393431643338
216 | aF0.29704984530391071
217 | aF0.2931446168207652
218 | aF0.29480617685982402
219 | aF0.29576975129065086
220 | aF0.29359336011535353
221 | aF0.29450203452857454
222 | aF0.29521354194283322
223 | aF0.29281448737344823
224 | aF0.29482487461515422
225 | aF0.29571226574885395
226 | aF0.29504088341902712
227 | aF0.29540440838582394
228 | aF0.29286989337211972
229 | aF0.29518663766845959
230 | aF0.2966833412001017
231 | aF0.295448470122119
232 | aF0.29711422648721347
233 | aF0.29711734749392577
234 | aF0.29627884382012593
235 | aF0.29639846161174005
236 | aF0.29497700766905133
237 | aF0.29605339757395865
238 | aF0.29650222850027647
239 | aF0.29612326748964929
240 | aF0.29621738509928525
241 | aF0.29881187108696727
242 | aF0.29557793809696425
243 | aF0.29560094518659574
244 | aF0.29561788163129465
245 | aF0.29949039672884398
246 | aF0.29586661778143331
247 | aF0.29518660763816906
248 | aF0.29576518080675901
249 | aF0.29505938092454825
250 | aF0.29744406928735179
251 | aF0.29636329553803575
252 | a.


--------------------------------------------------------------------------------
/coco_caption/draw.py:
--------------------------------------------------------------------------------
 1 | #! encoding: UTF-8
 2 | 
 3 | import cPickle as pickle
 4 | import matplotlib.pyplot as plt
 5 | 
 6 | with open("Bleu_1.pkl", "r") as f:
 7 |     Bleu_1 = pickle.load(f)
 8 | 
 9 | with open("Bleu_2.pkl", "r") as f:
10 |     Bleu_2 = pickle.load(f)
11 | 
12 | with open("Bleu_3.pkl", "r") as f:
13 |     Bleu_3 = pickle.load(f)
14 | 
15 | with open("Bleu_4.pkl", "r") as f:
16 |     Bleu_4 = pickle.load(f)
17 | 
18 | with open("METEOR.pkl", "r") as f:
19 |     METEOR = pickle.load(f)
20 | 
21 | with open("CIDEr.pkl", "r") as f:
22 |     CIDEr = pickle.load(f)
23 | 
24 | print len(Bleu_1)
25 | 
26 | plt.plot(range(0, 2*len(Bleu_1), 2), Bleu_1, label="Bleu-1", color="g")
27 | plt.plot(range(0, 2*len(Bleu_2), 2), Bleu_2, label="Bleu-2", color="r")
28 | plt.plot(range(0, 2*len(Bleu_3), 2), Bleu_3, label="Bleu-3", color="b")
29 | plt.plot(range(0, 2*len(Bleu_4), 2), Bleu_4, label="Bleu-4", color="m")
30 | plt.plot(range(0, 2*len(METEOR), 2), METEOR, label="METEOR", color="k")
31 | plt.plot(range(0, 2*len(CIDEr), 2), CIDEr, label="CIDEr", color="y")
32 | 
33 | plt.grid(True)
34 | #plt.legend(handles=[line_1, line_2])
35 | #plt.legend(bbox_to_anchor=(1.05, 1), loc=2, borderaxespad=0.)
36 | #plt.legend(handles=[line1], loc=1)
37 | plt.legend(loc=2)
38 | plt.show()
39 | #plt.savefig("tmp.png")
40 | 


--------------------------------------------------------------------------------
/coco_caption/eval_captions_results.py:
--------------------------------------------------------------------------------
 1 | #！ encoding: UTF-8
 2 | 
 3 | import os
 4 | from pycocotools.coco import COCO
 5 | from pycocoevalcap.eval import COCOEvalCap
 6 | 
 7 | annFile = "../data/train_val_all_reference.json"
 8 | resFile = "captions_val2014_results.json"
 9 | 
10 | # create coco object and cocoRes object
11 | coco = COCO(annFile)
12 | 
13 | # after generating the captions_val2014_results.json file
14 | # we call the coco caption evaluation tools
15 | cocoRes = coco.loadRes(resFile)
16 | 
17 | # create cocoEval object by taking coco and cocoRes
18 | cocoEval = COCOEvalCap(coco, cocoRes)
19 | 
20 | # evaluate on a subset of images by setting
21 | # cocoEval.params['image_id'] = cocoRes.getImgIds()
22 | # please remove this line when evaluating the full validation set
23 | cocoEval.params['image_id'] = cocoRes.getImgIds()
24 | 
25 | # evaluate results
26 | cocoEval.evaluate() 
27 | 
28 | # print output evaluation scores
29 | for metric, score in cocoEval.eval.items():
30 |     print '%s: %.3f'%(metric, score)


--------------------------------------------------------------------------------
/coco_caption/eval_image_caption.py:
--------------------------------------------------------------------------------
  1 | # encoding: UTF-8
  2 | 
  3 | import os
  4 | import sys
  5 | import glob
  6 | import random
  7 | import time
  8 | import json
  9 | from json import encoder
 10 | import numpy as np
 11 | import cPickle as pickle
 12 | import matplotlib.pyplot as plt
 13 | 
 14 | import tensorflow as tf
 15 | 
 16 | sys.path.append('./coco_caption/')
 17 | from pycocotools.coco import COCO
 18 | from pycocoevalcap.eval import COCOEvalCap
 19 | 
 20 | import ipdb
 21 | 
 22 | 
 23 | #############################################################################################################
 24 | #
 25 | # Step 1: Input: D = {(x^n, y^n): n = 1:N}
 26 | # Step 2:Train \Pi(g_{1:T} | x) using MLE on D, MLE: Maximum likehood eatimation
 27 | #
 28 | ############################################################################################################
 29 | class CNN_LSTM():
 30 |     def __init__(self,
 31 |                  n_words,
 32 |                  batch_size,
 33 |                  feats_dim,
 34 |                  project_dim,
 35 |                  lstm_size,
 36 |                  word_embed_dim,
 37 |                  lstm_step,
 38 |                  bias_init_vector=None):
 39 | 
 40 |         self.n_words = n_words
 41 |         self.batch_size = batch_size
 42 |         self.feats_dim = feats_dim
 43 |         self.project_dim = project_dim
 44 |         self.lstm_size = lstm_size
 45 |         self.word_embed_dim = word_embed_dim
 46 |         self.lstm_step = lstm_step
 47 | 
 48 |         # project the image feature vector of dimension 2048 to 512 dimension, with a linear layer
 49 |         # self.encode_img_W: 2048 x 512
 50 |         # self.encode_img_b: 512
 51 |         self.encode_img_W = tf.Variable(tf.random_uniform([feats_dim, project_dim], -0.1, 0.1), name="encode_img_W")
 52 |         self.encode_img_b = tf.zeros([project_dim], name="encode_img_b")
 53 | 
 54 |         with tf.device("/cpu:0"):
 55 |             self.Wemb = tf.Variable(tf.random_uniform([n_words, word_embed_dim], -0.1, 0.1), name="Wemb")
 56 | 
 57 |         self.lstm = tf.nn.rnn_cell.BasicLSTMCell(lstm_size, state_is_tuple=True)
 58 | 
 59 |         self.embed_word_W = tf.Variable(tf.random_uniform([lstm_size, n_words], -0.1, 0.1), name="embed_word_W")
 60 | 
 61 |         if bias_init_vector is not None:
 62 |             self.embed_word_b = tf.Variable(bias_init_vector.astype(np.float32), name="embed_word_b")
 63 |         else:
 64 |             self.embed_word_b = tf.Variable(tf.zeros([n_words]), name="embed_word_b")
 65 | 
 66 |         self.baseline_MLP_W = tf.Variable(tf.random_uniform([lstm_size, 1], -0.1, 0.1), name="baseline_MLP_W")
 67 |         self.baseline_MLP_b = tf.Variable(tf.zeros([1]), name="baseline_MLP_b")
 68 | 
 69 |     ############################################################################################################
 70 |     #
 71 |     # Class function for step 2
 72 |     #
 73 |     ############################################################################################################
 74 |     def build_model(self):
 75 |         images = tf.placeholder(tf.float32, [self.batch_size, self.feats_dim])
 76 |         sentences = tf.placeholder(tf.int32, [self.batch_size, self.lstm_step])
 77 |         masks = tf.placeholder(tf.float32, [self.batch_size, self.lstm_step])
 78 | 
 79 |         images_embed = tf.matmul(images, self.encode_img_W) + self.encode_img_b
 80 | 
 81 |         state = self.lstm.zero_state(batch_size=self.batch_size, dtype=tf.float32)
 82 | 
 83 |         loss = 0.0
 84 |         with tf.variable_scope("LSTM"):
 85 |             for i in range(0, self.lstm_step):
 86 |                 if i == 0:
 87 |                     current_emb = images_embed
 88 |                 else:
 89 |                     with tf.device("/cpu:0"):
 90 |                         current_emb = tf.nn.embedding_lookup(self.Wemb, sentences[:, i-1])
 91 | 
 92 |                 if i > 0:
 93 |                     tf.get_variable_scope().reuse_variables()
 94 | 
 95 |                 output, state = self.lstm(current_emb, state)
 96 | 
 97 |                 if i > 0:
 98 |                     labels = tf.expand_dims(sentences[:, i], 1)
 99 |                     indices = tf.expand_dims(tf.range(0, self.batch_size, 1), 1)
100 |                     concated = tf.concat(1, [indices, labels])
101 |                     onehot_labels = tf.sparse_to_dense( concated, tf.pack([self.batch_size, self.n_words]), 1.0, 0.0)
102 | 
103 |                     logit_words = tf.matmul(output, self.embed_word_W) + self.embed_word_b
104 |                     cross_entropy = tf.nn.softmax_cross_entropy_with_logits(logit_words, onehot_labels)
105 |                     cross_entropy = cross_entropy * masks[:, i]
106 |                     current_loss = tf.reduce_sum(cross_entropy)/self.batch_size
107 | 
108 |                     loss = loss + current_loss
109 |         return loss, images, sentences, masks
110 | 
111 |     def generate_model(self):
112 |         images = tf.placeholder(tf.float32, [1, self.feats_dim])
113 |         images_embed = tf.matmul(images, self.encode_img_W) + self.encode_img_b
114 | 
115 |         state = self.lstm.zero_state(batch_size=1, dtype=tf.float32)
116 |         sentences = []
117 | 
118 |         with tf.variable_scope("LSTM"):
119 |             output, state = self.lstm(images_embed, state)
120 | 
121 |             with tf.device("/cpu:0"):
122 |                 current_emb = tf.nn.embedding_lookup(self.Wemb, tf.ones([1], dtype=tf.int64))
123 | 
124 |             for i in range(0, self.lstm_step):
125 |                 tf.get_variable_scope().reuse_variables()
126 | 
127 |                 output, state = self.lstm(current_emb, state)
128 | 
129 |                 logit_words = tf.matmul(output, self.embed_word_W) + self.embed_word_b
130 |                 max_prob_word = tf.argmax(logit_words, 1)[0]
131 | 
132 |                 with tf.device("/cpu:0"):
133 |                     current_emb = tf.nn.embedding_lookup(self.Wemb, max_prob_word)
134 |                     current_emb = tf.expand_dims(current_emb, 0)
135 |                 sentences.append(max_prob_word)
136 | 
137 |         return images, sentences
138 | 
139 | 
140 | ##############################################################################
141 | #
142 | # set parameters and path
143 | #
144 | ##############################################################################
145 | batch_size = 100
146 | feats_dim = 2048
147 | project_dim = 512
148 | lstm_size = 512
149 | word_embed_dim = 512
150 | lstm_step = 20
151 | 
152 | idx_to_word_path = '../data/idx_to_word.pkl'
153 | word_to_idx_path = '../data/word_to_idx.pkl'
154 | bias_init_vector_path = '../data/bias_init_vector.npy'
155 | 
156 | with open(idx_to_word_path, 'r') as fr_3:
157 |     idx_to_word = pickle.load(fr_3)
158 | 
159 | with open(word_to_idx_path, 'r') as fr_4:
160 |     word_to_idx = pickle.load(fr_4)
161 | 
162 | bias_init_vector = np.load(bias_init_vector_path)
163 | 
164 | 
165 | ##########################################################################################
166 | #
167 | # I move the generation model part out of the Val_with_MLE function
168 | #
169 | ##########################################################################################
170 | n_words = len(idx_to_word)
171 | 
172 | val_feats_path = '../inception/val_feats'
173 | val_feats_names = glob.glob(val_feats_path + '/*.npy')
174 | val_images_names = map(lambda x: os.path.basename(x)[0:-4], val_feats_names)
175 | 
176 | model = CNN_LSTM(n_words = n_words,
177 |                  batch_size = batch_size,
178 |                  feats_dim = feats_dim,
179 |                  project_dim = project_dim,
180 |                  lstm_size = lstm_size,
181 |                  word_embed_dim = word_embed_dim,
182 |                  lstm_step = lstm_step,
183 |                  bias_init_vector = None)
184 | tf_images, tf_sentences = model.generate_model()
185 | 
186 | def Val_with_MLE(model_path):
187 |     '''
188 |     n_words = len(idx_to_word)
189 | 
190 |     # version 1: test all validation images
191 |     val_feats_path = '../inception/val_feats'
192 |     val_feats_names = glob.glob(val_feats_path + '/*.npy')
193 |     val_images_names = map(lambda x: os.path.basename(x)[0:-4], val_feats_names)
194 | 
195 |     model = CNN_LSTM(n_words = n_words,
196 |                      batch_size = batch_size,
197 |                      feats_dim = feats_dim,
198 |                      project_dim = project_dim,
199 |                      lstm_size = lstm_size,
200 |                      word_embed_dim = word_embed_dim,
201 |                      lstm_step = lstm_step,
202 |                      bias_init_vector = None)
203 |     tf_images, tf_sentences = model.generate_model()
204 |     '''
205 |     sess = tf.InteractiveSession()
206 |     saver = tf.train.Saver()
207 |     saver.restore(sess, model_path)
208 | 
209 |     fw_1 = open("val2014_results.txt", 'w')
210 |     for idx, img_name in enumerate(val_images_names[0:5000]):
211 |         print "{},  {}".format(idx, img_name)
212 |         start_time = time.time()
213 | 
214 |         current_feats = np.load( os.path.join(val_feats_path, img_name+'.npy') )
215 |         current_feats = np.reshape(current_feats, [1, feats_dim])
216 | 
217 |         sentences_index = sess.run(tf_sentences, feed_dict={tf_images: current_feats})
218 |         sentences = []
219 |         for idx_word in sentences_index:
220 |             word = idx_to_word[idx_word]
221 |             word = word.replace('\n', '')
222 |             word = word.replace('\\', '')
223 |             word = word.replace('"', '')
224 |             sentences.append(word)
225 | 
226 |         punctuation = np.argmax(np.array(sentences) == '<eos>') + 1
227 |         sentences = sentences[:punctuation]
228 |         generated_sentence = ' '.join(sentences)
229 |         generated_sentence = generated_sentence.replace('<bos> ', '')
230 |         generated_sentence = generated_sentence.replace(' <eos>', '')
231 | 
232 |         print generated_sentence,'\n'
233 |         fw_1.write(img_name + '\n')
234 |         fw_1.write(generated_sentence + '\n')
235 |     fw_1.close()
236 | 
237 | 


--------------------------------------------------------------------------------
/coco_caption/eval_model.py:
--------------------------------------------------------------------------------
  1 | #! encoding: UTF-8
  2 | 
  3 | import os
  4 | import ipdb
  5 | import glob
  6 | import time
  7 | import subprocess
  8 | import cPickle as pickle
  9 | import matplotlib.pyplot as plt
 10 | 
 11 | from pycocotools.coco import COCO
 12 | from pycocoevalcap.eval import COCOEvalCap
 13 | 
 14 | import eval_image_caption
 15 | 
 16 | 
 17 | model_path = "../models"
 18 | 
 19 | annFile = "../data/train_val_all_reference.json"
 20 | resFile = "captions_val2014_results.json"
 21 | 
 22 | # create coco object and cocoRes object
 23 | coco = COCO(annFile)
 24 | 
 25 | n_epochs = 500
 26 | n_epochs += 2
 27 | 
 28 | with open("Bleu_1.pkl", "r") as f:
 29 |     Bleu_1 = pickle.load(f)
 30 | 
 31 | with open("Bleu_2.pkl", "r") as f:
 32 |     Bleu_2 = pickle.load(f)
 33 | 
 34 | with open("Bleu_3.pkl", "r") as f:
 35 |     Bleu_3 = pickle.load(f)
 36 | 
 37 | with open("Bleu_4.pkl", "r") as f:
 38 |     Bleu_4 = pickle.load(f)
 39 | 
 40 | with open("METEOR.pkl", "r") as f:
 41 |     METEOR = pickle.load(f)
 42 | 
 43 | with open("CIDEr.pkl", "r") as f:
 44 |     CIDEr = pickle.load(f)
 45 | 
 46 | for idx_model in range(202, n_epochs, 2):
 47 |     model_name = os.path.join(model_path, "model_MLP-" + str(idx_model))
 48 |     
 49 |     start_time = time.time()
 50 |     
 51 |     # generate the val2014_results.txt
 52 |     eval_image_caption.Val_with_MLE(model_name)
 53 | 
 54 |     # call the gen_val_json.py with subprocess
 55 |     # we will generate the captions_val2014_results.json file
 56 |     subprocess.call(["python", "gen_val_json.py"])
 57 | 
 58 |     # after generating the captions_val2014_results.json file
 59 |     # we call the coco caption evaluation tools
 60 |     cocoRes = coco.loadRes(resFile)
 61 | 
 62 |     # create cocoEval object by taking coco and cocoRes
 63 |     cocoEval = COCOEvalCap(coco, cocoRes)
 64 | 
 65 |     # evaluate on a subset of images by setting
 66 |     # cocoEval.params['image_id'] = cocoRes.getImgIds()
 67 |     # please remove this line when evaluating the full validation set
 68 |     cocoEval.params['image_id'] = cocoRes.getImgIds()
 69 | 
 70 |     # evaluate results
 71 |     cocoEval.evaluate() 
 72 | 
 73 |     # print output evaluation scores
 74 |     for metric, score in cocoEval.eval.items():
 75 |         print '%s: %.3f'%(metric, score)
 76 |         if metric == "Bleu_1":
 77 |             Bleu_1.append(score)
 78 |         if metric == "Bleu_2":
 79 |             Bleu_2.append(score)
 80 |         if metric == "Bleu_3":
 81 |             Bleu_3.append(score)
 82 |         if metric == "Bleu_4":
 83 |             Bleu_4.append(score)
 84 |         if metric == "METEOR":
 85 |             METEOR.append(score)
 86 |         if metric == "CIDEr":
 87 |             CIDEr.append(score)
 88 |     # save the scores immediately
 89 |     with open("Bleu_1.pkl", "w") as fw1:
 90 |         pickle.dump(Bleu_1, fw1)
 91 |     with open("Bleu_2.pkl", "w") as fw2:
 92 |         pickle.dump(Bleu_2, fw2)
 93 |     with open("Bleu_3.pkl", "w") as fw3:
 94 |         pickle.dump(Bleu_3, fw3)
 95 |     with open("Bleu_4.pkl", "w") as fw4:
 96 |         pickle.dump(Bleu_4, fw4)
 97 |     with open("METEOR.pkl", "w") as fw5:
 98 |         pickle.dump(METEOR, fw5)
 99 |     with open("CIDEr.pkl", "w") as fw6:
100 |         pickle.dump(CIDEr, fw6)
101 | 
102 |     print "Mdoel {} evaluation time cost: {}".format(model_name, time.time()-start_time)
103 | 
104 | # draw the pictures
105 | plt.plot(range(len(Bleu_1)), Bleu_1, label="Bleu-1", color="g")
106 | plt.plot(range(len(Bleu_2)), Bleu_2, label="Bleu-2", color="r")
107 | plt.plot(range(len(Bleu_3)), Bleu_3, label="Bleu-3", color="b")
108 | plt.plot(range(len(Bleu_4)), Bleu_4, label="Bleu-4", color="m")
109 | plt.plot(range(len(METEOR)), METEOR, label="METEOR", color="k")
110 | plt.plot(range(len(CIDEr)), CIDEr, label="CIDEr", color="y")
111 | plt.grid(True)
112 | plt.legend(loc=2)
113 | plt.show()
114 | plt.savefig("evalution.png") 


--------------------------------------------------------------------------------
/coco_caption/gen_test_json.py:
--------------------------------------------------------------------------------
 1 | #!/usr/bin/env python
 2 | # coding=utf-8
 3 | import os
 4 | import json
 5 | import cPickle as pickle
 6 | 
 7 | test_results_save_path = '../test2014_results_model-486.txt'
 8 | test_results = open(test_results_save_path).read().splitlines()
 9 | 
10 | images_captions = {}
11 | captions = []
12 | names = []
13 | for idx, item in enumerate(test_results):
14 |     if idx % 2 == 0:
15 |         names.append(item)
16 |     if idx % 2 == 1:
17 |         captions.append(item)
18 | 
19 | for idx, name in enumerate(names):
20 |     print idx, ' ', name
21 |     images_captions[name] = captions[idx]
22 | 
23 | with open('../data/test2014_images_ids_to_names.pkl', 'r') as fr_1:
24 |     test2014_images_ids_to_names = pickle.load(fr_1)
25 | 
26 | names_to_ids = {}
27 | for key, item in test2014_images_ids_to_names.iteritems():
28 |     names_to_ids[item] = key
29 | 
30 | fw_1 = open('captions_test2014_results.json', 'w')
31 | fw_1.write('[')
32 | 
33 | for idx, name in enumerate(names):
34 |     print idx, ' ', name
35 |     tmp_idx = names.index(name)
36 |     caption = captions[tmp_idx]
37 |     caption = caption.replace(' ,', ',')
38 |     caption = caption.replace('"', '')
39 |     caption = caption.replace('\n', '')
40 |     if idx != len(names)-1:
41 |         fw_1.write('{"image_id": ' + str(names_to_ids[name]) + ', "caption": "' + str(caption) + '"}, ')
42 |     else:
43 |         fw_1.write('{"image_id": ' + str(names_to_ids[name]) + ', "caption": "' + str(caption) + '"}]')
44 | 
45 | fw_1.close()
46 | 


--------------------------------------------------------------------------------
/coco_caption/gen_val_json.py:
--------------------------------------------------------------------------------
 1 | #!/usr/bin/env python
 2 | # coding=utf-8
 3 | import os
 4 | import json
 5 | import cPickle as pickle
 6 | 
 7 | test_results_save_path = '../val2014_results_model_MLP-486.txt'
 8 | test_results = open(test_results_save_path).read().splitlines()
 9 | 
10 | images_captions = {}
11 | captions = []
12 | names = []
13 | for idx, item in enumerate(test_results):
14 |     if idx % 2 == 0:
15 |         names.append(item)
16 |     if idx % 2 == 1:
17 |         captions.append(item)
18 | 
19 | for idx, name in enumerate(names):
20 |     print idx, ' ', name
21 |     images_captions[name] = captions[idx]
22 | 
23 | with open('../data/val2014_images_ids_to_names.pkl', 'r') as fr_1:
24 |     test2014_images_ids_to_names = pickle.load(fr_1)
25 | 
26 | names_to_ids = {}
27 | for key, item in test2014_images_ids_to_names.iteritems():
28 |     names_to_ids[item] = key
29 | 
30 | fw_1 = open('captions_val2014_results.json', 'w')
31 | fw_1.write('[')
32 | 
33 | for idx, name in enumerate(names):
34 |     print idx, ' ', name
35 |     tmp_idx = names.index(name)
36 |     caption = captions[tmp_idx]
37 |     caption = caption.replace(' ,', ',')
38 |     caption = caption.replace('"', '')
39 |     caption = caption.replace('\n', '')
40 |     if idx != len(names)-1:
41 |         fw_1.write('{"image_id": ' + str(names_to_ids[name]) + ', "caption": "' + str(caption) + '"}, ')
42 |     else:
43 |         fw_1.write('{"image_id": ' + str(names_to_ids[name]) + ', "caption": "' + str(caption) + '"}]')
44 | 
45 | fw_1.close()
46 | 


--------------------------------------------------------------------------------
/coco_caption/model_evalution.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/chenxinpeng/Optimization_of_image_description_metrics_using_policy_gradient_methods/66089304b3dc78a1e27f90e262d0cb17c5bb4cf2/coco_caption/model_evalution.png


--------------------------------------------------------------------------------
/coco_caption/read_test_info.py:
--------------------------------------------------------------------------------
 1 | #!/usr/bin/env python
 2 | # coding=utf-8
 3 | 
 4 | import os
 5 | import json
 6 | import cPickle as pickle
 7 | 
 8 | image_info_test2014_path = '../data/image_info_test2014.json'
 9 | 
10 | image_info_json = json.load(open(image_info_test2014_path, 'r'))
11 | 
12 | images_info = image_info_json["images"]
13 | 
14 | imageIds_to_imageNames = {}
15 | for image in images_info:
16 |     id = int(image["id"])
17 |     name = image["file_name"]
18 |     imageIds_to_imageNames[id] = name
19 | 
20 | with open("./data/test2014_images_ids_to_names.pkl", 'w') as fw_1:
21 |     pickle.dump(imageIds_to_imageNames, fw_1)
22 | 


--------------------------------------------------------------------------------
/coco_caption/read_validation_info.py:
--------------------------------------------------------------------------------
 1 | #!/usr/bin/env python
 2 | # coding=utf-8
 3 | 
 4 | import os
 5 | import json
 6 | import cPickle as pickle
 7 | 
 8 | image_info_test2014_path = '../data/captions_val2014.json'
 9 | 
10 | image_info_json = json.load(open(image_info_test2014_path, 'r'))
11 | 
12 | images_info = image_info_json["images"]
13 | 
14 | imageIds_to_imageNames = {}
15 | for image in images_info:
16 |     id = int(image["id"])
17 |     name = image["file_name"]
18 |     imageIds_to_imageNames[id] = name
19 | 
20 | with open("val2014_images_ids_to_names.pkl", 'w') as fw_1:
21 |     pickle.dump(imageIds_to_imageNames, fw_1)
22 | 


--------------------------------------------------------------------------------
/create_train_val_all_reference.py:
--------------------------------------------------------------------------------
 1 | # encoding: UTF-8
 2 | 
 3 | import os
 4 | import glob
 5 | import json
 6 | import cPickle as pickle
 7 | 
 8 | 
 9 | train_val_imageNames_to_imageIDs = {}
10 | train_val_Names_Captions = []
11 | #train_imageNames_to_imageIDs = {}
12 | #val_imageNames_to_imageIDs = {}
13 | 
14 | ################################################################
15 | with open('./data/captions_train2014.json') as fr_1:
16 |     train_captions = json.load(fr_1)
17 | 
18 | for image in train_captions['images']:
19 |     image_name = image['file_name']
20 |     image_id = image['id']
21 |     train_val_imageNames_to_imageIDs[image_name] = image_id
22 | 
23 | for image in train_captions['annotations']:
24 |     image_id = image['image_id']
25 |     image_caption = image['caption']
26 |     train_val_Names_Captions.append([image_id, image_caption])
27 | 
28 | #################################################################
29 | with open('./data/captions_val2014.json') as fr_2:
30 |     val_captions = json.load(fr_2)
31 | 
32 | for image in val_captions['images']:
33 |     image_name = image['file_name']
34 |     image_id = image['id']
35 |     train_val_imageNames_to_imageIDs[image_name] = image_id
36 | 
37 | for image in val_captions['annotations']:
38 |     image_id = image['image_id']
39 |     image_caption = image['caption']
40 |     train_val_Names_Captions.append([image_id, image_caption])
41 | 
42 | #################################################################
43 | 
44 | json_fw = open('./data/train_val_all_reference.json', 'w')
45 | json_fw.write('{"info": {"description": "Test", "url": "https://github.com/chenxinpeng", "version": "1.0", "year": 2017, "contributor": "Chen Xinpeng", "date_created": "2017"}, "images": [')
46 | 
47 | count = 0
48 | for imageName, imageID in train_val_imageNames_to_imageIDs.iteritems():
49 |     if count != len(train_val_imageNames_to_imageIDs)-1:
50 |         json_fw.write('{"license": 1, "file_name": "' + str(imageName) + '", "id": ' + str(imageID) + '}, ')
51 |     else:
52 |         json_fw.write('{"license": 1, "file_name": "' + str(imageName) + '", "id": ' + str(imageID) + '}]')
53 |     count += 1
54 | 
55 | json_fw.write(', "licenses": [{"url": "http://creativecommons.org/licenses/by-nc-sa/2.0/", "id": 1, "name": "Test"}], ')
56 | 
57 | json_fw.write('"type": "captions", "annotations": [')
58 | 
59 | flag_count = 0
60 | id_count = 0
61 | for imageName, imageID in train_val_imageNames_to_imageIDs.iteritems():
62 |     print "{},  {},  {}".format(flag_count, imageName, imageID)
63 | 
64 |     captions = []
65 |     for item in train_val_Names_Captions:
66 |         if item[0] == imageID:
67 |             captions.append(item[1])
68 | 
69 |     count_captions = 0
70 |     if flag_count != len(train_val_imageNames_to_imageIDs)-1:
71 |         for idx, each_sent in enumerate(captions):
72 |             if '\n' in each_sent:
73 |                 each_sent = each_sent.replace('\n', '')
74 |             if '\\' in each_sent:
75 |                 each_sent = each_sent.replace('\\', '')
76 |             if '"' in each_sent:
77 |                 each_sent = each_sent.replace('"', '')
78 |             json_fw.write('{"image_id": ' + str(imageID) + ', "id": ' + str(id_count) + ', "caption": "' + each_sent + '"}, ')
79 |             id_count += 1
80 | 
81 |     if flag_count == len(train_val_imageNames_to_imageIDs)-1:
82 |         for idx, each_sent in enumerate(captions):
83 |             if '\n' in each_sent:
84 |                 each_sent = each_sent.replace('\n', '')
85 |             if '\\' in each_sent:
86 |                 each_sent = each_sent.replace('\\', '')
87 |             if '"' in each_sent:
88 |                 each_sent = each_sent.replace('"', '')
89 |             if idx != len(captions)-1:
90 |                 json_fw.write('{"image_id": ' + str(imageID) + ', "id": ' + str(id_count) + ', "caption": "' + each_sent + '"}, ')
91 |             else:
92 |                 json_fw.write('{"image_id": ' + str(imageID) + ', "id": ' + str(id_count) + ', "caption": "' + each_sent + '"}]}')
93 |             id_count += 1
94 | 
95 |     flag_count += 1
96 | 
97 | json_fw.close()
98 | 


--------------------------------------------------------------------------------
/create_train_val_each_reference.py:
--------------------------------------------------------------------------------
  1 | # encoding: UTF-8
  2 | 
  3 | ###############################################################
  4 | #
  5 | # generate images captions into every json file one by one,
  6 | # and get the dict that map the image IDs to image Names
  7 | #
  8 | ###############################################################
  9 | 
 10 | import os
 11 | import sys
 12 | import json
 13 | import cPickle as pickle
 14 | 
 15 | train_val_imageNames_to_imageIDs = {}
 16 | train_imageNames_to_imageIDs = {}
 17 | val_imageNames_to_imageIDs = {}
 18 | 
 19 | with open('./data/captions_train2014.json') as fr_1:
 20 |     train_captions = json.load(fr_1)
 21 | 
 22 | for image in train_captions['images']:
 23 |     image_name = image['file_name']
 24 |     image_id = image['id']
 25 |     train_imageNames_to_imageIDs[image_name] = image_id
 26 | 
 27 | train_Names_Captions = []
 28 | for image in train_captions['annotations']:
 29 |     image_id = image['image_id']
 30 |     image_caption = image['caption']
 31 |     train_Names_Captions.append([image_id, image_caption])
 32 | 
 33 | train_count = 0
 34 | for imageName, imageID in train_imageNames_to_imageIDs.iteritems():
 35 |     print "{},  {},  {}".format(train_count, imageName, imageID)
 36 |     train_count += 1
 37 | 
 38 |     captions = []
 39 |     for item in train_Names_Captions:
 40 |         if item[0] == imageID:
 41 |             captions.append(item[1])
 42 | 
 43 |     json_fw = open('./train_val_reference_json/'+imageName+'.json', 'w')
 44 |     json_fw.write('{"info": {"description": "CaptionEval", "url": "https://github.com/chenxinpeng/", "version": "1.0", "year": 2017, "contributor": "Xinpeng Chen", "date_created": "2017.01.26"}, "images": [{"license": 1, "file_name": "' + imageName + '", "id": ' + str(imageID) + '}]')
 45 | 
 46 |     json_fw.write(' ,"licenses": [{"url": "test", "id": 1, "name": "test"}], ')
 47 |     json_fw.write('"type": "captions", "annotations": [')
 48 | 
 49 |     id_count = 0
 50 |     for idx, each_sent in enumerate(captions):
 51 |         if idx != len(captions)-1:
 52 |             if '\n' in each_sent:
 53 |                 each_sent = each_sent.replace('\n', '')
 54 |             if '\\' in each_sent:
 55 |                 each_sent = each_sent.replace('\\', '')
 56 |             if '"' in each_sent:
 57 |                 each_sent = each_sent.replace('"', '')
 58 |             json_fw.write('{"image_id": ' + str(imageID) + ', "id": ' + str(id_count) + ', "caption": "' + each_sent + '"}, ')
 59 |         else:
 60 |             if '\n' in each_sent:
 61 |                 each_sent = each_sent.replace('\n', '')
 62 |             if '\\' in each_sent:
 63 |                 each_sent = each_sent.replace('\\', '')
 64 |             if '"' in each_sent:
 65 |                 each_sent = each_sent.replace('"', '')
 66 |             json_fw.write('{"image_id": ' + str(imageID) + ', "id": ' + str(id_count) + ', "caption": "' + each_sent + '"}]}')
 67 |         id_count += 1
 68 |     json_fw.close()
 69 | 
 70 | # Validation json file
 71 | with open('./data/captions_val2014.json') as fr_2:
 72 |     val_captions = json.load(fr_2)
 73 | 
 74 | for image in val_captions['images']:
 75 |     image_name = image['file_name']
 76 |     image_id = image['id']
 77 |     val_imageNames_to_imageIDs[image_name] = image_id
 78 | 
 79 | val_Names_Captions = []
 80 | for image in val_captions['annotations']:
 81 |     image_id = image['image_id']
 82 |     image_caption = image['caption']
 83 |     val_Names_Captions.append([image_id, image_caption])
 84 | 
 85 | val_count = 0
 86 | for imageName, imageID in val_imageNames_to_imageIDs.iteritems():
 87 |     print "{},  {},  {}".format(val_count, imageName, imageID)
 88 | 
 89 |     captions = []
 90 |     for item in val_Names_Captions:
 91 |         if item[0] == imageID:
 92 |             captions.append(item[1])
 93 | 
 94 |     json_fw = open('./train_val_reference_json/'+imageName+'.json', 'w')
 95 |     json_fw.write('{"info": {"description": "CaptionEval", "url": "https://github.com/chenxinpeng/", "version": "1.0", "year": 2017, "contributor": "Xinpeng Chen", "date_created": "2017.01.26"}, "images": [{"license": 1, "file_name": "' + imageName + '", "id": ' + str(imageID) + '}]')
 96 | 
 97 |     json_fw.write(' ,"licenses": [{"url": "test", "id": 1, "name": "test"}], ')
 98 |     json_fw.write('"type": "captions", "annotations": [')
 99 | 
100 |     id_count = 0
101 |     for idx, each_sent in enumerate(captions):
102 |         if idx != len(captions)-1:
103 |             if '\n' in each_sent:
104 |                 each_sent = each_sent.replace('\n', '')
105 |             if '\\' in each_sent:
106 |                 each_sent = each_sent.replace('\\', '')
107 |             if '"' in each_sent:
108 |                 each_sent = each_sent.replace('"', '')
109 |             json_fw.write('{"image_id": ' + str(imageID) + ', "id": ' + str(id_count) + ', "caption": "' + each_sent + '"}, ')
110 |         else:
111 |             if '\n' in each_sent:
112 |                 each_sent = each_sent.replace('\n', '')
113 |             if '\\' in each_sent:
114 |                 each_sent = each_sent.replace('\\', '')
115 |             if '"' in each_sent:
116 |                 each_sent = each_sent.replace('"', '')
117 |             json_fw.write('{"image_id": ' + str(imageID) + ', "id": ' + str(id_count) + ', "caption": "' + each_sent + '"}]}')
118 |         id_count += 1
119 |     val_count += 1
120 |     json_fw.close()
121 | 
122 | for k, item in train_imageNames_to_imageIDs.iteritems():
123 |     train_val_imageNames_to_imageIDs[k] = item
124 | for k, item in val_imageNames_to_imageIDs.iteritems():
125 |     train_val_imageNames_to_imageIDs[k] = item
126 | 
127 | with open('./data/train_val_imageNames_to_imageIDs.pkl', 'w') as fw_2:
128 |     pickle.dump(train_val_imageNames_to_imageIDs, fw_2)
129 | 
130 | 
131 | 


--------------------------------------------------------------------------------
/data/bias_init_vector.npy:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/chenxinpeng/Optimization_of_image_description_metrics_using_policy_gradient_methods/66089304b3dc78a1e27f90e262d0cb17c5bb4cf2/data/bias_init_vector.npy


--------------------------------------------------------------------------------
/data/test.txt:
--------------------------------------------------------------------------------
1 | test
2 | 


--------------------------------------------------------------------------------
/image/1.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/chenxinpeng/Optimization_of_image_description_metrics_using_policy_gradient_methods/66089304b3dc78a1e27f90e262d0cb17c5bb4cf2/image/1.png


--------------------------------------------------------------------------------
/image/2.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/chenxinpeng/Optimization_of_image_description_metrics_using_policy_gradient_methods/66089304b3dc78a1e27f90e262d0cb17c5bb4cf2/image/2.png


--------------------------------------------------------------------------------
/image/3.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/chenxinpeng/Optimization_of_image_description_metrics_using_policy_gradient_methods/66089304b3dc78a1e27f90e262d0cb17c5bb4cf2/image/3.png


--------------------------------------------------------------------------------
/image_caption.py:
--------------------------------------------------------------------------------
   1 | # encoding: UTF-8
   2 | 
   3 | import os
   4 | import sys
   5 | import glob
   6 | import random
   7 | import time
   8 | import json
   9 | from json import encoder
  10 | import numpy as np
  11 | import cPickle as pickle
  12 | import matplotlib.pyplot as plt
  13 | 
  14 | import tensorflow as tf
  15 | 
  16 | sys.path.append('../')
  17 | from pycocotools.coco import COCO
  18 | from pycocoevalcap.eval import COCOEvalCap
  19 | 
  20 | import ipdb
  21 | 
  22 | 
  23 | #############################################################################################################
  24 | #
  25 | # Step 1: Input: D = {(x^n, y^n): n = 1:N}
  26 | # Step 2:Train \Pi(g_{1:T} | x) using MLE on D, MLE: Maximum likehood eatimation
  27 | #
  28 | ############################################################################################################
  29 | class CNN_LSTM():
  30 |     def __init__(self,
  31 |                  n_words,
  32 |                  batch_size,
  33 |                  feats_dim,
  34 |                  project_dim,
  35 |                  lstm_size,
  36 |                  word_embed_dim,
  37 |                  lstm_step,
  38 |                  bias_init_vector=None):
  39 | 
  40 |         self.n_words = n_words
  41 |         self.batch_size = batch_size
  42 |         self.feats_dim = feats_dim
  43 |         self.project_dim = project_dim
  44 |         self.lstm_size = lstm_size
  45 |         self.word_embed_dim = word_embed_dim
  46 |         self.lstm_step = lstm_step
  47 | 
  48 |         # project the image feature vector of dimension 2048 to 512 dimension, with a linear layer
  49 |         # self.encode_img_W: 2048 x 512
  50 |         # self.encode_img_b: 512
  51 |         self.encode_img_W = tf.Variable(tf.random_uniform([feats_dim, project_dim], -0.1, 0.1), name="encode_img_W")
  52 |         self.encode_img_b = tf.zeros([project_dim], name="encode_img_b")
  53 | 
  54 |         with tf.device("/cpu:0"):
  55 |             self.Wemb = tf.Variable(tf.random_uniform([n_words, word_embed_dim], -0.1, 0.1), name="Wemb")
  56 | 
  57 |         self.lstm = tf.nn.rnn_cell.BasicLSTMCell(lstm_size, state_is_tuple=True)
  58 | 
  59 |         self.embed_word_W = tf.Variable(tf.random_uniform([lstm_size, n_words], -0.1, 0.1), name="embed_word_W")
  60 | 
  61 |         if bias_init_vector is not None:
  62 |             self.embed_word_b = tf.Variable(bias_init_vector.astype(np.float32), name="embed_word_b")
  63 |         else:
  64 |             self.embed_word_b = tf.Variable(tf.zeros([n_words]), name="embed_word_b")
  65 | 
  66 |         self.baseline_MLP_W = tf.Variable(tf.random_uniform([lstm_size, 1], -0.1, 0.1), name="baseline_MLP_W")
  67 |         self.baseline_MLP_b = tf.Variable(tf.zeros([1]), name="baseline_MLP_b")
  68 | 
  69 |         # At the beginning, I used two layers of MLP, but I think it's wrong
  70 |         #self.baseline_MLP2_W = tf.Variable(tf.random_uniform([lstm_size, 1], -0.1, 0.1), name="baseline_MLP2_W")
  71 |         #self.baseline_MLP2_b = tf.Variable(tf.zeros([1]), name="baseline_MLP2_b")
  72 | 
  73 |     ############################################################################################################
  74 |     #
  75 |     # Class function for step 2
  76 |     #
  77 |     ############################################################################################################
  78 |     def build_model(self):
  79 |         images = tf.placeholder(tf.float32, [self.batch_size, self.feats_dim])
  80 |         sentences = tf.placeholder(tf.int32, [self.batch_size, self.lstm_step])
  81 |         masks = tf.placeholder(tf.float32, [self.batch_size, self.lstm_step])
  82 | 
  83 |         images_embed = tf.matmul(images, self.encode_img_W) + self.encode_img_b
  84 | 
  85 |         state = self.lstm.zero_state(batch_size=self.batch_size, dtype=tf.float32)
  86 | 
  87 |         loss = 0.0
  88 |         with tf.variable_scope("LSTM"):
  89 |             for i in range(0, self.lstm_step):
  90 |                 if i == 0:
  91 |                     current_emb = images_embed
  92 |                 else:
  93 |                     with tf.device("/cpu:0"):
  94 |                         current_emb = tf.nn.embedding_lookup(self.Wemb, sentences[:, i-1])
  95 | 
  96 |                 if i > 0:
  97 |                     tf.get_variable_scope().reuse_variables()
  98 | 
  99 |                 output, state = self.lstm(current_emb, state)
 100 | 
 101 |                 if i > 0:
 102 |                     labels = tf.expand_dims(sentences[:, i], 1)
 103 |                     indices = tf.expand_dims(tf.range(0, self.batch_size, 1), 1)
 104 |                     concated = tf.concat(1, [indices, labels])
 105 |                     onehot_labels = tf.sparse_to_dense( concated, tf.pack([self.batch_size, self.n_words]), 1.0, 0.0)
 106 | 
 107 |                     logit_words = tf.matmul(output, self.embed_word_W) + self.embed_word_b
 108 |                     cross_entropy = tf.nn.softmax_cross_entropy_with_logits(logit_words, onehot_labels)
 109 |                     cross_entropy = cross_entropy * masks[:, i]
 110 |                     current_loss = tf.reduce_sum(cross_entropy)/self.batch_size
 111 | 
 112 |                     loss = loss + current_loss
 113 |         return loss, images, sentences, masks
 114 | 
 115 |     def generate_model(self):
 116 |         images = tf.placeholder(tf.float32, [1, self.feats_dim])
 117 |         images_embed = tf.matmul(images, self.encode_img_W) + self.encode_img_b
 118 | 
 119 |         state = self.lstm.zero_state(batch_size=1, dtype=tf.float32)
 120 |         sentences = []
 121 | 
 122 |         with tf.variable_scope("LSTM"):
 123 |             output, state = self.lstm(images_embed, state)
 124 | 
 125 |             with tf.device("/cpu:0"):
 126 |                 current_emb = tf.nn.embedding_lookup(self.Wemb, tf.ones([1], dtype=tf.int64))
 127 | 
 128 |             for i in range(0, self.lstm_step):
 129 |                 tf.get_variable_scope().reuse_variables()
 130 | 
 131 |                 output, state = self.lstm(current_emb, state)
 132 | 
 133 |                 logit_words = tf.matmul(output, self.embed_word_W) + self.embed_word_b
 134 |                 max_prob_word = tf.argmax(logit_words, 1)[0]
 135 | 
 136 |                 with tf.device("/cpu:0"):
 137 |                     current_emb = tf.nn.embedding_lookup(self.Wemb, max_prob_word)
 138 |                     current_emb = tf.expand_dims(current_emb, 0)
 139 |                 sentences.append(max_prob_word)
 140 | 
 141 |         return images, sentences
 142 | 
 143 |     ####################################################################################
 144 |     #
 145 |     # Class function for step 3
 146 |     #
 147 |     ####################################################################################
 148 |     def train_Bphi_model(self):
 149 |         encode_img_W = tf.stop_gradient(self.encode_img_W)
 150 |         encode_img_b = tf.stop_gradient(self.encode_img_b)
 151 |         Wemb = tf.stop_gradient(self.Wemb)
 152 | 
 153 |         images = tf.placeholder(tf.float32, [1, self.feats_dim])
 154 |         images_embed = tf.matmul(images, encode_img_W) + encode_img_b
 155 | 
 156 |         Q_Bleu_1 = tf.placeholder(tf.float32, [1, self.lstm_step])
 157 |         Q_Bleu_2 = tf.placeholder(tf.float32, [1, self.lstm_step])
 158 |         Q_Bleu_3 = tf.placeholder(tf.float32, [1, self.lstm_step])
 159 |         Q_Bleu_4 = tf.placeholder(tf.float32, [1, self.lstm_step])
 160 | 
 161 |         weight_Bleu_1 = 0.5
 162 |         weight_Bleu_2 = 0.5
 163 |         weight_Bleu_3 = 1.0
 164 |         weight_Bleu_4 = 1.0
 165 | 
 166 |         state = self.lstm.zero_state(batch_size=1, dtype=tf.float32)
 167 | 
 168 |         # To avoid creating a feedback loop, we do not back-propagate
 169 |         # gradients through the hidden state from this loss
 170 |         c, h = state[0], state[1]
 171 |         c, h = tf.stop_gradient(c), tf.stop_gradient(h)
 172 |         state = tf.nn.rnn_cell.LSTMStateTuple(c, h)
 173 | 
 174 |         loss = 0.0
 175 | 
 176 |         with tf.variable_scope("LSTM"):
 177 |             with tf.device("/cpu:0"):
 178 |                 current_embed = tf.nn.embedding_lookup(Wemb, tf.ones([1], dtype=tf.int64))
 179 | 
 180 |             output, state = self.lstm(images_embed, state)
 181 |             c, h = state[0], state[1]
 182 |             c, h = tf.stop_gradient(c), tf.stop_gradient(h)
 183 |             state = tf.nn.rnn_cell.LSTMStateTuple(c, h)
 184 | 
 185 |             for i in range(0, self.lstm_step):
 186 |                 tf.get_variable_scope().reuse_variables()
 187 | 
 188 |                 output, state = self.lstm(current_embed, state)
 189 |                 c, h = state[0], state[1]
 190 |                 c, h = tf.stop_gradient(c), tf.stop_gradient(h)
 191 |                 state = tf.nn.rnn_cell.LSTMStateTuple(c, h)
 192 | 
 193 |                 # In our experiments, the baseline estimator is an MLP which takes as input the hidden state of the RNN at step t
 194 |                 # To avoid creating a feedback loop, we do not back-propagate gradients through the hidden state from this loss
 195 |                 #if i >= 1:
 196 |                 baseline_estimator = tf.nn.relu(tf.matmul(state[1], self.baseline_MLP_W) + self.baseline_MLP_b)
 197 |                 Q_current = weight_Bleu_1 * Q_Bleu_1[:, i] + weight_Bleu_2 * Q_Bleu_2[:, i] + \
 198 |                             weight_Bleu_3 * Q_Bleu_3[:, i] + weight_Bleu_4 * Q_Bleu_4[:, i]
 199 | 
 200 |                 # Equation (8) in the paper
 201 |                 loss = loss + tf.square(Q_current - baseline_estimator)
 202 | 
 203 |         return images, Q_Bleu_1, Q_Bleu_2, Q_Bleu_3, Q_Bleu_4, loss
 204 | 
 205 |     def Monte_Carlo_Rollout(self):
 206 |         images = tf.placeholder(tf.float32, [1, self.feats_dim])
 207 |         images_embed = tf.matmul(images, self.encode_img_W) + self.encode_img_b
 208 | 
 209 |         state = self.lstm.zero_state(batch_size=1, dtype=tf.float32)
 210 | 
 211 |         gen_sentences = []
 212 |         all_sample_sentences = []
 213 | 
 214 |         with tf.variable_scope("LSTM"):
 215 |             output, state = self.lstm(images_embed, state)
 216 |             with tf.device("/cpu:0"):
 217 |                 current_emb = tf.nn.embedding_lookup(self.Wemb, tf.ones([1], dtype=tf.int64))
 218 | 
 219 |             for i in range(0, self.lstm_step):
 220 |                 tf.get_variable_scope().reuse_variables()
 221 | 
 222 |                 output, state = self.lstm(current_emb, state)
 223 |                 logit_words = tf.matmul(output, self.embed_word_W) + self.embed_word_b
 224 |                 max_prob_word = tf.argmax(logit_words, 1)[0]
 225 | 
 226 |                 with tf.device("/cpu:0"):
 227 |                     current_emb = tf.nn.embedding_lookup(self.Wemb, max_prob_word)
 228 |                     current_emb = tf.expand_dims(current_emb, 0)
 229 |                 gen_sentences.append(max_prob_word)
 230 | 
 231 |                 if i < self.lstm_step-1:
 232 |                     num_sample = self.lstm_step - 1 - i
 233 |                     sample_sentences = []
 234 |                     for idx_sample in range(num_sample):
 235 |                         sample = tf.multinomial(logit_words, 3)
 236 |                         sample_sentences.append(sample[0])
 237 |                     all_sample_sentences.append(sample_sentences)
 238 | 
 239 |         return images, gen_sentences, all_sample_sentences
 240 | 
 241 |     ########################################################################
 242 |     #
 243 |     # Class function for step 4
 244 |     #
 245 |     ########################################################################
 246 |     def Monte_Carlo_and_Baseline(self):
 247 |         images = tf.placeholder(tf.float32, [self.batch_size, self.feats_dim])
 248 |         images_embed = tf.matmul(images, self.encode_img_W) + self.encode_img_b
 249 | 
 250 |         state = self.lstm.zero_state(batch_size=self.batch_size, dtype=tf.float32)
 251 | 
 252 |         gen_sentences = []
 253 |         all_sample_sentences = []
 254 |         all_baselines = []
 255 | 
 256 |         with tf.variable_scope("LSTM"):
 257 |             output, state = self.lstm(images_embed, state)
 258 |             with tf.device("/cpu:0"):
 259 |                 current_emb = tf.nn.embedding_lookup(self.Wemb, tf.ones([self.batch_size], dtype=tf.int64))
 260 | 
 261 |             for i in range(0, self.lstm_step):
 262 |                 tf.get_variable_scope().reuse_variables()
 263 | 
 264 |                 output, state = self.lstm(current_emb, state)
 265 |                 logit_words = tf.matmul(output, self.embed_word_W) + self.embed_word_b
 266 |                 max_prob_word = tf.argmax(logit_words, 1)
 267 |                 with tf.device("/cpu:0"):
 268 |                     current_emb = tf.nn.embedding_lookup(self.Wemb, max_prob_word)
 269 |                     #current_emb = tf.expand_dims(current_emb, 0)
 270 |                 gen_sentences.append(max_prob_word)
 271 | 
 272 |                 # compute Q for gt with K Monte Carlo rollouts
 273 |                 if i < self.lstm_step-1:
 274 |                     num_sample = self.lstm_step - 1 - i
 275 |                     sample_sentences = []
 276 |                     for idx_sample in range(num_sample):
 277 |                         sample = tf.multinomial(logit_words, 3)
 278 |                         sample_sentences.append(sample)
 279 |                     all_sample_sentences.append(sample_sentences)
 280 |                 # compute eatimated baseline
 281 |                 baseline = tf.nn.relu(tf.matmul(state[1], self.baseline_MLP_W) + self.baseline_MLP_b)
 282 |                 all_baselines.append(baseline)
 283 | 
 284 |         return images, gen_sentences, all_sample_sentences, all_baselines
 285 | 
 286 |     def SGD_update(self, batch_num_images=1000):
 287 |         images = tf.placeholder(tf.float32, [batch_num_images, self.feats_dim])
 288 |         images_embed = tf.matmul(images, self.encode_img_W) + self.encode_img_b
 289 | 
 290 |         Q_rewards = tf.placeholder(tf.float32, [batch_num_images, self.lstm_step])
 291 |         Baselines = tf.placeholder(tf.float32, [batch_num_images, self.lstm_step])
 292 | 
 293 |         state = self.lstm.zero_state(batch_size=batch_num_images, dtype=tf.float32)
 294 | 
 295 |         loss = 0.0
 296 | 
 297 |         with tf.variable_scope("LSTM"):
 298 |             tf.get_variable_scope().reuse_variables()
 299 |             output, state = self.lstm(images_embed, state)
 300 | 
 301 |             with tf.device("/cpu:0"):
 302 |                 current_emb = tf.nn.embedding_lookup(self.Wemb, tf.ones([batch_num_images], dtype=tf.int64))
 303 | 
 304 |             for i in range(0, self.lstm_step):
 305 |                 output, state = self.lstm(current_emb, state)
 306 | 
 307 |                 logit_words = tf.matmul(output, self.embed_word_W) + self.embed_word_b
 308 |                 logit_words_softmax = tf.nn.softmax(logit_words)
 309 |                 max_prob_word = tf.argmax(logit_words_softmax, 1)
 310 |                 max_prob = tf.reduce_max(logit_words_softmax, 1)
 311 | 
 312 |                 current_rewards = Q_rewards[:, i] - Baselines[:, i]
 313 |                 
 314 |                 loss = loss + tf.reduce_sum(-tf.log(max_prob) * current_rewards)
 315 |                 
 316 |                 with tf.device("/cpu:0"):
 317 |                     current_emb = tf.nn.embedding_lookup(self.Wemb, max_prob_word)
 318 |                     #current_emb = tf.expand_dims(current_emb, 0)
 319 | 
 320 |         return images, Q_rewards, Baselines, loss, max_prob, current_rewards, logit_words
 321 | 
 322 | 
 323 | ##############################################################################
 324 | #
 325 | # Step 1: set parameters and path
 326 | #
 327 | ##############################################################################
 328 | batch_size = 100
 329 | feats_dim = 2048
 330 | project_dim = 512
 331 | lstm_size = 512
 332 | word_embed_dim = 512
 333 | lstm_step = 30
 334 | 
 335 | n_epochs = 500
 336 | learning_rate = 0.0001
 337 | 
 338 | # Features directory of training and validation images, and the other path
 339 | train_val_feats_path = './inception/train_val_feats'
 340 | val_feats_path = './inception/val_feats'
 341 | 
 342 | loss_images_save_path = './loss_imgs'
 343 | loss_file_save_path = 'loss.txt'
 344 | model_path = './models'
 345 | 
 346 | train_images_captions_path = './data/train_images_captions.pkl'
 347 | val_images_captions_path = './data/val_images_captions.pkl'
 348 | 
 349 | idx_to_word_path = './data/idx_to_word.pkl'
 350 | word_to_idx_path = './data/word_to_idx.pkl'
 351 | bias_init_vector_path = './data/bias_init_vector.npy'
 352 | 
 353 | # Load pre-processed data
 354 | with open(train_images_captions_path, 'r') as fr_1:
 355 |     train_images_captions = pickle.load(fr_1)
 356 | 
 357 | with open(val_images_captions_path, 'r') as fr_2:
 358 |     val_images_captions = pickle.load(fr_2)
 359 | 
 360 | with open(idx_to_word_path, 'r') as fr_3:
 361 |     idx_to_word = pickle.load(fr_3)
 362 | 
 363 | with open(word_to_idx_path, 'r') as fr_4:
 364 |     word_to_idx = pickle.load(fr_4)
 365 | 
 366 | bias_init_vector = np.load(bias_init_vector_path)
 367 | 
 368 | 
 369 | ##########################################################################
 370 | #
 371 | # Step 2: Train, validation and test stage using MLE on Dataset
 372 | #
 373 | ##########################################################################
 374 | def Train_with_MLE():
 375 |     n_words = len(idx_to_word)
 376 |     train_images_names = train_images_captions.keys()
 377 | 
 378 |     # change the word of each image captions to index by word_to_idx
 379 |     train_images_captions_index = {}
 380 |     for each_img, sents in train_images_captions.iteritems():
 381 |         sents_index = np.zeros([len(sents), lstm_step], dtype=np.int32)
 382 | 
 383 |         for idy, sent in enumerate(sents):
 384 |             sent = '<bos> ' + sent + ' <eos>'
 385 |             tmp_sent = sent.split(' ')
 386 |             tmp_sent = filter(None, tmp_sent)
 387 | 
 388 |             for idx, word in enumerate(tmp_sent):
 389 |                 if idx == lstm_step-1:
 390 |                     sents_index[idy, idx] = word_to_idx['<eos>']
 391 |                     break
 392 |                 elif word in word_to_idx:
 393 |                     sents_index[idy, idx] = word_to_idx[word]
 394 |         train_images_captions_index[each_img] = sents_index
 395 |     with open('./data/train_images_captions_index.pkl', 'w') as fw_1:
 396 |         pickle.dump(train_images_captions_index, fw_1)
 397 | 
 398 |     model = CNN_LSTM(n_words = n_words,
 399 |                      batch_size = batch_size,
 400 |                      feats_dim = feats_dim,
 401 |                      project_dim = project_dim,
 402 |                      lstm_size = lstm_size,
 403 |                      word_embed_dim = word_embed_dim,
 404 |                      lstm_step = lstm_step,
 405 |                      bias_init_vector = bias_init_vector)
 406 | 
 407 |     tf_loss, tf_images, tf_sentences, tf_masks = model.build_model()
 408 | 
 409 |     sess = tf.InteractiveSession()
 410 |     saver = tf.train.Saver(max_to_keep=500, write_version=1)
 411 |     train_op = tf.train.AdamOptimizer(learning_rate).minimize(tf_loss)
 412 |     tf.initialize_all_variables().run()
 413 | 
 414 |     # when you want to train the model from the front model
 415 |     #new_saver = tf.train.Saver(max_to_keep=500)
 416 |     #new_saver = tf.train.import_meta_graph('./models/model-78.meta')
 417 |     #new_saver.restore(sess, tf.train.latest_checkpoint('./models/'))
 418 | 
 419 |     loss_fw = open(loss_file_save_path, 'w')
 420 |     loss_to_draw = []
 421 |     for epoch in range(0, n_epochs):
 422 |         loss_to_draw_epoch = []
 423 |         # disorder the training images
 424 |         random.shuffle(train_images_names)
 425 | 
 426 |         for start, end in zip(range(0, len(train_images_names), batch_size),
 427 |                               range(batch_size, len(train_images_names), batch_size)):
 428 |             start_time = time.time()
 429 | 
 430 |             # current_feats: get the [start:end] features
 431 |             # current_captions: convert the word to the idx by the word_to_idx
 432 |             # current_masks: set the <pad> to zero, the other place to non-zero
 433 |             current_feats = []
 434 |             current_captions = []
 435 | 
 436 |             img_names = train_images_names[start:end]
 437 |             for each_img_name in img_names:
 438 |                 # load this image's feats from the train_val_feats directory
 439 |                 #each_img_name = each_img_name + '.npy'
 440 |                 img_feat = np.load( os.path.join(train_val_feats_path, each_img_name+'.npy') )
 441 |                 current_feats.append(img_feat)
 442 | 
 443 |                 img_caption_length = len(train_images_captions[each_img_name])
 444 |                 random_choice_index = random.randint(0, img_caption_length-1)
 445 |                 img_caption = train_images_captions_index[each_img_name][random_choice_index]
 446 |                 current_captions.append(img_caption)
 447 | 
 448 |             current_feats = np.asarray(current_feats)
 449 |             current_captions = np.asarray(current_captions)
 450 | 
 451 |             current_masks = np.zeros( (current_captions.shape[0], current_captions.shape[1]), dtype=np.int32 )
 452 |             nonzeros = np.array( map(lambda x: (x != 0).sum(), current_captions) )
 453 | 
 454 |             for ind, row in enumerate(current_masks):
 455 |                 row[:nonzeros[ind]] = 1
 456 | 
 457 |             _, loss_val = sess.run(
 458 |                     [train_op, tf_loss],
 459 |                     feed_dict = {
 460 |                         tf_images: current_feats,
 461 |                         tf_sentences: current_captions,
 462 |                         tf_masks: current_masks
 463 |                         })
 464 |             loss_to_draw_epoch.append(loss_val)
 465 | 
 466 |             print "idx: {}  epoch: {}  loss: {}  Time cost: {}".format(start, epoch, loss_val, time.time()-start_time)
 467 |             loss_fw.write('epoch ' + str(epoch) + ' loss ' + str(loss_val) + '\n')
 468 | 
 469 |         # draw loss curve every epoch
 470 |         loss_to_draw.append(np.mean(loss_to_draw_epoch))
 471 |         plt_save_img_name = str(epoch) + '.png'
 472 |         plt.plot(range(len(loss_to_draw)), loss_to_draw, color='g')
 473 |         plt.grid(True)
 474 |         plt.savefig(os.path.join(loss_images_save_path, plt_save_img_name))
 475 | 
 476 |         if np.mod(epoch, 2) == 0:
 477 |             print "Epoch ", epoch, " is done. Saving the model ..."
 478 |             saver.save(sess, os.path.join(model_path, 'model_MLP'), global_step=epoch)
 479 |     loss_fw.close()
 480 | 
 481 | 
 482 | def Test_with_MLE():
 483 |     model_path = os.path.join('./models', 'model_MLP-486')
 484 |     n_words = len(idx_to_word)
 485 | 
 486 |     test_feats_path = './inception/test_feats'
 487 |     test_feats_names = glob.glob(test_feats_path + '/*.npy')
 488 |     test_images_names = map(lambda x: os.path.basename(x)[0:-4], test_feats_names)
 489 | 
 490 |     model = CNN_LSTM(n_words = n_words,
 491 |                      batch_size = batch_size,
 492 |                      feats_dim = feats_dim,
 493 |                      project_dim = project_dim,
 494 |                      lstm_size = lstm_size,
 495 |                      word_embed_dim = word_embed_dim,
 496 |                      lstm_step = lstm_step,
 497 |                      bias_init_vector = None)
 498 | 
 499 |     tf_images, tf_sentences = model.generate_model()
 500 |     sess = tf.InteractiveSession()
 501 |     saver = tf.train.Saver()
 502 |     saver.restore(sess, model_path)
 503 | 
 504 |     fw_1 = open("test2014_results_model-486.txt", 'w')
 505 |     for idx, img_name in enumerate(test_images_names):
 506 |         t0 = time.time()
 507 | 
 508 |         current_feats = np.load( os.path.join(test_feats_path, img_name+'.npy') )
 509 |         current_feats = np.reshape(current_feats, [1, feats_dim])
 510 | 
 511 |         sentences_index = sess.run(tf_sentences, feed_dict={tf_images: current_feats})
 512 | 
 513 |         #sentences = map(lambda x: idx_to_word[x], sentences_index)
 514 |         sentences = []
 515 |         for idx_word in sentences_index:
 516 |             word = idx_to_word[idx_word]
 517 |             word = word.replace('\n', '')
 518 |             word = word.replace('\\', '')
 519 |             word = word.replace('"', '')
 520 |             sentences.append(word)
 521 | 
 522 |         punctuation = np.argmax(np.array(sentences) == '<eos>') + 1
 523 |         sentences = sentences[:punctuation]
 524 |         generated_sentence = ' '.join(sentences)
 525 |         generated_sentence = generated_sentence.replace('<bos> ', '')
 526 |         generated_sentence = generated_sentence.replace(' <eos>', '')
 527 | 
 528 |         print generated_sentence,'\n'
 529 |         fw_1.write(img_name + '\n')
 530 |         fw_1.write(generated_sentence + '\n')
 531 | 
 532 |         print "{},  {},  Time cost: {}".format(idx, img_name, time.time()-t0)
 533 | 
 534 |     fw_1.close()
 535 | 
 536 | 
 537 | def Val_with_MLE():
 538 |     model_path = os.path.join('./models', 'model_MLP-486')
 539 |     n_words = len(idx_to_word)
 540 | 
 541 |     # version 1: test all validation images
 542 |     val_feats_path = './inception/val_feats'
 543 |     val_feats_names = glob.glob(val_feats_path + '/*.npy')
 544 |     val_images_names = map(lambda x: os.path.basename(x)[0:-4], val_feats_names)
 545 | 
 546 |     # version 2: test only in the 1665 validation images
 547 |     #val_feats_path = './inception/val_feats_v2'
 548 |     #with open('./data/val_images_captions.pkl', 'r') as fr_1:
 549 |     #    val_images_names = pickle.load(fr_1).keys()
 550 | 
 551 |     model = CNN_LSTM(n_words = n_words,
 552 |                      batch_size = batch_size,
 553 |                      feats_dim = feats_dim,
 554 |                      project_dim = project_dim,
 555 |                      lstm_size = lstm_size,
 556 |                      word_embed_dim = word_embed_dim,
 557 |                      lstm_step = lstm_step,
 558 |                      bias_init_vector = None)
 559 |     tf_images, tf_sentences = model.generate_model()
 560 |     sess = tf.InteractiveSession()
 561 |     saver = tf.train.Saver()
 562 |     saver.restore(sess, model_path)
 563 | 
 564 |     fw_1 = open("val2014_results_model_MLP-486.txt", 'w')
 565 |     for idx, img_name in enumerate(val_images_names):
 566 |         print "{},  {}".format(idx, img_name)
 567 |         start_time = time.time()
 568 | 
 569 |         current_feats = np.load( os.path.join(val_feats_path, img_name+'.npy') )
 570 |         current_feats = np.reshape(current_feats, [1, feats_dim])
 571 | 
 572 |         sentences_index = sess.run(tf_sentences, feed_dict={tf_images: current_feats})
 573 |         #sentences = map(lambda x: idx_to_word[x], sentences_index)
 574 |         sentences = []
 575 |         for idx_word in sentences_index:
 576 |             word = idx_to_word[idx_word]
 577 |             word = word.replace('\n', '')
 578 |             word = word.replace('\\', '')
 579 |             word = word.replace('"', '')
 580 |             sentences.append(word)
 581 | 
 582 |         punctuation = np.argmax(np.array(sentences) == '<eos>') + 1
 583 |         sentences = sentences[:punctuation]
 584 |         generated_sentence = ' '.join(sentences)
 585 |         generated_sentence = generated_sentence.replace('<bos> ', '')
 586 |         generated_sentence = generated_sentence.replace(' <eos>', '')
 587 | 
 588 |         print generated_sentence,'\n'
 589 |         fw_1.write(img_name + '\n')
 590 |         fw_1.write(generated_sentence + '\n')
 591 |     fw_1.close()
 592 | 
 593 | 
 594 | ##########################################################################################################
 595 | #
 596 | # Step 3: Train B_phi using MC estimates of Q_\theta on a small subset of Dataset D
 597 | #
 598 | ##########################################################################################################
 599 | #import create_json_reference
 600 | 
 601 | #epochs_Bphi_with_MC = 1000
 602 | 
 603 | # I select 1665 images in the val set which saved in ./data: "val_images_captions.pkl",
 604 | # to train the B_phi, here is the reference json file path
 605 | #refer_1665_save_path = './data/reference_1665.json'
 606 | 
 607 | #eval_ids_to_imgNames_save_path = './data/eval_ids_to_imgNames.pkl'
 608 | 
 609 | def Sample_Q_with_MC():
 610 |     model_path = os.path.join('./models', 'model_MLP-200')
 611 | 
 612 |     n_words = len(idx_to_word)
 613 | 
 614 |     val_images_names = val_images_captions.keys()
 615 | 
 616 |     print "Begin compute Q rewards of {} images...".format(len(val_images_names))
 617 | 
 618 |     # create_json_reference.py
 619 |     # create_refer(train_images_captions_path, train_images_names, refer_1665_save_path)
 620 |     #create_json_reference.create_refer(val_images_captions_path, val_images_names, refer_1665_save_path)
 621 | 
 622 |     #with open(eval_ids_to_imgNames_save_path, 'r') as fr_1:
 623 |     #    eval_ids_to_imgNames = pickle.load(fr_1)
 624 |     #eval_imgNames_to_ids = {}
 625 |     #for key, val in eval_ids_to_imgNames.iteritems():
 626 |     #    eval_imgNames_to_ids[val] = key
 627 | 
 628 |     #with open('./data/train_images_captions_index.pkl', 'r') as fr_2:
 629 |     #    train_images_captions_index = pickle.load(fr_2)
 630 | 
 631 |     # open the dict that map the image names to image ids
 632 |     with open('./data/train_val_imageNames_to_imageIDs.pkl', 'r') as fr:
 633 |         train_val_imageNames_to_imageIDs = pickle.load(fr)
 634 | 
 635 |     model = CNN_LSTM(n_words = n_words,
 636 |                      batch_size = 1,
 637 |                      feats_dim = feats_dim,
 638 |                      project_dim = project_dim,
 639 |                      lstm_size = lstm_size,
 640 |                      word_embed_dim = word_embed_dim,
 641 |                      lstm_step = lstm_step,
 642 |                      bias_init_vector = bias_init_vector)
 643 | 
 644 |     tf_images, tf_gen_sentences, tf_all_sentences = model.Monte_Carlo_Rollout()
 645 |     sess = tf.Session()
 646 |     saver = tf.train.Saver()
 647 |     saver.restore(sess, model_path)
 648 | 
 649 |     all_images_Q_rewards = {}
 650 |     for idx, img_name in enumerate(val_images_names):
 651 |         print("current image idx: {},  {}".format(idx, img_name))
 652 |         start_time = time.time()
 653 | 
 654 |         # Load reference json file
 655 |         annFile = './train_val_reference_json/' + img_name + '.json'
 656 |         coco = COCO(annFile)
 657 | 
 658 |         all_images_Q_rewards[img_name] = {}
 659 |         current_image_rewards = all_images_Q_rewards[img_name]
 660 |         current_image_rewards['Bleu_4'] = []
 661 |         current_image_rewards['Bleu_3'] = []
 662 |         current_image_rewards['Bleu_2'] = []
 663 |         current_image_rewards['Bleu_1'] = []
 664 | 
 665 |         current_feats = np.load(os.path.join(val_feats_path, img_name+'.npy'))
 666 |         current_feats = np.reshape(current_feats, [1, feats_dim])
 667 | 
 668 |         gen_sents_index, all_sample_sents = sess.run([tf_gen_sentences, tf_all_sentences], feed_dict={tf_images: current_feats})
 669 |         gen_sents = []
 670 |         for item in gen_sents_index:
 671 |             tmp_word = idx_to_word[item]
 672 |             tmp_word = tmp_word.replace('\\', '')
 673 |             tmp_word = tmp_word.replace('\n', '')
 674 |             tmp_word = tmp_word.replace('"', '')
 675 |             gen_sents.append(tmp_word)
 676 |         gen_sents_list = gen_sents
 677 |         punctuation = np.argmax(np.array(gen_sents) == '<eos>') + 1
 678 |         gen_sents = gen_sents[:punctuation]
 679 |         gen_sents = ' '.join(gen_sents)
 680 |         gen_sents = gen_sents.replace(' <eos>', '')
 681 |         gen_sents = gen_sents.replace(' ,', ',')
 682 |         print "\ngenerated sentences: {}".format(gen_sents)
 683 |         
 684 |         for i_s, samples in enumerate(all_sample_sents):
 685 |             print "\n=========================================================================="
 686 |             print "{} / {}".format(i_s, len(all_sample_sents))
 687 | 
 688 |             samples = np.asarray(samples)
 689 |             sample_sent_1 = []; sample_sent_2 = []; sample_sent_3 = []
 690 | 
 691 |             for each_gen_sents_word in gen_sents_list[0: (i_s+1)]:
 692 |                 sample_sent_1.append(each_gen_sents_word)
 693 |                 sample_sent_2.append(each_gen_sents_word)
 694 |                 sample_sent_3.append(each_gen_sents_word)
 695 | 
 696 |             for j_s in range(samples.shape[0]):
 697 |                 word_1, word_2, word_3 = idx_to_word[samples[j_s, 0]], idx_to_word[samples[j_s, 1]], idx_to_word[samples[j_s, 2]]
 698 |                 word_1, word_2, word_3 = word_1.replace('\n', ''), word_2.replace('\n', ''), word_3.replace('\n', '')
 699 |                 word_1, word_2, word_3 = word_1.replace('"', ''), word_2.replace('"', ''), word_3.replace('"', '')
 700 |                 word_1, word_2, word_3 = word_1.replace('\\', ''), word_2.replace('\\', ''), word_3.replace('\\', '')
 701 |                 sample_sent_1.append(word_1)
 702 |                 sample_sent_2.append(word_2)
 703 |                 sample_sent_3.append(word_3)
 704 | 
 705 |             sample_sent_1.append('<eos>')
 706 |             sample_sent_2.append('<eos>')
 707 |             sample_sent_3.append('<eos>')
 708 | 
 709 |             three_sample_sents = [sample_sent_1, sample_sent_2, sample_sent_3]
 710 | 
 711 |             three_sample_rewards = {}
 712 |             three_sample_rewards['Bleu_1'] = 0.0
 713 |             three_sample_rewards['Bleu_2'] = 0.0
 714 |             three_sample_rewards['Bleu_3'] = 0.0
 715 |             three_sample_rewards['Bleu_4'] = 0.0
 716 | 
 717 |             for ii, each_sample_sent in enumerate(three_sample_sents):
 718 |                 if ' ' in each_sample_sent:
 719 |                     each_sample_sent.remove(' ') # remove the space element in a list!
 720 | 
 721 |                 print "sample sentence {},  {}".format(ii, each_sample_sent)
 722 | 
 723 |                 punctuation = np.argmax(np.array(each_sample_sent) == '<eos>') + 1
 724 |                 each_sample_sent = each_sample_sent[:punctuation]
 725 |                 each_sample_sent = ' '.join(each_sample_sent)
 726 |                 each_sample_sent = each_sample_sent.replace(' <eos>', '')
 727 |                 each_sample_sent = each_sample_sent.replace(' ,', ',')
 728 |                 print each_sample_sent
 729 |                 fw_1 = open("./data/results_MC.json", 'w')
 730 |                 fw_1.write('[{"image_id": ' + str(train_val_imageNames_to_imageIDs[img_name]) + ', "caption": "' + each_sample_sent + '"}]')
 731 |                 fw_1.close()
 732 | 
 733 |                 #annFile = './data/reference_1665.json'
 734 |                 resFile = './data/results_MC.json'
 735 |                 #coco = COCO(annFile)
 736 |                 cocoRes = coco.loadRes(resFile)
 737 |                 cocoEval = COCOEvalCap(coco, cocoRes)
 738 |                 cocoEval.params['image_id'] = cocoRes.getImgIds()
 739 |                 cocoEval.evaluate()
 740 | 
 741 |                 for metric, score in cocoEval.eval.items():
 742 |                     print '%s: %.3f'%(metric, score)
 743 |                     if metric == 'Bleu_1':
 744 |                         three_sample_rewards['Bleu_1'] += score
 745 |                     if metric == 'Bleu_2':
 746 |                         three_sample_rewards['Bleu_2'] += score
 747 |                     if metric == 'Bleu_3':
 748 |                         three_sample_rewards['Bleu_3'] += score
 749 |                     if metric == 'Bleu_4':
 750 |                         three_sample_rewards['Bleu_4'] += score
 751 | 
 752 |             current_image_rewards['Bleu_1'].append(three_sample_rewards['Bleu_1']/3.0)
 753 |             current_image_rewards['Bleu_2'].append(three_sample_rewards['Bleu_2']/3.0)
 754 |             current_image_rewards['Bleu_3'].append(three_sample_rewards['Bleu_3']/3.0)
 755 |             current_image_rewards['Bleu_4'].append(three_sample_rewards['Bleu_4']/3.0)
 756 | 
 757 |         # If be in a terminal state, we define Q(g_{1:T}, EOS) = R(g_{1:T})
 758 |         fw_1 = open("./data/results_MC.json", 'w')
 759 |         fw_1.write('[{"image_id": ' + str(train_val_imageNames_to_imageIDs[img_name]) + ', "caption": "' + gen_sents + '"}]')
 760 |         fw_1.close()
 761 |         #annFile = './data/reference_1665.json'
 762 |         resFile = './data/results_MC.json'
 763 |         #coco = COCO(annFile)
 764 |         cocoRes = coco.loadRes(resFile)
 765 |         cocoEval = COCOEvalCap(coco, cocoRes)
 766 |         cocoEval.params['image_id'] = cocoRes.getImgIds()
 767 |         cocoEval.evaluate()
 768 |         for metric, score in cocoEval.eval.items():
 769 |             print '%s: %.3f'%(metric, score)
 770 |             if metric == 'Bleu_1':
 771 |                 current_image_rewards['Bleu_1'].append(score)
 772 |             if metric == 'Bleu_2':
 773 |                 current_image_rewards['Bleu_2'].append(score)
 774 |             if metric == 'Bleu_3':
 775 |                 current_image_rewards['Bleu_3'].append(score)
 776 |             if metric == 'Bleu_4':
 777 |                 current_image_rewards['Bleu_4'].append(score)
 778 |         print "Time cost: {}".format(time.time()-start_time)
 779 | 
 780 |     with open('./data/all_images_Q_rewards.pkl', 'w') as fw_1:
 781 |         pickle.dump(all_images_Q_rewards, fw_1)
 782 | 
 783 | def Train_Bphi_Model():
 784 |     n_words = len(idx_to_word)
 785 | 
 786 |     with open('./data/all_images_Q_rewards.pkl', 'r') as fr_3:
 787 |         all_images_Q_rewards = pickle.load(fr_3)
 788 | 
 789 |     subset_images_names = all_images_Q_rewards.keys()
 790 | 
 791 |     model = CNN_LSTM(n_words = n_words,
 792 |                      batch_size = 1,
 793 |                      feats_dim = feats_dim,
 794 |                      project_dim = project_dim,
 795 |                      lstm_size = lstm_size,
 796 |                      word_embed_dim = word_embed_dim,
 797 |                      lstm_step = lstm_step,
 798 |                      bias_init_vector = bias_init_vector)
 799 | 
 800 |     Bphi_tf_images, Bphi_tf_Bleu_1, Bphi_tf_Bleu_2, Bphi_tf_Bleu_3, Bphi_tf_Bleu_4, Bphi_tf_loss = model.train_Bphi_model()
 801 | 
 802 |     train_op = tf.train.GradientDescentOptimizer(learning_rate).minimize(Bphi_tf_loss)
 803 |     sess = tf.InteractiveSession()
 804 |     #tf.initialize_all_variables().run()
 805 |     new_saver = tf.train.Saver(max_to_keep=500)
 806 |     #new_saver = tf.train.import_meta_graph('./models/model-32.meta')
 807 |     #new_saver.restore(sess, tf.train.latest_checkpoint('./models'))
 808 |     new_saver.restore(sess, './models/model-50')
 809 | 
 810 |     loss_to_draw = []
 811 |     for epoch in range(0, epochs_Bphi_with_MC):
 812 |         loss_to_draw_epoch = []
 813 |         random.shuffle(subset_images_names)
 814 | 
 815 |         for start, end in zip(range(0, len(subset_images_names), 1),
 816 |                               range(1, len(subset_images_names), 1)):
 817 |             start_time_batch = time.time()
 818 | 
 819 |             current_feats = []
 820 | 
 821 |             # Bleu_1, Bleu_2, Bleu_3, Bleu_4
 822 |             current_Bleu_1 = []
 823 |             current_Bleu_2 = []
 824 |             current_Bleu_3 = []
 825 |             current_Bleu_4 = []
 826 | 
 827 |             img_names = subset_images_names[start:end]
 828 |             for each_img_name in img_names:
 829 |                 img_feat = np.load(os.path.join(train_val_feats_path, each_img_name+'.npy'))
 830 |                 current_feats.append(img_feat)
 831 | 
 832 |                 current_Bleu_1.append(all_images_Q_rewards[each_img_name]['Bleu_1'])
 833 |                 current_Bleu_2.append(all_images_Q_rewards[each_img_name]['Bleu_2'])
 834 |                 current_Bleu_3.append(all_images_Q_rewards[each_img_name]['Bleu_3'])
 835 |                 current_Bleu_4.append(all_images_Q_rewards[each_img_name]['Bleu_4'])
 836 | 
 837 |             current_feats = np.asarray(current_feats, dtype=np.float32)
 838 |             current_Bleu_1 = np.asarray(current_Bleu_1, dtype=np.float32)
 839 |             current_Bleu_2 = np.asarray(current_Bleu_2, dtype=np.float32)
 840 |             current_Bleu_3 = np.asarray(current_Bleu_3, dtype=np.float32)
 841 |             current_Bleu_4 = np.asarray(current_Bleu_4, dtype=np.float32)
 842 | 
 843 |             _, loss_val = sess.run([train_op, Bphi_tf_loss],
 844 |                                    feed_dict = {Bphi_tf_images: current_feats,
 845 |                                                 Bphi_tf_Bleu_1: current_Bleu_1,
 846 |                                                 Bphi_tf_Bleu_2: current_Bleu_2,
 847 |                                                 Bphi_tf_Bleu_3: current_Bleu_3,
 848 |                                                 Bphi_tf_Bleu_4: current_Bleu_4
 849 |                                        })
 850 | 
 851 |             loss_to_draw_epoch.append(loss_val[0,0])
 852 |             print "idx: {}  epoch: {}  loss: {}  Time cost: {}".format(start, epoch, loss_val[0,0], time.time() - start_time_batch)
 853 | 
 854 |         loss_to_draw.append(np.mean(loss_to_draw_epoch))
 855 |         plt_save_img_name = 'Bphi_train_' + str(epoch) + '.png'
 856 |         plt.plot(range(len(loss_to_draw)), loss_to_draw, color='g')
 857 |         plt.grid(True)
 858 |         plt.savefig(os.path.join('./loss_imgs', plt_save_img_name))
 859 | 
 860 |         if np.mod(epoch, 2) == 0:
 861 |             print "Epoch ", epoch, " is done. Saving the model ..."
 862 |             new_saver.save(sess, os.path.join('./models', 'Bphi_train_model'), global_step=epoch)
 863 | 
 864 | 
 865 | ##############################################################################################################
 866 | #
 867 | # Step 4:  go through all the images in D, SGD update of \theta, \phi
 868 | #
 869 | ##############################################################################################################
 870 | def Train_SGD_update():
 871 |     model_path = os.path.join('./models', 'Bphi_train_model-84')
 872 |     batch_num_images = 100 # 100
 873 |     epoches = n_epochs # 500
 874 |     n_words = len(idx_to_word)
 875 |     train_images_names = train_images_captions.keys()
 876 | 
 877 |     # open the dict that map the image names to image ids
 878 |     with open('./data/train_val_imageNames_to_imageIDs.pkl', 'r') as fr:
 879 |         train_val_imageNames_to_imageIDs = pickle.load(fr)
 880 | 
 881 |     # Load COCO reference json file
 882 |     annFile = './data/train_val_all_reference.json'
 883 |     coco = COCO(annFile)
 884 | 
 885 |     # model initialization
 886 |     model = CNN_LSTM(n_words = n_words,
 887 |                      batch_size = batch_num_images,
 888 |                      feats_dim = feats_dim,
 889 |                      project_dim = project_dim,
 890 |                      lstm_size = lstm_size,
 891 |                      word_embed_dim = word_embed_dim,
 892 |                      lstm_step = lstm_step,
 893 |                      bias_init_vector = bias_init_vector)
 894 | 
 895 |     # The first model is used to generate sample sentences and Baselines.
 896 |     # Then we use the sample sentences and coco caption API to compute the Q_rewards.
 897 |     # And the second model is used to transfer the Q_rewards, Baselines values,
 898 |     # the loss function is \sum(log(max_probability) * rewards)
 899 |     tf_images, tf_gen_sents_index, tf_all_sample_sents, tf_all_baselines = model.Monte_Carlo_and_Baseline()
 900 |     tf_images_2, tf_Q_rewards, tf_Baselines, tf_loss, tf_max_prob, tf_current_rewards, tf_logit_words = model.SGD_update(batch_num_images=1000)
 901 | 
 902 |     train_op = tf.train.GradientDescentOptimizer(learning_rate).minimize(tf_loss)
 903 |     sess = tf.InteractiveSession()
 904 |     saver = tf.train.Saver()
 905 |     saver.restore(sess, model_path)
 906 |     #tf.initialize_all_variables().run()
 907 | 
 908 |     # save every epoch loss value in loss_to_draw
 909 |     loss_to_draw = []
 910 |     for epoch in range(0, epoches):
 911 |         # save every batch loss value in loss_to_draw_epoch
 912 |         loss_to_draw_epoch = []
 913 | 
 914 |         # shuffle the order of images randomly
 915 |         random.shuffle(train_images_names)
 916 | 
 917 |         # store rewards of all the training images
 918 |         train_val_images_Q_rewards = {}
 919 | 
 920 |         for start, end in zip(range(0, len(train_images_names), batch_num_images),
 921 |                               range(batch_num_images, len(train_images_names), batch_num_images)):
 922 |             start_time = time.time()
 923 | 
 924 |             img_names = train_images_names[start:end]
 925 |             current_feats = []
 926 |             for img_name in img_names:
 927 |                 tmp_feats = np.load(os.path.join(train_val_feats_path, img_name+'.npy'))
 928 |                 current_feats.append(tmp_feats)
 929 |             current_feats = np.asarray(current_feats)
 930 | 
 931 |             # store rewards of all the training images
 932 |             #train_val_images_Q_rewards = {}
 933 |             #ONE IMAGE: for idx, img_name in enumerate(train_images_names):
 934 |             #ONE IMAGE: print "{},  {}".format(idx, img_name)
 935 |             #ONE IMAGE: start_time = time.time()
 936 |             current_batch_rewards = {}
 937 |             current_batch_rewards['Bleu_1'] = []
 938 |             current_batch_rewards['Bleu_2'] = []
 939 |             current_batch_rewards['Bleu_3'] = []
 940 |             current_batch_rewards['Bleu_4'] = []
 941 | 
 942 |             # weighted sum
 943 |             sum_image_rewards = []
 944 |             Bleu_1_weight = 0.5
 945 |             Bleu_2_weight = 0.5
 946 |             Bleu_3_weight = 1.0
 947 |             Bleu_4_weight = 1.0
 948 | 
 949 |             #ONE IMAGE: current_feats = np.load(os.path.join(train_val_feats_path, img_name+'.npy'))
 950 |             #ONE IMAGE: current_feats = np.reshape(current_feats, [1, feats_dim])
 951 |             
 952 |             
 953 |             ###################################################################################################################################
 954 |             # 
 955 |             # Below, for the current 100 images, we compute Q(g1:t-1, gt) for gt with K Monte Carlo rollouts, using Equation (6)
 956 |             # Meanwhile, we compute estimated baseline B_phi(g1:t-1)
 957 |             #
 958 |             ###################################################################################################################################
 959 |             feed_dict = {tf_images: current_feats}
 960 |             gen_sents_index, all_sample_sents, all_baselines = sess.run([tf_gen_sents_index, tf_all_sample_sents, tf_all_baselines], feed_dict)
 961 |             
 962 |             # 100 sentences, every sentence has 30 words, thus its shape is 100 x 30
 963 |             batch_sentences = []
 964 |             for tmp_i in range(0, batch_num_images):
 965 |                 single_sentences = []
 966 |                 for tmp_j in range(0, len(gen_sents_index)):
 967 |                     word_idx = gen_sents_index[tmp_j][tmp_i]
 968 |                     word = idx_to_word[word_idx]
 969 |                     word = word.replace('\n', '')
 970 |                     word = word.replace('\\', '')
 971 |                     word = word.replace('"', '')
 972 |                     single_sentences.append(word)
 973 |                 batch_sentences.append(single_sentences)
 974 | 
 975 |             #ONE IMAGE: tmp_sentences = map(lambda x: idx_to_word[x], gen_sents_index)
 976 |             #ONE IMAGE: print tmp_sentences
 977 |             #ONE IMAGE: sentences = []
 978 |             #ONE IMAGE: for word in tmp_sentences:
 979 |             #ONE IMAGE:     word = word.replace('\n', '')
 980 |             #ONE IMAGE:     word = word.replace('\\', '')
 981 |             #ONE IMAGE:     word = word.replace('"', '')
 982 |             #ONE IMAGE:     sentences.append(word)
 983 |             
 984 |             batch_sentences_processed = []
 985 |             #gen_sents_list = batch_sentences
 986 |             for tmp_i in range(0, batch_num_images):
 987 |                 tmp_sentences = batch_sentences[tmp_i]
 988 |                 punctuation = np.argmax(np.array(tmp_sentences) == '<eos>') + 1
 989 |                 tmp_sentences = tmp_sentences[:punctuation]
 990 |                 tmp_sentences = ' '.join(tmp_sentences)
 991 |                 tmp_sentences = tmp_sentences.replace('<bos> ', '')
 992 |                 tmp_sentences = tmp_sentences.replace(' <eos>', '')
 993 |                 batch_sentences_processed.append(tmp_sentences)
 994 |                 #print "Idx: {}  Image Name: {}  Gen Sentence: {}".format(tmp_i, img_names[tmp_i], generated_sentence)
 995 |             
 996 |             #ONE IMAGE: gen_sents_list = sentences
 997 |             #ONE IMAGE: punctuation = np.argmax(np.array(sentences) == '<eos>') + 1
 998 |             #ONE IMAGE: sentences = sentences[:punctuation]
 999 |             #ONE IMAGE: generated_sentence = ' '.join(sentences)
1000 |             #ONE IMAGE: generated_sentence = generated_sentence.replace('<bos> ', '')
1001 |             #ONE IMAGE: generated_sentence = generated_sentence.replace(' <eos>', '')
1002 |             #ONE IMAGE: print "Generated sentences: {}".format(generated_sentence)
1003 |             
1004 |             # 0, 1, 2, ..., 28, the 30th is computed by the whole generated sentences
1005 |             for time_step in range(0, lstm_step-1):
1006 |                 print "\n===================================================================================================="
1007 |                 print "Time step:  {} \n".format(time_step)
1008 |                 batch_samples = all_sample_sents[time_step]
1009 |                 batch_samples = np.asarray(batch_samples)
1010 | 
1011 |                 batch_sample_sents_1 = []
1012 |                 batch_sample_sents_2 = []
1013 |                 batch_sample_sents_3 = []
1014 |                 # store the sample sentences, each sample list has 100 images' sentences
1015 |                 for img_idx in range(0, batch_num_images):
1016 |                     batch_sample_sents_1.append([])
1017 |                     batch_sample_sents_2.append([])
1018 |                     batch_sample_sents_3.append([])
1019 | 
1020 |                 # 0, 1, 2, ..., 99
1021 |                 for img_idx in range(0, batch_num_images):
1022 |                     for each_gen_sents_word in batch_sentences[img_idx][0:time_step+1]:
1023 |                         each_gen_sents_word = each_gen_sents_word.replace('\n', '')
1024 |                         each_gen_sents_word = each_gen_sents_word.replace('\\', '')
1025 |                         each_gen_sents_word = each_gen_sents_word.replace('"', '')
1026 |                         batch_sample_sents_1[img_idx].append(each_gen_sents_word)
1027 |                         batch_sample_sents_2[img_idx].append(each_gen_sents_word)
1028 |                         batch_sample_sents_3[img_idx].append(each_gen_sents_word)
1029 | 
1030 |                 # 0, 1, 2, ..., 99
1031 |                 for img_idx in range(0, batch_num_images):
1032 |                     for tmp_i in range(0, batch_samples.shape[0]):
1033 |                         word_1 = idx_to_word[batch_samples[tmp_i, img_idx, 0]]
1034 |                         word_2 = idx_to_word[batch_samples[tmp_i, img_idx, 1]]
1035 |                         word_3 = idx_to_word[batch_samples[tmp_i, img_idx, 2]]
1036 |                         word_1, word_2, word_3 = word_1.replace('\n', ''), word_2.replace('\n', ''), word_3.replace('\n', '')
1037 |                         word_1, word_2, word_3 = word_1.replace('\\', ''), word_2.replace('\\', ''), word_3.replace('\\', '')
1038 |                         word_1, word_2, word_3 = word_1.replace('"', ''), word_2.replace('"', ''), word_3.replace('"', '')
1039 | 
1040 |                         batch_sample_sents_1[img_idx].append(word_1)
1041 |                         batch_sample_sents_2[img_idx].append(word_2)
1042 |                         batch_sample_sents_3[img_idx].append(word_3)
1043 |                     batch_sample_sents_1[img_idx].append('<eos>')
1044 |                     batch_sample_sents_2[img_idx].append('<eos>')
1045 |                     batch_sample_sents_3[img_idx].append('<eos>')
1046 | 
1047 |                 batch_three_sample_sents = [batch_sample_sents_1, batch_sample_sents_2, batch_sample_sents_3]
1048 |                 three_sample_rewards = {}
1049 |                 three_sample_rewards['Bleu_1'] = 0.0
1050 |                 three_sample_rewards['Bleu_2'] = 0.0
1051 |                 three_sample_rewards['Bleu_3'] = 0.0
1052 |                 three_sample_rewards['Bleu_4'] = 0.0
1053 |                 
1054 |                 for tmp_i, batch_sample_sents in enumerate(batch_three_sample_sents):
1055 |                     ######################################################################################
1056 |                     # write the sample sentences of current 100 images
1057 |                     ######################################################################################
1058 |                     fw_1 = open("./data/results_batch_sample_sents.json", 'w')
1059 |                     fw_1.write('[')
1060 |                     
1061 |                     for img_idx in range(0, batch_num_images):
1062 |                         if ' ' in batch_sample_sents[img_idx]:
1063 |                             batch_sample_sents[img_idx].remove(' ')
1064 |                         
1065 |                         punctuation = np.argmax(np.array(batch_sample_sents[img_idx]) == '<eos>') + 1
1066 |                         batch_sample_sents[img_idx] = batch_sample_sents[img_idx][:punctuation]
1067 |                         batch_sample_sents[img_idx] = ' '.join(batch_sample_sents[img_idx])
1068 |                         batch_sample_sents[img_idx] = batch_sample_sents[img_idx].replace(' <eos>', '')
1069 |                         batch_sample_sents[img_idx] = batch_sample_sents[img_idx].replace(' ,', ',')
1070 |                         
1071 |                         if img_idx != batch_num_images-1:
1072 |                             fw_1.write('{"image_id": ' + str(train_val_imageNames_to_imageIDs[img_names[img_idx]]) + ', "caption": "' + batch_sample_sents[img_idx] + '"}, ')
1073 |                         else:
1074 |                             fw_1.write('{"image_id": ' + str(train_val_imageNames_to_imageIDs[img_names[img_idx]]) + ', "caption": "' + batch_sample_sents[img_idx] + '"}]')
1075 |                     fw_1.close()
1076 |                     
1077 |                     ########################################################################################
1078 |                     # compute the Bleu1,2,3,4 score using current 100 images
1079 |                     ########################################################################################
1080 |                     #annFile = './data/train_val_all_reference.json'
1081 |                     resFile = './data/results_batch_sample_sents.json'
1082 |                     #coco = COCO(annFile)
1083 |                     cocoRes = coco.loadRes(resFile)
1084 |                     cocoEval = COCOEvalCap(coco, cocoRes)
1085 |                     cocoEval.params['image_id'] = cocoRes.getImgIds()
1086 |                     cocoEval.evaluate()
1087 |                     for metric, score in cocoEval.eval.items():
1088 |                         if metric == 'Bleu_1':
1089 |                             three_sample_rewards['Bleu_1'] += score
1090 |                         if metric == 'Bleu_2':
1091 |                             three_sample_rewards['Bleu_2'] += score
1092 |                         if metric == 'Bleu_3':
1093 |                             three_sample_rewards['Bleu_3'] += score
1094 |                         if metric == 'Bleu_4':
1095 |                             three_sample_rewards['Bleu_4'] += score
1096 |             
1097 |                 current_batch_rewards['Bleu_1'].append(three_sample_rewards['Bleu_1']/3.0)
1098 |                 current_batch_rewards['Bleu_2'].append(three_sample_rewards['Bleu_2']/3.0)
1099 |                 current_batch_rewards['Bleu_3'].append(three_sample_rewards['Bleu_3']/3.0)
1100 |                 current_batch_rewards['Bleu_4'].append(three_sample_rewards['Bleu_4']/3.0)
1101 | 
1102 |             #####################################################################################################
1103 |             # compute the 30th rewards of the current 100 images
1104 |             #####################################################################################################
1105 |             fw_2 = open("./data/results_batch_sample_sents.json", 'w')
1106 |             fw_2.write('[')
1107 |             for img_idx in range(0, batch_num_images):
1108 |                 if img_idx != batch_num_images-1:
1109 |                     fw_2.write('{"image_id": ' + str(train_val_imageNames_to_imageIDs[img_names[img_idx]]) + ', "caption": "' + batch_sentences_processed[img_idx] + '"}, ')
1110 |                 else:
1111 |                     fw_2.write('{"image_id": ' + str(train_val_imageNames_to_imageIDs[img_names[img_idx]]) + ', "caption": "' + batch_sentences_processed[img_idx] + '"}]')
1112 |             fw_2.close()
1113 |             #annFile = './data/train_val_all_reference.json'
1114 |             resFile = './data/results_batch_sample_sents.json'
1115 |             #coco = COCO(annFile)
1116 |             cocoRes = coco.loadRes(resFile)
1117 |             cocoEval = COCOEvalCap(coco, cocoRes)
1118 |             cocoEval.params['image_id'] = cocoRes.getImgIds()
1119 |             cocoEval.evaluate()
1120 |             for metric, score in cocoEval.eval.items():
1121 |                 if metric == 'Bleu_1':
1122 |                     current_batch_rewards['Bleu_1'].append(score)
1123 |                 if metric == 'Bleu_2':
1124 |                     current_batch_rewards['Bleu_2'].append(score)
1125 |                 if metric == 'Bleu_3':
1126 |                     current_batch_rewards['Bleu_3'].append(score)
1127 |                 if metric == 'Bleu_4':
1128 |                     current_batch_rewards['Bleu_4'].append(score)
1129 |            
1130 |             # compute the weight sum of Bleu value as rewards
1131 |             for tmp_idx in range(0, lstm_step):
1132 |                 tmp_reward = current_batch_rewards['Bleu_1'][tmp_idx] * Bleu_1_weight + \
1133 |                              current_batch_rewards['Bleu_2'][tmp_idx] * Bleu_2_weight + \
1134 |                              current_batch_rewards['Bleu_3'][tmp_idx] * Bleu_3_weight + \
1135 |                              current_batch_rewards['Bleu_4'][tmp_idx] * Bleu_4_weight
1136 |                 sum_image_rewards.append(tmp_reward)
1137 |             sum_image_rewards = np.asarray(sum_image_rewards)
1138 |             #sum_image_rewards = np.reshape(sum_image_rewards, [1, lstm_step])
1139 |             sum_image_rewards = np.array([sum_image_rewards, ] * batch_num_images)
1140 | 
1141 |             all_baselines = np.asarray(all_baselines)
1142 |             all_baselines = np.reshape(all_baselines, [batch_num_images, lstm_step])
1143 |             #all_baselines_mean = np.mean(all_baselines, axis=0)
1144 |             #all_baselines = np.array([all_baselines_mean,] * batch_num_images)
1145 |             feed_dict = {tf_images_2: current_feats, tf_Q_rewards: sum_image_rewards, tf_Baselines: all_baselines}
1146 |             _, loss_value, max_prob, current_rewards, logit_words = sess.run([train_op, tf_loss, tf_max_prob, tf_current_rewards, tf_logit_words], feed_dict)
1147 |             #ipdb.set_trace()
1148 |             loss_to_draw_epoch.append(loss_value)
1149 |             print "idx: {}  epoch: {}  loss: {}  Time cost: {}".format(start, epoch, loss_value, time.time()-start_time)
1150 | 
1151 |         # draw loss curve every epoch
1152 |         loss_to_draw.append(np.mean(loss_to_draw_epoch))
1153 |         plt_save_img_name = 'SGD_update_' + str(epoch) + '.png'
1154 |         plt.plot(range(len(loss_to_draw)), loss_to_draw, color='g')
1155 |         plt.grid(True)
1156 |         plt.savefig(os.path.join(loss_images_save_path, plt_save_img_name))
1157 | 
1158 |         if np.mod(epoch, 1) == 0:
1159 |             print "Epoch ", epoch, " is done. Saving the model ..."
1160 |             saver.save(sess, os.path.join('./models', 'SGD_update_model'), global_step=epoch)
1161 | 
1162 |             #ONE IMAGE: # compute the 29 rewards using all_sample_sents
1163 |             #ONE IMAGE: # the 30th reward is computed with gen_sents_list
1164 |             #ONE IMAGE: for t in range(0, lstm_step-1):
1165 |             #ONE IMAGE:     samples = all_sample_sents[t]
1166 |             #ONE IMAGE:     samples = np.asarray(samples)
1167 | 
1168 |             #ONE IMAGE:     sample_sent_1 = []
1169 |             #ONE IMAGE:     sample_sent_2 = []
1170 |             #ONE IMAGE:     sample_sent_3 = []
1171 |             #ONE IMAGE:     for each_gen_sents_word in gen_sents_list[0:t+1]:
1172 |             #ONE IMAGE:         sample_sent_1.append(each_gen_sents_word)
1173 |             #ONE IMAGE:         sample_sent_2.append(each_gen_sents_word)
1174 |             #ONE IMAGE:         sample_sent_3.append(each_gen_sents_word)
1175 | 
1176 |             #ONE IMAGE:     for i in range(samples.shape[0]):
1177 |             #ONE IMAGE:         word_1, word_2, word_3 = idx_to_word[samples[i, 0]], idx_to_word[samples[i, 1]], idx_to_word[samples[i, 2]]
1178 | 
1179 |             #ONE IMAGE:         word_1, word_2, word_3 = word_1.replace('\n', ''), word_2.replace('\n', ''), word_3.replace('\n', '')
1180 |             #ONE IMAGE:         word_1, word_2, word_3 = word_1.replace('\\', ''), word_2.replace('\\', ''), word_3.replace('\\', '')
1181 |             #ONE IMAGE:         word_1, word_2, word_3 = word_1.replace('"', ''), word_2.replace('"', ''), word_3.replace('"', '')
1182 | 
1183 |             #ONE IMAGE:         sample_sent_1.append(word_1)
1184 |             #ONE IMAGE:         sample_sent_2.append(word_2)
1185 |             #ONE IMAGE:         sample_sent_3.append(word_3)
1186 | 
1187 |             #ONE IMAGE:     sample_sent_1.append('<eos>')
1188 |             #ONE IMAGE:     sample_sent_2.append('<eos>')
1189 |             #ONE IMAGE:     sample_sent_3.append('<eos>')
1190 | 
1191 |             #ONE IMAGE:     three_sample_sents = [sample_sent_1, sample_sent_2, sample_sent_3]
1192 |             #ONE IMAGE:     three_sample_rewards = {}
1193 |             #ONE IMAGE:     three_sample_rewards['Bleu_1'] = 0.0
1194 |             #ONE IMAGE:     three_sample_rewards['Bleu_2'] = 0.0
1195 |             #ONE IMAGE:     three_sample_rewards['Bleu_3'] = 0.0
1196 |             #ONE IMAGE:     three_sample_rewards['Bleu_4'] = 0.0
1197 | 
1198 |             #ONE IMAGE:     for i, each_sample_sent in enumerate(three_sample_sents):
1199 |             #ONE IMAGE:         # remove the space element in a list
1200 |             #ONE IMAGE:         if ' ' in each_sample_sent:
1201 |             #ONE IMAGE:             each_sample_sent.remove(' ')
1202 | 
1203 |             #ONE IMAGE:         punctuation = np.argmax(np.array(each_sample_sent) == '<eos>') + 1
1204 |             #ONE IMAGE:         each_sample_sent = each_sample_sent[:punctuation]
1205 |             #ONE IMAGE:         each_sample_sent = ' '.join(each_sample_sent)
1206 |             #ONE IMAGE:         each_sample_sent = each_sample_sent.replace(' <eos>', '')
1207 |             #ONE IMAGE:         each_sample_sent = each_sample_sent.replace(' ,', ',')
1208 | 
1209 |             #ONE IMAGE:         fw_1 = open("./data/results_each_sample_sent.json", 'w')
1210 |             #ONE IMAGE:         fw_1.write('[{"image_id": ' + str(train_val_imageNames_to_imageIDs[img_name]) + ', "caption": "' + each_sample_sent + '"}]')
1211 |             #ONE IMAGE:         fw_1.close()
1212 | 
1213 |             #ONE IMAGE:         annFile = './train_val_reference_json/' + img_name + '.json'
1214 |             #ONE IMAGE:         resFile = './data/results_each_sample_sent.json'
1215 |             #ONE IMAGE:         coco = COCO(annFile)
1216 |             #ONE IMAGE:         cocoRes = coco.loadRes(resFile)
1217 |             #ONE IMAGE:         cocoEval = COCOEvalCap(coco, cocoRes)
1218 |             #ONE IMAGE:         cocoEval.params['image_id'] = cocoRes.getImgIds()
1219 |             #ONE IMAGE:         cocoEval.evaluate()
1220 |             #ONE IMAGE:         for metric, score in cocoEval.eval.items():
1221 |             #ONE IMAGE:             if metric == 'Bleu_1':
1222 |             #ONE IMAGE:                 three_sample_rewards['Bleu_1'] += score
1223 |             #ONE IMAGE:             if metric == 'Bleu_2':
1224 |             #ONE IMAGE:                 three_sample_rewards['Bleu_2'] += score
1225 |             #ONE IMAGE:             if metric == 'Bleu_3':
1226 |             #ONE IMAGE:                 three_sample_rewards['Bleu_3'] += score
1227 |             #ONE IMAGE:             if metric == 'Bleu_4':
1228 |             #ONE IMAGE:                 three_sample_rewards['Bleu_4'] += score
1229 | 
1230 |             #ONE IMAGE:     current_image_rewards['Bleu_1'].append(three_sample_rewards['Bleu_1']/3.0)
1231 |             #ONE IMAGE:     current_image_rewards['Bleu_2'].append(three_sample_rewards['Bleu_2']/3.0)
1232 |             #ONE IMAGE:     current_image_rewards['Bleu_3'].append(three_sample_rewards['Bleu_3']/3.0)
1233 |             #ONE IMAGE:     current_image_rewards['Bleu_4'].append(three_sample_rewards['Bleu_4']/3.0)
1234 | 
1235 |             #ONE IMAGE: fw_1 = open("./data/results_each_sample_sent.json", 'w')
1236 |             #ONE IMAGE: fw_1.write('[{"image_id": ' + str(train_val_imageNames_to_imageIDs[img_name]) + ', "caption": "' + generated_sentence + '"}]')
1237 |             #ONE IMAGE: fw_1.close()
1238 | 
1239 |             #ONE IMAGE: annFile = './train_val_reference_json/' + img_name + '.json'
1240 |             #ONE IMAGE: resFile = './data/results_each_sample_sent.json'
1241 |             #ONE IMAGE: coco = COCO(annFile)
1242 |             #ONE IMAGE: cocoRes = coco.loadRes(resFile)
1243 |             #ONE IMAGE: cocoEval = COCOEvalCap(coco, cocoRes)
1244 |             #ONE IMAGE: cocoEval.params['image_id'] = cocoRes.getImgIds()
1245 |             #ONE IMAGE: cocoEval.evaluate()
1246 |             #ONE IMAGE: for metric, score in cocoEval.eval.items():
1247 |             #ONE IMAGE:     if metric == 'Bleu_1':
1248 |             #ONE IMAGE:         current_image_rewards['Bleu_1'].append(score)
1249 |             #ONE IMAGE:     if metric == 'Bleu_2':
1250 |             #ONE IMAGE:         current_image_rewards['Bleu_2'].append(score)
1251 |             #ONE IMAGE:     if metric == 'Bleu_3':
1252 |             #ONE IMAGE:         current_image_rewards['Bleu_3'].append(score)
1253 |             #ONE IMAGE:     if metric == 'Bleu_4':
1254 |             #ONE IMAGE:         current_image_rewards['Bleu_4'].append(score)
1255 | 
1256 |             #ONE IMAGE: # save the rewards immediately
1257 |             #ONE IMAGE: train_val_images_Q_rewards[img_name] = current_image_rewards
1258 |             #ONE IMAGE: with open('./data/train_val_images_Q_rewards.pkl', 'w') as fw_2:
1259 |             #ONE IMAGE:     pickle.dump(train_val_images_Q_rewards, fw_2)
1260 | 
1261 |             #ONE IMAGE: # compute the weight sum of Bleu value as rewards
1262 |             #ONE IMAGE: for tmp_idx in range(0, lstm_step):
1263 |             #ONE IMAGE:     tmp_reward = current_image_rewards['Bleu_1'][tmp_idx] * Bleu_1_weight + \
1264 |             #ONE IMAGE:                  current_image_rewards['Bleu_2'][tmp_idx] * Bleu_2_weight + \
1265 |             #ONE IMAGE:                  current_image_rewards['Bleu_3'][tmp_idx] * Bleu_3_weight + \
1266 |             #ONE IMAGE:                  current_image_rewards['Bleu_4'][tmp_idx] * Bleu_4_weight
1267 |             #ONE IMAGE:     sum_image_rewards.append(tmp_reward)
1268 | 
1269 |             #ONE IMAGE: sum_image_rewards = np.asarray(sum_image_rewards)
1270 |             #ONE IMAGE: sum_image_rewards = np.reshape(sum_image_rewards, [1, lstm_step])
1271 |             #ONE IMAGE: all_baselines = np.asarray(all_baselines)
1272 |             #ONE IMAGE: all_baselines = np.reshape(all_baselines, [1, lstm_step])
1273 |             #ONE IMAGE: feed_dict = {tf_images_2: current_feats, tf_Q_rewards: sum_image_rewards, tf_Baselines: all_baselines}
1274 |             #ONE IMAGE: _, loss_value = sess.run([train_op, tf_loss], feed_dict)
1275 | 
1276 |             #ONE IMAGE: loss_to_draw_epoch.append(loss_value)
1277 | 
1278 |             #ONE IMAGE: print "idx: {}  epoch: {}  loss: {}  Time cost: {}".format(idx, epoch, loss_value, time.time()-start_time)
1279 | 
1280 |         #ONE IMAGE: # draw loss curve every epoch
1281 |         #ONE IMAGE: loss_to_draw.append(np.mean(loss_to_draw_epoch))
1282 |         #ONE IMAGE: plt_save_img_name = str(epoch) + '.png'
1283 |         #ONE IMAGE: plt.plot(range(len(loss_to_draw)), loss_to_draw, color='g')
1284 |         #ONE IMAGE: plt.grid(True)
1285 |         #ONE IMAGE: plt.savefig(os.path.join(loss_images_save_path, plt_save_img_name))
1286 | 
1287 |         #ONE IMAGE: if np.mod(epoch, 2) == 0:
1288 |         #ONE IMAGE:     print "Epoch ", epoch, " is done. Saving the model ..."
1289 |         #ONE IMAGE:     saver.save(sess, os.path.join('./models', 'SGD_update_model'), global_step=epoch)
1290 | 
1291 | 
1292 | 
1293 | 


--------------------------------------------------------------------------------
/inception/COCO_val2014_000000320612.jpg:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/chenxinpeng/Optimization_of_image_description_metrics_using_policy_gradient_methods/66089304b3dc78a1e27f90e262d0cb17c5bb4cf2/inception/COCO_val2014_000000320612.jpg


--------------------------------------------------------------------------------
/inception/README.md:
--------------------------------------------------------------------------------
1 | Please attetion:
2 | 
3 | 1. First, the original images in MSCOCO have an image which is PNG format (**COCO_val2014_000000320612.jpg**). 
4 | 
5 | 2. Second, when I want to put the training features and validation features into one folder: `train_val_feats`. 
6 |    The number of files is so many that the Linux system can't execute with `cp` command.
7 |    So I use the `copy_train_val_feats.sh` to put the `train_feats`, `val_feats` into one folder: `train_val_feats`.
8 |    
9 | 


--------------------------------------------------------------------------------
/inception/check_NOT_JPEG_IMG.sh:
--------------------------------------------------------------------------------
1 | 
2 | DIR="/home/chenxp/data/mscoco/val2014/*.jpg"
3 | 
4 | for img in $DIR
5 | do 
6 |     file $img >> imageInfo.txt
7 | done
8 | 


--------------------------------------------------------------------------------
/inception/copy_train_val_feats.sh:
--------------------------------------------------------------------------------
1 | #!/bin/bash
2 | 
3 | DIR="./val_feats/*.npy"
4 | 
5 | for feat in $DIR
6 | do
7 |     cp $feat ./train_val_feats
8 | done
9 | 


--------------------------------------------------------------------------------
/inception/extract_inception_bottleneck_feature.py:
--------------------------------------------------------------------------------
 1 | import os
 2 | import glob
 3 | import time
 4 | 
 5 | import tensorflow as tf
 6 | import tensorflow.python.platform
 7 | from tensorflow.python.platform import gfile
 8 | 
 9 | import numpy as np
10 | 
11 | 
12 | def create_graph(model_path):
13 |     """
14 |     create_graph loads the inception model to memory, should be called before
15 |     calling extract_features.
16 | 
17 |     model_path: path to inception model in protobuf form.
18 |     """
19 |     with gfile.FastGFile(model_path, 'rb') as f:
20 |         graph_def = tf.GraphDef()
21 |         graph_def.ParseFromString(f.read())
22 |         _ = tf.import_graph_def(graph_def, name='')
23 | 
24 | 
25 | def extract_features(image_paths, feats_save_path, verbose=False):
26 |     """
27 |     extract_features computed the inception bottleneck feature for a list of images
28 | 
29 |     image_paths: array of image path
30 |     return: 2-d array in the shape of (len(image_paths), 2048)
31 |     """
32 |     #feature_dimension = 2048
33 |     #features = np.empty((len(image_paths), feature_dimension))
34 | 
35 |     with tf.Session() as sess:
36 |         flattened_tensor = sess.graph.get_tensor_by_name('pool_3:0')
37 | 
38 |         for i, image_path in enumerate(image_paths):
39 |             image_basename = os.path.basename(image_path)
40 |             start_time = time.time()
41 | 
42 |             feat_save_path = os.path.join(feats_save_path, image_basename + '.npy')
43 |             if os.path.isfile(feat_save_path):
44 |                 continue
45 | 
46 |             if not gfile.Exists(image_path):
47 |                 tf.logging.fatal('File does not exist %s', image)
48 | 
49 |             image_data = gfile.FastGFile(image_path, 'rb').read()
50 |             feature = sess.run([flattened_tensor], {'DecodeJpeg/contents:0': image_data})
51 |             np.save(feat_save_path, np.squeeze(feature))
52 | 
53 |             if verbose:
54 |                 print('idx: {}  {}  Time cost: {}'.format(i, image_basename, time.time()-start_time))
55 | 
56 | 
57 | if __name__ == "__main__":
58 |     images_path = '/home/chenxp/data/mscoco/test2014'
59 |     feats_save_path = './test_feats'
60 | 
61 |     model_path = 'tensorflow_inception_graph.pb'
62 | 
63 |     images_lists = glob.glob(images_path + '/*.jpg')
64 | 
65 |     create_graph(model_path)
66 |     extract_features(images_lists, feats_save_path, verbose=True)
67 | 


--------------------------------------------------------------------------------
/inception/test_feats/README.md:
--------------------------------------------------------------------------------
1 | 
2 | This folder saves the features of test images.
3 | 


--------------------------------------------------------------------------------
/inception/train_feats/README.md:
--------------------------------------------------------------------------------
1 | 
2 | This folder saves the feature of train images.
3 | 


--------------------------------------------------------------------------------
/inception/train_val_feats/README.md:
--------------------------------------------------------------------------------
1 | 
2 | This folder saves the features of training and validation images.
3 | 


--------------------------------------------------------------------------------
/inception/val_feats/README.md:
--------------------------------------------------------------------------------
1 | 
2 | This folder saves the features of validation images.
3 | 


--------------------------------------------------------------------------------
/pre_train_json.py:
--------------------------------------------------------------------------------
 1 | # encoding: UTF-8
 2 | 
 3 | import os
 4 | import json
 5 | import numpy as np
 6 | import cPickle as pickle
 7 | 
 8 | import time
 9 | import ipdb
10 | 
11 | train_captions_path = './data/captions_train2014.json'
12 | save_images_captions_path = './data/train_images_captions.pkl'
13 | 
14 | train_captions_fo = open(train_captions_path)
15 | train_captions = json.load(train_captions_fo)
16 | 
17 | image_ids = []
18 | for annotation in train_captions['annotations']:
19 |     image_ids.append(annotation['image_id'])
20 | 
21 | # [[filename1, id1], [filename2, id2], ... ]
22 | images_captions = {}
23 | for ii, image in enumerate(train_captions['images']):
24 |     start_time = time.time()
25 | 
26 |     image_file_name = image['file_name']
27 |     image_id = image['id']
28 |     indices = [i for i, x in enumerate(image_ids) if x == image_id]
29 | 
30 |     caption = []
31 |     for idx in indices:
32 |         each_cap = train_captions['annotations'][idx]['caption']
33 |         each_cap = each_cap.lower()
34 |         each_cap = each_cap.replace('.', '')
35 |         each_cap = each_cap.replace(',', ' ,')
36 |         each_cap = each_cap.replace('?', ' ?')
37 |         caption.append(each_cap)
38 |     images_captions[image_file_name] = caption
39 |     print "{}  {}  Each image cost: {}".format(ii, image_file_name, time.time()-start_time)
40 | 
41 | with open(save_images_captions_path, 'w') as fw:
42 |     pickle.dump(images_captions, fw)
43 | 
44 | 
45 | 


--------------------------------------------------------------------------------
/pre_val_json.py:
--------------------------------------------------------------------------------
 1 | # encoding: UTF-8
 2 | 
 3 | import os
 4 | import json
 5 | import numpy as np
 6 | import cPickle as pickle
 7 | 
 8 | import time
 9 | import ipdb
10 | 
11 | train_captions_path = './data/captions_val2014.json'
12 | save_images_captions_path = './data/val_images_captions.pkl'
13 | 
14 | train_captions_fo = open(train_captions_path)
15 | train_captions = json.load(train_captions_fo)
16 | 
17 | image_ids = []
18 | for annotation in train_captions['annotations']:
19 |     image_ids.append(annotation['image_id'])
20 | 
21 | # [[filename1, id1], [filename2, id2], ... ]
22 | images_captions = {}
23 | for ii, image in enumerate(train_captions['images']):
24 |     start_time = time.time()
25 | 
26 |     image_file_name = image['file_name']
27 |     image_id = image['id']
28 |     indices = [i for i, x in enumerate(image_ids) if x == image_id]
29 | 
30 |     caption = []
31 |     for idx in indices:
32 |         each_cap = train_captions['annotations'][idx]['caption']
33 |         each_cap = each_cap.lower()
34 |         each_cap = each_cap.replace('.', '')
35 |         each_cap = each_cap.replace(',', ' ,')
36 |         each_cap = each_cap.replace('?', ' ?')
37 |         caption.append(each_cap)
38 |     images_captions[image_file_name] = caption
39 |     print "{}  {}  Each image cost: {}".format(ii, image_file_name, time.time()-start_time)
40 | 
41 | with open(save_images_captions_path, 'w') as fw:
42 |     pickle.dump(images_captions, fw)
43 | 
44 | 
45 | 


--------------------------------------------------------------------------------
/split_train_val_data.py:
--------------------------------------------------------------------------------
 1 | # encoding: UTF-8
 2 | 
 3 | # accoding the paper: we hold out a small subset of 1,665 validation images
 4 | # for hyper-parameter tuning, and use the remaining combined training and
 5 | # validation set for training
 6 | 
 7 | import os
 8 | import cPickle as pickle
 9 | 
10 | train_images_captions_path = './data/train_images_captions.pkl'
11 | val_images_captions_path = './data/val_images_captions.pkl'
12 | 
13 | with open(train_images_captions_path, 'r') as fr1:
14 |     train_images_captions = pickle.load(fr1)
15 | 
16 | with open(val_images_captions_path, 'r') as fr2:
17 |     val_images_captions = pickle.load(fr2)
18 | 
19 | val_images_names = val_images_captions.keys()
20 | 
21 | # val_images_names[0:1665] for validation
22 | # val_images_names[1665:] for training
23 | val_names_part_one = val_images_names[0:1665]
24 | val_names_part_two = val_images_names[1665:]
25 | 
26 | # re-save the train_images_captions, val_images_captions
27 | val_images_captions_new = {}
28 | for img in val_names_part_one:
29 |     val_images_captions_new[img] = val_images_captions[img]
30 | 
31 | for img in val_names_part_two:
32 |     train_images_captions[img] = val_images_captions[img]
33 | 
34 | with open(train_images_captions_path, 'w') as fw1:
35 |     pickle.dump(train_images_captions, fw1)
36 | 
37 | with open(val_images_captions_path, 'w') as fw2:
38 |     pickle.dump(val_images_captions_new, fw2)
39 | 
40 | 


--------------------------------------------------------------------------------