├── README.md
├── build_vocab.py
├── coco_caption
├── Bleu_1.pkl
├── Bleu_2.pkl
├── Bleu_3.pkl
├── Bleu_4.pkl
├── CIDEr.pkl
├── METEOR.pkl
├── captions_test2014_hitachi_results.json
├── captions_val2014_hitachi_results.json
├── draw.py
├── eval_captions_results.py
├── eval_image_caption.py
├── eval_model.py
├── gen_test_json.py
├── gen_val_json.py
├── model_evalution.png
├── read_test_info.py
└── read_validation_info.py
├── create_train_val_all_reference.py
├── create_train_val_each_reference.py
├── data
├── bias_init_vector.npy
├── idx_to_word.pkl
├── test.txt
├── test2014_images_ids_to_names.pkl
├── train_val_imageNames_to_imageIDs.pkl
├── val2014_images_ids_to_names.pkl
├── val_images_captions.pkl
└── word_to_idx.pkl
├── image
├── 1.png
├── 2.png
└── 3.png
├── image_caption.py
├── inception
├── COCO_val2014_000000320612.jpg
├── README.md
├── check_NOT_JPEG_IMG.sh
├── copy_train_val_feats.sh
├── extract_inception_bottleneck_feature.py
├── imageInfo.txt
├── test_feats
│ └── README.md
├── train_feats
│ └── README.md
├── train_val_feats
│ └── README.md
└── val_feats
│ └── README.md
├── pre_train_json.py
├── pre_val_json.py
└── split_train_val_data.py
/README.md:
--------------------------------------------------------------------------------
1 | # Optimization of image description metrics using policy gradient methods
2 | This is Tensorflow implement of paper: [Optimization of image description metrics using policy gradient methods](https://arxiv.org/abs/1612.00370).
3 |
4 | ## Note
5 | This repository is not being actively maintained due to lack of time and interest. My sincerest apologies to the open source community for allowing this project to stagnate. I hope it was useful for some of you as a jumping-off point.
6 |
7 | ## Prerequisites
8 | - TensorFlow 0.10
9 |
10 | ## Introduction
11 | This codes are a little coarse. When I worked on this paper, I also have some questions. But the authors didn't reply me after I sent e-mails, and I don't know why.
12 |
13 | So, Please contact me anytime if you have any doubt.
14 |
15 | My e-mail: jschenxinpeng@gmail.com
16 |
17 | I will appreciate that if you have any advice.
18 |
19 | ## How to run the code
20 | ### Step 1
21 | Go into the `./inception` directory, the python script which used to extract features is: `extract_inception_bottleneck_feature.py`.
22 |
23 | In this python script, there are few parameters you should modified:
24 | - `image_path`: the MSCOCO image path, e.g. `/path/to/msococo/train2014`, `/path/to/msococo/val2014`, `/path/to/msococo/test2014`
25 | - `feats_save_path`: the feature directory which you want to saved.
26 | - `model_path`: the pre-trained **inception-V3** tensorflow model. And I uploaded this model on the Google Drive: [tensorflow_inception_graph.pb](https://drive.google.com/open?id=0B65vBUruA6N4Y2dtVHBJMVhodjA)
27 |
28 |
29 | After you modified the parameters, we can extract image features, in the terminal:
30 | ```bash
31 | $ CUDA_VISIBLE_DEVICES=3 python extract_inception_bottleneck_feature.py
32 | ```
33 | Also, you can run the code without GPU:
34 | ```bash
35 | $ CUDA_VISIBLE_DEVICES="" python extract_inception_bottleneck_feature.py
36 | ```
37 |
38 | In my experiment, I save the `train2014` image feature in the folder: `./inception/train_feats`, `val2014` image feature are saved in the folder: `./inception/val_feats`, and the `test2014` image features are saved in the folder: `test_feats`
39 | And at the same time, I saved the `train2014`+`val2014` image features in the folder: `./inception/train_val_feats`
40 |
41 | ### Step 2
42 | Run the scripts:
43 | ```bash
44 | $ python pre_train_json.py
45 | $ python pre_val_json.py
46 | $ python split_train_val_data.py
47 | ```
48 |
49 | The python script `pre_train_json.py`, it is used to process the `./data/captions_train2014.json`, it generated a file: `./data/train_images_captions.pkl`, it is a dict which save the captions of each image, like this:
50 |

51 |
52 | The script `pre_val_json.py`, it is used to process the `./data/captions_val2014.json`. it generated a file: `./data/val_images_captions.pkl`.
53 |
54 | The script `split_train_val_data.py`, because according to the paper, it only use 1665 validation images, the other validation images are used to training. So, I split the validation images into two parts, the 0~1665 images are used to validation, the left are used to training.
55 |
56 | ### Step 3
57 | Run the scripts:
58 | ```bash
59 | $ python create_train_val_all_reference.py
60 | ```
61 | and
62 | ```bash
63 | $ create_train_val_each_reference.py
64 | ```
65 |
66 | Let me explain the two scripts, the first script `create_train_val_all_reference.py`, it will generate a JSON file named `train_val_all_reference.json`(about 70M), it saves the ground-truth captions of training and validation images.
67 |
68 | The second script `create_train_val_each_reference.py`, it will generate JSON files of every training and validation images. And it saves every JSON file in the folder: `./train_val_reference_json/`
69 |
70 | ### Step 4
71 | Run the script:
72 | ```bash
73 | $ python build_vocab.py
74 | ```
75 |
76 | This script will build the vocabulary dict. In the data folder, it will generate three files:
77 | - word_to_idx.pkl
78 | - idx_to_word.pkl
79 | - bias_init_vector.npy
80 |
81 | By the way, I filter the words more than 5 times, you can change this parameter in the script.
82 |
83 | ### Step 5
84 | In this step, we follow the algorithm in the paper:
85 | 
86 |
87 | First, we train the the basic model with MLE(Maximum Likehood Estimation):
88 | ```bash
89 | $ CUDA_VISIBLE_DEVICES=0 ipython
90 | >>> import image_caption
91 | >>> image_caption.Train_with_MLE()
92 | ```
93 |
94 | After training the basic model, you can test and validate the model on test data and validation data:
95 | ```bash
96 | >>> image_caption.Test_with_MLE()
97 | >>> image_caption.Val_with_MLE()
98 | ```
99 |
100 | Second, we train B_phi using MC estimates of Q_theta on a small dataset D(1665 images):
101 | ```bash
102 | >>> image_caption.Sample_Q_with_MC()
103 | >>> image_caption.Train_Bphi_Model()
104 | ```
105 |
106 | After we get the B_phi model, we use RG to optimize the generation:
107 | ```bash
108 | >>> image_caption.Train_SGD_update()
109 | ```
110 | I have runned several epochs, here I compared the RL results with the no RL results:
111 | 
112 |
113 | This shows that the policy gradient method is beneficial for image caption.
114 |
115 | ### COCO evalution
116 | In the `./coco_caption/` folder, we can evaluate the generation results and our each trained model. Please see the python scripts.
117 |
--------------------------------------------------------------------------------
/build_vocab.py:
--------------------------------------------------------------------------------
1 | # encoding: UTF-8
2 |
3 | #-----------------------------------------------------------------------
4 | # We preprocess the text data by lower casing, and replacing words which
5 | # occur less than 5 times in the 82K training set with ;
6 | # This results in a vocabulary size of 10,622 (from 32,807 words).
7 | #-----------------------------------------------------------------------
8 |
9 | import os
10 | import numpy as np
11 | import cPickle as pickle
12 | import time
13 |
14 |
15 | train_images_captions_path = './data/train_images_captions.pkl'
16 | with open(train_images_captions_path, 'r') as train_fr:
17 | train_images_captions = pickle.load(train_fr)
18 |
19 | val_images_captions_path = './data/val_images_captions.pkl'
20 | with open(val_images_captions_path, 'r') as val_fr:
21 | val_images_captions = pickle.load(val_fr)
22 |
23 |
24 | #------------------------------------------------------------------------
25 | # Borrowed this function from NeuralTalk:
26 | # https://github.com/karpathy/neuraltalk/blob/master/driver.py#L16
27 | #-----------------------------------------------------------------------
28 | def preProBuildWordVocab(sentence_iterator, word_count_threshold=5):
29 | print 'Preprocessing word counts and creating vocab based on word count threshold %d' % (word_count_threshold, )
30 |
31 | t0 = time.time()
32 | word_counts = {}
33 | nsents = 0
34 |
35 | for sent in sentence_iterator:
36 | nsents += 1
37 | tmp_sent = sent.split(' ')
38 | # remove the empty string '' in the sentence
39 | tmp_sent = filter(None, tmp_sent)
40 | for w in tmp_sent:
41 | word_counts[w] = word_counts.get(w, 0) + 1
42 | vocab = [w for w in word_counts if word_counts[w] >= word_count_threshold]
43 | print 'Filter words from %d to %d in %0.2fs' % (len(word_counts), len(vocab), time.time()-t0)
44 |
45 | ixtoword = {}
46 | ixtoword[0] = ''
47 | ixtoword[1] = ''
48 | ixtoword[2] = ''
49 | ixtoword[3] = ''
50 |
51 | wordtoix = {}
52 | wordtoix[''] = 0
53 | wordtoix[''] = 1
54 | wordtoix[''] = 2
55 | wordtoix[''] = 3
56 |
57 | for idx, w in enumerate(vocab):
58 | wordtoix[w] = idx + 4
59 | ixtoword[idx+4] = w
60 |
61 | word_counts[''] = nsents
62 | word_counts[''] = nsents
63 | word_counts[''] = nsents
64 | word_counts[''] = nsents
65 |
66 | bias_init_vector = np.array([1.0 * word_counts[ ixtoword[i] ] for i in ixtoword])
67 | bias_init_vector /= np.sum(bias_init_vector) # normalize to frequencies
68 | bias_init_vector = np.log(bias_init_vector)
69 | bias_init_vector -= np.max(bias_init_vector) # shift to nice numeric range
70 |
71 | return wordtoix, ixtoword, bias_init_vector
72 |
73 |
74 | # extract all sentences in captions
75 | all_sents = []
76 | for image, sents in train_images_captions.iteritems():
77 | for each_sent in sents:
78 | all_sents.append(each_sent)
79 | #for image, sents in val_images_captions.iteritems():
80 | # for each_sent in sents:
81 | # all_sents.append(each_sent)
82 |
83 | word_to_idx, idx_to_word, bias_init_vector = preProBuildWordVocab(all_sents, word_count_threshold=5)
84 |
85 | with open('./data/idx_to_word.pkl', 'w') as fw_1:
86 | pickle.dump(idx_to_word, fw_1)
87 |
88 | with open('./data/word_to_idx.pkl', 'w') as fw_2:
89 | pickle.dump(word_to_idx, fw_2)
90 |
91 | np.save('./data/bias_init_vector.npy', bias_init_vector)
92 |
93 |
--------------------------------------------------------------------------------
/coco_caption/Bleu_1.pkl:
--------------------------------------------------------------------------------
1 | (lp1
2 | F0.63260942443110402
3 | aF0.66267368806380189
4 | aF0.67396468036764079
5 | aF0.67794791365717377
6 | aF0.67931985672525264
7 | aF0.68720906282181904
8 | aF0.68912609777666678
9 | aF0.69167747380553968
10 | aF0.69363946633569773
11 | aF0.70126672092487952
12 | aF0.69540300212104966
13 | aF0.69710219655562145
14 | aF0.70975461281721475
15 | aF0.70438543342455129
16 | aF0.71065515953693403
17 | aF0.70796259212616475
18 | aF0.7137309178045258
19 | aF0.70972130606858697
20 | aF0.71264320363297118
21 | aF0.71230150588813113
22 | aF0.71266183163274432
23 | aF0.71383000617409531
24 | aF0.71222052067379871
25 | aF0.71817035091708437
26 | aF0.71678112561610441
27 | aF0.71900126082551585
28 | aF0.72450404567296334
29 | aF0.72652732454870661
30 | aF0.71936356792608769
31 | aF0.72033242351406046
32 | aF0.72968607636874172
33 | aF0.72561088200897139
34 | aF0.7244390812590612
35 | aF0.72698497523275596
36 | aF0.73069298752786138
37 | aF0.73729693796998763
38 | aF0.73198069213775818
39 | aF0.73262131127298169
40 | aF0.73429693443261856
41 | aF0.73372853890175116
42 | aF0.73490602935423266
43 | aF0.73360538079808202
44 | aF0.73405409832707313
45 | aF0.73505874686919437
46 | aF0.73659015588558707
47 | aF0.73524962178515896
48 | aF0.74024590163932913
49 | aF0.74045426642110213
50 | aF0.74236570630058896
51 | aF0.74099218845853465
52 | aF0.74233917889878653
53 | aF0.74182765268035067
54 | aF0.7426081290217329
55 | aF0.74422245108134411
56 | aF0.74005520376683032
57 | aF0.74418414823278223
58 | aF0.73858778239285305
59 | aF0.74325937512862261
60 | aF0.74602720699373459
61 | aF0.74856333634347039
62 | aF0.74826556870816785
63 | aF0.75033971587398085
64 | aF0.74709100559341968
65 | aF0.74876745673204315
66 | aF0.75033740951288808
67 | aF0.75072200676621936
68 | aF0.75079507461467931
69 | aF0.75131406044676519
70 | aF0.75091575091573559
71 | aF0.7533601049008205
72 | aF0.75039924655008505
73 | aF0.75081011677908271
74 | aF0.75387249307026027
75 | aF0.75280507066520175
76 | aF0.75895135402089153
77 | aF0.7509718073570778
78 | aF0.75198312065081352
79 | aF0.75178513469651187
80 | aF0.75653279730892331
81 | aF0.75611004216643973
82 | aF0.75846715177999302
83 | aF0.75828324067043973
84 | aF0.75750477716820241
85 | aF0.75789989385154011
86 | aF0.75972493489581794
87 | aF0.75437812360325152
88 | aF0.75501730103804698
89 | aF0.76171867007671079
90 | aF0.75758749312079343
91 | aF0.75749841762458392
92 | aF0.75914832925833831
93 | aF0.75881129150687854
94 | aF0.75751186083287869
95 | aF0.75484145132240421
96 | aF0.75908663477654936
97 | aF0.75733703416464437
98 | aF0.76081097635671613
99 | aF0.75996548039778178
100 | aF0.75746828259898757
101 | aF0.76331747554973228
102 | aF0.76167146385774265
103 | aF0.75959328927298919
104 | aF0.76060692348369285
105 | aF0.7615271719189376
106 | aF0.76225570032571743
107 | aF0.75984967874892395
108 | aF0.75717563705163327
109 | aF0.76227174752936766
110 | aF0.75881628615836505
111 | aF0.76317765735044829
112 | aF0.75621208762417613
113 | aF0.75900304982729583
114 | aF0.76074306177258977
115 | aF0.76198528962326029
116 | aF0.76501348149357051
117 | aF0.76363267788362532
118 | aF0.76185546082012268
119 | aF0.76123584183157811
120 | aF0.76008633906235867
121 | aF0.76592733375117239
122 | aF0.76374915821478762
123 | aF0.76026841827837732
124 | aF0.76175517026116601
125 | aF0.76038009926157535
126 | aF0.76368359579522416
127 | aF0.75833484600954482
128 | aF0.76197571632803573
129 | aF0.7629299028616281
130 | aF0.76091773426797282
131 | aF0.76531982421873446
132 | aF0.76405016256384894
133 | aF0.7659885788607157
134 | aF0.76358106052923536
135 | aF0.76399991904635078
136 | aF0.76128875773131832
137 | aF0.76268313355126205
138 | aF0.76484217800449661
139 | aF0.76796994008325858
140 | aF0.76355805394910903
141 | aF0.76499747091551318
142 | aF0.76452938096579082
143 | aF0.76352628826580393
144 | aF0.76775447404646602
145 | aF0.76476284624849677
146 | aF0.76222562943833194
147 | aF0.76338147833473402
148 | aF0.76421753429051176
149 | aF0.76712827251590487
150 | aF0.76658684321733916
151 | aF0.76485198497635876
152 | aF0.7618884776311402
153 | aF0.76374823481943188
154 | aF0.76541313880411499
155 | aF0.76382010709826542
156 | aF0.76418767057311043
157 | aF0.76132075471696592
158 | aF0.76545898102782794
159 | aF0.76787125803115663
160 | aF0.76265898728100234
161 | aF0.76364074356605049
162 | aF0.76735406919538229
163 | aF0.76499618825982496
164 | aF0.76497955731000866
165 | aF0.76429861529197751
166 | aF0.7638679302227166
167 | aF0.76541870139264157
168 | aF0.76458651450613269
169 | aF0.7663183003496985
170 | aF0.76314153346301195
171 | aF0.76658610271901784
172 | aF0.76437242097670843
173 | aF0.76178136675905428
174 | aF0.76439947911447936
175 | aF0.75999279077217319
176 | aF0.76444729922317267
177 | aF0.76261997510138624
178 | aF0.75889880417638389
179 | aF0.76379881537996663
180 | aF0.76253897998187792
181 | aF0.76538973882854799
182 | aF0.76289754065527593
183 | aF0.76168346510550311
184 | aF0.76136567119633991
185 | aF0.75887144844027898
186 | aF0.76305356893836762
187 | aF0.76344193483908129
188 | aF0.76334680581527092
189 | aF0.76585101253614662
190 | aF0.76123191302260851
191 | aF0.7634585650890392
192 | aF0.76134217400038662
193 | aF0.7604760476047453
194 | aF0.76254760812776401
195 | aF0.76048083248608023
196 | aF0.76025974025972509
197 | aF0.76538307260327387
198 | aF0.76010312987167206
199 | aF0.76104285286183682
200 | aF0.76136592032799855
201 | aF0.76413942221063957
202 | aF0.76069486646408202
203 | aF0.76404426801393877
204 | aF0.76077328225998009
205 | aF0.76148381689745859
206 | aF0.76053659024925901
207 | aF0.76173206065797661
208 | aF0.75772879409025495
209 | aF0.76179131966686497
210 | aF0.7582728650027758
211 | aF0.75840286054825679
212 | aF0.76425427578027694
213 | aF0.7586440542856423
214 | aF0.76066111781102075
215 | aF0.76409670893173964
216 | aF0.76330972465381264
217 | aF0.7588560738735719
218 | aF0.75990113810765669
219 | aF0.7633820800829807
220 | aF0.7603335533025225
221 | aF0.76201052484141185
222 | aF0.76061405612855282
223 | aF0.75630568950803212
224 | aF0.757408546872814
225 | aF0.76159282197870493
226 | aF0.76209790209788697
227 | aF0.762037850355331
228 | aF0.75760592545395122
229 | aF0.76000319144690709
230 | aF0.76037472593181665
231 | aF0.76114156339369321
232 | aF0.76310264152451035
233 | aF0.75998571343530852
234 | aF0.75947565098758807
235 | aF0.76130983169536548
236 | aF0.76105455702241209
237 | aF0.76010427653179524
238 | aF0.75949947197479906
239 | aF0.76106000439006327
240 | aF0.76160169187181759
241 | aF0.76186211288908612
242 | aF0.75888582043588193
243 | aF0.76169278371093407
244 | aF0.76218829923272136
245 | aF0.76508881611589241
246 | aF0.76059954616026204
247 | aF0.75496076619007746
248 | aF0.75848861899940945
249 | aF0.75676319393444702
250 | aF0.7617587319287753
251 | aF0.76011012908244202
252 | a.
--------------------------------------------------------------------------------
/coco_caption/Bleu_2.pkl:
--------------------------------------------------------------------------------
1 | (lp1
2 | F0.44701445730041445
3 | aF0.4830433062444468
4 | aF0.4939662856657035
5 | aF0.50050569478837414
6 | aF0.50175276072974118
7 | aF0.50897594637847177
8 | aF0.51316415541095439
9 | aF0.51448432825471457
10 | aF0.51853477147182192
11 | aF0.52498707790409072
12 | aF0.51841263036406193
13 | aF0.52040892167020669
14 | aF0.53524628148436371
15 | aF0.52934501579869575
16 | aF0.53453659106341056
17 | aF0.53270843698039461
18 | aF0.54089648298723925
19 | aF0.53803943586804548
20 | aF0.53975898466930994
21 | aF0.54063323725452739
22 | aF0.54217124704850073
23 | aF0.54411558783118907
24 | aF0.53994886567844735
25 | aF0.54709809187597069
26 | aF0.54544906649331237
27 | aF0.54868210119279937
28 | aF0.5530751557038055
29 | aF0.55552914989855995
30 | aF0.54887126568793954
31 | aF0.55155635654771573
32 | aF0.56095074945855317
33 | aF0.55732285864590903
34 | aF0.55810031416061867
35 | aF0.56031775646004767
36 | aF0.56165756096229902
37 | aF0.57194047706892182
38 | aF0.56457488849693882
39 | aF0.56605184239308481
40 | aF0.56889475603038908
41 | aF0.56868565740017174
42 | aF0.56971650083471992
43 | aF0.56792100434663617
44 | aF0.56998952080709786
45 | aF0.57025692681385454
46 | aF0.57170900009487069
47 | aF0.57194722703677359
48 | aF0.57787436758728994
49 | aF0.5753862229678891
50 | aF0.57916014431738894
51 | aF0.57912794779759935
52 | aF0.57929492407106742
53 | aF0.58078922987124637
54 | aF0.5796684664074081
55 | aF0.58005494492785126
56 | aF0.57911581348905161
57 | aF0.58278507841009286
58 | aF0.57745218667697251
59 | aF0.58203875926982884
60 | aF0.58495866143962083
61 | aF0.58706610649185209
62 | aF0.58951244632503708
63 | aF0.59026069155858862
64 | aF0.58883494292550931
65 | aF0.58831417421830079
66 | aF0.59102430979134868
67 | aF0.5911804143343059
68 | aF0.59126462484225051
69 | aF0.59358746429763165
70 | aF0.59187843140612617
71 | aF0.59412353717009891
72 | aF0.59127930606673385
73 | aF0.59285352042928452
74 | aF0.59559156222164167
75 | aF0.59379898303023559
76 | aF0.60248618395778653
77 | aF0.59384720141066505
78 | aF0.59376182375289477
79 | aF0.59483540612433117
80 | aF0.59954501808983662
81 | aF0.59990203267749709
82 | aF0.60332997368419095
83 | aF0.60215894841141249
84 | aF0.60352467419032307
85 | aF0.6013315578396784
86 | aF0.60494560460416125
87 | aF0.59908424648591796
88 | aF0.60016198505966334
89 | aF0.60801929345256411
90 | aF0.60205314238629593
91 | aF0.60094656337983576
92 | aF0.60569322731387798
93 | aF0.60137663725716728
94 | aF0.60447554787780311
95 | aF0.6023968738365173
96 | aF0.60445867473371473
97 | aF0.60402523782931206
98 | aF0.60584654496739965
99 | aF0.60650858956193743
100 | aF0.60272262198958249
101 | aF0.61085598334544078
102 | aF0.60684081983158378
103 | aF0.60587199343224851
104 | aF0.6064372797156411
105 | aF0.61030070725962771
106 | aF0.6093417010233797
107 | aF0.60805388440391828
108 | aF0.6020004209514298
109 | aF0.61043376214203138
110 | aF0.60671576045281872
111 | aF0.61002699552743667
112 | aF0.60464068114185543
113 | aF0.60580018038988437
114 | aF0.60981663264733621
115 | aF0.60877942352291792
116 | aF0.61291674770910931
117 | aF0.61309914625801043
118 | aF0.60996962332835791
119 | aF0.61201405304123013
120 | aF0.60841555061617714
121 | aF0.61573573757223254
122 | aF0.61318783301342683
123 | aF0.60849975642279663
124 | aF0.61252675557034453
125 | aF0.60898045692066782
126 | aF0.61504865645318463
127 | aF0.60710668744610208
128 | aF0.61091470259637271
129 | aF0.61200473084388596
130 | aF0.6099081398355648
131 | aF0.61536424101384635
132 | aF0.61421700746185792
133 | aF0.61643166727267151
134 | aF0.61445942019481259
135 | aF0.61502708829375341
136 | aF0.61197734689484284
137 | aF0.61316103926709153
138 | aF0.61581535445513302
139 | aF0.61867682178305294
140 | aF0.61490103693948694
141 | aF0.6151495679583342
142 | aF0.61595917782039489
143 | aF0.61430860486926553
144 | aF0.61704711742905327
145 | aF0.61709909839235777
146 | aF0.61258774792155035
147 | aF0.61482858860084921
148 | aF0.61504181807003588
149 | aF0.61979197468898661
150 | aF0.61864272422277389
151 | aF0.61513322327342745
152 | aF0.61295431012147183
153 | aF0.61541712918827107
154 | aF0.61726156954047251
155 | aF0.61754589219786915
156 | aF0.61468076187160803
157 | aF0.61265300280943291
158 | aF0.6168941181595029
159 | aF0.61919901824762136
160 | aF0.61406508562514361
161 | aF0.61502761262377836
162 | aF0.61940977597170821
163 | aF0.61748437661042355
164 | aF0.61693903992142263
165 | aF0.61599967048221671
166 | aF0.61789053547752348
167 | aF0.6183240233733942
168 | aF0.61771838981498883
169 | aF0.61922813949850863
170 | aF0.61450439534501533
171 | aF0.61973030145040742
172 | aF0.61818458359418482
173 | aF0.6157284722486378
174 | aF0.61756890018497412
175 | aF0.61230920347519824
176 | aF0.61686833730123991
177 | aF0.61605411838141333
178 | aF0.61315340461893242
179 | aF0.61646795673946864
180 | aF0.61612208193854168
181 | aF0.61976107823057192
182 | aF0.61380989244511042
183 | aF0.61411410577631032
184 | aF0.61552782708664289
185 | aF0.61152383713225156
186 | aF0.61823178449047611
187 | aF0.61753378676799853
188 | aF0.61619993806514584
189 | aF0.62098948977751711
190 | aF0.61458008818578691
191 | aF0.61736647496168917
192 | aF0.61426458254321592
193 | aF0.61346741040548691
194 | aF0.61660466734297126
195 | aF0.61405728049556663
196 | aF0.61488851431290303
197 | aF0.61849654967711598
198 | aF0.61557906777518223
199 | aF0.61408542908871278
200 | aF0.61607072636777938
201 | aF0.61997841128965314
202 | aF0.61518709141594352
203 | aF0.62030651352035393
204 | aF0.61730140591390059
205 | aF0.61763072287861132
206 | aF0.61475577921228552
207 | aF0.61801016609292014
208 | aF0.61210736598709281
209 | aF0.61659054111000178
210 | aF0.61210331912518356
211 | aF0.61296183498816936
212 | aF0.61783967210400192
213 | aF0.61391822633306581
214 | aF0.61556511671279812
215 | aF0.61938574320350381
216 | aF0.62007090827931111
217 | aF0.61384994461502429
218 | aF0.61500100454818185
219 | aF0.61930892097903145
220 | aF0.61500605774310835
221 | aF0.61638415864859331
222 | aF0.6177303796053345
223 | aF0.61212374229375988
224 | aF0.61247389336872149
225 | aF0.61760359835095147
226 | aF0.61731902973593111
227 | aF0.61841611686862852
228 | aF0.61256405644606771
229 | aF0.61802499832904678
230 | aF0.61760266436194977
231 | aF0.61660486229631017
232 | aF0.62044394356209343
233 | aF0.61546395818335875
234 | aF0.6139598334367975
235 | aF0.61769434235330445
236 | aF0.61695063768578651
237 | aF0.61738899642248457
238 | aF0.61748494771049667
239 | aF0.61645394369516349
240 | aF0.61838848996504037
241 | aF0.62086840544332933
242 | aF0.61543881095760411
243 | aF0.61794125716157011
244 | aF0.61819097069341933
245 | aF0.62225844703921607
246 | aF0.61628904558898612
247 | aF0.6114189319235831
248 | aF0.61394621850444564
249 | aF0.61457811381292315
250 | aF0.62035333536351489
251 | aF0.61688896948080474
252 | a.
--------------------------------------------------------------------------------
/coco_caption/Bleu_3.pkl:
--------------------------------------------------------------------------------
1 | (lp1
2 | F0.29707668433830187
3 | aF0.33579709089922882
4 | aF0.34468956821642355
5 | aF0.35316911577671461
6 | aF0.35567269818131536
7 | aF0.36026177660692654
8 | aF0.36471407910280479
9 | aF0.36844262557330587
10 | aF0.37220932639239973
11 | aF0.37655673987251204
12 | aF0.37035683680989817
13 | aF0.37383122750940773
14 | aF0.38748195100330884
15 | aF0.38170775143175978
16 | aF0.38555749119890365
17 | aF0.38731723488565623
18 | aF0.3946590294047751
19 | aF0.39315025222532585
20 | aF0.39256515373344114
21 | aF0.39467263589120877
22 | aF0.39696270707495923
23 | aF0.39833479893545831
24 | aF0.39487779046538279
25 | aF0.40155629980022378
26 | aF0.39885564800104872
27 | aF0.402607878910128
28 | aF0.40497159270308258
29 | aF0.40795938013325189
30 | aF0.40344632387585178
31 | aF0.40637259660635661
32 | aF0.41407510930333169
33 | aF0.4102150656455672
34 | aF0.41338795373053411
35 | aF0.41467297534600533
36 | aF0.41516443031850281
37 | aF0.42669972153917346
38 | aF0.41821149081195041
39 | aF0.42038574304194831
40 | aF0.42513526251520656
41 | aF0.42370485724669033
42 | aF0.42488531017056119
43 | aF0.42293690849431748
44 | aF0.42667059727500384
45 | aF0.42586863147294762
46 | aF0.42836885025141436
47 | aF0.4299897907257495
48 | aF0.43411806746351622
49 | aF0.43056178438544851
50 | aF0.43563695720324735
51 | aF0.4345403419284995
52 | aF0.43522214987795199
53 | aF0.43796801751982894
54 | aF0.43571154528320061
55 | aF0.43636970073462372
56 | aF0.43705544038470123
57 | aF0.44065071575633136
58 | aF0.435368739574028
59 | aF0.43961773731432013
60 | aF0.44183301787037665
61 | aF0.44367101901501016
62 | aF0.44872928677342272
63 | aF0.44816791129017247
64 | aF0.44742686100836998
65 | aF0.44632863652345256
66 | aF0.44999945503612349
67 | aF0.44874574716998011
68 | aF0.44921021562245445
69 | aF0.45245486710269617
70 | aF0.45080258927390493
71 | aF0.4517288056115108
72 | aF0.45044330018954776
73 | aF0.45315527170499748
74 | aF0.45515082040339
75 | aF0.45256258077690009
76 | aF0.46140213355011939
77 | aF0.45340587413910699
78 | aF0.45358279690486486
79 | aF0.45457779339247517
80 | aF0.45946683687786549
81 | aF0.46015771281470458
82 | aF0.46459354755904542
83 | aF0.46353604874142995
84 | aF0.46485315789905229
85 | aF0.46229484283542849
86 | aF0.46683625217591457
87 | aF0.4600210089069427
88 | aF0.46163411012099803
89 | aF0.46952465502189561
90 | aF0.46353144069095609
91 | aF0.46138700496076562
92 | aF0.46805364305382735
93 | aF0.46261957473479265
94 | aF0.46822507216697445
95 | aF0.46573614820349651
96 | aF0.46646351011947462
97 | aF0.46808312770932858
98 | aF0.4681009629140841
99 | aF0.46924266963178407
100 | aF0.46433432700446597
101 | aF0.47395576019098851
102 | aF0.46988376324318432
103 | aF0.46803415290875722
104 | aF0.46871674087641396
105 | aF0.47439412173129303
106 | aF0.47246776641391586
107 | aF0.47247615078372701
108 | aF0.46404701465608966
109 | aF0.47357505531118094
110 | aF0.47198794292863699
111 | aF0.47334742622035336
112 | aF0.46985141348151238
113 | aF0.46863596911374922
114 | aF0.47534943990324519
115 | aF0.47217255313131457
116 | aF0.47675171005429551
117 | aF0.4769935641462848
118 | aF0.47488847318383937
119 | aF0.47772451287728834
120 | aF0.47385830175620297
121 | aF0.48120754995513088
122 | aF0.47831364406089844
123 | aF0.47285442227793889
124 | aF0.47875806638914192
125 | aF0.47443960821323983
126 | aF0.48247941147377782
127 | aF0.47228211995405689
128 | aF0.4757772503862297
129 | aF0.47804594214004742
130 | aF0.47603058967586698
131 | aF0.48186828438554485
132 | aF0.48125505187767553
133 | aF0.48268839434006816
134 | aF0.48060548983626122
135 | aF0.48185027200671937
136 | aF0.47887654320083178
137 | aF0.47982017328949261
138 | aF0.48288759752518839
139 | aF0.48531943145948719
140 | aF0.4821206819945793
141 | aF0.48170729227927089
142 | aF0.48325775033737572
143 | aF0.48090517713054992
144 | aF0.48413361675400879
145 | aF0.4851792890835338
146 | aF0.47916535165014268
147 | aF0.48267193300728001
148 | aF0.48194793035054923
149 | aF0.48872452714090736
150 | aF0.48591065259487948
151 | aF0.48277714764935498
152 | aF0.48087170207199137
153 | aF0.4837870721360194
154 | aF0.48524386978756578
155 | aF0.48705775470024276
156 | aF0.48266689387815853
157 | aF0.48001752849599644
158 | aF0.48472744672697665
159 | aF0.48754804891825843
160 | aF0.48193539406653291
161 | aF0.48463268530603482
162 | aF0.48784859837816424
163 | aF0.48608657030900687
164 | aF0.48496775726120694
165 | aF0.48621079480351659
166 | aF0.48701075729393739
167 | aF0.48770161730704686
168 | aF0.48711711959918164
169 | aF0.48906596202391678
170 | aF0.48341538387693245
171 | aF0.48954999335676047
172 | aF0.48902007741894865
173 | aF0.48596501774342127
174 | aF0.48741199016181458
175 | aF0.48195482010735796
176 | aF0.48620208346808602
177 | aF0.48718594829361511
178 | aF0.48442536182635992
179 | aF0.48723060252053435
180 | aF0.4865840457623945
181 | aF0.49040598219548953
182 | aF0.48411732663292323
183 | aF0.48542463554923942
184 | aF0.48691735953048676
185 | aF0.48221630463602011
186 | aF0.48897344960543215
187 | aF0.48925494480211323
188 | aF0.48734175849587807
189 | aF0.49382841715209486
190 | aF0.48525781916637306
191 | aF0.48952731180161335
192 | aF0.48489412473669607
193 | aF0.48512672089773834
194 | aF0.48773814645547636
195 | aF0.48511301628691544
196 | aF0.48515441018364269
197 | aF0.4911020151164302
198 | aF0.48774254427619224
199 | aF0.48642747994994562
200 | aF0.48797363436906499
201 | aF0.49245763106373164
202 | aF0.48815789423151973
203 | aF0.49445600842164888
204 | aF0.49117836208991322
205 | aF0.49105485743189986
206 | aF0.48766770615777566
207 | aF0.49168562113874731
208 | aF0.48461555235548326
209 | aF0.49017626894557875
210 | aF0.4860620150849187
211 | aF0.48701753957782395
212 | aF0.49082798212201251
213 | aF0.4876528482555253
214 | aF0.48980980128093132
215 | aF0.4929782608313194
216 | aF0.49502570552137382
217 | aF0.48648368075893822
218 | aF0.48956480171272349
219 | aF0.49366981705435337
220 | aF0.48835059095870148
221 | aF0.49053050478489696
222 | aF0.49372569309352365
223 | aF0.48715792153404475
224 | aF0.48702718177864163
225 | aF0.49273637239595935
226 | aF0.49203436512730986
227 | aF0.49321264770944984
228 | aF0.48763071331968477
229 | aF0.49344111229212106
230 | aF0.49260384260787449
231 | aF0.4912120228436303
232 | aF0.49561553536202341
233 | aF0.49143099524151251
234 | aF0.48810871842778192
235 | aF0.4935120176770717
236 | aF0.49019144022485189
237 | aF0.49313654782706051
238 | aF0.49396248107002222
239 | aF0.49139490752246073
240 | aF0.49398297645157413
241 | aF0.49767769550512825
242 | aF0.49079368869714263
243 | aF0.49303917546688714
244 | aF0.49310343192174433
245 | aF0.49824166501019007
246 | aF0.49207240322236639
247 | aF0.48590100958044885
248 | aF0.48890793373520286
249 | aF0.49177203053584628
250 | aF0.49592014401867918
251 | aF0.49230301633292395
252 | a.
--------------------------------------------------------------------------------
/coco_caption/Bleu_4.pkl:
--------------------------------------------------------------------------------
1 | (lp1
2 | F0.19103731907605284
3 | aF0.22683608421029841
4 | aF0.23542856254239006
5 | aF0.24459713551700563
6 | aF0.24886658872084744
7 | aF0.25220417267130463
8 | aF0.25457798072775245
9 | aF0.26031391053687997
10 | aF0.26420452058950622
11 | aF0.2659221673056883
12 | aF0.2609398704759891
13 | aF0.2653344724430638
14 | aF0.27584938913452517
15 | aF0.27237052723108879
16 | aF0.27479138411264281
17 | aF0.27797801774011011
18 | aF0.28434626362579979
19 | aF0.28405411752975479
20 | aF0.28204020906024119
21 | aF0.28593043526171896
22 | aF0.28787021905181986
23 | aF0.28805930784022998
24 | aF0.28614477285136153
25 | aF0.29181514811757781
26 | aF0.28874460804022045
27 | aF0.29247434454664928
28 | aF0.29326186082116695
29 | aF0.29634960178063929
30 | aF0.29345886529842169
31 | aF0.29650693146541662
32 | aF0.30227060494896868
33 | aF0.29855928246669844
34 | aF0.30284894566488907
35 | aF0.30260955963716535
36 | aF0.30392599304098794
37 | aF0.31370216448042576
38 | aF0.30521265145259086
39 | aF0.30993010433052742
40 | aF0.31441224952349478
41 | aF0.31192129088018899
42 | aF0.31409217087999597
43 | aF0.31076541060548568
44 | aF0.31587582541598569
45 | aF0.31537510823941756
46 | aF0.3182303877321383
47 | aF0.32096154055415882
48 | aF0.32356939088295256
49 | aF0.31907307143028468
50 | aF0.32403800046394565
51 | aF0.32287272311748239
52 | aF0.32315344964331982
53 | aF0.32640381048178768
54 | aF0.3241802927116072
55 | aF0.32610084694110908
56 | aF0.32659814021512662
57 | aF0.32997586663799006
58 | aF0.32608335039896919
59 | aF0.32858256719319212
60 | aF0.33162553092873226
61 | aF0.33210663715736538
62 | aF0.33869578316024507
63 | aF0.33722396656915149
64 | aF0.33708481990948791
65 | aF0.33567921379277449
66 | aF0.33876934289134653
67 | aF0.3371020407956643
68 | aF0.33760526283255243
69 | aF0.34166908601292334
70 | aF0.3394921246698111
71 | aF0.34054519042631148
72 | aF0.34001763209791031
73 | aF0.34277039194025344
74 | aF0.34451105787295644
75 | aF0.34200696303712547
76 | aF0.34973774603210861
77 | aF0.34230101070171876
78 | aF0.34382464607014346
79 | aF0.34457167675816985
80 | aF0.34921177041120893
81 | aF0.34986107855224013
82 | aF0.35394763736969825
83 | aF0.35372788977751429
84 | aF0.3545722007597692
85 | aF0.35297960676403084
86 | aF0.35785848228376343
87 | aF0.35115096945545998
88 | aF0.35240026852564793
89 | aF0.36015821930253972
90 | aF0.35419418837069244
91 | aF0.35222748106862894
92 | aF0.35900355377876059
93 | aF0.35270470018579997
94 | aF0.36017948097716079
95 | aF0.3571520904537403
96 | aF0.35784430443087445
97 | aF0.36033686539157744
98 | aF0.35936983184100374
99 | aF0.36103735701037065
100 | aF0.35468969311289089
101 | aF0.36481868731861383
102 | aF0.3610758660360579
103 | aF0.35833139721326762
104 | aF0.35935112724555651
105 | aF0.36681616517358284
106 | aF0.36423449763547083
107 | aF0.36436438155127382
108 | aF0.35466488745131275
109 | aF0.3638962331113289
110 | aF0.3648536348390829
111 | aF0.36538553017534126
112 | aF0.36289910709830586
113 | aF0.35994929215702953
114 | aF0.36733874497939828
115 | aF0.36397063140582475
116 | aF0.36835167441436145
117 | aF0.36867146530705597
118 | aF0.36726539384222662
119 | aF0.36981564654375382
120 | aF0.36681805279447088
121 | aF0.37403727775498979
122 | aF0.37082434601287384
123 | aF0.36561052287422757
124 | aF0.37239070490596776
125 | aF0.36741687886384988
126 | aF0.37648565964725067
127 | aF0.36583814676680609
128 | aF0.36820048588383369
129 | aF0.37085865562520698
130 | aF0.36971634701005907
131 | aF0.37524245754728314
132 | aF0.37519371420589775
133 | aF0.37569237627230612
134 | aF0.37505862408692603
135 | aF0.37494140621179811
136 | aF0.37271055324782215
137 | aF0.37381762178049976
138 | aF0.37645900557431977
139 | aF0.37860386900412352
140 | aF0.37667114598782409
141 | aF0.37451087901501978
142 | aF0.37831955612685131
143 | aF0.37467437676980575
144 | aF0.37825443356479066
145 | aF0.37993701658177981
146 | aF0.37348682450987941
147 | aF0.37795797618624111
148 | aF0.37645504797177287
149 | aF0.38387768811695189
150 | aF0.3799090659899263
151 | aF0.37685270057549286
152 | aF0.37596702261272047
153 | aF0.37845321062600784
154 | aF0.3803725963060608
155 | aF0.3822563128330731
156 | aF0.3776679861024328
157 | aF0.37493680540871699
158 | aF0.37993441984854709
159 | aF0.38199833532514771
160 | aF0.3776239901111067
161 | aF0.38090435829260305
162 | aF0.38304336514433907
163 | aF0.38140001342240798
164 | aF0.37942568337606175
165 | aF0.3831993228670541
166 | aF0.38252106021586818
167 | aF0.38417860160467121
168 | aF0.38250562004420691
169 | aF0.38423758350699044
170 | aF0.37950946045216399
171 | aF0.38616452829602393
172 | aF0.3853229532735552
173 | aF0.38278297504880476
174 | aF0.38341932995367961
175 | aF0.37845852212585496
176 | aF0.38258576524342491
177 | aF0.38457802752250619
178 | aF0.38202628685584999
179 | aF0.38541258589721178
180 | aF0.38245170915044602
181 | aF0.38745939261986095
182 | aF0.38207188761124977
183 | aF0.38237760969105117
184 | aF0.38384320986974424
185 | aF0.38015615863298075
186 | aF0.38614121306192073
187 | aF0.38705960837141862
188 | aF0.38545673597270552
189 | aF0.39340474858364699
190 | aF0.38254301091323906
191 | aF0.38711020214855368
192 | aF0.382811340276414
193 | aF0.3837974756188956
194 | aF0.38568903486431172
195 | aF0.38317133098419892
196 | aF0.38212551706633319
197 | aF0.39003297824229449
198 | aF0.38564809086809521
199 | aF0.38560682479375591
200 | aF0.38661079526224212
201 | aF0.39081229235860743
202 | aF0.387753271729882
203 | aF0.39427041718921269
204 | aF0.39120592522865488
205 | aF0.39103004797818908
206 | aF0.38733975067270082
207 | aF0.39210506439882942
208 | aF0.38463518152520504
209 | aF0.39002497769459588
210 | aF0.38688767829011556
211 | aF0.38782503967563448
212 | aF0.39050733830750117
213 | aF0.38783629528341018
214 | aF0.39075195624970827
215 | aF0.39231135321734867
216 | aF0.39621080075968301
217 | aF0.38519496993855645
218 | aF0.39095855075647445
219 | aF0.39464032402579052
220 | aF0.38878908563787218
221 | aF0.39139149780827431
222 | aF0.39569559841695606
223 | aF0.38962369059152524
224 | aF0.38811829689914212
225 | aF0.39358296190106823
226 | aF0.39262015735556394
227 | aF0.39411030708781664
228 | aF0.3897808831776835
229 | aF0.39429106708871375
230 | aF0.39371099054405934
231 | aF0.39286280253296424
232 | aF0.39659726531556033
233 | aF0.39438239328877034
234 | aF0.38857711679154944
235 | aF0.39497591048562675
236 | aF0.3903709293200292
237 | aF0.39519066565046018
238 | aF0.39655001015540858
239 | aF0.39271192457138304
240 | aF0.39663870194427303
241 | aF0.39977504950447323
242 | aF0.3923138889326509
243 | aF0.39466442112198152
244 | aF0.39469625701331201
245 | aF0.4003041739901127
246 | aF0.39477252852653533
247 | aF0.38676718932616477
248 | aF0.39068639404569161
249 | aF0.39480118886047777
250 | aF0.39771579914878369
251 | aF0.39478737966100647
252 | a.
--------------------------------------------------------------------------------
/coco_caption/CIDEr.pkl:
--------------------------------------------------------------------------------
1 | (lp1
2 | cnumpy.core.multiarray
3 | scalar
4 | p2
5 | (cnumpy
6 | dtype
7 | p3
8 | (S'f8'
9 | I0
10 | I1
11 | tRp4
12 | (I3
13 | S'<'
14 | NNNI-1
15 | I-1
16 | I0
17 | tbS'a\xb6#V6p\xe1?'
18 | tRp5
19 | ag2
20 | (g4
21 | S'sW\x1c\xb5S\xe5\xe5?'
22 | tRp6
23 | ag2
24 | (g4
25 | S'^oE\x18~\xaa\xe7?'
26 | tRp7
27 | ag2
28 | (g4
29 | S'\xc31\x0f\xf8)\xe9\xe8?'
30 | tRp8
31 | ag2
32 | (g4
33 | S'V\x03;\xcd\xda}\xe9?'
34 | tRp9
35 | ag2
36 | (g4
37 | S'\xf0g\xfa\xd1o\t\xea?'
38 | tRp10
39 | ag2
40 | (g4
41 | S'\x9c\xdb\xb9\xea\x94}\xea?'
42 | tRp11
43 | ag2
44 | (g4
45 | S'\x92\x94Y\x8e\x1c\xd9\xea?'
46 | tRp12
47 | ag2
48 | (g4
49 | S'\x05\r\xd8(\xd8\x10\xeb?'
50 | tRp13
51 | ag2
52 | (g4
53 | S'\xee\x9a(D$\xb6\xeb?'
54 | tRp14
55 | ag2
56 | (g4
57 | S'w\x86\x03\r6i\xeb?'
58 | tRp15
59 | ag2
60 | (g4
61 | S'V\xc2\x8b\x8e%\xd2\xeb?'
62 | tRp16
63 | ag2
64 | (g4
65 | S'\xa9Oo\xda\xc6\x9b\xec?'
66 | tRp17
67 | ag2
68 | (g4
69 | S'_\x99\xa0\xfd\xe0\xc4\xec?'
70 | tRp18
71 | ag2
72 | (g4
73 | S'\x15\x95$WEI\xed?'
74 | tRp19
75 | ag2
76 | (g4
77 | S'M\x1c\x13\xbc\x9a;\xed?'
78 | tRp20
79 | ag2
80 | (g4
81 | S'If\xea\x0f\xd3\xc5\xed?'
82 | tRp21
83 | ag2
84 | (g4
85 | S'\xb3:,Bf\x9c\xed?'
86 | tRp22
87 | ag2
88 | (g4
89 | S'b1Kp\xf4\xec\xed?'
90 | tRp23
91 | ag2
92 | (g4
93 | S'\x95qd<\xbaB\xee?'
94 | tRp24
95 | ag2
96 | (g4
97 | S'\x9chw\x1fE5\xee?'
98 | tRp25
99 | ag2
100 | (g4
101 | S'\xc0sN\xe7c\x9a\xee?'
102 | tRp26
103 | ag2
104 | (g4
105 | S'\x1c\x7f\x88\x0fQv\xee?'
106 | tRp27
107 | ag2
108 | (g4
109 | S'\xebK\xd8\xe2\x14\x17\xef?'
110 | tRp28
111 | ag2
112 | (g4
113 | S'\x9c<1\xce.\xd7\xee?'
114 | tRp29
115 | ag2
116 | (g4
117 | S'\xe7\\Y\xd3\xfb;\xef?'
118 | tRp30
119 | ag2
120 | (g4
121 | S'|E\x9c\xd9\xbf\x8d\xef?'
122 | tRp31
123 | ag2
124 | (g4
125 | S'_\xeb_\x06@\xfe\xef?'
126 | tRp32
127 | ag2
128 | (g4
129 | S'\xf7\x8e\xac\xd0\xdf\x83\xef?'
130 | tRp33
131 | ag2
132 | (g4
133 | S'[\xd3\xa5\xe3\xc2\xcc\xef?'
134 | tRp34
135 | ag2
136 | (g4
137 | S'\x18Z\x90L\x10\x1f\xf0?'
138 | tRp35
139 | ag2
140 | (g4
141 | S'\xbd\xadcl\x14\x04\xf0?'
142 | tRp36
143 | ag2
144 | (g4
145 | S'\x13\xb8\xd5wq-\xf0?'
146 | tRp37
147 | ag2
148 | (g4
149 | S'\x8a\x11\xdc\xb7\xa5P\xf0?'
150 | tRp38
151 | ag2
152 | (g4
153 | S'\x83(\x8ce8M\xf0?'
154 | tRp39
155 | ag2
156 | (g4
157 | S'J\xf3\xd5C\x01\x9d\xf0?'
158 | tRp40
159 | ag2
160 | (g4
161 | S'[\xff\n\xbcY\x82\xf0?'
162 | tRp41
163 | ag2
164 | (g4
165 | S'\xce\xc8\x99\xb7\x8dw\xf0?'
166 | tRp42
167 | ag2
168 | (g4
169 | S'f\x85\xd1jj\xb8\xf0?'
170 | tRp43
171 | ag2
172 | (g4
173 | S'M\xbbt#\xad\xa2\xf0?'
174 | tRp44
175 | ag2
176 | (g4
177 | S'B\xb2\x9bJ\xbe\xc3\xf0?'
178 | tRp45
179 | ag2
180 | (g4
181 | S'5\x17\xb7_\x91\xa5\xf0?'
182 | tRp46
183 | ag2
184 | (g4
185 | S'PC\xf3\xd8`\xdd\xf0?'
186 | tRp47
187 | ag2
188 | (g4
189 | S'\xe6\xec;\x99\x19\xfd\xf0?'
190 | tRp48
191 | ag2
192 | (g4
193 | S'\xe3\x85\xcc\xcbY\x10\xf1?'
194 | tRp49
195 | ag2
196 | (g4
197 | S'\x12\x93\xed\xb2\xa5\x16\xf1?'
198 | tRp50
199 | ag2
200 | (g4
201 | S'%A\x7f3\xf0Q\xf1?'
202 | tRp51
203 | ag2
204 | (g4
205 | S'\x17D,M54\xf1?'
206 | tRp52
207 | ag2
208 | (g4
209 | S'\x84}\xab\xb2\xe6C\xf1?'
210 | tRp53
211 | ag2
212 | (g4
213 | S'\x85\xaf\x81|\xde>\xf1?'
214 | tRp54
215 | ag2
216 | (g4
217 | S'6\xe7;J\xd0x\xf1?'
218 | tRp55
219 | ag2
220 | (g4
221 | S'z\xfb\r\xad\x8c|\xf1?'
222 | tRp56
223 | ag2
224 | (g4
225 | S'\x9c@\xb2\x89>\x81\xf1?'
226 | tRp57
227 | ag2
228 | (g4
229 | S'\xdb4\xce\x97Fn\xf1?'
230 | tRp58
231 | ag2
232 | (g4
233 | S'_^\xa1Xc\x97\xf1?'
234 | tRp59
235 | ag2
236 | (g4
237 | S"V'\n\x05\x0c\xac\xf1?"
238 | tRp60
239 | ag2
240 | (g4
241 | S'\x10\x1d\xc8\xb5$t\xf1?'
242 | tRp61
243 | ag2
244 | (g4
245 | S'c\x90\xf5VV\x9e\xf1?'
246 | tRp62
247 | ag2
248 | (g4
249 | S'\xaaV\x86l\xcb\xb2\xf1?'
250 | tRp63
251 | ag2
252 | (g4
253 | S'\x7f\x11\xd4{\xdf\xec\xf1?'
254 | tRp64
255 | ag2
256 | (g4
257 | S'@\x15\x99&,\xf4\xf1?'
258 | tRp65
259 | ag2
260 | (g4
261 | S'\xf9\x93\xa1\xb0\xe8\xef\xf1?'
262 | tRp66
263 | ag2
264 | (g4
265 | S'\xd8| \xa8w\xef\xf1?'
266 | tRp67
267 | ag2
268 | (g4
269 | S's\xe4-N\xcb\xc3\xf1?'
270 | tRp68
271 | ag2
272 | (g4
273 | S's\xa1\x99\xda\x9c\x1e\xf2?'
274 | tRp69
275 | ag2
276 | (g4
277 | S'e\x84\xe9\x9a\x90\x00\xf2?'
278 | tRp70
279 | ag2
280 | (g4
281 | S'\xaeB7Js\x13\xf2?'
282 | tRp71
283 | ag2
284 | (g4
285 | S'\x08pj@5\x1a\xf2?'
286 | tRp72
287 | ag2
288 | (g4
289 | S'\x1f\x881v\xfd;\xf2?'
290 | tRp73
291 | ag2
292 | (g4
293 | S'\xab2D&\xd5+\xf2?'
294 | tRp74
295 | ag2
296 | (g4
297 | S'\xec\xddG\x85\xea\x16\xf2?'
298 | tRp75
299 | ag2
300 | (g4
301 | S"\x02\xba'=\x80M\xf2?"
302 | tRp76
303 | ag2
304 | (g4
305 | S'\xf2B\xb1H&z\xf2?'
306 | tRp77
307 | ag2
308 | (g4
309 | S'\xe7\x8c\xa0%\xcf+\xf2?'
310 | tRp78
311 | ag2
312 | (g4
313 | S'\xa1L\xe9|\x15\xa4\xf2?'
314 | tRp79
315 | ag2
316 | (g4
317 | S'w\xc6\xc6\xde\xf5P\xf2?'
318 | tRp80
319 | ag2
320 | (g4
321 | S'\t\xfa\xac\xc7\xadW\xf2?'
322 | tRp81
323 | ag2
324 | (g4
325 | S'\xce\xcc\x90\x19kM\xf2?'
326 | tRp82
327 | ag2
328 | (g4
329 | S'F\x80[\xf0\xcau\xf2?'
330 | tRp83
331 | ag2
332 | (g4
333 | S'\xcdL\xabx\xbc\x8a\xf2?'
334 | tRp84
335 | ag2
336 | (g4
337 | S'=\xd7?:3\xb9\xf2?'
338 | tRp85
339 | ag2
340 | (g4
341 | S'\x0b\xa3\x8e_\xeb\xbd\xf2?'
342 | tRp86
343 | ag2
344 | (g4
345 | S'\xa1\xe2\xd3:\xff\xd2\xf2?'
346 | tRp87
347 | ag2
348 | (g4
349 | S" '1\x84R\xab\xf2?"
350 | tRp88
351 | ag2
352 | (g4
353 | S'\xa9n\xee\x1c\xce\x03\xf3?'
354 | tRp89
355 | ag2
356 | (g4
357 | S'\xc7\x15\x00o.\xbd\xf2?'
358 | tRp90
359 | ag2
360 | (g4
361 | S'\xde\xb8\x1bO\x9c\xbf\xf2?'
362 | tRp91
363 | ag2
364 | (g4
365 | S'\xec\x91H\x8fB\x07\xf3?'
366 | tRp92
367 | ag2
368 | (g4
369 | S'\xffr\xd9\xd1\xad\xcc\xf2?'
370 | tRp93
371 | ag2
372 | (g4
373 | S'\xb7\x01Y\x04m\xb9\xf2?'
374 | tRp94
375 | ag2
376 | (g4
377 | S'\xf8ZK\x1a\x1f\xe5\xf2?'
378 | tRp95
379 | ag2
380 | (g4
381 | S'\x10\xaaT;(\xbd\xf2?'
382 | tRp96
383 | ag2
384 | (g4
385 | S'a\x95\x98\xac,1\xf3?'
386 | tRp97
387 | ag2
388 | (g4
389 | S'\xebn\xa7\xdeD\xc7\xf2?'
390 | tRp98
391 | ag2
392 | (g4
393 | S'\xa5\xe2\xf0\xd9\xe9\x01\xf3?'
394 | tRp99
395 | ag2
396 | (g4
397 | S'\xd3\xb2jBP\x05\xf3?'
398 | tRp100
399 | ag2
400 | (g4
401 | S'TS\x92.\xf2\x08\xf3?'
402 | tRp101
403 | ag2
404 | (g4
405 | S'5\xab\xfd\xff\xb8\xe9\xf2?'
406 | tRp102
407 | ag2
408 | (g4
409 | S'\xb8\xf2\xa7\x1c\t\xd5\xf2?'
410 | tRp103
411 | ag2
412 | (g4
413 | S'\xf2\xc5w\x8a\xc1E\xf3?'
414 | tRp104
415 | ag2
416 | (g4
417 | S'\xd7\xc8\x16\xa3U\x0e\xf3?'
418 | tRp105
419 | ag2
420 | (g4
421 | S'T\x1c\xa7,o\xef\xf2?'
422 | tRp106
423 | ag2
424 | (g4
425 | S'*\xf1 \x9f\xf4\xfa\xf2?'
426 | tRp107
427 | ag2
428 | (g4
429 | S'.\x97J\xa6\xa6\\\xf3?'
430 | tRp108
431 | ag2
432 | (g4
433 | S'\xe8\xc1P:\x90?\xf3?'
434 | tRp109
435 | ag2
436 | (g4
437 | S'\xf9A0\xdf\xaaI\xf3?'
438 | tRp110
439 | ag2
440 | (g4
441 | S'\xdc\xb8\xbf\x05,\xd2\xf2?'
442 | tRp111
443 | ag2
444 | (g4
445 | S'T\xee\x13\x7f\xc84\xf3?'
446 | tRp112
447 | ag2
448 | (g4
449 | S'\xb2\x9b{\xc65M\xf3?'
450 | tRp113
451 | ag2
452 | (g4
453 | S'\xe8_\x19@?{\xf3?'
454 | tRp114
455 | ag2
456 | (g4
457 | S'\xb9\xb0\xc8L\xbb\x1d\xf3?'
458 | tRp115
459 | ag2
460 | (g4
461 | S'\xea\xf9\x03_\xedX\xf3?'
462 | tRp116
463 | ag2
464 | (g4
465 | S'\xd4&\x08H\xabH\xf3?'
466 | tRp117
467 | ag2
468 | (g4
469 | S'W\xa55\xea\x18>\xf3?'
470 | tRp118
471 | ag2
472 | (g4
473 | S'2i\x02\xee\xf7U\xf3?'
474 | tRp119
475 | ag2
476 | (g4
477 | S'\x13\x9a\x10\xb7\xe9\x81\xf3?'
478 | tRp120
479 | ag2
480 | (g4
481 | S'x\xfe\xfe\x0e\tQ\xf3?'
482 | tRp121
483 | ag2
484 | (g4
485 | S'\x9br\xa0qK~\xf3?'
486 | tRp122
487 | ag2
488 | (g4
489 | S'\xe8\xb1\x8a0\xfdO\xf3?'
490 | tRp123
491 | ag2
492 | (g4
493 | S'\xc0\xfc\x01\xe8\x85\xc1\xf3?'
494 | tRp124
495 | ag2
496 | (g4
497 | S'\xd09\xbde\xa6g\xf3?'
498 | tRp125
499 | ag2
500 | (g4
501 | S'\xa4n=\xd3\x1b9\xf3?'
502 | tRp126
503 | ag2
504 | (g4
505 | S'i\x9f.\xc0\xaf\x98\xf3?'
506 | tRp127
507 | ag2
508 | (g4
509 | S'E\xa0J\x1e\x16f\xf3?'
510 | tRp128
511 | ag2
512 | (g4
513 | S'{\xaf\xec\xddT\xc4\xf3?'
514 | tRp129
515 | ag2
516 | (g4
517 | S'1rIR\xf6\\\xf3?'
518 | tRp130
519 | ag2
520 | (g4
521 | S'\xa1\xd8Y!\x1e\x8c\xf3?'
522 | tRp131
523 | ag2
524 | (g4
525 | S'\x92\xd0\xba\xdd"\x80\xf3?'
526 | tRp132
527 | ag2
528 | (g4
529 | S'\xa3\x94\xc7\x8eV\x8f\xf3?'
530 | tRp133
531 | ag2
532 | (g4
533 | S'\xd11Xq\xb9\xbe\xf3?'
534 | tRp134
535 | ag2
536 | (g4
537 | S'\xe0\x9e\x0fn\xe3\xaf\xf3?'
538 | tRp135
539 | ag2
540 | (g4
541 | S'\x0f\xf7\x1eQ\xcc\xc4\xf3?'
542 | tRp136
543 | ag2
544 | (g4
545 | S'E[\x0f\x80m\xa7\xf3?'
546 | tRp137
547 | ag2
548 | (g4
549 | S'\xefA\x18\xc2\x01\xb7\xf3?'
550 | tRp138
551 | ag2
552 | (g4
553 | S'\x15)\xbd\x97\xcf\x97\xf3?'
554 | tRp139
555 | ag2
556 | (g4
557 | S'\x88|Xd\x03\xaa\xf3?'
558 | tRp140
559 | ag2
560 | (g4
561 | S'\xe4\nV\xd9\x99\xaa\xf3?'
562 | tRp141
563 | ag2
564 | (g4
565 | S'\x9f0\xf4i*\xd0\xf3?'
566 | tRp142
567 | ag2
568 | (g4
569 | S'\x1b\x1c\x86\x16?\xbf\xf3?'
570 | tRp143
571 | ag2
572 | (g4
573 | S'eqgj\xd9\xba\xf3?'
574 | tRp144
575 | ag2
576 | (g4
577 | S'\xbc\x88\x82\xa4f\xea\xf3?'
578 | tRp145
579 | ag2
580 | (g4
581 | S'a\xc4\xe0M\x1f\xbe\xf3?'
582 | tRp146
583 | ag2
584 | (g4
585 | S'\x99\xff\x19\xa3\xa2\xd3\xf3?'
586 | tRp147
587 | ag2
588 | (g4
589 | S'\x124\x96\x8a\xb4\xe4\xf3?'
590 | tRp148
591 | ag2
592 | (g4
593 | S'6#\xdan\x9d\x9b\xf3?'
594 | tRp149
595 | ag2
596 | (g4
597 | S'\xba\xd1\xd6\x8e\xb7\xcd\xf3?'
598 | tRp150
599 | ag2
600 | (g4
601 | S'BEz}H\xe5\xf3?'
602 | tRp151
603 | ag2
604 | (g4
605 | S'!\x82hfM\x02\xf4?'
606 | tRp152
607 | ag2
608 | (g4
609 | S'(\xd6\r\xbc\x8f\xfb\xf3?'
610 | tRp153
611 | ag2
612 | (g4
613 | S'=\x81/8n\xcb\xf3?'
614 | tRp154
615 | ag2
616 | (g4
617 | S'\x9c\xbe\xd8\xb0\xdc\xb6\xf3?'
618 | tRp155
619 | ag2
620 | (g4
621 | S'\xb6\x80\xb4\xae\xfe\xc3\xf3?'
622 | tRp156
623 | ag2
624 | (g4
625 | S'\xaegc\xb0\xdc\xd3\xf3?'
626 | tRp157
627 | ag2
628 | (g4
629 | S'\xa8\xa1\x9b\x94\xb6\x13\xf4?'
630 | tRp158
631 | ag2
632 | (g4
633 | S'\x8a"\x8dB;\xe8\xf3?'
634 | tRp159
635 | ag2
636 | (g4
637 | S'\x10s\xdf\xae]\xb9\xf3?'
638 | tRp160
639 | ag2
640 | (g4
641 | S'\xe0Z\xf2\x1a&\xdf\xf3?'
642 | tRp161
643 | ag2
644 | (g4
645 | S'y\xed0 \xb9\xf5\xf3?'
646 | tRp162
647 | ag2
648 | (g4
649 | S'\x1b\xa3\xbd\x14{\xea\xf3?'
650 | tRp163
651 | ag2
652 | (g4
653 | S'\x19\xfa\x1fUX\xec\xf3?'
654 | tRp164
655 | ag2
656 | (g4
657 | S'\xecU\x1f+C\x13\xf4?'
658 | tRp165
659 | ag2
660 | (g4
661 | S'\x8c\x80\xf7\xe4(\xe9\xf3?'
662 | tRp166
663 | ag2
664 | (g4
665 | S'c1ia\xd7\xb6\xf3?'
666 | tRp167
667 | ag2
668 | (g4
669 | S'\x14\xf6\xe1Og\x07\xf4?'
670 | tRp168
671 | ag2
672 | (g4
673 | S'\xa20\xd9g7\t\xf4?'
674 | tRp169
675 | ag2
676 | (g4
677 | S'\xc4P$T\xcb\xf6\xf3?'
678 | tRp170
679 | ag2
680 | (g4
681 | S'\xc6V\xcd\xec\xa4\x11\xf4?'
682 | tRp171
683 | ag2
684 | (g4
685 | S'\x97\xdc\xe3\x93I\xf5\xf3?'
686 | tRp172
687 | ag2
688 | (g4
689 | S'\xff\xfcL\xa9Q\xc8\xf3?'
690 | tRp173
691 | ag2
692 | (g4
693 | S'\x86\x02\x81\xf5B\xfc\xf3?'
694 | tRp174
695 | ag2
696 | (g4
697 | S'\xf0\xac|\xd9\xc6\n\xf4?'
698 | tRp175
699 | ag2
700 | (g4
701 | S'\xbf-\xca\\\x91\xea\xf3?'
702 | tRp176
703 | ag2
704 | (g4
705 | S'\x83p\xcb\xd6]\xe9\xf3?'
706 | tRp177
707 | ag2
708 | (g4
709 | S"?\xf5'\xb2\x0b\xd3\xf3?"
710 | tRp178
711 | ag2
712 | (g4
713 | S'\xa8\x82\xe3p\xae\xfa\xf3?'
714 | tRp179
715 | ag2
716 | (g4
717 | S'2\x8b\xe7\xf4\xc7\xf7\xf3?'
718 | tRp180
719 | ag2
720 | (g4
721 | S'R\xef[\xecR\xf1\xf3?'
722 | tRp181
723 | ag2
724 | (g4
725 | S'\xb8;\xee\x81?\xe7\xf3?'
726 | tRp182
727 | ag2
728 | (g4
729 | S'\xe9\xa8\xad\xe1\xeb\xf6\xf3?'
730 | tRp183
731 | ag2
732 | (g4
733 | S'M J\x19\xd5\x17\xf4?'
734 | tRp184
735 | ag2
736 | (g4
737 | S'B\x83\r)\xd8\xe4\xf3?'
738 | tRp185
739 | ag2
740 | (g4
741 | S'\xfd\xa0\xc7\x13\x81\xec\xf3?'
742 | tRp186
743 | ag2
744 | (g4
745 | S'\xd5\xf5\x04\x19f\x05\xf4?'
746 | tRp187
747 | ag2
748 | (g4
749 | S'6\x18\xb9\xe2q\xd4\xf3?'
750 | tRp188
751 | ag2
752 | (g4
753 | S'\xb3\xb0\x8b\xb6\x8a:\xf4?'
754 | tRp189
755 | ag2
756 | (g4
757 | S'\x8f\x00u\x07G*\xf4?'
758 | tRp190
759 | ag2
760 | (g4
761 | S'\x04:c\x99\x9f\x06\xf4?'
762 | tRp191
763 | ag2
764 | (g4
765 | S'x\xe7\x7f\xaa\x8eh\xf4?'
766 | tRp192
767 | ag2
768 | (g4
769 | S'\xfb\x1aR\x13m\xdd\xf3?'
770 | tRp193
771 | ag2
772 | (g4
773 | S'y\xd4\xc8\n\xcb\x1e\xf4?'
774 | tRp194
775 | ag2
776 | (g4
777 | S'\x8e\xf9.\xc9\xb5\xfe\xf3?'
778 | tRp195
779 | ag2
780 | (g4
781 | S'j\x86r\n\x94\x03\xf4?'
782 | tRp196
783 | ag2
784 | (g4
785 | S'\x02N\x03\x0e4\x03\xf4?'
786 | tRp197
787 | ag2
788 | (g4
789 | S'\xa5%\x16\x9e\xbf\xe4\xf3?'
790 | tRp198
791 | ag2
792 | (g4
793 | S'Mu\xe9\xdc\xf5\xec\xf3?'
794 | tRp199
795 | ag2
796 | (g4
797 | S'=\xd6\x90Yv(\xf4?'
798 | tRp200
799 | ag2
800 | (g4
801 | S'\xecL$W\xe7\x12\xf4?'
802 | tRp201
803 | ag2
804 | (g4
805 | S'\xfff\x8def\x06\xf4?'
806 | tRp202
807 | ag2
808 | (g4
809 | S'\xbb\xd2E\x11\xca\x15\xf4?'
810 | tRp203
811 | ag2
812 | (g4
813 | S'\xce\xa1\xac\x14\xab<\xf4?'
814 | tRp204
815 | ag2
816 | (g4
817 | S'jl|\x99M\x14\xf4?'
818 | tRp205
819 | ag2
820 | (g4
821 | S'\xf3vr\xb6WS\xf4?'
822 | tRp206
823 | ag2
824 | (g4
825 | S'B\xc4!\x7f\x06\x1d\xf4?'
826 | tRp207
827 | ag2
828 | (g4
829 | S'ZNI\xeb1I\xf4?'
830 | tRp208
831 | ag2
832 | (g4
833 | S'r\t%6\x8f\x11\xf4?'
834 | tRp209
835 | ag2
836 | (g4
837 | S'\xa5\xe10\xbd\xf59\xf4?'
838 | tRp210
839 | ag2
840 | (g4
841 | S'\xbe51\xffS\t\xf4?'
842 | tRp211
843 | ag2
844 | (g4
845 | S'z\xa4 \x11\xf2+\xf4?'
846 | tRp212
847 | ag2
848 | (g4
849 | S'\xd5\xca\xb4\xdb\xba\xfd\xf3?'
850 | tRp213
851 | ag2
852 | (g4
853 | S'l\xf1\x85\xadC\x04\xf4?'
854 | tRp214
855 | ag2
856 | (g4
857 | S'0\xcbo\xab\x90"\xf4?'
858 | tRp215
859 | ag2
860 | (g4
861 | S'\x84R\xebEg\xfc\xf3?'
862 | tRp216
863 | ag2
864 | (g4
865 | S'K\x1d\n\xd0\xf2/\xf4?'
866 | tRp217
867 | ag2
868 | (g4
869 | S'\x1bZ\xd4\xee\xaa?\xf4?'
870 | tRp218
871 | ag2
872 | (g4
873 | S',\x12r\xadw_\xf4?'
874 | tRp219
875 | ag2
876 | (g4
877 | S'L\xef\xb5\x00\xc7\x00\xf4?'
878 | tRp220
879 | ag2
880 | (g4
881 | S'\r\x8f\x92\x0f\x06:\xf4?'
882 | tRp221
883 | ag2
884 | (g4
885 | S'\xca:-\xfbf\\\xf4?'
886 | tRp222
887 | ag2
888 | (g4
889 | S"' \xa8\xa3\x91\xfd\xf3?"
890 | tRp223
891 | ag2
892 | (g4
893 | S'\xed\x02\xd7\xb1\xf6,\xf4?'
894 | tRp224
895 | ag2
896 | (g4
897 | S'\xa3\xa7u\xf5\x15!\xf4?'
898 | tRp225
899 | ag2
900 | (g4
901 | S'L\xe4\xd2\xd3G)\xf4?'
902 | tRp226
903 | ag2
904 | (g4
905 | S'\xf5A\xaa\xd7/\x0b\xf4?'
906 | tRp227
907 | ag2
908 | (g4
909 | S'\xcf\xac@>\xff@\xf4?'
910 | tRp228
911 | ag2
912 | (g4
913 | S'\x81x\x13Rm\x1c\xf4?'
914 | tRp229
915 | ag2
916 | (g4
917 | S'\t-n\x98"I\xf4?'
918 | tRp230
919 | ag2
920 | (g4
921 | S'\x9a\xd6|\xd0t\xfa\xf3?'
922 | tRp231
923 | ag2
924 | (g4
925 | S'\xe77\xaf$T0\xf4?'
926 | tRp232
927 | ag2
928 | (g4
929 | S')\r\x16\xb5<4\xf4?'
930 | tRp233
931 | ag2
932 | (g4
933 | S'\x1a\xce\xf3\xe0\x0b9\xf4?'
934 | tRp234
935 | ag2
936 | (g4
937 | S'i\xdf2AtC\xf4?'
938 | tRp235
939 | ag2
940 | (g4
941 | S'\xd9\xe6h\x0f\x8bC\xf4?'
942 | tRp236
943 | ag2
944 | (g4
945 | S'0 [(\x1a\x01\xf4?'
946 | tRp237
947 | ag2
948 | (g4
949 | S'\xae\x9e\x96\xcf\x80I\xf4?'
950 | tRp238
951 | ag2
952 | (g4
953 | S'\xb3\xdb\xb4g\xb5+\xf4?'
954 | tRp239
955 | ag2
956 | (g4
957 | S'\xdahk\xea\x1bJ\xf4?'
958 | tRp240
959 | ag2
960 | (g4
961 | S'\xf7i\xcdd\x93J\xf4?'
962 | tRp241
963 | ag2
964 | (g4
965 | S'.WXi\x9eE\xf4?'
966 | tRp242
967 | ag2
968 | (g4
969 | S'\x10\xa1=z\x80:\xf4?'
970 | tRp243
971 | ag2
972 | (g4
973 | S'\xd76\x7f1*o\xf4?'
974 | tRp244
975 | ag2
976 | (g4
977 | S'\x94\xdbd\x93\xab\x18\xf4?'
978 | tRp245
979 | ag2
980 | (g4
981 | S'\x11|r\xa2\x06-\xf4?'
982 | tRp246
983 | ag2
984 | (g4
985 | S'hYQ\x94\xc51\xf4?'
986 | tRp247
987 | ag2
988 | (g4
989 | S'\x84\xfd:KV~\xf4?'
990 | tRp248
991 | ag2
992 | (g4
993 | S'\xac\xaf\x01\x937;\xf4?'
994 | tRp249
995 | ag2
996 | (g4
997 | S'\xf7\x86\xf9\x9co\xfc\xf3?'
998 | tRp250
999 | ag2
1000 | (g4
1001 | S'q\xa5\x11H\xe4%\xf4?'
1002 | tRp251
1003 | ag2
1004 | (g4
1005 | S' \xbb\xf8\xc2\n,\xf4?'
1006 | tRp252
1007 | ag2
1008 | (g4
1009 | S'\x894\xae\xa1\xcb7\xf4?'
1010 | tRp253
1011 | ag2
1012 | (g4
1013 | S'\xc9\x8f\x919}=\xf4?'
1014 | tRp254
1015 | a.
--------------------------------------------------------------------------------
/coco_caption/METEOR.pkl:
--------------------------------------------------------------------------------
1 | (lp1
2 | F0.18915644594642708
3 | aF0.20911916479687623
4 | aF0.21949224920874325
5 | aF0.22395937995175452
6 | aF0.22636076772912747
7 | aF0.23063332405294543
8 | aF0.23168066989877056
9 | aF0.23275986881940422
10 | aF0.23417117399601728
11 | aF0.2350631549515593
12 | aF0.23678958786720067
13 | aF0.23671099113335686
14 | aF0.23922871999875575
15 | aF0.23980766235480724
16 | aF0.24250715969661582
17 | aF0.24409128448606446
18 | aF0.24417673283127833
19 | aF0.24534410461950146
20 | aF0.24717353763019165
21 | aF0.24719646311209323
22 | aF0.24650820758422559
23 | aF0.250388748859595
24 | aF0.24883898414632688
25 | aF0.25163130016639429
26 | aF0.24963518611164856
27 | aF0.25137443983486046
28 | aF0.25280151086472102
29 | aF0.2539250468977573
30 | aF0.25330708644786121
31 | aF0.25590649687564843
32 | aF0.25485096265105195
33 | aF0.25472098886275546
34 | aF0.25725636982963301
35 | aF0.2577444676483735
36 | aF0.25687426372906574
37 | aF0.26009681088862618
38 | aF0.25996750081263781
39 | aF0.25851554345039679
40 | aF0.26152881766684505
41 | aF0.25893832149455559
42 | aF0.26152794170654592
43 | aF0.26078061911783385
44 | aF0.26246229451701153
45 | aF0.26393208384860578
46 | aF0.26333146434380406
47 | aF0.26329484085148308
48 | aF0.26402779454741354
49 | aF0.26393363716423013
50 | aF0.26562805312849591
51 | aF0.264572119645897
52 | aF0.26601627774989034
53 | aF0.26632920720759801
54 | aF0.26748625980022117
55 | aF0.26663498870215679
56 | aF0.26810417109650442
57 | aF0.26916412073025509
58 | aF0.2674172239826727
59 | aF0.26680873225560681
60 | aF0.26968550960583088
61 | aF0.27003636233769723
62 | aF0.27116095980912269
63 | aF0.27091837362020121
64 | aF0.27190187421724671
65 | aF0.26940392346200703
66 | aF0.27176744667814412
67 | aF0.27089726905540917
68 | aF0.27288518217035101
69 | aF0.27274628383276189
70 | aF0.2728966926367839
71 | aF0.27219854188079728
72 | aF0.27262760529799257
73 | aF0.27505080069905191
74 | aF0.27477331309378716
75 | aF0.27243363100838913
76 | aF0.27510896016998154
77 | aF0.27502131399790919
78 | aF0.27404601246014254
79 | aF0.27481159756735757
80 | aF0.27560350140208262
81 | aF0.27514962922677899
82 | aF0.27718844396314396
83 | aF0.27608789049323135
84 | aF0.27804083149494685
85 | aF0.27725020699438457
86 | aF0.28040352064129626
87 | aF0.2772757474518821
88 | aF0.27813172148719623
89 | aF0.27954703349661797
90 | aF0.27729982167090511
91 | aF0.27683894481501659
92 | aF0.28107214089699611
93 | aF0.27790217290818026
94 | aF0.28104024908926628
95 | aF0.28048263204639945
96 | aF0.28061577170556196
97 | aF0.28023214986579476
98 | aF0.27940242965930034
99 | aF0.27926507426135566
100 | aF0.28019376437748233
101 | aF0.28248412298998271
102 | aF0.28046655881029953
103 | aF0.28002215564340555
104 | aF0.28046864475212863
105 | aF0.28390852552172363
106 | aF0.28296736924131477
107 | aF0.28268677077151555
108 | aF0.2798149277913497
109 | aF0.28374749773137276
110 | aF0.28284055862777119
111 | aF0.28334547401709625
112 | aF0.28240138854903724
113 | aF0.28365043720208905
114 | aF0.28296323222134556
115 | aF0.28340865376711727
116 | aF0.28375944351922833
117 | aF0.28482920079101431
118 | aF0.28285094729853871
119 | aF0.28537332875124127
120 | aF0.28369409144691493
121 | aF0.28788448500736619
122 | aF0.28417719629010885
123 | aF0.28247767281542563
124 | aF0.28610696382647222
125 | aF0.28507581771364848
126 | aF0.28774582385662278
127 | aF0.28284401083150934
128 | aF0.28526007497751038
129 | aF0.28584808770710396
130 | aF0.28676610806088709
131 | aF0.28640958412108453
132 | aF0.28593547723436086
133 | aF0.28608271026206111
134 | aF0.28661153862054706
135 | aF0.28883979698928797
136 | aF0.28681758599812696
137 | aF0.28730389532696099
138 | aF0.28662283193827115
139 | aF0.28892897219405839
140 | aF0.28759559638491083
141 | aF0.28666049624096002
142 | aF0.29011089745228869
143 | aF0.28900238335817979
144 | aF0.28860771778784133
145 | aF0.289670141090758
146 | aF0.28753610880743102
147 | aF0.28804254503597859
148 | aF0.29014480002922199
149 | aF0.29098093608245995
150 | aF0.29012750509699342
151 | aF0.28834429827938918
152 | aF0.2882065437807817
153 | aF0.28929992571507329
154 | aF0.28940006979015337
155 | aF0.29085633837764174
156 | aF0.29004617784801029
157 | aF0.28975265226997715
158 | aF0.28978342364090581
159 | aF0.28957382463597903
160 | aF0.28905715704535884
161 | aF0.29011513806956507
162 | aF0.29295667055631425
163 | aF0.29113327118771654
164 | aF0.28852864597199662
165 | aF0.29186213532220379
166 | aF0.2918781623550053
167 | aF0.29128146661360593
168 | aF0.29225980870223844
169 | aF0.29066672991543518
170 | aF0.29029890273104586
171 | aF0.29195084253980425
172 | aF0.29123553546819075
173 | aF0.29062106085118061
174 | aF0.29179804871193821
175 | aF0.28998709951209195
176 | aF0.29149185522962423
177 | aF0.29151798337369877
178 | aF0.29155177404501126
179 | aF0.29103521281804867
180 | aF0.29096331070421333
181 | aF0.29301306248005865
182 | aF0.29096500538456355
183 | aF0.28989199327667775
184 | aF0.29314262652344514
185 | aF0.29144324364037155
186 | aF0.29428756238948467
187 | aF0.2924259491287029
188 | aF0.29210513942607341
189 | aF0.29480797781844198
190 | aF0.29101766553274977
191 | aF0.29336269482609206
192 | aF0.29201278358565069
193 | aF0.29219909529639992
194 | aF0.2938912158574768
195 | aF0.2921506098701227
196 | aF0.29239299757508447
197 | aF0.29354609641852591
198 | aF0.29262711883114623
199 | aF0.29171791917485274
200 | aF0.29223746418306829
201 | aF0.29381567866483349
202 | aF0.29290370883214206
203 | aF0.29505681546520424
204 | aF0.29429962868558845
205 | aF0.29460775300456488
206 | aF0.29238086027195909
207 | aF0.29528543129374024
208 | aF0.29341402386920695
209 | aF0.29474568981359983
210 | aF0.29216358900268563
211 | aF0.2935913352931202
212 | aF0.29334145784065863
213 | aF0.29448787757565903
214 | aF0.29498898620849007
215 | aF0.29431393431643338
216 | aF0.29704984530391071
217 | aF0.2931446168207652
218 | aF0.29480617685982402
219 | aF0.29576975129065086
220 | aF0.29359336011535353
221 | aF0.29450203452857454
222 | aF0.29521354194283322
223 | aF0.29281448737344823
224 | aF0.29482487461515422
225 | aF0.29571226574885395
226 | aF0.29504088341902712
227 | aF0.29540440838582394
228 | aF0.29286989337211972
229 | aF0.29518663766845959
230 | aF0.2966833412001017
231 | aF0.295448470122119
232 | aF0.29711422648721347
233 | aF0.29711734749392577
234 | aF0.29627884382012593
235 | aF0.29639846161174005
236 | aF0.29497700766905133
237 | aF0.29605339757395865
238 | aF0.29650222850027647
239 | aF0.29612326748964929
240 | aF0.29621738509928525
241 | aF0.29881187108696727
242 | aF0.29557793809696425
243 | aF0.29560094518659574
244 | aF0.29561788163129465
245 | aF0.29949039672884398
246 | aF0.29586661778143331
247 | aF0.29518660763816906
248 | aF0.29576518080675901
249 | aF0.29505938092454825
250 | aF0.29744406928735179
251 | aF0.29636329553803575
252 | a.
--------------------------------------------------------------------------------
/coco_caption/draw.py:
--------------------------------------------------------------------------------
1 | #! encoding: UTF-8
2 |
3 | import cPickle as pickle
4 | import matplotlib.pyplot as plt
5 |
6 | with open("Bleu_1.pkl", "r") as f:
7 | Bleu_1 = pickle.load(f)
8 |
9 | with open("Bleu_2.pkl", "r") as f:
10 | Bleu_2 = pickle.load(f)
11 |
12 | with open("Bleu_3.pkl", "r") as f:
13 | Bleu_3 = pickle.load(f)
14 |
15 | with open("Bleu_4.pkl", "r") as f:
16 | Bleu_4 = pickle.load(f)
17 |
18 | with open("METEOR.pkl", "r") as f:
19 | METEOR = pickle.load(f)
20 |
21 | with open("CIDEr.pkl", "r") as f:
22 | CIDEr = pickle.load(f)
23 |
24 | print len(Bleu_1)
25 |
26 | plt.plot(range(0, 2*len(Bleu_1), 2), Bleu_1, label="Bleu-1", color="g")
27 | plt.plot(range(0, 2*len(Bleu_2), 2), Bleu_2, label="Bleu-2", color="r")
28 | plt.plot(range(0, 2*len(Bleu_3), 2), Bleu_3, label="Bleu-3", color="b")
29 | plt.plot(range(0, 2*len(Bleu_4), 2), Bleu_4, label="Bleu-4", color="m")
30 | plt.plot(range(0, 2*len(METEOR), 2), METEOR, label="METEOR", color="k")
31 | plt.plot(range(0, 2*len(CIDEr), 2), CIDEr, label="CIDEr", color="y")
32 |
33 | plt.grid(True)
34 | #plt.legend(handles=[line_1, line_2])
35 | #plt.legend(bbox_to_anchor=(1.05, 1), loc=2, borderaxespad=0.)
36 | #plt.legend(handles=[line1], loc=1)
37 | plt.legend(loc=2)
38 | plt.show()
39 | #plt.savefig("tmp.png")
40 |
--------------------------------------------------------------------------------
/coco_caption/eval_captions_results.py:
--------------------------------------------------------------------------------
1 | #! encoding: UTF-8
2 |
3 | import os
4 | from pycocotools.coco import COCO
5 | from pycocoevalcap.eval import COCOEvalCap
6 |
7 | annFile = "../data/train_val_all_reference.json"
8 | resFile = "captions_val2014_results.json"
9 |
10 | # create coco object and cocoRes object
11 | coco = COCO(annFile)
12 |
13 | # after generating the captions_val2014_results.json file
14 | # we call the coco caption evaluation tools
15 | cocoRes = coco.loadRes(resFile)
16 |
17 | # create cocoEval object by taking coco and cocoRes
18 | cocoEval = COCOEvalCap(coco, cocoRes)
19 |
20 | # evaluate on a subset of images by setting
21 | # cocoEval.params['image_id'] = cocoRes.getImgIds()
22 | # please remove this line when evaluating the full validation set
23 | cocoEval.params['image_id'] = cocoRes.getImgIds()
24 |
25 | # evaluate results
26 | cocoEval.evaluate()
27 |
28 | # print output evaluation scores
29 | for metric, score in cocoEval.eval.items():
30 | print '%s: %.3f'%(metric, score)
--------------------------------------------------------------------------------
/coco_caption/eval_image_caption.py:
--------------------------------------------------------------------------------
1 | # encoding: UTF-8
2 |
3 | import os
4 | import sys
5 | import glob
6 | import random
7 | import time
8 | import json
9 | from json import encoder
10 | import numpy as np
11 | import cPickle as pickle
12 | import matplotlib.pyplot as plt
13 |
14 | import tensorflow as tf
15 |
16 | sys.path.append('./coco_caption/')
17 | from pycocotools.coco import COCO
18 | from pycocoevalcap.eval import COCOEvalCap
19 |
20 | import ipdb
21 |
22 |
23 | #############################################################################################################
24 | #
25 | # Step 1: Input: D = {(x^n, y^n): n = 1:N}
26 | # Step 2:Train \Pi(g_{1:T} | x) using MLE on D, MLE: Maximum likehood eatimation
27 | #
28 | ############################################################################################################
29 | class CNN_LSTM():
30 | def __init__(self,
31 | n_words,
32 | batch_size,
33 | feats_dim,
34 | project_dim,
35 | lstm_size,
36 | word_embed_dim,
37 | lstm_step,
38 | bias_init_vector=None):
39 |
40 | self.n_words = n_words
41 | self.batch_size = batch_size
42 | self.feats_dim = feats_dim
43 | self.project_dim = project_dim
44 | self.lstm_size = lstm_size
45 | self.word_embed_dim = word_embed_dim
46 | self.lstm_step = lstm_step
47 |
48 | # project the image feature vector of dimension 2048 to 512 dimension, with a linear layer
49 | # self.encode_img_W: 2048 x 512
50 | # self.encode_img_b: 512
51 | self.encode_img_W = tf.Variable(tf.random_uniform([feats_dim, project_dim], -0.1, 0.1), name="encode_img_W")
52 | self.encode_img_b = tf.zeros([project_dim], name="encode_img_b")
53 |
54 | with tf.device("/cpu:0"):
55 | self.Wemb = tf.Variable(tf.random_uniform([n_words, word_embed_dim], -0.1, 0.1), name="Wemb")
56 |
57 | self.lstm = tf.nn.rnn_cell.BasicLSTMCell(lstm_size, state_is_tuple=True)
58 |
59 | self.embed_word_W = tf.Variable(tf.random_uniform([lstm_size, n_words], -0.1, 0.1), name="embed_word_W")
60 |
61 | if bias_init_vector is not None:
62 | self.embed_word_b = tf.Variable(bias_init_vector.astype(np.float32), name="embed_word_b")
63 | else:
64 | self.embed_word_b = tf.Variable(tf.zeros([n_words]), name="embed_word_b")
65 |
66 | self.baseline_MLP_W = tf.Variable(tf.random_uniform([lstm_size, 1], -0.1, 0.1), name="baseline_MLP_W")
67 | self.baseline_MLP_b = tf.Variable(tf.zeros([1]), name="baseline_MLP_b")
68 |
69 | ############################################################################################################
70 | #
71 | # Class function for step 2
72 | #
73 | ############################################################################################################
74 | def build_model(self):
75 | images = tf.placeholder(tf.float32, [self.batch_size, self.feats_dim])
76 | sentences = tf.placeholder(tf.int32, [self.batch_size, self.lstm_step])
77 | masks = tf.placeholder(tf.float32, [self.batch_size, self.lstm_step])
78 |
79 | images_embed = tf.matmul(images, self.encode_img_W) + self.encode_img_b
80 |
81 | state = self.lstm.zero_state(batch_size=self.batch_size, dtype=tf.float32)
82 |
83 | loss = 0.0
84 | with tf.variable_scope("LSTM"):
85 | for i in range(0, self.lstm_step):
86 | if i == 0:
87 | current_emb = images_embed
88 | else:
89 | with tf.device("/cpu:0"):
90 | current_emb = tf.nn.embedding_lookup(self.Wemb, sentences[:, i-1])
91 |
92 | if i > 0:
93 | tf.get_variable_scope().reuse_variables()
94 |
95 | output, state = self.lstm(current_emb, state)
96 |
97 | if i > 0:
98 | labels = tf.expand_dims(sentences[:, i], 1)
99 | indices = tf.expand_dims(tf.range(0, self.batch_size, 1), 1)
100 | concated = tf.concat(1, [indices, labels])
101 | onehot_labels = tf.sparse_to_dense( concated, tf.pack([self.batch_size, self.n_words]), 1.0, 0.0)
102 |
103 | logit_words = tf.matmul(output, self.embed_word_W) + self.embed_word_b
104 | cross_entropy = tf.nn.softmax_cross_entropy_with_logits(logit_words, onehot_labels)
105 | cross_entropy = cross_entropy * masks[:, i]
106 | current_loss = tf.reduce_sum(cross_entropy)/self.batch_size
107 |
108 | loss = loss + current_loss
109 | return loss, images, sentences, masks
110 |
111 | def generate_model(self):
112 | images = tf.placeholder(tf.float32, [1, self.feats_dim])
113 | images_embed = tf.matmul(images, self.encode_img_W) + self.encode_img_b
114 |
115 | state = self.lstm.zero_state(batch_size=1, dtype=tf.float32)
116 | sentences = []
117 |
118 | with tf.variable_scope("LSTM"):
119 | output, state = self.lstm(images_embed, state)
120 |
121 | with tf.device("/cpu:0"):
122 | current_emb = tf.nn.embedding_lookup(self.Wemb, tf.ones([1], dtype=tf.int64))
123 |
124 | for i in range(0, self.lstm_step):
125 | tf.get_variable_scope().reuse_variables()
126 |
127 | output, state = self.lstm(current_emb, state)
128 |
129 | logit_words = tf.matmul(output, self.embed_word_W) + self.embed_word_b
130 | max_prob_word = tf.argmax(logit_words, 1)[0]
131 |
132 | with tf.device("/cpu:0"):
133 | current_emb = tf.nn.embedding_lookup(self.Wemb, max_prob_word)
134 | current_emb = tf.expand_dims(current_emb, 0)
135 | sentences.append(max_prob_word)
136 |
137 | return images, sentences
138 |
139 |
140 | ##############################################################################
141 | #
142 | # set parameters and path
143 | #
144 | ##############################################################################
145 | batch_size = 100
146 | feats_dim = 2048
147 | project_dim = 512
148 | lstm_size = 512
149 | word_embed_dim = 512
150 | lstm_step = 20
151 |
152 | idx_to_word_path = '../data/idx_to_word.pkl'
153 | word_to_idx_path = '../data/word_to_idx.pkl'
154 | bias_init_vector_path = '../data/bias_init_vector.npy'
155 |
156 | with open(idx_to_word_path, 'r') as fr_3:
157 | idx_to_word = pickle.load(fr_3)
158 |
159 | with open(word_to_idx_path, 'r') as fr_4:
160 | word_to_idx = pickle.load(fr_4)
161 |
162 | bias_init_vector = np.load(bias_init_vector_path)
163 |
164 |
165 | ##########################################################################################
166 | #
167 | # I move the generation model part out of the Val_with_MLE function
168 | #
169 | ##########################################################################################
170 | n_words = len(idx_to_word)
171 |
172 | val_feats_path = '../inception/val_feats'
173 | val_feats_names = glob.glob(val_feats_path + '/*.npy')
174 | val_images_names = map(lambda x: os.path.basename(x)[0:-4], val_feats_names)
175 |
176 | model = CNN_LSTM(n_words = n_words,
177 | batch_size = batch_size,
178 | feats_dim = feats_dim,
179 | project_dim = project_dim,
180 | lstm_size = lstm_size,
181 | word_embed_dim = word_embed_dim,
182 | lstm_step = lstm_step,
183 | bias_init_vector = None)
184 | tf_images, tf_sentences = model.generate_model()
185 |
186 | def Val_with_MLE(model_path):
187 | '''
188 | n_words = len(idx_to_word)
189 |
190 | # version 1: test all validation images
191 | val_feats_path = '../inception/val_feats'
192 | val_feats_names = glob.glob(val_feats_path + '/*.npy')
193 | val_images_names = map(lambda x: os.path.basename(x)[0:-4], val_feats_names)
194 |
195 | model = CNN_LSTM(n_words = n_words,
196 | batch_size = batch_size,
197 | feats_dim = feats_dim,
198 | project_dim = project_dim,
199 | lstm_size = lstm_size,
200 | word_embed_dim = word_embed_dim,
201 | lstm_step = lstm_step,
202 | bias_init_vector = None)
203 | tf_images, tf_sentences = model.generate_model()
204 | '''
205 | sess = tf.InteractiveSession()
206 | saver = tf.train.Saver()
207 | saver.restore(sess, model_path)
208 |
209 | fw_1 = open("val2014_results.txt", 'w')
210 | for idx, img_name in enumerate(val_images_names[0:5000]):
211 | print "{}, {}".format(idx, img_name)
212 | start_time = time.time()
213 |
214 | current_feats = np.load( os.path.join(val_feats_path, img_name+'.npy') )
215 | current_feats = np.reshape(current_feats, [1, feats_dim])
216 |
217 | sentences_index = sess.run(tf_sentences, feed_dict={tf_images: current_feats})
218 | sentences = []
219 | for idx_word in sentences_index:
220 | word = idx_to_word[idx_word]
221 | word = word.replace('\n', '')
222 | word = word.replace('\\', '')
223 | word = word.replace('"', '')
224 | sentences.append(word)
225 |
226 | punctuation = np.argmax(np.array(sentences) == '') + 1
227 | sentences = sentences[:punctuation]
228 | generated_sentence = ' '.join(sentences)
229 | generated_sentence = generated_sentence.replace(' ', '')
230 | generated_sentence = generated_sentence.replace(' ', '')
231 |
232 | print generated_sentence,'\n'
233 | fw_1.write(img_name + '\n')
234 | fw_1.write(generated_sentence + '\n')
235 | fw_1.close()
236 |
237 |
--------------------------------------------------------------------------------
/coco_caption/eval_model.py:
--------------------------------------------------------------------------------
1 | #! encoding: UTF-8
2 |
3 | import os
4 | import ipdb
5 | import glob
6 | import time
7 | import subprocess
8 | import cPickle as pickle
9 | import matplotlib.pyplot as plt
10 |
11 | from pycocotools.coco import COCO
12 | from pycocoevalcap.eval import COCOEvalCap
13 |
14 | import eval_image_caption
15 |
16 |
17 | model_path = "../models"
18 |
19 | annFile = "../data/train_val_all_reference.json"
20 | resFile = "captions_val2014_results.json"
21 |
22 | # create coco object and cocoRes object
23 | coco = COCO(annFile)
24 |
25 | n_epochs = 500
26 | n_epochs += 2
27 |
28 | with open("Bleu_1.pkl", "r") as f:
29 | Bleu_1 = pickle.load(f)
30 |
31 | with open("Bleu_2.pkl", "r") as f:
32 | Bleu_2 = pickle.load(f)
33 |
34 | with open("Bleu_3.pkl", "r") as f:
35 | Bleu_3 = pickle.load(f)
36 |
37 | with open("Bleu_4.pkl", "r") as f:
38 | Bleu_4 = pickle.load(f)
39 |
40 | with open("METEOR.pkl", "r") as f:
41 | METEOR = pickle.load(f)
42 |
43 | with open("CIDEr.pkl", "r") as f:
44 | CIDEr = pickle.load(f)
45 |
46 | for idx_model in range(202, n_epochs, 2):
47 | model_name = os.path.join(model_path, "model_MLP-" + str(idx_model))
48 |
49 | start_time = time.time()
50 |
51 | # generate the val2014_results.txt
52 | eval_image_caption.Val_with_MLE(model_name)
53 |
54 | # call the gen_val_json.py with subprocess
55 | # we will generate the captions_val2014_results.json file
56 | subprocess.call(["python", "gen_val_json.py"])
57 |
58 | # after generating the captions_val2014_results.json file
59 | # we call the coco caption evaluation tools
60 | cocoRes = coco.loadRes(resFile)
61 |
62 | # create cocoEval object by taking coco and cocoRes
63 | cocoEval = COCOEvalCap(coco, cocoRes)
64 |
65 | # evaluate on a subset of images by setting
66 | # cocoEval.params['image_id'] = cocoRes.getImgIds()
67 | # please remove this line when evaluating the full validation set
68 | cocoEval.params['image_id'] = cocoRes.getImgIds()
69 |
70 | # evaluate results
71 | cocoEval.evaluate()
72 |
73 | # print output evaluation scores
74 | for metric, score in cocoEval.eval.items():
75 | print '%s: %.3f'%(metric, score)
76 | if metric == "Bleu_1":
77 | Bleu_1.append(score)
78 | if metric == "Bleu_2":
79 | Bleu_2.append(score)
80 | if metric == "Bleu_3":
81 | Bleu_3.append(score)
82 | if metric == "Bleu_4":
83 | Bleu_4.append(score)
84 | if metric == "METEOR":
85 | METEOR.append(score)
86 | if metric == "CIDEr":
87 | CIDEr.append(score)
88 | # save the scores immediately
89 | with open("Bleu_1.pkl", "w") as fw1:
90 | pickle.dump(Bleu_1, fw1)
91 | with open("Bleu_2.pkl", "w") as fw2:
92 | pickle.dump(Bleu_2, fw2)
93 | with open("Bleu_3.pkl", "w") as fw3:
94 | pickle.dump(Bleu_3, fw3)
95 | with open("Bleu_4.pkl", "w") as fw4:
96 | pickle.dump(Bleu_4, fw4)
97 | with open("METEOR.pkl", "w") as fw5:
98 | pickle.dump(METEOR, fw5)
99 | with open("CIDEr.pkl", "w") as fw6:
100 | pickle.dump(CIDEr, fw6)
101 |
102 | print "Mdoel {} evaluation time cost: {}".format(model_name, time.time()-start_time)
103 |
104 | # draw the pictures
105 | plt.plot(range(len(Bleu_1)), Bleu_1, label="Bleu-1", color="g")
106 | plt.plot(range(len(Bleu_2)), Bleu_2, label="Bleu-2", color="r")
107 | plt.plot(range(len(Bleu_3)), Bleu_3, label="Bleu-3", color="b")
108 | plt.plot(range(len(Bleu_4)), Bleu_4, label="Bleu-4", color="m")
109 | plt.plot(range(len(METEOR)), METEOR, label="METEOR", color="k")
110 | plt.plot(range(len(CIDEr)), CIDEr, label="CIDEr", color="y")
111 | plt.grid(True)
112 | plt.legend(loc=2)
113 | plt.show()
114 | plt.savefig("evalution.png")
--------------------------------------------------------------------------------
/coco_caption/gen_test_json.py:
--------------------------------------------------------------------------------
1 | #!/usr/bin/env python
2 | # coding=utf-8
3 | import os
4 | import json
5 | import cPickle as pickle
6 |
7 | test_results_save_path = '../test2014_results_model-486.txt'
8 | test_results = open(test_results_save_path).read().splitlines()
9 |
10 | images_captions = {}
11 | captions = []
12 | names = []
13 | for idx, item in enumerate(test_results):
14 | if idx % 2 == 0:
15 | names.append(item)
16 | if idx % 2 == 1:
17 | captions.append(item)
18 |
19 | for idx, name in enumerate(names):
20 | print idx, ' ', name
21 | images_captions[name] = captions[idx]
22 |
23 | with open('../data/test2014_images_ids_to_names.pkl', 'r') as fr_1:
24 | test2014_images_ids_to_names = pickle.load(fr_1)
25 |
26 | names_to_ids = {}
27 | for key, item in test2014_images_ids_to_names.iteritems():
28 | names_to_ids[item] = key
29 |
30 | fw_1 = open('captions_test2014_results.json', 'w')
31 | fw_1.write('[')
32 |
33 | for idx, name in enumerate(names):
34 | print idx, ' ', name
35 | tmp_idx = names.index(name)
36 | caption = captions[tmp_idx]
37 | caption = caption.replace(' ,', ',')
38 | caption = caption.replace('"', '')
39 | caption = caption.replace('\n', '')
40 | if idx != len(names)-1:
41 | fw_1.write('{"image_id": ' + str(names_to_ids[name]) + ', "caption": "' + str(caption) + '"}, ')
42 | else:
43 | fw_1.write('{"image_id": ' + str(names_to_ids[name]) + ', "caption": "' + str(caption) + '"}]')
44 |
45 | fw_1.close()
46 |
--------------------------------------------------------------------------------
/coco_caption/gen_val_json.py:
--------------------------------------------------------------------------------
1 | #!/usr/bin/env python
2 | # coding=utf-8
3 | import os
4 | import json
5 | import cPickle as pickle
6 |
7 | test_results_save_path = '../val2014_results_model_MLP-486.txt'
8 | test_results = open(test_results_save_path).read().splitlines()
9 |
10 | images_captions = {}
11 | captions = []
12 | names = []
13 | for idx, item in enumerate(test_results):
14 | if idx % 2 == 0:
15 | names.append(item)
16 | if idx % 2 == 1:
17 | captions.append(item)
18 |
19 | for idx, name in enumerate(names):
20 | print idx, ' ', name
21 | images_captions[name] = captions[idx]
22 |
23 | with open('../data/val2014_images_ids_to_names.pkl', 'r') as fr_1:
24 | test2014_images_ids_to_names = pickle.load(fr_1)
25 |
26 | names_to_ids = {}
27 | for key, item in test2014_images_ids_to_names.iteritems():
28 | names_to_ids[item] = key
29 |
30 | fw_1 = open('captions_val2014_results.json', 'w')
31 | fw_1.write('[')
32 |
33 | for idx, name in enumerate(names):
34 | print idx, ' ', name
35 | tmp_idx = names.index(name)
36 | caption = captions[tmp_idx]
37 | caption = caption.replace(' ,', ',')
38 | caption = caption.replace('"', '')
39 | caption = caption.replace('\n', '')
40 | if idx != len(names)-1:
41 | fw_1.write('{"image_id": ' + str(names_to_ids[name]) + ', "caption": "' + str(caption) + '"}, ')
42 | else:
43 | fw_1.write('{"image_id": ' + str(names_to_ids[name]) + ', "caption": "' + str(caption) + '"}]')
44 |
45 | fw_1.close()
46 |
--------------------------------------------------------------------------------
/coco_caption/model_evalution.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/chenxinpeng/Optimization_of_image_description_metrics_using_policy_gradient_methods/66089304b3dc78a1e27f90e262d0cb17c5bb4cf2/coco_caption/model_evalution.png
--------------------------------------------------------------------------------
/coco_caption/read_test_info.py:
--------------------------------------------------------------------------------
1 | #!/usr/bin/env python
2 | # coding=utf-8
3 |
4 | import os
5 | import json
6 | import cPickle as pickle
7 |
8 | image_info_test2014_path = '../data/image_info_test2014.json'
9 |
10 | image_info_json = json.load(open(image_info_test2014_path, 'r'))
11 |
12 | images_info = image_info_json["images"]
13 |
14 | imageIds_to_imageNames = {}
15 | for image in images_info:
16 | id = int(image["id"])
17 | name = image["file_name"]
18 | imageIds_to_imageNames[id] = name
19 |
20 | with open("./data/test2014_images_ids_to_names.pkl", 'w') as fw_1:
21 | pickle.dump(imageIds_to_imageNames, fw_1)
22 |
--------------------------------------------------------------------------------
/coco_caption/read_validation_info.py:
--------------------------------------------------------------------------------
1 | #!/usr/bin/env python
2 | # coding=utf-8
3 |
4 | import os
5 | import json
6 | import cPickle as pickle
7 |
8 | image_info_test2014_path = '../data/captions_val2014.json'
9 |
10 | image_info_json = json.load(open(image_info_test2014_path, 'r'))
11 |
12 | images_info = image_info_json["images"]
13 |
14 | imageIds_to_imageNames = {}
15 | for image in images_info:
16 | id = int(image["id"])
17 | name = image["file_name"]
18 | imageIds_to_imageNames[id] = name
19 |
20 | with open("val2014_images_ids_to_names.pkl", 'w') as fw_1:
21 | pickle.dump(imageIds_to_imageNames, fw_1)
22 |
--------------------------------------------------------------------------------
/create_train_val_all_reference.py:
--------------------------------------------------------------------------------
1 | # encoding: UTF-8
2 |
3 | import os
4 | import glob
5 | import json
6 | import cPickle as pickle
7 |
8 |
9 | train_val_imageNames_to_imageIDs = {}
10 | train_val_Names_Captions = []
11 | #train_imageNames_to_imageIDs = {}
12 | #val_imageNames_to_imageIDs = {}
13 |
14 | ################################################################
15 | with open('./data/captions_train2014.json') as fr_1:
16 | train_captions = json.load(fr_1)
17 |
18 | for image in train_captions['images']:
19 | image_name = image['file_name']
20 | image_id = image['id']
21 | train_val_imageNames_to_imageIDs[image_name] = image_id
22 |
23 | for image in train_captions['annotations']:
24 | image_id = image['image_id']
25 | image_caption = image['caption']
26 | train_val_Names_Captions.append([image_id, image_caption])
27 |
28 | #################################################################
29 | with open('./data/captions_val2014.json') as fr_2:
30 | val_captions = json.load(fr_2)
31 |
32 | for image in val_captions['images']:
33 | image_name = image['file_name']
34 | image_id = image['id']
35 | train_val_imageNames_to_imageIDs[image_name] = image_id
36 |
37 | for image in val_captions['annotations']:
38 | image_id = image['image_id']
39 | image_caption = image['caption']
40 | train_val_Names_Captions.append([image_id, image_caption])
41 |
42 | #################################################################
43 |
44 | json_fw = open('./data/train_val_all_reference.json', 'w')
45 | json_fw.write('{"info": {"description": "Test", "url": "https://github.com/chenxinpeng", "version": "1.0", "year": 2017, "contributor": "Chen Xinpeng", "date_created": "2017"}, "images": [')
46 |
47 | count = 0
48 | for imageName, imageID in train_val_imageNames_to_imageIDs.iteritems():
49 | if count != len(train_val_imageNames_to_imageIDs)-1:
50 | json_fw.write('{"license": 1, "file_name": "' + str(imageName) + '", "id": ' + str(imageID) + '}, ')
51 | else:
52 | json_fw.write('{"license": 1, "file_name": "' + str(imageName) + '", "id": ' + str(imageID) + '}]')
53 | count += 1
54 |
55 | json_fw.write(', "licenses": [{"url": "http://creativecommons.org/licenses/by-nc-sa/2.0/", "id": 1, "name": "Test"}], ')
56 |
57 | json_fw.write('"type": "captions", "annotations": [')
58 |
59 | flag_count = 0
60 | id_count = 0
61 | for imageName, imageID in train_val_imageNames_to_imageIDs.iteritems():
62 | print "{}, {}, {}".format(flag_count, imageName, imageID)
63 |
64 | captions = []
65 | for item in train_val_Names_Captions:
66 | if item[0] == imageID:
67 | captions.append(item[1])
68 |
69 | count_captions = 0
70 | if flag_count != len(train_val_imageNames_to_imageIDs)-1:
71 | for idx, each_sent in enumerate(captions):
72 | if '\n' in each_sent:
73 | each_sent = each_sent.replace('\n', '')
74 | if '\\' in each_sent:
75 | each_sent = each_sent.replace('\\', '')
76 | if '"' in each_sent:
77 | each_sent = each_sent.replace('"', '')
78 | json_fw.write('{"image_id": ' + str(imageID) + ', "id": ' + str(id_count) + ', "caption": "' + each_sent + '"}, ')
79 | id_count += 1
80 |
81 | if flag_count == len(train_val_imageNames_to_imageIDs)-1:
82 | for idx, each_sent in enumerate(captions):
83 | if '\n' in each_sent:
84 | each_sent = each_sent.replace('\n', '')
85 | if '\\' in each_sent:
86 | each_sent = each_sent.replace('\\', '')
87 | if '"' in each_sent:
88 | each_sent = each_sent.replace('"', '')
89 | if idx != len(captions)-1:
90 | json_fw.write('{"image_id": ' + str(imageID) + ', "id": ' + str(id_count) + ', "caption": "' + each_sent + '"}, ')
91 | else:
92 | json_fw.write('{"image_id": ' + str(imageID) + ', "id": ' + str(id_count) + ', "caption": "' + each_sent + '"}]}')
93 | id_count += 1
94 |
95 | flag_count += 1
96 |
97 | json_fw.close()
98 |
--------------------------------------------------------------------------------
/create_train_val_each_reference.py:
--------------------------------------------------------------------------------
1 | # encoding: UTF-8
2 |
3 | ###############################################################
4 | #
5 | # generate images captions into every json file one by one,
6 | # and get the dict that map the image IDs to image Names
7 | #
8 | ###############################################################
9 |
10 | import os
11 | import sys
12 | import json
13 | import cPickle as pickle
14 |
15 | train_val_imageNames_to_imageIDs = {}
16 | train_imageNames_to_imageIDs = {}
17 | val_imageNames_to_imageIDs = {}
18 |
19 | with open('./data/captions_train2014.json') as fr_1:
20 | train_captions = json.load(fr_1)
21 |
22 | for image in train_captions['images']:
23 | image_name = image['file_name']
24 | image_id = image['id']
25 | train_imageNames_to_imageIDs[image_name] = image_id
26 |
27 | train_Names_Captions = []
28 | for image in train_captions['annotations']:
29 | image_id = image['image_id']
30 | image_caption = image['caption']
31 | train_Names_Captions.append([image_id, image_caption])
32 |
33 | train_count = 0
34 | for imageName, imageID in train_imageNames_to_imageIDs.iteritems():
35 | print "{}, {}, {}".format(train_count, imageName, imageID)
36 | train_count += 1
37 |
38 | captions = []
39 | for item in train_Names_Captions:
40 | if item[0] == imageID:
41 | captions.append(item[1])
42 |
43 | json_fw = open('./train_val_reference_json/'+imageName+'.json', 'w')
44 | json_fw.write('{"info": {"description": "CaptionEval", "url": "https://github.com/chenxinpeng/", "version": "1.0", "year": 2017, "contributor": "Xinpeng Chen", "date_created": "2017.01.26"}, "images": [{"license": 1, "file_name": "' + imageName + '", "id": ' + str(imageID) + '}]')
45 |
46 | json_fw.write(' ,"licenses": [{"url": "test", "id": 1, "name": "test"}], ')
47 | json_fw.write('"type": "captions", "annotations": [')
48 |
49 | id_count = 0
50 | for idx, each_sent in enumerate(captions):
51 | if idx != len(captions)-1:
52 | if '\n' in each_sent:
53 | each_sent = each_sent.replace('\n', '')
54 | if '\\' in each_sent:
55 | each_sent = each_sent.replace('\\', '')
56 | if '"' in each_sent:
57 | each_sent = each_sent.replace('"', '')
58 | json_fw.write('{"image_id": ' + str(imageID) + ', "id": ' + str(id_count) + ', "caption": "' + each_sent + '"}, ')
59 | else:
60 | if '\n' in each_sent:
61 | each_sent = each_sent.replace('\n', '')
62 | if '\\' in each_sent:
63 | each_sent = each_sent.replace('\\', '')
64 | if '"' in each_sent:
65 | each_sent = each_sent.replace('"', '')
66 | json_fw.write('{"image_id": ' + str(imageID) + ', "id": ' + str(id_count) + ', "caption": "' + each_sent + '"}]}')
67 | id_count += 1
68 | json_fw.close()
69 |
70 | # Validation json file
71 | with open('./data/captions_val2014.json') as fr_2:
72 | val_captions = json.load(fr_2)
73 |
74 | for image in val_captions['images']:
75 | image_name = image['file_name']
76 | image_id = image['id']
77 | val_imageNames_to_imageIDs[image_name] = image_id
78 |
79 | val_Names_Captions = []
80 | for image in val_captions['annotations']:
81 | image_id = image['image_id']
82 | image_caption = image['caption']
83 | val_Names_Captions.append([image_id, image_caption])
84 |
85 | val_count = 0
86 | for imageName, imageID in val_imageNames_to_imageIDs.iteritems():
87 | print "{}, {}, {}".format(val_count, imageName, imageID)
88 |
89 | captions = []
90 | for item in val_Names_Captions:
91 | if item[0] == imageID:
92 | captions.append(item[1])
93 |
94 | json_fw = open('./train_val_reference_json/'+imageName+'.json', 'w')
95 | json_fw.write('{"info": {"description": "CaptionEval", "url": "https://github.com/chenxinpeng/", "version": "1.0", "year": 2017, "contributor": "Xinpeng Chen", "date_created": "2017.01.26"}, "images": [{"license": 1, "file_name": "' + imageName + '", "id": ' + str(imageID) + '}]')
96 |
97 | json_fw.write(' ,"licenses": [{"url": "test", "id": 1, "name": "test"}], ')
98 | json_fw.write('"type": "captions", "annotations": [')
99 |
100 | id_count = 0
101 | for idx, each_sent in enumerate(captions):
102 | if idx != len(captions)-1:
103 | if '\n' in each_sent:
104 | each_sent = each_sent.replace('\n', '')
105 | if '\\' in each_sent:
106 | each_sent = each_sent.replace('\\', '')
107 | if '"' in each_sent:
108 | each_sent = each_sent.replace('"', '')
109 | json_fw.write('{"image_id": ' + str(imageID) + ', "id": ' + str(id_count) + ', "caption": "' + each_sent + '"}, ')
110 | else:
111 | if '\n' in each_sent:
112 | each_sent = each_sent.replace('\n', '')
113 | if '\\' in each_sent:
114 | each_sent = each_sent.replace('\\', '')
115 | if '"' in each_sent:
116 | each_sent = each_sent.replace('"', '')
117 | json_fw.write('{"image_id": ' + str(imageID) + ', "id": ' + str(id_count) + ', "caption": "' + each_sent + '"}]}')
118 | id_count += 1
119 | val_count += 1
120 | json_fw.close()
121 |
122 | for k, item in train_imageNames_to_imageIDs.iteritems():
123 | train_val_imageNames_to_imageIDs[k] = item
124 | for k, item in val_imageNames_to_imageIDs.iteritems():
125 | train_val_imageNames_to_imageIDs[k] = item
126 |
127 | with open('./data/train_val_imageNames_to_imageIDs.pkl', 'w') as fw_2:
128 | pickle.dump(train_val_imageNames_to_imageIDs, fw_2)
129 |
130 |
131 |
--------------------------------------------------------------------------------
/data/bias_init_vector.npy:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/chenxinpeng/Optimization_of_image_description_metrics_using_policy_gradient_methods/66089304b3dc78a1e27f90e262d0cb17c5bb4cf2/data/bias_init_vector.npy
--------------------------------------------------------------------------------
/data/test.txt:
--------------------------------------------------------------------------------
1 | test
2 |
--------------------------------------------------------------------------------
/image/1.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/chenxinpeng/Optimization_of_image_description_metrics_using_policy_gradient_methods/66089304b3dc78a1e27f90e262d0cb17c5bb4cf2/image/1.png
--------------------------------------------------------------------------------
/image/2.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/chenxinpeng/Optimization_of_image_description_metrics_using_policy_gradient_methods/66089304b3dc78a1e27f90e262d0cb17c5bb4cf2/image/2.png
--------------------------------------------------------------------------------
/image/3.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/chenxinpeng/Optimization_of_image_description_metrics_using_policy_gradient_methods/66089304b3dc78a1e27f90e262d0cb17c5bb4cf2/image/3.png
--------------------------------------------------------------------------------
/image_caption.py:
--------------------------------------------------------------------------------
1 | # encoding: UTF-8
2 |
3 | import os
4 | import sys
5 | import glob
6 | import random
7 | import time
8 | import json
9 | from json import encoder
10 | import numpy as np
11 | import cPickle as pickle
12 | import matplotlib.pyplot as plt
13 |
14 | import tensorflow as tf
15 |
16 | sys.path.append('../')
17 | from pycocotools.coco import COCO
18 | from pycocoevalcap.eval import COCOEvalCap
19 |
20 | import ipdb
21 |
22 |
23 | #############################################################################################################
24 | #
25 | # Step 1: Input: D = {(x^n, y^n): n = 1:N}
26 | # Step 2:Train \Pi(g_{1:T} | x) using MLE on D, MLE: Maximum likehood eatimation
27 | #
28 | ############################################################################################################
29 | class CNN_LSTM():
30 | def __init__(self,
31 | n_words,
32 | batch_size,
33 | feats_dim,
34 | project_dim,
35 | lstm_size,
36 | word_embed_dim,
37 | lstm_step,
38 | bias_init_vector=None):
39 |
40 | self.n_words = n_words
41 | self.batch_size = batch_size
42 | self.feats_dim = feats_dim
43 | self.project_dim = project_dim
44 | self.lstm_size = lstm_size
45 | self.word_embed_dim = word_embed_dim
46 | self.lstm_step = lstm_step
47 |
48 | # project the image feature vector of dimension 2048 to 512 dimension, with a linear layer
49 | # self.encode_img_W: 2048 x 512
50 | # self.encode_img_b: 512
51 | self.encode_img_W = tf.Variable(tf.random_uniform([feats_dim, project_dim], -0.1, 0.1), name="encode_img_W")
52 | self.encode_img_b = tf.zeros([project_dim], name="encode_img_b")
53 |
54 | with tf.device("/cpu:0"):
55 | self.Wemb = tf.Variable(tf.random_uniform([n_words, word_embed_dim], -0.1, 0.1), name="Wemb")
56 |
57 | self.lstm = tf.nn.rnn_cell.BasicLSTMCell(lstm_size, state_is_tuple=True)
58 |
59 | self.embed_word_W = tf.Variable(tf.random_uniform([lstm_size, n_words], -0.1, 0.1), name="embed_word_W")
60 |
61 | if bias_init_vector is not None:
62 | self.embed_word_b = tf.Variable(bias_init_vector.astype(np.float32), name="embed_word_b")
63 | else:
64 | self.embed_word_b = tf.Variable(tf.zeros([n_words]), name="embed_word_b")
65 |
66 | self.baseline_MLP_W = tf.Variable(tf.random_uniform([lstm_size, 1], -0.1, 0.1), name="baseline_MLP_W")
67 | self.baseline_MLP_b = tf.Variable(tf.zeros([1]), name="baseline_MLP_b")
68 |
69 | # At the beginning, I used two layers of MLP, but I think it's wrong
70 | #self.baseline_MLP2_W = tf.Variable(tf.random_uniform([lstm_size, 1], -0.1, 0.1), name="baseline_MLP2_W")
71 | #self.baseline_MLP2_b = tf.Variable(tf.zeros([1]), name="baseline_MLP2_b")
72 |
73 | ############################################################################################################
74 | #
75 | # Class function for step 2
76 | #
77 | ############################################################################################################
78 | def build_model(self):
79 | images = tf.placeholder(tf.float32, [self.batch_size, self.feats_dim])
80 | sentences = tf.placeholder(tf.int32, [self.batch_size, self.lstm_step])
81 | masks = tf.placeholder(tf.float32, [self.batch_size, self.lstm_step])
82 |
83 | images_embed = tf.matmul(images, self.encode_img_W) + self.encode_img_b
84 |
85 | state = self.lstm.zero_state(batch_size=self.batch_size, dtype=tf.float32)
86 |
87 | loss = 0.0
88 | with tf.variable_scope("LSTM"):
89 | for i in range(0, self.lstm_step):
90 | if i == 0:
91 | current_emb = images_embed
92 | else:
93 | with tf.device("/cpu:0"):
94 | current_emb = tf.nn.embedding_lookup(self.Wemb, sentences[:, i-1])
95 |
96 | if i > 0:
97 | tf.get_variable_scope().reuse_variables()
98 |
99 | output, state = self.lstm(current_emb, state)
100 |
101 | if i > 0:
102 | labels = tf.expand_dims(sentences[:, i], 1)
103 | indices = tf.expand_dims(tf.range(0, self.batch_size, 1), 1)
104 | concated = tf.concat(1, [indices, labels])
105 | onehot_labels = tf.sparse_to_dense( concated, tf.pack([self.batch_size, self.n_words]), 1.0, 0.0)
106 |
107 | logit_words = tf.matmul(output, self.embed_word_W) + self.embed_word_b
108 | cross_entropy = tf.nn.softmax_cross_entropy_with_logits(logit_words, onehot_labels)
109 | cross_entropy = cross_entropy * masks[:, i]
110 | current_loss = tf.reduce_sum(cross_entropy)/self.batch_size
111 |
112 | loss = loss + current_loss
113 | return loss, images, sentences, masks
114 |
115 | def generate_model(self):
116 | images = tf.placeholder(tf.float32, [1, self.feats_dim])
117 | images_embed = tf.matmul(images, self.encode_img_W) + self.encode_img_b
118 |
119 | state = self.lstm.zero_state(batch_size=1, dtype=tf.float32)
120 | sentences = []
121 |
122 | with tf.variable_scope("LSTM"):
123 | output, state = self.lstm(images_embed, state)
124 |
125 | with tf.device("/cpu:0"):
126 | current_emb = tf.nn.embedding_lookup(self.Wemb, tf.ones([1], dtype=tf.int64))
127 |
128 | for i in range(0, self.lstm_step):
129 | tf.get_variable_scope().reuse_variables()
130 |
131 | output, state = self.lstm(current_emb, state)
132 |
133 | logit_words = tf.matmul(output, self.embed_word_W) + self.embed_word_b
134 | max_prob_word = tf.argmax(logit_words, 1)[0]
135 |
136 | with tf.device("/cpu:0"):
137 | current_emb = tf.nn.embedding_lookup(self.Wemb, max_prob_word)
138 | current_emb = tf.expand_dims(current_emb, 0)
139 | sentences.append(max_prob_word)
140 |
141 | return images, sentences
142 |
143 | ####################################################################################
144 | #
145 | # Class function for step 3
146 | #
147 | ####################################################################################
148 | def train_Bphi_model(self):
149 | encode_img_W = tf.stop_gradient(self.encode_img_W)
150 | encode_img_b = tf.stop_gradient(self.encode_img_b)
151 | Wemb = tf.stop_gradient(self.Wemb)
152 |
153 | images = tf.placeholder(tf.float32, [1, self.feats_dim])
154 | images_embed = tf.matmul(images, encode_img_W) + encode_img_b
155 |
156 | Q_Bleu_1 = tf.placeholder(tf.float32, [1, self.lstm_step])
157 | Q_Bleu_2 = tf.placeholder(tf.float32, [1, self.lstm_step])
158 | Q_Bleu_3 = tf.placeholder(tf.float32, [1, self.lstm_step])
159 | Q_Bleu_4 = tf.placeholder(tf.float32, [1, self.lstm_step])
160 |
161 | weight_Bleu_1 = 0.5
162 | weight_Bleu_2 = 0.5
163 | weight_Bleu_3 = 1.0
164 | weight_Bleu_4 = 1.0
165 |
166 | state = self.lstm.zero_state(batch_size=1, dtype=tf.float32)
167 |
168 | # To avoid creating a feedback loop, we do not back-propagate
169 | # gradients through the hidden state from this loss
170 | c, h = state[0], state[1]
171 | c, h = tf.stop_gradient(c), tf.stop_gradient(h)
172 | state = tf.nn.rnn_cell.LSTMStateTuple(c, h)
173 |
174 | loss = 0.0
175 |
176 | with tf.variable_scope("LSTM"):
177 | with tf.device("/cpu:0"):
178 | current_embed = tf.nn.embedding_lookup(Wemb, tf.ones([1], dtype=tf.int64))
179 |
180 | output, state = self.lstm(images_embed, state)
181 | c, h = state[0], state[1]
182 | c, h = tf.stop_gradient(c), tf.stop_gradient(h)
183 | state = tf.nn.rnn_cell.LSTMStateTuple(c, h)
184 |
185 | for i in range(0, self.lstm_step):
186 | tf.get_variable_scope().reuse_variables()
187 |
188 | output, state = self.lstm(current_embed, state)
189 | c, h = state[0], state[1]
190 | c, h = tf.stop_gradient(c), tf.stop_gradient(h)
191 | state = tf.nn.rnn_cell.LSTMStateTuple(c, h)
192 |
193 | # In our experiments, the baseline estimator is an MLP which takes as input the hidden state of the RNN at step t
194 | # To avoid creating a feedback loop, we do not back-propagate gradients through the hidden state from this loss
195 | #if i >= 1:
196 | baseline_estimator = tf.nn.relu(tf.matmul(state[1], self.baseline_MLP_W) + self.baseline_MLP_b)
197 | Q_current = weight_Bleu_1 * Q_Bleu_1[:, i] + weight_Bleu_2 * Q_Bleu_2[:, i] + \
198 | weight_Bleu_3 * Q_Bleu_3[:, i] + weight_Bleu_4 * Q_Bleu_4[:, i]
199 |
200 | # Equation (8) in the paper
201 | loss = loss + tf.square(Q_current - baseline_estimator)
202 |
203 | return images, Q_Bleu_1, Q_Bleu_2, Q_Bleu_3, Q_Bleu_4, loss
204 |
205 | def Monte_Carlo_Rollout(self):
206 | images = tf.placeholder(tf.float32, [1, self.feats_dim])
207 | images_embed = tf.matmul(images, self.encode_img_W) + self.encode_img_b
208 |
209 | state = self.lstm.zero_state(batch_size=1, dtype=tf.float32)
210 |
211 | gen_sentences = []
212 | all_sample_sentences = []
213 |
214 | with tf.variable_scope("LSTM"):
215 | output, state = self.lstm(images_embed, state)
216 | with tf.device("/cpu:0"):
217 | current_emb = tf.nn.embedding_lookup(self.Wemb, tf.ones([1], dtype=tf.int64))
218 |
219 | for i in range(0, self.lstm_step):
220 | tf.get_variable_scope().reuse_variables()
221 |
222 | output, state = self.lstm(current_emb, state)
223 | logit_words = tf.matmul(output, self.embed_word_W) + self.embed_word_b
224 | max_prob_word = tf.argmax(logit_words, 1)[0]
225 |
226 | with tf.device("/cpu:0"):
227 | current_emb = tf.nn.embedding_lookup(self.Wemb, max_prob_word)
228 | current_emb = tf.expand_dims(current_emb, 0)
229 | gen_sentences.append(max_prob_word)
230 |
231 | if i < self.lstm_step-1:
232 | num_sample = self.lstm_step - 1 - i
233 | sample_sentences = []
234 | for idx_sample in range(num_sample):
235 | sample = tf.multinomial(logit_words, 3)
236 | sample_sentences.append(sample[0])
237 | all_sample_sentences.append(sample_sentences)
238 |
239 | return images, gen_sentences, all_sample_sentences
240 |
241 | ########################################################################
242 | #
243 | # Class function for step 4
244 | #
245 | ########################################################################
246 | def Monte_Carlo_and_Baseline(self):
247 | images = tf.placeholder(tf.float32, [self.batch_size, self.feats_dim])
248 | images_embed = tf.matmul(images, self.encode_img_W) + self.encode_img_b
249 |
250 | state = self.lstm.zero_state(batch_size=self.batch_size, dtype=tf.float32)
251 |
252 | gen_sentences = []
253 | all_sample_sentences = []
254 | all_baselines = []
255 |
256 | with tf.variable_scope("LSTM"):
257 | output, state = self.lstm(images_embed, state)
258 | with tf.device("/cpu:0"):
259 | current_emb = tf.nn.embedding_lookup(self.Wemb, tf.ones([self.batch_size], dtype=tf.int64))
260 |
261 | for i in range(0, self.lstm_step):
262 | tf.get_variable_scope().reuse_variables()
263 |
264 | output, state = self.lstm(current_emb, state)
265 | logit_words = tf.matmul(output, self.embed_word_W) + self.embed_word_b
266 | max_prob_word = tf.argmax(logit_words, 1)
267 | with tf.device("/cpu:0"):
268 | current_emb = tf.nn.embedding_lookup(self.Wemb, max_prob_word)
269 | #current_emb = tf.expand_dims(current_emb, 0)
270 | gen_sentences.append(max_prob_word)
271 |
272 | # compute Q for gt with K Monte Carlo rollouts
273 | if i < self.lstm_step-1:
274 | num_sample = self.lstm_step - 1 - i
275 | sample_sentences = []
276 | for idx_sample in range(num_sample):
277 | sample = tf.multinomial(logit_words, 3)
278 | sample_sentences.append(sample)
279 | all_sample_sentences.append(sample_sentences)
280 | # compute eatimated baseline
281 | baseline = tf.nn.relu(tf.matmul(state[1], self.baseline_MLP_W) + self.baseline_MLP_b)
282 | all_baselines.append(baseline)
283 |
284 | return images, gen_sentences, all_sample_sentences, all_baselines
285 |
286 | def SGD_update(self, batch_num_images=1000):
287 | images = tf.placeholder(tf.float32, [batch_num_images, self.feats_dim])
288 | images_embed = tf.matmul(images, self.encode_img_W) + self.encode_img_b
289 |
290 | Q_rewards = tf.placeholder(tf.float32, [batch_num_images, self.lstm_step])
291 | Baselines = tf.placeholder(tf.float32, [batch_num_images, self.lstm_step])
292 |
293 | state = self.lstm.zero_state(batch_size=batch_num_images, dtype=tf.float32)
294 |
295 | loss = 0.0
296 |
297 | with tf.variable_scope("LSTM"):
298 | tf.get_variable_scope().reuse_variables()
299 | output, state = self.lstm(images_embed, state)
300 |
301 | with tf.device("/cpu:0"):
302 | current_emb = tf.nn.embedding_lookup(self.Wemb, tf.ones([batch_num_images], dtype=tf.int64))
303 |
304 | for i in range(0, self.lstm_step):
305 | output, state = self.lstm(current_emb, state)
306 |
307 | logit_words = tf.matmul(output, self.embed_word_W) + self.embed_word_b
308 | logit_words_softmax = tf.nn.softmax(logit_words)
309 | max_prob_word = tf.argmax(logit_words_softmax, 1)
310 | max_prob = tf.reduce_max(logit_words_softmax, 1)
311 |
312 | current_rewards = Q_rewards[:, i] - Baselines[:, i]
313 |
314 | loss = loss + tf.reduce_sum(-tf.log(max_prob) * current_rewards)
315 |
316 | with tf.device("/cpu:0"):
317 | current_emb = tf.nn.embedding_lookup(self.Wemb, max_prob_word)
318 | #current_emb = tf.expand_dims(current_emb, 0)
319 |
320 | return images, Q_rewards, Baselines, loss, max_prob, current_rewards, logit_words
321 |
322 |
323 | ##############################################################################
324 | #
325 | # Step 1: set parameters and path
326 | #
327 | ##############################################################################
328 | batch_size = 100
329 | feats_dim = 2048
330 | project_dim = 512
331 | lstm_size = 512
332 | word_embed_dim = 512
333 | lstm_step = 30
334 |
335 | n_epochs = 500
336 | learning_rate = 0.0001
337 |
338 | # Features directory of training and validation images, and the other path
339 | train_val_feats_path = './inception/train_val_feats'
340 | val_feats_path = './inception/val_feats'
341 |
342 | loss_images_save_path = './loss_imgs'
343 | loss_file_save_path = 'loss.txt'
344 | model_path = './models'
345 |
346 | train_images_captions_path = './data/train_images_captions.pkl'
347 | val_images_captions_path = './data/val_images_captions.pkl'
348 |
349 | idx_to_word_path = './data/idx_to_word.pkl'
350 | word_to_idx_path = './data/word_to_idx.pkl'
351 | bias_init_vector_path = './data/bias_init_vector.npy'
352 |
353 | # Load pre-processed data
354 | with open(train_images_captions_path, 'r') as fr_1:
355 | train_images_captions = pickle.load(fr_1)
356 |
357 | with open(val_images_captions_path, 'r') as fr_2:
358 | val_images_captions = pickle.load(fr_2)
359 |
360 | with open(idx_to_word_path, 'r') as fr_3:
361 | idx_to_word = pickle.load(fr_3)
362 |
363 | with open(word_to_idx_path, 'r') as fr_4:
364 | word_to_idx = pickle.load(fr_4)
365 |
366 | bias_init_vector = np.load(bias_init_vector_path)
367 |
368 |
369 | ##########################################################################
370 | #
371 | # Step 2: Train, validation and test stage using MLE on Dataset
372 | #
373 | ##########################################################################
374 | def Train_with_MLE():
375 | n_words = len(idx_to_word)
376 | train_images_names = train_images_captions.keys()
377 |
378 | # change the word of each image captions to index by word_to_idx
379 | train_images_captions_index = {}
380 | for each_img, sents in train_images_captions.iteritems():
381 | sents_index = np.zeros([len(sents), lstm_step], dtype=np.int32)
382 |
383 | for idy, sent in enumerate(sents):
384 | sent = ' ' + sent + ' '
385 | tmp_sent = sent.split(' ')
386 | tmp_sent = filter(None, tmp_sent)
387 |
388 | for idx, word in enumerate(tmp_sent):
389 | if idx == lstm_step-1:
390 | sents_index[idy, idx] = word_to_idx['']
391 | break
392 | elif word in word_to_idx:
393 | sents_index[idy, idx] = word_to_idx[word]
394 | train_images_captions_index[each_img] = sents_index
395 | with open('./data/train_images_captions_index.pkl', 'w') as fw_1:
396 | pickle.dump(train_images_captions_index, fw_1)
397 |
398 | model = CNN_LSTM(n_words = n_words,
399 | batch_size = batch_size,
400 | feats_dim = feats_dim,
401 | project_dim = project_dim,
402 | lstm_size = lstm_size,
403 | word_embed_dim = word_embed_dim,
404 | lstm_step = lstm_step,
405 | bias_init_vector = bias_init_vector)
406 |
407 | tf_loss, tf_images, tf_sentences, tf_masks = model.build_model()
408 |
409 | sess = tf.InteractiveSession()
410 | saver = tf.train.Saver(max_to_keep=500, write_version=1)
411 | train_op = tf.train.AdamOptimizer(learning_rate).minimize(tf_loss)
412 | tf.initialize_all_variables().run()
413 |
414 | # when you want to train the model from the front model
415 | #new_saver = tf.train.Saver(max_to_keep=500)
416 | #new_saver = tf.train.import_meta_graph('./models/model-78.meta')
417 | #new_saver.restore(sess, tf.train.latest_checkpoint('./models/'))
418 |
419 | loss_fw = open(loss_file_save_path, 'w')
420 | loss_to_draw = []
421 | for epoch in range(0, n_epochs):
422 | loss_to_draw_epoch = []
423 | # disorder the training images
424 | random.shuffle(train_images_names)
425 |
426 | for start, end in zip(range(0, len(train_images_names), batch_size),
427 | range(batch_size, len(train_images_names), batch_size)):
428 | start_time = time.time()
429 |
430 | # current_feats: get the [start:end] features
431 | # current_captions: convert the word to the idx by the word_to_idx
432 | # current_masks: set the to zero, the other place to non-zero
433 | current_feats = []
434 | current_captions = []
435 |
436 | img_names = train_images_names[start:end]
437 | for each_img_name in img_names:
438 | # load this image's feats from the train_val_feats directory
439 | #each_img_name = each_img_name + '.npy'
440 | img_feat = np.load( os.path.join(train_val_feats_path, each_img_name+'.npy') )
441 | current_feats.append(img_feat)
442 |
443 | img_caption_length = len(train_images_captions[each_img_name])
444 | random_choice_index = random.randint(0, img_caption_length-1)
445 | img_caption = train_images_captions_index[each_img_name][random_choice_index]
446 | current_captions.append(img_caption)
447 |
448 | current_feats = np.asarray(current_feats)
449 | current_captions = np.asarray(current_captions)
450 |
451 | current_masks = np.zeros( (current_captions.shape[0], current_captions.shape[1]), dtype=np.int32 )
452 | nonzeros = np.array( map(lambda x: (x != 0).sum(), current_captions) )
453 |
454 | for ind, row in enumerate(current_masks):
455 | row[:nonzeros[ind]] = 1
456 |
457 | _, loss_val = sess.run(
458 | [train_op, tf_loss],
459 | feed_dict = {
460 | tf_images: current_feats,
461 | tf_sentences: current_captions,
462 | tf_masks: current_masks
463 | })
464 | loss_to_draw_epoch.append(loss_val)
465 |
466 | print "idx: {} epoch: {} loss: {} Time cost: {}".format(start, epoch, loss_val, time.time()-start_time)
467 | loss_fw.write('epoch ' + str(epoch) + ' loss ' + str(loss_val) + '\n')
468 |
469 | # draw loss curve every epoch
470 | loss_to_draw.append(np.mean(loss_to_draw_epoch))
471 | plt_save_img_name = str(epoch) + '.png'
472 | plt.plot(range(len(loss_to_draw)), loss_to_draw, color='g')
473 | plt.grid(True)
474 | plt.savefig(os.path.join(loss_images_save_path, plt_save_img_name))
475 |
476 | if np.mod(epoch, 2) == 0:
477 | print "Epoch ", epoch, " is done. Saving the model ..."
478 | saver.save(sess, os.path.join(model_path, 'model_MLP'), global_step=epoch)
479 | loss_fw.close()
480 |
481 |
482 | def Test_with_MLE():
483 | model_path = os.path.join('./models', 'model_MLP-486')
484 | n_words = len(idx_to_word)
485 |
486 | test_feats_path = './inception/test_feats'
487 | test_feats_names = glob.glob(test_feats_path + '/*.npy')
488 | test_images_names = map(lambda x: os.path.basename(x)[0:-4], test_feats_names)
489 |
490 | model = CNN_LSTM(n_words = n_words,
491 | batch_size = batch_size,
492 | feats_dim = feats_dim,
493 | project_dim = project_dim,
494 | lstm_size = lstm_size,
495 | word_embed_dim = word_embed_dim,
496 | lstm_step = lstm_step,
497 | bias_init_vector = None)
498 |
499 | tf_images, tf_sentences = model.generate_model()
500 | sess = tf.InteractiveSession()
501 | saver = tf.train.Saver()
502 | saver.restore(sess, model_path)
503 |
504 | fw_1 = open("test2014_results_model-486.txt", 'w')
505 | for idx, img_name in enumerate(test_images_names):
506 | t0 = time.time()
507 |
508 | current_feats = np.load( os.path.join(test_feats_path, img_name+'.npy') )
509 | current_feats = np.reshape(current_feats, [1, feats_dim])
510 |
511 | sentences_index = sess.run(tf_sentences, feed_dict={tf_images: current_feats})
512 |
513 | #sentences = map(lambda x: idx_to_word[x], sentences_index)
514 | sentences = []
515 | for idx_word in sentences_index:
516 | word = idx_to_word[idx_word]
517 | word = word.replace('\n', '')
518 | word = word.replace('\\', '')
519 | word = word.replace('"', '')
520 | sentences.append(word)
521 |
522 | punctuation = np.argmax(np.array(sentences) == '') + 1
523 | sentences = sentences[:punctuation]
524 | generated_sentence = ' '.join(sentences)
525 | generated_sentence = generated_sentence.replace(' ', '')
526 | generated_sentence = generated_sentence.replace(' ', '')
527 |
528 | print generated_sentence,'\n'
529 | fw_1.write(img_name + '\n')
530 | fw_1.write(generated_sentence + '\n')
531 |
532 | print "{}, {}, Time cost: {}".format(idx, img_name, time.time()-t0)
533 |
534 | fw_1.close()
535 |
536 |
537 | def Val_with_MLE():
538 | model_path = os.path.join('./models', 'model_MLP-486')
539 | n_words = len(idx_to_word)
540 |
541 | # version 1: test all validation images
542 | val_feats_path = './inception/val_feats'
543 | val_feats_names = glob.glob(val_feats_path + '/*.npy')
544 | val_images_names = map(lambda x: os.path.basename(x)[0:-4], val_feats_names)
545 |
546 | # version 2: test only in the 1665 validation images
547 | #val_feats_path = './inception/val_feats_v2'
548 | #with open('./data/val_images_captions.pkl', 'r') as fr_1:
549 | # val_images_names = pickle.load(fr_1).keys()
550 |
551 | model = CNN_LSTM(n_words = n_words,
552 | batch_size = batch_size,
553 | feats_dim = feats_dim,
554 | project_dim = project_dim,
555 | lstm_size = lstm_size,
556 | word_embed_dim = word_embed_dim,
557 | lstm_step = lstm_step,
558 | bias_init_vector = None)
559 | tf_images, tf_sentences = model.generate_model()
560 | sess = tf.InteractiveSession()
561 | saver = tf.train.Saver()
562 | saver.restore(sess, model_path)
563 |
564 | fw_1 = open("val2014_results_model_MLP-486.txt", 'w')
565 | for idx, img_name in enumerate(val_images_names):
566 | print "{}, {}".format(idx, img_name)
567 | start_time = time.time()
568 |
569 | current_feats = np.load( os.path.join(val_feats_path, img_name+'.npy') )
570 | current_feats = np.reshape(current_feats, [1, feats_dim])
571 |
572 | sentences_index = sess.run(tf_sentences, feed_dict={tf_images: current_feats})
573 | #sentences = map(lambda x: idx_to_word[x], sentences_index)
574 | sentences = []
575 | for idx_word in sentences_index:
576 | word = idx_to_word[idx_word]
577 | word = word.replace('\n', '')
578 | word = word.replace('\\', '')
579 | word = word.replace('"', '')
580 | sentences.append(word)
581 |
582 | punctuation = np.argmax(np.array(sentences) == '') + 1
583 | sentences = sentences[:punctuation]
584 | generated_sentence = ' '.join(sentences)
585 | generated_sentence = generated_sentence.replace(' ', '')
586 | generated_sentence = generated_sentence.replace(' ', '')
587 |
588 | print generated_sentence,'\n'
589 | fw_1.write(img_name + '\n')
590 | fw_1.write(generated_sentence + '\n')
591 | fw_1.close()
592 |
593 |
594 | ##########################################################################################################
595 | #
596 | # Step 3: Train B_phi using MC estimates of Q_\theta on a small subset of Dataset D
597 | #
598 | ##########################################################################################################
599 | #import create_json_reference
600 |
601 | #epochs_Bphi_with_MC = 1000
602 |
603 | # I select 1665 images in the val set which saved in ./data: "val_images_captions.pkl",
604 | # to train the B_phi, here is the reference json file path
605 | #refer_1665_save_path = './data/reference_1665.json'
606 |
607 | #eval_ids_to_imgNames_save_path = './data/eval_ids_to_imgNames.pkl'
608 |
609 | def Sample_Q_with_MC():
610 | model_path = os.path.join('./models', 'model_MLP-200')
611 |
612 | n_words = len(idx_to_word)
613 |
614 | val_images_names = val_images_captions.keys()
615 |
616 | print "Begin compute Q rewards of {} images...".format(len(val_images_names))
617 |
618 | # create_json_reference.py
619 | # create_refer(train_images_captions_path, train_images_names, refer_1665_save_path)
620 | #create_json_reference.create_refer(val_images_captions_path, val_images_names, refer_1665_save_path)
621 |
622 | #with open(eval_ids_to_imgNames_save_path, 'r') as fr_1:
623 | # eval_ids_to_imgNames = pickle.load(fr_1)
624 | #eval_imgNames_to_ids = {}
625 | #for key, val in eval_ids_to_imgNames.iteritems():
626 | # eval_imgNames_to_ids[val] = key
627 |
628 | #with open('./data/train_images_captions_index.pkl', 'r') as fr_2:
629 | # train_images_captions_index = pickle.load(fr_2)
630 |
631 | # open the dict that map the image names to image ids
632 | with open('./data/train_val_imageNames_to_imageIDs.pkl', 'r') as fr:
633 | train_val_imageNames_to_imageIDs = pickle.load(fr)
634 |
635 | model = CNN_LSTM(n_words = n_words,
636 | batch_size = 1,
637 | feats_dim = feats_dim,
638 | project_dim = project_dim,
639 | lstm_size = lstm_size,
640 | word_embed_dim = word_embed_dim,
641 | lstm_step = lstm_step,
642 | bias_init_vector = bias_init_vector)
643 |
644 | tf_images, tf_gen_sentences, tf_all_sentences = model.Monte_Carlo_Rollout()
645 | sess = tf.Session()
646 | saver = tf.train.Saver()
647 | saver.restore(sess, model_path)
648 |
649 | all_images_Q_rewards = {}
650 | for idx, img_name in enumerate(val_images_names):
651 | print("current image idx: {}, {}".format(idx, img_name))
652 | start_time = time.time()
653 |
654 | # Load reference json file
655 | annFile = './train_val_reference_json/' + img_name + '.json'
656 | coco = COCO(annFile)
657 |
658 | all_images_Q_rewards[img_name] = {}
659 | current_image_rewards = all_images_Q_rewards[img_name]
660 | current_image_rewards['Bleu_4'] = []
661 | current_image_rewards['Bleu_3'] = []
662 | current_image_rewards['Bleu_2'] = []
663 | current_image_rewards['Bleu_1'] = []
664 |
665 | current_feats = np.load(os.path.join(val_feats_path, img_name+'.npy'))
666 | current_feats = np.reshape(current_feats, [1, feats_dim])
667 |
668 | gen_sents_index, all_sample_sents = sess.run([tf_gen_sentences, tf_all_sentences], feed_dict={tf_images: current_feats})
669 | gen_sents = []
670 | for item in gen_sents_index:
671 | tmp_word = idx_to_word[item]
672 | tmp_word = tmp_word.replace('\\', '')
673 | tmp_word = tmp_word.replace('\n', '')
674 | tmp_word = tmp_word.replace('"', '')
675 | gen_sents.append(tmp_word)
676 | gen_sents_list = gen_sents
677 | punctuation = np.argmax(np.array(gen_sents) == '') + 1
678 | gen_sents = gen_sents[:punctuation]
679 | gen_sents = ' '.join(gen_sents)
680 | gen_sents = gen_sents.replace(' ', '')
681 | gen_sents = gen_sents.replace(' ,', ',')
682 | print "\ngenerated sentences: {}".format(gen_sents)
683 |
684 | for i_s, samples in enumerate(all_sample_sents):
685 | print "\n=========================================================================="
686 | print "{} / {}".format(i_s, len(all_sample_sents))
687 |
688 | samples = np.asarray(samples)
689 | sample_sent_1 = []; sample_sent_2 = []; sample_sent_3 = []
690 |
691 | for each_gen_sents_word in gen_sents_list[0: (i_s+1)]:
692 | sample_sent_1.append(each_gen_sents_word)
693 | sample_sent_2.append(each_gen_sents_word)
694 | sample_sent_3.append(each_gen_sents_word)
695 |
696 | for j_s in range(samples.shape[0]):
697 | word_1, word_2, word_3 = idx_to_word[samples[j_s, 0]], idx_to_word[samples[j_s, 1]], idx_to_word[samples[j_s, 2]]
698 | word_1, word_2, word_3 = word_1.replace('\n', ''), word_2.replace('\n', ''), word_3.replace('\n', '')
699 | word_1, word_2, word_3 = word_1.replace('"', ''), word_2.replace('"', ''), word_3.replace('"', '')
700 | word_1, word_2, word_3 = word_1.replace('\\', ''), word_2.replace('\\', ''), word_3.replace('\\', '')
701 | sample_sent_1.append(word_1)
702 | sample_sent_2.append(word_2)
703 | sample_sent_3.append(word_3)
704 |
705 | sample_sent_1.append('')
706 | sample_sent_2.append('')
707 | sample_sent_3.append('')
708 |
709 | three_sample_sents = [sample_sent_1, sample_sent_2, sample_sent_3]
710 |
711 | three_sample_rewards = {}
712 | three_sample_rewards['Bleu_1'] = 0.0
713 | three_sample_rewards['Bleu_2'] = 0.0
714 | three_sample_rewards['Bleu_3'] = 0.0
715 | three_sample_rewards['Bleu_4'] = 0.0
716 |
717 | for ii, each_sample_sent in enumerate(three_sample_sents):
718 | if ' ' in each_sample_sent:
719 | each_sample_sent.remove(' ') # remove the space element in a list!
720 |
721 | print "sample sentence {}, {}".format(ii, each_sample_sent)
722 |
723 | punctuation = np.argmax(np.array(each_sample_sent) == '') + 1
724 | each_sample_sent = each_sample_sent[:punctuation]
725 | each_sample_sent = ' '.join(each_sample_sent)
726 | each_sample_sent = each_sample_sent.replace(' ', '')
727 | each_sample_sent = each_sample_sent.replace(' ,', ',')
728 | print each_sample_sent
729 | fw_1 = open("./data/results_MC.json", 'w')
730 | fw_1.write('[{"image_id": ' + str(train_val_imageNames_to_imageIDs[img_name]) + ', "caption": "' + each_sample_sent + '"}]')
731 | fw_1.close()
732 |
733 | #annFile = './data/reference_1665.json'
734 | resFile = './data/results_MC.json'
735 | #coco = COCO(annFile)
736 | cocoRes = coco.loadRes(resFile)
737 | cocoEval = COCOEvalCap(coco, cocoRes)
738 | cocoEval.params['image_id'] = cocoRes.getImgIds()
739 | cocoEval.evaluate()
740 |
741 | for metric, score in cocoEval.eval.items():
742 | print '%s: %.3f'%(metric, score)
743 | if metric == 'Bleu_1':
744 | three_sample_rewards['Bleu_1'] += score
745 | if metric == 'Bleu_2':
746 | three_sample_rewards['Bleu_2'] += score
747 | if metric == 'Bleu_3':
748 | three_sample_rewards['Bleu_3'] += score
749 | if metric == 'Bleu_4':
750 | three_sample_rewards['Bleu_4'] += score
751 |
752 | current_image_rewards['Bleu_1'].append(three_sample_rewards['Bleu_1']/3.0)
753 | current_image_rewards['Bleu_2'].append(three_sample_rewards['Bleu_2']/3.0)
754 | current_image_rewards['Bleu_3'].append(three_sample_rewards['Bleu_3']/3.0)
755 | current_image_rewards['Bleu_4'].append(three_sample_rewards['Bleu_4']/3.0)
756 |
757 | # If be in a terminal state, we define Q(g_{1:T}, EOS) = R(g_{1:T})
758 | fw_1 = open("./data/results_MC.json", 'w')
759 | fw_1.write('[{"image_id": ' + str(train_val_imageNames_to_imageIDs[img_name]) + ', "caption": "' + gen_sents + '"}]')
760 | fw_1.close()
761 | #annFile = './data/reference_1665.json'
762 | resFile = './data/results_MC.json'
763 | #coco = COCO(annFile)
764 | cocoRes = coco.loadRes(resFile)
765 | cocoEval = COCOEvalCap(coco, cocoRes)
766 | cocoEval.params['image_id'] = cocoRes.getImgIds()
767 | cocoEval.evaluate()
768 | for metric, score in cocoEval.eval.items():
769 | print '%s: %.3f'%(metric, score)
770 | if metric == 'Bleu_1':
771 | current_image_rewards['Bleu_1'].append(score)
772 | if metric == 'Bleu_2':
773 | current_image_rewards['Bleu_2'].append(score)
774 | if metric == 'Bleu_3':
775 | current_image_rewards['Bleu_3'].append(score)
776 | if metric == 'Bleu_4':
777 | current_image_rewards['Bleu_4'].append(score)
778 | print "Time cost: {}".format(time.time()-start_time)
779 |
780 | with open('./data/all_images_Q_rewards.pkl', 'w') as fw_1:
781 | pickle.dump(all_images_Q_rewards, fw_1)
782 |
783 | def Train_Bphi_Model():
784 | n_words = len(idx_to_word)
785 |
786 | with open('./data/all_images_Q_rewards.pkl', 'r') as fr_3:
787 | all_images_Q_rewards = pickle.load(fr_3)
788 |
789 | subset_images_names = all_images_Q_rewards.keys()
790 |
791 | model = CNN_LSTM(n_words = n_words,
792 | batch_size = 1,
793 | feats_dim = feats_dim,
794 | project_dim = project_dim,
795 | lstm_size = lstm_size,
796 | word_embed_dim = word_embed_dim,
797 | lstm_step = lstm_step,
798 | bias_init_vector = bias_init_vector)
799 |
800 | Bphi_tf_images, Bphi_tf_Bleu_1, Bphi_tf_Bleu_2, Bphi_tf_Bleu_3, Bphi_tf_Bleu_4, Bphi_tf_loss = model.train_Bphi_model()
801 |
802 | train_op = tf.train.GradientDescentOptimizer(learning_rate).minimize(Bphi_tf_loss)
803 | sess = tf.InteractiveSession()
804 | #tf.initialize_all_variables().run()
805 | new_saver = tf.train.Saver(max_to_keep=500)
806 | #new_saver = tf.train.import_meta_graph('./models/model-32.meta')
807 | #new_saver.restore(sess, tf.train.latest_checkpoint('./models'))
808 | new_saver.restore(sess, './models/model-50')
809 |
810 | loss_to_draw = []
811 | for epoch in range(0, epochs_Bphi_with_MC):
812 | loss_to_draw_epoch = []
813 | random.shuffle(subset_images_names)
814 |
815 | for start, end in zip(range(0, len(subset_images_names), 1),
816 | range(1, len(subset_images_names), 1)):
817 | start_time_batch = time.time()
818 |
819 | current_feats = []
820 |
821 | # Bleu_1, Bleu_2, Bleu_3, Bleu_4
822 | current_Bleu_1 = []
823 | current_Bleu_2 = []
824 | current_Bleu_3 = []
825 | current_Bleu_4 = []
826 |
827 | img_names = subset_images_names[start:end]
828 | for each_img_name in img_names:
829 | img_feat = np.load(os.path.join(train_val_feats_path, each_img_name+'.npy'))
830 | current_feats.append(img_feat)
831 |
832 | current_Bleu_1.append(all_images_Q_rewards[each_img_name]['Bleu_1'])
833 | current_Bleu_2.append(all_images_Q_rewards[each_img_name]['Bleu_2'])
834 | current_Bleu_3.append(all_images_Q_rewards[each_img_name]['Bleu_3'])
835 | current_Bleu_4.append(all_images_Q_rewards[each_img_name]['Bleu_4'])
836 |
837 | current_feats = np.asarray(current_feats, dtype=np.float32)
838 | current_Bleu_1 = np.asarray(current_Bleu_1, dtype=np.float32)
839 | current_Bleu_2 = np.asarray(current_Bleu_2, dtype=np.float32)
840 | current_Bleu_3 = np.asarray(current_Bleu_3, dtype=np.float32)
841 | current_Bleu_4 = np.asarray(current_Bleu_4, dtype=np.float32)
842 |
843 | _, loss_val = sess.run([train_op, Bphi_tf_loss],
844 | feed_dict = {Bphi_tf_images: current_feats,
845 | Bphi_tf_Bleu_1: current_Bleu_1,
846 | Bphi_tf_Bleu_2: current_Bleu_2,
847 | Bphi_tf_Bleu_3: current_Bleu_3,
848 | Bphi_tf_Bleu_4: current_Bleu_4
849 | })
850 |
851 | loss_to_draw_epoch.append(loss_val[0,0])
852 | print "idx: {} epoch: {} loss: {} Time cost: {}".format(start, epoch, loss_val[0,0], time.time() - start_time_batch)
853 |
854 | loss_to_draw.append(np.mean(loss_to_draw_epoch))
855 | plt_save_img_name = 'Bphi_train_' + str(epoch) + '.png'
856 | plt.plot(range(len(loss_to_draw)), loss_to_draw, color='g')
857 | plt.grid(True)
858 | plt.savefig(os.path.join('./loss_imgs', plt_save_img_name))
859 |
860 | if np.mod(epoch, 2) == 0:
861 | print "Epoch ", epoch, " is done. Saving the model ..."
862 | new_saver.save(sess, os.path.join('./models', 'Bphi_train_model'), global_step=epoch)
863 |
864 |
865 | ##############################################################################################################
866 | #
867 | # Step 4: go through all the images in D, SGD update of \theta, \phi
868 | #
869 | ##############################################################################################################
870 | def Train_SGD_update():
871 | model_path = os.path.join('./models', 'Bphi_train_model-84')
872 | batch_num_images = 100 # 100
873 | epoches = n_epochs # 500
874 | n_words = len(idx_to_word)
875 | train_images_names = train_images_captions.keys()
876 |
877 | # open the dict that map the image names to image ids
878 | with open('./data/train_val_imageNames_to_imageIDs.pkl', 'r') as fr:
879 | train_val_imageNames_to_imageIDs = pickle.load(fr)
880 |
881 | # Load COCO reference json file
882 | annFile = './data/train_val_all_reference.json'
883 | coco = COCO(annFile)
884 |
885 | # model initialization
886 | model = CNN_LSTM(n_words = n_words,
887 | batch_size = batch_num_images,
888 | feats_dim = feats_dim,
889 | project_dim = project_dim,
890 | lstm_size = lstm_size,
891 | word_embed_dim = word_embed_dim,
892 | lstm_step = lstm_step,
893 | bias_init_vector = bias_init_vector)
894 |
895 | # The first model is used to generate sample sentences and Baselines.
896 | # Then we use the sample sentences and coco caption API to compute the Q_rewards.
897 | # And the second model is used to transfer the Q_rewards, Baselines values,
898 | # the loss function is \sum(log(max_probability) * rewards)
899 | tf_images, tf_gen_sents_index, tf_all_sample_sents, tf_all_baselines = model.Monte_Carlo_and_Baseline()
900 | tf_images_2, tf_Q_rewards, tf_Baselines, tf_loss, tf_max_prob, tf_current_rewards, tf_logit_words = model.SGD_update(batch_num_images=1000)
901 |
902 | train_op = tf.train.GradientDescentOptimizer(learning_rate).minimize(tf_loss)
903 | sess = tf.InteractiveSession()
904 | saver = tf.train.Saver()
905 | saver.restore(sess, model_path)
906 | #tf.initialize_all_variables().run()
907 |
908 | # save every epoch loss value in loss_to_draw
909 | loss_to_draw = []
910 | for epoch in range(0, epoches):
911 | # save every batch loss value in loss_to_draw_epoch
912 | loss_to_draw_epoch = []
913 |
914 | # shuffle the order of images randomly
915 | random.shuffle(train_images_names)
916 |
917 | # store rewards of all the training images
918 | train_val_images_Q_rewards = {}
919 |
920 | for start, end in zip(range(0, len(train_images_names), batch_num_images),
921 | range(batch_num_images, len(train_images_names), batch_num_images)):
922 | start_time = time.time()
923 |
924 | img_names = train_images_names[start:end]
925 | current_feats = []
926 | for img_name in img_names:
927 | tmp_feats = np.load(os.path.join(train_val_feats_path, img_name+'.npy'))
928 | current_feats.append(tmp_feats)
929 | current_feats = np.asarray(current_feats)
930 |
931 | # store rewards of all the training images
932 | #train_val_images_Q_rewards = {}
933 | #ONE IMAGE: for idx, img_name in enumerate(train_images_names):
934 | #ONE IMAGE: print "{}, {}".format(idx, img_name)
935 | #ONE IMAGE: start_time = time.time()
936 | current_batch_rewards = {}
937 | current_batch_rewards['Bleu_1'] = []
938 | current_batch_rewards['Bleu_2'] = []
939 | current_batch_rewards['Bleu_3'] = []
940 | current_batch_rewards['Bleu_4'] = []
941 |
942 | # weighted sum
943 | sum_image_rewards = []
944 | Bleu_1_weight = 0.5
945 | Bleu_2_weight = 0.5
946 | Bleu_3_weight = 1.0
947 | Bleu_4_weight = 1.0
948 |
949 | #ONE IMAGE: current_feats = np.load(os.path.join(train_val_feats_path, img_name+'.npy'))
950 | #ONE IMAGE: current_feats = np.reshape(current_feats, [1, feats_dim])
951 |
952 |
953 | ###################################################################################################################################
954 | #
955 | # Below, for the current 100 images, we compute Q(g1:t-1, gt) for gt with K Monte Carlo rollouts, using Equation (6)
956 | # Meanwhile, we compute estimated baseline B_phi(g1:t-1)
957 | #
958 | ###################################################################################################################################
959 | feed_dict = {tf_images: current_feats}
960 | gen_sents_index, all_sample_sents, all_baselines = sess.run([tf_gen_sents_index, tf_all_sample_sents, tf_all_baselines], feed_dict)
961 |
962 | # 100 sentences, every sentence has 30 words, thus its shape is 100 x 30
963 | batch_sentences = []
964 | for tmp_i in range(0, batch_num_images):
965 | single_sentences = []
966 | for tmp_j in range(0, len(gen_sents_index)):
967 | word_idx = gen_sents_index[tmp_j][tmp_i]
968 | word = idx_to_word[word_idx]
969 | word = word.replace('\n', '')
970 | word = word.replace('\\', '')
971 | word = word.replace('"', '')
972 | single_sentences.append(word)
973 | batch_sentences.append(single_sentences)
974 |
975 | #ONE IMAGE: tmp_sentences = map(lambda x: idx_to_word[x], gen_sents_index)
976 | #ONE IMAGE: print tmp_sentences
977 | #ONE IMAGE: sentences = []
978 | #ONE IMAGE: for word in tmp_sentences:
979 | #ONE IMAGE: word = word.replace('\n', '')
980 | #ONE IMAGE: word = word.replace('\\', '')
981 | #ONE IMAGE: word = word.replace('"', '')
982 | #ONE IMAGE: sentences.append(word)
983 |
984 | batch_sentences_processed = []
985 | #gen_sents_list = batch_sentences
986 | for tmp_i in range(0, batch_num_images):
987 | tmp_sentences = batch_sentences[tmp_i]
988 | punctuation = np.argmax(np.array(tmp_sentences) == '') + 1
989 | tmp_sentences = tmp_sentences[:punctuation]
990 | tmp_sentences = ' '.join(tmp_sentences)
991 | tmp_sentences = tmp_sentences.replace(' ', '')
992 | tmp_sentences = tmp_sentences.replace(' ', '')
993 | batch_sentences_processed.append(tmp_sentences)
994 | #print "Idx: {} Image Name: {} Gen Sentence: {}".format(tmp_i, img_names[tmp_i], generated_sentence)
995 |
996 | #ONE IMAGE: gen_sents_list = sentences
997 | #ONE IMAGE: punctuation = np.argmax(np.array(sentences) == '') + 1
998 | #ONE IMAGE: sentences = sentences[:punctuation]
999 | #ONE IMAGE: generated_sentence = ' '.join(sentences)
1000 | #ONE IMAGE: generated_sentence = generated_sentence.replace(' ', '')
1001 | #ONE IMAGE: generated_sentence = generated_sentence.replace(' ', '')
1002 | #ONE IMAGE: print "Generated sentences: {}".format(generated_sentence)
1003 |
1004 | # 0, 1, 2, ..., 28, the 30th is computed by the whole generated sentences
1005 | for time_step in range(0, lstm_step-1):
1006 | print "\n===================================================================================================="
1007 | print "Time step: {} \n".format(time_step)
1008 | batch_samples = all_sample_sents[time_step]
1009 | batch_samples = np.asarray(batch_samples)
1010 |
1011 | batch_sample_sents_1 = []
1012 | batch_sample_sents_2 = []
1013 | batch_sample_sents_3 = []
1014 | # store the sample sentences, each sample list has 100 images' sentences
1015 | for img_idx in range(0, batch_num_images):
1016 | batch_sample_sents_1.append([])
1017 | batch_sample_sents_2.append([])
1018 | batch_sample_sents_3.append([])
1019 |
1020 | # 0, 1, 2, ..., 99
1021 | for img_idx in range(0, batch_num_images):
1022 | for each_gen_sents_word in batch_sentences[img_idx][0:time_step+1]:
1023 | each_gen_sents_word = each_gen_sents_word.replace('\n', '')
1024 | each_gen_sents_word = each_gen_sents_word.replace('\\', '')
1025 | each_gen_sents_word = each_gen_sents_word.replace('"', '')
1026 | batch_sample_sents_1[img_idx].append(each_gen_sents_word)
1027 | batch_sample_sents_2[img_idx].append(each_gen_sents_word)
1028 | batch_sample_sents_3[img_idx].append(each_gen_sents_word)
1029 |
1030 | # 0, 1, 2, ..., 99
1031 | for img_idx in range(0, batch_num_images):
1032 | for tmp_i in range(0, batch_samples.shape[0]):
1033 | word_1 = idx_to_word[batch_samples[tmp_i, img_idx, 0]]
1034 | word_2 = idx_to_word[batch_samples[tmp_i, img_idx, 1]]
1035 | word_3 = idx_to_word[batch_samples[tmp_i, img_idx, 2]]
1036 | word_1, word_2, word_3 = word_1.replace('\n', ''), word_2.replace('\n', ''), word_3.replace('\n', '')
1037 | word_1, word_2, word_3 = word_1.replace('\\', ''), word_2.replace('\\', ''), word_3.replace('\\', '')
1038 | word_1, word_2, word_3 = word_1.replace('"', ''), word_2.replace('"', ''), word_3.replace('"', '')
1039 |
1040 | batch_sample_sents_1[img_idx].append(word_1)
1041 | batch_sample_sents_2[img_idx].append(word_2)
1042 | batch_sample_sents_3[img_idx].append(word_3)
1043 | batch_sample_sents_1[img_idx].append('')
1044 | batch_sample_sents_2[img_idx].append('')
1045 | batch_sample_sents_3[img_idx].append('')
1046 |
1047 | batch_three_sample_sents = [batch_sample_sents_1, batch_sample_sents_2, batch_sample_sents_3]
1048 | three_sample_rewards = {}
1049 | three_sample_rewards['Bleu_1'] = 0.0
1050 | three_sample_rewards['Bleu_2'] = 0.0
1051 | three_sample_rewards['Bleu_3'] = 0.0
1052 | three_sample_rewards['Bleu_4'] = 0.0
1053 |
1054 | for tmp_i, batch_sample_sents in enumerate(batch_three_sample_sents):
1055 | ######################################################################################
1056 | # write the sample sentences of current 100 images
1057 | ######################################################################################
1058 | fw_1 = open("./data/results_batch_sample_sents.json", 'w')
1059 | fw_1.write('[')
1060 |
1061 | for img_idx in range(0, batch_num_images):
1062 | if ' ' in batch_sample_sents[img_idx]:
1063 | batch_sample_sents[img_idx].remove(' ')
1064 |
1065 | punctuation = np.argmax(np.array(batch_sample_sents[img_idx]) == '') + 1
1066 | batch_sample_sents[img_idx] = batch_sample_sents[img_idx][:punctuation]
1067 | batch_sample_sents[img_idx] = ' '.join(batch_sample_sents[img_idx])
1068 | batch_sample_sents[img_idx] = batch_sample_sents[img_idx].replace(' ', '')
1069 | batch_sample_sents[img_idx] = batch_sample_sents[img_idx].replace(' ,', ',')
1070 |
1071 | if img_idx != batch_num_images-1:
1072 | fw_1.write('{"image_id": ' + str(train_val_imageNames_to_imageIDs[img_names[img_idx]]) + ', "caption": "' + batch_sample_sents[img_idx] + '"}, ')
1073 | else:
1074 | fw_1.write('{"image_id": ' + str(train_val_imageNames_to_imageIDs[img_names[img_idx]]) + ', "caption": "' + batch_sample_sents[img_idx] + '"}]')
1075 | fw_1.close()
1076 |
1077 | ########################################################################################
1078 | # compute the Bleu1,2,3,4 score using current 100 images
1079 | ########################################################################################
1080 | #annFile = './data/train_val_all_reference.json'
1081 | resFile = './data/results_batch_sample_sents.json'
1082 | #coco = COCO(annFile)
1083 | cocoRes = coco.loadRes(resFile)
1084 | cocoEval = COCOEvalCap(coco, cocoRes)
1085 | cocoEval.params['image_id'] = cocoRes.getImgIds()
1086 | cocoEval.evaluate()
1087 | for metric, score in cocoEval.eval.items():
1088 | if metric == 'Bleu_1':
1089 | three_sample_rewards['Bleu_1'] += score
1090 | if metric == 'Bleu_2':
1091 | three_sample_rewards['Bleu_2'] += score
1092 | if metric == 'Bleu_3':
1093 | three_sample_rewards['Bleu_3'] += score
1094 | if metric == 'Bleu_4':
1095 | three_sample_rewards['Bleu_4'] += score
1096 |
1097 | current_batch_rewards['Bleu_1'].append(three_sample_rewards['Bleu_1']/3.0)
1098 | current_batch_rewards['Bleu_2'].append(three_sample_rewards['Bleu_2']/3.0)
1099 | current_batch_rewards['Bleu_3'].append(three_sample_rewards['Bleu_3']/3.0)
1100 | current_batch_rewards['Bleu_4'].append(three_sample_rewards['Bleu_4']/3.0)
1101 |
1102 | #####################################################################################################
1103 | # compute the 30th rewards of the current 100 images
1104 | #####################################################################################################
1105 | fw_2 = open("./data/results_batch_sample_sents.json", 'w')
1106 | fw_2.write('[')
1107 | for img_idx in range(0, batch_num_images):
1108 | if img_idx != batch_num_images-1:
1109 | fw_2.write('{"image_id": ' + str(train_val_imageNames_to_imageIDs[img_names[img_idx]]) + ', "caption": "' + batch_sentences_processed[img_idx] + '"}, ')
1110 | else:
1111 | fw_2.write('{"image_id": ' + str(train_val_imageNames_to_imageIDs[img_names[img_idx]]) + ', "caption": "' + batch_sentences_processed[img_idx] + '"}]')
1112 | fw_2.close()
1113 | #annFile = './data/train_val_all_reference.json'
1114 | resFile = './data/results_batch_sample_sents.json'
1115 | #coco = COCO(annFile)
1116 | cocoRes = coco.loadRes(resFile)
1117 | cocoEval = COCOEvalCap(coco, cocoRes)
1118 | cocoEval.params['image_id'] = cocoRes.getImgIds()
1119 | cocoEval.evaluate()
1120 | for metric, score in cocoEval.eval.items():
1121 | if metric == 'Bleu_1':
1122 | current_batch_rewards['Bleu_1'].append(score)
1123 | if metric == 'Bleu_2':
1124 | current_batch_rewards['Bleu_2'].append(score)
1125 | if metric == 'Bleu_3':
1126 | current_batch_rewards['Bleu_3'].append(score)
1127 | if metric == 'Bleu_4':
1128 | current_batch_rewards['Bleu_4'].append(score)
1129 |
1130 | # compute the weight sum of Bleu value as rewards
1131 | for tmp_idx in range(0, lstm_step):
1132 | tmp_reward = current_batch_rewards['Bleu_1'][tmp_idx] * Bleu_1_weight + \
1133 | current_batch_rewards['Bleu_2'][tmp_idx] * Bleu_2_weight + \
1134 | current_batch_rewards['Bleu_3'][tmp_idx] * Bleu_3_weight + \
1135 | current_batch_rewards['Bleu_4'][tmp_idx] * Bleu_4_weight
1136 | sum_image_rewards.append(tmp_reward)
1137 | sum_image_rewards = np.asarray(sum_image_rewards)
1138 | #sum_image_rewards = np.reshape(sum_image_rewards, [1, lstm_step])
1139 | sum_image_rewards = np.array([sum_image_rewards, ] * batch_num_images)
1140 |
1141 | all_baselines = np.asarray(all_baselines)
1142 | all_baselines = np.reshape(all_baselines, [batch_num_images, lstm_step])
1143 | #all_baselines_mean = np.mean(all_baselines, axis=0)
1144 | #all_baselines = np.array([all_baselines_mean,] * batch_num_images)
1145 | feed_dict = {tf_images_2: current_feats, tf_Q_rewards: sum_image_rewards, tf_Baselines: all_baselines}
1146 | _, loss_value, max_prob, current_rewards, logit_words = sess.run([train_op, tf_loss, tf_max_prob, tf_current_rewards, tf_logit_words], feed_dict)
1147 | #ipdb.set_trace()
1148 | loss_to_draw_epoch.append(loss_value)
1149 | print "idx: {} epoch: {} loss: {} Time cost: {}".format(start, epoch, loss_value, time.time()-start_time)
1150 |
1151 | # draw loss curve every epoch
1152 | loss_to_draw.append(np.mean(loss_to_draw_epoch))
1153 | plt_save_img_name = 'SGD_update_' + str(epoch) + '.png'
1154 | plt.plot(range(len(loss_to_draw)), loss_to_draw, color='g')
1155 | plt.grid(True)
1156 | plt.savefig(os.path.join(loss_images_save_path, plt_save_img_name))
1157 |
1158 | if np.mod(epoch, 1) == 0:
1159 | print "Epoch ", epoch, " is done. Saving the model ..."
1160 | saver.save(sess, os.path.join('./models', 'SGD_update_model'), global_step=epoch)
1161 |
1162 | #ONE IMAGE: # compute the 29 rewards using all_sample_sents
1163 | #ONE IMAGE: # the 30th reward is computed with gen_sents_list
1164 | #ONE IMAGE: for t in range(0, lstm_step-1):
1165 | #ONE IMAGE: samples = all_sample_sents[t]
1166 | #ONE IMAGE: samples = np.asarray(samples)
1167 |
1168 | #ONE IMAGE: sample_sent_1 = []
1169 | #ONE IMAGE: sample_sent_2 = []
1170 | #ONE IMAGE: sample_sent_3 = []
1171 | #ONE IMAGE: for each_gen_sents_word in gen_sents_list[0:t+1]:
1172 | #ONE IMAGE: sample_sent_1.append(each_gen_sents_word)
1173 | #ONE IMAGE: sample_sent_2.append(each_gen_sents_word)
1174 | #ONE IMAGE: sample_sent_3.append(each_gen_sents_word)
1175 |
1176 | #ONE IMAGE: for i in range(samples.shape[0]):
1177 | #ONE IMAGE: word_1, word_2, word_3 = idx_to_word[samples[i, 0]], idx_to_word[samples[i, 1]], idx_to_word[samples[i, 2]]
1178 |
1179 | #ONE IMAGE: word_1, word_2, word_3 = word_1.replace('\n', ''), word_2.replace('\n', ''), word_3.replace('\n', '')
1180 | #ONE IMAGE: word_1, word_2, word_3 = word_1.replace('\\', ''), word_2.replace('\\', ''), word_3.replace('\\', '')
1181 | #ONE IMAGE: word_1, word_2, word_3 = word_1.replace('"', ''), word_2.replace('"', ''), word_3.replace('"', '')
1182 |
1183 | #ONE IMAGE: sample_sent_1.append(word_1)
1184 | #ONE IMAGE: sample_sent_2.append(word_2)
1185 | #ONE IMAGE: sample_sent_3.append(word_3)
1186 |
1187 | #ONE IMAGE: sample_sent_1.append('')
1188 | #ONE IMAGE: sample_sent_2.append('')
1189 | #ONE IMAGE: sample_sent_3.append('')
1190 |
1191 | #ONE IMAGE: three_sample_sents = [sample_sent_1, sample_sent_2, sample_sent_3]
1192 | #ONE IMAGE: three_sample_rewards = {}
1193 | #ONE IMAGE: three_sample_rewards['Bleu_1'] = 0.0
1194 | #ONE IMAGE: three_sample_rewards['Bleu_2'] = 0.0
1195 | #ONE IMAGE: three_sample_rewards['Bleu_3'] = 0.0
1196 | #ONE IMAGE: three_sample_rewards['Bleu_4'] = 0.0
1197 |
1198 | #ONE IMAGE: for i, each_sample_sent in enumerate(three_sample_sents):
1199 | #ONE IMAGE: # remove the space element in a list
1200 | #ONE IMAGE: if ' ' in each_sample_sent:
1201 | #ONE IMAGE: each_sample_sent.remove(' ')
1202 |
1203 | #ONE IMAGE: punctuation = np.argmax(np.array(each_sample_sent) == '') + 1
1204 | #ONE IMAGE: each_sample_sent = each_sample_sent[:punctuation]
1205 | #ONE IMAGE: each_sample_sent = ' '.join(each_sample_sent)
1206 | #ONE IMAGE: each_sample_sent = each_sample_sent.replace(' ', '')
1207 | #ONE IMAGE: each_sample_sent = each_sample_sent.replace(' ,', ',')
1208 |
1209 | #ONE IMAGE: fw_1 = open("./data/results_each_sample_sent.json", 'w')
1210 | #ONE IMAGE: fw_1.write('[{"image_id": ' + str(train_val_imageNames_to_imageIDs[img_name]) + ', "caption": "' + each_sample_sent + '"}]')
1211 | #ONE IMAGE: fw_1.close()
1212 |
1213 | #ONE IMAGE: annFile = './train_val_reference_json/' + img_name + '.json'
1214 | #ONE IMAGE: resFile = './data/results_each_sample_sent.json'
1215 | #ONE IMAGE: coco = COCO(annFile)
1216 | #ONE IMAGE: cocoRes = coco.loadRes(resFile)
1217 | #ONE IMAGE: cocoEval = COCOEvalCap(coco, cocoRes)
1218 | #ONE IMAGE: cocoEval.params['image_id'] = cocoRes.getImgIds()
1219 | #ONE IMAGE: cocoEval.evaluate()
1220 | #ONE IMAGE: for metric, score in cocoEval.eval.items():
1221 | #ONE IMAGE: if metric == 'Bleu_1':
1222 | #ONE IMAGE: three_sample_rewards['Bleu_1'] += score
1223 | #ONE IMAGE: if metric == 'Bleu_2':
1224 | #ONE IMAGE: three_sample_rewards['Bleu_2'] += score
1225 | #ONE IMAGE: if metric == 'Bleu_3':
1226 | #ONE IMAGE: three_sample_rewards['Bleu_3'] += score
1227 | #ONE IMAGE: if metric == 'Bleu_4':
1228 | #ONE IMAGE: three_sample_rewards['Bleu_4'] += score
1229 |
1230 | #ONE IMAGE: current_image_rewards['Bleu_1'].append(three_sample_rewards['Bleu_1']/3.0)
1231 | #ONE IMAGE: current_image_rewards['Bleu_2'].append(three_sample_rewards['Bleu_2']/3.0)
1232 | #ONE IMAGE: current_image_rewards['Bleu_3'].append(three_sample_rewards['Bleu_3']/3.0)
1233 | #ONE IMAGE: current_image_rewards['Bleu_4'].append(three_sample_rewards['Bleu_4']/3.0)
1234 |
1235 | #ONE IMAGE: fw_1 = open("./data/results_each_sample_sent.json", 'w')
1236 | #ONE IMAGE: fw_1.write('[{"image_id": ' + str(train_val_imageNames_to_imageIDs[img_name]) + ', "caption": "' + generated_sentence + '"}]')
1237 | #ONE IMAGE: fw_1.close()
1238 |
1239 | #ONE IMAGE: annFile = './train_val_reference_json/' + img_name + '.json'
1240 | #ONE IMAGE: resFile = './data/results_each_sample_sent.json'
1241 | #ONE IMAGE: coco = COCO(annFile)
1242 | #ONE IMAGE: cocoRes = coco.loadRes(resFile)
1243 | #ONE IMAGE: cocoEval = COCOEvalCap(coco, cocoRes)
1244 | #ONE IMAGE: cocoEval.params['image_id'] = cocoRes.getImgIds()
1245 | #ONE IMAGE: cocoEval.evaluate()
1246 | #ONE IMAGE: for metric, score in cocoEval.eval.items():
1247 | #ONE IMAGE: if metric == 'Bleu_1':
1248 | #ONE IMAGE: current_image_rewards['Bleu_1'].append(score)
1249 | #ONE IMAGE: if metric == 'Bleu_2':
1250 | #ONE IMAGE: current_image_rewards['Bleu_2'].append(score)
1251 | #ONE IMAGE: if metric == 'Bleu_3':
1252 | #ONE IMAGE: current_image_rewards['Bleu_3'].append(score)
1253 | #ONE IMAGE: if metric == 'Bleu_4':
1254 | #ONE IMAGE: current_image_rewards['Bleu_4'].append(score)
1255 |
1256 | #ONE IMAGE: # save the rewards immediately
1257 | #ONE IMAGE: train_val_images_Q_rewards[img_name] = current_image_rewards
1258 | #ONE IMAGE: with open('./data/train_val_images_Q_rewards.pkl', 'w') as fw_2:
1259 | #ONE IMAGE: pickle.dump(train_val_images_Q_rewards, fw_2)
1260 |
1261 | #ONE IMAGE: # compute the weight sum of Bleu value as rewards
1262 | #ONE IMAGE: for tmp_idx in range(0, lstm_step):
1263 | #ONE IMAGE: tmp_reward = current_image_rewards['Bleu_1'][tmp_idx] * Bleu_1_weight + \
1264 | #ONE IMAGE: current_image_rewards['Bleu_2'][tmp_idx] * Bleu_2_weight + \
1265 | #ONE IMAGE: current_image_rewards['Bleu_3'][tmp_idx] * Bleu_3_weight + \
1266 | #ONE IMAGE: current_image_rewards['Bleu_4'][tmp_idx] * Bleu_4_weight
1267 | #ONE IMAGE: sum_image_rewards.append(tmp_reward)
1268 |
1269 | #ONE IMAGE: sum_image_rewards = np.asarray(sum_image_rewards)
1270 | #ONE IMAGE: sum_image_rewards = np.reshape(sum_image_rewards, [1, lstm_step])
1271 | #ONE IMAGE: all_baselines = np.asarray(all_baselines)
1272 | #ONE IMAGE: all_baselines = np.reshape(all_baselines, [1, lstm_step])
1273 | #ONE IMAGE: feed_dict = {tf_images_2: current_feats, tf_Q_rewards: sum_image_rewards, tf_Baselines: all_baselines}
1274 | #ONE IMAGE: _, loss_value = sess.run([train_op, tf_loss], feed_dict)
1275 |
1276 | #ONE IMAGE: loss_to_draw_epoch.append(loss_value)
1277 |
1278 | #ONE IMAGE: print "idx: {} epoch: {} loss: {} Time cost: {}".format(idx, epoch, loss_value, time.time()-start_time)
1279 |
1280 | #ONE IMAGE: # draw loss curve every epoch
1281 | #ONE IMAGE: loss_to_draw.append(np.mean(loss_to_draw_epoch))
1282 | #ONE IMAGE: plt_save_img_name = str(epoch) + '.png'
1283 | #ONE IMAGE: plt.plot(range(len(loss_to_draw)), loss_to_draw, color='g')
1284 | #ONE IMAGE: plt.grid(True)
1285 | #ONE IMAGE: plt.savefig(os.path.join(loss_images_save_path, plt_save_img_name))
1286 |
1287 | #ONE IMAGE: if np.mod(epoch, 2) == 0:
1288 | #ONE IMAGE: print "Epoch ", epoch, " is done. Saving the model ..."
1289 | #ONE IMAGE: saver.save(sess, os.path.join('./models', 'SGD_update_model'), global_step=epoch)
1290 |
1291 |
1292 |
1293 |
--------------------------------------------------------------------------------
/inception/COCO_val2014_000000320612.jpg:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/chenxinpeng/Optimization_of_image_description_metrics_using_policy_gradient_methods/66089304b3dc78a1e27f90e262d0cb17c5bb4cf2/inception/COCO_val2014_000000320612.jpg
--------------------------------------------------------------------------------
/inception/README.md:
--------------------------------------------------------------------------------
1 | Please attetion:
2 |
3 | 1. First, the original images in MSCOCO have an image which is PNG format (**COCO_val2014_000000320612.jpg**).
4 |
5 | 2. Second, when I want to put the training features and validation features into one folder: `train_val_feats`.
6 | The number of files is so many that the Linux system can't execute with `cp` command.
7 | So I use the `copy_train_val_feats.sh` to put the `train_feats`, `val_feats` into one folder: `train_val_feats`.
8 |
9 |
--------------------------------------------------------------------------------
/inception/check_NOT_JPEG_IMG.sh:
--------------------------------------------------------------------------------
1 |
2 | DIR="/home/chenxp/data/mscoco/val2014/*.jpg"
3 |
4 | for img in $DIR
5 | do
6 | file $img >> imageInfo.txt
7 | done
8 |
--------------------------------------------------------------------------------
/inception/copy_train_val_feats.sh:
--------------------------------------------------------------------------------
1 | #!/bin/bash
2 |
3 | DIR="./val_feats/*.npy"
4 |
5 | for feat in $DIR
6 | do
7 | cp $feat ./train_val_feats
8 | done
9 |
--------------------------------------------------------------------------------
/inception/extract_inception_bottleneck_feature.py:
--------------------------------------------------------------------------------
1 | import os
2 | import glob
3 | import time
4 |
5 | import tensorflow as tf
6 | import tensorflow.python.platform
7 | from tensorflow.python.platform import gfile
8 |
9 | import numpy as np
10 |
11 |
12 | def create_graph(model_path):
13 | """
14 | create_graph loads the inception model to memory, should be called before
15 | calling extract_features.
16 |
17 | model_path: path to inception model in protobuf form.
18 | """
19 | with gfile.FastGFile(model_path, 'rb') as f:
20 | graph_def = tf.GraphDef()
21 | graph_def.ParseFromString(f.read())
22 | _ = tf.import_graph_def(graph_def, name='')
23 |
24 |
25 | def extract_features(image_paths, feats_save_path, verbose=False):
26 | """
27 | extract_features computed the inception bottleneck feature for a list of images
28 |
29 | image_paths: array of image path
30 | return: 2-d array in the shape of (len(image_paths), 2048)
31 | """
32 | #feature_dimension = 2048
33 | #features = np.empty((len(image_paths), feature_dimension))
34 |
35 | with tf.Session() as sess:
36 | flattened_tensor = sess.graph.get_tensor_by_name('pool_3:0')
37 |
38 | for i, image_path in enumerate(image_paths):
39 | image_basename = os.path.basename(image_path)
40 | start_time = time.time()
41 |
42 | feat_save_path = os.path.join(feats_save_path, image_basename + '.npy')
43 | if os.path.isfile(feat_save_path):
44 | continue
45 |
46 | if not gfile.Exists(image_path):
47 | tf.logging.fatal('File does not exist %s', image)
48 |
49 | image_data = gfile.FastGFile(image_path, 'rb').read()
50 | feature = sess.run([flattened_tensor], {'DecodeJpeg/contents:0': image_data})
51 | np.save(feat_save_path, np.squeeze(feature))
52 |
53 | if verbose:
54 | print('idx: {} {} Time cost: {}'.format(i, image_basename, time.time()-start_time))
55 |
56 |
57 | if __name__ == "__main__":
58 | images_path = '/home/chenxp/data/mscoco/test2014'
59 | feats_save_path = './test_feats'
60 |
61 | model_path = 'tensorflow_inception_graph.pb'
62 |
63 | images_lists = glob.glob(images_path + '/*.jpg')
64 |
65 | create_graph(model_path)
66 | extract_features(images_lists, feats_save_path, verbose=True)
67 |
--------------------------------------------------------------------------------
/inception/test_feats/README.md:
--------------------------------------------------------------------------------
1 |
2 | This folder saves the features of test images.
3 |
--------------------------------------------------------------------------------
/inception/train_feats/README.md:
--------------------------------------------------------------------------------
1 |
2 | This folder saves the feature of train images.
3 |
--------------------------------------------------------------------------------
/inception/train_val_feats/README.md:
--------------------------------------------------------------------------------
1 |
2 | This folder saves the features of training and validation images.
3 |
--------------------------------------------------------------------------------
/inception/val_feats/README.md:
--------------------------------------------------------------------------------
1 |
2 | This folder saves the features of validation images.
3 |
--------------------------------------------------------------------------------
/pre_train_json.py:
--------------------------------------------------------------------------------
1 | # encoding: UTF-8
2 |
3 | import os
4 | import json
5 | import numpy as np
6 | import cPickle as pickle
7 |
8 | import time
9 | import ipdb
10 |
11 | train_captions_path = './data/captions_train2014.json'
12 | save_images_captions_path = './data/train_images_captions.pkl'
13 |
14 | train_captions_fo = open(train_captions_path)
15 | train_captions = json.load(train_captions_fo)
16 |
17 | image_ids = []
18 | for annotation in train_captions['annotations']:
19 | image_ids.append(annotation['image_id'])
20 |
21 | # [[filename1, id1], [filename2, id2], ... ]
22 | images_captions = {}
23 | for ii, image in enumerate(train_captions['images']):
24 | start_time = time.time()
25 |
26 | image_file_name = image['file_name']
27 | image_id = image['id']
28 | indices = [i for i, x in enumerate(image_ids) if x == image_id]
29 |
30 | caption = []
31 | for idx in indices:
32 | each_cap = train_captions['annotations'][idx]['caption']
33 | each_cap = each_cap.lower()
34 | each_cap = each_cap.replace('.', '')
35 | each_cap = each_cap.replace(',', ' ,')
36 | each_cap = each_cap.replace('?', ' ?')
37 | caption.append(each_cap)
38 | images_captions[image_file_name] = caption
39 | print "{} {} Each image cost: {}".format(ii, image_file_name, time.time()-start_time)
40 |
41 | with open(save_images_captions_path, 'w') as fw:
42 | pickle.dump(images_captions, fw)
43 |
44 |
45 |
--------------------------------------------------------------------------------
/pre_val_json.py:
--------------------------------------------------------------------------------
1 | # encoding: UTF-8
2 |
3 | import os
4 | import json
5 | import numpy as np
6 | import cPickle as pickle
7 |
8 | import time
9 | import ipdb
10 |
11 | train_captions_path = './data/captions_val2014.json'
12 | save_images_captions_path = './data/val_images_captions.pkl'
13 |
14 | train_captions_fo = open(train_captions_path)
15 | train_captions = json.load(train_captions_fo)
16 |
17 | image_ids = []
18 | for annotation in train_captions['annotations']:
19 | image_ids.append(annotation['image_id'])
20 |
21 | # [[filename1, id1], [filename2, id2], ... ]
22 | images_captions = {}
23 | for ii, image in enumerate(train_captions['images']):
24 | start_time = time.time()
25 |
26 | image_file_name = image['file_name']
27 | image_id = image['id']
28 | indices = [i for i, x in enumerate(image_ids) if x == image_id]
29 |
30 | caption = []
31 | for idx in indices:
32 | each_cap = train_captions['annotations'][idx]['caption']
33 | each_cap = each_cap.lower()
34 | each_cap = each_cap.replace('.', '')
35 | each_cap = each_cap.replace(',', ' ,')
36 | each_cap = each_cap.replace('?', ' ?')
37 | caption.append(each_cap)
38 | images_captions[image_file_name] = caption
39 | print "{} {} Each image cost: {}".format(ii, image_file_name, time.time()-start_time)
40 |
41 | with open(save_images_captions_path, 'w') as fw:
42 | pickle.dump(images_captions, fw)
43 |
44 |
45 |
--------------------------------------------------------------------------------
/split_train_val_data.py:
--------------------------------------------------------------------------------
1 | # encoding: UTF-8
2 |
3 | # accoding the paper: we hold out a small subset of 1,665 validation images
4 | # for hyper-parameter tuning, and use the remaining combined training and
5 | # validation set for training
6 |
7 | import os
8 | import cPickle as pickle
9 |
10 | train_images_captions_path = './data/train_images_captions.pkl'
11 | val_images_captions_path = './data/val_images_captions.pkl'
12 |
13 | with open(train_images_captions_path, 'r') as fr1:
14 | train_images_captions = pickle.load(fr1)
15 |
16 | with open(val_images_captions_path, 'r') as fr2:
17 | val_images_captions = pickle.load(fr2)
18 |
19 | val_images_names = val_images_captions.keys()
20 |
21 | # val_images_names[0:1665] for validation
22 | # val_images_names[1665:] for training
23 | val_names_part_one = val_images_names[0:1665]
24 | val_names_part_two = val_images_names[1665:]
25 |
26 | # re-save the train_images_captions, val_images_captions
27 | val_images_captions_new = {}
28 | for img in val_names_part_one:
29 | val_images_captions_new[img] = val_images_captions[img]
30 |
31 | for img in val_names_part_two:
32 | train_images_captions[img] = val_images_captions[img]
33 |
34 | with open(train_images_captions_path, 'w') as fw1:
35 | pickle.dump(train_images_captions, fw1)
36 |
37 | with open(val_images_captions_path, 'w') as fw2:
38 | pickle.dump(val_images_captions_new, fw2)
39 |
40 |
--------------------------------------------------------------------------------