├── LICENSE
├── README.md
├── configs
├── __init__.py
├── c04.py
├── get_config.py
├── json2tsv.py
├── m05.json
├── m05.tsv
└── tsv2json.py
├── create_fold.py
├── download.sh
├── download2.sh
├── main
├── __init__.py
└── x05.py
├── models
├── __init__.py
├── bm05.py
└── m05.py
├── my
├── __init__.py
├── nn.py
├── rnn_cell.py
└── tensorflow.py
├── prepro
├── __init__.py
├── __init__.pyc
└── p05.py
├── prepro_images.lua
├── read_data
├── __init__.py
└── r05.py
├── requirements.txt
├── tmp
├── __init__.py
├── sim_test.py
└── simple.py
├── utils.py
└── vis
├── __init__.py
├── list_dqa_questions.py
├── list_facts.py
├── list_relations.py
├── list_results.py
├── list_vqa_questions.py
└── templates
├── list_dqa_questions.html
├── list_facts.html
├── list_questions.html
├── list_relations.html
└── list_results.html
/LICENSE:
--------------------------------------------------------------------------------
1 | Apache License
2 | Version 2.0, January 2004
3 | http://www.apache.org/licenses/
4 |
5 | TERMS AND CONDITIONS FOR USE, REPRODUCTION, AND DISTRIBUTION
6 |
7 | 1. Definitions.
8 |
9 | "License" shall mean the terms and conditions for use, reproduction,
10 | and distribution as defined by Sections 1 through 9 of this document.
11 |
12 | "Licensor" shall mean the copyright owner or entity authorized by
13 | the copyright owner that is granting the License.
14 |
15 | "Legal Entity" shall mean the union of the acting entity and all
16 | other entities that control, are controlled by, or are under common
17 | control with that entity. For the purposes of this definition,
18 | "control" means (i) the power, direct or indirect, to cause the
19 | direction or management of such entity, whether by contract or
20 | otherwise, or (ii) ownership of fifty percent (50%) or more of the
21 | outstanding shares, or (iii) beneficial ownership of such entity.
22 |
23 | "You" (or "Your") shall mean an individual or Legal Entity
24 | exercising permissions granted by this License.
25 |
26 | "Source" form shall mean the preferred form for making modifications,
27 | including but not limited to software source code, documentation
28 | source, and configuration files.
29 |
30 | "Object" form shall mean any form resulting from mechanical
31 | transformation or translation of a Source form, including but
32 | not limited to compiled object code, generated documentation,
33 | and conversions to other media types.
34 |
35 | "Work" shall mean the work of authorship, whether in Source or
36 | Object form, made available under the License, as indicated by a
37 | copyright notice that is included in or attached to the work
38 | (an example is provided in the Appendix below).
39 |
40 | "Derivative Works" shall mean any work, whether in Source or Object
41 | form, that is based on (or derived from) the Work and for which the
42 | editorial revisions, annotations, elaborations, or other modifications
43 | represent, as a whole, an original work of authorship. For the purposes
44 | of this License, Derivative Works shall not include works that remain
45 | separable from, or merely link (or bind by name) to the interfaces of,
46 | the Work and Derivative Works thereof.
47 |
48 | "Contribution" shall mean any work of authorship, including
49 | the original version of the Work and any modifications or additions
50 | to that Work or Derivative Works thereof, that is intentionally
51 | submitted to Licensor for inclusion in the Work by the copyright owner
52 | or by an individual or Legal Entity authorized to submit on behalf of
53 | the copyright owner. For the purposes of this definition, "submitted"
54 | means any form of electronic, verbal, or written communication sent
55 | to the Licensor or its representatives, including but not limited to
56 | communication on electronic mailing lists, source code control systems,
57 | and issue tracking systems that are managed by, or on behalf of, the
58 | Licensor for the purpose of discussing and improving the Work, but
59 | excluding communication that is conspicuously marked or otherwise
60 | designated in writing by the copyright owner as "Not a Contribution."
61 |
62 | "Contributor" shall mean Licensor and any individual or Legal Entity
63 | on behalf of whom a Contribution has been received by Licensor and
64 | subsequently incorporated within the Work.
65 |
66 | 2. Grant of Copyright License. Subject to the terms and conditions of
67 | this License, each Contributor hereby grants to You a perpetual,
68 | worldwide, non-exclusive, no-charge, royalty-free, irrevocable
69 | copyright license to reproduce, prepare Derivative Works of,
70 | publicly display, publicly perform, sublicense, and distribute the
71 | Work and such Derivative Works in Source or Object form.
72 |
73 | 3. Grant of Patent License. Subject to the terms and conditions of
74 | this License, each Contributor hereby grants to You a perpetual,
75 | worldwide, non-exclusive, no-charge, royalty-free, irrevocable
76 | (except as stated in this section) patent license to make, have made,
77 | use, offer to sell, sell, import, and otherwise transfer the Work,
78 | where such license applies only to those patent claims licensable
79 | by such Contributor that are necessarily infringed by their
80 | Contribution(s) alone or by combination of their Contribution(s)
81 | with the Work to which such Contribution(s) was submitted. If You
82 | institute patent litigation against any entity (including a
83 | cross-claim or counterclaim in a lawsuit) alleging that the Work
84 | or a Contribution incorporated within the Work constitutes direct
85 | or contributory patent infringement, then any patent licenses
86 | granted to You under this License for that Work shall terminate
87 | as of the date such litigation is filed.
88 |
89 | 4. Redistribution. You may reproduce and distribute copies of the
90 | Work or Derivative Works thereof in any medium, with or without
91 | modifications, and in Source or Object form, provided that You
92 | meet the following conditions:
93 |
94 | (a) You must give any other recipients of the Work or
95 | Derivative Works a copy of this License; and
96 |
97 | (b) You must cause any modified files to carry prominent notices
98 | stating that You changed the files; and
99 |
100 | (c) You must retain, in the Source form of any Derivative Works
101 | that You distribute, all copyright, patent, trademark, and
102 | attribution notices from the Source form of the Work,
103 | excluding those notices that do not pertain to any part of
104 | the Derivative Works; and
105 |
106 | (d) If the Work includes a "NOTICE" text file as part of its
107 | distribution, then any Derivative Works that You distribute must
108 | include a readable copy of the attribution notices contained
109 | within such NOTICE file, excluding those notices that do not
110 | pertain to any part of the Derivative Works, in at least one
111 | of the following places: within a NOTICE text file distributed
112 | as part of the Derivative Works; within the Source form or
113 | documentation, if provided along with the Derivative Works; or,
114 | within a display generated by the Derivative Works, if and
115 | wherever such third-party notices normally appear. The contents
116 | of the NOTICE file are for informational purposes only and
117 | do not modify the License. You may add Your own attribution
118 | notices within Derivative Works that You distribute, alongside
119 | or as an addendum to the NOTICE text from the Work, provided
120 | that such additional attribution notices cannot be construed
121 | as modifying the License.
122 |
123 | You may add Your own copyright statement to Your modifications and
124 | may provide additional or different license terms and conditions
125 | for use, reproduction, or distribution of Your modifications, or
126 | for any such Derivative Works as a whole, provided Your use,
127 | reproduction, and distribution of the Work otherwise complies with
128 | the conditions stated in this License.
129 |
130 | 5. Submission of Contributions. Unless You explicitly state otherwise,
131 | any Contribution intentionally submitted for inclusion in the Work
132 | by You to the Licensor shall be under the terms and conditions of
133 | this License, without any additional terms or conditions.
134 | Notwithstanding the above, nothing herein shall supersede or modify
135 | the terms of any separate license agreement you may have executed
136 | with Licensor regarding such Contributions.
137 |
138 | 6. Trademarks. This License does not grant permission to use the trade
139 | names, trademarks, service marks, or product names of the Licensor,
140 | except as required for reasonable and customary use in describing the
141 | origin of the Work and reproducing the content of the NOTICE file.
142 |
143 | 7. Disclaimer of Warranty. Unless required by applicable law or
144 | agreed to in writing, Licensor provides the Work (and each
145 | Contributor provides its Contributions) on an "AS IS" BASIS,
146 | WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or
147 | implied, including, without limitation, any warranties or conditions
148 | of TITLE, NON-INFRINGEMENT, MERCHANTABILITY, or FITNESS FOR A
149 | PARTICULAR PURPOSE. You are solely responsible for determining the
150 | appropriateness of using or redistributing the Work and assume any
151 | risks associated with Your exercise of permissions under this License.
152 |
153 | 8. Limitation of Liability. In no event and under no legal theory,
154 | whether in tort (including negligence), contract, or otherwise,
155 | unless required by applicable law (such as deliberate and grossly
156 | negligent acts) or agreed to in writing, shall any Contributor be
157 | liable to You for damages, including any direct, indirect, special,
158 | incidental, or consequential damages of any character arising as a
159 | result of this License or out of the use or inability to use the
160 | Work (including but not limited to damages for loss of goodwill,
161 | work stoppage, computer failure or malfunction, or any and all
162 | other commercial damages or losses), even if such Contributor
163 | has been advised of the possibility of such damages.
164 |
165 | 9. Accepting Warranty or Additional Liability. While redistributing
166 | the Work or Derivative Works thereof, You may choose to offer,
167 | and charge a fee for, acceptance of support, warranty, indemnity,
168 | or other liability obligations and/or rights consistent with this
169 | License. However, in accepting such obligations, You may act only
170 | on Your own behalf and on Your sole responsibility, not on behalf
171 | of any other Contributor, and only if You agree to indemnify,
172 | defend, and hold each Contributor harmless for any liability
173 | incurred by, or claims asserted against, such Contributor by reason
174 | of your accepting any such warranty or additional liability.
175 |
176 | END OF TERMS AND CONDITIONS
177 |
178 | APPENDIX: How to apply the Apache License to your work.
179 |
180 | To apply the Apache License to your work, attach the following
181 | boilerplate notice, with the fields enclosed by brackets "{}"
182 | replaced with your own identifying information. (Don't include
183 | the brackets!) The text should be enclosed in the appropriate
184 | comment syntax for the file format. We also recommend that a
185 | file or class name and description of purpose be included on the
186 | same "printed page" as the copyright notice for easier
187 | identification within third-party archives.
188 |
189 | Copyright {yyyy} {name of copyright owner}
190 |
191 | Licensed under the Apache License, Version 2.0 (the "License");
192 | you may not use this file except in compliance with the License.
193 | You may obtain a copy of the License at
194 |
195 | http://www.apache.org/licenses/LICENSE-2.0
196 |
197 | Unless required by applicable law or agreed to in writing, software
198 | distributed under the License is distributed on an "AS IS" BASIS,
199 | WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
200 | See the License for the specific language governing permissions and
201 | limitations under the License.
202 |
--------------------------------------------------------------------------------
/README.md:
--------------------------------------------------------------------------------
1 | # DQA-Net
2 | This is the original source code for DQA-Net described in ["A Diagram is Worth a Dozen Images"] (http://arxiv.org/abs/1603.07396)
3 |
4 | ## Quick Start
5 |
6 | ### 1. Install Requirements
7 | - Python (verified on 3.5.1)
8 | - Python packages: numpy, progressbar2, nltk, tensorflow, h5py, [qa2hypo](https://github.com/anglil/qa2hypo)
9 | - Torch (comes with Lua)
10 | - Lua packages: cunn, cudnn, cutorch, loadcaffe, hdf5
11 |
12 | Note that the most recent official release of tensorflow (0.7.1) probably won't be compatible with this.
13 | You will need to build from a recent commit (verified on [e39d8fe](https://github.com/tensorflow/tensorflow/tree/e39d8feebb9666a331345cd8d960f5ade4652bba)).
14 | DQA-Net does not use images (VQA baseline does), so you can skip Lua/Torch if you just want to run DQA-Net. See details below.
15 |
16 | ### 2. Download Data and Models
17 | At the root folder, run:
18 | ```bash
19 | chmod +x download.sh
20 | ./download.sh
21 | ```
22 | to download DQA data, folds, Glove vectors and VGG-19 model.
23 | VGG-19 model is used for images, and as mentioned above, DQA-Net does not use images, so you can comment this line out if you only run DQA-Net.
24 |
25 | ### 3. Preprocess Data
26 | Run:
27 | ```bash
28 | python -m prepro.p05
29 | ```
30 | to preprocess data.
31 | You can just use default directories unless you make changes to download directories in `download.sh`.
32 |
33 | If you wish to skip image preprocessing (in case you only run DQA-Net), Run with an additional flag:
34 | ```bash
35 | python -m prepro.p05 --prepro_images False
36 | ```
37 | Now you will see all preprocessed json and h5 files in `data/s3` folder inside the source code's root folder.
38 |
39 | ### 4. Train and Test
40 | To train the default model, run:
41 | ```bash
42 | python -m main.x05 --train
43 | ```
44 |
45 | To test the trained model on test data, run:
46 | ```bash
47 | python -m main.x05
48 | ```
49 |
50 | To launch tensorboard, run:
51 | ```bash
52 | tensorboard --logdir logs/m05/None --port 6006
53 | ```
54 | Here, `m05` is the model name, and `None` is the default configuration. All tensorboard logs will be stored in `logs/` folder.
55 |
56 | To visualize the attention, run:
57 | ```bash
58 | python -m vis.list_results 5 None train 1 --port 8000 --open True
59 | ```
60 | Here, `5` is the model number (`m05`), `None` is the default configuration, `train` indicates data type, and `1` is the epoch from which the result will be fetches.
61 | See `evals/m05/None` folder to see possible epochs (result saving frequency can be controlled by "save_period" flag at `main/m05.py`).
62 | After running the script, the script hosts html server at the specified port.
63 | `--open True` flag opens web browser at this address.
64 |
65 | In general, use `-h` flag for the run files to see what kind options there are.
66 |
--------------------------------------------------------------------------------
/configs/__init__.py:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/allenai/dqa-net/c497e950017e69952deae3ffd07920a5bfb9f468/configs/__init__.py
--------------------------------------------------------------------------------
/configs/c04.py:
--------------------------------------------------------------------------------
1 | # For mac local testing
2 | configs = {}
3 | configs[8] = {'batch_size': 10,
4 | 'hidden_size': 10,
5 | 'num_layers': 1,
6 | 'rnn_num_layers': 1,
7 | 'init_mean': 0,
8 | 'init_std': 0.1,
9 | 'init_lr': 0.01,
10 | 'init_nw': 0.9,
11 | 'anneal_period': 10,
12 | 'anneal_ratio': 0.7,
13 | 'num_epochs': 2,
14 | 'linear_start': False,
15 | 'max_grad_norm': 40,
16 | 'keep_prob': 1.0,
17 | 'val_period': 1,
18 | 'save_period': 1,
19 | 'fold_path': 'data/s3-100/fold.json',
20 | 'data_dir': 'data/s3-100',
21 | 'model_name': 'm04',
22 | 'mode': 'la',
23 | 'use_null': False,
24 | 'opt': 'adagrad',
25 | 'model': 'sim',
26 | 'lstm': 'basic',
27 | 'sim_func': 'dot',
28 | }
29 |
30 | # debug purpose
31 | configs[9] = {'batch_size': 100,
32 | 'hidden_size': 100,
33 | 'num_layers': 1,
34 | 'rnn_num_layers': 1,
35 | 'emb_num_layers': 1,
36 | 'init_mean': 0,
37 | 'init_std': 0.1,
38 | 'init_lr': 0.01,
39 | 'anneal_period': 100,
40 | 'anneal_ratio': 0.5,
41 | 'num_epochs': 100,
42 | 'linear_start': False,
43 | 'max_grad_norm': 40,
44 | 'keep_prob': 1.0,
45 | 'fold_path': 'data/s3/folds/fold11.json',
46 | 'data_dir': 'data/s3_04_debug',
47 | 'mode': 'la',
48 | 'model_name': 'm04',
49 | 'val_num_batches': 100,
50 | }
51 |
52 | # deploy configs start here
53 | configs[0] = {'batch_size': 100,
54 | 'hidden_size': 50,
55 | 'num_layers': 1,
56 | 'rnn_num_layers': 1,
57 | 'init_mean': 0,
58 | 'init_std': 0.1,
59 | 'init_lr': 0.01,
60 | 'init_nw': 0.0,
61 | 'anneal_period': 10,
62 | 'anneal_ratio': 0.8,
63 | 'num_epochs': 100,
64 | 'linear_start': False,
65 | 'max_grad_norm': 40,
66 | 'keep_prob': 1.0,
67 | 'val_period': 5,
68 | 'save_period': 10,
69 | 'fold_path': 'data/s3/folds/fold20.json',
70 | 'data_dir': 'data/s3',
71 | 'model_name': 'm04',
72 | 'mode': 'a',
73 | 'use_null': False,
74 | 'opt': 'basic',
75 | 'model': 'sim',
76 | 'lstm': 'basic',
77 | 'sim_func': 'dot',
78 | 'forget_bias': 2.5,
79 | }
80 |
81 | configs[1] = {'batch_size': 100,
82 | 'hidden_size': 50,
83 | 'num_layers': 1,
84 | 'rnn_num_layers': 1,
85 | 'init_mean': 0,
86 | 'init_std': 0.1,
87 | 'init_lr': 0.01,
88 | 'init_nw': 0.0,
89 | 'anneal_period': 10,
90 | 'anneal_ratio': 0.8,
91 | 'num_epochs': 100,
92 | 'linear_start': False,
93 | 'max_grad_norm': 40,
94 | 'keep_prob': 1.0,
95 | 'val_period': 5,
96 | 'save_period': 10,
97 | 'fold_path': 'data/s3/folds/fold24.json',
98 | 'data_dir': 'data/s3',
99 | 'model_name': 'm04',
100 | 'mode': 'a',
101 | 'use_null': False,
102 | 'opt': 'basic',
103 | 'model': 'sim',
104 | 'lstm': 'basic',
105 | 'sim_func': 'dot',
106 | 'forget_bias': 2.5,
107 | }
108 |
109 | configs[2] = {'batch_size': 100,
110 | 'hidden_size': 50,
111 | 'num_layers': 1,
112 | 'rnn_num_layers': 1,
113 | 'init_mean': 0,
114 | 'init_std': 0.1,
115 | 'init_lr': 0.01,
116 | 'init_nw': 0.0,
117 | 'anneal_period': 10,
118 | 'anneal_ratio': 0.8,
119 | 'num_epochs': 100,
120 | 'linear_start': False,
121 | 'max_grad_norm': 40,
122 | 'keep_prob': 1.0,
123 | 'val_period': 5,
124 | 'save_period': 10,
125 | 'fold_path': 'data/s3/folds/fold22.json',
126 | 'data_dir': 'data/s3',
127 | 'model_name': 'm04',
128 | 'mode': 'a',
129 | 'use_null': False,
130 | 'opt': 'basic',
131 | 'model': 'sim',
132 | 'lstm': 'basic',
133 | 'sim_func': 'dot',
134 | 'forget_bias': 2.5,
135 | }
136 |
137 | configs[3] = {'batch_size': 100,
138 | 'hidden_size': 50,
139 | 'num_layers': 1,
140 | 'rnn_num_layers': 1,
141 | 'init_mean': 0,
142 | 'init_std': 0.1,
143 | 'init_lr': 0.01,
144 | 'init_nw': 0.0,
145 | 'anneal_period': 10,
146 | 'anneal_ratio': 0.8,
147 | 'num_epochs': 100,
148 | 'linear_start': False,
149 | 'max_grad_norm': 40,
150 | 'keep_prob': 1.0,
151 | 'val_period': 5,
152 | 'save_period': 10,
153 | 'fold_path': 'data/s3/folds/fold23.json',
154 | 'data_dir': 'data/s3',
155 | 'model_name': 'm04',
156 | 'mode': 'a',
157 | 'use_null': False,
158 | 'opt': 'basic',
159 | 'model': 'sim',
160 | 'lstm': 'basic',
161 | 'sim_func': 'dot',
162 | 'forget_bias': 2.5,
163 | }
164 |
165 | configs[4] = {'batch_size': 100,
166 | 'hidden_size': 50,
167 | 'num_layers': 1,
168 | 'rnn_num_layers': 1,
169 | 'init_mean': 0,
170 | 'init_std': 0.1,
171 | 'init_lr': 0.01,
172 | 'init_nw': 0.0,
173 | 'anneal_period': 10,
174 | 'anneal_ratio': 0.8,
175 | 'num_epochs': 100,
176 | 'linear_start': False,
177 | 'max_grad_norm': 40,
178 | 'keep_prob': 1.0,
179 | 'val_period': 5,
180 | 'save_period': 10,
181 | 'fold_path': 'data/s3/folds/fold21.json',
182 | 'data_dir': 'data/s3-ours',
183 | 'model_name': 'm04',
184 | 'mode': 'a',
185 | 'use_null': False,
186 | 'opt': 'basic',
187 | 'model': 'sim',
188 | 'lstm': 'basic',
189 | 'sim_func': 'dot',
190 | 'forget_bias': 2.5,
191 | }
192 |
193 | configs[5] = {'batch_size': 100,
194 | 'hidden_size': 50,
195 | 'num_layers': 1,
196 | 'rnn_num_layers': 1,
197 | 'init_mean': 0,
198 | 'init_std': 0.1,
199 | 'init_lr': 0.01,
200 | 'init_nw': 0.0,
201 | 'anneal_period': 10,
202 | 'anneal_ratio': 0.8,
203 | 'num_epochs': 100,
204 | 'linear_start': False,
205 | 'max_grad_norm': 40,
206 | 'keep_prob': 1.0,
207 | 'val_period': 5,
208 | 'save_period': 10,
209 | 'fold_path': 'data/s3/folds/fold22.json',
210 | 'data_dir': 'data/s3-ours',
211 | 'model_name': 'm04',
212 | 'mode': 'a',
213 | 'use_null': False,
214 | 'opt': 'basic',
215 | 'model': 'sim',
216 | 'lstm': 'basic',
217 | 'sim_func': 'dot',
218 | 'forget_bias': 2.5,
219 | }
220 |
221 | configs[6] = {'batch_size': 100,
222 | 'hidden_size': 50,
223 | 'num_layers': 1,
224 | 'rnn_num_layers': 1,
225 | 'init_mean': 0,
226 | 'init_std': 0.1,
227 | 'init_lr': 0.01,
228 | 'init_nw': 0.0,
229 | 'anneal_period': 10,
230 | 'anneal_ratio': 0.8,
231 | 'num_epochs': 100,
232 | 'linear_start': False,
233 | 'max_grad_norm': 40,
234 | 'keep_prob': 1.0,
235 | 'val_period': 5,
236 | 'save_period': 10,
237 | 'fold_path': 'data/s3/folds/fold23.json',
238 | 'data_dir': 'data/s3-ours',
239 | 'model_name': 'm04',
240 | 'mode': 'a',
241 | 'use_null': False,
242 | 'opt': 'basic',
243 | 'model': 'sim',
244 | 'lstm': 'basic',
245 | 'sim_func': 'dot',
246 | 'forget_bias': 2.5,
247 | }
248 |
249 | configs[7] = {'batch_size': 100,
250 | 'hidden_size': 50,
251 | 'num_layers': 1,
252 | 'rnn_num_layers': 1,
253 | 'emb_num_layers': 1,
254 | 'init_mean': 0,
255 | 'init_std': 0.1,
256 | 'init_lr': 0.01,
257 | 'init_nw': 0.0,
258 | 'anneal_period': 10,
259 | 'anneal_ratio': 0.8,
260 | 'num_epochs': 100,
261 | 'linear_start': False,
262 | 'max_grad_norm': 40,
263 | 'keep_prob': 1.0,
264 | 'val_period': 5,
265 | 'save_period': 10,
266 | 'fold_path': 'data/s3/folds/fold24.json',
267 | 'data_dir': 'data/s3',
268 | 'model_name': 'm04',
269 | 'mode': 'a',
270 | 'use_null': False,
271 | 'opt': 'basic',
272 | 'model': 'sim',
273 | 'lstm': 'basic',
274 | 'sim_func': 'dot',
275 | 'forget_bias': 1.0,
276 | }
277 |
--------------------------------------------------------------------------------
/configs/get_config.py:
--------------------------------------------------------------------------------
1 | import json
2 | import os
3 | from collections import OrderedDict
4 | from copy import deepcopy
5 |
6 | from configs.tsv2json import tsv2dict
7 |
8 |
9 | class Config(object):
10 | def __init__(self, **entries):
11 | self.__dict__.update(entries)
12 |
13 |
14 | def get_config(d0, d1, priority=1):
15 | """
16 | d1 replaces d0. If priority = 0, then d0 replaces d1
17 | :param d0:
18 | :param d1:
19 | :param name:
20 | :param priority:
21 | :return:
22 | """
23 | if priority == 0:
24 | d0, d1 = d1, d0
25 | d = deepcopy(d0)
26 | for key, val in d1.items():
27 | if val is not None:
28 | d[key] = val
29 | return Config(**d)
30 |
31 |
32 | def get_config_from_file(d0, path, id_, priority=1):
33 | _, ext = os.path.splitext(path)
34 | if ext == '.json':
35 | configs = json.load(open(path, 'r'), object_pairs_hook=OrderedDict)
36 | elif ext == '.tsv':
37 | configs = tsv2dict(path)
38 | else:
39 | raise Exception("Extension %r is not supported." % ext)
40 | d1 = configs[id_]
41 | return get_config(d0, d1, priority=priority)
42 |
43 |
44 |
45 |
46 |
47 |
--------------------------------------------------------------------------------
/configs/json2tsv.py:
--------------------------------------------------------------------------------
1 | import csv
2 | import json
3 | from collections import OrderedDict
4 | import argparse
5 |
6 |
7 | def get_args():
8 | parser = argparse.ArgumentParser()
9 | parser.add_argument("json_path")
10 | parser.add_argument("tsv_path")
11 | return parser.parse_args()
12 |
13 |
14 | def json2tsv(json_path, tsv_path):
15 | configs = json.load(open(json_path, 'r'), object_pairs_hook=OrderedDict)
16 | type_dict = OrderedDict([('id', 'str')])
17 | for id_, config in configs.items():
18 | for key, val in config.items():
19 | if val is None:
20 | if key not in type_dict:
21 | type_dict[key] = 'none'
22 | continue
23 |
24 | type_name = type(val).__name__
25 | if key in type_dict and type_dict[key] != 'none':
26 | assert type_dict[key] == type_name, "inconsistent param type: %s" % key
27 | else:
28 | type_dict[key] = type_name
29 |
30 | with open(tsv_path, 'w', newline='') as file:
31 | writer = csv.writer(file, delimiter='\t')
32 | writer.writerow(type_dict.keys())
33 | writer.writerow(type_dict.values())
34 | for id_, config in configs.items():
35 | config["id"] = id_
36 | row = [config[key] if key in config and config[key] is not None else "None"
37 | for key in type_dict]
38 | writer.writerow(row)
39 |
40 |
41 | def main():
42 | args = get_args()
43 | json2tsv(args.json_path, args.tsv_path)
44 |
45 |
46 | if __name__ == "__main__":
47 | main()
48 |
--------------------------------------------------------------------------------
/configs/m05.json:
--------------------------------------------------------------------------------
1 | {
2 | "debug": {
3 | "anneal_period": 25,
4 | "anneal_ratio": 0.5,
5 | "batch_size": 100,
6 | "data_dir": "data/s3-debug",
7 | "device_type": "gpu",
8 | "emb_num_layers": 0,
9 | "fold_path": "data/s3-debug/folds/fold11.json",
10 | "hidden_size": 100,
11 | "init_lr": 0.1,
12 | "init_mean": 0,
13 | "init_std": 0.1,
14 | "keep_prob": 1.0,
15 | "lstm": null,
16 | "max_grad_norm": null,
17 | "mode": null,
18 | "model": null,
19 | "num_devices": 2,
20 | "num_epochs": 100,
21 | "opt": null,
22 | "rnn_num_layers": 1,
23 | "save_period": null,
24 | "sim_func": "dot",
25 | "val_num_batches": null,
26 | "val_period": null
27 | },
28 | "gpu": {
29 | "anneal_period": null,
30 | "anneal_ratio": null,
31 | "batch_size": null,
32 | "data_dir": null,
33 | "device_type": "gpu",
34 | "emb_num_layers": null,
35 | "fold_path": null,
36 | "hidden_size": null,
37 | "init_lr": null,
38 | "init_mean": null,
39 | "init_std": null,
40 | "keep_prob": null,
41 | "lstm": null,
42 | "max_grad_norm": null,
43 | "mode": null,
44 | "model": null,
45 | "num_devices": 4,
46 | "num_epochs": null,
47 | "opt": null,
48 | "rnn_num_layers": null,
49 | "save_period": null,
50 | "sim_func": null,
51 | "val_num_batches": null,
52 | "val_period": null
53 | },
54 | "local": {
55 | "anneal_period": 10,
56 | "anneal_ratio": 0.7,
57 | "batch_size": 10,
58 | "data_dir": "data/s3-100",
59 | "device_type": null,
60 | "emb_num_layers": 1,
61 | "fold_path": "data/s3-100/fold.json",
62 | "hidden_size": 10,
63 | "init_lr": 0.01,
64 | "init_mean": 0,
65 | "init_std": 0.1,
66 | "keep_prob": 1.0,
67 | "lstm": "basic",
68 | "max_grad_norm": 40,
69 | "model": "sim",
70 | "num_devices": null,
71 | "num_epochs": 2,
72 | "opt": "adagrad",
73 | "rnn_num_layers": 1,
74 | "save_period": 1,
75 | "sim_func": "dot",
76 | "val_num_batches": null,
77 | "val_period": 1
78 | }
79 | }
--------------------------------------------------------------------------------
/configs/m05.tsv:
--------------------------------------------------------------------------------
1 | id anneal_period anneal_ratio batch_size data_dir device_type emb_num_layers fold_path hidden_size init_lr init_mean init_std keep_prob lstm model num_devices num_epochs opt rnn_num_layers save_period sim_func val_num_batches val_period max_grad_norm mode
2 | str int float int str str int str int float int float float str str int int str int int str none int int str
3 | debug 25 0.5 100 data/s3-debug gpu 0 data/s3-debug/folds/fold11.json 100 0.1 0 0.1 1.0 None None 2 100 None 1 None dot None None None None
4 | local 10 0.7 10 data/s3-100 None 1 data/s3-100/fold.json 10 0.01 0 0.1 1.0 basic sim None 2 adagrad 1 1 dot None 1 40 la
5 | gpu None None None None gpu None None None None None None None None None 4 None None None None None None None None None
6 |
--------------------------------------------------------------------------------
/configs/tsv2json.py:
--------------------------------------------------------------------------------
1 | import argparse
2 | import csv
3 | import json
4 | from collections import OrderedDict
5 |
6 | from utils import json_pretty_dump
7 |
8 |
9 | def get_args():
10 | parser = argparse.ArgumentParser()
11 | parser.add_argument("tsv_path")
12 | parser.add_argument("json_path")
13 | return parser.parse_args()
14 |
15 |
16 | def tsv2json(tsv_path, json_path):
17 | d = tsv2dict(tsv_path)
18 | json_pretty_dump(d, open(json_path, 'w'))
19 |
20 |
21 | def tsv2dict(tsv_path):
22 | def bool(string):
23 | """
24 | shadows original bool, which maps 'False' to True
25 | """
26 | if string == 'True':
27 | return True
28 | elif string == 'False':
29 | return False
30 | else:
31 | raise Exception("Cannot convert %s to bool" % string)
32 |
33 | def none(val):
34 | return val
35 |
36 | with open(tsv_path, 'r', newline='') as file:
37 | reader = csv.reader(file, delimiter='\t')
38 | fields = next(reader)
39 | type_names = next(reader)
40 | casters = list(map(eval, type_names))
41 | out_dict = {}
42 | for row in reader:
43 | cur_dict = OrderedDict(
44 | (field, None if val == "None" else caster(val))
45 | for field, caster, val in zip(fields, casters, row))
46 | id_ = cur_dict['id']
47 | del cur_dict['id']
48 | out_dict[id_] = cur_dict
49 | return out_dict
50 |
51 |
52 | def main():
53 | args = get_args()
54 | tsv2json(args.tsv_path, args.json_path)
55 |
56 |
57 | if __name__ == "__main__":
58 | main()
59 |
--------------------------------------------------------------------------------
/create_fold.py:
--------------------------------------------------------------------------------
1 | import json
2 | import os
3 | import random
4 | import argparse
5 | from collections import defaultdict
6 |
7 |
8 | def create_linear_fold():
9 | parser = argparse.ArgumentParser()
10 | parser.add_argument("data_dir")
11 | parser.add_argument("fold_path")
12 | parser.add_argument("--ratio", type=float, default=0.8)
13 | parser.add_argument("--shuffle", default="False")
14 |
15 | args = parser.parse_args()
16 |
17 | data_dir = args.data_dir
18 | images_dir = os.path.join(data_dir, "images")
19 | annotations_dir = os.path.join(data_dir, "annotations")
20 | ratio = args.ratio
21 | shuffle = args.shuffle == 'True'
22 | fold_path = args.fold_path
23 | annotation_names = set(os.path.splitext(name)[0] for name in os.listdir(annotations_dir) if name.endswith(".json"))
24 | image_ids = list(sorted([os.path.splitext(name)[0]
25 | for name in os.listdir(images_dir) if name.endswith(".png") and name in annotation_names],
26 | key=lambda x: int(x)))
27 | if shuffle:
28 | random.shuffle(image_ids)
29 |
30 | mid = int(len(image_ids) * (1 - ratio))
31 | print("train={}, test={}".format(len(image_ids)-mid, mid))
32 | fold = {'train': image_ids[mid:], 'test': image_ids[:mid]}
33 | json.dump(fold, open(fold_path, 'w'))
34 |
35 |
36 | def create_randomly_categorized_fold():
37 | parser = argparse.ArgumentParser()
38 | parser.add_argument("cat_path")
39 | parser.add_argument("fold_path")
40 | parser.add_argument("--test_cats", nargs='*')
41 | parser.add_argument("--ratio", type=float)
42 | args = parser.parse_args()
43 | cats_path = args.cat_path
44 | test_cats = args.test_cats
45 | cat_dict = json.load(open(cats_path, 'r'))
46 | ids_dict = defaultdict(set)
47 | for image_name, cat in cat_dict.items():
48 | image_id, _ = os.path.splitext(image_name)
49 | ids_dict[cat].add(image_id)
50 |
51 | cats = list(ids_dict.keys())
52 | print(cats)
53 | if test_cats is None:
54 | random.shuffle(cats)
55 | mid = int(args.ratio * len(cats))
56 | train_cats = cats[:mid]
57 | test_cats = cats[mid:]
58 | else:
59 | for cat in test_cats:
60 | assert cat in ids_dict, "%d id not a valid category." % cat
61 | train_cats = [cat for cat in cats if cat not in test_cats]
62 |
63 | print("train categories: %s" % ", ".join(train_cats))
64 | print("test categories: %s" % ", ".join(test_cats))
65 | train_ids = sorted(set.union(*[ids_dict[cat] for cat in train_cats]), key=lambda x: int(x))
66 | test_ids = sorted(set.union(*[ids_dict[cat] for cat in test_cats]), key=lambda x: int(x))
67 | fold = {'train': train_ids, 'test': test_ids, 'trainCats': train_cats, 'testCats': test_cats}
68 | json.dump(fold, open(args.fold_path, "w"))
69 |
70 |
71 | if __name__ == "__main__":
72 | # create_linear_fold(ARGS)
73 | create_randomly_categorized_fold()
74 |
--------------------------------------------------------------------------------
/download.sh:
--------------------------------------------------------------------------------
1 | #!/usr/bin/env bash
2 |
3 | DATA_DIR="$HOME/data"
4 | DQA_DATA_DIR="$DATA_DIR/dqa"
5 |
6 | MODELS_DIR="$HOME/models"
7 | GLOVE_DIR="$MODELS_DIR/glove"
8 | VGG_DIR="$MODELS_DIR/vgg"
9 |
10 | PREPRO_DIR="data"
11 | DQA_PREPRO_DIR="$PREPRO_DIR/s3"
12 |
13 | # DQA data download
14 | mkdir $DATA_DIR
15 | mkdir $DQA_DATA_DIR
16 | wget https://s3-us-west-2.amazonaws.com/dqa-data/shining3.zip -O $DQA_DATA_DIR/shining3.zip
17 | unzip -q $DQA_DATA_DIR/shining3-1500r.zip -d $DQA_DATA_DIR
18 |
19 | # Glove pre-trained vectors download
20 | mkdir $MODELS_DIR
21 | mkdir $GLOVE_DATA_DIR
22 | wget http://nlp.stanford.edu/data/glove.6B.zip -O $GLOVE_DIR/glove.6B.zip
23 | unzip -q $GLOVE_DIR/glove.6B.zip -d $GLOVE_DIR
24 |
25 | # VGG-19 models download
26 | mkdir $VGG_DIR
27 | wget http://www.robots.ox.ac.uk/~vgg/software/very_deep/caffe/VGG_ILSVRC_19_layers.caffemodel -O $VGG_DIR/vgg-19.caffemodel
28 | wget https://gist.githubusercontent.com/ksimonyan/3785162f95cd2d5fee77/raw/f02f8769e64494bcd3d7e97d5d747ac275825721/VGG_ILSVRC_19_layers_deploy.prototxt -O $VGG_DIR/vgg-19.prototxt
29 |
30 | # folds download
31 | mkdir $PREPRO_DIR
32 | mkdir $DQA_PREPRO_DIR
33 | wget https://s3-us-west-2.amazonaws.com/dqa-data/shining3-folds.zip $DQA_PREPRO_DIR/shining3-folds.zip
34 | unzip -q $DQA_PREPRO_DIR/shining3-folds.zip -d $DQA_PREPRO_DIR
--------------------------------------------------------------------------------
/download2.sh:
--------------------------------------------------------------------------------
1 | #!/usr/bin/env bash
2 |
3 | #PROJ_HOME="~/csehomedir/projects/dqa-net"
4 | DATA_DIR="$HOME/data"
5 | DQA_DATA_DIR="$DATA_DIR/dqa"
6 |
7 | MODELS_DIR="$HOME/models"
8 | GLOVE_DIR="$MODELS_DIR/glove"
9 | #VGG_DIR="$MODELS_DIR/vgg"
10 |
11 | PREPRO_DIR="data"
12 | DQA_PREPRO_DIR="$PREPRO_DIR/s3"
13 |
14 | # DQA data download
15 | #if [ ! -d "$DATA_DIR" ]; then
16 | # echo "making dir $DATA_DIR"
17 | # mkdir -p "$DATA_DIR"
18 | #fi
19 | #if [ ! -d "$DQA_DATA_DIR" ]; then
20 | # echo "making dir $DQA_DATA_DIR"
21 | # mkdir -p "$DQA_DATA_DIR"
22 | #fi
23 | #wget https://s3-us-west-2.amazonaws.com/dqa-data/shining3.zip -O $DQA_DATA_DIR/shining3.zip
24 | #unzip -q $DQA_DATA_DIR/shining3.zip -d $DQA_DATA_DIR
25 |
26 | # Glove pre-trained vectors download
27 | #if [ ! -d "$MODELS_DIR" ]; then
28 | # echo "making dir $MODELS_DIR"
29 | # mkdir -p "$MODELS_DIR"
30 | #fi
31 | #if [ ! -d "$GLOVE_DIR" ]; then
32 | # echo "making dir $GLOVE_DIR"
33 | # mkdir -p "$GLOVE_DIR"
34 | #fi
35 | #wget http://nlp.stanford.edu/data/glove.6B.zip -O $GLOVE_DIR/glove.6B.zip
36 | #unzip -q $GLOVE_DIR/glove.6B.zip -d $GLOVE_DIR
37 |
38 | # VGG-19 models download
39 | #mkdir $VGG_DIR
40 | #wget http://www.robots.ox.ac.uk/~vgg/software/very_deep/caffe/VGG_ILSVRC_19_layers.caffemodel -O $VGG_DIR/vgg-19.caffemodel
41 | #wget https://gist.githubusercontent.com/ksimonyan/3785162f95cd2d5fee77/raw/f02f8769e64494bcd3d7e97d5d747ac275825721/VGG_ILSVRC_19_layers_deploy.prototxt -O $VGG_DIR/vgg-19.prototxt
42 |
43 | # folds download
44 | if [ ! -d "$PREPRO_DIR" ]; then
45 | echo "making dir $PREPRO_DIR"
46 | mkdir -p "$PREPRO_DIR"
47 | fi
48 | if [ ! -d "$DQA_PREPRO_DIR" ]; then
49 | echo "making dir $DQA_PREPRO_DIR"
50 | mkdir -p "$DQA_PREPRO_DIR"
51 | fi
52 | wget https://s3-us-west-2.amazonaws.com/dqa-data/shining3-folds.zip -O $DQA_PREPRO_DIR/shining3-folds.zip
53 | unzip -q $DQA_PREPRO_DIR/shining3-folds.zip -d $DQA_PREPRO_DIR
54 |
--------------------------------------------------------------------------------
/main/__init__.py:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/allenai/dqa-net/c497e950017e69952deae3ffd07920a5bfb9f468/main/__init__.py
--------------------------------------------------------------------------------
/main/x05.py:
--------------------------------------------------------------------------------
1 | import json
2 | import os
3 | import shutil
4 | from pprint import pprint
5 |
6 | import h5py
7 | import tensorflow as tf
8 |
9 | from configs.get_config import get_config_from_file, get_config
10 | from models.m05 import Tower, Runner
11 | from read_data.r05 import read_data
12 |
13 | flags = tf.app.flags
14 |
15 | # File directories
16 | # TODO : Make sure these directories are correct!
17 | flags.DEFINE_string("model_name", "m05", "Model name. This will be used for save, log, and eval names. [m05]")
18 | flags.DEFINE_string("data_dir", "data/s3", "Data directory [data/s3]")
19 | flags.DEFINE_string("fold_path", "data/s3/folds/fold23.json", "fold json path [data/s3/folds/fold23.json]")
20 |
21 | # Training parameters
22 | flags.DEFINE_integer("batch_size", 100, "Batch size for each tower. [100]")
23 | flags.DEFINE_float("init_mean", 0, "Initial weight mean [0]")
24 | flags.DEFINE_float("init_std", 0.1, "Initial weight std [0.1]")
25 | flags.DEFINE_float("init_lr", 0.1, "Initial learning rate [0.01]")
26 | flags.DEFINE_integer("anneal_period", 20, "Anneal period [20]")
27 | flags.DEFINE_float("anneal_ratio", 0.5, "Anneal ratio [0.5")
28 | flags.DEFINE_integer("num_epochs", 200, "Total number of epochs for training [200]")
29 | flags.DEFINE_string("opt", 'basic', 'Optimizer: basic | adagrad [basic]')
30 |
31 | # Training and testing options
32 | flags.DEFINE_boolean("train", False, "Train? Test if False [False]")
33 | flags.DEFINE_integer("val_num_batches", -1, "Val num batches. -1 for max possible. [-1]")
34 | flags.DEFINE_integer("train_num_batches", -1, "Train num batches. -1 for max possible [-1]")
35 | flags.DEFINE_integer("test_num_batches", -1, "Test num batches. -1 for max possible [-1]")
36 | flags.DEFINE_boolean("load", False, "Load from saved model? [False]")
37 | flags.DEFINE_boolean("progress", True, "Show progress bar? [True]")
38 | flags.DEFINE_string("device_type", 'cpu', "cpu | gpu [cpu]")
39 | flags.DEFINE_integer("num_devices", 1, "Number of devices to use. Only for multi-GPU. [1]")
40 | flags.DEFINE_integer("val_period", 5, "Validation period (for display purpose only) [5]")
41 | flags.DEFINE_integer("save_period", 10, "Save period [10]")
42 | flags.DEFINE_string("config", 'None', "Config name (e.g. local) to load. 'None' to use config here. [None]")
43 | flags.DEFINE_string("config_ext", ".json", "Config file extension: .json | .tsv [.json]")
44 |
45 | # Debugging
46 | flags.DEFINE_boolean("draft", False, "Draft? (quick initialize) [False]")
47 |
48 | # App-specific training parameters
49 | flags.DEFINE_integer("hidden_size", 50, "Hidden size [50]")
50 | flags.DEFINE_integer("image_size", 4096, "Image size [4096]")
51 | flags.DEFINE_integer("rnn_num_layers", 1, "Number of rnn layers [2]")
52 | flags.DEFINE_integer("emb_num_layers", 0, "Number of embedding layers [0]")
53 | flags.DEFINE_float("keep_prob", 1.0, "Keep probability of dropout [1.0]")
54 | flags.DEFINE_string("sim_func", 'dot', "Similarity function: man_sim | dot [dot]")
55 | flags.DEFINE_string("lstm", "basic", "LSTM cell type: regular | basic | GRU [basic]")
56 | flags.DEFINE_float("forget_bias", 2.5, "LSTM forget bias for basic cell [2.5]")
57 | flags.DEFINE_float("cell_clip", 40, "LSTM cell clipping for regular cell [40]")
58 | flags.DEFINE_float("rand_y", 1.0, "Rand y. [1.0]")
59 | flags.DEFINE_string("mode", "dqanet", "dqanet | vqa [dqanet]")
60 | flags.DEFINE_string("encoder", "lstm", "lstm | mean [lstm]")
61 |
62 | FLAGS = flags.FLAGS
63 |
64 |
65 | def mkdirs(config):
66 | evals_dir = "evals"
67 | logs_dir = "logs"
68 | saves_dir = "saves"
69 | if not os.path.exists(evals_dir):
70 | os.mkdir(evals_dir)
71 | if not os.path.exists(logs_dir):
72 | os.mkdir(logs_dir)
73 | if not os.path.exists(saves_dir):
74 | os.mkdir(saves_dir)
75 |
76 | eval_dir = os.path.join(evals_dir, config.model_name)
77 | eval_subdir = os.path.join(eval_dir, "%s" % str(config.config).zfill(2))
78 | log_dir = os.path.join(logs_dir, config.model_name)
79 | log_subdir = os.path.join(log_dir, "%s" % str(config.config).zfill(2))
80 | save_dir = os.path.join(saves_dir, config.model_name)
81 | save_subdir = os.path.join(save_dir, "%s" % str(config.config).zfill(2))
82 | config.eval_dir = eval_subdir
83 | config.log_dir = log_subdir
84 | config.save_dir = save_subdir
85 |
86 | if not os.path.exists(eval_dir):
87 | os.mkdir(eval_dir)
88 | if os.path.exists(eval_subdir):
89 | if config.train and not config.load:
90 | shutil.rmtree(eval_subdir)
91 | os.mkdir(eval_subdir)
92 | else:
93 | os.mkdir(eval_subdir)
94 | if not os.path.exists(log_dir):
95 | os.mkdir(log_dir)
96 | if os.path.exists(log_subdir):
97 | if config.train and not config.load:
98 | shutil.rmtree(log_subdir)
99 | os.mkdir(log_subdir)
100 | else:
101 | os.mkdir(log_subdir)
102 | if config.train:
103 | if not os.path.exists(save_dir):
104 | os.mkdir(save_dir)
105 | if os.path.exists(save_subdir):
106 | if not config.load:
107 | shutil.rmtree(save_subdir)
108 | os.mkdir(save_subdir)
109 | else:
110 | os.mkdir(save_subdir)
111 |
112 |
113 | def load_meta_data(config):
114 | meta_data_path = os.path.join(config.data_dir, "meta_data.json")
115 | meta_data = json.load(open(meta_data_path, "r"))
116 |
117 | # Other parameters
118 | config.max_sent_size = meta_data['max_sent_size']
119 | config.max_fact_size = meta_data['max_fact_size']
120 | config.max_num_facts = meta_data['max_num_facts']
121 | config.num_choices = meta_data['num_choices']
122 | config.vocab_size = meta_data['vocab_size']
123 | config.word_size = meta_data['word_size']
124 |
125 |
126 | def main(_):
127 | if FLAGS.config == "None":
128 | config = get_config(FLAGS.__flags, {})
129 | else:
130 | config_path = os.path.join("configs", "%s%s" % (FLAGS.model_name, FLAGS.config_ext))
131 | config = get_config_from_file(FLAGS.__flags, config_path, FLAGS.config)
132 |
133 | load_meta_data(config)
134 | mkdirs(config)
135 |
136 | # load other files
137 | init_emb_mat_path = os.path.join(config.data_dir, 'init_emb_mat.h5')
138 | config.init_emb_mat = h5py.File(init_emb_mat_path, 'r')['data'][:]
139 |
140 | if config.train:
141 | train_ds = read_data(config, 'train')
142 | val_ds = read_data(config, 'val')
143 | else:
144 | test_ds = read_data(config, 'test')
145 |
146 | # For quick draft initialize (deubgging).
147 | if config.draft:
148 | config.train_num_batches = 1
149 | config.val_num_batches = 1
150 | config.test_num_batches = 1
151 | config.num_epochs = 1
152 | config.val_period = 1
153 | config.save_period = 1
154 | config.num_layers = 1
155 | config.rnn_num_layers = 1
156 |
157 | pprint(config.__dict__)
158 |
159 | eval_tensor_names = ['yp', 'p']
160 |
161 | graph = tf.Graph()
162 | towers = [Tower(config) for _ in range(config.num_devices)]
163 | sess = tf.Session(graph=graph, config=tf.ConfigProto(allow_soft_placement=True))
164 | runner = Runner(config, sess, towers)
165 | with graph.as_default(), tf.device("/cpu:0"):
166 | runner.initialize()
167 | if config.train:
168 | if config.load:
169 | runner.load()
170 | runner.train(train_ds, val_ds, eval_tensor_names=eval_tensor_names)
171 | else:
172 | runner.load()
173 | runner.eval(test_ds, eval_tensor_names=eval_tensor_names)
174 |
175 |
176 | if __name__ == "__main__":
177 | tf.app.run()
178 |
--------------------------------------------------------------------------------
/models/__init__.py:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/allenai/dqa-net/c497e950017e69952deae3ffd07920a5bfb9f468/models/__init__.py
--------------------------------------------------------------------------------
/models/bm05.py:
--------------------------------------------------------------------------------
1 | import itertools
2 | import json
3 | import os
4 |
5 | import numpy as np
6 | import tensorflow as tf
7 |
8 | from my.tensorflow import average_gradients
9 | from read_data.r05 import DataSet
10 | from utils import get_pbar
11 |
12 |
13 | class BaseRunner(object):
14 | def __init__(self, params, sess, towers):
15 | assert isinstance(sess, tf.Session)
16 | self.sess = sess
17 | self.params = params
18 | self.towers = towers
19 | self.num_towers = len(towers)
20 | self.placeholders = {}
21 | self.tensors = {}
22 | self.saver = None
23 | self.writer = None
24 | self.initialized = False
25 |
26 | def initialize(self):
27 | params = self.params
28 | sess = self.sess
29 | device_type = params.device_type
30 | summaries = []
31 |
32 | global_step = tf.get_variable('global_step', shape=[], dtype='int32',
33 | initializer=tf.constant_initializer(0), trainable=False)
34 | self.tensors['global_step'] = global_step
35 |
36 | epoch = tf.get_variable('epoch', shape=[], dtype='int32',
37 | initializer=tf.constant_initializer(0), trainable=False)
38 | self.tensors['epoch'] = epoch
39 |
40 | learning_rate = tf.placeholder('float32', name='learning_rate')
41 | summaries.append(tf.scalar_summary("learning_rate", learning_rate))
42 | self.placeholders['learning_rate'] = learning_rate
43 |
44 | if params.opt == 'basic':
45 | opt = tf.train.GradientDescentOptimizer(learning_rate)
46 | elif params.opt == 'adagrad':
47 | opt = tf.train.AdagradOptimizer(learning_rate)
48 | else:
49 | raise Exception()
50 |
51 | grads_tensors = []
52 | correct_tensors = []
53 | loss_tensors = []
54 | for device_id, tower in enumerate(self.towers):
55 | with tf.device("/%s:%d" % (device_type, device_id)), tf.name_scope("%s_%d" % (device_type, device_id)) as scope:
56 | tower.initialize(scope)
57 | tf.get_variable_scope().reuse_variables()
58 | loss_tensor = tower.get_loss_tensor()
59 | loss_tensors.append(loss_tensor)
60 | correct_tensor = tower.get_correct_tensor()
61 | correct_tensors.append(correct_tensor)
62 | grads_tensor = opt.compute_gradients(loss_tensor)
63 | grads_tensors.append(grads_tensor)
64 |
65 | with tf.name_scope("gpu_sync"):
66 | loss_tensor = tf.reduce_mean(tf.pack(loss_tensors), 0, name='loss')
67 | correct_tensor = tf.concat(0, correct_tensors, name="correct")
68 | with tf.name_scope("average_gradients"):
69 | grads_tensor = average_gradients(grads_tensors)
70 |
71 | self.tensors['loss'] = loss_tensor
72 | self.tensors['correct'] = correct_tensor
73 | summaries.append(tf.scalar_summary(loss_tensor.op.name, loss_tensor))
74 |
75 | for grad, var in grads_tensor:
76 | if grad is not None:
77 | summaries.append(tf.histogram_summary(var.op.name+'/gradients', grad))
78 | self.tensors['grads'] = grads_tensor
79 |
80 | for var in tf.trainable_variables():
81 | summaries.append(tf.histogram_summary(var.op.name, var))
82 |
83 | apply_grads_op = opt.apply_gradients(grads_tensor, global_step=global_step)
84 |
85 | train_op = tf.group(apply_grads_op)
86 | self.tensors['train'] = train_op
87 |
88 | saver = tf.train.Saver(tf.all_variables())
89 | self.saver = saver
90 |
91 | summary_op = tf.merge_summary(summaries)
92 | self.tensors['summary'] = summary_op
93 |
94 | init_op = tf.initialize_all_variables()
95 | sess.run(init_op)
96 | self.writer = tf.train.SummaryWriter(params.log_dir, sess.graph)
97 | self.initialized = True
98 |
99 | def _get_feed_dict(self, batches, mode, **kwargs):
100 | placeholders = self.placeholders
101 | learning_rate_ph = placeholders['learning_rate']
102 | learning_rate = kwargs['learning_rate'] if mode == 'train' else 0.0
103 | feed_dict = {learning_rate_ph: learning_rate}
104 | for tower_idx, tower in enumerate(self.towers):
105 | batch = batches[tower_idx] if tower_idx < len(batches) else None
106 | cur_feed_dict = tower.get_feed_dict(batch, mode, **kwargs)
107 | feed_dict.update(cur_feed_dict)
108 | return feed_dict
109 |
110 | def _train_batches(self, batches, **kwargs):
111 | sess = self.sess
112 | tensors = self.tensors
113 | feed_dict = self._get_feed_dict(batches, 'train', **kwargs)
114 | ops = [tensors[name] for name in ['train', 'summary', 'global_step']]
115 | train, summary, global_step = sess.run(ops, feed_dict=feed_dict)
116 | return train, summary, global_step
117 |
118 | def _eval_batches(self, batches, eval_tensor_names=()):
119 | sess = self.sess
120 | tensors = self.tensors
121 | num_examples = sum(len(batch[0]) for batch in batches)
122 | feed_dict = self._get_feed_dict(batches, 'eval')
123 | ops = [tensors[name] for name in ['correct', 'loss', 'summary', 'global_step']]
124 | correct, loss, summary, global_step = sess.run(ops, feed_dict=feed_dict)
125 | num_corrects = np.sum(correct[:num_examples])
126 | if len(eval_tensor_names) > 0:
127 | valuess = [sess.run([tower.tensors[name] for name in eval_tensor_names], feed_dict=feed_dict)
128 | for tower in self.towers]
129 | else:
130 | valuess = [[]]
131 |
132 | return (num_corrects, loss, summary, global_step), valuess
133 |
134 | def train(self, train_data_set, val_data_set=None, eval_tensor_names=()):
135 | assert isinstance(train_data_set, DataSet)
136 | assert self.initialized, "Initialize tower before training."
137 | # TODO : allow partial batch
138 |
139 | sess = self.sess
140 | writer = self.writer
141 | params = self.params
142 | num_epochs = params.num_epochs
143 | num_batches = params.train_num_batches if params.train_num_batches >= 0 else train_data_set.get_num_batches(partial=False)
144 | num_iters_per_epoch = int(num_batches / self.num_towers)
145 | num_digits = int(np.log10(num_batches))
146 |
147 | epoch_op = self.tensors['epoch']
148 | epoch = sess.run(epoch_op)
149 | print("training %d epochs ... " % num_epochs)
150 | print("num iters per epoch: %d" % num_iters_per_epoch)
151 | print("starting from epoch %d." % (epoch+1))
152 | while epoch < num_epochs:
153 | train_args = self._get_train_args(epoch)
154 | pbar = get_pbar(num_iters_per_epoch, "epoch %s|" % str(epoch+1).zfill(num_digits)).start()
155 | for iter_idx in range(num_iters_per_epoch):
156 | batches = [train_data_set.get_next_labeled_batch() for _ in range(self.num_towers)]
157 | _, summary, global_step = self._train_batches(batches, **train_args)
158 | writer.add_summary(summary, global_step)
159 | pbar.update(iter_idx)
160 | pbar.finish()
161 | train_data_set.complete_epoch()
162 |
163 | assign_op = epoch_op.assign_add(1)
164 | _, epoch = sess.run([assign_op, epoch_op])
165 |
166 | if val_data_set and epoch % params.val_period == 0:
167 | self.eval(train_data_set, is_val=True, eval_tensor_names=eval_tensor_names)
168 | self.eval(val_data_set, is_val=True, eval_tensor_names=eval_tensor_names)
169 |
170 | if epoch % params.save_period == 0:
171 | self.save()
172 |
173 | def eval(self, data_set, is_val=False, eval_tensor_names=()):
174 | assert isinstance(data_set, DataSet)
175 | assert self.initialized, "Initialize tower before training."
176 |
177 | params = self.params
178 | sess = self.sess
179 | epoch_op = self.tensors['epoch']
180 | dn = data_set.get_num_batches(partial=True)
181 | if is_val:
182 | pn = params.val_num_batches
183 | num_batches = pn if 0 <= pn <= dn else dn
184 | else:
185 | pn = params.test_num_batches
186 | num_batches = pn if 0 <= pn <= dn else dn
187 | num_iters = int(np.ceil(num_batches / self.num_towers))
188 | num_corrects, total = 0, 0
189 | eval_values = []
190 | idxs = []
191 | losses = []
192 | N = data_set.batch_size * num_batches
193 | if N > data_set.num_examples:
194 | N = data_set.num_examples
195 | string = "eval on %s, N=%d|" % (data_set.name, N)
196 | pbar = get_pbar(num_iters, prefix=string).start()
197 | for iter_idx in range(num_iters):
198 | batches = []
199 | for _ in range(self.num_towers):
200 | if data_set.has_next_batch(partial=True):
201 | idxs.extend(data_set.get_batch_idxs(partial=True))
202 | batches.append(data_set.get_next_labeled_batch(partial=True))
203 | (cur_num_corrects, cur_loss, _, global_step), eval_value_batches = \
204 | self._eval_batches(batches, eval_tensor_names=eval_tensor_names)
205 | num_corrects += cur_num_corrects
206 | total += sum(len(batch[0]) for batch in batches)
207 | for eval_value_batch in eval_value_batches:
208 | eval_values.append([x.tolist() for x in eval_value_batch]) # numpy.array.toList
209 | losses.append(cur_loss)
210 | pbar.update(iter_idx)
211 | pbar.finish()
212 | loss = np.mean(losses)
213 | data_set.reset()
214 |
215 | epoch = sess.run(epoch_op)
216 | print("at epoch %d: acc = %.2f%% = %d / %d, loss = %.4f" %
217 | (epoch, 100 * float(num_corrects)/total, num_corrects, total, loss))
218 |
219 | # For outputting eval json files
220 | ids = [data_set.idx2id[idx] for idx in idxs]
221 | zipped_eval_values = [list(itertools.chain(*each)) for each in zip(*eval_values)]
222 | values = {name: values for name, values in zip(eval_tensor_names, zipped_eval_values)}
223 | out = {'ids': ids, 'values': values}
224 | eval_path = os.path.join(params.eval_dir, "%s_%s.json" % (data_set.name, str(epoch).zfill(4)))
225 | json.dump(out, open(eval_path, 'w'))
226 |
227 | def _get_train_args(self, epoch_idx):
228 | params = self.params
229 | learning_rate = params.init_lr
230 |
231 | anneal_period = params.lr_anneal_period
232 | anneal_ratio = params.lr_anneal_ratio
233 | num_periods = int(epoch_idx / anneal_period)
234 | factor = anneal_ratio ** num_periods
235 | learning_rate *= factor
236 |
237 | train_args = {'learning_rate': learning_rate}
238 | return train_args
239 |
240 | def save(self):
241 | assert self.initialized, "Initialize tower before saving."
242 |
243 | sess = self.sess
244 | params = self.params
245 | save_dir = params.save_dir
246 | name = params.model_name
247 | global_step = self.tensors['global_step']
248 | print("saving model ...")
249 | save_path = os.path.join(save_dir, name)
250 | self.saver.save(sess, save_path, global_step)
251 | print("saving done.")
252 |
253 | def load(self):
254 | assert self.initialized, "Initialize tower before loading."
255 |
256 | sess = self.sess
257 | params = self.params
258 | save_dir = params.save_dir
259 | print("loading model ...")
260 | checkpoint = tf.train.get_checkpoint_state(save_dir)
261 | assert checkpoint is not None, "Cannot load checkpoint at %s" % save_dir
262 | self.saver.restore(sess, checkpoint.model_checkpoint_path)
263 | print("loading done.")
264 |
265 |
266 | class BaseTower(object):
267 | def __init__(self, params):
268 | self.params = params
269 | self.placeholders = {}
270 | self.tensors = {}
271 | self.default_initializer = tf.random_normal_initializer(params.init_mean, params.init_std)
272 |
273 | def initialize(self, scope):
274 | # Actual building
275 | # Separated so that GPU assignment can be done here.
276 | raise Exception("Implement this!")
277 |
278 | def get_correct_tensor(self):
279 | return self.tensors['correct']
280 |
281 | def get_loss_tensor(self):
282 | return self.tensors['loss']
283 |
284 | def get_feed_dict(self, batch, mode, **kwargs):
285 | raise Exception("Implment this!")
286 |
--------------------------------------------------------------------------------
/models/m05.py:
--------------------------------------------------------------------------------
1 | from functools import reduce
2 | from operator import mul
3 |
4 | import numpy as np
5 | import tensorflow as tf
6 | from tensorflow.python.ops import rnn
7 |
8 | from models.bm05 import BaseTower, BaseRunner
9 | import my.rnn_cell
10 | import my.nn
11 |
12 |
13 |
14 | class Sentence(object):
15 | def __init__(self, shape, name='sentence'):
16 | self.name = name
17 | self.shape = shape
18 | self.x = tf.placeholder('int32', shape, name="%s" % name)
19 | self.x_mask = tf.placeholder('float', shape, name="%s_mask" % name)
20 | self.x_len = tf.placeholder('int16', shape[:-1], name="%s_len" % name)
21 | self.x_mask_aug = tf.expand_dims(self.x_mask, -1, name='%s_mask_aug' % name)
22 |
23 | def add(self, feed_dict, *batch):
24 | x, x_mask, x_len = batch
25 | feed_dict[self.x] = x
26 | feed_dict[self.x_mask] = x_mask
27 | feed_dict[self.x_len] = x_len
28 |
29 |
30 | class Memory(Sentence):
31 | def __init__(self, params, name='memory'):
32 | N, M, K = params.batch_size, params.max_num_facts, params.max_fact_size
33 | shape = [N, M, K]
34 | super(Memory, self).__init__(shape, name=name)
35 | self.m_mask = tf.placeholder('float', [N, M], name='m_mask')
36 |
37 | def add(self, feed_dict, *batch):
38 | x, x_mask, x_len, m_mask = batch
39 | super(Memory, self).add(feed_dict, x, x_mask, x_len)
40 | feed_dict[self.m_mask] = m_mask
41 |
42 |
43 | class PESentenceEncoder(object):
44 | def __init__(self, params, emb_mat):
45 | self.params = params
46 | V, d, L, e = params.vocab_size, params.hidden_size, params.rnn_num_layers, params.word_size
47 | # self.init_emb_mat = tf.get_variable("init_emb_mat", [self.V, self.d])
48 | emb_hidden_sizes = [d for _ in range(params.emb_num_layers)]
49 | prev_size = e
50 | for layer_idx in range(params.emb_num_layers):
51 | with tf.variable_scope("Ax_%d" % layer_idx):
52 | cur_size = emb_hidden_sizes[layer_idx]
53 | mat = tf.get_variable("mat_%d" % layer_idx, shape=[prev_size, cur_size])
54 | bias = tf.get_variable("bias_%d" % layer_idx, shape=[cur_size])
55 | emb_mat = tf.tanh(tf.matmul(emb_mat, mat) + bias)
56 | self.emb_mat = emb_mat # [V, d]
57 |
58 | def __call__(self, sentence, name='u'):
59 | assert isinstance(sentence, Sentence)
60 | params = self.params
61 | d, e = params.hidden_size, params.word_size
62 | J = sentence.shape[-1]
63 |
64 | def f(JJ, jj, dd, kk):
65 | return (1-float(jj)/JJ) - (float(kk)/dd)*(1-2.0*jj/JJ)
66 |
67 | def g(jj):
68 | return [f(J, jj, d, k) for k in range(d)]
69 |
70 | _l = [g(j) for j in range(J)]
71 | self.l = tf.constant(_l, shape=[J, d], name='l')
72 | assert isinstance(sentence, Sentence)
73 | Ax = tf.nn.embedding_lookup(self.emb_mat, sentence.x, name='Ax')
74 | # TODO : dimension transformation
75 | lAx = self.l * Ax
76 | lAx_masked = lAx * tf.expand_dims(sentence.x_mask, -1)
77 | m = tf.reduce_sum(lAx_masked, len(sentence.shape) - 1, name=name)
78 | return m
79 |
80 |
81 | class MeanEncoder(object):
82 | def __init__(self, params, emb_mat):
83 | self.params = params
84 | V, d, L, e = params.vocab_size, params.hidden_size, params.rnn_num_layers, params.word_size
85 | prev_size = e
86 | hidden_sizes = [d for _ in range(params.emb_num_layers)]
87 | for layer_idx in range(params.emb_num_layers):
88 | with tf.variable_scope("emb_%d" % layer_idx):
89 | cur_hidden_size = hidden_sizes[layer_idx]
90 | emb_mat = tf.tanh(my.nn.linear([V, prev_size], cur_hidden_size, emb_mat))
91 | prev_size = cur_hidden_size
92 | self.emb_mat = emb_mat
93 |
94 | def __call__(self, sentence, name='mean'):
95 | assert isinstance(sentence, Sentence)
96 | Ax = tf.nn.embedding_lookup(self.emb_mat, sentence.x) # [N, C, J, e]
97 | return tf.reduce_mean(Ax * sentence.x_mask_aug, len(sentence.shape)-1, name=name)
98 |
99 |
100 | class LSTMSentenceEncoder(object):
101 | def __init__(self, params, emb_mat):
102 | self.params = params
103 | V, d, L, e = params.vocab_size, params.hidden_size, params.rnn_num_layers, params.word_size
104 | prev_size = e
105 | hidden_sizes = [d for _ in range(params.emb_num_layers)]
106 | for layer_idx in range(params.emb_num_layers):
107 | with tf.variable_scope("emb_%d" % layer_idx):
108 | cur_hidden_size = hidden_sizes[layer_idx]
109 | emb_mat = tf.tanh(my.nn.linear([V, prev_size], cur_hidden_size, emb_mat))
110 | prev_size = cur_hidden_size
111 | self.emb_mat = emb_mat
112 |
113 | self.emb_hidden_sizes = [d for _ in range(params.emb_num_layers)]
114 | self.input_size = self.emb_hidden_sizes[-1] if self.emb_hidden_sizes else e
115 |
116 | if params.lstm == 'basic':
117 | self.first_cell = my.rnn_cell.BasicLSTMCell(d, input_size=self.input_size, forget_bias=params.forget_bias)
118 | self.second_cell = my.rnn_cell.BasicLSTMCell(d, forget_bias=params.forget_bias)
119 | elif params.lstm == 'regular':
120 | self.first_cell = tf.nn.rnn_cell.LSTMCell(d, self.input_size, cell_clip=params.cell_clip)
121 | self.second_cell = tf.nn.rnn_cell.LSTMCell(d, d, cell_clip=params.cell_clip)
122 | elif params.lstm == 'gru':
123 | self.first_cell = tf.nn.rnn_cell.GRUCell(d, input_size=self.input_size)
124 | self.second_cell = tf.nn.rnn_cell.GRUCell(d)
125 | else:
126 | raise Exception()
127 |
128 | if params.train and params.keep_prob < 1.0:
129 | self.first_cell = tf.nn.rnn_cell.DropoutWrapper(self.first_cell, input_keep_prob=params.keep_prob)
130 | self.cell = tf.nn.rnn_cell.MultiRNNCell([self.first_cell] + [self.second_cell] * (L-1))
131 | self.scope = tf.get_variable_scope()
132 | self.used = False
133 |
134 | def __call__(self, sentence, init_hidden_state=None, name='s'):
135 | params = self.params
136 | L, d = params.rnn_num_layers, params.hidden_size
137 | h_flat = self.get_last_hidden_state(sentence, init_hidden_state=init_hidden_state)
138 | if params.lstm in ['basic', 'regular']:
139 | h_last = tf.reshape(h_flat, sentence.shape[:-1] + [2*L*d])
140 | s = tf.identity(tf.split(2, 2*L, h_last)[2*L-1], name=name)
141 | elif params.lstm == 'gru':
142 | h_last = tf.reshape(h_flat, sentence.shape[:-1] + [L*d])
143 | s = tf.identity(tf.split(2, L, h_last)[L-1], name=name)
144 | else:
145 | raise Exception()
146 | return s
147 |
148 | def get_last_hidden_state(self, sentence, init_hidden_state=None):
149 | assert isinstance(sentence, Sentence)
150 | with tf.variable_scope(self.scope, reuse=self.used):
151 | J = sentence.shape[-1]
152 | Ax = tf.nn.embedding_lookup(self.emb_mat, sentence.x) # [N, C, J, e]
153 |
154 | F = reduce(mul, sentence.shape[:-1], 1)
155 | init_hidden_state = init_hidden_state or self.cell.zero_state(F, tf.float32)
156 | Ax_flat = tf.reshape(Ax, [F, J, self.input_size])
157 | x_len_flat = tf.reshape(sentence.x_len, [F])
158 |
159 | # Ax_flat_split = [tf.squeeze(x_flat_each, [1]) for x_flat_each in tf.split(1, J, Ax_flat)]
160 | o_flat, h_flat = rnn.dynamic_rnn(self.cell, Ax_flat, x_len_flat, initial_state=init_hidden_state)
161 | self.used = True
162 | return h_flat
163 |
164 |
165 | class Sim(object):
166 | def __init__(self, params, memory, encoder, u):
167 | N, C, R, d = params.batch_size, params.num_choices, params.max_num_facts, params.hidden_size
168 | f = encoder(memory, name='f')
169 | f_aug = tf.expand_dims(f, 1) # [N, 1, R, d]
170 | u_aug = tf.expand_dims(u, 2) # [N, C, 1, d]
171 | u_tiled = tf.tile(u_aug, [1, 1, R, 1])
172 | if params.sim_func == 'man_sim':
173 | uf = my.nn.man_sim([N, C, R, d], f_aug, u_tiled, name='uf') # [N, C, R]
174 | elif params.sim_func == 'dot':
175 | uf = tf.reduce_sum(u_tiled * f_aug, 3)
176 | else:
177 | raise Exception()
178 | logit = tf.reduce_max(uf, 2) # [N, C]
179 |
180 | f_mask_aug = tf.expand_dims(memory.m_mask, 1)
181 | p = my.nn.softmax_with_mask([N, C, R], uf, f_mask_aug, name='p')
182 | self.logit = logit
183 | self.p = p
184 |
185 |
186 | class Tower(BaseTower):
187 | def initialize(self, scope):
188 | params = self.params
189 | tensors = self.tensors
190 | placeholders = self.placeholders
191 |
192 | V, d, G = params.vocab_size, params.hidden_size, params.image_size
193 | N, C, J = params.batch_size, params.num_choices, params.max_sent_size
194 | e = params.word_size
195 |
196 | # initialize self
197 | # placeholders
198 | with tf.name_scope('ph'):
199 | s = Sentence([N, C, J], 's')
200 | f = Memory(params, 'f')
201 | image = tf.placeholder('float', [N, G], name='i')
202 | y = tf.placeholder('int8', [N, C], name='y')
203 | init_emb_mat = tf.placeholder('float', shape=[V, e], name='init_emb_mat')
204 | placeholders['s'] = s
205 | placeholders['f'] = f
206 | placeholders['image'] = image
207 | placeholders['y'] = y
208 | placeholders['init_emb_mat'] = init_emb_mat
209 |
210 | with tf.variable_scope('encoder'):
211 | if params.encoder == 'lstm':
212 | u_encoder = LSTMSentenceEncoder(params, init_emb_mat)
213 | elif params.encoder == 'mean':
214 | u_encoder = MeanEncoder(params, init_emb_mat)
215 | else:
216 | raise Exception("Invalid encoder: {}".format(params.encoder))
217 | # u_encoder = PESentenceEncoder(params, init_emb_mat)
218 | first_u = u_encoder(s, name='first_u')
219 |
220 | with tf.name_scope("main"):
221 | sim = Sim(params, f, u_encoder, first_u)
222 | tensors['p'] = sim.p
223 | if params.mode == 'dqanet':
224 | logit = sim.logit
225 | elif params.mode == 'vqa':
226 | image_trans_mat = tf.get_variable('I', shape=[G, d])
227 | image_trans_bias = tf.get_variable('bI', shape=[])
228 | g = tf.tanh(tf.matmul(image, image_trans_mat) + image_trans_bias, name='g') # [N, d]
229 | aug_g = tf.expand_dims(g, 2, name='aug_g') # [N, d, 1]
230 | logit = tf.squeeze(tf.batch_matmul(first_u, aug_g), [2]) # [N, C]
231 | else:
232 | raise Exception("Invalid mode: {}".format(params.mode))
233 | tensors['logit'] = logit
234 |
235 | with tf.variable_scope('yp'):
236 | yp = tf.nn.softmax(logit, name='yp') # [N, C]
237 | tensors['yp'] = yp
238 |
239 | with tf.name_scope('loss'):
240 | cross_entropy = tf.nn.softmax_cross_entropy_with_logits(logit, tf.cast(y, 'float'), name='cross_entropy')
241 | avg_cross_entropy = tf.reduce_mean(cross_entropy, 0, name='avg_cross_entropy')
242 | tf.add_to_collection('losses', avg_cross_entropy)
243 | loss = tf.add_n(tf.get_collection('losses', scope), name='loss')
244 | tensors['loss'] = loss
245 |
246 | with tf.name_scope('acc'):
247 | correct_vec = tf.equal(tf.argmax(yp, 1), tf.argmax(y, 1))
248 | num_corrects = tf.reduce_sum(tf.cast(correct_vec, 'float'), name='num_corrects')
249 | acc = tf.reduce_mean(tf.cast(correct_vec, 'float'), name='acc')
250 | tensors['correct'] = correct_vec
251 | tensors['num_corrects'] = num_corrects
252 | tensors['acc'] = acc
253 |
254 | def get_feed_dict(self, batch, mode, **kwargs):
255 | placeholders = self.placeholders
256 | if batch is None:
257 | assert mode != 'train', "Cannot pass empty batch during training, for now."
258 | sents_batch, facts_batch, images_batch, label_batch = None, None, None, None
259 | else:
260 | sents_batch, facts_batch, images_batch = batch[:-1]
261 | if len(batch) > 3:
262 | label_batch = batch[-1]
263 | else:
264 | label_batch = np.zeros([len(sents_batch)])
265 | s = self._prepro_sents_batch(sents_batch) # [N, C, J], [N, C]
266 | f = self._prepro_facts_batch(facts_batch)
267 | g = self._prepro_images_batch(images_batch)
268 | feed_dict = {placeholders['image']: g, placeholders['init_emb_mat']: self.params.init_emb_mat}
269 | if mode == 'train':
270 | y_batch = self._prepro_label_batch(label_batch)
271 | elif mode == 'eval':
272 | y_batch = self._prepro_label_batch(label_batch)
273 | else:
274 | raise Exception()
275 | feed_dict[placeholders['y']] = y_batch
276 | placeholders['s'].add(feed_dict, *s)
277 | placeholders['f'].add(feed_dict, *f)
278 | return feed_dict
279 |
280 | def _prepro_images_batch(self, images_batch):
281 | params = self.params
282 | N, G = params.batch_size, params.image_size
283 | g = np.zeros([N, G])
284 | if images_batch is None:
285 | return g
286 | g[:len(images_batch)] = images_batch
287 | return g
288 |
289 | def _prepro_sents_batch(self, sents_batch):
290 | p = self.params
291 | N, C, J = p.batch_size, p.num_choices, p.max_sent_size
292 | s_batch = np.zeros([N, C, J], dtype='int32')
293 | s_mask_batch = np.zeros([N, C, J], dtype='float')
294 | s_len_batch = np.zeros([N, C], dtype='int16')
295 | out = s_batch, s_mask_batch, s_len_batch
296 | if sents_batch is None:
297 | return out
298 | for n, sents in enumerate(sents_batch):
299 | for c, sent in enumerate(sents):
300 | for j, idx in enumerate(sent):
301 | s_batch[n, c, j] = idx
302 | s_mask_batch[n, c, j] = 1.0
303 | s_len_batch[n, c] = len(sent)
304 |
305 | return out
306 |
307 | def _prepro_facts_batch(self, facts_batch):
308 | p = self.params
309 | N, M, K = p.batch_size, p.max_num_facts, p.max_fact_size
310 | s_batch = np.zeros([N, M, K], dtype='int32')
311 | s_mask_batch = np.zeros([N, M, K], dtype='float')
312 | s_len_batch = np.zeros([N, M], dtype='int16')
313 | m_mask_batch = np.zeros([N, M], dtype='float')
314 | out = s_batch, s_mask_batch, s_len_batch, m_mask_batch
315 | if facts_batch is None:
316 | return out
317 | for n, sents in enumerate(facts_batch):
318 | for m, sent in enumerate(sents):
319 | for k, idx in enumerate(sent):
320 | s_batch[n, m, k] = idx
321 | s_mask_batch[n, m, k] = 1.0
322 | s_len_batch[n, m] = len(sent)
323 | m_mask_batch[n, m] = 1.0
324 | return out
325 |
326 | def _prepro_label_batch(self, label_batch):
327 | p = self.params
328 | N, C = p.batch_size, p.num_choices
329 | y = np.zeros([N, C], dtype='float')
330 | if label_batch is None:
331 | return y
332 | for i, label in enumerate(label_batch):
333 | y[i, label] = np.random.rand() * self.params.rand_y
334 | rand_other = (1.0 - self.params.rand_y)/(C-1)
335 | for cur in range(C):
336 | if cur != label:
337 | y[i, cur] = np.random.rand() * rand_other
338 | y[i] = y[i] / sum(y[i])
339 |
340 | return y
341 |
342 |
343 | class Runner(BaseRunner):
344 | def _get_train_args(self, epoch_idx):
345 | params = self.params
346 | learning_rate = params.init_lr
347 |
348 | anneal_period = params.anneal_period
349 | anneal_ratio = params.anneal_ratio
350 | num_periods = int(epoch_idx / anneal_period)
351 | factor = anneal_ratio ** num_periods
352 |
353 | if params.opt == 'basic':
354 | learning_rate *= factor
355 |
356 | train_args = {'learning_rate': learning_rate}
357 | return train_args
358 |
--------------------------------------------------------------------------------
/my/__init__.py:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/allenai/dqa-net/c497e950017e69952deae3ffd07920a5bfb9f468/my/__init__.py
--------------------------------------------------------------------------------
/my/nn.py:
--------------------------------------------------------------------------------
1 | """
2 | useful neural net modules
3 | """
4 | import operator
5 | from operator import mul
6 | from functools import reduce
7 |
8 | import tensorflow as tf
9 |
10 | VERY_SMALL_NUMBER = -1e10
11 |
12 |
13 | def softmax_with_mask(shape, x, mask, name=None):
14 | if name is None:
15 | name = softmax_with_mask.__name__
16 | x_masked = x + VERY_SMALL_NUMBER * (1.0 - mask)
17 | x_flat = tf.reshape(x_masked, [reduce(mul, shape[:-1], 1), shape[-1]])
18 | p_flat = tf.nn.softmax(x_flat)
19 | p = tf.reshape(p_flat, shape, name=name)
20 | return p
21 |
22 |
23 | def softmax_with_base(shape, base_untiled, x, mask=None, name='sig'):
24 | if mask is not None:
25 | x += VERY_SMALL_NUMBER * (1.0 - mask)
26 | base_shape = shape[:-1] + [1]
27 | for _ in shape:
28 | base_untiled = tf.expand_dims(base_untiled, -1)
29 | base = tf.tile(base_untiled, base_shape)
30 |
31 | c_shape = shape[:-1] + [shape[-1] + 1]
32 | c = tf.concat(len(shape)-1, [base, x])
33 | c_flat = tf.reshape(c, [reduce(mul, shape[:-1], 1), c_shape[-1]])
34 | p_flat = tf.nn.softmax(c_flat)
35 | p_cat = tf.reshape(p_flat, c_shape)
36 | s_aug = tf.slice(p_cat, [0 for _ in shape], [i for i in shape[:-1]] + [1])
37 | s = tf.squeeze(s_aug, [len(shape)-1])
38 | sig = tf.sub(1.0, s, name="sig")
39 | p = tf.slice(p_cat, [0 for _ in shape[:-1]] + [1], shape)
40 | return sig, p
41 |
42 |
43 | def man_sim(shape, u, v, name='man_sim'):
44 | """
45 | Manhattan similarity
46 | https://pdfs.semanticscholar.org/6812/fb9ef1c2dad497684a9020d8292041a639ff.pdf
47 | :param shape:
48 | :param u:
49 | :param v:
50 | :param name:
51 | :return:
52 | """
53 | dist = tf.reduce_sum(tf.abs(u - v), len(shape)-1)
54 | sim = tf.sub(0.0, dist, name=name)
55 | return sim
56 |
57 |
58 | def linear(input_shape, output_dim, input_, name="linear"):
59 | a = input_shape[-1]
60 | b = output_dim
61 | input_flat = tf.reshape(input_, [reduce(operator.mul, input_shape[:-1], 1), a])
62 | with tf.variable_scope(name):
63 | mat = tf.get_variable("mat", shape=[a, b])
64 | bias = tf.get_variable("bias", shape=[b])
65 | out_flat = tf.matmul(input_flat, mat) + bias
66 | out = tf.reshape(out_flat, input_shape[:-1] + [b])
67 | return out
68 |
69 |
70 |
--------------------------------------------------------------------------------
/my/rnn_cell.py:
--------------------------------------------------------------------------------
1 | from __future__ import absolute_import
2 | from __future__ import division
3 | from __future__ import print_function
4 |
5 | import tensorflow as tf
6 | from tensorflow.python.ops.rnn_cell import RNNCell
7 |
8 |
9 | def linear(args, output_size, bias, bias_start=0.0, scope=None, var_on_cpu=True, wd=0.0):
10 | """Linear map: sum_i(args[i] * W[i]), where W[i] is a variable.
11 |
12 | Args:
13 | args: a 2D Tensor or a list of 2D, batch x n, Tensors.
14 | output_size: int, second dimension of W[i].
15 | bias: boolean, whether to add a bias term or not.
16 | bias_start: starting value to initialize the bias; 0 by default.
17 | scope: VariableScope for the created subgraph; defaults to "Linear".
18 | var_on_cpu: if True, put the variables on /cpu:0.
19 |
20 | Returns:
21 | A 2D Tensor with shape [batch x output_size] equal to
22 | sum_i(args[i] * W[i]), where W[i]s are newly created matrices.
23 |
24 | Raises:
25 | ValueError: if some of the arguments has unspecified or wrong shape.
26 | """
27 | assert args
28 | if not isinstance(args, (list, tuple)):
29 | args = [args]
30 |
31 | # Calculate the total size of arguments on dimension 1.
32 | total_arg_size = 0
33 | shapes = [a.get_shape().as_list() for a in args]
34 | for shape in shapes:
35 | if len(shape) != 2:
36 | raise ValueError("Linear is expecting 2D arguments: %s" % str(shapes))
37 | if not shape[1]:
38 | raise ValueError("Linear expects shape[1] of arguments: %s" % str(shapes))
39 | else:
40 | total_arg_size += shape[1]
41 |
42 | # Now the computation.
43 | with tf.variable_scope(scope or "Linear"):
44 | if var_on_cpu:
45 | with tf.device("/cpu:0"):
46 | matrix = tf.get_variable("Matrix", [total_arg_size, output_size])
47 | else:
48 | matrix = tf.get_variable("Matrix", [total_arg_size, output_size])
49 | if wd:
50 | weight_decay = tf.mul(tf.nn.l2_loss(matrix), wd, name='weight_loss')
51 | tf.add_to_collection('losses', weight_decay)
52 |
53 |
54 | if len(args) == 1:
55 | res = tf.matmul(args[0], matrix)
56 | else:
57 | res = tf.matmul(tf.concat(1, args), matrix)
58 | if not bias:
59 | return res
60 |
61 | if var_on_cpu:
62 | with tf.device("/cpu:0"):
63 | bias_term = tf.get_variable(
64 | "Bias", [output_size],
65 | initializer=tf.constant_initializer(bias_start))
66 | else:
67 | bias_term = tf.get_variable(
68 | "Bias", [output_size],
69 | initializer=tf.constant_initializer(bias_start))
70 | return res + bias_term
71 |
72 |
73 | class BasicLSTMCell(RNNCell):
74 | """Basic LSTM recurrent network cell.
75 |
76 | The implementation is based on: http://arxiv.org/abs/1409.2329.
77 |
78 | We add forget_bias (default: 1) to the biases of the forget gate in order to
79 | reduce the scale of forgetting in the beginning of the training.
80 |
81 | It does not allow cell clipping, a projection layer, and does not
82 | use peep-hole connections: it is the basic baseline.
83 |
84 | For advanced models, please use the full LSTMCell that follows.
85 | """
86 |
87 | def __init__(self, num_units, forget_bias=1.0, input_size=None, var_on_cpu=True, wd=0.0):
88 | """Initialize the basic LSTM cell.
89 |
90 | Args:
91 | num_units: int, The number of units in the LSTM cell.
92 | forget_bias: float, The bias added to forget gates (see above).
93 | input_size: int, The dimensionality of the inputs into the LSTM cell,
94 | by default equal to num_units.
95 | """
96 | self._num_units = num_units
97 | self._input_size = num_units if input_size is None else input_size
98 | self._forget_bias = forget_bias
99 | self.var_on_cpu = var_on_cpu
100 | self.wd = wd
101 |
102 | @property
103 | def input_size(self):
104 | return self._input_size
105 |
106 | @property
107 | def output_size(self):
108 | return self._num_units
109 |
110 | @property
111 | def state_size(self):
112 | return 2 * self._num_units
113 |
114 | def __call__(self, inputs, state, name_scope=None):
115 | """Long short-term memory cell (LSTM)."""
116 | with tf.variable_scope(name_scope or type(self).__name__): # "BasicLSTMCell"
117 | # Parameters of gates are concatenated into one multiply for efficiency.
118 | c, h = tf.split(1, 2, state)
119 | concat = linear([inputs, h], 4 * self._num_units, True, var_on_cpu=self.var_on_cpu, wd=self.wd)
120 |
121 | # i = input_gate, j = new_input, f = forget_gate, o = output_gate
122 | i, j, f, o = tf.split(1, 4, concat)
123 |
124 | new_c = c * tf.sigmoid(f + self._forget_bias) + tf.sigmoid(i) * tf.tanh(j)
125 | new_h = tf.tanh(new_c) * tf.sigmoid(o)
126 |
127 | return new_h, tf.concat(1, [new_c, new_h])
128 |
--------------------------------------------------------------------------------
/my/tensorflow.py:
--------------------------------------------------------------------------------
1 | import tensorflow as tf
2 |
3 |
4 | def _variable_on_cpu(name, shape, initializer):
5 | """Helper to create a Variable stored on CPU memory.
6 |
7 | Args:
8 | name: name of the variable
9 | shape: list of ints
10 | initializer: initializer for Variable
11 |
12 | Returns:
13 | Variable Tensor
14 | """
15 | with tf.device('/cpu:0'):
16 | var = tf.get_variable(name, shape, initializer=initializer)
17 | return var
18 |
19 |
20 | def _variable_with_weight_decay(name, shape, stddev, wd):
21 | """Helper to create an initialized Variable with weight decay.
22 |
23 | Note that the Variable is initialized with a truncated normal distribution.
24 | A weight decay is added only if one is specified.
25 |
26 | Args:
27 | name: name of the variable
28 | shape: list of ints
29 | stddev: standard deviation of a truncated Gaussian
30 | wd: add L2Loss weight decay multiplied by this float. If None, weight
31 | decay is not added for this Variable.
32 |
33 | Returns:
34 | Variable Tensor
35 | """
36 | var = _variable_on_cpu(name, shape,
37 | tf.truncated_normal_initializer(stddev=stddev))
38 | if wd:
39 | weight_decay = tf.mul(tf.nn.l2_loss(var), wd, name='weight_loss')
40 | tf.add_to_collection('losses', weight_decay)
41 | return var
42 |
43 |
44 | def average_gradients(tower_grads):
45 | """Calculate the average gradient for each shared variable across all towers.
46 |
47 | Note that this function provides a synchronization point across all towers.
48 |
49 | Args:
50 | tower_grads: List of lists of (gradient, variable) tuples. The outer list
51 | is over individual gradients. The inner list is over the gradient
52 | calculation for each tower.
53 | Returns:
54 | List of pairs of (gradient, variable) where the gradient has been averaged
55 | across all towers.
56 | """
57 | average_grads = []
58 | for grad_and_vars in zip(*tower_grads):
59 | # Note that each grad_and_vars looks like the following:
60 | # ((grad0_gpu0, var0_gpu0), ... , (grad0_gpuN, var0_gpuN))
61 | grads = []
62 | for g, _ in grad_and_vars:
63 | # Add 0 dimension to the gradients to represent the tower.
64 | assert g is not None
65 | expanded_g = tf.expand_dims(g, 0)
66 |
67 | # Append on a 'tower' dimension which we will average over below.
68 | grads.append(expanded_g)
69 |
70 | # Average over the 'tower' dimension.
71 | grad = tf.concat(0, grads)
72 | grad = tf.reduce_mean(grad, 0)
73 |
74 | # Keep in mind that the Variables are redundant because they are shared
75 | # across towers. So .. we will just return the first tower's pointer to
76 | # the Variable.
77 | v = grad_and_vars[0][1]
78 | grad_and_var = (grad, v)
79 | average_grads.append(grad_and_var)
80 | return average_grads
81 |
--------------------------------------------------------------------------------
/prepro/__init__.py:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/allenai/dqa-net/c497e950017e69952deae3ffd07920a5bfb9f468/prepro/__init__.py
--------------------------------------------------------------------------------
/prepro/__init__.pyc:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/allenai/dqa-net/c497e950017e69952deae3ffd07920a5bfb9f468/prepro/__init__.pyc
--------------------------------------------------------------------------------
/prepro/p05.py:
--------------------------------------------------------------------------------
1 | import argparse
2 | import os
3 | import json
4 | import shutil
5 | from collections import defaultdict, namedtuple
6 | import re
7 | import sys
8 | import random
9 | from pprint import pprint
10 |
11 | import h5py
12 | import numpy as np
13 |
14 | from utils import get_pbar
15 |
16 |
17 | def get_args():
18 | parser = argparse.ArgumentParser()
19 | parser.add_argument("--data_dir", default="/home/anglil/data/dqa/shining3")
20 | parser.add_argument("--target_dir", default="data/s3")
21 | parser.add_argument("--glove_path", default="/home/anglil/models/glove/glove.6B.300d.txt")
22 | parser.add_argument("--min_count", type=int, default=5)
23 | parser.add_argument("--vgg_model_path", default="~/models/vgg/vgg-19.caffemodel")
24 | parser.add_argument("--vgg_proto_path", default="~/models/vgg/vgg-19.prototxt")
25 | parser.add_argument("--debug", default='False')
26 | parser.add_argument("--qa2hypo", default='True')
27 | parser.add_argument("--qa2hypo_path", default="../dqa/qa2hypo")
28 | parser.add_argument("--prepro_images", default='True')
29 | return parser.parse_args()
30 |
31 |
32 | def qa2hypo(question, answer, flag, qa2hypo_path):
33 | if flag == 'True':
34 | # add qa2hypo_path to the Python path at runtime
35 | sys.path.insert(0, qa2hypo_path)
36 | sys.path.insert(0, qa2hypo_path+'/stanford-corenlp-python')
37 | from qa2hypo import qa2hypo as f
38 | return f(question, answer, False, True)
39 | # attach the answer to the question
40 | return "%s %s" % (question, answer)
41 |
42 |
43 | def _tokenize(raw):
44 | tokens = tuple(re.findall(r"[\w]+", raw))
45 | return tokens
46 |
47 |
48 | def _vadd(vocab_counter, word):
49 | word = word.lower()
50 | vocab_counter[word] += 1
51 |
52 |
53 | def _vget(vocab_dict, word):
54 | word = word.lower()
55 | if word in vocab_dict:
56 | return vocab_dict[word]
57 | else:
58 | return 0
59 |
60 |
61 | def _vlup(vocab_dict, words):
62 | return tuple(_vget(vocab_dict, word) for word in words)
63 |
64 |
65 | def _get(id_map, key):
66 | return id_map[key] if key in id_map else None
67 |
68 |
69 | def rel2text(id_map, rel):
70 | """
71 | Obtain text facts from the relation class.
72 | :param id_map:
73 | :param rel:
74 | :return:
75 | """
76 | TEMPLATES = ["%s links to %s.",
77 | "there is %s.",
78 | "the title is %s.",
79 | "%s describes region.",
80 | "there are %s %s.",
81 | "arrows objects regions 0 1 2 3 4 5 6 7 8 9",
82 | "% s and %s are related."]
83 | MAX_LABEL_SIZE = 3
84 | tup = rel[:3]
85 | o_keys, d_keys = rel[3:]
86 | if tup == ('interObject', 'linkage', 'objectToObject'):
87 | template = TEMPLATES[0]
88 | o = _get(id_map, o_keys[0]) if len(o_keys) else None
89 | d = _get(id_map, d_keys[0]) if len(d_keys) else None
90 | if not (o and d):
91 | return None
92 | o_words = _tokenize(o)
93 | d_words = _tokenize(d)
94 | if len(o_words) > MAX_LABEL_SIZE:
95 | o = "object"
96 | if len(d_words) > MAX_LABEL_SIZE:
97 | d = "object"
98 | text = template % (o, d)
99 | return text
100 |
101 | elif tup == ('intraObject', 'linkage', 'regionDescription'):
102 | template = TEMPLATES[3]
103 | o = _get(id_map, o_keys[0]) if len(o_keys) else None
104 | o = o or "an object"
105 | o_words = _tokenize(o)
106 | if len(o_words) > MAX_LABEL_SIZE:
107 | o = "an object"
108 | text = template % o
109 | return text
110 |
111 | elif tup == ('unary', '', 'regionDescriptionNoArrow'):
112 | template = TEMPLATES[3]
113 | o = _get(id_map, o_keys[0]) if len(o_keys) else None
114 | o = o or "an object"
115 | o_words = _tokenize(o)
116 | if len(o_words) > MAX_LABEL_SIZE:
117 | o = "an object"
118 | text = template % o
119 | return text
120 |
121 | elif tup[0] == 'unary' and tup[2] in ['objectLabel', 'ownObject']:
122 | template = TEMPLATES[1]
123 | val =_get(id_map, o_keys[0])
124 | if val is not None:
125 | words = _tokenize(val)
126 | if len(words) > MAX_LABEL_SIZE:
127 | return val
128 | else:
129 | return template % val
130 |
131 | elif tup == ('unary', '', 'regionLabel'):
132 | template = TEMPLATES[1]
133 | val =_get(id_map, o_keys[0])
134 | if val is not None:
135 | words = _tokenize(val)
136 | if len(words) > MAX_LABEL_SIZE:
137 | return val
138 | else:
139 | return template % val
140 |
141 | elif tup == ('unary', '', 'imageTitle'):
142 | template = TEMPLATES[2]
143 | val = _get(id_map, o_keys[0])
144 | return template % val
145 |
146 | elif tup == ('unary', '', 'sectionTitle'):
147 | template = TEMPLATES[2]
148 | val = _get(id_map, o_keys[0])
149 | return template % val
150 |
151 | elif tup[0] == 'count':
152 | template = TEMPLATES[4]
153 | category = tup[2]
154 | num = str(o_keys)
155 | return template % (num, category)
156 |
157 | elif tup[0] == 'unary':
158 | val = _get(id_map, o_keys[0])
159 | return val
160 |
161 | return None
162 |
163 |
164 | Relation = namedtuple('Relation', 'type subtype category origin destination')
165 | categories = set()
166 |
167 |
168 | def anno2rels(anno):
169 | types = set()
170 | rels = []
171 | # Unary relations
172 | for text_id, d in anno['text'].items():
173 | category = d['category'] if 'category' in d else ''
174 | categories.add(category)
175 | rel = Relation('unary', '', category, [text_id], '')
176 | rels.append(rel)
177 |
178 | # Counting
179 | if 'arrows' in anno and len(anno['arrows']) > 0:
180 | rels.append(Relation('count', '', 'stages', len(anno['arrows']), ''))
181 | if 'objects' in anno and len(anno['objects']) > 0:
182 | rels.append(Relation('count', '', 'objects', len(anno['objects']), ''))
183 |
184 | if 'relationships' not in anno:
185 | return rels
186 | for type_, d in anno['relationships'].items():
187 | for subtype, dd in d.items():
188 | for rel_id, ddd in dd.items():
189 | category = ddd['category']
190 | origin = ddd['origin'] if 'origin' in ddd else ""
191 | destination = ddd['destination'] if 'destination' in ddd else ""
192 | rel = Relation(type_, subtype, category, origin, destination)
193 | rels.append(rel)
194 | types.add((type_, subtype, category))
195 | return rels
196 |
197 |
198 | def _get_id_map(anno):
199 | id_map = {}
200 | if 'text' in anno:
201 | for key, d in anno['text'].items():
202 | id_map[key] = d['value']
203 | if 'objects' in anno:
204 | for key, d in anno['objects'].items():
205 | if 'text' in d and len(d['text']) > 0:
206 | new_key = d['text'][0]
207 | id_map[key] = id_map[new_key]
208 | if 'relationships' in anno:
209 | d = anno['relationships']
210 | if 'intraOjbect' in d:
211 | d = d['intraOjbect']
212 | if 'label' in d:
213 | d = d['label']
214 | for _, dd in d.items():
215 | category = dd['category']
216 | if category in ['arrowHeadTail', 'arrowDescriptor']:
217 | continue
218 | origin = dd['origin'][0]
219 | dest = dd['destination'][0]
220 | if origin.startswith("CT") or origin.startswith("T"):
221 | id_map[dest] = id_map[origin]
222 | elif dest.startswith("CT") or dest.startswith("T"):
223 | id_map[origin] = id_map[dest]
224 | return id_map
225 |
226 |
227 | def prepro_annos(args):
228 | """
229 | Transform DQA annotation.json -> a list of tokenized fact sentences for each image in json file
230 | The facts are indexed by image id.
231 | :param args:
232 | :return:
233 | """
234 | data_dir = args.data_dir
235 | target_dir = args.target_dir
236 |
237 | # For debugging
238 | if args.debug == 'True':
239 | sents_path =os.path.join(target_dir, "raw_sents.json")
240 | answers_path =os.path.join(target_dir, "answers.json")
241 | sentss_dict = json.load(open(sents_path, 'r'))
242 | answers_dict = json.load(open(answers_path, 'r'))
243 |
244 | facts_path = os.path.join(target_dir, "raw_facts.json")
245 | meta_data_path = os.path.join(target_dir, "meta_data.json")
246 | meta_data = json.load(open(meta_data_path, "r"))
247 | facts_dict = {}
248 | annos_dir = os.path.join(data_dir, "annotations")
249 | anno_names = [name for name in os.listdir(annos_dir) if name.endswith(".json")]
250 | max_num_facts = 0
251 | max_fact_size = 0
252 | pbar = get_pbar(len(anno_names)).start()
253 | for i, anno_name in enumerate(anno_names):
254 | image_name, _ = os.path.splitext(anno_name)
255 | image_id, _ = os.path.splitext(image_name)
256 | anno_path = os.path.join(annos_dir, anno_name)
257 | anno = json.load(open(anno_path, 'r'))
258 | rels = anno2rels(anno)
259 | id_map = _get_id_map(anno)
260 | text_facts = [rel2text(id_map, rel) for rel in rels]
261 | text_facts = list(set(_tokenize(fact) for fact in text_facts if fact is not None))
262 | max_fact_size = max([max_fact_size] + [len(fact) for fact in text_facts])
263 | # For debugging only
264 | if args.debug == 'True':
265 | if image_id in sentss_dict:
266 | correct_sents = [sents[answer] for sents, answer in zip(sentss_dict[image_id], answers_dict[image_id])]
267 | # indexed_facts.extend(correct_sents)
268 | # FIXME : this is very strong prior!
269 | text_facts = correct_sents
270 | else:
271 | text_facts = []
272 | facts_dict[image_id] = text_facts
273 | max_num_facts = max(max_num_facts, len(text_facts))
274 | pbar.update(i)
275 |
276 | pbar.finish()
277 |
278 | meta_data['max_num_facts'] = max_num_facts
279 | meta_data['max_fact_size'] = max_fact_size
280 | print("number of facts: %d" % sum(len(facts) for facts in facts_dict.values()))
281 | print("max num facts per relation: %d" % max_num_facts)
282 | print("max fact size: %d" % max_fact_size)
283 | print("dumping json files ... ")
284 | json.dump(meta_data, open(meta_data_path, 'w'))
285 | json.dump(facts_dict, open(facts_path, 'w'))
286 | print("done")
287 |
288 |
289 | def prepro_questions(args):
290 | """
291 | transform DQA questions.json files -> single statements json and single answers json.
292 | sentences and answers are doubly indexed by image id first and then question number within that image (0 indexed)
293 | :param args:
294 | :return:
295 | """
296 | data_dir = args.data_dir
297 | target_dir = args.target_dir
298 | questions_dir = os.path.join(data_dir, "questions")
299 | raw_sents_path = os.path.join(target_dir, "raw_sents.json")
300 | answers_path = os.path.join(target_dir, "answers.json")
301 | meta_data_path = os.path.join(target_dir, "meta_data.json")
302 | meta_data = json.load(open(meta_data_path, "r"))
303 |
304 | sentss_dict = {}
305 | answers_dict = {}
306 |
307 | ques_names = sorted([name for name in os.listdir(questions_dir) if os.path.splitext(name)[1].endswith(".json")],
308 | key=lambda x: int(os.path.splitext(os.path.splitext(x)[0])[0]))
309 | num_choices = 0
310 | num_questions = 0
311 | max_sent_size = 0
312 | pbar = get_pbar(len(ques_names)).start()
313 | for i, ques_name in enumerate(ques_names):
314 | image_name, _ = os.path.splitext(ques_name)
315 | image_id, _ = os.path.splitext(image_name)
316 | sentss = []
317 | answers = []
318 | ques_path = os.path.join(questions_dir, ques_name)
319 | ques = json.load(open(ques_path, "r"))
320 | for ques_id, (ques_text, d) in enumerate(ques['questions'].items()):
321 | if d['abcLabel']:
322 | continue
323 | sents = [_tokenize(qa2hypo(ques_text, choice, args.qa2hypo, args.qa2hypo_path)) for choice in d['answerTexts']]
324 | max_sent_size = max(max_sent_size, max(len(sent) for sent in sents))
325 | assert not num_choices or num_choices == len(sents), "number of choices don't match: %s" % ques_name
326 | num_choices = len(sents)
327 | sentss.append(sents)
328 | answers.append(d['correctAnswer'])
329 | num_questions += 1
330 | sentss_dict[image_id] = sentss
331 | answers_dict[image_id] = answers
332 | pbar.update(i)
333 | pbar.finish()
334 | meta_data['num_choices'] = num_choices
335 | meta_data['max_sent_size'] = max_sent_size
336 |
337 | print("number of questions: %d" % num_questions)
338 | print("number of choices: %d" % num_choices)
339 | print("max sent size: %d" % max_sent_size)
340 | print("dumping json file ... ")
341 | json.dump(sentss_dict, open(raw_sents_path, "w"))
342 | json.dump(answers_dict, open(answers_path, "w"))
343 | json.dump(meta_data, open(meta_data_path, "w"))
344 | print("done")
345 |
346 |
347 | def build_vocab(args):
348 | target_dir = args.target_dir
349 | vocab_path = os.path.join(target_dir, "vocab.json")
350 | emb_mat_path = os.path.join(target_dir, "init_emb_mat.h5")
351 | raw_sents_path = os.path.join(target_dir, "raw_sents.json")
352 | raw_facts_path = os.path.join(target_dir, "raw_facts.json")
353 | raw_sentss_dict = json.load(open(raw_sents_path, 'r'))
354 | raw_facts_dict = json.load(open(raw_facts_path, 'r'))
355 |
356 | meta_data_path = os.path.join(target_dir, "meta_data.json")
357 | meta_data = json.load(open(meta_data_path, 'r'))
358 | glove_path = args.glove_path
359 |
360 | word_counter = defaultdict(int)
361 |
362 | for image_id, raw_sentss in raw_sentss_dict.items():
363 | for raw_sents in raw_sentss:
364 | for raw_sent in raw_sents:
365 | for word in raw_sent:
366 | _vadd(word_counter, word)
367 |
368 | for image_id, raw_facts in raw_facts_dict.items():
369 | for raw_fact in raw_facts:
370 | for word in raw_fact:
371 | _vadd(word_counter, word)
372 |
373 | word_list, counts = zip(*sorted([pair for pair in word_counter.items()], key=lambda x: -x[1]))
374 | freq = 5
375 | print("top %d frequent words:" % freq)
376 | for word, count in zip(word_list[:freq], counts[:freq]):
377 | print("%r: %d" % (word, count))
378 |
379 | features = {}
380 | word_size = 0
381 | print("reading %s ... " % glove_path)
382 | with open(glove_path, 'r') as fp:
383 | for line in fp:
384 | array = line.lstrip().rstrip().split(" ")
385 | word = array[0]
386 | if word in word_counter:
387 | vector = list(map(float, array[1:]))
388 | features[word] = vector
389 | word_size = len(vector)
390 | print("done")
391 | vocab_word_list = [word for word in word_list if word in features]
392 | unknown_word_list = [word for word in word_list if word not in features]
393 | vocab_size = len(features) + 1
394 |
395 | f = h5py.File(emb_mat_path, 'w')
396 | emb_mat = f.create_dataset('data', [vocab_size, word_size], dtype='float')
397 | vocab = {}
398 | pbar = get_pbar(len(vocab_word_list)).start()
399 | for i, word in enumerate(vocab_word_list):
400 | emb_mat[i+1, :] = features[word]
401 | vocab[word] = i + 1
402 | pbar.update(i)
403 | pbar.finish()
404 | vocab['UNK'] = 0
405 |
406 | meta_data['vocab_size'] = vocab_size
407 | meta_data['word_size'] = word_size
408 | print("num of distinct words: %d" % len(word_counter))
409 | print("vocab size: %d" % vocab_size)
410 | print("word size: %d" % word_size)
411 |
412 | print("dumping json file ... ")
413 | f.close()
414 | json.dump(vocab, open(vocab_path, "w"))
415 | json.dump(meta_data, open(meta_data_path, "w"))
416 | print("done")
417 |
418 |
419 | def indexing(args):
420 | target_dir = args.target_dir
421 | vocab_path = os.path.join(target_dir, "vocab.json")
422 | raw_sents_path = os.path.join(target_dir, "raw_sents.json")
423 | raw_facts_path = os.path.join(target_dir, "raw_facts.json")
424 | sents_path = os.path.join(target_dir, "sents.json")
425 | facts_path = os.path.join(target_dir, "facts.json")
426 | vocab = json.load(open(vocab_path, 'r'))
427 | raw_sentss_dict = json.load(open(raw_sents_path, 'r'))
428 | raw_facts_dict = json.load(open(raw_facts_path, 'r'))
429 |
430 | sentss_dict = {image_id: [[_vlup(vocab, sent) for sent in sents] for sents in sentss] for image_id, sentss in raw_sentss_dict.items()}
431 | facts_dict = {image_id: [_vlup(vocab, fact) for fact in facts] for image_id, facts in raw_facts_dict.items()}
432 |
433 | print("dumping json files ... ")
434 | json.dump(sentss_dict, open(sents_path, 'w'))
435 | json.dump(facts_dict, open(facts_path, 'w'))
436 | print("done")
437 |
438 |
439 | def create_meta_data(args):
440 | target_dir = args.target_dir
441 | if not os.path.exists(target_dir):
442 | os.mkdir(target_dir)
443 | meta_data_path = os.path.join(target_dir, "meta_data.json")
444 | meta_data = {'data_dir': args.data_dir}
445 | json.dump(meta_data, open(meta_data_path, "w"))
446 |
447 |
448 | def create_image_ids_and_paths(args):
449 | if args.prepro_images == 'False':
450 | print("Skipping image preprocessing.")
451 | return
452 | data_dir = args.data_dir
453 | target_dir = args.target_dir
454 | images_dir = os.path.join(data_dir, "images")
455 | image_ids_path = os.path.join(target_dir, "image_ids.json")
456 | image_paths_path = os.path.join(target_dir, "image_paths.json")
457 | image_names = [name for name in os.listdir(images_dir) if name.endswith(".png")]
458 | image_ids = [os.path.splitext(name)[0] for name in image_names]
459 | ordered_image_ids = sorted(image_ids, key=lambda x: int(x))
460 | ordered_image_names = ["%s.png" % id_ for id_ in ordered_image_ids]
461 | print("dumping json files ... ")
462 | image_paths = [os.path.join(images_dir, name) for name in ordered_image_names]
463 | json.dump(ordered_image_ids, open(image_ids_path, "w"))
464 | json.dump(image_paths, open(image_paths_path, "w"))
465 | print("done")
466 |
467 |
468 | def prepro_images(args):
469 | if args.prepro_images == 'False':
470 | print("Skipping image preprocessing.")
471 | return
472 | model_path = args.vgg_model_path
473 | proto_path = args.vgg_proto_path
474 | out_path = os.path.join(args.target_dir, "images.h5")
475 | image_paths_path = os.path.join(args.target_dir, "image_paths.json")
476 | os.system("th prepro_images.lua --image_path_json %s --cnn_proto %s --cnn_model %s --out_path %s"
477 | % (image_paths_path, proto_path, model_path, out_path))
478 |
479 |
480 | def copy_folds(args):
481 | data_dir = args.data_dir
482 | target_dir = args.target_dir
483 | for num in range(1,6):
484 | from_folds_path = os.path.join(data_dir, "fold%d.json" % num)
485 | to_folds_path = os.path.join(target_dir, "fold%d.json" % num)
486 | shutil.copy(from_folds_path, to_folds_path)
487 |
488 |
489 | if __name__ == "__main__":
490 | ARGS = get_args()
491 | create_meta_data(ARGS)
492 | create_image_ids_and_paths(ARGS)
493 | prepro_questions(ARGS)
494 | prepro_annos(ARGS)
495 | build_vocab(ARGS)
496 | indexing(ARGS)
497 | prepro_images(ARGS)
498 |
--------------------------------------------------------------------------------
/prepro_images.lua:
--------------------------------------------------------------------------------
1 | require 'nn'
2 | require 'optim'
3 | require 'torch'
4 | require 'nn'
5 | require 'math'
6 | require 'cunn'
7 | require 'cutorch'
8 | require 'loadcaffe'
9 | require 'image'
10 | require 'hdf5'
11 | cjson=require('cjson')
12 | require 'xlua'
13 |
14 | -------------------------------------------------------------------------------
15 | -- Input arguments and options
16 | -------------------------------------------------------------------------------
17 | cmd = torch.CmdLine()
18 | cmd:text()
19 | cmd:text('Options')
20 | cmd:option('--image_path_json', '', 'json containing ordered list of image paths')
21 | cmd:option('--cnn_proto', '', 'path to the cnn prototxt')
22 | cmd:option('--cnn_model', '', 'path to the cnn model')
23 | cmd:option('--batch_size', 10, 'batch_size')
24 |
25 | cmd:option('--out_path', 'image_rep.h5', 'output path')
26 | cmd:option('--gpuid', 1, 'which gpu to use. -1 = use CPU')
27 | cmd:option('--backend', 'cudnn', 'nn|cudnn')
28 |
29 | opt = cmd:parse(arg)
30 | print(opt)
31 |
32 | cutorch.setDevice(opt.gpuid)
33 | net=loadcaffe.load(opt.cnn_proto, opt.cnn_model,opt.backend);
34 | net:evaluate()
35 | net=net:cuda()
36 |
37 | function loadim(imname)
38 | local im, im2
39 | im=image.load(imname)
40 | im=image.scale(im,224,224)
41 | if im:size(1)==1 then
42 | im2=torch.cat(im,im,1)
43 | im2=torch.cat(im2,im,1)
44 | im=im2
45 | elseif im:size(1)==4 then
46 | im=im[{{1,3},{},{}}]
47 | end
48 | im=im*255;
49 | im2=im:clone()
50 | im2[{{3},{},{}}]=im[{{1},{},{}}]-123.68
51 | im2[{{2},{},{}}]=im[{{2},{},{}}]-116.779
52 | im2[{{1},{},{}}]=im[{{3},{},{}}]-103.939
53 | return im2
54 | end
55 |
56 | local image_path_json_file = io.open(opt.image_path_json, 'r')
57 | local image_path_json = cjson.decode(image_path_json_file:read())
58 | image_path_json_file.close()
59 |
60 | local image_path_list = {}
61 | for i,image_path in pairs(image_path_json) do
62 | table.insert(image_path_list, image_path)
63 | end
64 |
65 | local ndims=4096
66 | local batch_size = opt.batch_size
67 |
68 | local sz = #image_path_list
69 | local feat = torch.CudaTensor(sz, ndims)
70 | print(string.format('processing %d images...', sz))
71 | for i = 1, sz, batch_size do
72 | xlua.progress(i, sz)
73 | local r = math.min(sz, i + batch_size - 1)
74 | local ims = torch.CudaTensor(r-i+1, 3, 224, 224)
75 | for j = 1, r-i+1 do
76 | ims[j] = loadim(image_path_list[i+j-1]):cuda()
77 | end
78 | net:forward(ims)
79 | feat[{{i,r},{}}] = net.modules[43].output:clone()
80 | collectgarbage()
81 | end
82 |
83 | local h5_file = hdf5.open(opt.out_path, 'w')
84 | h5_file:write('/data', feat:float())
85 | h5_file:close()
86 |
--------------------------------------------------------------------------------
/read_data/__init__.py:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/allenai/dqa-net/c497e950017e69952deae3ffd07920a5bfb9f468/read_data/__init__.py
--------------------------------------------------------------------------------
/read_data/r05.py:
--------------------------------------------------------------------------------
1 | import json
2 | import os
3 | from pprint import pprint
4 |
5 | import h5py
6 | import numpy as np
7 | import sys
8 |
9 | from configs.get_config import Config
10 |
11 |
12 | class DataSet(object):
13 | def __init__(self, name, batch_size, data, idxs, idx2id):
14 | self.name = name
15 | self.num_epochs_completed = 0
16 | self.idx_in_epoch = 0
17 | self.batch_size = batch_size
18 | self.data = data
19 | self.idxs = idxs
20 | self.idx2id = idx2id
21 | self.num_examples = len(idxs)
22 | self.num_full_batches = int(self.num_examples / self.batch_size)
23 | self.num_all_batches = self.num_full_batches + int(self.num_examples % self.batch_size > 0)
24 | self.reset()
25 |
26 | def get_num_batches(self, partial=False):
27 | return self.num_all_batches if partial else self.num_full_batches
28 |
29 | def get_batch_idxs(self, partial=False):
30 | assert self.has_next_batch(partial=partial), "End of data, reset required."
31 | from_, to = self.idx_in_epoch, self.idx_in_epoch + self.batch_size
32 | if partial and to > self.num_examples:
33 | to = self.num_examples
34 | cur_idxs = self.idxs[from_:to]
35 | return cur_idxs
36 |
37 | def get_next_labeled_batch(self, partial=False):
38 | cur_idxs = self.get_batch_idxs(partial=partial)
39 | batch = [[each[i] for i in cur_idxs] for each in self.data]
40 | self.idx_in_epoch += len(cur_idxs)
41 | return batch
42 |
43 | def has_next_batch(self, partial=False):
44 | if partial:
45 | return self.idx_in_epoch < self.num_examples
46 | return self.idx_in_epoch + self.batch_size <= self.num_examples
47 |
48 | def complete_epoch(self):
49 | self.reset()
50 | self.num_epochs_completed += 1
51 |
52 | def reset(self):
53 | self.idx_in_epoch = 0
54 | np.random.shuffle(self.idxs)
55 |
56 |
57 | def read_data(params, mode):
58 | print("loading {} data ... ".format(mode))
59 | data_dir = params.data_dir
60 |
61 | fold_path = params.fold_path
62 | fold = json.load(open(fold_path, 'r'))
63 | if mode in ['train', 'test']:
64 | cur_image_ids = fold[mode]
65 | elif mode == 'val':
66 | cur_image_ids = fold['test']
67 | else:
68 | raise Exception()
69 |
70 | sents_path = os.path.join(data_dir, "sents.json")
71 | facts_path = os.path.join(data_dir, "facts.json")
72 | answers_path = os.path.join(data_dir, "answers.json")
73 | images_path = os.path.join(data_dir, "images.h5")
74 | image_ids_path = os.path.join(data_dir, "image_ids.json")
75 |
76 | sentss_dict = json.load(open(sents_path, "r"))
77 | facts_dict = json.load(open(facts_path, "r"))
78 | answers_dict = json.load(open(answers_path, "r"))
79 | images_h5 = h5py.File(images_path, 'r')
80 | all_image_ids = json.load(open(image_ids_path, 'r'))
81 | image_id2idx = {id_: idx for idx, id_ in enumerate(all_image_ids)}
82 |
83 | batch_size = params.batch_size
84 | sentss, answers, factss, images = [], [], [], []
85 | idx = 0
86 | idx2id = []
87 | for image_id in cur_image_ids:
88 | if image_id not in sentss_dict or image_id not in facts_dict:
89 | continue
90 | facts = facts_dict[image_id]
91 | image = images_h5['data'][image_id2idx[image_id]]
92 | for sent_id, (sents, answer) in enumerate(zip(sentss_dict[image_id], answers_dict[image_id])):
93 | sentss.append(sents)
94 | answers.append(answer)
95 | factss.append(facts)
96 | images.append(image)
97 | idx2id.append([image_id, sent_id])
98 | idx += 1
99 |
100 | data = [sentss, factss, images, answers]
101 | idxs = np.arange(len(answers))
102 | data_set = DataSet(mode, batch_size, data, idxs, idx2id)
103 | print("done")
104 | return data_set
105 |
106 |
107 | if __name__ == "__main__":
108 | params = Config()
109 | params.batch_size = 2
110 | params.train = True
111 |
--------------------------------------------------------------------------------
/requirements.txt:
--------------------------------------------------------------------------------
1 | numpy
2 | progressbar2
3 | nltk
4 | tensorflow
5 | h5py
6 |
--------------------------------------------------------------------------------
/tmp/__init__.py:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/allenai/dqa-net/c497e950017e69952deae3ffd07920a5bfb9f468/tmp/__init__.py
--------------------------------------------------------------------------------
/tmp/sim_test.py:
--------------------------------------------------------------------------------
1 | import argparse
2 | import json
3 | import os
4 |
5 | import itertools
6 | from collections import defaultdict
7 |
8 | import numpy as np
9 | import matplotlib.pyplot as plt
10 |
11 | from utils import get_pbar
12 |
13 |
14 | def get_args():
15 | parser = argparse.ArgumentParser()
16 | parser.add_argument("first_dir")
17 | parser.add_argument("second_dir")
18 | return parser.parse_args()
19 |
20 | def sim_test(args):
21 | first_dir = args.first_dir
22 | second_dir = args.second_dir
23 | first_sents_path = os.path.join(first_dir, "sents.json")
24 | second_sents_path = os.path.join(second_dir, "sents.json")
25 | vocab_path = os.path.join(first_dir, "vocab.json")
26 | vocab = json.load(open(vocab_path, 'r'))
27 | inv_vocab = {idx: word for word, idx in vocab.items()}
28 | first_sents = json.load(open(first_sents_path, "r"))
29 | second_sents = json.load(open(second_sents_path, "r"))
30 | diff_dict = defaultdict(int)
31 | pbar = get_pbar(len(first_sents)).start()
32 | i = 0
33 | for first_id, sents1 in first_sents.items():
34 | text1 = sent_to_text(inv_vocab, sents1[0])
35 | min_second_id, diff = min([[second_id, cdiff(sents1, sents2, len(vocab))] for second_id, sents2 in second_sents.items()],
36 | key=lambda x: x[1])
37 | text2 = sent_to_text(inv_vocab, second_sents[min_second_id][0])
38 | diff_dict[diff] += 1
39 | """
40 | if diff <= 3:
41 | print("%s, %s, %d" % (text1, text2, diff))
42 | """
43 | pbar.update(i)
44 | i += 1
45 | pbar.finish()
46 | json.dump(diff_dict, open("diff_dict.json", "w"))
47 |
48 | def sent_to_text(vocab, sent):
49 | return " ".join(vocab[idx] for idx in sent)
50 |
51 | def sent_to_bow(sent, l):
52 | out = np.zeros([l])
53 | for idx in sent:
54 | out[idx] = 1.0
55 | return out
56 |
57 | def temp():
58 | a = {"0.0": 128, "1.0": 61, "2.0": 181, "3.0": 152, "4.0": 170, "5.0": 144, "6.0": 128, "7.0": 120, "8.0": 70, "9.0": 50, "10.0": 44, "11.0": 22, "12.0": 19, "13.0": 17, "14.0": 3, "15.0": 4, "16.0": 3, "18.0": 2, "22.0": 1, "24.0": 1, "27.0": 1}
59 | keys = map(int, a.keys())
60 | plt.plot(keys, [a[key] for key in keys])
61 |
62 |
63 |
64 | def diff(sent1, sent2, l):
65 | return np.sum(np.abs(sent_to_bow(sent1, l) - sent_to_bow(sent2, l)))
66 |
67 | def cdiff(sents1, sents2, l):
68 | return min(diff(sent1, sent2, l) for sent1, sent2 in itertools.product(sents1, sents2))
69 |
70 | if __name__ == "__main__":
71 | ARGS = get_args()
72 | sim_test(ARGS)
--------------------------------------------------------------------------------
/tmp/simple.py:
--------------------------------------------------------------------------------
1 | import argparse
2 | import json
3 | from os import path, listdir
4 | from random import randint
5 |
6 | import networkx as nx
7 | import re
8 |
9 | from nltk.stem import PorterStemmer
10 |
11 | from utils import get_pbar
12 |
13 |
14 | def _get_args():
15 | parser =argparse.ArgumentParser()
16 | parser.add_argument("data_dir")
17 | parser.add_argument("fold_path")
18 | return parser.parse_args()
19 |
20 |
21 | def _tokenize(raw):
22 | tokens = re.findall(r"[\w]+", raw)
23 | return tokens
24 |
25 | stem = True
26 | stemmer = PorterStemmer()
27 | def _normalize(word):
28 | word = word.lower()
29 | if stem:
30 | word = stemmer.stem(word)
31 | return word
32 |
33 | def load_all(data_dir):
34 | annos_dir = path.join(data_dir, 'annotations')
35 | images_dir = path.join(data_dir, 'images')
36 | questions_dir = path.join(data_dir, 'questions')
37 |
38 | anno_dict = {}
39 | questions_dict = {}
40 | choicess_dict = {}
41 | answers_dict = {}
42 |
43 | image_ids = sorted([path.splitext(name)[0] for name in listdir(images_dir) if name.endswith(".png")], key=lambda x: int(x))
44 | pbar = get_pbar(len(image_ids)).start()
45 | for i, image_id in enumerate(image_ids):
46 | json_name = "%s.png.json" % image_id
47 | anno_path = path.join(annos_dir, json_name)
48 | ques_path = path.join(questions_dir, json_name)
49 | if path.exists(anno_path) and path.exists(ques_path):
50 | anno = json.load(open(anno_path, "r"))
51 | ques = json.load(open(ques_path, "r"))
52 |
53 | questions = []
54 | choicess = []
55 | answers = []
56 | for question, d in ques['questions'].items():
57 | if not d['abcLabel']:
58 | choices = d['answerTexts']
59 | answer = d['correctAnswer']
60 | questions.append(question)
61 | choicess.append(choices)
62 | answers.append(answer)
63 |
64 | questions_dict[image_id] = questions
65 | choicess_dict[image_id] = choicess
66 | answers_dict[image_id] = answers
67 | anno_dict[image_id] = anno
68 | pbar.update(i)
69 | pbar.finish()
70 |
71 | return anno_dict, questions_dict, choicess_dict, answers_dict
72 |
73 |
74 | def _get_val(anno, key):
75 | first = key[0]
76 | if first == 'T':
77 | val = anno['text'][key]['value']
78 | val = _normalize(val)
79 | return val
80 | elif first == 'O':
81 | d = anno['objects'][key]
82 | if 'text' in d and len(d['text']) > 0:
83 | key = d['text'][0]
84 | return _get_val(anno, key)
85 | return None
86 | else:
87 | raise Exception(key)
88 |
89 |
90 | def create_graph(anno):
91 | graph = nx.Graph()
92 | try:
93 | d = anno['relationships']['interObject']['linkage']
94 | except:
95 | return graph
96 | for dd in d.values():
97 | if dd['category'] == 'objectToObject':
98 | dest = _get_val(anno, dd['destination'][0])
99 | orig = _get_val(anno, dd['origin'][0])
100 | if dest and orig:
101 | graph.add_edge(dest, orig)
102 | return graph
103 |
104 |
105 | def find_node(graph, text):
106 | words = _tokenize(text)
107 | words = [_normalize(word) for word in words]
108 | for word in words:
109 | if word in graph.nodes():
110 | return word
111 | return None
112 |
113 |
114 | def guess(graph, question, choices):
115 | MAX = 9999
116 | SUBMAX = 999
117 | ques_node = find_node(graph, question)
118 | dists = []
119 | for choice in choices:
120 | choice_node = find_node(graph, choice)
121 | if ques_node is None and choice_node is None:
122 | dist = MAX
123 | elif ques_node is None and choice_node is not None:
124 | dist = SUBMAX
125 | elif ques_node is not None and choice_node is None:
126 | dist = MAX
127 | else:
128 | if nx.has_path(graph, ques_node, choice_node):
129 | pl = len(nx.shortest_path(graph, ques_node, choice_node))
130 | dist = pl
131 | else:
132 | dist = MAX
133 | dists.append(dist)
134 | answer, dist = min(enumerate(dists), key=lambda x: x[1])
135 | max_dist = max(dists)
136 | if dist == MAX:
137 | return None
138 | if dist == max_dist:
139 | return None
140 | return answer
141 |
142 |
143 | def evaluate(anno_dict, questions_dict, choicess_dict, answers_dict):
144 | total = 0
145 | correct = 0
146 | incorrect = 0
147 | guessed = 0
148 | pbar = get_pbar(len(anno_dict)).start()
149 | for i, (image_id, anno) in enumerate(anno_dict.items()):
150 | graph = create_graph(anno)
151 | questions = questions_dict[image_id]
152 | choicess =choicess_dict[image_id]
153 | answers = answers_dict[image_id]
154 | for question, choices, answer in zip(questions, choicess, answers):
155 | total += 1
156 | a = guess(graph, question, choices)
157 | if a is None:
158 | guessed += 1
159 | elif answer == a:
160 | correct += 1
161 | else:
162 | incorrect += 1
163 | pbar.update(i)
164 | pbar.finish()
165 | print("expected accuracy: (0.25 * %d + %d)/%d = %.4f" % (guessed, correct, total, (0.25*guessed + correct)/total))
166 | print("precision: %d/%d = %.4f" % (correct, correct + incorrect, correct/(correct + incorrect)))
167 |
168 |
169 | def select(fold_path, *all_):
170 | fold = json.load(open(fold_path, 'r'))
171 | test_ids = fold['test']
172 | new_all = []
173 | for each in all_:
174 | new_each = {id_: each[id_] for id_ in test_ids if id_ in each}
175 | new_all.append(new_each)
176 | return new_all
177 |
178 |
179 | def main():
180 | args = _get_args()
181 | all_ = load_all(args.data_dir)
182 | selected = select(args.fold_path, *all_)
183 | evaluate(*selected)
184 |
185 | if __name__ == "__main__":
186 | main()
187 |
188 |
189 |
190 |
--------------------------------------------------------------------------------
/utils.py:
--------------------------------------------------------------------------------
1 | import json
2 |
3 | import progressbar as pb
4 |
5 |
6 | def get_pbar(num, prefix=""):
7 | assert isinstance(prefix, str)
8 | pbar = pb.ProgressBar(widgets=[prefix, pb.Percentage(), pb.Bar(), pb.ETA()], maxval=num)
9 | return pbar
10 |
11 |
12 | def json_pretty_dump(obj, fh):
13 | return json.dump(obj, fh, sort_keys=True, indent=2, separators=(',', ': '))
14 |
15 |
--------------------------------------------------------------------------------
/vis/__init__.py:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/allenai/dqa-net/c497e950017e69952deae3ffd07920a5bfb9f468/vis/__init__.py
--------------------------------------------------------------------------------
/vis/list_dqa_questions.py:
--------------------------------------------------------------------------------
1 | import SimpleHTTPServer
2 | import SocketServer
3 | import argparse
4 | import json
5 | import os
6 | import shutil
7 |
8 | from jinja2 import Environment, FileSystemLoader
9 |
10 | from utils import get_pbar
11 |
12 |
13 | def get_args():
14 | parser = argparse.ArgumentParser()
15 | parser.add_argument("data_dir")
16 | parser.add_argument("--start", default=0, type=int)
17 | parser.add_argument("--stop", default=1500, type=int)
18 | parser.add_argument("--show_im", default='True')
19 | parser.add_argument("--im_width", type=int, default=200)
20 | parser.add_argument("--ext", type=str, default=".png")
21 | parser.add_argument("--template_name", type=str, default="list_dqa_questions.html")
22 | parser.add_argument("--port", type=int, default=8000)
23 | parser.add_argument("--host", type=str, default="0.0.0.0")
24 | parser.add_argument("--num_im", type=int, default=50)
25 | parser.add_argument("--open", type=str, default='False')
26 |
27 | return parser.parse_args()
28 |
29 |
30 | def list_dqa_questions(args):
31 | data_dir = args.data_dir
32 | images_dir = os.path.join(data_dir, "images")
33 | questions_dir = os.path.join(data_dir, "questions")
34 | annos_dir = os.path.join(data_dir, "annotations")
35 | _id = 0
36 | html_dir = "/tmp/list_dqa_questions_%d" % _id
37 | while os.path.exists(html_dir):
38 | _id += 1
39 | html_dir = "/tmp/list_dqa_questions_%d" % _id
40 |
41 | cur_dir = os.path.dirname(os.path.realpath(__file__))
42 | templates_dir = os.path.join(cur_dir, 'templates')
43 | env = Environment(loader=FileSystemLoader(templates_dir))
44 | template = env.get_template(args.template_name)
45 |
46 | if os.path.exists(html_dir):
47 | shutil.rmtree(html_dir)
48 | os.mkdir(html_dir)
49 |
50 | headers = ['image_id', 'question_id', 'image', 'question', 'choices', 'answer', 'annotations']
51 | rows = []
52 | image_names = [name for name in os.listdir(images_dir) if name.endswith('png')]
53 | image_names = sorted(image_names, key=lambda name: int(os.path.splitext(name)[0]))
54 | image_names = [name for name in image_names
55 | if name.endswith(args.ext) and args.start <= int(os.path.splitext(name)[0]) < args.stop]
56 | pbar = get_pbar(len(image_names)).start()
57 | for i, image_name in enumerate(image_names):
58 | image_id, _ = os.path.splitext(image_name)
59 | json_name = "%s.json" % image_name
60 | anno_path = os.path.join(annos_dir, json_name)
61 | question_path = os.path.join(questions_dir, json_name)
62 | if os.path.exists(question_path):
63 | question_dict = json.load(open(question_path, "rb"))
64 | anno_dict = json.load(open(anno_path, "rb"))
65 | for j, (question, d) in enumerate(question_dict['questions'].iteritems()):
66 | row = {'image_id': image_id,
67 | 'question_id': str(j),
68 | 'image_url': os.path.join("images" if not d['abcLabel'] else "imagesReplacedText", image_name),
69 | 'anno_url': os.path.join("annotations", json_name),
70 | 'question': question,
71 | 'choices': d['answerTexts'],
72 | 'answer': d['correctAnswer']}
73 | rows.append(row)
74 |
75 | if i % args.num_im == 0:
76 | html_path = os.path.join(html_dir, "%s.html" % str(image_id).zfill(8))
77 |
78 | if (i + 1) % args.num_im == 0 or (i + 1) == len(image_names):
79 | var_dict = {'title': "Question List",
80 | 'image_width': args.im_width,
81 | 'headers': headers,
82 | 'rows': rows,
83 | 'show_im': args.show_im}
84 | with open(html_path, "wb") as f:
85 | f.write(template.render(**var_dict).encode('UTF-8'))
86 | rows = []
87 | pbar.update(i)
88 | pbar.finish()
89 |
90 |
91 | os.system("ln -s %s/* %s" % (data_dir, html_dir))
92 | os.chdir(html_dir)
93 | port = args.port
94 | host = args.host
95 | # Overriding to suppress log message
96 | class MyHandler(SimpleHTTPServer.SimpleHTTPRequestHandler):
97 | def log_message(self, format, *args):
98 | pass
99 | handler = MyHandler
100 | httpd = SocketServer.TCPServer((host, port), handler)
101 | if args.open == 'True':
102 | os.system("open http://%s:%d" % (args.host, args.port))
103 | print("serving at %s:%d" % (host, port))
104 | httpd.serve_forever()
105 |
106 |
107 | if __name__ == "__main__":
108 | ARGS = get_args()
109 | list_dqa_questions(ARGS)
110 |
--------------------------------------------------------------------------------
/vis/list_facts.py:
--------------------------------------------------------------------------------
1 | import shutil
2 |
3 | import SimpleHTTPServer
4 | import SocketServer
5 | import argparse
6 | import json
7 | import os
8 | from copy import deepcopy
9 |
10 | from jinja2 import Environment, FileSystemLoader
11 |
12 | from utils import get_pbar
13 |
14 |
15 | def get_args():
16 | parser = argparse.ArgumentParser()
17 | parser.add_argument("prepro_dir")
18 | parser.add_argument("--start", default=0, type=int)
19 | parser.add_argument("--stop", default=1500, type=int)
20 | parser.add_argument("--show_im", default='True')
21 | parser.add_argument("--im_width", type=int, default=200)
22 | parser.add_argument("--ext", type=str, default=".png")
23 | parser.add_argument("--template_name", type=str, default="list_facts.html")
24 | parser.add_argument("--num_im", type=int, default=50)
25 | parser.add_argument("--port", type=int, default=8000)
26 | parser.add_argument("--host", type=str, default="0.0.0.0")
27 | parser.add_argument("--open", type=str, default='False')
28 |
29 | args = parser.parse_args()
30 | return args
31 |
32 |
33 | def _decode_sent(decoder, sent):
34 | return " ".join(decoder[idx] for idx in sent)
35 |
36 |
37 |
38 | def list_facts(args):
39 | prepro_dir = args.prepro_dir
40 | meta_data_dir = os.path.join(prepro_dir, "meta_data.json")
41 | meta_data = json.load(open(meta_data_dir, "r"))
42 | data_dir = meta_data['data_dir']
43 | _id = 0
44 | html_dir = "/tmp/list_facts%d" % _id
45 | while os.path.exists(html_dir):
46 | _id += 1
47 | html_dir = "/tmp/list_facts%d" % _id
48 |
49 | images_dir = os.path.join(data_dir, 'images')
50 | annos_dir = os.path.join(data_dir, 'annotations')
51 |
52 | sents_path = os.path.join(prepro_dir, 'sents.json')
53 | facts_path = os.path.join(prepro_dir, 'facts.json')
54 | vocab_path = os.path.join(prepro_dir, 'vocab.json')
55 | answers_path = os.path.join(prepro_dir, 'answers.json')
56 | sentss_dict = json.load(open(sents_path, "r"))
57 | facts_dict = json.load(open(facts_path, "r"))
58 | vocab = json.load(open(vocab_path, "r"))
59 | answers_dict = json.load(open(answers_path, "r"))
60 | decoder = {idx: word for word, idx in vocab.items()}
61 |
62 | if os.path.exists(html_dir):
63 | shutil.rmtree(html_dir)
64 | os.mkdir(html_dir)
65 |
66 | cur_dir = os.path.dirname(os.path.realpath(__file__))
67 | templates_dir = os.path.join(cur_dir, 'templates')
68 | env = Environment(loader=FileSystemLoader(templates_dir))
69 | template = env.get_template(args.template_name)
70 |
71 | headers = ['iid', 'qid', 'image', 'sents', 'answer', 'annotations', 'relations']
72 | rows = []
73 | pbar = get_pbar(len(sentss_dict)).start()
74 | image_ids = sorted(sentss_dict.keys(), key=lambda x: int(x))
75 | for i, image_id in enumerate(image_ids):
76 | sentss = sentss_dict[image_id]
77 | answers = answers_dict[image_id]
78 | facts = facts_dict[image_id] if image_id in facts_dict else []
79 | decoded_facts = [_decode_sent(decoder, fact) for fact in facts]
80 | for question_id, (sents, answer) in enumerate(zip(sentss, answers)):
81 | image_name = "%s.png" % image_id
82 | json_name = "%s.json" % image_name
83 | image_url = os.path.join('images', image_name)
84 | anno_url = os.path.join('annotations', json_name)
85 | row = {'image_id': image_id,
86 | 'question_id': question_id,
87 | 'image_url': image_url,
88 | 'anno_url': anno_url,
89 | 'sents': [_decode_sent(decoder, sent) for sent in sents],
90 | 'answer': answer,
91 | 'facts': decoded_facts}
92 | rows.append(row)
93 |
94 | if i % args.num_im == 0:
95 | html_path = os.path.join(html_dir, "%s.html" % str(image_id).zfill(8))
96 |
97 | if (i + 1) % args.num_im == 0 or (i + 1) == len(image_ids):
98 | var_dict = {'title': "Question List",
99 | 'image_width': args.im_width,
100 | 'headers': headers,
101 | 'rows': rows,
102 | 'show_im': True if args.show_im == 'True' else False}
103 | with open(html_path, "wb") as f:
104 | f.write(template.render(**var_dict).encode('UTF-8'))
105 | rows = []
106 | pbar.update(i)
107 | pbar.finish()
108 |
109 | os.system("ln -s %s/* %s" % (data_dir, html_dir))
110 | os.chdir(html_dir)
111 | port = args.port
112 | host = args.host
113 | # Overriding to suppress log message
114 | class MyHandler(SimpleHTTPServer.SimpleHTTPRequestHandler):
115 | def log_message(self, format, *args):
116 | pass
117 | handler = MyHandler
118 | httpd = SocketServer.TCPServer((host, port), handler)
119 | if args.open == 'True':
120 | os.system("open http://%s:%d" % (args.host, args.port))
121 | print("serving at %s:%d" % (host, port))
122 | httpd.serve_forever()
123 |
124 |
125 | if __name__ == "__main__":
126 | ARGS = get_args()
127 | list_facts(ARGS)
128 |
--------------------------------------------------------------------------------
/vis/list_relations.py:
--------------------------------------------------------------------------------
1 | import argparse
2 | import json
3 | import os
4 | from copy import deepcopy
5 |
6 | from jinja2 import Environment, FileSystemLoader
7 |
8 | from utils import get_pbar
9 |
10 |
11 | def get_args():
12 | parser = argparse.ArgumentParser()
13 | parser.add_argument("prepro_dir")
14 | parser.add_argument("--start", default=0, type=int)
15 | parser.add_argument("--stop", default=1500, type=int)
16 | parser.add_argument("--show_im", default='True')
17 | parser.add_argument("--im_width", type=int, default=200)
18 | parser.add_argument("--ext", type=str, default=".png")
19 | parser.add_argument("--html_path", type=str, default="/tmp/list_relations.html")
20 | parser.add_argument("--template_name", type=str, default="list_relations.html")
21 |
22 | args = parser.parse_args()
23 | return args
24 |
25 |
26 | def _decode_sent(decoder, sent):
27 | return " ".join(decoder[idx] for idx in sent)
28 |
29 |
30 | def _decode_relation(decoder, relation):
31 | new_relation = deepcopy(relation)
32 | """
33 | new_relation['a1r'] = _decode_sent(decoder, new_relation['a1r'])
34 | new_relation['a2r'] = _decode_sent(decoder, new_relation['a2r'])
35 | """
36 | new_relation['a1'] = _decode_sent(decoder, new_relation['a1'])
37 | new_relation['a2'] = _decode_sent(decoder, new_relation['a2'])
38 | return new_relation
39 |
40 |
41 | def interpret_relations(args):
42 | prepro_dir = args.prepro_dir
43 | meta_data_dir = os.path.join(prepro_dir, "meta_data.json")
44 | meta_data = json.load(open(meta_data_dir, "r"))
45 | data_dir = meta_data['data_dir']
46 |
47 | images_dir = os.path.join(data_dir, 'images')
48 | annos_dir = os.path.join(data_dir, 'annotations')
49 | html_path = args.html_path
50 |
51 | sents_path = os.path.join(prepro_dir, 'sents.json')
52 | relations_path = os.path.join(prepro_dir, 'relations.json')
53 | vocab_path = os.path.join(prepro_dir, 'vocab.json')
54 | answers_path = os.path.join(prepro_dir, 'answers.json')
55 | sentss_dict = json.load(open(sents_path, "r"))
56 | relations_dict = json.load(open(relations_path, "r"))
57 | vocab = json.load(open(vocab_path, "r"))
58 | answers_dict = json.load(open(answers_path, "r"))
59 | decoder = {idx: word for word, idx in vocab.items()}
60 |
61 | headers = ['iid', 'qid', 'image', 'sents', 'answer', 'annotations', 'relations']
62 | rows = []
63 | pbar = get_pbar(len(sentss_dict)).start()
64 | image_ids = sorted(sentss_dict.keys(), key=lambda x: int(x))
65 | for i, image_id in enumerate(image_ids):
66 | sentss = sentss_dict[image_id]
67 | answers = answers_dict[image_id]
68 | relations = relations_dict[image_id]
69 | decoded_relations = [_decode_relation(decoder, relation) for relation in relations]
70 | for question_id, (sents, answer) in enumerate(zip(sentss, answers)):
71 | image_name = "%s.png" % image_id
72 | json_name = "%s.json" % image_name
73 | image_path = os.path.join(images_dir, image_name)
74 | anno_path = os.path.join(annos_dir, json_name)
75 | row = {'image_id': image_id,
76 | 'question_id': question_id,
77 | 'image_url': image_path,
78 | 'anno_url': anno_path,
79 | 'sents': [_decode_sent(decoder, sent) for sent in sents],
80 | 'answer': answer,
81 | 'relations': decoded_relations}
82 | rows.append(row)
83 | pbar.update(i)
84 | pbar.finish()
85 | var_dict = {'title': "Question List: %d - %d" % (args.start, args.stop - 1),
86 | 'image_width': args.im_width,
87 | 'headers': headers,
88 | 'rows': rows,
89 | 'show_im': True if args.show_im == 'True' else False}
90 |
91 | cur_dir = os.path.dirname(os.path.realpath(__file__))
92 | templates_dir = os.path.join(cur_dir, 'templates')
93 | env = Environment(loader=FileSystemLoader(templates_dir))
94 | template = env.get_template(args.template_name)
95 | out = template.render(**var_dict)
96 | with open(html_path, "w") as f:
97 | f.write(out)
98 |
99 | os.system("open %s" % html_path)
100 |
101 |
102 | if __name__ == "__main__":
103 | ARGS = get_args()
104 | interpret_relations(ARGS)
105 |
--------------------------------------------------------------------------------
/vis/list_results.py:
--------------------------------------------------------------------------------
1 | import importlib
2 | import shutil
3 |
4 | import http.server
5 | import socketserver
6 | import argparse
7 | import json
8 | import os
9 | import numpy as np
10 | from copy import deepcopy
11 |
12 | from jinja2 import Environment, FileSystemLoader
13 |
14 | from utils import get_pbar
15 |
16 |
17 | def get_args():
18 | parser = argparse.ArgumentParser()
19 | parser.add_argument("model_num", type=int)
20 | parser.add_argument("config_name", type=str)
21 | parser.add_argument("data_type", type=str)
22 | parser.add_argument("epoch", type=int)
23 | parser.add_argument("--start", default=0, type=int)
24 | parser.add_argument("--stop", default=1500, type=int)
25 | parser.add_argument("--show_im", default='True')
26 | parser.add_argument("--im_width", type=int, default=200)
27 | parser.add_argument("--ext", type=str, default=".png")
28 | parser.add_argument("--template_name", type=str, default="list_results.html")
29 | parser.add_argument("--num_im", type=int, default=50)
30 | parser.add_argument("--port", type=int, default=8000)
31 | parser.add_argument("--host", type=str, default="0.0.0.0")
32 | parser.add_argument("--open", type=str, default='False')
33 |
34 | args = parser.parse_args()
35 | return args
36 |
37 |
38 | def _decode_sent(decoder, sent):
39 | return " ".join(decoder[idx] for idx in sent)
40 |
41 |
42 |
43 | def list_results(args):
44 | model_num = args.model_num
45 | config_name = args.config_name
46 | data_type = args.data_type
47 | epoch =args.epoch
48 | configs_path = os.path.join("configs", "m%s.json" % str(model_num).zfill(2))
49 | configs = json.load(open(configs_path, 'r'))
50 | config = configs[config_name]
51 | evals_dir = os.path.join("evals", "m%s" % str(model_num).zfill(2), config_name)
52 | evals_name = "%s_%s.json" % (data_type, str(epoch).zfill(4))
53 | evals_path = os.path.join(evals_dir, evals_name)
54 | evals = json.load(open(evals_path, 'r'))
55 |
56 | fold_path = config['fold_path']
57 | fold = json.load(open(fold_path, 'r'))
58 | fold_data_type = 'test' if data_type == 'val' else data_type
59 | image_ids = sorted(fold[fold_data_type], key=lambda x: int(x))
60 |
61 | prepro_dir = config['data_dir']
62 | meta_data_dir = os.path.join(prepro_dir, "meta_data.json")
63 | meta_data = json.load(open(meta_data_dir, "r"))
64 | data_dir = meta_data['data_dir']
65 | _id = 0
66 | html_dir = "/tmp/list_results%d" % _id
67 | while os.path.exists(html_dir):
68 | _id += 1
69 | html_dir = "/tmp/list_results%d" % _id
70 |
71 | images_dir = os.path.join(data_dir, 'images')
72 | annos_dir = os.path.join(data_dir, 'annotations')
73 |
74 | sents_path = os.path.join(prepro_dir, 'sents.json')
75 | facts_path = os.path.join(prepro_dir, 'facts.json')
76 | vocab_path = os.path.join(prepro_dir, 'vocab.json')
77 | answers_path = os.path.join(prepro_dir, 'answers.json')
78 | sentss_dict = json.load(open(sents_path, "r"))
79 | facts_dict = json.load(open(facts_path, "r"))
80 | vocab = json.load(open(vocab_path, "r"))
81 | answers_dict = json.load(open(answers_path, "r"))
82 | decoder = {idx: word for word, idx in list(vocab.items())}
83 |
84 | if os.path.exists(html_dir):
85 | shutil.rmtree(html_dir)
86 | os.mkdir(html_dir)
87 |
88 | cur_dir = os.path.dirname(os.path.realpath(__file__))
89 | templates_dir = os.path.join(cur_dir, 'templates')
90 | env = Environment(loader=FileSystemLoader(templates_dir))
91 | template = env.get_template(args.template_name)
92 |
93 | eval_names = list(evals['values'].keys())
94 | eval_dd = {}
95 | for idx, id_ in enumerate(evals['ids']):
96 | eval_d = {}
97 | for name, d in list(evals['values'].items()):
98 | eval_d[name] = d[idx]
99 | eval_dd[tuple(id_)] = eval_d
100 |
101 | # headers = ['iid', 'qid', 'image', 'sents', 'answer', 'annotations', 'relations'] + eval_names
102 | headers = ['iid', 'qid', 'image', 'sents', 'annotations', 'relations', 'p', 'yp']
103 | rows = []
104 | pbar = get_pbar(len(sentss_dict)).start()
105 | for i, image_id in enumerate(image_ids):
106 | if image_id not in sentss_dict:
107 | continue
108 | sentss = sentss_dict[image_id]
109 | answers = answers_dict[image_id]
110 | facts = facts_dict[image_id] if image_id in facts_dict else []
111 | decoded_facts = [_decode_sent(decoder, fact) for fact in facts]
112 | for question_id, (sents, answer) in enumerate(zip(sentss, answers)):
113 | eval_id = (image_id, question_id)
114 | eval_d = eval_dd[eval_id] if eval_id in eval_dd else None
115 |
116 | if eval_d:
117 | p_all = list(zip(*eval_d['p']))
118 | p = p_all[:len(decoded_facts)]
119 | p = [[float("%.3f" % x) for x in y] for y in p]
120 | yp = [float("%.3f" % x) for x in eval_d['yp']]
121 | else:
122 | p, yp, sig = [], [], []
123 |
124 | evals = [eval_d[name] if eval_d else "" for name in eval_names]
125 | image_name = "%s.png" % image_id
126 | json_name = "%s.json" % image_name
127 | image_url = os.path.join('images', image_name)
128 | anno_url = os.path.join('annotations', json_name)
129 | ap = np.argmax(yp) if len(yp) > 0 else 0
130 | correct = len(yp) > 0 and ap == answer
131 | row = {'image_id': image_id,
132 | 'question_id': question_id,
133 | 'image_url': image_url,
134 | 'anno_url': anno_url,
135 | 'sents': [_decode_sent(decoder, sent) for sent in sents],
136 | 'answer': answer,
137 | 'facts': decoded_facts,
138 | 'p': p,
139 | 'yp': yp,
140 | 'ap': np.argmax(yp) if len(yp) > 0 else 0,
141 | 'correct': correct,
142 | }
143 |
144 | rows.append(row)
145 |
146 | if i % args.num_im == 0:
147 | html_path = os.path.join(html_dir, "%s.html" % str(image_id).zfill(8))
148 |
149 | if (i + 1) % args.num_im == 0 or (i + 1) == len(image_ids):
150 | var_dict = {'title': "Question List",
151 | 'image_width': args.im_width,
152 | 'headers': headers,
153 | 'rows': rows,
154 | 'show_im': True if args.show_im == 'True' else False}
155 | with open(html_path, "wb") as f:
156 | f.write(template.render(**var_dict).encode('UTF-8'))
157 | rows = []
158 | pbar.update(i)
159 | pbar.finish()
160 |
161 | os.system("ln -s %s/* %s" % (data_dir, html_dir))
162 | os.chdir(html_dir)
163 | port = args.port
164 | host = args.host
165 | # Overriding to suppress log message
166 | class MyHandler(http.server.SimpleHTTPRequestHandler):
167 | def log_message(self, format, *args):
168 | pass
169 | handler = MyHandler
170 | httpd = socketserver.TCPServer((host, port), handler)
171 | if args.open == 'True':
172 | os.system("open http://%s:%d" % (args.host, args.port))
173 | print(("serving at %s:%d" % (host, port)))
174 | httpd.serve_forever()
175 |
176 |
177 | if __name__ == "__main__":
178 | ARGS = get_args()
179 | list_results(ARGS)
180 |
--------------------------------------------------------------------------------
/vis/list_vqa_questions.py:
--------------------------------------------------------------------------------
1 | import argparse
2 | import json
3 | import os
4 |
5 | from jinja2 import Environment, FileSystemLoader
6 |
7 | parser = argparse.ArgumentParser()
8 | parser.add_argument("root_dir")
9 | parser.add_argument("--images_dir", default='images')
10 | parser.add_argument("--questions_name", default='questions.json')
11 | parser.add_argument("--annotations_name", default="annotations.json")
12 | parser.add_argument("--start", default=0, type=int)
13 | parser.add_argument("--stop", default=-100, type=int)
14 | parser.add_argument("--html_path", default="/tmp/main.html")
15 | parser.add_argument("--image_width", default=200, type=int)
16 | parser.add_argument("--ext", default='.png')
17 | parser.add_argument("--prefix", default='')
18 | parser.add_argument("--zfill_width", default=12, type=int)
19 | parser.add_argument("--template_name", default="list_questions.html")
20 |
21 | ARGS = parser.parse_args()
22 |
23 | env = Environment(loader=FileSystemLoader('templates'))
24 |
25 |
26 | def main(args):
27 | root_dir = args.root_dir
28 | images_dir = os.path.join(root_dir, args.images_dir)
29 | questions_path = os.path.join(root_dir, args.questions_name)
30 | annotations_path = os.path.join(root_dir, args.annotations_name)
31 | html_path = args.html_path
32 |
33 | def _get_image_url(image_id):
34 | return os.path.join(images_dir, "%s%s%s" % (args.prefix, image_id.zfill(args.zfill_width), args.ext))
35 |
36 | questions_dict = json.load(open(questions_path, "rb"))
37 | annotations_dict = json.load(open(annotations_path, "rb"))
38 |
39 | headers = ['image_id', 'question_id', 'image', 'question', 'choices', 'answer']
40 | row_dict = {question['question_id']:
41 | {'image_id': question['image_id'],
42 | 'question_id': question['question_id'],
43 | 'image_url': _get_image_url(question['image_id']),
44 | 'question': question['question'],
45 | 'choices': question['multiple_choices'],
46 | 'answer': annotation['multiple_choice_answer']}
47 | for question, annotation in zip(questions_dict['questions'], annotations_dict['annotations'])}
48 | idxs = range(args.start, args.stop)
49 | rows = [row_dict[idx] for idx in idxs]
50 | template = env.get_template(args.template_name)
51 | vars_dict = {'title': "Question List: %d - %d" % (args.start, args.stop - 1),
52 | 'image_width': args.image_width,
53 | 'headers': headers,
54 | 'rows': rows[args.start:args.stop]}
55 | out = template.render(**vars_dict)
56 | with open(html_path, "wb") as f:
57 | f.write(out.encode('UTF-8'))
58 |
59 | os.system("open %s" % html_path)
60 |
61 | if __name__ == "__main__":
62 | main(ARGS)
63 |
--------------------------------------------------------------------------------
/vis/templates/list_dqa_questions.html:
--------------------------------------------------------------------------------
1 |
2 |
3 |
4 |
5 | {{ title }}
6 |
7 |
11 |
12 | {{ title }}
13 |
14 |
15 | {% for header in headers %}
16 | {{ header }} |
17 | {% endfor %}
18 |
19 | {% for row in rows %}
20 |
21 | {{ row.image_id }} |
22 | {{ row.question_id }} |
23 |
24 | {% if show_im == 'True' %}
25 |
26 | {% endif %}
27 | |
28 | {{ row.question }} |
29 |
30 |
31 | {% for choice in row.choices %}
32 | - {{ choice }}
33 | {% endfor %}
34 |
35 | |
36 | {{ row.answer }} |
37 | Open |
38 |
39 | {% endfor %}
40 |
41 |
42 |
43 |
--------------------------------------------------------------------------------
/vis/templates/list_facts.html:
--------------------------------------------------------------------------------
1 |
2 |
3 |
4 |
5 | {{ title }}
6 |
7 |
11 |
12 | {{ title }}
13 |
14 |
15 | {% for header in headers %}
16 | {{ header }} |
17 | {% endfor %}
18 |
19 | {% for row in rows %}
20 |
21 | {{ row.image_id }} |
22 | {{ row.question_id }} |
23 |
24 | {% if show_im %}
25 |
26 | {% else %}
27 | Image hidden.
28 | {% endif %}
29 | |
30 |
31 |
32 | {% for sent in row.sents %}
33 | -
34 | {{ sent }}
35 |
36 | {% endfor %}
37 |
38 | |
39 | {{ row.answer }} |
40 | Open |
41 |
42 |
43 | {% for fact in row.facts %}
44 | -
45 | {{ fact }}
46 |
47 | {% endfor %}
48 |
49 | |
50 |
51 | {% endfor %}
52 |
53 |
54 |
55 |
--------------------------------------------------------------------------------
/vis/templates/list_questions.html:
--------------------------------------------------------------------------------
1 |
2 |
3 |
4 |
5 | {{ title }}
6 |
7 |
11 |
12 | {{ title }}
13 |
14 |
15 | {% for header in headers %}
16 | {{ header }} |
17 | {% endfor %}
18 |
19 | {% for row in rows %}
20 |
21 | {{ row.image_id }} |
22 | {{ row.question_id }} |
23 |
24 |
25 | |
26 | {{ row.question }} |
27 |
28 |
29 | {% for choice in row.choices %}
30 | - {{ choice }}
31 | {% endfor %}
32 |
33 | |
34 | {{ row.answer }} |
35 |
36 | {% endfor %}
37 |
38 |
39 |
40 |
--------------------------------------------------------------------------------
/vis/templates/list_relations.html:
--------------------------------------------------------------------------------
1 |
2 |
3 |
4 |
5 | {{ title }}
6 |
7 |
11 |
12 | {{ title }}
13 |
14 |
15 | {% for header in headers %}
16 | {{ header }} |
17 | {% endfor %}
18 |
19 | {% for row in rows %}
20 |
21 | {{ row.image_id }} |
22 | {{ row.question_id }} |
23 |
24 | {% if show_im %}
25 |
26 | {% else %}
27 | Image hidden.
28 | {% endif %}
29 | |
30 |
31 |
32 | {% for sent in row.sents %}
33 | -
34 | {{ sent }}
35 |
36 | {% endfor %}
37 |
38 | |
39 | {{ row.answer }} |
40 | Open |
41 |
42 |
43 | {% for relation in row.relations %}
44 | -
45 | {{ relation.a1 }} -> {{ relation.a2 }}
46 |
47 | {% endfor %}
48 |
49 | |
50 |
51 | {% endfor %}
52 |
53 |
54 |
55 |
--------------------------------------------------------------------------------
/vis/templates/list_results.html:
--------------------------------------------------------------------------------
1 |
2 |
3 |
4 |
5 | {{ title }}
6 |
7 |
8 |
18 |
19 |
23 |
24 | {{ title }}
25 |
26 |
27 | {% for header in headers %}
28 | {{ header }} |
29 | {% endfor %}
30 |
31 | {% for row in rows %}
32 |
33 | {{ row.image_id }} |
34 | {{ row.question_id }} |
35 |
36 | {% if show_im %}
37 |
38 | {% else %}
39 | Image hidden.
40 | {% endif %}
41 | |
42 |
43 |
44 | {% for sent in row.sents %}
45 | -
46 | {% if loop.index0 == row.answer %}
47 | {{ sent }}
48 | {% else %}
49 | {{ sent }}
50 | {% endif %}
51 |
52 | {% endfor %}
53 |
54 | |
55 | Open |
56 |
57 |
58 | {% for fact in row.facts %}
59 | -
60 | {{ fact }}
61 |
62 | {% endfor %}
63 |
64 | |
65 |
66 |
67 | {% for a in row.p %}
68 |
69 | {{ loop.index0 }} |
70 | {% for b in a %}
71 | {{ b }} |
72 | {% endfor %}
73 |
74 | {% endfor %}
75 |
76 | |
77 | {% if row.correct %}
78 |
79 | {% else %}
80 | |
81 | {% endif %}
82 |
83 | {% for x in row.yp %}
84 | -
85 | {% if loop.index0 == row.ap %}
86 | {{ x }}
87 | {% else %}
88 | {{ x }}
89 | {% endif %}
90 |
91 | {% endfor %}
92 |
93 | |
94 |
95 | {% endfor %}
96 |
97 |
98 |
99 |
100 |
--------------------------------------------------------------------------------