├── LICENSE ├── README.md ├── configs ├── __init__.py ├── c04.py ├── get_config.py ├── json2tsv.py ├── m05.json ├── m05.tsv └── tsv2json.py ├── create_fold.py ├── download.sh ├── download2.sh ├── main ├── __init__.py └── x05.py ├── models ├── __init__.py ├── bm05.py └── m05.py ├── my ├── __init__.py ├── nn.py ├── rnn_cell.py └── tensorflow.py ├── prepro ├── __init__.py ├── __init__.pyc └── p05.py ├── prepro_images.lua ├── read_data ├── __init__.py └── r05.py ├── requirements.txt ├── tmp ├── __init__.py ├── sim_test.py └── simple.py ├── utils.py └── vis ├── __init__.py ├── list_dqa_questions.py ├── list_facts.py ├── list_relations.py ├── list_results.py ├── list_vqa_questions.py └── templates ├── list_dqa_questions.html ├── list_facts.html ├── list_questions.html ├── list_relations.html └── list_results.html /LICENSE: -------------------------------------------------------------------------------- 1 | Apache License 2 | Version 2.0, January 2004 3 | http://www.apache.org/licenses/ 4 | 5 | TERMS AND CONDITIONS FOR USE, REPRODUCTION, AND DISTRIBUTION 6 | 7 | 1. Definitions. 8 | 9 | "License" shall mean the terms and conditions for use, reproduction, 10 | and distribution as defined by Sections 1 through 9 of this document. 11 | 12 | "Licensor" shall mean the copyright owner or entity authorized by 13 | the copyright owner that is granting the License. 14 | 15 | "Legal Entity" shall mean the union of the acting entity and all 16 | other entities that control, are controlled by, or are under common 17 | control with that entity. For the purposes of this definition, 18 | "control" means (i) the power, direct or indirect, to cause the 19 | direction or management of such entity, whether by contract or 20 | otherwise, or (ii) ownership of fifty percent (50%) or more of the 21 | outstanding shares, or (iii) beneficial ownership of such entity. 22 | 23 | "You" (or "Your") shall mean an individual or Legal Entity 24 | exercising permissions granted by this License. 25 | 26 | "Source" form shall mean the preferred form for making modifications, 27 | including but not limited to software source code, documentation 28 | source, and configuration files. 29 | 30 | "Object" form shall mean any form resulting from mechanical 31 | transformation or translation of a Source form, including but 32 | not limited to compiled object code, generated documentation, 33 | and conversions to other media types. 34 | 35 | "Work" shall mean the work of authorship, whether in Source or 36 | Object form, made available under the License, as indicated by a 37 | copyright notice that is included in or attached to the work 38 | (an example is provided in the Appendix below). 39 | 40 | "Derivative Works" shall mean any work, whether in Source or Object 41 | form, that is based on (or derived from) the Work and for which the 42 | editorial revisions, annotations, elaborations, or other modifications 43 | represent, as a whole, an original work of authorship. For the purposes 44 | of this License, Derivative Works shall not include works that remain 45 | separable from, or merely link (or bind by name) to the interfaces of, 46 | the Work and Derivative Works thereof. 47 | 48 | "Contribution" shall mean any work of authorship, including 49 | the original version of the Work and any modifications or additions 50 | to that Work or Derivative Works thereof, that is intentionally 51 | submitted to Licensor for inclusion in the Work by the copyright owner 52 | or by an individual or Legal Entity authorized to submit on behalf of 53 | the copyright owner. For the purposes of this definition, "submitted" 54 | means any form of electronic, verbal, or written communication sent 55 | to the Licensor or its representatives, including but not limited to 56 | communication on electronic mailing lists, source code control systems, 57 | and issue tracking systems that are managed by, or on behalf of, the 58 | Licensor for the purpose of discussing and improving the Work, but 59 | excluding communication that is conspicuously marked or otherwise 60 | designated in writing by the copyright owner as "Not a Contribution." 61 | 62 | "Contributor" shall mean Licensor and any individual or Legal Entity 63 | on behalf of whom a Contribution has been received by Licensor and 64 | subsequently incorporated within the Work. 65 | 66 | 2. Grant of Copyright License. Subject to the terms and conditions of 67 | this License, each Contributor hereby grants to You a perpetual, 68 | worldwide, non-exclusive, no-charge, royalty-free, irrevocable 69 | copyright license to reproduce, prepare Derivative Works of, 70 | publicly display, publicly perform, sublicense, and distribute the 71 | Work and such Derivative Works in Source or Object form. 72 | 73 | 3. Grant of Patent License. Subject to the terms and conditions of 74 | this License, each Contributor hereby grants to You a perpetual, 75 | worldwide, non-exclusive, no-charge, royalty-free, irrevocable 76 | (except as stated in this section) patent license to make, have made, 77 | use, offer to sell, sell, import, and otherwise transfer the Work, 78 | where such license applies only to those patent claims licensable 79 | by such Contributor that are necessarily infringed by their 80 | Contribution(s) alone or by combination of their Contribution(s) 81 | with the Work to which such Contribution(s) was submitted. If You 82 | institute patent litigation against any entity (including a 83 | cross-claim or counterclaim in a lawsuit) alleging that the Work 84 | or a Contribution incorporated within the Work constitutes direct 85 | or contributory patent infringement, then any patent licenses 86 | granted to You under this License for that Work shall terminate 87 | as of the date such litigation is filed. 88 | 89 | 4. Redistribution. You may reproduce and distribute copies of the 90 | Work or Derivative Works thereof in any medium, with or without 91 | modifications, and in Source or Object form, provided that You 92 | meet the following conditions: 93 | 94 | (a) You must give any other recipients of the Work or 95 | Derivative Works a copy of this License; and 96 | 97 | (b) You must cause any modified files to carry prominent notices 98 | stating that You changed the files; and 99 | 100 | (c) You must retain, in the Source form of any Derivative Works 101 | that You distribute, all copyright, patent, trademark, and 102 | attribution notices from the Source form of the Work, 103 | excluding those notices that do not pertain to any part of 104 | the Derivative Works; and 105 | 106 | (d) If the Work includes a "NOTICE" text file as part of its 107 | distribution, then any Derivative Works that You distribute must 108 | include a readable copy of the attribution notices contained 109 | within such NOTICE file, excluding those notices that do not 110 | pertain to any part of the Derivative Works, in at least one 111 | of the following places: within a NOTICE text file distributed 112 | as part of the Derivative Works; within the Source form or 113 | documentation, if provided along with the Derivative Works; or, 114 | within a display generated by the Derivative Works, if and 115 | wherever such third-party notices normally appear. The contents 116 | of the NOTICE file are for informational purposes only and 117 | do not modify the License. You may add Your own attribution 118 | notices within Derivative Works that You distribute, alongside 119 | or as an addendum to the NOTICE text from the Work, provided 120 | that such additional attribution notices cannot be construed 121 | as modifying the License. 122 | 123 | You may add Your own copyright statement to Your modifications and 124 | may provide additional or different license terms and conditions 125 | for use, reproduction, or distribution of Your modifications, or 126 | for any such Derivative Works as a whole, provided Your use, 127 | reproduction, and distribution of the Work otherwise complies with 128 | the conditions stated in this License. 129 | 130 | 5. Submission of Contributions. Unless You explicitly state otherwise, 131 | any Contribution intentionally submitted for inclusion in the Work 132 | by You to the Licensor shall be under the terms and conditions of 133 | this License, without any additional terms or conditions. 134 | Notwithstanding the above, nothing herein shall supersede or modify 135 | the terms of any separate license agreement you may have executed 136 | with Licensor regarding such Contributions. 137 | 138 | 6. Trademarks. This License does not grant permission to use the trade 139 | names, trademarks, service marks, or product names of the Licensor, 140 | except as required for reasonable and customary use in describing the 141 | origin of the Work and reproducing the content of the NOTICE file. 142 | 143 | 7. Disclaimer of Warranty. Unless required by applicable law or 144 | agreed to in writing, Licensor provides the Work (and each 145 | Contributor provides its Contributions) on an "AS IS" BASIS, 146 | WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or 147 | implied, including, without limitation, any warranties or conditions 148 | of TITLE, NON-INFRINGEMENT, MERCHANTABILITY, or FITNESS FOR A 149 | PARTICULAR PURPOSE. You are solely responsible for determining the 150 | appropriateness of using or redistributing the Work and assume any 151 | risks associated with Your exercise of permissions under this License. 152 | 153 | 8. Limitation of Liability. In no event and under no legal theory, 154 | whether in tort (including negligence), contract, or otherwise, 155 | unless required by applicable law (such as deliberate and grossly 156 | negligent acts) or agreed to in writing, shall any Contributor be 157 | liable to You for damages, including any direct, indirect, special, 158 | incidental, or consequential damages of any character arising as a 159 | result of this License or out of the use or inability to use the 160 | Work (including but not limited to damages for loss of goodwill, 161 | work stoppage, computer failure or malfunction, or any and all 162 | other commercial damages or losses), even if such Contributor 163 | has been advised of the possibility of such damages. 164 | 165 | 9. Accepting Warranty or Additional Liability. While redistributing 166 | the Work or Derivative Works thereof, You may choose to offer, 167 | and charge a fee for, acceptance of support, warranty, indemnity, 168 | or other liability obligations and/or rights consistent with this 169 | License. However, in accepting such obligations, You may act only 170 | on Your own behalf and on Your sole responsibility, not on behalf 171 | of any other Contributor, and only if You agree to indemnify, 172 | defend, and hold each Contributor harmless for any liability 173 | incurred by, or claims asserted against, such Contributor by reason 174 | of your accepting any such warranty or additional liability. 175 | 176 | END OF TERMS AND CONDITIONS 177 | 178 | APPENDIX: How to apply the Apache License to your work. 179 | 180 | To apply the Apache License to your work, attach the following 181 | boilerplate notice, with the fields enclosed by brackets "{}" 182 | replaced with your own identifying information. (Don't include 183 | the brackets!) The text should be enclosed in the appropriate 184 | comment syntax for the file format. We also recommend that a 185 | file or class name and description of purpose be included on the 186 | same "printed page" as the copyright notice for easier 187 | identification within third-party archives. 188 | 189 | Copyright {yyyy} {name of copyright owner} 190 | 191 | Licensed under the Apache License, Version 2.0 (the "License"); 192 | you may not use this file except in compliance with the License. 193 | You may obtain a copy of the License at 194 | 195 | http://www.apache.org/licenses/LICENSE-2.0 196 | 197 | Unless required by applicable law or agreed to in writing, software 198 | distributed under the License is distributed on an "AS IS" BASIS, 199 | WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. 200 | See the License for the specific language governing permissions and 201 | limitations under the License. 202 | -------------------------------------------------------------------------------- /README.md: -------------------------------------------------------------------------------- 1 | # DQA-Net 2 | This is the original source code for DQA-Net described in ["A Diagram is Worth a Dozen Images"] (http://arxiv.org/abs/1603.07396) 3 | 4 | ## Quick Start 5 | 6 | ### 1. Install Requirements 7 | - Python (verified on 3.5.1) 8 | - Python packages: numpy, progressbar2, nltk, tensorflow, h5py, [qa2hypo](https://github.com/anglil/qa2hypo) 9 | - Torch (comes with Lua) 10 | - Lua packages: cunn, cudnn, cutorch, loadcaffe, hdf5 11 | 12 | Note that the most recent official release of tensorflow (0.7.1) probably won't be compatible with this. 13 | You will need to build from a recent commit (verified on [e39d8fe](https://github.com/tensorflow/tensorflow/tree/e39d8feebb9666a331345cd8d960f5ade4652bba)). 14 | DQA-Net does not use images (VQA baseline does), so you can skip Lua/Torch if you just want to run DQA-Net. See details below. 15 | 16 | ### 2. Download Data and Models 17 | At the root folder, run: 18 | ```bash 19 | chmod +x download.sh 20 | ./download.sh 21 | ``` 22 | to download DQA data, folds, Glove vectors and VGG-19 model. 23 | VGG-19 model is used for images, and as mentioned above, DQA-Net does not use images, so you can comment this line out if you only run DQA-Net. 24 | 25 | ### 3. Preprocess Data 26 | Run: 27 | ```bash 28 | python -m prepro.p05 29 | ``` 30 | to preprocess data. 31 | You can just use default directories unless you make changes to download directories in `download.sh`. 32 | 33 | If you wish to skip image preprocessing (in case you only run DQA-Net), Run with an additional flag: 34 | ```bash 35 | python -m prepro.p05 --prepro_images False 36 | ``` 37 | Now you will see all preprocessed json and h5 files in `data/s3` folder inside the source code's root folder. 38 | 39 | ### 4. Train and Test 40 | To train the default model, run: 41 | ```bash 42 | python -m main.x05 --train 43 | ``` 44 | 45 | To test the trained model on test data, run: 46 | ```bash 47 | python -m main.x05 48 | ``` 49 | 50 | To launch tensorboard, run: 51 | ```bash 52 | tensorboard --logdir logs/m05/None --port 6006 53 | ``` 54 | Here, `m05` is the model name, and `None` is the default configuration. All tensorboard logs will be stored in `logs/` folder. 55 | 56 | To visualize the attention, run: 57 | ```bash 58 | python -m vis.list_results 5 None train 1 --port 8000 --open True 59 | ``` 60 | Here, `5` is the model number (`m05`), `None` is the default configuration, `train` indicates data type, and `1` is the epoch from which the result will be fetches. 61 | See `evals/m05/None` folder to see possible epochs (result saving frequency can be controlled by "save_period" flag at `main/m05.py`). 62 | After running the script, the script hosts html server at the specified port. 63 | `--open True` flag opens web browser at this address. 64 | 65 | In general, use `-h` flag for the run files to see what kind options there are. 66 | -------------------------------------------------------------------------------- /configs/__init__.py: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/allenai/dqa-net/c497e950017e69952deae3ffd07920a5bfb9f468/configs/__init__.py -------------------------------------------------------------------------------- /configs/c04.py: -------------------------------------------------------------------------------- 1 | # For mac local testing 2 | configs = {} 3 | configs[8] = {'batch_size': 10, 4 | 'hidden_size': 10, 5 | 'num_layers': 1, 6 | 'rnn_num_layers': 1, 7 | 'init_mean': 0, 8 | 'init_std': 0.1, 9 | 'init_lr': 0.01, 10 | 'init_nw': 0.9, 11 | 'anneal_period': 10, 12 | 'anneal_ratio': 0.7, 13 | 'num_epochs': 2, 14 | 'linear_start': False, 15 | 'max_grad_norm': 40, 16 | 'keep_prob': 1.0, 17 | 'val_period': 1, 18 | 'save_period': 1, 19 | 'fold_path': 'data/s3-100/fold.json', 20 | 'data_dir': 'data/s3-100', 21 | 'model_name': 'm04', 22 | 'mode': 'la', 23 | 'use_null': False, 24 | 'opt': 'adagrad', 25 | 'model': 'sim', 26 | 'lstm': 'basic', 27 | 'sim_func': 'dot', 28 | } 29 | 30 | # debug purpose 31 | configs[9] = {'batch_size': 100, 32 | 'hidden_size': 100, 33 | 'num_layers': 1, 34 | 'rnn_num_layers': 1, 35 | 'emb_num_layers': 1, 36 | 'init_mean': 0, 37 | 'init_std': 0.1, 38 | 'init_lr': 0.01, 39 | 'anneal_period': 100, 40 | 'anneal_ratio': 0.5, 41 | 'num_epochs': 100, 42 | 'linear_start': False, 43 | 'max_grad_norm': 40, 44 | 'keep_prob': 1.0, 45 | 'fold_path': 'data/s3/folds/fold11.json', 46 | 'data_dir': 'data/s3_04_debug', 47 | 'mode': 'la', 48 | 'model_name': 'm04', 49 | 'val_num_batches': 100, 50 | } 51 | 52 | # deploy configs start here 53 | configs[0] = {'batch_size': 100, 54 | 'hidden_size': 50, 55 | 'num_layers': 1, 56 | 'rnn_num_layers': 1, 57 | 'init_mean': 0, 58 | 'init_std': 0.1, 59 | 'init_lr': 0.01, 60 | 'init_nw': 0.0, 61 | 'anneal_period': 10, 62 | 'anneal_ratio': 0.8, 63 | 'num_epochs': 100, 64 | 'linear_start': False, 65 | 'max_grad_norm': 40, 66 | 'keep_prob': 1.0, 67 | 'val_period': 5, 68 | 'save_period': 10, 69 | 'fold_path': 'data/s3/folds/fold20.json', 70 | 'data_dir': 'data/s3', 71 | 'model_name': 'm04', 72 | 'mode': 'a', 73 | 'use_null': False, 74 | 'opt': 'basic', 75 | 'model': 'sim', 76 | 'lstm': 'basic', 77 | 'sim_func': 'dot', 78 | 'forget_bias': 2.5, 79 | } 80 | 81 | configs[1] = {'batch_size': 100, 82 | 'hidden_size': 50, 83 | 'num_layers': 1, 84 | 'rnn_num_layers': 1, 85 | 'init_mean': 0, 86 | 'init_std': 0.1, 87 | 'init_lr': 0.01, 88 | 'init_nw': 0.0, 89 | 'anneal_period': 10, 90 | 'anneal_ratio': 0.8, 91 | 'num_epochs': 100, 92 | 'linear_start': False, 93 | 'max_grad_norm': 40, 94 | 'keep_prob': 1.0, 95 | 'val_period': 5, 96 | 'save_period': 10, 97 | 'fold_path': 'data/s3/folds/fold24.json', 98 | 'data_dir': 'data/s3', 99 | 'model_name': 'm04', 100 | 'mode': 'a', 101 | 'use_null': False, 102 | 'opt': 'basic', 103 | 'model': 'sim', 104 | 'lstm': 'basic', 105 | 'sim_func': 'dot', 106 | 'forget_bias': 2.5, 107 | } 108 | 109 | configs[2] = {'batch_size': 100, 110 | 'hidden_size': 50, 111 | 'num_layers': 1, 112 | 'rnn_num_layers': 1, 113 | 'init_mean': 0, 114 | 'init_std': 0.1, 115 | 'init_lr': 0.01, 116 | 'init_nw': 0.0, 117 | 'anneal_period': 10, 118 | 'anneal_ratio': 0.8, 119 | 'num_epochs': 100, 120 | 'linear_start': False, 121 | 'max_grad_norm': 40, 122 | 'keep_prob': 1.0, 123 | 'val_period': 5, 124 | 'save_period': 10, 125 | 'fold_path': 'data/s3/folds/fold22.json', 126 | 'data_dir': 'data/s3', 127 | 'model_name': 'm04', 128 | 'mode': 'a', 129 | 'use_null': False, 130 | 'opt': 'basic', 131 | 'model': 'sim', 132 | 'lstm': 'basic', 133 | 'sim_func': 'dot', 134 | 'forget_bias': 2.5, 135 | } 136 | 137 | configs[3] = {'batch_size': 100, 138 | 'hidden_size': 50, 139 | 'num_layers': 1, 140 | 'rnn_num_layers': 1, 141 | 'init_mean': 0, 142 | 'init_std': 0.1, 143 | 'init_lr': 0.01, 144 | 'init_nw': 0.0, 145 | 'anneal_period': 10, 146 | 'anneal_ratio': 0.8, 147 | 'num_epochs': 100, 148 | 'linear_start': False, 149 | 'max_grad_norm': 40, 150 | 'keep_prob': 1.0, 151 | 'val_period': 5, 152 | 'save_period': 10, 153 | 'fold_path': 'data/s3/folds/fold23.json', 154 | 'data_dir': 'data/s3', 155 | 'model_name': 'm04', 156 | 'mode': 'a', 157 | 'use_null': False, 158 | 'opt': 'basic', 159 | 'model': 'sim', 160 | 'lstm': 'basic', 161 | 'sim_func': 'dot', 162 | 'forget_bias': 2.5, 163 | } 164 | 165 | configs[4] = {'batch_size': 100, 166 | 'hidden_size': 50, 167 | 'num_layers': 1, 168 | 'rnn_num_layers': 1, 169 | 'init_mean': 0, 170 | 'init_std': 0.1, 171 | 'init_lr': 0.01, 172 | 'init_nw': 0.0, 173 | 'anneal_period': 10, 174 | 'anneal_ratio': 0.8, 175 | 'num_epochs': 100, 176 | 'linear_start': False, 177 | 'max_grad_norm': 40, 178 | 'keep_prob': 1.0, 179 | 'val_period': 5, 180 | 'save_period': 10, 181 | 'fold_path': 'data/s3/folds/fold21.json', 182 | 'data_dir': 'data/s3-ours', 183 | 'model_name': 'm04', 184 | 'mode': 'a', 185 | 'use_null': False, 186 | 'opt': 'basic', 187 | 'model': 'sim', 188 | 'lstm': 'basic', 189 | 'sim_func': 'dot', 190 | 'forget_bias': 2.5, 191 | } 192 | 193 | configs[5] = {'batch_size': 100, 194 | 'hidden_size': 50, 195 | 'num_layers': 1, 196 | 'rnn_num_layers': 1, 197 | 'init_mean': 0, 198 | 'init_std': 0.1, 199 | 'init_lr': 0.01, 200 | 'init_nw': 0.0, 201 | 'anneal_period': 10, 202 | 'anneal_ratio': 0.8, 203 | 'num_epochs': 100, 204 | 'linear_start': False, 205 | 'max_grad_norm': 40, 206 | 'keep_prob': 1.0, 207 | 'val_period': 5, 208 | 'save_period': 10, 209 | 'fold_path': 'data/s3/folds/fold22.json', 210 | 'data_dir': 'data/s3-ours', 211 | 'model_name': 'm04', 212 | 'mode': 'a', 213 | 'use_null': False, 214 | 'opt': 'basic', 215 | 'model': 'sim', 216 | 'lstm': 'basic', 217 | 'sim_func': 'dot', 218 | 'forget_bias': 2.5, 219 | } 220 | 221 | configs[6] = {'batch_size': 100, 222 | 'hidden_size': 50, 223 | 'num_layers': 1, 224 | 'rnn_num_layers': 1, 225 | 'init_mean': 0, 226 | 'init_std': 0.1, 227 | 'init_lr': 0.01, 228 | 'init_nw': 0.0, 229 | 'anneal_period': 10, 230 | 'anneal_ratio': 0.8, 231 | 'num_epochs': 100, 232 | 'linear_start': False, 233 | 'max_grad_norm': 40, 234 | 'keep_prob': 1.0, 235 | 'val_period': 5, 236 | 'save_period': 10, 237 | 'fold_path': 'data/s3/folds/fold23.json', 238 | 'data_dir': 'data/s3-ours', 239 | 'model_name': 'm04', 240 | 'mode': 'a', 241 | 'use_null': False, 242 | 'opt': 'basic', 243 | 'model': 'sim', 244 | 'lstm': 'basic', 245 | 'sim_func': 'dot', 246 | 'forget_bias': 2.5, 247 | } 248 | 249 | configs[7] = {'batch_size': 100, 250 | 'hidden_size': 50, 251 | 'num_layers': 1, 252 | 'rnn_num_layers': 1, 253 | 'emb_num_layers': 1, 254 | 'init_mean': 0, 255 | 'init_std': 0.1, 256 | 'init_lr': 0.01, 257 | 'init_nw': 0.0, 258 | 'anneal_period': 10, 259 | 'anneal_ratio': 0.8, 260 | 'num_epochs': 100, 261 | 'linear_start': False, 262 | 'max_grad_norm': 40, 263 | 'keep_prob': 1.0, 264 | 'val_period': 5, 265 | 'save_period': 10, 266 | 'fold_path': 'data/s3/folds/fold24.json', 267 | 'data_dir': 'data/s3', 268 | 'model_name': 'm04', 269 | 'mode': 'a', 270 | 'use_null': False, 271 | 'opt': 'basic', 272 | 'model': 'sim', 273 | 'lstm': 'basic', 274 | 'sim_func': 'dot', 275 | 'forget_bias': 1.0, 276 | } 277 | -------------------------------------------------------------------------------- /configs/get_config.py: -------------------------------------------------------------------------------- 1 | import json 2 | import os 3 | from collections import OrderedDict 4 | from copy import deepcopy 5 | 6 | from configs.tsv2json import tsv2dict 7 | 8 | 9 | class Config(object): 10 | def __init__(self, **entries): 11 | self.__dict__.update(entries) 12 | 13 | 14 | def get_config(d0, d1, priority=1): 15 | """ 16 | d1 replaces d0. If priority = 0, then d0 replaces d1 17 | :param d0: 18 | :param d1: 19 | :param name: 20 | :param priority: 21 | :return: 22 | """ 23 | if priority == 0: 24 | d0, d1 = d1, d0 25 | d = deepcopy(d0) 26 | for key, val in d1.items(): 27 | if val is not None: 28 | d[key] = val 29 | return Config(**d) 30 | 31 | 32 | def get_config_from_file(d0, path, id_, priority=1): 33 | _, ext = os.path.splitext(path) 34 | if ext == '.json': 35 | configs = json.load(open(path, 'r'), object_pairs_hook=OrderedDict) 36 | elif ext == '.tsv': 37 | configs = tsv2dict(path) 38 | else: 39 | raise Exception("Extension %r is not supported." % ext) 40 | d1 = configs[id_] 41 | return get_config(d0, d1, priority=priority) 42 | 43 | 44 | 45 | 46 | 47 | -------------------------------------------------------------------------------- /configs/json2tsv.py: -------------------------------------------------------------------------------- 1 | import csv 2 | import json 3 | from collections import OrderedDict 4 | import argparse 5 | 6 | 7 | def get_args(): 8 | parser = argparse.ArgumentParser() 9 | parser.add_argument("json_path") 10 | parser.add_argument("tsv_path") 11 | return parser.parse_args() 12 | 13 | 14 | def json2tsv(json_path, tsv_path): 15 | configs = json.load(open(json_path, 'r'), object_pairs_hook=OrderedDict) 16 | type_dict = OrderedDict([('id', 'str')]) 17 | for id_, config in configs.items(): 18 | for key, val in config.items(): 19 | if val is None: 20 | if key not in type_dict: 21 | type_dict[key] = 'none' 22 | continue 23 | 24 | type_name = type(val).__name__ 25 | if key in type_dict and type_dict[key] != 'none': 26 | assert type_dict[key] == type_name, "inconsistent param type: %s" % key 27 | else: 28 | type_dict[key] = type_name 29 | 30 | with open(tsv_path, 'w', newline='') as file: 31 | writer = csv.writer(file, delimiter='\t') 32 | writer.writerow(type_dict.keys()) 33 | writer.writerow(type_dict.values()) 34 | for id_, config in configs.items(): 35 | config["id"] = id_ 36 | row = [config[key] if key in config and config[key] is not None else "None" 37 | for key in type_dict] 38 | writer.writerow(row) 39 | 40 | 41 | def main(): 42 | args = get_args() 43 | json2tsv(args.json_path, args.tsv_path) 44 | 45 | 46 | if __name__ == "__main__": 47 | main() 48 | -------------------------------------------------------------------------------- /configs/m05.json: -------------------------------------------------------------------------------- 1 | { 2 | "debug": { 3 | "anneal_period": 25, 4 | "anneal_ratio": 0.5, 5 | "batch_size": 100, 6 | "data_dir": "data/s3-debug", 7 | "device_type": "gpu", 8 | "emb_num_layers": 0, 9 | "fold_path": "data/s3-debug/folds/fold11.json", 10 | "hidden_size": 100, 11 | "init_lr": 0.1, 12 | "init_mean": 0, 13 | "init_std": 0.1, 14 | "keep_prob": 1.0, 15 | "lstm": null, 16 | "max_grad_norm": null, 17 | "mode": null, 18 | "model": null, 19 | "num_devices": 2, 20 | "num_epochs": 100, 21 | "opt": null, 22 | "rnn_num_layers": 1, 23 | "save_period": null, 24 | "sim_func": "dot", 25 | "val_num_batches": null, 26 | "val_period": null 27 | }, 28 | "gpu": { 29 | "anneal_period": null, 30 | "anneal_ratio": null, 31 | "batch_size": null, 32 | "data_dir": null, 33 | "device_type": "gpu", 34 | "emb_num_layers": null, 35 | "fold_path": null, 36 | "hidden_size": null, 37 | "init_lr": null, 38 | "init_mean": null, 39 | "init_std": null, 40 | "keep_prob": null, 41 | "lstm": null, 42 | "max_grad_norm": null, 43 | "mode": null, 44 | "model": null, 45 | "num_devices": 4, 46 | "num_epochs": null, 47 | "opt": null, 48 | "rnn_num_layers": null, 49 | "save_period": null, 50 | "sim_func": null, 51 | "val_num_batches": null, 52 | "val_period": null 53 | }, 54 | "local": { 55 | "anneal_period": 10, 56 | "anneal_ratio": 0.7, 57 | "batch_size": 10, 58 | "data_dir": "data/s3-100", 59 | "device_type": null, 60 | "emb_num_layers": 1, 61 | "fold_path": "data/s3-100/fold.json", 62 | "hidden_size": 10, 63 | "init_lr": 0.01, 64 | "init_mean": 0, 65 | "init_std": 0.1, 66 | "keep_prob": 1.0, 67 | "lstm": "basic", 68 | "max_grad_norm": 40, 69 | "model": "sim", 70 | "num_devices": null, 71 | "num_epochs": 2, 72 | "opt": "adagrad", 73 | "rnn_num_layers": 1, 74 | "save_period": 1, 75 | "sim_func": "dot", 76 | "val_num_batches": null, 77 | "val_period": 1 78 | } 79 | } -------------------------------------------------------------------------------- /configs/m05.tsv: -------------------------------------------------------------------------------- 1 | id anneal_period anneal_ratio batch_size data_dir device_type emb_num_layers fold_path hidden_size init_lr init_mean init_std keep_prob lstm model num_devices num_epochs opt rnn_num_layers save_period sim_func val_num_batches val_period max_grad_norm mode 2 | str int float int str str int str int float int float float str str int int str int int str none int int str 3 | debug 25 0.5 100 data/s3-debug gpu 0 data/s3-debug/folds/fold11.json 100 0.1 0 0.1 1.0 None None 2 100 None 1 None dot None None None None 4 | local 10 0.7 10 data/s3-100 None 1 data/s3-100/fold.json 10 0.01 0 0.1 1.0 basic sim None 2 adagrad 1 1 dot None 1 40 la 5 | gpu None None None None gpu None None None None None None None None None 4 None None None None None None None None None 6 | -------------------------------------------------------------------------------- /configs/tsv2json.py: -------------------------------------------------------------------------------- 1 | import argparse 2 | import csv 3 | import json 4 | from collections import OrderedDict 5 | 6 | from utils import json_pretty_dump 7 | 8 | 9 | def get_args(): 10 | parser = argparse.ArgumentParser() 11 | parser.add_argument("tsv_path") 12 | parser.add_argument("json_path") 13 | return parser.parse_args() 14 | 15 | 16 | def tsv2json(tsv_path, json_path): 17 | d = tsv2dict(tsv_path) 18 | json_pretty_dump(d, open(json_path, 'w')) 19 | 20 | 21 | def tsv2dict(tsv_path): 22 | def bool(string): 23 | """ 24 | shadows original bool, which maps 'False' to True 25 | """ 26 | if string == 'True': 27 | return True 28 | elif string == 'False': 29 | return False 30 | else: 31 | raise Exception("Cannot convert %s to bool" % string) 32 | 33 | def none(val): 34 | return val 35 | 36 | with open(tsv_path, 'r', newline='') as file: 37 | reader = csv.reader(file, delimiter='\t') 38 | fields = next(reader) 39 | type_names = next(reader) 40 | casters = list(map(eval, type_names)) 41 | out_dict = {} 42 | for row in reader: 43 | cur_dict = OrderedDict( 44 | (field, None if val == "None" else caster(val)) 45 | for field, caster, val in zip(fields, casters, row)) 46 | id_ = cur_dict['id'] 47 | del cur_dict['id'] 48 | out_dict[id_] = cur_dict 49 | return out_dict 50 | 51 | 52 | def main(): 53 | args = get_args() 54 | tsv2json(args.tsv_path, args.json_path) 55 | 56 | 57 | if __name__ == "__main__": 58 | main() 59 | -------------------------------------------------------------------------------- /create_fold.py: -------------------------------------------------------------------------------- 1 | import json 2 | import os 3 | import random 4 | import argparse 5 | from collections import defaultdict 6 | 7 | 8 | def create_linear_fold(): 9 | parser = argparse.ArgumentParser() 10 | parser.add_argument("data_dir") 11 | parser.add_argument("fold_path") 12 | parser.add_argument("--ratio", type=float, default=0.8) 13 | parser.add_argument("--shuffle", default="False") 14 | 15 | args = parser.parse_args() 16 | 17 | data_dir = args.data_dir 18 | images_dir = os.path.join(data_dir, "images") 19 | annotations_dir = os.path.join(data_dir, "annotations") 20 | ratio = args.ratio 21 | shuffle = args.shuffle == 'True' 22 | fold_path = args.fold_path 23 | annotation_names = set(os.path.splitext(name)[0] for name in os.listdir(annotations_dir) if name.endswith(".json")) 24 | image_ids = list(sorted([os.path.splitext(name)[0] 25 | for name in os.listdir(images_dir) if name.endswith(".png") and name in annotation_names], 26 | key=lambda x: int(x))) 27 | if shuffle: 28 | random.shuffle(image_ids) 29 | 30 | mid = int(len(image_ids) * (1 - ratio)) 31 | print("train={}, test={}".format(len(image_ids)-mid, mid)) 32 | fold = {'train': image_ids[mid:], 'test': image_ids[:mid]} 33 | json.dump(fold, open(fold_path, 'w')) 34 | 35 | 36 | def create_randomly_categorized_fold(): 37 | parser = argparse.ArgumentParser() 38 | parser.add_argument("cat_path") 39 | parser.add_argument("fold_path") 40 | parser.add_argument("--test_cats", nargs='*') 41 | parser.add_argument("--ratio", type=float) 42 | args = parser.parse_args() 43 | cats_path = args.cat_path 44 | test_cats = args.test_cats 45 | cat_dict = json.load(open(cats_path, 'r')) 46 | ids_dict = defaultdict(set) 47 | for image_name, cat in cat_dict.items(): 48 | image_id, _ = os.path.splitext(image_name) 49 | ids_dict[cat].add(image_id) 50 | 51 | cats = list(ids_dict.keys()) 52 | print(cats) 53 | if test_cats is None: 54 | random.shuffle(cats) 55 | mid = int(args.ratio * len(cats)) 56 | train_cats = cats[:mid] 57 | test_cats = cats[mid:] 58 | else: 59 | for cat in test_cats: 60 | assert cat in ids_dict, "%d id not a valid category." % cat 61 | train_cats = [cat for cat in cats if cat not in test_cats] 62 | 63 | print("train categories: %s" % ", ".join(train_cats)) 64 | print("test categories: %s" % ", ".join(test_cats)) 65 | train_ids = sorted(set.union(*[ids_dict[cat] for cat in train_cats]), key=lambda x: int(x)) 66 | test_ids = sorted(set.union(*[ids_dict[cat] for cat in test_cats]), key=lambda x: int(x)) 67 | fold = {'train': train_ids, 'test': test_ids, 'trainCats': train_cats, 'testCats': test_cats} 68 | json.dump(fold, open(args.fold_path, "w")) 69 | 70 | 71 | if __name__ == "__main__": 72 | # create_linear_fold(ARGS) 73 | create_randomly_categorized_fold() 74 | -------------------------------------------------------------------------------- /download.sh: -------------------------------------------------------------------------------- 1 | #!/usr/bin/env bash 2 | 3 | DATA_DIR="$HOME/data" 4 | DQA_DATA_DIR="$DATA_DIR/dqa" 5 | 6 | MODELS_DIR="$HOME/models" 7 | GLOVE_DIR="$MODELS_DIR/glove" 8 | VGG_DIR="$MODELS_DIR/vgg" 9 | 10 | PREPRO_DIR="data" 11 | DQA_PREPRO_DIR="$PREPRO_DIR/s3" 12 | 13 | # DQA data download 14 | mkdir $DATA_DIR 15 | mkdir $DQA_DATA_DIR 16 | wget https://s3-us-west-2.amazonaws.com/dqa-data/shining3.zip -O $DQA_DATA_DIR/shining3.zip 17 | unzip -q $DQA_DATA_DIR/shining3-1500r.zip -d $DQA_DATA_DIR 18 | 19 | # Glove pre-trained vectors download 20 | mkdir $MODELS_DIR 21 | mkdir $GLOVE_DATA_DIR 22 | wget http://nlp.stanford.edu/data/glove.6B.zip -O $GLOVE_DIR/glove.6B.zip 23 | unzip -q $GLOVE_DIR/glove.6B.zip -d $GLOVE_DIR 24 | 25 | # VGG-19 models download 26 | mkdir $VGG_DIR 27 | wget http://www.robots.ox.ac.uk/~vgg/software/very_deep/caffe/VGG_ILSVRC_19_layers.caffemodel -O $VGG_DIR/vgg-19.caffemodel 28 | wget https://gist.githubusercontent.com/ksimonyan/3785162f95cd2d5fee77/raw/f02f8769e64494bcd3d7e97d5d747ac275825721/VGG_ILSVRC_19_layers_deploy.prototxt -O $VGG_DIR/vgg-19.prototxt 29 | 30 | # folds download 31 | mkdir $PREPRO_DIR 32 | mkdir $DQA_PREPRO_DIR 33 | wget https://s3-us-west-2.amazonaws.com/dqa-data/shining3-folds.zip $DQA_PREPRO_DIR/shining3-folds.zip 34 | unzip -q $DQA_PREPRO_DIR/shining3-folds.zip -d $DQA_PREPRO_DIR -------------------------------------------------------------------------------- /download2.sh: -------------------------------------------------------------------------------- 1 | #!/usr/bin/env bash 2 | 3 | #PROJ_HOME="~/csehomedir/projects/dqa-net" 4 | DATA_DIR="$HOME/data" 5 | DQA_DATA_DIR="$DATA_DIR/dqa" 6 | 7 | MODELS_DIR="$HOME/models" 8 | GLOVE_DIR="$MODELS_DIR/glove" 9 | #VGG_DIR="$MODELS_DIR/vgg" 10 | 11 | PREPRO_DIR="data" 12 | DQA_PREPRO_DIR="$PREPRO_DIR/s3" 13 | 14 | # DQA data download 15 | #if [ ! -d "$DATA_DIR" ]; then 16 | # echo "making dir $DATA_DIR" 17 | # mkdir -p "$DATA_DIR" 18 | #fi 19 | #if [ ! -d "$DQA_DATA_DIR" ]; then 20 | # echo "making dir $DQA_DATA_DIR" 21 | # mkdir -p "$DQA_DATA_DIR" 22 | #fi 23 | #wget https://s3-us-west-2.amazonaws.com/dqa-data/shining3.zip -O $DQA_DATA_DIR/shining3.zip 24 | #unzip -q $DQA_DATA_DIR/shining3.zip -d $DQA_DATA_DIR 25 | 26 | # Glove pre-trained vectors download 27 | #if [ ! -d "$MODELS_DIR" ]; then 28 | # echo "making dir $MODELS_DIR" 29 | # mkdir -p "$MODELS_DIR" 30 | #fi 31 | #if [ ! -d "$GLOVE_DIR" ]; then 32 | # echo "making dir $GLOVE_DIR" 33 | # mkdir -p "$GLOVE_DIR" 34 | #fi 35 | #wget http://nlp.stanford.edu/data/glove.6B.zip -O $GLOVE_DIR/glove.6B.zip 36 | #unzip -q $GLOVE_DIR/glove.6B.zip -d $GLOVE_DIR 37 | 38 | # VGG-19 models download 39 | #mkdir $VGG_DIR 40 | #wget http://www.robots.ox.ac.uk/~vgg/software/very_deep/caffe/VGG_ILSVRC_19_layers.caffemodel -O $VGG_DIR/vgg-19.caffemodel 41 | #wget https://gist.githubusercontent.com/ksimonyan/3785162f95cd2d5fee77/raw/f02f8769e64494bcd3d7e97d5d747ac275825721/VGG_ILSVRC_19_layers_deploy.prototxt -O $VGG_DIR/vgg-19.prototxt 42 | 43 | # folds download 44 | if [ ! -d "$PREPRO_DIR" ]; then 45 | echo "making dir $PREPRO_DIR" 46 | mkdir -p "$PREPRO_DIR" 47 | fi 48 | if [ ! -d "$DQA_PREPRO_DIR" ]; then 49 | echo "making dir $DQA_PREPRO_DIR" 50 | mkdir -p "$DQA_PREPRO_DIR" 51 | fi 52 | wget https://s3-us-west-2.amazonaws.com/dqa-data/shining3-folds.zip -O $DQA_PREPRO_DIR/shining3-folds.zip 53 | unzip -q $DQA_PREPRO_DIR/shining3-folds.zip -d $DQA_PREPRO_DIR 54 | -------------------------------------------------------------------------------- /main/__init__.py: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/allenai/dqa-net/c497e950017e69952deae3ffd07920a5bfb9f468/main/__init__.py -------------------------------------------------------------------------------- /main/x05.py: -------------------------------------------------------------------------------- 1 | import json 2 | import os 3 | import shutil 4 | from pprint import pprint 5 | 6 | import h5py 7 | import tensorflow as tf 8 | 9 | from configs.get_config import get_config_from_file, get_config 10 | from models.m05 import Tower, Runner 11 | from read_data.r05 import read_data 12 | 13 | flags = tf.app.flags 14 | 15 | # File directories 16 | # TODO : Make sure these directories are correct! 17 | flags.DEFINE_string("model_name", "m05", "Model name. This will be used for save, log, and eval names. [m05]") 18 | flags.DEFINE_string("data_dir", "data/s3", "Data directory [data/s3]") 19 | flags.DEFINE_string("fold_path", "data/s3/folds/fold23.json", "fold json path [data/s3/folds/fold23.json]") 20 | 21 | # Training parameters 22 | flags.DEFINE_integer("batch_size", 100, "Batch size for each tower. [100]") 23 | flags.DEFINE_float("init_mean", 0, "Initial weight mean [0]") 24 | flags.DEFINE_float("init_std", 0.1, "Initial weight std [0.1]") 25 | flags.DEFINE_float("init_lr", 0.1, "Initial learning rate [0.01]") 26 | flags.DEFINE_integer("anneal_period", 20, "Anneal period [20]") 27 | flags.DEFINE_float("anneal_ratio", 0.5, "Anneal ratio [0.5") 28 | flags.DEFINE_integer("num_epochs", 200, "Total number of epochs for training [200]") 29 | flags.DEFINE_string("opt", 'basic', 'Optimizer: basic | adagrad [basic]') 30 | 31 | # Training and testing options 32 | flags.DEFINE_boolean("train", False, "Train? Test if False [False]") 33 | flags.DEFINE_integer("val_num_batches", -1, "Val num batches. -1 for max possible. [-1]") 34 | flags.DEFINE_integer("train_num_batches", -1, "Train num batches. -1 for max possible [-1]") 35 | flags.DEFINE_integer("test_num_batches", -1, "Test num batches. -1 for max possible [-1]") 36 | flags.DEFINE_boolean("load", False, "Load from saved model? [False]") 37 | flags.DEFINE_boolean("progress", True, "Show progress bar? [True]") 38 | flags.DEFINE_string("device_type", 'cpu', "cpu | gpu [cpu]") 39 | flags.DEFINE_integer("num_devices", 1, "Number of devices to use. Only for multi-GPU. [1]") 40 | flags.DEFINE_integer("val_period", 5, "Validation period (for display purpose only) [5]") 41 | flags.DEFINE_integer("save_period", 10, "Save period [10]") 42 | flags.DEFINE_string("config", 'None', "Config name (e.g. local) to load. 'None' to use config here. [None]") 43 | flags.DEFINE_string("config_ext", ".json", "Config file extension: .json | .tsv [.json]") 44 | 45 | # Debugging 46 | flags.DEFINE_boolean("draft", False, "Draft? (quick initialize) [False]") 47 | 48 | # App-specific training parameters 49 | flags.DEFINE_integer("hidden_size", 50, "Hidden size [50]") 50 | flags.DEFINE_integer("image_size", 4096, "Image size [4096]") 51 | flags.DEFINE_integer("rnn_num_layers", 1, "Number of rnn layers [2]") 52 | flags.DEFINE_integer("emb_num_layers", 0, "Number of embedding layers [0]") 53 | flags.DEFINE_float("keep_prob", 1.0, "Keep probability of dropout [1.0]") 54 | flags.DEFINE_string("sim_func", 'dot', "Similarity function: man_sim | dot [dot]") 55 | flags.DEFINE_string("lstm", "basic", "LSTM cell type: regular | basic | GRU [basic]") 56 | flags.DEFINE_float("forget_bias", 2.5, "LSTM forget bias for basic cell [2.5]") 57 | flags.DEFINE_float("cell_clip", 40, "LSTM cell clipping for regular cell [40]") 58 | flags.DEFINE_float("rand_y", 1.0, "Rand y. [1.0]") 59 | flags.DEFINE_string("mode", "dqanet", "dqanet | vqa [dqanet]") 60 | flags.DEFINE_string("encoder", "lstm", "lstm | mean [lstm]") 61 | 62 | FLAGS = flags.FLAGS 63 | 64 | 65 | def mkdirs(config): 66 | evals_dir = "evals" 67 | logs_dir = "logs" 68 | saves_dir = "saves" 69 | if not os.path.exists(evals_dir): 70 | os.mkdir(evals_dir) 71 | if not os.path.exists(logs_dir): 72 | os.mkdir(logs_dir) 73 | if not os.path.exists(saves_dir): 74 | os.mkdir(saves_dir) 75 | 76 | eval_dir = os.path.join(evals_dir, config.model_name) 77 | eval_subdir = os.path.join(eval_dir, "%s" % str(config.config).zfill(2)) 78 | log_dir = os.path.join(logs_dir, config.model_name) 79 | log_subdir = os.path.join(log_dir, "%s" % str(config.config).zfill(2)) 80 | save_dir = os.path.join(saves_dir, config.model_name) 81 | save_subdir = os.path.join(save_dir, "%s" % str(config.config).zfill(2)) 82 | config.eval_dir = eval_subdir 83 | config.log_dir = log_subdir 84 | config.save_dir = save_subdir 85 | 86 | if not os.path.exists(eval_dir): 87 | os.mkdir(eval_dir) 88 | if os.path.exists(eval_subdir): 89 | if config.train and not config.load: 90 | shutil.rmtree(eval_subdir) 91 | os.mkdir(eval_subdir) 92 | else: 93 | os.mkdir(eval_subdir) 94 | if not os.path.exists(log_dir): 95 | os.mkdir(log_dir) 96 | if os.path.exists(log_subdir): 97 | if config.train and not config.load: 98 | shutil.rmtree(log_subdir) 99 | os.mkdir(log_subdir) 100 | else: 101 | os.mkdir(log_subdir) 102 | if config.train: 103 | if not os.path.exists(save_dir): 104 | os.mkdir(save_dir) 105 | if os.path.exists(save_subdir): 106 | if not config.load: 107 | shutil.rmtree(save_subdir) 108 | os.mkdir(save_subdir) 109 | else: 110 | os.mkdir(save_subdir) 111 | 112 | 113 | def load_meta_data(config): 114 | meta_data_path = os.path.join(config.data_dir, "meta_data.json") 115 | meta_data = json.load(open(meta_data_path, "r")) 116 | 117 | # Other parameters 118 | config.max_sent_size = meta_data['max_sent_size'] 119 | config.max_fact_size = meta_data['max_fact_size'] 120 | config.max_num_facts = meta_data['max_num_facts'] 121 | config.num_choices = meta_data['num_choices'] 122 | config.vocab_size = meta_data['vocab_size'] 123 | config.word_size = meta_data['word_size'] 124 | 125 | 126 | def main(_): 127 | if FLAGS.config == "None": 128 | config = get_config(FLAGS.__flags, {}) 129 | else: 130 | config_path = os.path.join("configs", "%s%s" % (FLAGS.model_name, FLAGS.config_ext)) 131 | config = get_config_from_file(FLAGS.__flags, config_path, FLAGS.config) 132 | 133 | load_meta_data(config) 134 | mkdirs(config) 135 | 136 | # load other files 137 | init_emb_mat_path = os.path.join(config.data_dir, 'init_emb_mat.h5') 138 | config.init_emb_mat = h5py.File(init_emb_mat_path, 'r')['data'][:] 139 | 140 | if config.train: 141 | train_ds = read_data(config, 'train') 142 | val_ds = read_data(config, 'val') 143 | else: 144 | test_ds = read_data(config, 'test') 145 | 146 | # For quick draft initialize (deubgging). 147 | if config.draft: 148 | config.train_num_batches = 1 149 | config.val_num_batches = 1 150 | config.test_num_batches = 1 151 | config.num_epochs = 1 152 | config.val_period = 1 153 | config.save_period = 1 154 | config.num_layers = 1 155 | config.rnn_num_layers = 1 156 | 157 | pprint(config.__dict__) 158 | 159 | eval_tensor_names = ['yp', 'p'] 160 | 161 | graph = tf.Graph() 162 | towers = [Tower(config) for _ in range(config.num_devices)] 163 | sess = tf.Session(graph=graph, config=tf.ConfigProto(allow_soft_placement=True)) 164 | runner = Runner(config, sess, towers) 165 | with graph.as_default(), tf.device("/cpu:0"): 166 | runner.initialize() 167 | if config.train: 168 | if config.load: 169 | runner.load() 170 | runner.train(train_ds, val_ds, eval_tensor_names=eval_tensor_names) 171 | else: 172 | runner.load() 173 | runner.eval(test_ds, eval_tensor_names=eval_tensor_names) 174 | 175 | 176 | if __name__ == "__main__": 177 | tf.app.run() 178 | -------------------------------------------------------------------------------- /models/__init__.py: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/allenai/dqa-net/c497e950017e69952deae3ffd07920a5bfb9f468/models/__init__.py -------------------------------------------------------------------------------- /models/bm05.py: -------------------------------------------------------------------------------- 1 | import itertools 2 | import json 3 | import os 4 | 5 | import numpy as np 6 | import tensorflow as tf 7 | 8 | from my.tensorflow import average_gradients 9 | from read_data.r05 import DataSet 10 | from utils import get_pbar 11 | 12 | 13 | class BaseRunner(object): 14 | def __init__(self, params, sess, towers): 15 | assert isinstance(sess, tf.Session) 16 | self.sess = sess 17 | self.params = params 18 | self.towers = towers 19 | self.num_towers = len(towers) 20 | self.placeholders = {} 21 | self.tensors = {} 22 | self.saver = None 23 | self.writer = None 24 | self.initialized = False 25 | 26 | def initialize(self): 27 | params = self.params 28 | sess = self.sess 29 | device_type = params.device_type 30 | summaries = [] 31 | 32 | global_step = tf.get_variable('global_step', shape=[], dtype='int32', 33 | initializer=tf.constant_initializer(0), trainable=False) 34 | self.tensors['global_step'] = global_step 35 | 36 | epoch = tf.get_variable('epoch', shape=[], dtype='int32', 37 | initializer=tf.constant_initializer(0), trainable=False) 38 | self.tensors['epoch'] = epoch 39 | 40 | learning_rate = tf.placeholder('float32', name='learning_rate') 41 | summaries.append(tf.scalar_summary("learning_rate", learning_rate)) 42 | self.placeholders['learning_rate'] = learning_rate 43 | 44 | if params.opt == 'basic': 45 | opt = tf.train.GradientDescentOptimizer(learning_rate) 46 | elif params.opt == 'adagrad': 47 | opt = tf.train.AdagradOptimizer(learning_rate) 48 | else: 49 | raise Exception() 50 | 51 | grads_tensors = [] 52 | correct_tensors = [] 53 | loss_tensors = [] 54 | for device_id, tower in enumerate(self.towers): 55 | with tf.device("/%s:%d" % (device_type, device_id)), tf.name_scope("%s_%d" % (device_type, device_id)) as scope: 56 | tower.initialize(scope) 57 | tf.get_variable_scope().reuse_variables() 58 | loss_tensor = tower.get_loss_tensor() 59 | loss_tensors.append(loss_tensor) 60 | correct_tensor = tower.get_correct_tensor() 61 | correct_tensors.append(correct_tensor) 62 | grads_tensor = opt.compute_gradients(loss_tensor) 63 | grads_tensors.append(grads_tensor) 64 | 65 | with tf.name_scope("gpu_sync"): 66 | loss_tensor = tf.reduce_mean(tf.pack(loss_tensors), 0, name='loss') 67 | correct_tensor = tf.concat(0, correct_tensors, name="correct") 68 | with tf.name_scope("average_gradients"): 69 | grads_tensor = average_gradients(grads_tensors) 70 | 71 | self.tensors['loss'] = loss_tensor 72 | self.tensors['correct'] = correct_tensor 73 | summaries.append(tf.scalar_summary(loss_tensor.op.name, loss_tensor)) 74 | 75 | for grad, var in grads_tensor: 76 | if grad is not None: 77 | summaries.append(tf.histogram_summary(var.op.name+'/gradients', grad)) 78 | self.tensors['grads'] = grads_tensor 79 | 80 | for var in tf.trainable_variables(): 81 | summaries.append(tf.histogram_summary(var.op.name, var)) 82 | 83 | apply_grads_op = opt.apply_gradients(grads_tensor, global_step=global_step) 84 | 85 | train_op = tf.group(apply_grads_op) 86 | self.tensors['train'] = train_op 87 | 88 | saver = tf.train.Saver(tf.all_variables()) 89 | self.saver = saver 90 | 91 | summary_op = tf.merge_summary(summaries) 92 | self.tensors['summary'] = summary_op 93 | 94 | init_op = tf.initialize_all_variables() 95 | sess.run(init_op) 96 | self.writer = tf.train.SummaryWriter(params.log_dir, sess.graph) 97 | self.initialized = True 98 | 99 | def _get_feed_dict(self, batches, mode, **kwargs): 100 | placeholders = self.placeholders 101 | learning_rate_ph = placeholders['learning_rate'] 102 | learning_rate = kwargs['learning_rate'] if mode == 'train' else 0.0 103 | feed_dict = {learning_rate_ph: learning_rate} 104 | for tower_idx, tower in enumerate(self.towers): 105 | batch = batches[tower_idx] if tower_idx < len(batches) else None 106 | cur_feed_dict = tower.get_feed_dict(batch, mode, **kwargs) 107 | feed_dict.update(cur_feed_dict) 108 | return feed_dict 109 | 110 | def _train_batches(self, batches, **kwargs): 111 | sess = self.sess 112 | tensors = self.tensors 113 | feed_dict = self._get_feed_dict(batches, 'train', **kwargs) 114 | ops = [tensors[name] for name in ['train', 'summary', 'global_step']] 115 | train, summary, global_step = sess.run(ops, feed_dict=feed_dict) 116 | return train, summary, global_step 117 | 118 | def _eval_batches(self, batches, eval_tensor_names=()): 119 | sess = self.sess 120 | tensors = self.tensors 121 | num_examples = sum(len(batch[0]) for batch in batches) 122 | feed_dict = self._get_feed_dict(batches, 'eval') 123 | ops = [tensors[name] for name in ['correct', 'loss', 'summary', 'global_step']] 124 | correct, loss, summary, global_step = sess.run(ops, feed_dict=feed_dict) 125 | num_corrects = np.sum(correct[:num_examples]) 126 | if len(eval_tensor_names) > 0: 127 | valuess = [sess.run([tower.tensors[name] for name in eval_tensor_names], feed_dict=feed_dict) 128 | for tower in self.towers] 129 | else: 130 | valuess = [[]] 131 | 132 | return (num_corrects, loss, summary, global_step), valuess 133 | 134 | def train(self, train_data_set, val_data_set=None, eval_tensor_names=()): 135 | assert isinstance(train_data_set, DataSet) 136 | assert self.initialized, "Initialize tower before training." 137 | # TODO : allow partial batch 138 | 139 | sess = self.sess 140 | writer = self.writer 141 | params = self.params 142 | num_epochs = params.num_epochs 143 | num_batches = params.train_num_batches if params.train_num_batches >= 0 else train_data_set.get_num_batches(partial=False) 144 | num_iters_per_epoch = int(num_batches / self.num_towers) 145 | num_digits = int(np.log10(num_batches)) 146 | 147 | epoch_op = self.tensors['epoch'] 148 | epoch = sess.run(epoch_op) 149 | print("training %d epochs ... " % num_epochs) 150 | print("num iters per epoch: %d" % num_iters_per_epoch) 151 | print("starting from epoch %d." % (epoch+1)) 152 | while epoch < num_epochs: 153 | train_args = self._get_train_args(epoch) 154 | pbar = get_pbar(num_iters_per_epoch, "epoch %s|" % str(epoch+1).zfill(num_digits)).start() 155 | for iter_idx in range(num_iters_per_epoch): 156 | batches = [train_data_set.get_next_labeled_batch() for _ in range(self.num_towers)] 157 | _, summary, global_step = self._train_batches(batches, **train_args) 158 | writer.add_summary(summary, global_step) 159 | pbar.update(iter_idx) 160 | pbar.finish() 161 | train_data_set.complete_epoch() 162 | 163 | assign_op = epoch_op.assign_add(1) 164 | _, epoch = sess.run([assign_op, epoch_op]) 165 | 166 | if val_data_set and epoch % params.val_period == 0: 167 | self.eval(train_data_set, is_val=True, eval_tensor_names=eval_tensor_names) 168 | self.eval(val_data_set, is_val=True, eval_tensor_names=eval_tensor_names) 169 | 170 | if epoch % params.save_period == 0: 171 | self.save() 172 | 173 | def eval(self, data_set, is_val=False, eval_tensor_names=()): 174 | assert isinstance(data_set, DataSet) 175 | assert self.initialized, "Initialize tower before training." 176 | 177 | params = self.params 178 | sess = self.sess 179 | epoch_op = self.tensors['epoch'] 180 | dn = data_set.get_num_batches(partial=True) 181 | if is_val: 182 | pn = params.val_num_batches 183 | num_batches = pn if 0 <= pn <= dn else dn 184 | else: 185 | pn = params.test_num_batches 186 | num_batches = pn if 0 <= pn <= dn else dn 187 | num_iters = int(np.ceil(num_batches / self.num_towers)) 188 | num_corrects, total = 0, 0 189 | eval_values = [] 190 | idxs = [] 191 | losses = [] 192 | N = data_set.batch_size * num_batches 193 | if N > data_set.num_examples: 194 | N = data_set.num_examples 195 | string = "eval on %s, N=%d|" % (data_set.name, N) 196 | pbar = get_pbar(num_iters, prefix=string).start() 197 | for iter_idx in range(num_iters): 198 | batches = [] 199 | for _ in range(self.num_towers): 200 | if data_set.has_next_batch(partial=True): 201 | idxs.extend(data_set.get_batch_idxs(partial=True)) 202 | batches.append(data_set.get_next_labeled_batch(partial=True)) 203 | (cur_num_corrects, cur_loss, _, global_step), eval_value_batches = \ 204 | self._eval_batches(batches, eval_tensor_names=eval_tensor_names) 205 | num_corrects += cur_num_corrects 206 | total += sum(len(batch[0]) for batch in batches) 207 | for eval_value_batch in eval_value_batches: 208 | eval_values.append([x.tolist() for x in eval_value_batch]) # numpy.array.toList 209 | losses.append(cur_loss) 210 | pbar.update(iter_idx) 211 | pbar.finish() 212 | loss = np.mean(losses) 213 | data_set.reset() 214 | 215 | epoch = sess.run(epoch_op) 216 | print("at epoch %d: acc = %.2f%% = %d / %d, loss = %.4f" % 217 | (epoch, 100 * float(num_corrects)/total, num_corrects, total, loss)) 218 | 219 | # For outputting eval json files 220 | ids = [data_set.idx2id[idx] for idx in idxs] 221 | zipped_eval_values = [list(itertools.chain(*each)) for each in zip(*eval_values)] 222 | values = {name: values for name, values in zip(eval_tensor_names, zipped_eval_values)} 223 | out = {'ids': ids, 'values': values} 224 | eval_path = os.path.join(params.eval_dir, "%s_%s.json" % (data_set.name, str(epoch).zfill(4))) 225 | json.dump(out, open(eval_path, 'w')) 226 | 227 | def _get_train_args(self, epoch_idx): 228 | params = self.params 229 | learning_rate = params.init_lr 230 | 231 | anneal_period = params.lr_anneal_period 232 | anneal_ratio = params.lr_anneal_ratio 233 | num_periods = int(epoch_idx / anneal_period) 234 | factor = anneal_ratio ** num_periods 235 | learning_rate *= factor 236 | 237 | train_args = {'learning_rate': learning_rate} 238 | return train_args 239 | 240 | def save(self): 241 | assert self.initialized, "Initialize tower before saving." 242 | 243 | sess = self.sess 244 | params = self.params 245 | save_dir = params.save_dir 246 | name = params.model_name 247 | global_step = self.tensors['global_step'] 248 | print("saving model ...") 249 | save_path = os.path.join(save_dir, name) 250 | self.saver.save(sess, save_path, global_step) 251 | print("saving done.") 252 | 253 | def load(self): 254 | assert self.initialized, "Initialize tower before loading." 255 | 256 | sess = self.sess 257 | params = self.params 258 | save_dir = params.save_dir 259 | print("loading model ...") 260 | checkpoint = tf.train.get_checkpoint_state(save_dir) 261 | assert checkpoint is not None, "Cannot load checkpoint at %s" % save_dir 262 | self.saver.restore(sess, checkpoint.model_checkpoint_path) 263 | print("loading done.") 264 | 265 | 266 | class BaseTower(object): 267 | def __init__(self, params): 268 | self.params = params 269 | self.placeholders = {} 270 | self.tensors = {} 271 | self.default_initializer = tf.random_normal_initializer(params.init_mean, params.init_std) 272 | 273 | def initialize(self, scope): 274 | # Actual building 275 | # Separated so that GPU assignment can be done here. 276 | raise Exception("Implement this!") 277 | 278 | def get_correct_tensor(self): 279 | return self.tensors['correct'] 280 | 281 | def get_loss_tensor(self): 282 | return self.tensors['loss'] 283 | 284 | def get_feed_dict(self, batch, mode, **kwargs): 285 | raise Exception("Implment this!") 286 | -------------------------------------------------------------------------------- /models/m05.py: -------------------------------------------------------------------------------- 1 | from functools import reduce 2 | from operator import mul 3 | 4 | import numpy as np 5 | import tensorflow as tf 6 | from tensorflow.python.ops import rnn 7 | 8 | from models.bm05 import BaseTower, BaseRunner 9 | import my.rnn_cell 10 | import my.nn 11 | 12 | 13 | 14 | class Sentence(object): 15 | def __init__(self, shape, name='sentence'): 16 | self.name = name 17 | self.shape = shape 18 | self.x = tf.placeholder('int32', shape, name="%s" % name) 19 | self.x_mask = tf.placeholder('float', shape, name="%s_mask" % name) 20 | self.x_len = tf.placeholder('int16', shape[:-1], name="%s_len" % name) 21 | self.x_mask_aug = tf.expand_dims(self.x_mask, -1, name='%s_mask_aug' % name) 22 | 23 | def add(self, feed_dict, *batch): 24 | x, x_mask, x_len = batch 25 | feed_dict[self.x] = x 26 | feed_dict[self.x_mask] = x_mask 27 | feed_dict[self.x_len] = x_len 28 | 29 | 30 | class Memory(Sentence): 31 | def __init__(self, params, name='memory'): 32 | N, M, K = params.batch_size, params.max_num_facts, params.max_fact_size 33 | shape = [N, M, K] 34 | super(Memory, self).__init__(shape, name=name) 35 | self.m_mask = tf.placeholder('float', [N, M], name='m_mask') 36 | 37 | def add(self, feed_dict, *batch): 38 | x, x_mask, x_len, m_mask = batch 39 | super(Memory, self).add(feed_dict, x, x_mask, x_len) 40 | feed_dict[self.m_mask] = m_mask 41 | 42 | 43 | class PESentenceEncoder(object): 44 | def __init__(self, params, emb_mat): 45 | self.params = params 46 | V, d, L, e = params.vocab_size, params.hidden_size, params.rnn_num_layers, params.word_size 47 | # self.init_emb_mat = tf.get_variable("init_emb_mat", [self.V, self.d]) 48 | emb_hidden_sizes = [d for _ in range(params.emb_num_layers)] 49 | prev_size = e 50 | for layer_idx in range(params.emb_num_layers): 51 | with tf.variable_scope("Ax_%d" % layer_idx): 52 | cur_size = emb_hidden_sizes[layer_idx] 53 | mat = tf.get_variable("mat_%d" % layer_idx, shape=[prev_size, cur_size]) 54 | bias = tf.get_variable("bias_%d" % layer_idx, shape=[cur_size]) 55 | emb_mat = tf.tanh(tf.matmul(emb_mat, mat) + bias) 56 | self.emb_mat = emb_mat # [V, d] 57 | 58 | def __call__(self, sentence, name='u'): 59 | assert isinstance(sentence, Sentence) 60 | params = self.params 61 | d, e = params.hidden_size, params.word_size 62 | J = sentence.shape[-1] 63 | 64 | def f(JJ, jj, dd, kk): 65 | return (1-float(jj)/JJ) - (float(kk)/dd)*(1-2.0*jj/JJ) 66 | 67 | def g(jj): 68 | return [f(J, jj, d, k) for k in range(d)] 69 | 70 | _l = [g(j) for j in range(J)] 71 | self.l = tf.constant(_l, shape=[J, d], name='l') 72 | assert isinstance(sentence, Sentence) 73 | Ax = tf.nn.embedding_lookup(self.emb_mat, sentence.x, name='Ax') 74 | # TODO : dimension transformation 75 | lAx = self.l * Ax 76 | lAx_masked = lAx * tf.expand_dims(sentence.x_mask, -1) 77 | m = tf.reduce_sum(lAx_masked, len(sentence.shape) - 1, name=name) 78 | return m 79 | 80 | 81 | class MeanEncoder(object): 82 | def __init__(self, params, emb_mat): 83 | self.params = params 84 | V, d, L, e = params.vocab_size, params.hidden_size, params.rnn_num_layers, params.word_size 85 | prev_size = e 86 | hidden_sizes = [d for _ in range(params.emb_num_layers)] 87 | for layer_idx in range(params.emb_num_layers): 88 | with tf.variable_scope("emb_%d" % layer_idx): 89 | cur_hidden_size = hidden_sizes[layer_idx] 90 | emb_mat = tf.tanh(my.nn.linear([V, prev_size], cur_hidden_size, emb_mat)) 91 | prev_size = cur_hidden_size 92 | self.emb_mat = emb_mat 93 | 94 | def __call__(self, sentence, name='mean'): 95 | assert isinstance(sentence, Sentence) 96 | Ax = tf.nn.embedding_lookup(self.emb_mat, sentence.x) # [N, C, J, e] 97 | return tf.reduce_mean(Ax * sentence.x_mask_aug, len(sentence.shape)-1, name=name) 98 | 99 | 100 | class LSTMSentenceEncoder(object): 101 | def __init__(self, params, emb_mat): 102 | self.params = params 103 | V, d, L, e = params.vocab_size, params.hidden_size, params.rnn_num_layers, params.word_size 104 | prev_size = e 105 | hidden_sizes = [d for _ in range(params.emb_num_layers)] 106 | for layer_idx in range(params.emb_num_layers): 107 | with tf.variable_scope("emb_%d" % layer_idx): 108 | cur_hidden_size = hidden_sizes[layer_idx] 109 | emb_mat = tf.tanh(my.nn.linear([V, prev_size], cur_hidden_size, emb_mat)) 110 | prev_size = cur_hidden_size 111 | self.emb_mat = emb_mat 112 | 113 | self.emb_hidden_sizes = [d for _ in range(params.emb_num_layers)] 114 | self.input_size = self.emb_hidden_sizes[-1] if self.emb_hidden_sizes else e 115 | 116 | if params.lstm == 'basic': 117 | self.first_cell = my.rnn_cell.BasicLSTMCell(d, input_size=self.input_size, forget_bias=params.forget_bias) 118 | self.second_cell = my.rnn_cell.BasicLSTMCell(d, forget_bias=params.forget_bias) 119 | elif params.lstm == 'regular': 120 | self.first_cell = tf.nn.rnn_cell.LSTMCell(d, self.input_size, cell_clip=params.cell_clip) 121 | self.second_cell = tf.nn.rnn_cell.LSTMCell(d, d, cell_clip=params.cell_clip) 122 | elif params.lstm == 'gru': 123 | self.first_cell = tf.nn.rnn_cell.GRUCell(d, input_size=self.input_size) 124 | self.second_cell = tf.nn.rnn_cell.GRUCell(d) 125 | else: 126 | raise Exception() 127 | 128 | if params.train and params.keep_prob < 1.0: 129 | self.first_cell = tf.nn.rnn_cell.DropoutWrapper(self.first_cell, input_keep_prob=params.keep_prob) 130 | self.cell = tf.nn.rnn_cell.MultiRNNCell([self.first_cell] + [self.second_cell] * (L-1)) 131 | self.scope = tf.get_variable_scope() 132 | self.used = False 133 | 134 | def __call__(self, sentence, init_hidden_state=None, name='s'): 135 | params = self.params 136 | L, d = params.rnn_num_layers, params.hidden_size 137 | h_flat = self.get_last_hidden_state(sentence, init_hidden_state=init_hidden_state) 138 | if params.lstm in ['basic', 'regular']: 139 | h_last = tf.reshape(h_flat, sentence.shape[:-1] + [2*L*d]) 140 | s = tf.identity(tf.split(2, 2*L, h_last)[2*L-1], name=name) 141 | elif params.lstm == 'gru': 142 | h_last = tf.reshape(h_flat, sentence.shape[:-1] + [L*d]) 143 | s = tf.identity(tf.split(2, L, h_last)[L-1], name=name) 144 | else: 145 | raise Exception() 146 | return s 147 | 148 | def get_last_hidden_state(self, sentence, init_hidden_state=None): 149 | assert isinstance(sentence, Sentence) 150 | with tf.variable_scope(self.scope, reuse=self.used): 151 | J = sentence.shape[-1] 152 | Ax = tf.nn.embedding_lookup(self.emb_mat, sentence.x) # [N, C, J, e] 153 | 154 | F = reduce(mul, sentence.shape[:-1], 1) 155 | init_hidden_state = init_hidden_state or self.cell.zero_state(F, tf.float32) 156 | Ax_flat = tf.reshape(Ax, [F, J, self.input_size]) 157 | x_len_flat = tf.reshape(sentence.x_len, [F]) 158 | 159 | # Ax_flat_split = [tf.squeeze(x_flat_each, [1]) for x_flat_each in tf.split(1, J, Ax_flat)] 160 | o_flat, h_flat = rnn.dynamic_rnn(self.cell, Ax_flat, x_len_flat, initial_state=init_hidden_state) 161 | self.used = True 162 | return h_flat 163 | 164 | 165 | class Sim(object): 166 | def __init__(self, params, memory, encoder, u): 167 | N, C, R, d = params.batch_size, params.num_choices, params.max_num_facts, params.hidden_size 168 | f = encoder(memory, name='f') 169 | f_aug = tf.expand_dims(f, 1) # [N, 1, R, d] 170 | u_aug = tf.expand_dims(u, 2) # [N, C, 1, d] 171 | u_tiled = tf.tile(u_aug, [1, 1, R, 1]) 172 | if params.sim_func == 'man_sim': 173 | uf = my.nn.man_sim([N, C, R, d], f_aug, u_tiled, name='uf') # [N, C, R] 174 | elif params.sim_func == 'dot': 175 | uf = tf.reduce_sum(u_tiled * f_aug, 3) 176 | else: 177 | raise Exception() 178 | logit = tf.reduce_max(uf, 2) # [N, C] 179 | 180 | f_mask_aug = tf.expand_dims(memory.m_mask, 1) 181 | p = my.nn.softmax_with_mask([N, C, R], uf, f_mask_aug, name='p') 182 | self.logit = logit 183 | self.p = p 184 | 185 | 186 | class Tower(BaseTower): 187 | def initialize(self, scope): 188 | params = self.params 189 | tensors = self.tensors 190 | placeholders = self.placeholders 191 | 192 | V, d, G = params.vocab_size, params.hidden_size, params.image_size 193 | N, C, J = params.batch_size, params.num_choices, params.max_sent_size 194 | e = params.word_size 195 | 196 | # initialize self 197 | # placeholders 198 | with tf.name_scope('ph'): 199 | s = Sentence([N, C, J], 's') 200 | f = Memory(params, 'f') 201 | image = tf.placeholder('float', [N, G], name='i') 202 | y = tf.placeholder('int8', [N, C], name='y') 203 | init_emb_mat = tf.placeholder('float', shape=[V, e], name='init_emb_mat') 204 | placeholders['s'] = s 205 | placeholders['f'] = f 206 | placeholders['image'] = image 207 | placeholders['y'] = y 208 | placeholders['init_emb_mat'] = init_emb_mat 209 | 210 | with tf.variable_scope('encoder'): 211 | if params.encoder == 'lstm': 212 | u_encoder = LSTMSentenceEncoder(params, init_emb_mat) 213 | elif params.encoder == 'mean': 214 | u_encoder = MeanEncoder(params, init_emb_mat) 215 | else: 216 | raise Exception("Invalid encoder: {}".format(params.encoder)) 217 | # u_encoder = PESentenceEncoder(params, init_emb_mat) 218 | first_u = u_encoder(s, name='first_u') 219 | 220 | with tf.name_scope("main"): 221 | sim = Sim(params, f, u_encoder, first_u) 222 | tensors['p'] = sim.p 223 | if params.mode == 'dqanet': 224 | logit = sim.logit 225 | elif params.mode == 'vqa': 226 | image_trans_mat = tf.get_variable('I', shape=[G, d]) 227 | image_trans_bias = tf.get_variable('bI', shape=[]) 228 | g = tf.tanh(tf.matmul(image, image_trans_mat) + image_trans_bias, name='g') # [N, d] 229 | aug_g = tf.expand_dims(g, 2, name='aug_g') # [N, d, 1] 230 | logit = tf.squeeze(tf.batch_matmul(first_u, aug_g), [2]) # [N, C] 231 | else: 232 | raise Exception("Invalid mode: {}".format(params.mode)) 233 | tensors['logit'] = logit 234 | 235 | with tf.variable_scope('yp'): 236 | yp = tf.nn.softmax(logit, name='yp') # [N, C] 237 | tensors['yp'] = yp 238 | 239 | with tf.name_scope('loss'): 240 | cross_entropy = tf.nn.softmax_cross_entropy_with_logits(logit, tf.cast(y, 'float'), name='cross_entropy') 241 | avg_cross_entropy = tf.reduce_mean(cross_entropy, 0, name='avg_cross_entropy') 242 | tf.add_to_collection('losses', avg_cross_entropy) 243 | loss = tf.add_n(tf.get_collection('losses', scope), name='loss') 244 | tensors['loss'] = loss 245 | 246 | with tf.name_scope('acc'): 247 | correct_vec = tf.equal(tf.argmax(yp, 1), tf.argmax(y, 1)) 248 | num_corrects = tf.reduce_sum(tf.cast(correct_vec, 'float'), name='num_corrects') 249 | acc = tf.reduce_mean(tf.cast(correct_vec, 'float'), name='acc') 250 | tensors['correct'] = correct_vec 251 | tensors['num_corrects'] = num_corrects 252 | tensors['acc'] = acc 253 | 254 | def get_feed_dict(self, batch, mode, **kwargs): 255 | placeholders = self.placeholders 256 | if batch is None: 257 | assert mode != 'train', "Cannot pass empty batch during training, for now." 258 | sents_batch, facts_batch, images_batch, label_batch = None, None, None, None 259 | else: 260 | sents_batch, facts_batch, images_batch = batch[:-1] 261 | if len(batch) > 3: 262 | label_batch = batch[-1] 263 | else: 264 | label_batch = np.zeros([len(sents_batch)]) 265 | s = self._prepro_sents_batch(sents_batch) # [N, C, J], [N, C] 266 | f = self._prepro_facts_batch(facts_batch) 267 | g = self._prepro_images_batch(images_batch) 268 | feed_dict = {placeholders['image']: g, placeholders['init_emb_mat']: self.params.init_emb_mat} 269 | if mode == 'train': 270 | y_batch = self._prepro_label_batch(label_batch) 271 | elif mode == 'eval': 272 | y_batch = self._prepro_label_batch(label_batch) 273 | else: 274 | raise Exception() 275 | feed_dict[placeholders['y']] = y_batch 276 | placeholders['s'].add(feed_dict, *s) 277 | placeholders['f'].add(feed_dict, *f) 278 | return feed_dict 279 | 280 | def _prepro_images_batch(self, images_batch): 281 | params = self.params 282 | N, G = params.batch_size, params.image_size 283 | g = np.zeros([N, G]) 284 | if images_batch is None: 285 | return g 286 | g[:len(images_batch)] = images_batch 287 | return g 288 | 289 | def _prepro_sents_batch(self, sents_batch): 290 | p = self.params 291 | N, C, J = p.batch_size, p.num_choices, p.max_sent_size 292 | s_batch = np.zeros([N, C, J], dtype='int32') 293 | s_mask_batch = np.zeros([N, C, J], dtype='float') 294 | s_len_batch = np.zeros([N, C], dtype='int16') 295 | out = s_batch, s_mask_batch, s_len_batch 296 | if sents_batch is None: 297 | return out 298 | for n, sents in enumerate(sents_batch): 299 | for c, sent in enumerate(sents): 300 | for j, idx in enumerate(sent): 301 | s_batch[n, c, j] = idx 302 | s_mask_batch[n, c, j] = 1.0 303 | s_len_batch[n, c] = len(sent) 304 | 305 | return out 306 | 307 | def _prepro_facts_batch(self, facts_batch): 308 | p = self.params 309 | N, M, K = p.batch_size, p.max_num_facts, p.max_fact_size 310 | s_batch = np.zeros([N, M, K], dtype='int32') 311 | s_mask_batch = np.zeros([N, M, K], dtype='float') 312 | s_len_batch = np.zeros([N, M], dtype='int16') 313 | m_mask_batch = np.zeros([N, M], dtype='float') 314 | out = s_batch, s_mask_batch, s_len_batch, m_mask_batch 315 | if facts_batch is None: 316 | return out 317 | for n, sents in enumerate(facts_batch): 318 | for m, sent in enumerate(sents): 319 | for k, idx in enumerate(sent): 320 | s_batch[n, m, k] = idx 321 | s_mask_batch[n, m, k] = 1.0 322 | s_len_batch[n, m] = len(sent) 323 | m_mask_batch[n, m] = 1.0 324 | return out 325 | 326 | def _prepro_label_batch(self, label_batch): 327 | p = self.params 328 | N, C = p.batch_size, p.num_choices 329 | y = np.zeros([N, C], dtype='float') 330 | if label_batch is None: 331 | return y 332 | for i, label in enumerate(label_batch): 333 | y[i, label] = np.random.rand() * self.params.rand_y 334 | rand_other = (1.0 - self.params.rand_y)/(C-1) 335 | for cur in range(C): 336 | if cur != label: 337 | y[i, cur] = np.random.rand() * rand_other 338 | y[i] = y[i] / sum(y[i]) 339 | 340 | return y 341 | 342 | 343 | class Runner(BaseRunner): 344 | def _get_train_args(self, epoch_idx): 345 | params = self.params 346 | learning_rate = params.init_lr 347 | 348 | anneal_period = params.anneal_period 349 | anneal_ratio = params.anneal_ratio 350 | num_periods = int(epoch_idx / anneal_period) 351 | factor = anneal_ratio ** num_periods 352 | 353 | if params.opt == 'basic': 354 | learning_rate *= factor 355 | 356 | train_args = {'learning_rate': learning_rate} 357 | return train_args 358 | -------------------------------------------------------------------------------- /my/__init__.py: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/allenai/dqa-net/c497e950017e69952deae3ffd07920a5bfb9f468/my/__init__.py -------------------------------------------------------------------------------- /my/nn.py: -------------------------------------------------------------------------------- 1 | """ 2 | useful neural net modules 3 | """ 4 | import operator 5 | from operator import mul 6 | from functools import reduce 7 | 8 | import tensorflow as tf 9 | 10 | VERY_SMALL_NUMBER = -1e10 11 | 12 | 13 | def softmax_with_mask(shape, x, mask, name=None): 14 | if name is None: 15 | name = softmax_with_mask.__name__ 16 | x_masked = x + VERY_SMALL_NUMBER * (1.0 - mask) 17 | x_flat = tf.reshape(x_masked, [reduce(mul, shape[:-1], 1), shape[-1]]) 18 | p_flat = tf.nn.softmax(x_flat) 19 | p = tf.reshape(p_flat, shape, name=name) 20 | return p 21 | 22 | 23 | def softmax_with_base(shape, base_untiled, x, mask=None, name='sig'): 24 | if mask is not None: 25 | x += VERY_SMALL_NUMBER * (1.0 - mask) 26 | base_shape = shape[:-1] + [1] 27 | for _ in shape: 28 | base_untiled = tf.expand_dims(base_untiled, -1) 29 | base = tf.tile(base_untiled, base_shape) 30 | 31 | c_shape = shape[:-1] + [shape[-1] + 1] 32 | c = tf.concat(len(shape)-1, [base, x]) 33 | c_flat = tf.reshape(c, [reduce(mul, shape[:-1], 1), c_shape[-1]]) 34 | p_flat = tf.nn.softmax(c_flat) 35 | p_cat = tf.reshape(p_flat, c_shape) 36 | s_aug = tf.slice(p_cat, [0 for _ in shape], [i for i in shape[:-1]] + [1]) 37 | s = tf.squeeze(s_aug, [len(shape)-1]) 38 | sig = tf.sub(1.0, s, name="sig") 39 | p = tf.slice(p_cat, [0 for _ in shape[:-1]] + [1], shape) 40 | return sig, p 41 | 42 | 43 | def man_sim(shape, u, v, name='man_sim'): 44 | """ 45 | Manhattan similarity 46 | https://pdfs.semanticscholar.org/6812/fb9ef1c2dad497684a9020d8292041a639ff.pdf 47 | :param shape: 48 | :param u: 49 | :param v: 50 | :param name: 51 | :return: 52 | """ 53 | dist = tf.reduce_sum(tf.abs(u - v), len(shape)-1) 54 | sim = tf.sub(0.0, dist, name=name) 55 | return sim 56 | 57 | 58 | def linear(input_shape, output_dim, input_, name="linear"): 59 | a = input_shape[-1] 60 | b = output_dim 61 | input_flat = tf.reshape(input_, [reduce(operator.mul, input_shape[:-1], 1), a]) 62 | with tf.variable_scope(name): 63 | mat = tf.get_variable("mat", shape=[a, b]) 64 | bias = tf.get_variable("bias", shape=[b]) 65 | out_flat = tf.matmul(input_flat, mat) + bias 66 | out = tf.reshape(out_flat, input_shape[:-1] + [b]) 67 | return out 68 | 69 | 70 | -------------------------------------------------------------------------------- /my/rnn_cell.py: -------------------------------------------------------------------------------- 1 | from __future__ import absolute_import 2 | from __future__ import division 3 | from __future__ import print_function 4 | 5 | import tensorflow as tf 6 | from tensorflow.python.ops.rnn_cell import RNNCell 7 | 8 | 9 | def linear(args, output_size, bias, bias_start=0.0, scope=None, var_on_cpu=True, wd=0.0): 10 | """Linear map: sum_i(args[i] * W[i]), where W[i] is a variable. 11 | 12 | Args: 13 | args: a 2D Tensor or a list of 2D, batch x n, Tensors. 14 | output_size: int, second dimension of W[i]. 15 | bias: boolean, whether to add a bias term or not. 16 | bias_start: starting value to initialize the bias; 0 by default. 17 | scope: VariableScope for the created subgraph; defaults to "Linear". 18 | var_on_cpu: if True, put the variables on /cpu:0. 19 | 20 | Returns: 21 | A 2D Tensor with shape [batch x output_size] equal to 22 | sum_i(args[i] * W[i]), where W[i]s are newly created matrices. 23 | 24 | Raises: 25 | ValueError: if some of the arguments has unspecified or wrong shape. 26 | """ 27 | assert args 28 | if not isinstance(args, (list, tuple)): 29 | args = [args] 30 | 31 | # Calculate the total size of arguments on dimension 1. 32 | total_arg_size = 0 33 | shapes = [a.get_shape().as_list() for a in args] 34 | for shape in shapes: 35 | if len(shape) != 2: 36 | raise ValueError("Linear is expecting 2D arguments: %s" % str(shapes)) 37 | if not shape[1]: 38 | raise ValueError("Linear expects shape[1] of arguments: %s" % str(shapes)) 39 | else: 40 | total_arg_size += shape[1] 41 | 42 | # Now the computation. 43 | with tf.variable_scope(scope or "Linear"): 44 | if var_on_cpu: 45 | with tf.device("/cpu:0"): 46 | matrix = tf.get_variable("Matrix", [total_arg_size, output_size]) 47 | else: 48 | matrix = tf.get_variable("Matrix", [total_arg_size, output_size]) 49 | if wd: 50 | weight_decay = tf.mul(tf.nn.l2_loss(matrix), wd, name='weight_loss') 51 | tf.add_to_collection('losses', weight_decay) 52 | 53 | 54 | if len(args) == 1: 55 | res = tf.matmul(args[0], matrix) 56 | else: 57 | res = tf.matmul(tf.concat(1, args), matrix) 58 | if not bias: 59 | return res 60 | 61 | if var_on_cpu: 62 | with tf.device("/cpu:0"): 63 | bias_term = tf.get_variable( 64 | "Bias", [output_size], 65 | initializer=tf.constant_initializer(bias_start)) 66 | else: 67 | bias_term = tf.get_variable( 68 | "Bias", [output_size], 69 | initializer=tf.constant_initializer(bias_start)) 70 | return res + bias_term 71 | 72 | 73 | class BasicLSTMCell(RNNCell): 74 | """Basic LSTM recurrent network cell. 75 | 76 | The implementation is based on: http://arxiv.org/abs/1409.2329. 77 | 78 | We add forget_bias (default: 1) to the biases of the forget gate in order to 79 | reduce the scale of forgetting in the beginning of the training. 80 | 81 | It does not allow cell clipping, a projection layer, and does not 82 | use peep-hole connections: it is the basic baseline. 83 | 84 | For advanced models, please use the full LSTMCell that follows. 85 | """ 86 | 87 | def __init__(self, num_units, forget_bias=1.0, input_size=None, var_on_cpu=True, wd=0.0): 88 | """Initialize the basic LSTM cell. 89 | 90 | Args: 91 | num_units: int, The number of units in the LSTM cell. 92 | forget_bias: float, The bias added to forget gates (see above). 93 | input_size: int, The dimensionality of the inputs into the LSTM cell, 94 | by default equal to num_units. 95 | """ 96 | self._num_units = num_units 97 | self._input_size = num_units if input_size is None else input_size 98 | self._forget_bias = forget_bias 99 | self.var_on_cpu = var_on_cpu 100 | self.wd = wd 101 | 102 | @property 103 | def input_size(self): 104 | return self._input_size 105 | 106 | @property 107 | def output_size(self): 108 | return self._num_units 109 | 110 | @property 111 | def state_size(self): 112 | return 2 * self._num_units 113 | 114 | def __call__(self, inputs, state, name_scope=None): 115 | """Long short-term memory cell (LSTM).""" 116 | with tf.variable_scope(name_scope or type(self).__name__): # "BasicLSTMCell" 117 | # Parameters of gates are concatenated into one multiply for efficiency. 118 | c, h = tf.split(1, 2, state) 119 | concat = linear([inputs, h], 4 * self._num_units, True, var_on_cpu=self.var_on_cpu, wd=self.wd) 120 | 121 | # i = input_gate, j = new_input, f = forget_gate, o = output_gate 122 | i, j, f, o = tf.split(1, 4, concat) 123 | 124 | new_c = c * tf.sigmoid(f + self._forget_bias) + tf.sigmoid(i) * tf.tanh(j) 125 | new_h = tf.tanh(new_c) * tf.sigmoid(o) 126 | 127 | return new_h, tf.concat(1, [new_c, new_h]) 128 | -------------------------------------------------------------------------------- /my/tensorflow.py: -------------------------------------------------------------------------------- 1 | import tensorflow as tf 2 | 3 | 4 | def _variable_on_cpu(name, shape, initializer): 5 | """Helper to create a Variable stored on CPU memory. 6 | 7 | Args: 8 | name: name of the variable 9 | shape: list of ints 10 | initializer: initializer for Variable 11 | 12 | Returns: 13 | Variable Tensor 14 | """ 15 | with tf.device('/cpu:0'): 16 | var = tf.get_variable(name, shape, initializer=initializer) 17 | return var 18 | 19 | 20 | def _variable_with_weight_decay(name, shape, stddev, wd): 21 | """Helper to create an initialized Variable with weight decay. 22 | 23 | Note that the Variable is initialized with a truncated normal distribution. 24 | A weight decay is added only if one is specified. 25 | 26 | Args: 27 | name: name of the variable 28 | shape: list of ints 29 | stddev: standard deviation of a truncated Gaussian 30 | wd: add L2Loss weight decay multiplied by this float. If None, weight 31 | decay is not added for this Variable. 32 | 33 | Returns: 34 | Variable Tensor 35 | """ 36 | var = _variable_on_cpu(name, shape, 37 | tf.truncated_normal_initializer(stddev=stddev)) 38 | if wd: 39 | weight_decay = tf.mul(tf.nn.l2_loss(var), wd, name='weight_loss') 40 | tf.add_to_collection('losses', weight_decay) 41 | return var 42 | 43 | 44 | def average_gradients(tower_grads): 45 | """Calculate the average gradient for each shared variable across all towers. 46 | 47 | Note that this function provides a synchronization point across all towers. 48 | 49 | Args: 50 | tower_grads: List of lists of (gradient, variable) tuples. The outer list 51 | is over individual gradients. The inner list is over the gradient 52 | calculation for each tower. 53 | Returns: 54 | List of pairs of (gradient, variable) where the gradient has been averaged 55 | across all towers. 56 | """ 57 | average_grads = [] 58 | for grad_and_vars in zip(*tower_grads): 59 | # Note that each grad_and_vars looks like the following: 60 | # ((grad0_gpu0, var0_gpu0), ... , (grad0_gpuN, var0_gpuN)) 61 | grads = [] 62 | for g, _ in grad_and_vars: 63 | # Add 0 dimension to the gradients to represent the tower. 64 | assert g is not None 65 | expanded_g = tf.expand_dims(g, 0) 66 | 67 | # Append on a 'tower' dimension which we will average over below. 68 | grads.append(expanded_g) 69 | 70 | # Average over the 'tower' dimension. 71 | grad = tf.concat(0, grads) 72 | grad = tf.reduce_mean(grad, 0) 73 | 74 | # Keep in mind that the Variables are redundant because they are shared 75 | # across towers. So .. we will just return the first tower's pointer to 76 | # the Variable. 77 | v = grad_and_vars[0][1] 78 | grad_and_var = (grad, v) 79 | average_grads.append(grad_and_var) 80 | return average_grads 81 | -------------------------------------------------------------------------------- /prepro/__init__.py: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/allenai/dqa-net/c497e950017e69952deae3ffd07920a5bfb9f468/prepro/__init__.py -------------------------------------------------------------------------------- /prepro/__init__.pyc: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/allenai/dqa-net/c497e950017e69952deae3ffd07920a5bfb9f468/prepro/__init__.pyc -------------------------------------------------------------------------------- /prepro/p05.py: -------------------------------------------------------------------------------- 1 | import argparse 2 | import os 3 | import json 4 | import shutil 5 | from collections import defaultdict, namedtuple 6 | import re 7 | import sys 8 | import random 9 | from pprint import pprint 10 | 11 | import h5py 12 | import numpy as np 13 | 14 | from utils import get_pbar 15 | 16 | 17 | def get_args(): 18 | parser = argparse.ArgumentParser() 19 | parser.add_argument("--data_dir", default="/home/anglil/data/dqa/shining3") 20 | parser.add_argument("--target_dir", default="data/s3") 21 | parser.add_argument("--glove_path", default="/home/anglil/models/glove/glove.6B.300d.txt") 22 | parser.add_argument("--min_count", type=int, default=5) 23 | parser.add_argument("--vgg_model_path", default="~/models/vgg/vgg-19.caffemodel") 24 | parser.add_argument("--vgg_proto_path", default="~/models/vgg/vgg-19.prototxt") 25 | parser.add_argument("--debug", default='False') 26 | parser.add_argument("--qa2hypo", default='True') 27 | parser.add_argument("--qa2hypo_path", default="../dqa/qa2hypo") 28 | parser.add_argument("--prepro_images", default='True') 29 | return parser.parse_args() 30 | 31 | 32 | def qa2hypo(question, answer, flag, qa2hypo_path): 33 | if flag == 'True': 34 | # add qa2hypo_path to the Python path at runtime 35 | sys.path.insert(0, qa2hypo_path) 36 | sys.path.insert(0, qa2hypo_path+'/stanford-corenlp-python') 37 | from qa2hypo import qa2hypo as f 38 | return f(question, answer, False, True) 39 | # attach the answer to the question 40 | return "%s %s" % (question, answer) 41 | 42 | 43 | def _tokenize(raw): 44 | tokens = tuple(re.findall(r"[\w]+", raw)) 45 | return tokens 46 | 47 | 48 | def _vadd(vocab_counter, word): 49 | word = word.lower() 50 | vocab_counter[word] += 1 51 | 52 | 53 | def _vget(vocab_dict, word): 54 | word = word.lower() 55 | if word in vocab_dict: 56 | return vocab_dict[word] 57 | else: 58 | return 0 59 | 60 | 61 | def _vlup(vocab_dict, words): 62 | return tuple(_vget(vocab_dict, word) for word in words) 63 | 64 | 65 | def _get(id_map, key): 66 | return id_map[key] if key in id_map else None 67 | 68 | 69 | def rel2text(id_map, rel): 70 | """ 71 | Obtain text facts from the relation class. 72 | :param id_map: 73 | :param rel: 74 | :return: 75 | """ 76 | TEMPLATES = ["%s links to %s.", 77 | "there is %s.", 78 | "the title is %s.", 79 | "%s describes region.", 80 | "there are %s %s.", 81 | "arrows objects regions 0 1 2 3 4 5 6 7 8 9", 82 | "% s and %s are related."] 83 | MAX_LABEL_SIZE = 3 84 | tup = rel[:3] 85 | o_keys, d_keys = rel[3:] 86 | if tup == ('interObject', 'linkage', 'objectToObject'): 87 | template = TEMPLATES[0] 88 | o = _get(id_map, o_keys[0]) if len(o_keys) else None 89 | d = _get(id_map, d_keys[0]) if len(d_keys) else None 90 | if not (o and d): 91 | return None 92 | o_words = _tokenize(o) 93 | d_words = _tokenize(d) 94 | if len(o_words) > MAX_LABEL_SIZE: 95 | o = "object" 96 | if len(d_words) > MAX_LABEL_SIZE: 97 | d = "object" 98 | text = template % (o, d) 99 | return text 100 | 101 | elif tup == ('intraObject', 'linkage', 'regionDescription'): 102 | template = TEMPLATES[3] 103 | o = _get(id_map, o_keys[0]) if len(o_keys) else None 104 | o = o or "an object" 105 | o_words = _tokenize(o) 106 | if len(o_words) > MAX_LABEL_SIZE: 107 | o = "an object" 108 | text = template % o 109 | return text 110 | 111 | elif tup == ('unary', '', 'regionDescriptionNoArrow'): 112 | template = TEMPLATES[3] 113 | o = _get(id_map, o_keys[0]) if len(o_keys) else None 114 | o = o or "an object" 115 | o_words = _tokenize(o) 116 | if len(o_words) > MAX_LABEL_SIZE: 117 | o = "an object" 118 | text = template % o 119 | return text 120 | 121 | elif tup[0] == 'unary' and tup[2] in ['objectLabel', 'ownObject']: 122 | template = TEMPLATES[1] 123 | val =_get(id_map, o_keys[0]) 124 | if val is not None: 125 | words = _tokenize(val) 126 | if len(words) > MAX_LABEL_SIZE: 127 | return val 128 | else: 129 | return template % val 130 | 131 | elif tup == ('unary', '', 'regionLabel'): 132 | template = TEMPLATES[1] 133 | val =_get(id_map, o_keys[0]) 134 | if val is not None: 135 | words = _tokenize(val) 136 | if len(words) > MAX_LABEL_SIZE: 137 | return val 138 | else: 139 | return template % val 140 | 141 | elif tup == ('unary', '', 'imageTitle'): 142 | template = TEMPLATES[2] 143 | val = _get(id_map, o_keys[0]) 144 | return template % val 145 | 146 | elif tup == ('unary', '', 'sectionTitle'): 147 | template = TEMPLATES[2] 148 | val = _get(id_map, o_keys[0]) 149 | return template % val 150 | 151 | elif tup[0] == 'count': 152 | template = TEMPLATES[4] 153 | category = tup[2] 154 | num = str(o_keys) 155 | return template % (num, category) 156 | 157 | elif tup[0] == 'unary': 158 | val = _get(id_map, o_keys[0]) 159 | return val 160 | 161 | return None 162 | 163 | 164 | Relation = namedtuple('Relation', 'type subtype category origin destination') 165 | categories = set() 166 | 167 | 168 | def anno2rels(anno): 169 | types = set() 170 | rels = [] 171 | # Unary relations 172 | for text_id, d in anno['text'].items(): 173 | category = d['category'] if 'category' in d else '' 174 | categories.add(category) 175 | rel = Relation('unary', '', category, [text_id], '') 176 | rels.append(rel) 177 | 178 | # Counting 179 | if 'arrows' in anno and len(anno['arrows']) > 0: 180 | rels.append(Relation('count', '', 'stages', len(anno['arrows']), '')) 181 | if 'objects' in anno and len(anno['objects']) > 0: 182 | rels.append(Relation('count', '', 'objects', len(anno['objects']), '')) 183 | 184 | if 'relationships' not in anno: 185 | return rels 186 | for type_, d in anno['relationships'].items(): 187 | for subtype, dd in d.items(): 188 | for rel_id, ddd in dd.items(): 189 | category = ddd['category'] 190 | origin = ddd['origin'] if 'origin' in ddd else "" 191 | destination = ddd['destination'] if 'destination' in ddd else "" 192 | rel = Relation(type_, subtype, category, origin, destination) 193 | rels.append(rel) 194 | types.add((type_, subtype, category)) 195 | return rels 196 | 197 | 198 | def _get_id_map(anno): 199 | id_map = {} 200 | if 'text' in anno: 201 | for key, d in anno['text'].items(): 202 | id_map[key] = d['value'] 203 | if 'objects' in anno: 204 | for key, d in anno['objects'].items(): 205 | if 'text' in d and len(d['text']) > 0: 206 | new_key = d['text'][0] 207 | id_map[key] = id_map[new_key] 208 | if 'relationships' in anno: 209 | d = anno['relationships'] 210 | if 'intraOjbect' in d: 211 | d = d['intraOjbect'] 212 | if 'label' in d: 213 | d = d['label'] 214 | for _, dd in d.items(): 215 | category = dd['category'] 216 | if category in ['arrowHeadTail', 'arrowDescriptor']: 217 | continue 218 | origin = dd['origin'][0] 219 | dest = dd['destination'][0] 220 | if origin.startswith("CT") or origin.startswith("T"): 221 | id_map[dest] = id_map[origin] 222 | elif dest.startswith("CT") or dest.startswith("T"): 223 | id_map[origin] = id_map[dest] 224 | return id_map 225 | 226 | 227 | def prepro_annos(args): 228 | """ 229 | Transform DQA annotation.json -> a list of tokenized fact sentences for each image in json file 230 | The facts are indexed by image id. 231 | :param args: 232 | :return: 233 | """ 234 | data_dir = args.data_dir 235 | target_dir = args.target_dir 236 | 237 | # For debugging 238 | if args.debug == 'True': 239 | sents_path =os.path.join(target_dir, "raw_sents.json") 240 | answers_path =os.path.join(target_dir, "answers.json") 241 | sentss_dict = json.load(open(sents_path, 'r')) 242 | answers_dict = json.load(open(answers_path, 'r')) 243 | 244 | facts_path = os.path.join(target_dir, "raw_facts.json") 245 | meta_data_path = os.path.join(target_dir, "meta_data.json") 246 | meta_data = json.load(open(meta_data_path, "r")) 247 | facts_dict = {} 248 | annos_dir = os.path.join(data_dir, "annotations") 249 | anno_names = [name for name in os.listdir(annos_dir) if name.endswith(".json")] 250 | max_num_facts = 0 251 | max_fact_size = 0 252 | pbar = get_pbar(len(anno_names)).start() 253 | for i, anno_name in enumerate(anno_names): 254 | image_name, _ = os.path.splitext(anno_name) 255 | image_id, _ = os.path.splitext(image_name) 256 | anno_path = os.path.join(annos_dir, anno_name) 257 | anno = json.load(open(anno_path, 'r')) 258 | rels = anno2rels(anno) 259 | id_map = _get_id_map(anno) 260 | text_facts = [rel2text(id_map, rel) for rel in rels] 261 | text_facts = list(set(_tokenize(fact) for fact in text_facts if fact is not None)) 262 | max_fact_size = max([max_fact_size] + [len(fact) for fact in text_facts]) 263 | # For debugging only 264 | if args.debug == 'True': 265 | if image_id in sentss_dict: 266 | correct_sents = [sents[answer] for sents, answer in zip(sentss_dict[image_id], answers_dict[image_id])] 267 | # indexed_facts.extend(correct_sents) 268 | # FIXME : this is very strong prior! 269 | text_facts = correct_sents 270 | else: 271 | text_facts = [] 272 | facts_dict[image_id] = text_facts 273 | max_num_facts = max(max_num_facts, len(text_facts)) 274 | pbar.update(i) 275 | 276 | pbar.finish() 277 | 278 | meta_data['max_num_facts'] = max_num_facts 279 | meta_data['max_fact_size'] = max_fact_size 280 | print("number of facts: %d" % sum(len(facts) for facts in facts_dict.values())) 281 | print("max num facts per relation: %d" % max_num_facts) 282 | print("max fact size: %d" % max_fact_size) 283 | print("dumping json files ... ") 284 | json.dump(meta_data, open(meta_data_path, 'w')) 285 | json.dump(facts_dict, open(facts_path, 'w')) 286 | print("done") 287 | 288 | 289 | def prepro_questions(args): 290 | """ 291 | transform DQA questions.json files -> single statements json and single answers json. 292 | sentences and answers are doubly indexed by image id first and then question number within that image (0 indexed) 293 | :param args: 294 | :return: 295 | """ 296 | data_dir = args.data_dir 297 | target_dir = args.target_dir 298 | questions_dir = os.path.join(data_dir, "questions") 299 | raw_sents_path = os.path.join(target_dir, "raw_sents.json") 300 | answers_path = os.path.join(target_dir, "answers.json") 301 | meta_data_path = os.path.join(target_dir, "meta_data.json") 302 | meta_data = json.load(open(meta_data_path, "r")) 303 | 304 | sentss_dict = {} 305 | answers_dict = {} 306 | 307 | ques_names = sorted([name for name in os.listdir(questions_dir) if os.path.splitext(name)[1].endswith(".json")], 308 | key=lambda x: int(os.path.splitext(os.path.splitext(x)[0])[0])) 309 | num_choices = 0 310 | num_questions = 0 311 | max_sent_size = 0 312 | pbar = get_pbar(len(ques_names)).start() 313 | for i, ques_name in enumerate(ques_names): 314 | image_name, _ = os.path.splitext(ques_name) 315 | image_id, _ = os.path.splitext(image_name) 316 | sentss = [] 317 | answers = [] 318 | ques_path = os.path.join(questions_dir, ques_name) 319 | ques = json.load(open(ques_path, "r")) 320 | for ques_id, (ques_text, d) in enumerate(ques['questions'].items()): 321 | if d['abcLabel']: 322 | continue 323 | sents = [_tokenize(qa2hypo(ques_text, choice, args.qa2hypo, args.qa2hypo_path)) for choice in d['answerTexts']] 324 | max_sent_size = max(max_sent_size, max(len(sent) for sent in sents)) 325 | assert not num_choices or num_choices == len(sents), "number of choices don't match: %s" % ques_name 326 | num_choices = len(sents) 327 | sentss.append(sents) 328 | answers.append(d['correctAnswer']) 329 | num_questions += 1 330 | sentss_dict[image_id] = sentss 331 | answers_dict[image_id] = answers 332 | pbar.update(i) 333 | pbar.finish() 334 | meta_data['num_choices'] = num_choices 335 | meta_data['max_sent_size'] = max_sent_size 336 | 337 | print("number of questions: %d" % num_questions) 338 | print("number of choices: %d" % num_choices) 339 | print("max sent size: %d" % max_sent_size) 340 | print("dumping json file ... ") 341 | json.dump(sentss_dict, open(raw_sents_path, "w")) 342 | json.dump(answers_dict, open(answers_path, "w")) 343 | json.dump(meta_data, open(meta_data_path, "w")) 344 | print("done") 345 | 346 | 347 | def build_vocab(args): 348 | target_dir = args.target_dir 349 | vocab_path = os.path.join(target_dir, "vocab.json") 350 | emb_mat_path = os.path.join(target_dir, "init_emb_mat.h5") 351 | raw_sents_path = os.path.join(target_dir, "raw_sents.json") 352 | raw_facts_path = os.path.join(target_dir, "raw_facts.json") 353 | raw_sentss_dict = json.load(open(raw_sents_path, 'r')) 354 | raw_facts_dict = json.load(open(raw_facts_path, 'r')) 355 | 356 | meta_data_path = os.path.join(target_dir, "meta_data.json") 357 | meta_data = json.load(open(meta_data_path, 'r')) 358 | glove_path = args.glove_path 359 | 360 | word_counter = defaultdict(int) 361 | 362 | for image_id, raw_sentss in raw_sentss_dict.items(): 363 | for raw_sents in raw_sentss: 364 | for raw_sent in raw_sents: 365 | for word in raw_sent: 366 | _vadd(word_counter, word) 367 | 368 | for image_id, raw_facts in raw_facts_dict.items(): 369 | for raw_fact in raw_facts: 370 | for word in raw_fact: 371 | _vadd(word_counter, word) 372 | 373 | word_list, counts = zip(*sorted([pair for pair in word_counter.items()], key=lambda x: -x[1])) 374 | freq = 5 375 | print("top %d frequent words:" % freq) 376 | for word, count in zip(word_list[:freq], counts[:freq]): 377 | print("%r: %d" % (word, count)) 378 | 379 | features = {} 380 | word_size = 0 381 | print("reading %s ... " % glove_path) 382 | with open(glove_path, 'r') as fp: 383 | for line in fp: 384 | array = line.lstrip().rstrip().split(" ") 385 | word = array[0] 386 | if word in word_counter: 387 | vector = list(map(float, array[1:])) 388 | features[word] = vector 389 | word_size = len(vector) 390 | print("done") 391 | vocab_word_list = [word for word in word_list if word in features] 392 | unknown_word_list = [word for word in word_list if word not in features] 393 | vocab_size = len(features) + 1 394 | 395 | f = h5py.File(emb_mat_path, 'w') 396 | emb_mat = f.create_dataset('data', [vocab_size, word_size], dtype='float') 397 | vocab = {} 398 | pbar = get_pbar(len(vocab_word_list)).start() 399 | for i, word in enumerate(vocab_word_list): 400 | emb_mat[i+1, :] = features[word] 401 | vocab[word] = i + 1 402 | pbar.update(i) 403 | pbar.finish() 404 | vocab['UNK'] = 0 405 | 406 | meta_data['vocab_size'] = vocab_size 407 | meta_data['word_size'] = word_size 408 | print("num of distinct words: %d" % len(word_counter)) 409 | print("vocab size: %d" % vocab_size) 410 | print("word size: %d" % word_size) 411 | 412 | print("dumping json file ... ") 413 | f.close() 414 | json.dump(vocab, open(vocab_path, "w")) 415 | json.dump(meta_data, open(meta_data_path, "w")) 416 | print("done") 417 | 418 | 419 | def indexing(args): 420 | target_dir = args.target_dir 421 | vocab_path = os.path.join(target_dir, "vocab.json") 422 | raw_sents_path = os.path.join(target_dir, "raw_sents.json") 423 | raw_facts_path = os.path.join(target_dir, "raw_facts.json") 424 | sents_path = os.path.join(target_dir, "sents.json") 425 | facts_path = os.path.join(target_dir, "facts.json") 426 | vocab = json.load(open(vocab_path, 'r')) 427 | raw_sentss_dict = json.load(open(raw_sents_path, 'r')) 428 | raw_facts_dict = json.load(open(raw_facts_path, 'r')) 429 | 430 | sentss_dict = {image_id: [[_vlup(vocab, sent) for sent in sents] for sents in sentss] for image_id, sentss in raw_sentss_dict.items()} 431 | facts_dict = {image_id: [_vlup(vocab, fact) for fact in facts] for image_id, facts in raw_facts_dict.items()} 432 | 433 | print("dumping json files ... ") 434 | json.dump(sentss_dict, open(sents_path, 'w')) 435 | json.dump(facts_dict, open(facts_path, 'w')) 436 | print("done") 437 | 438 | 439 | def create_meta_data(args): 440 | target_dir = args.target_dir 441 | if not os.path.exists(target_dir): 442 | os.mkdir(target_dir) 443 | meta_data_path = os.path.join(target_dir, "meta_data.json") 444 | meta_data = {'data_dir': args.data_dir} 445 | json.dump(meta_data, open(meta_data_path, "w")) 446 | 447 | 448 | def create_image_ids_and_paths(args): 449 | if args.prepro_images == 'False': 450 | print("Skipping image preprocessing.") 451 | return 452 | data_dir = args.data_dir 453 | target_dir = args.target_dir 454 | images_dir = os.path.join(data_dir, "images") 455 | image_ids_path = os.path.join(target_dir, "image_ids.json") 456 | image_paths_path = os.path.join(target_dir, "image_paths.json") 457 | image_names = [name for name in os.listdir(images_dir) if name.endswith(".png")] 458 | image_ids = [os.path.splitext(name)[0] for name in image_names] 459 | ordered_image_ids = sorted(image_ids, key=lambda x: int(x)) 460 | ordered_image_names = ["%s.png" % id_ for id_ in ordered_image_ids] 461 | print("dumping json files ... ") 462 | image_paths = [os.path.join(images_dir, name) for name in ordered_image_names] 463 | json.dump(ordered_image_ids, open(image_ids_path, "w")) 464 | json.dump(image_paths, open(image_paths_path, "w")) 465 | print("done") 466 | 467 | 468 | def prepro_images(args): 469 | if args.prepro_images == 'False': 470 | print("Skipping image preprocessing.") 471 | return 472 | model_path = args.vgg_model_path 473 | proto_path = args.vgg_proto_path 474 | out_path = os.path.join(args.target_dir, "images.h5") 475 | image_paths_path = os.path.join(args.target_dir, "image_paths.json") 476 | os.system("th prepro_images.lua --image_path_json %s --cnn_proto %s --cnn_model %s --out_path %s" 477 | % (image_paths_path, proto_path, model_path, out_path)) 478 | 479 | 480 | def copy_folds(args): 481 | data_dir = args.data_dir 482 | target_dir = args.target_dir 483 | for num in range(1,6): 484 | from_folds_path = os.path.join(data_dir, "fold%d.json" % num) 485 | to_folds_path = os.path.join(target_dir, "fold%d.json" % num) 486 | shutil.copy(from_folds_path, to_folds_path) 487 | 488 | 489 | if __name__ == "__main__": 490 | ARGS = get_args() 491 | create_meta_data(ARGS) 492 | create_image_ids_and_paths(ARGS) 493 | prepro_questions(ARGS) 494 | prepro_annos(ARGS) 495 | build_vocab(ARGS) 496 | indexing(ARGS) 497 | prepro_images(ARGS) 498 | -------------------------------------------------------------------------------- /prepro_images.lua: -------------------------------------------------------------------------------- 1 | require 'nn' 2 | require 'optim' 3 | require 'torch' 4 | require 'nn' 5 | require 'math' 6 | require 'cunn' 7 | require 'cutorch' 8 | require 'loadcaffe' 9 | require 'image' 10 | require 'hdf5' 11 | cjson=require('cjson') 12 | require 'xlua' 13 | 14 | ------------------------------------------------------------------------------- 15 | -- Input arguments and options 16 | ------------------------------------------------------------------------------- 17 | cmd = torch.CmdLine() 18 | cmd:text() 19 | cmd:text('Options') 20 | cmd:option('--image_path_json', '', 'json containing ordered list of image paths') 21 | cmd:option('--cnn_proto', '', 'path to the cnn prototxt') 22 | cmd:option('--cnn_model', '', 'path to the cnn model') 23 | cmd:option('--batch_size', 10, 'batch_size') 24 | 25 | cmd:option('--out_path', 'image_rep.h5', 'output path') 26 | cmd:option('--gpuid', 1, 'which gpu to use. -1 = use CPU') 27 | cmd:option('--backend', 'cudnn', 'nn|cudnn') 28 | 29 | opt = cmd:parse(arg) 30 | print(opt) 31 | 32 | cutorch.setDevice(opt.gpuid) 33 | net=loadcaffe.load(opt.cnn_proto, opt.cnn_model,opt.backend); 34 | net:evaluate() 35 | net=net:cuda() 36 | 37 | function loadim(imname) 38 | local im, im2 39 | im=image.load(imname) 40 | im=image.scale(im,224,224) 41 | if im:size(1)==1 then 42 | im2=torch.cat(im,im,1) 43 | im2=torch.cat(im2,im,1) 44 | im=im2 45 | elseif im:size(1)==4 then 46 | im=im[{{1,3},{},{}}] 47 | end 48 | im=im*255; 49 | im2=im:clone() 50 | im2[{{3},{},{}}]=im[{{1},{},{}}]-123.68 51 | im2[{{2},{},{}}]=im[{{2},{},{}}]-116.779 52 | im2[{{1},{},{}}]=im[{{3},{},{}}]-103.939 53 | return im2 54 | end 55 | 56 | local image_path_json_file = io.open(opt.image_path_json, 'r') 57 | local image_path_json = cjson.decode(image_path_json_file:read()) 58 | image_path_json_file.close() 59 | 60 | local image_path_list = {} 61 | for i,image_path in pairs(image_path_json) do 62 | table.insert(image_path_list, image_path) 63 | end 64 | 65 | local ndims=4096 66 | local batch_size = opt.batch_size 67 | 68 | local sz = #image_path_list 69 | local feat = torch.CudaTensor(sz, ndims) 70 | print(string.format('processing %d images...', sz)) 71 | for i = 1, sz, batch_size do 72 | xlua.progress(i, sz) 73 | local r = math.min(sz, i + batch_size - 1) 74 | local ims = torch.CudaTensor(r-i+1, 3, 224, 224) 75 | for j = 1, r-i+1 do 76 | ims[j] = loadim(image_path_list[i+j-1]):cuda() 77 | end 78 | net:forward(ims) 79 | feat[{{i,r},{}}] = net.modules[43].output:clone() 80 | collectgarbage() 81 | end 82 | 83 | local h5_file = hdf5.open(opt.out_path, 'w') 84 | h5_file:write('/data', feat:float()) 85 | h5_file:close() 86 | -------------------------------------------------------------------------------- /read_data/__init__.py: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/allenai/dqa-net/c497e950017e69952deae3ffd07920a5bfb9f468/read_data/__init__.py -------------------------------------------------------------------------------- /read_data/r05.py: -------------------------------------------------------------------------------- 1 | import json 2 | import os 3 | from pprint import pprint 4 | 5 | import h5py 6 | import numpy as np 7 | import sys 8 | 9 | from configs.get_config import Config 10 | 11 | 12 | class DataSet(object): 13 | def __init__(self, name, batch_size, data, idxs, idx2id): 14 | self.name = name 15 | self.num_epochs_completed = 0 16 | self.idx_in_epoch = 0 17 | self.batch_size = batch_size 18 | self.data = data 19 | self.idxs = idxs 20 | self.idx2id = idx2id 21 | self.num_examples = len(idxs) 22 | self.num_full_batches = int(self.num_examples / self.batch_size) 23 | self.num_all_batches = self.num_full_batches + int(self.num_examples % self.batch_size > 0) 24 | self.reset() 25 | 26 | def get_num_batches(self, partial=False): 27 | return self.num_all_batches if partial else self.num_full_batches 28 | 29 | def get_batch_idxs(self, partial=False): 30 | assert self.has_next_batch(partial=partial), "End of data, reset required." 31 | from_, to = self.idx_in_epoch, self.idx_in_epoch + self.batch_size 32 | if partial and to > self.num_examples: 33 | to = self.num_examples 34 | cur_idxs = self.idxs[from_:to] 35 | return cur_idxs 36 | 37 | def get_next_labeled_batch(self, partial=False): 38 | cur_idxs = self.get_batch_idxs(partial=partial) 39 | batch = [[each[i] for i in cur_idxs] for each in self.data] 40 | self.idx_in_epoch += len(cur_idxs) 41 | return batch 42 | 43 | def has_next_batch(self, partial=False): 44 | if partial: 45 | return self.idx_in_epoch < self.num_examples 46 | return self.idx_in_epoch + self.batch_size <= self.num_examples 47 | 48 | def complete_epoch(self): 49 | self.reset() 50 | self.num_epochs_completed += 1 51 | 52 | def reset(self): 53 | self.idx_in_epoch = 0 54 | np.random.shuffle(self.idxs) 55 | 56 | 57 | def read_data(params, mode): 58 | print("loading {} data ... ".format(mode)) 59 | data_dir = params.data_dir 60 | 61 | fold_path = params.fold_path 62 | fold = json.load(open(fold_path, 'r')) 63 | if mode in ['train', 'test']: 64 | cur_image_ids = fold[mode] 65 | elif mode == 'val': 66 | cur_image_ids = fold['test'] 67 | else: 68 | raise Exception() 69 | 70 | sents_path = os.path.join(data_dir, "sents.json") 71 | facts_path = os.path.join(data_dir, "facts.json") 72 | answers_path = os.path.join(data_dir, "answers.json") 73 | images_path = os.path.join(data_dir, "images.h5") 74 | image_ids_path = os.path.join(data_dir, "image_ids.json") 75 | 76 | sentss_dict = json.load(open(sents_path, "r")) 77 | facts_dict = json.load(open(facts_path, "r")) 78 | answers_dict = json.load(open(answers_path, "r")) 79 | images_h5 = h5py.File(images_path, 'r') 80 | all_image_ids = json.load(open(image_ids_path, 'r')) 81 | image_id2idx = {id_: idx for idx, id_ in enumerate(all_image_ids)} 82 | 83 | batch_size = params.batch_size 84 | sentss, answers, factss, images = [], [], [], [] 85 | idx = 0 86 | idx2id = [] 87 | for image_id in cur_image_ids: 88 | if image_id not in sentss_dict or image_id not in facts_dict: 89 | continue 90 | facts = facts_dict[image_id] 91 | image = images_h5['data'][image_id2idx[image_id]] 92 | for sent_id, (sents, answer) in enumerate(zip(sentss_dict[image_id], answers_dict[image_id])): 93 | sentss.append(sents) 94 | answers.append(answer) 95 | factss.append(facts) 96 | images.append(image) 97 | idx2id.append([image_id, sent_id]) 98 | idx += 1 99 | 100 | data = [sentss, factss, images, answers] 101 | idxs = np.arange(len(answers)) 102 | data_set = DataSet(mode, batch_size, data, idxs, idx2id) 103 | print("done") 104 | return data_set 105 | 106 | 107 | if __name__ == "__main__": 108 | params = Config() 109 | params.batch_size = 2 110 | params.train = True 111 | -------------------------------------------------------------------------------- /requirements.txt: -------------------------------------------------------------------------------- 1 | numpy 2 | progressbar2 3 | nltk 4 | tensorflow 5 | h5py 6 | -------------------------------------------------------------------------------- /tmp/__init__.py: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/allenai/dqa-net/c497e950017e69952deae3ffd07920a5bfb9f468/tmp/__init__.py -------------------------------------------------------------------------------- /tmp/sim_test.py: -------------------------------------------------------------------------------- 1 | import argparse 2 | import json 3 | import os 4 | 5 | import itertools 6 | from collections import defaultdict 7 | 8 | import numpy as np 9 | import matplotlib.pyplot as plt 10 | 11 | from utils import get_pbar 12 | 13 | 14 | def get_args(): 15 | parser = argparse.ArgumentParser() 16 | parser.add_argument("first_dir") 17 | parser.add_argument("second_dir") 18 | return parser.parse_args() 19 | 20 | def sim_test(args): 21 | first_dir = args.first_dir 22 | second_dir = args.second_dir 23 | first_sents_path = os.path.join(first_dir, "sents.json") 24 | second_sents_path = os.path.join(second_dir, "sents.json") 25 | vocab_path = os.path.join(first_dir, "vocab.json") 26 | vocab = json.load(open(vocab_path, 'r')) 27 | inv_vocab = {idx: word for word, idx in vocab.items()} 28 | first_sents = json.load(open(first_sents_path, "r")) 29 | second_sents = json.load(open(second_sents_path, "r")) 30 | diff_dict = defaultdict(int) 31 | pbar = get_pbar(len(first_sents)).start() 32 | i = 0 33 | for first_id, sents1 in first_sents.items(): 34 | text1 = sent_to_text(inv_vocab, sents1[0]) 35 | min_second_id, diff = min([[second_id, cdiff(sents1, sents2, len(vocab))] for second_id, sents2 in second_sents.items()], 36 | key=lambda x: x[1]) 37 | text2 = sent_to_text(inv_vocab, second_sents[min_second_id][0]) 38 | diff_dict[diff] += 1 39 | """ 40 | if diff <= 3: 41 | print("%s, %s, %d" % (text1, text2, diff)) 42 | """ 43 | pbar.update(i) 44 | i += 1 45 | pbar.finish() 46 | json.dump(diff_dict, open("diff_dict.json", "w")) 47 | 48 | def sent_to_text(vocab, sent): 49 | return " ".join(vocab[idx] for idx in sent) 50 | 51 | def sent_to_bow(sent, l): 52 | out = np.zeros([l]) 53 | for idx in sent: 54 | out[idx] = 1.0 55 | return out 56 | 57 | def temp(): 58 | a = {"0.0": 128, "1.0": 61, "2.0": 181, "3.0": 152, "4.0": 170, "5.0": 144, "6.0": 128, "7.0": 120, "8.0": 70, "9.0": 50, "10.0": 44, "11.0": 22, "12.0": 19, "13.0": 17, "14.0": 3, "15.0": 4, "16.0": 3, "18.0": 2, "22.0": 1, "24.0": 1, "27.0": 1} 59 | keys = map(int, a.keys()) 60 | plt.plot(keys, [a[key] for key in keys]) 61 | 62 | 63 | 64 | def diff(sent1, sent2, l): 65 | return np.sum(np.abs(sent_to_bow(sent1, l) - sent_to_bow(sent2, l))) 66 | 67 | def cdiff(sents1, sents2, l): 68 | return min(diff(sent1, sent2, l) for sent1, sent2 in itertools.product(sents1, sents2)) 69 | 70 | if __name__ == "__main__": 71 | ARGS = get_args() 72 | sim_test(ARGS) -------------------------------------------------------------------------------- /tmp/simple.py: -------------------------------------------------------------------------------- 1 | import argparse 2 | import json 3 | from os import path, listdir 4 | from random import randint 5 | 6 | import networkx as nx 7 | import re 8 | 9 | from nltk.stem import PorterStemmer 10 | 11 | from utils import get_pbar 12 | 13 | 14 | def _get_args(): 15 | parser =argparse.ArgumentParser() 16 | parser.add_argument("data_dir") 17 | parser.add_argument("fold_path") 18 | return parser.parse_args() 19 | 20 | 21 | def _tokenize(raw): 22 | tokens = re.findall(r"[\w]+", raw) 23 | return tokens 24 | 25 | stem = True 26 | stemmer = PorterStemmer() 27 | def _normalize(word): 28 | word = word.lower() 29 | if stem: 30 | word = stemmer.stem(word) 31 | return word 32 | 33 | def load_all(data_dir): 34 | annos_dir = path.join(data_dir, 'annotations') 35 | images_dir = path.join(data_dir, 'images') 36 | questions_dir = path.join(data_dir, 'questions') 37 | 38 | anno_dict = {} 39 | questions_dict = {} 40 | choicess_dict = {} 41 | answers_dict = {} 42 | 43 | image_ids = sorted([path.splitext(name)[0] for name in listdir(images_dir) if name.endswith(".png")], key=lambda x: int(x)) 44 | pbar = get_pbar(len(image_ids)).start() 45 | for i, image_id in enumerate(image_ids): 46 | json_name = "%s.png.json" % image_id 47 | anno_path = path.join(annos_dir, json_name) 48 | ques_path = path.join(questions_dir, json_name) 49 | if path.exists(anno_path) and path.exists(ques_path): 50 | anno = json.load(open(anno_path, "r")) 51 | ques = json.load(open(ques_path, "r")) 52 | 53 | questions = [] 54 | choicess = [] 55 | answers = [] 56 | for question, d in ques['questions'].items(): 57 | if not d['abcLabel']: 58 | choices = d['answerTexts'] 59 | answer = d['correctAnswer'] 60 | questions.append(question) 61 | choicess.append(choices) 62 | answers.append(answer) 63 | 64 | questions_dict[image_id] = questions 65 | choicess_dict[image_id] = choicess 66 | answers_dict[image_id] = answers 67 | anno_dict[image_id] = anno 68 | pbar.update(i) 69 | pbar.finish() 70 | 71 | return anno_dict, questions_dict, choicess_dict, answers_dict 72 | 73 | 74 | def _get_val(anno, key): 75 | first = key[0] 76 | if first == 'T': 77 | val = anno['text'][key]['value'] 78 | val = _normalize(val) 79 | return val 80 | elif first == 'O': 81 | d = anno['objects'][key] 82 | if 'text' in d and len(d['text']) > 0: 83 | key = d['text'][0] 84 | return _get_val(anno, key) 85 | return None 86 | else: 87 | raise Exception(key) 88 | 89 | 90 | def create_graph(anno): 91 | graph = nx.Graph() 92 | try: 93 | d = anno['relationships']['interObject']['linkage'] 94 | except: 95 | return graph 96 | for dd in d.values(): 97 | if dd['category'] == 'objectToObject': 98 | dest = _get_val(anno, dd['destination'][0]) 99 | orig = _get_val(anno, dd['origin'][0]) 100 | if dest and orig: 101 | graph.add_edge(dest, orig) 102 | return graph 103 | 104 | 105 | def find_node(graph, text): 106 | words = _tokenize(text) 107 | words = [_normalize(word) for word in words] 108 | for word in words: 109 | if word in graph.nodes(): 110 | return word 111 | return None 112 | 113 | 114 | def guess(graph, question, choices): 115 | MAX = 9999 116 | SUBMAX = 999 117 | ques_node = find_node(graph, question) 118 | dists = [] 119 | for choice in choices: 120 | choice_node = find_node(graph, choice) 121 | if ques_node is None and choice_node is None: 122 | dist = MAX 123 | elif ques_node is None and choice_node is not None: 124 | dist = SUBMAX 125 | elif ques_node is not None and choice_node is None: 126 | dist = MAX 127 | else: 128 | if nx.has_path(graph, ques_node, choice_node): 129 | pl = len(nx.shortest_path(graph, ques_node, choice_node)) 130 | dist = pl 131 | else: 132 | dist = MAX 133 | dists.append(dist) 134 | answer, dist = min(enumerate(dists), key=lambda x: x[1]) 135 | max_dist = max(dists) 136 | if dist == MAX: 137 | return None 138 | if dist == max_dist: 139 | return None 140 | return answer 141 | 142 | 143 | def evaluate(anno_dict, questions_dict, choicess_dict, answers_dict): 144 | total = 0 145 | correct = 0 146 | incorrect = 0 147 | guessed = 0 148 | pbar = get_pbar(len(anno_dict)).start() 149 | for i, (image_id, anno) in enumerate(anno_dict.items()): 150 | graph = create_graph(anno) 151 | questions = questions_dict[image_id] 152 | choicess =choicess_dict[image_id] 153 | answers = answers_dict[image_id] 154 | for question, choices, answer in zip(questions, choicess, answers): 155 | total += 1 156 | a = guess(graph, question, choices) 157 | if a is None: 158 | guessed += 1 159 | elif answer == a: 160 | correct += 1 161 | else: 162 | incorrect += 1 163 | pbar.update(i) 164 | pbar.finish() 165 | print("expected accuracy: (0.25 * %d + %d)/%d = %.4f" % (guessed, correct, total, (0.25*guessed + correct)/total)) 166 | print("precision: %d/%d = %.4f" % (correct, correct + incorrect, correct/(correct + incorrect))) 167 | 168 | 169 | def select(fold_path, *all_): 170 | fold = json.load(open(fold_path, 'r')) 171 | test_ids = fold['test'] 172 | new_all = [] 173 | for each in all_: 174 | new_each = {id_: each[id_] for id_ in test_ids if id_ in each} 175 | new_all.append(new_each) 176 | return new_all 177 | 178 | 179 | def main(): 180 | args = _get_args() 181 | all_ = load_all(args.data_dir) 182 | selected = select(args.fold_path, *all_) 183 | evaluate(*selected) 184 | 185 | if __name__ == "__main__": 186 | main() 187 | 188 | 189 | 190 | -------------------------------------------------------------------------------- /utils.py: -------------------------------------------------------------------------------- 1 | import json 2 | 3 | import progressbar as pb 4 | 5 | 6 | def get_pbar(num, prefix=""): 7 | assert isinstance(prefix, str) 8 | pbar = pb.ProgressBar(widgets=[prefix, pb.Percentage(), pb.Bar(), pb.ETA()], maxval=num) 9 | return pbar 10 | 11 | 12 | def json_pretty_dump(obj, fh): 13 | return json.dump(obj, fh, sort_keys=True, indent=2, separators=(',', ': ')) 14 | 15 | -------------------------------------------------------------------------------- /vis/__init__.py: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/allenai/dqa-net/c497e950017e69952deae3ffd07920a5bfb9f468/vis/__init__.py -------------------------------------------------------------------------------- /vis/list_dqa_questions.py: -------------------------------------------------------------------------------- 1 | import SimpleHTTPServer 2 | import SocketServer 3 | import argparse 4 | import json 5 | import os 6 | import shutil 7 | 8 | from jinja2 import Environment, FileSystemLoader 9 | 10 | from utils import get_pbar 11 | 12 | 13 | def get_args(): 14 | parser = argparse.ArgumentParser() 15 | parser.add_argument("data_dir") 16 | parser.add_argument("--start", default=0, type=int) 17 | parser.add_argument("--stop", default=1500, type=int) 18 | parser.add_argument("--show_im", default='True') 19 | parser.add_argument("--im_width", type=int, default=200) 20 | parser.add_argument("--ext", type=str, default=".png") 21 | parser.add_argument("--template_name", type=str, default="list_dqa_questions.html") 22 | parser.add_argument("--port", type=int, default=8000) 23 | parser.add_argument("--host", type=str, default="0.0.0.0") 24 | parser.add_argument("--num_im", type=int, default=50) 25 | parser.add_argument("--open", type=str, default='False') 26 | 27 | return parser.parse_args() 28 | 29 | 30 | def list_dqa_questions(args): 31 | data_dir = args.data_dir 32 | images_dir = os.path.join(data_dir, "images") 33 | questions_dir = os.path.join(data_dir, "questions") 34 | annos_dir = os.path.join(data_dir, "annotations") 35 | _id = 0 36 | html_dir = "/tmp/list_dqa_questions_%d" % _id 37 | while os.path.exists(html_dir): 38 | _id += 1 39 | html_dir = "/tmp/list_dqa_questions_%d" % _id 40 | 41 | cur_dir = os.path.dirname(os.path.realpath(__file__)) 42 | templates_dir = os.path.join(cur_dir, 'templates') 43 | env = Environment(loader=FileSystemLoader(templates_dir)) 44 | template = env.get_template(args.template_name) 45 | 46 | if os.path.exists(html_dir): 47 | shutil.rmtree(html_dir) 48 | os.mkdir(html_dir) 49 | 50 | headers = ['image_id', 'question_id', 'image', 'question', 'choices', 'answer', 'annotations'] 51 | rows = [] 52 | image_names = [name for name in os.listdir(images_dir) if name.endswith('png')] 53 | image_names = sorted(image_names, key=lambda name: int(os.path.splitext(name)[0])) 54 | image_names = [name for name in image_names 55 | if name.endswith(args.ext) and args.start <= int(os.path.splitext(name)[0]) < args.stop] 56 | pbar = get_pbar(len(image_names)).start() 57 | for i, image_name in enumerate(image_names): 58 | image_id, _ = os.path.splitext(image_name) 59 | json_name = "%s.json" % image_name 60 | anno_path = os.path.join(annos_dir, json_name) 61 | question_path = os.path.join(questions_dir, json_name) 62 | if os.path.exists(question_path): 63 | question_dict = json.load(open(question_path, "rb")) 64 | anno_dict = json.load(open(anno_path, "rb")) 65 | for j, (question, d) in enumerate(question_dict['questions'].iteritems()): 66 | row = {'image_id': image_id, 67 | 'question_id': str(j), 68 | 'image_url': os.path.join("images" if not d['abcLabel'] else "imagesReplacedText", image_name), 69 | 'anno_url': os.path.join("annotations", json_name), 70 | 'question': question, 71 | 'choices': d['answerTexts'], 72 | 'answer': d['correctAnswer']} 73 | rows.append(row) 74 | 75 | if i % args.num_im == 0: 76 | html_path = os.path.join(html_dir, "%s.html" % str(image_id).zfill(8)) 77 | 78 | if (i + 1) % args.num_im == 0 or (i + 1) == len(image_names): 79 | var_dict = {'title': "Question List", 80 | 'image_width': args.im_width, 81 | 'headers': headers, 82 | 'rows': rows, 83 | 'show_im': args.show_im} 84 | with open(html_path, "wb") as f: 85 | f.write(template.render(**var_dict).encode('UTF-8')) 86 | rows = [] 87 | pbar.update(i) 88 | pbar.finish() 89 | 90 | 91 | os.system("ln -s %s/* %s" % (data_dir, html_dir)) 92 | os.chdir(html_dir) 93 | port = args.port 94 | host = args.host 95 | # Overriding to suppress log message 96 | class MyHandler(SimpleHTTPServer.SimpleHTTPRequestHandler): 97 | def log_message(self, format, *args): 98 | pass 99 | handler = MyHandler 100 | httpd = SocketServer.TCPServer((host, port), handler) 101 | if args.open == 'True': 102 | os.system("open http://%s:%d" % (args.host, args.port)) 103 | print("serving at %s:%d" % (host, port)) 104 | httpd.serve_forever() 105 | 106 | 107 | if __name__ == "__main__": 108 | ARGS = get_args() 109 | list_dqa_questions(ARGS) 110 | -------------------------------------------------------------------------------- /vis/list_facts.py: -------------------------------------------------------------------------------- 1 | import shutil 2 | 3 | import SimpleHTTPServer 4 | import SocketServer 5 | import argparse 6 | import json 7 | import os 8 | from copy import deepcopy 9 | 10 | from jinja2 import Environment, FileSystemLoader 11 | 12 | from utils import get_pbar 13 | 14 | 15 | def get_args(): 16 | parser = argparse.ArgumentParser() 17 | parser.add_argument("prepro_dir") 18 | parser.add_argument("--start", default=0, type=int) 19 | parser.add_argument("--stop", default=1500, type=int) 20 | parser.add_argument("--show_im", default='True') 21 | parser.add_argument("--im_width", type=int, default=200) 22 | parser.add_argument("--ext", type=str, default=".png") 23 | parser.add_argument("--template_name", type=str, default="list_facts.html") 24 | parser.add_argument("--num_im", type=int, default=50) 25 | parser.add_argument("--port", type=int, default=8000) 26 | parser.add_argument("--host", type=str, default="0.0.0.0") 27 | parser.add_argument("--open", type=str, default='False') 28 | 29 | args = parser.parse_args() 30 | return args 31 | 32 | 33 | def _decode_sent(decoder, sent): 34 | return " ".join(decoder[idx] for idx in sent) 35 | 36 | 37 | 38 | def list_facts(args): 39 | prepro_dir = args.prepro_dir 40 | meta_data_dir = os.path.join(prepro_dir, "meta_data.json") 41 | meta_data = json.load(open(meta_data_dir, "r")) 42 | data_dir = meta_data['data_dir'] 43 | _id = 0 44 | html_dir = "/tmp/list_facts%d" % _id 45 | while os.path.exists(html_dir): 46 | _id += 1 47 | html_dir = "/tmp/list_facts%d" % _id 48 | 49 | images_dir = os.path.join(data_dir, 'images') 50 | annos_dir = os.path.join(data_dir, 'annotations') 51 | 52 | sents_path = os.path.join(prepro_dir, 'sents.json') 53 | facts_path = os.path.join(prepro_dir, 'facts.json') 54 | vocab_path = os.path.join(prepro_dir, 'vocab.json') 55 | answers_path = os.path.join(prepro_dir, 'answers.json') 56 | sentss_dict = json.load(open(sents_path, "r")) 57 | facts_dict = json.load(open(facts_path, "r")) 58 | vocab = json.load(open(vocab_path, "r")) 59 | answers_dict = json.load(open(answers_path, "r")) 60 | decoder = {idx: word for word, idx in vocab.items()} 61 | 62 | if os.path.exists(html_dir): 63 | shutil.rmtree(html_dir) 64 | os.mkdir(html_dir) 65 | 66 | cur_dir = os.path.dirname(os.path.realpath(__file__)) 67 | templates_dir = os.path.join(cur_dir, 'templates') 68 | env = Environment(loader=FileSystemLoader(templates_dir)) 69 | template = env.get_template(args.template_name) 70 | 71 | headers = ['iid', 'qid', 'image', 'sents', 'answer', 'annotations', 'relations'] 72 | rows = [] 73 | pbar = get_pbar(len(sentss_dict)).start() 74 | image_ids = sorted(sentss_dict.keys(), key=lambda x: int(x)) 75 | for i, image_id in enumerate(image_ids): 76 | sentss = sentss_dict[image_id] 77 | answers = answers_dict[image_id] 78 | facts = facts_dict[image_id] if image_id in facts_dict else [] 79 | decoded_facts = [_decode_sent(decoder, fact) for fact in facts] 80 | for question_id, (sents, answer) in enumerate(zip(sentss, answers)): 81 | image_name = "%s.png" % image_id 82 | json_name = "%s.json" % image_name 83 | image_url = os.path.join('images', image_name) 84 | anno_url = os.path.join('annotations', json_name) 85 | row = {'image_id': image_id, 86 | 'question_id': question_id, 87 | 'image_url': image_url, 88 | 'anno_url': anno_url, 89 | 'sents': [_decode_sent(decoder, sent) for sent in sents], 90 | 'answer': answer, 91 | 'facts': decoded_facts} 92 | rows.append(row) 93 | 94 | if i % args.num_im == 0: 95 | html_path = os.path.join(html_dir, "%s.html" % str(image_id).zfill(8)) 96 | 97 | if (i + 1) % args.num_im == 0 or (i + 1) == len(image_ids): 98 | var_dict = {'title': "Question List", 99 | 'image_width': args.im_width, 100 | 'headers': headers, 101 | 'rows': rows, 102 | 'show_im': True if args.show_im == 'True' else False} 103 | with open(html_path, "wb") as f: 104 | f.write(template.render(**var_dict).encode('UTF-8')) 105 | rows = [] 106 | pbar.update(i) 107 | pbar.finish() 108 | 109 | os.system("ln -s %s/* %s" % (data_dir, html_dir)) 110 | os.chdir(html_dir) 111 | port = args.port 112 | host = args.host 113 | # Overriding to suppress log message 114 | class MyHandler(SimpleHTTPServer.SimpleHTTPRequestHandler): 115 | def log_message(self, format, *args): 116 | pass 117 | handler = MyHandler 118 | httpd = SocketServer.TCPServer((host, port), handler) 119 | if args.open == 'True': 120 | os.system("open http://%s:%d" % (args.host, args.port)) 121 | print("serving at %s:%d" % (host, port)) 122 | httpd.serve_forever() 123 | 124 | 125 | if __name__ == "__main__": 126 | ARGS = get_args() 127 | list_facts(ARGS) 128 | -------------------------------------------------------------------------------- /vis/list_relations.py: -------------------------------------------------------------------------------- 1 | import argparse 2 | import json 3 | import os 4 | from copy import deepcopy 5 | 6 | from jinja2 import Environment, FileSystemLoader 7 | 8 | from utils import get_pbar 9 | 10 | 11 | def get_args(): 12 | parser = argparse.ArgumentParser() 13 | parser.add_argument("prepro_dir") 14 | parser.add_argument("--start", default=0, type=int) 15 | parser.add_argument("--stop", default=1500, type=int) 16 | parser.add_argument("--show_im", default='True') 17 | parser.add_argument("--im_width", type=int, default=200) 18 | parser.add_argument("--ext", type=str, default=".png") 19 | parser.add_argument("--html_path", type=str, default="/tmp/list_relations.html") 20 | parser.add_argument("--template_name", type=str, default="list_relations.html") 21 | 22 | args = parser.parse_args() 23 | return args 24 | 25 | 26 | def _decode_sent(decoder, sent): 27 | return " ".join(decoder[idx] for idx in sent) 28 | 29 | 30 | def _decode_relation(decoder, relation): 31 | new_relation = deepcopy(relation) 32 | """ 33 | new_relation['a1r'] = _decode_sent(decoder, new_relation['a1r']) 34 | new_relation['a2r'] = _decode_sent(decoder, new_relation['a2r']) 35 | """ 36 | new_relation['a1'] = _decode_sent(decoder, new_relation['a1']) 37 | new_relation['a2'] = _decode_sent(decoder, new_relation['a2']) 38 | return new_relation 39 | 40 | 41 | def interpret_relations(args): 42 | prepro_dir = args.prepro_dir 43 | meta_data_dir = os.path.join(prepro_dir, "meta_data.json") 44 | meta_data = json.load(open(meta_data_dir, "r")) 45 | data_dir = meta_data['data_dir'] 46 | 47 | images_dir = os.path.join(data_dir, 'images') 48 | annos_dir = os.path.join(data_dir, 'annotations') 49 | html_path = args.html_path 50 | 51 | sents_path = os.path.join(prepro_dir, 'sents.json') 52 | relations_path = os.path.join(prepro_dir, 'relations.json') 53 | vocab_path = os.path.join(prepro_dir, 'vocab.json') 54 | answers_path = os.path.join(prepro_dir, 'answers.json') 55 | sentss_dict = json.load(open(sents_path, "r")) 56 | relations_dict = json.load(open(relations_path, "r")) 57 | vocab = json.load(open(vocab_path, "r")) 58 | answers_dict = json.load(open(answers_path, "r")) 59 | decoder = {idx: word for word, idx in vocab.items()} 60 | 61 | headers = ['iid', 'qid', 'image', 'sents', 'answer', 'annotations', 'relations'] 62 | rows = [] 63 | pbar = get_pbar(len(sentss_dict)).start() 64 | image_ids = sorted(sentss_dict.keys(), key=lambda x: int(x)) 65 | for i, image_id in enumerate(image_ids): 66 | sentss = sentss_dict[image_id] 67 | answers = answers_dict[image_id] 68 | relations = relations_dict[image_id] 69 | decoded_relations = [_decode_relation(decoder, relation) for relation in relations] 70 | for question_id, (sents, answer) in enumerate(zip(sentss, answers)): 71 | image_name = "%s.png" % image_id 72 | json_name = "%s.json" % image_name 73 | image_path = os.path.join(images_dir, image_name) 74 | anno_path = os.path.join(annos_dir, json_name) 75 | row = {'image_id': image_id, 76 | 'question_id': question_id, 77 | 'image_url': image_path, 78 | 'anno_url': anno_path, 79 | 'sents': [_decode_sent(decoder, sent) for sent in sents], 80 | 'answer': answer, 81 | 'relations': decoded_relations} 82 | rows.append(row) 83 | pbar.update(i) 84 | pbar.finish() 85 | var_dict = {'title': "Question List: %d - %d" % (args.start, args.stop - 1), 86 | 'image_width': args.im_width, 87 | 'headers': headers, 88 | 'rows': rows, 89 | 'show_im': True if args.show_im == 'True' else False} 90 | 91 | cur_dir = os.path.dirname(os.path.realpath(__file__)) 92 | templates_dir = os.path.join(cur_dir, 'templates') 93 | env = Environment(loader=FileSystemLoader(templates_dir)) 94 | template = env.get_template(args.template_name) 95 | out = template.render(**var_dict) 96 | with open(html_path, "w") as f: 97 | f.write(out) 98 | 99 | os.system("open %s" % html_path) 100 | 101 | 102 | if __name__ == "__main__": 103 | ARGS = get_args() 104 | interpret_relations(ARGS) 105 | -------------------------------------------------------------------------------- /vis/list_results.py: -------------------------------------------------------------------------------- 1 | import importlib 2 | import shutil 3 | 4 | import http.server 5 | import socketserver 6 | import argparse 7 | import json 8 | import os 9 | import numpy as np 10 | from copy import deepcopy 11 | 12 | from jinja2 import Environment, FileSystemLoader 13 | 14 | from utils import get_pbar 15 | 16 | 17 | def get_args(): 18 | parser = argparse.ArgumentParser() 19 | parser.add_argument("model_num", type=int) 20 | parser.add_argument("config_name", type=str) 21 | parser.add_argument("data_type", type=str) 22 | parser.add_argument("epoch", type=int) 23 | parser.add_argument("--start", default=0, type=int) 24 | parser.add_argument("--stop", default=1500, type=int) 25 | parser.add_argument("--show_im", default='True') 26 | parser.add_argument("--im_width", type=int, default=200) 27 | parser.add_argument("--ext", type=str, default=".png") 28 | parser.add_argument("--template_name", type=str, default="list_results.html") 29 | parser.add_argument("--num_im", type=int, default=50) 30 | parser.add_argument("--port", type=int, default=8000) 31 | parser.add_argument("--host", type=str, default="0.0.0.0") 32 | parser.add_argument("--open", type=str, default='False') 33 | 34 | args = parser.parse_args() 35 | return args 36 | 37 | 38 | def _decode_sent(decoder, sent): 39 | return " ".join(decoder[idx] for idx in sent) 40 | 41 | 42 | 43 | def list_results(args): 44 | model_num = args.model_num 45 | config_name = args.config_name 46 | data_type = args.data_type 47 | epoch =args.epoch 48 | configs_path = os.path.join("configs", "m%s.json" % str(model_num).zfill(2)) 49 | configs = json.load(open(configs_path, 'r')) 50 | config = configs[config_name] 51 | evals_dir = os.path.join("evals", "m%s" % str(model_num).zfill(2), config_name) 52 | evals_name = "%s_%s.json" % (data_type, str(epoch).zfill(4)) 53 | evals_path = os.path.join(evals_dir, evals_name) 54 | evals = json.load(open(evals_path, 'r')) 55 | 56 | fold_path = config['fold_path'] 57 | fold = json.load(open(fold_path, 'r')) 58 | fold_data_type = 'test' if data_type == 'val' else data_type 59 | image_ids = sorted(fold[fold_data_type], key=lambda x: int(x)) 60 | 61 | prepro_dir = config['data_dir'] 62 | meta_data_dir = os.path.join(prepro_dir, "meta_data.json") 63 | meta_data = json.load(open(meta_data_dir, "r")) 64 | data_dir = meta_data['data_dir'] 65 | _id = 0 66 | html_dir = "/tmp/list_results%d" % _id 67 | while os.path.exists(html_dir): 68 | _id += 1 69 | html_dir = "/tmp/list_results%d" % _id 70 | 71 | images_dir = os.path.join(data_dir, 'images') 72 | annos_dir = os.path.join(data_dir, 'annotations') 73 | 74 | sents_path = os.path.join(prepro_dir, 'sents.json') 75 | facts_path = os.path.join(prepro_dir, 'facts.json') 76 | vocab_path = os.path.join(prepro_dir, 'vocab.json') 77 | answers_path = os.path.join(prepro_dir, 'answers.json') 78 | sentss_dict = json.load(open(sents_path, "r")) 79 | facts_dict = json.load(open(facts_path, "r")) 80 | vocab = json.load(open(vocab_path, "r")) 81 | answers_dict = json.load(open(answers_path, "r")) 82 | decoder = {idx: word for word, idx in list(vocab.items())} 83 | 84 | if os.path.exists(html_dir): 85 | shutil.rmtree(html_dir) 86 | os.mkdir(html_dir) 87 | 88 | cur_dir = os.path.dirname(os.path.realpath(__file__)) 89 | templates_dir = os.path.join(cur_dir, 'templates') 90 | env = Environment(loader=FileSystemLoader(templates_dir)) 91 | template = env.get_template(args.template_name) 92 | 93 | eval_names = list(evals['values'].keys()) 94 | eval_dd = {} 95 | for idx, id_ in enumerate(evals['ids']): 96 | eval_d = {} 97 | for name, d in list(evals['values'].items()): 98 | eval_d[name] = d[idx] 99 | eval_dd[tuple(id_)] = eval_d 100 | 101 | # headers = ['iid', 'qid', 'image', 'sents', 'answer', 'annotations', 'relations'] + eval_names 102 | headers = ['iid', 'qid', 'image', 'sents', 'annotations', 'relations', 'p', 'yp'] 103 | rows = [] 104 | pbar = get_pbar(len(sentss_dict)).start() 105 | for i, image_id in enumerate(image_ids): 106 | if image_id not in sentss_dict: 107 | continue 108 | sentss = sentss_dict[image_id] 109 | answers = answers_dict[image_id] 110 | facts = facts_dict[image_id] if image_id in facts_dict else [] 111 | decoded_facts = [_decode_sent(decoder, fact) for fact in facts] 112 | for question_id, (sents, answer) in enumerate(zip(sentss, answers)): 113 | eval_id = (image_id, question_id) 114 | eval_d = eval_dd[eval_id] if eval_id in eval_dd else None 115 | 116 | if eval_d: 117 | p_all = list(zip(*eval_d['p'])) 118 | p = p_all[:len(decoded_facts)] 119 | p = [[float("%.3f" % x) for x in y] for y in p] 120 | yp = [float("%.3f" % x) for x in eval_d['yp']] 121 | else: 122 | p, yp, sig = [], [], [] 123 | 124 | evals = [eval_d[name] if eval_d else "" for name in eval_names] 125 | image_name = "%s.png" % image_id 126 | json_name = "%s.json" % image_name 127 | image_url = os.path.join('images', image_name) 128 | anno_url = os.path.join('annotations', json_name) 129 | ap = np.argmax(yp) if len(yp) > 0 else 0 130 | correct = len(yp) > 0 and ap == answer 131 | row = {'image_id': image_id, 132 | 'question_id': question_id, 133 | 'image_url': image_url, 134 | 'anno_url': anno_url, 135 | 'sents': [_decode_sent(decoder, sent) for sent in sents], 136 | 'answer': answer, 137 | 'facts': decoded_facts, 138 | 'p': p, 139 | 'yp': yp, 140 | 'ap': np.argmax(yp) if len(yp) > 0 else 0, 141 | 'correct': correct, 142 | } 143 | 144 | rows.append(row) 145 | 146 | if i % args.num_im == 0: 147 | html_path = os.path.join(html_dir, "%s.html" % str(image_id).zfill(8)) 148 | 149 | if (i + 1) % args.num_im == 0 or (i + 1) == len(image_ids): 150 | var_dict = {'title': "Question List", 151 | 'image_width': args.im_width, 152 | 'headers': headers, 153 | 'rows': rows, 154 | 'show_im': True if args.show_im == 'True' else False} 155 | with open(html_path, "wb") as f: 156 | f.write(template.render(**var_dict).encode('UTF-8')) 157 | rows = [] 158 | pbar.update(i) 159 | pbar.finish() 160 | 161 | os.system("ln -s %s/* %s" % (data_dir, html_dir)) 162 | os.chdir(html_dir) 163 | port = args.port 164 | host = args.host 165 | # Overriding to suppress log message 166 | class MyHandler(http.server.SimpleHTTPRequestHandler): 167 | def log_message(self, format, *args): 168 | pass 169 | handler = MyHandler 170 | httpd = socketserver.TCPServer((host, port), handler) 171 | if args.open == 'True': 172 | os.system("open http://%s:%d" % (args.host, args.port)) 173 | print(("serving at %s:%d" % (host, port))) 174 | httpd.serve_forever() 175 | 176 | 177 | if __name__ == "__main__": 178 | ARGS = get_args() 179 | list_results(ARGS) 180 | -------------------------------------------------------------------------------- /vis/list_vqa_questions.py: -------------------------------------------------------------------------------- 1 | import argparse 2 | import json 3 | import os 4 | 5 | from jinja2 import Environment, FileSystemLoader 6 | 7 | parser = argparse.ArgumentParser() 8 | parser.add_argument("root_dir") 9 | parser.add_argument("--images_dir", default='images') 10 | parser.add_argument("--questions_name", default='questions.json') 11 | parser.add_argument("--annotations_name", default="annotations.json") 12 | parser.add_argument("--start", default=0, type=int) 13 | parser.add_argument("--stop", default=-100, type=int) 14 | parser.add_argument("--html_path", default="/tmp/main.html") 15 | parser.add_argument("--image_width", default=200, type=int) 16 | parser.add_argument("--ext", default='.png') 17 | parser.add_argument("--prefix", default='') 18 | parser.add_argument("--zfill_width", default=12, type=int) 19 | parser.add_argument("--template_name", default="list_questions.html") 20 | 21 | ARGS = parser.parse_args() 22 | 23 | env = Environment(loader=FileSystemLoader('templates')) 24 | 25 | 26 | def main(args): 27 | root_dir = args.root_dir 28 | images_dir = os.path.join(root_dir, args.images_dir) 29 | questions_path = os.path.join(root_dir, args.questions_name) 30 | annotations_path = os.path.join(root_dir, args.annotations_name) 31 | html_path = args.html_path 32 | 33 | def _get_image_url(image_id): 34 | return os.path.join(images_dir, "%s%s%s" % (args.prefix, image_id.zfill(args.zfill_width), args.ext)) 35 | 36 | questions_dict = json.load(open(questions_path, "rb")) 37 | annotations_dict = json.load(open(annotations_path, "rb")) 38 | 39 | headers = ['image_id', 'question_id', 'image', 'question', 'choices', 'answer'] 40 | row_dict = {question['question_id']: 41 | {'image_id': question['image_id'], 42 | 'question_id': question['question_id'], 43 | 'image_url': _get_image_url(question['image_id']), 44 | 'question': question['question'], 45 | 'choices': question['multiple_choices'], 46 | 'answer': annotation['multiple_choice_answer']} 47 | for question, annotation in zip(questions_dict['questions'], annotations_dict['annotations'])} 48 | idxs = range(args.start, args.stop) 49 | rows = [row_dict[idx] for idx in idxs] 50 | template = env.get_template(args.template_name) 51 | vars_dict = {'title': "Question List: %d - %d" % (args.start, args.stop - 1), 52 | 'image_width': args.image_width, 53 | 'headers': headers, 54 | 'rows': rows[args.start:args.stop]} 55 | out = template.render(**vars_dict) 56 | with open(html_path, "wb") as f: 57 | f.write(out.encode('UTF-8')) 58 | 59 | os.system("open %s" % html_path) 60 | 61 | if __name__ == "__main__": 62 | main(ARGS) 63 | -------------------------------------------------------------------------------- /vis/templates/list_dqa_questions.html: -------------------------------------------------------------------------------- 1 | 2 | 3 | 4 | 5 | {{ title }} 6 | 7 | 11 | 12 |

{{ title }}

13 | 14 | 15 | {% for header in headers %} 16 | 17 | {% endfor %} 18 | 19 | {% for row in rows %} 20 | 21 | 22 | 23 | 28 | 29 | 36 | 37 | 38 | 39 | {% endfor %} 40 |
{{ header }}
{{ row.image_id }}{{ row.question_id }} 24 | {% if show_im == 'True' %} 25 | 26 | {% endif %} 27 | {{ row.question }} 30 |
    31 | {% for choice in row.choices %} 32 |
  1. {{ choice }}
  2. 33 | {% endfor %} 34 |
35 |
{{ row.answer }}Open
41 | 42 | 43 | -------------------------------------------------------------------------------- /vis/templates/list_facts.html: -------------------------------------------------------------------------------- 1 | 2 | 3 | 4 | 5 | {{ title }} 6 | 7 | 11 | 12 |

{{ title }}

13 | 14 | 15 | {% for header in headers %} 16 | 17 | {% endfor %} 18 | 19 | {% for row in rows %} 20 | 21 | 22 | 23 | 30 | 39 | 40 | 41 | 50 | 51 | {% endfor %} 52 |
{{ header }}
{{ row.image_id }}{{ row.question_id }} 24 | {% if show_im %} 25 | 26 | {% else %} 27 | Image hidden. 28 | {% endif %} 29 | 31 |
    32 | {% for sent in row.sents %} 33 |
  1. 34 | {{ sent }} 35 |
  2. 36 | {% endfor %} 37 |
38 |
{{ row.answer }}Open 42 |
    43 | {% for fact in row.facts %} 44 |
  1. 45 | {{ fact }} 46 |
  2. 47 | {% endfor %} 48 |
49 |
53 | 54 | 55 | -------------------------------------------------------------------------------- /vis/templates/list_questions.html: -------------------------------------------------------------------------------- 1 | 2 | 3 | 4 | 5 | {{ title }} 6 | 7 | 11 | 12 |

{{ title }}

13 | 14 | 15 | {% for header in headers %} 16 | 17 | {% endfor %} 18 | 19 | {% for row in rows %} 20 | 21 | 22 | 23 | 26 | 27 | 34 | 35 | 36 | {% endfor %} 37 |
{{ header }}
{{ row.image_id }}{{ row.question_id }} 24 | 25 | {{ row.question }} 28 |
    29 | {% for choice in row.choices %} 30 |
  1. {{ choice }}
  2. 31 | {% endfor %} 32 |
33 |
{{ row.answer }}
38 | 39 | 40 | -------------------------------------------------------------------------------- /vis/templates/list_relations.html: -------------------------------------------------------------------------------- 1 | 2 | 3 | 4 | 5 | {{ title }} 6 | 7 | 11 | 12 |

{{ title }}

13 | 14 | 15 | {% for header in headers %} 16 | 17 | {% endfor %} 18 | 19 | {% for row in rows %} 20 | 21 | 22 | 23 | 30 | 39 | 40 | 41 | 50 | 51 | {% endfor %} 52 |
{{ header }}
{{ row.image_id }}{{ row.question_id }} 24 | {% if show_im %} 25 | 26 | {% else %} 27 | Image hidden. 28 | {% endif %} 29 | 31 |
    32 | {% for sent in row.sents %} 33 |
  1. 34 | {{ sent }} 35 |
  2. 36 | {% endfor %} 37 |
38 |
{{ row.answer }}Open 42 |
    43 | {% for relation in row.relations %} 44 |
  1. 45 | {{ relation.a1 }} -> {{ relation.a2 }} 46 |
  2. 47 | {% endfor %} 48 |
49 |
53 | 54 | 55 | -------------------------------------------------------------------------------- /vis/templates/list_results.html: -------------------------------------------------------------------------------- 1 | 2 | 3 | 4 | 5 | {{ title }} 6 | 7 | 8 | 18 | 19 | 23 | 24 |

{{ title }}

25 | 26 | 27 | {% for header in headers %} 28 | 29 | {% endfor %} 30 | 31 | {% for row in rows %} 32 | 33 | 34 | 35 | 42 | 55 | 56 | 65 | 77 | {% if row.correct %} 78 | 94 | 95 | {% endfor %} 96 |
{{ header }}
{{ row.image_id }}{{ row.question_id }} 36 | {% if show_im %} 37 | 38 | {% else %} 39 | Image hidden. 40 | {% endif %} 41 | 43 |
    44 | {% for sent in row.sents %} 45 |
  1. 46 | {% if loop.index0 == row.answer %} 47 | {{ sent }} 48 | {% else %} 49 | {{ sent }} 50 | {% endif %} 51 |
  2. 52 | {% endfor %} 53 |
54 |
Open 57 |
    58 | {% for fact in row.facts %} 59 |
  1. 60 | {{ fact }} 61 |
  2. 62 | {% endfor %} 63 |
64 |
66 | 67 | {% for a in row.p %} 68 | 69 | 70 | {% for b in a %} 71 | 72 | {% endfor %} 73 | 74 | {% endfor %} 75 |
{{ loop.index0 }}{{ b }}
76 |
79 | {% else %} 80 | 81 | {% endif %} 82 |
    83 | {% for x in row.yp %} 84 |
  1. 85 | {% if loop.index0 == row.ap %} 86 | {{ x }} 87 | {% else %} 88 | {{ x }} 89 | {% endif %} 90 |
  2. 91 | {% endfor %} 92 |
93 |
97 | 98 | 99 | 100 | --------------------------------------------------------------------------------