├── Graph2Seq-master ├── LICENSE ├── README.md ├── batch_helper.py ├── data │ ├── .DS_Store │ ├── no_cycle │ │ ├── dev.data │ │ ├── test.data │ │ └── train.data │ └── word.idx ├── data_creator.py ├── main │ ├── __init__.py │ ├── __pycache__ │ │ ├── aggregators.cpython-36.pyc │ │ ├── configure.cpython-36.pyc │ │ ├── configure.cpython-37.pyc │ │ ├── data_collector.cpython-36.pyc │ │ ├── data_collector.cpython-37.pyc │ │ ├── evaluator.cpython-36.pyc │ │ ├── helpers.cpython-36.pyc │ │ ├── inits.cpython-36.pyc │ │ ├── layer_utils.cpython-36.pyc │ │ ├── layers.cpython-36.pyc │ │ ├── loaderAndwriter.cpython-36.pyc │ │ ├── loaderAndwriter.cpython-37.pyc │ │ ├── match_utils.cpython-36.pyc │ │ ├── model.cpython-36.pyc │ │ ├── model.cpython-37.pyc │ │ ├── neigh_samplers.cpython-36.pyc │ │ ├── pooling.cpython-36.pyc │ │ └── text_decoder.cpython-36.pyc │ ├── aggregators.py │ ├── batch_helper.py │ ├── configure.py │ ├── data_collector.py │ ├── data_creator.py │ ├── evaluator.py │ ├── helpers.py │ ├── inits.py │ ├── layer_utils.py │ ├── layers.py │ ├── loaderAndwriter.py │ ├── match_utils.py │ ├── model.py │ ├── neigh_samplers.py │ ├── pooling.py │ ├── run_model.py │ └── text_decoder.py └── saved_model │ ├── checkpoint │ ├── model-0.data-00000-of-00001 │ ├── model-0.index │ └── model-0.meta ├── LICENSE └── README.md /Graph2Seq-master/LICENSE: -------------------------------------------------------------------------------- 1 | Apache License 2 | Version 2.0, January 2004 3 | http://www.apache.org/licenses/ 4 | 5 | TERMS AND CONDITIONS FOR USE, REPRODUCTION, AND DISTRIBUTION 6 | 7 | 1. Definitions. 8 | 9 | "License" shall mean the terms and conditions for use, reproduction, 10 | and distribution as defined by Sections 1 through 9 of this document. 11 | 12 | "Licensor" shall mean the copyright owner or entity authorized by 13 | the copyright owner that is granting the License. 14 | 15 | "Legal Entity" shall mean the union of the acting entity and all 16 | other entities that control, are controlled by, or are under common 17 | control with that entity. For the purposes of this definition, 18 | "control" means (i) the power, direct or indirect, to cause the 19 | direction or management of such entity, whether by contract or 20 | otherwise, or (ii) ownership of fifty percent (50%) or more of the 21 | outstanding shares, or (iii) beneficial ownership of such entity. 22 | 23 | "You" (or "Your") shall mean an individual or Legal Entity 24 | exercising permissions granted by this License. 25 | 26 | "Source" form shall mean the preferred form for making modifications, 27 | including but not limited to software source code, documentation 28 | source, and configuration files. 29 | 30 | "Object" form shall mean any form resulting from mechanical 31 | transformation or translation of a Source form, including but 32 | not limited to compiled object code, generated documentation, 33 | and conversions to other media types. 34 | 35 | "Work" shall mean the work of authorship, whether in Source or 36 | Object form, made available under the License, as indicated by a 37 | copyright notice that is included in or attached to the work 38 | (an example is provided in the Appendix below). 39 | 40 | "Derivative Works" shall mean any work, whether in Source or Object 41 | form, that is based on (or derived from) the Work and for which the 42 | editorial revisions, annotations, elaborations, or other modifications 43 | represent, as a whole, an original work of authorship. For the purposes 44 | of this License, Derivative Works shall not include works that remain 45 | separable from, or merely link (or bind by name) to the interfaces of, 46 | the Work and Derivative Works thereof. 47 | 48 | "Contribution" shall mean any work of authorship, including 49 | the original version of the Work and any modifications or additions 50 | to that Work or Derivative Works thereof, that is intentionally 51 | submitted to Licensor for inclusion in the Work by the copyright owner 52 | or by an individual or Legal Entity authorized to submit on behalf of 53 | the copyright owner. For the purposes of this definition, "submitted" 54 | means any form of electronic, verbal, or written communication sent 55 | to the Licensor or its representatives, including but not limited to 56 | communication on electronic mailing lists, source code control systems, 57 | and issue tracking systems that are managed by, or on behalf of, the 58 | Licensor for the purpose of discussing and improving the Work, but 59 | excluding communication that is conspicuously marked or otherwise 60 | designated in writing by the copyright owner as "Not a Contribution." 61 | 62 | "Contributor" shall mean Licensor and any individual or Legal Entity 63 | on behalf of whom a Contribution has been received by Licensor and 64 | subsequently incorporated within the Work. 65 | 66 | 2. Grant of Copyright License. Subject to the terms and conditions of 67 | this License, each Contributor hereby grants to You a perpetual, 68 | worldwide, non-exclusive, no-charge, royalty-free, irrevocable 69 | copyright license to reproduce, prepare Derivative Works of, 70 | publicly display, publicly perform, sublicense, and distribute the 71 | Work and such Derivative Works in Source or Object form. 72 | 73 | 3. Grant of Patent License. Subject to the terms and conditions of 74 | this License, each Contributor hereby grants to You a perpetual, 75 | worldwide, non-exclusive, no-charge, royalty-free, irrevocable 76 | (except as stated in this section) patent license to make, have made, 77 | use, offer to sell, sell, import, and otherwise transfer the Work, 78 | where such license applies only to those patent claims licensable 79 | by such Contributor that are necessarily infringed by their 80 | Contribution(s) alone or by combination of their Contribution(s) 81 | with the Work to which such Contribution(s) was submitted. If You 82 | institute patent litigation against any entity (including a 83 | cross-claim or counterclaim in a lawsuit) alleging that the Work 84 | or a Contribution incorporated within the Work constitutes direct 85 | or contributory patent infringement, then any patent licenses 86 | granted to You under this License for that Work shall terminate 87 | as of the date such litigation is filed. 88 | 89 | 4. Redistribution. You may reproduce and distribute copies of the 90 | Work or Derivative Works thereof in any medium, with or without 91 | modifications, and in Source or Object form, provided that You 92 | meet the following conditions: 93 | 94 | (a) You must give any other recipients of the Work or 95 | Derivative Works a copy of this License; and 96 | 97 | (b) You must cause any modified files to carry prominent notices 98 | stating that You changed the files; and 99 | 100 | (c) You must retain, in the Source form of any Derivative Works 101 | that You distribute, all copyright, patent, trademark, and 102 | attribution notices from the Source form of the Work, 103 | excluding those notices that do not pertain to any part of 104 | the Derivative Works; and 105 | 106 | (d) If the Work includes a "NOTICE" text file as part of its 107 | distribution, then any Derivative Works that You distribute must 108 | include a readable copy of the attribution notices contained 109 | within such NOTICE file, excluding those notices that do not 110 | pertain to any part of the Derivative Works, in at least one 111 | of the following places: within a NOTICE text file distributed 112 | as part of the Derivative Works; within the Source form or 113 | documentation, if provided along with the Derivative Works; or, 114 | within a display generated by the Derivative Works, if and 115 | wherever such third-party notices normally appear. The contents 116 | of the NOTICE file are for informational purposes only and 117 | do not modify the License. You may add Your own attribution 118 | notices within Derivative Works that You distribute, alongside 119 | or as an addendum to the NOTICE text from the Work, provided 120 | that such additional attribution notices cannot be construed 121 | as modifying the License. 122 | 123 | You may add Your own copyright statement to Your modifications and 124 | may provide additional or different license terms and conditions 125 | for use, reproduction, or distribution of Your modifications, or 126 | for any such Derivative Works as a whole, provided Your use, 127 | reproduction, and distribution of the Work otherwise complies with 128 | the conditions stated in this License. 129 | 130 | 5. Submission of Contributions. Unless You explicitly state otherwise, 131 | any Contribution intentionally submitted for inclusion in the Work 132 | by You to the Licensor shall be under the terms and conditions of 133 | this License, without any additional terms or conditions. 134 | Notwithstanding the above, nothing herein shall supersede or modify 135 | the terms of any separate license agreement you may have executed 136 | with Licensor regarding such Contributions. 137 | 138 | 6. Trademarks. This License does not grant permission to use the trade 139 | names, trademarks, service marks, or product names of the Licensor, 140 | except as required for reasonable and customary use in describing the 141 | origin of the Work and reproducing the content of the NOTICE file. 142 | 143 | 7. Disclaimer of Warranty. Unless required by applicable law or 144 | agreed to in writing, Licensor provides the Work (and each 145 | Contributor provides its Contributions) on an "AS IS" BASIS, 146 | WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or 147 | implied, including, without limitation, any warranties or conditions 148 | of TITLE, NON-INFRINGEMENT, MERCHANTABILITY, or FITNESS FOR A 149 | PARTICULAR PURPOSE. You are solely responsible for determining the 150 | appropriateness of using or redistributing the Work and assume any 151 | risks associated with Your exercise of permissions under this License. 152 | 153 | 8. Limitation of Liability. In no event and under no legal theory, 154 | whether in tort (including negligence), contract, or otherwise, 155 | unless required by applicable law (such as deliberate and grossly 156 | negligent acts) or agreed to in writing, shall any Contributor be 157 | liable to You for damages, including any direct, indirect, special, 158 | incidental, or consequential damages of any character arising as a 159 | result of this License or out of the use or inability to use the 160 | Work (including but not limited to damages for loss of goodwill, 161 | work stoppage, computer failure or malfunction, or any and all 162 | other commercial damages or losses), even if such Contributor 163 | has been advised of the possibility of such damages. 164 | 165 | 9. Accepting Warranty or Additional Liability. While redistributing 166 | the Work or Derivative Works thereof, You may choose to offer, 167 | and charge a fee for, acceptance of support, warranty, indemnity, 168 | or other liability obligations and/or rights consistent with this 169 | License. However, in accepting such obligations, You may act only 170 | on Your own behalf and on Your sole responsibility, not on behalf 171 | of any other Contributor, and only if You agree to indemnify, 172 | defend, and hold each Contributor harmless for any liability 173 | incurred by, or claims asserted against, such Contributor by reason 174 | of your accepting any such warranty or additional liability. 175 | 176 | END OF TERMS AND CONDITIONS 177 | 178 | APPENDIX: How to apply the Apache License to your work. 179 | 180 | To apply the Apache License to your work, attach the following 181 | boilerplate notice, with the fields enclosed by brackets "[]" 182 | replaced with your own identifying information. (Don't include 183 | the brackets!) The text should be enclosed in the appropriate 184 | comment syntax for the file format. We also recommend that a 185 | file or class name and description of purpose be included on the 186 | same "printed page" as the copyright notice for easier 187 | identification within third-party archives. 188 | 189 | Copyright [yyyy] [name of copyright owner] 190 | 191 | Licensed under the Apache License, Version 2.0 (the "License"); 192 | you may not use this file except in compliance with the License. 193 | You may obtain a copy of the License at 194 | 195 | http://www.apache.org/licenses/LICENSE-2.0 196 | 197 | Unless required by applicable law or agreed to in writing, software 198 | distributed under the License is distributed on an "AS IS" BASIS, 199 | WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. 200 | See the License for the specific language governing permissions and 201 | limitations under the License. 202 | -------------------------------------------------------------------------------- /Graph2Seq-master/README.md: -------------------------------------------------------------------------------- 1 | # Graph2Seq 2 | Graph2Seq is a simple code for building a graph-encoder and sequence-decoder for NLP and other AI/ML/DL tasks. 3 | 4 | # How To Run The Codes 5 | To train your graph-to-sequence model, you need: 6 | 7 | (1) Prepare your train/dev/test data which the form of: 8 | 9 | each line is a json object whose keys are "seq", "g_ids", "g_id_features", "g_adj": 10 | "seq" is a text which is supposed to be the output of the decoder 11 | "g_ids" is a mapping from the node ID to its ID in the graph 12 | "g_id_features" is a mapping from the node ID to its text features 13 | "g_adj" is a mapping from the node ID to its adjacent nodes (represented as thier IDs) 14 | 15 | See data/no_cycle/train.data as examples. 16 | 17 | 18 | (2) Modify some hyper-parameters according to your task in the main/configure.py 19 | 20 | (3) train the model by running the following code 21 | "python run_model.py train -sample_size_per_layer=xxx -sample_layer_size=yyy" 22 | The model that performs the best on the dev data will be saved in the dir "saved_model" 23 | 24 | (4) test the model by running the following code 25 | "python run_model.py test -sample_size_per_layer=xxx -sample_layer_size=yyy" 26 | The prediction result will be saved in saved_model/prediction.txt 27 | 28 | 29 | 30 | # How To Cite The Codes 31 | Please cite our work if you like or are using our codes for your projects! 32 | 33 | Kun Xu, Lingfei Wu, Zhiguo Wang, Yansong Feng, Michael Witbrock, and Vadim Sheinin (first and second authors contributed equally), "Graph2Seq: Graph to Sequence Learning with Attention-based Neural Networks", arXiv preprint arXiv:1804.00823. 34 | 35 | @article{xu2018graph2seq,
36 | title={Graph2Seq: Graph to Sequence Learning with Attention-based Neural Networks},
37 | author={Xu, Kun and Wu, Lingfei and Wang, Zhiguo and Feng, Yansong and Witbrock, Michael and Sheinin, Vadim},
38 | journal={arXiv preprint arXiv:1804.00823},
39 | year={2018}
40 | }
41 | 42 | ------------------------------------------------------ 43 | Contributors: Kun Xu, Lingfei Wu
44 | Created date: November 4, 2018
45 | Last update: November 4, 2018
46 | 47 | -------------------------------------------------------------------------------- /Graph2Seq-master/batch_helper.py: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/IBM/Text-to-LogicForm/515e73db1795019007c1b3a3ed22b10c349f555a/Graph2Seq-master/batch_helper.py -------------------------------------------------------------------------------- /Graph2Seq-master/data/.DS_Store: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/IBM/Text-to-LogicForm/515e73db1795019007c1b3a3ed22b10c349f555a/Graph2Seq-master/data/.DS_Store -------------------------------------------------------------------------------- /Graph2Seq-master/data/word.idx: -------------------------------------------------------------------------------- 1 | 1 2 | 2 3 | 3 4 | START 4 5 | 6 5 6 | 5 6 7 | END 7 8 | 12 8 9 | 15 9 10 | 7 10 11 | 10 11 12 | 11 12 13 | 14 13 14 | 1 14 15 | 8 15 16 | 4 16 17 | 2 17 18 | 3 18 19 | 13 19 20 | 9 20 21 | -------------------------------------------------------------------------------- /Graph2Seq-master/data_creator.py: -------------------------------------------------------------------------------- 1 | """this python file is used to construct the fake data for the model""" 2 | import random 3 | import json 4 | import numpy as np 5 | 6 | from networkx import * 7 | import networkx.algorithms as nxalg 8 | 9 | def create_random_graph(type, filePath, numberOfCase, graph_scale): 10 | """ 11 | 12 | :param type: the graph type 13 | :param filePath: the output file path 14 | :param numberOfCase: the number of examples 15 | :return: 16 | """ 17 | with open(filePath, "w+") as f: 18 | degree = 0.0 19 | for _ in range(numberOfCase): 20 | info = {} 21 | graph_node_size = graph_scale 22 | edge_prob = 0.3 23 | 24 | while True: 25 | edge_count = 0.0 26 | if type == "random": 27 | graph = nx.gnp_random_graph(graph_node_size, edge_prob, directed=True) 28 | for id in graph.edge: 29 | edge_count += len(graph.edge[id]) 30 | start = random.randint(0, graph_node_size - 1) 31 | adj = nx.shortest_path(graph, start) 32 | 33 | max_len = 0 34 | path = [] 35 | paths = [] 36 | for neighbor in adj: 37 | if len(adj[neighbor]) > max_len and neighbor != start: 38 | paths = [] 39 | max_len = len(adj[neighbor]) 40 | path = adj[neighbor] 41 | end = neighbor 42 | for p in nx.all_shortest_paths(graph, start, end): 43 | paths.append(p) 44 | 45 | if len(path) > 0 and path[0] == start and len(path) == 3 and len(paths) == 1: 46 | degree += edge_count / graph_node_size 47 | break 48 | 49 | elif type == "no-cycle": 50 | graph = nx.DiGraph() 51 | for i in range(graph_node_size): 52 | nodes = graph.nodes() 53 | if len(nodes) == 0: 54 | graph.add_node(i) 55 | else: 56 | size = random.randint(1, min(i, 2)); 57 | fathers = random.sample(range(0, i), size) 58 | for father in fathers: 59 | graph.add_edge(father, i) 60 | for id in graph.edge: 61 | edge_count += len(graph.edge[id]) 62 | start = 0 63 | end = graph_node_size-1 64 | path = nx.shortest_path(graph, 0, graph_node_size-1) 65 | paths = [p for p in nx.all_shortest_paths(graph, 0, graph_node_size-1)] 66 | if len(path) >= 4 and len(paths) == 1: 67 | degree += edge_count / graph_node_size 68 | break 69 | 70 | elif type == "baseline": 71 | num_nodes = graph_node_size 72 | graph = nx.random_graphs.connected_watts_strogatz_graph(num_nodes, 3, edge_prob) 73 | for id in graph.edge: 74 | edge_count += len(graph.edge[id]) 75 | start, end = np.random.randint(num_nodes, size=2) 76 | 77 | if start == end: 78 | continue # reject trivial paths 79 | 80 | paths = list(nxalg.all_shortest_paths(graph, source=start, target=end)) 81 | 82 | if len(paths) > 1: 83 | continue # reject when more than one shortest path 84 | 85 | path = paths[0] 86 | 87 | if len(path) != 4: 88 | continue 89 | degree += edge_count / graph_node_size 90 | break 91 | 92 | adj_list = graph.adjacency_list() 93 | 94 | 95 | g_ids = {} 96 | g_ids_features = {} 97 | g_adj = {} 98 | for i in range(graph_node_size): 99 | g_ids[i] = i 100 | if i == start: 101 | g_ids_features[i] = "START" 102 | elif i == end: 103 | g_ids_features[i] = "END" 104 | else: 105 | # g_ids_features[i] = str(i+10) 106 | g_ids_features[i] = str(random.randint(1, 15)) 107 | g_adj[i] = adj_list[i] 108 | 109 | # print start, end, path 110 | text = "" 111 | for id in path: 112 | text += g_ids_features[id] + " " 113 | 114 | info["seq"] = text.strip() 115 | info["g_ids"] = g_ids 116 | info['g_ids_features'] = g_ids_features 117 | info['g_adj'] = g_adj 118 | f.write(json.dumps(info)+"\n") 119 | 120 | print("average degree in the graph is :{}".format(degree/numberOfCase)) 121 | 122 | if __name__ == "__main__": 123 | create_random_graph("no-cycle", "data/no_cycle/train.data", 1000, graph_scale=100) 124 | create_random_graph("no-cycle", "data/no_cycle/dev.data", 1000, graph_scale=100) 125 | create_random_graph("no-cycle", "data/no_cycle/test.data", 1000, graph_scale=100) 126 | 127 | 128 | -------------------------------------------------------------------------------- /Graph2Seq-master/main/__init__.py: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/IBM/Text-to-LogicForm/515e73db1795019007c1b3a3ed22b10c349f555a/Graph2Seq-master/main/__init__.py -------------------------------------------------------------------------------- /Graph2Seq-master/main/__pycache__/aggregators.cpython-36.pyc: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/IBM/Text-to-LogicForm/515e73db1795019007c1b3a3ed22b10c349f555a/Graph2Seq-master/main/__pycache__/aggregators.cpython-36.pyc -------------------------------------------------------------------------------- /Graph2Seq-master/main/__pycache__/configure.cpython-36.pyc: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/IBM/Text-to-LogicForm/515e73db1795019007c1b3a3ed22b10c349f555a/Graph2Seq-master/main/__pycache__/configure.cpython-36.pyc -------------------------------------------------------------------------------- /Graph2Seq-master/main/__pycache__/configure.cpython-37.pyc: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/IBM/Text-to-LogicForm/515e73db1795019007c1b3a3ed22b10c349f555a/Graph2Seq-master/main/__pycache__/configure.cpython-37.pyc -------------------------------------------------------------------------------- /Graph2Seq-master/main/__pycache__/data_collector.cpython-36.pyc: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/IBM/Text-to-LogicForm/515e73db1795019007c1b3a3ed22b10c349f555a/Graph2Seq-master/main/__pycache__/data_collector.cpython-36.pyc -------------------------------------------------------------------------------- /Graph2Seq-master/main/__pycache__/data_collector.cpython-37.pyc: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/IBM/Text-to-LogicForm/515e73db1795019007c1b3a3ed22b10c349f555a/Graph2Seq-master/main/__pycache__/data_collector.cpython-37.pyc -------------------------------------------------------------------------------- /Graph2Seq-master/main/__pycache__/evaluator.cpython-36.pyc: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/IBM/Text-to-LogicForm/515e73db1795019007c1b3a3ed22b10c349f555a/Graph2Seq-master/main/__pycache__/evaluator.cpython-36.pyc -------------------------------------------------------------------------------- /Graph2Seq-master/main/__pycache__/helpers.cpython-36.pyc: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/IBM/Text-to-LogicForm/515e73db1795019007c1b3a3ed22b10c349f555a/Graph2Seq-master/main/__pycache__/helpers.cpython-36.pyc -------------------------------------------------------------------------------- /Graph2Seq-master/main/__pycache__/inits.cpython-36.pyc: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/IBM/Text-to-LogicForm/515e73db1795019007c1b3a3ed22b10c349f555a/Graph2Seq-master/main/__pycache__/inits.cpython-36.pyc -------------------------------------------------------------------------------- /Graph2Seq-master/main/__pycache__/layer_utils.cpython-36.pyc: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/IBM/Text-to-LogicForm/515e73db1795019007c1b3a3ed22b10c349f555a/Graph2Seq-master/main/__pycache__/layer_utils.cpython-36.pyc -------------------------------------------------------------------------------- /Graph2Seq-master/main/__pycache__/layers.cpython-36.pyc: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/IBM/Text-to-LogicForm/515e73db1795019007c1b3a3ed22b10c349f555a/Graph2Seq-master/main/__pycache__/layers.cpython-36.pyc -------------------------------------------------------------------------------- /Graph2Seq-master/main/__pycache__/loaderAndwriter.cpython-36.pyc: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/IBM/Text-to-LogicForm/515e73db1795019007c1b3a3ed22b10c349f555a/Graph2Seq-master/main/__pycache__/loaderAndwriter.cpython-36.pyc -------------------------------------------------------------------------------- /Graph2Seq-master/main/__pycache__/loaderAndwriter.cpython-37.pyc: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/IBM/Text-to-LogicForm/515e73db1795019007c1b3a3ed22b10c349f555a/Graph2Seq-master/main/__pycache__/loaderAndwriter.cpython-37.pyc -------------------------------------------------------------------------------- /Graph2Seq-master/main/__pycache__/match_utils.cpython-36.pyc: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/IBM/Text-to-LogicForm/515e73db1795019007c1b3a3ed22b10c349f555a/Graph2Seq-master/main/__pycache__/match_utils.cpython-36.pyc -------------------------------------------------------------------------------- /Graph2Seq-master/main/__pycache__/model.cpython-36.pyc: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/IBM/Text-to-LogicForm/515e73db1795019007c1b3a3ed22b10c349f555a/Graph2Seq-master/main/__pycache__/model.cpython-36.pyc -------------------------------------------------------------------------------- /Graph2Seq-master/main/__pycache__/model.cpython-37.pyc: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/IBM/Text-to-LogicForm/515e73db1795019007c1b3a3ed22b10c349f555a/Graph2Seq-master/main/__pycache__/model.cpython-37.pyc -------------------------------------------------------------------------------- /Graph2Seq-master/main/__pycache__/neigh_samplers.cpython-36.pyc: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/IBM/Text-to-LogicForm/515e73db1795019007c1b3a3ed22b10c349f555a/Graph2Seq-master/main/__pycache__/neigh_samplers.cpython-36.pyc -------------------------------------------------------------------------------- /Graph2Seq-master/main/__pycache__/pooling.cpython-36.pyc: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/IBM/Text-to-LogicForm/515e73db1795019007c1b3a3ed22b10c349f555a/Graph2Seq-master/main/__pycache__/pooling.cpython-36.pyc -------------------------------------------------------------------------------- /Graph2Seq-master/main/__pycache__/text_decoder.cpython-36.pyc: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/IBM/Text-to-LogicForm/515e73db1795019007c1b3a3ed22b10c349f555a/Graph2Seq-master/main/__pycache__/text_decoder.cpython-36.pyc -------------------------------------------------------------------------------- /Graph2Seq-master/main/aggregators.py: -------------------------------------------------------------------------------- 1 | import tensorflow as tf 2 | from layers import Layer, Dense 3 | from inits import glorot, zeros 4 | from pooling import mean_pool 5 | 6 | class GatedMeanAggregator(Layer): 7 | def __init__(self, input_dim, output_dim, neigh_input_dim=None, 8 | dropout=0, bias=True, act=tf.nn.relu, 9 | name=None, concat=False, **kwargs): 10 | super(GatedMeanAggregator, self).__init__(**kwargs) 11 | 12 | self.dropout = dropout 13 | self.bias = bias 14 | self.act = act 15 | self.concat = concat 16 | 17 | if name is not None: 18 | name = '/' + name 19 | else: 20 | name = '' 21 | 22 | if neigh_input_dim == None: 23 | neigh_input_dim = input_dim 24 | 25 | if concat: 26 | self.output_dim = 2 * output_dim 27 | 28 | with tf.variable_scope(self.name + name + '_vars'): 29 | self.vars['neigh_weights'] = glorot([neigh_input_dim, output_dim], 30 | name='neigh_weights') 31 | self.vars['self_weights'] = glorot([input_dim, output_dim], 32 | name='self_weights') 33 | if self.bias: 34 | self.vars['bias'] = zeros([self.output_dim], name='bias') 35 | 36 | self.vars['gate_weights'] = glorot([2*output_dim, 2*output_dim], 37 | name='gate_weights') 38 | self.vars['gate_bias'] = zeros([2*output_dim], name='bias') 39 | 40 | 41 | self.input_dim = input_dim 42 | self.output_dim = output_dim 43 | 44 | def _call(self, inputs): 45 | self_vecs, neigh_vecs = inputs 46 | 47 | neigh_vecs = tf.nn.dropout(neigh_vecs, 1-self.dropout) 48 | self_vecs = tf.nn.dropout(self_vecs, 1-self.dropout) 49 | 50 | neigh_means = tf.reduce_mean(neigh_vecs, axis=1) 51 | 52 | # [nodes] x [out_dim] 53 | from_neighs = tf.matmul(neigh_means, self.vars['neigh_weights']) 54 | 55 | from_self = tf.matmul(self_vecs, self.vars["self_weights"]) 56 | 57 | if not self.concat: 58 | output = tf.add_n([from_self, from_neighs]) 59 | else: 60 | output = tf.concat([from_self, from_neighs], axis=1) 61 | 62 | # bias 63 | if self.bias: 64 | output += self.vars['bias'] 65 | 66 | gate = tf.concat([from_self, from_neighs], axis=1) 67 | gate = tf.matmul(gate, self.vars["gate_weights"]) + self.vars["gate_bias"] 68 | gate = tf.nn.relu(gate) 69 | 70 | return gate*self.act(output) 71 | 72 | class MeanAggregator(Layer): 73 | """Aggregates via mean followed by matmul and non-linearity.""" 74 | 75 | def __init__(self, input_dim, output_dim, neigh_input_dim=None, 76 | dropout=0, bias=True, act=tf.nn.relu, 77 | name=None, concat=False, mode="train", **kwargs): 78 | super(MeanAggregator, self).__init__(**kwargs) 79 | 80 | self.dropout = dropout 81 | self.bias = bias 82 | self.act = act 83 | self.concat = concat 84 | self.mode = mode 85 | 86 | if name is not None: 87 | name = '/' + name 88 | else: 89 | name = '' 90 | 91 | if neigh_input_dim == None: 92 | neigh_input_dim = input_dim 93 | 94 | if concat: 95 | self.output_dim = 2 * output_dim 96 | 97 | with tf.variable_scope(self.name + name + '_vars'): 98 | self.vars['neigh_weights'] = glorot([neigh_input_dim, output_dim], 99 | name='neigh_weights') 100 | self.vars['self_weights'] = glorot([input_dim, output_dim], 101 | name='self_weights') 102 | if self.bias: 103 | self.vars['bias'] = zeros([self.output_dim], name='bias') 104 | 105 | self.input_dim = input_dim 106 | self.output_dim = output_dim 107 | 108 | def _call(self, inputs): 109 | self_vecs, neigh_vecs, neigh_len = inputs 110 | 111 | if self.mode == "train": 112 | neigh_vecs = tf.nn.dropout(neigh_vecs, 1-self.dropout) 113 | self_vecs = tf.nn.dropout(self_vecs, 1-self.dropout) 114 | 115 | # reduce_mean performs better than mean_pool 116 | neigh_means = tf.reduce_mean(neigh_vecs, axis=1) 117 | # neigh_means = mean_pool(neigh_vecs, neigh_len) 118 | 119 | # [nodes] x [out_dim] 120 | from_neighs = tf.matmul(neigh_means, self.vars['neigh_weights']) 121 | 122 | from_self = tf.matmul(self_vecs, self.vars["self_weights"]) 123 | 124 | if not self.concat: 125 | output = tf.add_n([from_self, from_neighs]) 126 | else: 127 | output = tf.concat([from_self, from_neighs], axis=1) 128 | 129 | # bias 130 | if self.bias: 131 | output += self.vars['bias'] 132 | 133 | return self.act(output) 134 | 135 | class MaxPoolingAggregator(Layer): 136 | """ Aggregates via max-pooling over MLP functions.""" 137 | def __init__(self, input_dim, output_dim, model_size="small", neigh_input_dim=None, 138 | dropout=0., bias=True, act=tf.nn.relu, name=None, concat=False, **kwargs): 139 | super(MaxPoolingAggregator, self).__init__(**kwargs) 140 | 141 | self.dropout = dropout 142 | self.bias = bias 143 | self.act = act 144 | self.concat = concat 145 | 146 | if name is not None: 147 | name = '/' + name 148 | else: 149 | name = '' 150 | 151 | if neigh_input_dim == None: 152 | neigh_input_dim = input_dim 153 | 154 | if concat: 155 | self.output_dim = 2 * output_dim 156 | 157 | if model_size == "small": 158 | hidden_dim = self.hidden_dim = 50 159 | elif model_size == "big": 160 | hidden_dim = self.hidden_dim = 50 161 | 162 | self.mlp_layers = [] 163 | self.mlp_layers.append(Dense(input_dim=neigh_input_dim, output_dim=hidden_dim, act=tf.nn.relu, 164 | dropout=dropout, sparse_inputs=False, logging=self.logging)) 165 | 166 | with tf.variable_scope(self.name + name + '_vars'): 167 | 168 | self.vars['neigh_weights'] = glorot([hidden_dim, output_dim], name='neigh_weights') 169 | 170 | self.vars['self_weights'] = glorot([input_dim, output_dim], name='self_weights') 171 | 172 | if self.bias: 173 | self.vars['bias'] = zeros([self.output_dim], name='bias') 174 | 175 | self.input_dim = input_dim 176 | self.output_dim = output_dim 177 | self.neigh_input_dim = neigh_input_dim 178 | 179 | def _call(self, inputs): 180 | self_vecs, neigh_vecs = inputs 181 | neigh_h = neigh_vecs 182 | 183 | dims = tf.shape(neigh_h) 184 | batch_size = dims[0] 185 | num_neighbors = dims[1] 186 | 187 | h_reshaped = tf.reshape(neigh_h, (batch_size * num_neighbors, self.neigh_input_dim)) 188 | 189 | for l in self.mlp_layers: 190 | h_reshaped = l(h_reshaped) 191 | neigh_h = tf.reshape(h_reshaped, (batch_size, num_neighbors, self.hidden_dim)) 192 | neigh_h = tf.reduce_max(neigh_h, axis=1) 193 | 194 | from_neighs = tf.matmul(neigh_h, self.vars['neigh_weights']) 195 | from_self = tf.matmul(self_vecs, self.vars["self_weights"]) 196 | 197 | if not self.concat: 198 | output = tf.add_n([from_self, from_neighs]) 199 | else: 200 | output = tf.concat([from_self, from_neighs], axis=1) 201 | 202 | # bias 203 | if self.bias: 204 | output += self.vars['bias'] 205 | return self.act(output) -------------------------------------------------------------------------------- /Graph2Seq-master/main/batch_helper.py: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/IBM/Text-to-LogicForm/515e73db1795019007c1b3a3ed22b10c349f555a/Graph2Seq-master/main/batch_helper.py -------------------------------------------------------------------------------- /Graph2Seq-master/main/configure.py: -------------------------------------------------------------------------------- 1 | train_data_path = "../data/no_cycle/train.data" 2 | dev_data_path = "../data/no_cycle/dev.data" 3 | test_data_path = "../data/no_cycle/test.data" 4 | 5 | word_idx_file_path = "../data/word.idx" 6 | 7 | word_embedding_dim = 100 8 | 9 | train_batch_size = 32 10 | dev_batch_size = 500 11 | test_batch_size = 500 12 | 13 | l2_lambda = 0.000001 14 | learning_rate = 0.001 15 | epochs = 100 16 | encoder_hidden_dim = 200 17 | num_layers_decode = 1 18 | word_size_max = 1 19 | 20 | dropout = 0.0 21 | 22 | path_embed_method = "lstm" # cnn or lstm or bi-lstm 23 | 24 | unknown_word = "" 25 | PAD = "" 26 | GO = "" 27 | EOS = "" 28 | deal_unknown_words = True 29 | 30 | seq_max_len = 11 31 | 32 | decoder_type = "greedy" # greedy, beam 33 | beam_width = 0 34 | attention = True 35 | num_layers = 1 # 1 or 2 36 | 37 | # the following are for the graph encoding method 38 | weight_decay = 0.0000 39 | sample_size_per_layer = 4 40 | sample_layer_size = 4 41 | hidden_layer_dim = 100 42 | feature_max_len = 1 43 | feature_encode_type = "uni" 44 | # graph_encode_method = "max-pooling" # "lstm" or "max-pooling" 45 | graph_encode_direction = "bi" # "single" or "bi" 46 | concat = True 47 | 48 | encoder = "gated_gcn" # "gated_gcn" "gcn" "seq" 49 | 50 | lstm_in_gcn = "none" # before, after, none 51 | -------------------------------------------------------------------------------- /Graph2Seq-master/main/data_collector.py: -------------------------------------------------------------------------------- 1 | import numpy as np 2 | import json 3 | import configure as conf 4 | from collections import OrderedDict 5 | 6 | def read_data(input_path, word_idx, if_increase_dict): 7 | seqs = [] 8 | graphs = [] 9 | 10 | if if_increase_dict: 11 | word_idx[conf.GO] = 1 12 | word_idx[conf.EOS] = 2 13 | word_idx[conf.unknown_word] = 3 14 | 15 | with open(input_path, 'r') as f: 16 | lines = f.readlines() 17 | for line in lines: 18 | line = line.strip() 19 | jo = json.loads(line, object_pairs_hook=OrderedDict) 20 | seq = jo['seq'] 21 | seqs.append(seq) 22 | if if_increase_dict: 23 | for w in seq.split(): 24 | if w not in word_idx: 25 | word_idx[w] = len(word_idx) + 1 26 | 27 | for id in jo['g_ids_features']: 28 | features = jo['g_ids_features'][id] 29 | for w in features.split(): 30 | if w not in word_idx: 31 | word_idx[w] = len(word_idx) + 1 32 | 33 | graph = {} 34 | graph['g_ids'] = jo['g_ids'] 35 | graph['g_ids_features'] = jo['g_ids_features'] 36 | graph['g_adj'] = jo['g_adj'] 37 | graphs.append(graph) 38 | 39 | return seqs, graphs 40 | 41 | def vectorize_data(word_idx, texts): 42 | tv = [] 43 | for text in texts: 44 | stv = [] 45 | for w in text.split(): 46 | if w not in word_idx: 47 | stv.append(word_idx[conf.unknown_word]) 48 | else: 49 | stv.append(word_idx[w]) 50 | tv.append(stv) 51 | return tv 52 | 53 | def cons_batch_graph(graphs): 54 | g_ids = {} 55 | g_ids_features = {} 56 | g_fw_adj = {} 57 | g_bw_adj = {} 58 | g_nodes = [] 59 | 60 | for g in graphs: 61 | ids = g['g_ids'] 62 | id_adj = g['g_adj'] 63 | features = g['g_ids_features'] 64 | 65 | nodes = [] 66 | 67 | # we first add all nodes into batch_graph and create a mapping from graph id to batch_graph id, this mapping will be 68 | # used in the creation of fw_adj and bw_adj 69 | 70 | id_gid_map = {} 71 | offset = len(g_ids.keys()) 72 | for id in ids: 73 | id = int(id) 74 | g_ids[offset + id] = len(g_ids.keys()) 75 | g_ids_features[offset + id] = features[str(id)] 76 | id_gid_map[id] = offset + id 77 | nodes.append(offset + id) 78 | g_nodes.append(nodes) 79 | 80 | for id in id_adj: 81 | adj = id_adj[id] 82 | id = int(id) 83 | g_id = id_gid_map[id] 84 | if g_id not in g_fw_adj: 85 | g_fw_adj[g_id] = [] 86 | for t in adj: 87 | t = int(t) 88 | g_t = id_gid_map[t] 89 | g_fw_adj[g_id].append(g_t) 90 | if g_t not in g_bw_adj: 91 | g_bw_adj[g_t] = [] 92 | g_bw_adj[g_t].append(g_id) 93 | 94 | node_size = len(g_ids.keys()) 95 | for id in range(node_size): 96 | if id not in g_fw_adj: 97 | g_fw_adj[id] = [] 98 | if id not in g_bw_adj: 99 | g_bw_adj[id] = [] 100 | 101 | graph = {} 102 | graph['g_ids'] = g_ids 103 | graph['g_ids_features'] = g_ids_features 104 | graph['g_nodes'] = g_nodes 105 | graph['g_fw_adj'] = g_fw_adj 106 | graph['g_bw_adj'] = g_bw_adj 107 | 108 | return graph 109 | 110 | def vectorize_batch_graph(graph, word_idx): 111 | # vectorize the graph feature and normalize the adj info 112 | id_features = graph['g_ids_features'] 113 | gv = {} 114 | nv = [] 115 | word_max_len = 0 116 | for id in id_features: 117 | feature = id_features[id] 118 | word_max_len = max(word_max_len, len(feature.split())) 119 | word_max_len = min(word_max_len, conf.word_size_max) 120 | 121 | for id in graph['g_ids_features']: 122 | feature = graph['g_ids_features'][id] 123 | fv = [] 124 | for token in feature.split(): 125 | if len(token) == 0: 126 | continue 127 | if token in word_idx: 128 | fv.append(word_idx[token]) 129 | else: 130 | fv.append(word_idx[conf.unknown_word]) 131 | 132 | for _ in range(word_max_len - len(fv)): 133 | fv.append(0) 134 | fv = fv[:word_max_len] 135 | nv.append(fv) 136 | 137 | nv.append([0 for temp in range(word_max_len)]) 138 | gv['g_ids_features'] = np.array(nv) 139 | 140 | g_fw_adj = graph['g_fw_adj'] 141 | g_fw_adj_v = [] 142 | 143 | degree_max_size = 0 144 | for id in g_fw_adj: 145 | degree_max_size = max(degree_max_size, len(g_fw_adj[id])) 146 | 147 | g_bw_adj = graph['g_bw_adj'] 148 | for id in g_bw_adj: 149 | degree_max_size = max(degree_max_size, len(g_bw_adj[id])) 150 | 151 | degree_max_size = min(degree_max_size, conf.sample_size_per_layer) 152 | 153 | for id in g_fw_adj: 154 | adj = g_fw_adj[id] 155 | for _ in range(degree_max_size - len(adj)): 156 | adj.append(len(g_fw_adj.keys())) 157 | adj = adj[:degree_max_size] 158 | g_fw_adj_v.append(adj) 159 | 160 | # PAD node directs to the PAD node 161 | g_fw_adj_v.append([len(g_fw_adj.keys()) for _ in range(degree_max_size)]) 162 | 163 | g_bw_adj_v = [] 164 | for id in g_bw_adj: 165 | adj = g_bw_adj[id] 166 | for _ in range(degree_max_size - len(adj)): 167 | adj.append(len(g_bw_adj.keys())) 168 | adj = adj[:degree_max_size] 169 | g_bw_adj_v.append(adj) 170 | 171 | # PAD node directs to the PAD node 172 | g_bw_adj_v.append([len(g_bw_adj.keys()) for _ in range(degree_max_size)]) 173 | 174 | gv['g_ids'] = graph['g_ids'] 175 | gv['g_nodes'] =np.array(graph['g_nodes']) 176 | gv['g_bw_adj'] = np.array(g_bw_adj_v) 177 | gv['g_fw_adj'] = np.array(g_fw_adj_v) 178 | 179 | return gv 180 | -------------------------------------------------------------------------------- /Graph2Seq-master/main/data_creator.py: -------------------------------------------------------------------------------- 1 | """this python file is used to construct the fake data for the model""" 2 | import random 3 | import json 4 | import numpy as np 5 | 6 | from networkx import * 7 | import networkx.algorithms as nxalg 8 | 9 | def create_random_graph(type, filePath, numberOfCase, graph_scale): 10 | """ 11 | 12 | :param type: the graph type 13 | :param filePath: the output file path 14 | :param numberOfCase: the number of examples 15 | :return: 16 | """ 17 | with open(filePath, "w+") as f: 18 | degree = 0.0 19 | for _ in range(numberOfCase): 20 | info = {} 21 | graph_node_size = graph_scale 22 | edge_prob = 0.3 23 | 24 | while True: 25 | edge_count = 0.0 26 | if type == "random": 27 | graph = nx.gnp_random_graph(graph_node_size, edge_prob, directed=True) 28 | for id in graph.edge: 29 | edge_count += len(graph.edge[id]) 30 | start = random.randint(0, graph_node_size - 1) 31 | adj = nx.shortest_path(graph, start) 32 | 33 | max_len = 0 34 | path = [] 35 | paths = [] 36 | for neighbor in adj: 37 | if len(adj[neighbor]) > max_len and neighbor != start: 38 | paths = [] 39 | max_len = len(adj[neighbor]) 40 | path = adj[neighbor] 41 | end = neighbor 42 | for p in nx.all_shortest_paths(graph, start, end): 43 | paths.append(p) 44 | 45 | if len(path) > 0 and path[0] == start and len(path) == 3 and len(paths) == 1: 46 | degree += edge_count / graph_node_size 47 | break 48 | 49 | elif type == "no-cycle": 50 | graph = nx.DiGraph() 51 | for i in range(graph_node_size): 52 | nodes = graph.nodes() 53 | if len(nodes) == 0: 54 | graph.add_node(i) 55 | else: 56 | size = random.randint(1, min(i, 2)); 57 | fathers = random.sample(range(0, i), size) 58 | for father in fathers: 59 | graph.add_edge(father, i) 60 | for id in graph.edge: 61 | edge_count += len(graph.edge[id]) 62 | start = 0 63 | end = graph_node_size-1 64 | path = nx.shortest_path(graph, 0, graph_node_size-1) 65 | paths = [p for p in nx.all_shortest_paths(graph, 0, graph_node_size-1)] 66 | if len(path) >= 4 and len(paths) == 1: 67 | degree += edge_count / graph_node_size 68 | break 69 | 70 | elif type == "baseline": 71 | num_nodes = graph_node_size 72 | graph = nx.random_graphs.connected_watts_strogatz_graph(num_nodes, 3, edge_prob) 73 | for id in graph.edge: 74 | edge_count += len(graph.edge[id]) 75 | start, end = np.random.randint(num_nodes, size=2) 76 | 77 | if start == end: 78 | continue # reject trivial paths 79 | 80 | paths = list(nxalg.all_shortest_paths(graph, source=start, target=end)) 81 | 82 | if len(paths) > 1: 83 | continue # reject when more than one shortest path 84 | 85 | path = paths[0] 86 | 87 | if len(path) != 4: 88 | continue 89 | degree += edge_count / graph_node_size 90 | break 91 | 92 | adj_list = graph.adjacency_list() 93 | 94 | 95 | g_ids = {} 96 | g_ids_features = {} 97 | g_adj = {} 98 | for i in range(graph_node_size): 99 | g_ids[i] = i 100 | if i == start: 101 | g_ids_features[i] = "START" 102 | elif i == end: 103 | g_ids_features[i] = "END" 104 | else: 105 | # g_ids_features[i] = str(i+10) 106 | g_ids_features[i] = str(random.randint(1, 15)) 107 | g_adj[i] = adj_list[i] 108 | 109 | # print start, end, path 110 | text = "" 111 | for id in path: 112 | text += g_ids_features[id] + " " 113 | 114 | info["seq"] = text.strip() 115 | info["g_ids"] = g_ids 116 | info['g_ids_features'] = g_ids_features 117 | info['g_adj'] = g_adj 118 | f.write(json.dumps(info)+"\n") 119 | 120 | print("average degree in the graph is :{}".format(degree/numberOfCase)) 121 | 122 | if __name__ == "__main__": 123 | create_random_graph("no-cycle", "data/no_cycle/train.data", 1000, graph_scale=100) 124 | create_random_graph("no-cycle", "data/no_cycle/dev.data", 1000, graph_scale=100) 125 | create_random_graph("no-cycle", "data/no_cycle/test.data", 1000, graph_scale=100) 126 | 127 | 128 | -------------------------------------------------------------------------------- /Graph2Seq-master/main/evaluator.py: -------------------------------------------------------------------------------- 1 | 2 | 3 | def evaluate(type, golds, preds): 4 | assert len(golds) == len(preds) 5 | if type == "acc": 6 | correct = 0.0 7 | for _ in range(len(golds)): 8 | gold = golds[_] 9 | gold_str = " ".join(gold).strip() 10 | 11 | pred = preds[_] 12 | pred_str = " ".join(pred).strip() 13 | 14 | if gold_str == pred_str: 15 | correct += 1.0 16 | return correct/len(preds) -------------------------------------------------------------------------------- /Graph2Seq-master/main/helpers.py: -------------------------------------------------------------------------------- 1 | import numpy as np 2 | 3 | def batch(inputs, max_sequence_length=None): 4 | """ 5 | Args: 6 | inputs: 7 | list of sentences (integer lists) 8 | max_sequence_length: 9 | integer specifying how large should `max_time` dimension be. 10 | If None, maximum sequence length would be used 11 | 12 | Outputs: 13 | inputs_time_major: 14 | input sentences transformed into time-major matrix 15 | (shape [max_time, batch_size]) padded with 0s 16 | sequence_lengths: 17 | batch-sized list of integers specifying amount of active 18 | time steps in each input sequence 19 | """ 20 | sequence_lengths = [len(seq) for seq in inputs] 21 | batch_size = len(inputs) 22 | 23 | if max_sequence_length is None: 24 | max_sequence_length = max(sequence_lengths) 25 | 26 | inputs_batch_major = np.zeros(shape=[batch_size, max_sequence_length], dtype=np.int32) # == PAD 27 | 28 | for i, seq in enumerate(inputs): 29 | for j, element in enumerate(seq): 30 | inputs_batch_major[i, j] = element 31 | 32 | loss_weights = [] 33 | for _ in range(len(inputs)): 34 | weights = [] 35 | for __ in range(len(inputs[_])+1): 36 | weights.append(1) 37 | for __ in range(max_sequence_length-len(inputs[_])): 38 | weights.append(0) 39 | loss_weights.append(weights) 40 | 41 | return inputs_batch_major, sequence_lengths, loss_weights -------------------------------------------------------------------------------- /Graph2Seq-master/main/inits.py: -------------------------------------------------------------------------------- 1 | import tensorflow as tf 2 | import numpy as np 3 | 4 | # DISCLAIMER: 5 | # Parts of this code file are derived from 6 | # https://github.com/tkipf/gcn 7 | # which is under an identical MIT license as GraphSAGE 8 | 9 | def uniform(shape, scale=0.05, name=None): 10 | """Uniform init.""" 11 | initial = tf.random_uniform(shape, minval=-scale, maxval=scale, dtype=tf.float32) 12 | return tf.Variable(initial, name=name) 13 | 14 | 15 | def glorot(shape, name=None): 16 | """Glorot & Bengio (AISTATS 2010) init.""" 17 | init_range = np.sqrt(6.0/(shape[0]+shape[1])) 18 | initial = tf.random_uniform(shape, minval=-init_range, maxval=init_range, dtype=tf.float32) 19 | return tf.Variable(initial, name=name) 20 | 21 | 22 | def zeros(shape, name=None): 23 | """All zeros.""" 24 | initial = tf.zeros(shape, dtype=tf.float32) 25 | return tf.Variable(initial, name=name) 26 | 27 | def ones(shape, name=None): 28 | """All ones.""" 29 | initial = tf.ones(shape, dtype=tf.float32) 30 | return tf.Variable(initial, name=name) 31 | -------------------------------------------------------------------------------- /Graph2Seq-master/main/layer_utils.py: -------------------------------------------------------------------------------- 1 | import tensorflow as tf 2 | from tensorflow.python.ops import nn_ops 3 | 4 | def my_lstm_layer(input_reps, lstm_dim, input_lengths=None, scope_name=None, reuse=False, is_training=True, 5 | dropout_rate=0.2, use_cudnn=True): 6 | ''' 7 | :param inputs: [batch_size, seq_len, feature_dim] 8 | :param lstm_dim: 9 | :param scope_name: 10 | :param reuse: 11 | :param is_training: 12 | :param dropout_rate: 13 | :return: 14 | ''' 15 | input_reps = dropout_layer(input_reps, dropout_rate, is_training=is_training) 16 | with tf.variable_scope(scope_name, reuse=reuse): 17 | if use_cudnn: 18 | inputs = tf.transpose(input_reps, [1, 0, 2]) 19 | lstm = tf.contrib.cudnn_rnn.CudnnLSTM(1, lstm_dim, direction="bidirectional", 20 | name="{}_cudnn_bi_lstm".format(scope_name), dropout=dropout_rate if is_training else 0) 21 | outputs, _ = lstm(inputs) 22 | outputs = tf.transpose(outputs, [1, 0, 2]) 23 | f_rep = outputs[:, :, 0:lstm_dim] 24 | b_rep = outputs[:, :, lstm_dim:2*lstm_dim] 25 | else: 26 | context_lstm_cell_fw = tf.nn.rnn_cell.BasicLSTMCell(lstm_dim) 27 | context_lstm_cell_bw = tf.nn.rnn_cell.BasicLSTMCell(lstm_dim) 28 | if is_training: 29 | context_lstm_cell_fw = tf.nn.rnn_cell.DropoutWrapper(context_lstm_cell_fw, output_keep_prob=(1 - dropout_rate)) 30 | context_lstm_cell_bw = tf.nn.rnn_cell.DropoutWrapper(context_lstm_cell_bw, output_keep_prob=(1 - dropout_rate)) 31 | context_lstm_cell_fw = tf.nn.rnn_cell.MultiRNNCell([context_lstm_cell_fw]) 32 | context_lstm_cell_bw = tf.nn.rnn_cell.MultiRNNCell([context_lstm_cell_bw]) 33 | 34 | (f_rep, b_rep), _ = tf.nn.bidirectional_dynamic_rnn( 35 | context_lstm_cell_fw, context_lstm_cell_bw, input_reps, dtype=tf.float32, 36 | sequence_length=input_lengths) # [batch_size, question_len, context_lstm_dim] 37 | outputs = tf.concat(axis=2, values=[f_rep, b_rep]) 38 | return (f_rep,b_rep, outputs) 39 | 40 | def dropout_layer(input_reps, dropout_rate, is_training=True): 41 | if is_training: 42 | output_repr = tf.nn.dropout(input_reps, (1 - dropout_rate)) 43 | else: 44 | output_repr = input_reps 45 | return output_repr 46 | 47 | def cosine_distance(y1,y2, cosine_norm=True, eps=1e-6): 48 | # cosine_norm = True 49 | # y1 [....,a, 1, d] 50 | # y2 [....,1, b, d] 51 | cosine_numerator = tf.reduce_sum(tf.multiply(y1, y2), axis=-1) 52 | if not cosine_norm: 53 | return tf.tanh(cosine_numerator) 54 | y1_norm = tf.sqrt(tf.maximum(tf.reduce_sum(tf.square(y1), axis=-1), eps)) 55 | y2_norm = tf.sqrt(tf.maximum(tf.reduce_sum(tf.square(y2), axis=-1), eps)) 56 | return cosine_numerator / y1_norm / y2_norm 57 | 58 | def euclidean_distance(y1, y2, eps=1e-6): 59 | distance = tf.sqrt(tf.maximum(tf.reduce_sum(tf.square(y1 - y2), axis=-1), eps)) 60 | return distance 61 | 62 | def cross_entropy(logits, truth, mask=None): 63 | # logits: [batch_size, passage_len] 64 | # truth: [batch_size, passage_len] 65 | # mask: [batch_size, passage_len] 66 | if mask is not None: logits = tf.multiply(logits, mask) 67 | xdev = tf.subtract(logits, tf.expand_dims(tf.reduce_max(logits, 1), -1)) 68 | log_predictions = tf.subtract(xdev, tf.expand_dims(tf.log(tf.reduce_sum(tf.exp(xdev),-1)),-1)) 69 | result = tf.multiply(truth, log_predictions) # [batch_size, passage_len] 70 | if mask is not None: result = tf.multiply(result, mask) # [batch_size, passage_len] 71 | return tf.multiply(-1.0,tf.reduce_sum(result, -1)) # [batch_size] 72 | 73 | def projection_layer(in_val, input_size, output_size, activation_func=tf.tanh, scope=None): 74 | # in_val: [batch_size, passage_len, dim] 75 | input_shape = tf.shape(in_val) 76 | batch_size = input_shape[0] 77 | passage_len = input_shape[1] 78 | # feat_dim = input_shape[2] 79 | in_val = tf.reshape(in_val, [batch_size * passage_len, input_size]) 80 | with tf.variable_scope(scope or "projection_layer"): 81 | full_w = tf.get_variable("full_w", [input_size, output_size], dtype=tf.float32) 82 | full_b = tf.get_variable("full_b", [output_size], dtype=tf.float32) 83 | outputs = activation_func(tf.nn.xw_plus_b(in_val, full_w, full_b)) 84 | outputs = tf.reshape(outputs, [batch_size, passage_len, output_size]) 85 | return outputs # [batch_size, passage_len, output_size] 86 | 87 | def highway_layer(in_val, output_size, activation_func=tf.tanh, scope=None): 88 | # in_val: [batch_size, passage_len, dim] 89 | input_shape = tf.shape(in_val) 90 | batch_size = input_shape[0] 91 | passage_len = input_shape[1] 92 | # feat_dim = input_shape[2] 93 | in_val = tf.reshape(in_val, [batch_size * passage_len, output_size]) 94 | with tf.variable_scope(scope or "highway_layer"): 95 | highway_w = tf.get_variable("highway_w", [output_size, output_size], dtype=tf.float32) 96 | highway_b = tf.get_variable("highway_b", [output_size], dtype=tf.float32) 97 | full_w = tf.get_variable("full_w", [output_size, output_size], dtype=tf.float32) 98 | full_b = tf.get_variable("full_b", [output_size], dtype=tf.float32) 99 | trans = activation_func(tf.nn.xw_plus_b(in_val, full_w, full_b)) 100 | gate = tf.nn.sigmoid(tf.nn.xw_plus_b(in_val, highway_w, highway_b)) 101 | outputs = tf.add(tf.multiply(trans, gate), tf.multiply(in_val, tf.subtract(1.0, gate)), "y") 102 | outputs = tf.reshape(outputs, [batch_size, passage_len, output_size]) 103 | return outputs 104 | 105 | def multi_highway_layer(in_val, output_size, num_layers, activation_func=tf.tanh, scope_name=None, reuse=False): 106 | with tf.variable_scope(scope_name, reuse=reuse): 107 | for i in xrange(num_layers): 108 | cur_scope_name = scope_name + "-{}".format(i) 109 | in_val = highway_layer(in_val, output_size,activation_func=activation_func, scope=cur_scope_name) 110 | return in_val 111 | 112 | def collect_representation(representation, positions): 113 | # representation: [batch_size, node_num, feature_dim] 114 | # positions: [batch_size, neigh_num] 115 | return collect_probs(representation, positions) 116 | 117 | def collect_final_step_of_lstm(lstm_representation, lengths): 118 | # lstm_representation: [batch_size, passsage_length, dim] 119 | # lengths: [batch_size] 120 | lengths = tf.maximum(lengths, tf.zeros_like(lengths, dtype=tf.int32)) 121 | 122 | batch_size = tf.shape(lengths)[0] 123 | batch_nums = tf.range(0, limit=batch_size) # shape (batch_size) 124 | indices = tf.stack((batch_nums, lengths), axis=1) # shape (batch_size, 2) 125 | result = tf.gather_nd(lstm_representation, indices, name='last-forwar-lstm') 126 | return result # [batch_size, dim] 127 | 128 | def collect_probs(probs, positions): 129 | # probs [batch_size, chunks_size] 130 | # positions [batch_size, pair_size] 131 | batch_size = tf.shape(probs)[0] 132 | pair_size = tf.shape(positions)[1] 133 | batch_nums = tf.range(0, limit=batch_size) # shape (batch_size) 134 | batch_nums = tf.reshape(batch_nums, shape=[-1, 1]) # [batch_size, 1] 135 | batch_nums = tf.tile(batch_nums, multiples=[1, pair_size]) # [batch_size, pair_size] 136 | 137 | indices = tf.stack((batch_nums, positions), axis=2) # shape (batch_size, pair_size, 2) 138 | pair_probs = tf.gather_nd(probs, indices) 139 | # pair_probs = tf.reshape(pair_probs, shape=[batch_size, pair_size]) 140 | return pair_probs 141 | 142 | 143 | def calcuate_attention(in_value_1, in_value_2, feature_dim1, feature_dim2, scope_name='att', 144 | att_type='symmetric', att_dim=20, remove_diagnoal=False, mask1=None, mask2=None, is_training=False, dropout_rate=0.2): 145 | input_shape = tf.shape(in_value_1) 146 | batch_size = input_shape[0] 147 | len_1 = input_shape[1] 148 | len_2 = tf.shape(in_value_2)[1] 149 | 150 | in_value_1 = dropout_layer(in_value_1, dropout_rate, is_training=is_training) 151 | in_value_2 = dropout_layer(in_value_2, dropout_rate, is_training=is_training) 152 | with tf.variable_scope(scope_name): 153 | # calculate attention ==> a: [batch_size, len_1, len_2] 154 | atten_w1 = tf.get_variable("atten_w1", [feature_dim1, att_dim], dtype=tf.float32) 155 | if feature_dim1 == feature_dim2: atten_w2 = atten_w1 156 | else: atten_w2 = tf.get_variable("atten_w2", [feature_dim2, att_dim], dtype=tf.float32) 157 | atten_value_1 = tf.matmul(tf.reshape(in_value_1, [batch_size * len_1, feature_dim1]), atten_w1) # [batch_size*len_1, feature_dim] 158 | atten_value_1 = tf.reshape(atten_value_1, [batch_size, len_1, att_dim]) 159 | atten_value_2 = tf.matmul(tf.reshape(in_value_2, [batch_size * len_2, feature_dim2]), atten_w2) # [batch_size*len_2, feature_dim] 160 | atten_value_2 = tf.reshape(atten_value_2, [batch_size, len_2, att_dim]) 161 | 162 | 163 | if att_type == 'additive': 164 | atten_b = tf.get_variable("atten_b", [att_dim], dtype=tf.float32) 165 | atten_v = tf.get_variable("atten_v", [1, att_dim], dtype=tf.float32) 166 | atten_value_1 = tf.expand_dims(atten_value_1, axis=2, name="atten_value_1") # [batch_size, len_1, 'x', feature_dim] 167 | atten_value_2 = tf.expand_dims(atten_value_2, axis=1, name="atten_value_2") # [batch_size, 'x', len_2, feature_dim] 168 | atten_value = atten_value_1 + atten_value_2 # + tf.expand_dims(tf.expand_dims(tf.expand_dims(atten_b, axis=0), axis=0), axis=0) 169 | atten_value = nn_ops.bias_add(atten_value, atten_b) 170 | atten_value = tf.tanh(atten_value) # [batch_size, len_1, len_2, feature_dim] 171 | atten_value = tf.reshape(atten_value, [-1, att_dim]) * atten_v # tf.expand_dims(atten_v, axis=0) # [batch_size*len_1*len_2, feature_dim] 172 | atten_value = tf.reduce_sum(atten_value, axis=-1) 173 | atten_value = tf.reshape(atten_value, [batch_size, len_1, len_2]) 174 | else: 175 | atten_value_1 = tf.tanh(atten_value_1) 176 | # atten_value_1 = tf.nn.relu(atten_value_1) 177 | atten_value_2 = tf.tanh(atten_value_2) 178 | # atten_value_2 = tf.nn.relu(atten_value_2) 179 | diagnoal_params = tf.get_variable("diagnoal_params", [1, 1, att_dim], dtype=tf.float32) 180 | atten_value_1 = atten_value_1 * diagnoal_params 181 | atten_value = tf.matmul(atten_value_1, atten_value_2, transpose_b=True) # [batch_size, len_1, len_2] 182 | 183 | # normalize 184 | if remove_diagnoal: 185 | diagnoal = tf.ones([len_1], tf.float32) # [len1] 186 | diagnoal = 1.0 - tf.diag(diagnoal) # [len1, len1] 187 | diagnoal = tf.expand_dims(diagnoal, axis=0) # ['x', len1, len1] 188 | atten_value = atten_value * diagnoal 189 | if mask1 is not None: atten_value = tf.multiply(atten_value, tf.expand_dims(mask1, axis=-1)) 190 | if mask2 is not None: atten_value = tf.multiply(atten_value, tf.expand_dims(mask2, axis=1)) 191 | atten_value = tf.nn.softmax(atten_value, name='atten_value') # [batch_size, len_1, len_2] 192 | if remove_diagnoal: atten_value = atten_value * diagnoal 193 | if mask1 is not None: atten_value = tf.multiply(atten_value, tf.expand_dims(mask1, axis=-1)) 194 | if mask2 is not None: atten_value = tf.multiply(atten_value, tf.expand_dims(mask2, axis=1)) 195 | 196 | return atten_value 197 | 198 | def weighted_sum(atten_scores, in_values): 199 | ''' 200 | 201 | :param atten_scores: # [batch_size, len1, len2] 202 | :param in_values: [batch_size, len2, dim] 203 | :return: 204 | ''' 205 | return tf.matmul(atten_scores, in_values) 206 | 207 | def cal_relevancy_matrix(in_question_repres, in_passage_repres): 208 | in_question_repres_tmp = tf.expand_dims(in_question_repres, 1) # [batch_size, 1, question_len, dim] 209 | in_passage_repres_tmp = tf.expand_dims(in_passage_repres, 2) # [batch_size, passage_len, 1, dim] 210 | relevancy_matrix = cosine_distance(in_question_repres_tmp,in_passage_repres_tmp) # [batch_size, passage_len, question_len] 211 | return relevancy_matrix 212 | 213 | def mask_relevancy_matrix(relevancy_matrix, question_mask, passage_mask): 214 | # relevancy_matrix: [batch_size, passage_len, question_len] 215 | # question_mask: [batch_size, question_len] 216 | # passage_mask: [batch_size, passsage_len] 217 | if question_mask is not None: 218 | relevancy_matrix = tf.multiply(relevancy_matrix, tf.expand_dims(question_mask, 1)) 219 | relevancy_matrix = tf.multiply(relevancy_matrix, tf.expand_dims(passage_mask, 2)) 220 | return relevancy_matrix 221 | 222 | def compute_gradients(tensor, var_list): 223 | grads = tf.gradients(tensor, var_list) 224 | return [grad if grad is not None else tf.zeros_like(var) for var, grad in zip(var_list, grads)] -------------------------------------------------------------------------------- /Graph2Seq-master/main/layers.py: -------------------------------------------------------------------------------- 1 | import tensorflow as tf 2 | from inits import zeros 3 | import configure as conf 4 | 5 | _LAYER_UIDS = {} 6 | 7 | def get_layer_uid(layer_name=''): 8 | """Helper function, assigns unique layer IDs.""" 9 | if layer_name not in _LAYER_UIDS: 10 | _LAYER_UIDS[layer_name] = 1 11 | return 1 12 | else: 13 | _LAYER_UIDS[layer_name] += 1 14 | return _LAYER_UIDS[layer_name] 15 | 16 | class Layer(object): 17 | """Base layer class. Defines basic API for all layer objects. 18 | Implementation inspired by keras (http://keras.io). 19 | # Properties 20 | name: String, defines the variable scope of the layer. 21 | logging: Boolean, switches Tensorflow histogram logging on/off 22 | 23 | # Methods 24 | _call(inputs): Defines computation graph of layer 25 | (i.e. takes input, returns output) 26 | __call__(inputs): Wrapper for _call() 27 | """ 28 | 29 | def __init__(self, **kwargs): 30 | allowed_kwargs = {'name', 'logging', 'model_size'} 31 | for kwarg in kwargs.keys(): 32 | assert kwarg in allowed_kwargs, 'Invalid keyword argument: ' + kwarg 33 | name = kwargs.get('name') 34 | if not name: 35 | layer = self.__class__.__name__.lower() 36 | name = layer + '_' + str(get_layer_uid(layer)) 37 | self.name = name 38 | self.vars = {} 39 | logging = kwargs.get('logging', False) 40 | self.logging = logging 41 | self.sparse_inputs = False 42 | 43 | def _call(self, inputs): 44 | return inputs 45 | 46 | def __call__(self, inputs): 47 | with tf.name_scope(self.name): 48 | outputs = self._call(inputs) 49 | return outputs 50 | 51 | class Dense(Layer): 52 | """Dense layer.""" 53 | def __init__(self, input_dim, output_dim, dropout=0., 54 | act=tf.nn.relu, placeholders=None, bias=True, featureless=False, 55 | sparse_inputs=False, **kwargs): 56 | super(Dense, self).__init__(**kwargs) 57 | 58 | self.dropout = dropout 59 | 60 | self.act = act 61 | self.featureless = featureless 62 | self.bias = bias 63 | self.input_dim = input_dim 64 | self.output_dim = output_dim 65 | 66 | # helper variable for sparse dropout 67 | self.sparse_inputs = sparse_inputs 68 | if sparse_inputs: 69 | self.num_features_nonzero = placeholders['num_features_nonzero'] 70 | 71 | with tf.variable_scope(self.name + '_vars'): 72 | self.vars['weights'] = tf.get_variable('weights', shape=(input_dim, output_dim), 73 | dtype=tf.float32, 74 | initializer=tf.contrib.layers.xavier_initializer(), 75 | regularizer=tf.contrib.layers.l2_regularizer(conf.weight_decay)) 76 | if self.bias: 77 | self.vars['bias'] = zeros([output_dim], name='bias') 78 | 79 | def _call(self, inputs): 80 | x = inputs 81 | 82 | # x = tf.nn.dropout(x, self.dropout) 83 | 84 | # transform 85 | output = tf.matmul(x, self.vars['weights']) 86 | 87 | # bias 88 | if self.bias: 89 | output += self.vars['bias'] 90 | 91 | return self.act(output) -------------------------------------------------------------------------------- /Graph2Seq-master/main/loaderAndwriter.py: -------------------------------------------------------------------------------- 1 | import codecs 2 | import numpy as np 3 | import os 4 | 5 | def load_word_embedding(embedding_path): 6 | with codecs.open(embedding_path, 'r', 'utf-8') as f: 7 | word_idx = {} 8 | vecs = [] 9 | for line in f: 10 | line = line.strip() 11 | if len(line.split(" ")) == 2: 12 | continue 13 | info = line.split(' ') 14 | word = info[0] 15 | vec = [float(v) for v in info[1:]] 16 | if len(vec) != 300: 17 | continue 18 | vecs.append(vec) 19 | word_idx[word] = len(word_idx.keys()) + 1 20 | 21 | return word_idx, np.array(vecs) 22 | 23 | def write_word_idx(word_idx, path): 24 | dir = path[:path.rfind('/')] 25 | if not os.path.exists(dir): 26 | os.makedirs(dir) 27 | 28 | with codecs.open(path, 'w', 'utf-8') as f: 29 | for word in word_idx: 30 | f.write(word+" "+str(word_idx[word])+'\n') 31 | 32 | def read_word_idx_from_file(path): 33 | word_idx = {} 34 | with codecs.open(path, 'r', 'utf-8') as f: 35 | lines = f.readlines() 36 | for line in lines: 37 | info = line.strip().split(" ") 38 | if len(info) != 2: 39 | word_idx[' '] = int(info[0]) 40 | else: 41 | word_idx[info[0]] = int(info[1]) 42 | return word_idx 43 | -------------------------------------------------------------------------------- /Graph2Seq-master/main/match_utils.py: -------------------------------------------------------------------------------- 1 | import tensorflow as tf 2 | import layer_utils 3 | 4 | eps = 1e-6 5 | 6 | 7 | def cosine_distance(y1, y2): 8 | # y1 [....,a, 1, d] 9 | # y2 [....,1, b, d] 10 | cosine_numerator = tf.reduce_sum(tf.multiply(y1, y2), axis=-1) 11 | y1_norm = tf.sqrt(tf.maximum(tf.reduce_sum(tf.square(y1), axis=-1), eps)) 12 | y2_norm = tf.sqrt(tf.maximum(tf.reduce_sum(tf.square(y2), axis=-1), eps)) 13 | return cosine_numerator / y1_norm / y2_norm 14 | 15 | 16 | def cal_relevancy_matrix(in_question_repres, in_passage_repres): 17 | in_question_repres_tmp = tf.expand_dims(in_question_repres, 1) # [batch_size, 1, question_len, dim] 18 | in_passage_repres_tmp = tf.expand_dims(in_passage_repres, 2) # [batch_size, passage_len, 1, dim] 19 | relevancy_matrix = cosine_distance(in_question_repres_tmp, 20 | in_passage_repres_tmp) # [batch_size, passage_len, question_len] 21 | return relevancy_matrix 22 | 23 | 24 | def mask_relevancy_matrix(relevancy_matrix, question_mask, passage_mask): 25 | # relevancy_matrix: [batch_size, passage_len, question_len] 26 | # question_mask: [batch_size, question_len] 27 | # passage_mask: [batch_size, passsage_len] 28 | relevancy_matrix = tf.multiply(relevancy_matrix, tf.expand_dims(question_mask, 1)) 29 | relevancy_matrix = tf.multiply(relevancy_matrix, tf.expand_dims(passage_mask, 2)) 30 | return relevancy_matrix 31 | 32 | 33 | def multi_perspective_expand_for_3D(in_tensor, decompose_params): 34 | in_tensor = tf.expand_dims(in_tensor, axis=2) # [batch_size, passage_len, 'x', dim] 35 | decompose_params = tf.expand_dims(tf.expand_dims(decompose_params, axis=0), axis=0) # [1, 1, decompse_dim, dim] 36 | return tf.multiply(in_tensor, decompose_params) # [batch_size, passage_len, decompse_dim, dim] 37 | 38 | 39 | def multi_perspective_expand_for_2D(in_tensor, decompose_params): 40 | in_tensor = tf.expand_dims(in_tensor, axis=1) # [batch_size, 'x', dim] 41 | decompose_params = tf.expand_dims(decompose_params, axis=0) # [1, decompse_dim, dim] 42 | return tf.multiply(in_tensor, decompose_params) # [batch_size, decompse_dim, dim] 43 | 44 | 45 | def cal_maxpooling_matching(passage_rep, question_rep, decompose_params): 46 | # passage_representation: [batch_size, passage_len, dim] 47 | # qusetion_representation: [batch_size, question_len, dim] 48 | # decompose_params: [decompose_dim, dim] 49 | 50 | def singel_instance(x): 51 | p = x[0] 52 | q = x[1] 53 | # p: [pasasge_len, dim], q: [question_len, dim] 54 | p = multi_perspective_expand_for_2D(p, decompose_params) # [pasasge_len, decompose_dim, dim] 55 | q = multi_perspective_expand_for_2D(q, decompose_params) # [question_len, decompose_dim, dim] 56 | p = tf.expand_dims(p, 1) # [pasasge_len, 1, decompose_dim, dim] 57 | q = tf.expand_dims(q, 0) # [1, question_len, decompose_dim, dim] 58 | return cosine_distance(p, q) # [passage_len, question_len, decompose] 59 | 60 | elems = (passage_rep, question_rep) 61 | matching_matrix = tf.map_fn(singel_instance, elems, 62 | dtype=tf.float32) # [batch_size, passage_len, question_len, decompse_dim] 63 | return tf.concat(axis=2, values=[tf.reduce_max(matching_matrix, axis=2), tf.reduce_mean(matching_matrix, 64 | axis=2)]) # [batch_size, passage_len, 2*decompse_dim] 65 | 66 | 67 | def cross_entropy(logits, truth, mask): 68 | # logits: [batch_size, passage_len] 69 | # truth: [batch_size, passage_len] 70 | # mask: [batch_size, passage_len] 71 | 72 | # xdev = x - x.max() 73 | # return xdev - T.log(T.sum(T.exp(xdev))) 74 | logits = tf.multiply(logits, mask) 75 | xdev = tf.sub(logits, tf.expand_dims(tf.reduce_max(logits, 1), -1)) 76 | log_predictions = tf.sub(xdev, tf.expand_dims(tf.log(tf.reduce_sum(tf.exp(xdev), -1)), -1)) 77 | # return -T.sum(targets * log_predictions) 78 | result = tf.multiply(tf.multiply(truth, log_predictions), mask) # [batch_size, passage_len] 79 | return tf.multiply(-1.0, tf.reduce_sum(result, -1)) # [batch_size] 80 | 81 | 82 | def highway_layer(in_val, output_size, scope=None): 83 | # in_val: [batch_size, passage_len, dim] 84 | input_shape = tf.shape(in_val) 85 | batch_size = input_shape[0] 86 | passage_len = input_shape[1] 87 | # feat_dim = input_shape[2] 88 | in_val = tf.reshape(in_val, [batch_size * passage_len, output_size]) 89 | with tf.variable_scope(scope or "highway_layer"): 90 | highway_w = tf.get_variable("highway_w", [output_size, output_size], dtype=tf.float32) 91 | highway_b = tf.get_variable("highway_b", [output_size], dtype=tf.float32) 92 | full_w = tf.get_variable("full_w", [output_size, output_size], dtype=tf.float32) 93 | full_b = tf.get_variable("full_b", [output_size], dtype=tf.float32) 94 | trans = tf.nn.tanh(tf.nn.xw_plus_b(in_val, full_w, full_b)) 95 | gate = tf.nn.sigmoid(tf.nn.xw_plus_b(in_val, highway_w, highway_b)) 96 | outputs = trans * gate + in_val * (1.0 - gate) 97 | outputs = tf.reshape(outputs, [batch_size, passage_len, output_size]) 98 | return outputs 99 | 100 | 101 | def multi_highway_layer(in_val, output_size, num_layers, scope=None): 102 | scope_name = 'highway_layer' 103 | if scope is not None: scope_name = scope 104 | for i in range(num_layers): 105 | cur_scope_name = scope_name + "-{}".format(i) 106 | in_val = highway_layer(in_val, output_size, scope=cur_scope_name) 107 | return in_val 108 | 109 | 110 | def cal_max_question_representation(question_representation, atten_scores): 111 | atten_positions = tf.argmax(atten_scores, axis=2, output_type=tf.int32) # [batch_size, passage_len] 112 | max_question_reps = layer_utils.collect_representation(question_representation, atten_positions) 113 | return max_question_reps 114 | 115 | 116 | def multi_perspective_match(feature_dim, repres1, repres2, is_training=True, dropout_rate=0.2, 117 | options=None, scope_name='mp-match', reuse=False): 118 | ''' 119 | :param repres1: [batch_size, len, feature_dim] 120 | :param repres2: [batch_size, len, feature_dim] 121 | :return: 122 | ''' 123 | input_shape = tf.shape(repres1) 124 | batch_size = input_shape[0] 125 | seq_length = input_shape[1] 126 | matching_result = [] 127 | with tf.variable_scope(scope_name, reuse=reuse): 128 | match_dim = 0 129 | if options['with_cosine']: 130 | cosine_value = layer_utils.cosine_distance(repres1, repres2, cosine_norm=False) 131 | cosine_value = tf.reshape(cosine_value, [batch_size, seq_length, 1]) 132 | matching_result.append(cosine_value) 133 | match_dim += 1 134 | 135 | if options['with_mp_cosine']: 136 | mp_cosine_params = tf.get_variable("mp_cosine", shape=[options['cosine_MP_dim'], feature_dim], 137 | dtype=tf.float32) 138 | mp_cosine_params = tf.expand_dims(mp_cosine_params, axis=0) 139 | mp_cosine_params = tf.expand_dims(mp_cosine_params, axis=0) 140 | repres1_flat = tf.expand_dims(repres1, axis=2) 141 | repres2_flat = tf.expand_dims(repres2, axis=2) 142 | mp_cosine_matching = layer_utils.cosine_distance(tf.multiply(repres1_flat, mp_cosine_params), 143 | repres2_flat, cosine_norm=False) 144 | matching_result.append(mp_cosine_matching) 145 | match_dim += options['cosine_MP_dim'] 146 | 147 | matching_result = tf.concat(axis=2, values=matching_result) 148 | return (matching_result, match_dim) 149 | 150 | 151 | def match_passage_with_question(passage_reps, question_reps, passage_mask, question_mask, passage_lengths, 152 | question_lengths, 153 | context_lstm_dim, scope=None, 154 | with_full_match=True, with_maxpool_match=True, with_attentive_match=True, 155 | with_max_attentive_match=True, 156 | is_training=True, options=None, dropout_rate=0, forward=True): 157 | passage_reps = tf.multiply(passage_reps, tf.expand_dims(passage_mask, -1)) 158 | question_reps = tf.multiply(question_reps, tf.expand_dims(question_mask, -1)) 159 | all_question_aware_representatins = [] 160 | dim = 0 161 | with tf.variable_scope(scope or "match_passage_with_question"): 162 | relevancy_matrix = cal_relevancy_matrix(question_reps, passage_reps) 163 | relevancy_matrix = mask_relevancy_matrix(relevancy_matrix, question_mask, passage_mask) 164 | # relevancy_matrix = layer_utils.calcuate_attention(passage_reps, question_reps, context_lstm_dim, context_lstm_dim, 165 | # scope_name="fw_attention", att_type=options.att_type, att_dim=options.att_dim, 166 | # remove_diagnoal=False, mask1=passage_mask, mask2=question_mask, is_training=is_training, dropout_rate=dropout_rate) 167 | 168 | all_question_aware_representatins.append(tf.reduce_max(relevancy_matrix, axis=2, keep_dims=True)) 169 | all_question_aware_representatins.append(tf.reduce_mean(relevancy_matrix, axis=2, keep_dims=True)) 170 | dim += 2 171 | if with_full_match: 172 | if forward: 173 | question_full_rep = layer_utils.collect_final_step_of_lstm(question_reps, question_lengths - 1) 174 | else: 175 | question_full_rep = question_reps[:, 0, :] 176 | 177 | passage_len = tf.shape(passage_reps)[1] 178 | question_full_rep = tf.expand_dims(question_full_rep, axis=1) 179 | question_full_rep = tf.tile(question_full_rep, 180 | [1, passage_len, 1]) # [batch_size, pasasge_len, feature_dim] 181 | 182 | (attentive_rep, match_dim) = multi_perspective_match(context_lstm_dim, 183 | passage_reps, question_full_rep, 184 | is_training=is_training, 185 | dropout_rate=options['dropout_rate'], 186 | options=options, scope_name='mp-match-full-match') 187 | all_question_aware_representatins.append(attentive_rep) 188 | dim += match_dim 189 | 190 | if with_maxpool_match: 191 | maxpooling_decomp_params = tf.get_variable("maxpooling_matching_decomp", 192 | shape=[options['cosine_MP_dim'], context_lstm_dim], 193 | dtype=tf.float32) 194 | maxpooling_rep = cal_maxpooling_matching(passage_reps, question_reps, maxpooling_decomp_params) 195 | all_question_aware_representatins.append(maxpooling_rep) 196 | dim += 2 * options['cosine_MP_dim'] 197 | 198 | if with_attentive_match: 199 | atten_scores = layer_utils.calcuate_attention(passage_reps, question_reps, context_lstm_dim, 200 | context_lstm_dim, 201 | scope_name="attention", att_type=options['att_type'], 202 | att_dim=options['att_dim'], 203 | remove_diagnoal=False, mask1=passage_mask, 204 | mask2=question_mask, is_training=is_training, 205 | dropout_rate=dropout_rate) 206 | att_question_contexts = tf.matmul(atten_scores, question_reps) 207 | (attentive_rep, match_dim) = multi_perspective_match(context_lstm_dim, 208 | passage_reps, att_question_contexts, 209 | is_training=is_training, 210 | dropout_rate=options['dropout_rate'], 211 | options=options, scope_name='mp-match-att_question') 212 | all_question_aware_representatins.append(attentive_rep) 213 | dim += match_dim 214 | 215 | if with_max_attentive_match: 216 | max_att = cal_max_question_representation(question_reps, relevancy_matrix) 217 | (max_attentive_rep, match_dim) = multi_perspective_match(context_lstm_dim, 218 | passage_reps, max_att, is_training=is_training, 219 | dropout_rate=options['dropout_rate'], 220 | options=options, scope_name='mp-match-max-att') 221 | all_question_aware_representatins.append(max_attentive_rep) 222 | dim += match_dim 223 | 224 | all_question_aware_representatins = tf.concat(axis=2, values=all_question_aware_representatins) 225 | return (all_question_aware_representatins, dim) 226 | 227 | 228 | def bilateral_match_func(in_question_repres, in_passage_repres, 229 | question_lengths, passage_lengths, question_mask, passage_mask, input_dim, is_training, 230 | options=None): 231 | question_aware_representatins = [] 232 | question_aware_dim = 0 233 | passage_aware_representatins = [] 234 | passage_aware_dim = 0 235 | 236 | # ====word level matching====== 237 | (match_reps, match_dim) = match_passage_with_question(in_passage_repres, in_question_repres, passage_mask, 238 | question_mask, passage_lengths, 239 | question_lengths, input_dim, scope="word_match_forward", 240 | with_full_match=False, 241 | with_maxpool_match=options['with_maxpool_match'], 242 | with_attentive_match=options['with_attentive_match'], 243 | with_max_attentive_match=options['with_max_attentive_match'], 244 | is_training=is_training, options=options, 245 | dropout_rate=options['dropout_rate'], forward=True) 246 | question_aware_representatins.append(match_reps) 247 | question_aware_dim += match_dim 248 | 249 | (match_reps, match_dim) = match_passage_with_question(in_question_repres, in_passage_repres, question_mask, 250 | passage_mask, question_lengths, 251 | passage_lengths, input_dim, scope="word_match_backward", 252 | with_full_match=False, 253 | with_maxpool_match=options['with_maxpool_match'], 254 | with_attentive_match=options['with_attentive_match'], 255 | with_max_attentive_match=options['with_max_attentive_match'], 256 | is_training=is_training, options=options, 257 | dropout_rate=options['dropout_rate'], forward=False) 258 | passage_aware_representatins.append(match_reps) 259 | passage_aware_dim += match_dim 260 | 261 | with tf.variable_scope('context_MP_matching'): 262 | for i in range(options['context_layer_num']): # support multiple context layer 263 | with tf.variable_scope('layer-{}'.format(i)): 264 | # contextual lstm for both passage and question 265 | in_question_repres = tf.multiply(in_question_repres, tf.expand_dims(question_mask, axis=-1)) 266 | in_passage_repres = tf.multiply(in_passage_repres, tf.expand_dims(passage_mask, axis=-1)) 267 | (question_context_representation_fw, question_context_representation_bw, 268 | in_question_repres) = layer_utils.my_lstm_layer( 269 | in_question_repres, options['context_lstm_dim'], input_lengths=question_lengths, 270 | scope_name="context_represent", 271 | reuse=False, is_training=is_training, dropout_rate=options['dropout_rate'], 272 | use_cudnn=options['use_cudnn']) 273 | (passage_context_representation_fw, passage_context_representation_bw, 274 | in_passage_repres) = layer_utils.my_lstm_layer( 275 | in_passage_repres, options['context_lstm_dim'], input_lengths=passage_lengths, 276 | scope_name="context_represent", 277 | reuse=True, is_training=is_training, dropout_rate=options['dropout_rate'], use_cudnn=options['use_cudnn']) 278 | 279 | # Multi-perspective matching 280 | with tf.variable_scope('left_MP_matching'): 281 | (match_reps, match_dim) = match_passage_with_question(passage_context_representation_fw, 282 | question_context_representation_fw, 283 | passage_mask, question_mask, passage_lengths, 284 | question_lengths, options['context_lstm_dim'], 285 | scope="forward_match", 286 | with_full_match=options['with_full_match'], 287 | with_maxpool_match=options['with_maxpool_match'], 288 | with_attentive_match=options['with_attentive_match'], 289 | with_max_attentive_match=options['with_max_attentive_match'], 290 | is_training=is_training, options=options, 291 | dropout_rate=options['dropout_rate'], 292 | forward=True) 293 | question_aware_representatins.append(match_reps) 294 | question_aware_dim += match_dim 295 | (match_reps, match_dim) = match_passage_with_question(passage_context_representation_bw, 296 | question_context_representation_bw, 297 | passage_mask, question_mask, passage_lengths, 298 | question_lengths, options['context_lstm_dim'], 299 | scope="backward_match", 300 | with_full_match=options['with_full_match'], 301 | with_maxpool_match=options['with_maxpool_match'], 302 | with_attentive_match=options['with_attentive_match'], 303 | with_max_attentive_match=options['with_max_attentive_match'], 304 | is_training=is_training, options=options, 305 | dropout_rate=options['dropout_rate'], 306 | forward=False) 307 | question_aware_representatins.append(match_reps) 308 | question_aware_dim += match_dim 309 | 310 | with tf.variable_scope('right_MP_matching'): 311 | (match_reps, match_dim) = match_passage_with_question(question_context_representation_fw, 312 | passage_context_representation_fw, 313 | question_mask, passage_mask, question_lengths, 314 | passage_lengths, options['context_lstm_dim'], 315 | scope="forward_match", 316 | with_full_match=options['with_full_match'], 317 | with_maxpool_match=options['with_maxpool_match'], 318 | with_attentive_match=options['with_attentive_match'], 319 | with_max_attentive_match=options['with_max_attentive_match'], 320 | is_training=is_training, options=options, 321 | dropout_rate=options['dropout_rate'], 322 | forward=True) 323 | passage_aware_representatins.append(match_reps) 324 | passage_aware_dim += match_dim 325 | (match_reps, match_dim) = match_passage_with_question(question_context_representation_bw, 326 | passage_context_representation_bw, 327 | question_mask, passage_mask, question_lengths, 328 | passage_lengths, options['context_lstm_dim'], 329 | scope="backward_match", 330 | with_full_match=options['with_full_match'], 331 | with_maxpool_match=options['with_maxpool_match'], 332 | with_attentive_match=options['with_attentive_match'], 333 | with_max_attentive_match=options['with_max_attentive_match'], 334 | is_training=is_training, options=options, 335 | dropout_rate=options['dropout_rate'], 336 | forward=False) 337 | passage_aware_representatins.append(match_reps) 338 | passage_aware_dim += match_dim 339 | 340 | question_aware_representatins = tf.concat(axis=2, 341 | values=question_aware_representatins) # [batch_size, passage_len, question_aware_dim] 342 | passage_aware_representatins = tf.concat(axis=2, 343 | values=passage_aware_representatins) # [batch_size, question_len, question_aware_dim] 344 | 345 | if is_training: 346 | question_aware_representatins = tf.nn.dropout(question_aware_representatins, (1 - options['dropout_rate'])) 347 | passage_aware_representatins = tf.nn.dropout(passage_aware_representatins, (1 - options['dropout_rate'])) 348 | 349 | # ======Highway layer====== 350 | if options['with_match_highway']: 351 | with tf.variable_scope("left_matching_highway"): 352 | question_aware_representatins = multi_highway_layer(question_aware_representatins, question_aware_dim, 353 | options['highway_layer_num']) 354 | with tf.variable_scope("right_matching_highway"): 355 | passage_aware_representatins = multi_highway_layer(passage_aware_representatins, passage_aware_dim, 356 | options['highway_layer_num']) 357 | 358 | # ========Aggregation Layer====== 359 | aggregation_representation = [] 360 | aggregation_dim = 0 361 | 362 | qa_aggregation_input = question_aware_representatins 363 | pa_aggregation_input = passage_aware_representatins 364 | with tf.variable_scope('aggregation_layer'): 365 | for i in range(options['aggregation_layer_num']): # support multiple aggregation layer 366 | qa_aggregation_input = tf.multiply(qa_aggregation_input, tf.expand_dims(passage_mask, axis=-1)) 367 | (fw_rep, bw_rep, cur_aggregation_representation) = layer_utils.my_lstm_layer( 368 | qa_aggregation_input, options['aggregation_lstm_dim'], input_lengths=passage_lengths, 369 | scope_name='left_layer-{}'.format(i), 370 | reuse=False, is_training=is_training, dropout_rate=options['dropout_rate'], use_cudnn=options['use_cudnn']) 371 | fw_rep = layer_utils.collect_final_step_of_lstm(fw_rep, passage_lengths - 1) 372 | bw_rep = bw_rep[:, 0, :] 373 | aggregation_representation.append(fw_rep) 374 | aggregation_representation.append(bw_rep) 375 | aggregation_dim += 2 * options['aggregation_lstm_dim'] 376 | qa_aggregation_input = cur_aggregation_representation # [batch_size, passage_len, 2*aggregation_lstm_dim] 377 | 378 | pa_aggregation_input = tf.multiply(pa_aggregation_input, tf.expand_dims(question_mask, axis=-1)) 379 | (fw_rep, bw_rep, cur_aggregation_representation) = layer_utils.my_lstm_layer( 380 | pa_aggregation_input, options['aggregation_lstm_dim'], 381 | input_lengths=question_lengths, scope_name='right_layer-{}'.format(i), 382 | reuse=False, is_training=is_training, dropout_rate=options['dropout_rate'], use_cudnn=options['use_cudnn']) 383 | fw_rep = layer_utils.collect_final_step_of_lstm(fw_rep, question_lengths - 1) 384 | bw_rep = bw_rep[:, 0, :] 385 | aggregation_representation.append(fw_rep) 386 | aggregation_representation.append(bw_rep) 387 | aggregation_dim += 2 * options['aggregation_lstm_dim'] 388 | pa_aggregation_input = cur_aggregation_representation # [batch_size, passage_len, 2*aggregation_lstm_dim] 389 | 390 | aggregation_representation = tf.concat(axis=1, values=aggregation_representation) # [batch_size, aggregation_dim] 391 | 392 | # ======Highway layer====== 393 | if options['with_aggregation_highway']: 394 | with tf.variable_scope("aggregation_highway"): 395 | agg_shape = tf.shape(aggregation_representation) 396 | batch_size = agg_shape[0] 397 | aggregation_representation = tf.reshape(aggregation_representation, [1, batch_size, aggregation_dim]) 398 | aggregation_representation = multi_highway_layer(aggregation_representation, aggregation_dim, 399 | options['highway_layer_num']) 400 | aggregation_representation = tf.reshape(aggregation_representation, [batch_size, aggregation_dim]) 401 | 402 | return (aggregation_representation, aggregation_dim) 403 | -------------------------------------------------------------------------------- /Graph2Seq-master/main/model.py: -------------------------------------------------------------------------------- 1 | import tensorflow as tf 2 | from tensorflow.python.layers.core import Dense 3 | from tensorflow.python.ops.rnn_cell_impl import LSTMStateTuple 4 | import tensorflow.contrib.seq2seq as seq2seq 5 | 6 | from neigh_samplers import UniformNeighborSampler 7 | from aggregators import MeanAggregator, MaxPoolingAggregator, GatedMeanAggregator 8 | import numpy as np 9 | import match_utils 10 | 11 | class Graph2SeqNN(object): 12 | 13 | PAD = 0 14 | GO = 1 15 | EOS = 2 16 | 17 | def __init__(self, mode, conf, path_embed_method): 18 | 19 | self.mode = mode 20 | self.word_vocab_size = conf.word_vocab_size 21 | self.l2_lambda = conf.l2_lambda 22 | self.path_embed_method = path_embed_method 23 | # self.word_embedding_dim = conf.word_embedding_dim 24 | self.word_embedding_dim = conf.hidden_layer_dim 25 | self.encoder_hidden_dim = conf.encoder_hidden_dim 26 | 27 | # the setting for the GCN 28 | self.num_layers_decode = conf.num_layers_decode 29 | self.num_layers = conf.num_layers 30 | self.graph_encode_direction = conf.graph_encode_direction 31 | self.sample_layer_size = conf.sample_layer_size 32 | self.hidden_layer_dim = conf.hidden_layer_dim 33 | self.concat = conf.concat 34 | 35 | # the setting for the decoder 36 | self.beam_width = conf.beam_width 37 | self.decoder_type = conf.decoder_type 38 | self.seq_max_len = conf.seq_max_len 39 | 40 | self._text = tf.placeholder(tf.int32, [None, None]) 41 | self.decoder_seq_length = tf.placeholder(tf.int32, [None]) 42 | self.loss_weights = tf.placeholder(tf.float32, [None, None]) 43 | 44 | # the following place holders are for the gcn 45 | self.fw_adj_info = tf.placeholder(tf.int32, [None, None]) # the fw adj info for each node 46 | self.bw_adj_info = tf.placeholder(tf.int32, [None, None]) # the bw adj info for each node 47 | self.feature_info = tf.placeholder(tf.int32, [None, None]) # the feature info for each node 48 | self.batch_nodes = tf.placeholder(tf.int32, [None, None]) # the nodes for each batch 49 | 50 | self.sample_size_per_layer = tf.shape(self.fw_adj_info)[1] 51 | 52 | self.single_graph_nodes_size = tf.shape(self.batch_nodes)[1] 53 | self.attention = conf.attention 54 | self.dropout = conf.dropout 55 | self.fw_aggregators = [] 56 | self.bw_aggregators = [] 57 | 58 | self.if_pred_on_dev = False 59 | 60 | self.learning_rate = conf.learning_rate 61 | 62 | def _init_decoder_train_connectors(self): 63 | batch_size, sequence_size = tf.unstack(tf.shape(self._text)) 64 | self.batch_size = batch_size 65 | GO_SLICE = tf.ones([batch_size, 1], dtype=tf.int32) * self.GO 66 | EOS_SLICE = tf.ones([batch_size, 1], dtype=tf.int32) * self.PAD 67 | self.decoder_train_inputs = tf.concat([GO_SLICE, self._text], axis=1) 68 | self.decoder_train_length = self.decoder_seq_length + 1 69 | decoder_train_targets = tf.concat([self._text, EOS_SLICE], axis=1) 70 | _, decoder_train_targets_seq_len = tf.unstack(tf.shape(decoder_train_targets)) 71 | decoder_train_targets_eos_mask = tf.one_hot(self.decoder_train_length - 1, decoder_train_targets_seq_len, 72 | on_value=self.EOS, off_value=self.PAD, dtype=tf.int32) 73 | self.decoder_train_targets = tf.add(decoder_train_targets, decoder_train_targets_eos_mask) 74 | self.decoder_train_inputs_embedded = tf.nn.embedding_lookup(self.word_embeddings, self.decoder_train_inputs) 75 | 76 | 77 | 78 | def encode(self): 79 | with tf.variable_scope("embedding_layer"): 80 | pad_word_embedding = tf.zeros([1, self.word_embedding_dim]) # this is for the PAD symbol 81 | self.word_embeddings = tf.concat([pad_word_embedding, 82 | tf.get_variable('W_train', shape=[self.word_vocab_size,self.word_embedding_dim], 83 | initializer=tf.contrib.layers.xavier_initializer(), trainable=True)], 0) 84 | 85 | with tf.variable_scope("graph_encoding_layer"): 86 | 87 | # self.encoder_outputs, self.encoder_state = self.gcn_encode() 88 | 89 | # this is for optimizing gcn 90 | encoder_outputs, encoder_state = self.optimized_gcn_encode() 91 | 92 | source_sequence_length = tf.reshape( 93 | tf.ones([tf.shape(encoder_outputs)[0], 1], dtype=tf.int32) * self.single_graph_nodes_size, 94 | (tf.shape(encoder_outputs)[0],)) 95 | 96 | return encoder_outputs, encoder_state, source_sequence_length 97 | 98 | def encode_node_feature(self, word_embeddings, feature_info): 99 | # in some cases, we can use LSTM to produce the node feature representation 100 | # cell = self._build_encoder_cell(conf.num_layers, conf.dim) 101 | 102 | 103 | feature_embedded_chars = tf.nn.embedding_lookup(word_embeddings, feature_info) 104 | batch_size = tf.shape(feature_embedded_chars)[0] 105 | 106 | # node_repres = match_utils.multi_highway_layer(feature_embedded_chars, self.hidden_layer_dim, num_layers=1) 107 | # node_repres = tf.reshape(node_repres, [batch_size, -1]) 108 | # y = tf.shape(node_repres)[1] 109 | # 110 | # node_repres = tf.concat([tf.slice(node_repres, [0,0], [batch_size-1, y]), tf.zeros([1, y])], 0) 111 | 112 | node_repres = tf.reshape(feature_embedded_chars, [batch_size, -1]) 113 | 114 | return node_repres 115 | 116 | def optimized_gcn_encode(self): 117 | # [node_size, hidden_layer_dim] 118 | embedded_node_rep = self.encode_node_feature(self.word_embeddings, self.feature_info) 119 | 120 | fw_sampler = UniformNeighborSampler(self.fw_adj_info) 121 | bw_sampler = UniformNeighborSampler(self.bw_adj_info) 122 | nodes = tf.reshape(self.batch_nodes, [-1, ]) 123 | 124 | # batch_size = tf.shape(nodes)[0] 125 | 126 | # the fw_hidden and bw_hidden is the initial node embedding 127 | # [node_size, dim_size] 128 | fw_hidden = tf.nn.embedding_lookup(embedded_node_rep, nodes) 129 | bw_hidden = tf.nn.embedding_lookup(embedded_node_rep, nodes) 130 | 131 | # [node_size, adj_size] 132 | fw_sampled_neighbors = fw_sampler((nodes, self.sample_size_per_layer)) 133 | bw_sampled_neighbors = bw_sampler((nodes, self.sample_size_per_layer)) 134 | 135 | fw_sampled_neighbors_len = tf.constant(0) 136 | bw_sampled_neighbors_len = tf.constant(0) 137 | 138 | # sample 139 | for layer in range(self.sample_layer_size): 140 | if layer == 0: 141 | dim_mul = 1 142 | else: 143 | dim_mul = 2 144 | 145 | if layer > 6: 146 | fw_aggregator = self.fw_aggregators[6] 147 | else: 148 | fw_aggregator = MeanAggregator(dim_mul * self.hidden_layer_dim, self.hidden_layer_dim, concat=self.concat, mode=self.mode) 149 | self.fw_aggregators.append(fw_aggregator) 150 | 151 | # [node_size, adj_size, word_embedding_dim] 152 | if layer == 0: 153 | neigh_vec_hidden = tf.nn.embedding_lookup(embedded_node_rep, fw_sampled_neighbors) 154 | 155 | # compute the neighbor size 156 | tmp_sum = tf.reduce_sum(tf.nn.relu(neigh_vec_hidden), axis=2) 157 | tmp_mask = tf.sign(tmp_sum) 158 | fw_sampled_neighbors_len = tf.reduce_sum(tmp_mask, axis=1) 159 | 160 | else: 161 | neigh_vec_hidden = tf.nn.embedding_lookup( 162 | tf.concat([fw_hidden, tf.zeros([1, dim_mul * self.hidden_layer_dim])], 0), fw_sampled_neighbors) 163 | 164 | fw_hidden = fw_aggregator((fw_hidden, neigh_vec_hidden, fw_sampled_neighbors_len)) 165 | 166 | 167 | if self.graph_encode_direction == "bi": 168 | if layer > 6: 169 | bw_aggregator = self.bw_aggregators[6] 170 | else: 171 | bw_aggregator = MeanAggregator(dim_mul * self.hidden_layer_dim, self.hidden_layer_dim, concat=self.concat, mode=self.mode) 172 | self.bw_aggregators.append(bw_aggregator) 173 | 174 | if layer == 0: 175 | neigh_vec_hidden = tf.nn.embedding_lookup(embedded_node_rep, bw_sampled_neighbors) 176 | 177 | # compute the neighbor size 178 | tmp_sum = tf.reduce_sum(tf.nn.relu(neigh_vec_hidden), axis=2) 179 | tmp_mask = tf.sign(tmp_sum) 180 | bw_sampled_neighbors_len = tf.reduce_sum(tmp_mask, axis=1) 181 | 182 | else: 183 | neigh_vec_hidden = tf.nn.embedding_lookup( 184 | tf.concat([bw_hidden, tf.zeros([1, dim_mul * self.hidden_layer_dim])], 0), bw_sampled_neighbors) 185 | 186 | bw_hidden = bw_aggregator((bw_hidden, neigh_vec_hidden, bw_sampled_neighbors_len)) 187 | 188 | # hidden stores the representation for all nodes 189 | fw_hidden = tf.reshape(fw_hidden, [-1, self.single_graph_nodes_size, 2 * self.hidden_layer_dim]) 190 | if self.graph_encode_direction == "bi": 191 | bw_hidden = tf.reshape(bw_hidden, [-1, self.single_graph_nodes_size, 2 * self.hidden_layer_dim]) 192 | hidden = tf.concat([fw_hidden, bw_hidden], axis=2) 193 | else: 194 | hidden = fw_hidden 195 | 196 | hidden = tf.nn.relu(hidden) 197 | 198 | pooled = tf.reduce_max(hidden, 1) 199 | if self.graph_encode_direction == "bi": 200 | graph_embedding = tf.reshape(pooled, [-1, 4 * self.hidden_layer_dim]) 201 | else: 202 | graph_embedding = tf.reshape(pooled, [-1, 2 * self.hidden_layer_dim]) 203 | 204 | graph_embedding = LSTMStateTuple(c=graph_embedding, h=graph_embedding) 205 | 206 | # shape of hidden: [batch_size, single_graph_nodes_size, 4 * hidden_layer_dim] 207 | # shape of graph_embedding: ([batch_size, 4 * hidden_layer_dim], [batch_size, 4 * hidden_layer_dim]) 208 | return hidden, graph_embedding 209 | 210 | def decode(self, encoder_outputs, encoder_state, source_sequence_length): 211 | with tf.variable_scope("Decoder") as scope: 212 | beam_width = self.beam_width 213 | decoder_type = self.decoder_type 214 | seq_max_len = self.seq_max_len 215 | batch_size = tf.shape(encoder_outputs)[0] 216 | 217 | if self.path_embed_method == "lstm": 218 | self.decoder_cell = self._build_decode_cell() 219 | if self.mode == "test" and beam_width > 0: 220 | memory = seq2seq.tile_batch(self.encoder_outputs, multiplier=beam_width) 221 | source_sequence_length = seq2seq.tile_batch(self.source_sequence_length, multiplier=beam_width) 222 | encoder_state = seq2seq.tile_batch(self.encoder_state, multiplier=beam_width) 223 | batch_size = self.batch_size * beam_width 224 | else: 225 | memory = encoder_outputs 226 | source_sequence_length = source_sequence_length 227 | encoder_state = encoder_state 228 | 229 | attention_mechanism = seq2seq.BahdanauAttention(self.hidden_layer_dim, memory, 230 | memory_sequence_length=source_sequence_length) 231 | self.decoder_cell = seq2seq.AttentionWrapper(self.decoder_cell, attention_mechanism, 232 | attention_layer_size=self.hidden_layer_dim) 233 | self.decoder_initial_state = self.decoder_cell.zero_state(batch_size, tf.float32).clone(cell_state=encoder_state) 234 | 235 | projection_layer = Dense(self.word_vocab_size, use_bias=False) 236 | 237 | """For training the model""" 238 | if self.mode == "train": 239 | decoder_train_helper = tf.contrib.seq2seq.TrainingHelper(self.decoder_train_inputs_embedded, 240 | self.decoder_train_length) 241 | decoder_train = seq2seq.BasicDecoder(self.decoder_cell, decoder_train_helper, 242 | self.decoder_initial_state, 243 | projection_layer) 244 | decoder_outputs_train, decoder_states_train, decoder_seq_len_train = seq2seq.dynamic_decode(decoder_train) 245 | decoder_logits_train = decoder_outputs_train.rnn_output 246 | self.decoder_logits_train = tf.reshape(decoder_logits_train, [batch_size, -1, self.word_vocab_size]) 247 | 248 | """For test the model""" 249 | # if self.mode == "infer" or self.if_pred_on_dev: 250 | if decoder_type == "greedy": 251 | decoder_infer_helper = seq2seq.GreedyEmbeddingHelper(self.word_embeddings, 252 | tf.ones([batch_size], dtype=tf.int32), 253 | self.EOS) 254 | decoder_infer = seq2seq.BasicDecoder(self.decoder_cell, decoder_infer_helper, 255 | self.decoder_initial_state, projection_layer) 256 | elif decoder_type == "beam": 257 | decoder_infer = seq2seq.BeamSearchDecoder(cell=self.decoder_cell, embedding=self.word_embeddings, 258 | start_tokens=tf.ones([batch_size], dtype=tf.int32), 259 | end_token=self.EOS, 260 | initial_state=self.decoder_initial_state, 261 | beam_width=beam_width, 262 | output_layer=projection_layer) 263 | 264 | decoder_outputs_infer, decoder_states_infer, decoder_seq_len_infer = seq2seq.dynamic_decode(decoder_infer, 265 | maximum_iterations=seq_max_len) 266 | 267 | if decoder_type == "beam": 268 | self.decoder_logits_infer = tf.no_op() 269 | self.sample_id = decoder_outputs_infer.predicted_ids 270 | 271 | elif decoder_type == "greedy": 272 | self.decoder_logits_infer = decoder_outputs_infer.rnn_output 273 | self.sample_id = decoder_outputs_infer.sample_id 274 | 275 | def _build_decode_cell(self): 276 | if self.num_layers == 1: 277 | cell = tf.nn.rnn_cell.BasicLSTMCell(num_units=4*self.hidden_layer_dim) 278 | if self.mode == "train": 279 | cell = tf.nn.rnn_cell.DropoutWrapper(cell, 1 - self.dropout) 280 | return cell 281 | else: 282 | cell_list = [] 283 | for i in range(self.num_layers): 284 | single_cell = tf.contrib.rnn.BasicLSTMCell(self._decoder_hidden_size) 285 | if self.mode == "train": 286 | single_cell = tf.nn.rnn_cell.DropoutWrapper(single_cell, 1 - self.dropout) 287 | cell_list.append(single_cell) 288 | return tf.contrib.rnn.MultiRNNCell(cell_list) 289 | 290 | def _build_encoder_cell(self, num_layers, hidden_layer_dim): 291 | if num_layers == 1: 292 | cell = tf.nn.rnn_cell.BasicLSTMCell(hidden_layer_dim) 293 | if self.mode == "train": 294 | cell = tf.nn.rnn_cell.DropoutWrapper(cell, 1 - self.dropout) 295 | return cell 296 | else: 297 | cell_list = [] 298 | for i in range(num_layers): 299 | single_cell = tf.contrib.rnn.BasicLSTMCell(hidden_layer_dim) 300 | if self.mode == "train": 301 | single_cell = tf.nn.rnn_cell.DropoutWrapper(single_cell, 1 - self.dropout) 302 | cell_list.append(single_cell) 303 | return tf.contrib.rnn.MultiRNNCell(cell_list) 304 | 305 | def _init_optimizer(self): 306 | crossent = tf.nn.sparse_softmax_cross_entropy_with_logits(labels=self.decoder_train_targets, logits=self.decoder_logits_train) 307 | decode_loss = (tf.reduce_sum(crossent * self.loss_weights) / tf.cast(self.batch_size, tf.float32)) 308 | 309 | train_loss = decode_loss 310 | 311 | for aggregator in self.fw_aggregators: 312 | for var in aggregator.vars.values(): 313 | train_loss += self.l2_lambda * tf.nn.l2_loss(var) 314 | 315 | for aggregator in self.bw_aggregators: 316 | for var in aggregator.vars.values(): 317 | train_loss += self.l2_lambda * tf.nn.l2_loss(var) 318 | 319 | self.loss_op = train_loss 320 | self.cross_entropy_sum = train_loss 321 | params = tf.trainable_variables() 322 | gradients = tf.gradients(train_loss, params) 323 | clipped_gradients, _ = tf.clip_by_global_norm(gradients, 1) 324 | optimizer = tf.train.AdamOptimizer(learning_rate=self.learning_rate) 325 | self.train_op = optimizer.apply_gradients(zip(clipped_gradients, params)) 326 | 327 | def _build_graph(self): 328 | encoder_outputs, encoder_state, source_sequence_length = self.encode() 329 | 330 | if self.mode == "train": 331 | self._init_decoder_train_connectors() 332 | 333 | self.decode(encoder_outputs=encoder_outputs, encoder_state=encoder_state, source_sequence_length=source_sequence_length) 334 | 335 | if self.mode == "train": 336 | self._init_optimizer() 337 | 338 | def act(self, sess, mode, dict, if_pred_on_dev): 339 | text = np.array(dict['seq']) 340 | decoder_seq_length = np.array(dict['decoder_seq_length']) 341 | loss_weights = np.array(dict['loss_weights']) 342 | batch_graph = dict['batch_graph'] 343 | fw_adj_info = batch_graph['g_fw_adj'] 344 | bw_adj_info = batch_graph['g_bw_adj'] 345 | feature_info = batch_graph['g_ids_features'] 346 | batch_nodes = batch_graph['g_nodes'] 347 | 348 | self.if_pred_on_dev = if_pred_on_dev 349 | 350 | feed_dict = { 351 | self._text: text, 352 | self.decoder_seq_length: decoder_seq_length, 353 | self.loss_weights: loss_weights, 354 | self.fw_adj_info: fw_adj_info, 355 | self.bw_adj_info: bw_adj_info, 356 | self.feature_info: feature_info, 357 | self.batch_nodes: batch_nodes 358 | } 359 | 360 | if mode == "train" and not if_pred_on_dev: 361 | output_feeds = [self.train_op, self.loss_op, self.cross_entropy_sum] 362 | elif mode == "test" or if_pred_on_dev: 363 | output_feeds = [self.sample_id] 364 | 365 | results = sess.run(output_feeds, feed_dict) 366 | return results -------------------------------------------------------------------------------- /Graph2Seq-master/main/neigh_samplers.py: -------------------------------------------------------------------------------- 1 | from layers import Layer 2 | import tensorflow as tf 3 | 4 | class UniformNeighborSampler(Layer): 5 | """ 6 | Uniformly samples neighbors. 7 | Assumes that adj lists are padded with random re-sampling 8 | """ 9 | def __init__(self, adj_info, **kwargs): 10 | super(UniformNeighborSampler, self).__init__(**kwargs) 11 | self.adj_info = adj_info 12 | 13 | def _call(self, inputs): 14 | ids, num_samples = inputs 15 | adj_lists = tf.nn.embedding_lookup(self.adj_info, ids) 16 | adj_lists = tf.transpose(tf.transpose(adj_lists)) 17 | adj_lists = tf.slice(adj_lists, [0,0], [-1, num_samples]) 18 | return adj_lists -------------------------------------------------------------------------------- /Graph2Seq-master/main/pooling.py: -------------------------------------------------------------------------------- 1 | import tensorflow as tf 2 | 3 | 4 | def mean_pool(input_tensor, sequence_length=None): 5 | """ 6 | Given an input tensor (e.g., the outputs of a LSTM), do mean pooling 7 | over the last dimension of the input. 8 | 9 | For example, if the input was the output of a LSTM of shape 10 | (batch_size, sequence length, hidden_dim), this would 11 | calculate a mean pooling over the last dimension (taking the padding 12 | into account, if provided) to output a tensor of shape 13 | (batch_size, hidden_dim). 14 | 15 | Parameters 16 | ---------- 17 | input_tensor: Tensor 18 | An input tensor, preferably the output of a tensorflow RNN. 19 | The mean-pooled representation of this output will be calculated 20 | over the last dimension. 21 | 22 | sequence_length: Tensor, optional (default=None) 23 | A tensor of dimension (batch_size, ) indicating the length 24 | of the sequences before padding was applied. 25 | 26 | Returns 27 | ------- 28 | mean_pooled_output: Tensor 29 | A tensor of one less dimension than the input, with the size of the 30 | last dimension equal to the hidden dimension state size. 31 | """ 32 | with tf.name_scope("mean_pool"): 33 | # shape (batch_size, sequence_length) 34 | input_tensor_sum = tf.reduce_sum(input_tensor, axis=-2) 35 | 36 | # If sequence_length is None, divide by the sequence length 37 | # as indicated by the input tensor. 38 | if sequence_length is None: 39 | sequence_length = tf.shape(input_tensor)[-2] 40 | 41 | # Expand sequence length from shape (batch_size,) to 42 | # (batch_size, 1) for broadcasting to work. 43 | expanded_sequence_length = tf.cast(tf.expand_dims(sequence_length, -1), 44 | "float32") + 1e-08 45 | 46 | # Now, divide by the length of each sequence. 47 | # shape (batch_size, sequence_length) 48 | mean_pooled_input = (input_tensor_sum / 49 | expanded_sequence_length) 50 | return mean_pooled_input 51 | 52 | def handle_pad_max_pooling(tensor, last_dim): 53 | tensor = tf.reshape(tensor, [-1, last_dim]) 54 | bs = tf.shape(tensor)[0] 55 | tt = tf.fill(tf.stack([bs, last_dim]), -1e9) 56 | cond = tf.not_equal(tensor, 0.0) 57 | res = tf.where(cond, tensor, tt) 58 | return res 59 | 60 | def max_pool(input_tensor, last_dim, sequence_length=None): 61 | """ 62 | Given an input tensor, do max pooling over the last dimension of the input 63 | :param input_tensor: 64 | :param sequence_length: 65 | :return: 66 | """ 67 | with tf.name_scope("max_pool"): 68 | #shape [batch_size, sequence_length] 69 | mid_dim = tf.shape(input_tensor)[1] 70 | input_tensor = handle_pad_max_pooling(input_tensor, last_dim) 71 | input_tensor = tf.reshape(input_tensor, [-1, mid_dim, last_dim]) 72 | input_tensor_max = tf.reduce_max(input_tensor, axis=-2) 73 | return input_tensor_max 74 | -------------------------------------------------------------------------------- /Graph2Seq-master/main/run_model.py: -------------------------------------------------------------------------------- 1 | import configure as conf 2 | import data_collector as data_collector 3 | import loaderAndwriter as disk_helper 4 | import numpy as np 5 | from model import Graph2SeqNN 6 | import tensorflow as tf 7 | import helpers as helpers 8 | import datetime 9 | import text_decoder 10 | from evaluator import evaluate 11 | import os 12 | import argparse 13 | import json 14 | 15 | def main(mode): 16 | 17 | word_idx = {} 18 | 19 | if mode == "train": 20 | epochs = conf.epochs 21 | train_batch_size = conf.train_batch_size 22 | 23 | # read the training data from a file 24 | print("reading training data into the mem ...") 25 | texts_train, graphs_train = data_collector.read_data(conf.train_data_path, word_idx, if_increase_dict=True) 26 | 27 | print("reading development data into the mem ...") 28 | texts_dev, graphs_dev = data_collector.read_data(conf.dev_data_path, word_idx, if_increase_dict=False) 29 | 30 | print("writing word-idx mapping ...") 31 | disk_helper.write_word_idx(word_idx, conf.word_idx_file_path) 32 | 33 | print("vectoring training data ...") 34 | tv_train = data_collector.vectorize_data(word_idx, texts_train) 35 | 36 | print("vectoring dev data ...") 37 | tv_dev = data_collector.vectorize_data(word_idx, texts_dev) 38 | 39 | conf.word_vocab_size = len(word_idx.keys()) + 1 40 | 41 | with tf.Graph().as_default(): 42 | with tf.Session() as sess: 43 | model = Graph2SeqNN("train", conf, path_embed_method="lstm") 44 | 45 | model._build_graph() 46 | saver = tf.train.Saver(max_to_keep=None) 47 | sess.run(tf.initialize_all_variables()) 48 | 49 | def train_step(seqs, decoder_seq_length, loss_weights, batch_graph, if_pred_on_dev=False): 50 | dict = {} 51 | dict['seq'] = seqs 52 | dict['batch_graph'] = batch_graph 53 | dict['loss_weights'] = loss_weights 54 | dict['decoder_seq_length'] = decoder_seq_length 55 | 56 | if not if_pred_on_dev: 57 | _, loss_op, cross_entropy = model.act(sess, "train", dict, if_pred_on_dev) 58 | return loss_op, cross_entropy 59 | else: 60 | sample_id = model.act(sess, "train", dict, if_pred_on_dev) 61 | return sample_id 62 | 63 | best_acc_on_dev = 0.0 64 | for t in range(1, epochs + 1): 65 | n_train = len(texts_train) 66 | temp_order = list(range(n_train)) 67 | np.random.shuffle(temp_order) 68 | 69 | loss_sum = 0.0 70 | for start in range(0, n_train, train_batch_size): 71 | end = min(start+train_batch_size, n_train) 72 | tv = [] 73 | graphs = [] 74 | for _ in range(start, end): 75 | idx = temp_order[_] 76 | tv.append(tv_train[idx]) 77 | graphs.append(graphs_train[idx]) 78 | 79 | batch_graph = data_collector.cons_batch_graph(graphs) 80 | gv = data_collector.vectorize_batch_graph(batch_graph, word_idx) 81 | 82 | tv, tv_real_len, loss_weights = helpers.batch(tv) 83 | 84 | loss_op, cross_entropy = train_step(tv, tv_real_len, loss_weights, gv) 85 | loss_sum += loss_op 86 | 87 | #################### test the model on the dev data ######################### 88 | n_dev = len(texts_dev) 89 | dev_batch_size = conf.dev_batch_size 90 | 91 | idx_word = {} 92 | for w in word_idx: 93 | idx_word[word_idx[w]] = w 94 | 95 | pred_texts = [] 96 | for start in range(0, n_dev, dev_batch_size): 97 | end = min(start+dev_batch_size, n_dev) 98 | tv = [] 99 | graphs = [] 100 | for _ in range(start, end): 101 | tv.append(tv_dev[_]) 102 | graphs.append(graphs_dev[_]) 103 | 104 | batch_graph = data_collector.cons_batch_graph(graphs) 105 | gv = data_collector.vectorize_batch_graph(batch_graph, word_idx) 106 | 107 | tv, tv_real_len, loss_weights = helpers.batch(tv) 108 | 109 | sample_id = train_step(tv, tv_real_len, loss_weights, gv, if_pred_on_dev=True)[0] 110 | 111 | for tmp_id in sample_id: 112 | pred_texts.append(text_decoder.decode_text(tmp_id, idx_word)) 113 | 114 | acc = evaluate(type="acc", golds=texts_dev, preds=pred_texts) 115 | if_save_model = False 116 | if acc >= best_acc_on_dev: 117 | best_acc_on_dev = acc 118 | if_save_model = True 119 | 120 | time_str = datetime.datetime.now().isoformat() 121 | print('-----------------------') 122 | print('time:{}'.format(time_str)) 123 | print('Epoch', t) 124 | print('Acc on Dev: {}'.format(acc)) 125 | print('Best acc on Dev: {}'.format(best_acc_on_dev)) 126 | print('Loss on train:{}'.format(loss_sum)) 127 | if if_save_model: 128 | save_path = "../saved_model/" 129 | if not os.path.exists(save_path): 130 | os.makedirs(save_path) 131 | 132 | path = saver.save(sess, save_path + 'model', global_step=0) 133 | print("Already saved model to {}".format(path)) 134 | 135 | print('-----------------------') 136 | 137 | elif mode == "test": 138 | test_batch_size = conf.test_batch_size 139 | 140 | # read the test data from a file 141 | print("reading test data into the mem ...") 142 | texts_test, graphs_test = data_collector.read_data(conf.test_data_path, word_idx, if_increase_dict=False) 143 | 144 | print("reading word idx mapping from file") 145 | word_idx = disk_helper.read_word_idx_from_file(conf.word_idx_file_path) 146 | 147 | idx_word = {} 148 | for w in word_idx: 149 | idx_word[word_idx[w]] = w 150 | 151 | print("vectoring test data ...") 152 | tv_test = data_collector.vectorize_data(word_idx, texts_test) 153 | 154 | conf.word_vocab_size = len(word_idx.keys()) + 1 155 | 156 | with tf.Graph().as_default(): 157 | with tf.Session() as sess: 158 | model = Graph2SeqNN("test", conf, path_embed_method="lstm") 159 | model._build_graph() 160 | saver = tf.train.Saver(max_to_keep=None) 161 | 162 | model_path_name = "../saved_model/model-0" 163 | model_pred_path = "../saved_model/prediction.txt" 164 | 165 | saver.restore(sess, model_path_name) 166 | 167 | def test_step(seqs, decoder_seq_length, loss_weights, batch_graph): 168 | dict = {} 169 | dict['seq'] = seqs 170 | dict['batch_graph'] = batch_graph 171 | dict['loss_weights'] = loss_weights 172 | dict['decoder_seq_length'] = decoder_seq_length 173 | sample_id = model.act(sess, "test", dict, if_pred_on_dev=False) 174 | return sample_id 175 | 176 | n_test = len(texts_test) 177 | 178 | pred_texts = [] 179 | global_graphs = [] 180 | for start in range(0, n_test, test_batch_size): 181 | end = min(start + test_batch_size, n_test) 182 | tv = [] 183 | graphs = [] 184 | for _ in range(start, end): 185 | tv.append(tv_test[_]) 186 | graphs.append(graphs_test[_]) 187 | global_graphs.append(graphs_test[_]) 188 | 189 | batch_graph = data_collector.cons_batch_graph(graphs) 190 | gv = data_collector.vectorize_batch_graph(batch_graph, word_idx) 191 | tv, tv_real_len, loss_weights = helpers.batch(tv) 192 | 193 | sample_id = test_step(tv, tv_real_len, loss_weights, gv)[0] 194 | for tem_id in sample_id: 195 | pred_texts.append(text_decoder.decode_text(tem_id, idx_word)) 196 | 197 | acc = evaluate(type="acc", golds=texts_test, preds=pred_texts) 198 | print("acc on test set is {}".format(acc)) 199 | 200 | # write prediction result into a file 201 | with open(model_pred_path, 'w+') as f: 202 | for _ in range(len(global_graphs)): 203 | f.write("graph:\t"+json.dumps(global_graphs[_])+"\nGold:\t"+texts_test[_]+"\nPredicted:\t"+pred_texts[_]+"\n") 204 | if texts_test[_].strip() == pred_texts[_].strip(): 205 | f.write("Correct\n\n") 206 | else: 207 | f.write("Incorrect\n\n") 208 | 209 | 210 | if __name__ == "__main__": 211 | argparser = argparse.ArgumentParser() 212 | argparser.add_argument("mode", type=str, choices=["train", "test"]) 213 | argparser.add_argument("-sample_size_per_layer", type=int, default=4, help="sample size at each layer") 214 | argparser.add_argument("-sample_layer_size", type=int, default=4, help="sample layer size") 215 | argparser.add_argument("-epochs", type=int, default=100, help="training epochs") 216 | argparser.add_argument("-learning_rate", type=float, default=conf.learning_rate, help="learning rate") 217 | argparser.add_argument("-word_embedding_dim", type=int, default=conf.word_embedding_dim, help="word embedding dim") 218 | argparser.add_argument("-hidden_layer_dim", type=int, default=conf.hidden_layer_dim) 219 | 220 | config = argparser.parse_args() 221 | 222 | mode = config.mode 223 | conf.sample_layer_size = config.sample_layer_size 224 | conf.sample_size_per_layer = config.sample_size_per_layer 225 | conf.epochs = config.epochs 226 | conf.learning_rate = config.learning_rate 227 | conf.word_embedding_dim = config.word_embedding_dim 228 | conf.hidden_layer_dim = config.hidden_layer_dim 229 | 230 | main(mode) 231 | -------------------------------------------------------------------------------- /Graph2Seq-master/main/text_decoder.py: -------------------------------------------------------------------------------- 1 | import configure as conf 2 | 3 | def decode_text(pred_idx, idx_word): 4 | # if conf.decoder_type != "beam": 5 | # pred_idx = pred_idx[0] 6 | # else: 7 | # pred_idx = np.transpose(pred_idx) 8 | # pred_idx = pred_idx[0] 9 | 10 | text = "" 11 | for __ in range(len(pred_idx)): 12 | ids = pred_idx[__] 13 | if isinstance(ids, list): 14 | for id in ids: 15 | if id == 2: 16 | break 17 | if id != 0: 18 | text += idx_word[id] + " " 19 | else: 20 | text += conf.GO + " " 21 | else: 22 | if ids == 2: 23 | break 24 | if ids != 0: 25 | text += idx_word[ids] + " " 26 | else: 27 | text += conf.GO + " " 28 | return text -------------------------------------------------------------------------------- /Graph2Seq-master/saved_model/checkpoint: -------------------------------------------------------------------------------- 1 | model_checkpoint_path: "model-0" 2 | all_model_checkpoint_paths: "model-0" 3 | -------------------------------------------------------------------------------- /Graph2Seq-master/saved_model/model-0.data-00000-of-00001: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/IBM/Text-to-LogicForm/515e73db1795019007c1b3a3ed22b10c349f555a/Graph2Seq-master/saved_model/model-0.data-00000-of-00001 -------------------------------------------------------------------------------- /Graph2Seq-master/saved_model/model-0.index: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/IBM/Text-to-LogicForm/515e73db1795019007c1b3a3ed22b10c349f555a/Graph2Seq-master/saved_model/model-0.index -------------------------------------------------------------------------------- /Graph2Seq-master/saved_model/model-0.meta: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/IBM/Text-to-LogicForm/515e73db1795019007c1b3a3ed22b10c349f555a/Graph2Seq-master/saved_model/model-0.meta -------------------------------------------------------------------------------- /LICENSE: -------------------------------------------------------------------------------- 1 | Apache License 2 | Version 2.0, January 2004 3 | http://www.apache.org/licenses/ 4 | 5 | TERMS AND CONDITIONS FOR USE, REPRODUCTION, AND DISTRIBUTION 6 | 7 | 1. Definitions. 8 | 9 | "License" shall mean the terms and conditions for use, reproduction, 10 | and distribution as defined by Sections 1 through 9 of this document. 11 | 12 | "Licensor" shall mean the copyright owner or entity authorized by 13 | the copyright owner that is granting the License. 14 | 15 | "Legal Entity" shall mean the union of the acting entity and all 16 | other entities that control, are controlled by, or are under common 17 | control with that entity. For the purposes of this definition, 18 | "control" means (i) the power, direct or indirect, to cause the 19 | direction or management of such entity, whether by contract or 20 | otherwise, or (ii) ownership of fifty percent (50%) or more of the 21 | outstanding shares, or (iii) beneficial ownership of such entity. 22 | 23 | "You" (or "Your") shall mean an individual or Legal Entity 24 | exercising permissions granted by this License. 25 | 26 | "Source" form shall mean the preferred form for making modifications, 27 | including but not limited to software source code, documentation 28 | source, and configuration files. 29 | 30 | "Object" form shall mean any form resulting from mechanical 31 | transformation or translation of a Source form, including but 32 | not limited to compiled object code, generated documentation, 33 | and conversions to other media types. 34 | 35 | "Work" shall mean the work of authorship, whether in Source or 36 | Object form, made available under the License, as indicated by a 37 | copyright notice that is included in or attached to the work 38 | (an example is provided in the Appendix below). 39 | 40 | "Derivative Works" shall mean any work, whether in Source or Object 41 | form, that is based on (or derived from) the Work and for which the 42 | editorial revisions, annotations, elaborations, or other modifications 43 | represent, as a whole, an original work of authorship. For the purposes 44 | of this License, Derivative Works shall not include works that remain 45 | separable from, or merely link (or bind by name) to the interfaces of, 46 | the Work and Derivative Works thereof. 47 | 48 | "Contribution" shall mean any work of authorship, including 49 | the original version of the Work and any modifications or additions 50 | to that Work or Derivative Works thereof, that is intentionally 51 | submitted to Licensor for inclusion in the Work by the copyright owner 52 | or by an individual or Legal Entity authorized to submit on behalf of 53 | the copyright owner. For the purposes of this definition, "submitted" 54 | means any form of electronic, verbal, or written communication sent 55 | to the Licensor or its representatives, including but not limited to 56 | communication on electronic mailing lists, source code control systems, 57 | and issue tracking systems that are managed by, or on behalf of, the 58 | Licensor for the purpose of discussing and improving the Work, but 59 | excluding communication that is conspicuously marked or otherwise 60 | designated in writing by the copyright owner as "Not a Contribution." 61 | 62 | "Contributor" shall mean Licensor and any individual or Legal Entity 63 | on behalf of whom a Contribution has been received by Licensor and 64 | subsequently incorporated within the Work. 65 | 66 | 2. Grant of Copyright License. Subject to the terms and conditions of 67 | this License, each Contributor hereby grants to You a perpetual, 68 | worldwide, non-exclusive, no-charge, royalty-free, irrevocable 69 | copyright license to reproduce, prepare Derivative Works of, 70 | publicly display, publicly perform, sublicense, and distribute the 71 | Work and such Derivative Works in Source or Object form. 72 | 73 | 3. Grant of Patent License. Subject to the terms and conditions of 74 | this License, each Contributor hereby grants to You a perpetual, 75 | worldwide, non-exclusive, no-charge, royalty-free, irrevocable 76 | (except as stated in this section) patent license to make, have made, 77 | use, offer to sell, sell, import, and otherwise transfer the Work, 78 | where such license applies only to those patent claims licensable 79 | by such Contributor that are necessarily infringed by their 80 | Contribution(s) alone or by combination of their Contribution(s) 81 | with the Work to which such Contribution(s) was submitted. If You 82 | institute patent litigation against any entity (including a 83 | cross-claim or counterclaim in a lawsuit) alleging that the Work 84 | or a Contribution incorporated within the Work constitutes direct 85 | or contributory patent infringement, then any patent licenses 86 | granted to You under this License for that Work shall terminate 87 | as of the date such litigation is filed. 88 | 89 | 4. Redistribution. You may reproduce and distribute copies of the 90 | Work or Derivative Works thereof in any medium, with or without 91 | modifications, and in Source or Object form, provided that You 92 | meet the following conditions: 93 | 94 | (a) You must give any other recipients of the Work or 95 | Derivative Works a copy of this License; and 96 | 97 | (b) You must cause any modified files to carry prominent notices 98 | stating that You changed the files; and 99 | 100 | (c) You must retain, in the Source form of any Derivative Works 101 | that You distribute, all copyright, patent, trademark, and 102 | attribution notices from the Source form of the Work, 103 | excluding those notices that do not pertain to any part of 104 | the Derivative Works; and 105 | 106 | (d) If the Work includes a "NOTICE" text file as part of its 107 | distribution, then any Derivative Works that You distribute must 108 | include a readable copy of the attribution notices contained 109 | within such NOTICE file, excluding those notices that do not 110 | pertain to any part of the Derivative Works, in at least one 111 | of the following places: within a NOTICE text file distributed 112 | as part of the Derivative Works; within the Source form or 113 | documentation, if provided along with the Derivative Works; or, 114 | within a display generated by the Derivative Works, if and 115 | wherever such third-party notices normally appear. The contents 116 | of the NOTICE file are for informational purposes only and 117 | do not modify the License. You may add Your own attribution 118 | notices within Derivative Works that You distribute, alongside 119 | or as an addendum to the NOTICE text from the Work, provided 120 | that such additional attribution notices cannot be construed 121 | as modifying the License. 122 | 123 | You may add Your own copyright statement to Your modifications and 124 | may provide additional or different license terms and conditions 125 | for use, reproduction, or distribution of Your modifications, or 126 | for any such Derivative Works as a whole, provided Your use, 127 | reproduction, and distribution of the Work otherwise complies with 128 | the conditions stated in this License. 129 | 130 | 5. Submission of Contributions. Unless You explicitly state otherwise, 131 | any Contribution intentionally submitted for inclusion in the Work 132 | by You to the Licensor shall be under the terms and conditions of 133 | this License, without any additional terms or conditions. 134 | Notwithstanding the above, nothing herein shall supersede or modify 135 | the terms of any separate license agreement you may have executed 136 | with Licensor regarding such Contributions. 137 | 138 | 6. Trademarks. This License does not grant permission to use the trade 139 | names, trademarks, service marks, or product names of the Licensor, 140 | except as required for reasonable and customary use in describing the 141 | origin of the Work and reproducing the content of the NOTICE file. 142 | 143 | 7. Disclaimer of Warranty. Unless required by applicable law or 144 | agreed to in writing, Licensor provides the Work (and each 145 | Contributor provides its Contributions) on an "AS IS" BASIS, 146 | WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or 147 | implied, including, without limitation, any warranties or conditions 148 | of TITLE, NON-INFRINGEMENT, MERCHANTABILITY, or FITNESS FOR A 149 | PARTICULAR PURPOSE. You are solely responsible for determining the 150 | appropriateness of using or redistributing the Work and assume any 151 | risks associated with Your exercise of permissions under this License. 152 | 153 | 8. Limitation of Liability. In no event and under no legal theory, 154 | whether in tort (including negligence), contract, or otherwise, 155 | unless required by applicable law (such as deliberate and grossly 156 | negligent acts) or agreed to in writing, shall any Contributor be 157 | liable to You for damages, including any direct, indirect, special, 158 | incidental, or consequential damages of any character arising as a 159 | result of this License or out of the use or inability to use the 160 | Work (including but not limited to damages for loss of goodwill, 161 | work stoppage, computer failure or malfunction, or any and all 162 | other commercial damages or losses), even if such Contributor 163 | has been advised of the possibility of such damages. 164 | 165 | 9. Accepting Warranty or Additional Liability. While redistributing 166 | the Work or Derivative Works thereof, You may choose to offer, 167 | and charge a fee for, acceptance of support, warranty, indemnity, 168 | or other liability obligations and/or rights consistent with this 169 | License. However, in accepting such obligations, You may act only 170 | on Your own behalf and on Your sole responsibility, not on behalf 171 | of any other Contributor, and only if You agree to indemnify, 172 | defend, and hold each Contributor harmless for any liability 173 | incurred by, or claims asserted against, such Contributor by reason 174 | of your accepting any such warranty or additional liability. 175 | 176 | END OF TERMS AND CONDITIONS 177 | 178 | APPENDIX: How to apply the Apache License to your work. 179 | 180 | To apply the Apache License to your work, attach the following 181 | boilerplate notice, with the fields enclosed by brackets "[]" 182 | replaced with your own identifying information. (Don't include 183 | the brackets!) The text should be enclosed in the appropriate 184 | comment syntax for the file format. We also recommend that a 185 | file or class name and description of purpose be included on the 186 | same "printed page" as the copyright notice for easier 187 | identification within third-party archives. 188 | 189 | Copyright [yyyy] [name of copyright owner] 190 | 191 | Licensed under the Apache License, Version 2.0 (the "License"); 192 | you may not use this file except in compliance with the License. 193 | You may obtain a copy of the License at 194 | 195 | http://www.apache.org/licenses/LICENSE-2.0 196 | 197 | Unless required by applicable law or agreed to in writing, software 198 | distributed under the License is distributed on an "AS IS" BASIS, 199 | WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. 200 | See the License for the specific language governing permissions and 201 | limitations under the License. 202 | -------------------------------------------------------------------------------- /README.md: -------------------------------------------------------------------------------- 1 | # Text-to-LogicForm 2 | Text-to-LogicForm is a simple code for leveraging a syntactic graph for semantic parsing using a nov 3 | 4 | # Possibly Required Dependency and Tools 5 | Text-to-LogicForm requires two parts of codes. The first part is a novel Graph2Seq model to perform the job. 6 | The codes can be found here: https://github.com/IBM/Graph2Seq. The second part is a pre-processing code 7 | for converting text to a syntactic graph (this part of codes will be releasing soon). 8 | 9 | # Graph2Seq 10 | Graph2Seq is a simple code for building a graph-encoder and sequence-decoder for NLP and other AI/ML/DL tasks. 11 | 12 | # How To Run The Codes 13 | To train your graph-to-sequence model, you need: 14 | 15 | (1) Prepare your train/dev/test data which the form of: 16 | 17 | each line is a json object whose keys are "seq", "g_ids", "g_id_features", "g_adj": 18 | "seq" is a text which is supposed to be the output of the decoder 19 | "g_ids" is a mapping from the node ID to its ID in the graph 20 | "g_id_features" is a mapping from the node ID to its text features 21 | "g_adj" is a mapping from the node ID to its adjacent nodes (represented as thier IDs) 22 | 23 | See data/no_cycle/train.data as examples. 24 | 25 | 26 | (2) Modify some hyper-parameters according to your task in the main/configure.py 27 | 28 | (3) train the model by running the following code 29 | "python run_model.py train -sample_size_per_layer=xxx -sample_layer_size=yyy" 30 | The model that performs the best on the dev data will be saved in the dir "saved_model" 31 | 32 | (4) test the model by running the following code 33 | "python run_model.py test -sample_size_per_layer=xxx -sample_layer_size=yyy" 34 | The prediction result will be saved in saved_model/prediction.txt 35 | 36 | 37 | # How To Cite The Codes 38 | Please cite our work if you like or are using our codes for your projects! 39 | 40 | Kun Xu, Lingfei Wu, Zhiguo Wang, Yansong Feng, and Vadim Sheinin, "Exploiting Rich Syntactic Information for Semantic Parsing with Graph-to-Sequence Model", In 2018 Conference on Empirical Methods in Natural Language Processing. 41 | 42 | @article{xu2018exploiting,
43 | title={Exploiting rich syntactic information for semantic parsing with graph-to-sequence model},
44 | author={Xu, Kun and Wu, Lingfei and Wang, Zhiguo and Yu, Mo and Chen, Liwei and Sheinin, Vadim},
45 | journal={arXiv preprint arXiv:1808.07624},
46 | year={2018}
47 | } 48 | 49 | Kun Xu, Lingfei Wu, Zhiguo Wang, Yansong Feng, Michael Witbrock, and Vadim Sheinin (first and second authors contributed equally), "Graph2Seq: Graph to Sequence Learning with Attention-based Neural Networks", arXiv preprint arXiv:1804.00823. 50 | 51 | @article{xu2018graph2seq,
52 | title={Graph2Seq: Graph to Sequence Learning with Attention-based Neural Networks},
53 | author={Xu, Kun and Wu, Lingfei and Wang, Zhiguo and Feng, Yansong and Witbrock, Michael and Sheinin, Vadim},
54 | journal={arXiv preprint arXiv:1804.00823},
55 | year={2018}
56 | }
57 | 58 | ------------------------------------------------------ 59 | Contributors: Kun Xu, Lingfei Wu
60 | Created date: November 19, 2018
61 | Last update: November 19, 2018
62 | 63 | --------------------------------------------------------------------------------