├── Graph2Seq-master
├── LICENSE
├── README.md
├── batch_helper.py
├── data
│ ├── .DS_Store
│ ├── no_cycle
│ │ ├── dev.data
│ │ ├── test.data
│ │ └── train.data
│ └── word.idx
├── data_creator.py
├── main
│ ├── __init__.py
│ ├── __pycache__
│ │ ├── aggregators.cpython-36.pyc
│ │ ├── configure.cpython-36.pyc
│ │ ├── configure.cpython-37.pyc
│ │ ├── data_collector.cpython-36.pyc
│ │ ├── data_collector.cpython-37.pyc
│ │ ├── evaluator.cpython-36.pyc
│ │ ├── helpers.cpython-36.pyc
│ │ ├── inits.cpython-36.pyc
│ │ ├── layer_utils.cpython-36.pyc
│ │ ├── layers.cpython-36.pyc
│ │ ├── loaderAndwriter.cpython-36.pyc
│ │ ├── loaderAndwriter.cpython-37.pyc
│ │ ├── match_utils.cpython-36.pyc
│ │ ├── model.cpython-36.pyc
│ │ ├── model.cpython-37.pyc
│ │ ├── neigh_samplers.cpython-36.pyc
│ │ ├── pooling.cpython-36.pyc
│ │ └── text_decoder.cpython-36.pyc
│ ├── aggregators.py
│ ├── batch_helper.py
│ ├── configure.py
│ ├── data_collector.py
│ ├── data_creator.py
│ ├── evaluator.py
│ ├── helpers.py
│ ├── inits.py
│ ├── layer_utils.py
│ ├── layers.py
│ ├── loaderAndwriter.py
│ ├── match_utils.py
│ ├── model.py
│ ├── neigh_samplers.py
│ ├── pooling.py
│ ├── run_model.py
│ └── text_decoder.py
└── saved_model
│ ├── checkpoint
│ ├── model-0.data-00000-of-00001
│ ├── model-0.index
│ └── model-0.meta
├── LICENSE
└── README.md
/Graph2Seq-master/LICENSE:
--------------------------------------------------------------------------------
1 | Apache License
2 | Version 2.0, January 2004
3 | http://www.apache.org/licenses/
4 |
5 | TERMS AND CONDITIONS FOR USE, REPRODUCTION, AND DISTRIBUTION
6 |
7 | 1. Definitions.
8 |
9 | "License" shall mean the terms and conditions for use, reproduction,
10 | and distribution as defined by Sections 1 through 9 of this document.
11 |
12 | "Licensor" shall mean the copyright owner or entity authorized by
13 | the copyright owner that is granting the License.
14 |
15 | "Legal Entity" shall mean the union of the acting entity and all
16 | other entities that control, are controlled by, or are under common
17 | control with that entity. For the purposes of this definition,
18 | "control" means (i) the power, direct or indirect, to cause the
19 | direction or management of such entity, whether by contract or
20 | otherwise, or (ii) ownership of fifty percent (50%) or more of the
21 | outstanding shares, or (iii) beneficial ownership of such entity.
22 |
23 | "You" (or "Your") shall mean an individual or Legal Entity
24 | exercising permissions granted by this License.
25 |
26 | "Source" form shall mean the preferred form for making modifications,
27 | including but not limited to software source code, documentation
28 | source, and configuration files.
29 |
30 | "Object" form shall mean any form resulting from mechanical
31 | transformation or translation of a Source form, including but
32 | not limited to compiled object code, generated documentation,
33 | and conversions to other media types.
34 |
35 | "Work" shall mean the work of authorship, whether in Source or
36 | Object form, made available under the License, as indicated by a
37 | copyright notice that is included in or attached to the work
38 | (an example is provided in the Appendix below).
39 |
40 | "Derivative Works" shall mean any work, whether in Source or Object
41 | form, that is based on (or derived from) the Work and for which the
42 | editorial revisions, annotations, elaborations, or other modifications
43 | represent, as a whole, an original work of authorship. For the purposes
44 | of this License, Derivative Works shall not include works that remain
45 | separable from, or merely link (or bind by name) to the interfaces of,
46 | the Work and Derivative Works thereof.
47 |
48 | "Contribution" shall mean any work of authorship, including
49 | the original version of the Work and any modifications or additions
50 | to that Work or Derivative Works thereof, that is intentionally
51 | submitted to Licensor for inclusion in the Work by the copyright owner
52 | or by an individual or Legal Entity authorized to submit on behalf of
53 | the copyright owner. For the purposes of this definition, "submitted"
54 | means any form of electronic, verbal, or written communication sent
55 | to the Licensor or its representatives, including but not limited to
56 | communication on electronic mailing lists, source code control systems,
57 | and issue tracking systems that are managed by, or on behalf of, the
58 | Licensor for the purpose of discussing and improving the Work, but
59 | excluding communication that is conspicuously marked or otherwise
60 | designated in writing by the copyright owner as "Not a Contribution."
61 |
62 | "Contributor" shall mean Licensor and any individual or Legal Entity
63 | on behalf of whom a Contribution has been received by Licensor and
64 | subsequently incorporated within the Work.
65 |
66 | 2. Grant of Copyright License. Subject to the terms and conditions of
67 | this License, each Contributor hereby grants to You a perpetual,
68 | worldwide, non-exclusive, no-charge, royalty-free, irrevocable
69 | copyright license to reproduce, prepare Derivative Works of,
70 | publicly display, publicly perform, sublicense, and distribute the
71 | Work and such Derivative Works in Source or Object form.
72 |
73 | 3. Grant of Patent License. Subject to the terms and conditions of
74 | this License, each Contributor hereby grants to You a perpetual,
75 | worldwide, non-exclusive, no-charge, royalty-free, irrevocable
76 | (except as stated in this section) patent license to make, have made,
77 | use, offer to sell, sell, import, and otherwise transfer the Work,
78 | where such license applies only to those patent claims licensable
79 | by such Contributor that are necessarily infringed by their
80 | Contribution(s) alone or by combination of their Contribution(s)
81 | with the Work to which such Contribution(s) was submitted. If You
82 | institute patent litigation against any entity (including a
83 | cross-claim or counterclaim in a lawsuit) alleging that the Work
84 | or a Contribution incorporated within the Work constitutes direct
85 | or contributory patent infringement, then any patent licenses
86 | granted to You under this License for that Work shall terminate
87 | as of the date such litigation is filed.
88 |
89 | 4. Redistribution. You may reproduce and distribute copies of the
90 | Work or Derivative Works thereof in any medium, with or without
91 | modifications, and in Source or Object form, provided that You
92 | meet the following conditions:
93 |
94 | (a) You must give any other recipients of the Work or
95 | Derivative Works a copy of this License; and
96 |
97 | (b) You must cause any modified files to carry prominent notices
98 | stating that You changed the files; and
99 |
100 | (c) You must retain, in the Source form of any Derivative Works
101 | that You distribute, all copyright, patent, trademark, and
102 | attribution notices from the Source form of the Work,
103 | excluding those notices that do not pertain to any part of
104 | the Derivative Works; and
105 |
106 | (d) If the Work includes a "NOTICE" text file as part of its
107 | distribution, then any Derivative Works that You distribute must
108 | include a readable copy of the attribution notices contained
109 | within such NOTICE file, excluding those notices that do not
110 | pertain to any part of the Derivative Works, in at least one
111 | of the following places: within a NOTICE text file distributed
112 | as part of the Derivative Works; within the Source form or
113 | documentation, if provided along with the Derivative Works; or,
114 | within a display generated by the Derivative Works, if and
115 | wherever such third-party notices normally appear. The contents
116 | of the NOTICE file are for informational purposes only and
117 | do not modify the License. You may add Your own attribution
118 | notices within Derivative Works that You distribute, alongside
119 | or as an addendum to the NOTICE text from the Work, provided
120 | that such additional attribution notices cannot be construed
121 | as modifying the License.
122 |
123 | You may add Your own copyright statement to Your modifications and
124 | may provide additional or different license terms and conditions
125 | for use, reproduction, or distribution of Your modifications, or
126 | for any such Derivative Works as a whole, provided Your use,
127 | reproduction, and distribution of the Work otherwise complies with
128 | the conditions stated in this License.
129 |
130 | 5. Submission of Contributions. Unless You explicitly state otherwise,
131 | any Contribution intentionally submitted for inclusion in the Work
132 | by You to the Licensor shall be under the terms and conditions of
133 | this License, without any additional terms or conditions.
134 | Notwithstanding the above, nothing herein shall supersede or modify
135 | the terms of any separate license agreement you may have executed
136 | with Licensor regarding such Contributions.
137 |
138 | 6. Trademarks. This License does not grant permission to use the trade
139 | names, trademarks, service marks, or product names of the Licensor,
140 | except as required for reasonable and customary use in describing the
141 | origin of the Work and reproducing the content of the NOTICE file.
142 |
143 | 7. Disclaimer of Warranty. Unless required by applicable law or
144 | agreed to in writing, Licensor provides the Work (and each
145 | Contributor provides its Contributions) on an "AS IS" BASIS,
146 | WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or
147 | implied, including, without limitation, any warranties or conditions
148 | of TITLE, NON-INFRINGEMENT, MERCHANTABILITY, or FITNESS FOR A
149 | PARTICULAR PURPOSE. You are solely responsible for determining the
150 | appropriateness of using or redistributing the Work and assume any
151 | risks associated with Your exercise of permissions under this License.
152 |
153 | 8. Limitation of Liability. In no event and under no legal theory,
154 | whether in tort (including negligence), contract, or otherwise,
155 | unless required by applicable law (such as deliberate and grossly
156 | negligent acts) or agreed to in writing, shall any Contributor be
157 | liable to You for damages, including any direct, indirect, special,
158 | incidental, or consequential damages of any character arising as a
159 | result of this License or out of the use or inability to use the
160 | Work (including but not limited to damages for loss of goodwill,
161 | work stoppage, computer failure or malfunction, or any and all
162 | other commercial damages or losses), even if such Contributor
163 | has been advised of the possibility of such damages.
164 |
165 | 9. Accepting Warranty or Additional Liability. While redistributing
166 | the Work or Derivative Works thereof, You may choose to offer,
167 | and charge a fee for, acceptance of support, warranty, indemnity,
168 | or other liability obligations and/or rights consistent with this
169 | License. However, in accepting such obligations, You may act only
170 | on Your own behalf and on Your sole responsibility, not on behalf
171 | of any other Contributor, and only if You agree to indemnify,
172 | defend, and hold each Contributor harmless for any liability
173 | incurred by, or claims asserted against, such Contributor by reason
174 | of your accepting any such warranty or additional liability.
175 |
176 | END OF TERMS AND CONDITIONS
177 |
178 | APPENDIX: How to apply the Apache License to your work.
179 |
180 | To apply the Apache License to your work, attach the following
181 | boilerplate notice, with the fields enclosed by brackets "[]"
182 | replaced with your own identifying information. (Don't include
183 | the brackets!) The text should be enclosed in the appropriate
184 | comment syntax for the file format. We also recommend that a
185 | file or class name and description of purpose be included on the
186 | same "printed page" as the copyright notice for easier
187 | identification within third-party archives.
188 |
189 | Copyright [yyyy] [name of copyright owner]
190 |
191 | Licensed under the Apache License, Version 2.0 (the "License");
192 | you may not use this file except in compliance with the License.
193 | You may obtain a copy of the License at
194 |
195 | http://www.apache.org/licenses/LICENSE-2.0
196 |
197 | Unless required by applicable law or agreed to in writing, software
198 | distributed under the License is distributed on an "AS IS" BASIS,
199 | WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
200 | See the License for the specific language governing permissions and
201 | limitations under the License.
202 |
--------------------------------------------------------------------------------
/Graph2Seq-master/README.md:
--------------------------------------------------------------------------------
1 | # Graph2Seq
2 | Graph2Seq is a simple code for building a graph-encoder and sequence-decoder for NLP and other AI/ML/DL tasks.
3 |
4 | # How To Run The Codes
5 | To train your graph-to-sequence model, you need:
6 |
7 | (1) Prepare your train/dev/test data which the form of:
8 |
9 | each line is a json object whose keys are "seq", "g_ids", "g_id_features", "g_adj":
10 | "seq" is a text which is supposed to be the output of the decoder
11 | "g_ids" is a mapping from the node ID to its ID in the graph
12 | "g_id_features" is a mapping from the node ID to its text features
13 | "g_adj" is a mapping from the node ID to its adjacent nodes (represented as thier IDs)
14 |
15 | See data/no_cycle/train.data as examples.
16 |
17 |
18 | (2) Modify some hyper-parameters according to your task in the main/configure.py
19 |
20 | (3) train the model by running the following code
21 | "python run_model.py train -sample_size_per_layer=xxx -sample_layer_size=yyy"
22 | The model that performs the best on the dev data will be saved in the dir "saved_model"
23 |
24 | (4) test the model by running the following code
25 | "python run_model.py test -sample_size_per_layer=xxx -sample_layer_size=yyy"
26 | The prediction result will be saved in saved_model/prediction.txt
27 |
28 |
29 |
30 | # How To Cite The Codes
31 | Please cite our work if you like or are using our codes for your projects!
32 |
33 | Kun Xu, Lingfei Wu, Zhiguo Wang, Yansong Feng, Michael Witbrock, and Vadim Sheinin (first and second authors contributed equally), "Graph2Seq: Graph to Sequence Learning with Attention-based Neural Networks", arXiv preprint arXiv:1804.00823.
34 |
35 | @article{xu2018graph2seq,
36 | title={Graph2Seq: Graph to Sequence Learning with Attention-based Neural Networks},
37 | author={Xu, Kun and Wu, Lingfei and Wang, Zhiguo and Feng, Yansong and Witbrock, Michael and Sheinin, Vadim},
38 | journal={arXiv preprint arXiv:1804.00823},
39 | year={2018}
40 | }
41 |
42 | ------------------------------------------------------
43 | Contributors: Kun Xu, Lingfei Wu
44 | Created date: November 4, 2018
45 | Last update: November 4, 2018
46 |
47 |
--------------------------------------------------------------------------------
/Graph2Seq-master/batch_helper.py:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/IBM/Text-to-LogicForm/515e73db1795019007c1b3a3ed22b10c349f555a/Graph2Seq-master/batch_helper.py
--------------------------------------------------------------------------------
/Graph2Seq-master/data/.DS_Store:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/IBM/Text-to-LogicForm/515e73db1795019007c1b3a3ed22b10c349f555a/Graph2Seq-master/data/.DS_Store
--------------------------------------------------------------------------------
/Graph2Seq-master/data/word.idx:
--------------------------------------------------------------------------------
1 | 1
2 | 2
3 | 3
4 | START 4
5 | 6 5
6 | 5 6
7 | END 7
8 | 12 8
9 | 15 9
10 | 7 10
11 | 10 11
12 | 11 12
13 | 14 13
14 | 1 14
15 | 8 15
16 | 4 16
17 | 2 17
18 | 3 18
19 | 13 19
20 | 9 20
21 |
--------------------------------------------------------------------------------
/Graph2Seq-master/data_creator.py:
--------------------------------------------------------------------------------
1 | """this python file is used to construct the fake data for the model"""
2 | import random
3 | import json
4 | import numpy as np
5 |
6 | from networkx import *
7 | import networkx.algorithms as nxalg
8 |
9 | def create_random_graph(type, filePath, numberOfCase, graph_scale):
10 | """
11 |
12 | :param type: the graph type
13 | :param filePath: the output file path
14 | :param numberOfCase: the number of examples
15 | :return:
16 | """
17 | with open(filePath, "w+") as f:
18 | degree = 0.0
19 | for _ in range(numberOfCase):
20 | info = {}
21 | graph_node_size = graph_scale
22 | edge_prob = 0.3
23 |
24 | while True:
25 | edge_count = 0.0
26 | if type == "random":
27 | graph = nx.gnp_random_graph(graph_node_size, edge_prob, directed=True)
28 | for id in graph.edge:
29 | edge_count += len(graph.edge[id])
30 | start = random.randint(0, graph_node_size - 1)
31 | adj = nx.shortest_path(graph, start)
32 |
33 | max_len = 0
34 | path = []
35 | paths = []
36 | for neighbor in adj:
37 | if len(adj[neighbor]) > max_len and neighbor != start:
38 | paths = []
39 | max_len = len(adj[neighbor])
40 | path = adj[neighbor]
41 | end = neighbor
42 | for p in nx.all_shortest_paths(graph, start, end):
43 | paths.append(p)
44 |
45 | if len(path) > 0 and path[0] == start and len(path) == 3 and len(paths) == 1:
46 | degree += edge_count / graph_node_size
47 | break
48 |
49 | elif type == "no-cycle":
50 | graph = nx.DiGraph()
51 | for i in range(graph_node_size):
52 | nodes = graph.nodes()
53 | if len(nodes) == 0:
54 | graph.add_node(i)
55 | else:
56 | size = random.randint(1, min(i, 2));
57 | fathers = random.sample(range(0, i), size)
58 | for father in fathers:
59 | graph.add_edge(father, i)
60 | for id in graph.edge:
61 | edge_count += len(graph.edge[id])
62 | start = 0
63 | end = graph_node_size-1
64 | path = nx.shortest_path(graph, 0, graph_node_size-1)
65 | paths = [p for p in nx.all_shortest_paths(graph, 0, graph_node_size-1)]
66 | if len(path) >= 4 and len(paths) == 1:
67 | degree += edge_count / graph_node_size
68 | break
69 |
70 | elif type == "baseline":
71 | num_nodes = graph_node_size
72 | graph = nx.random_graphs.connected_watts_strogatz_graph(num_nodes, 3, edge_prob)
73 | for id in graph.edge:
74 | edge_count += len(graph.edge[id])
75 | start, end = np.random.randint(num_nodes, size=2)
76 |
77 | if start == end:
78 | continue # reject trivial paths
79 |
80 | paths = list(nxalg.all_shortest_paths(graph, source=start, target=end))
81 |
82 | if len(paths) > 1:
83 | continue # reject when more than one shortest path
84 |
85 | path = paths[0]
86 |
87 | if len(path) != 4:
88 | continue
89 | degree += edge_count / graph_node_size
90 | break
91 |
92 | adj_list = graph.adjacency_list()
93 |
94 |
95 | g_ids = {}
96 | g_ids_features = {}
97 | g_adj = {}
98 | for i in range(graph_node_size):
99 | g_ids[i] = i
100 | if i == start:
101 | g_ids_features[i] = "START"
102 | elif i == end:
103 | g_ids_features[i] = "END"
104 | else:
105 | # g_ids_features[i] = str(i+10)
106 | g_ids_features[i] = str(random.randint(1, 15))
107 | g_adj[i] = adj_list[i]
108 |
109 | # print start, end, path
110 | text = ""
111 | for id in path:
112 | text += g_ids_features[id] + " "
113 |
114 | info["seq"] = text.strip()
115 | info["g_ids"] = g_ids
116 | info['g_ids_features'] = g_ids_features
117 | info['g_adj'] = g_adj
118 | f.write(json.dumps(info)+"\n")
119 |
120 | print("average degree in the graph is :{}".format(degree/numberOfCase))
121 |
122 | if __name__ == "__main__":
123 | create_random_graph("no-cycle", "data/no_cycle/train.data", 1000, graph_scale=100)
124 | create_random_graph("no-cycle", "data/no_cycle/dev.data", 1000, graph_scale=100)
125 | create_random_graph("no-cycle", "data/no_cycle/test.data", 1000, graph_scale=100)
126 |
127 |
128 |
--------------------------------------------------------------------------------
/Graph2Seq-master/main/__init__.py:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/IBM/Text-to-LogicForm/515e73db1795019007c1b3a3ed22b10c349f555a/Graph2Seq-master/main/__init__.py
--------------------------------------------------------------------------------
/Graph2Seq-master/main/__pycache__/aggregators.cpython-36.pyc:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/IBM/Text-to-LogicForm/515e73db1795019007c1b3a3ed22b10c349f555a/Graph2Seq-master/main/__pycache__/aggregators.cpython-36.pyc
--------------------------------------------------------------------------------
/Graph2Seq-master/main/__pycache__/configure.cpython-36.pyc:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/IBM/Text-to-LogicForm/515e73db1795019007c1b3a3ed22b10c349f555a/Graph2Seq-master/main/__pycache__/configure.cpython-36.pyc
--------------------------------------------------------------------------------
/Graph2Seq-master/main/__pycache__/configure.cpython-37.pyc:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/IBM/Text-to-LogicForm/515e73db1795019007c1b3a3ed22b10c349f555a/Graph2Seq-master/main/__pycache__/configure.cpython-37.pyc
--------------------------------------------------------------------------------
/Graph2Seq-master/main/__pycache__/data_collector.cpython-36.pyc:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/IBM/Text-to-LogicForm/515e73db1795019007c1b3a3ed22b10c349f555a/Graph2Seq-master/main/__pycache__/data_collector.cpython-36.pyc
--------------------------------------------------------------------------------
/Graph2Seq-master/main/__pycache__/data_collector.cpython-37.pyc:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/IBM/Text-to-LogicForm/515e73db1795019007c1b3a3ed22b10c349f555a/Graph2Seq-master/main/__pycache__/data_collector.cpython-37.pyc
--------------------------------------------------------------------------------
/Graph2Seq-master/main/__pycache__/evaluator.cpython-36.pyc:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/IBM/Text-to-LogicForm/515e73db1795019007c1b3a3ed22b10c349f555a/Graph2Seq-master/main/__pycache__/evaluator.cpython-36.pyc
--------------------------------------------------------------------------------
/Graph2Seq-master/main/__pycache__/helpers.cpython-36.pyc:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/IBM/Text-to-LogicForm/515e73db1795019007c1b3a3ed22b10c349f555a/Graph2Seq-master/main/__pycache__/helpers.cpython-36.pyc
--------------------------------------------------------------------------------
/Graph2Seq-master/main/__pycache__/inits.cpython-36.pyc:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/IBM/Text-to-LogicForm/515e73db1795019007c1b3a3ed22b10c349f555a/Graph2Seq-master/main/__pycache__/inits.cpython-36.pyc
--------------------------------------------------------------------------------
/Graph2Seq-master/main/__pycache__/layer_utils.cpython-36.pyc:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/IBM/Text-to-LogicForm/515e73db1795019007c1b3a3ed22b10c349f555a/Graph2Seq-master/main/__pycache__/layer_utils.cpython-36.pyc
--------------------------------------------------------------------------------
/Graph2Seq-master/main/__pycache__/layers.cpython-36.pyc:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/IBM/Text-to-LogicForm/515e73db1795019007c1b3a3ed22b10c349f555a/Graph2Seq-master/main/__pycache__/layers.cpython-36.pyc
--------------------------------------------------------------------------------
/Graph2Seq-master/main/__pycache__/loaderAndwriter.cpython-36.pyc:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/IBM/Text-to-LogicForm/515e73db1795019007c1b3a3ed22b10c349f555a/Graph2Seq-master/main/__pycache__/loaderAndwriter.cpython-36.pyc
--------------------------------------------------------------------------------
/Graph2Seq-master/main/__pycache__/loaderAndwriter.cpython-37.pyc:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/IBM/Text-to-LogicForm/515e73db1795019007c1b3a3ed22b10c349f555a/Graph2Seq-master/main/__pycache__/loaderAndwriter.cpython-37.pyc
--------------------------------------------------------------------------------
/Graph2Seq-master/main/__pycache__/match_utils.cpython-36.pyc:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/IBM/Text-to-LogicForm/515e73db1795019007c1b3a3ed22b10c349f555a/Graph2Seq-master/main/__pycache__/match_utils.cpython-36.pyc
--------------------------------------------------------------------------------
/Graph2Seq-master/main/__pycache__/model.cpython-36.pyc:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/IBM/Text-to-LogicForm/515e73db1795019007c1b3a3ed22b10c349f555a/Graph2Seq-master/main/__pycache__/model.cpython-36.pyc
--------------------------------------------------------------------------------
/Graph2Seq-master/main/__pycache__/model.cpython-37.pyc:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/IBM/Text-to-LogicForm/515e73db1795019007c1b3a3ed22b10c349f555a/Graph2Seq-master/main/__pycache__/model.cpython-37.pyc
--------------------------------------------------------------------------------
/Graph2Seq-master/main/__pycache__/neigh_samplers.cpython-36.pyc:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/IBM/Text-to-LogicForm/515e73db1795019007c1b3a3ed22b10c349f555a/Graph2Seq-master/main/__pycache__/neigh_samplers.cpython-36.pyc
--------------------------------------------------------------------------------
/Graph2Seq-master/main/__pycache__/pooling.cpython-36.pyc:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/IBM/Text-to-LogicForm/515e73db1795019007c1b3a3ed22b10c349f555a/Graph2Seq-master/main/__pycache__/pooling.cpython-36.pyc
--------------------------------------------------------------------------------
/Graph2Seq-master/main/__pycache__/text_decoder.cpython-36.pyc:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/IBM/Text-to-LogicForm/515e73db1795019007c1b3a3ed22b10c349f555a/Graph2Seq-master/main/__pycache__/text_decoder.cpython-36.pyc
--------------------------------------------------------------------------------
/Graph2Seq-master/main/aggregators.py:
--------------------------------------------------------------------------------
1 | import tensorflow as tf
2 | from layers import Layer, Dense
3 | from inits import glorot, zeros
4 | from pooling import mean_pool
5 |
6 | class GatedMeanAggregator(Layer):
7 | def __init__(self, input_dim, output_dim, neigh_input_dim=None,
8 | dropout=0, bias=True, act=tf.nn.relu,
9 | name=None, concat=False, **kwargs):
10 | super(GatedMeanAggregator, self).__init__(**kwargs)
11 |
12 | self.dropout = dropout
13 | self.bias = bias
14 | self.act = act
15 | self.concat = concat
16 |
17 | if name is not None:
18 | name = '/' + name
19 | else:
20 | name = ''
21 |
22 | if neigh_input_dim == None:
23 | neigh_input_dim = input_dim
24 |
25 | if concat:
26 | self.output_dim = 2 * output_dim
27 |
28 | with tf.variable_scope(self.name + name + '_vars'):
29 | self.vars['neigh_weights'] = glorot([neigh_input_dim, output_dim],
30 | name='neigh_weights')
31 | self.vars['self_weights'] = glorot([input_dim, output_dim],
32 | name='self_weights')
33 | if self.bias:
34 | self.vars['bias'] = zeros([self.output_dim], name='bias')
35 |
36 | self.vars['gate_weights'] = glorot([2*output_dim, 2*output_dim],
37 | name='gate_weights')
38 | self.vars['gate_bias'] = zeros([2*output_dim], name='bias')
39 |
40 |
41 | self.input_dim = input_dim
42 | self.output_dim = output_dim
43 |
44 | def _call(self, inputs):
45 | self_vecs, neigh_vecs = inputs
46 |
47 | neigh_vecs = tf.nn.dropout(neigh_vecs, 1-self.dropout)
48 | self_vecs = tf.nn.dropout(self_vecs, 1-self.dropout)
49 |
50 | neigh_means = tf.reduce_mean(neigh_vecs, axis=1)
51 |
52 | # [nodes] x [out_dim]
53 | from_neighs = tf.matmul(neigh_means, self.vars['neigh_weights'])
54 |
55 | from_self = tf.matmul(self_vecs, self.vars["self_weights"])
56 |
57 | if not self.concat:
58 | output = tf.add_n([from_self, from_neighs])
59 | else:
60 | output = tf.concat([from_self, from_neighs], axis=1)
61 |
62 | # bias
63 | if self.bias:
64 | output += self.vars['bias']
65 |
66 | gate = tf.concat([from_self, from_neighs], axis=1)
67 | gate = tf.matmul(gate, self.vars["gate_weights"]) + self.vars["gate_bias"]
68 | gate = tf.nn.relu(gate)
69 |
70 | return gate*self.act(output)
71 |
72 | class MeanAggregator(Layer):
73 | """Aggregates via mean followed by matmul and non-linearity."""
74 |
75 | def __init__(self, input_dim, output_dim, neigh_input_dim=None,
76 | dropout=0, bias=True, act=tf.nn.relu,
77 | name=None, concat=False, mode="train", **kwargs):
78 | super(MeanAggregator, self).__init__(**kwargs)
79 |
80 | self.dropout = dropout
81 | self.bias = bias
82 | self.act = act
83 | self.concat = concat
84 | self.mode = mode
85 |
86 | if name is not None:
87 | name = '/' + name
88 | else:
89 | name = ''
90 |
91 | if neigh_input_dim == None:
92 | neigh_input_dim = input_dim
93 |
94 | if concat:
95 | self.output_dim = 2 * output_dim
96 |
97 | with tf.variable_scope(self.name + name + '_vars'):
98 | self.vars['neigh_weights'] = glorot([neigh_input_dim, output_dim],
99 | name='neigh_weights')
100 | self.vars['self_weights'] = glorot([input_dim, output_dim],
101 | name='self_weights')
102 | if self.bias:
103 | self.vars['bias'] = zeros([self.output_dim], name='bias')
104 |
105 | self.input_dim = input_dim
106 | self.output_dim = output_dim
107 |
108 | def _call(self, inputs):
109 | self_vecs, neigh_vecs, neigh_len = inputs
110 |
111 | if self.mode == "train":
112 | neigh_vecs = tf.nn.dropout(neigh_vecs, 1-self.dropout)
113 | self_vecs = tf.nn.dropout(self_vecs, 1-self.dropout)
114 |
115 | # reduce_mean performs better than mean_pool
116 | neigh_means = tf.reduce_mean(neigh_vecs, axis=1)
117 | # neigh_means = mean_pool(neigh_vecs, neigh_len)
118 |
119 | # [nodes] x [out_dim]
120 | from_neighs = tf.matmul(neigh_means, self.vars['neigh_weights'])
121 |
122 | from_self = tf.matmul(self_vecs, self.vars["self_weights"])
123 |
124 | if not self.concat:
125 | output = tf.add_n([from_self, from_neighs])
126 | else:
127 | output = tf.concat([from_self, from_neighs], axis=1)
128 |
129 | # bias
130 | if self.bias:
131 | output += self.vars['bias']
132 |
133 | return self.act(output)
134 |
135 | class MaxPoolingAggregator(Layer):
136 | """ Aggregates via max-pooling over MLP functions."""
137 | def __init__(self, input_dim, output_dim, model_size="small", neigh_input_dim=None,
138 | dropout=0., bias=True, act=tf.nn.relu, name=None, concat=False, **kwargs):
139 | super(MaxPoolingAggregator, self).__init__(**kwargs)
140 |
141 | self.dropout = dropout
142 | self.bias = bias
143 | self.act = act
144 | self.concat = concat
145 |
146 | if name is not None:
147 | name = '/' + name
148 | else:
149 | name = ''
150 |
151 | if neigh_input_dim == None:
152 | neigh_input_dim = input_dim
153 |
154 | if concat:
155 | self.output_dim = 2 * output_dim
156 |
157 | if model_size == "small":
158 | hidden_dim = self.hidden_dim = 50
159 | elif model_size == "big":
160 | hidden_dim = self.hidden_dim = 50
161 |
162 | self.mlp_layers = []
163 | self.mlp_layers.append(Dense(input_dim=neigh_input_dim, output_dim=hidden_dim, act=tf.nn.relu,
164 | dropout=dropout, sparse_inputs=False, logging=self.logging))
165 |
166 | with tf.variable_scope(self.name + name + '_vars'):
167 |
168 | self.vars['neigh_weights'] = glorot([hidden_dim, output_dim], name='neigh_weights')
169 |
170 | self.vars['self_weights'] = glorot([input_dim, output_dim], name='self_weights')
171 |
172 | if self.bias:
173 | self.vars['bias'] = zeros([self.output_dim], name='bias')
174 |
175 | self.input_dim = input_dim
176 | self.output_dim = output_dim
177 | self.neigh_input_dim = neigh_input_dim
178 |
179 | def _call(self, inputs):
180 | self_vecs, neigh_vecs = inputs
181 | neigh_h = neigh_vecs
182 |
183 | dims = tf.shape(neigh_h)
184 | batch_size = dims[0]
185 | num_neighbors = dims[1]
186 |
187 | h_reshaped = tf.reshape(neigh_h, (batch_size * num_neighbors, self.neigh_input_dim))
188 |
189 | for l in self.mlp_layers:
190 | h_reshaped = l(h_reshaped)
191 | neigh_h = tf.reshape(h_reshaped, (batch_size, num_neighbors, self.hidden_dim))
192 | neigh_h = tf.reduce_max(neigh_h, axis=1)
193 |
194 | from_neighs = tf.matmul(neigh_h, self.vars['neigh_weights'])
195 | from_self = tf.matmul(self_vecs, self.vars["self_weights"])
196 |
197 | if not self.concat:
198 | output = tf.add_n([from_self, from_neighs])
199 | else:
200 | output = tf.concat([from_self, from_neighs], axis=1)
201 |
202 | # bias
203 | if self.bias:
204 | output += self.vars['bias']
205 | return self.act(output)
--------------------------------------------------------------------------------
/Graph2Seq-master/main/batch_helper.py:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/IBM/Text-to-LogicForm/515e73db1795019007c1b3a3ed22b10c349f555a/Graph2Seq-master/main/batch_helper.py
--------------------------------------------------------------------------------
/Graph2Seq-master/main/configure.py:
--------------------------------------------------------------------------------
1 | train_data_path = "../data/no_cycle/train.data"
2 | dev_data_path = "../data/no_cycle/dev.data"
3 | test_data_path = "../data/no_cycle/test.data"
4 |
5 | word_idx_file_path = "../data/word.idx"
6 |
7 | word_embedding_dim = 100
8 |
9 | train_batch_size = 32
10 | dev_batch_size = 500
11 | test_batch_size = 500
12 |
13 | l2_lambda = 0.000001
14 | learning_rate = 0.001
15 | epochs = 100
16 | encoder_hidden_dim = 200
17 | num_layers_decode = 1
18 | word_size_max = 1
19 |
20 | dropout = 0.0
21 |
22 | path_embed_method = "lstm" # cnn or lstm or bi-lstm
23 |
24 | unknown_word = ""
25 | PAD = ""
26 | GO = ""
27 | EOS = ""
28 | deal_unknown_words = True
29 |
30 | seq_max_len = 11
31 |
32 | decoder_type = "greedy" # greedy, beam
33 | beam_width = 0
34 | attention = True
35 | num_layers = 1 # 1 or 2
36 |
37 | # the following are for the graph encoding method
38 | weight_decay = 0.0000
39 | sample_size_per_layer = 4
40 | sample_layer_size = 4
41 | hidden_layer_dim = 100
42 | feature_max_len = 1
43 | feature_encode_type = "uni"
44 | # graph_encode_method = "max-pooling" # "lstm" or "max-pooling"
45 | graph_encode_direction = "bi" # "single" or "bi"
46 | concat = True
47 |
48 | encoder = "gated_gcn" # "gated_gcn" "gcn" "seq"
49 |
50 | lstm_in_gcn = "none" # before, after, none
51 |
--------------------------------------------------------------------------------
/Graph2Seq-master/main/data_collector.py:
--------------------------------------------------------------------------------
1 | import numpy as np
2 | import json
3 | import configure as conf
4 | from collections import OrderedDict
5 |
6 | def read_data(input_path, word_idx, if_increase_dict):
7 | seqs = []
8 | graphs = []
9 |
10 | if if_increase_dict:
11 | word_idx[conf.GO] = 1
12 | word_idx[conf.EOS] = 2
13 | word_idx[conf.unknown_word] = 3
14 |
15 | with open(input_path, 'r') as f:
16 | lines = f.readlines()
17 | for line in lines:
18 | line = line.strip()
19 | jo = json.loads(line, object_pairs_hook=OrderedDict)
20 | seq = jo['seq']
21 | seqs.append(seq)
22 | if if_increase_dict:
23 | for w in seq.split():
24 | if w not in word_idx:
25 | word_idx[w] = len(word_idx) + 1
26 |
27 | for id in jo['g_ids_features']:
28 | features = jo['g_ids_features'][id]
29 | for w in features.split():
30 | if w not in word_idx:
31 | word_idx[w] = len(word_idx) + 1
32 |
33 | graph = {}
34 | graph['g_ids'] = jo['g_ids']
35 | graph['g_ids_features'] = jo['g_ids_features']
36 | graph['g_adj'] = jo['g_adj']
37 | graphs.append(graph)
38 |
39 | return seqs, graphs
40 |
41 | def vectorize_data(word_idx, texts):
42 | tv = []
43 | for text in texts:
44 | stv = []
45 | for w in text.split():
46 | if w not in word_idx:
47 | stv.append(word_idx[conf.unknown_word])
48 | else:
49 | stv.append(word_idx[w])
50 | tv.append(stv)
51 | return tv
52 |
53 | def cons_batch_graph(graphs):
54 | g_ids = {}
55 | g_ids_features = {}
56 | g_fw_adj = {}
57 | g_bw_adj = {}
58 | g_nodes = []
59 |
60 | for g in graphs:
61 | ids = g['g_ids']
62 | id_adj = g['g_adj']
63 | features = g['g_ids_features']
64 |
65 | nodes = []
66 |
67 | # we first add all nodes into batch_graph and create a mapping from graph id to batch_graph id, this mapping will be
68 | # used in the creation of fw_adj and bw_adj
69 |
70 | id_gid_map = {}
71 | offset = len(g_ids.keys())
72 | for id in ids:
73 | id = int(id)
74 | g_ids[offset + id] = len(g_ids.keys())
75 | g_ids_features[offset + id] = features[str(id)]
76 | id_gid_map[id] = offset + id
77 | nodes.append(offset + id)
78 | g_nodes.append(nodes)
79 |
80 | for id in id_adj:
81 | adj = id_adj[id]
82 | id = int(id)
83 | g_id = id_gid_map[id]
84 | if g_id not in g_fw_adj:
85 | g_fw_adj[g_id] = []
86 | for t in adj:
87 | t = int(t)
88 | g_t = id_gid_map[t]
89 | g_fw_adj[g_id].append(g_t)
90 | if g_t not in g_bw_adj:
91 | g_bw_adj[g_t] = []
92 | g_bw_adj[g_t].append(g_id)
93 |
94 | node_size = len(g_ids.keys())
95 | for id in range(node_size):
96 | if id not in g_fw_adj:
97 | g_fw_adj[id] = []
98 | if id not in g_bw_adj:
99 | g_bw_adj[id] = []
100 |
101 | graph = {}
102 | graph['g_ids'] = g_ids
103 | graph['g_ids_features'] = g_ids_features
104 | graph['g_nodes'] = g_nodes
105 | graph['g_fw_adj'] = g_fw_adj
106 | graph['g_bw_adj'] = g_bw_adj
107 |
108 | return graph
109 |
110 | def vectorize_batch_graph(graph, word_idx):
111 | # vectorize the graph feature and normalize the adj info
112 | id_features = graph['g_ids_features']
113 | gv = {}
114 | nv = []
115 | word_max_len = 0
116 | for id in id_features:
117 | feature = id_features[id]
118 | word_max_len = max(word_max_len, len(feature.split()))
119 | word_max_len = min(word_max_len, conf.word_size_max)
120 |
121 | for id in graph['g_ids_features']:
122 | feature = graph['g_ids_features'][id]
123 | fv = []
124 | for token in feature.split():
125 | if len(token) == 0:
126 | continue
127 | if token in word_idx:
128 | fv.append(word_idx[token])
129 | else:
130 | fv.append(word_idx[conf.unknown_word])
131 |
132 | for _ in range(word_max_len - len(fv)):
133 | fv.append(0)
134 | fv = fv[:word_max_len]
135 | nv.append(fv)
136 |
137 | nv.append([0 for temp in range(word_max_len)])
138 | gv['g_ids_features'] = np.array(nv)
139 |
140 | g_fw_adj = graph['g_fw_adj']
141 | g_fw_adj_v = []
142 |
143 | degree_max_size = 0
144 | for id in g_fw_adj:
145 | degree_max_size = max(degree_max_size, len(g_fw_adj[id]))
146 |
147 | g_bw_adj = graph['g_bw_adj']
148 | for id in g_bw_adj:
149 | degree_max_size = max(degree_max_size, len(g_bw_adj[id]))
150 |
151 | degree_max_size = min(degree_max_size, conf.sample_size_per_layer)
152 |
153 | for id in g_fw_adj:
154 | adj = g_fw_adj[id]
155 | for _ in range(degree_max_size - len(adj)):
156 | adj.append(len(g_fw_adj.keys()))
157 | adj = adj[:degree_max_size]
158 | g_fw_adj_v.append(adj)
159 |
160 | # PAD node directs to the PAD node
161 | g_fw_adj_v.append([len(g_fw_adj.keys()) for _ in range(degree_max_size)])
162 |
163 | g_bw_adj_v = []
164 | for id in g_bw_adj:
165 | adj = g_bw_adj[id]
166 | for _ in range(degree_max_size - len(adj)):
167 | adj.append(len(g_bw_adj.keys()))
168 | adj = adj[:degree_max_size]
169 | g_bw_adj_v.append(adj)
170 |
171 | # PAD node directs to the PAD node
172 | g_bw_adj_v.append([len(g_bw_adj.keys()) for _ in range(degree_max_size)])
173 |
174 | gv['g_ids'] = graph['g_ids']
175 | gv['g_nodes'] =np.array(graph['g_nodes'])
176 | gv['g_bw_adj'] = np.array(g_bw_adj_v)
177 | gv['g_fw_adj'] = np.array(g_fw_adj_v)
178 |
179 | return gv
180 |
--------------------------------------------------------------------------------
/Graph2Seq-master/main/data_creator.py:
--------------------------------------------------------------------------------
1 | """this python file is used to construct the fake data for the model"""
2 | import random
3 | import json
4 | import numpy as np
5 |
6 | from networkx import *
7 | import networkx.algorithms as nxalg
8 |
9 | def create_random_graph(type, filePath, numberOfCase, graph_scale):
10 | """
11 |
12 | :param type: the graph type
13 | :param filePath: the output file path
14 | :param numberOfCase: the number of examples
15 | :return:
16 | """
17 | with open(filePath, "w+") as f:
18 | degree = 0.0
19 | for _ in range(numberOfCase):
20 | info = {}
21 | graph_node_size = graph_scale
22 | edge_prob = 0.3
23 |
24 | while True:
25 | edge_count = 0.0
26 | if type == "random":
27 | graph = nx.gnp_random_graph(graph_node_size, edge_prob, directed=True)
28 | for id in graph.edge:
29 | edge_count += len(graph.edge[id])
30 | start = random.randint(0, graph_node_size - 1)
31 | adj = nx.shortest_path(graph, start)
32 |
33 | max_len = 0
34 | path = []
35 | paths = []
36 | for neighbor in adj:
37 | if len(adj[neighbor]) > max_len and neighbor != start:
38 | paths = []
39 | max_len = len(adj[neighbor])
40 | path = adj[neighbor]
41 | end = neighbor
42 | for p in nx.all_shortest_paths(graph, start, end):
43 | paths.append(p)
44 |
45 | if len(path) > 0 and path[0] == start and len(path) == 3 and len(paths) == 1:
46 | degree += edge_count / graph_node_size
47 | break
48 |
49 | elif type == "no-cycle":
50 | graph = nx.DiGraph()
51 | for i in range(graph_node_size):
52 | nodes = graph.nodes()
53 | if len(nodes) == 0:
54 | graph.add_node(i)
55 | else:
56 | size = random.randint(1, min(i, 2));
57 | fathers = random.sample(range(0, i), size)
58 | for father in fathers:
59 | graph.add_edge(father, i)
60 | for id in graph.edge:
61 | edge_count += len(graph.edge[id])
62 | start = 0
63 | end = graph_node_size-1
64 | path = nx.shortest_path(graph, 0, graph_node_size-1)
65 | paths = [p for p in nx.all_shortest_paths(graph, 0, graph_node_size-1)]
66 | if len(path) >= 4 and len(paths) == 1:
67 | degree += edge_count / graph_node_size
68 | break
69 |
70 | elif type == "baseline":
71 | num_nodes = graph_node_size
72 | graph = nx.random_graphs.connected_watts_strogatz_graph(num_nodes, 3, edge_prob)
73 | for id in graph.edge:
74 | edge_count += len(graph.edge[id])
75 | start, end = np.random.randint(num_nodes, size=2)
76 |
77 | if start == end:
78 | continue # reject trivial paths
79 |
80 | paths = list(nxalg.all_shortest_paths(graph, source=start, target=end))
81 |
82 | if len(paths) > 1:
83 | continue # reject when more than one shortest path
84 |
85 | path = paths[0]
86 |
87 | if len(path) != 4:
88 | continue
89 | degree += edge_count / graph_node_size
90 | break
91 |
92 | adj_list = graph.adjacency_list()
93 |
94 |
95 | g_ids = {}
96 | g_ids_features = {}
97 | g_adj = {}
98 | for i in range(graph_node_size):
99 | g_ids[i] = i
100 | if i == start:
101 | g_ids_features[i] = "START"
102 | elif i == end:
103 | g_ids_features[i] = "END"
104 | else:
105 | # g_ids_features[i] = str(i+10)
106 | g_ids_features[i] = str(random.randint(1, 15))
107 | g_adj[i] = adj_list[i]
108 |
109 | # print start, end, path
110 | text = ""
111 | for id in path:
112 | text += g_ids_features[id] + " "
113 |
114 | info["seq"] = text.strip()
115 | info["g_ids"] = g_ids
116 | info['g_ids_features'] = g_ids_features
117 | info['g_adj'] = g_adj
118 | f.write(json.dumps(info)+"\n")
119 |
120 | print("average degree in the graph is :{}".format(degree/numberOfCase))
121 |
122 | if __name__ == "__main__":
123 | create_random_graph("no-cycle", "data/no_cycle/train.data", 1000, graph_scale=100)
124 | create_random_graph("no-cycle", "data/no_cycle/dev.data", 1000, graph_scale=100)
125 | create_random_graph("no-cycle", "data/no_cycle/test.data", 1000, graph_scale=100)
126 |
127 |
128 |
--------------------------------------------------------------------------------
/Graph2Seq-master/main/evaluator.py:
--------------------------------------------------------------------------------
1 |
2 |
3 | def evaluate(type, golds, preds):
4 | assert len(golds) == len(preds)
5 | if type == "acc":
6 | correct = 0.0
7 | for _ in range(len(golds)):
8 | gold = golds[_]
9 | gold_str = " ".join(gold).strip()
10 |
11 | pred = preds[_]
12 | pred_str = " ".join(pred).strip()
13 |
14 | if gold_str == pred_str:
15 | correct += 1.0
16 | return correct/len(preds)
--------------------------------------------------------------------------------
/Graph2Seq-master/main/helpers.py:
--------------------------------------------------------------------------------
1 | import numpy as np
2 |
3 | def batch(inputs, max_sequence_length=None):
4 | """
5 | Args:
6 | inputs:
7 | list of sentences (integer lists)
8 | max_sequence_length:
9 | integer specifying how large should `max_time` dimension be.
10 | If None, maximum sequence length would be used
11 |
12 | Outputs:
13 | inputs_time_major:
14 | input sentences transformed into time-major matrix
15 | (shape [max_time, batch_size]) padded with 0s
16 | sequence_lengths:
17 | batch-sized list of integers specifying amount of active
18 | time steps in each input sequence
19 | """
20 | sequence_lengths = [len(seq) for seq in inputs]
21 | batch_size = len(inputs)
22 |
23 | if max_sequence_length is None:
24 | max_sequence_length = max(sequence_lengths)
25 |
26 | inputs_batch_major = np.zeros(shape=[batch_size, max_sequence_length], dtype=np.int32) # == PAD
27 |
28 | for i, seq in enumerate(inputs):
29 | for j, element in enumerate(seq):
30 | inputs_batch_major[i, j] = element
31 |
32 | loss_weights = []
33 | for _ in range(len(inputs)):
34 | weights = []
35 | for __ in range(len(inputs[_])+1):
36 | weights.append(1)
37 | for __ in range(max_sequence_length-len(inputs[_])):
38 | weights.append(0)
39 | loss_weights.append(weights)
40 |
41 | return inputs_batch_major, sequence_lengths, loss_weights
--------------------------------------------------------------------------------
/Graph2Seq-master/main/inits.py:
--------------------------------------------------------------------------------
1 | import tensorflow as tf
2 | import numpy as np
3 |
4 | # DISCLAIMER:
5 | # Parts of this code file are derived from
6 | # https://github.com/tkipf/gcn
7 | # which is under an identical MIT license as GraphSAGE
8 |
9 | def uniform(shape, scale=0.05, name=None):
10 | """Uniform init."""
11 | initial = tf.random_uniform(shape, minval=-scale, maxval=scale, dtype=tf.float32)
12 | return tf.Variable(initial, name=name)
13 |
14 |
15 | def glorot(shape, name=None):
16 | """Glorot & Bengio (AISTATS 2010) init."""
17 | init_range = np.sqrt(6.0/(shape[0]+shape[1]))
18 | initial = tf.random_uniform(shape, minval=-init_range, maxval=init_range, dtype=tf.float32)
19 | return tf.Variable(initial, name=name)
20 |
21 |
22 | def zeros(shape, name=None):
23 | """All zeros."""
24 | initial = tf.zeros(shape, dtype=tf.float32)
25 | return tf.Variable(initial, name=name)
26 |
27 | def ones(shape, name=None):
28 | """All ones."""
29 | initial = tf.ones(shape, dtype=tf.float32)
30 | return tf.Variable(initial, name=name)
31 |
--------------------------------------------------------------------------------
/Graph2Seq-master/main/layer_utils.py:
--------------------------------------------------------------------------------
1 | import tensorflow as tf
2 | from tensorflow.python.ops import nn_ops
3 |
4 | def my_lstm_layer(input_reps, lstm_dim, input_lengths=None, scope_name=None, reuse=False, is_training=True,
5 | dropout_rate=0.2, use_cudnn=True):
6 | '''
7 | :param inputs: [batch_size, seq_len, feature_dim]
8 | :param lstm_dim:
9 | :param scope_name:
10 | :param reuse:
11 | :param is_training:
12 | :param dropout_rate:
13 | :return:
14 | '''
15 | input_reps = dropout_layer(input_reps, dropout_rate, is_training=is_training)
16 | with tf.variable_scope(scope_name, reuse=reuse):
17 | if use_cudnn:
18 | inputs = tf.transpose(input_reps, [1, 0, 2])
19 | lstm = tf.contrib.cudnn_rnn.CudnnLSTM(1, lstm_dim, direction="bidirectional",
20 | name="{}_cudnn_bi_lstm".format(scope_name), dropout=dropout_rate if is_training else 0)
21 | outputs, _ = lstm(inputs)
22 | outputs = tf.transpose(outputs, [1, 0, 2])
23 | f_rep = outputs[:, :, 0:lstm_dim]
24 | b_rep = outputs[:, :, lstm_dim:2*lstm_dim]
25 | else:
26 | context_lstm_cell_fw = tf.nn.rnn_cell.BasicLSTMCell(lstm_dim)
27 | context_lstm_cell_bw = tf.nn.rnn_cell.BasicLSTMCell(lstm_dim)
28 | if is_training:
29 | context_lstm_cell_fw = tf.nn.rnn_cell.DropoutWrapper(context_lstm_cell_fw, output_keep_prob=(1 - dropout_rate))
30 | context_lstm_cell_bw = tf.nn.rnn_cell.DropoutWrapper(context_lstm_cell_bw, output_keep_prob=(1 - dropout_rate))
31 | context_lstm_cell_fw = tf.nn.rnn_cell.MultiRNNCell([context_lstm_cell_fw])
32 | context_lstm_cell_bw = tf.nn.rnn_cell.MultiRNNCell([context_lstm_cell_bw])
33 |
34 | (f_rep, b_rep), _ = tf.nn.bidirectional_dynamic_rnn(
35 | context_lstm_cell_fw, context_lstm_cell_bw, input_reps, dtype=tf.float32,
36 | sequence_length=input_lengths) # [batch_size, question_len, context_lstm_dim]
37 | outputs = tf.concat(axis=2, values=[f_rep, b_rep])
38 | return (f_rep,b_rep, outputs)
39 |
40 | def dropout_layer(input_reps, dropout_rate, is_training=True):
41 | if is_training:
42 | output_repr = tf.nn.dropout(input_reps, (1 - dropout_rate))
43 | else:
44 | output_repr = input_reps
45 | return output_repr
46 |
47 | def cosine_distance(y1,y2, cosine_norm=True, eps=1e-6):
48 | # cosine_norm = True
49 | # y1 [....,a, 1, d]
50 | # y2 [....,1, b, d]
51 | cosine_numerator = tf.reduce_sum(tf.multiply(y1, y2), axis=-1)
52 | if not cosine_norm:
53 | return tf.tanh(cosine_numerator)
54 | y1_norm = tf.sqrt(tf.maximum(tf.reduce_sum(tf.square(y1), axis=-1), eps))
55 | y2_norm = tf.sqrt(tf.maximum(tf.reduce_sum(tf.square(y2), axis=-1), eps))
56 | return cosine_numerator / y1_norm / y2_norm
57 |
58 | def euclidean_distance(y1, y2, eps=1e-6):
59 | distance = tf.sqrt(tf.maximum(tf.reduce_sum(tf.square(y1 - y2), axis=-1), eps))
60 | return distance
61 |
62 | def cross_entropy(logits, truth, mask=None):
63 | # logits: [batch_size, passage_len]
64 | # truth: [batch_size, passage_len]
65 | # mask: [batch_size, passage_len]
66 | if mask is not None: logits = tf.multiply(logits, mask)
67 | xdev = tf.subtract(logits, tf.expand_dims(tf.reduce_max(logits, 1), -1))
68 | log_predictions = tf.subtract(xdev, tf.expand_dims(tf.log(tf.reduce_sum(tf.exp(xdev),-1)),-1))
69 | result = tf.multiply(truth, log_predictions) # [batch_size, passage_len]
70 | if mask is not None: result = tf.multiply(result, mask) # [batch_size, passage_len]
71 | return tf.multiply(-1.0,tf.reduce_sum(result, -1)) # [batch_size]
72 |
73 | def projection_layer(in_val, input_size, output_size, activation_func=tf.tanh, scope=None):
74 | # in_val: [batch_size, passage_len, dim]
75 | input_shape = tf.shape(in_val)
76 | batch_size = input_shape[0]
77 | passage_len = input_shape[1]
78 | # feat_dim = input_shape[2]
79 | in_val = tf.reshape(in_val, [batch_size * passage_len, input_size])
80 | with tf.variable_scope(scope or "projection_layer"):
81 | full_w = tf.get_variable("full_w", [input_size, output_size], dtype=tf.float32)
82 | full_b = tf.get_variable("full_b", [output_size], dtype=tf.float32)
83 | outputs = activation_func(tf.nn.xw_plus_b(in_val, full_w, full_b))
84 | outputs = tf.reshape(outputs, [batch_size, passage_len, output_size])
85 | return outputs # [batch_size, passage_len, output_size]
86 |
87 | def highway_layer(in_val, output_size, activation_func=tf.tanh, scope=None):
88 | # in_val: [batch_size, passage_len, dim]
89 | input_shape = tf.shape(in_val)
90 | batch_size = input_shape[0]
91 | passage_len = input_shape[1]
92 | # feat_dim = input_shape[2]
93 | in_val = tf.reshape(in_val, [batch_size * passage_len, output_size])
94 | with tf.variable_scope(scope or "highway_layer"):
95 | highway_w = tf.get_variable("highway_w", [output_size, output_size], dtype=tf.float32)
96 | highway_b = tf.get_variable("highway_b", [output_size], dtype=tf.float32)
97 | full_w = tf.get_variable("full_w", [output_size, output_size], dtype=tf.float32)
98 | full_b = tf.get_variable("full_b", [output_size], dtype=tf.float32)
99 | trans = activation_func(tf.nn.xw_plus_b(in_val, full_w, full_b))
100 | gate = tf.nn.sigmoid(tf.nn.xw_plus_b(in_val, highway_w, highway_b))
101 | outputs = tf.add(tf.multiply(trans, gate), tf.multiply(in_val, tf.subtract(1.0, gate)), "y")
102 | outputs = tf.reshape(outputs, [batch_size, passage_len, output_size])
103 | return outputs
104 |
105 | def multi_highway_layer(in_val, output_size, num_layers, activation_func=tf.tanh, scope_name=None, reuse=False):
106 | with tf.variable_scope(scope_name, reuse=reuse):
107 | for i in xrange(num_layers):
108 | cur_scope_name = scope_name + "-{}".format(i)
109 | in_val = highway_layer(in_val, output_size,activation_func=activation_func, scope=cur_scope_name)
110 | return in_val
111 |
112 | def collect_representation(representation, positions):
113 | # representation: [batch_size, node_num, feature_dim]
114 | # positions: [batch_size, neigh_num]
115 | return collect_probs(representation, positions)
116 |
117 | def collect_final_step_of_lstm(lstm_representation, lengths):
118 | # lstm_representation: [batch_size, passsage_length, dim]
119 | # lengths: [batch_size]
120 | lengths = tf.maximum(lengths, tf.zeros_like(lengths, dtype=tf.int32))
121 |
122 | batch_size = tf.shape(lengths)[0]
123 | batch_nums = tf.range(0, limit=batch_size) # shape (batch_size)
124 | indices = tf.stack((batch_nums, lengths), axis=1) # shape (batch_size, 2)
125 | result = tf.gather_nd(lstm_representation, indices, name='last-forwar-lstm')
126 | return result # [batch_size, dim]
127 |
128 | def collect_probs(probs, positions):
129 | # probs [batch_size, chunks_size]
130 | # positions [batch_size, pair_size]
131 | batch_size = tf.shape(probs)[0]
132 | pair_size = tf.shape(positions)[1]
133 | batch_nums = tf.range(0, limit=batch_size) # shape (batch_size)
134 | batch_nums = tf.reshape(batch_nums, shape=[-1, 1]) # [batch_size, 1]
135 | batch_nums = tf.tile(batch_nums, multiples=[1, pair_size]) # [batch_size, pair_size]
136 |
137 | indices = tf.stack((batch_nums, positions), axis=2) # shape (batch_size, pair_size, 2)
138 | pair_probs = tf.gather_nd(probs, indices)
139 | # pair_probs = tf.reshape(pair_probs, shape=[batch_size, pair_size])
140 | return pair_probs
141 |
142 |
143 | def calcuate_attention(in_value_1, in_value_2, feature_dim1, feature_dim2, scope_name='att',
144 | att_type='symmetric', att_dim=20, remove_diagnoal=False, mask1=None, mask2=None, is_training=False, dropout_rate=0.2):
145 | input_shape = tf.shape(in_value_1)
146 | batch_size = input_shape[0]
147 | len_1 = input_shape[1]
148 | len_2 = tf.shape(in_value_2)[1]
149 |
150 | in_value_1 = dropout_layer(in_value_1, dropout_rate, is_training=is_training)
151 | in_value_2 = dropout_layer(in_value_2, dropout_rate, is_training=is_training)
152 | with tf.variable_scope(scope_name):
153 | # calculate attention ==> a: [batch_size, len_1, len_2]
154 | atten_w1 = tf.get_variable("atten_w1", [feature_dim1, att_dim], dtype=tf.float32)
155 | if feature_dim1 == feature_dim2: atten_w2 = atten_w1
156 | else: atten_w2 = tf.get_variable("atten_w2", [feature_dim2, att_dim], dtype=tf.float32)
157 | atten_value_1 = tf.matmul(tf.reshape(in_value_1, [batch_size * len_1, feature_dim1]), atten_w1) # [batch_size*len_1, feature_dim]
158 | atten_value_1 = tf.reshape(atten_value_1, [batch_size, len_1, att_dim])
159 | atten_value_2 = tf.matmul(tf.reshape(in_value_2, [batch_size * len_2, feature_dim2]), atten_w2) # [batch_size*len_2, feature_dim]
160 | atten_value_2 = tf.reshape(atten_value_2, [batch_size, len_2, att_dim])
161 |
162 |
163 | if att_type == 'additive':
164 | atten_b = tf.get_variable("atten_b", [att_dim], dtype=tf.float32)
165 | atten_v = tf.get_variable("atten_v", [1, att_dim], dtype=tf.float32)
166 | atten_value_1 = tf.expand_dims(atten_value_1, axis=2, name="atten_value_1") # [batch_size, len_1, 'x', feature_dim]
167 | atten_value_2 = tf.expand_dims(atten_value_2, axis=1, name="atten_value_2") # [batch_size, 'x', len_2, feature_dim]
168 | atten_value = atten_value_1 + atten_value_2 # + tf.expand_dims(tf.expand_dims(tf.expand_dims(atten_b, axis=0), axis=0), axis=0)
169 | atten_value = nn_ops.bias_add(atten_value, atten_b)
170 | atten_value = tf.tanh(atten_value) # [batch_size, len_1, len_2, feature_dim]
171 | atten_value = tf.reshape(atten_value, [-1, att_dim]) * atten_v # tf.expand_dims(atten_v, axis=0) # [batch_size*len_1*len_2, feature_dim]
172 | atten_value = tf.reduce_sum(atten_value, axis=-1)
173 | atten_value = tf.reshape(atten_value, [batch_size, len_1, len_2])
174 | else:
175 | atten_value_1 = tf.tanh(atten_value_1)
176 | # atten_value_1 = tf.nn.relu(atten_value_1)
177 | atten_value_2 = tf.tanh(atten_value_2)
178 | # atten_value_2 = tf.nn.relu(atten_value_2)
179 | diagnoal_params = tf.get_variable("diagnoal_params", [1, 1, att_dim], dtype=tf.float32)
180 | atten_value_1 = atten_value_1 * diagnoal_params
181 | atten_value = tf.matmul(atten_value_1, atten_value_2, transpose_b=True) # [batch_size, len_1, len_2]
182 |
183 | # normalize
184 | if remove_diagnoal:
185 | diagnoal = tf.ones([len_1], tf.float32) # [len1]
186 | diagnoal = 1.0 - tf.diag(diagnoal) # [len1, len1]
187 | diagnoal = tf.expand_dims(diagnoal, axis=0) # ['x', len1, len1]
188 | atten_value = atten_value * diagnoal
189 | if mask1 is not None: atten_value = tf.multiply(atten_value, tf.expand_dims(mask1, axis=-1))
190 | if mask2 is not None: atten_value = tf.multiply(atten_value, tf.expand_dims(mask2, axis=1))
191 | atten_value = tf.nn.softmax(atten_value, name='atten_value') # [batch_size, len_1, len_2]
192 | if remove_diagnoal: atten_value = atten_value * diagnoal
193 | if mask1 is not None: atten_value = tf.multiply(atten_value, tf.expand_dims(mask1, axis=-1))
194 | if mask2 is not None: atten_value = tf.multiply(atten_value, tf.expand_dims(mask2, axis=1))
195 |
196 | return atten_value
197 |
198 | def weighted_sum(atten_scores, in_values):
199 | '''
200 |
201 | :param atten_scores: # [batch_size, len1, len2]
202 | :param in_values: [batch_size, len2, dim]
203 | :return:
204 | '''
205 | return tf.matmul(atten_scores, in_values)
206 |
207 | def cal_relevancy_matrix(in_question_repres, in_passage_repres):
208 | in_question_repres_tmp = tf.expand_dims(in_question_repres, 1) # [batch_size, 1, question_len, dim]
209 | in_passage_repres_tmp = tf.expand_dims(in_passage_repres, 2) # [batch_size, passage_len, 1, dim]
210 | relevancy_matrix = cosine_distance(in_question_repres_tmp,in_passage_repres_tmp) # [batch_size, passage_len, question_len]
211 | return relevancy_matrix
212 |
213 | def mask_relevancy_matrix(relevancy_matrix, question_mask, passage_mask):
214 | # relevancy_matrix: [batch_size, passage_len, question_len]
215 | # question_mask: [batch_size, question_len]
216 | # passage_mask: [batch_size, passsage_len]
217 | if question_mask is not None:
218 | relevancy_matrix = tf.multiply(relevancy_matrix, tf.expand_dims(question_mask, 1))
219 | relevancy_matrix = tf.multiply(relevancy_matrix, tf.expand_dims(passage_mask, 2))
220 | return relevancy_matrix
221 |
222 | def compute_gradients(tensor, var_list):
223 | grads = tf.gradients(tensor, var_list)
224 | return [grad if grad is not None else tf.zeros_like(var) for var, grad in zip(var_list, grads)]
--------------------------------------------------------------------------------
/Graph2Seq-master/main/layers.py:
--------------------------------------------------------------------------------
1 | import tensorflow as tf
2 | from inits import zeros
3 | import configure as conf
4 |
5 | _LAYER_UIDS = {}
6 |
7 | def get_layer_uid(layer_name=''):
8 | """Helper function, assigns unique layer IDs."""
9 | if layer_name not in _LAYER_UIDS:
10 | _LAYER_UIDS[layer_name] = 1
11 | return 1
12 | else:
13 | _LAYER_UIDS[layer_name] += 1
14 | return _LAYER_UIDS[layer_name]
15 |
16 | class Layer(object):
17 | """Base layer class. Defines basic API for all layer objects.
18 | Implementation inspired by keras (http://keras.io).
19 | # Properties
20 | name: String, defines the variable scope of the layer.
21 | logging: Boolean, switches Tensorflow histogram logging on/off
22 |
23 | # Methods
24 | _call(inputs): Defines computation graph of layer
25 | (i.e. takes input, returns output)
26 | __call__(inputs): Wrapper for _call()
27 | """
28 |
29 | def __init__(self, **kwargs):
30 | allowed_kwargs = {'name', 'logging', 'model_size'}
31 | for kwarg in kwargs.keys():
32 | assert kwarg in allowed_kwargs, 'Invalid keyword argument: ' + kwarg
33 | name = kwargs.get('name')
34 | if not name:
35 | layer = self.__class__.__name__.lower()
36 | name = layer + '_' + str(get_layer_uid(layer))
37 | self.name = name
38 | self.vars = {}
39 | logging = kwargs.get('logging', False)
40 | self.logging = logging
41 | self.sparse_inputs = False
42 |
43 | def _call(self, inputs):
44 | return inputs
45 |
46 | def __call__(self, inputs):
47 | with tf.name_scope(self.name):
48 | outputs = self._call(inputs)
49 | return outputs
50 |
51 | class Dense(Layer):
52 | """Dense layer."""
53 | def __init__(self, input_dim, output_dim, dropout=0.,
54 | act=tf.nn.relu, placeholders=None, bias=True, featureless=False,
55 | sparse_inputs=False, **kwargs):
56 | super(Dense, self).__init__(**kwargs)
57 |
58 | self.dropout = dropout
59 |
60 | self.act = act
61 | self.featureless = featureless
62 | self.bias = bias
63 | self.input_dim = input_dim
64 | self.output_dim = output_dim
65 |
66 | # helper variable for sparse dropout
67 | self.sparse_inputs = sparse_inputs
68 | if sparse_inputs:
69 | self.num_features_nonzero = placeholders['num_features_nonzero']
70 |
71 | with tf.variable_scope(self.name + '_vars'):
72 | self.vars['weights'] = tf.get_variable('weights', shape=(input_dim, output_dim),
73 | dtype=tf.float32,
74 | initializer=tf.contrib.layers.xavier_initializer(),
75 | regularizer=tf.contrib.layers.l2_regularizer(conf.weight_decay))
76 | if self.bias:
77 | self.vars['bias'] = zeros([output_dim], name='bias')
78 |
79 | def _call(self, inputs):
80 | x = inputs
81 |
82 | # x = tf.nn.dropout(x, self.dropout)
83 |
84 | # transform
85 | output = tf.matmul(x, self.vars['weights'])
86 |
87 | # bias
88 | if self.bias:
89 | output += self.vars['bias']
90 |
91 | return self.act(output)
--------------------------------------------------------------------------------
/Graph2Seq-master/main/loaderAndwriter.py:
--------------------------------------------------------------------------------
1 | import codecs
2 | import numpy as np
3 | import os
4 |
5 | def load_word_embedding(embedding_path):
6 | with codecs.open(embedding_path, 'r', 'utf-8') as f:
7 | word_idx = {}
8 | vecs = []
9 | for line in f:
10 | line = line.strip()
11 | if len(line.split(" ")) == 2:
12 | continue
13 | info = line.split(' ')
14 | word = info[0]
15 | vec = [float(v) for v in info[1:]]
16 | if len(vec) != 300:
17 | continue
18 | vecs.append(vec)
19 | word_idx[word] = len(word_idx.keys()) + 1
20 |
21 | return word_idx, np.array(vecs)
22 |
23 | def write_word_idx(word_idx, path):
24 | dir = path[:path.rfind('/')]
25 | if not os.path.exists(dir):
26 | os.makedirs(dir)
27 |
28 | with codecs.open(path, 'w', 'utf-8') as f:
29 | for word in word_idx:
30 | f.write(word+" "+str(word_idx[word])+'\n')
31 |
32 | def read_word_idx_from_file(path):
33 | word_idx = {}
34 | with codecs.open(path, 'r', 'utf-8') as f:
35 | lines = f.readlines()
36 | for line in lines:
37 | info = line.strip().split(" ")
38 | if len(info) != 2:
39 | word_idx[' '] = int(info[0])
40 | else:
41 | word_idx[info[0]] = int(info[1])
42 | return word_idx
43 |
--------------------------------------------------------------------------------
/Graph2Seq-master/main/match_utils.py:
--------------------------------------------------------------------------------
1 | import tensorflow as tf
2 | import layer_utils
3 |
4 | eps = 1e-6
5 |
6 |
7 | def cosine_distance(y1, y2):
8 | # y1 [....,a, 1, d]
9 | # y2 [....,1, b, d]
10 | cosine_numerator = tf.reduce_sum(tf.multiply(y1, y2), axis=-1)
11 | y1_norm = tf.sqrt(tf.maximum(tf.reduce_sum(tf.square(y1), axis=-1), eps))
12 | y2_norm = tf.sqrt(tf.maximum(tf.reduce_sum(tf.square(y2), axis=-1), eps))
13 | return cosine_numerator / y1_norm / y2_norm
14 |
15 |
16 | def cal_relevancy_matrix(in_question_repres, in_passage_repres):
17 | in_question_repres_tmp = tf.expand_dims(in_question_repres, 1) # [batch_size, 1, question_len, dim]
18 | in_passage_repres_tmp = tf.expand_dims(in_passage_repres, 2) # [batch_size, passage_len, 1, dim]
19 | relevancy_matrix = cosine_distance(in_question_repres_tmp,
20 | in_passage_repres_tmp) # [batch_size, passage_len, question_len]
21 | return relevancy_matrix
22 |
23 |
24 | def mask_relevancy_matrix(relevancy_matrix, question_mask, passage_mask):
25 | # relevancy_matrix: [batch_size, passage_len, question_len]
26 | # question_mask: [batch_size, question_len]
27 | # passage_mask: [batch_size, passsage_len]
28 | relevancy_matrix = tf.multiply(relevancy_matrix, tf.expand_dims(question_mask, 1))
29 | relevancy_matrix = tf.multiply(relevancy_matrix, tf.expand_dims(passage_mask, 2))
30 | return relevancy_matrix
31 |
32 |
33 | def multi_perspective_expand_for_3D(in_tensor, decompose_params):
34 | in_tensor = tf.expand_dims(in_tensor, axis=2) # [batch_size, passage_len, 'x', dim]
35 | decompose_params = tf.expand_dims(tf.expand_dims(decompose_params, axis=0), axis=0) # [1, 1, decompse_dim, dim]
36 | return tf.multiply(in_tensor, decompose_params) # [batch_size, passage_len, decompse_dim, dim]
37 |
38 |
39 | def multi_perspective_expand_for_2D(in_tensor, decompose_params):
40 | in_tensor = tf.expand_dims(in_tensor, axis=1) # [batch_size, 'x', dim]
41 | decompose_params = tf.expand_dims(decompose_params, axis=0) # [1, decompse_dim, dim]
42 | return tf.multiply(in_tensor, decompose_params) # [batch_size, decompse_dim, dim]
43 |
44 |
45 | def cal_maxpooling_matching(passage_rep, question_rep, decompose_params):
46 | # passage_representation: [batch_size, passage_len, dim]
47 | # qusetion_representation: [batch_size, question_len, dim]
48 | # decompose_params: [decompose_dim, dim]
49 |
50 | def singel_instance(x):
51 | p = x[0]
52 | q = x[1]
53 | # p: [pasasge_len, dim], q: [question_len, dim]
54 | p = multi_perspective_expand_for_2D(p, decompose_params) # [pasasge_len, decompose_dim, dim]
55 | q = multi_perspective_expand_for_2D(q, decompose_params) # [question_len, decompose_dim, dim]
56 | p = tf.expand_dims(p, 1) # [pasasge_len, 1, decompose_dim, dim]
57 | q = tf.expand_dims(q, 0) # [1, question_len, decompose_dim, dim]
58 | return cosine_distance(p, q) # [passage_len, question_len, decompose]
59 |
60 | elems = (passage_rep, question_rep)
61 | matching_matrix = tf.map_fn(singel_instance, elems,
62 | dtype=tf.float32) # [batch_size, passage_len, question_len, decompse_dim]
63 | return tf.concat(axis=2, values=[tf.reduce_max(matching_matrix, axis=2), tf.reduce_mean(matching_matrix,
64 | axis=2)]) # [batch_size, passage_len, 2*decompse_dim]
65 |
66 |
67 | def cross_entropy(logits, truth, mask):
68 | # logits: [batch_size, passage_len]
69 | # truth: [batch_size, passage_len]
70 | # mask: [batch_size, passage_len]
71 |
72 | # xdev = x - x.max()
73 | # return xdev - T.log(T.sum(T.exp(xdev)))
74 | logits = tf.multiply(logits, mask)
75 | xdev = tf.sub(logits, tf.expand_dims(tf.reduce_max(logits, 1), -1))
76 | log_predictions = tf.sub(xdev, tf.expand_dims(tf.log(tf.reduce_sum(tf.exp(xdev), -1)), -1))
77 | # return -T.sum(targets * log_predictions)
78 | result = tf.multiply(tf.multiply(truth, log_predictions), mask) # [batch_size, passage_len]
79 | return tf.multiply(-1.0, tf.reduce_sum(result, -1)) # [batch_size]
80 |
81 |
82 | def highway_layer(in_val, output_size, scope=None):
83 | # in_val: [batch_size, passage_len, dim]
84 | input_shape = tf.shape(in_val)
85 | batch_size = input_shape[0]
86 | passage_len = input_shape[1]
87 | # feat_dim = input_shape[2]
88 | in_val = tf.reshape(in_val, [batch_size * passage_len, output_size])
89 | with tf.variable_scope(scope or "highway_layer"):
90 | highway_w = tf.get_variable("highway_w", [output_size, output_size], dtype=tf.float32)
91 | highway_b = tf.get_variable("highway_b", [output_size], dtype=tf.float32)
92 | full_w = tf.get_variable("full_w", [output_size, output_size], dtype=tf.float32)
93 | full_b = tf.get_variable("full_b", [output_size], dtype=tf.float32)
94 | trans = tf.nn.tanh(tf.nn.xw_plus_b(in_val, full_w, full_b))
95 | gate = tf.nn.sigmoid(tf.nn.xw_plus_b(in_val, highway_w, highway_b))
96 | outputs = trans * gate + in_val * (1.0 - gate)
97 | outputs = tf.reshape(outputs, [batch_size, passage_len, output_size])
98 | return outputs
99 |
100 |
101 | def multi_highway_layer(in_val, output_size, num_layers, scope=None):
102 | scope_name = 'highway_layer'
103 | if scope is not None: scope_name = scope
104 | for i in range(num_layers):
105 | cur_scope_name = scope_name + "-{}".format(i)
106 | in_val = highway_layer(in_val, output_size, scope=cur_scope_name)
107 | return in_val
108 |
109 |
110 | def cal_max_question_representation(question_representation, atten_scores):
111 | atten_positions = tf.argmax(atten_scores, axis=2, output_type=tf.int32) # [batch_size, passage_len]
112 | max_question_reps = layer_utils.collect_representation(question_representation, atten_positions)
113 | return max_question_reps
114 |
115 |
116 | def multi_perspective_match(feature_dim, repres1, repres2, is_training=True, dropout_rate=0.2,
117 | options=None, scope_name='mp-match', reuse=False):
118 | '''
119 | :param repres1: [batch_size, len, feature_dim]
120 | :param repres2: [batch_size, len, feature_dim]
121 | :return:
122 | '''
123 | input_shape = tf.shape(repres1)
124 | batch_size = input_shape[0]
125 | seq_length = input_shape[1]
126 | matching_result = []
127 | with tf.variable_scope(scope_name, reuse=reuse):
128 | match_dim = 0
129 | if options['with_cosine']:
130 | cosine_value = layer_utils.cosine_distance(repres1, repres2, cosine_norm=False)
131 | cosine_value = tf.reshape(cosine_value, [batch_size, seq_length, 1])
132 | matching_result.append(cosine_value)
133 | match_dim += 1
134 |
135 | if options['with_mp_cosine']:
136 | mp_cosine_params = tf.get_variable("mp_cosine", shape=[options['cosine_MP_dim'], feature_dim],
137 | dtype=tf.float32)
138 | mp_cosine_params = tf.expand_dims(mp_cosine_params, axis=0)
139 | mp_cosine_params = tf.expand_dims(mp_cosine_params, axis=0)
140 | repres1_flat = tf.expand_dims(repres1, axis=2)
141 | repres2_flat = tf.expand_dims(repres2, axis=2)
142 | mp_cosine_matching = layer_utils.cosine_distance(tf.multiply(repres1_flat, mp_cosine_params),
143 | repres2_flat, cosine_norm=False)
144 | matching_result.append(mp_cosine_matching)
145 | match_dim += options['cosine_MP_dim']
146 |
147 | matching_result = tf.concat(axis=2, values=matching_result)
148 | return (matching_result, match_dim)
149 |
150 |
151 | def match_passage_with_question(passage_reps, question_reps, passage_mask, question_mask, passage_lengths,
152 | question_lengths,
153 | context_lstm_dim, scope=None,
154 | with_full_match=True, with_maxpool_match=True, with_attentive_match=True,
155 | with_max_attentive_match=True,
156 | is_training=True, options=None, dropout_rate=0, forward=True):
157 | passage_reps = tf.multiply(passage_reps, tf.expand_dims(passage_mask, -1))
158 | question_reps = tf.multiply(question_reps, tf.expand_dims(question_mask, -1))
159 | all_question_aware_representatins = []
160 | dim = 0
161 | with tf.variable_scope(scope or "match_passage_with_question"):
162 | relevancy_matrix = cal_relevancy_matrix(question_reps, passage_reps)
163 | relevancy_matrix = mask_relevancy_matrix(relevancy_matrix, question_mask, passage_mask)
164 | # relevancy_matrix = layer_utils.calcuate_attention(passage_reps, question_reps, context_lstm_dim, context_lstm_dim,
165 | # scope_name="fw_attention", att_type=options.att_type, att_dim=options.att_dim,
166 | # remove_diagnoal=False, mask1=passage_mask, mask2=question_mask, is_training=is_training, dropout_rate=dropout_rate)
167 |
168 | all_question_aware_representatins.append(tf.reduce_max(relevancy_matrix, axis=2, keep_dims=True))
169 | all_question_aware_representatins.append(tf.reduce_mean(relevancy_matrix, axis=2, keep_dims=True))
170 | dim += 2
171 | if with_full_match:
172 | if forward:
173 | question_full_rep = layer_utils.collect_final_step_of_lstm(question_reps, question_lengths - 1)
174 | else:
175 | question_full_rep = question_reps[:, 0, :]
176 |
177 | passage_len = tf.shape(passage_reps)[1]
178 | question_full_rep = tf.expand_dims(question_full_rep, axis=1)
179 | question_full_rep = tf.tile(question_full_rep,
180 | [1, passage_len, 1]) # [batch_size, pasasge_len, feature_dim]
181 |
182 | (attentive_rep, match_dim) = multi_perspective_match(context_lstm_dim,
183 | passage_reps, question_full_rep,
184 | is_training=is_training,
185 | dropout_rate=options['dropout_rate'],
186 | options=options, scope_name='mp-match-full-match')
187 | all_question_aware_representatins.append(attentive_rep)
188 | dim += match_dim
189 |
190 | if with_maxpool_match:
191 | maxpooling_decomp_params = tf.get_variable("maxpooling_matching_decomp",
192 | shape=[options['cosine_MP_dim'], context_lstm_dim],
193 | dtype=tf.float32)
194 | maxpooling_rep = cal_maxpooling_matching(passage_reps, question_reps, maxpooling_decomp_params)
195 | all_question_aware_representatins.append(maxpooling_rep)
196 | dim += 2 * options['cosine_MP_dim']
197 |
198 | if with_attentive_match:
199 | atten_scores = layer_utils.calcuate_attention(passage_reps, question_reps, context_lstm_dim,
200 | context_lstm_dim,
201 | scope_name="attention", att_type=options['att_type'],
202 | att_dim=options['att_dim'],
203 | remove_diagnoal=False, mask1=passage_mask,
204 | mask2=question_mask, is_training=is_training,
205 | dropout_rate=dropout_rate)
206 | att_question_contexts = tf.matmul(atten_scores, question_reps)
207 | (attentive_rep, match_dim) = multi_perspective_match(context_lstm_dim,
208 | passage_reps, att_question_contexts,
209 | is_training=is_training,
210 | dropout_rate=options['dropout_rate'],
211 | options=options, scope_name='mp-match-att_question')
212 | all_question_aware_representatins.append(attentive_rep)
213 | dim += match_dim
214 |
215 | if with_max_attentive_match:
216 | max_att = cal_max_question_representation(question_reps, relevancy_matrix)
217 | (max_attentive_rep, match_dim) = multi_perspective_match(context_lstm_dim,
218 | passage_reps, max_att, is_training=is_training,
219 | dropout_rate=options['dropout_rate'],
220 | options=options, scope_name='mp-match-max-att')
221 | all_question_aware_representatins.append(max_attentive_rep)
222 | dim += match_dim
223 |
224 | all_question_aware_representatins = tf.concat(axis=2, values=all_question_aware_representatins)
225 | return (all_question_aware_representatins, dim)
226 |
227 |
228 | def bilateral_match_func(in_question_repres, in_passage_repres,
229 | question_lengths, passage_lengths, question_mask, passage_mask, input_dim, is_training,
230 | options=None):
231 | question_aware_representatins = []
232 | question_aware_dim = 0
233 | passage_aware_representatins = []
234 | passage_aware_dim = 0
235 |
236 | # ====word level matching======
237 | (match_reps, match_dim) = match_passage_with_question(in_passage_repres, in_question_repres, passage_mask,
238 | question_mask, passage_lengths,
239 | question_lengths, input_dim, scope="word_match_forward",
240 | with_full_match=False,
241 | with_maxpool_match=options['with_maxpool_match'],
242 | with_attentive_match=options['with_attentive_match'],
243 | with_max_attentive_match=options['with_max_attentive_match'],
244 | is_training=is_training, options=options,
245 | dropout_rate=options['dropout_rate'], forward=True)
246 | question_aware_representatins.append(match_reps)
247 | question_aware_dim += match_dim
248 |
249 | (match_reps, match_dim) = match_passage_with_question(in_question_repres, in_passage_repres, question_mask,
250 | passage_mask, question_lengths,
251 | passage_lengths, input_dim, scope="word_match_backward",
252 | with_full_match=False,
253 | with_maxpool_match=options['with_maxpool_match'],
254 | with_attentive_match=options['with_attentive_match'],
255 | with_max_attentive_match=options['with_max_attentive_match'],
256 | is_training=is_training, options=options,
257 | dropout_rate=options['dropout_rate'], forward=False)
258 | passage_aware_representatins.append(match_reps)
259 | passage_aware_dim += match_dim
260 |
261 | with tf.variable_scope('context_MP_matching'):
262 | for i in range(options['context_layer_num']): # support multiple context layer
263 | with tf.variable_scope('layer-{}'.format(i)):
264 | # contextual lstm for both passage and question
265 | in_question_repres = tf.multiply(in_question_repres, tf.expand_dims(question_mask, axis=-1))
266 | in_passage_repres = tf.multiply(in_passage_repres, tf.expand_dims(passage_mask, axis=-1))
267 | (question_context_representation_fw, question_context_representation_bw,
268 | in_question_repres) = layer_utils.my_lstm_layer(
269 | in_question_repres, options['context_lstm_dim'], input_lengths=question_lengths,
270 | scope_name="context_represent",
271 | reuse=False, is_training=is_training, dropout_rate=options['dropout_rate'],
272 | use_cudnn=options['use_cudnn'])
273 | (passage_context_representation_fw, passage_context_representation_bw,
274 | in_passage_repres) = layer_utils.my_lstm_layer(
275 | in_passage_repres, options['context_lstm_dim'], input_lengths=passage_lengths,
276 | scope_name="context_represent",
277 | reuse=True, is_training=is_training, dropout_rate=options['dropout_rate'], use_cudnn=options['use_cudnn'])
278 |
279 | # Multi-perspective matching
280 | with tf.variable_scope('left_MP_matching'):
281 | (match_reps, match_dim) = match_passage_with_question(passage_context_representation_fw,
282 | question_context_representation_fw,
283 | passage_mask, question_mask, passage_lengths,
284 | question_lengths, options['context_lstm_dim'],
285 | scope="forward_match",
286 | with_full_match=options['with_full_match'],
287 | with_maxpool_match=options['with_maxpool_match'],
288 | with_attentive_match=options['with_attentive_match'],
289 | with_max_attentive_match=options['with_max_attentive_match'],
290 | is_training=is_training, options=options,
291 | dropout_rate=options['dropout_rate'],
292 | forward=True)
293 | question_aware_representatins.append(match_reps)
294 | question_aware_dim += match_dim
295 | (match_reps, match_dim) = match_passage_with_question(passage_context_representation_bw,
296 | question_context_representation_bw,
297 | passage_mask, question_mask, passage_lengths,
298 | question_lengths, options['context_lstm_dim'],
299 | scope="backward_match",
300 | with_full_match=options['with_full_match'],
301 | with_maxpool_match=options['with_maxpool_match'],
302 | with_attentive_match=options['with_attentive_match'],
303 | with_max_attentive_match=options['with_max_attentive_match'],
304 | is_training=is_training, options=options,
305 | dropout_rate=options['dropout_rate'],
306 | forward=False)
307 | question_aware_representatins.append(match_reps)
308 | question_aware_dim += match_dim
309 |
310 | with tf.variable_scope('right_MP_matching'):
311 | (match_reps, match_dim) = match_passage_with_question(question_context_representation_fw,
312 | passage_context_representation_fw,
313 | question_mask, passage_mask, question_lengths,
314 | passage_lengths, options['context_lstm_dim'],
315 | scope="forward_match",
316 | with_full_match=options['with_full_match'],
317 | with_maxpool_match=options['with_maxpool_match'],
318 | with_attentive_match=options['with_attentive_match'],
319 | with_max_attentive_match=options['with_max_attentive_match'],
320 | is_training=is_training, options=options,
321 | dropout_rate=options['dropout_rate'],
322 | forward=True)
323 | passage_aware_representatins.append(match_reps)
324 | passage_aware_dim += match_dim
325 | (match_reps, match_dim) = match_passage_with_question(question_context_representation_bw,
326 | passage_context_representation_bw,
327 | question_mask, passage_mask, question_lengths,
328 | passage_lengths, options['context_lstm_dim'],
329 | scope="backward_match",
330 | with_full_match=options['with_full_match'],
331 | with_maxpool_match=options['with_maxpool_match'],
332 | with_attentive_match=options['with_attentive_match'],
333 | with_max_attentive_match=options['with_max_attentive_match'],
334 | is_training=is_training, options=options,
335 | dropout_rate=options['dropout_rate'],
336 | forward=False)
337 | passage_aware_representatins.append(match_reps)
338 | passage_aware_dim += match_dim
339 |
340 | question_aware_representatins = tf.concat(axis=2,
341 | values=question_aware_representatins) # [batch_size, passage_len, question_aware_dim]
342 | passage_aware_representatins = tf.concat(axis=2,
343 | values=passage_aware_representatins) # [batch_size, question_len, question_aware_dim]
344 |
345 | if is_training:
346 | question_aware_representatins = tf.nn.dropout(question_aware_representatins, (1 - options['dropout_rate']))
347 | passage_aware_representatins = tf.nn.dropout(passage_aware_representatins, (1 - options['dropout_rate']))
348 |
349 | # ======Highway layer======
350 | if options['with_match_highway']:
351 | with tf.variable_scope("left_matching_highway"):
352 | question_aware_representatins = multi_highway_layer(question_aware_representatins, question_aware_dim,
353 | options['highway_layer_num'])
354 | with tf.variable_scope("right_matching_highway"):
355 | passage_aware_representatins = multi_highway_layer(passage_aware_representatins, passage_aware_dim,
356 | options['highway_layer_num'])
357 |
358 | # ========Aggregation Layer======
359 | aggregation_representation = []
360 | aggregation_dim = 0
361 |
362 | qa_aggregation_input = question_aware_representatins
363 | pa_aggregation_input = passage_aware_representatins
364 | with tf.variable_scope('aggregation_layer'):
365 | for i in range(options['aggregation_layer_num']): # support multiple aggregation layer
366 | qa_aggregation_input = tf.multiply(qa_aggregation_input, tf.expand_dims(passage_mask, axis=-1))
367 | (fw_rep, bw_rep, cur_aggregation_representation) = layer_utils.my_lstm_layer(
368 | qa_aggregation_input, options['aggregation_lstm_dim'], input_lengths=passage_lengths,
369 | scope_name='left_layer-{}'.format(i),
370 | reuse=False, is_training=is_training, dropout_rate=options['dropout_rate'], use_cudnn=options['use_cudnn'])
371 | fw_rep = layer_utils.collect_final_step_of_lstm(fw_rep, passage_lengths - 1)
372 | bw_rep = bw_rep[:, 0, :]
373 | aggregation_representation.append(fw_rep)
374 | aggregation_representation.append(bw_rep)
375 | aggregation_dim += 2 * options['aggregation_lstm_dim']
376 | qa_aggregation_input = cur_aggregation_representation # [batch_size, passage_len, 2*aggregation_lstm_dim]
377 |
378 | pa_aggregation_input = tf.multiply(pa_aggregation_input, tf.expand_dims(question_mask, axis=-1))
379 | (fw_rep, bw_rep, cur_aggregation_representation) = layer_utils.my_lstm_layer(
380 | pa_aggregation_input, options['aggregation_lstm_dim'],
381 | input_lengths=question_lengths, scope_name='right_layer-{}'.format(i),
382 | reuse=False, is_training=is_training, dropout_rate=options['dropout_rate'], use_cudnn=options['use_cudnn'])
383 | fw_rep = layer_utils.collect_final_step_of_lstm(fw_rep, question_lengths - 1)
384 | bw_rep = bw_rep[:, 0, :]
385 | aggregation_representation.append(fw_rep)
386 | aggregation_representation.append(bw_rep)
387 | aggregation_dim += 2 * options['aggregation_lstm_dim']
388 | pa_aggregation_input = cur_aggregation_representation # [batch_size, passage_len, 2*aggregation_lstm_dim]
389 |
390 | aggregation_representation = tf.concat(axis=1, values=aggregation_representation) # [batch_size, aggregation_dim]
391 |
392 | # ======Highway layer======
393 | if options['with_aggregation_highway']:
394 | with tf.variable_scope("aggregation_highway"):
395 | agg_shape = tf.shape(aggregation_representation)
396 | batch_size = agg_shape[0]
397 | aggregation_representation = tf.reshape(aggregation_representation, [1, batch_size, aggregation_dim])
398 | aggregation_representation = multi_highway_layer(aggregation_representation, aggregation_dim,
399 | options['highway_layer_num'])
400 | aggregation_representation = tf.reshape(aggregation_representation, [batch_size, aggregation_dim])
401 |
402 | return (aggregation_representation, aggregation_dim)
403 |
--------------------------------------------------------------------------------
/Graph2Seq-master/main/model.py:
--------------------------------------------------------------------------------
1 | import tensorflow as tf
2 | from tensorflow.python.layers.core import Dense
3 | from tensorflow.python.ops.rnn_cell_impl import LSTMStateTuple
4 | import tensorflow.contrib.seq2seq as seq2seq
5 |
6 | from neigh_samplers import UniformNeighborSampler
7 | from aggregators import MeanAggregator, MaxPoolingAggregator, GatedMeanAggregator
8 | import numpy as np
9 | import match_utils
10 |
11 | class Graph2SeqNN(object):
12 |
13 | PAD = 0
14 | GO = 1
15 | EOS = 2
16 |
17 | def __init__(self, mode, conf, path_embed_method):
18 |
19 | self.mode = mode
20 | self.word_vocab_size = conf.word_vocab_size
21 | self.l2_lambda = conf.l2_lambda
22 | self.path_embed_method = path_embed_method
23 | # self.word_embedding_dim = conf.word_embedding_dim
24 | self.word_embedding_dim = conf.hidden_layer_dim
25 | self.encoder_hidden_dim = conf.encoder_hidden_dim
26 |
27 | # the setting for the GCN
28 | self.num_layers_decode = conf.num_layers_decode
29 | self.num_layers = conf.num_layers
30 | self.graph_encode_direction = conf.graph_encode_direction
31 | self.sample_layer_size = conf.sample_layer_size
32 | self.hidden_layer_dim = conf.hidden_layer_dim
33 | self.concat = conf.concat
34 |
35 | # the setting for the decoder
36 | self.beam_width = conf.beam_width
37 | self.decoder_type = conf.decoder_type
38 | self.seq_max_len = conf.seq_max_len
39 |
40 | self._text = tf.placeholder(tf.int32, [None, None])
41 | self.decoder_seq_length = tf.placeholder(tf.int32, [None])
42 | self.loss_weights = tf.placeholder(tf.float32, [None, None])
43 |
44 | # the following place holders are for the gcn
45 | self.fw_adj_info = tf.placeholder(tf.int32, [None, None]) # the fw adj info for each node
46 | self.bw_adj_info = tf.placeholder(tf.int32, [None, None]) # the bw adj info for each node
47 | self.feature_info = tf.placeholder(tf.int32, [None, None]) # the feature info for each node
48 | self.batch_nodes = tf.placeholder(tf.int32, [None, None]) # the nodes for each batch
49 |
50 | self.sample_size_per_layer = tf.shape(self.fw_adj_info)[1]
51 |
52 | self.single_graph_nodes_size = tf.shape(self.batch_nodes)[1]
53 | self.attention = conf.attention
54 | self.dropout = conf.dropout
55 | self.fw_aggregators = []
56 | self.bw_aggregators = []
57 |
58 | self.if_pred_on_dev = False
59 |
60 | self.learning_rate = conf.learning_rate
61 |
62 | def _init_decoder_train_connectors(self):
63 | batch_size, sequence_size = tf.unstack(tf.shape(self._text))
64 | self.batch_size = batch_size
65 | GO_SLICE = tf.ones([batch_size, 1], dtype=tf.int32) * self.GO
66 | EOS_SLICE = tf.ones([batch_size, 1], dtype=tf.int32) * self.PAD
67 | self.decoder_train_inputs = tf.concat([GO_SLICE, self._text], axis=1)
68 | self.decoder_train_length = self.decoder_seq_length + 1
69 | decoder_train_targets = tf.concat([self._text, EOS_SLICE], axis=1)
70 | _, decoder_train_targets_seq_len = tf.unstack(tf.shape(decoder_train_targets))
71 | decoder_train_targets_eos_mask = tf.one_hot(self.decoder_train_length - 1, decoder_train_targets_seq_len,
72 | on_value=self.EOS, off_value=self.PAD, dtype=tf.int32)
73 | self.decoder_train_targets = tf.add(decoder_train_targets, decoder_train_targets_eos_mask)
74 | self.decoder_train_inputs_embedded = tf.nn.embedding_lookup(self.word_embeddings, self.decoder_train_inputs)
75 |
76 |
77 |
78 | def encode(self):
79 | with tf.variable_scope("embedding_layer"):
80 | pad_word_embedding = tf.zeros([1, self.word_embedding_dim]) # this is for the PAD symbol
81 | self.word_embeddings = tf.concat([pad_word_embedding,
82 | tf.get_variable('W_train', shape=[self.word_vocab_size,self.word_embedding_dim],
83 | initializer=tf.contrib.layers.xavier_initializer(), trainable=True)], 0)
84 |
85 | with tf.variable_scope("graph_encoding_layer"):
86 |
87 | # self.encoder_outputs, self.encoder_state = self.gcn_encode()
88 |
89 | # this is for optimizing gcn
90 | encoder_outputs, encoder_state = self.optimized_gcn_encode()
91 |
92 | source_sequence_length = tf.reshape(
93 | tf.ones([tf.shape(encoder_outputs)[0], 1], dtype=tf.int32) * self.single_graph_nodes_size,
94 | (tf.shape(encoder_outputs)[0],))
95 |
96 | return encoder_outputs, encoder_state, source_sequence_length
97 |
98 | def encode_node_feature(self, word_embeddings, feature_info):
99 | # in some cases, we can use LSTM to produce the node feature representation
100 | # cell = self._build_encoder_cell(conf.num_layers, conf.dim)
101 |
102 |
103 | feature_embedded_chars = tf.nn.embedding_lookup(word_embeddings, feature_info)
104 | batch_size = tf.shape(feature_embedded_chars)[0]
105 |
106 | # node_repres = match_utils.multi_highway_layer(feature_embedded_chars, self.hidden_layer_dim, num_layers=1)
107 | # node_repres = tf.reshape(node_repres, [batch_size, -1])
108 | # y = tf.shape(node_repres)[1]
109 | #
110 | # node_repres = tf.concat([tf.slice(node_repres, [0,0], [batch_size-1, y]), tf.zeros([1, y])], 0)
111 |
112 | node_repres = tf.reshape(feature_embedded_chars, [batch_size, -1])
113 |
114 | return node_repres
115 |
116 | def optimized_gcn_encode(self):
117 | # [node_size, hidden_layer_dim]
118 | embedded_node_rep = self.encode_node_feature(self.word_embeddings, self.feature_info)
119 |
120 | fw_sampler = UniformNeighborSampler(self.fw_adj_info)
121 | bw_sampler = UniformNeighborSampler(self.bw_adj_info)
122 | nodes = tf.reshape(self.batch_nodes, [-1, ])
123 |
124 | # batch_size = tf.shape(nodes)[0]
125 |
126 | # the fw_hidden and bw_hidden is the initial node embedding
127 | # [node_size, dim_size]
128 | fw_hidden = tf.nn.embedding_lookup(embedded_node_rep, nodes)
129 | bw_hidden = tf.nn.embedding_lookup(embedded_node_rep, nodes)
130 |
131 | # [node_size, adj_size]
132 | fw_sampled_neighbors = fw_sampler((nodes, self.sample_size_per_layer))
133 | bw_sampled_neighbors = bw_sampler((nodes, self.sample_size_per_layer))
134 |
135 | fw_sampled_neighbors_len = tf.constant(0)
136 | bw_sampled_neighbors_len = tf.constant(0)
137 |
138 | # sample
139 | for layer in range(self.sample_layer_size):
140 | if layer == 0:
141 | dim_mul = 1
142 | else:
143 | dim_mul = 2
144 |
145 | if layer > 6:
146 | fw_aggregator = self.fw_aggregators[6]
147 | else:
148 | fw_aggregator = MeanAggregator(dim_mul * self.hidden_layer_dim, self.hidden_layer_dim, concat=self.concat, mode=self.mode)
149 | self.fw_aggregators.append(fw_aggregator)
150 |
151 | # [node_size, adj_size, word_embedding_dim]
152 | if layer == 0:
153 | neigh_vec_hidden = tf.nn.embedding_lookup(embedded_node_rep, fw_sampled_neighbors)
154 |
155 | # compute the neighbor size
156 | tmp_sum = tf.reduce_sum(tf.nn.relu(neigh_vec_hidden), axis=2)
157 | tmp_mask = tf.sign(tmp_sum)
158 | fw_sampled_neighbors_len = tf.reduce_sum(tmp_mask, axis=1)
159 |
160 | else:
161 | neigh_vec_hidden = tf.nn.embedding_lookup(
162 | tf.concat([fw_hidden, tf.zeros([1, dim_mul * self.hidden_layer_dim])], 0), fw_sampled_neighbors)
163 |
164 | fw_hidden = fw_aggregator((fw_hidden, neigh_vec_hidden, fw_sampled_neighbors_len))
165 |
166 |
167 | if self.graph_encode_direction == "bi":
168 | if layer > 6:
169 | bw_aggregator = self.bw_aggregators[6]
170 | else:
171 | bw_aggregator = MeanAggregator(dim_mul * self.hidden_layer_dim, self.hidden_layer_dim, concat=self.concat, mode=self.mode)
172 | self.bw_aggregators.append(bw_aggregator)
173 |
174 | if layer == 0:
175 | neigh_vec_hidden = tf.nn.embedding_lookup(embedded_node_rep, bw_sampled_neighbors)
176 |
177 | # compute the neighbor size
178 | tmp_sum = tf.reduce_sum(tf.nn.relu(neigh_vec_hidden), axis=2)
179 | tmp_mask = tf.sign(tmp_sum)
180 | bw_sampled_neighbors_len = tf.reduce_sum(tmp_mask, axis=1)
181 |
182 | else:
183 | neigh_vec_hidden = tf.nn.embedding_lookup(
184 | tf.concat([bw_hidden, tf.zeros([1, dim_mul * self.hidden_layer_dim])], 0), bw_sampled_neighbors)
185 |
186 | bw_hidden = bw_aggregator((bw_hidden, neigh_vec_hidden, bw_sampled_neighbors_len))
187 |
188 | # hidden stores the representation for all nodes
189 | fw_hidden = tf.reshape(fw_hidden, [-1, self.single_graph_nodes_size, 2 * self.hidden_layer_dim])
190 | if self.graph_encode_direction == "bi":
191 | bw_hidden = tf.reshape(bw_hidden, [-1, self.single_graph_nodes_size, 2 * self.hidden_layer_dim])
192 | hidden = tf.concat([fw_hidden, bw_hidden], axis=2)
193 | else:
194 | hidden = fw_hidden
195 |
196 | hidden = tf.nn.relu(hidden)
197 |
198 | pooled = tf.reduce_max(hidden, 1)
199 | if self.graph_encode_direction == "bi":
200 | graph_embedding = tf.reshape(pooled, [-1, 4 * self.hidden_layer_dim])
201 | else:
202 | graph_embedding = tf.reshape(pooled, [-1, 2 * self.hidden_layer_dim])
203 |
204 | graph_embedding = LSTMStateTuple(c=graph_embedding, h=graph_embedding)
205 |
206 | # shape of hidden: [batch_size, single_graph_nodes_size, 4 * hidden_layer_dim]
207 | # shape of graph_embedding: ([batch_size, 4 * hidden_layer_dim], [batch_size, 4 * hidden_layer_dim])
208 | return hidden, graph_embedding
209 |
210 | def decode(self, encoder_outputs, encoder_state, source_sequence_length):
211 | with tf.variable_scope("Decoder") as scope:
212 | beam_width = self.beam_width
213 | decoder_type = self.decoder_type
214 | seq_max_len = self.seq_max_len
215 | batch_size = tf.shape(encoder_outputs)[0]
216 |
217 | if self.path_embed_method == "lstm":
218 | self.decoder_cell = self._build_decode_cell()
219 | if self.mode == "test" and beam_width > 0:
220 | memory = seq2seq.tile_batch(self.encoder_outputs, multiplier=beam_width)
221 | source_sequence_length = seq2seq.tile_batch(self.source_sequence_length, multiplier=beam_width)
222 | encoder_state = seq2seq.tile_batch(self.encoder_state, multiplier=beam_width)
223 | batch_size = self.batch_size * beam_width
224 | else:
225 | memory = encoder_outputs
226 | source_sequence_length = source_sequence_length
227 | encoder_state = encoder_state
228 |
229 | attention_mechanism = seq2seq.BahdanauAttention(self.hidden_layer_dim, memory,
230 | memory_sequence_length=source_sequence_length)
231 | self.decoder_cell = seq2seq.AttentionWrapper(self.decoder_cell, attention_mechanism,
232 | attention_layer_size=self.hidden_layer_dim)
233 | self.decoder_initial_state = self.decoder_cell.zero_state(batch_size, tf.float32).clone(cell_state=encoder_state)
234 |
235 | projection_layer = Dense(self.word_vocab_size, use_bias=False)
236 |
237 | """For training the model"""
238 | if self.mode == "train":
239 | decoder_train_helper = tf.contrib.seq2seq.TrainingHelper(self.decoder_train_inputs_embedded,
240 | self.decoder_train_length)
241 | decoder_train = seq2seq.BasicDecoder(self.decoder_cell, decoder_train_helper,
242 | self.decoder_initial_state,
243 | projection_layer)
244 | decoder_outputs_train, decoder_states_train, decoder_seq_len_train = seq2seq.dynamic_decode(decoder_train)
245 | decoder_logits_train = decoder_outputs_train.rnn_output
246 | self.decoder_logits_train = tf.reshape(decoder_logits_train, [batch_size, -1, self.word_vocab_size])
247 |
248 | """For test the model"""
249 | # if self.mode == "infer" or self.if_pred_on_dev:
250 | if decoder_type == "greedy":
251 | decoder_infer_helper = seq2seq.GreedyEmbeddingHelper(self.word_embeddings,
252 | tf.ones([batch_size], dtype=tf.int32),
253 | self.EOS)
254 | decoder_infer = seq2seq.BasicDecoder(self.decoder_cell, decoder_infer_helper,
255 | self.decoder_initial_state, projection_layer)
256 | elif decoder_type == "beam":
257 | decoder_infer = seq2seq.BeamSearchDecoder(cell=self.decoder_cell, embedding=self.word_embeddings,
258 | start_tokens=tf.ones([batch_size], dtype=tf.int32),
259 | end_token=self.EOS,
260 | initial_state=self.decoder_initial_state,
261 | beam_width=beam_width,
262 | output_layer=projection_layer)
263 |
264 | decoder_outputs_infer, decoder_states_infer, decoder_seq_len_infer = seq2seq.dynamic_decode(decoder_infer,
265 | maximum_iterations=seq_max_len)
266 |
267 | if decoder_type == "beam":
268 | self.decoder_logits_infer = tf.no_op()
269 | self.sample_id = decoder_outputs_infer.predicted_ids
270 |
271 | elif decoder_type == "greedy":
272 | self.decoder_logits_infer = decoder_outputs_infer.rnn_output
273 | self.sample_id = decoder_outputs_infer.sample_id
274 |
275 | def _build_decode_cell(self):
276 | if self.num_layers == 1:
277 | cell = tf.nn.rnn_cell.BasicLSTMCell(num_units=4*self.hidden_layer_dim)
278 | if self.mode == "train":
279 | cell = tf.nn.rnn_cell.DropoutWrapper(cell, 1 - self.dropout)
280 | return cell
281 | else:
282 | cell_list = []
283 | for i in range(self.num_layers):
284 | single_cell = tf.contrib.rnn.BasicLSTMCell(self._decoder_hidden_size)
285 | if self.mode == "train":
286 | single_cell = tf.nn.rnn_cell.DropoutWrapper(single_cell, 1 - self.dropout)
287 | cell_list.append(single_cell)
288 | return tf.contrib.rnn.MultiRNNCell(cell_list)
289 |
290 | def _build_encoder_cell(self, num_layers, hidden_layer_dim):
291 | if num_layers == 1:
292 | cell = tf.nn.rnn_cell.BasicLSTMCell(hidden_layer_dim)
293 | if self.mode == "train":
294 | cell = tf.nn.rnn_cell.DropoutWrapper(cell, 1 - self.dropout)
295 | return cell
296 | else:
297 | cell_list = []
298 | for i in range(num_layers):
299 | single_cell = tf.contrib.rnn.BasicLSTMCell(hidden_layer_dim)
300 | if self.mode == "train":
301 | single_cell = tf.nn.rnn_cell.DropoutWrapper(single_cell, 1 - self.dropout)
302 | cell_list.append(single_cell)
303 | return tf.contrib.rnn.MultiRNNCell(cell_list)
304 |
305 | def _init_optimizer(self):
306 | crossent = tf.nn.sparse_softmax_cross_entropy_with_logits(labels=self.decoder_train_targets, logits=self.decoder_logits_train)
307 | decode_loss = (tf.reduce_sum(crossent * self.loss_weights) / tf.cast(self.batch_size, tf.float32))
308 |
309 | train_loss = decode_loss
310 |
311 | for aggregator in self.fw_aggregators:
312 | for var in aggregator.vars.values():
313 | train_loss += self.l2_lambda * tf.nn.l2_loss(var)
314 |
315 | for aggregator in self.bw_aggregators:
316 | for var in aggregator.vars.values():
317 | train_loss += self.l2_lambda * tf.nn.l2_loss(var)
318 |
319 | self.loss_op = train_loss
320 | self.cross_entropy_sum = train_loss
321 | params = tf.trainable_variables()
322 | gradients = tf.gradients(train_loss, params)
323 | clipped_gradients, _ = tf.clip_by_global_norm(gradients, 1)
324 | optimizer = tf.train.AdamOptimizer(learning_rate=self.learning_rate)
325 | self.train_op = optimizer.apply_gradients(zip(clipped_gradients, params))
326 |
327 | def _build_graph(self):
328 | encoder_outputs, encoder_state, source_sequence_length = self.encode()
329 |
330 | if self.mode == "train":
331 | self._init_decoder_train_connectors()
332 |
333 | self.decode(encoder_outputs=encoder_outputs, encoder_state=encoder_state, source_sequence_length=source_sequence_length)
334 |
335 | if self.mode == "train":
336 | self._init_optimizer()
337 |
338 | def act(self, sess, mode, dict, if_pred_on_dev):
339 | text = np.array(dict['seq'])
340 | decoder_seq_length = np.array(dict['decoder_seq_length'])
341 | loss_weights = np.array(dict['loss_weights'])
342 | batch_graph = dict['batch_graph']
343 | fw_adj_info = batch_graph['g_fw_adj']
344 | bw_adj_info = batch_graph['g_bw_adj']
345 | feature_info = batch_graph['g_ids_features']
346 | batch_nodes = batch_graph['g_nodes']
347 |
348 | self.if_pred_on_dev = if_pred_on_dev
349 |
350 | feed_dict = {
351 | self._text: text,
352 | self.decoder_seq_length: decoder_seq_length,
353 | self.loss_weights: loss_weights,
354 | self.fw_adj_info: fw_adj_info,
355 | self.bw_adj_info: bw_adj_info,
356 | self.feature_info: feature_info,
357 | self.batch_nodes: batch_nodes
358 | }
359 |
360 | if mode == "train" and not if_pred_on_dev:
361 | output_feeds = [self.train_op, self.loss_op, self.cross_entropy_sum]
362 | elif mode == "test" or if_pred_on_dev:
363 | output_feeds = [self.sample_id]
364 |
365 | results = sess.run(output_feeds, feed_dict)
366 | return results
--------------------------------------------------------------------------------
/Graph2Seq-master/main/neigh_samplers.py:
--------------------------------------------------------------------------------
1 | from layers import Layer
2 | import tensorflow as tf
3 |
4 | class UniformNeighborSampler(Layer):
5 | """
6 | Uniformly samples neighbors.
7 | Assumes that adj lists are padded with random re-sampling
8 | """
9 | def __init__(self, adj_info, **kwargs):
10 | super(UniformNeighborSampler, self).__init__(**kwargs)
11 | self.adj_info = adj_info
12 |
13 | def _call(self, inputs):
14 | ids, num_samples = inputs
15 | adj_lists = tf.nn.embedding_lookup(self.adj_info, ids)
16 | adj_lists = tf.transpose(tf.transpose(adj_lists))
17 | adj_lists = tf.slice(adj_lists, [0,0], [-1, num_samples])
18 | return adj_lists
--------------------------------------------------------------------------------
/Graph2Seq-master/main/pooling.py:
--------------------------------------------------------------------------------
1 | import tensorflow as tf
2 |
3 |
4 | def mean_pool(input_tensor, sequence_length=None):
5 | """
6 | Given an input tensor (e.g., the outputs of a LSTM), do mean pooling
7 | over the last dimension of the input.
8 |
9 | For example, if the input was the output of a LSTM of shape
10 | (batch_size, sequence length, hidden_dim), this would
11 | calculate a mean pooling over the last dimension (taking the padding
12 | into account, if provided) to output a tensor of shape
13 | (batch_size, hidden_dim).
14 |
15 | Parameters
16 | ----------
17 | input_tensor: Tensor
18 | An input tensor, preferably the output of a tensorflow RNN.
19 | The mean-pooled representation of this output will be calculated
20 | over the last dimension.
21 |
22 | sequence_length: Tensor, optional (default=None)
23 | A tensor of dimension (batch_size, ) indicating the length
24 | of the sequences before padding was applied.
25 |
26 | Returns
27 | -------
28 | mean_pooled_output: Tensor
29 | A tensor of one less dimension than the input, with the size of the
30 | last dimension equal to the hidden dimension state size.
31 | """
32 | with tf.name_scope("mean_pool"):
33 | # shape (batch_size, sequence_length)
34 | input_tensor_sum = tf.reduce_sum(input_tensor, axis=-2)
35 |
36 | # If sequence_length is None, divide by the sequence length
37 | # as indicated by the input tensor.
38 | if sequence_length is None:
39 | sequence_length = tf.shape(input_tensor)[-2]
40 |
41 | # Expand sequence length from shape (batch_size,) to
42 | # (batch_size, 1) for broadcasting to work.
43 | expanded_sequence_length = tf.cast(tf.expand_dims(sequence_length, -1),
44 | "float32") + 1e-08
45 |
46 | # Now, divide by the length of each sequence.
47 | # shape (batch_size, sequence_length)
48 | mean_pooled_input = (input_tensor_sum /
49 | expanded_sequence_length)
50 | return mean_pooled_input
51 |
52 | def handle_pad_max_pooling(tensor, last_dim):
53 | tensor = tf.reshape(tensor, [-1, last_dim])
54 | bs = tf.shape(tensor)[0]
55 | tt = tf.fill(tf.stack([bs, last_dim]), -1e9)
56 | cond = tf.not_equal(tensor, 0.0)
57 | res = tf.where(cond, tensor, tt)
58 | return res
59 |
60 | def max_pool(input_tensor, last_dim, sequence_length=None):
61 | """
62 | Given an input tensor, do max pooling over the last dimension of the input
63 | :param input_tensor:
64 | :param sequence_length:
65 | :return:
66 | """
67 | with tf.name_scope("max_pool"):
68 | #shape [batch_size, sequence_length]
69 | mid_dim = tf.shape(input_tensor)[1]
70 | input_tensor = handle_pad_max_pooling(input_tensor, last_dim)
71 | input_tensor = tf.reshape(input_tensor, [-1, mid_dim, last_dim])
72 | input_tensor_max = tf.reduce_max(input_tensor, axis=-2)
73 | return input_tensor_max
74 |
--------------------------------------------------------------------------------
/Graph2Seq-master/main/run_model.py:
--------------------------------------------------------------------------------
1 | import configure as conf
2 | import data_collector as data_collector
3 | import loaderAndwriter as disk_helper
4 | import numpy as np
5 | from model import Graph2SeqNN
6 | import tensorflow as tf
7 | import helpers as helpers
8 | import datetime
9 | import text_decoder
10 | from evaluator import evaluate
11 | import os
12 | import argparse
13 | import json
14 |
15 | def main(mode):
16 |
17 | word_idx = {}
18 |
19 | if mode == "train":
20 | epochs = conf.epochs
21 | train_batch_size = conf.train_batch_size
22 |
23 | # read the training data from a file
24 | print("reading training data into the mem ...")
25 | texts_train, graphs_train = data_collector.read_data(conf.train_data_path, word_idx, if_increase_dict=True)
26 |
27 | print("reading development data into the mem ...")
28 | texts_dev, graphs_dev = data_collector.read_data(conf.dev_data_path, word_idx, if_increase_dict=False)
29 |
30 | print("writing word-idx mapping ...")
31 | disk_helper.write_word_idx(word_idx, conf.word_idx_file_path)
32 |
33 | print("vectoring training data ...")
34 | tv_train = data_collector.vectorize_data(word_idx, texts_train)
35 |
36 | print("vectoring dev data ...")
37 | tv_dev = data_collector.vectorize_data(word_idx, texts_dev)
38 |
39 | conf.word_vocab_size = len(word_idx.keys()) + 1
40 |
41 | with tf.Graph().as_default():
42 | with tf.Session() as sess:
43 | model = Graph2SeqNN("train", conf, path_embed_method="lstm")
44 |
45 | model._build_graph()
46 | saver = tf.train.Saver(max_to_keep=None)
47 | sess.run(tf.initialize_all_variables())
48 |
49 | def train_step(seqs, decoder_seq_length, loss_weights, batch_graph, if_pred_on_dev=False):
50 | dict = {}
51 | dict['seq'] = seqs
52 | dict['batch_graph'] = batch_graph
53 | dict['loss_weights'] = loss_weights
54 | dict['decoder_seq_length'] = decoder_seq_length
55 |
56 | if not if_pred_on_dev:
57 | _, loss_op, cross_entropy = model.act(sess, "train", dict, if_pred_on_dev)
58 | return loss_op, cross_entropy
59 | else:
60 | sample_id = model.act(sess, "train", dict, if_pred_on_dev)
61 | return sample_id
62 |
63 | best_acc_on_dev = 0.0
64 | for t in range(1, epochs + 1):
65 | n_train = len(texts_train)
66 | temp_order = list(range(n_train))
67 | np.random.shuffle(temp_order)
68 |
69 | loss_sum = 0.0
70 | for start in range(0, n_train, train_batch_size):
71 | end = min(start+train_batch_size, n_train)
72 | tv = []
73 | graphs = []
74 | for _ in range(start, end):
75 | idx = temp_order[_]
76 | tv.append(tv_train[idx])
77 | graphs.append(graphs_train[idx])
78 |
79 | batch_graph = data_collector.cons_batch_graph(graphs)
80 | gv = data_collector.vectorize_batch_graph(batch_graph, word_idx)
81 |
82 | tv, tv_real_len, loss_weights = helpers.batch(tv)
83 |
84 | loss_op, cross_entropy = train_step(tv, tv_real_len, loss_weights, gv)
85 | loss_sum += loss_op
86 |
87 | #################### test the model on the dev data #########################
88 | n_dev = len(texts_dev)
89 | dev_batch_size = conf.dev_batch_size
90 |
91 | idx_word = {}
92 | for w in word_idx:
93 | idx_word[word_idx[w]] = w
94 |
95 | pred_texts = []
96 | for start in range(0, n_dev, dev_batch_size):
97 | end = min(start+dev_batch_size, n_dev)
98 | tv = []
99 | graphs = []
100 | for _ in range(start, end):
101 | tv.append(tv_dev[_])
102 | graphs.append(graphs_dev[_])
103 |
104 | batch_graph = data_collector.cons_batch_graph(graphs)
105 | gv = data_collector.vectorize_batch_graph(batch_graph, word_idx)
106 |
107 | tv, tv_real_len, loss_weights = helpers.batch(tv)
108 |
109 | sample_id = train_step(tv, tv_real_len, loss_weights, gv, if_pred_on_dev=True)[0]
110 |
111 | for tmp_id in sample_id:
112 | pred_texts.append(text_decoder.decode_text(tmp_id, idx_word))
113 |
114 | acc = evaluate(type="acc", golds=texts_dev, preds=pred_texts)
115 | if_save_model = False
116 | if acc >= best_acc_on_dev:
117 | best_acc_on_dev = acc
118 | if_save_model = True
119 |
120 | time_str = datetime.datetime.now().isoformat()
121 | print('-----------------------')
122 | print('time:{}'.format(time_str))
123 | print('Epoch', t)
124 | print('Acc on Dev: {}'.format(acc))
125 | print('Best acc on Dev: {}'.format(best_acc_on_dev))
126 | print('Loss on train:{}'.format(loss_sum))
127 | if if_save_model:
128 | save_path = "../saved_model/"
129 | if not os.path.exists(save_path):
130 | os.makedirs(save_path)
131 |
132 | path = saver.save(sess, save_path + 'model', global_step=0)
133 | print("Already saved model to {}".format(path))
134 |
135 | print('-----------------------')
136 |
137 | elif mode == "test":
138 | test_batch_size = conf.test_batch_size
139 |
140 | # read the test data from a file
141 | print("reading test data into the mem ...")
142 | texts_test, graphs_test = data_collector.read_data(conf.test_data_path, word_idx, if_increase_dict=False)
143 |
144 | print("reading word idx mapping from file")
145 | word_idx = disk_helper.read_word_idx_from_file(conf.word_idx_file_path)
146 |
147 | idx_word = {}
148 | for w in word_idx:
149 | idx_word[word_idx[w]] = w
150 |
151 | print("vectoring test data ...")
152 | tv_test = data_collector.vectorize_data(word_idx, texts_test)
153 |
154 | conf.word_vocab_size = len(word_idx.keys()) + 1
155 |
156 | with tf.Graph().as_default():
157 | with tf.Session() as sess:
158 | model = Graph2SeqNN("test", conf, path_embed_method="lstm")
159 | model._build_graph()
160 | saver = tf.train.Saver(max_to_keep=None)
161 |
162 | model_path_name = "../saved_model/model-0"
163 | model_pred_path = "../saved_model/prediction.txt"
164 |
165 | saver.restore(sess, model_path_name)
166 |
167 | def test_step(seqs, decoder_seq_length, loss_weights, batch_graph):
168 | dict = {}
169 | dict['seq'] = seqs
170 | dict['batch_graph'] = batch_graph
171 | dict['loss_weights'] = loss_weights
172 | dict['decoder_seq_length'] = decoder_seq_length
173 | sample_id = model.act(sess, "test", dict, if_pred_on_dev=False)
174 | return sample_id
175 |
176 | n_test = len(texts_test)
177 |
178 | pred_texts = []
179 | global_graphs = []
180 | for start in range(0, n_test, test_batch_size):
181 | end = min(start + test_batch_size, n_test)
182 | tv = []
183 | graphs = []
184 | for _ in range(start, end):
185 | tv.append(tv_test[_])
186 | graphs.append(graphs_test[_])
187 | global_graphs.append(graphs_test[_])
188 |
189 | batch_graph = data_collector.cons_batch_graph(graphs)
190 | gv = data_collector.vectorize_batch_graph(batch_graph, word_idx)
191 | tv, tv_real_len, loss_weights = helpers.batch(tv)
192 |
193 | sample_id = test_step(tv, tv_real_len, loss_weights, gv)[0]
194 | for tem_id in sample_id:
195 | pred_texts.append(text_decoder.decode_text(tem_id, idx_word))
196 |
197 | acc = evaluate(type="acc", golds=texts_test, preds=pred_texts)
198 | print("acc on test set is {}".format(acc))
199 |
200 | # write prediction result into a file
201 | with open(model_pred_path, 'w+') as f:
202 | for _ in range(len(global_graphs)):
203 | f.write("graph:\t"+json.dumps(global_graphs[_])+"\nGold:\t"+texts_test[_]+"\nPredicted:\t"+pred_texts[_]+"\n")
204 | if texts_test[_].strip() == pred_texts[_].strip():
205 | f.write("Correct\n\n")
206 | else:
207 | f.write("Incorrect\n\n")
208 |
209 |
210 | if __name__ == "__main__":
211 | argparser = argparse.ArgumentParser()
212 | argparser.add_argument("mode", type=str, choices=["train", "test"])
213 | argparser.add_argument("-sample_size_per_layer", type=int, default=4, help="sample size at each layer")
214 | argparser.add_argument("-sample_layer_size", type=int, default=4, help="sample layer size")
215 | argparser.add_argument("-epochs", type=int, default=100, help="training epochs")
216 | argparser.add_argument("-learning_rate", type=float, default=conf.learning_rate, help="learning rate")
217 | argparser.add_argument("-word_embedding_dim", type=int, default=conf.word_embedding_dim, help="word embedding dim")
218 | argparser.add_argument("-hidden_layer_dim", type=int, default=conf.hidden_layer_dim)
219 |
220 | config = argparser.parse_args()
221 |
222 | mode = config.mode
223 | conf.sample_layer_size = config.sample_layer_size
224 | conf.sample_size_per_layer = config.sample_size_per_layer
225 | conf.epochs = config.epochs
226 | conf.learning_rate = config.learning_rate
227 | conf.word_embedding_dim = config.word_embedding_dim
228 | conf.hidden_layer_dim = config.hidden_layer_dim
229 |
230 | main(mode)
231 |
--------------------------------------------------------------------------------
/Graph2Seq-master/main/text_decoder.py:
--------------------------------------------------------------------------------
1 | import configure as conf
2 |
3 | def decode_text(pred_idx, idx_word):
4 | # if conf.decoder_type != "beam":
5 | # pred_idx = pred_idx[0]
6 | # else:
7 | # pred_idx = np.transpose(pred_idx)
8 | # pred_idx = pred_idx[0]
9 |
10 | text = ""
11 | for __ in range(len(pred_idx)):
12 | ids = pred_idx[__]
13 | if isinstance(ids, list):
14 | for id in ids:
15 | if id == 2:
16 | break
17 | if id != 0:
18 | text += idx_word[id] + " "
19 | else:
20 | text += conf.GO + " "
21 | else:
22 | if ids == 2:
23 | break
24 | if ids != 0:
25 | text += idx_word[ids] + " "
26 | else:
27 | text += conf.GO + " "
28 | return text
--------------------------------------------------------------------------------
/Graph2Seq-master/saved_model/checkpoint:
--------------------------------------------------------------------------------
1 | model_checkpoint_path: "model-0"
2 | all_model_checkpoint_paths: "model-0"
3 |
--------------------------------------------------------------------------------
/Graph2Seq-master/saved_model/model-0.data-00000-of-00001:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/IBM/Text-to-LogicForm/515e73db1795019007c1b3a3ed22b10c349f555a/Graph2Seq-master/saved_model/model-0.data-00000-of-00001
--------------------------------------------------------------------------------
/Graph2Seq-master/saved_model/model-0.index:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/IBM/Text-to-LogicForm/515e73db1795019007c1b3a3ed22b10c349f555a/Graph2Seq-master/saved_model/model-0.index
--------------------------------------------------------------------------------
/Graph2Seq-master/saved_model/model-0.meta:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/IBM/Text-to-LogicForm/515e73db1795019007c1b3a3ed22b10c349f555a/Graph2Seq-master/saved_model/model-0.meta
--------------------------------------------------------------------------------
/LICENSE:
--------------------------------------------------------------------------------
1 | Apache License
2 | Version 2.0, January 2004
3 | http://www.apache.org/licenses/
4 |
5 | TERMS AND CONDITIONS FOR USE, REPRODUCTION, AND DISTRIBUTION
6 |
7 | 1. Definitions.
8 |
9 | "License" shall mean the terms and conditions for use, reproduction,
10 | and distribution as defined by Sections 1 through 9 of this document.
11 |
12 | "Licensor" shall mean the copyright owner or entity authorized by
13 | the copyright owner that is granting the License.
14 |
15 | "Legal Entity" shall mean the union of the acting entity and all
16 | other entities that control, are controlled by, or are under common
17 | control with that entity. For the purposes of this definition,
18 | "control" means (i) the power, direct or indirect, to cause the
19 | direction or management of such entity, whether by contract or
20 | otherwise, or (ii) ownership of fifty percent (50%) or more of the
21 | outstanding shares, or (iii) beneficial ownership of such entity.
22 |
23 | "You" (or "Your") shall mean an individual or Legal Entity
24 | exercising permissions granted by this License.
25 |
26 | "Source" form shall mean the preferred form for making modifications,
27 | including but not limited to software source code, documentation
28 | source, and configuration files.
29 |
30 | "Object" form shall mean any form resulting from mechanical
31 | transformation or translation of a Source form, including but
32 | not limited to compiled object code, generated documentation,
33 | and conversions to other media types.
34 |
35 | "Work" shall mean the work of authorship, whether in Source or
36 | Object form, made available under the License, as indicated by a
37 | copyright notice that is included in or attached to the work
38 | (an example is provided in the Appendix below).
39 |
40 | "Derivative Works" shall mean any work, whether in Source or Object
41 | form, that is based on (or derived from) the Work and for which the
42 | editorial revisions, annotations, elaborations, or other modifications
43 | represent, as a whole, an original work of authorship. For the purposes
44 | of this License, Derivative Works shall not include works that remain
45 | separable from, or merely link (or bind by name) to the interfaces of,
46 | the Work and Derivative Works thereof.
47 |
48 | "Contribution" shall mean any work of authorship, including
49 | the original version of the Work and any modifications or additions
50 | to that Work or Derivative Works thereof, that is intentionally
51 | submitted to Licensor for inclusion in the Work by the copyright owner
52 | or by an individual or Legal Entity authorized to submit on behalf of
53 | the copyright owner. For the purposes of this definition, "submitted"
54 | means any form of electronic, verbal, or written communication sent
55 | to the Licensor or its representatives, including but not limited to
56 | communication on electronic mailing lists, source code control systems,
57 | and issue tracking systems that are managed by, or on behalf of, the
58 | Licensor for the purpose of discussing and improving the Work, but
59 | excluding communication that is conspicuously marked or otherwise
60 | designated in writing by the copyright owner as "Not a Contribution."
61 |
62 | "Contributor" shall mean Licensor and any individual or Legal Entity
63 | on behalf of whom a Contribution has been received by Licensor and
64 | subsequently incorporated within the Work.
65 |
66 | 2. Grant of Copyright License. Subject to the terms and conditions of
67 | this License, each Contributor hereby grants to You a perpetual,
68 | worldwide, non-exclusive, no-charge, royalty-free, irrevocable
69 | copyright license to reproduce, prepare Derivative Works of,
70 | publicly display, publicly perform, sublicense, and distribute the
71 | Work and such Derivative Works in Source or Object form.
72 |
73 | 3. Grant of Patent License. Subject to the terms and conditions of
74 | this License, each Contributor hereby grants to You a perpetual,
75 | worldwide, non-exclusive, no-charge, royalty-free, irrevocable
76 | (except as stated in this section) patent license to make, have made,
77 | use, offer to sell, sell, import, and otherwise transfer the Work,
78 | where such license applies only to those patent claims licensable
79 | by such Contributor that are necessarily infringed by their
80 | Contribution(s) alone or by combination of their Contribution(s)
81 | with the Work to which such Contribution(s) was submitted. If You
82 | institute patent litigation against any entity (including a
83 | cross-claim or counterclaim in a lawsuit) alleging that the Work
84 | or a Contribution incorporated within the Work constitutes direct
85 | or contributory patent infringement, then any patent licenses
86 | granted to You under this License for that Work shall terminate
87 | as of the date such litigation is filed.
88 |
89 | 4. Redistribution. You may reproduce and distribute copies of the
90 | Work or Derivative Works thereof in any medium, with or without
91 | modifications, and in Source or Object form, provided that You
92 | meet the following conditions:
93 |
94 | (a) You must give any other recipients of the Work or
95 | Derivative Works a copy of this License; and
96 |
97 | (b) You must cause any modified files to carry prominent notices
98 | stating that You changed the files; and
99 |
100 | (c) You must retain, in the Source form of any Derivative Works
101 | that You distribute, all copyright, patent, trademark, and
102 | attribution notices from the Source form of the Work,
103 | excluding those notices that do not pertain to any part of
104 | the Derivative Works; and
105 |
106 | (d) If the Work includes a "NOTICE" text file as part of its
107 | distribution, then any Derivative Works that You distribute must
108 | include a readable copy of the attribution notices contained
109 | within such NOTICE file, excluding those notices that do not
110 | pertain to any part of the Derivative Works, in at least one
111 | of the following places: within a NOTICE text file distributed
112 | as part of the Derivative Works; within the Source form or
113 | documentation, if provided along with the Derivative Works; or,
114 | within a display generated by the Derivative Works, if and
115 | wherever such third-party notices normally appear. The contents
116 | of the NOTICE file are for informational purposes only and
117 | do not modify the License. You may add Your own attribution
118 | notices within Derivative Works that You distribute, alongside
119 | or as an addendum to the NOTICE text from the Work, provided
120 | that such additional attribution notices cannot be construed
121 | as modifying the License.
122 |
123 | You may add Your own copyright statement to Your modifications and
124 | may provide additional or different license terms and conditions
125 | for use, reproduction, or distribution of Your modifications, or
126 | for any such Derivative Works as a whole, provided Your use,
127 | reproduction, and distribution of the Work otherwise complies with
128 | the conditions stated in this License.
129 |
130 | 5. Submission of Contributions. Unless You explicitly state otherwise,
131 | any Contribution intentionally submitted for inclusion in the Work
132 | by You to the Licensor shall be under the terms and conditions of
133 | this License, without any additional terms or conditions.
134 | Notwithstanding the above, nothing herein shall supersede or modify
135 | the terms of any separate license agreement you may have executed
136 | with Licensor regarding such Contributions.
137 |
138 | 6. Trademarks. This License does not grant permission to use the trade
139 | names, trademarks, service marks, or product names of the Licensor,
140 | except as required for reasonable and customary use in describing the
141 | origin of the Work and reproducing the content of the NOTICE file.
142 |
143 | 7. Disclaimer of Warranty. Unless required by applicable law or
144 | agreed to in writing, Licensor provides the Work (and each
145 | Contributor provides its Contributions) on an "AS IS" BASIS,
146 | WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or
147 | implied, including, without limitation, any warranties or conditions
148 | of TITLE, NON-INFRINGEMENT, MERCHANTABILITY, or FITNESS FOR A
149 | PARTICULAR PURPOSE. You are solely responsible for determining the
150 | appropriateness of using or redistributing the Work and assume any
151 | risks associated with Your exercise of permissions under this License.
152 |
153 | 8. Limitation of Liability. In no event and under no legal theory,
154 | whether in tort (including negligence), contract, or otherwise,
155 | unless required by applicable law (such as deliberate and grossly
156 | negligent acts) or agreed to in writing, shall any Contributor be
157 | liable to You for damages, including any direct, indirect, special,
158 | incidental, or consequential damages of any character arising as a
159 | result of this License or out of the use or inability to use the
160 | Work (including but not limited to damages for loss of goodwill,
161 | work stoppage, computer failure or malfunction, or any and all
162 | other commercial damages or losses), even if such Contributor
163 | has been advised of the possibility of such damages.
164 |
165 | 9. Accepting Warranty or Additional Liability. While redistributing
166 | the Work or Derivative Works thereof, You may choose to offer,
167 | and charge a fee for, acceptance of support, warranty, indemnity,
168 | or other liability obligations and/or rights consistent with this
169 | License. However, in accepting such obligations, You may act only
170 | on Your own behalf and on Your sole responsibility, not on behalf
171 | of any other Contributor, and only if You agree to indemnify,
172 | defend, and hold each Contributor harmless for any liability
173 | incurred by, or claims asserted against, such Contributor by reason
174 | of your accepting any such warranty or additional liability.
175 |
176 | END OF TERMS AND CONDITIONS
177 |
178 | APPENDIX: How to apply the Apache License to your work.
179 |
180 | To apply the Apache License to your work, attach the following
181 | boilerplate notice, with the fields enclosed by brackets "[]"
182 | replaced with your own identifying information. (Don't include
183 | the brackets!) The text should be enclosed in the appropriate
184 | comment syntax for the file format. We also recommend that a
185 | file or class name and description of purpose be included on the
186 | same "printed page" as the copyright notice for easier
187 | identification within third-party archives.
188 |
189 | Copyright [yyyy] [name of copyright owner]
190 |
191 | Licensed under the Apache License, Version 2.0 (the "License");
192 | you may not use this file except in compliance with the License.
193 | You may obtain a copy of the License at
194 |
195 | http://www.apache.org/licenses/LICENSE-2.0
196 |
197 | Unless required by applicable law or agreed to in writing, software
198 | distributed under the License is distributed on an "AS IS" BASIS,
199 | WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
200 | See the License for the specific language governing permissions and
201 | limitations under the License.
202 |
--------------------------------------------------------------------------------
/README.md:
--------------------------------------------------------------------------------
1 | # Text-to-LogicForm
2 | Text-to-LogicForm is a simple code for leveraging a syntactic graph for semantic parsing using a nov
3 |
4 | # Possibly Required Dependency and Tools
5 | Text-to-LogicForm requires two parts of codes. The first part is a novel Graph2Seq model to perform the job.
6 | The codes can be found here: https://github.com/IBM/Graph2Seq. The second part is a pre-processing code
7 | for converting text to a syntactic graph (this part of codes will be releasing soon).
8 |
9 | # Graph2Seq
10 | Graph2Seq is a simple code for building a graph-encoder and sequence-decoder for NLP and other AI/ML/DL tasks.
11 |
12 | # How To Run The Codes
13 | To train your graph-to-sequence model, you need:
14 |
15 | (1) Prepare your train/dev/test data which the form of:
16 |
17 | each line is a json object whose keys are "seq", "g_ids", "g_id_features", "g_adj":
18 | "seq" is a text which is supposed to be the output of the decoder
19 | "g_ids" is a mapping from the node ID to its ID in the graph
20 | "g_id_features" is a mapping from the node ID to its text features
21 | "g_adj" is a mapping from the node ID to its adjacent nodes (represented as thier IDs)
22 |
23 | See data/no_cycle/train.data as examples.
24 |
25 |
26 | (2) Modify some hyper-parameters according to your task in the main/configure.py
27 |
28 | (3) train the model by running the following code
29 | "python run_model.py train -sample_size_per_layer=xxx -sample_layer_size=yyy"
30 | The model that performs the best on the dev data will be saved in the dir "saved_model"
31 |
32 | (4) test the model by running the following code
33 | "python run_model.py test -sample_size_per_layer=xxx -sample_layer_size=yyy"
34 | The prediction result will be saved in saved_model/prediction.txt
35 |
36 |
37 | # How To Cite The Codes
38 | Please cite our work if you like or are using our codes for your projects!
39 |
40 | Kun Xu, Lingfei Wu, Zhiguo Wang, Yansong Feng, and Vadim Sheinin, "Exploiting Rich Syntactic Information for Semantic Parsing with Graph-to-Sequence Model", In 2018 Conference on Empirical Methods in Natural Language Processing.
41 |
42 | @article{xu2018exploiting,
43 | title={Exploiting rich syntactic information for semantic parsing with graph-to-sequence model},
44 | author={Xu, Kun and Wu, Lingfei and Wang, Zhiguo and Yu, Mo and Chen, Liwei and Sheinin, Vadim},
45 | journal={arXiv preprint arXiv:1808.07624},
46 | year={2018}
47 | }
48 |
49 | Kun Xu, Lingfei Wu, Zhiguo Wang, Yansong Feng, Michael Witbrock, and Vadim Sheinin (first and second authors contributed equally), "Graph2Seq: Graph to Sequence Learning with Attention-based Neural Networks", arXiv preprint arXiv:1804.00823.
50 |
51 | @article{xu2018graph2seq,
52 | title={Graph2Seq: Graph to Sequence Learning with Attention-based Neural Networks},
53 | author={Xu, Kun and Wu, Lingfei and Wang, Zhiguo and Feng, Yansong and Witbrock, Michael and Sheinin, Vadim},
54 | journal={arXiv preprint arXiv:1804.00823},
55 | year={2018}
56 | }
57 |
58 | ------------------------------------------------------
59 | Contributors: Kun Xu, Lingfei Wu
60 | Created date: November 19, 2018
61 | Last update: November 19, 2018
62 |
63 |
--------------------------------------------------------------------------------