├── .ipynb_checkpoints
├── gcn-node_classification-checkpoint.ipynb
└── gcn-prediction-checkpoint.ipynb
├── A gentle introduction about graph neural networks.pdf
├── GCN-Thomas Kipf.pdf
├── How_Powerful_are_GNN.pdf
├── Inductive biases, graph neural networks, attention and relational inference.pdf
├── README.md
├── figures
├── gat_attention.png
├── gcn_prediction.jpg
├── graph_conv.jpg
└── gru.png
├── gnn.png
└── tutorials
├── .ipynb_checkpoints
├── ggnn-checkpoint.ipynb
└── graph_attn-checkpoint.ipynb
├── gat.ipynb
├── gcn_classification.ipynb
├── gcn_prediction.ipynb
└── ggnn.ipynb
/.ipynb_checkpoints/gcn-node_classification-checkpoint.ipynb:
--------------------------------------------------------------------------------
1 | {
2 | "cells": [],
3 | "metadata": {},
4 | "nbformat": 4,
5 | "nbformat_minor": 2
6 | }
7 |
--------------------------------------------------------------------------------
/.ipynb_checkpoints/gcn-prediction-checkpoint.ipynb:
--------------------------------------------------------------------------------
1 | {
2 | "cells": [
3 | {
4 | "cell_type": "markdown",
5 | "metadata": {},
6 | "source": [
7 | "# Prediction of molecular properties using GCN\n",
8 | "\n",
9 | "A tutorial for supervised learning of labels as outputs of graph inputs.
\n",
10 | "I implemented GCN and GAT to predict molecular properties and observed better results from GAT.
\n",
11 | "Please refer to the following article for the whole contents.
\n",
12 | "Ryu, Seongok, Jaechang Lim, and Woo Youn Kim. \"Deeply learning molecular structure-property relationships using graph attention neural network.\" arXiv preprint arXiv:1805.10988 (2018).\n",
13 | "\n",
14 | "There is a key difference between the node classification and prediction of labels from whole graph inputs. The later task has to satisfy permutation invariance with respect to changing node orders. Therefore, we will implement readout functions which satisfy the permutation invariance in this tutorial."
15 | ]
16 | },
17 | {
18 | "cell_type": "code",
19 | "execution_count": 1,
20 | "metadata": {},
21 | "outputs": [
22 | {
23 | "name": "stderr",
24 | "output_type": "stream",
25 | "text": [
26 | "/Users/Lulu/anaconda3/lib/python3.6/site-packages/h5py/__init__.py:36: FutureWarning: Conversion of the second argument of issubdtype from `float` to `np.floating` is deprecated. In future, it will be treated as `np.float64 == np.dtype(float).type`.\n",
27 | " from ._conv import register_converters as _register_converters\n"
28 | ]
29 | }
30 | ],
31 | "source": [
32 | "import tensorflow as tf\n",
33 | "from IPython.display import Image"
34 | ]
35 | },
36 | {
37 | "cell_type": "markdown",
38 | "metadata": {},
39 | "source": [
40 | "Overall architecture is as below."
41 | ]
42 | },
43 | {
44 | "cell_type": "code",
45 | "execution_count": 2,
46 | "metadata": {},
47 | "outputs": [
48 | {
49 | "data": {
50 | "image/jpeg": "\n",
51 | "text/plain": [
52 | ""
53 | ]
54 | },
55 | "execution_count": 2,
56 | "metadata": {},
57 | "output_type": "execute_result"
58 | }
59 | ],
60 | "source": [
61 | "Image('./figures/gcn_prediction.jpg')"
62 | ]
63 | },
64 | {
65 | "cell_type": "markdown",
66 | "metadata": {},
67 | "source": [
68 | "Let's assume that shapes of the adjacency matrix, feature matrix are same as the shapes of the matrix used in a 'gcn-node_classification' tutorial."
69 | ]
70 | },
71 | {
72 | "cell_type": "code",
73 | "execution_count": 3,
74 | "metadata": {},
75 | "outputs": [],
76 | "source": [
77 | "num_nodes = 50\n",
78 | "num_features = 50\n",
79 | "X = tf.placeholder(tf.float64, [None, num_nodes, num_features])\n",
80 | "A = tf.placeholder(tf.float64, [None, num_nodes, num_nodes])\n",
81 | "Y_truth = tf.placeholder(tf.float64, [None,])"
82 | ]
83 | },
84 | {
85 | "cell_type": "markdown",
86 | "metadata": {},
87 | "source": [
88 | "An implementation of graph convolution layer is same also."
89 | ]
90 | },
91 | {
92 | "cell_type": "code",
93 | "execution_count": 4,
94 | "metadata": {},
95 | "outputs": [],
96 | "source": [
97 | "def graph_conv(_X, _A, output_dim):\n",
98 | " output = tf.layers.dense(_X, units=output_dim, use_bias=True)\n",
99 | " output = tf.matmul(_A, output)\n",
100 | " output = tf.nn.relu(output)\n",
101 | " return output"
102 | ]
103 | },
104 | {
105 | "cell_type": "markdown",
106 | "metadata": {},
107 | "source": [
108 | "We have to implement the readout function as described above.
\n",
109 | "There are two types of the implementation: node-wise summation (nw) and graph gathering (gg).
\n",
110 | "Equations are as follow, respectively.\n",
111 | "\n",
112 | "$$ R_{nw} = \\tau(\\sum_{i \\in G} MLP(H_{i}^{L})) $$\n",
113 | "$$ R_{gg} = \\tau(\\sum_{i \\in G} \\sigma(MLP_1(H_{i}^{L} | H_{i}^{0})) \\odot MLP_2(H_{i}^{L}))$$\n",
114 | "\n",
115 | "Notations :\n",
116 | "* $ \\tau $ : ReLU activation (or other non-linear activations)\n",
117 | "* $ \\sigma $ : sigmoid activation\n",
118 | "* $ \\odot $ : elementwise-multiplication - Hadamard product \n",
119 | "* $ (\\cdot|\\cdot) $ : concatenation\n",
120 | " \n",
121 | "Please refer to the following article for the more detail.
\n",
122 | "Gilmer, Justin, et al. \"Neural message passing for quantum chemistry.\" arXiv preprint arXiv:1704.01212 (2017)."
123 | ]
124 | },
125 | {
126 | "cell_type": "code",
127 | "execution_count": 5,
128 | "metadata": {},
129 | "outputs": [],
130 | "source": [
131 | "def readout_nw(_X, output_dim):\n",
132 | " # _X : final node embeddings\n",
133 | " output = tf.layers.dense(_X, output_dim, use_bias=True)\n",
134 | " output = tf.reduce_sum(output, axis=1)\n",
135 | " output = tf.nn.relu(output)\n",
136 | " \n",
137 | " return output"
138 | ]
139 | },
140 | {
141 | "cell_type": "code",
142 | "execution_count": 6,
143 | "metadata": {},
144 | "outputs": [],
145 | "source": [
146 | "def readout_gg(_X, X, output_dim):\n",
147 | " # _X : final node embeddings\n",
148 | " # X : initial node features\n",
149 | " val1 = tf.layers.dense(tf.concat([_X, X], axis=2), output_dim, use_bias=True)\n",
150 | " val1 = tf.nn.sigmoid(val1)\n",
151 | " val2 = tf.layers.dense(_X, output_dim, use_bias=True)\n",
152 | " output = tf.multiply(val1, val2)\n",
153 | " output = tf.reduce_sum(output, axis=1)\n",
154 | " output = tf.nn.relu(output)\n",
155 | " \n",
156 | " return output"
157 | ]
158 | },
159 | {
160 | "cell_type": "markdown",
161 | "metadata": {},
162 | "source": [
163 | "We finished preparing necessary functions in the architecture.
\n",
164 | "Therefore, the implementation of the overall architecture is as below."
165 | ]
166 | },
167 | {
168 | "cell_type": "code",
169 | "execution_count": 7,
170 | "metadata": {},
171 | "outputs": [
172 | {
173 | "data": {
174 | "text/plain": [
175 | ""
176 | ]
177 | },
178 | "execution_count": 7,
179 | "metadata": {},
180 | "output_type": "execute_result"
181 | }
182 | ],
183 | "source": [
184 | "gconv1 = graph_conv(X, A, 32)\n",
185 | "gconv2 = graph_conv(gconv1, A, 32)\n",
186 | "gconv3 = graph_conv(gconv2, A, 32)\n",
187 | "graph_feature = readout_gg(gconv3, gconv1, 128)\n",
188 | "graph_feature"
189 | ]
190 | },
191 | {
192 | "cell_type": "code",
193 | "execution_count": 8,
194 | "metadata": {},
195 | "outputs": [
196 | {
197 | "data": {
198 | "text/plain": [
199 | ""
200 | ]
201 | },
202 | "execution_count": 8,
203 | "metadata": {},
204 | "output_type": "execute_result"
205 | }
206 | ],
207 | "source": [
208 | "Y_pred = tf.layers.dense(graph_feature, 128, use_bias=True, activation=tf.nn.relu)\n",
209 | "Y_pred = tf.layers.dense(Y_pred, 128, use_bias=True, activation=tf.nn.tanh)\n",
210 | "Y_pred = tf.layers.dense(Y_pred, 1, use_bias=True, activation=None)\n",
211 | "Y_pred"
212 | ]
213 | },
214 | {
215 | "cell_type": "markdown",
216 | "metadata": {},
217 | "source": [
218 | "A loss function have to be minimized in this task is l2-norm."
219 | ]
220 | },
221 | {
222 | "cell_type": "code",
223 | "execution_count": 9,
224 | "metadata": {},
225 | "outputs": [
226 | {
227 | "data": {
228 | "text/plain": [
229 | ""
230 | ]
231 | },
232 | "execution_count": 9,
233 | "metadata": {},
234 | "output_type": "execute_result"
235 | }
236 | ],
237 | "source": [
238 | "Y_pred = tf.reshape(Y_pred, shape=[-1])\n",
239 | "Y_truth = tf.reshape(Y_truth, shape=[-1])\n",
240 | "loss = tf.reduce_mean(tf.pow(Y_truth - Y_pred,2))\n",
241 | "loss"
242 | ]
243 | },
244 | {
245 | "cell_type": "markdown",
246 | "metadata": {},
247 | "source": [
248 | "Yes, we have completed all necessary preparations for training.\n",
249 | "\n",
250 | "I upload all codes which implement the supervised learning of prediction molecular properties at the 'gnn-molecule' folder.
\n",
251 | "Scripts for preprocessing also exist. Hope you enjoy the graph neural networks from this moment!\n"
252 | ]
253 | }
254 | ],
255 | "metadata": {
256 | "kernelspec": {
257 | "display_name": "Python 3",
258 | "language": "python",
259 | "name": "python3"
260 | },
261 | "language_info": {
262 | "codemirror_mode": {
263 | "name": "ipython",
264 | "version": 3
265 | },
266 | "file_extension": ".py",
267 | "mimetype": "text/x-python",
268 | "name": "python",
269 | "nbconvert_exporter": "python",
270 | "pygments_lexer": "ipython3",
271 | "version": "3.6.5"
272 | }
273 | },
274 | "nbformat": 4,
275 | "nbformat_minor": 2
276 | }
277 |
--------------------------------------------------------------------------------
/A gentle introduction about graph neural networks.pdf:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/SeongokRyu/Graph-neural-networks/e76cb1ad187d19fccad233b42b0806575f8e05e5/A gentle introduction about graph neural networks.pdf
--------------------------------------------------------------------------------
/GCN-Thomas Kipf.pdf:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/SeongokRyu/Graph-neural-networks/e76cb1ad187d19fccad233b42b0806575f8e05e5/GCN-Thomas Kipf.pdf
--------------------------------------------------------------------------------
/How_Powerful_are_GNN.pdf:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/SeongokRyu/Graph-neural-networks/e76cb1ad187d19fccad233b42b0806575f8e05e5/How_Powerful_are_GNN.pdf
--------------------------------------------------------------------------------
/Inductive biases, graph neural networks, attention and relational inference.pdf:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/SeongokRyu/Graph-neural-networks/e76cb1ad187d19fccad233b42b0806575f8e05e5/Inductive biases, graph neural networks, attention and relational inference.pdf
--------------------------------------------------------------------------------
/README.md:
--------------------------------------------------------------------------------
1 | # Graph-neural-networks
2 | 
3 |
4 | Image source : https://arxiv.org/abs/1705.07664
5 |
6 | I've been using graph neural networks (GNN) mainly for molecular applications because molecular structures can be represented in graph structures. GNN is interesting in that it can effectively model relationships or interactions between objects in a system. There are various applications of GNN such as molecular applications, network analysis, and physics modeling.
7 |
8 | I will introduce GNN in this repository: from theoretical backgrounds to implementations using TensorFlow. I hope you enjoy GNN from this moment.
9 |
10 | ## References (continually updated) :
11 | ### Geometric Deep Learning and Surveys on Graph Neural Networks
12 | * Bronstein, Michael M., et al. "Geometric deep learning: going beyond euclidean data." IEEE Signal Processing Magazine 34.4 (2017): 18-42.
13 | * [NIPS 2017] Tutorial - Geometric deep learning on graphs and manifolds, https://nips.cc/Conferences/2017/Schedule?showEvent=8735
14 | * Goyal, Palash, and Emilio Ferrara. "Graph embedding techniques, applications, and performance: A survey." Knowledge-Based Systems 151 (2018): 78-94., https://github.com/palash1992/GEM
15 | * Awesome Graph Embedding And Representation Learning Papers, https://github.com/benedekrozemberczki/awesome-graph-embedding
16 | * Battaglia, Peter W., et al. "Relational inductive biases, deep learning, and graph networks." arXiv preprint arXiv:1806.01261 (2018).
17 |
18 | ### Graph Convolution Network (GCN)
19 | * Defferrard, Michaël, Xavier Bresson, and Pierre Vandergheynst. "Convolutional neural networks on graphs with fast localized spectral filtering." Advances in Neural Information Processing Systems. 2016.
20 | * Kipf, Thomas N., and Max Welling. "Semi-supervised classification with graph convolutional networks." arXiv preprint arXiv:1609.02907 (2016).
21 | * van den Berg, Rianne, Thomas N. Kipf, and Max Welling. "Graph Convolutional Matrix Completion." stat 1050 (2017): 7.
22 | * Schlichtkrull, Michael, et al. "Modeling relational data with graph convolutional networks." European Semantic Web Conference. Springer, Cham, 2018.
23 | * Levie, Ron, et al. "Cayleynets: Graph convolutional neural networks with complex rational spectral filters." arXiv preprint arXiv:1705.07664 (2017).
24 |
25 | ### Attention mechanism in GNN
26 | * Velickovic, Petar, et al. "Graph attention networks." arXiv preprint arXiv:1710.10903 (2017).
27 | * GRAM: Graph-based Attention Model for Healthcare Representation Learning
28 | * Lee, John Boaz, et al. "Attention Models in Graphs: A Survey." arXiv preprint arXiv:1807.07984 (2018).
29 |
30 | ### Message Passing Neural Network (MPNN)
31 | * Li, Yujia, et al. "Gated graph sequence neural networks." arXiv preprint arXiv:1511.05493 (2015).
32 | * Gilmer, J., Schoenholz, S. S., Riley, P. F., Vinyals, O., & Dahl, G. E. (2017). Neural message passing for quantum chemistry. arXiv preprint arXiv:1704.01212.
33 |
34 | ### Graph Autoencoder and Graph Generative Models
35 | * Kipf, Thomas N., and Max Welling. "Variational graph auto-encoders." arXiv preprint arXiv:1611.07308 (2016).
36 | * Simonovsky, Martin, and Nikos Komodakis. "GraphVAE: Towards Generation of Small Graphs Using Variational Autoencoders." arXiv preprint arXiv:1802.03480 (2018).
37 | * Liu, Qi, et al. "Constrained Graph Variational Autoencoders for Molecule Design." arXiv preprint arXiv:1805.09076 (2018).
38 | * Pan, Shirui, et al. "Adversarially Regularized Graph Autoencoder." arXiv preprint arXiv:1802.04407 (2018).
39 | * Li, Y., Vinyals, O., Dyer, C., Pascanu, R., & Battaglia, P. (2018). Learning deep generative models of graphs. arXiv preprint arXiv:1803.03324.
40 |
41 |
42 | ### Applications of GNN
43 | * Duvenaud, David K., et al. "Convolutional networks on graphs for learning molecular fingerprints." Advances in neural information processing systems. 2015.
44 | * Kearnes, Steven, et al. "Molecular graph convolutions: moving beyond fingerprints." Journal of computer-aided molecular design 30.8 (2016): 595-608.
45 | * Schütt, K. T., Arbabzadah, F., Chmiela, S., Müller, K. R., & Tkatchenko, A. (2017). Quantum-chemical insights from deep tensor neural networks. Nature communications, 8, 13890.
46 | * Wu, Z., Ramsundar, B., Feinberg, E. N., Gomes, J., Geniesse, C., Pappu, A. S., ... & Pande, V. (2018). MoleculeNet: a benchmark for molecular machine learning. Chemical Science, 9(2), 513-530.
47 | * Shang, C., Liu, Q., Chen, K. S., Sun, J., Lu, J., Yi, J., & Bi, J. (2018). Edge Attention-based Multi-Relational Graph Convolutional Networks. arXiv preprint arXiv:1802.04944.
48 | * Feinberg, Evan N., et al. "Spatial Graph Convolutions for Drug Discovery." arXiv preprint arXiv:1803.04465 (2018).
49 | * Jin, Wengong, Regina Barzilay, and Tommi Jaakkola. "Junction Tree Variational Autoencoder for Molecular Graph Generation." arXiv preprint arXiv:1802.04364 (2018).
50 | * Liu, Qi, et al. "Constrained Graph Variational Autoencoders for Molecule Design." arXiv preprint arXiv:1805.09076 (2018).
51 | * De Cao, Nicola, and Thomas Kipf. "MolGAN: An implicit generative model for small molecular graphs." arXiv preprint arXiv:1805.11973 (2018).
52 | * Selvan, Raghavendra, et al. "Extraction of Airways using Graph Neural Networks." arXiv preprint arXiv:1804.04436 (2018).
53 | * Kipf, Thomas, et al. "Neural relational inference for interacting systems." arXiv preprint arXiv:1802.04687 (2018).
54 |
--------------------------------------------------------------------------------
/figures/gat_attention.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/SeongokRyu/Graph-neural-networks/e76cb1ad187d19fccad233b42b0806575f8e05e5/figures/gat_attention.png
--------------------------------------------------------------------------------
/figures/gcn_prediction.jpg:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/SeongokRyu/Graph-neural-networks/e76cb1ad187d19fccad233b42b0806575f8e05e5/figures/gcn_prediction.jpg
--------------------------------------------------------------------------------
/figures/graph_conv.jpg:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/SeongokRyu/Graph-neural-networks/e76cb1ad187d19fccad233b42b0806575f8e05e5/figures/graph_conv.jpg
--------------------------------------------------------------------------------
/figures/gru.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/SeongokRyu/Graph-neural-networks/e76cb1ad187d19fccad233b42b0806575f8e05e5/figures/gru.png
--------------------------------------------------------------------------------
/gnn.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/SeongokRyu/Graph-neural-networks/e76cb1ad187d19fccad233b42b0806575f8e05e5/gnn.png
--------------------------------------------------------------------------------
/tutorials/.ipynb_checkpoints/ggnn-checkpoint.ipynb:
--------------------------------------------------------------------------------
1 | {
2 | "cells": [
3 | {
4 | "cell_type": "markdown",
5 | "metadata": {},
6 | "source": [
7 | "# Gated Graph Neural Network\n",
8 | "\n",
9 | "We have found that GCN and GAT are CNN-like versions of graph neural networks. GGNN, on the other hand, is the RNN-like version of the node updating method. \n",
10 | "\n",
11 | "First, let's look at the message passing neural network (MPNN) framework. The MPNN framework updates the route node with the following formula.
\n",
12 | "\n",
13 | "$$ H_{i}^{(l+1)} = U(H_{i}^{(l)}, m^{(l+1)}) $$\n",
14 | "\n",
15 | "The i-th node, which is a route node, is newly updated through the message state, $m^{(i+1)}$ from the neighboring nodes and previous node state, $H^{(l)}$.
\n",
16 | "\n",
17 | "Updating message state can be written as a general formulation as follow.\n",
18 | "\n",
19 | "$$ m^{(l+1)} = \\sum_{j \\in N_{i}} M(H_i^{(l)}, H_j^{(l)}, e_{ij}) $$\n",
20 | "\n",
21 | "If we know the initial edge information - $e_{ij}$, we can update the message states differently for different relations, for example a single bond, a double bond and an aromatic bond will transfer a different message to the route node.
\n",
22 | "For simpliticy, we will only consider just connectivity between the node pairs, i.e.) $A_{ij} =1$ for connected node pairs, and zero otherwise.\n",
23 | "\n",
24 | "In GGNN framework, message function is defined as simple summation of the neighbor node states.\n",
25 | "\n",
26 | "$$ m^{(l+1)} = \\sum_{j \\in N_{i}} H_j^{(l)} $$\n",
27 | "\n",
28 | "And the gated recurrent unit (GRU) is used for the node updating. Finally, the node updating is re-written as follow.\n",
29 | "\n",
30 | "$$ H_i^{(l+1)} = GRU(H_i^{(l)}, \\sum_{j \\in N_i} H_i^{(l)}) $$\n",
31 | "\n",
32 | "We will implement the updating function in the GGNN framework."
33 | ]
34 | },
35 | {
36 | "cell_type": "code",
37 | "execution_count": 1,
38 | "metadata": {},
39 | "outputs": [
40 | {
41 | "name": "stderr",
42 | "output_type": "stream",
43 | "text": [
44 | "/Users/Lulu/anaconda3/lib/python3.6/site-packages/h5py/__init__.py:36: FutureWarning: Conversion of the second argument of issubdtype from `float` to `np.floating` is deprecated. In future, it will be treated as `np.float64 == np.dtype(float).type`.\n",
45 | " from ._conv import register_converters as _register_converters\n"
46 | ]
47 | }
48 | ],
49 | "source": [
50 | "import tensorflow as tf"
51 | ]
52 | },
53 | {
54 | "cell_type": "code",
55 | "execution_count": 2,
56 | "metadata": {},
57 | "outputs": [],
58 | "source": [
59 | "def ggnn(_X, _A, output_dim, num_layer):\n",
60 | " num_nodes = int(_X.get_shape()[1])\n",
61 | " input_dim = int(_X.get_shape()[2])\n",
62 | " \n",
63 | " if( input_dim != output_dim ):\n",
64 | " _X = tf.layers.dense(_X, units=output_dim, use_bias=False)\n",
65 | " \n",
66 | " # Message state\n",
67 | " _m = tf.matmul(_A, _X)\n",
68 | " \n",
69 | " # Update node state using GRU cell\n",
70 | " X_total = []\n",
71 | " cell = tf.contrib.rnn.GRUCell(output_dim, name='GRUcell'+str(num_layer))\n",
72 | " \n",
73 | " for i in range(num_nodes):\n",
74 | " mi = tf.expand_dims(_m[:,i,:],1)\n",
75 | " hi = _X[:,i,:]\n",
76 | " \n",
77 | " _, _h = tf.nn.dynamic_rnn(cell, mi, initial_state=hi)\n",
78 | " X_total.append(tf.expand_dims(_h, 1))\n",
79 | " \n",
80 | " output = tf.concat(X_total, 1)\n",
81 | " \n",
82 | " return output"
83 | ]
84 | },
85 | {
86 | "cell_type": "markdown",
87 | "metadata": {},
88 | "source": [
89 | "Let's check if our code is correct."
90 | ]
91 | },
92 | {
93 | "cell_type": "code",
94 | "execution_count": 3,
95 | "metadata": {},
96 | "outputs": [
97 | {
98 | "data": {
99 | "text/plain": [
100 | ""
101 | ]
102 | },
103 | "execution_count": 3,
104 | "metadata": {},
105 | "output_type": "execute_result"
106 | }
107 | ],
108 | "source": [
109 | "X = tf.placeholder(tf.float64, [None, 50, 58])\n",
110 | "A = tf.placeholder(tf.float64, [None, 50, 50])\n",
111 | "\n",
112 | "ggnn1 = ggnn(X, A, 32, 1)\n",
113 | "ggnn1"
114 | ]
115 | },
116 | {
117 | "cell_type": "markdown",
118 | "metadata": {},
119 | "source": [
120 | "That's right. We implemented a GGNN node updating method simply by using GRU cell.\n",
121 | "\n",
122 | "However, I have implemented it using the for statement here. I wonder if it can be implemented better by using tensorflow. In particular, when a graph neural network is applied to a molecule, a dynamic computational graph should be used when the number of atoms varies. If you know about this, please comment."
123 | ]
124 | }
125 | ],
126 | "metadata": {
127 | "kernelspec": {
128 | "display_name": "Python 3",
129 | "language": "python",
130 | "name": "python3"
131 | },
132 | "language_info": {
133 | "codemirror_mode": {
134 | "name": "ipython",
135 | "version": 3
136 | },
137 | "file_extension": ".py",
138 | "mimetype": "text/x-python",
139 | "name": "python",
140 | "nbconvert_exporter": "python",
141 | "pygments_lexer": "ipython3",
142 | "version": "3.6.5"
143 | }
144 | },
145 | "nbformat": 4,
146 | "nbformat_minor": 2
147 | }
148 |
--------------------------------------------------------------------------------
/tutorials/gcn_classification.ipynb:
--------------------------------------------------------------------------------
1 | {
2 | "cells": [
3 | {
4 | "cell_type": "markdown",
5 | "metadata": {},
6 | "source": [
7 | "# Graph Convolution Network\n",
8 | "\n",
9 | "A tutorial for exercising implement graph convolution network (GCN).
\n",
10 | "We will implement the graph convolution to classify nodes in some network graph.
\n",
11 | "It is almost similar to the implementation in T. Kipf's paper \"Semi-supervised classification with graph convolutional networks\".
\n",
12 | "Refer T. Kipf's github - https://github.com/tkipf/gcn"
13 | ]
14 | },
15 | {
16 | "cell_type": "code",
17 | "execution_count": 1,
18 | "metadata": {},
19 | "outputs": [
20 | {
21 | "name": "stderr",
22 | "output_type": "stream",
23 | "text": [
24 | "/Users/Lulu/anaconda3/lib/python3.6/site-packages/h5py/__init__.py:36: FutureWarning: Conversion of the second argument of issubdtype from `float` to `np.floating` is deprecated. In future, it will be treated as `np.float64 == np.dtype(float).type`.\n",
25 | " from ._conv import register_converters as _register_converters\n"
26 | ]
27 | }
28 | ],
29 | "source": [
30 | "import tensorflow as tf\n",
31 | "from IPython.display import Image"
32 | ]
33 | },
34 | {
35 | "cell_type": "markdown",
36 | "metadata": {},
37 | "source": [
38 | "Assume that graph inputs - adjacency matrix and node features - are given, and number of nodes and features are 50.
\n",
39 | "Therefore, the shape of node feature matrix is (batch_size, num_nodes, num_features) and adjacency matrix is (batch_size, num_nodes, num_nodes).
\n",
40 | "And the number of labels are 10."
41 | ]
42 | },
43 | {
44 | "cell_type": "code",
45 | "execution_count": 2,
46 | "metadata": {},
47 | "outputs": [],
48 | "source": [
49 | "num_nodes = 50\n",
50 | "num_features = 50\n",
51 | "num_labels = 10\n",
52 | "X = tf.placeholder(tf.float64, shape=(None, num_nodes, num_features))\n",
53 | "A = tf.placeholder(tf.float64, shape=(None, num_nodes, num_nodes))\n",
54 | "Y_truth = tf.placeholder(tf.float64, shape=(None, num_labels))"
55 | ]
56 | },
57 | {
58 | "cell_type": "markdown",
59 | "metadata": {},
60 | "source": [
61 | "The equation of graph convolution is as below.
\n",
62 | "$$ H^{l+1} = \\sigma(AH^{l}W^{l}) $$\n",
63 | "We will implement this equation in function named graph_conv. The function will receive l-th node features, adjacency matrix, and the output dimension of node features as inputs. \n",
64 | "\n",
65 | "Actually, original equation is introduced as above, which does not use bias term, in T.Kipf's paper. However, I think using bias is necessary, as below, because the bias term shifts the decision boundary. \n",
66 | "$$ H^{l+1} = \\sigma(A(H^{l}W^{l}+b^{l})) $$\n",
67 | "Therefore, I set 'use_bias=True' at a dense layer in the graph convolution."
68 | ]
69 | },
70 | {
71 | "cell_type": "code",
72 | "execution_count": 3,
73 | "metadata": {},
74 | "outputs": [],
75 | "source": [
76 | "def graph_conv(_X, _A, output_dim):\n",
77 | " output = tf.layers.dense(_X, units=output_dim, use_bias=True)\n",
78 | " output = tf.matmul(_A, output)\n",
79 | " output = tf.nn.relu(output)\n",
80 | " return output"
81 | ]
82 | },
83 | {
84 | "cell_type": "code",
85 | "execution_count": 4,
86 | "metadata": {},
87 | "outputs": [
88 | {
89 | "data": {
90 | "text/plain": [
91 | ""
92 | ]
93 | },
94 | "execution_count": 4,
95 | "metadata": {},
96 | "output_type": "execute_result"
97 | }
98 | ],
99 | "source": [
100 | "X_new = graph_conv(X, A, 32)\n",
101 | "X_new"
102 | ]
103 | },
104 | {
105 | "cell_type": "markdown",
106 | "metadata": {},
107 | "source": [
108 | "After single graph convolution, we can check that the dimension of node features is transformed from 50 to 32.
\n",
109 | "We want to build the graph convolution network with three graph convolution layers and softmax classifier, as below."
110 | ]
111 | },
112 | {
113 | "cell_type": "code",
114 | "execution_count": 5,
115 | "metadata": {},
116 | "outputs": [
117 | {
118 | "data": {
119 | "image/jpeg": "\n",
120 | "text/plain": [
121 | ""
122 | ]
123 | },
124 | "execution_count": 5,
125 | "metadata": {},
126 | "output_type": "execute_result"
127 | }
128 | ],
129 | "source": [
130 | "Image('./figures/graph_conv.jpg')"
131 | ]
132 | },
133 | {
134 | "cell_type": "code",
135 | "execution_count": 6,
136 | "metadata": {},
137 | "outputs": [
138 | {
139 | "data": {
140 | "text/plain": [
141 | ""
142 | ]
143 | },
144 | "execution_count": 6,
145 | "metadata": {},
146 | "output_type": "execute_result"
147 | }
148 | ],
149 | "source": [
150 | "gconv1 = graph_conv(X, A, 32)\n",
151 | "gconv2 = graph_conv(gconv1, A, 32)\n",
152 | "gconv3 = graph_conv(gconv2, A, 32)\n",
153 | "Y_pred = tf.nn.softmax(tf.layers.dense(gconv3, units=num_labels, use_bias=True), axis=2)\n",
154 | "Y_pred"
155 | ]
156 | },
157 | {
158 | "cell_type": "markdown",
159 | "metadata": {},
160 | "source": [
161 | "A shape of the final output is [batch_size, num_nodes=50, num_labels=10].
\n",
162 | "Finally, we have to set the loss function as a cross entropy loss, because it is the classification task."
163 | ]
164 | },
165 | {
166 | "cell_type": "code",
167 | "execution_count": 7,
168 | "metadata": {},
169 | "outputs": [],
170 | "source": [
171 | "Y_pred = tf.reshape(Y_pred, [-1])\n",
172 | "loss = tf.reduce_mean(Y_truth*tf.log(Y_pred+1.**-5))"
173 | ]
174 | },
175 | {
176 | "cell_type": "markdown",
177 | "metadata": {},
178 | "source": [
179 | "Or you can use cross entropy loss variants already implemented in TensorFlow.
\n",
180 | "As usually done in supervised learning, we have to minimize the loss function.
\n",
181 | "We do not have data to train in this tutorial, so we will skip the training. "
182 | ]
183 | }
184 | ],
185 | "metadata": {
186 | "kernelspec": {
187 | "display_name": "Python 3",
188 | "language": "python",
189 | "name": "python3"
190 | },
191 | "language_info": {
192 | "codemirror_mode": {
193 | "name": "ipython",
194 | "version": 3
195 | },
196 | "file_extension": ".py",
197 | "mimetype": "text/x-python",
198 | "name": "python",
199 | "nbconvert_exporter": "python",
200 | "pygments_lexer": "ipython3",
201 | "version": "3.6.5"
202 | }
203 | },
204 | "nbformat": 4,
205 | "nbformat_minor": 2
206 | }
207 |
--------------------------------------------------------------------------------
/tutorials/gcn_prediction.ipynb:
--------------------------------------------------------------------------------
1 | {
2 | "cells": [
3 | {
4 | "cell_type": "markdown",
5 | "metadata": {},
6 | "source": [
7 | "# Prediction of molecular properties using GCN\n",
8 | "\n",
9 | "A tutorial for supervised learning of labels as outputs of graph inputs.
\n",
10 | "I implemented GCN and GAT to predict molecular properties and observed better results from GAT.
\n",
11 | "Please refer to the following article for the whole contents.
\n",
12 | "Ryu, Seongok, Jaechang Lim, and Woo Youn Kim. \"Deeply learning molecular structure-property relationships using graph attention neural network.\" arXiv preprint arXiv:1805.10988 (2018).\n",
13 | "\n",
14 | "There is a key difference between the node classification and prediction of labels from whole graph inputs. The later task has to satisfy permutation invariance with respect to changing node orders. Therefore, we will implement readout functions which satisfy the permutation invariance in this tutorial."
15 | ]
16 | },
17 | {
18 | "cell_type": "code",
19 | "execution_count": 1,
20 | "metadata": {},
21 | "outputs": [
22 | {
23 | "name": "stderr",
24 | "output_type": "stream",
25 | "text": [
26 | "/Users/Lulu/anaconda3/lib/python3.6/site-packages/h5py/__init__.py:36: FutureWarning: Conversion of the second argument of issubdtype from `float` to `np.floating` is deprecated. In future, it will be treated as `np.float64 == np.dtype(float).type`.\n",
27 | " from ._conv import register_converters as _register_converters\n"
28 | ]
29 | }
30 | ],
31 | "source": [
32 | "import tensorflow as tf\n",
33 | "from IPython.display import Image"
34 | ]
35 | },
36 | {
37 | "cell_type": "markdown",
38 | "metadata": {},
39 | "source": [
40 | "Overall architecture is as below."
41 | ]
42 | },
43 | {
44 | "cell_type": "code",
45 | "execution_count": 2,
46 | "metadata": {},
47 | "outputs": [
48 | {
49 | "data": {
50 | "image/jpeg": "\n",
51 | "text/plain": [
52 | ""
53 | ]
54 | },
55 | "execution_count": 2,
56 | "metadata": {},
57 | "output_type": "execute_result"
58 | }
59 | ],
60 | "source": [
61 | "Image('./figures/gcn_prediction.jpg')"
62 | ]
63 | },
64 | {
65 | "cell_type": "markdown",
66 | "metadata": {},
67 | "source": [
68 | "Let's assume that shapes of the adjacency matrix, feature matrix are same as the shapes of the matrix used in a 'gcn-node_classification' tutorial."
69 | ]
70 | },
71 | {
72 | "cell_type": "code",
73 | "execution_count": 3,
74 | "metadata": {},
75 | "outputs": [],
76 | "source": [
77 | "num_nodes = 50\n",
78 | "num_features = 50\n",
79 | "X = tf.placeholder(tf.float64, [None, num_nodes, num_features])\n",
80 | "A = tf.placeholder(tf.float64, [None, num_nodes, num_nodes])\n",
81 | "Y_truth = tf.placeholder(tf.float64, [None,])"
82 | ]
83 | },
84 | {
85 | "cell_type": "markdown",
86 | "metadata": {},
87 | "source": [
88 | "An implementation of graph convolution layer is same also."
89 | ]
90 | },
91 | {
92 | "cell_type": "code",
93 | "execution_count": 4,
94 | "metadata": {},
95 | "outputs": [],
96 | "source": [
97 | "def graph_conv(_X, _A, output_dim):\n",
98 | " output = tf.layers.dense(_X, units=output_dim, use_bias=True)\n",
99 | " output = tf.matmul(_A, output)\n",
100 | " output = tf.nn.relu(output)\n",
101 | " return output"
102 | ]
103 | },
104 | {
105 | "cell_type": "markdown",
106 | "metadata": {},
107 | "source": [
108 | "We have to implement the readout function as described above.
\n",
109 | "There are two types of the implementation: node-wise summation (nw) and graph gathering (gg).
\n",
110 | "Equations are as follow, respectively.\n",
111 | "\n",
112 | "$$ R_{nw} = \\tau(\\sum_{i \\in G} MLP(H_{i}^{L})) $$\n",
113 | "$$ R_{gg} = \\tau(\\sum_{i \\in G} \\sigma(MLP_1(H_{i}^{L} | H_{i}^{0})) \\odot MLP_2(H_{i}^{L}))$$\n",
114 | "\n",
115 | "Notations :\n",
116 | "* $ \\tau $ : ReLU activation (or other non-linear activations)\n",
117 | "* $ \\sigma $ : sigmoid activation\n",
118 | "* $ \\odot $ : elementwise-multiplication - Hadamard product \n",
119 | "* $ (\\cdot|\\cdot) $ : concatenation\n",
120 | " \n",
121 | "Please refer to the following article for the more detail.
\n",
122 | "Gilmer, Justin, et al. \"Neural message passing for quantum chemistry.\" arXiv preprint arXiv:1704.01212 (2017)."
123 | ]
124 | },
125 | {
126 | "cell_type": "code",
127 | "execution_count": 5,
128 | "metadata": {},
129 | "outputs": [],
130 | "source": [
131 | "def readout_nw(_X, output_dim):\n",
132 | " # _X : final node embeddings\n",
133 | " output = tf.layers.dense(_X, output_dim, use_bias=True)\n",
134 | " output = tf.reduce_sum(output, axis=1)\n",
135 | " output = tf.nn.relu(output)\n",
136 | " \n",
137 | " return output"
138 | ]
139 | },
140 | {
141 | "cell_type": "code",
142 | "execution_count": 6,
143 | "metadata": {},
144 | "outputs": [],
145 | "source": [
146 | "def readout_gg(_X, X, output_dim):\n",
147 | " # _X : final node embeddings\n",
148 | " # X : initial node features\n",
149 | " val1 = tf.layers.dense(tf.concat([_X, X], axis=2), output_dim, use_bias=True)\n",
150 | " val1 = tf.nn.sigmoid(val1)\n",
151 | " val2 = tf.layers.dense(_X, output_dim, use_bias=True)\n",
152 | " output = tf.multiply(val1, val2)\n",
153 | " output = tf.reduce_sum(output, axis=1)\n",
154 | " output = tf.nn.relu(output)\n",
155 | " \n",
156 | " return output"
157 | ]
158 | },
159 | {
160 | "cell_type": "markdown",
161 | "metadata": {},
162 | "source": [
163 | "We finished preparing necessary functions in the architecture.
\n",
164 | "Therefore, the implementation of the overall architecture is as below."
165 | ]
166 | },
167 | {
168 | "cell_type": "code",
169 | "execution_count": 7,
170 | "metadata": {},
171 | "outputs": [
172 | {
173 | "data": {
174 | "text/plain": [
175 | ""
176 | ]
177 | },
178 | "execution_count": 7,
179 | "metadata": {},
180 | "output_type": "execute_result"
181 | }
182 | ],
183 | "source": [
184 | "gconv1 = graph_conv(X, A, 32)\n",
185 | "gconv2 = graph_conv(gconv1, A, 32)\n",
186 | "gconv3 = graph_conv(gconv2, A, 32)\n",
187 | "graph_feature = readout_gg(gconv3, gconv1, 128)\n",
188 | "graph_feature"
189 | ]
190 | },
191 | {
192 | "cell_type": "code",
193 | "execution_count": 8,
194 | "metadata": {},
195 | "outputs": [
196 | {
197 | "data": {
198 | "text/plain": [
199 | ""
200 | ]
201 | },
202 | "execution_count": 8,
203 | "metadata": {},
204 | "output_type": "execute_result"
205 | }
206 | ],
207 | "source": [
208 | "Y_pred = tf.layers.dense(graph_feature, 128, use_bias=True, activation=tf.nn.relu)\n",
209 | "Y_pred = tf.layers.dense(Y_pred, 128, use_bias=True, activation=tf.nn.tanh)\n",
210 | "Y_pred = tf.layers.dense(Y_pred, 1, use_bias=True, activation=None)\n",
211 | "Y_pred"
212 | ]
213 | },
214 | {
215 | "cell_type": "markdown",
216 | "metadata": {},
217 | "source": [
218 | "A loss function have to be minimized in this task is an l2-norm."
219 | ]
220 | },
221 | {
222 | "cell_type": "code",
223 | "execution_count": 9,
224 | "metadata": {},
225 | "outputs": [
226 | {
227 | "data": {
228 | "text/plain": [
229 | ""
230 | ]
231 | },
232 | "execution_count": 9,
233 | "metadata": {},
234 | "output_type": "execute_result"
235 | }
236 | ],
237 | "source": [
238 | "Y_pred = tf.reshape(Y_pred, shape=[-1])\n",
239 | "Y_truth = tf.reshape(Y_truth, shape=[-1])\n",
240 | "loss = tf.reduce_mean(tf.pow(Y_truth - Y_pred,2))\n",
241 | "loss"
242 | ]
243 | },
244 | {
245 | "cell_type": "markdown",
246 | "metadata": {},
247 | "source": [
248 | "Yes, we have completed all necessary preparations for training.\n",
249 | "\n",
250 | "I upload all codes which implement the supervised learning of prediction molecular properties at the 'gnn-molecule' folder.
\n",
251 | "Scripts for preprocessing also exist. Hope you enjoy the graph neural networks from this moment!\n"
252 | ]
253 | }
254 | ],
255 | "metadata": {
256 | "kernelspec": {
257 | "display_name": "Python 3",
258 | "language": "python",
259 | "name": "python3"
260 | },
261 | "language_info": {
262 | "codemirror_mode": {
263 | "name": "ipython",
264 | "version": 3
265 | },
266 | "file_extension": ".py",
267 | "mimetype": "text/x-python",
268 | "name": "python",
269 | "nbconvert_exporter": "python",
270 | "pygments_lexer": "ipython3",
271 | "version": "3.6.5"
272 | }
273 | },
274 | "nbformat": 4,
275 | "nbformat_minor": 2
276 | }
277 |
--------------------------------------------------------------------------------
/tutorials/ggnn.ipynb:
--------------------------------------------------------------------------------
1 | {
2 | "cells": [
3 | {
4 | "cell_type": "markdown",
5 | "metadata": {},
6 | "source": [
7 | "# Gated Graph Neural Network\n",
8 | "\n",
9 | "We have found that GCN and GAT are CNN-like versions of graph neural networks. GGNN, on the other hand, is the RNN-like version of the node updating method. \n",
10 | "\n",
11 | "First, let's look at the message passing neural network (MPNN) framework. The MPNN framework updates the route node with the following formula.
\n",
12 | "\n",
13 | "$$ H_{i}^{(l+1)} = U(H_{i}^{(l)}, m^{(l+1)}) $$\n",
14 | "\n",
15 | "The i-th node, which is a route node, is newly updated through the message state, $m^{(i+1)}$ from the neighboring nodes and previous node state, $H^{(l)}$.
\n",
16 | "\n",
17 | "Updating message state can be written as a general formulation as follow.\n",
18 | "\n",
19 | "$$ m^{(l+1)} = \\sum_{j \\in N_{i}} M(H_i^{(l)}, H_j^{(l)}, e_{ij}) $$\n",
20 | "\n",
21 | "If we know the initial edge information - $e_{ij}$, we can update the message states differently for different relations, for example a single bond, a double bond and an aromatic bond will transfer a different message to the route node.
\n",
22 | "For simpliticy, we will only consider just connectivity between the node pairs, i.e.) $A_{ij} =1$ for connected node pairs, and zero otherwise.\n",
23 | "\n",
24 | "In GGNN framework, message function is defined as simple summation of the neighbor node states.\n",
25 | "\n",
26 | "$$ m^{(l+1)} = \\sum_{j \\in N_{i}} H_j^{(l)} $$\n",
27 | "\n",
28 | "And the gated recurrent unit (GRU) is used for the node updating. Finally, the node updating is re-written as follow.\n",
29 | "\n",
30 | "$$ H_i^{(l+1)} = GRU(H_i^{(l)}, \\sum_{j \\in N_i} H_i^{(l)}) $$\n",
31 | "\n",
32 | "We will implement the updating function in the GGNN framework."
33 | ]
34 | },
35 | {
36 | "cell_type": "code",
37 | "execution_count": 1,
38 | "metadata": {},
39 | "outputs": [
40 | {
41 | "name": "stderr",
42 | "output_type": "stream",
43 | "text": [
44 | "/Users/Lulu/anaconda3/lib/python3.6/site-packages/h5py/__init__.py:36: FutureWarning: Conversion of the second argument of issubdtype from `float` to `np.floating` is deprecated. In future, it will be treated as `np.float64 == np.dtype(float).type`.\n",
45 | " from ._conv import register_converters as _register_converters\n"
46 | ]
47 | }
48 | ],
49 | "source": [
50 | "import tensorflow as tf"
51 | ]
52 | },
53 | {
54 | "cell_type": "code",
55 | "execution_count": 2,
56 | "metadata": {},
57 | "outputs": [],
58 | "source": [
59 | "def ggnn(_X, _A, output_dim, num_layer):\n",
60 | " num_nodes = int(_X.get_shape()[1])\n",
61 | " input_dim = int(_X.get_shape()[2])\n",
62 | " \n",
63 | " if( input_dim != output_dim ):\n",
64 | " _X = tf.layers.dense(_X, units=output_dim, use_bias=False)\n",
65 | " \n",
66 | " # Message state\n",
67 | " _m = tf.matmul(_A, _X)\n",
68 | " \n",
69 | " # Update node state using GRU cell\n",
70 | " X_total = []\n",
71 | " cell = tf.contrib.rnn.GRUCell(output_dim, name='GRUcell'+str(num_layer))\n",
72 | " \n",
73 | " for i in range(num_nodes):\n",
74 | " mi = tf.expand_dims(_m[:,i,:],1)\n",
75 | " hi = _X[:,i,:]\n",
76 | " \n",
77 | " _, _h = tf.nn.dynamic_rnn(cell, mi, initial_state=hi)\n",
78 | " X_total.append(tf.expand_dims(_h, 1))\n",
79 | " \n",
80 | " output = tf.concat(X_total, 1)\n",
81 | " \n",
82 | " return output"
83 | ]
84 | },
85 | {
86 | "cell_type": "markdown",
87 | "metadata": {},
88 | "source": [
89 | "Let's check if our code is correct."
90 | ]
91 | },
92 | {
93 | "cell_type": "code",
94 | "execution_count": 3,
95 | "metadata": {},
96 | "outputs": [
97 | {
98 | "data": {
99 | "text/plain": [
100 | ""
101 | ]
102 | },
103 | "execution_count": 3,
104 | "metadata": {},
105 | "output_type": "execute_result"
106 | }
107 | ],
108 | "source": [
109 | "X = tf.placeholder(tf.float64, [None, 50, 58])\n",
110 | "A = tf.placeholder(tf.float64, [None, 50, 50])\n",
111 | "\n",
112 | "ggnn1 = ggnn(X, A, 32, 1)\n",
113 | "ggnn1"
114 | ]
115 | },
116 | {
117 | "cell_type": "markdown",
118 | "metadata": {},
119 | "source": [
120 | "That's right. We implemented a GGNN node updating method simply by using GRU cell.\n",
121 | "\n",
122 | "However, I have implemented it using the for statement here. I wonder if it can be implemented better by using tensorflow. In particular, when a graph neural network is applied to a molecule, a dynamic computational graph should be used when the number of atoms varies. If you know about this, please comment."
123 | ]
124 | }
125 | ],
126 | "metadata": {
127 | "kernelspec": {
128 | "display_name": "Python 3",
129 | "language": "python",
130 | "name": "python3"
131 | },
132 | "language_info": {
133 | "codemirror_mode": {
134 | "name": "ipython",
135 | "version": 3
136 | },
137 | "file_extension": ".py",
138 | "mimetype": "text/x-python",
139 | "name": "python",
140 | "nbconvert_exporter": "python",
141 | "pygments_lexer": "ipython3",
142 | "version": "3.6.5"
143 | }
144 | },
145 | "nbformat": 4,
146 | "nbformat_minor": 2
147 | }
148 |
--------------------------------------------------------------------------------